(2023-04-10) Why HTTP/0.9 sucks but still matters ------------------------------------------------- Among various posts and discussions around some "big Web" proponents criticizing Gopher and Gemini, sometimes the HTTP/0.9 abbreviation comes up. This is the original specification of HTTP ([1]) back from 1991, which is notably similar to Gopher protocol in some aspects: first, the server accepts a single CRLF-terminated request string (where CR is optional), second, the server returns the response body immediately with no headers or status messages and closes the TCP connection on finish. All this sounds so familiar that you might think "wow, the only HTTP version that could be integrated within Gophermaps!" Well, yes, but actually not quite. First, the request string we send to the HTTP/0.9 server really can only refer to the pathname similar to the selectors in Gopher, but it must be prepended with the "GET" method name and a whitespace. There actually are no other methods in this HTTP version, just "GET", but it must be specified. This already makes building a selector not so straightforward but still doable. Well, the second problem is far more paramount. The standard explicitly says that the response must be returned in the HTML format. It is the only content type allowed. There is, however, an interesting remark in the original statement: > The format of the message is HTML - that is, a trimmed SGML document. Note > that this format allows for menus and hit lists to be returned as > hypertext. It also allows for plain ASCII text to be returned following > the PLAINTEXT tag. Dafuq is "the PLAINTEXT tag"? As someone who had known HTML for about 15 years (and its basics for like 20), I had never heard of one. But I looked it up on MDN, and indeed, here it is ([2]): > The HTML element renders everything following the start tag as > raw text, ignoring any following HTML. There is no closing tag, since > everything after it is considered raw text. Wow. Just wow. This is even cooler than <xmp> and <marquee>. This tag had been deprecated since HTML 2 (!) and the MDN page is half-red from trying to tell you that you should not use it, but all major browsers still support it. What does all this mean to us? It essentially means two things: 1) we still can view/download HTTP/0.9 documents directly from Gopher clients by shaping special GET-prefixed selectors in the maps or addresses; 2) we still can serve plaintext documents to Web browsers supporting HTTP/0.9 requests from a Gopher server under special GET-prefixed selectors by automatically prepending <plaintext> before our response. Unfortunately, case 1 is much harder to try out nowadays because more and more servers are dropping HTTP/0.9 support and treat single-line requests as malformed HTTP/1.0 or HTTP/1.1 requests instead. For instance, anything behind Cloudflare or Traefik can't be accessed using the 0.9. Unsurprisingly though, textfiles.com and frogfind.com worked, and you actually can shape URLs like gopher://frogfind.com:80/0GET%20/?q=example or gopher://textfiles.com:80/0GET%20/internet/acronyms.txt - although, of course, Textfiles has a direct mirror on Gopherspace, anyway, you get the idea. Well, no one prevents you from appending " HTTP/1.0" to the selector if you want to, but then you'll have to deal with all the header parsing/skipping yourself. For case 2, there also might be another issue with Chromium-based browsers dropping HTTP/0.9 support for non-default ports (and for insecure connections, this only leaves port 80 for us) but Gopher, thankfully, doesn't care which port to work on, and we can proxy all requests coming to port 80 by removing the "GET " part, and then prepending <plaintext> to all the responses. Of course, it won't do any good for gophermaps and binaries, but will at least make any plaintext documents accessible from the browser with little to no overhead. Or, we can add a bit of overhead to generate a valid HTTP/1.0 response by prepending the following instead of the <plaintext> tag: HTTP/1.0 200 OK<CRLF>Content-Type: text/plain<CRLF><CRLF> For experimental purposes, I created a "GET " directory in the content root of this server and placed an index.map file with my <plaintext> message there. If you visit http://hoi.st:70/ from a Web browser that still supports HTTP/0.9 _requests_, you should be able to see my message about HTTP 0.9. Otherwise you'll see a message about HTTP 1.0 or 1.1, depending on which version you're trying to access the server with. Their sources are located in the " HTTP" directory in the corresponding files. As you can see, some interoperability still can be achieved, but of course, it should only be viewed as a temporary measure. A proper Gopher client is something no Web browsers and proxies can ever substitute. --- Luxferre --- [1]: https://www.w3.org/Protocols/HTTP/AsImplemented.html [2]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/plaintext