mala::home Davide “+mala” Eynard’s website

8Nov/090

Request for Comments: learn Internet standards by reading the documents that gave them birth

[Foreword: this is article number 3 of the new "hacks" series. Read here if you want to know more about this. A huuuuuge THANX to Aliosha who helped me with the translation of this article!]

A typical characteristic of hackers is the desire to understand -in the most intimate details- the way any machinery works. From this point of view, Internet is one of the most interesting objects of study, since it offers a huge variety of concepts to be learnt: just think how many basic formats and standards it relies on... And all the nice hacks we could perform once we understand the way they work!

Luckily, most of these standards are published in freely accessible and easily obtainable notes: these documents are called RFC (Request for Comments), and have been used for almost 40 years to share information and observations regarding Internet formats, technologies, protocols and standards. The first RFC goes up to 1969, and since then more than 5500 have been published. Each one of them has to pass a difficult selection process lead by IETF (Internet Engineering Task Force), whose task is -as described in RFC 3935 and 4677- "to manage the Internet in such a way as to make the Internet work better."

John Postel, one of the authors of the first RFC and contributor to the project for 28 years, has always typed on his keyboard using only two fingers :-)

John Postel, one of the authors of the first RFC and contributor to the project for 28 years, has always typed on his keyboard using only two fingers :-)

The RFC format

In order to become a RFC, a technical document must above all follow a very strict standard. At a first glance, it strikes us for its stark outlook: a simple text file with 73 columns, exclusively formatted with standard ASCII chars. On a second thought, it is easy to understand the reason of this choice: what format did not change since 1969 and can be visualized on any computer, no matter how old it is or which OS it runs?

Despite of the fact that many standards are now more than stable, the number of published RFCs is always increasing.

Despite of the fact that many standards are now more than stable, the number of published RFCs is always increasing.

Every RFC has a header with information especially important for the reader. In addition to title, date and authors, there is also the unique serial number of the document, the relations with preceding documents, and its category. For example (see figure below), the most recent RFC that describes the SMTP protocol is 5321, updating RFC 1123, making 2821 obsolete, belonging to the "Standard Track" category. Similarly, if we read that a document has been "obsoleted" by another, it is better to look for this other one, since it will contain more up to date information.

The categories of RFCs are several, depending to the level of standardization reached by the described protocol or format at the moment of publication. The documents considered as the most official ones are split in three main categories: well-established standards (standard), drafts (draft) and standard proposals (proposed). There are also three non-standard classes, including experimental documents (experimental), informative ones (informational), and historical ones (historic). Finally, there is an "almost standard" category, containing the Best Current Practices (BCP), that is all those non official practices that are considered the most logical to adopt.

The header of RFC 5321, the document devoted to the SMTP protocol.

The header of RFC 5321, the document devoted to the SMTP protocol.

Finding the document we want

Now that we understand the meaning of RFC associated metadata (all those data not pertaining to the content of the document but to the document itself), we only have to take a peek inside the official IETF archive to see if there is information of interest for us. There are several methods to find an RFC: the first -and simplest- can be used when we know the document serial number and consists in opening the address http://www.ietf.org/rfc/rfcxxxx.txt, where xxxx is that number. For instance, the first RFC in computer history is available at http://www.ietf.org/rfc/rfc0001.txt.

Another search approach consists in starting from a protocol name and searching for all the documents that are related to it. To do this, we can use the list of Official Internet Protocol Standards that is available at http://www.rfc-editor.org/rfcxx00.html. Inside this list you can find the acronyms of many protocols, their full names and the standards they are related to: for instance, IP, ICMP, and IGMP protocols are described in different RFCs but they are all part of the same standard (number 5).

Finally, you can search documents according to their status or category: at http://www.rfc-editor.org/category.html you can find an index of RFCs ordered by publication status and, for each section, updated documents appear as black while obsolete ones are red, together with the id of the RFC which obsoleted them.

The tools we have just described should be enough in most of the cases: in fact, we usually know at least the name of the protocol we want to study, if not even the code of the RFC where it is described. However, whenever we just have a vague idea of the concepts that we want to learn, we can use the search engine available at http://www.rfc-editor.org/rfcsearch.html. If, for instance, we want to know something more about the encoding used for mail attachments, we can just search for "mail attachment" and obtain as a result the list of titles of the RFCs which deal with this topic (in this case, RDF 2183).

What should I read now?

When you have an archive like this available, the biggest problem you have to face is the huge quantity of information: a life is not enough to read all of these RFCs! Search options, fortunately, might help in filtering away everything which is not interesting for us. However, which could be good starting points for our research?

If we don't know where to begin, having a look at the basic protocols is always a good way to start: you can begin from the easiest, higher-level ones, such as the ones regarding email (POP3, IMAP, and SMTP, already partially described here), the Web (HTTP), or other famous application-level protocols (FTP, IRC, DNS, and so on). Transport and network protocols, such as TCP and IP, are more complicated but not less interesting than the others.

If, instead, you are searching for something simpler you can check informative RFCs: they actually contain many interesting documents, such as RFC 2151 (A Primer On Internet and TCP/IP Tools and Utilities) and 2504 (Users' Security Handbook). April Fool's documents deserve a special mention, being funny jokes written as formal RFCs (http://en.wikipedia.org/wiki/April_Fools%27_Day_RFC). Finally, if you still have problems with English (so, why are you reading this? ;-)) you might want to search the Internet for RFCs translated in your language. For instance, at http://rfc.altervista.org, you can find the Italian version of the RFCs describing the most common protocols.

[Foreword: this is article number 2 of the new "hacks" series. Read here if you want to know more about this]
Filed under: hacks No Comments