📄 rfc1036.txt
字号:
Network Working Group M. HortonRequest for Comments: 1036 AT&T Bell LaboratoriesObsoletes: RFC-850 R. Adams Center for Seismic Studies December 1987 Standard for Interchange of USENET MessagesSTATUS OF THIS MEMO This document defines the standard format for the interchange of network News messages among USENET hosts. It updates and replaces RFC-850, reflecting version B2.11 of the News program. This memo is disributed as an RFC to make this information easily accessible to the Internet community. It does not specify an Internet standard. Distribution of this memo is unlimited.1. Introduction This document defines the standard format for the interchange of network News messages among USENET hosts. It describes the format for messages themselves and gives partial standards for transmission of news. The news transmission is not entirely in order to give a good deal of flexibility to the hosts to choose transmission hardware and software, to batch news, and so on. There are five sections to this document. Section two defines the format. Section three defines the valid control messages. Section four specifies some valid transmission methods. Section five describes the overall news propagation algorithm.2. Message Format The primary consideration in choosing a message format is that it fit in with existing tools as well as possible. Existing tools include implementations of both mail and news. (The notesfiles system from the University of Illinois is considered a news implementation.) A standard format for mail messages has existed for many years on the Internet, and this format meets most of the needs of USENET. Since the Internet format is extensible, extensions to meet the additional needs of USENET are easily made within the Internet standard. Therefore, the rule is adopted that all USENET news messages must be formatted as valid Internet mail messages, according to the Internet standard RFC-822. The USENET News standard is more restrictive than the Internet standard,Horton & Adams [Page 1]RFC 1036 Standard for USENET Messages December 1987 placing additional requirements on each message and forbidding use of certain Internet features. However, it should always be possible to use a tool expecting an Internet message to process a news message. In any situation where this standard conflicts with the Internet standard, RFC-822 should be considered correct and this standard in error. Here is an example USENET message to illustrate the fields. From: jerry@eagle.ATT.COM (Jerry Schwarz) Path: cbosgd!mhuxj!mhuxt!eagle!jerry Newsgroups: news.announce Subject: Usenet Etiquette -- Please Read Message-ID: <642@eagle.ATT.COM> Date: Fri, 19 Nov 82 16:14:55 GMT Followup-To: news.misc Expires: Sat, 1 Jan 83 00:00:00 -0500 Organization: AT&T Bell Laboratories, Murray Hill The body of the message comes here, after a blank line. Here is an example of a message in the old format (before the existence of this standard). It is recommended that implementations also accept messages in this format to ease upward conversion. From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz) Newsgroups: news.misc Title: Usenet Etiquette -- Please Read Article-I.D.: eagle.642 Posted: Fri Nov 19 16:14:55 1982 Received: Fri Nov 19 16:59:30 1982 Expires: Mon Jan 1 00:00:00 1990 The body of the message comes here, after a blank line. Some news systems transmit news in the A format, which looks like this: Aeagle.642 news.misc cbosgd!mhuxj!mhuxt!eagle!jerry Fri Nov 19 16:14:55 1982 Usenet Etiquette - Please Read The body of the message comes here, with no blank line. A standard USENET message consists of several header lines, followed by a blank line, followed by the body of the message. Each headerHorton & Adams [Page 2]RFC 1036 Standard for USENET Messages December 1987 line consist of a keyword, a colon, a blank, and some additional information. This is a subset of the Internet standard, simplified to allow simpler software to handle it. The "From" line may optionally include a full name, in the format above, or use the Internet angle bracket syntax. To keep the implementations simple, other formats (for example, with part of the machine address after the close parenthesis) are not allowed. The Internet convention of continuation header lines (beginning with a blank or tab) is allowed. Certain headers are required, and certain other headers are optional. Any unrecognized headers are allowed, and will be passed through unchanged. The required header lines are "From", "Date", "Newsgroups", "Subject", "Message-ID", and "Path". The optional header lines are "Followup-To", "Expires", "Reply-To", "Sender", "References", "Control", "Distribution", "Keywords", "Summary", "Approved", "Lines", "Xref", and "Organization". Each of these header lines will be described below.2.1. Required Header lines2.1.1. From The "From" line contains the electronic mailing address of the person who sent the message, in the Internet syntax. It may optionally also contain the full name of the person, in parentheses, after the electronic address. The electronic address is the same as the entity responsible for originating the message, unless the "Sender" header is present, in which case the "From" header might not be verified. Note that in all host and domain names, upper and lower case are considered the same, thus "mark@cbosgd.ATT.COM", "mark@cbosgd.att.com", and "mark@CBosgD.ATt.COm" are all equivalent. User names may or may not be case sensitive, for example, "Billy@cbosgd.ATT.COM" might be different from "BillY@cbosgd.ATT.COM". Programs should avoid changing the case of electronic addresses when forwarding news or mail. RFC-822 specifies that all text in parentheses is to be interpreted as a comment. It is common in Internet mail to place the full name of the user in a comment at the end of the "From" line. This standard specifies a more rigid syntax. The full name is not considered a comment, but an optional part of the header line. Either the full name is omitted, or it appears in parentheses after the electronic address of the person posting the message, or it appears before an electronic address which is enclosed in angle brackets. Thus, the three permissible forms are:Horton & Adams [Page 3]RFC 1036 Standard for USENET Messages December 1987 From: mark@cbosgd.ATT.COM From: mark@cbosgd.ATT.COM (Mark Horton) From: Mark Horton <mark@cbosgd.ATT.COM> Full names may contain any printing ASCII characters from space through tilde, except that they may not contain "(" (left parenthesis), ")" (right parenthesis), "<" (left angle bracket), or ">" (right angle bracket). Additional restrictions may be placed on full names by the mail standard, in particular, the characters "," (comma), ":" (colon), "@" (at), "!" (bang), "/" (slash), "=" (equal), and ";" (semicolon) are inadvisable in full names.2.1.2. Date The "Date" line (formerly "Posted") is the date that the message was originally posted to the network. Its format must be acceptable both in RFC-822 and to the getdate(3) routine that is provided with the Usenet software. This date remains unchanged as the message is propagated throughout the network. One format that is acceptable to both is: Wdy, DD Mon YY HH:MM:SS TIMEZONE Several examples of valid dates appear in the sample message above. Note in particular that ctime(3) format: Wdy Mon DD HH:MM:SS YYYY is not acceptable because it is not a valid RFC-822 date. However, since older software still generates this format, news implementations are encouraged to accept this format and translate it into an acceptable format. There is no hope of having a complete list of timezones. Universal Time (GMT), the North American timezones (PST, PDT, MST, MDT, CST, CDT, EST, EDT) and the +/-hhmm offset specifed in RFC-822 should be supported. It is recommended that times in message headers be transmitted in GMT and displayed in the local time zone.2.1.3. Newsgroups The "Newsgroups" line specifies the newsgroup or newsgroups in which the message belongs. Multiple newsgroups may be specified, separated by a comma. Newsgroups specified must all be the names of existing newsgroups, as no new newsgroups will be created by simply posting to them.Horton & Adams [Page 4]RFC 1036 Standard for USENET Messages December 1987 Wildcards (e.g., the word "all") are never allowed in a "News- groups" line. For example, a newsgroup comp.all is illegal, although a newsgroup rec.sport.football is permitted. If a message is received with a "Newsgroups" line listing some valid newsgroups and some invalid newsgroups, a host should not remove invalid newsgroups from the list. Instead, the invalid newsgroups should be ignored. For example, suppose host A subscribes to the classes btl.all and comp.all, and exchanges news messages with host B, which subscribes to comp.all but not btl.all. Suppose A receives a message with Newsgroups: comp.unix,btl.general. This message is passed on to B because B receives comp.unix, but B does not receive btl.general. A must leave the "Newsgroups" line unchanged. If it were to remove btl.general, the edited header could eventually re-enter the btl.all class, resulting in a message that is not shown to users subscribing to btl.general. Also, follow-ups from outside btl.all would not be shown to such users.2.1.4. Subject The "Subject" line (formerly "Title") tells what the message is about. It should be suggestive enough of the contents of the message to enable a reader to make a decision whether to read the message based on the subject alone. If the message is submitted in response to another message (e.g., is a follow-up) the default subject should begin with the four characters "Re:", and the "References" line is required. For follow-ups, the use of the "Summary" line is encouraged.2.1.5. Message-ID The "Message-ID" line gives the message a unique identifier. The Message-ID may not be reused during the lifetime of any previous message with the same Message-ID. (It is recommended that no Message-ID be reused for at least two years.) Message-ID's have the syntax: <string not containing blank or ">"> In order to conform to RFC-822, the Message-ID must have the format: <unique@full_domain_name> where full_domain_name is the full name of the host at which the message entered the network, including a domain that host is in, and unique is any string of printing ASCII characters, not including "<" (left angle bracket), ">" (right angle bracket), or "@" (at sign).Horton & Adams [Page 5]RFC 1036 Standard for USENET Messages December 1987 For example, the unique part could be an integer representing a sequence number for messages submitted to the network, or a short string derived from the date and time the message was created. For example, a valid Message-ID for a message submitted from host ucbvax in domain "Berkeley.EDU" would be "<4123@ucbvax.Berkeley.EDU>". Programmers are urged not to make assumptions about the content of Message-ID fields from other hosts, but to treat them as unknown character strings. It is not safe, for example, to assume that a Message-ID will be under 14 characters, that it is unique in the first 14 characters, nor that is does not contain a "/". The angle brackets are considered part of the Message-ID. Thus, in references to the Message-ID, such as the ihave/sendme and cancel control messages, the angle brackets are included. White space characters (e.g., blank and tab) are not allowed in a Message-ID. Slashes ("/") are strongly discouraged. All characters between the angle brackets must be printing ASCII characters.2.1.6. Path This line shows the path the message took to reach the current system. When a system forwards the message, it should add its own name to the list of systems in the "Path" line. The names may be separated by any punctuation character or characters (except "." which is considered part of the hostname). Thus, the following are valid entries: cbosgd!mhuxj!mhuxt cbosgd, mhuxj, mhuxt @cbosgd.ATT.COM,@mhuxj.ATT.COM,@mhuxt.ATT.COM teklabs, zehntel, sri-unix@cca!decvax (The latter path indicates a message that passed through decvax, cca, sri-unix, zehntel, and teklabs, in that order.) Additional names should be added from the left. For example, the most recently added name in the fourth example was teklabs. Letters, digits, periods and hyphens are considered part of host names; other punctuation, including blanks, are considered separators. Normally, the rightmost name will be the name of the originating system. However, it is also permissible to include an extra entry on the right, which is the name of the sender. This is for upward compatibility with older systems. The "Path" line is not used for replies, and should not be taken as a mailing address. It is intended to show the route the message traveled to reach the local host. There are several uses for this information. One is to monitor USENET routing for performanceHorton & Adams [Page 6]RFC 1036 Standard for USENET Messages December 1987 reasons. Another is to establish a path to reach new hosts. Perhaps the most important use is to cut down on redundant USENET traffic by failing to forward a message to a host that is known to have already received it. In particular, when host A sends a message to host B, the "Path" line includes A, so that host B will not immediately send the message back to host A. The name each host uses to identify itself should be the same as the name by which its neighbors know it, in order to make this optimization possible. A host adds its own name to the front of a path when it receives a message from another host. Thus, if a message with path "A!X!Y!Z" is passed from host A to host B, B will add its own name to the path when it receives the message from A, e.g., "B!A!X!Y!Z". If B then passes the message on to C, the message sent to C will contain the
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -