📄 rfc822.txt
字号:
2.7. #RULE: LISTS A construct "#" is defined, similar to "*", as follows: <l>#<m>element indicating at least <l> and at most <m> elements, each separated by one or more commas (","). This makes the usual form of lists very easy; a rule such as '(element *("," element))' can be shown as "1#element". Wherever this construct is used, null elements are allowed, but do not contribute to the count of elements present. That is, "(element),,(element)" is permitted, but counts as only two elements. Therefore, where at least one ele- ment is required, at least one non-null element must be present. Default values are 0 and infinity so that "#(element)" allows any number, including zero; "1#element" requires at least one; and "1#2element" allows one or two. 2.8. ; COMMENTS A semi-colon, set off some distance to the right of rule text, starts a comment that continues to the end of line. This is a simple way of including useful notes in parallel with the specifications. August 13, 1982 - 4 - RFC #822 Standard for ARPA Internet Text Messages 3. LEXICAL ANALYSIS OF MESSAGES 3.1. GENERAL DESCRIPTION A message consists of header fields and, optionally, a body. The body is simply a sequence of lines containing ASCII charac- ters. It is separated from the headers by a null line (i.e., a line with nothing preceding the CRLF). 3.1.1. LONG HEADER FIELDS Each header field can be viewed as a single, logical line of ASCII characters, comprising a field-name and a field-body. For convenience, the field-body portion of this conceptual entity can be split into a multiple-line representation; this is called "folding". The general rule is that wherever there may be linear-white-space (NOT simply LWSP-chars), a CRLF immediately followed by AT LEAST one LWSP-char may instead be inserted. Thus, the single line To: "Joe & J. Harvey" <ddd @Org>, JJV @ BBN can be represented as: To: "Joe & J. Harvey" <ddd @ Org>, JJV@BBN and To: "Joe & J. Harvey" <ddd@ Org>, JJV @BBN and To: "Joe & J. Harvey" <ddd @ Org>, JJV @ BBN The process of moving from this folded multiple-line representation of a header field to its single line represen- tation is called "unfolding". Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char. Note: While the standard permits folding wherever linear- white-space is permitted, it is recommended that struc- tured fields, such as those containing addresses, limit folding to higher-level syntactic breaks. For address fields, it is recommended that such folding occur August 13, 1982 - 5 - RFC #822 Standard for ARPA Internet Text Messages between addresses, after the separating comma. 3.1.2. STRUCTURE OF HEADER FIELDS Once a field has been unfolded, it may be viewed as being com- posed of a field-name followed by a colon (":"), followed by a field-body, and terminated by a carriage-return/line-feed. The field-name must be composed of printable ASCII characters (i.e., characters that have values between 33. and 126., decimal, except colon). The field-body may be composed of any ASCII characters, except CR or LF. (While CR and/or LF may be present in the actual text, they are removed by the action of unfolding the field.) Certain field-bodies of headers may be interpreted according to an internal syntax that some systems may wish to parse. These fields are called "structured fields". Examples include fields containing dates and addresses. Other fields, such as "Subject" and "Comments", are regarded simply as strings of text. Note: Any field which has a field-body that is defined as other than simply <text> is to be treated as a struc- tured field. Field-names, unstructured field bodies and structured field bodies each are scanned by their own, independent "lexical" analyzers. 3.1.3. UNSTRUCTURED FIELD BODIES For some fields, such as "Subject" and "Comments", no struc- turing is assumed, and they are treated simply as <text>s, as in the message body. Rules of folding apply to these fields, so that such field bodies which occupy several lines must therefore have the second and successive lines indented by at least one LWSP-char. 3.1.4. STRUCTURED FIELD BODIES To aid in the creation and reading of structured fields, the free insertion of linear-white-space (which permits folding by inclusion of CRLFs) is allowed between lexical tokens. Rather than obscuring the syntax specifications for these structured fields with explicit syntax for this linear-white- space, the existence of another "lexical" analyzer is assumed. This analyzer does not apply for unstructured field bodies that are simply strings of text, as described above. The analyzer provides an interpretation of the unfolded text August 13, 1982 - 6 - RFC #822 Standard for ARPA Internet Text Messages composing the body of the field as a sequence of lexical sym- bols. These symbols are: - individual special characters - quoted-strings - domain-literals - comments - atoms The first four of these symbols are self-delimiting. Atoms are not; they are delimited by the self-delimiting symbols and by linear-white-space. For the purposes of regenerating sequences of atoms and quoted-strings, exactly one SPACE is assumed to exist, and should be used, between them. (Also, in the "Clarifications" section on "White Space", below, note the rules about treatment of multiple contiguous LWSP-chars.) So, for example, the folded body of an address field ":sysmail"@ Some-Group. Some-Org, Muhammed.(I am the greatest) Ali @(the)Vegas.WBA August 13, 1982 - 7 - RFC #822 Standard for ARPA Internet Text Messages is analyzed into the following lexical symbols and types: :sysmail quoted string @ special Some-Group atom . special Some-Org atom , special Muhammed atom . special (I am the greatest) comment Ali atom @ atom (the) comment Vegas atom . special WBA atom The canonical representations for the data in these addresses are the following strings: ":sysmail"@Some-Group.Some-Org and Muhammed.Ali@Vegas.WBA Note: For purposes of display, and when passing such struc- tured information to other systems, such as mail proto- col services, there must be NO linear-white-space between <word>s that are separated by period (".") or at-sign ("@") and exactly one SPACE between all other <word>s. Also, headers should be in a folded form. August 13, 1982 - 8 - RFC #822 Standard for ARPA Internet Text Messages 3.2. HEADER FIELD DEFINITIONS These rules show a field meta-syntax, without regard for the particular type or internal syntax. Their purpose is to permit detection of fields; also, they present to higher-level parsers an image of each field as fitting on one line. field = field-name ":" [ field-body ] CRLF field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> field-body = field-body-contents [CRLF LWSP-char field-body] field-body-contents = <the ASCII characters making up the field-body, as defined in the following sections, and consisting of combinations of atom, quoted-string, and specials tokens, or else consisting of texts> August 13, 1982 - 9 - RFC #822 Standard for ARPA Internet Text Messages 3.3. LEXICAL TOKENS
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -