📄 rfc2822.txt

📁 著名的RFC文档,其中有一些文档是已经翻译成中文的的.
💻 TXT
📖 第 1 页 / 共 5 页
字号:
   sequence of lines of characters with special syntax as defined in   this standard. The body is simply a sequence of characters that   follows the header and is separated from the header by an empty line   (i.e., a line with nothing preceding the CRLF).2.1.1. Line Length Limits   There are two limits that this standard places on the number of   characters in a line. Each line of characters MUST be no more than   998 characters, and SHOULD be no more than 78 characters, excluding   the CRLF.   The 998 character limit is due to limitations in many implementations   which send, receive, or store Internet Message Format messages that   simply cannot handle more than 998 characters on a line. Receiving   implementations would do well to handle an arbitrarily large number   of characters in a line for robustness sake. However, there are so   many implementations which (in compliance with the transport   requirements of [RFC2821]) do not accept messages containing more   than 1000 character including the CR and LF per line, it is important   for implementations not to create such messages.   The more conservative 78 character recommendation is to accommodate   the many implementations of user interfaces that display these   messages which may truncate, or disastrously wrap, the display of   more than 78 characters per line, in spite of the fact that such   implementations are non-conformant to the intent of this   specification (and that of [RFC2821] if they actually cause   information to be lost). Again, even though this limitation is put on   messages, it is encumbant upon implementations which display messagesResnick                     Standards Track                     [Page 6]RFC 2822                Internet Message Format               April 2001   to handle an arbitrarily large number of characters in a line   (certainly at least up to the 998 character limit) for the sake of   robustness.2.2. Header Fields   Header fields are lines composed of a field name, followed by a colon   (":"), followed by a field body, and terminated by CRLF.  A field   name MUST be composed of printable US-ASCII characters (i.e.,   characters that have values between 33 and 126, inclusive), except   colon.  A field body may be composed of any US-ASCII characters,   except for CR and LF.  However, a field body may contain CRLF when   used in header "folding" and  "unfolding" as described in section   2.2.3.  All field bodies MUST conform to the syntax described in   sections 3 and 4 of this standard.2.2.1. Unstructured Header Field Bodies   Some field bodies in this standard are defined simply as   "unstructured" (which is specified below as any US-ASCII characters,   except for CR and LF) with no further restrictions.  These are   referred to as unstructured field bodies.  Semantically, unstructured   field bodies are simply to be treated as a single line of characters   with no further processing (except for header "folding" and   "unfolding" as described in section 2.2.3).2.2.2. Structured Header Field Bodies   Some field bodies in this standard have specific syntactical   structure more restrictive than the unstructured field bodies   described above. These are referred to as "structured" field bodies.   Structured field bodies are sequences of specific lexical tokens as   described in sections 3 and 4 of this standard.  Many of these tokens   are allowed (according to their syntax) to be introduced or end with   comments (as described in section 3.2.3) as well as the space (SP,   ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters   (together known as the white space characters, WSP), and those WSP   characters are subject to header "folding" and "unfolding" as   described in section 2.2.3.  Semantic analysis of structured field   bodies is given along with their syntax.2.2.3. Long Header Fields   Each header field is logically a single line of characters comprising   the field name, the colon, and the field body.  For convenience   however, and to deal with the 998/78 character limitations per line,   the field body portion of a header field can be split into a multiple   line representation; this is called "folding".  The general rule isResnick                     Standards Track                     [Page 7]RFC 2822                Internet Message Format               April 2001   that wherever this standard allows for folding white space (not   simply WSP characters), a CRLF may be inserted before any WSP.  For   example, the header field:           Subject: This is a test   can be represented as:           Subject: This            is a test   Note: Though structured field bodies are defined in such a way that   folding can take place between many of the lexical tokens (and even   within some of the lexical tokens), folding SHOULD be limited to   placing the CRLF at higher-level syntactic breaks.  For instance, if   a field body is defined as comma-separated values, it is recommended   that folding occur after the comma separating the structured items in   preference to other places where the field could be folded, even if   it is allowed elsewhere.   The process of moving from this folded multiple-line representation   of a header field to its single line representation is called   "unfolding". Unfolding is accomplished by simply removing any CRLF   that is immediately followed by WSP.  Each header field should be   treated in its unfolded form for further syntactic and semantic   evaluation.2.3. Body   The body of a message is simply lines of US-ASCII characters.  The   only two limitations on the body are as follows:   - CR and LF MUST only occur together as CRLF; they MUST NOT appear     independently in the body.   - Lines of characters in the body MUST be limited to 998 characters,     and SHOULD be limited to 78 characters, excluding the CRLF.   Note: As was stated earlier, there are other standards documents,   specifically the MIME documents [RFC2045, RFC2046, RFC2048, RFC2049]   that extend this standard to allow for different sorts of message   bodies.  Again, these mechanisms are beyond the scope of this   document.Resnick                     Standards Track                     [Page 8]RFC 2822                Internet Message Format               April 20013. Syntax3.1. Introduction   The syntax as given in this section defines the legal syntax of   Internet messages.  Messages that are conformant to this standard   MUST conform to the syntax in this section.  If there are options in   this section where one option SHOULD be generated, that is indicated   either in the prose or in a comment next to the syntax.   For the defined expressions, a short description of the syntax and   use is given, followed by the syntax in ABNF, followed by a semantic   analysis.  Primitive tokens that are used but otherwise unspecified   come from [RFC2234].   In some of the definitions, there will be nonterminals whose names   start with "obs-".  These "obs-" elements refer to tokens defined in   the obsolete syntax in section 4.  In all cases, these productions   are to be ignored for the purposes of generating legal Internet   messages and MUST NOT be used as part of such a message.  However,   when interpreting messages, these tokens MUST be honored as part of   the legal syntax.  In this sense, section 3 defines a grammar for   generation of messages, with "obs-" elements that are to be ignored,   while section 4 adds grammar for interpretation of messages.3.2. Lexical Tokens   The following rules are used to define an underlying lexical   analyzer, which feeds tokens to the higher-level parsers.  This   section defines the tokens used in structured header field bodies.   Note: Readers of this standard need to pay special attention to how   these lexical tokens are used in both the lower-level and   higher-level syntax later in the document.  Particularly, the white   space tokens and the comment tokens defined in section 3.2.3 get used   in the lower-level tokens defined here, and those lower-level tokens   are in turn used as parts of the higher-level tokens defined later.   Therefore, the white space and comments may be allowed in the   higher-level tokens even though they may not explicitly appear in a   particular definition.3.2.1. Primitive Tokens   The following are primitive tokens referred to elsewhere in this   standard, but not otherwise defined in [RFC2234].  Some of them will   not appear anywhere else in the syntax, but they are convenient to   refer to in other parts of this document.Resnick                     Standards Track                     [Page 9]RFC 2822                Internet Message Format               April 2001   Note: The "specials" below are just such an example.  Though the   specials token does not appear anywhere else in this standard, it is   useful for implementers who use tools that lexically analyze   messages.  Each of the characters in specials can be used to indicate   a tokenization point in lexical analysis.NO-WS-CTL       =       %d1-8 /         ; US-ASCII control characters                        %d11 /          ;  that do not include the                        %d12 /          ;  carriage return, line feed,                        %d14-31 /       ;  and white space characters                        %d127text            =       %d1-9 /         ; Characters excluding CR and LF                        %d11 /                        %d12 /                        %d14-127 /                        obs-textspecials        =       "(" / ")" /     ; Special characters used in                        "<" / ">" /     ;  other parts of the syntax                        "[" / "]" /                        ":" / ";" /                        "@" / "\" /                        "," / "." /                        DQUOTE   No special semantics are attached to these tokens.  They are simply   single characters.3.2.2. Quoted characters   Some characters are reserved for special interpretation, such as   delimiting lexical tokens.  To permit use of these characters as   uninterpreted data, a quoting mechanism is provided.quoted-pair     =       ("\" text) / obs-qp   Where any quoted-pair appears, it is to be interpreted as the text   character alone.  That is to say, the "\" character that appears as   part of a quoted-pair is semantically "invisible".   Note: The "\" character may appear in a message where it is not part   of a quoted-pair.  A "\" character that does not appear in a   quoted-pair is not semantically invisible.  The only places in this   standard where quoted-pair currently appears are ccontent, qcontent,   dcontent, no-fold-quote, and no-fold-literal.Resnick                     Standards Track                    [Page 10]RFC 2822                Internet Message Format               April 20013.2.3. Folding white space and comments   White space characters, including white space used in folding   (described in section 2.2.3), may appear between many elements in   header field bodies.  Also, strings of characters that are treated as   comments may be included in structured field bodies as characters   enclosed in parentheses.  The following defines the folding white   space (FWS) and comment constructs.   Strings of characters enclosed in parentheses are considered comments   so long as they do not appear within a "quoted-string", as defined in   section 3.2.5.  Comments may nest.   There are several places in this standard where comments and FWS may   be freely inserted.  To accommodate that syntax, an additional token   for "CFWS" is defined for places where comments and/or FWS can occur.   However, where CFWS occurs in this standard, it MUST NOT be inserted   in such a way that any line of a folded header field is made up   entirely of WSP characters and nothing else.FWS             =       ([*WSP CRLF] 1*WSP) /   ; Folding white space                        obs-FWSctext           =       NO-WS-CTL /     ; Non white space controls                        %d33-39 /       ; The rest of the US-ASCII                        %d42-91 /       ;  characters not including "(",                        %d93-126        ;  ")", or "\"ccontent        =       ctext / quoted-pair / commentcomment         =       "(" *([FWS] ccontent) [FWS] ")"CFWS            =       *([FWS] comment) (([FWS] comment) / FWS)   Throughout this standard, where FWS (the folding white space token)   appears, it indicates a place where header folding, as discussed in   section 2.2.3, may take place.  Wherever header folding appears in a
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -