📄 rfc733.txt
字号:
counts as only two elements. Therefore, where at least oneelement is required, at least one non-null element must bepresent.6. [optional]Square brackets enclose optional elements; "[foo bar]" isequivalent to "*1(foo bar)".7. ; CommentsA semi-colon, set off some distance to the right of rule text,starts a comment which continues to the end of line. This is asimple way of including useful notes in parallel with thespecifications.B. LEXICAL ANALYSIS OF MESSAGES1. General DescriptionA message consists of headers and, optionally, a body (i.e. aseries of text lines). The text part is just a sequence of linescontaining ASCII characters; it is separated from the headers bya null line (i.e., a line with nothing preceding the CRLF).Standard for the Format of Text Messages 6III. Syntax B. Lexical Analysisa. Folding and unfolding of headers Each header item can be viewed as a single, logical line of ASCII characters. For convenience, the field-body portion of this conceptual entity can be split into a multiple-line representation (i.e., "folded"). The general rule is that wherever there can be linear-white-space (NOT simply LWSP- chars), a CRLF immediately followed by AT LEAST one LWSP-char can instead be inserted. (However, a header's name and the following colon (":"), which occur at the beginning of the header item, may NOT be folded onto multiple lines.) Thus, the single line To: "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN can be represented as To: "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN and To: "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN and To: "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN The process of moving from this folded multiple-line representation of a header field to its single line representation will be called "unfolding". Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char.b. Structure of header fields Once header fields have been unfolded, they may be viewed as being composed of a field-name followed by a colon (":"), followed by a field-body. The field-name must be composed of printable ASCII characters (i.e., characters which have values between 33. and 126., decimal, except colon) and LWSP-chars. The field-body may be composed of any ASCII characters (other than an unquoted CRLF, which has been removed by unfolding). Certain field-bodies of header fields may be interpreted according to an internal syntax which some systems may wish to parse. These fields will be referred to as "structured" fields. Examples include fields containing dates andStandard for the Format of Text Messages 7III. Syntax B. Lexical Analysis addresses. Other fields, such as "Subject" and "Comments", are regarded simply as strings of text. NOTE: Field-names, unstructured field bodies and structured field bodies each are scanned by their own, INDEPENDENT "lexical" analyzer.c. Field-names To aid in the creation and reading of field-names, the free insertion of LWSP-chars is allowed in reasonable places. Rather than obscuring the syntax specification for field-name with the explicit syntax for these LWSP-chars, the existence of a "lexical" analyzer is assumed. The analyzer interprets the text which comprises the field-name as a sequence of field-name atoms (fnatoms) separated by LWSP-chars Note that ONLY LWSP-chars may occur between the fnatoms of a field-name and that CRLFs may NOT. In addition, comments are NOT lexically recognized, as such, but parenthesized strings are legal as part of field-names. These constraints are different from what is permissible within structured field bodies. In particular, this means that header field-names must wholly occur on the FIRST line of a folded header item and may NOT be split across two or more lines.d. Unstructured field bodies For some fields, such as "Subject" and "Comments", no structuring is assumed; and they are treated simply as texts, like those in the message body. Rules of folding apply to these fields, so that such field bodies which occupy several lines must therefore have the second and successive lines indented by at least one LWSP-char.e. Structured field bodies To aid in the creation and reading of structured fields, the free insertion of linear-white-space (which permits folding by inclusion of CRLFs) is allowed in reasonable places. Rather than obscuring the syntax specifications for these structured fields with explicit syntax for this linear- white-space, the existence of another "lexical" analyzer is assumed. This analyzer does not apply for field bodies which are simply unstructured strings of text, as described above. It provides an interpretation of the unfolded text comprising the body of the field as a sequence of lexical symbols. These symbols are: - individual special characters - quoted-stringsStandard for the Format of Text Messages 8III. Syntax B. Lexical Analysis - comments - atoms The first three of these symbols are self-delimiting. Atoms are not; they therefore are delimited by the self-delimiting symbols and by linear-white-space. For the purposes of re- generating sequences of atoms and quoted-strings, exactly one SPACE is assumed to exist and should be used between them. (Also, in Section III.B.3.a, note the rules concerning treatment of multiple continguous LWSP-chars.) So, for example, the folded body of an address field ":sysmail"@ Some-Host, Muhammed(I am the greatest)Ali at(the)WBA is analyzed into the following lexical symbols and types: ":sysmail" quoted string @ special Some-Host atom , special Muhammed atom (I am the greatest) comment Ali atom at atom (the) comment WBA atom The cononical representations for the data in these addresses are the following strings (note that there is exactly one SPACE between words): :sysmail at Some-Host and Muhammed Ali at WBA2. Formal DefinitionsThe first four rules, below, indicate a meta-syntax for fields,without regard to their particular type or internal syntax. Theremaining rules define basic syntactic structures which are usedby the rules in Sections III.C, III.D, and III.E.field = field-name ":" [ field-body ] CRLFfield-name = fnatom *( LWSP-char [fnatom] )Standard for the Format of Text Messages 9III. Syntax B. Lexical Analysisfnatom = 1*<any CHAR, excluding CTLs, SPACE, and ":">field-body = field-body-contents [CRLF LWSP-char field-body]field-body-contents = <the TELNET ASCII characters making up the field-body, as defined in the following sections, and consisting of combinations of atom, quoted- string, and specials tokens, or else consisting of texts> ; ( Octal, Decimal.)CHAR = <any TELNET ASCII character> ; ( 0-177, 0.-127.)ALPHA = <any TELNET ASCII alphabetic character> ; (101-132, 65.- 90.) ; (141-172, 97.-122.)DIGIT = <any TELNET ASCII digit> ; ( 60- 71, 48.- 57.)CTL = <any TELNET ASCII control ; ( 0- 37, 0.- 31.) character and DEL> ; ( 177, 127.)CR = <TELNET ASCII carriage return>;( 15, 13.)LF = <TELNET ASCII linefeed> ; ( 12, 10.)SPACE = <TELNET ASCII space> ; ( 40, 32.)HTAB = <TELNET ASCII horizontal-tab>; ( 11, 9.)<"> = <TELNET ASCII quote mark> ; ( 42, 34.)CRLF = CR LFLWSP-char = SPACE / HTAB ; semantics = SPACElinear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE ; CRLF => foldingspecials = "(" / ")" / "<" / ">" / "@" ; To use in a word, / "," / ";" / ":" / "\" / <"> ; word must be a ; quoted-string.delimiters = specials / comment / linear-white-spacetext = <any CHAR, including bare ; => atoms, specials, CR and/or bare LF, but NOT ; comments and including CRLF> ; quoted-strings are ; NOT interpreted.atom = 1*<any CHAR except specials and CTLs>quoted-string = <"> *(qtext/quoted-pair) <">; Any number of qtext ; chars or any ; quoted char.qtext = <any CHAR excepting <"> ; => may be folded and CR, and including linear-white-space>Standard for the Format of Text Messages 10III. Syntax B. Lexical Analysiscomment = "(" *(ctext / comment / quoted-pair) ")"ctext = <any CHAR excluding "(", ; => may be folded ")" and CR, and including linear-white-space>quoted-pair = "\" CHAR3. Clarificationsa. "White space" Remember that in field-names and structured field bodies, MULTIPLE LINEAR WHITE SPACE TELNET ASCII CHARACTERS (namely HTABs and SPACEs) ARE TREATED AS SINGLE SPACES AND MAY FREELY SURROUND ANY SYMBOL. In all header fields, the only place in which at least one space is REQUIRED is at the beginning of continuation lines in a folded field. When passing text to processes which do not interpret text according to this standard (e.g., ARPANET FTP mail servers), then exactly one SPACE should be used in place of arbitrary linear-white-space and comment sequences. WHEREVER A MEMBER OF THE LIST OF <DELIMITER>S IS ALLOWED, LWSP-CHARS MAY ALSO OCCUR BEFORE AND/OR AFTER IT. Writers of mail-sending (i.e. header generating) programs should realize that there is no Network-wide definition of the effect of horizontal-tab TELNET ASCII characters on the appearance of text at another Network host; therefore, the use of tabs in message headers, though permitted, is discouraged. Note that during transmissions across the ARPANET using TELNET NVT connections, data must conform to TELNET NVT conventions (e.g., CR must be followed by either LF, making a CRLF, or <null>, if the CR is to stand alone).b. Comments Comments are detected as such only within field-bodies of structured fields. A comment is a set of TELNET ASCII characters, which is not within a quoted-string and which is enclosed in matching parentheses; parentheses nest, so that if an unquoted left parenthesis occurs in a comment string, there must also be a matching right parenthesis. When a comment is used to act as the delimiter between a sequence of two lexical symbols, such as two atoms, it is lexically equivalent with one SPACE, for the purposes of regenerating the sequence, such as when passing the sequence onto an FTP mail server.Standard for the Format of Text Messages 11III. Syntax B. Lexical Analysis In particular comments are NOT passed to the FTP server, as part of a MAIL or MLFL command, since comments are not part of the "formal" address. If a comment is to be "folded" onto multiple lines, then the syntax for folding must be adhered to. (See items III.B.1.a, above, and III.B.3.f, below.) Note that the official semantics therefore do not "see" any unquoted CRLFs which are in comments, although particular parsing programs may wish to note their presence. For these programs, it would be reasonable to interpret a "CRLF LWSP-char" as being a CRLF which is part of the comment; i.e., the CRLF is kept and the LWSP-char is discarded. Quoted CRLFs (i.e., a backslash followed by a CR followed by a LF) still must be followed by at least one LWSP-char.c. Delimiting and quoting characters
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -