📄 rfc733.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:
                  <l>#<m>element

indicating at least <l> and at most <m> elements, each  separated
by  one or more commas (",").  This makes the usual form of lists
very easy; a rule such as '(element *("," element))' can be shown
as  "1#element".   Wherever this construct is used, null elements
are allowed, but do not  contribute  to  the  count  of  elements
present.   That  is,  "(element),,(element)"  is  permitted,  but
counts as only two  elements.   Therefore,  where  at  least  one
element  is  required,  at  least  one  non-null  element must be
present.


6.  [optional]

Square  brackets  enclose  optional  elements;  "[foo  bar]"   is
equivalent to "*1(foo bar)".


7.  ; Comments

A semi-colon, set off some distance to the right  of  rule  text,
starts  a  comment which continues to the end of line.  This is a
simple way  of  including  useful  notes  in  parallel  with  the
specifications.



B.  LEXICAL ANALYSIS OF MESSAGES


1.  General Description

A message consists of headers and, optionally,  a  body  (i.e.  a
series of text lines).  The text part is just a sequence of lines
containing ASCII characters; it is separated from the headers  by
a null line (i.e., a line with nothing preceding the CRLF).


Standard for the Format of Text Messages                        6
III. Syntax
  B. Lexical Analysis



a.  Folding and unfolding of headers

    Each header item can be viewed as a single, logical  line  of
    ASCII characters.  For convenience, the field-body portion of
    this conceptual entity can  be  split  into  a  multiple-line
    representation  (i.e.,  "folded").   The general rule is that
    wherever there can be linear-white-space  (NOT  simply  LWSP-
    chars), a CRLF immediately followed by AT LEAST one LWSP-char
    can instead be inserted.  (However, a header's name  and  the
    following  colon  (":"),  which occur at the beginning of the
    header item, may NOT be folded onto multiple  lines.)   Thus,
    the single line

       To:  "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN

    can be represented as

       To:  "Joe Dokes & J. Harvey" <ddd at Host>,
            JJV at BBN

    and

       To:  "Joe Dokes & J. Harvey"
                        <ddd at Host>,
        JJV at BBN

    and

       To:  "Joe Dokes
        & J. Harvey" <ddd at Host>, JJV at BBN

    The  process  of  moving  from  this   folded   multiple-line
    representation   of   a  header  field  to  its  single  line
    representation will  be  called  "unfolding".   Unfolding  is
    accomplished  by  regarding  CRLF  immediately  followed by a
    LWSP-char as equivalent  to  the  LWSP-char.

b.  Structure of header fields

    Once header fields have been unfolded, they may be viewed  as
    being  composed  of  a  field-name followed by a colon (":"),
    followed by a field-body.  The field-name must be composed of
    printable  ASCII  characters  (i.e.,  characters  which  have
    values between 33.  and  126.,  decimal,  except  colon)  and
    LWSP-chars.   The  field-body  may  be  composed of any ASCII
    characters (other than  an  unquoted  CRLF,  which  has  been
    removed by unfolding).

    Certain field-bodies of  header  fields  may  be  interpreted
    according  to  an internal syntax which some systems may wish
    to parse.  These fields will be referred to  as  "structured"
    fields.    Examples   include  fields  containing  dates  and

Standard for the Format of Text Messages                        7
III. Syntax
  B. Lexical Analysis



    addresses.  Other fields, such as "Subject"  and  "Comments",
    are regarded simply as strings of text.

    NOTE:  Field-names, unstructured field bodies and  structured
    field  bodies  each  are  scanned  by  their own, INDEPENDENT
    "lexical" analyzer.

c.  Field-names

    To aid in the creation and reading of field-names,  the  free
    insertion  of  LWSP-chars  is  allowed in  reasonable places.

    Rather than obscuring the syntax specification for field-name
    with  the explicit syntax for these LWSP-chars, the existence
    of a "lexical" analyzer is assumed.  The analyzer  interprets
    the  text  which  comprises  the  field-name as a sequence of
    field-name atoms (fnatoms) separated by LWSP-chars

    Note that ONLY LWSP-chars may occur between the fnatoms of  a
    field-name and that CRLFs may NOT.  In addition, comments are
    NOT lexically recognized, as such, but parenthesized  strings
    are  legal  as  part  of  field-names.  These constraints are
    different from what is permissible  within  structured  field
    bodies.   In  particular,  this means that header field-names
    must wholly occur on the FIRST line of a folded  header  item
    and may NOT be split across two or more lines.

d.  Unstructured field bodies

    For  some  fields,  such  as  "Subject"  and  "Comments",  no
    structuring is assumed; and they are treated simply as texts,
    like those in the message body.  Rules of  folding  apply  to
    these  fields, so that such field bodies which occupy several
    lines must therefore have the  second  and  successive  lines
    indented by at least one LWSP-char.

e.  Structured field bodies

    To aid in the creation and reading of structured fields,  the
    free  insertion  of linear-white-space (which permits folding
    by inclusion of  CRLFs)  is  allowed  in  reasonable  places.
    Rather  than  obscuring  the  syntax specifications for these
    structured fields  with  explicit  syntax  for  this  linear-
    white-space,  the  existence of another "lexical" analyzer is
    assumed.  This analyzer does not apply for field bodies which
    are  simply unstructured strings of text, as described above.
    It provides an interpretation of the unfolded text comprising
    the  body  of  the  field  as  a sequence of lexical symbols.
    These symbols are:

            -  individual special characters
            -  quoted-strings

Standard for the Format of Text Messages                        8
III. Syntax
  B. Lexical Analysis



            -  comments
            -  atoms

    The first three of these symbols are self-delimiting.   Atoms
    are  not; they therefore are delimited by the self-delimiting
    symbols and by linear-white-space.  For the purposes  of  re-
    generating sequences of atoms and quoted-strings, exactly one
    SPACE is assumed to exist and should be  used  between  them.
    (Also,  in  Section  III.B.3.a,  note  the  rules  concerning
    treatment of multiple continguous LWSP-chars.)

    So, for example, the folded body of an address field

            ":sysmail"@   Some-Host,
            Muhammed(I am   the greatest)Ali   at(the)WBA

    is analyzed into the following lexical symbols and types:

            ":sysmail"              quoted string
            @                       special
            Some-Host               atom
            ,                       special
            Muhammed                atom
            (I am   the greatest)   comment
            Ali                     atom
            at                      atom
            (the)                   comment
            WBA                     atom

    The cononical representations for the data in these addresses
    are  the  following  strings  (note that there is exactly one
    SPACE between words):

                :sysmail at Some-Host

    and

                Muhammed Ali at WBA



2.  Formal Definitions

The first four rules, below, indicate a meta-syntax  for  fields,
without  regard to their particular type or internal syntax.  The
remaining rules define basic syntactic structures which are  used
by the rules in Sections III.C, III.D, and III.E.

field       =  field-name ":" [ field-body ] CRLF

field-name  =  fnatom *( LWSP-char [fnatom] )


Standard for the Format of Text Messages                        9
III. Syntax
  B. Lexical Analysis



fnatom      =  1*<any CHAR, excluding CTLs, SPACE, and ":">

field-body  =  field-body-contents
               [CRLF LWSP-char field-body]

field-body-contents = <the TELNET ASCII characters making up the
               field-body, as defined in the following sections,
               and consisting of combinations of atom, quoted-
               string, and specials tokens, or else consisting of
               texts>

                                            ; (  Octal, Decimal.)
CHAR        =  <any TELNET ASCII character> ; (  0-177,  0.-127.)
ALPHA       =  <any TELNET ASCII alphabetic character>
                                            ; (101-132, 65.- 90.)
                                            ; (141-172, 97.-122.)
DIGIT       =  <any TELNET ASCII digit>     ; ( 60- 71, 48.- 57.)
CTL         =  <any TELNET ASCII control    ; (  0- 37,  0.- 31.)
                character and DEL>          ; (    177,     127.)
CR          =  <TELNET ASCII carriage return>;(     15,      13.)
LF          =  <TELNET ASCII linefeed>      ; (     12,      10.)
SPACE       =  <TELNET ASCII space>         ; (     40,      32.)
HTAB        =  <TELNET ASCII horizontal-tab>; (     11,       9.)
<">         =  <TELNET ASCII quote mark>    ; (     42,      34.)
CRLF        =  CR LF

LWSP-char   =  SPACE / HTAB                 ; semantics = SPACE
linear-white-space =  1*([CRLF] LWSP-char)  ; semantics = SPACE
                                            ; CRLF => folding

specials    =  "(" / ")" / "<" / ">" / "@"  ; To use in a word,
            /  "," / ";" / ":" / "\" / <">  ;  word must be a
                                            ;  quoted-string.

delimiters  =  specials / comment / linear-white-space

text        =  <any CHAR, including bare    ; => atoms, specials,
                CR and/or bare LF, but NOT  ;  comments and
                including CRLF>             ;  quoted-strings are
                                            ;  NOT interpreted.

atom        =  1*<any CHAR except specials and CTLs>

quoted-string = <"> *(qtext/quoted-pair) <">; Any number of qtext
                                            ;   chars or any
                                            ;   quoted char.

qtext       =  <any CHAR excepting <">      ; => may be folded
                and CR, and including
                linear-white-space>


Standard for the Format of Text Messages                       10
III. Syntax
  B. Lexical Analysis



comment     =  "(" *(ctext / comment / quoted-pair) ")"
ctext       =  <any CHAR excluding "(",     ; => may be folded
                ")" and CR, and including
                linear-white-space>

quoted-pair =  "\" CHAR


3.  Clarifications

a.  "White space"

    Remember that in field-names  and  structured  field  bodies,
    MULTIPLE  LINEAR  WHITE SPACE TELNET ASCII CHARACTERS (namely
    HTABs and SPACEs) ARE TREATED AS SINGLE SPACES AND MAY FREELY
    SURROUND ANY SYMBOL.  In all header fields, the only place in
    which at least one space is REQUIRED is at the  beginning  of
    continuation  lines  in a folded field.  When passing text to
    processes which do  not  interpret  text  according  to  this
    standard  (e.g.,  ARPANET FTP mail servers), then exactly one
    SPACE should be used in place of arbitrary linear-white-space
    and comment sequences.

    WHEREVER A MEMBER OF THE LIST  OF  <DELIMITER>S  IS  ALLOWED,
    LWSP-CHARS MAY ALSO OCCUR BEFORE AND/OR AFTER IT.

    Writers of mail-sending  (i.e.  header  generating)  programs
    should  realize  that  there is no Network-wide definition of
    the effect of horizontal-tab TELNET ASCII characters  on  the
    appearance  of  text  at another Network host; therefore, the
    use  of  tabs  in  message  headers,  though  permitted,   is
    discouraged.

    Note that  during  transmissions  across  the  ARPANET  using
    TELNET  NVT  connections,  data  must  conform  to TELNET NVT
    conventions (e.g., CR must be followed by either LF, making a
    CRLF, or <null>, if the CR is to stand alone).

b.  Comments

    Comments are detected as such  only  within  field-bodies  of
    structured  fields.   A  comment  is  a  set  of TELNET ASCII
    characters, which is not within a quoted-string and which  is
    enclosed  in  matching parentheses; parentheses nest, so that
    if an unquoted left parenthesis occurs in a  comment  string,
    there  must  also  be  a  matching right parenthesis.  When a
    comment is used to act as the delimiter between a sequence of
    two  lexical  symbols,  such  as  two  atoms, it is lexically
    equivalent with one SPACE, for the purposes  of  regenerating
    the  sequence,  such as when passing the sequence onto an FTP
    mail server.


Standard for the Format of Text Messages                       11
III. Syntax
  B. Lexical Analysis



    In particular comments are NOT passed to the FTP  server,  as
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -