📄 rfc733.txt

📁 RFC 相关的技术文档
💻 TXT
📖 第 1 页 / 共 5 页
字号:
counts as only two  elements.   Therefore,  where  at  least  oneelement  is  required,  at  least  one  non-null  element must bepresent.6.  [optional]Square  brackets  enclose  optional  elements;  "[foo  bar]"   isequivalent to "*1(foo bar)".7.  ; CommentsA semi-colon, set off some distance to the right  of  rule  text,starts  a  comment which continues to the end of line.  This is asimple way  of  including  useful  notes  in  parallel  with  thespecifications.B.  LEXICAL ANALYSIS OF MESSAGES1.  General DescriptionA message consists of headers and, optionally,  a  body  (i.e.  aseries of text lines).  The text part is just a sequence of linescontaining ASCII characters; it is separated from the headers  bya null line (i.e., a line with nothing preceding the CRLF).Standard for the Format of Text Messages                        6III. Syntax  B. Lexical Analysisa.  Folding and unfolding of headers    Each header item can be viewed as a single, logical  line  of    ASCII characters.  For convenience, the field-body portion of    this conceptual entity can  be  split  into  a  multiple-line    representation  (i.e.,  "folded").   The general rule is that    wherever there can be linear-white-space  (NOT  simply  LWSP-    chars), a CRLF immediately followed by AT LEAST one LWSP-char    can instead be inserted.  (However, a header's name  and  the    following  colon  (":"),  which occur at the beginning of the    header item, may NOT be folded onto multiple  lines.)   Thus,    the single line       To:  "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN    can be represented as       To:  "Joe Dokes & J. Harvey" <ddd at Host>,            JJV at BBN    and       To:  "Joe Dokes & J. Harvey"                        <ddd at Host>,        JJV at BBN    and       To:  "Joe Dokes        & J. Harvey" <ddd at Host>, JJV at BBN    The  process  of  moving  from  this   folded   multiple-line    representation   of   a  header  field  to  its  single  line    representation will  be  called  "unfolding".   Unfolding  is    accomplished  by  regarding  CRLF  immediately  followed by a    LWSP-char as equivalent  to  the  LWSP-char.b.  Structure of header fields    Once header fields have been unfolded, they may be viewed  as    being  composed  of  a  field-name followed by a colon (":"),    followed by a field-body.  The field-name must be composed of    printable  ASCII  characters  (i.e.,  characters  which  have    values between 33.  and  126.,  decimal,  except  colon)  and    LWSP-chars.   The  field-body  may  be  composed of any ASCII    characters (other than  an  unquoted  CRLF,  which  has  been    removed by unfolding).    Certain field-bodies of  header  fields  may  be  interpreted    according  to  an internal syntax which some systems may wish    to parse.  These fields will be referred to  as  "structured"    fields.    Examples   include  fields  containing  dates  andStandard for the Format of Text Messages                        7III. Syntax  B. Lexical Analysis    addresses.  Other fields, such as "Subject"  and  "Comments",    are regarded simply as strings of text.    NOTE:  Field-names, unstructured field bodies and  structured    field  bodies  each  are  scanned  by  their own, INDEPENDENT    "lexical" analyzer.c.  Field-names    To aid in the creation and reading of field-names,  the  free    insertion  of  LWSP-chars  is  allowed in  reasonable places.    Rather than obscuring the syntax specification for field-name    with  the explicit syntax for these LWSP-chars, the existence    of a "lexical" analyzer is assumed.  The analyzer  interprets    the  text  which  comprises  the  field-name as a sequence of    field-name atoms (fnatoms) separated by LWSP-chars    Note that ONLY LWSP-chars may occur between the fnatoms of  a    field-name and that CRLFs may NOT.  In addition, comments are    NOT lexically recognized, as such, but parenthesized  strings    are  legal  as  part  of  field-names.  These constraints are    different from what is permissible  within  structured  field    bodies.   In  particular,  this means that header field-names    must wholly occur on the FIRST line of a folded  header  item    and may NOT be split across two or more lines.d.  Unstructured field bodies    For  some  fields,  such  as  "Subject"  and  "Comments",  no    structuring is assumed; and they are treated simply as texts,    like those in the message body.  Rules of  folding  apply  to    these  fields, so that such field bodies which occupy several    lines must therefore have the  second  and  successive  lines    indented by at least one LWSP-char.e.  Structured field bodies    To aid in the creation and reading of structured fields,  the    free  insertion  of linear-white-space (which permits folding    by inclusion of  CRLFs)  is  allowed  in  reasonable  places.    Rather  than  obscuring  the  syntax specifications for these    structured fields  with  explicit  syntax  for  this  linear-    white-space,  the  existence of another "lexical" analyzer is    assumed.  This analyzer does not apply for field bodies which    are  simply unstructured strings of text, as described above.    It provides an interpretation of the unfolded text comprising    the  body  of  the  field  as  a sequence of lexical symbols.    These symbols are:            -  individual special characters            -  quoted-stringsStandard for the Format of Text Messages                        8III. Syntax  B. Lexical Analysis            -  comments            -  atoms    The first three of these symbols are self-delimiting.   Atoms    are  not; they therefore are delimited by the self-delimiting    symbols and by linear-white-space.  For the purposes  of  re-    generating sequences of atoms and quoted-strings, exactly one    SPACE is assumed to exist and should be  used  between  them.    (Also,  in  Section  III.B.3.a,  note  the  rules  concerning    treatment of multiple continguous LWSP-chars.)    So, for example, the folded body of an address field            ":sysmail"@   Some-Host,            Muhammed(I am   the greatest)Ali   at(the)WBA    is analyzed into the following lexical symbols and types:            ":sysmail"              quoted string            @                       special            Some-Host               atom            ,                       special            Muhammed                atom            (I am   the greatest)   comment            Ali                     atom            at                      atom            (the)                   comment            WBA                     atom    The cononical representations for the data in these addresses    are  the  following  strings  (note that there is exactly one    SPACE between words):                :sysmail at Some-Host    and                Muhammed Ali at WBA2.  Formal DefinitionsThe first four rules, below, indicate a meta-syntax  for  fields,without  regard to their particular type or internal syntax.  Theremaining rules define basic syntactic structures which are  usedby the rules in Sections III.C, III.D, and III.E.field       =  field-name ":" [ field-body ] CRLFfield-name  =  fnatom *( LWSP-char [fnatom] )Standard for the Format of Text Messages                        9III. Syntax  B. Lexical Analysisfnatom      =  1*<any CHAR, excluding CTLs, SPACE, and ":">field-body  =  field-body-contents               [CRLF LWSP-char field-body]field-body-contents = <the TELNET ASCII characters making up the               field-body, as defined in the following sections,               and consisting of combinations of atom, quoted-               string, and specials tokens, or else consisting of               texts>                                            ; (  Octal, Decimal.)CHAR        =  <any TELNET ASCII character> ; (  0-177,  0.-127.)ALPHA       =  <any TELNET ASCII alphabetic character>                                            ; (101-132, 65.- 90.)                                            ; (141-172, 97.-122.)DIGIT       =  <any TELNET ASCII digit>     ; ( 60- 71, 48.- 57.)CTL         =  <any TELNET ASCII control    ; (  0- 37,  0.- 31.)                character and DEL>          ; (    177,     127.)CR          =  <TELNET ASCII carriage return>;(     15,      13.)LF          =  <TELNET ASCII linefeed>      ; (     12,      10.)SPACE       =  <TELNET ASCII space>         ; (     40,      32.)HTAB        =  <TELNET ASCII horizontal-tab>; (     11,       9.)<">         =  <TELNET ASCII quote mark>    ; (     42,      34.)CRLF        =  CR LFLWSP-char   =  SPACE / HTAB                 ; semantics = SPACElinear-white-space =  1*([CRLF] LWSP-char)  ; semantics = SPACE                                            ; CRLF => foldingspecials    =  "(" / ")" / "<" / ">" / "@"  ; To use in a word,            /  "," / ";" / ":" / "\" / <">  ;  word must be a                                            ;  quoted-string.delimiters  =  specials / comment / linear-white-spacetext        =  <any CHAR, including bare    ; => atoms, specials,                CR and/or bare LF, but NOT  ;  comments and                including CRLF>             ;  quoted-strings are                                            ;  NOT interpreted.atom        =  1*<any CHAR except specials and CTLs>quoted-string = <"> *(qtext/quoted-pair) <">; Any number of qtext                                            ;   chars or any                                            ;   quoted char.qtext       =  <any CHAR excepting <">      ; => may be folded                and CR, and including                linear-white-space>Standard for the Format of Text Messages                       10III. Syntax  B. Lexical Analysiscomment     =  "(" *(ctext / comment / quoted-pair) ")"ctext       =  <any CHAR excluding "(",     ; => may be folded                ")" and CR, and including                linear-white-space>quoted-pair =  "\" CHAR3.  Clarificationsa.  "White space"    Remember that in field-names  and  structured  field  bodies,    MULTIPLE  LINEAR  WHITE SPACE TELNET ASCII CHARACTERS (namely    HTABs and SPACEs) ARE TREATED AS SINGLE SPACES AND MAY FREELY    SURROUND ANY SYMBOL.  In all header fields, the only place in    which at least one space is REQUIRED is at the  beginning  of    continuation  lines  in a folded field.  When passing text to    processes which do  not  interpret  text  according  to  this    standard  (e.g.,  ARPANET FTP mail servers), then exactly one    SPACE should be used in place of arbitrary linear-white-space    and comment sequences.    WHEREVER A MEMBER OF THE LIST  OF  <DELIMITER>S  IS  ALLOWED,    LWSP-CHARS MAY ALSO OCCUR BEFORE AND/OR AFTER IT.    Writers of mail-sending  (i.e.  header  generating)  programs    should  realize  that  there is no Network-wide definition of    the effect of horizontal-tab TELNET ASCII characters  on  the    appearance  of  text  at another Network host; therefore, the    use  of  tabs  in  message  headers,  though  permitted,   is    discouraged.    Note that  during  transmissions  across  the  ARPANET  using    TELNET  NVT  connections,  data  must  conform  to TELNET NVT    conventions (e.g., CR must be followed by either LF, making a    CRLF, or <null>, if the CR is to stand alone).b.  Comments    Comments are detected as such  only  within  field-bodies  of    structured  fields.   A  comment  is  a  set  of TELNET ASCII    characters, which is not within a quoted-string and which  is    enclosed  in  matching parentheses; parentheses nest, so that    if an unquoted left parenthesis occurs in a  comment  string,    there  must  also  be  a  matching right parenthesis.  When a    comment is used to act as the delimiter between a sequence of    two  lexical  symbols,  such  as  two  atoms, it is lexically    equivalent with one SPACE, for the purposes  of  regenerating    the  sequence,  such as when passing the sequence onto an FTP    mail server.Standard for the Format of Text Messages                       11III. Syntax  B. Lexical Analysis    In particular comments are NOT passed to the FTP  server,  as    part  of  a MAIL or MLFL command, since comments are not part    of the "formal" address.    If a comment is to be "folded" onto multiple lines, then  the    syntax for folding must be adhered to.  (See items III.B.1.a,    above,  and  III.B.3.f,  below.)   Note  that  the   official    semantics therefore do not "see" any unquoted CRLFs which are    in comments, although particular parsing programs may wish to    note  their  presence.   For  these  programs,  it  would  be    reasonable to interpret a "CRLF LWSP-char" as  being  a  CRLF    which  is part of the comment; i.e., the CRLF is kept and the    LWSP-char is discarded.   Quoted  CRLFs  (i.e.,  a  backslash    followed  by a CR followed by a LF) still must be followed by    at least one LWSP-char.c.  Delimiting and quoting characters
💿 文件大小 3544 K
👤 上传用户 kzdai22
📂 所属分类文章/文档
🏷️ 相关标签

#RFC #文档
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -