📄 rfc733.txt
字号:
<l>#<m>element
indicating at least <l> and at most <m> elements, each separated
by one or more commas (","). This makes the usual form of lists
very easy; a rule such as '(element *("," element))' can be shown
as "1#element". Wherever this construct is used, null elements
are allowed, but do not contribute to the count of elements
present. That is, "(element),,(element)" is permitted, but
counts as only two elements. Therefore, where at least one
element is required, at least one non-null element must be
present.
6. [optional]
Square brackets enclose optional elements; "[foo bar]" is
equivalent to "*1(foo bar)".
7. ; Comments
A semi-colon, set off some distance to the right of rule text,
starts a comment which continues to the end of line. This is a
simple way of including useful notes in parallel with the
specifications.
B. LEXICAL ANALYSIS OF MESSAGES
1. General Description
A message consists of headers and, optionally, a body (i.e. a
series of text lines). The text part is just a sequence of lines
containing ASCII characters; it is separated from the headers by
a null line (i.e., a line with nothing preceding the CRLF).
Standard for the Format of Text Messages 6
III. Syntax
B. Lexical Analysis
a. Folding and unfolding of headers
Each header item can be viewed as a single, logical line of
ASCII characters. For convenience, the field-body portion of
this conceptual entity can be split into a multiple-line
representation (i.e., "folded"). The general rule is that
wherever there can be linear-white-space (NOT simply LWSP-
chars), a CRLF immediately followed by AT LEAST one LWSP-char
can instead be inserted. (However, a header's name and the
following colon (":"), which occur at the beginning of the
header item, may NOT be folded onto multiple lines.) Thus,
the single line
To: "Joe Dokes & J. Harvey" <ddd at Host>, JJV at BBN
can be represented as
To: "Joe Dokes & J. Harvey" <ddd at Host>,
JJV at BBN
and
To: "Joe Dokes & J. Harvey"
<ddd at Host>,
JJV at BBN
and
To: "Joe Dokes
& J. Harvey" <ddd at Host>, JJV at BBN
The process of moving from this folded multiple-line
representation of a header field to its single line
representation will be called "unfolding". Unfolding is
accomplished by regarding CRLF immediately followed by a
LWSP-char as equivalent to the LWSP-char.
b. Structure of header fields
Once header fields have been unfolded, they may be viewed as
being composed of a field-name followed by a colon (":"),
followed by a field-body. The field-name must be composed of
printable ASCII characters (i.e., characters which have
values between 33. and 126., decimal, except colon) and
LWSP-chars. The field-body may be composed of any ASCII
characters (other than an unquoted CRLF, which has been
removed by unfolding).
Certain field-bodies of header fields may be interpreted
according to an internal syntax which some systems may wish
to parse. These fields will be referred to as "structured"
fields. Examples include fields containing dates and
Standard for the Format of Text Messages 7
III. Syntax
B. Lexical Analysis
addresses. Other fields, such as "Subject" and "Comments",
are regarded simply as strings of text.
NOTE: Field-names, unstructured field bodies and structured
field bodies each are scanned by their own, INDEPENDENT
"lexical" analyzer.
c. Field-names
To aid in the creation and reading of field-names, the free
insertion of LWSP-chars is allowed in reasonable places.
Rather than obscuring the syntax specification for field-name
with the explicit syntax for these LWSP-chars, the existence
of a "lexical" analyzer is assumed. The analyzer interprets
the text which comprises the field-name as a sequence of
field-name atoms (fnatoms) separated by LWSP-chars
Note that ONLY LWSP-chars may occur between the fnatoms of a
field-name and that CRLFs may NOT. In addition, comments are
NOT lexically recognized, as such, but parenthesized strings
are legal as part of field-names. These constraints are
different from what is permissible within structured field
bodies. In particular, this means that header field-names
must wholly occur on the FIRST line of a folded header item
and may NOT be split across two or more lines.
d. Unstructured field bodies
For some fields, such as "Subject" and "Comments", no
structuring is assumed; and they are treated simply as texts,
like those in the message body. Rules of folding apply to
these fields, so that such field bodies which occupy several
lines must therefore have the second and successive lines
indented by at least one LWSP-char.
e. Structured field bodies
To aid in the creation and reading of structured fields, the
free insertion of linear-white-space (which permits folding
by inclusion of CRLFs) is allowed in reasonable places.
Rather than obscuring the syntax specifications for these
structured fields with explicit syntax for this linear-
white-space, the existence of another "lexical" analyzer is
assumed. This analyzer does not apply for field bodies which
are simply unstructured strings of text, as described above.
It provides an interpretation of the unfolded text comprising
the body of the field as a sequence of lexical symbols.
These symbols are:
- individual special characters
- quoted-strings
Standard for the Format of Text Messages 8
III. Syntax
B. Lexical Analysis
- comments
- atoms
The first three of these symbols are self-delimiting. Atoms
are not; they therefore are delimited by the self-delimiting
symbols and by linear-white-space. For the purposes of re-
generating sequences of atoms and quoted-strings, exactly one
SPACE is assumed to exist and should be used between them.
(Also, in Section III.B.3.a, note the rules concerning
treatment of multiple continguous LWSP-chars.)
So, for example, the folded body of an address field
":sysmail"@ Some-Host,
Muhammed(I am the greatest)Ali at(the)WBA
is analyzed into the following lexical symbols and types:
":sysmail" quoted string
@ special
Some-Host atom
, special
Muhammed atom
(I am the greatest) comment
Ali atom
at atom
(the) comment
WBA atom
The cononical representations for the data in these addresses
are the following strings (note that there is exactly one
SPACE between words):
:sysmail at Some-Host
and
Muhammed Ali at WBA
2. Formal Definitions
The first four rules, below, indicate a meta-syntax for fields,
without regard to their particular type or internal syntax. The
remaining rules define basic syntactic structures which are used
by the rules in Sections III.C, III.D, and III.E.
field = field-name ":" [ field-body ] CRLF
field-name = fnatom *( LWSP-char [fnatom] )
Standard for the Format of Text Messages 9
III. Syntax
B. Lexical Analysis
fnatom = 1*<any CHAR, excluding CTLs, SPACE, and ":">
field-body = field-body-contents
[CRLF LWSP-char field-body]
field-body-contents = <the TELNET ASCII characters making up the
field-body, as defined in the following sections,
and consisting of combinations of atom, quoted-
string, and specials tokens, or else consisting of
texts>
; ( Octal, Decimal.)
CHAR = <any TELNET ASCII character> ; ( 0-177, 0.-127.)
ALPHA = <any TELNET ASCII alphabetic character>
; (101-132, 65.- 90.)
; (141-172, 97.-122.)
DIGIT = <any TELNET ASCII digit> ; ( 60- 71, 48.- 57.)
CTL = <any TELNET ASCII control ; ( 0- 37, 0.- 31.)
character and DEL> ; ( 177, 127.)
CR = <TELNET ASCII carriage return>;( 15, 13.)
LF = <TELNET ASCII linefeed> ; ( 12, 10.)
SPACE = <TELNET ASCII space> ; ( 40, 32.)
HTAB = <TELNET ASCII horizontal-tab>; ( 11, 9.)
<"> = <TELNET ASCII quote mark> ; ( 42, 34.)
CRLF = CR LF
LWSP-char = SPACE / HTAB ; semantics = SPACE
linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE
; CRLF => folding
specials = "(" / ")" / "<" / ">" / "@" ; To use in a word,
/ "," / ";" / ":" / "\" / <"> ; word must be a
; quoted-string.
delimiters = specials / comment / linear-white-space
text = <any CHAR, including bare ; => atoms, specials,
CR and/or bare LF, but NOT ; comments and
including CRLF> ; quoted-strings are
; NOT interpreted.
atom = 1*<any CHAR except specials and CTLs>
quoted-string = <"> *(qtext/quoted-pair) <">; Any number of qtext
; chars or any
; quoted char.
qtext = <any CHAR excepting <"> ; => may be folded
and CR, and including
linear-white-space>
Standard for the Format of Text Messages 10
III. Syntax
B. Lexical Analysis
comment = "(" *(ctext / comment / quoted-pair) ")"
ctext = <any CHAR excluding "(", ; => may be folded
")" and CR, and including
linear-white-space>
quoted-pair = "\" CHAR
3. Clarifications
a. "White space"
Remember that in field-names and structured field bodies,
MULTIPLE LINEAR WHITE SPACE TELNET ASCII CHARACTERS (namely
HTABs and SPACEs) ARE TREATED AS SINGLE SPACES AND MAY FREELY
SURROUND ANY SYMBOL. In all header fields, the only place in
which at least one space is REQUIRED is at the beginning of
continuation lines in a folded field. When passing text to
processes which do not interpret text according to this
standard (e.g., ARPANET FTP mail servers), then exactly one
SPACE should be used in place of arbitrary linear-white-space
and comment sequences.
WHEREVER A MEMBER OF THE LIST OF <DELIMITER>S IS ALLOWED,
LWSP-CHARS MAY ALSO OCCUR BEFORE AND/OR AFTER IT.
Writers of mail-sending (i.e. header generating) programs
should realize that there is no Network-wide definition of
the effect of horizontal-tab TELNET ASCII characters on the
appearance of text at another Network host; therefore, the
use of tabs in message headers, though permitted, is
discouraged.
Note that during transmissions across the ARPANET using
TELNET NVT connections, data must conform to TELNET NVT
conventions (e.g., CR must be followed by either LF, making a
CRLF, or <null>, if the CR is to stand alone).
b. Comments
Comments are detected as such only within field-bodies of
structured fields. A comment is a set of TELNET ASCII
characters, which is not within a quoted-string and which is
enclosed in matching parentheses; parentheses nest, so that
if an unquoted left parenthesis occurs in a comment string,
there must also be a matching right parenthesis. When a
comment is used to act as the delimiter between a sequence of
two lexical symbols, such as two atoms, it is lexically
equivalent with one SPACE, for the purposes of regenerating
the sequence, such as when passing the sequence onto an FTP
mail server.
Standard for the Format of Text Messages 11
III. Syntax
B. Lexical Analysis
In particular comments are NOT passed to the FTP server, as
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -