rfc2049.txt

来自「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」· 文本代码 · 共 1,348 行 · 第 1/4 页
TXT
1,348 行
Freed & Borenstein          Standards Track                     [Page 6]RFC 2049                    MIME Conformance               November 1996   The following guidelines may be useful to anyone devising a data   format (media type) that is supposed to survive the widest range of   networking technologies and known broken MTAs unscathed.  Note that   anything encoded in the base64 encoding will satisfy these rules, but   that some well-known mechanisms, notably the UNIX uuencode facility,   will not.  Note also that anything encoded in the Quoted-Printable   encoding will survive most gateways intact, but possibly not some   gateways to systems that use the EBCDIC character set.    (1)   Under some circumstances the encoding used for data may          change as part of normal gateway or user agent          operation.  In particular, conversion from base64 to          quoted-printable and vice versa may be necessary.  This          may result in the confusion of CRLF sequences with line          breaks in text bodies.  As such, the persistence of          CRLF as something other than a line break must not be          relied on.    (2)   Many systems may elect to represent and store text data          using local newline conventions.  Local newline          conventions may not match the RFC822 CRLF convention --          systems are known that use plain CR, plain LF, CRLF, or          counted records.  The result is that isolated CR and LF          characters are not well tolerated in general; they may          be lost or converted to delimiters on some systems, and          hence must not be relied on.    (3)   The transmission of NULs (US-ASCII value 0) is          problematic in Internet mail.  (This is largely the          result of NULs being used as a termination character by          many of the standard runtime library routines in the C          programming language.) The practice of using NULs as          termination characters is so entrenched now that          messages should not rely on them being preserved.    (4)   TAB (HT) characters may be misinterpreted or may be          automatically converted to variable numbers of spaces.          This is unavoidable in some environments, notably those          not based on the US-ASCII character set.  Such          conversion is STRONGLY DISCOURAGED, but it may occur,          and mail formats must not rely on the persistence of          TAB (HT) characters.    (5)   Lines longer than 76 characters may be wrapped or          truncated in some environments.  Line wrapping or line          truncation imposed by mail transports is STRONGLY          DISCOURAGED, but unavoidable in some cases.          Applications which require long lines must somehowFreed & Borenstein          Standards Track                     [Page 7]RFC 2049                    MIME Conformance               November 1996          differentiate between soft and hard line breaks.  (A          simple way to do this is to use the quoted-printable          encoding.)    (6)   Trailing "white space" characters (SPACE, TAB (HT)) on          a line may be discarded by some transport agents, while          other transport agents may pad lines with these          characters so that all lines in a mail file are of          equal length.  The persistence of trailing white space,          therefore, must not be relied on.    (7)   Many mail domains use variations on the US-ASCII          character set, or use character sets such as EBCDIC          which contain most but not all of the US-ASCII          characters.  The correct translation of characters not          in the "invariant" set cannot be depended on across          character converting gateways.  For example, this          situation is a problem when sending uuencoded          information across BITNET, an EBCDIC system.  Similar          problems can occur without crossing a gateway, since          many Internet hosts use character sets other than US-          ASCII internally.  The definition of Printable Strings          in X.400 adds further restrictions in certain special          cases.  In particular, the only characters that are          known to be consistent across all gateways are the 73          characters that correspond to the upper and lower case          letters A-Z and a-z, the 10 digits 0-9, and the          following eleven special characters:            "'"  (US-ASCII decimal value 39)            "("  (US-ASCII decimal value 40)            ")"  (US-ASCII decimal value 41)            "+"  (US-ASCII decimal value 43)            ","  (US-ASCII decimal value 44)            "-"  (US-ASCII decimal value 45)            "."  (US-ASCII decimal value 46)            "/"  (US-ASCII decimal value 47)            ":"  (US-ASCII decimal value 58)            "="  (US-ASCII decimal value 61)            "?"  (US-ASCII decimal value 63)          A maximally portable mail representation will confine          itself to relatively short lines of text in which the          only meaningful characters are taken from this set of          73 characters.  The base64 encoding follows this rule.    (8)   Some mail transport agents will corrupt data that          includes certain literal strings.  In particular, aFreed & Borenstein          Standards Track                     [Page 8]RFC 2049                    MIME Conformance               November 1996          period (".") alone on a line is known to be corrupted          by some (incorrect) SMTP implementations, and a line          that starts with the five characters "From " (the fifth          character is a SPACE) are commonly corrupted as well.          A careful composition agent can prevent these          corruptions by encoding the data (e.g., in the quoted-          printable encoding using "=46rom " in place of "From "          at the start of a line, and "=2E" in place of "." alone          on a line).   Please note that the above list is NOT a list of recommended   practices for MTAs.  RFC 821 MTAs are prohibited from altering the   character of white space or wrapping long lines.  These BAD and   invalid practices are known to occur on established networks, and   implementations should be robust in dealing with the bad effects they   can cause.4.  Canonical Encoding Model   There was some confusion, in earlier versions of these documents,   regarding the model for when email data was to be converted to   canonical form and encoded, and in particular how this process would   affect the treatment of CRLFs, given that the representation of   newlines varies greatly from system to system.  For this reason, a   canonical model for encoding is presented below.   The process of composing a MIME entity can be modeled as being done   in a number of steps.  Note that these steps are roughly similar to   those steps used in PEM [RFC-1421] and are performed for each   "innermost level" body:    (1)   Creation of local form.          The body to be transmitted is created in the system's          native format.  The native character set is used and,          where appropriate, local end of line conventions are          used as well.  The body may be a UNIX-style text file,          or a Sun raster image, or a VMS indexed file, or audio          data in a system-dependent format stored only in          memory, or anything else that corresponds to the local          model for the representation of some form of          information.  Fundamentally, the data is created in the          "native" form that corresponds to the type specified by          the media type.Freed & Borenstein          Standards Track                     [Page 9]RFC 2049                    MIME Conformance               November 1996    (2)   Conversion to canonical form.          The entire body, including "out-of-band" information          such as record lengths and possibly file attribute          information, is converted to a universal canonical          form.  The specific media type of the body as well as          its associated attributes dictate the nature of the          canonical form that is used.  Conversion to the proper          canonical form may involve character set conversion,          transformation of audio data, compression, or various          other operations specific to the various media types.          If character set conversion is involved, however, care          must be taken to understand the semantics of the media          type, which may have strong implications for any          character set conversion, e.g. with regard to          syntactically meaningful characters in a text subtype          other than "plain".          For example, in the case of text/plain data, the text          must be converted to a supported character set and          lines must be delimited with CRLF delimiters in          accordance with RFC 822.  Note that the restriction on          line lengths implied by RFC 822 is eliminated if the          next step employs either quoted-printable or base64          encoding.    (3)   Apply transfer encoding.          A Content-Transfer-Encoding appropriate for this body          is applied.  Note that there is no fixed relationship          between the media type and the transfer encoding.  In          particular, it may be appropriate to base the choice of          base64 or quoted-printable on character frequency          counts which are specific to a given instance of a          body.    (4)   Insertion into entity.          The encoded body is inserted into a MIME entity with          appropriate headers. The entity is then inserted into          the body of a higher-level entity (message or          multipart) as needed.   Conversion from entity form to local form is accomplished by   reversing these steps. Note that reversal of these steps may produce   differing results since there is no guarantee that the original and   final local forms are the same.Freed & Borenstein          Standards Track                    [Page 10]RFC 2049                    MIME Conformance               November 1996   It is vital to note that these steps are only a model; they are   specifically NOT a blueprint for how an actual system would be built.   In particular, the model fails to account for two common designs:    (1)   In many cases the conversion to a canonical form prior          to encoding will be subsumed into the encoder itself,          which understands local formats directly.  For example,          the local newline convention for text bodies might be          carried through to the encoder itself along with          knowledge of what that format is.    (2)   The output of the encoders may have to pass through one          or more additional steps prior to being transmitted as          a message.  As such, the output of the encoder may not          be conformant with the formats specified by RFC 822.          In particular, once again it may be appropriate for the          converter's output to be expressed using local newline          conventions rather than using the standard RFC 822 CRLF          delimiters.   Other implementation variations are conceivable as well.  The vital   aspect of this discussion is that, in spite of any optimizations,   collapsings of required steps, or insertion of additional processing,   the resulting messages must be consistent with those produced by the   model described here.  For example, a message with the following   header fields:     Content-type: text/foo; charset=bar     Content-Transfer-Encoding: base64   must be first represented in the text/foo form, then (if necessary)   represented in the "bar" character set, and finally transformed via   the base64 algorithm into a mail-safe form.   NOTE: Some confusion has been caused by systems that represent   messages in a format which uses local newline conventions which   differ from the RFC822 CRLF convention.  It is important to note that   these formats are not canonical RFC822/MIME.  These formats are   instead *encodings* of RFC822, where CRLF sequences in the canonical   representation of the message are encoded as the local newline   convention.  Note that formats which encode CRLF sequences as, for   example, LF are not capable of representing MIME messages containing   binary data which contains LF octets not part of CRLF line separation   sequences.Freed & Borenstein          Standards Track                    [Page 11]RFC 2049                    MIME Conformance               November 19965.  Summary   This document defines what is meant by MIME Conformance. It also   details various problems known to exist in the Internet email system   and how to use MIME to overcome them. Finally, it describes MIME's   canonical encoding model.6.  Security Considerations   Security issues are discussed in the second document in this set, RFC   2046.7.  Authors' Addresses   For more information, the authors of this document are best contacted   via Internet mail:   Ned Freed   Innosoft International, Inc.   1050 East Garvey Avenue South   West Covina, CA 91790   USA   Phone: +1 818 919 3600   Fax:   +1 818 919 3614   EMail: ned@innosoft.com   Nathaniel S. Borenstein   First Virtual Holdings   25 Washington Avenue   Morristown, NJ 07960   USA   Phone: +1 201 540 8967   Fax:   +1 201 993 3032   EMail: nsb@nsb.fv.com   MIME is a result of the work of the Internet Engineering Task Force   Working Group on RFC 822 Extensions.  The chairman of that group,   Greg Vaudreuil, may be reached at:   Gregory M. Vaudreuil   Octel Network Services   17080 Dallas Parkway   Dallas, TX 75248-1905   USA   EMail: Greg.Vaudreuil@Octel.ComFreed & Borenstein          Standards Track                    [Page 12]
rfc2049.txt - 源码说明

本页面展示了「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」中的 rfc2049.txt 源码文件，采用文本编程语言编写，共 1,348 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与RFC相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?