📄 rfc1341.txt

📁 著名的RFC文档,其中有一些文档是已经翻译成中文的的.
💻 TXT
📖 第 1 页 / 共 5 页
字号:
            the  mechanisms  described  here are mechanisms for encoding            arbitrary byte streams, not bit streams.  If a bit stream is            to  be encoded via one of these mechanisms, it must first be            converted to an 8-bit byte stream using the network standard            bit  order  ("big-endian"),  in  which the earlier bits in a            stream become the higher-order bits in a byte.  A bit stream            not  ending at an 8-bit boundary must be padded with zeroes.            This document provides a mechanism for noting  the  addition            of such padding in the case of the application Content-Type,            which has a "padding" parameter.            The encoding mechanisms defined here explicitly  encode  all            data  in  ASCII.   Thus,  for example, suppose an entity has            header fields such as:                 Content-Type: text/plain; charset=ISO-8859-1                 Content-transfer-encoding: base64            This should be interpreted to mean that the body is a base64            ASCII  encoding  of  data that was originally in ISO-8859-1,            and will be in that character set again after decoding.            The following sections will define the two standard encoding            mechanisms.    The   definition   of  new  content-transfer-            encodings is explicitly discouraged and  should  only  occur            when  absolutely  necessary.   All content-transfer-encoding            namespace except that  beginning  with  "X-"  is  explicitly            reserved  to  the  IANA  for future use.  Private agreements            about   content-transfer-encodings   are   also   explicitly            discouraged.            Certain Content-Transfer-Encoding values may only be used on            certain  Content-Types.   In  particular,  it  is  expressly            forbidden to use any encodings other than "7bit", "8bit", or            "binary"  with  any  Content-Type  that recursively includes            other Content-Type  fields,   notably  the  "multipart"  and            Borenstein & Freed                                 [Page 12]            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992            "message" Content-Types.  All encodings that are desired for            bodies of type multipart or message  must  be  done  at  the            innermost  level,  by encoding the actual body that needs to            be encoded.            NOTE  ON  ENCODING  RESTRICTIONS:   Though  the  prohibition            against  using  content-transfer-encodings  on  data of type            multipart or message may  seem  overly  restrictive,  it  is            necessary  to  prevent  nested  encodings, in which data are            passed through an encoding  algorithm  multiple  times,  and            must  be  decoded  multiple  times  in  order to be properly            viewed.  Nested encodings  add  considerable  complexity  to            user  agents:   aside  from  the obvious efficiency problems            with such multiple encodings, they  can  obscure  the  basic            structure  of a message.  In particular, they can imply that            several decoding operations are necessary simply to find out            what  types  of  objects a message contains.  Banning nested            encodings may complicate the job of certain  mail  gateways,            but  this  seems less of a problem than the effect of nested            encodings on user agents.            NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE  AND  CONTENT-            TRANSFER-ENCODING:   It  may seem that the Content-Transfer-            Encoding could be inferred from the characteristics  of  the            Content-Type  that  is to be encoded, or, at the very least,            that certain Content-Transfer-Encodings  could  be  mandated            for  use  with  specific  Content-Types.  There  are several            reasons why this is not the case. First, given  the  varying            types  of  transports  used  for mail, some encodings may be            appropriate for some Content-Type/transport combinations and            not  for  others.  (For  example, in an  8-bit transport, no            encoding would be required for  text  in  certain  character            sets,  while  such  encodings are clearly required for 7-bit            SMTP.)  Second, certain Content-Types may require  different            types  of  transfer  encoding under different circumstances.            For example, many PostScript bodies might  consist  entirely            of  short lines of 7-bit data and hence require little or no            encoding. Other PostScript bodies  (especially  those  using            Level  2 PostScript's binary encoding mechanism) may only be            reasonably represented using a  binary  transport  encoding.            Finally,  since Content-Type is intended to be an open-ended            specification  mechanism,   strict   specification   of   an            association  between Content-Types and encodings effectively            couples the specification of an application protocol with  a            specific  lower-level transport. This is not desirable since            the developers of a Content-Type should not have to be aware            of all the transports in use and what their limitations are.            NOTE ON TRANSLATING  ENCODINGS:   The  quoted-printable  and            base64  encodings  are  designed  so that conversion between            them is possible. The only  issue  that  arises  in  such  a            conversion  is  the handling of line breaks. When converting            from  quoted-printable  to  base64  a  line  break  must  be            converted  into  a CRLF sequence. Similarly, a CRLF sequence            Borenstein & Freed                                 [Page 13]            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992            in base64 data should be  converted  to  a  quoted-printable            line break, but ONLY when converting text data.            NOTE  ON  CANONICAL  ENCODING  MODEL:     There   was   some            confusion,  in  earlier  drafts  of this memo, regarding the            model for when email data was to be converted  to  canonical            form  and  encoded, and in particular how this process would            affect the treatment of CRLFs, given that the representation            of  newlines  varies greatly from system to system. For this            reason, a canonical  model  for  encoding  is  presented  as            Appendix H.            5.1  Quoted-Printable Content-Transfer-Encoding            The Quoted-Printable encoding is intended to represent  data            that largely consists of octets that correspond to printable            characters in the ASCII character set.  It encodes the  data            in  such  a way that the resulting octets are unlikely to be            modified by mail transport.  If the data being  encoded  are            mostly  ASCII  text,  the  encoded  form of the data remains            largely recognizable by humans.  A body  which  is  entirely            ASCII  may also be encoded in Quoted-Printable to ensure the            integrity of the data should  the  message  pass  through  a            character-translating, and/or line-wrapping gateway.            In this encoding, octets are to be represented as determined            by the following rules:                 Rule #1:  (General  8-bit  representation)  Any  octet,                 except  those  indicating a line break according to the                 newline convention of the canonical form  of  the  data                 being encoded, may be represented by an "=" followed by                 a two digit hexadecimal representation of  the  octet's                 value. The digits of the hexadecimal alphabet, for this                 purpose, are "0123456789ABCDEF". Uppercase letters must                 be                 used when sending hexadecimal  data,  though  a  robust                 implementation   may   choose  to  recognize  lowercase                 letters on receipt. Thus, for  example,  the  value  12                 (ASCII  form feed) can be represented by "=0C", and the                 value 61 (ASCII  EQUAL  SIGN)  can  be  represented  by                 "=3D".   Except  when  the  following  rules  allow  an                 alternative encoding, this rule is mandatory.                 Rule #2: (Literal representation) Octets  with  decimal                 values  of 33 through 60 inclusive, and 62 through 126,                 inclusive, MAY be represented as the  ASCII  characters                 which  correspond  to  those  octets (EXCLAMATION POINT                 through LESS THAN,  and  GREATER  THAN  through  TILDE,                 respectively).                 Rule #3: (White Space): Octets with values of 9 and  32                 MAY   be  represented  as  ASCII  TAB  (HT)  and  SPACE                 characters,  respectively,   but   MUST   NOT   be   so            Borenstein & Freed                                 [Page 14]            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992                 represented at the end of an encoded line. Any TAB (HT)                 or SPACE characters on an encoded  line  MUST  thus  be                 followed  on  that  line  by a printable character.  In                 particular, an "=" at  the  end  of  an  encoded  line,                 indicating  a  soft line break (see rule #5) may follow                 one or more TAB (HT) or SPACE characters.   It  follows                 that  an  octet with value 9 or 32 appearing at the end                 of an encoded line must  be  represented  according  to                 Rule  #1.  This  rule  is  necessary  because some MTAs                 (Message Transport  Agents,  programs  which  transport                 messages from one user to another, or perform a part of                 such transfers) are known to pad  lines  of  text  with                 SPACEs,  and  others  are known to remove "white space"                 characters from the end  of  a  line.  Therefore,  when                 decoding  a  Quoted-Printable  body, any trailing white                 space on a line must be deleted, as it will necessarily                 have been added by intermediate transport agents.                 Rule #4 (Line Breaks): A line  break  in  a  text  body                 part,   independent   of  what  its  representation  is                 following the  canonical  representation  of  the  data                 being  encoded, must be represented by a (RFC 822) line                 break,  which  is  a  CRLF  sequence,  in  the  Quoted-                 Printable  encoding.  If isolated CRs and LFs, or LF CR                 and CR LF sequences are allowed  to  appear  in  binary                 data  according  to  the  canonical  form, they must be                 represented   using  the  "=0D",  "=0A",  "=0A=0D"  and                 "=0D=0A" notations respectively.                 Note that many implementation may elect to  encode  the                 local representation of various content types directly.                 In particular, this may apply to plain text material on                 systems  that  use  newline conventions other than CRLF                 delimiters. Such an implementation is permissible,  but                 the  generation  of  line breaks must be generalized to                 account for the case where alternate representations of                 newline sequences are used.                 Rule  #5  (Soft  Line  Breaks):  The   Quoted-Printable                 encoding REQUIRES that encoded lines be no more than 76                 characters long. If longer lines are to be encoded with                 the  Quoted-Printable encoding, 'soft' line breaks must                 be used. An equal sign  as  the  last  character  on  a                 encoded  line indicates such a non-significant ('soft')                 line break in the encoded text. Thus if the "raw"  form                 of the line is a single unencoded line that says:                      Now's the time for all folk to come to the aid of                      their country.                 This  can  be  represented,  in  the   Quoted-Printable                 encoding, as            Borenstein & Freed                                 [Page 15]            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992                      Now's the time =                      for all folk to come=                       to the aid of their country.                 This provides a mechanism with  which  long  lines  are                 encoded  in  such  a  way as to be restored by the user                 agent.  The 76  character  limit  does  not  count  the                 trailing   CRLF,   but  counts  all  other  characters,                 including any equal signs.            Since the hyphen character ("-") is represented as itself in            the  Quoted-Printable  encoding,  care  must  be taken, when            encapsulating a quoted-printable encoded body in a multipart            entity,  to  ensure that the encapsulation boundary does not            appear anywhere in the encoded body.  (A good strategy is to            choose a boundary that includes a character sequence such as            "=_" which can never appear in a quoted-printable body.  See            the   definition   of   multipart  messages  later  in  this            document.)            NOTE:  The quoted-printable encoding represents something of            a   compromise   between   readability  and  reliability  in            transport.   Bodies  encoded   with   the   quoted-printable            encoding will work reliably over most mail gateways, but may            not work  perfectly  over  a  few  gateways,  notably  those            involving  translation  into  EBCDIC.  (In theory, an EBCDIC
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -