📄 rfc1341.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:
            cooperating user agents.

            If a Content-Transfer-Encoding header field appears as  part
            of  a  message header, it applies to the entire body of that
            message.   If  a  Content-Transfer-Encoding   header   field
            appears as part of a body part's headers, it applies only to
            the body of that  body  part.   If  an  entity  is  of  type
            "multipart"  or  "message", the Content-Transfer-Encoding is
            not permitted to have any  value  other  than  a  bit  width
            (e.g., "7bit", "8bit", etc.) or "binary".

            It should be noted that email is character-oriented, so that
            the  mechanisms  described  here are mechanisms for encoding
            arbitrary byte streams, not bit streams.  If a bit stream is
            to  be encoded via one of these mechanisms, it must first be
            converted to an 8-bit byte stream using the network standard
            bit  order  ("big-endian"),  in  which the earlier bits in a
            stream become the higher-order bits in a byte.  A bit stream
            not  ending at an 8-bit boundary must be padded with zeroes.
            This document provides a mechanism for noting  the  addition
            of such padding in the case of the application Content-Type,
            which has a "padding" parameter.

            The encoding mechanisms defined here explicitly  encode  all
            data  in  ASCII.   Thus,  for example, suppose an entity has
            header fields such as:

                 Content-Type: text/plain; charset=ISO-8859-1
                 Content-transfer-encoding: base64

            This should be interpreted to mean that the body is a base64
            ASCII  encoding  of  data that was originally in ISO-8859-1,
            and will be in that character set again after decoding.

            The following sections will define the two standard encoding
            mechanisms.    The   definition   of  new  content-transfer-
            encodings is explicitly discouraged and  should  only  occur
            when  absolutely  necessary.   All content-transfer-encoding
            namespace except that  beginning  with  "X-"  is  explicitly
            reserved  to  the  IANA  for future use.  Private agreements
            about   content-transfer-encodings   are   also   explicitly
            discouraged.

            Certain Content-Transfer-Encoding values may only be used on
            certain  Content-Types.   In  particular,  it  is  expressly
            forbidden to use any encodings other than "7bit", "8bit", or
            "binary"  with  any  Content-Type  that recursively includes
            other Content-Type  fields,   notably  the  "multipart"  and



            Borenstein & Freed                                 [Page 12]




            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992


            "message" Content-Types.  All encodings that are desired for
            bodies of type multipart or message  must  be  done  at  the
            innermost  level,  by encoding the actual body that needs to
            be encoded.

            NOTE  ON  ENCODING  RESTRICTIONS:   Though  the  prohibition
            against  using  content-transfer-encodings  on  data of type
            multipart or message may  seem  overly  restrictive,  it  is
            necessary  to  prevent  nested  encodings, in which data are
            passed through an encoding  algorithm  multiple  times,  and
            must  be  decoded  multiple  times  in  order to be properly
            viewed.  Nested encodings  add  considerable  complexity  to
            user  agents:   aside  from  the obvious efficiency problems
            with such multiple encodings, they  can  obscure  the  basic
            structure  of a message.  In particular, they can imply that
            several decoding operations are necessary simply to find out
            what  types  of  objects a message contains.  Banning nested
            encodings may complicate the job of certain  mail  gateways,
            but  this  seems less of a problem than the effect of nested
            encodings on user agents.

            NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE  AND  CONTENT-
            TRANSFER-ENCODING:   It  may seem that the Content-Transfer-
            Encoding could be inferred from the characteristics  of  the
            Content-Type  that  is to be encoded, or, at the very least,
            that certain Content-Transfer-Encodings  could  be  mandated
            for  use  with  specific  Content-Types.  There  are several
            reasons why this is not the case. First, given  the  varying
            types  of  transports  used  for mail, some encodings may be
            appropriate for some Content-Type/transport combinations and
            not  for  others.  (For  example, in an  8-bit transport, no
            encoding would be required for  text  in  certain  character
            sets,  while  such  encodings are clearly required for 7-bit
            SMTP.)  Second, certain Content-Types may require  different
            types  of  transfer  encoding under different circumstances.
            For example, many PostScript bodies might  consist  entirely
            of  short lines of 7-bit data and hence require little or no
            encoding. Other PostScript bodies  (especially  those  using
            Level  2 PostScript's binary encoding mechanism) may only be
            reasonably represented using a  binary  transport  encoding.
            Finally,  since Content-Type is intended to be an open-ended
            specification  mechanism,   strict   specification   of   an
            association  between Content-Types and encodings effectively
            couples the specification of an application protocol with  a
            specific  lower-level transport. This is not desirable since
            the developers of a Content-Type should not have to be aware
            of all the transports in use and what their limitations are.

            NOTE ON TRANSLATING  ENCODINGS:   The  quoted-printable  and
            base64  encodings  are  designed  so that conversion between
            them is possible. The only  issue  that  arises  in  such  a
            conversion  is  the handling of line breaks. When converting
            from  quoted-printable  to  base64  a  line  break  must  be
            converted  into  a CRLF sequence. Similarly, a CRLF sequence



            Borenstein & Freed                                 [Page 13]




            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992


            in base64 data should be  converted  to  a  quoted-printable
            line break, but ONLY when converting text data.

            NOTE  ON  CANONICAL  ENCODING  MODEL:     There   was   some
            confusion,  in  earlier  drafts  of this memo, regarding the
            model for when email data was to be converted  to  canonical
            form  and  encoded, and in particular how this process would
            affect the treatment of CRLFs, given that the representation
            of  newlines  varies greatly from system to system. For this
            reason, a canonical  model  for  encoding  is  presented  as
            Appendix H.

            5.1  Quoted-Printable Content-Transfer-Encoding

            The Quoted-Printable encoding is intended to represent  data
            that largely consists of octets that correspond to printable
            characters in the ASCII character set.  It encodes the  data
            in  such  a way that the resulting octets are unlikely to be
            modified by mail transport.  If the data being  encoded  are
            mostly  ASCII  text,  the  encoded  form of the data remains
            largely recognizable by humans.  A body  which  is  entirely
            ASCII  may also be encoded in Quoted-Printable to ensure the
            integrity of the data should  the  message  pass  through  a
            character-translating, and/or line-wrapping gateway.

            In this encoding, octets are to be represented as determined
            by the following rules:

                 Rule #1:  (General  8-bit  representation)  Any  octet,
                 except  those  indicating a line break according to the
                 newline convention of the canonical form  of  the  data
                 being encoded, may be represented by an "=" followed by
                 a two digit hexadecimal representation of  the  octet's
                 value. The digits of the hexadecimal alphabet, for this
                 purpose, are "0123456789ABCDEF". Uppercase letters must
                 be
                 used when sending hexadecimal  data,  though  a  robust
                 implementation   may   choose  to  recognize  lowercase
                 letters on receipt. Thus, for  example,  the  value  12
                 (ASCII  form feed) can be represented by "=0C", and the
                 value 61 (ASCII  EQUAL  SIGN)  can  be  represented  by
                 "=3D".   Except  when  the  following  rules  allow  an
                 alternative encoding, this rule is mandatory.

                 Rule #2: (Literal representation) Octets  with  decimal
                 values  of 33 through 60 inclusive, and 62 through 126,
                 inclusive, MAY be represented as the  ASCII  characters
                 which  correspond  to  those  octets (EXCLAMATION POINT
                 through LESS THAN,  and  GREATER  THAN  through  TILDE,
                 respectively).

                 Rule #3: (White Space): Octets with values of 9 and  32
                 MAY   be  represented  as  ASCII  TAB  (HT)  and  SPACE
                 characters,  respectively,   but   MUST   NOT   be   so



            Borenstein & Freed                                 [Page 14]




            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992


                 represented at the end of an encoded line. Any TAB (HT)
                 or SPACE characters on an encoded  line  MUST  thus  be
                 followed  on  that  line  by a printable character.  In
                 particular, an "=" at  the  end  of  an  encoded  line,
                 indicating  a  soft line break (see rule #5) may follow
                 one or more TAB (HT) or SPACE characters.   It  follows
                 that  an  octet with value 9 or 32 appearing at the end
                 of an encoded line must  be  represented  according  to
                 Rule  #1.  This  rule  is  necessary  because some MTAs
                 (Message Transport  Agents,  programs  which  transport
                 messages from one user to another, or perform a part of
                 such transfers) are known to pad  lines  of  text  with
                 SPACEs,  and  others  are known to remove "white space"
                 characters from the end  of  a  line.  Therefore,  when
                 decoding  a  Quoted-Printable  body, any trailing white
                 space on a line must be deleted, as it will necessarily
                 have been added by intermediate transport agents.

                 Rule #4 (Line Breaks): A line  break  in  a  text  body
                 part,   independent   of  what  its  representation  is
                 following the  canonical  representation  of  the  data
                 being  encoded, must be represented by a (RFC 822) line
                 break,  which  is  a  CRLF  sequence,  in  the  Quoted-
                 Printable  encoding.  If isolated CRs and LFs, or LF CR
                 and CR LF sequences are allowed  to  appear  in  binary
                 data  according  to  the  canonical  form, they must be
                 represented   using  the  "=0D",  "=0A",  "=0A=0D"  and
                 "=0D=0A" notations respectively.

                 Note that many implementation may elect to  encode  the
                 local representation of various content types directly.
                 In particular, this may apply to plain text material on
                 systems  that  use  newline conventions other than CRLF
                 delimiters. Such an implementation is permissible,  but
                 the  generation  of  line breaks must be generalized to
                 account for the case where alternate representations of
                 newline sequences are used.

                 Rule  #5  (Soft  Line  Breaks):  The   Quoted-Printable
                 encoding REQUIRES that encoded lines be no more than 76
                 characters long. If longer lines are to be encoded with
                 the  Quoted-Printable encoding, 'soft' line breaks must
                 be used. An equal sign  as  the  last  character  on  a
                 encoded  line indicates such a non-significant ('soft')
                 line break in the encoded text. Thus if the "raw"  form
                 of the line is a single unencoded line that says:

                      Now's the time for all folk to come to the aid of
                      their country.

                 This  can  be  represented,  in  the   Quoted-Printable
                 encoding, as





            Borenstein & Freed                                 [Page 15]




            RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992


                      Now's the time =
                      for all folk to come=
                       to the aid of their country.

                 This provides a mechanism with  which  long  lines  are
                 encoded  in  such  a  way as to be restored by the user
                 agent.  The 76  character  limit  does  not  count  the
                 trailing   CRLF,   but  counts  all  other  characters,
                 including any equal signs.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -