rfc2047.txt

来自「一个基于html的【pop客户端程序」· 文本代码 · 共 844 行 · 第 1/3 页
TXT
844 行
Network Working Group                                           K. MooreRequest for Comments: 2047                       University of TennesseeObsoletes: 1521, 1522, 1590                                November 1996Category: Standards Track        MIME (Multipurpose Internet Mail Extensions) Part Three:              Message Header Extensions for Non-ASCII TextStatus of this Memo   This document specifies an Internet standards track protocol for the   Internet community, and requests discussion and suggestions for   improvements.  Please refer to the current edition of the "Internet   Official Protocol Standards" (STD 1) for the standardization state   and status of this protocol.  Distribution of this memo is unlimited.Abstract   STD 11, RFC 822, defines a message representation protocol specifying   considerable detail about US-ASCII message headers, and leaves the   message content, or message body, as flat US-ASCII text.  This set of   documents, collectively called the Multipurpose Internet Mail   Extensions, or MIME, redefines the format of messages to allow for   (1) textual message bodies in character sets other than US-ASCII,   (2) an extensible set of different formats for non-textual message       bodies,   (3) multi-part message bodies, and   (4) textual header information in character sets other than US-ASCII.   These documents are based on earlier work documented in RFC 934, STD   11, and RFC 1049, but extends and revises them.  Because RFC 822 said   so little about message bodies, these documents are largely   orthogonal to (rather than a revision of) RFC 822.   This particular document is the third document in the series.  It   describes extensions to RFC 822 to allow non-US-ASCII text data in   Internet mail header fields.Moore                       Standards Track                     [Page 1]RFC 2047               Message Header Extensions           November 1996   Other documents in this series include:   + RFC 2045, which specifies the various headers used to describe     the structure of MIME messages.   + RFC 2046, which defines the general structure of the MIME media     typing system and defines an initial set of media types,   + RFC 2048, which specifies various IANA registration procedures     for MIME-related facilities, and   + RFC 2049, which describes MIME conformance criteria and     provides some illustrative examples of MIME message formats,     acknowledgements, and the bibliography.   These documents are revisions of RFCs 1521, 1522, and 1590, which   themselves were revisions of RFCs 1341 and 1342.  An appendix in RFC   2049 describes differences and changes from previous versions.1. Introduction   RFC 2045 describes a mechanism for denoting textual body parts which   are coded in various character sets, as well as methods for encoding   such body parts as sequences of printable US-ASCII characters.  This   memo describes similar techniques to allow the encoding of non-ASCII   text in various portions of a RFC 822 [2] message header, in a manner   which is unlikely to confuse existing message handling software.   Like the encoding techniques described in RFC 2045, the techniques   outlined here were designed to allow the use of non-ASCII characters   in message headers in a way which is unlikely to be disturbed by the   quirks of existing Internet mail handling programs.  In particular,   some mail relaying programs are known to (a) delete some message   header fields while retaining others, (b) rearrange the order of   addresses in To or Cc fields, (c) rearrange the (vertical) order of   header fields, and/or (d) "wrap" message headers at different places   than those in the original message.  In addition, some mail reading   programs are known to have difficulty correctly parsing message   headers which, while legal according to RFC 822, make use of   backslash-quoting to "hide" special characters such as "<", ",", or   ":", or which exploit other infrequently-used features of that   specification.   While it is unfortunate that these programs do not correctly   interpret RFC 822 headers, to "break" these programs would cause   severe operational problems for the Internet mail system.  The   extensions described in this memo therefore do not rely on little-   used features of RFC 822.Moore                       Standards Track                     [Page 2]RFC 2047               Message Header Extensions           November 1996   Instead, certain sequences of "ordinary" printable ASCII characters   (known as "encoded-words") are reserved for use as encoded data.  The   syntax of encoded-words is such that they are unlikely to   "accidentally" appear as normal text in message headers.   Furthermore, the characters used in encoded-words are restricted to   those which do not have special meanings in the context in which the   encoded-word appears.   Generally, an "encoded-word" is a sequence of printable ASCII   characters that begins with "=?", ends with "?=", and has two "?"s in   between.  It specifies a character set and an encoding method, and   also includes the original text encoded as graphic ASCII characters,   according to the rules for that encoding method.   A mail composer that implements this specification will provide a   means of inputting non-ASCII text in header fields, but will   translate these fields (or appropriate portions of these fields) into   encoded-words before inserting them into the message header.   A mail reader that implements this specification will recognize   encoded-words when they appear in certain portions of the message   header.  Instead of displaying the encoded-word "as is", it will   reverse the encoding and display the original text in the designated   character set.NOTES   This memo relies heavily on notation and terms defined RFC 822 and   RFC 2045.  In particular, the syntax for the ABNF used in this memo   is defined in RFC 822, as well as many of the terminal or nonterminal   symbols from RFC 822 are used in the grammar for the header   extensions defined here.  Among the symbols defined in RFC 822 and   referenced in this memo are: 'addr-spec', 'atom', 'CHAR', 'comment',   'CTLs', 'ctext', 'linear-white-space', 'phrase', 'quoted-pair'.   'quoted-string', 'SPACE', and 'word'.  Successful implementation of   this protocol extension requires careful attention to the RFC 822   definitions of these terms.   When the term "ASCII" appears in this memo, it refers to the "7-Bit   American Standard Code for Information Interchange", ANSI X3.4-1986.   The MIME charset name for this character set is "US-ASCII".  When not   specifically referring to the MIME charset name, this document uses   the term "ASCII", both for brevity and for consistency with RFC 822.   However, implementors are warned that the character set name must be   spelled "US-ASCII" in MIME message and body part headers.Moore                       Standards Track                     [Page 3]RFC 2047               Message Header Extensions           November 1996   This memo specifies a protocol for the representation of non-ASCII   text in message headers.  It specifically DOES NOT define any   translation between "8-bit headers" and pure ASCII headers, nor is   any such translation assumed to be possible.2. Syntax of encoded-words   An 'encoded-word' is defined by the following ABNF grammar.  The   notation of RFC 822 is used, with the exception that white space   characters MUST NOT appear between components of an 'encoded-word'.   encoded-word = "=?" charset "?" encoding "?" encoded-text "?="   charset = token    ; see section 3   encoding = token   ; see section 4   token = 1*<Any CHAR except SPACE, CTLs, and especials>   especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "               <"> / "/" / "[" / "]" / "?" / "." / "="   encoded-text = 1*<Any printable ASCII character other than "?"                     or SPACE>                  ; (but see "Use of encoded-words in message                  ; headers", section 5)   Both 'encoding' and 'charset' names are case-independent.  Thus the   charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the   encoding named "Q" may be spelled either "Q" or "q".   An 'encoded-word' may not be more than 75 characters long, including   'charset', 'encoding', 'encoded-text', and delimiters.  If it is   desirable to encode more text than will fit in an 'encoded-word' of   75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may   be used.   While there is no limit to the length of a multiple-line header   field, each line of a header field that contains one or more   'encoded-word's is limited to 76 characters.   The length restrictions are included both to ease interoperability   through internetwork mail gateways, and to impose a limit on the   amount of lookahead a header parser must employ (while looking for a   final ?= delimiter) before it can decide whether a token is an   "encoded-word" or something else.Moore                       Standards Track                     [Page 4]RFC 2047               Message Header Extensions           November 1996   IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's   by an RFC 822 parser.  As a consequence, unencoded white space   characters (such as SPACE and HTAB) are FORBIDDEN within an   'encoded-word'.  For example, the character sequence      =?iso-8859-1?q?this is some text?=   would be parsed as four 'atom's, rather than as a single 'atom' (by   an RFC 822 parser) or 'encoded-word' (by a parser which understands   'encoded-words').  The correct way to encode the string "this is some   text" is to encode the SPACE characters as well, e.g.      =?iso-8859-1?q?this=20is=20some=20text?=   The characters which may appear in 'encoded-text' are further   restricted by the rules in section 5.3. Character sets   The 'charset' portion of an 'encoded-word' specifies the character   set associated with the unencoded text.  A 'charset' can be any of   the character set names allowed in an MIME "charset" parameter of a   "text/plain" body part, or any character set name registered with   IANA for use with the MIME text/plain content-type.   Some character sets use code-switching techniques to switch between   "ASCII mode" and other modes.  If unencoded text in an 'encoded-word'   contains a sequence which causes the charset interpreter to switch   out of ASCII mode, it MUST contain additional control codes such that   ASCII mode is again selected at the end of the 'encoded-word'.  (This   rule applies separately to each 'encoded-word', including adjacent   'encoded-word's within a single header field.)   When there is a possibility of using more than one character set to   represent the text in an 'encoded-word', and in the absence of   private agreements between sender and recipients of a message, it is   recommended that members of the ISO-8859-* series be used in   preference to other character sets.4. Encodings   Initially, the legal values for "encoding" are "Q" and "B".  These   encodings are described below.  The "Q" encoding is recommended for   use when most of the characters to be encoded are in the ASCII   character set; otherwise, the "B" encoding should be used.   Nevertheless, a mail reader which claims to recognize 'encoded-word's   MUST be able to accept either encoding for any character set which it   supports.Moore                       Standards Track                     [Page 5]
rfc2047.txt - 源码说明

本页面展示了「一个基于html的【pop客户端程序」中的 rfc2047.txt 源码文件，采用文本编程语言编写，共 844 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与html相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?