rfc1345.txt

来自「<VC++网络游戏建摸与实现>源代码」· 文本代码 · 共 1,502 行 · 第 1/5 页
TXT
1,502 行
Network Working Group                                        K. SimonsenRequest for Comments: 1345                   Rationel Almen Planlaegning                                                               June 1992                  Character Mnemonics & Character SetsStatus of the Memo   This memo provides information for the Internet community.  It does   not specify an Internet standard.  Distribution of this memo is   unlimited.Summary   This memo lists a selection of characters and their presence in some   coded character sets. To facilitate the coded character set   tabulations an unambiguous mnemonic for each character is used, and a   format for tabulating the coded character sets is defined. The coded   character sets are given names for easy reference. A family of coded   character sets called the mnemonic character sets and conversion   between these coded character set without information loss is   defined.   The character set names are registered with the Internet Assigned   Numbers Authority (IANA).  Additional character sets not described in   this memo should be registered with the IANA. This memo may be   updated periodically, or additional specifications may be published,   to reflect other coded character sets.   Please send any comments including comments about the accuracy of the   tables to the author, Keld Simonsen <Keld.Simonsen@dkuug.dk>.1.  INTRODUCTION   With the growing internationalization of the Internet, support for   many coded character sets is required. It is the intention of this   memo to document precisely the mapping between all characters and   their corresponding coded representations in various coded character   sets, and give names to these coded character sets, so they can be   referenced unambiguously in Internet standards.   This memo does not indicate anything about the validity of using   these specifications in any Internet standard, so you should consult   each individual Internet standard to see which coded character sets   and names are allowed there.   Unambiguous character mnemonics are specified, which provide a   practical way of identifying a character, without reference to a   coded character set and its code in this coded character set.  The   mnemonics are written in a minimal set of characters, namely the   invariant 83 graphical characters of ISO 646, which is a kind of   greatest common subset to be found between the majority of codedSimonsen                                                        [Page 1]RFC 1345          Character Mnemonics & Character Sets         June 1992   character sets, including ASCII, national variants of the ISO 646 7-   bit character set and various EBCDICs.  In addition, the numeric   value of the coded representations of all these characters are the   same in all coded character sets compatible with ISO standards.  All   of them except two, EXCLAMATION MARK and QUOTATION MARK, have the   same coded representation in all variants of EBCDIC.  This minimal   set of characters is called the reference character set in this memo.   The mnemonics can be used in Internet standards for easy and   unambiguous reference, and they can also serve as a fallback   representation in various Internet specifications.   The coded character sets covered include all parts of ISO 8859, ISO   6937-2 and all ISO 646 conforming coded character sets in the ISO   character set registry managed by ECMA according to ISO 2375.  Almost   all graphic coded character sets in the ECMA registry (1) are   covered.  The graphic coded character sets not included are registry   numbers 31, 38, 39, 53, 59, 68, 71, 72, 129 and 137.  In addition   many vendor defined character sets are covered, including PC   codepages (4), (7), (8), many EBCDIC character sets (4), (5), (6) and   HP, DEC and Apple character sets (8), (9), (10), (13), (14).  The   East-Asian 16-bit character sets from the ECMA registry is also   included in this memo.2.  CHARACTER MNEMONICS2.1  General Syntax   The character mnemonics are taken from the ISO committee draft (CD)   of the POSIX.2 standard (3).  They are classified into two groups:   1. A group with two-character mnemonics      - Primarily intended for alphabetic scripts like Latin, Greek,        Cyrillic, Hebrew and Arabic, and special characters.   2. A group with variable-length mnemonics      - primarily intended for non-alphabetic scripts like Japanese and        Chinese, but also used for some accented letters and special        characters.   In the two-character mnemonics, all invariant graphic character in   the ISO 646 character codes except "&" are used, i.e. the following   characters:           ! "     %   ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?             A B C D E F G H I J K L M N O P Q R S T U V W X Y Z       _             a b c d e f g h i j k l m n o p q r s t u v w x y z   The character "_" is not used as the first character.   In the variable-length mnemonics, the character "_" is not  used as   the first character. If it is used in a name, its presence is   doubled.Simonsen                                                        [Page 2]RFC 1345          Character Mnemonics & Character Sets         June 1992   The mnemonics can be used in several different ways for different   purposes.  One of these is description of coded character sets, which   is detailed in section 3.  Another is for extending a given coded   character set to a mnemonic character set.  This is described in   section 4.  The restrictions on the use of the characters "&" and "_"   are due to demands of the compositional methods of these techniques.2.2  ISO Official Long Descriptive Character Name   For all mnemonics, the character for which it stands is indicated in   the following table by a long descriptive name.  This name is   identical to the ISO name of the character as given in reference (2).   For a few characters that are not included there, descriptive names   of the same kind are introduced in this memo.  The source of each   character is stated in the table after the name and should be   consulted for a reliable identification of the character.   These long descriptive names consists only of the capital Latin   letters of the invariant part of ISO 646, the digits, "-", and SPACE.   Digits are only used in names of ideographic and Hangul characters   and never as the first character.2.3  The 2-character Mnemonics   The two-character mnemonics include various accented Latin letters,   Greek, Cyrillic, Hebrew, Arabic, Hiragana and Katakana.  Also a fair   number of special characters are included.  Almost all ISO or ISO   registered 7- and 8-bit graphical coded character sets are covered   with these two-character mnemonics.   The two characters are chosen so the graphical appearance in the   reference set resembles as much as possible (within the possibilities   available) the graphical appearance of the character. The basic   character set of ISO 646 is used as the reference set, as mentioned   above.   The characters in the reference character set are chosen to represent   themselves.   For control characters from ISO 646 the two-character acronyms of ISO   2047 are used as mnemonics.  For the other control characters of ISO   6429, two-character mnemonics have been selected based on the   variable-length acronyms used in that standard.   Letters, including Greek, Cyrillic, Arabic and Hebrew, are   represented with the base letter as the first letter, and the second   letter represents an accent or relation to a non-Latin script.  Non-   Latin letters are transliterated to Latin letters, following   transliteration standards as closely as possible.  This is also done   with the Latin letters such as ETH and THORN, and the   Danish/Norwegian/Swedish letter A WITH RING ABOVE is transliterated   into "aa".Simonsen                                                        [Page 3]RFC 1345          Character Mnemonics & Character Sets         June 1992   After a letter, the second character signifies the following:     Exclamation mark           ! Grave     Apostrophe                 ' Acute accent     Greater-Than sign          > Circumflex accent     Question Mark              ? tilde     Hyphen-Minus               - Macron     Left parenthesis           ( Breve     Full Stop                  . Dot Above     Colon                      : Diaeresis     Comma                      , Cedilla     Underline                  _ Underline     Solidus                    / Stroke     Quotation mark             " Double acute accent     Semicolon                  ; Ogonek     Less-Than sign             < Caron     Zero                       0 Ring above     Two                        2 Hook     Nine                       9 Horn     Equals                     = Cyrillic     Asterisk                   * Greek     Percent sign               % Greek/Cyrillic special     Plus                       + smalls: Arabic, capitals: Hebrew     Three                      3 some Latin/Greek/Cyrillic letters     Four                       4 Bopomofo     Five                       5 Hiragana     Six                        6 Katakana   In designing the mnemonics the following special characters were   reserved: The ampersand is reserved as an intro character, indicating   that the following string is in the mnemonic character set.  The   underline character is reserved for the variable-length mnemonics.   This use does not eliminate usage as an accent or language   identifier.   Special characters are encoded with some mnemonic value.  These are   not systematic thruout, but most mnemonics start with a related   special character of the reference set.2.4  The Variable-length Character Mnemonics   The Variable-length Character Mnemonics are primarily meant for the   ideographic characters in larger Asian character sets, but are also   used for accented characters with several accents and some special   characters. To have the mnemonics as short as possible, which both   saves storage and is easier to input, a quite short name is   preferred. Considering the Chinese standard GB 2312-1980, the   Japanese standards JIS X0208 and JIS X0212, and the Korean standard   KS C 5601, they are all given by row and column numbers between 1 and   94. So two positions for row and column and a character set   identifier of one character would be almost as short as possible.   The following character set identifiers are defined:Simonsen                                                        [Page 4]RFC 1345          Character Mnemonics & Character Sets         June 1992            c   GB 2312-1980            j   JIS X0208-1990            J   JIS X0212-1990            k   KS C 5601-1987   This system for the representation of ideographic characters and   Hangul characters is not truly mnemonic, but it provides short   representations that are easy to connect to the corresponding   character by means of the code table of an official character set   standard. Alternative methods based on the graphic appearance or the   pronunciation of the characters are thought to be unfeasible.   One prominent character in the reference character set is reserved   for identifying variable-length mnemonics, namely the underline   character "_". This character is intended as a delimiter both in the   front and in the end of the mnemonic. An example of its use would be:   (&=intro):             &_j3210_ &_j4436_&_j6530_3.  CHARACTER MNEMONIC TABLE   The following table contains the character mnemonic and the encoding   and long descriptive name of ISO 2DIS 10646 (2).  Although the ISO   10646 is only at DIS stage at this moment of writing and there is   quite some debate about it, the long descriptive naming in the DIS is   considered to be stable and the best official ISO reference to   character names. The 2-octet encoded value of the ISO 2DIS 10646 is   also used, but only as an identification of the character, and it   should only be used for identification purposes as the coded   representation may be changed in the final 10646 international   standard. Some characters not in the ISO 2DIS 10646 are allocated   values in the private use zone and given names and references to a   character set where it is used.   The format of the table is:   1st field is the character mnemonic (mostly 2 characters).   2nd field is the ISO 2DIS 10646 code in hexadecimal.   3rd field is the long descriptive name of ISO 2DIS 10646. SP     0020    SPACE !      0021    EXCLAMATION MARK "      0022    QUOTATION MARK Nb     0023    NUMBER SIGN DO     0024    DOLLAR SIGN %      0025    PERCENT SIGN &      0026    AMPERSAND '      0027    APOSTROPHE (      0028    LEFT PARENTHESIS )      0029    RIGHT PARENTHESIS *      002a    ASTERISK +      002b    PLUS SIGNSimonsen                                                        [Page 5]RFC 1345          Character Mnemonics & Character Sets         June 1992 ,      002c    COMMA -      002d    HYPHEN-MINUS
rfc1345.txt - 源码说明

本页面展示了「<VC++网络游戏建摸与实现>源代码」中的 rfc1345.txt 源码文件，采用文本编程语言编写，共 1,502 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫开发者社区收录了大量与VC++相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?