⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 draft-ietf-idn-race-03.txt

📁 bind-3.2.
💻 TXT
📖 第 1 页 / 共 2 页
字号:
two-octet mode using anything other than the logic given in thissection.2.4.1 Compressing a stringThe input string is in big-endian UTF-16 encoding with no byte ordermark.Design note: No checking is done on the input to this algorithm. It isassumed that all checking for valid ISO/IEC 10646 characters has alreadybeen done by a previous step in the conversion process.Design note: In step 5, 0xFF was chosen as the escape character becauseit appears in the fewest number of scripts in ISO 10646, and thereforethe "escaped escape" will be needed the least. 0x99 was chosen as thesecond octet for the "escaped escape" because the character U+0099 hasno value, and is not even used as a control character in the C1 controlsor in ISO 6429.1) Starting at the beginning of the input, read each pair of octets inthe input stream, comparing the upper octet of each. Reset the inputpointer to the beginning of the input again. If all of the upper octets(called U1) are the same, go to step 4. Note that if the input is onlyone character, this test will always be true.2) Read each pair of octets in the input stream, comparing the upperoctet of each. Reset the input pointer to the beginning of the inputagain. If all of the upper octets are either 0x00 or one single othervalue (called U1), go to step 4.3) Output 0xD8, followed by the entire input stream. Finish.4) If U1 is in the range 0xD8 to 0xDC, stop with an error. Otherwise,output U1.5) If you are at the end of the input string, finish. Otherwise, readthe next octet, called U2, and the octet after that, called N1. If U2 is0x00 and N1 is 0x99, stop with an error.6) If U2 is equal to U1, and N1 is not equal to 0xFF, output N1, and goto step 5.7) If U2 is equal to U1, and N1 is equal to 0xFF, output 0xFF followedby 0x99, and go to step 5.8) Output 0xFF followed by N1. Go to step 5.2.4.2 Decompressing a string1) Read the first octet of the input string. Call the value of the firstoctet U1. If there are no more octets in the input string (that is, ifthe input string had only one octet total), stop with an error. If U1 is0xD8, go to step 8.2) If you are at the end of the input string, go to step 11. Otherwise,read the next octet in the input string, called N1. If N1 is 0xFF, go tostep 5.3) If U1 is 0x00 and N1 is 0x99, stop with an error.4) Put U1 followed by N1 in the output buffer. Go to step 2.5) If you are at the end of the input string, stop with an error.6) Read the next octet of the input string, called N1. If N1 is 0x99,put U1 followed by 0xFF in the output buffer, and go to step 2.7) Put 0x00 followed by N1 in the output buffer. Go to step 2.8) Read the rest of the input stream into a temporary string calledLCHECK. If the length of LCHECK is an odd number, stop with an error.9) Perform the checks from steps 1 and 2 of the compression algorithm insection 2.4.1 on LCHECK. If either checks pass (that is, if either wouldhave created a compressed string), stop with an error because the inputto the decompression is in the wrong format.10) If the length of LCHECK is odd, stop with and error. Otherwise,output LCHECK and finish.11) If the length of the output buffer is odd, stop with and error.Otherwise, emit the output buffer and finish.2.4.3 Compression examplesFor the input string of <U+012D><U+0111><U+014B>, all characters are inthe same row, 0x01. Thus, the output is 0x012D114B.For the input string of <U+012D><U+00E0><U+014B>, the characters are allin row 0x01 or row 0x00. Thus, the output is 0x012DFFE04B.For the input string of <U+1290><U+12FF><U+120C>, the characters are allin row 0x12. Thus, the output is 0x1290FF990C.For the input string of <U+012D><U+00E0><U+24D3>, the characters arefrom two rows other than 0x00. Thus, the output is 0xD8012D00E024D3.2.5 Base32In order to encode non-ASCII characters in DNS-compatible host name parts,they must be converted into legal characters. This is done with Base32encoding, described here.Table 1 shows the mapping between input bits and output characters inBase32. Design note: the digits used in Base32 are "2" through "7"instead of "0" through "6" in order to avoid digits "0" and "1". Thishelps reduce errors for users who are entering a Base32 stream and maymisinterpret a "0" for an "O" or a "1" for an "l".                    Table 1: Base32 conversion             bits   char  hex         bits   char  hex             00000   a    0x61        10000   q    0x71             00001   b    0x62        10001   r    0x72             00010   c    0x63        10010   s    0x73             00011   d    0x64        10011   t    0x74             00100   e    0x65        10100   u    0x75             00101   f    0x66        10101   v    0x76             00110   g    0x67        10110   w    0x77             00111   h    0x68        10111   x    0x78             01000   i    0x69        11000   y    0x79             01001   j    0x6a        11001   z    0x7a             01010   k    0x6b        11010   2    0x32             01011   l    0x6c        11011   3    0x33             01100   m    0x6d        11100   4    0x34             01101   n    0x6e        11101   5    0x35             01110   o    0x6f        11110   6    0x36             01111   p    0x70        11111   7    0x372.5.1 Encoding octets as Base32The input is a stream of octets. However, the octets are then treatedas a stream of bits.Design note: The assumption that the input is a stream of octets(instead of a stream of bits) was made so that no padding was needed.If you are reusing this algorithm for a stream of bits, you must add apadding mechanism in order to differentiate different lengths of input.1) If the input bit stream is not an even multiple of five bits, padthe input stream with 0 bits until it is an even multiple of five bits.Set the read pointer to the beginning of the input bit stream.2) Look at the five bits after the read pointer.3) Look up the value of the set of five bits in the bits column ofTable 1, and output the character from the char column (whose hex valueis in the hex column).4) Move the read pointer five bits forward. If the read pointer is atthe end of the input bit stream (that is, there are no more bits in theinput), stop. Otherwise, go to step 2.2.5.2 Decoding Base32 as octetsThe input is octets in network byte order. The input octets MUST bevalues from the second column in Table 1.1) Count the number of octets in the input and divide it by 8; call theremainder INPUTCHECK. If INPUTCHECK is 1 or 3 or 6, stop with an error.2) Set the read pointer to the beginning of the input octet stream.3) Look up the character value of the octet in the char column (or hexvalue in hex column) of Table 1, and add the five bits from the bitscolumn to the output buffer.4) Move the read pointer one octet forward. If the read pointer is notat the end of the input octet stream (that is, there are more octets inthe input), go to step 3.5) Count the number of bits that are in the output buffer and divide itby 8; call the remainder PADDING. If the PADDING number of bits at theend of the output buffer are not all zero, stop with an error.Otherwise, emit the output buffer and stop.2.5.3 Base32 exampleAssume you want to encode the value 0x3a270f93. The bit string is:3   a    2   7    0   f    9   300111010 00100111 00001111 10010011Broken into chunks of five bits, this is:00111 01000 10011 10000 11111 00100 11Padding is added to make the last chunk five bits:00111 01000 10011 10000 11111 00100 11000The output of encoding is:00111 01000 10011 10000 11111 00100 11000  h     i     t     q     7     e     yor "hitq7ey".3. Security ConsiderationsMuch of the security of the Internet relies on the DNS. Thus, anychange to the characteristics of the DNS can change the security ofmuch of the Internet. Thus, RACE makes no changes to the DNSitself.Host names are used by users to connect to Internet servers. Thesecurity of the Internet would be compromised if a user entering asingle internationalized name could be connected to different serversbased on different interpretations of the internationalized hostname.RACE is designed so that every internationalized host name partcan be represented as one and only one DNS-compatible string. If thereis any way to follow the steps in this document and get two or moredifferent results, it is a severe and fatal error in the protocol.4. References[IDNComp] Paul Hoffman, "Comparison of Internationalized Domain Name Proposals",draft-ietf-idn-compare.[IDNReq] James Seng, "Requirements of Internationalized Domain Names",draft-ietf-idn-requirement.[ISO10646] ISO/IEC 10646-1:1993. International Standard -- Informationtechnology -- Universal Multiple-Octet Coded Character Set (UCS) --Part 1: Architecture and Basic Multilingual Plane.  Five amendments anda technical corrigendum have been published up to now. UTF-16 isdescribed in Annex Q, published as Amendment 1. 17 other amendments arecurrently at various stages of standardization. [[[ THIS REFERENCENEEDS TO BE UPDATED AFTER DETERMINING ACCEPTABLE WORDING ]]][RFC2119] Scott Bradner, "Key words for use in RFCs to IndicateRequirement Levels", March 1997, RFC 2119.[STD13] Paul Mockapetris, "Domain names - implementation andspecification", November 1987, STD 13 (RFC 1035).[Unicode3] The Unicode Consortium, "The Unicode Standard -- Version3.0", ISBN 0-201-61633-5. Described at<http://www.unicode.org/unicode/standard/versions/Unicode3.0.html>.A. AcknowledgementsMark Davis contributed many ideas to the initial draft of this document,as well as comments in later versions. Graham Klyne and Martin Duerstoffered technical comments on the algorithms used. GIM Gyeongseog andPongtorn Jentaweepornkul helped fix technical errors in early drafts.Rick Wesson and Mark Davis contributed many suggestions on errorconditions in the processing.Base32 is quite obviously inspired by the tried-and-true Base64Content-Transfer-Encoding from MIME.B. Changes from Versions -02 to -03 of this Draft1: Wording corrections to third paragraph.2.2 and 2.3: Added need to check for all-STD13.2.4.1: Wording corrections in the first two paragraphs. Made step 1 and2 clearer with resetting the input pointer. Also added sentence at theend of step 1. Also added error conditions in steps 4 and 5.2.4.2: Added error condition in step 1. Added a new step 3 for an errorcheck. Expanded step 8 to check for malformed input error. Added errorcheck for odd-length output.2.4.3: Changed all the examples to use lowercase characters on input. 2.5.1: Made the list of steps shorter by padding with 0 bits at thebeginning of the steps.2.5.2: Changed the sense of the test in step 3 and added step 4 to becheckfor malformed input. Also made the output a buffer. Also addednew step 1.C. IANA ConsiderationsThere are no IANA considerations in this document.D. Author Contact InformationPaul HoffmanInternet Mail Consortium and VPN Consortium127 Segre PlaceSanta Cruz, CA  95060 USApaul.hoffman@imc.org and paul.hoffman@vpnc.org

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -