📄 draft-ietf-idn-utf6-00.txt
字号:
5a. Set HN to the result of the bitwise AND of the input character and the mask; 5b. Emit the variable length nibble encoding of HN. 2.4.3 Forward Transformation AlgorithmThe UTF-6 transformation algorithm accepts a string in UTF-16 [ISO10646] format as input. The encoding algorithm is as follows: 1. Break the hostname string into dot-separated hostname parts. For each hostname part, perform steps 2 and 3 below; 2. Compress the component using the method described in section 2.4.2 above, and encode using the encoding described in section 2.4.1; 3. Prepend the post-converted name prefix 'wq--' (see section 2.1 above) to the resulting string.2.5 UTF-6 Decoding2.5.1 Variable Length Hex Decoding 1. Let N be the lower case of the first input character; If N is not in set [ghijklmnopqrstuv] return error, else consume the input character; 2. Let R = N - 'g'; 3. If another input character exists, then let N be the lower case of the next input character, else goto Step 9; 4. If N is not in the set [0123456789abcdef], go to Step 9; 5. Let N = the lower case of the next input character and consume the input character; 6. Let R = R * 16; 7. If N is in set [0123456789], then let R = R + (N - '0'), else let R = R + (N - 'a') + 10; 8. Go to step 3; 9. Return decoded result R.2.5.2 UTF-6 Decompression Algorithm 1. Let N be the lower case of the first input character; 2. If N != 'y' and N != 'z', 2a. Let CPART be 0; 2b. Let VMAX be 0xFFFF; This is the no-compression case; 3. If N == 'y', 3a. Let M be the variable length hex decoding of the next character; 3b. Let CPART be the result of M * 0x0100; 3c. Let VMAX be 0x00FF; 3d. Continue to Step 5; 4. If N == 'z', 4a. Let M be the variable length hex decoding of the next character; 4b. Let CPART be the result of M * 0x1000; 4c. Let VMAX be 0x0FFF; 4d. Continue to Step 5; 5. While another input character exists, let N be the lower case of the next input character, and do the following: 5a. If N == '-' consume the character and then append '-' to the result string, else let VPART be the next variable hex decoded value; 5b. If VPART > VMAX, return error, else append CPART + VPART to the result string; 6. Return the result string. 2.5.3 Reverse Transformation Algorithm 1. Break the string into dot-separated components and apply Steps 2 through 4 to each component: 2. Check for legality (in terms of RFC1035 permitted characters) and return error status if illegal, 3. Remove the post converted name prefix 'wq--' (see Section 2.1), 4. Decompress the component using the decompression algorithm described above. 5. Concatenate the decoded segments with dot separators and return.3. ExamplesThe examples below illustrate the encoding algorithm and providecomparisons to alternate encoding schemes. UTF-5 sequences areprefixed with '----', as no ACE prefix was defined for that encoding.3.1 'www.walid.com' (in Arabic): UTF-16: U+0645 U+0648 U+0642 U+0639 . U+0648 U+0644 U+064A U+062F . U+0634 U+0631 U+0643 U+0629 UTF-6: wq--ymk5k8k2j9.wq--ymk8k4kaif.wq--ymj4j1k3i9 UTF-5: ----m45m48m42m39.----m48m44m4am2f.----m34m31m43m29 RACE: bq--azcuqqrz.bq--azeeisrp.bq--ay2dcqzj LACE: bq--aqdekscche.bq--aqdeqrckf5.bq--aqddimkdfe3.2 Mixed Katakana and Hiragana (SOREZORENOBASHO) UTF-16: U+305D U+308C U+305E U+308C U+306E U+5834 U+6240 UTF-6: UTF-5: RACE: bq--4ayf3memgbpdbdbqnzmdiysa LACE: bq--auyf4dc7rrxacwbuafrea3.3 Currently Disallowed ASCII Characters ($OneBillionDollars!): UTF-16: U+0024 U+004F U+006E U+0065 U+0042 U+0069 U+006C U+006C U+0069 U+006F U+006E U+0044 U+006F U+006C U+006C U+0061 U+0072 U+0073 U+0021 UTF-6: UTF-5: RACE: bq--aase74tfijuwy4djn6xei44mnrqxe5zb LACE: bq--cmacit4omvbgs4dmnfxw5rdpnrwgc5ttee4. Security ConsiderationsMuch of the security of the Internet relies on the DNS and anychange to the characteristics of the DNS may change the security ofmuch of the Internet. Therefore UTF-6 makes no changes to the DNS itself.UTF-6 is designed so that distinct Unicode sequences map to distinctdomain name sequences (modulo the Unicode and DNS equivalence rules).Therefore use of UTF-6 with DNS will not negatively affect security.5. References[IDNCOMP] Paul Hoffman, "Comparison of Internationalized Domain Name Proposals", draft-ietf-idn-compare.[IDNREQ] James Seng, "Requirements of Internationalized Domain Names",draft-ietf-idn-requirement.[IDNNAMEPREP] Paul Hoffman and Marc Blanchet, "Preparation of Internationalized Host Names", draft-ietf-idn-nameprep[IDNDUERST] M. Duerst, "Internationalization of Domain Names",draft-duerst-dns-i18n.[ISO10646] ISO/IEC 10646-1:1993. International Standard -- Informationtechnology -- Universal Multiple-Octet Coded Character Set (UCS) --Part 1: Architecture and Basic Multilingual Plane. Five amendments anda technical corrigendum have been published up to now. UTF-16 isdescribed in Annex Q, published as Amendment 1. 17 other amendments arecurrently at various stages of standardization. [RFC2119] Scott Bradner, "Key words for use in RFCs to IndicateRequirement Levels", March 1997, RFC 2119.[STD13] Paul Mockapetris, "Domain names - implementation andspecification", November 1987, STD 13 (RFC 1035).[UNICODE3] The Unicode Consortium, "The Unicode Standard -- Version3.0", ISBN 0-201-61633-5. Described at<http://www.unicode.org/unicode/standard/versions/Unicode3.0.html>.A. AcknowledgementsThe structure (and some of the structural text) of this document is intentionally borrowed from the LACE IDN draft (draft-ietf-idn-lace) by Mark Davis and Paul Hoffman.The 'SOREZORENOBASHO' example was taken from draft-ietf-idn-brace draftby Adam Costello.B. IANA ConsiderationsThere are no IANA considerations in this document.C. Author Contact InformationMark WelterBrian W. SpolarichWALID, Inc.State Technology Park2245 S. State St.Ann Arbor, MI 48104+1-734-822-2020mwelter@walid.combriansp@walid.com-----BEGIN PGP SIGNATURE-----Version: GnuPG v1.0.1 (GNU/Linux)Comment: For info see http://www.gnupg.orgiD8DBQE6FaCt/DkPcNgtD/0RAtRmAJwISVeJGY6qmll71mL+Axc51o8iIwCgmNt/86RcQh1JQYWTux+8FS+XvMU==bxiv-----END PGP SIGNATURE-----
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -