📄 rfc3490.txt
字号:
RFC 3490 IDNA March 2003 2. Perform the steps specified in [NAMEPREP] and fail if there is an error. (If step 3 of ToASCII is also performed here, it will not affect the overall behavior of ToUnicode, but it is not necessary.) The AllowUnassigned flag is used in [NAMEPREP]. 3. Verify that the sequence begins with the ACE prefix, and save a copy of the sequence. 4. Remove the ACE prefix. 5. Decode the sequence using the decoding algorithm in [PUNYCODE] and fail if there is an error. Save a copy of the result of this step. 6. Apply ToASCII. 7. Verify that the result of step 6 matches the saved copy from step 3, using a case-insensitive ASCII comparison. 8. Return the saved copy from step 5.5. ACE prefix The ACE prefix, used in the conversion operations (section 4), is two alphanumeric ASCII characters followed by two hyphen-minuses. It cannot be any of the prefixes already used in earlier documents, which includes the following: "bl--", "bq--", "dq--", "lq--", "mq--", "ra--", "wq--" and "zq--". The ToASCII and ToUnicode operations MUST recognize the ACE prefix in a case-insensitive manner. The ACE prefix for IDNA is "xn--" or any capitalization thereof. This means that an ACE label might be "xn--de-jg4avhby1noc0d", where "de-jg4avhby1noc0d" is the part of the ACE label that is generated by the encoding steps in [PUNYCODE]. While all ACE labels begin with the ACE prefix, not all labels beginning with the ACE prefix are necessarily ACE labels. Non-ACE labels that begin with the ACE prefix will confuse users and SHOULD NOT be allowed in DNS zones.Faltstrom, et al. Standards Track [Page 12]RFC 3490 IDNA March 20036. Implications for typical applications using DNS In IDNA, applications perform the processing needed to input internationalized domain names from users, display internationalized domain names to users, and process the inputs and outputs from DNS and other protocols that carry domain names. The components and interfaces between them can be represented pictorially as: +------+ | User | +------+ ^ | Input and display: local interface methods | (pen, keyboard, glowing phosphorus, ...) +-------------------|-------------------------------+ | v | | +-----------------------------+ | | | Application | | | | (ToASCII and ToUnicode | | | | operations may be | | | | called here) | | | +-----------------------------+ | | ^ ^ | End system | | | | | Call to resolver: | | Application-specific | | ACE | | protocol: | | v | ACE unless the | | +----------+ | protocol is updated | | | Resolver | | to handle other | | +----------+ | encodings | | ^ | | +-----------------|----------|----------------------+ DNS protocol: | | ACE | | v v +-------------+ +---------------------+ | DNS servers | | Application servers | +-------------+ +---------------------+ The box labeled "Application" is where the application splits a domain name into labels, sets the appropriate flags, and performs the ToASCII and ToUnicode operations. This is described in section 4.Faltstrom, et al. Standards Track [Page 13]RFC 3490 IDNA March 20036.1 Entry and display in applications Applications can accept domain names using any character set or sets desired by the application developer, and can display domain names in any charset. That is, the IDNA protocol does not affect the interface between users and applications. An IDNA-aware application can accept and display internationalized domain names in two formats: the internationalized character set(s) supported by the application, and as an ACE label. ACE labels that are displayed or input MUST always include the ACE prefix. Applications MAY allow input and display of ACE labels, but are not encouraged to do so except as an interface for special purposes, possibly for debugging, or to cope with display limitations as described in section 6.4.. ACE encoding is opaque and ugly, and should thus only be exposed to users who absolutely need it. Because name labels encoded as ACE name labels can be rendered either as the encoded ASCII characters or the proper decoded characters, the application MAY have an option for the user to select the preferred method of display; if it does, rendering the ACE SHOULD NOT be the default. Domain names are often stored and transported in many places. For example, they are part of documents such as mail messages and web pages. They are transported in many parts of many protocols, such as both the control commands and the RFC 2822 body parts of SMTP, and the headers and the body content in HTTP. It is important to remember that domain names appear both in domain name slots and in the content that is passed over protocols. In protocols and document formats that define how to handle specification or negotiation of charsets, labels can be encoded in any charset allowed by the protocol or document format. If a protocol or document format only allows one charset, the labels MUST be given in that charset. In any place where a protocol or document format allows transmission of the characters in internationalized labels, internationalized labels SHOULD be transmitted using whatever character encoding and escape mechanism that the protocol or document format uses at that place. All protocols that use domain name slots already have the capacity for handling domain names in the ASCII charset. Thus, ACE labels (internationalized labels that have been processed with the ToASCII operation) can inherently be handled by those protocols.Faltstrom, et al. Standards Track [Page 14]RFC 3490 IDNA March 20036.2 Applications and resolver libraries Applications normally use functions in the operating system when they resolve DNS queries. Those functions in the operating system are often called "the resolver library", and the applications communicate with the resolver libraries through a programming interface (API). Because these resolver libraries today expect only domain names in ASCII, applications MUST prepare labels that are passed to the resolver library using the ToASCII operation. Labels received from the resolver library contain only ASCII characters; internationalized labels that cannot be represented directly in ASCII use the ACE form. ACE labels always include the ACE prefix. An operating system might have a set of libraries for performing the ToASCII operation. The input to such a library might be in one or more charsets that are used in applications (UTF-8 and UTF-16 are likely candidates for almost any operating system, and script- specific charsets are likely for localized operating systems). IDNA-aware applications MUST be able to work with both non- internationalized labels (those that conform to [STD13] and [STD3]) and internationalized labels. It is expected that new versions of the resolver libraries in the future will be able to accept domain names in other charsets than ASCII, and application developers might one day pass not only domain names in Unicode, but also in local script to a new API for the resolver libraries in the operating system. Thus the ToASCII and ToUnicode operations might be performed inside these new versions of the resolver libraries. Domain names passed to resolvers or put into the question section of DNS requests follow the rules for "queries" from [STRINGPREP].6.3 DNS servers Domain names stored in zones follow the rules for "stored strings" from [STRINGPREP]. For internationalized labels that cannot be represented directly in ASCII, DNS servers MUST use the ACE form produced by the ToASCII operation. All IDNs served by DNS servers MUST contain only ASCII characters. If a signaling system which makes negotiation possible between old and new DNS clients and servers is standardized in the future, the encoding of the query in the DNS protocol itself can be changed fromFaltstrom, et al. Standards Track [Page 15]RFC 3490 IDNA March 2003 ACE to something else, such as UTF-8. The question whether or not this should be used is, however, a separate problem and is not discussed in this memo.6.4 Avoiding exposing users to the raw ACE encoding Any application that might show the user a domain name obtained from a domain name slot, such as from gethostbyaddr or part of a mail header, will need to be updated if it is to prevent users from seeing the ACE. If an application decodes an ACE name using ToUnicode but cannot show all of the characters in the decoded name, such as if the name contains characters that the output system cannot display, the application SHOULD show the name in ACE format (which always includes the ACE prefix) instead of displaying the name with the replacement character (U+FFFD). This is to make it easier for the user to transfer the name correctly to other programs. Programs that by default show the ACE form when they cannot show all the characters in a name label SHOULD also have a mechanism to show the name that is produced by the ToUnicode operation with as many characters as possible and replacement characters in the positions where characters cannot be displayed. The ToUnicode operation does not alter labels that are not valid ACE labels, even if they begin with the ACE prefix. After ToUnicode has been applied, if a label still begins with the ACE prefix, then it is not a valid ACE label, and is not equivalent to any of the intermediate Unicode strings constructed by ToUnicode.6.5 DNSSEC authentication of IDN domain names DNS Security [RFC2535] is a method for supplying cryptographic verification information along with DNS messages. Public Key Cryptography is used in conjunction with digital signatures to provide a means for a requester of domain information to authenticate the source of the data. This ensures that it can be traced back to a trusted source, either directly, or via a chain of trust linking the source of the information to the top of the DNS hierarchy. IDNA specifies that all internationalized domain names served by DNS servers that cannot be represented directly in ASCII must use the ACE form produced by the ToASCII operation. This operation must be performed prior to a zone being signed by the private key for that zone. Because of this ordering, it is important to recognize that DNSSEC authenticates the ASCII domain name, not the Unicode form orFaltstrom, et al. Standards Track [Page 16]RFC 3490 IDNA March 2003 the mapping between the Unicode form and the ASCII form. In the presence of DNSSEC, this is the name that MUST be signed in the zone and MUST be validated against. One consequence of this for sites deploying IDNA in the presence of DNSSEC is that any special purpose proxies or forwarders used to transform user input into IDNs must be earlier in the resolution flow than DNSSEC authenticating nameservers for DNSSEC to work.7. Name server considerations Existing DNS servers do not know the IDNA rules for handling non- ASCII forms of IDNs, and therefore need to be shielded from them. All existing channels through which names can enter a DNS server database (for example, master files [STD13] and DNS update messages [RFC2136]) are IDN-unaware because they predate IDNA, and therefore requirement 2 of section 3.1 of this document provides the needed shielding, by ensuring that internationalized domain names entering DNS server databases through such channels have already been converted to their equivalent ASCII forms. It is imperative that there be only one ASCII encoding for a particular domain name. Because of the design of the ToASCII and ToUnicode operations, there are no ACE labels that decode to ASCII labels, and therefore name servers cannot contain multiple ASCII
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -