📄 draft-skwan-utf8-dns-06.txt
字号:
INTERNET-DRAFT Stuart Kwan James Gilroy Levon Esibov Microsoft Corp. May 2001<draft-skwan-utf8-dns-06.txt> Expires November 2001 Using the UTF-8 Character Set in the Domain Name SystemStatus of this MemoThis document is an Internet-Draft and is in full conformancewith all provisions of Section 10 of RFC2026.Internet-Drafts are working documents of the Internet EngineeringTask Force (IETF), its areas, and its working groups. Note thatother groups may also distribute working documents asInternet-Drafts.Internet-Drafts are draft documents valid for a maximum of sixmonths and may be updated, replaced, or obsoleted by otherdocuments at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as"work in progress."The list of current Internet-Drafts can be accessed athttp://www.ietf.org/ietf/1id-abstracts.txtThe list of Internet-Draft Shadow Directories can be accessed athttp://www.ietf.org/shadow.html.AbstractThe Domain Names standard specifies that hostnames are represented using the ASCII character encoding. This document expands that specification to allow the use of the UTF-8 character encoding, asuperset of ASCII and a translation of the UCS-2 character encoding.1. IntroductionThe Domain Names standard [RFC1123] specifies that hostnames arerepresented using the ASCII character encoding. This document expandsthat specification to allow the use of the UTF-8 character encoding[RFC2044], a superset of ASCII and a translation of the UCS-2character encoding.Interpreting names as ASCII-only limits the utility of DNS in aninternational setting. The UTF-8 character set includes charactersfrom most of the world's written languages, allowing a far greaterrange of possible names and allowing names to use characters that arerelevant to a particular locality. UTF-8 is the recommended characterset for protocols that are evolving beyond ASCII [RFC2130].Expires November 2001 [Page 1]INTERNET-DRAFT UTF-8 DNS May 2001This document defines the technology for a richer character set inDNS. This document specifically does not define policy for thecharacters allowed in a name when used in a particular application.For example, some protocols place restrictions on the charactersallowed in a name2. Protocol Description2.1 Components and rolesBefore the description of the protocol itself authors feel a need toclarify which components are involved in processing the hostnames anddescribe the usage of the hostnames by these components. The followinglist contains such information.User.User could be a human or application. Its role is to specify (alsoknown as "write") and retrieve (also known as "read") the hostname toand from an application. The examples of such operations includetyping the hostname, writing it on a touch sensitive screen, readingthe name from the monitor, listening to a voicemail, etc...Application.Application's role is to- process the hostname specified by user or other local or remote application.- return to the user (for example display on a monitor screen) the hostname returned by DNS resolver.- call DNS name resolution APIs to request resolver to perform the name resolutionResolver.Resolver's role is to- process the name resolution requests from an application and submit appropriate DNS query to the DNS servers- process the response from a DNS server and pass the response to the Application.DNS server.The role of the DNS server is to store and maintain the DNS data,process the updates to its database, update the replica copies of thedatabases and perform the DNS name resolution through responding tothe DNS queries.2.2 Protocol detailsThis section describes the modifications (if any) to each of thesecomponents and interfaces between the communicating components.Expires November 2001 [Page 2]INTERNET-DRAFT UTF-8 DNS May 20012.2.1 UsersNo modifications to the users are proposed in this document. At thesame time support of this protocol by other components specified laterin this section may enable users to start using in hostnamescharacters from wider set than one specified in [RFC1123].2.2.2 Interface between users and applicationsUser may use any character set or multiple character sets supported bythe particular application. Specification of the allowed charactersets supported by an application is outside of the scope of thisdocument. The decision on which characters sets can be used to allowuser to input and retrieve the hostnames is left to the implementersof the particular applications unless a protocol underlying specificapplication specifies the supported characters set. Thus this protocoldoes not affect the interface between users and applications.2.2.3 ApplicationsStorage format of the hostnames by the applications is outside of thescope of this protocol. 2.2.4 Interface between applications and resolversThis protocol does not specify the APIs that applications should useto request the resolver to perform the DNS name resolution of theinternationalized hostnames. Instead it only specifies the format ofthe hostnames specified in the input and output of such APIs.The applications supporting non-ASCII characters in hostnames MUSTpass to the resolvers a hostname in ISO/IEC 10646 encoding. If theresponse returned by the resolver to the application contains thehostname, then the application should expect the hostname to beencoded using ISO/IEC 10646.2.2.5 ResolversBefore sending the hostname in the query packet, the resolver MUSTprepare each name part as specified in [NAMEPREP]. After the namepreparation the resolver MUST convert the hostname to be encoded usingUTF-8 as specified in [RFC2044].Names encoded in UTF-8 must not exceed the size limits clarified in[RFC2181]. Character count is insufficient to determine size, sincesome UTF-8 characters exceed one octet in length.Expires November 2001 [Page 3]INTERNET-DRAFT UTF-8 DNS May 2001When resolver receives a response to the query from a DNS server, itMUST convert all of the hostnames from UTF-8 encoded format to theISO/IEC 10646 encoding before passing these hostnames back to theapplication. 2.2.6 DNS serversDNS servers authoritative for the records containing the hostnamescontaining the characters not allowed by [RFC1123] MUST allow use ofthe namepreped UTF-8 format to store and transmit those parts of thehostnames.According to existing standards, any binary string can be used in aDNS name [RFC2181], but names must be compared with case-insensitivity[RFC1035]. At the same time DNS protocol standard states that originalcase SHOULD be preserved when possible as data is entered into the DNSdatabase. This requirement is modified as follows: a DNS serverauthoritative for the internationalized hostnames MUST nameprep andperform UTF-8 conversion on all names containing internationalizedcharacters in both record names and record data before storing thesehostnames and transmitting those names in any message. This newrequirement guarantees case-insensitive comparison of theinternationalized hostnames even by those DNS servers that do notsupport this protocol.DNS servers must compare names that contain UTF-8 charactersbyte-for-byte, as opposed to using Unicode equivalency rules.3. Interoperability ConsiderationsIf user continues using ASCII-only characters in the hostnames, thenthere is no need to upgrade any applications and/or resolvers.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -