⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 draft-ietf-idn-requirements-05.txt

📁 bind-3.2.
💻 TXT
📖 第 1 页 / 共 2 页
字号:
IETF IDN Working Group               Editors Zita Wenzel, James SengInternet Draft                       draft-ietf-idn-requirements-05.txt24 April 2001                        Expires 24 October 2001             Requirements of Internationalized Domain NamesStatus of this MemoThis document is an Internet-Draft and is in full conformance withall provisions of Section 10 of RFC2026.Internet-Drafts are working documents of the Internet EngineeringTask Force (IETF), its areas, and its working groups. Note thatother groups may also distribute working documents asInternet-Drafts.Internet-Drafts are draft documents valid for a maximum of sixmonths and may be updated, replaced, or obsoleted by otherdocuments at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as"work in progress."The list of current Internet-Drafts can be accessed athttp://www.ietf.org/ietf/1id-abstracts.txtThe list of Internet-Draft Shadow Directories can be accessed athttp://www.ietf.org/shadow.html.Intended Scope  The intended scope of this document is to explore requirements for theinternationalization of domain names on the Internet. It is notintended to document user requirements. It is recommended thatsolutions not necessarily be within the DNS itself, but could be a layerinterjected between the application and the DNS. Proposals SHOULDfulfill most, if not all, of the requirements. This document MAY beupdated based on clinical trials.AbstractThis document describes the requirement for encoding internationalcharacters into DNS names and records. This document is guidance fordeveloping protocols for internationalized domain names.1. IntroductionAt present, the encoding of Internet domain names is restricted to asubset of 7-bit ASCII (ISO/IEC 646). HTML, XML, IMAP, FTP, and manyother text based items on the Internet have already been at leastpartially internationalized. It is important for domain names to besimilarly internationalized or for an equivalent solution to be found.This document assumes that the most effective solution involves puttingnon-ASCII names inside some parts of the overall DNS system.This document is being discussed on the "idn" mailing list. To join thelist, send a message to <majordomo@ops.ietf.org> with the words"subscribe idn" in the body of the message. Archives of the mailinglist can also be found at ftp://ops.ietf.org/pub/lists/idn*.1.1 Definitions and ConventionsA language is a way that humans interact. In computerised form, a textin a written language can be expressed as a string of characters.The same set of characters can often be used for many written languages,and many written languages can be expressed using different scripts.The same characters are often shown with somewhat different glyphs(shapes) for display of a text depending on the font used, theautomatic shaping applied, or the automatic formation of ligatures. Inaddition, the same characters can be shown with somewhat differentglyphs (shapes) for display of a text depending on the language beingused, even within the same font or trough automatic font change.A character is a member of a set of elements used for organization,control, or representation of textual data.A graphic character is a character, other than a control function,that has a visual representation normally handwritten, printed, ordisplayed.Characters mentioned in this document are identified by their positionin the Unicode [UNICODE] character set.  This character set is alsoknown as the UCS [ISO10646]. The notation U+12AB, for example, indicatesthe character at position 12AB (hexadecimal) in the Unicode characterset.  Note that the use of this notation is not an indication of arequirement to use Unicode.Examples quoted in this document should be considered as a method tofurther explain the meanings and principles adopted by the document. Itis not a requirement for the protocol to satisfy the examples.Unicode Technical Report 17 [UTR17] defines a character encodingmodel in several levels (much of the text below is quoted fromUnicode Technical Report 17 [UTR17]):1. A abstract character repertoire (ACR) is defined as the set of   abstract characters to be encoded, normally a familiar alphabet   or symbol set. The word abstract just means that these objects   are defined by convention (such as the 26 letters of the English   alphabet, uppercase and lowercase forms). Examples: the ASCII   repertoire, the Latin-15 repertoire, the JIS X 0208 repertoire,   the UCS repertiore (of a particular version).2. A coded character set (CCS) is defined to be a mapping from a   set of abstract characters to the set of non-negative integers.   This range of integers need not be contiguous. An abstract   character is defined to be in a coded character set if the coded   character set maps from it to an integer. That integer is said   to be the code point for the abstract character. That abstract   character is then an encoded character. Examples: ASCII, Latin-15,   JIS X 0208, the UCS.3. A character encoding form (CEF) is a mapping from the set of integers   used in a CCS to the set of sequences of code units. A code unit   is an integer occupying a specified binary width in a computer   architecture, such as a septet, an octet, or a 16-bit unit. The   encoding form enables character representation as actual data in   a computer. The sequences of code units do not necessarily have the   same length. Examples: ASCII, Latin-15, Shift-JIS, UTF-16, UTF-8.4. A character encoding scheme (CES) is a mapping of code units into   serialized octet sequences. Character encoding schemes are relevant   to the issue of cross-platform persistent data involving code units   wider than a byte, where byte-swapping may be required to put data   into the byte polarity canonical for a particular platform.   The CES may involve two or more CCS's, and may include code units   (e.g. single shifts, SI/SO, or escape sequences) that are not part   of the CCS per se, but which are defined by the character encoding   architecture and which may require an external registry of particular   values (as for the ISO 2022 escape sequences). In such a case, the   CES is called a compound CES. (A CES that only involves a single   CCS is called a simple CES.)   Examples: ASCII, Latin-15, Shift-JIS, UTF-16BE, UTF-16LE, UTF-8.5. The mapping from an abstract character repertoire (ACR) to a   serialised sequence of octets is called a Character Map (CM). A simple   character map thus implicitly includes a CCS, a CEF, and a CES,   mapping from abstract characters to code units to octets. A compound   character map includes a compound CES, and thus includes more than one   CCS and CEF. In that case, the abstract character repertoire for the   character map is the union of the repertoires covered by the coded   character sets involved.   Character Maps are the things that in the IAB architecture get IANA   charset identifiers. A sequence of encoded characters must be   unambiguously mapped onto a sequence of octets by the charset. The   charset must be specified in all instances, as in Internet   protocols, where textual content is treated as a ordered sequence   of octets, and where the textual content must be reconstructible   from that sequence of octets.  Charset names are registered by the   IANA according to procedures documented in [RFC2278]. In many cases,   the same name is used for both a character map and for a character   encoding scheme, such as UTF-16BE. Typically this is done for simple   character maps when such usage is clear from context.6. A transfer encoding syntax (TES) is a reversible transform of encoded   data which may (or may not) include textual data represented in   one or more character encoding schemes.  Examples: 8bit,   Quoted-Printable, BASE64, UTF-7 (defunct), (UTF-5, and RACE).1.2 Description of the Domain Name SystemThe Domain Name System is defined by [RFC1034] and [RFC1035], withclarifications, extensions and modifications given in [RFC1123],[RFC1996], [RFC2181], and others. Of special importance here is thesecurity extensions described in [RFC2535] and companions.Over the years, many different words have been used to describe thecomponents of resource naming on the Internet (e.g., URI, URN); to makecertain that the set of terms used in this document are well-defined andnon-ambiguous, the definitions are given here.A master server for a zone holds the main copy of that zone. This copyis sometimes stored in a zone file. A slave server for a zone holds acomplete copy of the records for that zone. Slave servers MAY be eitherauthorized by the zone owner (secondary servers) or unauthorized(so-called "stealth secondaries"). Master and authorized slave serversare listed in the NS records for the zone, and are termed"authoritative" servers. In many contexts, outside this document theterm "primary" is used interchangeably with "master" and "secondary" isused interchangeably with "slave".A caching server holds temporary copies of DNS records; it uses recordsto answer queries about domain names. Further explanation of these termscan be found in [RFC1034] and [RFC1996].DNS names can be represented in multiple forms, with differentproperties for internationalization. The most important ones are:- Domain name: The binary representation of a name used internally in  the DNS protocol. This consists of a series of components of 1-63  octets, with an overall length limited to 255 octets (including the  length fields).- Master file format domain name: This is a representation of the name  as a sequence of characters in some character sets; the common  convention (derived from [RFC1035] section 5.1) is to represent the  octets of the name as ASCII characters where the octet is in the set  corresponding to the ASCII values for [a-zA-Z0-9-], using an escape  mechanism (\x or \NNN) where not, and separating the components of the  name by the dot character (".").The form specified for most protocols using the DNS is a limited form ofthe master file format domain name. This limited form is defined in[RFC1034] Section 3.5 and [RFC1123]. In most implementations ofapplications today, domain names in the Internet have been limited tothe much more restricted forms used, e.g., in email.  Those names arelimited to the upper- and lower-case letters a-z (interpreted in acase-independent fashion), the digits, and the hyphen-minus, all inASCII.1.3 Definition of "hostname" and "Internationalized Domain Name"In the DNS protocols, a name is referred to as a sequence of octets.However, when discussing requirements for internationalized domainnames, what we are looking for are ways to represent characters thatare meaningful for humans.In this document, this is referred to as a "hostname". While this termhas been used for many different purposes over the years, it is usedhere in the sense of sequence of characters (not octets) representing adomain name conforming to the limited hostname syntax [RFC952].This document attempts to define the requirements for an"Internationalized Domain Name" (IDN). This is defined as a sequence ofcharacters that can be used in the context of functions where a hostnameis used today, but contains one or more characters that are outside theset of characters specified as legal characters for host names[RFC1123].1.4 A multilayer model of the DNS functionThe DNS can be seen as a multilayer function:- The bottom layer is where the packets are passed across the Internet  in a DNS query and a DNS response. At this level, what matters is  the format and meaning of bits and octets in a DNS packet.- Above that is the "DNS service", created by an infrastructure of DNS  servers, NS records that point to those DNS servers, that is  pointed to by the root servers (listed in the "root cache file" on  each DNS server, often called "named.cache". It is at this level  that the statement "the DNS has a single root" [RFC2826] makes  sense, but still, what are being transferred are octets, not  characters.- Interfacing to the user is a service layer, often called "the resolver  library", and often embedded in the operating system or system  libraries of the client machines. It is at the top of this layer that  the API calls commonly known as "gethostbyname" and "gethostbyaddress"  reside.  These calls are modified to support IPv6 [RFC2553]. A  conceptually similar layer exists in authoritative DNS servers,  comprising the parts that generate "meaningful" strings in DNS files.  Due to the popularity of the "master file" format, this layer often  exists only in the administrative routines of the service maintainers.- The user of this layer (resolver library) is the application programs  that use the DNS, such as mailers, mail servers, Web clients, Web  servers, Web caches, IRC clients, FTP clients, distributed file  systems, distributed databases, and almost all other applications on  TCP/IP.Graphically, one can illustrate it like this:+---------------+                            +---------------------+| Application   |                            | (Base data)         |+---------------+                            +---------------------+      |  Application service interface                 |      |  For ex. GethostbyXXXX interface               | (no standard)+---------------+                            +---------------------+| Resolver      |                            | Auth DNS server     |+---------------+                            +---------------------+      |     <-----   DNS service interface   ----->    |+------------------------------------------------------------------+|  DNS service                                                     ||  +-----------------------+         +--------------------+        ||  | Forwarding DNS server |         | Caching DNS server |        ||  +-----------------------+         +--------------------+        ||                                                                  ||                 +-------------------------+                      ||                 | Parent-zone DNS servers |                      ||                 +-------------------------+                      ||                                                                  ||                 +-------------------------+                      ||                 | Root DNS servers        |                      ||                 +-------------------------+                      ||                                                                  |+------------------------------------------------------------------+1.5 Service model of the DNSThe Domain Name Service is used for multiple purposes, each of which ischaracterized by what it puts into the system (the query) and what itexpects as a result (the reply).The most used ones in the current DNS are:- Hostname-to-address service (A, AAAA, A6): Enter a hostname, and get  back an IPv4 or IPv6 address.- Hostname-to-Mail server service (MX): As above, but the expected  return value is a hostname and a priority for SMTP servers.- Address-to-hostname service (PTR): Enter an IPv4 or IPv6 address (in  in-addr.arpa or ip6.int form respectively) and get back a hostname.- Domain delegation service (NS). Enter a domain name and get back  nameserver records (designated hosts who provides authoritive  nameservice) for the domain.New services are being defined, either as entirely new services (IPv6 tohostname mapping using binary labels) or as embellishments to otherservices (DNSSEC returning information about whether a given DNS serviceis performed securely or not).These services exist, conceptually, at the Application/Resolverinterface, NOT at the DNS-service interface. This document attempts toset requirements for an equivalent of the "used services" given above,where "hostname" is replaced by "Internationalized Domain Name". Thisdoesn't preclude the fact that IDN should work with any kind of DNSqueries.  IDN is a new service. Since existing protocols like SMTP orHTTP use the old service, it is a matter of great concern how the newand old services work together, and how other protocols can takeadvantage of the new service.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -