📄 rfc3467.txt

📁 bind 源码最新实现 linux/unix/windows平台
💻 TXT
📖 第 1 页 / 共 5 页
字号:
   into the DNS to calling a directory service and then the DNS (in many   situations, both actions could be accomplished in a single API call).   A directory approach can be consistent both with "flat" models and   multi-attribute ones.  The DNS requires strict hierarchies, limiting   its ability to differentiate among names by their properties.  By   contrast, modern directories can utilize independently-searched   attributes and other structured schema to provide flexibilities not   present in a strictly hierarchical system.   There is a strong historical argument for a single directory   structure (implying a need for mechanisms for registration,   delegation, etc.).  But a single structure is not a strict   requirement, especially if in-depth case analysis and design work   leads to the conclusion that reverse-mapping to directory names is   not a requirement (see section 5).  If a single structure is not   needed, then, unlike the DNS, there would be no requirement for a   global organization to authorize or delegate operation of portions of   the structure.Klensin                      Informational                     [Page 14]RFC 3467          Role of the Domain Name System (DNS)     February 2003   The "no single structure" concept could be taken further by moving   away from simple "names" in favor of, e.g., multiattribute,   multihierarchical, faceted systems in which most of the facets use   restricted vocabularies.  (These terms are fairly standard in the   information retrieval and classification system literature, see,   e.g., [IS5127].)  Such systems could be designed to avoid the need   for procedures to ensure uniqueness across, or even within, providers   and databases of the faceted entities for which the search is to be   performed.  (See [DNS-Search] for further discussion.)   While the discussion above includes very general comments about   attributes, it appears that only a very small number of attributes   would be needed.  The list would almost certainly include country and   language for internationalization purposes.  It might require   "charset" if we cannot agree on a character set and encoding,   although there are strong arguments for simply using ISO 10646 (also   known as Unicode or "UCS" (for Universal Character Set) [UNICODE],   [IS10646] coding in interchange.  Trademark issues might motivate   "commercial" and "non-commercial" (or other) attributes if they would   be helpful in bypassing trademark problems.  And applications to   resource location, such as those contemplated for Uniform Resource   Identifiers (URIs) [RFC2396, RFC3305] or the Service Location   Protocol [RFC2608], might argue for a few other attributes (as   outlined above).4.  Internationalization   Much of the thinking underlying this document was driven by   considerations of internationalizing the DNS or, more specifically,   providing access to the functions of the DNS from languages and   naming systems that cannot be accurately expressed in the traditional   DNS subset of ASCII.  Much of the relevant work was done in the   IETF's "Internationalized Domain Names" Working Group (IDN-WG),   although this document also draws on extensive parallel discussions   in other forums.  This section contains an evaluation of what was   learned as an "internationalized DNS" or "multilingual DNS" was   explored and suggests future steps based on that evaluation.   When the IDN-WG was initiated, it was obvious to several of the   participants that its first important task was an undocumented one:   to increase the understanding of the complexities of the problem   sufficiently that naive solutions could be rejected and people could   go to work on the harder problems.  The IDN-WG clearly accomplished   that task. The beliefs that the problems were simple, and in the   corresponding simplistic approaches and their promises of quick and   painless deployment, effectively disappeared as the WG's efforts   matured.Klensin                      Informational                     [Page 15]RFC 3467          Role of the Domain Name System (DNS)     February 2003   Some of the lessons learned from increased understanding and the   dissipation of naive beliefs should be taken as cautions by the wider   community: the problems are not simple. Specifically, extracting   small elements for solution rather than looking at whole systems, may   result in obscuring the problems but not solving any problem that is   worth the trouble.4.1 ASCII Isn't Just Because of English   The hostname rules chosen in the mid-70s weren't just "ASCII because   English uses ASCII", although that was a starting point.  We have   discovered that almost every other script (and even ASCII if we   permit the rest of the characters specified in the ISO 646   International Reference Version) is more complex than hostname-   restricted-ASCII (the "LDH" form, see section 1.1).  And ASCII isn't   sufficient to completely represent English -- there are several words   in the language that are correctly spelled only with characters or   diacritical marks that do not appear in ASCII.  With a broader   selection of scripts, in some examples, case mapping works from one   case to the other but is not reversible.  In others, there are   conventions about alternate ways to represent characters (in the   language, not [only] in character coding) that work most of the time,   but not always.  And there are issues in coding, with Unicode/10646   providing different ways to represent the same character   ("character", rather than "glyph", is used deliberately here).  And,   in still others, there are questions as to whether two glyphs   "match", which may be a distance-function question, not one with a   binary answer.  The IETF approach to these problems is to require   pre-matching canonicalization (see the "stringprep" discussion   below).   The IETF has resisted the temptations to either try to specify an   entirely new coded character set, or to pick and choose Unicode/10646   characters on a per-character basis rather than by using well-defined   blocks.  While it may appear that a character set designed to meet   Internet-specific needs would be very attractive, the IETF has never   had the expertise, resources, and representation from critically-   important communities to actually take on that job.  Perhaps more   important, a new effort might have chosen to make some of the many   complex tradeoffs differently than the Unicode committee did,   producing a code with somewhat different characteristics.  But there   is no evidence that doing so would produce a code with fewer problems   and side-effects.  It is much more likely that making tradeoffs   differently would simply result in a different set of problems, which   would be equally or more difficult.Klensin                      Informational                     [Page 16]RFC 3467          Role of the Domain Name System (DNS)     February 20034.2 The "ASCII Encoding" Approaches   While the DNS can handle arbitrary binary strings without known   internal problems (see [RFC2181]), some restrictions are imposed by   the requirement that text be interpreted in a case-independent way   ([RFC1034], [RFC1035]).  More important, most internet applications   assume the hostname-restricted "LDH" syntax that is specified in the   host table RFCs and as "prudent" in RFC 1035.  If those assumptions   are not met, many conforming implementations of those applications   may exhibit behavior that would surprise implementors and users.  To   avoid these potential problems, IETF internationalization work has   focused on "ASCII-Compatible Encodings" (ACE).  These encodings   preserve the LDH conventions in the DNS itself.  Implementations of   applications that have not been upgraded utilize the encoded forms,   while newer ones can be written to recognize the special codings and   map them into non-ASCII characters. These approaches are, however,   not problem-free even if human interface issues are ignored.  Among   other issues, they rely on what is ultimately a heuristic to   determine whether a DNS label is to be considered as an   internationalized name (i.e., encoded Unicode) or interpreted as an   actual LDH name in its own right.  And, while all determinations of   whether a particular query matches a stored object are traditionally   made by DNS servers, the ACE systems, when combined with the   complexities of international scripts and names, require that much of   the matching work be separated into a separate, client-side,   canonicalization or "preparation" process before the DNS matching   mechanisms are invoked [STRINGPREP].4.3 "Stringprep" and Its Complexities   As outlined above, the model for avoiding problems associated with   putting non-ASCII names in the DNS and elsewhere evolved into the   principle that strings are to be placed into the DNS only after being   passed through a string preparation function that eliminates or   rejects spurious character codes, maps some characters onto others,   performs some sequence canonicalization, and generally creates forms   that can be accurately compared.  The impact of this process on   hostname-restricted ASCII (i.e., "LDH") strings is trivial and   essentially adds only overhead.  For other scripts, the impact is, of   necessity, quite significant.   Although the general notion underlying stringprep is simple, the many   details are quite subtle and the associated tradeoffs are complex. A   design team worked on it for months, with considerable effort placed   into clarifying and fine-tuning the protocol and tables.  Despite   general agreement that the IETF would avoid getting into the business   of defining character sets, character codings, and the associated   conventions, the group several times considered and rejected specialKlensin                      Informational                     [Page 17]RFC 3467          Role of the Domain Name System (DNS)     February 2003   treatment of code positions to more nearly match the distinctions   made by Unicode with user perceptions about similarities and   differences between characters.  But there were intense temptations   (and pressures) to incorporate language-specific or country-specific   rules.  Those temptations, even when resisted, were indicative of   parts of the ongoing controversy or of the basic unsuitability of the   DNS for fully internationalized names that are visible,   comprehensible, and predictable for end users.   There have also been controversies about how far one should go in   these processes of preparation and transformation and, ultimately,   about the validity of various analogies.  For example, each of the   following operations has been claimed to be similar to case-mapping   in ASCII:   o  stripping of vowels in Arabic or Hebrew   o  matching of "look-alike" characters such as upper-case Alpha in      Greek and upper-case A in Roman-based alphabets   o  matching of Traditional and Simplified Chinese characters that      represent the same words,   o  matching of Serbo-Croatian words whether written in Roman-derived      or Cyrillic characters   A decision to support any of these operations would have implications   for other scripts or languages and would increase the overall   complexity of the process.  For example, unless language-specific   information is somehow available, performing matching between   Traditional and Simplified Chinese has impacts on Japanese and Korean   uses of the same "traditional" characters (e.g., it would not be   appropriate to map Kanji into Simplified Chinese).   Even were the IDN-WG's other work to have been abandoned completely   or if it were to fail in the marketplace, the stringprep and nameprep   work will continue to be extremely useful, both in identifying issues   and problem code points and in providing a reasonable set of basic   rules.  Where problems remain, they are arguably not with nameprep,   but with the DNS-imposed requirement that its results, as with all   other parts of the matching and comparison process, yield a binary   "match or no match" answer, rather than, e.g., a value on a   similarity scale that can be evaluated by the user or by user-driven   heuristic functions.Klensin                      Informational                     [Page 18]RFC 3467          Role of the Domain Name System (DNS)     February 20034.4 The Unicode Stability Problem
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -