📄 rfc4790.txt
字号:
Network Working Group C. NewmanRequest for Comments: 4790 Sun MicrosystemsCategory: Standards Track M. Duerst Aoyama Gakuin University A. Gulbrandsen Oryx March 2007 Internet Application Protocol Collation RegistryStatus of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.Copyright Notice Copyright (C) The IETF Trust (2007).Abstract Many Internet application protocols include string-based lookup, searching, or sorting operations. However, the problem space for searching and sorting international strings is large, not fully explored, and is outside the area of expertise for the Internet Engineering Task Force (IETF). Rather than attempt to solve such a large problem, this specification creates an abstraction framework so that application protocols can precisely identify a comparison function, and the repertoire of comparison functions can be extended in the future.Newman, et al. Standards Track [Page 1]RFC 4790 Collation Registry March 2007Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Conventions Used in This Document . . . . . . . . . . . . 4 2. Collation Definition and Purpose . . . . . . . . . . . . . . . 4 2.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2. Purpose . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3. Some Other Terms Used in this Document . . . . . . . . . . 5 2.4. Sort Keys . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Collation Identifier Syntax . . . . . . . . . . . . . . . . . 6 3.1. Basic Syntax . . . . . . . . . . . . . . . . . . . . . . . 6 3.2. Wildcards . . . . . . . . . . . . . . . . . . . . . . . . 6 3.3. Ordering Direction . . . . . . . . . . . . . . . . . . . . 7 3.4. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.5. Naming Guidelines . . . . . . . . . . . . . . . . . . . . 7 4. Collation Specification Requirements . . . . . . . . . . . . . 8 4.1. Collation/Server Interface . . . . . . . . . . . . . . . . 8 4.2. Operations Supported . . . . . . . . . . . . . . . . . . . 8 4.2.1. Validity . . . . . . . . . . . . . . . . . . . . . . . 9 4.2.2. Equality . . . . . . . . . . . . . . . . . . . . . . . 9 4.2.3. Substring . . . . . . . . . . . . . . . . . . . . . . 9 4.2.4. Ordering . . . . . . . . . . . . . . . . . . . . . . . 10 4.3. Sort Keys . . . . . . . . . . . . . . . . . . . . . . . . 10 4.4. Use of Lookup Tables . . . . . . . . . . . . . . . . . . . 11 5. Application Protocol Requirements . . . . . . . . . . . . . . 11 5.1. Character Encoding . . . . . . . . . . . . . . . . . . . . 11 5.2. Operations . . . . . . . . . . . . . . . . . . . . . . . . 11 5.3. Wildcards . . . . . . . . . . . . . . . . . . . . . . . . 12 5.4. String Comparison . . . . . . . . . . . . . . . . . . . . 12 5.5. Disconnected Clients . . . . . . . . . . . . . . . . . . . 12 5.6. Error Codes . . . . . . . . . . . . . . . . . . . . . . . 13 5.7. Octet Collation . . . . . . . . . . . . . . . . . . . . . 13 6. Use by Existing Protocols . . . . . . . . . . . . . . . . . . 13 7. Collation Registration . . . . . . . . . . . . . . . . . . . . 14 7.1. Collation Registration Procedure . . . . . . . . . . . . . 14 7.2. Collation Registration Format . . . . . . . . . . . . . . 15 7.2.1. Registration Template . . . . . . . . . . . . . . . . 15 7.2.2. The Collation Element . . . . . . . . . . . . . . . . 15 7.2.3. The Identifier Element . . . . . . . . . . . . . . . . 16 7.2.4. The Title Element . . . . . . . . . . . . . . . . . . 16 7.2.5. The Operations Element . . . . . . . . . . . . . . . . 16 7.2.6. The Specification Element . . . . . . . . . . . . . . 16 7.2.7. The Submitter Element . . . . . . . . . . . . . . . . 16 7.2.8. The Owner Element . . . . . . . . . . . . . . . . . . 16 7.2.9. The Version Element . . . . . . . . . . . . . . . . . 17 7.2.10. The Variable Element . . . . . . . . . . . . . . . . . 17 7.3. Structure of Collation Registry . . . . . . . . . . . . . 17 7.4. Example Initial Registry Summary . . . . . . . . . . . . . 18Newman, et al. Standards Track [Page 2]RFC 4790 Collation Registry March 2007 8. Guidelines for Expert Reviewer . . . . . . . . . . . . . . . . 18 9. Initial Collations . . . . . . . . . . . . . . . . . . . . . . 19 9.1. ASCII Numeric Collation . . . . . . . . . . . . . . . . . 20 9.1.1. ASCII Numeric Collation Description . . . . . . . . . 20 9.1.2. ASCII Numeric Collation Registration . . . . . . . . . 20 9.2. ASCII Casemap Collation . . . . . . . . . . . . . . . . . 21 9.2.1. ASCII Casemap Collation Description . . . . . . . . . 21 9.2.2. ASCII Casemap Collation Registration . . . . . . . . . 22 9.3. Octet Collation . . . . . . . . . . . . . . . . . . . . . 22 9.3.1. Octet Collation Description . . . . . . . . . . . . . 22 9.3.2. Octet Collation Registration . . . . . . . . . . . . . 23 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 11. Security Considerations . . . . . . . . . . . . . . . . . . . 23 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 13.1. Normative References . . . . . . . . . . . . . . . . . . . 24 13.2. Informative References . . . . . . . . . . . . . . . . . . 24Newman, et al. Standards Track [Page 3]RFC 4790 Collation Registry March 20071. Introduction The Application Configuration Access Protocol ACAP [11] specification introduced the concept of a comparator (which we call collation in this document), but failed to create an IANA registry. With the introduction of stringprep [6] and the Unicode Collation Algorithm [7], it is now time to create that registry and populate it with some initial values appropriate for an international community. This specification replaces and generalizes the definition of a comparator in ACAP, and creates a collation registry.1.1. Conventions Used in This Document The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in this document are to be interpreted as defined in "Key words for use in RFCs to Indicate Requirement Levels" [1]. The attribute syntax specifications use the Augmented Backus-Naur Form (ABNF) [2] notation, including the core rules defined in Appendix A. The ABNF production "Language-tag" is imported from Language Tags [5] and "reg-name" from URI: Generic Syntax [4].2. Collation Definition and Purpose2.1. Definition A collation is a named function which takes two arbitrary length strings as input and can be used to perform one or more of three basic comparison operations: equality test, substring match, and ordering test.2.2. Purpose Collations are an abstraction for comparison functions so that these comparison functions can be used in multiple protocols. The details of a particular comparison operation can be specified by someone with appropriate expertise, independent of the application protocols that use that collation. This is similar to the way a charset [13] separates the details of octet to character mapping from a protocol specification, such as MIME [9], or the way SASL [10] separates the details of an authentication mechanism from a protocol specification, such as ACAP [11].Newman, et al. Standards Track [Page 4]RFC 4790 Collation Registry March 2007 Here is a small diagram to help illustrate the value of this abstraction: +-------------------+ +-----------------+ | IMAP i18n SEARCH |--+ | Basic | +-------------------+ | +--| Collation Spec | | | +-----------------+ +-------------------+ | +-------------+ | +-----------------+ | ACAP i18n SEARCH |--+--| Collation |--+--| A stringprep | +-------------------+ | | Registry | | | Collation Spec | | +-------------+ | +-----------------+ +-------------------+ | | +-----------------+ | ...other protocol |--+ | | locale-specific | +-------------------+ +--| Collation Spec | +-----------------+ Thus IMAP, ACAP, and future application protocols with international search capability simply specify how to interface to the collation registry instead of each protocol specification having to specify all the collations it supports.2.3. Some Other Terms Used in this Document The terms client, server, and protocol are used in somewhat unusual senses. Client means a user, or a program acting directly on behalf of a user. This may be a mail reader acting as an IMAP client, or it may be an interactive shell, where the user can type protocol commands/ requests directly, or it may be a script or program written by the user. Server means a program that performs services requested by the client. This may be a traditional server such as an HTTP server, or it may be a Sieve [14] interpreter running a Sieve script written by a user. A server needs to use the operations provided by collations in order to fulfill the client's requests. The protocol describes how the client tells the server what it wants done, and (if applicable) how the server tells the client about the results. IMAP is a protocol by this definition, and so is the Sieve language.2.4. Sort Keys One component of a collation is a transformation, which turns a string into a sort key, which is then used while sorting.Newman, et al. Standards Track [Page 5]RFC 4790 Collation Registry March 2007 The transformation can range from an identity mapping (e.g., the i;octet collation Section 9.3) to a mapping that makes the string unreadable to a human. This is an implementation detail of collations or servers. A protocol SHOULD NOT expose it to clients, since some collations leave the sort key's format up to the implementation, and current conformant implementations are known to use different formats.3. Collation Identifier Syntax3.1. Basic Syntax The collation identifier itself is a single US-ASCII string. The identifier MUST NOT be longer than 254 characters, and obeys the following grammar: collation-char = ALPHA / DIGIT / "-" / ";" / "=" / "." collation-id = collation-prefix ";" collation-core-name *collation-arg collation-scope = Language-tag / "vnd-" reg-name collation-core-name = ALPHA *( ALPHA / DIGIT / "-" ) collation-arg = ";" ALPHA *( ALPHA / DIGIT ) "=" 1*( ALPHA / DIGIT / "." ) Note: the ABNF production "Language-tag" is imported from Language Tags [5] and "reg-name" from URI: Generic Syntax [4]. There is a special identifier called "default". For protocols that have a default collation, "default" refers to that collation. For other protocols, the identifier "default" MUST match no collations, and servers SHOULD treat it in the same way as they treat nonexistent collations.3.2. Wildcards The string a client uses to select a collation MAY contain one or more wildcard ("*") characters that match zero or more collation- chars. Wildcard characters MUST NOT be adjacent. If the wildcard string matches multiple collations, the server SHOULD attempt to select a widely useful collation in preference to a narrowly useful one.Newman, et al. Standards Track [Page 6]RFC 4790 Collation Registry March 2007 collation-wild = ("*" / (ALPHA ["*"])) *(collation-char ["*"]) ; MUST NOT exceed 254 characters total3.3. Ordering Direction When used as a protocol element for ordering, the collation identifier MAY be prefixed by either "+" or "-" to explicitly specify an ordering direction. "+" has no effect on the ordering operation, while "-" inverts the result of the ordering operation. In general, collation-order is used when a client requests a collation, and collation-selected is used when the server informs the client of the selected collation. collation-selected = ["+" / "-"] collation-id collation-order = ["+" / "-"] collation-wild3.4. URIs Some protocols are designed to use URIs [4] to refer to collations rather than simple tokens. A special section of the IANA URL space is reserved for such usage. The "collation-uri" form is used to refer to a specific named collation (the collation registration may
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -