📄 rfc2229.txt
字号:
Network Working Group R. FaithRequest for Comments: 2229 U. North Carolina, Chapel HillCategory: Informational B. Martin Miranda Productions October 1997 A Dictionary Server ProtocolStatus of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.Copyright Notice Copyright (C) The Internet Society (1997). All Rights Reserved.Abstract The Dictionary Server Protocol (DICT) is a TCP transaction based query/response protocol that allows a client to access dictionary definitions from a set of natural language dictionary databases.Table of Contents 1. Introduction ......................................... 2 1.1. Requirements ......................................... 3 2. Protocol Overview .................................... 3 2.1. Link Level ........................................... 3 2.2. Lexical Tokens ....................................... 3 2.3. Commands ............................................. 4 2.4. Responses ............................................ 5 2.4.1. Status Responses ..................................... 5 2.4.2. General Status Responses ............................. 6 2.4.3. Text Responses ....................................... 6 3. Command and Response Details ......................... 7 3.1. Initial Connection ................................... 7 3.2. The DEFINE Command ................................... 9 3.3. The MATCH Command .................................... 10 3.4. A Note on Virtual Databases .......................... 12 3.5. The SHOW Command ..................................... 13 3.5.1. SHOW DB .............................................. 13 3.5.2. SHOW STRAT ........................................... 13 3.5.3. SHOW INFO ............................................ 14 3.5.4. SHOW SERVER .......................................... 14 3.6. The CLIENT Command ................................... 15Faith & Martin Informational [Page 1]RFC 2229 A Dictionary Server Protocol October 1997 3.7. The STATUS Command ................................... 15 3.8. The HELP Command ..................................... 15 3.9. The QUIT Command ..................................... 16 3.10. The OPTION Command ................................... 16 3.10.1. OPTION MIME .......................................... 16 3.11. The AUTH Command ..................................... 18 3.12. The SASLAUTH Command ................................. 18 4. Command Pipelining ................................... 20 5. URL Specification .................................... 20 6. Extensions ........................................... 22 6.1. Experimental Command Syntax .......................... 22 6.2. Experimental Commands and Pipelining ................. 22 7. Summary of Response Codes ............................ 23 8. Sample Conversations ................................. 23 8.1. Sample 1 - HELP, DEFINE, and QUIT commands ........... 24 8.2. Sample 2 - SHOW commands, MATCH command .............. 25 8.3. Sample 3 - Server downtime ........................... 26 8.4. Sample 4 - Authentication ............................ 26 9. Security Considerations .............................. 26 10. References ........................................... 27 11. Acknowledgements ..................................... 29 12. Authors' Addresses ................................... 29 13. Full Copyright Statement ............................. 301. Introduction For many years, the Internet community has relied on the "webster" protocol for access to natural language definitions. The webster protocol supports access to a single dictionary and (optionally) to a single thesaurus. In recent years, the number of publicly available webster servers on the Internet has dramatically decreased. Fortunately, several freely-distributable dictionaries and lexicons have recently become available on the Internet. However, these freely-distributable databases are not accessible via a uniform interface, and are not accessible from a single site. They are often small and incomplete individually, but would collectively provide an interesting and useful database of English words. Examples include the Jargon file [JARGON], the WordNet database [WORDNET], MICRA's version of the 1913 Webster's Revised Unabridged Dictionary [WEB1913], and the Free Online Dictionary of Computing [FOLDOC]. Translating and non-English dictionaries are also becoming available (for example, the FOLDOC dictionary is being translated into Spanish).Faith & Martin Informational [Page 2]RFC 2229 A Dictionary Server Protocol October 1997 The webster protocol is not suitable for providing access to a large number of separate dictionary databases, and extensions to the current webster protocol were not felt to be a clean solution to the dictionary database problem. The DICT protocol is designed to provide access to multiple databases. Word definitions can be requested, the word index can be searched (using an easily extended set of algorithms), information about the server can be provided (e.g., which index search strategies are supported, or which databases are available), and information about a database can be provided (e.g., copyright, citation, or distribution information). Further, the DICT protocol has hooks that can be used to restrict access to some or all of the databases.1.1. Requirements In this document, we adopt the convention discussed in Section 1.3.2 of [RFC1122] of using the capitalized words MUST, REQUIRED, SHOULD, RECOMMENDED, MAY, and OPTIONAL to define the significance of each particular requirement specified in this document. In brief: "MUST" (or "REQUIRED") means that the item is an absolute requirement of the specification; "SHOULD" (or "RECOMMENDED") means there may exist valid reasons for ignoring this item, but the full implications should be understood before doing so; and "MAY" (or "OPTIONAL") means that his item is optional, and may be omitted without careful consideration.2. Protocol Overview2.1. Link Level The DICT protocol assumes a reliable data stream such as provided by TCP. When TCP is used, a DICT server listens on port 2628. This server is only an interface between programs and the dictionary databases. It does not perform any user interaction or presentation-level functions.2.2. Lexical Tokens Commands and replies are composed of characters from the UCS character set [ISO10646] using the UTF-8 [RFC2044] encoding. More specifically, using the grammar conventions from [RFC822]:Faith & Martin Informational [Page 3]RFC 2229 A Dictionary Server Protocol October 1997 ; ( Octal, Decimal.) CHAR = <any UTF-8 character (1 to 6 octets)> CTL = <any ASCII control ; ( 0- 37, 0.- 31.) character and DEL> ; ( 177, 127.) CR = <ASCII CR, carriage return> ; ( 15, 13.) LF = <ASCII LF, linefeed> ; ( 12, 10.) SPACE = <ASCII SP, space> ; ( 40, 32.) HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.) <"> = <ASCII quote mark> ; ( 42, 34.) <'> = <ASCII single quote mark> ; ( 47, 39.) CRLF = CR LF WS = 1*(SPACE / HTAB) dqstring = <"> *(dqtext/quoted-pair) <"> dqtext = <any CHAR except <">, "\", and CTLs> sqstring = <'> *(dqtext/quoted-pair) <'> sqtext = <any CHAR except <'>, "\", and CTLs> quoted-pair = "\" CHAR atom = 1*<any CHAR except SPACE, CTLs, <'>, <">, and "\"> string = *<dqstring / sqstring / quoted-pair> word = *<atom / string> description = *<word / WS> text = *<word / WS>2.3. Commands Commands consist of a command word followed by zero or more parameters. Commands with parameters must separate the parameters from each other and from the command by one or more space or tab characters. Command lines must be complete with all required parameters, and may not contain more than one command. Each command line must be terminated by a CRLF. The grammar for commands is: command = cmd-word *<WS cmd-param> cmd-word = atom cmd-param = database / strategy / word database = atom strategy = atom Commands are not case sensitive.Faith & Martin Informational [Page 4]RFC 2229 A Dictionary Server Protocol October 1997 Command lines MUST NOT exceed 1024 characters in length, counting all characters including spaces, separators, punctuation, and the trailing CRLF. There is no provision for the continuation of command lines. Since UTF-8 may encode a character using up to 6 octets, the command line buffer MUST be able to accept up to 6144 octets.2.4. Responses Responses are of two kinds, status and textual.2.4.1. Status Responses Status responses indicate the server's response to the last command received from the client. Status response lines begin with a 3 digit numeric code which is sufficient to distinguish all responses. Some of these may herald the subsequent transmission of text. The first digit of the response broadly indicates the success, failure, or progress of the previous command (based generally on [RFC640,RFC821]): 1yz - Positive Preliminary reply 2yz - Positive Completion reply 3yz - Positive Intermediate reply 4yz - Transient Negative Completion reply 5yz - Permanent Negative Completion reply The next digit in the code indicates the response category: x0z - Syntax x1z - Information (e.g., help) x2z - Connections x3z - Authentication x4z - Unspecified as yet x5z - DICT System (These replies indicate the status of the receiver DICT system vis-a-vis the requested transfer or other DICT system action.) x8z - Nonstandard (private implementation) extensions The exact response codes that should be expected from each command are detailed in the description of that command. Certain status responses contain parameters such as numbers and strings. The number and type of such parameters is fixed for each response code to simplify interpretation of the response. Other status responses do not require specific text identifiers. ParameterFaith & Martin Informational [Page 5]RFC 2229 A Dictionary Server Protocol October 1997 requirements are detailed in the description of relevant commands. Except for specifically detailed parameters, the text following response codes is server-dependent. Parameters are separated from the numeric response code and from each other by a single space. All numeric parameters are decimal, and may have leading zeros. All string parameters MUST conform to the "atom" or "dqstring" grammar productions. If no parameters are present, and the server implementation provides no implementation-specific text, then there MAY or MAY NOT be a space after the response code. Response codes not specified in this standard may be used for any installation-specific additional commands also not specified. These should be chosen to fit the pattern of x8z specified above. The use of unspecified response codes for standard commands is prohibited.2.4.2. General Status Responses In response to every command, the following general status responses are possible: 500 Syntax error, command not recognized 501 Syntax error, illegal parameters 502 Command not implemented 503 Command parameter not implemented 420 Server temporarily unavailable 421 Server shutting down at operator request2.4.3. Text Responses Before text is sent a numeric status response line, using a 1yz code, will be sent indicating text will follow. Text is sent as a series of successive lines of textual matter, each terminated with a CRLF. A single line containing only a period (decimal code 46, ".") is sent to indicate the end of the text (i.e., the server will send a CRLF at the end of the last line of text, a period, and another CRLF). If a line of original text contained a period as the first character of the line, that first period is doubled by the DICT server. Therefore, the client must examine the first character of each line received. Those that begin with two periods must have those two periods collapsed into one period. Those that contain only a single period followed by a CRLF indicate the end of the text response.Faith & Martin Informational [Page 6]RFC 2229 A Dictionary Server Protocol October 1997 If the OPTION MIME command has been given, all textual responses will be prefaced by a MIME header [RFC2045] followed by a single blank line (CRLF). See section 3.10.1 for more details on OPTION MIME. Following a text response, a 2yz response code will be sent. Text lines MUST NOT exceed 1024 characters in length, counting all characters including spaces, separators, punctuation, the extra initial period (if needed), and the trailing CRLF. Since UTF-8 may encode a character using up to 6 octets, the text line input buffer MUST be able to accept up to 6144 octets. By default, the text of the definitions MUST be composed of characters from the UCS character set [ISO10644] using the UTF-8 [RFC2044] encoding. The UTF-8 encoding has the advantage of preserving the full range of 7-bit US ASCII [USASCII] values. Clients and servers MUST support UTF-8, even if only in some minimal fashion.3. Command and Response Details Below, each DICT command and appropriate responses are detailed. Each command is shown in upper case for clarity, but the DICT server is case-insensitive. Except for the AUTH and SASLAUTH commands, every command described in this section MUST be implemented by all DICT servers.3.1. Initial Connection When a client initially connects to a DICT server, a code 220 is sent if the client's IP is allowed to connect: 220 text capabilities msg-id The code 220 is a banner, usually containing host name and DICT server version information. The second-to-last sequence of characters in the banner is the optional capabilities string, which will allow servers to declare support for extensions to the DICT protocol. The capabilities string is defined below: capabilities = ["<" msg-atom *("." msg-atom) ">"] msg-atom = 1*<any CHAR except SPACE, CTLs, "<", ">", ".", and "\">Faith & Martin Informational [Page 7]RFC 2229 A Dictionary Server Protocol October 1997 Individual capabilities are described by a single msg-atom. For example, the string <html.gzip> might be used to describe a server that supports extensions which allow HTML or compressed output. Capability names beginning with "x" or "X" are reserved for experimental extensions, and SHOULD NOT be defined in any future DICT protocol specification. Some of these capabilities may inform the client that certain functionality is available or can be requested. The following capabilities are currently defined: mime The OPTION MIME command is supported auth The AUTH command is supported kerberos_v4 The SASL Kerberos version 4 mechanism is supported gssapi The SASL GSSAPI [RFC2078] mechanism is supported skey The SASL S/Key [RFC1760] mechanism is supported external The SASL external mechanism is supported The last sequence of characters in the banner is a msg-id, similar to the format specified in [RFC822]. The simplified description is given below: msg-id = "<" spec ">" ; Unique message id spec = local-part "@" domain local-part = msg-atom *("." msg-atom)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -