📄 rfc2376.txt
字号:
Network Working Group E. WhiteheadRequest for Comments: 2376 UC IrvineCategory: Informational M. Murata Fuji Xerox Info. Systems July 1998 XML Media TypesStatus of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.Copyright Notice Copyright (C) The Internet Society (1998). All Rights Reserved.Abstract This document proposes two new media subtypes, text/xml and application/xml, for use in exchanging network entities which are conforming Extensible Markup Language (XML). XML entities are currently exchanged via the HyperText Transfer Protocol on the World Wide Web, are an integral part of the WebDAV protocol for remote web authoring, and are expected to have utility in many domains.Table of Contents 1 INTRODUCTION ....................................................2 2 NOTATIONAL CONVENTIONS ..........................................3 3 XML MEDIA TYPES .................................................3 3.1 Text/xml Registration ........................................3 3.2 Application/xml Registration .................................6 4 SECURITY CONSIDERATIONS .........................................8 5 THE BYTE ORDER MARK (BOM) AND CONVERSIONS TO/FROM UTF-16 ........9 6 EXAMPLES ........................................................9 6.1 text/xml with UTF-8 Charset .................................10 6.2 text/xml with UTF-16 Charset ................................10 6.3 text/xml with ISO-2022-KR Charset ...........................10 6.4 text/xml with Omitted Charset ...............................11 6.5 application/xml with UTF-16 Charset .........................11 6.6 application/xml with ISO-2022-KR Charset ....................11 6.7 application/xml with Omitted Charset and UTF-16 XML Entity ..12 6.8 application/xml with Omitted Charset and UTF-8 Entity .......12 6.9 application/xml with Omitted Charset and Internal Encoding Declaration.......................................................12Whitehead & Murata Informational [Page 1]RFC 2376 XML Media Types July 1998 7 REFERENCES .....................................................13 8 ACKNOWLEDGEMENTS ...............................................14 9 ADDRESSES OF AUTHORS ...........................................14 10 FULL COPYRIGHT STATEMENT ......................................151 Introduction The World Wide Web Consortium (W3C) has issued a Recommendation [REC-XML] which defines the Extensible Markup Language (XML), version 1. To enable the exchange of XML network entities, this document proposes two new media types, text/xml and application/xml. XML entities are currently exchanged on the World Wide Web, and XML is also used for property values and parameter marshalling by the WebDAV protocol for remote web authoring. Thus, there is a need for a media type to properly label the exchange of XML network entities. (Note that, as sometimes happens between two communities, both MIME and XML have defined the term entity, with different meanings.) Although XML is a subset of the Standard Generalized Markup Language (SGML) [ISO-8897], and currently is assigned the media types text/sgml and application/sgml, there are several reasons why use of text/sgml or application/sgml to label XML is inappropriate. First, there exist many applications which can process XML, but which cannot process SGML, due to SGML's larger feature set. Second, SGML applications cannot always process XML entities, because XML uses features of recent technical corrigenda to SGML. Third, the definition of text/sgml and application/sgml [RFC-1874] includes parameters for SGML bit combination transformation format (SGML- bctf), and SGML boot attribute (SGML-boot). Since XML does not use these parameters, it would be ambiguous if such parameters were given for an XML entity. For these reasons, the best approach for labeling XML network entities is to provide new media types for XML. Since XML is an integral part of the WebDAV Distributed Authoring Protocol, and since World Wide Web Consortium Recommendations have conventionally been assigned IETF tree media types, and since similar media types (HTML, SGML) have been assigned IETF tree media types, the XML media types also belong in the IETF media types tree.Whitehead & Murata Informational [Page 2]RFC 2376 XML Media Types July 19982 Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC-2119].3 XML Media Types This document introduces two new media types for XML entities, text/xml and application/xml. Registration information for these media types are described in the sections below. Every XML entity is suitable for use with the application/xml media type without modification. But this does not exploit the fact that XML can be treated as plain text in many cases. MIME user agents (and web user agents) that do not have explicit support for application/xml will treat it as application/octet-stream, for example, by offering to save it to a file. To indicate that an XML entity should be treated as plain text by default, use the text/xml media type. This restricts the encoding used in the XML entity to those that are compatible with the requirements for text media types as described in [RFC-2045] and [RFC-2046], e.g., UTF-8, but not UTF-16 (except for HTTP). XML provides a general framework for defining sequences of structured data. In some cases, it may be desirable to define new media types which use XML but define a specific application of XML, perhaps due to domain-specific security considerations or runtime information. This document does not prohibit future media types dedicated to such XML applications. However, developers of such media types are recommended to use this document as a basis. In particular, the charset parameter should be used in the same manner. Within the XML specification, XML entities can be classified into four types. In the XML terminology, they are called "document entities", "external DTD subsets", "external parsed entities", and "external parameter entities". The media types text/xml and application/xml can be used for any of these four types.3.1 Text/xml Registration MIME media type name: text MIME subtype name: xml Mandatory parameters: noneWhitehead & Murata Informational [Page 3]RFC 2376 XML Media Types July 1998 Optional parameters: charset Although listed as an optional parameter, the use of the charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the character encoding of the XML entity. The charset parameter can also be used to provide protocol-specific operations, such as charset-based content negotiation in HTTP. "UTF-8" [RFC-2279] is the recommended value, representing the UTF-8 charset. UTF-8 is supported by all conforming XML processors [REC-XML]. If the XML entity is transmitted via HTTP, which uses a MIME-like mechanism that is exempt from the restrictions on the text top- level type (see section 19.4.1 of HTTP 1.1 [RFC-2068]), "UTF-16" (Appendix C.3 of [UNICODE] and Amendment 1 of [ISO-10646]) is also recommended. UTF-16 is supported by all conforming XML processors [REC-XML]. Since the handling of CR, LF and NUL for text types in most MIME applications would cause undesired transformations of individual octets in UTF-16 multi-octet characters, gateways from HTTP to these MIME applications MUST transform the XML entity from a text/xml; charset="utf-16" to application/xml; charset="utf-16". Conformant with [RFC-2046], if a text/xml entity is received with the charset parameter omitted, MIME processors and XML processors MUST use the default charset value of "us-ascii". In cases where the XML entity is transmitted via HTTP, the default charset value is still "us-ascii". Since the charset parameter is authoritative, the charset is not always declared within an XML encoding declaration. Thus, special care is needed when the recipient strips the MIME header and provides persistent storage of the received XML entity (e.g., in a file system). Unless the charset is UTF-8 or UTF-16, the recipient SHOULD also persistently store information about the charset, perhaps by embedding a correct XML encoding declaration within the XML entity. Encoding considerations: This media type MAY be encoded as appropriate for the charset and the capabilities of the underlying MIME transport. For 7-bit transports, data in both UTF-8 and UTF-16 is encoded in quoted- printable or base64. For 8-bit clean transport (e.g., ESMTP, 8BITMIME, or NNTP), UTF-8 is not encoded, but UTF-16 is base64 encoded. For binary clean transports (e.g., HTTP), no content- transfer-encoding is necessary.Whitehead & Murata Informational [Page 4]RFC 2376 XML Media Types July 1998 Security considerations: See section 4 below. Interoperability considerations: XML has proven to be interoperable across WebDAV clients and servers, and for import and export from multiple XML authoring tools. Published specification: see [REC-XML] Applications which use this media type: XML is device-, platform-, and vendor-neutral and is supported by a wide range of Web user agents, WebDAV clients and servers, as well as XML authoring tools. Additional information: Magic number(s): none Although no byte sequences can be counted on to always be present, XML entities in ASCII-compatible charsets (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"). For more information, see Appendix F of [REC-XML]. File extension(s): .xml, .dtd Macintosh File Type Code(s): "TEXT" Person & email address for further information: Dan Connolly <connolly@w3.org> Murata Makoto (Family Given) <murata@fxis.fujixerox.co.jp> Intended usage: COMMON Author/Change controller: The XML specification is a work product of the World Wide Web Consortium's XML Working Group, and was edited by: Tim Bray <tbray@textuality.com> Jean Paoli <jeanpa@microsoft.com> C. M. Sperberg-McQueen <cmsmcq@uic.edu> The W3C, and the W3C XML working group, has change control over the XML specification.Whitehead & Murata Informational [Page 5]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -