rfc2376.txt
来自「RFC 的详细文档!」· 文本 代码 · 共 844 行 · 第 1/3 页
TXT
844 行
Network Working Group E. Whitehead
Request for Comments: 2376 UC Irvine
Category: Informational M. Murata
Fuji Xerox Info. Systems
July 1998
XML Media Types
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1998). All Rights Reserved.
Abstract
This document proposes two new media subtypes, text/xml and
application/xml, for use in exchanging network entities which are
conforming Extensible Markup Language (XML). XML entities are
currently exchanged via the HyperText Transfer Protocol on the World
Wide Web, are an integral part of the WebDAV protocol for remote web
authoring, and are expected to have utility in many domains.
Table of Contents
1 INTRODUCTION ....................................................2
2 NOTATIONAL CONVENTIONS ..........................................3
3 XML MEDIA TYPES .................................................3
3.1 Text/xml Registration ........................................3
3.2 Application/xml Registration .................................6
4 SECURITY CONSIDERATIONS .........................................8
5 THE BYTE ORDER MARK (BOM) AND CONVERSIONS TO/FROM UTF-16 ........9
6 EXAMPLES ........................................................9
6.1 text/xml with UTF-8 Charset .................................10
6.2 text/xml with UTF-16 Charset ................................10
6.3 text/xml with ISO-2022-KR Charset ...........................10
6.4 text/xml with Omitted Charset ...............................11
6.5 application/xml with UTF-16 Charset .........................11
6.6 application/xml with ISO-2022-KR Charset ....................11
6.7 application/xml with Omitted Charset and UTF-16 XML Entity ..12
6.8 application/xml with Omitted Charset and UTF-8 Entity .......12
6.9 application/xml with Omitted Charset and Internal Encoding
Declaration.......................................................12
Whitehead & Murata Informational [Page 1]
RFC 2376 XML Media Types July 1998
7 REFERENCES .....................................................13
8 ACKNOWLEDGEMENTS ...............................................14
9 ADDRESSES OF AUTHORS ...........................................14
10 FULL COPYRIGHT STATEMENT ......................................15
1 Introduction
The World Wide Web Consortium (W3C) has issued a Recommendation
[REC-XML] which defines the Extensible Markup Language (XML), version
1. To enable the exchange of XML network entities, this document
proposes two new media types, text/xml and application/xml.
XML entities are currently exchanged on the World Wide Web, and XML
is also used for property values and parameter marshalling by the
WebDAV protocol for remote web authoring. Thus, there is a need for a
media type to properly label the exchange of XML network entities.
(Note that, as sometimes happens between two communities, both MIME
and XML have defined the term entity, with different meanings.)
Although XML is a subset of the Standard Generalized Markup Language
(SGML) [ISO-8897], and currently is assigned the media types
text/sgml and application/sgml, there are several reasons why use of
text/sgml or application/sgml to label XML is inappropriate. First,
there exist many applications which can process XML, but which cannot
process SGML, due to SGML's larger feature set. Second, SGML
applications cannot always process XML entities, because XML uses
features of recent technical corrigenda to SGML. Third, the
definition of text/sgml and application/sgml [RFC-1874] includes
parameters for SGML bit combination transformation format (SGML-
bctf), and SGML boot attribute (SGML-boot). Since XML does not use
these parameters, it would be ambiguous if such parameters were given
for an XML entity. For these reasons, the best approach for labeling
XML network entities is to provide new media types for XML.
Since XML is an integral part of the WebDAV Distributed Authoring
Protocol, and since World Wide Web Consortium Recommendations have
conventionally been assigned IETF tree media types, and since similar
media types (HTML, SGML) have been assigned IETF tree media types,
the XML media types also belong in the IETF media types tree.
Whitehead & Murata Informational [Page 2]
RFC 2376 XML Media Types July 1998
2 Notational Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC-2119].
3 XML Media Types
This document introduces two new media types for XML entities,
text/xml and application/xml. Registration information for these
media types are described in the sections below.
Every XML entity is suitable for use with the application/xml media
type without modification. But this does not exploit the fact that
XML can be treated as plain text in many cases. MIME user agents
(and web user agents) that do not have explicit support for
application/xml will treat it as application/octet-stream, for
example, by offering to save it to a file.
To indicate that an XML entity should be treated as plain text by
default, use the text/xml media type. This restricts the encoding
used in the XML entity to those that are compatible with the
requirements for text media types as described in [RFC-2045] and
[RFC-2046], e.g., UTF-8, but not UTF-16 (except for HTTP).
XML provides a general framework for defining sequences of structured
data. In some cases, it may be desirable to define new media types
which use XML but define a specific application of XML, perhaps due
to domain-specific security considerations or runtime information.
This document does not prohibit future media types dedicated to such
XML applications. However, developers of such media types are
recommended to use this document as a basis. In particular, the
charset parameter should be used in the same manner.
Within the XML specification, XML entities can be classified into
four types. In the XML terminology, they are called "document
entities", "external DTD subsets", "external parsed entities", and
"external parameter entities". The media types text/xml and
application/xml can be used for any of these four types.
3.1 Text/xml Registration
MIME media type name: text
MIME subtype name: xml
Mandatory parameters: none
Whitehead & Murata Informational [Page 3]
RFC 2376 XML Media Types July 1998
Optional parameters: charset
Although listed as an optional parameter, the use of the charset
parameter is STRONGLY RECOMMENDED, since this information can be
used by XML processors to determine authoritatively the character
encoding of the XML entity. The charset parameter can also be used
to provide protocol-specific operations, such as charset-based
content negotiation in HTTP. "UTF-8" [RFC-2279] is the
recommended value, representing the UTF-8 charset. UTF-8 is
supported by all conforming XML processors [REC-XML].
If the XML entity is transmitted via HTTP, which uses a MIME-like
mechanism that is exempt from the restrictions on the text top-
level type (see section 19.4.1 of HTTP 1.1 [RFC-2068]), "UTF-16"
(Appendix C.3 of [UNICODE] and Amendment 1 of [ISO-10646]) is also
recommended. UTF-16 is supported by all conforming XML processors
[REC-XML]. Since the handling of CR, LF and NUL for text types in
most MIME applications would cause undesired transformations of
individual octets in UTF-16 multi-octet characters, gateways from
HTTP to these MIME applications MUST transform the XML entity from
a text/xml; charset="utf-16" to application/xml; charset="utf-16".
Conformant with [RFC-2046], if a text/xml entity is received with
the charset parameter omitted, MIME processors and XML processors
MUST use the default charset value of "us-ascii". In cases where
the XML entity is transmitted via HTTP, the default charset value
is still "us-ascii".
Since the charset parameter is authoritative, the charset is not
always declared within an XML encoding declaration. Thus, special
care is needed when the recipient strips the MIME header and
provides persistent storage of the received XML entity (e.g., in a
file system). Unless the charset is UTF-8 or UTF-16, the recipient
SHOULD also persistently store information about the charset,
perhaps by embedding a correct XML encoding declaration within the
XML entity.
Encoding considerations:
This media type MAY be encoded as appropriate for the charset and
the capabilities of the underlying MIME transport. For 7-bit
transports, data in both UTF-8 and UTF-16 is encoded in quoted-
printable or base64. For 8-bit clean transport (e.g., ESMTP,
8BITMIME, or NNTP), UTF-8 is not encoded, but UTF-16 is base64
encoded. For binary clean transports (e.g., HTTP), no content-
transfer-encoding is necessary.
Whitehead & Murata Informational [Page 4]
RFC 2376 XML Media Types July 1998
Security considerations:
See section 4 below.
Interoperability considerations:
XML has proven to be interoperable across WebDAV clients and
servers, and for import and export from multiple XML authoring
tools.
Published specification: see [REC-XML]
Applications which use this media type:
XML is device-, platform-, and vendor-neutral and is supported by
a wide range of Web user agents, WebDAV clients and servers, as
well as XML authoring tools.
Additional information:
Magic number(s): none
Although no byte sequences can be counted on to always be present,
XML entities in ASCII-compatible charsets (including UTF-8) often
begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"). For more
information, see Appendix F of [REC-XML].
File extension(s): .xml, .dtd
Macintosh File Type Code(s): "TEXT"
Person & email address for further information:
Dan Connolly <connolly@w3.org>
Murata Makoto (Family Given) <murata@fxis.fujixerox.co.jp>
Intended usage: COMMON
Author/Change controller:
The XML specification is a work product of the World Wide Web
Consortium's XML Working Group, and was edited by:
Tim Bray <tbray@textuality.com>
Jean Paoli <jeanpa@microsoft.com>
C. M. Sperberg-McQueen <cmsmcq@uic.edu>
The W3C, and the W3C XML working group, has change control over
the XML specification.
Whitehead & Murata Informational [Page 5]
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?