📄 rfc3023.txt
字号:
XML provides a general framework for defining sequences of structured
data. In some cases, it may be desirable to define new media types
that use XML but define a specific application of XML, perhaps due to
domain-specific security considerations or runtime information.
Furthermore, such media types may allow UTF-8 or UTF-16 only and
prohibit other charsets. This document does not prohibit such media
types and in fact expects them to proliferate. However, developers
of such media types are STRONGLY RECOMMENDED to use this document as
a basis for their registration. In particular, the charset parameter
SHOULD be used in the same manner, as described in Section 7.1, in
order to enhance interoperability.
An XML document labeled as text/xml or application/xml might contain
namespace declarations, stylesheet-linking processing instructions
(PIs), schema information, or other declarations that might be used
to suggest how the document is to be processed. For example, a
document might have the XHTML namespace and a reference to a CSS
stylesheet. Such a document might be handled by applications that
would use this information to dispatch the document for appropriate
processing.
Murata, et al. Standards Track [Page 6]
RFC 3023 XML Media Types January 2001
3.1 Text/xml Registration
MIME media type name: text
MIME subtype name: xml
Mandatory parameters: none
Optional parameters: charset
Although listed as an optional parameter, the use of the charset
parameter is STRONGLY RECOMMENDED, since this information can be
used by XML processors to determine authoritatively the character
encoding of the XML MIME entity. The charset parameter can also
be used to provide protocol-specific operations, such as charset-
based content negotiation in HTTP. "utf-8" [RFC2279] is the
recommended value, representing the UTF-8 charset. UTF-8 is
supported by all conforming processors of [XML].
If the XML MIME entity is transmitted via HTTP, which uses a
MIME-like mechanism that is exempt from the restrictions on the
text top-level type (see section 19.4.1 of [RFC2616]), "utf-16"
[RFC2781]) is also recommended. UTF-16 is supported by all
conforming processors of [XML]. Since the handling of CR, LF and
NUL for text types in most MIME applications would cause undesired
transformations of individual octets in UTF-16 multi-octet
characters, gateways from HTTP to these MIME applications MUST
transform the XML MIME entity from text/xml; charset="utf-16" to
application/xml; charset="utf-16".
Conformant with [RFC2046], if a text/xml entity is received with
the charset parameter omitted, MIME processors and XML processors
MUST use the default charset value of "us-ascii"[ASCII]. In cases
where the XML MIME entity is transmitted via HTTP, the default
charset value is still "us-ascii". (Note: There is an
inconsistency between this specification and HTTP/1.1, which uses
ISO-8859-1[ISO8859] as the default for a historical reason. Since
XML is a new format, a new default should be chosen for better
I18N. US-ASCII was chosen, since it is the intersection of UTF-8
and ISO-8859-1 and since it is already used by MIME.)
There are several reasons that the charset parameter is
authoritative. First, some MIME processing engines do transcoding
of MIME bodies of the top-level media type "text" without
reference to any of the internal content. Thus, it is possible
that some agent might change text/xml; charset="iso-2022-jp" to
text/xml; charset="utf-8" without modifying the encoding
declaration of an XML document. Second, text/xml must be
Murata, et al. Standards Track [Page 7]
RFC 3023 XML Media Types January 2001
compatible with text/plain, since MIME agents that do not
understand text/xml will fallback to handling it as text/plain.
If the charset parameter for text/xml were not authoritative, such
fallback would cause data corruption. Third, recent web servers
have been improved so that users can specify the charset
parameter. Fourth, [RFC2130] specifies that the recommended
specification scheme is the "charset" parameter.
Since the charset parameter is authoritative, the charset is not
always declared within an XML encoding declaration. Thus, special
care is needed when the recipient strips the MIME header and
provides persistent storage of the received XML MIME entity (e.g.,
in a file system). Unless the charset is UTF-8 or UTF-16, the
recipient SHOULD also persistently store information about the
charset, perhaps by embedding a correct XML encoding declaration
within the XML MIME entity.
Encoding considerations: This media type MAY be encoded as
appropriate for the charset and the capabilities of the underlying
MIME transport. For 7-bit transports, data in UTF-8 MUST be
encoded in quoted-printable or base64. For 8-bit clean transport
(e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]), UTF-8 does not
need to be encoded. Over HTTP[RFC2616], no content-transfer-
encoding is necessary and UTF-16 may also be used.
Security considerations: See Section 10.
Interoperability considerations: XML has proven to be interoperable
across WebDAV clients and servers, and for import and export from
multiple XML authoring tools. For maximum interoperability,
validating processors are recommended. Although non-validating
processors may be more efficient, they are not required to handle
all features of XML. For further information, see sub-section 2.9
"Standalone Document Declaration" and section 5 "Conformance" of
[XML].
Published specification: Extensible Markup Language (XML) 1.0 (Second
Edition)[XML].
Applications which use this media type: XML is device-, platform-,
and vendor-neutral and is supported by a wide range of Web user
agents, WebDAV[RFC2518] clients and servers, as well as XML
authoring tools.
Additional information:
Murata, et al. Standards Track [Page 8]
RFC 3023 XML Media Types January 2001
Magic number(s): None.
Although no byte sequences can be counted on to always be
present, XML MIME entities in ASCII-compatible charsets
(including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C
("<?xml"), and those in UTF-16 often begin with hexadecimal FE
FF 00 3C 00 3F 00 78 00 6D 00 6C or FF FE 3C 00 3F 00 78 00 6D
00 6C 00 (the Byte Order Mark (BOM) followed by "<?xml"). For
more information, see Appendix F of [XML].
File extension(s): .xml
Macintosh File Type Code(s): "TEXT"
Person and email address for further information:
MURATA Makoto (FAMILY Given) <mmurata@trl.ibm.co.jp>
Simon St.Laurent <simonstl@simonstl.com>
Daniel Kohn <dan@dankohn.com>
Intended usage: COMMON
Author/Change controller: The XML specification is a work product of
the World Wide Web Consortium's XML Working Group, and was edited
by:
Tim Bray <tbray@textuality.com>
Jean Paoli <jeanpa@microsoft.com>
C. M. Sperberg-McQueen <cmsmcq@uic.edu>
Eve Maler <eve.maler@east.sun.com>
The W3C, and the W3C XML Core Working Group, have change control
over the XML specification.
3.2 Application/xml Registration
MIME media type name: application
MIME subtype name: xml
Mandatory parameters: none
Murata, et al. Standards Track [Page 9]
RFC 3023 XML Media Types January 2001
Optional parameters: charset
Although listed as an optional parameter, the use of the charset
parameter is STRONGLY RECOMMENDED, since this information can be
used by XML processors to determine authoritatively the charset of
the XML MIME entity. The charset parameter can also be used to
provide protocol-specific operations, such as charset-based
content negotiation in HTTP.
"utf-8" [RFC2279] and "utf-16" [RFC2781] are the recommended
values, representing the UTF-8 and UTF-16 charsets, respectively.
These charsets are preferred since they are supported by all
conforming processors of [XML].
If an application/xml entity is received where the charset
parameter is omitted, no information is being provided about the
charset by the MIME Content-Type header. Conforming XML
processors MUST follow the requirements in section 4.3.3 of [XML]
that directly address this contingency. However, MIME processors
that are not XML processors SHOULD NOT assume a default charset if
the charset parameter is omitted from an application/xml entity.
There are several reasons that the charset parameter is
authoritative. First, recent web servers have been improved so
that users can specify the charset parameter. Second, [RFC2130]
specifies that the recommended specification scheme is the
"charset" parameter.
On the other hand, it has been argued that the charset parameter
should be omitted and the mechanism described in Appendix F of
[XML] (which is non-normative) should be solely relied on. This
approach would allow users to avoid configuration of the charset
parameter; an XML document stored in a file is likely to contain a
correct encoding declaration or BOM (if necessary), since the
operating system does not typically provide charset information
for files. If users would like to rely on the encoding
declaration or BOM and to hide charset information from protocols,
they may determine not to use the parameter.
Since the charset parameter is authoritative, the charset is not
always declared within an XML encoding declaration. Thus, special
care is needed when the recipient strips the MIME header and
provides persistent storage of the received XML MIME entity (e.g.,
in a file system). Unless the charset is UTF-8 or UTF-16, the
recipient SHOULD also persistently store information about the
charset, perhaps by embedding a correct XML encoding declaration
within the XML MIME entity.
Murata, et al. Standards Track [Page 10]
RFC 3023 XML Media Types January 2001
Encoding considerations: This media type MAY be encoded as
appropriate for the charset and the capabilities of the underlying
MIME transport. For 7-bit transports, data in either UTF-8 or
UTF-16 MUST be encoded in quoted-printable or base64. For 8-bit
clean transport (e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]),
UTF-8 is not encoded, but the UTF-16 family MUST be encoded in
base64. For binary clean transports (e.g., HTTP[RFC2616]), no
content-transfer-encoding is necessary.
Security considerations: See Section 10.
Interoperability considerations: Same as Section 3.1.
Published specification: Same as Section 3.1.
Applications which use this media type: Same as Section 3.1.
Additional information: Same as Section 3.1.
Person and email address for further information: Same as Section
3.1.
Intended usage: COMMON
Author/Change controller: Same as Section 3.1.
3.3 Text/xml-external-parsed-entity Registration
MIME media type name: text
MIME subtype name: xml-external-parsed-entity
Mandatory parameters: none
Optional parameters: charset
The charset parameter of text/xml-external-parsed-entity is
handled the same as that of text/xml as described in Section 3.1.
Encoding considerations: Same as Section 3.1.
Security considerations: See Section 10.
Interoperability considerations: XML external parsed entities are as
interoperable as XML documents, though they have a less tightly
constrained structure and therefore need to be referenced by XML
documents for proper handling by XML processors. Similarly, XML
documents cannot be reliably used as external parsed entities
Murata, et al. Standards Track [Page 11]
RFC 3023 XML Media Types January 2001
because external parsed entities are prohibited from having
standalone document declarations or DTDs. Identifying XML
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -