rfc2376.txt

来自「RFC 的详细文档!」· 文本 代码 · 共 844 行 · 第 1/3 页

TXT
844
字号






Network Working Group                                     E. Whitehead
Request for Comments: 2376                                   UC Irvine
Category: Informational                                      M. Murata
                                              Fuji Xerox Info. Systems
                                                             July 1998


                            XML Media Types

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (1998).  All Rights Reserved.

Abstract

   This document proposes two new media subtypes, text/xml and
   application/xml, for use in exchanging network entities which are
   conforming Extensible Markup Language (XML). XML entities are
   currently exchanged via the HyperText Transfer Protocol on the World
   Wide Web, are an integral part of the WebDAV protocol for remote web
   authoring, and are expected to have utility in many domains.

Table of Contents

   1 INTRODUCTION ....................................................2
   2 NOTATIONAL CONVENTIONS ..........................................3
   3 XML MEDIA TYPES .................................................3
   3.1  Text/xml Registration ........................................3
   3.2  Application/xml Registration .................................6
   4 SECURITY CONSIDERATIONS .........................................8
   5 THE BYTE ORDER MARK (BOM) AND CONVERSIONS TO/FROM UTF-16 ........9
   6 EXAMPLES ........................................................9
   6.1  text/xml with UTF-8 Charset .................................10
   6.2  text/xml with UTF-16 Charset ................................10
   6.3  text/xml with ISO-2022-KR Charset ...........................10
   6.4  text/xml with Omitted Charset ...............................11
   6.5  application/xml with UTF-16 Charset .........................11
   6.6  application/xml with ISO-2022-KR Charset ....................11
   6.7  application/xml with Omitted Charset and UTF-16 XML Entity ..12
   6.8  application/xml with Omitted Charset and UTF-8 Entity .......12
   6.9  application/xml with Omitted Charset and Internal Encoding
   Declaration.......................................................12



Whitehead & Murata           Informational                      [Page 1]

RFC 2376                    XML Media Types                    July 1998


   7 REFERENCES .....................................................13
   8 ACKNOWLEDGEMENTS ...............................................14
   9 ADDRESSES OF AUTHORS ...........................................14
   10 FULL COPYRIGHT STATEMENT ......................................15

1  Introduction

   The World Wide Web Consortium (W3C) has issued a Recommendation
   [REC-XML] which defines the Extensible Markup Language (XML), version
   1. To enable the exchange of XML network entities, this document
   proposes two new media types, text/xml and application/xml.

   XML entities are currently exchanged on the World Wide Web, and XML
   is also used for property values and parameter marshalling by the
   WebDAV protocol for remote web authoring. Thus, there is a need for a
   media type to properly label the exchange of XML network entities.
   (Note that, as sometimes happens between two communities, both MIME
   and XML have defined the term entity, with different meanings.)

   Although XML is a subset of the Standard Generalized Markup Language
   (SGML) [ISO-8897], and currently is assigned the media types
   text/sgml and application/sgml, there are several reasons why use of
   text/sgml or application/sgml to label XML is inappropriate. First,
   there exist many applications which can process XML, but which cannot
   process SGML, due to SGML's larger feature set. Second, SGML
   applications cannot always process XML entities, because XML uses
   features of recent technical corrigenda to SGML.  Third, the
   definition of text/sgml and application/sgml [RFC-1874] includes
   parameters for SGML bit combination transformation format (SGML-
   bctf), and SGML boot attribute (SGML-boot). Since XML does not use
   these parameters, it would be ambiguous if such parameters were given
   for an XML entity.  For these reasons, the best approach for labeling
   XML network entities is to provide new media types for XML.

   Since XML is an integral part of the WebDAV Distributed Authoring
   Protocol, and since World Wide Web Consortium Recommendations have
   conventionally been assigned IETF tree media types, and since similar
   media types (HTML, SGML) have been assigned IETF tree media types,
   the XML media types also belong in the IETF media types tree.












Whitehead & Murata           Informational                      [Page 2]

RFC 2376                    XML Media Types                    July 1998


2  Notational Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC-2119].

3  XML Media Types

   This document introduces two new media types for XML entities,
   text/xml and application/xml.  Registration information for these
   media types are described in the sections below.

   Every XML entity is suitable for use with the application/xml media
   type without modification.  But this does not exploit the fact that
   XML can be treated as plain text in many cases.  MIME user agents
   (and web user agents) that do not have explicit support for
   application/xml will treat it as application/octet-stream, for
   example, by offering to save it to a file.

   To indicate that an XML entity should be treated as plain text by
   default, use the text/xml media type.  This restricts the encoding
   used in the XML entity to those that are compatible with the
   requirements for text media types as described in [RFC-2045] and
   [RFC-2046], e.g., UTF-8, but not UTF-16 (except for HTTP).

   XML provides a general framework for defining sequences of structured
   data.  In some cases, it may be desirable to define new media types
   which use XML but define a specific application of XML, perhaps due
   to domain-specific security considerations or runtime information.
   This document does not prohibit future media types dedicated to such
   XML applications. However, developers of such media types are
   recommended to use this document as a basis.  In particular, the
   charset parameter should be used in the same manner.

   Within the XML specification, XML entities can be classified into
   four types.  In the XML terminology, they are called "document
   entities", "external DTD subsets", "external parsed entities", and
   "external parameter entities".  The media types text/xml and
   application/xml can be used for any of these four types.

3.1 Text/xml Registration

   MIME media type name: text

   MIME subtype name: xml

   Mandatory parameters: none




Whitehead & Murata           Informational                      [Page 3]

RFC 2376                    XML Media Types                    July 1998


   Optional parameters: charset

      Although listed as an optional parameter, the use of the charset
      parameter is STRONGLY RECOMMENDED, since this information can be
      used by XML processors to determine authoritatively the character
      encoding of the XML entity. The charset parameter can also be used
      to provide protocol-specific operations, such as charset-based
      content negotiation in HTTP.  "UTF-8" [RFC-2279] is the
      recommended value, representing the UTF-8 charset. UTF-8 is
      supported by all conforming XML processors [REC-XML].

      If the XML entity is transmitted via HTTP, which uses a MIME-like
      mechanism that is exempt from the restrictions on the text top-
      level type (see section 19.4.1 of HTTP 1.1 [RFC-2068]), "UTF-16"
      (Appendix C.3 of [UNICODE] and Amendment 1 of [ISO-10646]) is also
      recommended.  UTF-16 is supported by all conforming XML processors
      [REC-XML].  Since the handling of CR, LF and NUL for text types in
      most MIME applications would cause undesired transformations of
      individual octets in UTF-16 multi-octet characters, gateways from
      HTTP to these MIME applications MUST transform the XML entity from
      a text/xml; charset="utf-16" to application/xml; charset="utf-16".

      Conformant with [RFC-2046], if a text/xml entity is received with
      the charset parameter omitted, MIME processors and XML processors
      MUST use the default charset value of "us-ascii".  In cases where
      the XML entity is transmitted via HTTP, the default charset value
      is still "us-ascii".

      Since the charset parameter is authoritative, the charset is not
      always declared within an XML encoding declaration.  Thus, special
      care is needed when the recipient strips the MIME header and
      provides persistent storage of the received XML entity (e.g., in a
      file system). Unless the charset is UTF-8 or UTF-16, the recipient
      SHOULD also persistently store information about the charset,
      perhaps by embedding a correct XML encoding declaration within the
      XML entity.

   Encoding considerations:

      This media type MAY be encoded as appropriate for the charset and
      the capabilities of the underlying MIME transport. For 7-bit
      transports, data in both UTF-8 and UTF-16 is encoded in quoted-
      printable or base64.  For 8-bit clean transport (e.g., ESMTP,
      8BITMIME, or NNTP), UTF-8 is not encoded, but UTF-16 is base64
      encoded.  For binary clean transports (e.g., HTTP), no content-
      transfer-encoding is necessary.





Whitehead & Murata           Informational                      [Page 4]

RFC 2376                    XML Media Types                    July 1998


   Security considerations:

      See section 4 below.

   Interoperability considerations:

      XML has proven to be interoperable across WebDAV clients and
      servers, and for import and export from multiple XML authoring
      tools.

   Published specification: see [REC-XML]

   Applications which use this media type:

      XML is device-, platform-, and vendor-neutral and is supported by
      a wide range of Web user agents, WebDAV clients and servers, as
      well as XML authoring tools.

   Additional information:

      Magic number(s): none

      Although no byte sequences can be counted on to always be present,
      XML entities in ASCII-compatible charsets (including UTF-8) often
      begin with hexadecimal 3C 3F 78 6D 6C ("<?xml").  For more
      information, see Appendix F of [REC-XML].

      File extension(s): .xml, .dtd
      Macintosh File Type Code(s): "TEXT"

   Person & email address for further information:

      Dan Connolly <connolly@w3.org>
      Murata Makoto (Family Given) <murata@fxis.fujixerox.co.jp>

   Intended usage: COMMON

   Author/Change controller:

      The XML specification is a work product of the World Wide Web
      Consortium's XML Working Group, and was edited by:

      Tim Bray <tbray@textuality.com>
      Jean Paoli <jeanpa@microsoft.com>
      C. M. Sperberg-McQueen <cmsmcq@uic.edu>

      The W3C, and the W3C XML working group, has change control over
      the XML specification.



Whitehead & Murata           Informational                      [Page 5]

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?