📄 rec-xml-20001006

📁 如何实现安全的xml机制
💻
📖 第 1 页 / 共 5 页
字号:
<dd> <p>[<a title="Must" name="dt-must">Definition</a>: Conforming documents
and XML processors are required to behave as described; otherwise they are
in error. ]</p> </dd>
<dt class="label">error</dt>
<dd> <p>[<a title="Error" name="dt-error">Definition</a>: A violation of the
rules of this specification; results are undefined. Conforming software may
detect and report an error and may recover from it.]</p> </dd>
<dt class="label">fatal error</dt>
<dd> <p>[<a title="Fatal Error" name="dt-fatal">Definition</a>: An error which
a conforming <a title="XML Processor" href="#dt-xml-proc">XML processor</a>
must detect and report to the application. After encountering a fatal error,
the processor may continue processing the data to search for further errors
and may report such errors to the application. In order to support correction
of errors, the processor may make unprocessed data from the document (with
intermingled character data and markup) available to the application. Once
a fatal error is detected, however, the processor must not continue normal
processing (i.e., it must not continue to pass character data and information
about the document's logical structure to the application in the normal way).]</p> </dd>
<dt class="label">at user option</dt>
<dd> <p>[<a title="At user option" name="dt-atuseroption">Definition</a>:
Conforming software may or must (depending on the modal verb in the sentence)
behave as described; if it does, it must provide users a means to enable or
disable the behavior described.]</p> </dd>
<dt class="label">validity constraint</dt>
<dd> <p>[<a title="Validity constraint" name="dt-vc">Definition</a>: A rule
which applies to all <a title="Validity" href="#dt-valid">valid</a> XML documents.
Violations of validity constraints are errors; they must, at user option,
be reported by <a title="Validating Processor" href="#dt-validating">validating
XML processors</a>.]</p> </dd>
<dt class="label">well-formedness constraint</dt>
<dd> <p>[<a title="Well-formedness constraint" name="dt-wfc">Definition</a>:
A rule which applies to all <a title="Well-Formed" href="#dt-wellformed">well-formed</a>
XML documents. Violations of well-formedness constraints are <a title="Fatal Error"
href="#dt-fatal">fatal errors</a>.]</p> </dd>
<dt class="label">match</dt>
<dd> <p>[<a title="match" name="dt-match">Definition</a>: (Of strings or names:)
Two strings or names being compared must be identical. Characters with multiple
possible representations in ISO/IEC 10646 (e.g. characters with both precomposed
and base+diacritic forms) match only if they have the same representation
in both strings. No case folding is performed. (Of strings and rules in the
grammar:) A string matches a grammatical production if it belongs to the language
generated by that production. (Of content and content models:) An element
matches its declaration when it conforms in the fashion described in the constraint <a
href="#elementvalid"><b>[VC: Element Valid]</b></a>.]</p> </dd>
<dt class="label">for compatibility</dt>
<dd> <p>[<a title="For Compatibility" name="dt-compat">Definition</a>: Marks
a sentence describing a feature of XML included solely to ensure that XML
remains compatible with SGML.]</p> </dd>
<dt class="label">for interoperability</dt>
<dd> <p>[<a title="For interoperability" name="dt-interop">Definition</a>:
Marks a sentence describing a non-binding recommendation included to increase
the chances that XML documents can be processed by the existing installed
base of SGML processors which predate the WebSGML Adaptations Annex to ISO
8879.]</p> </dd>
</dl><p></p> </div> </div>  <div class="div1"> <h2><a name="sec-documents"></a>2
Documents</h2> <p>[<a title="XML Document" name="dt-xml-doc">Definition</a>:
 A data object is an <b>XML document</b> if it is <a title="Well-Formed" href="#dt-wellformed">well-formed</a>,
as defined in this specification. A well-formed XML document may in addition
be <a title="Validity" href="#dt-valid">valid</a> if it meets certain further
constraints.]</p> <p>Each XML document has both a logical and a physical structure.
Physically, the document is composed of units called <a title="Entity" href="#dt-entity">entities</a>.
An entity may <a title="Entity Reference" href="#dt-entref">refer</a> to other
entities to cause their inclusion in the document. A document begins in a
"root" or <a title="Document Entity" href="#dt-docent">document entity</a>.
Logically, the document is composed of declarations, elements, comments, character
references, and processing instructions, all of which are indicated in the
document by explicit markup. The logical and physical structures must nest
properly, as described in <a href="#wf-entities"><b>4.3.2 Well-Formed Parsed
Entities</b></a>.</p> <div class="div2"> <h3><a name="sec-well-formed"></a>2.1
Well-Formed XML Documents</h3> <p>[<a title="Well-Formed" name="dt-wellformed">Definition</a>:
 A textual object is a <b>well-formed</b> XML document if:]</p> <ol>
<li><p>Taken as a whole, it matches the production labeled <a href="#NT-document">document</a>.</p> </li>
<li><p>It meets all the well-formedness constraints given in this specification.</p> </li>
<li><p>Each of the <a title="Text Entity" href="#dt-parsedent">parsed entities</a>
which is referenced directly or indirectly within the document is <a title="Well-Formed"
href="#dt-wellformed">well-formed</a>.</p></li>
</ol> <h5>Document</h5><table class="scrap"><tbody>
<tr valign="baseline">
<td><a name="NT-document"></a>[1]&nbsp;&nbsp;&nbsp;</td>
<td><code>document</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code><a href="#NT-prolog">prolog</a> <a href="#NT-element">element</a> <a
href="#NT-Misc">Misc</a>*</code></td>
</tr>
</tbody></table> <p>Matching the <a href="#NT-document">document</a> production
implies that:</p> <ol>
<li><p>It contains one or more <a title="Element" href="#dt-element">elements</a>.</p> </li>
<li><p>[<a title="Root Element" name="dt-root">Definition</a>: There is exactly
one element, called the <b>root</b>, or document element, no part of which
appears in the <a title="Content" href="#dt-content">content</a> of any other
element.] For all other elements, if the <a title="Start-Tag" href="#dt-stag">start-tag</a>
is in the content of another element, the <a title="End Tag" href="#dt-etag">end-tag</a>
is in the content of the same element. More simply stated, the elements, delimited
by start- and end-tags, nest properly within each other.</p></li>
</ol> <p>[<a title="Parent/Child" name="dt-parentchild">Definition</a>: As
a consequence of this, for each non-root element <code>C</code> in the document,
there is one other element <code>P</code> in the document such that <code>C</code>
is in the content of <code>P</code>, but is not in the content of any other
element that is in the content of <code>P</code>. <code>P</code> is referred
to as the <b>parent</b> of <code>C</code>, and <code>C</code> as a <b>child</b>
of <code>P</code>.]</p> </div> <div class="div2"> <h3><a name="charsets"></a>2.2
Characters</h3> <p>[<a title="Text" name="dt-text">Definition</a>: A parsed
entity contains <b>text</b>, a sequence of <a title="Character" href="#dt-character">characters</a>,
which may represent markup or character data.] [<a title="Character" name="dt-character">Definition</a>:
A <b>character</b> is an atomic unit of text as specified by ISO/IEC 10646 <a
href="#ISO10646">[ISO/IEC 10646]</a> (see also <a href="#ISO10646-2000">[ISO/IEC
10646-2000]</a>). Legal characters are tab, carriage return, line feed, and
the legal characters of Unicode and ISO/IEC 10646. The versions of these standards
cited in <a href="#sec-existing-stds"><b>A.1 Normative References</b></a>
were current at the time this document was prepared. New characters may be
added to these standards by amendments or new editions. Consequently, XML
processors must accept any character in the range specified for <a href="#NT-Char">Char</a>.
The use of "compatibility characters", as defined in section 6.8 of <a href="#Unicode">[Unicode]</a>
(see also D21 in section 3.6 of <a href="#Unicode3">[Unicode3]</a>), is discouraged.]</p> <h5>Character
Range</h5><table class="scrap"><tbody>
<tr valign="baseline">
<td><a name="NT-Char"></a>[2]&nbsp;&nbsp;&nbsp;</td>
<td><code>Char</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code>#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]</code></td>
<td><i>/* any Unicode character, excluding the surrogate blocks, FFFE, and
FFFF. */</i></td>
</tr>
</tbody></table> <p>The mechanism for encoding character code points into
bit patterns may vary from entity to entity. All XML processors must accept
the UTF-8 and UTF-16 encodings of 10646; the mechanisms for signaling which
of the two is in use, or for bringing other encodings into play, are discussed
later, in <a href="#charencoding"><b>4.3.3 Character Encoding in Entities</b></a>.</p>
 </div> <div class="div2"> <h3><a name="sec-common-syn"></a>2.3 Common Syntactic
Constructs</h3> <p>This section defines some symbols used widely in the grammar.</p> <p><a
href="#NT-S">S</a> (white space) consists of one or more space (#x20) characters,
carriage returns, line feeds, or tabs.</p> <h5>White Space</h5><table class="scrap">
<tbody>
<tr valign="baseline">
<td><a name="NT-S"></a>[3]&nbsp;&nbsp;&nbsp;</td>
<td><code>S</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code>(#x20 | #x9 | #xD | #xA)+</code></td>
</tr>
</tbody></table> <p>Characters are classified for convenience as letters,
digits, or other characters. A letter consists of an alphabetic or syllabic
base character or an ideographic character. Full definitions of the specific
characters in each class are given in <a href="#CharClasses"><b>B Character
Classes</b></a>.</p> <p>[<a title="Name" name="dt-name">Definition</a>: A <b>Name</b>
is a token beginning with a letter or one of a few punctuation characters,
and continuing with letters, digits, hyphens, underscores, colons, or full
stops, together known as name characters.] Names beginning with the string
"<code>xml</code>", or any string which would match <code>(('X'|'x') ('M'|'m')
('L'|'l'))</code>, are reserved for standardization in this or future versions
of this specification.</p> <div class="note"><p class="prefix"><b>Note:</b></p> <p>The
Namespaces in XML Recommendation <a href="#xml-names">[XML Names]</a> assigns
a meaning to names containing colon characters. Therefore, authors should
not use the colon in XML names except for namespace purposes, but XML processors
must accept the colon as a name character.</p> </div> <p>An <a href="#NT-Nmtoken">Nmtoken</a>
(name token) is any mixture of name characters.</p> <h5>Names and Tokens</h5><table
class="scrap"><tbody>
<tr valign="baseline">
<td><a name="NT-NameChar"></a>[4]&nbsp;&nbsp;&nbsp;</td>
<td><code>NameChar</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code><a href="#NT-Letter">Letter</a> | <a href="#NT-Digit">Digit</a>
| '.' | '-' | '_' | ':' | <a href="#NT-CombiningChar">CombiningChar</a> | <a
href="#NT-Extender">Extender</a></code></td>
</tr>
</tbody><tbody>
<tr valign="baseline">
<td><a name="NT-Name"></a>[5]&nbsp;&nbsp;&nbsp;</td>
<td><code>Name</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code>(<a href="#NT-Letter">Letter</a> | '_' | ':') (<a href="#NT-NameChar">NameChar</a>)*</code></td>
</tr>
</tbody><tbody>
<tr valign="baseline">
<td><a name="NT-Names"></a>[6]&nbsp;&nbsp;&nbsp;</td>
<td><code>Names</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code><a href="#NT-Name">Name</a> (<a href="#NT-S">S</a> <a href="#NT-Name">Name</a>)*</code></td>
</tr>
</tbody><tbody>
<tr valign="baseline">
<td><a name="NT-Nmtoken"></a>[7]&nbsp;&nbsp;&nbsp;</td>
<td><code>Nmtoken</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code>(<a href="#NT-NameChar">NameChar</a>)+</code></td>
</tr>
</tbody><tbody>
<tr valign="baseline">
<td><a name="NT-Nmtokens"></a>[8]&nbsp;&nbsp;&nbsp;</td>
<td><code>Nmtokens</code></td>
<td>&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td>
<td><code><a href="#NT-Nmtoken">Nmtoken</a> (<a href="#NT-S">S</a> <a href="#NT-Nmtoken">Nmtoken</a>)*</code></td>
</tr>
</tbody></table> <p>Literal data is any quoted string not containing the quotation
mark used as a delimiter for that string. Literals are used for specifying
the content of internal entities (<a href="#NT-EntityValue">EntityValue</a>),
the values of attributes (<a href="#NT-AttValue">AttValue</a>), and external
identifiers (<a href="#NT-SystemLiteral">SystemLiteral</a>). Note that a <a
href="#NT-SystemLiteral">SystemLiteral</a> can be parsed without scanning
for markup.</p> <h5>Literals</h5><table class="scrap"><tbody>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -