⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 introxml4.html

📁 j2eePDF格式的电子书
💻 HTML
📖 第 1 页 / 共 5 页
字号:
It would be nice if we could specify that an <code class="cCode">item</code> contains either text, or text followed by one or more list items. But that kind of specification turns out to be hard to achieve in a DTD. For example, you might be tempted to define an <code class="cCode">item</code> like this: </p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!ELEMENT item (#PCDATA | (#PCDATA, item+)) &gt;<a name="wp68038"> </a></pre></div><a name="wp68039"> </a><p class="pBody">That would certainly be accurate, but as soon as the parser sees #PCDATA and the vertical bar, it requires the remaining definition to conform to the mixed-content model. This specification doesn't, so you get can error that says: <code class="cCode">Illegal mixed content model for &#39;item&#39;. Found &amp;#x28; ...,</code> where the hex character 28 is the angle bracket the ends the definition. </p><a name="wp68041"> </a><p class="pBody">Trying to double-define the item element doesn't work, either. A specification like this:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!ELEMENT item (#PCDATA) &gt;&lt;!ELEMENT item (#PCDATA, item+) &gt;<a name="wp68042"> </a></pre></div><a name="wp68043"> </a><p class="pBody">produces a &quot;duplicate definition&quot; warning when the validating parser runs. The second definition is, in fact, ignored. So it seems that defining a mixed content model (which allows <code class="cCode">item</code> elements to be interspersed in text) is about as good as we can do. </p><a name="wp68044"> </a><p class="pBody">In addition to the limitations of the mixed content model mentioned above, there is no way to further qualify the kind of text that can occur where <code class="cCode">PCDATA</code> has been specified. Should it contain only numbers? Should be in a date format, or possibly a monetary format? There is no way to say in the context of a DTD. </p><a name="wp68045"> </a><p class="pBody">Finally, note that the DTD offers no sense of hierarchy. The definition for the <code class="cCode">title</code> element applies equally to a <code class="cCode">slide</code> title and to an <code class="cCode">item</code> title. When we expand the DTD to allow HTML-style markup in addition to plain text, it would make sense to restrict the size of an <code class="cCode">item</code> title compared to a <code class="cCode">slide</code> title, for example. But the only way to do that would be to give one of them a different name, such as &quot;<code class="cCode">item-title</code>&quot;. The bottom line is that the lack of hierarchy in the DTD forces you to introduce a &quot;hyphenation hierarchy&quot; (or its equivalent) in your namespace. All of these limitations are fundamental motivations behind the development of schema-specification standards.</p><a name="wp68046"> </a><h4 class="pHeading3">Special Element Values in the DTD</h4><a name="wp68047"> </a><p class="pBody">Rather than specifying a parenthesized list of elements, the element definition could use one of two special values: <code class="cCode">ANY</code> or <code class="cCode">EMPTY</code>. The <code class="cCode">ANY</code> specification says that the element may contain any other defined element, or <code class="cCode">PCDATA</code>. Such a specification is usually used for the root element of a general-purpose XML document such as you might create with a word processor. Textual elements could occur in any order in such a document, so specifying <code class="cCode">ANY</code> makes sense.</p><a name="wp68050"> </a><p class="pBody">The <code class="cCode">EMPTY</code> specification says that the element contains no contents. So the DTD for e-mail messages that let you &quot;flag&quot; the message with <code class="cCode">&lt;flag/&gt;</code> might have a line like this in the DTD:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!ELEMENT flag EMPTY&gt;<a name="wp68051"> </a></pre></div><a name="wp68053"> </a><h4 class="pHeading3">Referencing the DTD</h4><a name="wp68054"> </a><p class="pBody">In this case, the DTD definition is in a separate file from the XML document. That means you have to reference it from the XML document, which makes the DTD file part of the external subset of the full Document Type Definition (DTD) for the XML file. As you'll see later on, you can also include parts of the DTD within the document. Such definitions constitute the local subset of the DTD.</p><hr><a name="wp68056"> </a><p class="pNote">Note: The XML written in this section is contained in <code class="cCode"><a  href="../examples/xml/samples/slideSample05.xml" target="_blank">slideSample05.xml</a></code>. (The browsable version is <code class="cCode"><a  href="../examples/xml/samples/slideSample05-xml.html" target="_blank">slideSample05-xml.html</a></code>.) </p><hr><a name="wp68058"> </a><p class="pBody">To reference the DTD file you just created, add the line highlighted below to your <code class="cCode">slideSample.xml</code> file, and save a copy of the file as <code class="cCode">slideSample05.xml</code>:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!--  A SAMPLE set of slides  --&gt;<a name="wp68059"> </a><code class="cCodeBold">&lt;!DOCTYPE slideshow SYSTEM &quot;slideshow.dtd&quot;&gt;</code><a name="wp68060"> </a>&lt;slideshow<a name="wp68061"> </a></pre></div><a name="wp68062"> </a><p class="pBody">Again, the DTD tag starts with <code class="cCode">&quot;&lt;!&quot;</code>. In this case, the tag name, <code class="cCode">DOCTYPE</code>, says that the document is a <code class="cCode">slideshow</code>, which means that the document consists of the <code class="cCode">slideshow</code> element and everything within it: </p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;slideshow&gt;...&lt;/slideshow&gt;<a name="wp68063"> </a></pre></div><a name="wp68064"> </a><p class="pBody">This tag defines the <code class="cCode">slideshow</code> element as the root element for the document. An XML document must have exactly one root element. This is where that element is specified. In other words, this tag identifies the document <span style="font-style: italic">content</span> as a <code class="cCode">slideshow</code>. </p><a name="wp68065"> </a><p class="pBody">The <code class="cCode">DOCTYPE</code> tag occurs after the XML declaration and before the root element. The <code class="cCode">SYSTEM</code> identifier specifies the location of the DTD file. Since it does not start with a prefix like <code class="cCode">http:/ </code>or <code class="cCode">file:/</code>, the path is relative to the location of the XML document. Remember the <code class="cCode">setDocumentLocator</code> method? The parser is using that information to find the DTD file, just as your application would to find a file relative to the XML document. A <code class="cCode">PUBLIC</code> identifier could also be used to specify the DTD file using a unique name--but the parser would have to be able to resolve it </p><a name="wp68066"> </a><p class="pBody">The <code class="cCode">DOCTYPE</code> specification could also contain DTD definitions within the XML document, rather than referring to an external DTD file. Such definitions would be contained in square brackets, like this:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!DOCTYPE slideshow SYSTEM &quot;slideshow1.dtd&quot; [<code class="cCodeBold">&nbsp;&nbsp;...local subset definitions here...]&gt;</code><a name="wp68067"> </a></pre></div><a name="wp68068"> </a><p class="pBody">You'll take advantage of that facility in a moment to define some entities that can be used in the document.</p><a name="wp68962"> </a><h3 class="pHeading2">Documents and Data</h3><a name="wp68964"> </a><p class="pBody">Earlier, you learned that one reason you hear about XML <span style="font-style: italic">documents</span>, on the one hand, and XML <span style="font-style: italic">data</span>, on the other, is that XML handles both comfortably, depending on whether text is or is not allowed between elements in the structure. </p><a name="wp68966"> </a><p class="pBody">In the sample file you have been working with, the <code class="cCode">slideshow</code> element is an example of a <span style="font-style: italic">data element</span>--it contains only subelements with no intervening text. The <code class="cCode">item</code> element, on the other hand, might be termed a <span style="font-style: italic">document element</span>, because it is defined to include both text and subelements. </p><a name="wp68969"> </a><p class="pBody">As you work through this tutorial, you will see how to expand the definition of the title element to include HTML-style markup, which will turn it into a document element as well.</p><a name="wp68104"> </a><h3 class="pHeading2">Defining Attributes and Entities in the DTD</h3><a name="wp68105"> </a><p class="pBody">The DTD you've defined so far is fine for use with the nonvalidating parser. It tells where text is expected and where it isn't, which is all the nonvalidating parser is going to pay attention to. But for use with the validating parser, the DTD needs to specify the valid attributes for the different elements. You'll do that in this section, after which you'll define one internal entity and one external entity that you can reference in your XML file. </p><a name="wp68109"> </a><h4 class="pHeading3">Defining Attributes in the DTD</h4><a name="wp68110"> </a><p class="pBody">Let's start by defining the attributes for the elements in the slide presentation.</p><hr><a name="wp68112"> </a><p class="pNote">Note: The XML written in this section is contained in <code class="cCode"><a  href="../examples/xml/samples/slideshow1b.dtd" target="_blank">slideshow1b.dtd</a></code>. (The browsable version is <code class="cCode"><a  href="../examples/xml/samples/slideshow1b-dtd.html" target="_blank">slideshow1b-dtd.html</a></code>.) </p><hr><a name="wp68114"> </a><p class="pBody">Add the text highlighted below to define the attributes for the <code class="cCode">slideshow</code> element:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!ELEMENT slideshow (slide+)&gt;<code class="cCodeBold">&lt;!ATTLIST slideshow &nbsp;&nbsp;&nbsp;&nbsp;title    CDATA    #REQUIRED&nbsp;&nbsp;&nbsp;&nbsp;date     CDATA    #IMPLIED&nbsp;&nbsp;&nbsp;&nbsp;author   CDATA    &quot;unknown&quot;&gt;</code>&lt;!ELEMENT slide (title, item*)&gt;<a name="wp68115"> </a></pre></div><a name="wp68116"> </a><p class="pBody">The DTD tag <code class="cCode">ATTLIST</code> begins the series of attribute definitions. The name that follows <code class="cCode">ATTLIST</code> specifies the element for which the attributes are being defined. In this case, the element is the <code class="cCode">slideshow</code> element. (Note once again the lack of hierarchy in DTD specifications.)</p><a name="wp68117"> </a><p class="pBody">Each attribute is defined by a series of three space-separated values. Commas and other separators are not allowed, so formatting the definitions as shown above is helpful for readability. The first element in each line is the name of the attribute: <code class="cCode">title</code>, <code class="cCode">date</code>, or <code class="cCode">author</code>, in this case. The second element indicates the type of the data: <code class="cCode">CDATA</code> is character data--unparsed data, once again, in which a left-angle bracket (&lt;) will never be construed as part of an XML tag. <a  href="IntroXML4.html#wp68125">Table 2-3</a> presents the valid choices for the attribute type. </p><div align="left"><table border="1" summary="Attribute Types" id="wp68125">  <caption><a name="wp68125"> </a><div class="pTableTitle">Table 2-3   Attribute Types</div></caption>  <tr align="center">    <th><a name="wp68130"> </a><div class="pCellHeading"> Attribute Type</div></th>    <th><a name="wp68132"> </a><div class="pCellHeading">Specifies...</div></th></tr>  <tr align="left">    <td><a name="wp68134"> </a><div class="pCellBody"><code class="cCode">(value1 | value2 | ...)</code></div></td>    <td><a name="wp68136"> </a><div class="pCellBody">A list of values separated by vertical bars. (Example below)</div></td></tr>  <tr align="left">    <td><a name="wp68138"> </a><div class="pCellBody"><code class="cCode">CDATA</code></div></td>    <td><a name="wp68140"> </a><div class="pCellBody">&quot;Unparsed character data&quot;. (For normal people, a text string.)</div></td></tr>  <tr align="left">    <td><a name="wp68142"> </a><div class="pCellBody"><code class="cCode">ID</code></div></td>    <td><a name="wp68144"> </a><div class="pCellBody">A name that no other ID attribute shares.</div></td></tr>  <tr align="left">    <td><a name="wp68146"> </a><div class="pCellBody"><code class="cCode">IDREF</code></div></td>    <td><a name="wp68148"> </a><div class="pCellBody">A reference to an ID defined elsewhere in the document.</div></td></tr>  <tr align="left">    <td><a name="wp68150"> </a><div class="pCellBody"><code class="cCode">IDREFS</code></div></td>    <td><a name="wp68152"> </a><div class="pCellBody">A space-separated list containing one or more ID references.</div></td></tr>  <tr align="left">    <td><a name="wp68154"> </a><div class="pCellBody"><code class="cCode">ENTITY</code></div></td>    <td><a name="wp68156"> </a><div class="pCellBody">The name of an entity defined in the DTD.</div></td></tr>  <tr align="left">    <td><a name="wp68158"> </a><div class="pCellBody"><code class="cCode">ENTITIES</code></div></td>    <td><a name="wp68160"> </a><div class="pCellBody">A space-separated list of entities.</div></td></tr>  <tr align="left">    <td><a name="wp68162"> </a><div class="pCellBody"><code class="cCode">NMTOKEN</code></div></td>    <td><a name="wp68164"> </a><div class="pCellBody">A valid XML name composed of letters, numbers, hyphens, underscores, and colons.</div></td></tr>  <tr align="left">    <td><a name="wp68166"> </a><div class="pCellBody"><code class="cCode">NMTOKENS</code></div></td>    <td><a name="wp68168"> </a><div class="pCellBody">

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -