📄 introxml4.html

📁 j2eePDF格式的电子书
💻 HTML
📖 第 1 页 / 共 5 页
字号:
<div class="pPreformattedRelative"><pre class="pPreformattedRelative"> Market Size &lt; predicted<a name="wp67727"> </a></pre></div><a name="wp67728"> </a><p class="pBody">The problem with putting that line into an XML file directly is that when the parser sees the left-angle bracket (&lt;), it starts looking for a tag name, which throws off the parse. To get around that problem, you put <code class="cCode">&amp;lt;</code> in the file, instead of <code class="cCode">&quot;&lt;&quot;</code>.</p><hr><a name="wp67730"> </a><p class="pNote">Note: The results of the modifications below are contained in <code class="cCode"><a  href="../examples/xml/samples/slideSample03.xml" target="_blank">slideSample03.xml</a></code>. </p><hr><a name="wp67734"> </a><p class="pBody">Add the text highlighted below to your <code class="cCode">slideSample.xml</code> file, and save a copy of it for future use as <code class="cCode">slideSample03.xml</code>:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&nbsp;&nbsp;&lt;!-- OVERVIEW --&gt;&nbsp;&nbsp;&lt;slide type=&quot;all&quot;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;title&gt;Overview&lt;/title&gt;&nbsp;&nbsp;&nbsp;&nbsp;...&nbsp;&nbsp;&lt;/slide&gt;<a name="wp67735"> </a><code class="cCodeBold">&nbsp;&nbsp;&lt;slide type=&quot;exec&quot;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;title&gt;Financial Forecast&lt;/title&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;Market Size &amp;lt; predicted&lt;/item&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;Anticipated Penetration&lt;/item&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;Expected Revenues&lt;/item&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;Profit Margin &lt;/item&gt;&nbsp;&nbsp;&lt;/slide&gt;</code><a name="wp67736"> </a>&lt;/slideshow&gt;<a name="wp67737"> </a></pre></div><a name="wp67738"> </a><p class="pBody">When you use an XML parser to echo this data, you will see the desired output:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">Market Size &lt; predicted<a name="wp67739"> </a></pre></div><a name="wp67827"> </a><p class="pBody">You see an angle bracket (&quot;&lt;&quot;) where you coded &quot;&amp;lt;&quot;, because the XML parser converts the reference into the entity it represents, and passes that entity to the application.</p><a name="wp67742"> </a><h4 class="pHeading3">Handling Text with XML-Style Syntax</h4><a name="wp67743"> </a><p class="pBody">When you are handling large blocks of XML or HTML that include many of the special characters, it would be inconvenient to replace each of them with the appropriate entity reference. For those situations, you can use a <code class="cCode">CDATA</code> section.</p><hr><a name="wp67745"> </a><p class="pNote">Note: The results of the modifications below are contained in <code class="cCode"><a  href="../examples/xml/samples/slideSample04.xml" target="_blank">slideSample04.xml</a></code>.</p><hr><a name="wp67750"> </a><p class="pBody">A <code class="cCode">CDATA</code> section works like <code class="cCode">&lt;pre&gt;...&lt;/pre&gt;</code> in HTML, only more so--all whitespace in a <code class="cCode">CDATA</code> section is significant, and characters in it are not interpreted as XML. A <code class="cCode">CDATA</code> section starts with <code class="cCode">&lt;![CDATA[ </code>and ends with <code class="cCode">]]&gt;</code>. </p><a name="wp68519"> </a><p class="pBody">Add the text highlighted below to your <code class="cCode">slideSample.xml</code> file to define a <code class="cCode">CDATA</code> section for a fictitious technical slide, and save a copy of the file as <code class="cCode">slideSample04.xml</code>:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&nbsp;&nbsp; ...<code class="cCodeBold">&nbsp;&nbsp;&lt;slide type=&quot;tech&quot;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;title&gt;How it Works&lt;/title&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;First we fozzle the frobmorten&lt;/item&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;Then we framboze the staten&lt;/item&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;Finally, we frenzle the fuznaten&lt;/item&gt;&nbsp;&nbsp;&nbsp;&nbsp;&lt;item&gt;&lt;![CDATA[Diagram:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;frobmorten &lt;--------------- fuznaten&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| &nbsp;&nbsp; &lt;3&gt;&nbsp;&nbsp;^&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| &lt;1&gt;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&lt;1&gt; = fozzle&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;V &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&lt;2&gt; = framboze &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Staten--------------------+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;3&gt; = frenzle&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;2&gt;&nbsp;&nbsp;&nbsp;&nbsp;]]&gt;&lt;/item&gt;&nbsp;&nbsp;&lt;/slide&gt;</code>&lt;/slideshow&gt;<a name="wp67751"> </a></pre></div><a name="wp67752"> </a><p class="pBody">When you echo this file with an XML parser, you'll see the following output:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">Diagram:frobmorten &lt;--------------fuznaten&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;3&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^&nbsp;&nbsp;&nbsp;&nbsp; | &lt;1&gt;  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|   &lt;1&gt; = fozzle&nbsp;&nbsp;&nbsp;&nbsp;V  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|   &lt;2&gt; = framboze &nbsp;&nbsp;staten----------------------+   &lt;3&gt; = frenzle&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;2&gt;<a name="wp67753"> </a></pre></div><a name="wp67754"> </a><p class="pBody">The point here is that the text in the <code class="cCode">CDATA</code> section will have arrived as it was written. Since the parser doesn't treat the angle brackets as XML, they don't generate the fatal errors they would otherwise cause. (Because, if the angle brackets weren't in a CDATA section, the document would not be well-formed.)</p><a name="wp67965"> </a><h3 class="pHeading2">Creating a Document Type Definition</h3><a name="wp67967"> </a><p class="pBody">After the XML declaration, the document prolog can include a DTD, which lets you specify the kinds of tags that can be included in your XML document. In addition to telling a validating parser which tags are valid, and in what arrangements, a DTD tells both validating and nonvalidating parsers where text is expected, which lets the parser determine whether the whitespace it sees is significant or ignorable. </p><a name="wp67969"> </a><h4 class="pHeading3">Basic DTD Definitions</h4><a name="wp68545"> </a><p class="pBody">To begin learning about DTD definitions, let's start by telling the parser where text is expected and where any text (other than whitespace) would be an error. (Whitespace in such locations is <span style="font-style: italic">ignorable</span>.)</p><hr><a name="wp68547"> </a><p class="pNote">Note: The DTD defined in this section is contained in <code class="cCode"><a  href="../examples/xml/samples/slideshow1a.dtd" target="_blank">slideshow1a.dtd</a></code>. (The browsable version is<code class="cCode"><a  href="../examples/xml/samples/slideshow1a-dtd.html" target="_blank"> slideshow1a-dtd.html</a></code>.) </p><hr><a name="wp67974"> </a><p class="pBody">Start by creating a file named <code style="font-weight: normal" class="cCode">slideshow.dtd</code><span style="font-weight: bold">.</span> Enter an XML declaration and a comment to identify the file, as shown below:    </p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;<a name="wp67975"> </a>&lt;!-- &nbsp;&nbsp;DTD for a simple &quot;slide show&quot;. --&gt;<a name="wp67976"> </a></pre></div><a name="wp67977"> </a><p class="pBody">Next, add the text highlighted below to specify that a <code class="cCode">slideshow</code> element contains <code class="cCode">slide</code> elements and nothing else:    </p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!-- DTD for a simple &quot;slide show&quot;. --&gt;<a name="wp67978"> </a><code class="cCodeBold">&lt;!ELEMENT slideshow (slide+)&gt;</code><a name="wp67980"> </a></pre></div><a name="wp67981"> </a><p class="pBody">As you can see, the DTD tag starts with <code class="cCode">&lt;!</code> followed by the tag name (<code class="cCode">ELEMENT</code>). After the tag name comes the name of the element that is being defined (<code class="cCode">slideshow</code>) and, in parentheses, one or more items that indicate the valid contents for that element. In this case, the notation says that a <code class="cCode">slideshow</code> consists of one or more <code class="cCode">slide</code> elements. </p><a name="wp67982"> </a><p class="pBody">Without the plus sign, the definition would be saying that a <code class="cCode">slideshow</code> consists of a single <code class="cCode">slide</code> element. The qualifiers you can add to an element definition are listed in <a  href="IntroXML4.html#wp67991">Table 2-2</a>.</p><div align="left"><table border="1" summary="DTD Element Qualifiers" id="wp67991">  <caption><a name="wp67991"> </a><div class="pTableTitle">Table 2-2   DTD Element Qualifiers&nbsp;</div></caption>  <tr align="center">    <th><a name="wp67997"> </a><div class="pCellHeading"> Qualifier</div></th>    <th><a name="wp67999"> </a><div class="pCellHeading">Name</div></th>    <th><a name="wp68001"> </a><div class="pCellHeading">Meaning</div></th></tr>  <tr align="left">    <td><a name="wp68003"> </a><div class="pCellBody"><code class="cCode">?</code></div></td>    <td><a name="wp68005"> </a><div class="pCellBody">Question Mark</div></td>    <td><a name="wp68007"> </a><div class="pCellBody">Optional (zero or one)</div></td></tr>  <tr align="left">    <td><a name="wp68009"> </a><div class="pCellBody"><code class="cCode"> *</code></div></td>    <td><a name="wp68011"> </a><div class="pCellBody">Asterisk </div></td>    <td><a name="wp68013"> </a><div class="pCellBody">Zero or more</div></td></tr>  <tr align="left">    <td><a name="wp68015"> </a><div class="pCellBody"><code class="cCode">+</code></div></td>    <td><a name="wp68017"> </a><div class="pCellBody">Plus Sign</div></td>    <td><a name="wp68019"> </a><div class="pCellBody">One or more</div></td></tr></table></div><p class="pBody"></p><a name="wp68020"> </a><p class="pBody">You can include multiple elements inside the parentheses in a comma separated list, and use a qualifier on each element to indicate how many instances of that element may occur. The comma-separated list tells which elements are valid and the order they can occur in.</p><a name="wp68021"> </a><p class="pBody">You can also nest parentheses to group multiple items. For an example, after defining an <code class="cCode">image</code> element (coming up shortly), you could declare that every <code class="cCode">image</code> element must be paired with a <code class="cCode">title</code> element in a slide by specifying <code class="cCode">((image, title)+)</code>. Here, the plus sign applies to the <code class="cCode">image/title</code> pair to indicate that one or more pairs of the specified items can occur.</p><a name="wp68025"> </a><h4 class="pHeading3">Defining Text and Nested Elements</h4><a name="wp68026"> </a><p class="pBody">Now that you have told the parser something about where <span style="font-style: italic">not</span> to expect text, let's see how to tell it where text <span style="font-style: italic">can</span> occur. Add the text highlighted below to define the <code class="cCode">slide</code>, <code class="cCode">title</code>, <code class="cCode">item</code>, and <code class="cCode">list</code> elements:</p><div class="pPreformattedRelative"><pre class="pPreformattedRelative">&lt;!ELEMENT slideshow (slide+)&gt;<code class="cCodeBold">&lt;!ELEMENT slide (title, item*)&gt;&lt;!ELEMENT title (#PCDATA)&gt;&lt;!ELEMENT item (#PCDATA | item)* &gt;</code><a name="wp68027"> </a></pre></div><a name="wp68028"> </a><p class="pBody">The first line you added says that a slide consists of a <code class="cCode">title</code> followed by zero or more <code class="cCode">item</code> elements. Nothing new there. The next line says that a title consists entirely of <span style="font-style: italic">parsed character data</span> (<code class="cCode">PCDATA</code>). That's known as &quot;text&quot; in most parts of the country, but in XML-speak it's called &quot;parsed character data&quot;. (That distinguishes it from <code class="cCode">CDATA</code> sections, which contain character data that is not parsed.) The <code class="cCode">&quot;#&quot;</code> that precedes <code class="cCode">PCDATA</code> indicates that what follows is a special word, rather than an element name. </p><a name="wp68032"> </a><p class="pBody">The last line introduces the vertical bar (<code class="cCode">|</code>), which indicates an <span style="font-style: italic">or</span> condition. In this case, either <code class="cCode">PCDATA</code> or an <code class="cCode">item</code> can occur. The asterisk at the end says that either one can occur zero or more times in succession. The result of this specification is known as a <em class="cEmphasis">mixed-content model</em>, because any number of <code class="cCode">item</code> elements can be interspersed with the text. Such models must always be defined with <code class="cCode">#PCDATA</code> specified first, some number of alternate items divided by vertical bars (<code class="cCode">|</code>), and an asterisk (<code class="cCode">*</code>) at the end.</p><a name="wp68551"> </a><p class="pBody">Save a copy of this DTD as slideSample1a.dtd, for use when experimenting with basic DTD processing.</p><a name="wp68036"> </a><h4 class="pHeading3">Limitations of DTDs</h4><a name="wp68037"> </a><p class="pBody">
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -