📄 5a_dtd.html
字号:
<!ELEMENT title (#PCDATA)>
<!ELEMENT item (#PCDATA | item)* ></b></pre>
</blockquote>
<p>The first line you added says that a slide consists of a <code>title</code>
followed by zero or more <code>item</code> elements. Nothing new there. The
next line says that a title consists entirely of <i>parsed character data</i>
(<code>PCDATA</code>). That's known as "text" in most parts of the
country, but in XML-speak it's called "parsed character data". (That
distinguishes it from <code>CDATA</code> sections, which contain character data
that is not parsed.) The "#" that precedes <code>PCDATA</code> indicates
that what follows is a special word, rather than an element name. </p>
<p>The last line introduces the vertical bar (<code>|</code>), which indicates
an <i>or</i> condition. In this case, either <code>PCDATA</code> or an <code>item</code>
can occur. The asterisk at the end says that either one can occur zero or more
times in succession. The result of this specification is known as a <a href="../glossary.html#mixedContent">mixed-content
model</a>, because any number of <code>item</code> elements can be interspersed
with the text. Such models must always be defined with <code>#PCDATA</code>
specified first, some number of alternate items divided by vertical bars (<code>|</code>),
and an asterisk (<code>*</code>) at the end.</p>
<h3><a name="limitations"></a>Limitations of DTDs</h3>
<blockquote>
<p></p>
</blockquote>
<p>It would be nice if we could specify that an <code>item</code> contains either
text, or text followed by one or more list items. But that kind of specification
turns out to be hard to achieve in a DTD. For example, you might be tempted
to define an <code>item</code> like this: </p>
<blockquote>
<pre><!ELEMENT item (#PCDATA | (#PCDATA, item+)) ></pre>
</blockquote>
That would certainly be accurate, but as soon as the parser sees <code>#PCDATA</code>
and the vertical bar, it requires the remaining definition to conform to the mixed-content
model. This specification doesn't, so you get can error that says: <code>Illegal
mixed content model for 'item'. Found &#x28; ...</code>, where the hex character
28 is the angle bracket the ends the definition.
<p>Trying to double-define the item element doesn't work, either. A specification
like this:</p>
<blockquote>
<pre><!ELEMENT item (#PCDATA) >
<!ELEMENT item (#PCDATA, item+) ></pre>
</blockquote>
<p>produces a "duplicate definition" warning when the validating parser
runs. The second definition is, in fact, ignored. So it seems that defining
a mixed content model (which allows <code>item</code> elements to be interspersed
in text) is about as good as we can do. </p>
<p>In addition to the limitations of the mixed content model mentioned above,
there is no way to further qualify the kind of text that can occur where <code>PCDATA</code>
has been specified. Should it contain only numbers? Should be in a date format,
or possibly a monetary format? There is no way to say in the context of a DTD.
</p>
<p>Finally, note that the DTD offers no sense of hierarchy. The definition for
the <code>title</code> element applies equally to a <code>slide</code> title
and to an <code>item</code> title. When we expand the DTD to allow HTML-style
markup in addition to plain text, it would make sense to restrict the size of
an <code>item</code> title compared to a <code>slide</code> title, for example.
But the only way to do that would be to give one of them a different name, such
as "<code>item-title</code>". The bottom line is that the lack of
hierarchy in the DTD forces you to introduce a "hyphenation hierarchy"
(or its equivalent) in your namespace. All of these limitations are fundamental
motivations behind the development of schema-specification standards.</p>
<h3><a name="specialValues"></a>Special Element Values in the DTD</h3>
<p>Rather than specifying a parenthesized list of elements, the element definition
could use one of two special values: <code>ANY</code> or <code>EMPTY</code>.
The <code>ANY</code> specification says that the element may contain any other
defined element, or <code>PCDATA</code>. Such a specification is usually used
for the root element of a general-purpose XML document such as you might create
with a word processor. Textual elements could occur in any order in such a document,
so specifying <code>ANY</code> makes sense.</p>
<p>The <code>EMPTY</code> specification says that the element contains no contents.
So the DTD for email messages that let you "flag" the message with
<code><flag/></code> might have a line like this in the DTD:</p>
<blockquote>
<pre><!ELEMENT flag EMPTY></pre>
</blockquote>
<h3></h3>
<h3><a name="referencing"></a>Referencing the DTD</h3>
<p>In this case, the DTD definition is in a separate file from the XML document.
That means you have to reference it from the XML document, which makes the DTD
file part of the <a href="../glossary.html#externalSubset">external subset</a>
of the full Document Type Definition (DTD) for the XML file. As you'll see later
on, you can also include parts of the DTD within the document. Such definitions
constitute the <a href="../glossary.html#localSubset">local subset</a> of the
DTD.</p>
<blockquote>
<p><b>Note: </b>The XML written in this section is contained in <a href="samples/slideSample05.xml"><code>slideSample05.xml</code></a>.</p>
</blockquote>
<p>To reference the DTD file you just created, add the line highlighted below
to your <code>slideSample.xml</code> file:</p>
<blockquote>
<pre><!-- A SAMPLE set of slides -->
<b><!DOCTYPE slideshow SYSTEM "slideshow.dtd"></b>
<slideshow
</pre>
</blockquote>
<p><a name="root"></a>Again, the DTD tag starts with "<code><!</code>".
In this case, the tag name, <code>DOCTYPE</code>, says that the document is
a <code>slideshow</code>, which means that the document consists of the <code>slideshow</code>
element and everything within it: </p>
<blockquote>
<pre><slideshow>
...
</slideshow></pre>
</blockquote>
<p>This tag defines the <code>slideshow</code> element as the <a href="../glossary.html#root">root</a>
element for the document. An XML document must have exactly one root element.
This is where that element is specified. In other words, this tag identifies
the document <i>content</i> as a <code>slideshow</code>. </p>
<p><a name="DOCTYPE"></a>The <code>DOCTYPE</code> tag occurs after the XML declaration
and before the root element. The <code>SYSTEM</code> identifier specifies the
location of the DTD file. Since it does not start with a prefix like <code>http:/
</code>or <code>file:/</code>, the path is relative to the location of the XML
document. Remember the <code>setDocumentLocator</code> method? The parser is
using that information to find the DTD file, just as your application would
to find a file relative to the XML document. A <code>PUBLIC</code> identifier
could also be used to specify the DTD file using a unique name -- but the parser
would have to be able to resolve it </p>
<p>The <code>DOCTYPE</code> specification could also contain DTD definitions within
the XML document, rather than referring to an external DTD file. Such definitions
would be contained in square brackets, like this:.</p>
<blockquote>
<pre><!DOCTYPE slideshow SYSTEM "slideshow1.dtd" <b>[
...<i>local subset definitions here</i>...
]></b></pre>
</blockquote>
<p>You'll take advantage of that facility later on to define some entities that
can be used in the document. </p>
<blockquote>
<p><b><a name="Resolver"></a>Note:</b> <br>
If a public ID (<a href="../glossary.html#URN">URN</a>) is specified instead
of a system ID (<a href="../glossary.html#URL">URL</a>), then the parser has
to be able to resolve it to an actual address in order to use it. To do that,
the parser can be configured with a <a href="../../api/internal/com/sun/xml/parser/Resolver.html"><code>com.sun.xml.parser.Resolver</code></a>
using the parser's <code>setEntityResolver</code> method, and the URN can
be associated with a local URL using the resolver's <code>registerCatalogEntry</code>
method. </p>
</blockquote>
<p></p>
<blockquote>
<hr size=4>
</blockquote>
<p>
<p>
<table width="100%">
<tr>
<td align=left> <a href="4_refs.html"><img src="../images/PreviousArrow.gif" width=26 height=26 align=top border=0 alt="Previous | "></a><a
href="5b_dtd.html"><img src="../images/NextArrow.gif" width=26 height=26 align=top border=0 alt="Next | "></a><a href="../alphaIndex.html"><img src="../images/xml_IDX.gif" width=26 height=26 align=top border=0 alt="Index | "></a><a href="../TOC.html"><img
src="../images/xml_TOC.gif" width=26 height=26 align=top border=0 alt="TOC | "></a><a href="../index.html"><img
src="../images/xml_Top.gif" width=26 height=26 align=top border=0 alt="Top | "></a>
</td>
<td align=right><strong><em><a href="index.html">Top</a></em></strong> <a href="../TOC.html#intro"><strong><em>Contents</em></strong></a>
<a href="../TOC.html#intro"><strong><em></em></strong></a> <a href="../alphaIndex.html"><strong><em>Index</em></strong></a>
<a href="../glossary.html"><strong><em>Glossary</em></strong></a></td>
</tr>
</table>
</body>
</html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -