📄 0085-0088.html
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>Presenting XML:Logical Structures in XML Documents:EarthWeb Inc.-</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) { var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!-- <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15" HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE=" Book Search "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!-- ISBN=1575213346 //--><!-- TITLE=Presenting XML//--><!-- AUTHOR=Richard Light//--><!-- PUBLISHER=Macmillan Computer Publishing//--><!-- IMPRINT=Sams//--><!-- CHAPTER=06 //--><!-- PAGES=0085-0108 //--><!-- UNASSIGNED1 //--><!-- UNASSIGNED2 //--><P><CENTER><A HREF="../ch05/0081-0084.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0089-0092.html">Next</A></CENTER></P><A NAME="PAGENUM-85"><P>Page 85</P></A><H3><A NAME="ch06_ 1">CHAPTER 6</A></H3><H2>Logical Structures in <BR>XML Documents</H2><B>by Richard Light</B><P>In Chapter 5, "The XML Approach," I reviewed thebasics of the XML language and discussed types of markup (suchas comments, processing instructions, and CDATA sections)that are generally useful but rather incidental to the real jobthat XML is trying to do. In this chapter, you will see thelogical structure of XML documents and what makes them intouseful pieces of structured information.</P><H3><A NAME="ch06_ 2">XML Documents</A></H3><P>Now that you've learned the basics, you can finally findout what an XML document is! This process will be ratherlike peeling an onion. Each time I define a new XMLconcept, you will see that some more specific concepts are hiddeninside it.</P><A NAME="PAGENUM-86"><P>Page 86</P></A><P>Starting with the whole onion, let's look at the top-level logical structureof XML documents. XML documents have an optionalprolog, followed by a required element known as thedocument element, and then optional miscellaneous stuff (such as comments, processing instructions, and white space)at the end.</P><P>Here is a simple example. Believe it or not, this is a genuine, complete,and well-formed XML document:</P><!-- CODE SNIP //--><PRE><greeting>Hello, world!</greeting></PRE><!-- END CODE SNIP //--><P>This example is a document element, without any prolog. You'll take thisexample and build on it as you go along.</P><P>The prolog consists of two main components, both of which areoptional.</P><H4><A NAME="ch06_ 3">XML Declaration</A></H4><P>The XML declaration is a special processing instruction declaring that thisis an XML document and quoting the version of XML to which itconforms (currently 1.0):</P><!-- CODE SNIP //--><PRE><?XML version="1.0"?></PRE><!-- END CODE SNIP //--><P>In practice, it is a very good idea to include an XML declaration in eventhe simplest XML documents. Here is how to add one to your example:</P><!-- CODE SNIP //--><PRE><?XML version="1.0"?><greeting>Hello, world!</greeting></PRE><!-- END CODE SNIP //--><P>By including an XML declaration, you are making it crystal clear both tohuman readers and to software that this is intended to be an XMLdocument. The XML declaration, if present, is always the first piece of markup in anXML document.</P><P>The XML declaration can also contain information about the characterencoding scheme used in the document:</P><!-- CODE SNIP //--><PRE><?XML version="1.0" encoding="UTF-8"?></PRE><!-- END CODE SNIP //--><P>This is associated more with the physical than the logical structure of theXML document, because it helps software to read the characters in thedocument correctly. This aspect of the XML declaration is discussed in Chapter 7,"Physical Structures in XML Documents," in the section "Character Encodingin XML Text Entities."</P><A NAME="PAGENUM-87"><P>Page 87</P></A><P>The XML declaration can contain guidelines on whether it is necessaryto process all or part of the DTD. These guidelines are therequired markup declaration (RMD) and look like the following:</P><!-- CODE SNIP //--><PRE><?XML version="1.0" RMD="INTERNAL"?></PRE><!-- END CODE SNIP //--><P>This is discussed in Chapter 8, "Keeping It Tidy: The XML RuleBook."</P><H4><A NAME="ch06_ 4">Document Type Declaration</A></H4><P>The second main part of the prolog is the document typedeclaration. It must appear between the XML declaration and the start of the documentelement. The document type declaration indicates the rules that the XML documentis following (or at least trying to follow). These rules are collectively knownas the document type definition (DTD).</P><P>These rules can be held in another entity (that is, another file), as in thefollowing example:</P><!-- CODE SNIP //--><PRE><?XML version="1.0"?><!DOCTYPE greeting SYSTEM "hello.dtd"><greeting>Hello, world!</greeting></PRE><!-- END CODE SNIP //--><P>Here, the rules for this document can be found in the file referenced bythe URL hello.dtd. This entity is called the externalsubset of the DTD.</P><P>The document type declaration can also include some or all of these ruleswithin itself:</P><!-- CODE //--><PRE><?XML version="1.0"?><!DOCTYPE greeting [<!ELEMENT greeting (#PCDATA)>]><greeting>Hello, world!</greeting></PRE><!-- END CODE //--><P>In this case, the single rule for this document (an element declaration forgreeting) is actually inside the document type declaration before the start ofthe document proper. This is called the internalsubset of the DTD.</P><P>In general, you will have both an external and an internal DTD subset inyour document type declarations. The general idea behind this two-partapproach is that you can use the external subset to refer to a standard DTD. Thenyou declare any features that are specific to this particular document in theinternal DTD subset. (The rule to support this approach states that declarationsin the internal DTD subset are read first and take precedence overdeclarations in the external DTD subset.)</P><A NAME="PAGENUM-88"><P>Page 88</P></A><P>Normally, you should not use your internal DTD subset to change thestructural rules defined by the external DTD, although it is quite possible todo that with XML. However, you still need this opportunity to declare, forexample, all of the entities that your document uses, such as imagefiles.</P><H4><A NAME="ch06_ 5">The Document Element</A></H4><P>So the "meat" of an XML document consists of exactly one element: thedocument element. This doesn't sound like much, until you consider that insidethis element any number of subelements can be nested to any depth, andany amount of text can be included.</P><P>The key points are that an XML document cannot consist of more thanone element, and it cannot be part of an element.</P><P>The following is not an XML document, because it contains two elements:</P><!-- CODE SNIP //--><PRE><?XML version="1.0"?><greeting>Hello, world!</greeting><response>Hello, XML!</response></PRE><!-- END CODE SNIP //--><P>On the other hand, the following is a well-formed XML document, becauseit has a single conversation element containing thegreeting and the response:</P><!-- CODE //--><PRE><?XML version="1.0"?><conversation><greeting>Hello, world!</greeting><response>Hello, XML!</response></conversation></PRE><!-- END CODE //--><H3><A NAME="ch06_ 6">Well-Formed and Valid Documents</A></H3><P>I've casually used the adjectives well-formed andvalid to describe XML documents. What do these words mean, exactly?</P><H4><A NAME="ch06_ 7">Well-Formed Documents</A></H4><P>A well-formed XML document is one that "looks right," but whose logicalstructure is not validated against the DTD (if any) associated with thedocument. In other words, it needs to have a single outermost element (theroot or document element) within which all the other elements and character data areneatly nested. But it isn't necessary for the document's elements and theirattributes to be declared in the DTD, and no check is made that each elementcontains the subelements that the DTD says it should contain.</P><P><CENTER><A HREF="../ch05/0081-0084.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0089-0092.html">Next</A></CENTER></P></TD></TR></TABLE></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -