📄 0037-0040.html
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>Presenting XML:The XML Advantage:EarthWeb Inc.-</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) { var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!-- <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15" HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE=" Book Search "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!-- ISBN=1575213346 //--><!-- TITLE=Presenting XML//--><!-- AUTHOR=Richard Light//--><!-- PUBLISHER=Macmillan Computer Publishing//--><!-- IMPRINT=Sams//--><!-- CHAPTER=03 //--><!-- PAGES=0037-0050 //--><!-- UNASSIGNED1 //--><!-- UNASSIGNED2 //--><P><CENTER><A HREF="../ch02/0036-0036.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0041-0043.html">Next</A></CENTER></P><A NAME="PAGENUM-37"><P>Page 37</P></A><H3><A NAME="ch03_ 1">CHAPTER 3</A></H3><H2>The XML Advantage</H2><B>by Richard Light</B><P>So far, you've had a brief introduction to the theory andpractice of generalized markup, and you've learned whereXML came from. You have seen a broad overview of XML'smain features. Now it's time to think—again in very broadterms—about the impact that XML might have on the world ofthe Web.</P><P>Let's kick off with a metaphor. Structured informationresources might see XML as the Model T Ford of theWeb. How's that? Well, think of the car as representingpersonal transport—being able to get out and about on yourown terms. HTML doesn't offer this: Information has to dressup in overalls (adopt HTML tagging) to climb aboardthe HTML bus. Up to now, if information wanted personaltransport on the Web, it had to buy into a Rolls Roycetechnology</P><A NAME="PAGENUM-38"><P>Page 38</P></A><P>such as SGML. At last, with XML, it can find transportation for areasonable cost. XML provides a standard packaging/transport mechanism for anytype of information—small or gigantic, simple or mind-bendingly complex.XML transports any number of different types of travelers, all headingsomewhere they'll be welcome.</P><P>Now let's see how far that metaphor will travel!</P><H3><A NAME="ch03_ 2">Documents that Know Themselves</A></H3><P>When an XML document arrives, it brings with it a knowledge of its ownstructure and semantics that make it much more useful in its own right. Thisoccurs before you even start to think about potential applications for XML.In this section, you look at the general advantages of the XML approach to <BR>documents.</P><H4><A NAME="ch03_ 3">Header Information: The Owner's Handbook</A></H4><P>A key feature of valid XML documents is that they all start with a definitionof their own rules and resource requirements. Even when the documenttype declaration is just a single line, like the following declaration, it is specifyinga URL that can, and routinely will, be resolved as the file is read:</P><!-- CODE SNIP //--><PRE><!DOCTYPE TEI.2 SYSTEM"http://www-tei.uic.edu/orgs/tei/p3/dtd/teilite-xml.dtd></A></PRE><!-- END CODE SNIP //--><P>Therefore, the complete document type definition (DTD)—in this case,the TEI Lite XML DTD—will always be available to the processor that istrying to interpret the XML document. (See Chapter 8, "Keeping It Tidy: TheXML Rule Book," for a full description of the information that can be found inthe DTD.)</P><P>The DTD states the overall document type, and it goes on to specifywhich element types are allowed and the properties of each type.</P><P>This header information allows XML applications to give a much betterservice to users. Generic XML tools, such as editors, can use the rules in theDTD to offer context-sensitive lists of allowed elements and to fill out newelements with a template for any mandatory attributes or subelements.Application-specific software can use the document type to recognize the XMLdocuments it is able to process.</P><A NAME="PAGENUM-39"><P>Page 39</P></A><P>Compare this with the HTML case. Even where an external DTD subsetis specified at the start of an HTML page, most HTML software will makeno attempt to resolve it, and even less attempt to act on the information theDTD contains. Here is an example:</P><!-- CODE SNIP //--><PRE><!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"></PRE><!-- END CODE SNIP //--><P>Yet, useful distinctions can be made between the structure of HTML 2.0and 3.2 documents and documents using vendor-specific extensions to theHTML tagset. Also, the DTD must be read if software is to correctly infer theimplied structure of the HTML document when markup (such as end-tags) hasbeen omitted. As you have already seen, the HTML browser philosophy is toignore the DTD, and also to ignore any markup that the particularprocessor doesn't understand. This makes for robustness, but at a considerable priceof lost information.</P><P>If instead HTML browsers were to insist on having a DTD, and were toactually read it, then a generic HTML browser would be able to deal easilywith updates to the HTML spec and variants on standard HTML withoutitself requiring an upgrade.</P><P>Another aspect of the DTD is that it declares up front the full set ofresources that constitute the document (or at least that might form part of it). Thisallows XML processors to spot any potential problems—such as theunavailability of a URL, or a file type that the XML application cannot handle—beforethe full file is processed. The DTD might also contain processing instructionsthat link in XS style sheets, which state how the document expects itself to bedisplayed.</P><H4><A NAME="ch03_ 4">Browseable Document Structure</A></H4><P>Thanks to XML's refusal to let you exclude any tags, the internal logicalstructure of all XML documents is clear for all to see. This is true ofwell-formed XML documents as well as valid ones. Every XML document can bethought of and processed as a neatly organized tree structure of elements withassociated data content. Again, this brings benefits at both generic andapplication-specific levels.</P><P>At the generic level, having all the markup given explicitly means thatXML-aware browsers and editors can present the structure of any XMLdocument as a nested set of folders. These can work just like the Windows FileManager</P><A NAME="PAGENUM-40"><P>Page 40</P></A><P>(or the Windows Explorer in Windows 95). The left pane allows you toopen up or collapse the child elements of any element. The right pane showsthe content of the currently selected element. Figure 3.1 shows the generalidea. This section is being browsed by the SGML Panorama Pro package. Theelement structure appears on the left, with the current elementhighlighted.</P><A HREF="javascript:displayWindow('images/ch03fg01.jpg',288,204)"><IMG SRC="images/tn_ch03fg01.jpg"></A><BR>Figure 3.1.<BR>Browsing a document's structure.<BR><H4><A NAME="ch03_ 5">Searchable Document Structure</A></H4><P>The transparent structure of XML documents also means that searchingwithin a document can be much more precise. Each word in the document has awell-defined context. (To be precise, each data character has a well-definedcontext—but that's probably too precise for most purposes!)Each data character forms part of the data content of a parent element, which itself has aparent element, and so on—all the way up tothe root element, which contains the whole document. For example, this"quoted" word is inside a q element, inside ap element containing this paragraph, and so on. Its full context iscurrently as follows:</P><!-- CODE SNIP //--><PRE>div1 - div2 - div3 - div4 - p - q</PRE><!-- END CODE SNIP //--><P><CENTER><A HREF="../ch02/0036-0036.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0041-0043.html">Next</A></CENTER></P></TD></TR></TABLE></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -