⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 0213-0218.html

📁 Presenting XML.rar,详细介绍有关XML的知识
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>Presenting XML:Morphing Existing HTML into XML:EarthWeb Inc.-</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!--  ISBN=1575213346 //--><!--  TITLE=Presenting XML//--><!--  AUTHOR=Richard Light//--><!--  PUBLISHER=Macmillan Computer Publishing//--><!--  IMPRINT=Sams//--><!--  CHAPTER=12 //--><!--  PAGES=0213-0234 //--><!--  UNASSIGNED1 //--><!--  UNASSIGNED2 //--><P><CENTER><A HREF="../ch11/0209-0212.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0219-0222.html">Next</A></CENTER></P><A NAME="PAGENUM-213"><P>Page 213</P></A>PART  III</A></H3><H2><A NAME="ch11_ 17">Using XML<BR></A></H2><OL START="12"><LI>     Morphing Existing HTML into XML<LI>     Developing an XML Memo Application<LI>     Creating an XML MuseumInformation Application<LI>     Automating the Web: RapidIntegration with XML</OL><H3><A NAME="ch12_ 1">CHAPTER 12</A></H3><H2>Morphing Existing<BR>HTML into XML</H2><B>by Richard Light</B><P>Let's start the XML practicals by taking a fairly typicalWeb page written in HTML and converting it to XML. Thereason for doing this is not to suggest that all HTML pageswill require this treatment! Rather, it's a good way to explorethe differences in approach between HTML and XML.</P><A NAME="PAGENUM-214"><P>Page 214</P></A><TABLE BGCOLOR="#FFFF99"><TR><TD>Note:</TD></TR><TR><TD><BLOCKQUOTE>Surprisingly, a fair proportion of the markup that you need to changeto make your Web page into a valid XML document is actually validSGML. SGML provides a number of shortcuts that have been widely used inHTML markup. They aren't valid XML because XML deliberately chooses not touse these markup minimization techniques.</BLOCKQUOTE></TD></TR></TABLE><P>Listing 12.1 shows the sample Web page. It's not a real one, but everyelement in it is based on features commonly found in real pages.</P><P>Listing 12.1. The source of the sample HTML page.</P><!--  CODE //--><PRE>&lt;html&gt;&lt;head&gt;&lt;title&gt;Morphing existing HTML into XML&lt;hr&gt;&lt;a href=&quot;l12_1.htm&quot;&gt;&lt;IMG ALIGN=MIDDLE SRC=&quot;home.gif&quot;alt=&quot;[home page]&quot;&gt;&lt;/a&gt;&lt;a href=&quot;l12html1.htm&quot;&gt;&lt;IMG ALIGN=MIDDLE SRC=&quot;html.gif&quot;alt=&quot;[HTML]&quot;&gt;&lt;/a&gt;&lt;a href=&quot;l12xml1.htm&quot;&gt;&lt;IMG ALIGN=MIDDLE SRC=&quot;xml.gif&quot;alt=&quot;[XML]&quot;&gt;&lt;/a&gt;&lt;hr&gt;&lt;h1&gt;Morphing existing HTML into XML&lt;/h1&gt;&lt;P&gt;We will start our XML practicals by taking a fairly typicalWeb page, written in HTML, and converting it to XML. The reasonfor doing this is &lt;i&gt;not&lt;/i&gt; to suggest that all HTMLpages will require this treatment! Rather, it's a good way toexplore the differences in approach between HTML and XML.&lt;p&gt;&lt;b&gt;&lt;tt&gt;SGML short-cuts&lt;/b&gt;&lt;/tt&gt; are probably to blame for muchof the incorrect HTML that we see [&lt;a href=&quot;l12note1.htm&quot;&gt;Note 1&lt;/a&gt;]&lt;p&gt;We will:&lt;ul compact&gt;&lt;li&gt;make our web page well-formed&lt;/li&gt;&lt;li&gt;update the HTML DTD for XML and make our page valid&lt;/li&gt;&lt;li&gt;use XML features to enhance our page&lt;/li&gt;&lt;p&gt;&lt;tt&gt;Page last updated July 8th 1997 by Richard Light</PRE><!--  END CODE //--><P>When displayed by a Web browser, it looks something like what is shownin Figure 12.1.</P><A NAME="PAGENUM-215"><P>Page 215</P></A><BR><A HREF="javascript:displayWindow('images/ch12fg01.jpg',288,204)"><IMG SRC="images/tn_ch12fg01.jpg"></A><BR>Figure 12.1.<BR>A sample HTML page.<BR><H3><A NAME="ch12_ 2">Toward Well-Formedness: <BR>From Tag Soup to Neatly <BR>Packed Suitcases</A></H3><P>Your first objective is to make the sample pagewell-formed. This means that each element must have both a start-tag and an end-tag. It also means thatthe tags must nest neatly. (Remember the nested suitcases from Chapter 6,&quot;Logical Structures in XML Documents.&quot;)</P><H4><A NAME="ch12_ 3">Non-Nested Tags</A></H4><P>A common error is to treat HTML start-tags and end-tags as simply ameans of switching features on and off. Therefore, users put them in the wrongorder. For instance, the following code is not well-formed XML, because theb and tt tags do not nest properly:</P><!--  CODE SNIP //--><PRE>&lt;b&gt;&lt;tt&gt;SGML short-cuts&lt;/b&gt;&lt;/tt&gt;</PRE><!--  END CODE SNIP //--><A NAME="PAGENUM-216"><P>Page 216</P></A><P>Having encountered the &lt;b&gt; start-tag, you are &quot;inside&quot; ab element when you insert a &lt;tt&gt; start-tag. Thus, you need to finish thett element before you can finish the b element. Let's indent the markup to see that more clearly:</P><!--  CODE SNIP //--><PRE>&lt;b&gt;    &lt;tt&gt;        SGML short-cuts    &lt;/b&gt; &lt;!-- wrong!! --&gt;&lt;/tt&gt; &lt;!-- wrong!! --&gt;</PRE><!--  END CODE SNIP //--><P>Switching the end-tags solves the problem:</P><!--  CODE SNIP //--><PRE>&lt;b&gt;&lt;tt&gt;SGML short-cuts&lt;/tt&gt;&lt;/b&gt;</PRE><!--  END CODE SNIP //--><P>The tags now nest neatly.</P><!--  CODE SNIP //--><PRE>&lt;b&gt;    &lt;tt&gt;        SGML short-cuts    &lt;/tt&gt;&lt;/b&gt;</PRE><!--  END CODE SNIP //--><H4><A NAME="ch12_ 4">Adding Implied Start-Tags and End-Tags</A></H4><P>In this page, many end-tags and even a few start-tags have been omitted.You can do this in SGML (if your DTD allows the tags in question to beomitted), but never in XML. For example, both the start-tag and the end-tag for thebody element are completely absent, and there is no end-tag for thehtml element that contains the whole page. As you have just seen when sortingnon-nested tags, all the p elements also have a start-tag but no end-tag.</P><P>Listing 12.2 shows what the code looks like after you add all thosemissing tags.</P><P>Listing 12.2. An HTML page with all tagging explicit.</P><!--  CODE //--><PRE>&lt;html&gt;&lt;head&gt;&lt;title&gt;Morphing existing HTML into XML&lt;/title&gt;&lt;/head&gt;&lt;body&gt;&lt;hr&gt;&lt;a href=&quot;l12_1.htm&quot;&gt;&lt;IMG ALIGN=MIDDLE SRC=&quot;home.gif&quot;alt=&quot;[home page]&quot;&gt;&lt;/a&gt;&lt;a href=&quot;l12html1.htm&quot;&gt;&lt;IMG ALIGN=MIDDLE SRC=&quot;html.gif&quot;alt=&quot;[HTML]&quot;&gt;&lt;/a&gt;&lt;a href=&quot;l12xml1.htm&quot;&gt;&lt;IMG ALIGN=MIDDLE SRC=&quot;xml.gif&quot;alt=&quot;[XML]&quot;&gt;&lt;/a&gt;&lt;hr&gt;&lt;h1&gt;Morphing existing HTML into XML&lt;/h1&gt;&lt;P&gt;We will start our XML practicals by taking a fairly typical</PRE><!--  END CODE //--><A NAME="PAGENUM-217"><P>Page 217</P></A><!--  CODE //--><PRE>Web page, written in HTML, and converting it to XML. The reasonfor doing this is &lt;i&gt;not&lt;/i&gt; to suggest that all HTMLpages will require this treatment! Rather, it's a good way toexplore the differences in approach between HTML and XML.&lt;/p&gt;&lt;p&gt;&lt;b&gt;&lt;tt&gt;SGML short-cuts&lt;/tt&gt;&lt;/b&gt; are probably to blame for muchof the incorrect HTML that we see [&lt;a href=&quot;l12note1.htm&quot;&gt;Note 1&lt;/a&gt;]&lt;/p&gt;&lt;p&gt;We will:&lt;ul compact&gt;&lt;li&gt;make our web page well-formed&lt;/li&gt;&lt;li&gt;update the HTML DTD for XML and make our page valid&lt;/li&gt;&lt;li&gt;use XML features to enhance our page&lt;/li&gt;&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;&lt;tt&gt;Page last updated July 8th 1997 by Richard Light&lt;/tt&gt;&lt;/p&gt;&lt;/body&gt;&lt;/html&gt;</PRE><!--  END CODE //--><H3><A NAME="ch12_ 5">Tidying Up Those Attributes</A></H3><P>The rules for XML attributes state that all attribute specifications need tobe quoted. Single or double quotes can be used. Numerical attribute values,in particular, tend not to be quoted in HTML. In this case, the alignment ofthe image</P><!--  CODE SNIP //--><PRE>&lt;IMG ALIGN=MIDDLE SRC=&quot;home.gif&quot;&gt;</PRE><!--  END CODE SNIP //--><P>should be quoted</P><!--  CODE SNIP //--><PRE>&lt;IMG ALIGN=&quot;MIDDLE&quot; SRC=&quot;home.gif&quot;&gt;</PRE><!--  END CODE SNIP //--><P>Also, where an attribute has a single possible value, it is common practiceto enter that value without quoting the attribute's name at all:</P><!--  CODE SNIP //--><PRE>&lt;ul compact&gt;</PRE><!--  END CODE SNIP //--><P>SGML allows this, but for XML the attribute specification needs to beentered in full:</P><!--  CODE SNIP //--><PRE>&lt;ul compact=&quot;compact&quot;&gt;</PRE><!--  END CODE SNIP //--><P>This is much less elegant&#151;but then the HTML DTD was undoubtedlydesigned with this SGML shortcut in mind!</P><A NAME="PAGENUM-218"><P>Page 218</P></A><H3><A NAME="ch12_ 6">Converting Empty Elements to <BR>XML Format</A></H3><P>All elements without any content&#151;formally known asempty elements&#151;need to be declared in a different way in XML. The special format used wasinvented for XML. (It's a novelty for people coming to XML from SGML.) Itsadvantage is that it marks an empty element quite unambiguously. WithSGML conventions, you cannot tell whether an element is empty without lookingup its declaration in the DTD.</P><P>You need to change the img element from</P><!--  CODE SNIP //--><PRE>&lt;IMG ALIGN=&quot;MIDDLE&quot; SRC=&quot;home.gif&quot;&gt;</PRE><!--  END CODE SNIP //--><P>to</P><!--  CODE SNIP //--><PRE>&lt;IMG ALIGN=&quot;MIDDLE&quot; SRC=&quot;home.gif&quot;/&gt;</PRE><!--  END CODE SNIP //--><P>At this point you start paying a price&#151;your HTML page is no longervalid HTML, because HTML does not recognize the XML convention forempty elements.</P><H3><A NAME="ch12_ 7">A Well-Formed XML Document</A></H3><P>With all these changes, your sample page will now report as well-formed ifan XML processor gives it the once-over. To celebrate this progress, add anXML declaration to the start of the page to make your intentions perfectly clear.The declaration states that you intend the page to be interpreted as a piece of XML.</P><P>The point of all the changes you have been making is that the logicalstructure of the page is now quite explicit. Notice that you haven't yet formallystated that this is actually an HTML document of any sort, yet an XMLprocessor would be able to read this document and correctly deduce its markupfrom the start-tags and end-tags, empty elements, and clearly labeled attributespecifications that you have provided. It doesn't need the HTML DTD to dothose things. That is the advantage that well-formedness provides.</P><P>Listing 12.3 shows how the whole page looks now.</P><P><CENTER><A HREF="../ch11/0209-0212.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0219-0222.html">Next</A></CENTER></P></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -