⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 0201-0204.html

📁 Presenting XML.rar,详细介绍有关XML的知识
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>Presenting XML:The XML Processor:EarthWeb Inc.-</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!--  ISBN=1575213346 //--><!--  TITLE=Presenting XML//--><!--  AUTHOR=Richard Light//--><!--  PUBLISHER=Macmillan Computer Publishing//--><!--  IMPRINT=Sams//--><!--  CHAPTER=11 //--><!--  PAGES=0201-0212 //--><!--  UNASSIGNED1 //--><!--  UNASSIGNED2 //--><P><CENTER><A HREF="../ch10/0200-0200.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0205-0208.html">Next</A></CENTER></P><A NAME="PAGENUM-201"><P>Page 201</P></A><H3><A NAME="ch11_ 1">CHAPTER 11</A></H3><H2>The XML Processor</H2><B>by Richard Light</B><P>As I discuss in Chapter 2, &quot;Enter XML,&quot; the XMLlanguage draft describes a piece of software called theXML processor. In this chapter, I outline the jobs that the XML processoris required to do.</P><P>The XML-Lang specification says the following aboutthe XML processor:<BLOCKQUOTE>&quot;A software module called an XML processor isused to read XML documents and provide access totheir content and structure. It is assumed that anXML processor is doing its work on behalf ofanother module, referred to as the application. Thisspecification describes the required behavior of anXML processor in terms of how it must read XML dataand the information it must provide to the application.&quot;</BLOCKQUOTE><P>In this chapter, I go through the XML-Lang specificationand summarize that &quot;required behavior of an XML processor.&quot;</P><A NAME="PAGENUM-202"><P>Page 202</P></A><H3><A NAME="ch11_ 2">Types of XML Processors</A></H3><P>Conforming XML processors can work at one of two levels.</P><P>A non-validating XML processor checks XML documents forwell-formedness. This means that the XML processor does not always have to make use ofthe information in the DTD. However, in order to be conforming, anonvalidating XML processor must be able to process XML DTDs correctly. It must beable to read the DTD on request in order to check that an XML document iswell-formed.</P><P>A validating XML processor additionally checks that the documents arevalid&#151;that they conform to the rules in their DTD and to all thevalidity constraints specified in the XML-Lang specification. Clearly, the DTD must alwaysbe processed to achieve this.</P><H3><A NAME="ch11_ 3">General Ground Rules</A></H3><P>There are some general conventions about error handling and thetreatment of characters that apply to all XML processors.</P><H4><A NAME="ch11_ 4">Treatment of Errors</A></H4><P>XML processors are expected to be good citizens in the XML community,which requires them to know the law and to report any misdemeanors theyencounter. In other words, XML processors must have full knowledge of theXML standard. Where the standard says that something must apply, the XMLprocessor is required to behave as described. Where the standard says thatsomething might apply, the XML processor is permitted to, but is not requiredto, behave as described.</P><P>On encountering a fatal error in an XML document, an XML processormust report it to the application and discontinue normal processing. It can,however, pass undigested text to the application to help with error correction.</P><P>The only mention of fatal errors I can find is when a document violatesthe rules for a well-formed XML document. On the other hand, violating therules for validity is a reportable error and is not fatal. This makes sense. Asensibly well-formed document gives you a fully tree-structured object that youcan usefully process. Anything less is not really useable.</P><A NAME="PAGENUM-203"><P>Page 203</P></A><P>This strategy for error handling is radically different from theapproaches adopted by both HTML and SGML processors (parsers). On one hand,the HTML philosophy is that any page should be processed withoutcomplaint whatever errors it might contain. On the other hand, SGML parsers willdiligently report each and every error in an SGML document. In both cases,however, the processor will battle on through the document, attempting torecover from the errors it encounters whenever possible.</P><P>The XML approach to validity checking is similar to the SGMLapproach&#151;violations of validity constraints will be reported to the application, andthe processor will continue. However, the draconian approach towell-formed constraints in XML is quite new. By saying that an XML processor willstop processing, and so reject any XML document that is not well-formed, theXML-Lang specification is stating quite categorically that well-formedness is theprice of admission to the XML world.</P><H4><A NAME="ch11_ 5">Character Processing</A></H4><P>Where strings of characters have to be folded to a single case, the XMLprocessor has to fold to uppercase (despite the normal convention for Unicode,which is to fold to lowercase). Uppercase is necessary so that XML can becompatible with SGML, which uses uppercase.</P><P>All XML processors must accept the UTF-8 and UCS-2 encodings ofISO 10646 as a minimum. They must be able to use the Byte Order Mark todistinguish between these two encodings. In addition, the XML-Langspecification suggests that it is advantageous for XML processors (and applications)to be capable of interpreting the widest possible range of character encodings.</P><P>XML processors must read and act upon any encoding declaration thatappears at the start of a text entity. Encoding declarations are described indetail in the &quot;Character Encoding in XML Text Entities&quot; section in Chapter 7,&quot;Physical Structures in XML Documents.&quot; It is recommended that XMLprocessors be equipped with a variety of methods of deducing the encoding of anentity when it lacks a suitable encoding declaration. They should use</P><UL><LI>          HTTP header information<LI>          MIME headers<LI>          Information about the XML text entity provided by theoperating system or document management software</UL><A NAME="PAGENUM-204"><P>Page 204</P></A><P>If all of the preceding techniques fail to provide reliable information onthe encoding of an XML text entity, the XML-Lang specification describesa method for auto-detecting the general class of encoding within an XMLtext entity that is not encoded in UTF-8 or UCS-2. This technique is based onthe knowledge that an XML text entity must begin with an encodingdeclaration, which in turn must begin with the characters&lt;?XML. When the general class of encoding has been recognized, the XML processor will be able to read the <BR>encoding declaration, because that is guaranteed to contain only ASCIIcharacters. It will then know the precise encoding that applies to this XMLtext entity, and can reliably process it.</P><P>If an XML processor cannot recognize or process the encoding of an entity,it should tell the application and offer it the choice of either treating theentity as a binary entity (one that the XML processor does not try to process) orgiving up.</P><P>A nonvalidating XML processor (which is just looking for well-formednessand is not taking much notice of the DTD) must pass all characters in adocument that are not markup to the application. This includes all white spacecharacters. In addition, a validating XML processor must spot where white spacecan be ignored (because it comes at the start or end of an element withelement content only). It must signal to the application that this white space is notsignificant.</P><P>This strategy represents a departure from the precedent set by SGMLand HTML processors, which do not pass through to the application anywhite space around markup that they consider to be insignificant. An XMLapplication is guaranteed to receive every character that appears in the sourceXML document. The XML processor will tell the application that certaincharacters are not significant, but it is up to the application to decide whether tosuppress them.</P><H4><A NAME="ch11_ 6">Treatment of Logical Markup</A></H4><P>The XML processor must follow certain conventions when it isinterpreting the logical structure of an XML document. (See Chapter 6, &quot;LogicalStructures in XML Documents,&quot; for details of XML's logical structures.)</P><P><CENTER><A HREF="../ch10/0200-0200.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0205-0208.html">Next</A></CENTER></P></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -