⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 0074-0076.html

📁 Presenting XML.rar,详细介绍有关XML的知识
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>Presenting XML:The XML Approach:EarthWeb Inc.-</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!--  ISBN=1575213346 //--><!--  TITLE=Presenting XML//--><!--  AUTHOR=Richard Light//--><!--  PUBLISHER=Macmillan Computer Publishing//--><!--  IMPRINT=Sams//--><!--  CHAPTER=05 //--><!--  PAGES=0067-0084 //--><!--  UNASSIGNED1 //--><!--  UNASSIGNED2 //--><P><CENTER><A HREF="0071-0073.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0077-0080.html">Next</A></CENTER></P><A NAME="PAGENUM-74"><P>Page 74</P></A><H4><A NAME="ch05_ 14">8: The Design of XML Shall Be Formal and Concise</A></H4><P>The XML language specification uses Extended Backus-NaurFormat (EBNF). This is a standard format for declaring programming languages.Wherever possible, EBNF notation is used in preference to a description of an XMLfeature. Conciseness has also been achieved by including only those parts ofSGML that are absolutely necessary, as notedearlier.</P><H4><A NAME="ch05_ 15">9: XML Documents Shall Be Easy to Create</A></H4><P>This is perhaps more of a user perspective. Ease of authoring, like ease ofDTD design, is most greatly helped by having good authoring software. Thissoftware has two main goals. It keeps your documents valid by letting youselect only markup that is allowed in the current context. And it saves youkeystrokes by putting in the markup for you. Such software lets you select elementtypes (tags) from drop-down lists, add attribute values via dialog boxes, and soon. This is very similar to the aspects of a word processor that have to do withstyle selection.</P><P>Conversely, it is quite feasible to create (and update) XML documentswith any simple editor or word processor that can deal with ASCII documentsand offers the facility of saving the resulting file as plain ASCII. This might notbe an attractive method for creating a major piece of work, but it is very usefulto be able to use an editor to tweak XML documents that have been deliveredto you requiring minor changes. This compares well with page-orienteddelivery methods such as PostScript or PDF, where changes are notpossible.</P><H4><A NAME="ch05_ 16">10: Terseness in XML Markup Is of Minimal Importance</A></H4><P>This is another lesson from SGML. The SGML standard has options thatallow you to omit tags (particularly end tags) and leave an SGML processorto infer their presence. These features were included primarily to savekeystrokes and to reduce the disc storage requirements of the resulting documents.However, if software is doing the work of adding the markup, there is little pointin your being able to omit it. You don't have to type the tags yourself, andthe disc storage overhead is not really an issue these days. (Just look at the sizeof a typical image file or the large and uncontrollable overhead of &quot;binarystuff&quot;</P><A NAME="PAGENUM-75"><P>Page 75</P></A><P>in typical word processor files! The Word file containing this chapter iscurrently 66,048 bytes, but it contains only 25,197 data characters. Nearly62% of the file is overhead. So even if markup adds 100% to the size of anXML document, it is still doing better than Word.)</P><P>If you minimize markup, the structure of the document is much less clearto the human eye. Also, software finds it more difficult to discern thestructure.</P><H3><A NAME="ch05_ 17">Representation of Characters in XML</A></H3><P>An XML document consists of a sequence ofcharacters, each of which is &quot;an atomic unit of text represented by a bit string.&quot; This is a familiar conceptto anyone who has probed around inside computer systems. For example,PC-compatible computers use an extension of the long-established ASCIIcharacter set. In the original 7-bit ASCII scheme, each letter, number, andpunctuation symbol was given a different 7-bit code, similar to the following examples:</P><UL><LI>0110000 stands for the digit 0.<LI>0110001 stands for the digit 1.<LI>1000001 stands for the letter A.</UL><P>These bit patterns are commonly represented by ahexadecimal number, such as 30 standing for the digit 0. Only 128 different 7-bit patterns exist, so7-bit ASCII is able to represent only 128 different characters.With the advent of PCs, this proved to be a limitation; therefore, 8-bit character sets havebeen used most often since the advent of MS-DOS. Going to 8 bits doublesthe number of possible patterns to 256, and it allows the PC character set toinclude accented characters, drawing shapes, a selection of Greek letters, and <BR>so on.</P><P>XML uses ISO 10646, or Unicode, which takes the next logical step andsupports the use of 16-bit patterns to represent characters. These allow morethan 65,000 different characters to be represented. ISO 10646 provides astandard definition for all the characters found in many European and Asianlanguages. (16-bit encoding is not mandatory; it's just one of the options supportedby ISO 10646. For cases in which a more compact 8-bit character set issufficient, ISO 10646 offers the UTF-8 encoding scheme.) In addition, ISO 10646provides private use areas for user extensions to the standard character set.Chapter 7, &quot;Physical Structures in XML Documents,&quot; discusses characterencoding issues in more detail.</P><A NAME="PAGENUM-76"><P>Page 76</P></A><P>XML defines the range of legal characters (which are the characters thatcan appear in an XML document) as those with the following hexadecimal values:</P><UL><LI>     09 is the tab.<LI>     0D is the carriage return.<LI>     0A is the line feed.<LI>     20-FFFD and 00010000-7FFFFFFF are the legal graphics charactersof Unicode and ISO 10646.</UL><P>Characters are classified for convenience asletters, digits, and other characters. There are detailed rules for dealing with compound and ideographiccharacters and with layout and format-control characters.</P><H3><A NAME="ch05_ 18">Primitive Constructs</A></H3><P>I will introduce here a few low-level concepts that crop up throughout theXML specification. Name, Names, NMToken, andNmTokens are used within the definition of many XML constructs, such as start-tags, attributes, and elementdeclarations. They contain name characters, so I will define those first.XML defines name characters as</P><UL><LI>   Letters<LI>   Digits<LI>   Hyphens<LI>   Underscores<LI>   Full stops<LI>   Special characters in classesCombiningChar, Ignorable, and Extender</UL><P>The four primitive constructs are then defined thus:</P><UL><LI>  A Name consists of a letter or underscore, followed by zero ormore name characters.<LI>     Names refers to one or moreName entries, separated by white space.<LI>    A Nametoken (commonly known as an Nmtoken) is any mixtureof name characters.<LI>Nmtokens refers to one or more Nmtoken entries, separated bywhite space.</UL><P><CENTER><A HREF="0071-0073.html">Previous</A> | <A HREF="../ewtoc.html">Table of Contents</A> | <A HREF="0077-0080.html">Next</A></CENTER></P></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -