首页 › 资源下载 › 其他书籍 › Perl & XML. by Er › 源码查看
ch06_01.htm

来自「Perl & XML. by Erik T. Ray and Jason 」· HTM 代码 · 共 217 行
HTM
217 行
<html><head><title>Tree Processing (Perl and XML)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Erik T. Ray and Jason McIntosh" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly &amp; Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="059600205XL" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Perl and XML" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl &amp; XML" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch05_07.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228" /><td align="right" valign="top" width="228"><a href="ch06_02.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h1 class="chapter">Chapter 6. Tree Processing</h1><div class="htmltoc"><h4 class="tochead">Contents:</h4>  <p> <a href="#perlxml-CHP-6-SECT-1">XML Trees</a><br /><a href="ch06_02.htm">XML::Simple</a><br /><a href="ch06_03.htm">XML::Parser's Tree Mode</a><br /><a href="ch06_04.htm">XML::SimpleObject</a><br /><a href="ch06_05.htm">XML::TreeBuilder</a><br /><a href="ch06_06.htm">XML::Grove</a><br /></p></div><p>Having<a name="INDEX-478" /></a> done just about all we can do withstreams, it's time to move on to another style ofXML processing. Instead of letting the XML fly past the program onetiny piece at a time, we will capture the whole document in memoryand <em class="emphasis">then</em> start working on it. Having anin-memory representation built behind the scenes for us makes our jobmuch easier, although it tends to require more memory and CPU cycles.</p><p>This chapter is an overview of programming with persistent XMLobjects, better known as<a name="INDEX-479" /></a><em class="emphasis">tree processing</em>. It looks at a variety ofdifferent modules and strategies for building and accessing XMLtrees, including the rigorous, standard Document Object Model (DOM),fast access to internal document parts with XPath, and efficient treeprocessing methods.</p><div class="sect1"><a name="perlxml-CHP-6-SECT-1" /></a><h2 class="sect1">6.1. XML Trees</h2><p>Every XML document can be represented as a collection of data objectslinked in an acyclic structure called a tree. Each object, or<em class="emphasis">node</em><a name="INDEX-480" /></a>, is a small piece of the document, suchas an element, a piece of text, or a processing instruction. Onenode, called the <em class="emphasis">root</em>, links to other nodes,and so on down to nodes that aren't linked toanything. Graph this image out and it looks like a big, bushytree -- hence the name.</p><p>A tree structure representing a piece of XML is a handy thing tohave. Since a tree is acyclic (it has no circular links), you can usesimple traversal methods that won't get stuck ininfinite loops. Like a filesystem directory tree, you can representthe location of a node easily in simple shorthand. Like real trees,you can break a piece off and treat it like a smaller tree -- atree is just a collection of subtrees joined by a root node. Best ofall, you have all the information in one place and search through itlike a database.</p><p>For the programmer, a tree makes life much easier. Stream processing,you will recall, remembers fleeting details to use later inconstructing another data structure or printing out information. Thiswork is tedious, and can be downright horrible for very complexdocuments. If you have to combine pieces of information fromdifferent parts of the document, then you might go mad. If you have atree containing the document, though, all the details are right infront of you. You only need to write code to sift through the nodesand pull out what you need.</p><p>Of course, you don't get anything good for free.There is a penalty for having easy access to every point in adocument. Building the tree in the first place takes time andprecious CPU cycles, and even more if you use object-oriented methodcalls. There is also a memory tax to pay, since each object in thetree takes up some space. With very large documents (trees withmillions of nodes are not unheard of), you could bring your poormachine down to its knees with a tree processing program. On theaverage, though, processing trees can get you pretty good results(especially with a little optimizing, as we show later in thechapter), so don't give up just yet.</p><p>As we talk about trees, we will frequently use genealogical terms todescribe relationships between nodes. A <a name="INDEX-481" /></a>container node is said to bethe <em class="emphasis">parent</em><a name="INDEX-482" /></a> of the nodes it branches to, each ofwhich may be called a <em class="emphasis">child</em> of the containernode. Likewise, the terms<em class="emphasis">descendant</em><a name="INDEX-483" /></a>,<em class="emphasis">ancestor</em><a name="INDEX-484" /></a>, and<em class="emphasis">sibling</em><a name="INDEX-485" /></a> mean pretty much what you think theywould. So two sibling nodes share the same parent node, and all nodeshave the root node as their ancestor.</p><p>There are several different species of trees, depending on theimplementation you're talking about. Each speciesmodels the document in a slightly different way. For example, do youconsider an entity reference to be a separate node from text, orwould you include the reference in the same package? You have to payattention to the individual scheme of each module. <a href="ch06_01.htm#perlxml-CHP-6-TABLE-1">Table 6-1</a> shows a common selection of node types.</p><a name="perlxml-CHP-6-TABLE-1" /></a><h4 class="objtitle">Table 6-1. Typical node type definitions </h4><table border="1"><tr><th><p>Type</p></th><th><p>Properties</p></th></tr><tr><td><p><a name="INDEX-486" /></a>Element</p></td><td><p>Name, attributes, references to children</p></td></tr><tr><td><p><a name="INDEX-487" /></a>Namespace</p></td><td><p>Prefix name, URI</p></td></tr><tr><td><p><a name="INDEX-488" /></a>Character data</p></td><td><p>String of characters</p></td></tr><tr><td><p><a name="INDEX-489" /></a>Processinginstruction</p></td><td><p>Target, Data</p></td></tr><tr><td><p><a name="INDEX-490" /></a>Comment</p></td><td><p>String of characters</p></td></tr><tr><td><p><a name="INDEX-491" /></a>CDATA section</p></td><td><p>String of characters</p></td></tr><tr><td><p><a name="INDEX-492" /></a>Entity reference</p></td><td><p>Name, Replacement text (or System ID and/or Public ID)</p></td></tr></table><p><p>In addition to this set, some implementations define node types forthe DTD, allowing a programmer to access declarations for elements,entities, notations, and attributes. Nodes may also exist for the XMLdeclaration and document type declarations.</p></div><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch05_07.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="ch06_02.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">5.7. XML::SAX: The Second Generation</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">6.2. XML::Simple</td></tr></table></div><hr width="684" align="left" /><img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" /><p><p><font size="-1"><a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>
ch06_01.htm - 源码说明

本页面展示了「Perl & XML. by Erik T. Ray and Jason McIntosh ISBN 0-596-00205-X First Edition, published April」中的 ch06_01.htm 源码文件，采用 HTM 编程语言编写，共 217 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与T.相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?