⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch03_03.htm

📁 Perl & XML. by Erik T. Ray and Jason McIntosh ISBN 0-596-00205-X First Edition, published April
💻 HTM
字号:
<html><head><title>Stream-Based Versus Tree-Based Processing (Perl and XML)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Erik T. Ray and Jason McIntosh" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly &amp; Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="059600205XL" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Perl and XML" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl &amp; XML" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch03_02.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228" /><td align="right" valign="top" width="228"><a href="ch03_04.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h2 class="sect1">3.3. Stream-Based Versus Tree-Based Processing</h2><p>Remember<a name="INDEX-230" /> the<a name="INDEX-231" /> Perl mantra<a name="INDEX-232" />,"There's more than one way to doit<a name="INDEX-233" />"? It is also truewhen working with XML. Depending on how you want to work and whatkind of resources you have, many options are available. One developermay prefer a low-maintenance parsing job and is prepared to be looseand sloppy with memory to get it. Another will need to squeeze outfaster and leaner performance at the expense of more complex code.XML processing tasks vary widely, so you should be free to choose theshortest path to a solution.</p><p>There are a lot of different XML processing strategies. Most fallinto two categories: stream-based and tree-based. With the<em class="emphasis">stream-based strategy</em>, the parser continuouslyalerts a program to patterns in the XML. The parser functions like apipeline, taking XML markup on one end and pumping out processednuggets of data to your program. We call this pipeline an<em class="emphasis">event stream</em><a name="INDEX-234" /> because each chunk of data sent to theprogram signals something new and interesting in the XML stream. Forexample, the beginning of a new element is a significant event. So isthe discovery of a processing instruction in the markup. With eachupdate, your program does something new -- perhaps translating thedata and sending it to another place, testing it for some specificcontent, or sticking it onto a growing heap of data.</p><p>With the <em class="emphasis">tree-based strategy</em>, the parser keepsthe data to itself until the very end, when it presents a completemodel of the document to your program. Instead of a pipeline,it's like a camera that takes a picture andtransmits the replica to you. The model is usually in a much moreconvenient state than raw XML. For example, nested elements may berepresented in native Perl structures like lists or hashes, as we sawin an earlier example. Even more useful are trees of blessed objectswith methods that help navigate the structure from one place toanother. The whole point to this strategy is that your program canpull out any data it needs, in any order.</p><p>Why would you prefer one over the other? Each has strong and weakpoints. Event streams are fast and often have a much slimmer memoryfootprint, but at the expense of greater code complexity andimpermanent data. Tree building, on the other hand, lets the datastick around for as long as you need it, and your code is usuallysimple because you don't need special tricks to dothings like backwards searching. However, trees wither when it comesto economical use of processor time and memory.</p><p>All of this is relative, of course. Small documentsdon't cause much hardship to a typical computer,especially since CPU cycles and megabytes are getting cheaper everyday. Maybe the convenience of a persistent data structure willoutweigh any drawbacks. On the other hand, when working withGodzilla-sized documents like books, or huge numbers of documents allat once, you'll definitely notice the crunch. Thenthe agility of event stream processors will start to look better.It's impossible to give you any hard-and-fast rules,so we'll leave the decision up to you.</p><p>An interesting thing to note about the stream-based and tree-basedstrategies is that one is the basis for the other.That's right, an event stream drives the process ofbuilding a tree data structure. Thus, most low-level parsers areevent streams because you can always write a tree building layer ontop. This is how <tt class="literal">XML::Parser</tt> and most otherparsers work.</p><p>In a related, more recent, and very cool development, XML eventstreams can also turn any kind of document into some form of XML bywriting stream-based parsers that generate XML events from whateverdata structures lurk in that document type.</p><p>There's a lot more to say about event streams andtree builders -- so much, in fact, that we'vedevoted two whole chapters to the topics. <a href="ch04_01.htm">Chapter 4, "Event Streams"</a> takes a deep plunge into the theory behindevent streams with lots of examples for making useful programs out ofthem. <a href="ch06_01.htm">Chapter 6, "Tree Processing"</a> takes you deeper into the forestwith lots of tree-based examples. After that, <a href="ch08_01.htm">Chapter 8, "Beyond Trees: XPath, XSLT, and More"</a> shows you<a name="INDEX-235" /> unusual<a name="INDEX-236" /> hybrids thatprovide<a name="INDEX-237" /> the best of both worlds.</p><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch03_02.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="ch03_04.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">3.2. XML::Parser</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">3.4. Putting Parsers to Work</td></tr></table></div><hr width="684" align="left" /><img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" /><p><p><font size="-1"><a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -