⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch09_02.htm

📁 Perl & XML. by Erik T. Ray and Jason McIntosh ISBN 0-596-00205-X First Edition, published April
💻 HTM
📖 第 1 页 / 共 2 页
字号:
;    bless ($self,$class);    $self-&gt;_initialize(@_);    return $self;}</pre></blockquote><p>Note how the module calls its parent's<tt class="literal">new</tt> with very specific arguments. All are standardand well-documented setup instructions in<tt class="literal">XML::Parser</tt>'s public interface,but by taking these parameters out of the user'shands and into its own, the <tt class="literal">XML::RSS</tt> module knowsexactly what it's getting -- in this case, aparser object with namespace processing enabled, but not expansion orparsing of parameter entities -- and defines for itself what itshandlers are.</p><p>The result of calling <tt class="literal">SUPER::new</tt> is an<tt class="literal">XML::Parser</tt> object, which this moduledoesn't want to hand back to its users -- doingso would diminish the point of all this abstraction! Therefore, itreblesses the object (at this point, deemed to be a new<tt class="literal">$self</tt> for this class) using the Perl-iticallycorrect two-argument method, so that the returned object claimsfealty to <tt class="literal">XML::RSS</tt>, not<a name="INDEX-732" /><tt class="literal">XML::Parser</tt>.</p></div></div><a name="perlxml-CHP-9-SECT-2.3" /><div class="sect2"><h3 class="sect2">9.2.3. The Object Model </h3><p>Since we can see that<tt class="literal">XML::RSS</tt><a name="INDEX-733" /> isnot very unique in terms of parser object construction and documentparsing, let's look at where it starts to cut anedge of its own: through the shape of the internal data structure itbuilds and to which it applies its method-based API.</p><p><tt class="literal">XML::RSS</tt>'s code is made up mostlyof accessors -- methods that read and write to predefined placesin the structure it's building. Using nothing morecomplex than a few Perl hashes, <tt class="literal">XML::RSS</tt> buildsmaps of what it expects to see in the document, made of nested hashreferences with keys named after the elements and attributes it mightencounter, nested to match the way one might find them in a real RSSXML document. The module defines one of these maps for each versionof RSS that it handles. Here's the simplest one,which covers RSS Version 0.9:</p><blockquote><pre class="code">my %v0_9_ok_fields = (    channel =&gt; {         title       =&gt; '',        description =&gt; '',        link        =&gt; '',        },    image  =&gt; {         title =&gt; '',        url   =&gt; '',        link  =&gt; ''         },    textinput =&gt; {         title       =&gt; '',        description =&gt; '',        name        =&gt; '',        link        =&gt; ''        },    items =&gt; [],    num_items =&gt; 0,    version         =&gt; '',    encoding        =&gt; '');</pre></blockquote><p>This model is not entirely made up of hash references, of course; thetop-level "items" key holds anempty array reference, and otherwise, all the end values for all thekeys are scalars -- all empty strings. The exception is<tt class="literal">num_items</tt>, which isn't amongRSS's elements. Instead, it serves the role ofconvenience, making a small trade-off of structural elegance for thesake of convenience (presumably so the code doesn'thave to keep explicitly dereferencing the <tt class="literal">items</tt>array reference and then getting its value in scalar context).</p><p>On the other hand, this example risks going out of sync with realityif what it describes changes and the programmerdoesn't remember to update the number when thathappens. However, this sort of thing often comes down to programmingstyle, which is far beyond the bounds of this book.</p><p>There's good reason for this arrangement, besidesthe fact that hash values have to be set to something (or<tt class="literal">undef</tt>, which is a special sort of something). Eachhash doubles as a map for the module's subroutinesto follow and a template for the structures themselves. With that inmind, let's see what happens when an<tt class="literal">XML::Parser</tt> item is constructed via thismodule's <tt class="literal">new</tt> class method.</p></div><a name="perlxml-CHP-9-SECT-2.4" /><div class="sect2"><h3 class="sect2">9.2.4. Input: User or File </h3><p>After <a name="INDEX-734" />construction, an<tt class="literal">XML::RSS</tt><a name="INDEX-735" /> is ready to chew through an RSSdocument, thanks to the parsing powers afforded to it by its proudparent, <tt class="literal">XML::Parser</tt>. A user only needs to call theobject's <tt class="literal">parse</tt> or<tt class="literal">parsefile</tt> methods, and off it goes -- fillingitself up with data.</p><p>Despite this, many of these objects will live long<a href="#FOOTNOTE-31">[31]</a> andproductive lives without sinking their teeth into an existing XMLdocument. Often RSS users would rather have the module help build adocument from scratch -- or rather, from the bits of text thatprograms we write will feed to it. This is when all those accessorscome in handy.</p><blockquote class="footnote"><a name="FOOTNOTE-31" /><p>[31]Well, a few hundredths of a second on a typical whizbang PC,but we mean long in the poetic sense.</p> </blockquote><p>Thus, let's say we have a SQL database somewherethat contains some web log entries we'd like toRSS-ify. We could write up this little script:</p><blockquote><pre class="code">#!/usr/bin/perl# Turn the last 15 entries of Dr. Link's Weblog into an RSS 1.0 document,# which gets pronted to STDOUT.use warnings;use strict;use XML::RSS;use DBIx::Abstract;my $MAX_ENTRIES = 15;my ($output_version) = @ARGV;$output_version ||= '1.0';unless ($output_version eq '1.0' or $output_version eq '0.9'                                  or $output_version eq '0.91') {  die "Usage: $0 [version]\nWhere [version] is an RSS version to output: 0.9, 0 .91, or 1.0\nDefault is 1.0\n";}my $dbh = DBIx::Abstract-&gt;connect({dbname=&gt;'weblog',                                   user=&gt;'link',                                   password=&gt;'dirtyape'})  or die "Couln't connect to database.\n";my ($date) = $dbh-&gt;select('max(date_added)',                          'entry')-&gt;fetchrow_array;my ($time) = $dbh-&gt;select('max(time_added)',                          'entry')-&gt;fetchrow_array;my $time_zone = "+05:00"; # This happens to be where I live. :)my $rss_time = "${date}T$time$time_zone";# base time is when I started the blog, for the syndication infomy $base_time = "2001-03-03T00:00:00$time_zone";# I'll choose to use RSS version 1.0 here, which stuffs some meta-information into # 'modules' that go into their own namespaces, such as 'dc' (for Dublin Core) or # 'syn' (for RSS Syndication), but fortunately it doesn't make defining the document # any more complex, as you can see below...my $rss = XML::RSS-&gt;new(version=&gt;'1.0', output=&gt;$output_version);$rss-&gt;channel(              title=&gt;'Dr. Links Weblog',              link=&gt;'http://www.jmac.org/linklog/',              description=&gt;"Dr. Link's weblog and online journal",              dc=&gt; {                    date=&gt;$rss_time,                    creator=&gt;'llink@jmac.org',                    rights=&gt;'Copyright 2002 by Dr. Lance Link',                    language=&gt;'en-us',                   },              syn=&gt; {                     updatePeriod=&gt;'daily',                     updateFrequency=&gt;1,                     updateBase=&gt;$base_time,                    },             );$dbh-&gt;query("select * from entry order by id desc limit $MAX_ENTRIES");while (my $entry = $dbh-&gt;fetchrow_hashref) {  # Replace XML-naughty characters with entities  $$entry{entry} =~ s/&amp;/&amp;/g;  $$entry{entry} =~ s/&lt;/&amp;lt;/g;  $$entry{entry} =~ s/'/&amp;apos;/g;  $$entry{entry} =~ s/"/&amp;quot;/g;  $rss-&gt;add_item(         title=&gt;"$$entry{date_added} $$entry{time_added}",         link=&gt;"http://www.jmac.org/weblog?$$entry{date_added}#$$entry{time_added}",         description=&gt;$$entry{entry},                );}# Just throw the results into standard output. :)print $rss-&gt;as_string;</pre></blockquote><p>Did you see any XML there? We didn't. Well, OK, wedid have to give the truth of the matter a little nod by tossing inthose entity-escape regexes, but other than that, we were readingfrom a database and then stuffing what we found into an object by wayof a few method calls (or rather, a single, looped call to its<tt class="literal">add_item</tt> method). These calls accepted, as theirsole argument, a hash made of some straightforward strings. While we(presumably) wrote this program to let our web log take advantage ofeverything RSS has to offer, no actual XML was munged in theproduction of this file.</p></div><a name="perlxml-CHP-9-SECT-2.5" /><div class="sect2"><h3 class="sect2">9.2.5. Off-the-Cuff Output </h3><p>By<a name="INDEX-736" /> the way,<tt class="literal">XML::RSS</tt><a name="INDEX-737" />doesn't use XML-generation-helper modules such as<tt class="literal">XML::Writer</tt> to product its output; it just buildsone long scalar based on what the map-hash looks like, runningthrough ordinary <tt class="literal">if</tt>, <tt class="literal">else</tt>, and<tt class="literal">elsif</tt> blocks, each of which tend to use the<tt class="literal">.=</tt> self-concatenation operator. If you think youcan get away with it in your own XML-generating modules, you mighttry this approach, building up the literal document-to-be in memoryand <tt class="literal">print</tt>ing it to a filehandle; that way,you'll save a lot of overhead and gain control, butgive up some safety in the process. Just be sure to test your outputthoroughly for well-formedness. (If you're making adual-purpose parser/generator like <tt class="literal">XML::RSS</tt>, youmight try to have the module parse some of its own output and makesure everything<a name="INDEX-738" /> <a name="INDEX-739" /> <a name="INDEX-740" /> looks<a name="INDEX-741" /> as you'd expect.)</p></div><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch09_01.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="ch09_03.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">9. RSS, SOAP, and Other XML Applications </td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">9.3. XML Programming Tools </td></tr></table></div><hr width="684" align="left" /><img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" /><p><p><font size="-1"><a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -