📄 ch20_04.htm
字号:
</p></div><a name="INDEX-2604" /><div class="refentry"><table width="515" border="0" cellpadding="5"><tr><td align="left"><font size="+1"><b>tag</b></font></td><td align="right"><i></i></td></tr></table><hr width="515" size="3" noshade="true" align="left" color="black" /><pre>$<em class="replaceable">h</em>->tag([<em class="replaceable">name</em>])</pre><p><a name="INDEX-2604" />Sets or retrieves the tag<em class="replaceable"><tt>name</tt></em> for the element. Tag names are alwaysconverted to lowercase.</p></div><a name="INDEX-2605" /><a name="INDEX-2606" /><a name="INDEX-2607" /><div class="refentry"><table width="515" border="0" cellpadding="5"><tr><td align="left"><font size="+1"><b>traverse</b></font></td><td align="right"><i></i></td></tr></table><hr width="515" size="3" noshade="true" align="left" color="black" /><pre>$<em class="replaceable">h</em>->traverse(<em class="replaceable">sub</em>, [<em class="replaceable">ignoretext</em>])</pre><p><a name="INDEX-2605" />Traverses the current elementand all of its children, invoking the callback routine<em class="replaceable"><tt>sub</tt></em> for each element. The callback routineis called with a reference to the current element (the node), astartflag, and the depth as arguments. The start flag is<tt class="literal">1</tt> when entering a node and <tt class="literal">0</tt>when leaving (returning to a parent element). If the<em class="replaceable"><tt>ignoretext</tt></em> parameter is true (thedefault), then the callback routine will not be invoked for textcontent. If the callback routine returns false, the method will nottraverse any child elements of that node.<a name="INDEX-2606" /><a name="INDEX-2607" /></p></div></div><a name="perlnut2-CHP-20-SECT-4.4" /><div class="sect2"><h3 class="sect2">20.4.4. HTML::TreeBuilder</h3><p><a name="INDEX-2608" /><a name="INDEX-2609" />The HTML::TreeBuilder class provides aparser that creates an HTML syntax tree. Each node of the tree is anHTML::Element object. This class inherits both HTML::Parser andHTML::Elements, so methods from both of those classes can be used onits objects.</p><p>The methods provided by HTML::TreeBuilder control how the parsing isperformed. Values for these methods are set by providing a Booleanvalue for their arguments.</p><a name="INDEX-2610" /><div class="refentry"><table width="515" border="0" cellpadding="5"><tr><td align="left"><font size="+1"><b>ignore_text</b></font></td><td align="right"><i></i></td></tr></table><hr width="515" size="3" noshade="true" align="left" color="black" /><pre>$<em class="replaceable">p</em>->ignore_text(<em class="replaceable">boolean</em>)</pre><p><a name="INDEX-2610" />If set to true, text content ofelements will not be included in elements of the parse tree. Thedefault is false.</p></div><a name="INDEX-2611" /><div class="refentry"><table width="515" border="0" cellpadding="5"><tr><td align="left"><font size="+1"><b>ignore_unknown</b></font></td><td align="right"><i></i></td></tr></table><hr width="515" size="3" noshade="true" align="left" color="black" /><pre>$<em class="replaceable">p</em>->ignore_unknown(<em class="replaceable">boolean</em>)</pre><p><a name="INDEX-2611" />If setto true, unknown tags in the HTML will be represented as elements inthe parse tree.</p></div><a name="INDEX-2612" /><div class="refentry"><table width="515" border="0" cellpadding="5"><tr><td align="left"><font size="+1"><b>implicit_tags</b></font></td><td align="right"><i></i></td></tr></table><hr width="515" size="3" noshade="true" align="left" color="black" /><pre>$<em class="replaceable">p</em>->implicit_tags(<em class="replaceable">boolean</em>)</pre><p><a name="INDEX-2612" />If setto true, the parser will try to deduce implicit tags such as missingelements or end tags that are required to conform to proper HTMLstructure. If false, the parse tree will reflect the HTML as is.</p></div><a name="INDEX-2613" /><a name="INDEX-2614" /><a name="INDEX-2615" /><div class="refentry"><table width="515" border="0" cellpadding="5"><tr><td align="left"><font size="+1"><b>warn</b></font></td><td align="right"><i></i></td></tr></table><hr width="515" size="3" noshade="true" align="left" color="black" /><pre>$<em class="replaceable">p</em>->warn(<em class="replaceable">boolean</em>)</pre><p><a name="INDEX-2613" />If set to true, the parser will makecalls to <tt class="literal">warn</tt> with messages describing syntaxerrors when they occur. Error messages are off by default.<a name="INDEX-2614" /><a name="INDEX-2615" /></p></div></div><a name="perlnut2-CHP-20-SECT-4.5" /><div class="sect2"><h3 class="sect2">20.4.5. HTML::FormatPS</h3><p><a name="INDEX-2616" /><a name="INDEX-2617" /><a name="INDEX-2618" />The HTML::FormatPS module converts anHTML parse tree into PostScript. The formatter object is created withthe <tt class="literal">new</tt> constructor, which can take parametersthat assign PostScript attributes. For example:</p><blockquote><pre class="code">$formatter = HTML::FormatPS->new('papersize' => 'Letter');</pre></blockquote><p>You can now give parsed HTML to the formatter and produce PostScriptoutput for printing. HTML::FormatPS does not handle table or formelements at this time.</p><p><a name="INDEX-2619" />The method for this class is<tt class="literal">format</tt>. <tt class="literal">format</tt> takes areference to an HTML TreeBuilder object, representing a parsed HTMLdocument. It returns a scalar containing the document formatted inPostScript. The following example shows how to use this module toprint a file in PostScript:</p><blockquote><pre class="code">use HTML::FormatPS;$html = HTML::TreeBuilder->parse_file(somefile);$formatter = HTML::FormatPS->new( );print $formatter->format($html);</pre></blockquote><p>The following list describes the attributes that can be set in theconstructor:</p><dl><dt><i><em class="replaceable"><tt>PaperSize</tt></em></i></dt><dd>Possible values are <tt class="literal">3</tt>, <tt class="literal">A4</tt>,<tt class="literal">A5</tt>, <tt class="literal">B4</tt>, <tt class="literal">B5</tt>,<tt class="literal">Letter</tt>, <tt class="literal">Legal</tt>,<tt class="literal">Executive</tt>, <tt class="literal">Tabloid</tt>,<tt class="literal">Statement</tt>, <tt class="literal">Folio</tt>,<tt class="literal">10x14</tt>, and <tt class="literal">Quarto</tt>. The defaultis <tt class="literal">A4</tt>.</p></dd><dt><i><em class="replaceable"><tt>PaperWidth</tt></em></i></dt><dd>Width of the paper in points.</p></dd><dt><i><em class="replaceable"><tt>PaperHeight</tt></em></i></dt><dd>Height of the paper in points.</p></dd><dt><i><em class="replaceable"><tt>LeftMargin</tt></em></i></dt><dd>Left margin in points.</p></dd><dt><i><em class="replaceable"><tt>RightMargin</tt></em></i></dt><dd>Right margin in points.</p></dd><dt><i><em class="replaceable"><tt>HorizontalMargin</tt></em></i></dt><dd>Left and right margin. Default is 4 cm.</p></dd><dt><i><em class="replaceable"><tt>TopMargin</tt></em></i></dt><dd>Top margin in points.</p></dd><dt><i><em class="replaceable"><tt>BottomMargin</tt></em></i></dt><dd>Bottom margin in points.</p></dd><dt><i><em class="replaceable"><tt>VerticalMargin</tt></em></i></dt><dd>Top and bottom margin. Default is 2 cm.</p></dd><dt><i><em class="replaceable"><tt>PageNo</tt></em></i></dt><dd>Boolean value to display page numbers. Default is<tt class="literal">0</tt> (off).</p></dd><dt><i><em class="replaceable"><tt>FontFamily</tt></em></i></dt><dd>Font family to use on the page. Possible values are<tt class="literal">Courier</tt>, <tt class="literal">Helvetica</tt>, and<tt class="literal">Times</tt>. Default is <tt class="literal">Times</tt>.</p></dd><dt><i><em class="replaceable"><tt>FontScale</tt></em></i></dt><dd>Scale factor for the font.</p></dd><dt><i><em class="replaceable"><tt>Leading</tt></em></i></dt><dd>Space between lines, as a factor of the font size. Default is<tt class="literal">0.1</tt>. <a name="INDEX-2620" /><a name="INDEX-2621" /><a name="INDEX-2622" /> </p></dd></dl></div><a name="perlnut2-CHP-20-SECT-4.6" /><div class="sect2"><h3 class="sect2">20.4.6. HTML::FormatText</h3><p><a name="INDEX-2623" /><a name="INDEX-2624" />The HTML::FormatText module takes aparsed HTML file and outputs a plain-text version of it. None of thecharacter attributes will be usable, i.e., bold or italic fonts, fontsizes, etc.</p><p><a name="INDEX-2625" />This module is similar to FormatPS inthat the constructor takes attributes for formatting, and the<tt class="literal">format</tt> method produces the output. A formatterobject can be constructed like this:</p><blockquote><pre class="code">$formatter = HTML::FormatText->new(leftmargin => 10, rightmargin => 80);</pre></blockquote><p>The constructor can take two parameters:<tt class="literal">leftmargin</tt> and <tt class="literal">rightmargin</tt>. Thevalue for the margins is given in column numbers. The aliases<tt class="literal">lm</tt> and <tt class="literal">rm</tt> can also be used.</p><p>The <tt class="literal">format</tt> method takes an HTML::TreeBuilderobject and returns a scalar containing the formatted text. You canprint it with:</p><blockquote><pre class="code">print $formatter->format($html);</pre></blockquote></div><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch20_03.htm"><img src="../gifs/txtpreva.gif" alt="Previous" border="0" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img src="../gifs/txthome.gif" alt="Home" border="0" /></a></td><td align="right" valign="top" width="228"><a href="ch20_05.htm"><img src="../gifs/txtnexta.gif" alt="Next" border="0" /></a></td></tr><tr><td align="left" valign="top" width="228">20.3. The HTTP Modules</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img src="../gifs/index.gif" alt="Book Index" border="0" /></a></td><td align="right" valign="top" width="228">20.5. The URI Module</td></tr></table></div><hr width="684" align="left" /><img src="../gifs/navbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links" /><p><p><font size="-1"><a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -