⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch20_05.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 20.4. Converting ASCII to HTML (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:45:57Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch20_01.htm"TITLE="20. Web Automation"><LINKREL="prev"HREF="ch20_04.htm"TITLE="20.3. Extracting URLs"><LINKREL="next"HREF="ch20_06.htm"TITLE="20.5. Converting HTML to ASCII"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_04.htm"TITLE="20.3. Extracting URLs"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 20.3. Extracting URLs"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch20_01.htm"TITLE="20. Web Automation"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_06.htm"TITLE="20.5. Converting HTML to ASCII"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 20.5. Converting HTML to ASCII"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch20-25410">20.4. Converting ASCII to HTML</A></H2><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-447">Problem<ACLASS="indexterm"NAME="ch20-idx-1000002612-0"></A><ACLASS="indexterm"NAME="ch20-idx-1000002612-1"></A><ACLASS="indexterm"NAME="ch20-idx-1000002612-2"></A><ACLASS="indexterm"NAME="ch20-idx-1000002612-3"></A><ACLASS="indexterm"NAME="ch20-idx-1000002612-4"></A><ACLASS="indexterm"NAME="ch20-idx-1000002612-5"></A></A></H3><PCLASS="para">You want to convert ASCII text to HTML.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-453">Solution</A></H3><PCLASS="para">Use the simple little encoding filter in <ACLASS="xref"HREF="ch20_05.htm#ch20-37199"TITLE="text2html">Example 20.3</A>.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch20-37199">Example 20.3: text2html</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w -p00# text2html - trivial html encoding of normal text# -p means apply this script to each record.# -00 mean that a record is now a paragraphuse HTML::Entities;$_ = encode_entities($_, &quot;\200-\377&quot;);if (/^\s/) {    # Paragraphs beginning with whitespace are wrapped in &lt;PRE&gt;     s{(.*)$}        {&lt;PRE&gt;\n$1&lt;/PRE&gt;\n}s;           # indented verbatim} else {    s{^(&gt;.*)}       {$1&lt;BR&gt;}gm;                    # quoted text    s{&lt;URL:(.*?)&gt;}    {&lt;A HREF=&quot;$1&quot;&gt;$1&lt;/A&gt;}gs         # embedded URL  (good)                    ||    s{(http:\S+)}   {&lt;A HREF=&quot;$1&quot;&gt;$1&lt;/A&gt;}gs;        # guessed URL   (bad)    s{\*(\S+)\*}    {&lt;STRONG&gt;$1&lt;/STRONG&gt;}g;         # this is *bold* here    s{\b_(\S+)\_\b} {&lt;EM&gt;$1&lt;/EM&gt;}g;                 # this is _italics_ here    s{^}            {&lt;P&gt;\n};                        # add paragraph tag}</PRE></DIV></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-499">Discussion</A></H3><PCLASS="para">Converting arbitrary plain text to HTML has no general solution because there are too many different, conflicting ways of representing formatting information in a plain text file. The more you know about the input, the better the job you can do of formatting it.</P><PCLASS="para">For example, if you knew that you would be fed a mail message, you could add this block to format the mail headers:</P><PRECLASS="programlisting">BEGIN {    print &quot;&lt;TABLE&gt;&quot;;    $_ = encode_entities(scalar &lt;&gt;);    s/\n\s+/ /g;  # continuation lines    while ( /^(\S+?:)\s*(.*)$/gm ) {                # parse heading        print &quot;&lt;TR&gt;&lt;TH ALIGN='LEFT'&gt;$1&lt;/TH&gt;&lt;TD&gt;$2&lt;/TD&gt;&lt;/TR&gt;\n&quot;;    }    print &quot;&lt;/TABLE&gt;&lt;HR&gt;&quot;;}<ACLASS="indexterm"NAME="ch20-idx-1000002621-0"></A></PRE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-525">See Also</A></H3><PCLASS="para">The documentation for the CPAN module HTML::Entities</P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_04.htm"TITLE="20.3. Extracting URLs"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 20.3. Extracting URLs"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_06.htm"TITLE="20.5. Converting HTML to ASCII"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 20.5. Converting HTML to ASCII"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">20.3. Extracting URLs</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">20.5. Converting HTML to ASCII</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -