⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 webautomation.mht

📁 Perl 编程技巧大全。适合初学者阅读。
💻 MHT
📖 第 1 页 / 共 5 页
字号:
#-----------------------------
#% perl -0777 -pe 's/<[^>]*>//gs' file
#-----------------------------
</FONT><FONT color=3D#00ffff>{</FONT>
    <FONT color=3D#ffa500>local</FONT> $/<FONT color=3D#00ffff>;</FONT>  =
             <FONT color=3D#bebebe># temporary whole-file input mode
</FONT>    $html =3D &lt;FILE&gt;<FONT color=3D#00ffff>;</FONT>
    $html =3D~ <FONT color=3D#ff7f50>s</FONT>/&lt;<FONT =
color=3D#00ffff>[</FONT>^&gt;<FONT =
color=3D#00ffff>]</FONT>*&gt;//gs<FONT color=3D#00ffff>;</FONT>
<FONT color=3D#00ffff>}</FONT>
<FONT color=3D#bebebe>#-----------------------------
#&lt;IMG SRC =3D "foo.gif" ALT =3D "A &gt; B"&gt;
#
#&lt;!-- &lt;A comment&gt; --&gt;
#
#&lt;script&gt;if (a&lt;b &amp;&amp; a&gt;c)&lt;/script&gt;
#
#&lt;# Just data #&gt;
#
#&lt;![INCLUDE CDATA [ &gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt; =
]]&gt;
#-----------------------------
#&lt;!-- This section commented out.
#    &lt;B&gt;You can't see me!&lt;/B&gt;
#--&gt;
#-----------------------------
</FONT><FONT color=3D#ffa500>package</FONT> <FONT =
color=3D#b2dfee>MyParser</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT =
color=3D#b2dfee>HTML::Parser</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT =
color=3D#b2dfee>HTML::Entities</FONT> <FONT =
color=3D#ff7f50>qw</FONT><FONT =
color=3D#00ffff>(</FONT>decode_entities<FONT color=3D#00ffff>);</FONT>

<FONT color=3D#cdad00>@ISA</FONT> =3D <FONT =
color=3D#ff7f50>qw</FONT><FONT color=3D#00ffff>(</FONT>HTML::Parser<FONT =
color=3D#00ffff>);</FONT>

<FONT color=3D#ffa500>sub</FONT> <FONT color=3D#b2dfee>text</FONT> <FONT =
color=3D#00ffff>{</FONT>
    <FONT color=3D#ffa500>my</FONT><FONT color=3D#00ffff>(</FONT><FONT =
color=3D#9ac0cd>$self</FONT>, <FONT color=3D#9ac0cd>$text</FONT><FONT =
color=3D#00ffff>)</FONT> =3D <FONT color=3D#cdad00>@_</FONT><FONT =
color=3D#00ffff>;</FONT>
    <FONT color=3D#ff7f50>print</FONT> decode_entities<FONT =
color=3D#00ffff>(</FONT>$text<FONT color=3D#00ffff>);</FONT>
<FONT color=3D#00ffff>}</FONT>

<FONT color=3D#ffa500>package</FONT> <FONT =
color=3D#b2dfee>main</FONT><FONT color=3D#00ffff>;</FONT>
MyParser-&gt;new-&gt;parse_file<FONT color=3D#00ffff>(</FONT>*F<FONT =
color=3D#00ffff>);</FONT>
<FONT color=3D#bebebe>#-----------------------------
</FONT><FONT color=3D#00ffff>(</FONT>$title<FONT =
color=3D#00ffff>)</FONT> =3D <FONT color=3D#00ffff>(</FONT>$html =3D~ =
<FONT color=3D#ff7f50>m</FONT><FONT =
color=3D#bebebe>#&lt;TITLE&gt;\s*(.*?)\s*&lt;/TITLE&gt;#is);
#-----------------------------
# <FONT size=3D-1><A =
href=3D"http://pleac.sourceforge.net/include/perl/ch20/htitle">download =
the following standalone program</A></FONT>
#!/usr/bin/perl
# htitle - get html title from URL
</FONT>
<FONT color=3D#ffa500>die</FONT> <FONT color=3D#00cd00>"usage: $0 url =
...\n"</FONT> <FONT color=3D#ffa500>unless</FONT> <FONT =
color=3D#cdad00>@ARGV</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>require</FONT> <FONT =
color=3D#b2dfee>LWP</FONT><FONT color=3D#00ffff>;</FONT>

<FONT color=3D#ffa500>foreach</FONT> <FONT color=3D#9ac0cd>$url</FONT> =
<FONT color=3D#00ffff>(</FONT><FONT color=3D#cdad00>@ARGV</FONT><FONT =
color=3D#00ffff>)</FONT> <FONT color=3D#00ffff>{</FONT>
    $ua =3D LWP::UserAgent-&gt;new<FONT color=3D#00ffff>();</FONT>
    $res =3D $ua-&gt;request<FONT =
color=3D#00ffff>(</FONT>HTTP::Request-&gt;new<FONT =
color=3D#00ffff>(</FONT><FONT color=3D#00cd00>GET</FONT> =3D&gt; =
$url<FONT color=3D#00ffff>));</FONT>
    <FONT color=3D#ff7f50>print</FONT> <FONT color=3D#00cd00>"$url: =
"</FONT> <FONT color=3D#ffa500>if</FONT> <FONT =
color=3D#cdad00>@ARGV</FONT> &gt; <FONT color=3D#cdcd00>1</FONT><FONT =
color=3D#00ffff>;</FONT>
    <FONT color=3D#ffa500>if</FONT> <FONT =
color=3D#00ffff>(</FONT>$res-&gt;is_success<FONT =
color=3D#00ffff>)</FONT> <FONT color=3D#00ffff>{</FONT>
        <FONT color=3D#ff7f50>print</FONT> $res-&gt;title, <FONT =
color=3D#00cd00>"\n"</FONT><FONT color=3D#00ffff>;</FONT>
    <FONT color=3D#00ffff>}</FONT> <FONT color=3D#ffa500>else</FONT> =
<FONT color=3D#00ffff>{</FONT>
        <FONT color=3D#ff7f50>print</FONT> $res-&gt;status_line, <FONT =
color=3D#00cd00>"\n"</FONT><FONT color=3D#00ffff>;</FONT>
    <FONT color=3D#00ffff>}</FONT>
<FONT color=3D#00ffff>}</FONT>

<FONT color=3D#bebebe>#-----------------------------
#% htitle http://www.ora.com
#www.oreilly.com -- Welcome to O'Reilly &amp; Associates!
#
#% htitle http://www.perl.com/ http://www.perl.com/nullvoid
#http://www.perl.com/: The www.perl.com Home Page
#http://www.perl.com/nullvoid: 404 File Not Found
#-----------------------------
</FONT></PRE></FONT></TD></TR></TBODY></TABLE></DIV>
<DIV class=3DSECT2>
<H2 class=3DSECT2><A name=3DAEN1075>Finding Stale Links</A></H2>
<TABLE width=3D"100%" bgColor=3D#2f4f4f border=3D0>
  <TBODY>
  <TR>
    <TD><PRE class=3DSCREEN><FONT color=3D#f5deb3 size=3D+1><FONT =
color=3D#bebebe>#-----------------------------
# <FONT size=3D-1><A =
href=3D"http://pleac.sourceforge.net/include/perl/ch20/churl">download =
the following standalone program</A></FONT>
#!/usr/bin/perl -w
# churl - check urls
</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT =
color=3D#b2dfee>HTML::LinkExtor</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT =
color=3D#b2dfee>LWP::Simple</FONT> <FONT color=3D#ff7f50>qw</FONT><FONT =
color=3D#00ffff>(</FONT>get head<FONT color=3D#00ffff>);</FONT>

$base_url =3D <FONT color=3D#ff7f50>shift</FONT>
    <FONT color=3D#98fb98>or</FONT> <FONT color=3D#ffa500>die</FONT> =
<FONT color=3D#00cd00>"usage: $0 &lt;start_url&gt;\n"</FONT><FONT =
color=3D#00ffff>;</FONT>
$parser =3D HTML::LinkExtor-&gt;new<FONT color=3D#00ffff>(</FONT><FONT =
color=3D#ff7f50>undef</FONT>, $base_url<FONT color=3D#00ffff>);</FONT>
$parser-&gt;parse<FONT color=3D#00ffff>(</FONT>get<FONT =
color=3D#00ffff>(</FONT>$base_url<FONT color=3D#00ffff>));</FONT>
<FONT color=3D#cdad00>@links</FONT> =3D $parser-&gt;links<FONT =
color=3D#00ffff>;</FONT>
<FONT color=3D#ff7f50>print</FONT> <FONT color=3D#00cd00>"$base_url: =
\n"</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>foreach</FONT> <FONT =
color=3D#9ac0cd>$linkarray</FONT> <FONT color=3D#00ffff>(</FONT><FONT =
color=3D#cdad00>@links</FONT><FONT color=3D#00ffff>)</FONT> <FONT =
color=3D#00ffff>{</FONT>
    <FONT color=3D#ffa500>my</FONT> <FONT =
color=3D#cdad00>@element</FONT>  =3D @$linkarray<FONT =
color=3D#00ffff>;</FONT>
    <FONT color=3D#ffa500>my</FONT> <FONT =
color=3D#9ac0cd>$elt_type</FONT> =3D <FONT color=3D#ff7f50>shift</FONT> =
<FONT color=3D#cdad00>@element</FONT><FONT color=3D#00ffff>;</FONT>
    <FONT color=3D#ffa500>while</FONT> <FONT =
color=3D#00ffff>(</FONT><FONT color=3D#cdad00>@element</FONT><FONT =
color=3D#00ffff>)</FONT> <FONT color=3D#00ffff>{</FONT>
        <FONT color=3D#ffa500>my</FONT> <FONT =
color=3D#00ffff>(</FONT><FONT color=3D#9ac0cd>$attr_name</FONT> , <FONT =
color=3D#9ac0cd>$attr_value</FONT><FONT color=3D#00ffff>)</FONT> =3D =
<FONT color=3D#ff7f50>splice</FONT><FONT color=3D#00ffff>(</FONT><FONT =
color=3D#cdad00>@element</FONT>, <FONT color=3D#cdcd00>0</FONT>, <FONT =
color=3D#cdcd00>2</FONT><FONT color=3D#00ffff>);</FONT>
        <FONT color=3D#ffa500>if</FONT> <FONT =
color=3D#00ffff>(</FONT>$attr_value-&gt;scheme =3D~ /\b<FONT =
color=3D#00ffff>(</FONT>ftp|https?|file<FONT =
color=3D#00ffff>)</FONT>\b/<FONT color=3D#00ffff>)</FONT> <FONT =
color=3D#00ffff>{</FONT>
            <FONT color=3D#ff7f50>print</FONT> <FONT color=3D#00cd00>"  =
$attr_value: "</FONT>, head<FONT =
color=3D#00ffff>(</FONT>$attr_value<FONT color=3D#00ffff>)</FONT> ? =
<FONT color=3D#00cd00>"OK"</FONT> : <FONT color=3D#00cd00>"BAD"</FONT>, =
<FONT color=3D#00cd00>"\n"</FONT><FONT color=3D#00ffff>;</FONT>
        <FONT color=3D#00ffff>}</FONT>
    <FONT color=3D#00ffff>}</FONT>
<FONT color=3D#00ffff>}</FONT>

<FONT color=3D#bebebe>#-----------------------------
#% churl http://www.wizards.com
#http://www.wizards.com:
#
#  FrontPage/FP_Color.gif:  OK
#
#  FrontPage/FP_BW.gif:  BAD
#
#  #FP_Map:  OK
#
#  Games_Library/Welcome.html:  OK
#-----------------------------
</FONT></PRE></FONT></TD></TR></TBODY></TABLE></DIV>
<DIV class=3DSECT2>
<H2 class=3DSECT2><A name=3DAEN1078>Finding Fresh Links</A></H2>
<TABLE width=3D"100%" bgColor=3D#2f4f4f border=3D0>
  <TBODY>
  <TR>
    <TD><PRE class=3DSCREEN><FONT color=3D#f5deb3 size=3D+1><FONT =
color=3D#bebebe>#-----------------------------
# <FONT size=3D-1><A =
href=3D"http://pleac.sourceforge.net/include/perl/ch20/surl">download =
the following standalone program</A></FONT>
#!/usr/bin/perl -w
# surl - sort URLs by their last modification date
</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT =
color=3D#b2dfee>LWP::UserAgent</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT =
color=3D#b2dfee>HTTP::Request</FONT><FONT color=3D#00ffff>;</FONT>
<FONT color=3D#ffa500>use</FONT> <FONT color=3D#b2dfee>URI::URL</FONT> =
<FONT color=3D#ff7f50>qw</FONT><FONT color=3D#00ffff>(</FONT>url<FONT =
color=3D#00ffff>);</FONT>

<FONT color=3D#ffa500>my</FONT><FONT color=3D#00ffff>(</FONT><FONT =
color=3D#9ac0cd>$url</FONT>, <FONT color=3D#cdcd00>%Date</FONT><FONT =
color=3D#00ffff>);</FONT>
<FONT color=3D#ffa500>my</FONT> <FONT color=3D#9ac0cd>$ua</FONT> =3D =
LWP::UserAgent-&gt;new<FONT color=3D#00ffff>();</FONT>

<FONT color=3D#ffa500>while</FONT> <FONT color=3D#00ffff>(</FONT> $url =
=3D url<FONT color=3D#00ffff>(</FONT><FONT color=3D#ff7f50>scalar</FONT> =
&lt;&gt;<FONT color=3D#00ffff>)</FONT> <FONT color=3D#00ffff>)</FONT> =
<FONT color=3D#00ffff>{</FONT>
    <FONT color=3D#ffa500>my</FONT><FONT color=3D#00ffff>(</FONT><FONT =
color=3D#9ac0cd>$req</FONT>, <FONT color=3D#9ac0cd>$ans</FONT><FONT =
color=3D#00ffff>);</FONT>
    <FONT color=3D#ffa500>next</FONT> <FONT =
color=3D#ffa500>unless</FONT> $url-&gt;scheme =3D~ /^<FONT =
color=3D#00ffff>(</FONT>file|https?<FONT color=3D#00ffff>)</FONT>$/<FONT =
color=3D#00ffff>;</FONT>
    $ans =3D $ua-&gt;request<FONT =
color=3D#00ffff>(</FONT>HTTP::Request-&gt;new<FONT =
color=3D#00ffff>(</FONT><FONT color=3D#00cd00>"HEAD"</FONT>, $url<FONT =
color=3D#00ffff>));</FONT>
    <FONT color=3D#ffa500>if</FONT> <FONT =
color=3D#00ffff>(</FONT>$ans-&gt;is_success<FONT =
color=3D#00ffff>)</FONT> <FONT color=3D#00ffff>{</FONT>
        <FONT color=3D#cdcd00>$Date</FONT><FONT =
color=3D#00ffff>{</FONT>$url<FONT color=3D#00ffff>}</FONT> =3D =
$ans-&gt;<FONT color=3D#ffa500>last</FONT>_modified || <FONT =
color=3D#cdcd00>0</FONT><FONT color=3D#00ffff>;</FONT>  <FONT =
color=3D#bebebe># unknown
</FONT>    <FONT color=3D#00ffff>}</FONT> <FONT =
color=3D#ffa500>else</FONT> <FONT color=3D#00ffff>{</FONT>
        <FONT color=3D#ff7f50>print</FONT> STDERR <FONT =
color=3D#00cd00>"$url: Error ["</FONT>, $ans-&gt;code, <FONT =
color=3D#00cd00>"] "</FONT>, $ans-&gt;message, <FONT =
color=3D#00cd00>"!\n"</FONT><FONT color=3D#00ffff>;</FONT>
    <FONT color=3D#00ffff>}</FONT>
<FONT color=3D#00ffff>}</FONT>

<FONT color=3D#ffa500>foreach</FONT> <FONT color=3D#9ac0cd>$url</FONT> =
<FONT color=3D#00ffff>(</FONT> <FONT color=3D#ff7f50>sort</FONT> <FONT =
color=3D#00ffff>{</FONT> <FONT color=3D#cdcd00>$Date</FONT><FONT =
color=3D#00ffff>{</FONT>$b<FONT color=3D#00ffff>}</FONT> &lt;=3D&gt; =
<FONT color=3D#cdcd00>$Date</FONT><FONT color=3D#00ffff>{</FONT>$a<FONT =
color=3D#00ffff>}</FONT> <FONT color=3D#00ffff>}</FONT> <FONT =
color=3D#ff7f50>keys</FONT> <FONT color=3D#cdcd00>%Date</FONT> <FONT =
color=3D#00ffff>)</FONT> <FONT color=3D#00ffff>{</FONT>
    <FONT color=3D#ff7f50>printf</FONT> <FONT color=3D#00cd00>"%-25s =
</FONT><FONT color=3D#cdcd00>%s</FONT><FONT color=3D#00cd00>\n"</FONT>, =
<FONT color=3D#cdcd00>$Date</FONT><FONT =
color=3D#00ffff>{</FONT>$url<FONT color=3D#00ffff>}</FONT> ? <FONT =
color=3D#00ffff>(</FONT><FONT color=3D#ff7f50>scalar</FONT> <FONT =
color=3D#98fb98>localtime</FONT> <FONT color=3D#cdcd00>$Date</FONT><FONT =
color=3D#00ffff>{</FONT>$url<FONT color=3D#00ffff>})</FONT>
                                     : <FONT color=3D#00cd00>"&lt;NONE =
SPECIFIED&gt;"</FONT>, $url<FONT color=3D#00ffff>;</FONT>
<FONT color=3D#00ffff>}</FONT>

<FONT color=3D#bebebe>#-----------------------------
#% xurl http://www.perl.com/  | surl | head
#Mon Apr 20 06:16:02 1998  http://electriclichen.com/linux/srom.html
#
#Fri Apr 17 13:38:51 1998  http://www.oreilly.com/
#
#Fri Mar 13 12:16:47 1998  http://www2.binevolve.com/
#
#Sun Mar  8 21:01:27 1998  http://www.perl.org/
#

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -