📄 ch06_11.htm
字号:
> a string containing the definition of an anonymous subroutine to match any of the supplied patterns. Perl compiles the pattern once, when the subroutine is defined. The string is evaluated to give you comparatively quick matching ability. An explanation of the algorithm can be found at the end of the section "Regex Compilation, the /o Modifier, and Efficiency" in Chapter 7 of <EMCLASS="emphasis">Mastering Regular Expressions</EM>.</P><PCLASS="para"><ACLASS="xref"HREF="ch06_11.htm#ch06-35632"TITLE="popgrep3">Example 6.6</A> is a version of our pop grepper that uses that technique.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch06-35632">Example 6.6: popgrep3</A></H4><PRECLASS="programlisting">#!/usr/bin/perl# <ACLASS="indexterm"NAME="ch06-idx-1000007799-0"></A>popgrep3 - grep for abbreviations of places that say "pop"# version 3: use build_match_func algorithm@popstates = qw(CO ON MI WI MN); $expr = join('||', map { "m/\\b\$popstates[$_]\\b/o" } 0..$#popstates);$match_any = eval "sub { $expr }";die if $@;while (<>) { print if &$match_any;}</PRE></DIV><PCLASS="para">The string that gets evaluated ends up looking like this, modulo formatting:</P><PRECLASS="programlisting">sub { m/\b$popstates[0]\b/o || m/\b$popstates[1]\b/o || m/\b$popstates[2]\b/o || m/\b$popstates[3]\b/o || m/\b$popstates[4]\b/o }</PRE><PCLASS="para">The reference to the <CODECLASS="literal">@popstates</CODE> array is locked up inside the closure. Each one is different, so the <CODECLASS="literal">/o</CODE> is safe here.</P><PCLASS="para"><ACLASS="xref"HREF="ch06_11.htm#ch06-40846"TITLE="grepauth">Example 6.7</A> is a generalized form of this technique showing how to create functions that return true if any of the patterns match or if all match.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch06-40846">Example 6.7: grepauth</A></H4><PRECLASS="programlisting">#!/usr/bin/perl# <ACLASS="indexterm"NAME="ch06-idx-1000007801-0"></A>grepauth - print lines that mention both Tom and Nat$multimatch = build_match_all(q/Tom/, q/Nat/);while (<>) { print if &$multimatch;}exit;sub build_match_any { build_match_func('||', @_) }sub build_match_all { build_match_func('&&', @_) }sub build_match_func { my $condition = shift; my @pattern = @_; # must be lexical variable, not dynamic one my $expr = join $condition => map { "m/\$pattern[$_]/o" } (0..$#pattern); my $match_func = eval "sub { local \$_ = shift if \@_; $expr }"; die if $@; # propagate $@; this shouldn't happen! return $match_func;}</PRE></DIV><PCLASS="para">Using <CODECLASS="literal">eval</CODE> <CODECLASS="literal">"STRING"</CODE> on interpolated strings as we did in <EMCLASS="emphasis">popgrep2</EM> is a hack that happens to work. Using lexical variables that get bound up in a closure as in <EMCLASS="emphasis">popgrep3</EM> and the <CODECLASS="literal">build_match_*</CODE> functions is deep enough magic that even Perl wizards stare at it a while before they believe in it. Of course, it still works whether they believe in it or not.</P><PCLASS="para">What you really need is some way to get Perl to compile each pattern once and let you directly refer to the compiled form later on. Such functionality is directly supported in the 5.005 release in the form of a <CODECLASS="literal">qr//</CODE><ACLASS="indexterm"NAME="ch06-idx-1000008349-0"></A> regular-expression quoting operator. For prior releases, that's exactly what the experimental Regexp module from CPAN was designed for. Objects created by this module represent compiled regular expression patterns. Using the <CODECLASS="literal">match</CODE> method on these objects matches the pattern against the string argument. Methods in the class exist for extracting backreferences, determining where pattern matched, and passing flags corresponding to modifiers like <CODECLASS="literal">/i</CODE>.</P><PCLASS="para"><ACLASS="xref"HREF="ch06_11.htm#ch06-36674"TITLE="popgrep4">Example 6.8</A> is a version of our program that demonstrates a simple use of this module.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch06-36674">Example 6.8: popgrep4</A></H4><PRECLASS="programlisting">#!/usr/bin/perl# <ACLASS="indexterm"NAME="ch06-idx-1000007803-0"></A>popgrep4 - grep for abbreviations of places that say "pop"# version 4: use Regexp moduleuse Regexp;@popstates = qw(CO ON MI WI MN);@poppats = map { Regexp->new( '\b' . $_ . '\b') } @popstates;while (defined($line = <>)) { for $patobj (@poppats) { print $line if $patobj->match($line); }}</PRE></DIV><PCLASS="para">You might wonder about the comparative speeds of these approaches. When run against the 22,000 line text file (the Jargon File, to be exact), version 1 ran in 7.92 seconds, version 2 in merely 0.53 seconds, version 3 in 0.79 seconds, and version 4 in 1.74 seconds. The last technique is a lot easier to understand than the others, although it does run slightly slower than they do. It's also more flexible. <ACLASS="indexterm"NAME="ch06-idx-1000007627-0"></A><ACLASS="indexterm"NAME="ch06-idx-1000007627-1"></A></P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch06-pgfId-1353">See Also</A></H3><PCLASS="para">Interpolation is explained in the "Scalar Value Constructors" section of <ICLASS="filename">perldata</I> (1), and in the <ACLASS="olink"HREF="../prog/ch02_03.htm#PERL2-CH-2-SECT-3.2.2">"String literals"</A> section of <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the <CODECLASS="literal">/o</CODE> modifier in <ICLASS="filename">perlre </I>(1) and the <ACLASS="olink"HREF="../prog/ch02_04.htm">"Pattern Matching"</A> section of <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the "Regex Compilation, the /o Modifier, and Efficiency" section of Chapter 7 of <CITECLASS="citetitle">Mastering Regular Expressions</CITE>; the documentation with the CPAN module Regexp</P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_10.htm"TITLE="6.9. Matching Shell Globs as Regular Expressions"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 6.9. Matching Shell Globs as Regular Expressions"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_12.htm"TITLE="6.11. Testing for a Valid Pattern"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 6.11. Testing for a Valid Pattern"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">6.9. Matching Shell Globs as Regular Expressions</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">6.11. Testing for a Valid Pattern</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -