📄 ch07_01.htm
字号:
<html><head><title>Concepts of Regular Expressions (Learning Perl, 3rd Edition)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" />
<meta name="DC.Creator" content="Randal L. Schwartz and Tom Phoenix" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly & Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="0596001320L" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Learning Perl, 3rd Edition" /><meta name="DC.Type" content="Text.Monograph" />
</head><body bgcolor="#ffffff">
<img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Learning Perl, 3rd Edition" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map>
<div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch06_06.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"></a></td><td align="right" valign="top" width="228"><a href="ch07_02.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div>
<h1 class="chapter">Chapter 7. Concepts of Regular Expressions</h1>
<div class="htmltoc"><h4 class="tochead">Contents:</h4>
<p> <a href="#lperl3-CHP-7-SECT-1">What Are Regular Expressions?</a><br />
<a href="ch07_02.htm">Using Simple Patterns</a><br />
<a href="ch07_03.htm">A Pattern Test Program</a><br />
<a href="ch07_04.htm">Exercises</a><br /></p></div>
<p><a name="INDEX-492" /></a>Perl has many features that set it
apart from other languages. Of all those features, one of the most
important is its strong support for regular expressions. These allow
fast, flexible, and reliable string handling.
</p>
<p>But that power comes at a price. Regular expressions are actually
tiny programs in their own special language, built inside Perl. (Yes,
you're about to learn <em class="emphasis">another</em> programming
language!<a href="#FOOTNOTE-162">[162]</a> Fortunately it's a simple
one.) So for the next two chapters, we'll be learning that
language; then we'll take what we've learned back to the
world of Perl in <a href="ch09_01.htm">Chapter 9, "Using Regular Expressions"</a>.
</p><blockquote class="footnote"> <a name="FOOTNOTE-162" /></a><p>[162]Some might argue that regular expressions
are not a <em class="emphasis">complete</em> programming language. We
won't argue.</p> </blockquote>
<p><a name="INDEX-493" /></a>Regular expressions aren't merely
part of Perl; they're also found in <i class="command">sed</i> and
<i class="command">awk</i>, <i class="command">procmail</i>,
<i class="command">grep</i>, most programmers' text editors like
<i class="command">vi</i> and <i class="command">emacs</i>, and even in more
esoteric places. If you've seen some of these already,
you're ahead of the game. Keep watching, and you'll see
many more tools that use or support regular expressions, such as
search engines on the Web (often written in Perl), email clients, and
others.
</p>
<div class="sect1"><a name="lperl3-CHP-7-SECT-1" /></a>
<h2 class="sect1">7.1. What Are Regular Expressions?</h2>
<p>A <em class="firstterm">regular expression</em>, often called a
<em class="firstterm">pattern</em><a name="INDEX-494" /></a>
<a name="INDEX-495" /></a> in Perl, is a template that either
matches or doesn't match a given string.<a href="#FOOTNOTE-163">[163]</a> That is, there are an infinite
number of possible text strings; a given pattern divides that
infinite set into two groups: the ones that match, and the ones that
don't. There's never any kinda-sorta-almost-up-to-here
wishy-washy matching: either it matches or it doesn't. A
pattern may match just one possible string, or just two or three, or
a dozen, or a hundred, or an infinite number. Or it may match all
strings <em class="emphasis">except</em> for one, or except for some, or
except for an infinite number.<a href="#FOOTNOTE-164">[164]</a>
</p><blockquote class="footnote">
<a name="FOOTNOTE-163" /></a><p>[163]Purists would ask for a more rigorous definition. But then
again, purists say that Perl's patterns aren't really
regular expressions. If you're serious about regular
expressions, we highly recommend the book <em class="emphasis">Mastering Regular
Expressions</em> by Jeffrey Friedl (O'Reilly &
Associates, Inc.).</p> </blockquote><blockquote class="footnote"> <a name="FOOTNOTE-164" /></a><p>[164]And as we'll
see, you could have a pattern that always matches or that never does.
In rare cases, even these may be useful. Generally, though,
they're mistakes.</p> </blockquote>
<p>We already referred to regular expressions as being little programs
in their own simple programming language. It's a simple
language because the programs have just one task: to look at a string
and say "it matches" or "it doesn't
match".<a href="#FOOTNOTE-165">[165]</a> That's all they do.
</p><blockquote class="footnote"> <a name="FOOTNOTE-165" /></a><p>[165]The programs also pass back some
information that Perl can use later. One such piece of information is
the "regular expressions memories" that we'll learn
about a little later.</p> </blockquote>
<p>One of the places you're likely to have seen regular
expressions is in the Unix
<i class="command">grep</i><a name="INDEX-496" /></a> command, which prints out text lines
matching a given pattern. For example, if you wanted to see which
lines in a given file mention
<tt class="literal">flint</tt><a name="INDEX-497" /></a> and, somewhere later on the
same line, <tt class="literal">stone</tt>, you might do something like
this, with the Unix <i class="command">grep</i> command:
</p>
<blockquote><pre class="code">$ <tt class="userinput"><b>grep 'flint.*stone' some_file</b></tt>
a piece of flint, a stone which may be used to start a fire by striking
found obsidian, flint, granite, and small stones of basaltic rock, which
a flintlock rifle in poor condition. The sandstone mantle held several</pre></blockquote>
<p>Now, if you've used regular expressions somewhere else,
that's good, because you have a head start on these three
chapters. But Perl's regular expressions have somewhat
different syntax than most other implementations; in fact,
everybody's regular expressions are a little different. So, if
you needed to use a backslash to do something in another
implementation, maybe you'll need to leave it off in Perl, or
maybe vice versa.
</p>
<p>Don't confuse regular expressions with shell filename-matching
patterns, called
<em class="firstterm">globs</em><a name="INDEX-498" /></a>.
A typical glob is what you use when you type
<tt class="literal">*.pm</tt><a name="INDEX-499" /></a> to the Unix shell to match all filenames
that end in <tt class="literal">.pm</tt>. Globs use a lot of the same
characters that we use in regular expressions, but those characters
are used in totally different ways.<a href="#FOOTNOTE-166">[166]</a> We'll visit globs later, in <a href="ch12_01.htm">Chapter 12, "Directory Operations"</a>, but for now try to put them totally out of
your mind.
</p><blockquote class="footnote"> <a name="FOOTNOTE-166" /></a><p>[166]Globs are also
(alas) sometimes called patterns. What's worse, though, is that
some bad Unix books for beginners (and possibly
<em class="emphasis">written</em> by beginners) have taken to calling
globs "regular expressions", which they certainly are
not. This confuses many folks at the start of their work with
Unix.</p> </blockquote>
</div>
<hr width="684" align="left" />
<div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch06_06.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="ch07_02.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">6.6. Exercises</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">7.2. Using Simple Patterns</td></tr></table></div>
<hr width="684" align="left" />
<img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" />
<p><p><font size="-1"><a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font></p>
<map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map>
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -