ch06_07.htm

来自「By Tom Christiansen and Nathan Torkingto」· HTM 代码 · 共 686 行 · 第 1/2 页
HTM
686 行
>Chemisery&quot;</CODE>. It wraps these with an appropriate HTML level one header. Because the pattern is relatively complex, we use the <CODECLASS="literal">/x</CODE> modifier so we can embed whitespace and comments.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch06-31611">Example 6.3: headerfy</A></H4><PRECLASS="programlisting">#!/usr/bin/perl# <ACLASS="indexterm"NAME="ch06-idx-1000007793-0"></A>headerfy: change certain chapter headers to html$/ = '';while ( &lt;&gt; ) {              # fetch a paragraph    s{        \A                  # start of record        (                   # capture in $1            Chapter         # text string            \s+             # mandatory whitespace            \d+             # decimal number            \s*             # optional whitespace            :               # a real colon            . *             # anything not a newline till end of line        )    }{&lt;H1&gt;$1&lt;/H1&gt;}gx;    print;}</PRE></DIV><PCLASS="para">Here it is as a one-liner from the command line if those extended comments just get in the way of understanding:</P><PRECLASS="programlisting">% perl -00pe 's{\A(Chapter\s+\d+\s*:.*)}{&lt;H1&gt;$1&lt;/H1&gt;}gx' datafile</PRE><PCLASS="para">This problem is interesting because we need to be able to specify both start-of-record and end-of-line in the same pattern. We could normally use <CODECLASS="literal">^</CODE> for start-of-record, but we need <CODECLASS="literal">$</CODE> to indicate not only end-of-record, but also end-of-line as well. We add the <CODECLASS="literal">/m</CODE> modifier, which changes both <CODECLASS="literal">^</CODE> and <CODECLASS="literal">$</CODE>. So instead of using <CODECLASS="literal">^</CODE> to match beginning-of-record, we use <CODECLASS="literal">\A</CODE> instead. (We're not using it here, but in case you're interested, the version of <CODECLASS="literal">$</CODE> that always matches end-of-record even in the presence of <CODECLASS="literal">/m</CODE> is <CODECLASS="literal">\Z</CODE>.)</P><PCLASS="para">The following example demonstrates using both <CODECLASS="literal">/s</CODE> and <CODECLASS="literal">/m</CODE> together. That's because we want <CODECLASS="literal">^</CODE> to match the beginning of any line in the paragraph and also want dot to be able to match a newline. (Because they are unrelated, using them together is simply the sum of the parts. If you have the questionable habit of using "single line" as a mnemonic for <CODECLASS="literal">/s</CODE> and "multiple line" for <CODECLASS="literal">/m </CODE>, then you may think you can't use them together.) The predefined variable <CODECLASS="literal">$.</CODE> represents the record number of the last read file. The predefined variable <CODECLASS="literal">$ARGV</CODE> is the file automatically opened by implicit <CODECLASS="literal">&lt;ARGV&gt;</CODE> processing.</P><PRECLASS="programlisting">$/ = '';            # paragraph read mode for readline accesswhile (&lt;ARGV&gt;) {    while (m#^START(.*?)^END#sm) {  # /s makes . span line boundaries                                    # /m makes ^ match near newlines        print &quot;chunk $. in $ARGV has &lt;&lt;$1&gt;&gt;\n&quot;;    }}</PRE><PCLASS="para">If you've already committed to using the <CODECLASS="literal">/m</CODE> modifier, you can use <CODECLASS="literal">\A</CODE> and <CODECLASS="literal">\Z</CODE> to get the old meanings of <CODECLASS="literal">^</CODE> and <CODECLASS="literal">$</CODE> respectively. But what if you've used the <CODECLASS="literal">/s</CODE> modifier and want to get the original meaning of <CODECLASS="literal">.</CODE>? You can use <CODECLASS="literal">[^\n]</CODE>. If you don't care to use <CODECLASS="literal">/s</CODE> but want the notion of matching any character, you could construct a character class that matches any one byte, such as <CODECLASS="literal">[\000-\377]</CODE> or even <CODECLASS="literal">[\d\D]</CODE>. You can't use <CODECLASS="literal">[.\n]</CODE> because <CODECLASS="literal">.</CODE> is not special in a character <CODECLASS="literal"></CODE><ACLASS="indexterm"NAME="ch06-idx-1000007584-0"></A><ACLASS="indexterm"NAME="ch06-idx-1000007584-1"></A>class.<ACLASS="indexterm"NAME="ch06-idx-1000007573-0"></A><ACLASS="indexterm"NAME="ch06-idx-1000007573-1"></A><ACLASS="indexterm"NAME="ch06-idx-1000007573-2"></A><ACLASS="indexterm"NAME="ch06-idx-1000007573-3"></A></P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch06-pgfId-855">See Also</A></H3><PCLASS="para">The <CODECLASS="literal">$/</CODE> variable in <ICLASS="filename">perlvar  </I>(1) and in the <ACLASS="olink"HREF="../prog/ch02_09.htm">"Special Variables"</A> section of <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the <CODECLASS="literal">/s</CODE> and <CODECLASS="literal">/m</CODE> modifiers in <ICLASS="filename">perlre </I>(1) and <ACLASS="olink"HREF="../prog/ch02_04.htm#PERL2-CH-2-SECT-4.1.3">"the fine print"</A> section of <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the "String Anchors" section of <CITECLASS="citetitle">Mastering Regular Expressions</CITE>; we talk more about the special variable <CODECLASS="literal">$/</CODE> in <ACLASS="xref"HREF="ch08_01.htm"TITLE="File Contents">Chapter 8</A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_06.htm"TITLE="6.5. Finding the Nth Occurrence of a Match"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 6.5. Finding the Nth Occurrence of a Match"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_08.htm"TITLE="6.7. Reading Records with a Pattern Separator"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 6.7. Reading Records with a Pattern Separator"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">6.5. Finding the Nth Occurrence of a Match</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">6.7. Reading Records with a Pattern Separator</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>
ch06_07.htm - 源码说明

本页面展示了「By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998」中的 ch06_07.htm 源码文件，采用 HTM 编程语言编写，共 686 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与Christiansen相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?