75.html

来自「Tcl 语言的入门级图书」· HTML 代码 · 共 143 行
HTML
143 行
<HTML><TITLE>Regexp and Regsub: Use Parentheses to Build more Complicated Patterns</TITLE><BODY BGCOLOR="#FFF0E0" VLINK="#0FBD0F" TEXT="#101000" LINK="#0F0FDD">
<A NAME="top"><H1>Use Parentheses to Build more Complicated Patterns</H1></A>


<P>  Now to change the rules in a way that lets more complicated
regular expressions be written:

<DL><DD> <CITE> A quasichar may be replaced with an entire pattern if that
pattern is placed inside parentheses and the resulting overall pattern does
not apply a repeater to a pattern that can match an empty string.
<P> </CITE> </DL>

<P>  In the previous section, we built regular expressions from quasichars,
anchors, repeaters, and branches. The rules we gave for those regular
expressions did not really require that quasichars only match single
characters.  That just made the rules easier to explain.  All that mattered
was that a quasichar could be tested to see if it matches a substring
beginning at a definite place.  A pattern, too, can be tested to see if it
matches a substring beginning at a definite place.  So, there is no reason not
to let quasichars be patterns.

<P>  Therefore, we do let quasichars be patterns but we insist that such
quasichar patterns be surrounded with parentheses to keep things unambiguous.

<P>  Explaining why a quasichar pattern that matches an emptyf string cannot have
a repeater operand after it is more difficult.  After all, the theory says that
the <TT>*</TT> repeater is idempotent which should mean that <TT>a**</TT> is the same as <TT>a*.</TT>
Why then should the practice forbid <TT>a**</TT> or <TT>(a*)*</TT>?  I have not looked
at the code to see why but I suppose it has something to do with avoiding infinite
recursion or an infinite loop.  Whatever the reason, theory and practice
differ here.  However, the divergence is not very consequential.

<P>  Now for an example.  Consider this,

<PRE>
x*
</PRE>

which matches zero or more copies of the letter <TT>x</TT> and this,

<PRE>
cat|dog
</PRE>

which matches "cat" or "dog."  If we replace the quasichar <TT>x</TT> with
the pattern in parentheses, we get

<PRE>
(cat|dog)*
</PRE>

which matches zero or more consecutive substrings, each of which is "cat"
or "dog."

<P> To be even more concrete,

<PRE>
regexp "(cat|dog)*" catdogcatbert Match
</PRE>

will return true and set <TT>Match</TT> to <TT>catdogcat</TT>.

<P> <P><A NAME="7.5a">
<STRONG>Exercise 7.5a</STRONG> </A><DL><DD>
 
<P>  Which of the following will return true?  Of those that
do, what is assigned to the variable <TT>Match?</TT> Of those that do not, why?

<PRE>
set NoLetter_ {[^A-Za-z]}
set OkChar_ {[a-z@\.]}
regexp "(cat | dog)*bert"  catdogbert Match
regexp "($NoLetter_+|nil) + ($NoLetter_+|nil)" "Answer: 2.6 + nillem" Match
regexp -nocase "^(From:|To:) *$OkChar_+$" \
       "From: jazimmer@acm.org\n" \
       Match
</PRE> <P>
<A HREF="7.9.html#Sol7.5a" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.9.html#Sol7.5a">Solution</A></DL>


<P>  Here is a short example of the power of parentheses.  Recall that the Tcl
pattern matcher interprets <TT>^</TT> as an empty string just before the first
character of the string you are trying to match.  In other words, <TT>^</TT> is
not just a control character the way <TT>(</TT> is. Instead, <TT>^</TT> is seen as
matching something.  Now, consider the following,

<PRE>
set LineBrk_ "\n"
regexp "(^|$LineBrk_)To:" $Str Match
</PRE>

This will match the first occurrence of "To:" which is immediately 
preceded by the start of the given string or a break between lines.
In other words, it matches the first occurrence of "To:" at the
beginning of a line.  

<P> <P><A NAME="7.5b">
<STRONG>Exercise 7.5b</STRONG> </A><DL><DD>
  Finish implementing this procedure,

<PRE>
proc getSummary String { ... }
</PRE>

<TT>String</TT> is viewed as a sequence of lines.  Lines are separated with the
<TT>\n</TT> character.  There may be any number of lines.  The last line may, or
may not, end with a <TT>\n.</TT>

<P>   The purpose of <TT>getSummary</TT> is to return the complete line that begins
with the word "Summary" &#150; not including any <TT>\n</TT>.  "Summary" may be
indented.  If the word "Summary" begins more than one line, then the first
one is returned.  If the word "Summary" begins no lines, then the empty
string is returned.

<P>  To discover that "Summary" begins a line, you have make sure the "S"
is the very first letter or follows a end-of-line character.
This may get an unwanted <TT>\n</TT> into your match.  You can get rid of
it with a <TT>string</TT> action.  (There is another way to accomplish this
match which is described in the next
section.  Use it if you like.)  <P>
<A HREF="7.9.html#Sol7.5b" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.9.html#Sol7.5b">Solution</A></DL>


<!-- Linkbar -->
<P><CENTER><FONT SIZE=2><NOBR>
<STRONG>From</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/sbf/tcl/book/home.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/sbf/tcl/book/home.html'" tppabs="http://www.mapfree.com/sbf/tcl/book/home.html">Tcl/Tk For Programmers</A><WBR>
<STRONG>Previous</STRONG>
<A HREF="7.4.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.4.html">section</A><WBR>
<STRONG>Next</STRONG>
<A HREF="7.6.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.6.html">section</A><WBR>
<STRONG>All</STRONG>
<A HREF="7.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.html">sections</A><WBR>
<STRONG>Author</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/mp/jaz/home.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/mp/jaz/home.html'" tppabs="http://www.mapfree.com/mp/jaz/home.html">J. A. Zimmer</A><WBR>
<STRONG>Copyright</STRONG>
<A HREF="copyright.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/copyright.html">Notice</A><WBR>
<P>
<I>Jun 17, 1998</I>
 </NOBR></FONT></CENTER></BODY></HTML>
75.html - 源码说明

本页面展示了「Tcl 语言的入门级图书」中的 75.html 源码文件，采用 HTML 编程语言编写，共 143 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与Tcl相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?