📄 76.html
字号:
<HTML><TITLE>Regexp and Regsub: Use Parentheses to Extract Subpatterns</TITLE><BODY BGCOLOR="#FFF0E0" VLINK="#0FBD0F" TEXT="#101000" LINK="#0F0FDD">
<A NAME="top"><H1>Use Parentheses to Extract Subpatterns</H1></A>
<P> While parentheses can permit you to write more complicated regular
expressions, their main purpose may be to let you extract substrings from a
matching string. Suppose you have some document in which negative numbers are
represented by being placed inside parentheses, for example, (45.32) (2.94). All
numbers have two digits to the right of the decimal point. There may, or may
not, be any digits to the left of the decimal point. Here is a pattern to
match those numbers.
<PRE>
$LParen_$Number_$RParen_
</PRE>
The pattern relies on these preassigned subpatterns:
<PRE>
set LParen_ {\(}
set RParen_ {\)}
set Digit_ {[0-9]}
set Dot_ {\.}
set Number_ $Digit_*$Dot_$Digit_$Digit_
</PRE>
<P> Now, suppose you want to search for a parenthesized negative number, extract
the nonnegative number in the parentheses, and make it negative. There is
a variation of <TT>regexp</TT> that will help:
<P><CENTER><TABLE BORDER><TR><TD><DL>
<DT><STRONG><PRE>regexp <CITE>?SWITCHES? PATTERN STRING VAR_NAME1 VAR_NAME2 ... VAR_NAMEn</CITE></PRE></STRONG><DD>
As with the other forms of <TT>regexp</TT>, <CITE>VAR_NAME1</CITE> is the name of the
variable that will be assigned the entire matching substring. The other
<CITE>VAR_NAMEi</CITE>s are new. They are assigned substrings determined by the way
you add parentheses to your pattern.
<P> <CITE>VAR_NAME2</CITE> is the name of a variable that will be assigned the part of
the matching substring that matches the subpattern in the leftmost
parentheses. <CITE>VAR_NAME3</CITE> is the name of a variable that will be assigned
the part of the matching substring that matches the subpattern in the
next-to-leftmost parentheses. And so on.
<P> You discover whether one set of parentheses is to the left of the other by
looking at the actual placement of the two left parentheses. Forget about branches,
nesting, or whatever. A set of parentheses appears to the left of another if
its left side appears to the left.
<P> The return value is a boolean indicating whether the complete match
was successful.
</DL></TD></TR></TABLE></CENTER></P>
<P> To extract the number part of the previous pattern, we need to put
parentheses around it, something like this:
<PRE>
$LParen_($Number_)$RParen_
</PRE>
<P> The variable <TT>LParen</TT> has been defined so that it will match a left
parenthesis and not be seen as a special symbol by <TT>regexp.</TT>
Unfortunately, the string shown above is a case where the left parenthesis
can also be a special symbol for Tcl. When interpreting the string, Tcl
thinks <TT>LParen_</TT> is being used as an array!
<P> Tcl has ways of handling this problem. The left parenthesis could be
protected with a backslash, or the variable name could be delineated with
curly brackets. Using the second trick, the <TT>regexp</TT> command looks like
this:
<PRE>
regexp ${LParen_}($Number_)$RParen_ $Text Junk Number
</PRE>
If a match is found, <TT>Junk</TT> will contain the entire matching substring,
which we do not care about, and <TT>Number</TT> will contain the desired number.
<P> <P><A NAME="7.6a">
<STRONG>Exercise 7.6a</STRONG> </A><DL><DD>
A table is represented in an ASCII file. Each line contains
three things. First a label. Then two positive (or at least nonnegative)
integers representing "before" and "after" values for the item named in
the label. These three things are separated by blanks or tabs. The label can
contain anything that is not a blank or tab. The label may be indented.
Here is an example line:
<PRE>
BrandX 17 18
</PRE>
<P> Assume that <TT>Line</TT> contains one of these lines. Write a
<TT>regexp</TT> command that extracts the three things and places them in
the variables <TT>Label,</TT> <TT>Before,</TT> and <TT>After.</TT> <P>
<A HREF="7.9.html#Sol7.6a" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.9.html#Sol7.6a">Solution</A></DL>
<!-- Linkbar -->
<P><CENTER><FONT SIZE=2><NOBR>
<STRONG>From</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/sbf/tcl/book/home.html \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address. \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/sbf/tcl/book/home.html'" tppabs="http://www.mapfree.com/sbf/tcl/book/home.html">Tcl/Tk For Programmers</A><WBR>
<STRONG>Previous</STRONG>
<A HREF="7.5.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.5.html">section</A><WBR>
<STRONG>Next</STRONG>
<A HREF="7.7.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.7.html">section</A><WBR>
<STRONG>All</STRONG>
<A HREF="7.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.html">sections</A><WBR>
<STRONG>Author</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/mp/jaz/home.html \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address. \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/mp/jaz/home.html'" tppabs="http://www.mapfree.com/mp/jaz/home.html">J. A. Zimmer</A><WBR>
<STRONG>Copyright</STRONG>
<A HREF="copyright.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/copyright.html">Notice</A><WBR>
<P>
<I>Jun 17, 1998</I>
</NOBR></FONT></CENTER></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -