⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 72.html

📁 Tcl 语言的入门级图书
💻 HTML
字号:
<HTML><TITLE>Regexp and Regsub: Character Set, Quoting, and Style</TITLE><BODY BGCOLOR="#FFF0E0" VLINK="#0FBD0F" TEXT="#101000" LINK="#0F0FDD">
<A NAME="top"><H1>Character Set, Quoting, and Style</H1></A>


<P>  Tcl regular expressions describe sets of strings of ASCII characters.  You
already know how to represent ASCII characters in Tcl &#150; this is discussed
above in the 
<A HREF="2.10.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/2.10.html">More about Substitution</A>.  For example, the statement

<PRE>
set Salutation "hi there\n"
</PRE>

assigns a string containing two words, a blank and an end-of-line symbol to the
variable <TT>Salutation.</TT>  

<P> You can pass arbitrary ASCII characters to Tcl's regular-expression command
by writing them the Tcl way.  Just make sure your arguments are in quotes and
not in curly brackets.  If your arguments are in curly brackets, it is the
regular-expression command that must do the backslash substitution.  The first
Tcl version whose regular-expression commands do backslash substitution is (the
currently experimental) version 8.1.

<P> Another point concerning character sets is that Tcl has special characters
that have to be protected with backslashes if they are to appear in arguments
surrounded by quotes.  Regular expressions also have special characters that
have to be protected with backslashes whenever they are passed without their
special meaning to a regular-expression command.  You can avoid the confusion
of two different sets of special characters by simply not involving the Tcl
interpreter, i.e. by placing all your arguments in curly brackets.  However,
if you do this, and you are working with a version of Tcl earlier than 8.1,
you cannot work with nonprintable characters.

<P> This problem also exists with glob patterns.  I chose to ignore it in the previous
chapter by
insisting that all glob arguments be placed in curly brackets.  One reason I
did that is that I prefer to use regular expressions when things start to get
complicated.

<P>  The question of how to deal with two sets of special characters is more
serious for those who use the Tcl extension named Expect.  This is because
Expect users do lots of pattern matching on the strings of characters that
computers send to terminals.  Since these strings often contain nonprintable
characters the use of curly brackets by Expect users is very often impossible.
(It will continue to be impossible until the experimental Tcl 8.1 is finished
and has percolated its way into Expect.)  Thus it is easy to understand why
Don Libes, the creator of Expect , urges you to use quotes.  
He wants you to
work in a consistent environment &#150; even if that environment has two sets of
special characters to contend with.  Unfortunately, working exclusively with
quotes can have you writing such commands as

<PRE>
expect -re "(%|$|\\\$a) $"
</PRE>

so that the regular-expression processor will see <TT>(%|$|\$a) $</TT>.

<P>  My own method is somewhat different.  It is motivated by the observation
that regular expressions tend to be messy and difficult to get right.

<P> There was a time when we thought the same thing about almost all programming.
Then Edsger Dijkstra wrote a letter  that said 
essentially, "You
know, I have noticed the programmers who organize their code into neat blocks
get their work done faster and have fewer bugs than programmers who do not."
He then pointed to the <TT>goto</TT> statement as a license to avoid neat blocks.

<P>  It is my contention that programmers who structure their regular expressions
by preassigning relevant subpatterns to variables will get their patterns
done faster and with fewer bugs.  A one-line regular expression is a license
to avoid neat blocks.

<P>  Preassigning relevant subpatterns to variables is also a solution to the
"brackets or quotes" dilemma.  Follow Don Libes' rule.  Place all your
regular expressions in quotes.  But, preassign any part of any regular
expression that needs a backslash for any reason.  When you do a
preassignment, you will be using the <TT>set</TT> command and you will have a
choice of passing the subpattern to <TT>set</TT> surrounded by curly brackets or
quotes.  Choosing quotes means your focus is on processing by the Tcl
interpreter.  Choosing curly brackets means your focus is on processing by the
regular-expression parser.

<P>  To judge the value of this solution, all you have to do is take some of the
more complicated examples/exercises below and rewrite them without the
preassigned subpatterns.

<P> One last point: my style of writing regular-expression patterns causes a
proliferation in the number of variables that must be remembered.  As I argue


<A HREF="javascript:if(confirm('http://www.mapfree.com/sbf/tips/name1.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/sbf/tips/name1.html'" tppabs="http://www.mapfree.com/sbf/tips/name1.html">elsewhere,</A> 

a proliferation of variable names is not a good idea unless your variable
names are organized into neat blocks.

<P>  One way to create these blocks would be to write a new procedure every time
you want to use a regular expression.  That seems excessive.  Another way is
to adopt a naming convention so that it is easy to see which variable names
exist solely for use with regular-expression patterns.  My naming convention
is simple: I put underscores at the end of the names of variables containing
regular expression subpatterns.

<P>  It is possible to look at an example now because it happens that
regular-expression patterns accept the square bracket notation you have seen
for globs.  So you already know the pattern that means "digit."  Here
is a preassigned subpattern.

<PRE>
set Digit_ {[0-9]}
</PRE>

With this preassignment, you can write <TT>$Digit_</TT> instead of <TT>\[0-9]</TT>
to mean "digit" in your regular expressions.

<P> <P><A NAME="7.2a">
<STRONG>Exercise 7.2a</STRONG> </A><DL><DD>
 Will the following preassignment work for the
regular-expression commands of all Tcl versions?  Explain.

<PRE>
set Space_ {[ \t]}
</PRE>
<P>
<A HREF="7.9.html#Sol7.2a" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.9.html#Sol7.2a">Solution</A></DL>



<!-- Linkbar -->
<P><CENTER><FONT SIZE=2><NOBR>
<STRONG>From</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/sbf/tcl/book/home.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/sbf/tcl/book/home.html'" tppabs="http://www.mapfree.com/sbf/tcl/book/home.html">Tcl/Tk For Programmers</A><WBR>
<STRONG>Previous</STRONG>
<A HREF="7.1.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.1.html">section</A><WBR>
<STRONG>Next</STRONG>
<A HREF="7.3.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.3.html">section</A><WBR>
<STRONG>All</STRONG>
<A HREF="7.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.html">sections</A><WBR>
<STRONG>Author</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/mp/jaz/home.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/mp/jaz/home.html'" tppabs="http://www.mapfree.com/mp/jaz/home.html">J. A. Zimmer</A><WBR>
<STRONG>Copyright</STRONG>
<A HREF="copyright.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/copyright.html">Notice</A><WBR>
<P>
<I>Jun 17, 1998</I>
 </NOBR></FONT></CENTER></BODY></HTML>


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -