⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 79.html

📁 Tcl 语言的入门级图书
💻 HTML
字号:
<HTML><TITLE>Regexp and Regsub: Solutions to Exercises</TITLE><BODY BGCOLOR="#FFF0E0" VLINK="#0FBD0F" TEXT="#101000" LINK="#0F0FDD">
<A NAME="top"><H1>Solutions to Exercises</H1></A>


<P> <A> <A NAME="Sol7.2a">
<STRONG> Solution To Exercise 7.2a</STRONG> </A> <P>


The <TT>\t</TT> will not be substituted with the nonprintable tab character unless
you are running version 8.1 or later.  Instead, the regular expression
evaluator will see it as a letter "t."  Fix the problem this way:

<PRE>
set Space_ "\[ \t]"
</PRE>

The backslash before the left square bracket prevents the Tcl interpreter from
doing command substitution.


<P> <A> <A NAME="Sol7.3a">
<STRONG> Solution To Exercise 7.3a</STRONG> </A> <P>



The first part, <TT>$Pre1_</TT>, is not interpreted by the Tcl interpreter during
preassignment.  The backslashes tell the regular-expression translator that
there are no special symbols.  This part matches <TT>[0-9]</TT> exactly.

<P> The second part, <TT>$Pre_2</TT>, is interpreted by the Tcl interpreter during
preassignment.  The regular-expression translator sees <TT>[0-9]</TT>, which
matches any single digit.

<P> So, the whole regular expression pattern would match <TT>[0-9]3</TT>, but not
<TT>3[0-9]</TT>.

<P> The second preassignment to <TT>Pre2_</TT> would cause an error because <TT>0-9</TT>
is not a command name and so command substitution fails.

<P> By the way, the preassignment to <TT>Pre1_</TT> could have been written this way:

<PRE>
set Pre1_ {\[0-9]}
</PRE>

because the hypen and the right square bracket are not considered to be
special symbols by the regular-expression translator unless they follow a left
square bracket.



<P> <A> <A NAME="Sol7.3b">
<STRONG> Solution To Exercise 7.3b</STRONG> </A> <P>



<PRE>
regexp "^Tcl$" $Name
</PRE>

Variable substitution is not attempted when a symbol other than a letter,
number, or underscore follows a dollar sign.  This rule is consistent
with what you have had to learn about safe variable names.  




<P> <A> <A NAME="Sol7.3c">
<STRONG> Solution To Exercise 7.3c</STRONG> </A> <P>



<PRE>
set NoDot_ {[^\.]}
</PRE>

As it happens, the backslash is not necessary.  Within square brackets, the
only special symbols that are recognized are <TT>^,</TT> <TT>-,</TT> and <TT>].</TT>  I
prefer to ignore this rule and do the backslash substitutions for
nonalphameric characters.  (The word "nonalphameric" is important here.
Indeed, with version 8.1, a backslash of a letter is either a request for a
special backslash substitution, such as <TT>\t</TT> or <TT>\n</TT>, or an error.)  If
you want to take advantage of it, you should know that the rule even has a
counterpart with glob pattern matching that I did not mention there.

<P> <A> <A NAME="Sol7.3d">
<STRONG> Solution To Exercise 7.3d</STRONG> </A> <P>



<PRE>
% regexp -indices "\[a-z]ab"  abab Match
1
% set Match
1 3
% regexp -indices t$ catbert Match
1
% set Match
6 6
</PRE>



<P> <A> <A NAME="Sol7.4a">
<STRONG> Solution To Exercise 7.4a</STRONG> </A> <P>



<PRE>
regexp -indices $Space_$Quote_ {  "} Match
                                  <CITE>Matches and</CITE> Match <CITE>is</CITE> 1 2
regexp $Digit_.$Digit_ 201 Match  <CITE>Matches and</CITE> Match <CITE>is</CITE> 201
regexp $NoDot_*$Dot_ "Interesting. But not relevant." Match
                                  <CITE>Matches and</CITE> Match <CITE>is</CITE> Interesting.
regexp ".*" "" Match              <CITE>Matches and</CITE> Match <CITE>is the empty string.</CITE>
</PRE>




<P> <A> <A NAME="Sol7.4b">
<STRONG> Solution To Exercise 7.4b</STRONG> </A> <P>



<PRE>
regexp catbert|cat catbert Match  <CITE>Matches and</CITE> Match <CITE>is</CITE> catbert
regexp cat|catbert catbert Match  <CITE>Matches and</CITE> 
                                  Match <CITE>is</CITE> cat in version 8.0 and earlier
                                  Match <CITE>is</CITE> catbert in version 8.1 and later
regexp c?t|at catbert Match       <CITE>Matches and</CITE> Match <CITE>is</CITE> at
regexp $NoLowerCase_*at|catbert Catbert Match
                                  <CITE>Matches and</CITE> Match <CITE>is</CITE> Cat
regexp $NoLowerCase_*bert|bert Catbert Match
                                  <CITE>Matches and</CITE> Match <CITE>is</CITE> bert
</PRE>

In the last one it is the leftmost branch that is used.  Remember that the
<TT>*</TT> repeater lets a quasichar match an empty string, an imaginary empty
string exists at the front of each character in a string, and that when two
matches are the same length the leftmost one prevails in all versions of Tcl.


<P> <A> <A NAME="Sol7.4c">
<STRONG> Solution To Exercise 7.4c</STRONG> </A> <P>



<PRE>
set CarriageRet_ "\n"
set NoCarriageRet_ "\[^\n]"
regexp "^$NoCarriageRet_*$CarriageRet_" $Str Match
</PRE>



<P> <A> <A NAME="Sol7.5a">
<STRONG> Solution To Exercise 7.5a</STRONG> </A> <P>



<P> This,

<PRE>
regexp "(cat | dog)*bert"  catdogbert Match
</PRE>

returns 1, but the "<TT>(cat | dog)*</TT>" part had to match an empty string because
there is no space before the "<TT>dog</TT>" in "<TT>catdogbert</TT>;" <TT>Match</TT> is
"<TT>bert</TT>."

<P> This, 
<PRE> 
regexp "($NoLetter_+|nil) + ($NoLetter_+|nil)" "Answer: 2.6 +nillem" Match 
</PRE> 

returns 0.  The <TT>+</TT> does not match the "+" in the string because
it is a repeater.

The match you may have thought you were getting happens with this version:

<PRE>
set Plus_ {\+}
regexp "($NoLetter_+|nil) $Plus_ ($NoLetter_+|nil)" "Answer: 2.6 + nillem" 
</PRE>

<P> This,
<PRE>
regexp -nocase "^(From:|To:) *$OkChar_+$" \
       "From: jazimmer@acm.org\n" \
       Match
</PRE>

returns 0.  Here it is the <TT>\n</TT> that causes the trouble.  The <TT>$</TT> in the
pattern does not match it because it is the end of a line, not the end of a
string. This string, "<TT>From: jazimmer@acm.org</TT>," would match just fine.



<P> <A> <A NAME="Sol7.5b">
<STRONG> Solution To Exercise 7.5b</STRONG> </A> <P>


<PRE>
proc getSummary String {
  set Beginning_ "(^|\n)"
  set Space_ "\[ \t]"
  set InLine_ "\[^\n]"
  if [regexp "$Beginning_$Space_*Summary$InLine_*" $String Line] {
     return [string trim $Line "\n "]
  } else {
     return ""
  }
}
</PRE>

<P> Here is the way it is done using parentheses to extract subpatterns as
described above in 
<A HREF="7.6.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.6.html">Use Parentheses to Extract Subpatterns</A>.

<PRE>
proc getSummary String {
  set Beginning_ "(^|\n)"
  set Space_ "\[ \t]"
  set InLine_ "\[^\n]"
  if [regexp "$Beginning_$Space_*(Summary$InLine_*)" $String \
             Junk1 Junk2 Summary] \
  {
     return $Summary
  } else {
     return ""
  }
}
</PRE>



<P> <A> <A NAME="Sol7.6a">
<STRONG> Solution To Exercise 7.6a</STRONG> </A> <P>



<PRE>
set Space_ "\[ \t]"
set Labl_ "\[^ \t]+"
set Int_ {[0-9]*}
regexp "$Space_*($Labl_)$Space_+($Int_)$Space_+($Int_)" $Line \
       Junk Label Before After
</PRE>


<P> <A> <A NAME="Sol7.7a">
<STRONG> Solution To Exercise 7.7a</STRONG> </A> <P>



<PRE>
regsub -all &#38; $Str &#38;&#38; Str
</PRE>



<P> <A> <A NAME="Sol7.7b">
<STRONG> Solution To Exercise 7.7b</STRONG> </A> <P>



<PRE>
set ToLft_ "^|\[^a-zA-Z]"
set ToRght_ "\[^a-zA-Z]|$"
regsub -all ($ToLft_)cat($ToRght_) $Str \\1dog\\2 Str

regsub -all ($ToLft_)cat(s?)($ToRght_) $Str \\1dog\\2\\3 Str
</PRE>








<!-- Linkbar -->
<P><CENTER><FONT SIZE=2><NOBR>
<STRONG>From</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/sbf/tcl/book/home.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/sbf/tcl/book/home.html'" tppabs="http://www.mapfree.com/sbf/tcl/book/home.html">Tcl/Tk For Programmers</A><WBR>
<STRONG>Previous</STRONG>
<A HREF="7.8.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.8.html">section</A><WBR>
<STRONG>All</STRONG>
<A HREF="7.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/7.html">sections</A><WBR>
<STRONG>Author</STRONG>
<A HREF="javascript:if(confirm('http://www.mapfree.com/mp/jaz/home.html  \n\nThis file was not retrieved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address.  \n\nDo you want to open it from the server?'))window.location='http://www.mapfree.com/mp/jaz/home.html'" tppabs="http://www.mapfree.com/mp/jaz/home.html">J. A. Zimmer</A><WBR>
<STRONG>Copyright</STRONG>
<A HREF="copyright.html" tppabs="http://www.mapfree.com/sbf/tcl/book/select/Html/copyright.html">Notice</A><WBR>
<P>
<I>Jun 17, 1998</I>
 </NOBR></FONT></CENTER></BODY></HTML>


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -