📄 ch29_09.htm
字号:
<HTML><!--Distributed by F --><HEAD><TITLE>[Chapter 29] 29.9 Looking for Closure </TITLE><METANAME="DC.title"CONTENT="UNIX Power Tools"><METANAME="DC.creator"CONTENT="Jerry Peek, Tim O'Reilly & Mike Loukides"><METANAME="DC.publisher"CONTENT="O'Reilly & Associates, Inc."><METANAME="DC.date"CONTENT="1998-08-04T21:45:11Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-260-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch29_01.htm"TITLE="29. Spell Checking, Word Counting, and Textual Analysis"><LINKREL="prev"HREF="ch29_08.htm"TITLE="29.8 Find a a Doubled Word "><LINKREL="next"HREF="ch29_10.htm"TITLE="29.10 Just the Words, Please "></HEAD><BODYBGCOLOR="#FFFFFF"TEXT="#000000"><DIVCLASS="htmlnav"><H1><IMGSRC="gifs/smbanner.gif"ALT="UNIX Power Tools"USEMAP="#srchmap"BORDER="0"></H1><MAPNAME="srchmap"><AREASHAPE="RECT"COORDS="0,0,466,58"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="467,0,514,18"HREF="jobjects/fsearch.htm"ALT="Search this book"></MAP><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch29_08.htm"TITLE="29.8 Find a a Doubled Word "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 29.8 Find a a Doubled Word "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1">Chapter 29<BR>Spell Checking, Word Counting, and Textual Analysis</FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch29_10.htm"TITLE="29.10 Just the Words, Please "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 29.10 Just the Words, Please "BORDER="0"></A></TD></TR></TABLE> <HRALIGN="LEFT"WIDTH="515"TITLE="footer"></DIV><DIVCLASS="SECT1"><H2CLASS="sect1"><ACLASS="title"NAME="UPT-ART-0038">29.9 Looking for Closure </A></H2><PCLASS="para"><ACLASS="indexterm"NAME="AUTOID-32394"></A><ACLASS="indexterm"NAME="AUTOID-32396"></A>A common problem in text processing is making sure that itemsthat need to occur in pairs actually do so.</P><PCLASS="para">Most UNIX text editors include support for making sure that elementsof C syntax such as parentheses and braces are closed properly.There's much less support for making sure that textual documents,such as<SPANCLASS="link"><EMCLASS="emphasis">troff</EM> (<ACLASS="linkend"HREF="ch43_13.htm"TITLE="The Text Formatters nroff, troff, ditroff, ... ">43.13</A>)</SPAN>source files, have the proper structure.For example, tables must start with a <CODECLASS="literal">.TS</CODE> macro, and end with <CODECLASS="literal">.TE</CODE>. HTML documents that start a list with <CODECLASS="literal"><UL></CODE> need a closing<CODECLASS="literal"></UL></CODE>.</P><PCLASS="para">UNIX provides a number of tools that might help you to tackle thisproblem.Here's a shell script written by Dale Dougherty thatuses <EMCLASS="emphasis">awk</EM> to make sure that <CODECLASS="literal">.TS</CODE>and <CODECLASS="literal">.TE</CODE> macros come in pairs:<ACLASS="indexterm"NAME="AUTOID-32410"></A></P><PCLASS="para"><TABLECLASS="screen.co"BORDER="1"><TR><THVALIGN="TOP"><PRECLASS="calloutlist"> <ACLASS="co"HREF="ch33_12.htm"TITLE="33.12 Versions of awk ">gawk</A> </PRE></TH><TDVALIGN="TOP"><PRECLASS="screen"> #! /usr/local/bin/<CODECLASS="literal">gawk</CODE> -fBEGIN { inTable = 0 TSlineno = 0 TElineno = 0 prevFile = ""}# check for unclosed table in first file, when more than one fileFILENAME != prevFile { if (inTable) printf ("%s: found .TS at File %s: %d without .TE before end of file\n", $0, prevFile, TSlineno) inTable = 0 prevFile = FILENAME}# match TS and see if we are in Table/^/.TS/ { if (inTable) { printf("%s: nested starts, File %s: line %d and %d\n", $0, FILENAME, TSlineno, FNR) } inTable = 1 TSlineno = FNR}/^/.TE/ { if (! inTable) printf("%s: too many ends, File %s: line %d and %d\n", $0, FILENAME, TElineno, FNR) else inTable = 0 TElineno = FNR}# this catches end of inputEND { if (inTable) printf ("found .TS at File %s: %d without .TE before end of file\n", FILENAME, TSlineno)}</PRE></TD></TR></TABLE></P><PCLASS="para">You can adapt this type of script for any place you need to check for something that has a start and finish.</P><PCLASS="para">A more complete syntax checking program could be written with thehelp of a lexical analyzer like <EMCLASS="emphasis">lex</EM>.<EMCLASS="emphasis">lex</EM> is normallyused by experienced C programmers, but it can be used profitably bysomeone who has mastered <EMCLASS="emphasis">awk</EM> and is just beginning with C,since it combines an <EMCLASS="emphasis">awk</EM>-like pattern-matching process usingregular expression syntax, with actions written in the more powerfuland flexible C language.(See O'Reilly & Associates' <EMCLASS="emphasis">lex & yacc</EM>.)</P><PCLASS="para">And of course, this kind of problem could be very easily tackledin <SPANCLASS="link"><EMCLASS="emphasis">perl</EM> (<ACLASS="linkend"HREF="ch37_01.htm#UPT-ART-5560"TITLE="What We Do and Don't Tell You About Perl ">37.1</A>)</SPAN>.</P><DIVCLASS="sect1info"><PCLASS="SECT1INFO">- <SPANCLASS="authorinitials">TOR</SPAN></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch29_08.htm"TITLE="29.8 Find a a Doubled Word "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 29.8 Find a a Doubled Word "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="book"HREF="index.htm"TITLE="UNIX Power Tools"><IMGSRC="gifs/txthome.gif"SRC="gifs/txthome.gif"ALT="UNIX Power Tools"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch29_10.htm"TITLE="29.10 Just the Words, Please "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 29.10 Just the Words, Please "BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172">29.8 Find a a Doubled Word </TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="index"HREF="index/idx_0.htm"TITLE="Book Index"><IMGSRC="gifs/index.gif"SRC="gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172">29.10 Just the Words, Please </TD></TR></TABLE><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><IMGSRC="gifs/smnavbar.gif"SRC="gifs/smnavbar.gif"USEMAP="#map"BORDER="0"ALT="The UNIX CD Bookshelf Navigation"><MAPNAME="map"><AREASHAPE="RECT"COORDS="0,0,73,21"HREF="../index.htm"ALT="The UNIX CD Bookshelf"><AREASHAPE="RECT"COORDS="74,0,163,21"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="164,0,257,21"HREF="../unixnut/index.htm"ALT="UNIX in a Nutshell"><AREASHAPE="RECT"COORDS="258,0,321,21"HREF="../vi/index.htm"ALT="Learning the vi Editor"><AREASHAPE="RECT"COORDS="322,0,378,21"HREF="../sedawk/index.htm"ALT="sed & awk"><AREASHAPE="RECT"COORDS="379,0,438,21"HREF="../ksh/index.htm"ALT="Learning the Korn Shell"><AREASHAPE="RECT"COORDS="439,0,514,21"HREF="../lrnunix/index.htm"ALT="Learning the UNIX Operating System"></MAP></DIV></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -