⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch35_11.htm

📁 the unix power tools
💻 HTM
字号:
<HTML><!--Distributed by F --><HEAD><TITLE>[Chapter 35] 35.11 Hacking on Characters with tr </TITLE><METANAME="DC.title"CONTENT="UNIX Power Tools"><METANAME="DC.creator"CONTENT="Jerry Peek, Tim O'Reilly &amp; Mike Loukides"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1998-08-04T21:48:11Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-260-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch35_01.htm"TITLE="35. You Can't Quite Call This Editing"><LINKREL="prev"HREF="ch35_10.htm"TITLE="35.10 Splitting Files by Context: csplit "><LINKREL="next"HREF="ch35_12.htm"TITLE="35.12 Converting Between ASCII and EBCDIC "></HEAD><BODYBGCOLOR="#FFFFFF"TEXT="#000000"><DIVCLASS="htmlnav"><H1><IMGSRC="gifs/smbanner.gif"ALT="UNIX Power Tools"USEMAP="#srchmap"BORDER="0"></H1><MAPNAME="srchmap"><AREASHAPE="RECT"COORDS="0,0,466,58"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="467,0,514,18"HREF="jobjects/fsearch.htm"ALT="Search this book"></MAP><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_10.htm"TITLE="35.10 Splitting Files by Context: csplit "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 35.10 Splitting Files by Context: csplit "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1">Chapter 35<BR>You Can't Quite Call This Editing</FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_12.htm"TITLE="35.12 Converting Between ASCII and EBCDIC "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 35.12 Converting Between ASCII and EBCDIC "BORDER="0"></A></TD></TR></TABLE>&nbsp;<HRALIGN="LEFT"WIDTH="515"TITLE="footer"></DIV><DIVCLASS="SECT1"><H2CLASS="sect1"><ACLASS="title"NAME="UPT-ART-3780">35.11 Hacking on Characters with tr </A></H2><PCLASS="para">The <EMCLASS="emphasis">tr</EM> command is a character translation filter,reading<SPANCLASS="link">standard input (<ACLASS="linkend"HREF="ch13_01.htm#UPT-ART-1023"TITLE="Using Standard Input and Output">13.1</A>)</SPAN>and either deleting specificcharacters or substituting one character for another.</P><PCLASS="para">The most common use of <EMCLASS="emphasis">tr</EM> is to change each character in onestring to the corresponding character in a second string.(A string ofconsecutive<SPANCLASS="link">ASCII (<ACLASS="linkend"HREF="ch51_03.htm"TITLE="ASCII Characters: Listing and Getting Values ">51.3</A>)</SPAN>characters can be represented as a hyphen-separated range.)<ACLASS="indexterm"NAME="AUTOID-40678"></A></P><PCLASS="para">For example, the command:</P><PCLASS="para"><TABLECLASS="screen.co"BORDER="1"><TR><THVALIGN="TOP"><PRECLASS="calloutlist"><ACLASS="co"HREF="ch13_01.htm"TITLE="13.1 Using Standard Input and Output">&lt;</A> </PRE></TH><TDVALIGN="TOP"><PRECLASS="screen">$ <CODECLASS="userinput"><B>tr 'A-Z' 'a-z' &lt;</B></CODE><CODECLASS="replaceable"><I> file</I></CODE>   <EMCLASS="emphasis">Berkeley version</EM></PRE></TD></TR></TABLE></P><PCLASS="para">will convert all uppercase characters in <EMCLASS="emphasis">file</EM> to the equivalentlowercase characters.The result is printed onstandard output.</P><PCLASS="para">In the System V version of <EMCLASS="emphasis">tr</EM>, square brackets must surround anyrange of characters.That is, you have to say: <CODECLASS="literal">[a-z]</CODE> instead ofsimply <CODECLASS="literal">a-z</CODE>.And of course, because square brackets are meaningful tothe shell, you must protect them from interpretation by putting thestring in quotes.</P><PCLASS="para">If you aren't sure which version you have, here's a test.The Berkeley version converts the input <CODECLASS="literal">[]</CODE> to <CODECLASS="literal">A</CODE>characters because <CODECLASS="literal">[]</CODE> aren't treated as range operators:</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">% <CODECLASS="userinput"><B>echo '[]' | tr '[a-z]' A</B></CODE>AA <ICLASS="lineannotation">Berkeley version</I>% <CODECLASS="userinput"><B>echo '[]' | tr '[a-z]' A</B></CODE>[] <ICLASS="lineannotation">System V version</I></PRE></BLOCKQUOTE></P><PCLASS="para">There's one place you don't have to worry about the differencebetween the two versions: when you're converting one range toanother range, and both ranges have the same number of characters.For example, this command works in both versions:</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">$ <CODECLASS="userinput"><B>tr '[A-Z]' '[a-z]' &lt; </B></CODE><CODECLASS="replaceable"><I>file</I></CODE> <ICLASS="lineannotation">both versions</I></PRE></BLOCKQUOTE></P><PCLASS="para">The Berkeley <EMCLASS="emphasis">tr</EM> will convert a <CODECLASS="literal">[</CODE> from the first stringinto the same character <CODECLASS="literal">[</CODE> in the second string, and the samefor the <CODECLASS="literal">]</CODE> characters.The System V version uses the <CODECLASS="literal">[]</CODE> characters as rangeoperators.In both versions, you get what you want: the range <CODECLASS="literal">A-Z</CODE> isconverted to the corresponding range <CODECLASS="literal">a-z</CODE>.Again, this trick works only when both ranges have the same number ofcharacters.</P><PCLASS="para">The System V version also has a nice feature: the syntax<CODECLASS="literal">[a*n]</CODE>, where <EMCLASS="emphasis">n</EM> is some digit,means that the string should consist of <EMCLASS="emphasis">n</EM>repetitions of character &quot;a.&quot;If <EMCLASS="emphasis">n</EM> isn't specified, or is 0, it istaken to be some indefinitely large number.This is useful if youdon't know how many characters might be included in the first string.</P><PCLASS="para">As described in article<ACLASS="xref"HREF="ch30_22.htm"TITLE="Filtering Text Through a UNIX Command ">30.22</A>,this translation (and the reverse)can be useful from within<EMCLASS="emphasis">vi</EM> for translating a string.You can also delete specific characters.The <EMCLASS="emphasis">-d</EM> option deletes from the input each occurrenceof one or more characters specified in a string (special charactersshould be placed within quotation marks to protect them from the shell).For instance, thefollowing command passes to standard output thecontents of <EMCLASS="emphasis">file</EM> with all punctuationdeleted (and is a great exercise in<SPANCLASS="link">shell quoting (<ACLASS="linkend"HREF="ch08_14.htm"TITLE="Bourne Shell Quoting ">8.14</A>)</SPAN>):<ACLASS="indexterm"NAME="AUTOID-40729"></A></P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">$ <CODECLASS="userinput"><B>tr -d &quot;,.!?;:'&quot;'&quot;`' &lt; </B></CODE><CODECLASS="replaceable"><I>file</I></CODE></PRE></BLOCKQUOTE></P><PCLASS="para">&#13;The <EMCLASS="emphasis">-s</EM> (<EMCLASS="emphasis">squeeze</EM>) option of <EMCLASS="emphasis">tr</EM> removes multipleconsecutive occurrences of the same character in the second argument.For example, thecommand:</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">$ <CODECLASS="userinput"><B>tr -s &quot; &quot; &quot; &quot; &lt;</B></CODE> <CODECLASS="replaceable"><I>file</I></CODE></PRE></BLOCKQUOTE></P><PCLASS="para">will print on standard output a copy of <EMCLASS="emphasis">file</EM> in whichmultiple spaces in sequence have been replaced with a singlespace.</P><PCLASS="para">We've also found<EMCLASS="emphasis">tr</EM> useful when converting documents created on other systemsfor use under UNIX.For example, as described in article<ACLASS="xref"HREF="ch01_05.htm"TITLE="Anyone Can Program the Shell ">1.5</A>,<EMCLASS="emphasis">tr</EM> can be used to change the carriage returns at the end ofeach line in a Macintosh text file into the newline UNIX expects.<EMCLASS="emphasis">tr</EM> allows you to specify characters as octal values bypreceding the value with a backslash, so the command:<ACLASS="indexterm"NAME="AUTOID-40751"></A></P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">$ <CODECLASS="userinput"><B>tr '\015' '\012' &lt; file.mac &gt; file.unix</B></CODE></PRE></BLOCKQUOTE></P><PCLASS="para">does the trick.</P><PCLASS="para">The command:</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">$ <CODECLASS="userinput"><B> tr -d '\015' &lt; pc.file</B></CODE></PRE></BLOCKQUOTE></P><PCLASS="para">will remove the carriage return from the carriage return/newline pairthat a PC file uses as a line terminator.(This command is also handyfor removing the excess carriage returns from a file created with<SPANCLASS="link"><EMCLASS="emphasis">script</EM> (<ACLASS="linkend"HREF="ch51_05.htm"TITLE="Copy What You Do with script ">51.5</A>)</SPAN>.)</P><PCLASS="para">Article<ACLASS="xref"HREF="ch29_10.htm"TITLE="Just the Words, Please ">29.10</A>uses <EMCLASS="emphasis">tr</EM> to split sentences into words.<ACLASS="indexterm"NAME="AUTOID-40767"></A></P><DIVCLASS="sect1info"><PCLASS="SECT1INFO">- <SPANCLASS="authorinitials">TOR</SPAN>, <SPANCLASS="authorinitials">JP</SPAN></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_10.htm"TITLE="35.10 Splitting Files by Context: csplit "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 35.10 Splitting Files by Context: csplit "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="book"HREF="index.htm"TITLE="UNIX Power Tools"><IMGSRC="gifs/txthome.gif"SRC="gifs/txthome.gif"ALT="UNIX Power Tools"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_12.htm"TITLE="35.12 Converting Between ASCII and EBCDIC "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 35.12 Converting Between ASCII and EBCDIC "BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172">35.10 Splitting Files by Context: csplit </TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="index"HREF="index/idx_0.htm"TITLE="Book Index"><IMGSRC="gifs/index.gif"SRC="gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172">35.12 Converting Between ASCII and EBCDIC </TD></TR></TABLE><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><IMGSRC="gifs/smnavbar.gif"SRC="gifs/smnavbar.gif"USEMAP="#map"BORDER="0"ALT="The UNIX CD Bookshelf Navigation"><MAPNAME="map"><AREASHAPE="RECT"COORDS="0,0,73,21"HREF="../index.htm"ALT="The UNIX CD Bookshelf"><AREASHAPE="RECT"COORDS="74,0,163,21"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="164,0,257,21"HREF="../unixnut/index.htm"ALT="UNIX in a Nutshell"><AREASHAPE="RECT"COORDS="258,0,321,21"HREF="../vi/index.htm"ALT="Learning the vi Editor"><AREASHAPE="RECT"COORDS="322,0,378,21"HREF="../sedawk/index.htm"ALT="sed &amp; awk"><AREASHAPE="RECT"COORDS="379,0,438,21"HREF="../ksh/index.htm"ALT="Learning the Korn Shell"><AREASHAPE="RECT"COORDS="439,0,514,21"HREF="../lrnunix/index.htm"ALT="Learning the UNIX Operating System"></MAP></DIV></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -