📄 ch25_07.htm
字号:
<HTML><!--Distributed by F --><HEAD><TITLE>[Chapter 25] 25.7 Show Non-Printing Characters with cat -v or od -c </TITLE><METANAME="DC.title"CONTENT="UNIX Power Tools"><METANAME="DC.creator"CONTENT="Jerry Peek, Tim O'Reilly & Mike Loukides"><METANAME="DC.publisher"CONTENT="O'Reilly & Associates, Inc."><METANAME="DC.date"CONTENT="1998-08-04T21:43:11Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-260-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch25_01.htm"TITLE="25. Showing What's in a File"><LINKREL="prev"HREF="ch25_06.htm"TITLE="25.6 What's in That White Space? "><LINKREL="next"HREF="ch25_08.htm"TITLE="25.8 Finding File Types "></HEAD><BODYBGCOLOR="#FFFFFF"TEXT="#000000"><DIVCLASS="htmlnav"><H1><IMGSRC="gifs/smbanner.gif"ALT="UNIX Power Tools"USEMAP="#srchmap"BORDER="0"></H1><MAPNAME="srchmap"><AREASHAPE="RECT"COORDS="0,0,466,58"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="467,0,514,18"HREF="jobjects/fsearch.htm"ALT="Search this book"></MAP><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch25_06.htm"TITLE="25.6 What's in That White Space? "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 25.6 What's in That White Space? "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1">Chapter 25<BR>Showing What's in a File</FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch25_08.htm"TITLE="25.8 Finding File Types "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 25.8 Finding File Types "BORDER="0"></A></TD></TR></TABLE> <HRALIGN="LEFT"WIDTH="515"TITLE="footer"></DIV><DIVCLASS="SECT1"><H2CLASS="sect1"><ACLASS="title"NAME="UPT-ART-2640">25.7 Show Non-Printing Characters with cat -v or od -c </A></H2><PCLASS="para"><ACLASS="indexterm"NAME="UPT-ART-2640-IX-CAT-COMMAND-V-OPTION-V-OPTION"></A><ACLASS="indexterm"NAME="UPT-ART-2640-IX-OD-COMMAND-C-OPTION-C-OPTION"></A><ACLASS="indexterm"NAME="UPT-ART-2640-IX-FILES-DISPLAYING-CONTENTS-OF-CAT-COMMAND"></A><ACLASS="indexterm"NAME="UPT-ART-2640-IX-FILES-DISPLAYING-CONTENTS-OF-OD-UTILITY"></A><ACLASS="indexterm"NAME="AUTOID-27513"></A>Especially if you use an ASCII-based terminal, files canhave characters that your terminal can't display.Some characters will lock up your communications software or hardware,make your screen look strange, or cause other weird problems.So if you'd like to look at a file and you aren't sure what's in there,it's not a good idea to just <EMCLASS="emphasis">cat</EM> the file!</P><PCLASS="para">Instead, try <EMCLASS="emphasis">cat -v</EM>.It turns non-printable characters into a printable form.In fact, although most manual pages don't explain how, you can read theoutput and see what's in the file.Another utility for displaying non-printable files is <EMCLASS="emphasis">od</EM>.I usually use its <EMCLASS="emphasis">-c</EM> option when I need to look at a filecharacter by character.</P><PCLASS="para">Let's look at a file that's almost guaranteed to be unprintable: adirectory file.This example is on astandard V7 (UNIX Version 7) filesystem.(Unfortunately, some UNIX systems won't let you read a directory.If you want to follow along on one of those systems, try a<SPANCLASS="link">compressed file (<ACLASS="linkend"HREF="ch24_07.htm"TITLE="Compressing Files to Save Space ">24.7</A>)</SPAN>or an executable program from <EMCLASS="emphasis">/bin</EM>.)A directory usually has some long lines, so it's a good idea to pipe<EMCLASS="emphasis">cat</EM>'s output through<SPANCLASS="link"><EMCLASS="emphasis">fold</EM> (<ACLASS="linkend"HREF="ch43_08.htm"TITLE="Fixing Margins with pr and fold ">43.8</A>)</SPAN>:</P><PCLASS="para"><TABLECLASS="screen.co"BORDER="1"><TR><THVALIGN="TOP"><PRECLASS="calloutlist"><ACLASS="co"HREF="ch24_16.htm"TITLE="24.16 Trimming a Huge Directory ">-f</A> </PRE></TH><TDVALIGN="TOP"><PRECLASS="screen">% <CODECLASS="userinput"><B>ls -fa</B></CODE>...comp% <CODECLASS="userinput"><B>cat -v . | fold -62</B></CODE>M-^?^N.^@^@^@^@^@^@^@^@^@^@^@^@^@>^G..^@^@^@^@^@^@^@^@^@^@^@^@M-acomp^@^@^@^@^@^@^@^@^@^@^@^@MassAveFood^@^@^@^@^@hist^@^@^@^@^@^@^@^@^@^@% <CODECLASS="userinput"><B>od -c .</B></CODE>0000000 377 016 . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \00000020 > 007 . . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \00000040 341 \n c o m p \0 \0 \0 \0 \0 \0 \0 \0 \0 \00000060 \0 \0 M a s s A v e F o o d \0 \0 \00000100 \0 \0 h i s t \0 \0 \0 \0 \0 \0 \0 \0 \0 \00000120</PRE></TD></TR></TABLE></P><PCLASS="para">Each entry in a V7-type directory is 16 bytes long (that's also 16characters, in the<SPANCLASS="link">ASCII (<ACLASS="linkend"HREF="ch51_03.htm"TITLE="ASCII Characters: Listing and Getting Values ">51.3</A>)</SPAN>system).The <EMCLASS="emphasis">od -c</EM> command starts each line with the number of bytes, inoctal, shown since the start of the file.The first line starts at byte 0.The second line starts at byte 20 (that's byte 16 in decimal, the way most ofus count).And so on.Enough about <EMCLASS="emphasis">od</EM> for now, though.We'll come back in a minute.Time to dissect the <EMCLASS="emphasis">cat -v</EM> output:</P><ULCLASS="itemizedlist"><LICLASS="listitem"><PCLASS="para">You've probably seen sequences like <CODECLASS="literal">^N</CODE> and <CODECLASS="literal">^G</CODE>.Those are control characters.(Find them in the <EMCLASS="emphasis">cat -v</EM> output, please.)</P><PCLASS="para">Another character like this is <CODECLASS="literal">^@</CODE>, the character NUL (ASCII 0).There are a lot of NULs in the directory; more about that below.A DEL character (ASCII 177 octal) is shown as <CODECLASS="literal">^?</CODE>.Check an<SPANCLASS="link">ASCII chart (<ACLASS="linkend"HREF="ch51_03.htm"TITLE="ASCII Characters: Listing and Getting Values ">51.3</A>)</SPAN>.</P></LI><LICLASS="listitem"><PCLASS="para"><EMCLASS="emphasis">cat -v</EM> has its own symbol for characters outside the ASCII rangewith their high bits set,also called metacharacters.<EMCLASS="emphasis">cat -v</EM> prints those as <CODECLASS="literal">M-</CODE> followed by another character.There are two of them in the <EMCLASS="emphasis">cat -v</EM> output: <CODECLASS="literal">M-^?</CODE> and<CODECLASS="literal">M-a</CODE>.</P><PCLASS="para">To get a metacharacter, you add 200 octal."Say what?"Let's look at <CODECLASS="literal">M-a</CODE> first.The octal value of the letter <CODECLASS="literal">a</CODE> is 141.When <EMCLASS="emphasis">cat -v</EM> prints <CODECLASS="literal">M-a</CODE>, it means the character you getby adding 141+200, or 341 octal.</P><PCLASS="para">You can decode the character <EMCLASS="emphasis">cat</EM> prints as <CODECLASS="literal">M-^?</CODE> in the same way.The <CODECLASS="literal">^?</CODE> stands for the DEL character, which is octal 177.Add 200+177 to get 377 octal.</P></LI><LICLASS="listitem"><PCLASS="para">If a character isn't <CODECLASS="literal">M-</CODE><CODECLASS="replaceable"><I>something</I></CODE> or <CODECLASS="literal">^</CODE><CODECLASS="replaceable"><I>something</I></CODE>,it's a regular printable character.The entries in the directory (<CODECLASS="literal">.</CODE>, <CODECLASS="literal">..</CODE>, <CODECLASS="literal">comp</CODE>,<CODECLASS="literal">MassAveFood</CODE>, and <CODECLASS="literal">hist</CODE>) are all made of regular ASCIIcharacters.</P><PCLASS="para">If you're wondering where the entries <CODECLASS="literal">MassAveFood</CODE> and <CODECLASS="literal">hist</CODE>are in the <EMCLASS="emphasis">ls</EM> listing, the answer is: they aren't.Those entries have been deleted from the directory.UNIX putstwo NUL (ASCII 0, or <CODECLASS="literal">^@</CODE>) bytes in front of the name whena file has been deleted.</P></LI></UL><PCLASS="para"><EMCLASS="emphasis">cat</EM> has two options, <EMCLASS="emphasis">-t</EM> and <EMCLASS="emphasis">-e</EM>,for displaying white space in a line.The <EMCLASS="emphasis">-v</EM> option doesn't convert TAB and trailing space characters to avisible form without those options.See article<ACLASS="xref"HREF="ch25_06.htm"TITLE="What's in That White Space? ">25.6</A>.</P><PCLASS="para">Next, time for <EMCLASS="emphasis">od -c</EM>;it's easier to explain than <EMCLASS="emphasis">cat -v</EM>:</P><ULCLASS="itemizedlist"><LICLASS="listitem"><PCLASS="para"><EMCLASS="emphasis">od -c</EM> shows some characters starting with a backslash (<CODECLASS="literal">\</CODE>).It uses the standard UNIX and C abbreviations for<SPANCLASS="link">control characters (<ACLASS="linkend"HREF="glossary.htm#UPT-ART-1010"TITLE="Glossary">52.9</A>)</SPAN>where it can.For instance, <CODECLASS="literal">\n</CODE> stands for a newline character, <CODECLASS="literal">\t</CODE>for a tab, etc.There's a newline at the start of the <CODECLASS="literal">comp</CODE> entry - see it in the<EMCLASS="emphasis">od -c</EM> output?That explains why the <EMCLASS="emphasis">cat -v</EM> output was broken onto a new lineat that place: <EMCLASS="emphasis">cat -v</EM> doesn't translate newlines when it finds them.</P><PCLASS="para">The <CODECLASS="literal">\0</CODE> is a NUL character (ASCII 0).It's used to pad the ends of entries in V7 directories when a name isn't thefull 14 characters long.</P></LI><LICLASS="listitem"><PCLASS="para"><EMCLASS="emphasis">od -c</EM> shows the octal value of other characters as three digits.For instance, the <CODECLASS="literal">007</CODE> means "the character 7 octal."<EMCLASS="emphasis">cat -v</EM> shows this as <CODECLASS="literal">^G</CODE> (CTRL-g).</P><PCLASS="para">Metacharacters, the ones with octal values 200 and above, are shown as<CODECLASS="literal">M-</CODE><CODECLASS="replaceable"><I>something</I></CODE> by <EMCLASS="emphasis">cat -v</EM>.In <EMCLASS="emphasis">od -c</EM>, you'll see their octal values - like <CODECLASS="literal">341</CODE>.</P><PCLASS="para">Each directory entry on aUNIX Version 7 filesystemstarts with a two-byte "pointer" to its location in thedisk's inode table.When you type a filename, UNIX uses this pointer to find the actual file informationon the disk.The entry for this directory (named <CODECLASS="literal">.</CODE>) is <CODECLASS="literal">377 016</CODE>.Its parent (named <CODECLASS="literal">..</CODE>) is at <CODECLASS="literal">> 007</CODE>.And <EMCLASS="emphasis">comp</EM>'s entry is <CODECLASS="literal">341 \n</CODE>.Find those in the <EMCLASS="emphasis">cat -v</EM> output, if you want - and compare the twooutputs.</P></LI><LICLASS="listitem"><PCLASS="para">Like <EMCLASS="emphasis">cat -v</EM>, regular printable characters are shown as is by<EMCLASS="emphasis">od -c</EM>.</P></LI></UL><PCLASS="para">The<SPANCLASS="link"><EMCLASS="emphasis">strings</EM> (<ACLASS="linkend"HREF="ch27_19.htm"TITLE="Finding Words Inside Binary Files ">27.19</A>)</SPAN>program finds printable strings of characters (such as filenames) insidemostly non-printable files (like executable binaries).<ACLASS="indexterm"NAME="AUTOID-27631"></A><ACLASS="indexterm"NAME="AUTOID-27632"></A><ACLASS="indexterm"NAME="AUTOID-27633"></A><ACLASS="indexterm"NAME="AUTOID-27634"></A><ACLASS="indexterm"NAME="AUTOID-27635"></A></P><DIVCLASS="sect1info"><PCLASS="SECT1INFO">- <SPANCLASS="authorinitials">JP</SPAN></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch25_06.htm"TITLE="25.6 What's in That White Space? "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 25.6 What's in That White Space? "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="book"HREF="index.htm"TITLE="UNIX Power Tools"><IMGSRC="gifs/txthome.gif"SRC="gifs/txthome.gif"ALT="UNIX Power Tools"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch25_08.htm"TITLE="25.8 Finding File Types "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 25.8 Finding File Types "BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172">25.6 What's in That White Space? </TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="index"HREF="index/idx_0.htm"TITLE="Book Index"><IMGSRC="gifs/index.gif"SRC="gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172">25.8 Finding File Types </TD></TR></TABLE><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><IMGSRC="gifs/smnavbar.gif"SRC="gifs/smnavbar.gif"USEMAP="#map"BORDER="0"ALT="The UNIX CD Bookshelf Navigation"><MAPNAME="map"><AREASHAPE="RECT"COORDS="0,0,73,21"HREF="../index.htm"ALT="The UNIX CD Bookshelf"><AREASHAPE="RECT"COORDS="74,0,163,21"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="164,0,257,21"HREF="../unixnut/index.htm"ALT="UNIX in a Nutshell"><AREASHAPE="RECT"COORDS="258,0,321,21"HREF="../vi/index.htm"ALT="Learning the vi Editor"><AREASHAPE="RECT"COORDS="322,0,378,21"HREF="../sedawk/index.htm"ALT="sed & awk"><AREASHAPE="RECT"COORDS="379,0,438,21"HREF="../ksh/index.htm"ALT="Learning the Korn Shell"><AREASHAPE="RECT"COORDS="439,0,514,21"HREF="../lrnunix/index.htm"ALT="Learning the UNIX Operating System"></MAP></DIV></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -