📄 ch35_09.htm
字号:
<HTML><!--Distributed by F --><HEAD><TITLE>[Chapter 35] 35.9 Splitting Files at Fixed Points: split </TITLE><METANAME="DC.title"CONTENT="UNIX Power Tools"><METANAME="DC.creator"CONTENT="Jerry Peek, Tim O'Reilly & Mike Loukides"><METANAME="DC.publisher"CONTENT="O'Reilly & Associates, Inc."><METANAME="DC.date"CONTENT="1998-08-04T21:48:06Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-260-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch35_01.htm"TITLE="35. You Can't Quite Call This Editing"><LINKREL="prev"HREF="ch35_08.htm"TITLE="35.8 Centering Lines in a File "><LINKREL="next"HREF="ch35_10.htm"TITLE="35.10 Splitting Files by Context: csplit "></HEAD><BODYBGCOLOR="#FFFFFF"TEXT="#000000"><DIVCLASS="htmlnav"><H1><IMGSRC="gifs/smbanner.gif"ALT="UNIX Power Tools"USEMAP="#srchmap"BORDER="0"></H1><MAPNAME="srchmap"><AREASHAPE="RECT"COORDS="0,0,466,58"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="467,0,514,18"HREF="jobjects/fsearch.htm"ALT="Search this book"></MAP><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_08.htm"TITLE="35.8 Centering Lines in a File "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 35.8 Centering Lines in a File "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1">Chapter 35<BR>You Can't Quite Call This Editing</FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_10.htm"TITLE="35.10 Splitting Files by Context: csplit "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 35.10 Splitting Files by Context: csplit "BORDER="0"></A></TD></TR></TABLE> <HRALIGN="LEFT"WIDTH="515"TITLE="footer"></DIV><DIVCLASS="SECT1"><H2CLASS="sect1"><ACLASS="title"NAME="UPT-ART-2880">35.9 Splitting Files at Fixed Points: split </A></H2><PCLASS="para"><ACLASS="indexterm"NAME="UPT-ART-2880-IX-SPLIT-PROGRAM"></A>Most versions of UNIX come with a program called <EMCLASS="emphasis">split</EM> whosepurpose is to split large files into smaller files for tasks such asediting them in an editor that cannot handle large files, or mailingthem if they are so big that some mailers will refuse to deal withthem. For example, let's say you have a really big text file that youwant to mail to someone:</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">% <CODECLASS="userinput"><B>ls -l bigfile</B></CODE>-r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile</PRE></BLOCKQUOTE></P><PCLASS="para">Running <EMCLASS="emphasis">split</EM> on that file will (by default, with most versions of<EMCLASS="emphasis">split</EM>) break it up into pieces that are each no more than 1000 lines long:<ACLASS="indexterm"NAME="UPT-ART-2880-IX-TEXT-PROCESSING-SPLITTING-FILES"></A></P><PCLASS="para"><TABLECLASS="screen.co"BORDER="1"><TR><THVALIGN="TOP"><PRECLASS="calloutlist"> <ACLASS="co"HREF="ch29_06.htm"TITLE="29.6 Counting Lines, Words, and Characters: wc ">wc</A> </PRE></TH><TDVALIGN="TOP"><PRECLASS="screen">% <CODECLASS="userinput"><B>ls -l</B></CODE>total 283-r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile-rw-rw-r-- 1 jik 46444 Oct 15 21:04 xaa-rw-rw-r-- 1 jik 51619 Oct 15 21:04 xab-rw-rw-r-- 1 jik 41007 Oct 15 21:04 xac% <CODECLASS="userinput"><B>wc -l x*</B></CODE> 1000 xaa 1000 xab 932 xac 2932 total</PRE></TD></TR></TABLE></P><PCLASS="para">Note the default naming scheme, which is to append "aa," "ab," "ac,"etc., to the letter "x" for each subsequent filename. It is possibleto modify the default behavior. For example, you can make it createfiles that are 1500 lines long instead of 1000:</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">% <CODECLASS="userinput"><B>rm x??</B></CODE>% <CODECLASS="userinput"><B>split -1500 bigfile</B></CODE>% <CODECLASS="userinput"><B>ls -l</B></CODE>total 288-r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile-rw-rw-r-- 1 jik 74016 Oct 15 21:06 xaa-rw-rw-r-- 1 jik 65054 Oct 15 21:06 xab</PRE></BLOCKQUOTE></P><PCLASS="para">You can also get it to use a name prefix other than "x":</P><PCLASS="para"><BLOCKQUOTECLASS="screen"><PRECLASS="screen">% <CODECLASS="userinput"><B>rm x??</B></CODE>% <CODECLASS="userinput"><B>split -1500 bigfile bigfile.split.</B></CODE>% <CODECLASS="userinput"><B>ls -l</B></CODE>total 288-r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile-rw-rw-r-- 1 jik 74016 Oct 15 21:07 bigfile.split.aa-rw-rw-r-- 1 jik 65054 Oct 15 21:07 bigfile.split.ab</PRE></BLOCKQUOTE></P><PCLASS="para">Although the simple behavior described above tends to be relativelyuniversal, there are differences in the functionality of <EMCLASS="emphasis">split</EM> ondifferent UNIX systems. There are four basic variants of <EMCLASS="emphasis">split</EM>as shipped with various implementations of UNIX:</P><OLCLASS="orderedlist"><LICLASS="listitem"><PCLASS="para">A <EMCLASS="emphasis">split</EM> that understands only how to deal with splitting textfiles into chunks of <EMCLASS="emphasis">n</EM> lines or less each.</P></LI><LICLASS="listitem"><TABLECLASS="para.programreference"BORDER="1"><TR><THVALIGN="TOP"><ACLASS="programreference"HREF="examples/index.htm"TITLE="bsplit">bsplit</A><BR></TH><TDVALIGN="TOP">A <EMCLASS="emphasis">split</EM>, usually called <EMCLASS="emphasis">bsplit</EM>, that understands only how todeal with splitting non-text files into <EMCLASS="emphasis">n</EM>-character chunks.A public domain version of <EMCLASS="emphasis">bsplit</EM> is available on the PowerTools disc.</TD></TR></TABLE></LI><LICLASS="listitem"><PCLASS="para">A <EMCLASS="emphasis">split</EM> that will split text files into <EMCLASS="emphasis">n</EM>-line chunks, ornon-text files into <EMCLASS="emphasis">n</EM>-character chunks, and tries to figure outautomatically whether it's working on a text file or a non-text file.<ACLASS="indexterm"NAME="AUTOID-40496"></A></P></LI><LICLASS="listitem"><PCLASS="para">A <EMCLASS="emphasis">split</EM> that will do either text files or non-text files, butneeds to be told explicitly when it is working on a non-text file.</P></LI></OL><PCLASS="para">The only way to tell which version you've got is to read themanual page for it on your system, which will also tell you the exactsyntax for using it.</P><PCLASS="para">The problem with the third variant is that although it tries to besmart andautomatically do the right thing with both text and non-text files, itsometimes guesses wrong and splits a text file as a non-text file orvice versa, with completely unsatisfactory results. Therefore, if thevariant on your system is (3), you probably want to get your hands onone of the many <EMCLASS="emphasis">split</EM> clones out there that is closer to one ofthe other variants (see below).</P><PCLASS="para">Variants (1) and (2) listed above are OK as far as they go, but theyaren't adequate if your environment provides only one of them ratherthan both.If you find yourself needing to split a non-text file when you haveonly a text <EMCLASS="emphasis">split</EM>, or needing to split a text file when you haveonly <EMCLASS="emphasis">bsplit</EM>, you need to get one of theclones that will perform the function you need.</P><PCLASS="para">Variant (4) is the most reliable and versatile of the four listed,and is therefore what you should go with if you find it necessary toget a clone and install it on your system. There are several suchclones in the various source archives, including the freely availableBSD UNIX version.Alternatively, if you have installed<SPANCLASS="link"><EMCLASS="emphasis">perl</EM> (<ACLASS="linkend"HREF="ch37_01.htm#UPT-ART-5560"TITLE="What We Do and Don't Tell You About Perl ">37.1</A>)</SPAN>,it is quite easy to write a simple <EMCLASS="emphasis">split</EM> clone in<EMCLASS="emphasis">perl</EM>, and you don't have to worry about compiling a C program to doit; this is an especially significant advantage if you need to runyour <EMCLASS="emphasis">split</EM> on multiple architectures that would need separatebinaries.</P><PCLASS="para">If you need to split a non-text file and don't feel like going to allof the trouble of finding a <EMCLASS="emphasis">split</EM> clone that handles them, onestandard UNIX tool you can use to do the splitting is<SPANCLASS="link"><EMCLASS="emphasis">dd</EM> (<ACLASS="linkend"HREF="ch35_06.htm"TITLE="Low-Level File Butchery with dd ">35.6</A>)</SPAN>.For example, if <EMCLASS="emphasis">bigfile</EM> above were a non-text file and you wanted to split it into 20,000-byte pieces, you could do something likethis:</P><PCLASS="para"><TABLECLASS="screen.co"BORDER="1"><TR><THVALIGN="TOP"><PRECLASS="calloutlist"> <ACLASS="co"HREF="ch44_16.htm"TITLE="44.16 Handling Command-Line Arguments with a for Loop ">for</A> <ACLASS="co"HREF="ch09_13.htm"TITLE="9.13 Multiline Commands, Secondary Prompts ">></A> <ACLASS="co"HREF="ch45_23.htm"TITLE="45.23 The Ins and Outs of Redirected I/O Loops ">done <</A> </PRE></TH><TDVALIGN="TOP"><PRECLASS="screen">$ <CODECLASS="userinput"><B>ls -l bigfile</B></CODE>-r--r--r-- 1 jik 139070 Oct 23 08:58 bigfile$ for<CODECLASS="userinput"><B> i in 1 2 3 4 5 6 7</B></CODE> <EMCLASS="emphasis"># </EM>[2]> <CODECLASS="userinput"><B>do</B></CODE>> <CODECLASS="userinput"><B>dd of=x$i bs=20000 count=1 2>/dev/null</B></CODE> <EMCLASS="emphasis"># </EM>[3]> <CODECLASS="userinput"><B>done < bigfile</B></CODE>$ <CODECLASS="userinput"><B>ls -l</B></CODE>total 279-r--r--r-- 1 jik 139070 Oct 23 08:58 bigfile-rw-rw-r-- 1 jik 20000 Oct 23 09:00 x1-rw-rw-r-- 1 jik 20000 Oct 23 09:00 x2-rw-rw-r-- 1 jik 20000 Oct 23 09:00 x3-rw-rw-r-- 1 jik 20000 Oct 23 09:00 x4-rw-rw-r-- 1 jik 20000 Oct 23 09:00 x5-rw-rw-r-- 1 jik 20000 Oct 23 09:00 x6-rw-rw-r-- 1 jik 19070 Oct 23 09:00 x7</PRE></TD></TR></TABLE></P><ACLASS="indexterm"NAME="AUTOID-40542"></A><DIVCLASS="sect1info"><PCLASS="SECT1INFO">- <SPANCLASS="authorinitials">JIK</SPAN></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><TABLEWIDTH="515"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_08.htm"TITLE="35.8 Centering Lines in a File "><IMGSRC="gifs/txtpreva.gif"SRC="gifs/txtpreva.gif"ALT="Previous: 35.8 Centering Lines in a File "BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="book"HREF="index.htm"TITLE="UNIX Power Tools"><IMGSRC="gifs/txthome.gif"SRC="gifs/txthome.gif"ALT="UNIX Power Tools"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172"><ACLASS="SECT1"HREF="ch35_10.htm"TITLE="35.10 Splitting Files by Context: csplit "><IMGSRC="gifs/txtnexta.gif"SRC="gifs/txtnexta.gif"ALT="Next: 35.10 Splitting Files by Context: csplit "BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="172">35.8 Centering Lines in a File </TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="171"><ACLASS="index"HREF="index/idx_0.htm"TITLE="Book Index"><IMGSRC="gifs/index.gif"SRC="gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="172">35.10 Splitting Files by Context: csplit </TD></TR></TABLE><HRALIGN="LEFT"WIDTH="515"TITLE="footer"><IMGSRC="gifs/smnavbar.gif"SRC="gifs/smnavbar.gif"USEMAP="#map"BORDER="0"ALT="The UNIX CD Bookshelf Navigation"><MAPNAME="map"><AREASHAPE="RECT"COORDS="0,0,73,21"HREF="../index.htm"ALT="The UNIX CD Bookshelf"><AREASHAPE="RECT"COORDS="74,0,163,21"HREF="index.htm"ALT="UNIX Power Tools"><AREASHAPE="RECT"COORDS="164,0,257,21"HREF="../unixnut/index.htm"ALT="UNIX in a Nutshell"><AREASHAPE="RECT"COORDS="258,0,321,21"HREF="../vi/index.htm"ALT="Learning the vi Editor"><AREASHAPE="RECT"COORDS="322,0,378,21"HREF="../sedawk/index.htm"ALT="sed & awk"><AREASHAPE="RECT"COORDS="379,0,438,21"HREF="../ksh/index.htm"ALT="Learning the Korn Shell"><AREASHAPE="RECT"COORDS="439,0,514,21"HREF="../lrnunix/index.htm"ALT="Learning the UNIX Operating System"></MAP></DIV></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -