📄 library_5.html

📁 Linux程序员的工作手册
💻 HTML
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
<!-- This HTML file has been created by texi2html 1.27     from library.texinfo on 3 March 1994 --><TITLE>The GNU C Library - String and Array Utilities</TITLE><P>Go to the <A HREF="library_4.html" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_4.html">previous</A>, <A HREF="library_6.html" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_6.html">next</A> section.<P><H1><A NAME="SEC57" HREF="library_toc.html#SEC57" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC57">String and Array Utilities</A></H1><P>Operations on strings (or arrays of characters) are an important part ofmany programs.  The GNU C library provides an extensive set of stringutility functions, including functions for copying, concatenating,comparing, and searching strings.  Many of these functions can alsooperate on arbitrary regions of storage; for example, the <CODE>memcpy</CODE>function can be used to copy the contents of any kind of array.  <P>It's fairly common for beginning C programmers to "reinvent the wheel"by duplicating this functionality in their own code, but it pays tobecome familiar with the library functions and to make use of them,since this offers benefits in maintenance, efficiency, and portability.<P>For instance, you could easily compare one string to another in twolines of C code, but if you use the built-in <CODE>strcmp</CODE> function,you're less likely to make a mistake.  And, since these libraryfunctions are typically highly optimized, your program may run fastertoo.<P><A NAME="IDX267"></A><H2><A NAME="SEC58" HREF="library_toc.html#SEC58" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC58">Representation of Strings</A></H2><P>This section is a quick summary of string concepts for beginning Cprogrammers.  It describes how character strings are represented in Cand some common pitfalls.  If you are already familiar with thismaterial, you can skip this section.<A NAME="IDX268"></A><A NAME="IDX269"></A><P>A <DFN>string</DFN> is an array of <CODE>char</CODE> objects.  But string-valuedvariables are usually declared to be pointers of type <CODE>char *</CODE>.Such variables do not include space for the text of a string; that hasto be stored somewhere else--in an array variable, a string constant,or dynamically allocated memory (see section <A HREF="library_3.html#SEC18" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_3.html#SEC18">Memory Allocation</A>).  It's up toyou to store the address of the chosen memory space into the pointervariable.  Alternatively you can store a <DFN>null pointer</DFN> in thepointer variable.  The null pointer does not point anywhere, soattempting to reference the string it points to gets an error.<P>By convention, a <DFN>null character</DFN>, <CODE>'\0'</CODE>, marks the end of astring.  For example, in testing to see whether the <CODE>char *</CODE>variable <VAR>p</VAR> points to a null character marking the end of a string,you can write <CODE>!*<VAR>p</VAR></CODE> or <CODE>*<VAR>p</VAR> == '\0'</CODE>.<P>A null character is quite different conceptually from a null pointer,although both are represented by the integer <CODE>0</CODE>.<A NAME="IDX270"></A><P><DFN>String literals</DFN> appear in C program source as strings ofcharacters between double-quote characters (<SAMP>`"'</SAMP>).  In ANSI C,string literals can also be formed by <DFN>string concatenation</DFN>:<CODE>"a" "b"</CODE> is the same as <CODE>"ab"</CODE>.  Modification of stringliterals is not allowed by the GNU C compiler, because literalsare placed in read-only storage.<P>Character arrays that are declared <CODE>const</CODE> cannot be modifiedeither.  It's generally good style to declare non-modifiable stringpointers to be of type <CODE>const char *</CODE>, since this often allows theC compiler to detect accidental modifications as well as providing someamount of documentation about what your program intends to do with thestring.<P>The amount of memory allocated for the character array may extend pastthe null character that normally marks the end of the string.  In thisdocument, the term <DFN>allocation size</DFN> is always used to refer to thetotal amount of memory allocated for the string, while the term<DFN>length</DFN> refers to the number of characters up to (but notincluding) the terminating null character.<A NAME="IDX272"></A><A NAME="IDX273"></A><A NAME="IDX274"></A><A NAME="IDX275"></A><A NAME="IDX271"></A><P>A notorious source of program bugs is trying to put more characters in astring than fit in its allocated size.  When writing code that extendsstrings or moves characters into a pre-allocated array, you should bevery careful to keep track of the length of the text and make explicitchecks for overflowing the array.  Many of the library functions<EM>do not</EM> do this for you!  Remember also that you need to allocatean extra byte to hold the null character that marks the end of thestring.<P><H2><A NAME="SEC59" HREF="library_toc.html#SEC59" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC59">String/Array Conventions</A></H2><P>This chapter describes both functions that work on arbitrary arrays orblocks of memory, and functions that are specific to null-terminatedarrays of characters.<P>Functions that operate on arbitrary blocks of memory have namesbeginning with <SAMP>`mem'</SAMP> (such as <CODE>memcpy</CODE>) and invariably take anargument which specifies the size (in bytes) of the block of memory tooperate on.  The array arguments and return values for these functionshave type <CODE>void *</CODE>, and as a matter of style, the elements of thesearrays are referred to as "bytes".  You can pass any kind of pointerto these functions, and the <CODE>sizeof</CODE> operator is useful incomputing the value for the size argument.<P>In contrast, functions that operate specifically on strings have namesbeginning with <SAMP>`str'</SAMP> (such as <CODE>strcpy</CODE>) and look for a nullcharacter to terminate the string instead of requiring an explicit sizeargument to be passed.  (Some of these functions accept a specifiedmaximum length, but they also check for premature termination with anull character.)  The array arguments and return values for thesefunctions have type <CODE>char *</CODE>, and the array elements are referredto as "characters".<P>In many cases, there are both <SAMP>`mem'</SAMP> and <SAMP>`str'</SAMP> versions of afunction.  The one that is more appropriate to use depends on the exactsituation.  When your program is manipulating arbitrary arrays or blocks ofstorage, then you should always use the <SAMP>`mem'</SAMP> functions.  On theother hand, when you are manipulating null-terminated strings it isusually more convenient to use the <SAMP>`str'</SAMP> functions, unless youalready know the length of the string in advance.<P><H2><A NAME="SEC60" HREF="library_toc.html#SEC60" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC60">String Length</A></H2><P>You can get the length of a string using the <CODE>strlen</CODE> function.This function is declared in the header file <TT>`string.h'</TT>.<A NAME="IDX276"></A><P><A NAME="IDX277"></A><U>Function:</U> size_t <B>strlen</B> <I>(const char *<VAR>s</VAR>)</I><P>The <CODE>strlen</CODE> function returns the length of the null-terminatedstring <VAR>s</VAR>.  (In other words, it returns the offset of the terminatingnull character within the array.)<P>For example,<PRE>strlen ("hello, world")    => 12</PRE><P>When applied to a character array, the <CODE>strlen</CODE> function returnsthe length of the string stored there, not its allocation size.  You canget the allocation size of the character array that holds a string usingthe <CODE>sizeof</CODE> operator:<P><PRE>char string[32] = "hello, world"; sizeof (string)    => 32strlen (string)    => 12</PRE><P><H2><A NAME="SEC61" HREF="library_toc.html#SEC61" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC61">Copying and Concatenation</A></H2><P>You can use the functions described in this section to copy the contentsof strings and arrays, or to append the contents of one string toanother.  These functions are declared in the header file<TT>`string.h'</TT>.<A NAME="IDX279"></A><A NAME="IDX280"></A><A NAME="IDX281"></A><A NAME="IDX282"></A><A NAME="IDX283"></A><A NAME="IDX278"></A><P>A helpful way to remember the ordering of the arguments to the functionsin this section is that it corresponds to an assignment expression, withthe destination array specified to the left of the source array.  Allof these functions return the address of the destination array.<P>Most of these functions do not work properly if the source anddestination arrays overlap.  For example, if the beginning of thedestination array overlaps the end of the source array, the originalcontents of that part of the source array may get overwritten before itis copied.  Even worse, in the case of the string functions, the nullcharacter marking the end of the string may be lost, and the copyfunction might get stuck in a loop trashing all the memory allocated toyour program.<P>All functions that have problems copying between overlapping arrays areexplicitly identified in this manual.  In addition to functions in thissection, there are a few others like <CODE>sprintf</CODE> (see section <A HREF="library_11.html#SEC135" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_11.html#SEC135">Formatted Output Functions</A>) and <CODE>scanf</CODE> (see section <A HREF="library_11.html#SEC153" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_11.html#SEC153">Formatted Input Functions</A>).<P><A NAME="IDX284"></A><U>Function:</U> void * <B>memcpy</B> <I>(void *<VAR>to</VAR>, const void *<VAR>from</VAR>, size_t <VAR>size</VAR>)</I><P>The <CODE>memcpy</CODE> function copies <VAR>size</VAR> bytes from the objectbeginning at <VAR>from</VAR> into the object beginning at <VAR>to</VAR>.  Thebehavior of this function is undefined if the two arrays <VAR>to</VAR> and<VAR>from</VAR> overlap; use <CODE>memmove</CODE> instead if overlapping is possible.<P>The value returned by <CODE>memcpy</CODE> is the value of <VAR>to</VAR>.<P>Here is an example of how you might use <CODE>memcpy</CODE> to copy thecontents of a <CODE>struct</CODE>:<P><PRE>struct foo *old, *new;...memcpy (new, old, sizeof(struct foo));</PRE><P><A NAME="IDX285"></A><U>Function:</U> void * <B>memmove</B> <I>(void *<VAR>to</VAR>, const void *<VAR>from</VAR>, size_t <VAR>size</VAR>)</I><P><CODE>memmove</CODE> copies the <VAR>size</VAR> bytes at <VAR>from</VAR> into the<VAR>size</VAR> bytes at <VAR>to</VAR>, even if those two blocks of spaceoverlap.  In the case of overlap, <CODE>memmove</CODE> is careful to copy theoriginal values of the bytes in the block at <VAR>from</VAR>, including thosebytes which also belong to the block at <VAR>to</VAR>.<P><A NAME="IDX286"></A><U>Function:</U> void * <B>memccpy</B> <I>(void *<VAR>to</VAR>, const void *<VAR>from</VAR>, int <VAR>c</VAR>, size_t <VAR>size</VAR>)</I><P>This function copies no more than <VAR>size</VAR> bytes from <VAR>from</VAR> to<VAR>to</VAR>, stopping if a byte matching <VAR>c</VAR> is found.  The returnvalue is a pointer into <VAR>to</VAR> one byte past where <VAR>c</VAR> was copied,or a null pointer if no byte matching <VAR>c</VAR> appeared in the first<VAR>size</VAR> bytes of <VAR>from</VAR>.<P><A NAME="IDX287"></A><U>Function:</U> void * <B>memset</B> <I>(void *<VAR>block</VAR>, int <VAR>c</VAR>, size_t <VAR>size</VAR>)</I><P>This function copies the value of <VAR>c</VAR> (converted to an<CODE>unsigned char</CODE>) into each of the first <VAR>size</VAR> bytes of theobject beginning at <VAR>block</VAR>.  It returns the value of <VAR>block</VAR>.<P><A NAME="IDX288"></A><U>Function:</U> char * <B>strcpy</B> <I>(char *<VAR>to</VAR>, const char *<VAR>from</VAR>)</I><P>This copies characters from the string <VAR>from</VAR> (up to and includingthe terminating null character) into the string <VAR>to</VAR>.  Like<CODE>memcpy</CODE>, this function has undefined results if the stringsoverlap.  The return value is the value of <VAR>to</VAR>.<P><A NAME="IDX289"></A><U>Function:</U> char * <B>strncpy</B> <I>(char *<VAR>to</VAR>, const char *<VAR>from</VAR>, size_t <VAR>size</VAR>)</I><P>This function is similar to <CODE>strcpy</CODE> but always copies exactly<VAR>size</VAR> characters into <VAR>to</VAR>.<P>If the length of <VAR>from</VAR> is more than <VAR>size</VAR>, then <CODE>strncpy</CODE>copies just the first <VAR>size</VAR> characters.<P>If the length of <VAR>from</VAR> is less than <VAR>size</VAR>, then <CODE>strncpy</CODE>copies all of <VAR>from</VAR>, followed by enough null characters to add upto <VAR>size</VAR> characters in all.  This behavior is rarely useful, but itis specified by the ANSI C standard.<P>The behavior of <CODE>strncpy</CODE> is undefined if the strings overlap.<P>Using <CODE>strncpy</CODE> as opposed to <CODE>strcpy</CODE> is a way to avoid bugsrelating to writing past the end of the allocated space for <VAR>to</VAR>.However, it can also make your program much slower in one common case:copying a string which is probably small into a potentially large buffer.In this case, <VAR>size</VAR> may be large, and when it is, <CODE>strncpy</CODE> willwaste a considerable amount of time copying null characters.<P><A NAME="IDX290"></A><U>Function:</U> char * <B>strdup</B> <I>(const char *<VAR>s</VAR>)</I><P>This function copies the null-terminated string <VAR>s</VAR> into a newlyallocated string.  The string is allocated using <CODE>malloc</CODE>; seesection <A HREF="library_3.html#SEC21" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_3.html#SEC21">Unconstrained Allocation</A>.  If <CODE>malloc</CODE> cannot allocate spacefor the new string, <CODE>strdup</CODE> returns a null pointer.  Otherwise itreturns a pointer to the new string.<P><A NAME="IDX291"></A><U>Function:</U> char * <B>stpcpy</B> <I>(char *<VAR>to</VAR>, const char *<VAR>from</VAR>)</I><P>This function is like <CODE>strcpy</CODE>, except that it returns a pointer tothe end of the string <VAR>to</VAR> (that is, the address of the terminatingnull character) rather than the beginning.<P>For example, this program uses <CODE>stpcpy</CODE> to concatenate <SAMP>`foo'</SAMP>and <SAMP>`bar'</SAMP> to produce <SAMP>`foobar'</SAMP>, which it then prints.<P><PRE>#include &#60;string.h&#62;intmain (void){  char *to = buffer;  to = stpcpy (to, "foo");  to = stpcpy (to, "bar");  printf ("%s\n", buffer);}</PRE><P>This function is not part of the ANSI or POSIX standards, and is notcustomary on Unix systems, but we did not invent it either.  Perhaps itcomes from MS-DOG.<P>Its behavior is undefined if the strings overlap.<P><A NAME="IDX292"></A><U>Function:</U> char * <B>strcat</B> <I>(char *<VAR>to</VAR>, const char *<VAR>from</VAR>)</I><P>
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -