⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 chapter1.html

📁 Kernighan and Ritchie - The C Programming Language c程序设计语言(第二版)称作是C语言学习的圣经
💻 HTML
📖 第 1 页 / 共 5 页
字号:
   {       int c;       c = getchar();       while (c != EOF) {           putchar(c);           c = getchar();       }   }</pre>The relational operator <tt>!=</tt> means ``not equal to''.<p>What appears to be a character on the keyboard or screen is of course, likeeverything else, stored internally just as a bit pattern. The type <tt>char</tt>is specifically meant for storing such character data, but any integer typecan be used. We used <tt>int</tt> for a subtle but important reason.<p>The problem is distinguishing the end of input from valid data. The solutionis that <tt>getchar</tt> returns a distinctive value when there is no moreinput, a value that cannot be confused with any real character. This valueis called <tt>EOF</tt>, for ``end of file''. We must declare <tt>c</tt> to bea type big enough to hold any value that <tt>getchar</tt> returns. We can't use<tt>char</tt> since <tt>c</tt> must be big enough to hold <tt>EOF</tt> inaddition to any possible <tt>char</tt>. Therefore we use <tt>int</tt>.<p><tt>EOF</tt> is an integer defined in </tt>&lt;stdio.h&gt;</tt>, but the specific numericvalue doesn't matter as long as it is not the same as any <tt>char</tt> value.By using the symbolic constant, we are assured that nothing in the programdepends on the specific numeric value.<p>The program for copying would be written more concisely by experienced Cprogrammers. In C, any assignment, such as<pre>   c = getchar();</pre>is an expression and has a value, which is the value of the left hand sideafter the assignment. This means that a assignment can appear as part of alarger expression. If the assignment of a character to <tt>c</tt> is put insidethe test part of a <tt>while</tt> loop, the copy program can be written thisway:<pre>   #include &lt;stdio.h&gt;   /* copy input to output; 2nd version  */   main()   {       int c;       while ((c = getchar()) != EOF)           putchar(c);   }</pre>The <tt>while</tt> gets a character, assigns it to <tt>c</tt>, and then testswhether the character was the end-of-file signal. If it was not, the body ofthe <tt>while</tt> is executed, printing the character. The <tt>while</tt> thenrepeats. When the end of the input is finally reached, the <tt>while</tt>terminates and so does <tt>main</tt>.<p>This version centralizes the input - there is now only one reference to<tt>getchar</tt> - and shrinks the program. The resulting program is morecompact, and, once the idiom is mastered, easier to read. You'll see thisstyle often. (It's possible to get carried away and create impenetrable code,however, a tendency that we will try to curb.)<p>The parentheses around the assignment, within the condition are necessary.The <em>precedence</em> of <tt>!=</tt> is higher than that of <tt>=</tt>, whichmeans that in the absence of parentheses the relational test <tt>!=</tt> wouldbe done before the assignment <tt>=</tt>. So the statement<pre>   c = getchar() != EOF</pre>is equivalent to<pre>   c = (getchar() != EOF)</pre>This has the undesired effect of setting <tt>c</tt> to 0 or 1, depending onwhether or not the call of <tt>getchar</tt> returned end of file. (More on thisin <a href="chapter2.html">Chapter 2</a>.)<p><strong>Exercsise 1-6.</strong> Verify that the expression<tt>getchar() != EOF</tt> is 0 or 1.<p><strong>Exercise 1-7.</strong> Write a program to print the value of<tt>EOF</tt>.<h3><a name="s1.5.2">1.5.2 Character Counting</a></h3>The next program counts characters; it is similar to the copy program.<pre>   #include &lt;stdio.h&gt;   /* count characters in input; 1st version */   main()   {       long nc;       nc = 0;       while (getchar() != EOF)           ++nc;       printf("%ld\n", nc);   }</pre>The statement<pre>   ++nc;</pre>presents a new operator, <tt>++</tt>, which means <em>increment by one</em>.You could instead write <tt>nc = nc + 1</tt> but <tt>++nc</tt> is moreconcise and often more efficient. There is a corresponding operator<tt>--</tt> to decrement by 1. The operators <tt>++</tt> and <tt>--</tt> canbe either prefix operators (<tt>++nc</tt>) or postfix operators(<tt>nc++</tt>); these two forms have different values in expressions, aswill be shown in <a href="chapter2.html">Chapter 2</a>, but <tt>++nc</tt> and<tt>nc++</tt> both increment <tt>nc</tt>. For the moment we will will stickto the prefix form.<p>The character counting program accumulates its count in a <tt>long</tt>variable instead of an int. <tt>long</tt> integers are at least 32 bits.Although on some machines, <tt>int</tt> and <tt>long</tt> are the same size,on others an <tt>int</tt> is 16 bits, with a maximum value of 32767, and itwould take relatively little input to overflow an <tt>int</tt> counter. Theconversion specification <tt>%ld</tt> tells <tt>printf</tt> that thecorresponding argument is a <tt>long</tt> integer.<p>It may be possible to cope with even bigger numbers by using a <tt>double</tt>(double precision <tt>float</tt>). We will also use a <tt>for</tt> statementinstead of a <tt>while</tt>, to illustrate another way to write the loop.<pre>    #include &lt;stdio.h&gt;   /* count characters in input; 2nd version */   main()   {       double nc;       for (nc = 0; gechar() != EOF; ++nc)           ;       printf("%.0f\n", nc);   }</pre><tt>printf</tt> uses <tt>%f</tt> for both <tt>float</tt> and <tt>double</tt>;<tt>%.0f</tt> suppresses the printing of the decimal point and the fractionpart, which is zero.<p>The body of this <tt>for</tt> loop is empty, because all the work is done inthe test and increment parts. But the grammatical rules of C require that a<tt>for</tt> statement have a body. The isolated semicolon, called a <em>nullstatement</em>, is there to satisfy that requirement. We put it on a separateline to make it visible.<p>Before we leave the character counting program, observe that if the inputcontains no characters, the <tt>while</tt> or <tt>for</tt> test fails on the veryfirst call to <tt>getchar</tt>, and the program produces zero, the right answer.This is important. One of the nice things about <tt>while</tt> and <tt>for</tt>is that they test at the top of the loop, before proceeding with the body. Ifthere is nothing to do, nothing is done, even if that means never goingthrough the loop body. Programs should act intelligently when givenzero-length input. The <tt>while</tt> and <tt>for</tt> statements help ensurethat programs do reasonable things with boundary conditions.<h3><a name="s1.5.3">1.5.3 Line Counting</a></h3>The next program counts input lines. As we mentioned above, the standardlibrary ensures that an input text stream appears as a sequence of lines,each terminated by a newline. Hence, counting lines is just countingnewlines:<pre>   #include &lt;stdio.h&gt;   /* count lines in input */   main()   {       int c, nl;       nl = 0;       while ((c = getchar()) != EOF)           if (c == '\n')               ++nl;       printf("%d\n", nl);   }</pre>The body of the <tt>while</tt> now consists of an <tt>if</tt>, which in turncontrols the increment <tt>++nl</tt>. The <tt>if</tt> statement tests theparenthesized condition, and if the condition is true, executes the statement(or group of statements in braces) that follows. We have again indented toshow what is controlled by what.<p>The double equals sign <tt>==</tt> is the C notation for ``is equal to'' (likePascal's single <tt>=</tt> or Fortran's <tt>.EQ.</tt>). This symbol is used todistinguish the equality test from the single <tt>=</tt> that C uses forassignment. A word of caution: newcomers to C occasionally write <tt>=</tt> whenthey mean <tt>==</tt>. As we will see in <a href="chapter2.html">Chapter 2</a>,the result is usually a legal expression, so you will get no warning.<p>A character written between single quotes represents an integer value equalto the numerical value of the character in the machine's character set. Thisis called a <em>character constant</em>, although it is just another way towrite a small integer. So, for example, <tt>'A'</tt> is a character constant; inthe ASCII character set its value is 65, the internal representation of thecharacter <tt>A</tt>. Of course, <tt>'A'</tt> is to be preferred over <tt>65</tt>: itsmeaning is obvious, and it is independent of a particular character set.<p>The escape sequences used in string constants are also legal in characterconstants, so <tt>'\n'</tt> stands for the value of the newline character,which is 10 in ASCII. You should note carefully that <tt>'\n'</tt> is a singlecharacter, and in expressions is just an integer; on the other hand,<tt>'\n'</tt> is a string constant that happens to contain only one character.The topic of strings versus characters is discussed further in<a href="chapter2.html">Chapter 2</a>.<p><strong>Exercise 1-8.</strong> Write a program to count blanks, tabs, and newlines.<p><strong>Exercise 1-9.</strong> Write a program to copy its input to its output,replacing each string of one or more blanks by a single blank.<p><strong>Exercise 1-10.</strong> Write a program to copy its input to its output,replacing each tab by <tt>\t</tt>, each backspace by <tt>\b</tt>, and eachbackslash by <tt>\\</tt>. This makes tabs and backspaces visible in anunambiguous way.<h3><a name="s1.5.4">1.5.4 Word Counting</a></h3>The fourth in our series of useful programs counts lines, words, andcharacters, with the loose definition that a word is any sequence ofcharacters that does not contain a blank, tab or newline. This is abare-bones version of the UNIX program <tt>wc</tt>.<pre>   #include &lt;stdio.h&gt;   #define IN   1  /* inside a word */   #define OUT  0  /* outside a word */   /* count lines, words, and characters in input */   main()   {       int c, nl, nw, nc, state;       state = OUT;       nl = nw = nc = 0;       while ((c = getchar()) != EOF) {           ++nc;           if (c == '\n')               ++nl;           if (c == ' ' || c == '\n' || c = '\t')               state = OUT;           else if (state == OUT) {               state = IN;               ++nw;           }       }       printf("%d %d %d\n", nl, nw, nc);   }</pre>Every time the program encounters the first character of a word, it countsone more word. The variable <tt>state</tt> records whether the program iscurrently in a word or not; initially it is ``not in a word'', which isassigned the value <tt>OUT</tt>. We prefer the symbolic constants <tt>IN</tt>and <tt>OUT</tt> to the literal values 1 and 0 because they make the programmore readable. In a program as tiny as this, it makes little difference, butin larger programs, the increase in clarity is well worth the modest extraeffort to write it this way from the beginning. You'll also find that it'seasier to make extensive changes in programs where magic numbers appear onlyas symbolic constants.<p>The line<pre>   nl = nw = nc = 0;</pre>sets all three variables to zero. This is not a special case, but aconsequence of the fact that an assignment is an expression with the valueand assignments associated from right to left. It's as if we had written<pre>   nl = (nw = (nc = 0));</pre>The operator <tt>||</tt> means OR, so the line<pre>   if (c == ' ' || c == '\n' || c = '\t')</pre>says ``if <tt>c</tt> is a blank <em>or</em> <tt>c</tt> is a newline<em>or</em> <tt>c</tt> is a tab''. (Recall that the escape sequence<tt>\t</tt> is a visible representation of the tab character.) There is acorresponding operator <tt>&amp;&amp;</tt> for AND; its precedence is justhigher than <tt>||</tt>. Expressions connected by <tt>&amp;&amp;</tt> or<tt>||</tt> are evaluated left to right, and it is guaranteed that evaluationwill stop as soon as the truth or falsehood is known. If <tt>c</tt> is ablank, there is no need to test whether it is a newline or tab, so thesetests are not made. This isn't particularly important here, but is significantin more complicated situations, as we will soon see.<p>The example also shows an <tt>else</tt>, which specifies an alternativeaction if the condition part of an <tt>if</tt> statement is false. Thegeneral form is<pre>   if (<em>expression</em>)       <em>statement<sub>1</sub></em>   else       <em>statement<sub>2</sub></em></pre>One and only one of the two statements associated with an <tt>if-else</tt> isperformed. If the <em>expression</em> is true, <em>statement<sub>1</sub></em> isexecuted; if not, <em>statement<sub>2</sub></em> is executed. Each<em>statement</em> can be a single statement or several in braces. In theword count program, the one after the <tt>else</tt> is an <tt>if</tt> thatcontrols two statements in braces.<p><strong>Exercise 1-11.</strong> How would you test the word count program?What kinds of input are most likely to uncover bugs if there are any?<p><strong>Exercise 1-12.</strong> Write a program that prints its input one

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -