📄 initial-processing.html

📁 gcc手册
💻 HTML
字号:
<html lang="en">

<head>

<title>The C Preprocessor</title>

<meta http-equiv="Content-Type" content="text/html">

<meta name="description" content="The C Preprocessor">

<meta name="generator" content="makeinfo 4.3">

<link href="http://www.gnu.org/software/texinfo/" rel="generator-home">

<!--

Copyright &copy; 1987, 1989, 1991, 1992, 1993, 1994, 1995, 1996,

1997, 1998, 1999, 2000, 2001, 2002, 2003

Free Software Foundation, Inc.



   <p>Permission is granted to copy, distribute and/or modify this document

under the terms of the GNU Free Documentation License, Version 1.1 or

any later version published by the Free Software Foundation.  A copy of

the license is included in the

section entitled "GNU Free Documentation License".



   <p>This manual contains no Invariant Sections.  The Front-Cover Texts are

(a) (see below), and the Back-Cover Texts are (b) (see below).



   <p>(a) The FSF's Front-Cover Text is:



   <p>A GNU Manual



   <p>(b) The FSF's Back-Cover Text is:



   <p>You have freedom to copy and modify this GNU Manual, like GNU

     software.  Copies published by the Free Software Foundation raise

     funds for GNU development. 

-->

</head>

<body>

<div class="node">

<p>

Node:<a name="Initial%20processing">Initial processing</a>,

Next:<a rel="next" accesskey="n" href="Tokenization.html#Tokenization">Tokenization</a>,

Up:<a rel="up" accesskey="u" href="Overview.html#Overview">Overview</a>

<hr><br>

</div>



<h3 class="section">Initial processing</h3>



   <p>The preprocessor performs a series of textual transformations on its

input.  These happen before all other processing.  Conceptually, they

happen in a rigid order, and the entire file is run through each

transformation before the next one begins.  CPP actually does them

all at once, for performance reasons.  These transformations correspond

roughly to the first three "phases of translation" described in the C

standard.



     <ol type=1 start=1>

<li>The input file is read into memory and broken into lines.



     <p>CPP expects its input to be a text file, that is, an unstructured

stream of ASCII characters, with some characters indicating the end of a

line of text.  Extended ASCII character sets, such as ISO Latin-1 or

Unicode encoded in UTF-8, are also acceptable.  Character sets that are

not strict supersets of seven-bit ASCII will not work.  We plan to add

complete support for international character sets in a future release.



     <p>Different systems use different conventions to indicate the end of a

line.  GCC accepts the ASCII control sequences <kbd>LF</kbd>, <kbd>CR&nbsp;LF</kbd>, <kbd>CR</kbd>, and <kbd>LF&nbsp;CR</kbd> as end-of-line markers.  The first

three are the canonical sequences used by Unix, DOS and VMS, and the

classic Mac OS (before OSX) respectively.  You may therefore safely copy

source code written on any of those systems to a different one and use

it without conversion.  (GCC may lose track of the current line number

if a file doesn't consistently use one convention, as sometimes happens

when it is edited on computers with different conventions that share a

network file system.)  <kbd>LF&nbsp;CR</kbd> is included because it has been

reported as an end-of-line marker under exotic conditions.



     <p>If the last line of any input file lacks an end-of-line marker, the end

of the file is considered to implicitly supply one.  The C standard says

that this condition provokes undefined behavior, so GCC will emit a

warning message.



     </p><li><a name="trigraphs"></a>If trigraphs are enabled, they are replaced by their

corresponding single characters.  By default GCC ignores trigraphs,

but if you request a strictly conforming mode with the <code>-std</code>

option, or you specify the <code>-trigraphs</code> option, then it

converts them.



     <p>These are nine three-character sequences, all starting with <code>??</code>,

that are defined by ISO C to stand for single characters.  They permit

obsolete systems that lack some of C's punctuation to use C.  For

example, <code>??/</code> stands for <code>\</code>, so <tt>'??/n'</tt> is a character

constant for a newline.



     <p>Trigraphs are not popular and many compilers implement them incorrectly. 

Portable code should not rely on trigraphs being either converted or

ignored.  If you use the <code>-Wall</code> or <code>-Wtrigraphs</code> options,

GCC will warn you when a trigraph would change the meaning of your

program if it were converted.



     <p>In a string constant, you can prevent a sequence of question marks from

being confused with a trigraph by inserting a backslash between the

question marks.  <tt>"(??\?)"</tt> is the string <code>(???)</code>, not

<code>(?]</code>.  Traditional C compilers do not recognize this idiom.



     <p>The nine trigraphs and their replacements are



     <pre class="example">          Trigraph:       ??(  ??)  ??&lt;  ??&gt;  ??=  ??/  ??'  ??!  ??-

          Replacement:      [    ]    {    }    #    \    ^    |    ~

          </pre>



     </p><li>Continued lines are merged into one long line.



     <p>A continued line is a line which ends with a backslash, <code>\</code>.  The

backslash is removed and the following line is joined with the current

one.  No space is inserted, so you may split a line anywhere, even in

the middle of a word.  (It is generally more readable to split lines

only at white space.)



     <p>The trailing backslash on a continued line is commonly referred to as a

<dfn>backslash-newline</dfn>.



     <p>If there is white space between a backslash and the end of a line, that

is still a continued line.  However, as this is usually the result of an

editing mistake, and many compilers will not accept it as a continued

line, GCC will warn you about it.



     </p><li>All comments are replaced with single spaces.



     <p>There are two kinds of comments.  <dfn>Block comments</dfn> begin with

<code>/*</code> and continue until the next <code>*/</code>.  Block comments do not

nest:



     <pre class="example">          /* this is /* one comment */ text outside comment

          </pre>



     <p><dfn>Line comments</dfn> begin with <code>//</code> and continue to the end of the

current line.  Line comments do not nest either, but it does not matter,

because they would end in the same place anyway.



     <pre class="example">          // this is // one comment

          text outside comment

          </pre>

     </ol>



   <p>It is safe to put line comments inside block comments, or vice versa.



<pre class="example">     /* block comment

        // contains line comment

        yet more comment

      */ outside comment

     

     // line comment /* contains block comment */

     </pre>



   <p>But beware of commenting out one end of a block comment with a line

comment.



<pre class="example">      // l.c.  /* block comment begins

         oops! this isn't a comment anymore */

     </pre>



   <p>Comments are not recognized within string literals.  <tt>"/*&nbsp;blah&nbsp;*/"</tt> is the string constant <code>/*&nbsp;blah&nbsp;*/</code>, not an empty string.



   <p>Line comments are not in the 1989 edition of the C standard, but they

are recognized by GCC as an extension.  In C++ and in the 1999 edition

of the C standard, they are an official part of the language.



   <p>Since these transformations happen before all other processing, you can

split a line mechanically with backslash-newline anywhere.  You can

comment out the end of a line.  You can continue a line comment onto the

next line with backslash-newline.  You can even split <code>/*</code>,

<code>*/</code>, and <code>//</code> onto multiple lines with backslash-newline. 

For example:



<pre class="example">     /\

     *

     */ # /*

     */ defi\

     ne FO\

     O 10\

     20

     </pre>



<p>is equivalent to <code>#define&nbsp;FOO&nbsp;1020</code>.  All these tricks are

extremely confusing and should not be used in code intended to be

readable.



   <p>There is no way to prevent a backslash at the end of a line from being

interpreted as a backslash-newline.  This cannot affect any correct

program, however.



   </body></html>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -