📄 cpp.texi
字号:
@group
/* @r{block comment}
// @r{contains line comment}
@r{yet more comment}
*/ @r{outside comment}
// @r{line comment} /* @r{contains block comment} */
@end group
@end example
But beware of commenting out one end of a block comment with a line
comment.
@example
@group
// @r{l.c.} /* @r{block comment begins}
@r{oops! this isn't a comment anymore} */
@end group
@end example
Comments are not recognized within string literals. @t{@w{"/* blah
*/"}} is the string constant @samp{@w{/* blah */}}, not an empty string.
Line comments are not in the 1989 edition of the C standard, but they
are recognized by GCC as an extension. In C++ and in the 1999 edition
of the C standard, they are an official part of the language.
Since these transformations happen before all other processing, you can
split a line mechanically with backslash-newline anywhere. You can
comment out the end of a line. You can continue a line comment onto the
next line with backslash-newline. You can even split @samp{/*},
@samp{*/}, and @samp{//} onto multiple lines with backslash-newline.
For example:
@example
@group
/\
*
*/ # /*
*/ defi\
ne FO\
O 10\
20
@end group
@end example
@noindent
is equivalent to @code{@w{#define FOO 1020}}. All these tricks are
extremely confusing and should not be used in code intended to be
readable.
There is no way to prevent a backslash at the end of a line from being
interpreted as a backslash-newline.
@example
"foo\\
bar"
@end example
@noindent
is equivalent to @code{"foo\bar"}, not to @code{"foo\\bar"}. To avoid
having to worry about this, do not use the deprecated GNU extension
which permits multi-line strings. Instead, use string literal
concatenation:
@example
"foo\\"
"bar"
@end example
@noindent
Your program will be more portable this way, too.
@node Tokenization
@section Tokenization
@cindex tokens
@cindex preprocessing tokens
After the textual transformations are finished, the input file is
converted into a sequence of @dfn{preprocessing tokens}. These mostly
correspond to the syntactic tokens used by the C compiler, but there are
a few differences. White space separates tokens; it is not itself a
token of any kind. Tokens do not have to be separated by white space,
but it is often necessary to avoid ambiguities.
When faced with a sequence of characters that has more than one possible
tokenization, the preprocessor is greedy. It always makes each token,
starting from the left, as big as possible before moving on to the next
token. For instance, @code{a+++++b} is interpreted as
@code{@w{a ++ ++ + b}}, not as @code{@w{a ++ + ++ b}}, even though the
latter tokenization could be part of a valid C program and the former
could not.
Once the input file is broken into tokens, the token boundaries never
change, except when the @samp{##} preprocessing operator is used to paste
tokens together. @xref{Concatenation}. For example,
@example
@group
#define foo() bar
foo()baz
@expansion{} bar baz
@emph{not}
@expansion{} barbaz
@end group
@end example
The compiler does not re-tokenize the preprocessor's output. Each
preprocessing token becomes one compiler token.
@cindex identifiers
Preprocessing tokens fall into five broad classes: identifiers,
preprocessing numbers, string literals, punctuators, and other. An
@dfn{identifier} is the same as an identifier in C: any sequence of
letters, digits, or underscores, which begins with a letter or
underscore. Keywords of C have no significance to the preprocessor;
they are ordinary identifiers. You can define a macro whose name is a
keyword, for instance. The only identifier which can be considered a
preprocessing keyword is @code{defined}. @xref{Defined}.
This is mostly true of other languages which use the C preprocessor.
However, a few of the keywords of C++ are significant even in the
preprocessor. @xref{C++ Named Operators}.
In the 1999 C standard, identifiers may contain letters which are not
part of the ``basic source character set,'' at the implementation's
discretion (such as accented Latin letters, Greek letters, or Chinese
ideograms). This may be done with an extended character set, or the
@samp{\u} and @samp{\U} escape sequences. GCC does not presently
implement either feature in the preprocessor or the compiler.
As an extension, GCC treats @samp{$} as a letter. This is for
compatibility with some systems, such as VMS, where @samp{$} is commonly
used in system-defined function and object names. @samp{$} is not a
letter in strictly conforming mode, or if you specify the @option{-$}
option. @xref{Invocation}.
@cindex numbers
@cindex preprocessing numbers
A @dfn{preprocessing number} has a rather bizarre definition. The
category includes all the normal integer and floating point constants
one expects of C, but also a number of other things one might not
initially recognize as a number. Formally, preprocessing numbers begin
with an optional period, a required decimal digit, and then continue
with any sequence of letters, digits, underscores, periods, and
exponents. Exponents are the two-character sequences @samp{e+},
@samp{e-}, @samp{E+}, @samp{E-}, @samp{p+}, @samp{p-}, @samp{P+}, and
@samp{P-}. (The exponents that begin with @samp{p} or @samp{P} are new
to C99. They are used for hexadecimal floating-point constants.)
The purpose of this unusual definition is to isolate the preprocessor
from the full complexity of numeric constants. It does not have to
distinguish between lexically valid and invalid floating-point numbers,
which is complicated. The definition also permits you to split an
identifier at any position and get exactly two tokens, which can then be
pasted back together with the @samp{##} operator.
It's possible for preprocessing numbers to cause programs to be
misinterpreted. For example, @code{0xE+12} is a preprocessing number
which does not translate to any valid numeric constant, therefore a
syntax error. It does not mean @code{@w{0xE + 12}}, which is what you
might have intended.
@cindex string literals
@cindex string constants
@cindex character constants
@cindex header file names
@c the @: prevents makeinfo from turning '' into ".
@dfn{String literals} are string constants, character constants, and
header file names (the argument of @samp{#include}).@footnote{The C
standard uses the term @dfn{string literal} to refer only to what we are
calling @dfn{string constants}.} String constants and character
constants are straightforward: @t{"@dots{}"} or @t{'@dots{}'}. In
either case the closing quote may be escaped with a backslash:
@t{'\'@:'} is the character constant for @samp{'}. There is no limit on
the length of a character constant, but the value of a character
constant that contains more than one character is
implementation-defined. @xref{Implementation Details}.
Header file names either look like string constants, @t{"@dots{}"}, or are
written with angle brackets instead, @t{<@dots{}>}. In either case,
backslash is an ordinary character. There is no way to escape the
closing quote or angle bracket. The preprocessor looks for the header
file in different places depending on which form you use. @xref{Include
Operation}.
In standard C, no string literal may extend past the end of a line. GNU
CPP accepts multi-line string constants, but not character constants or
header file names. This extension is deprecated and will be removed in
GCC 3.1. You may use continued lines instead, or string constant
concatenation. @xref{Differences from previous versions}.
@cindex punctuators
@dfn{Punctuators} are all the usual bits of punctuation which are
meaningful to C and C++. All but three of the punctuation characters in
ASCII are C punctuators. The exceptions are @samp{@@}, @samp{$}, and
@samp{`}. In addition, all the two- and three-character operators are
punctuators. There are also six @dfn{digraphs}, which are merely
alternate ways to spell other punctuators. This is a second attempt to
work around missing punctuation in obsolete systems. It has no negative
side effects, unlike trigraphs, but does not cover as much ground. The
digraphs and their corresponding normal punctuators are:
@example
Digraph: <% %> <: :> %: %:%:
Punctuator: @{ @} [ ] # ##
@end example
@cindex other tokens
Any other single character is considered ``other.'' It is passed on to
the preprocessor's output unmolested. The C compiler will almost
certainly reject source code containing ``other'' tokens. In ASCII, the
only other characters are @samp{@@}, @samp{$}, @samp{`}, and control
characters other than NUL (all bits zero). (Note that @samp{$} is
normally considered a letter.) All characters with the high bit set
(numeric range 0x7F--0xFF) are also ``other'' in the present
implementation. This will change when proper support for international
character sets is added to GCC.
NUL is a special case because of the high probability that its
appearance is accidental, and because it may be invisible to the user
(many terminals do not display NUL at all). Within comments, NULs are
silently ignored, just as any other character would be. In running
text, NUL is considered white space. For example, these two directives
have the same meaning.
@example
#define X^@@1
#define X 1
@end example
@noindent
(where @samp{^@@} is ASCII NUL). Within string or character constants,
NULs are preserved. In the latter two cases the preprocessor emits a
warning message.
@node The preprocessing language
@section The preprocessing language
@cindex directives
@cindex preprocessing directives
@cindex directive line
@cindex directive name
After tokenization, the stream of tokens may simply be passed straight
to the compiler's parser. However, if it contains any operations in the
@dfn{preprocessing language}, it will be transformed first. This stage
corresponds roughly to the standard's ``translation phase 4'' and is
what most people think of as the preprocessor's job.
The preprocessing language consists of @dfn{directives} to be executed
and @dfn{macros} to be expanded. Its primary capabilities are:
@itemize @bullet
@item
Inclusion of header files. These are files of declarations that can be
substituted into your program.
@item
Macro expansion. You can define @dfn{macros}, which are abbreviations
for arbitrary fragments of C code. The preprocessor will replace the
macros with their definitions throughout the program. Some macros are
automatically defined for you.
@item
Conditional compilation. You can include or exclude parts of the
program according to various conditions.
@item
Line control. If you use a program to combine or rearrange source files
into an intermediate file which is then compiled, you can use line
control to inform the compiler where each source line originally came
from.
@item
Diagnostics. You can detect problems at compile time and issue errors
or warnings.
@end itemize
There are a few more, less useful features.
Except for expansion of predefined macros, all these operations are
triggered with @dfn{preprocessing directives}. Preprocessing directives
are lines in your program that start with @samp{#}. Whitespace is
allowed before and after the @samp{#}. The @samp{#} is followed by an
identifier, the @dfn{directive name}. It specifies the operation to
perform. Directives are commonly referred to as @samp{#@var{name}}
where @var{name} is the directive name. For example, @samp{#define} is
the directive that defines a macro.
The @samp{#} which begins a directive cannot come from a macro
expansion. Also, the directive name is not macro expanded. Thus, if
@code{foo} is defined as a macro expanding to @code{define}, that does
not make @samp{#foo} a valid preprocessing directive.
The set of valid directive names is fixed. Programs cannot define new
preprocessing directives.
Some directives require arguments; these make up the rest of the
directive line and must be separated from the directive name by
whitespace. For example, @samp{#define} must be followed by a macro
name and the intended expansion of the macro.
A preprocessing directive cannot cover more than one line. The line
may, however, be continued with backslash-newline, or by a block comment
which extends past the end of the line. In either case, when the
directive is processed, the continuations have already been merged with
the first line to make one long line.
@node Header Files
@chapter Header Files
@cindex header file
A header file is a file containing C declarations and macro definitions
(@pxref{Macros}) to be shared between several source files. You request
the use of a header file in your program by @dfn{including} it, with the
C preprocessing directive @samp{#include}.
Header files serve two purposes.
@itemize @bullet
@item
@cindex system header files
System header files declare the interfaces to parts of the operating
system. You include them in your program to supply the definitions and
declarations you need to invoke system calls and libraries.
@item
Your own header files contain declarations for interfaces between the
source files of your program. Each time you have a group of related
declarations and macro definitions all or most of which are needed in
several different source files, it is a good idea to create a header
file for them.
@end itemize
Including a header file produces the same results as copying the header
file into each source file that needs it. Such copying would be
time-consuming and error-prone. With a header file, the related
declarations appear in only one place. If they need to be changed, they
can be changed in one place, and programs that include the header file
will automatically use the new version when next recompiled. The header
file eliminates the labor of finding and changing all the copies as well
as the risk that a failure to find one copy will result in
inconsistencies within a program.
In C, the usual convention is to give header files names that end with
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -