📄 flex.1

📁 flex编译器的源代码
💻 1
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
.B YY_BREAK
is inaccessible.
.SH VALUES AVAILABLE TO THE USER
This section summarizes the various values available to the user
in the rule actions.
.IP -
.B char *yytext
holds the text of the current token.  It may be modified but not lengthened
(you cannot append characters to the end).
.IP
If the special directive
.B %array
appears in the first section of the scanner description, then
.B yytext
is instead declared
.B char yytext[YYLMAX],
where
.B YYLMAX
is a macro definition that you can redefine in the first section
if you don't like the default value (generally 8KB).  Using
.B %array
results in somewhat slower scanners, but the value of
.B yytext
becomes immune to calls to
.I input()
and
.I unput(),
which potentially destroy its value when
.B yytext
is a character pointer.  The opposite of
.B %array
is
.B %pointer,
which is the default.
.IP
You cannot use
.B %array
when generating C++ scanner classes
(the
.B \-+
flag).
.IP -
.B int yyleng
holds the length of the current token.
.IP -
.B FILE *yyin
is the file which by default
.I flex
reads from.  It may be redefined but doing so only makes sense before
scanning begins or after an EOF has been encountered.  Changing it in
the midst of scanning will have unexpected results since
.I flex
buffers its input; use
.B yyrestart()
instead.
Once scanning terminates because an end-of-file
has been seen, you can assign
.I yyin
at the new input file and then call the scanner again to continue scanning.
.IP -
.B void yyrestart( FILE *new_file )
may be called to point
.I yyin
at the new input file.  The switch-over to the new file is immediate
(any previously buffered-up input is lost).  Note that calling
.B yyrestart()
with
.I yyin
as an argument thus throws away the current input buffer and continues
scanning the same input file.
.IP -
.B FILE *yyout
is the file to which
.B ECHO
actions are done.  It can be reassigned by the user.
.IP -
.B YY_CURRENT_BUFFER
returns a
.B YY_BUFFER_STATE
handle to the current buffer.
.IP -
.B YY_START
returns an integer value corresponding to the current start
condition.  You can subsequently use this value with
.B BEGIN
to return to that start condition.
.SH INTERFACING WITH YACC
One of the main uses of
.I flex
is as a companion to the
.I yacc
parser-generator.
.I yacc
parsers expect to call a routine named
.B yylex()
to find the next input token.  The routine is supposed to
return the type of the next token as well as putting any associated
value in the global
.B yylval.
To use
.I flex
with
.I yacc,
one specifies the
.B \-d
option to
.I yacc
to instruct it to generate the file
.B y.tab.h
containing definitions of all the
.B %tokens
appearing in the
.I yacc
input.  This file is then included in the
.I flex
scanner.  For example, if one of the tokens is "TOK_NUMBER",
part of the scanner might look like:
.nf

    %{
    #include "y.tab.h"
    %}

    %%

    [0-9]+        yylval = atoi( yytext ); return TOK_NUMBER;

.fi
.SH OPTIONS
.I flex
has the following options:
.TP
.B \-b
Generate backing-up information to
.I lex.backup.
This is a list of scanner states which require backing up
and the input characters on which they do so.  By adding rules one
can remove backing-up states.  If
.I all
backing-up states are eliminated and
.B \-Cf
or
.B \-CF
is used, the generated scanner will run faster (see the
.B \-p
flag).  Only users who wish to squeeze every last cycle out of their
scanners need worry about this option.  (See the section on Performance
Considerations below.)
.TP
.B \-c
is a do-nothing, deprecated option included for POSIX compliance.
.TP
.B \-d
makes the generated scanner run in
.I debug
mode.  Whenever a pattern is recognized and the global
.B yy_flex_debug
is non-zero (which is the default),
the scanner will write to
.I stderr
a line of the form:
.nf

    --accepting rule at line 53 ("the matched text")

.fi
The line number refers to the location of the rule in the file
defining the scanner (i.e., the file that was fed to flex).  Messages
are also generated when the scanner backs up, accepts the
default rule, reaches the end of its input buffer (or encounters
a NUL; at this point, the two look the same as far as the scanner's concerned),
or reaches an end-of-file.
.TP
.B \-f
specifies
.I fast scanner.
No table compression is done and stdio is bypassed.
The result is large but fast.  This option is equivalent to
.B \-Cfr
(see below).
.TP
.B \-h
generates a "help" summary of
.I flex's
options to
.I stdout 
and then exits.
.B \-?
and
.B \-\-help
are synonyms for
.B \-h.
.TP
.B \-i
instructs
.I flex
to generate a
.I case-insensitive
scanner.  The case of letters given in the
.I flex
input patterns will
be ignored, and tokens in the input will be matched regardless of case.  The
matched text given in
.I yytext
will have the preserved case (i.e., it will not be folded).
.TP
.B \-l
turns on maximum compatibility with the original AT&T
.I lex
implementation.  Note that this does not mean
.I full
compatibility.  Use of this option costs a considerable amount of
performance, and it cannot be used with the
.B \-+, -f, -F, -Cf,
or
.B -CF
options.  For details on the compatibilities it provides, see the section
"Incompatibilities With Lex And POSIX" below.  This option also results
in the name
.B YY_FLEX_LEX_COMPAT
being #define'd in the generated scanner.
.TP
.B \-n
is another do-nothing, deprecated option included only for
POSIX compliance.
.TP
.B \-p
generates a performance report to stderr.  The report
consists of comments regarding features of the
.I flex
input file which will cause a serious loss of performance in the resulting
scanner.  If you give the flag twice, you will also get comments regarding
features that lead to minor performance losses.
.IP
Note that the use of
.B REJECT,
.B %option yylineno,
and variable trailing context (see the Deficiencies / Bugs section below)
entails a substantial performance penalty; use of
.I yymore(),
the
.B ^
operator,
and the
.B \-I
flag entail minor performance penalties.
.TP
.B \-s
causes the
.I default rule
(that unmatched scanner input is echoed to
.I stdout)
to be suppressed.  If the scanner encounters input that does not
match any of its rules, it aborts with an error.  This option is
useful for finding holes in a scanner's rule set.
.TP
.B \-t
instructs
.I flex
to write the scanner it generates to standard output instead
of
.B lex.yy.c.
.TP
.B \-v
specifies that
.I flex
should write to
.I stderr
a summary of statistics regarding the scanner it generates.
Most of the statistics are meaningless to the casual
.I flex
user, but the first line identifies the version of
.I flex
(same as reported by
.B \-V),
and the next line the flags used when generating the scanner, including
those that are on by default.
.TP
.B \-w
suppresses warning messages.
.TP
.B \-B
instructs
.I flex
to generate a
.I batch
scanner, the opposite of
.I interactive
scanners generated by
.B \-I
(see below).  In general, you use
.B \-B
when you are
.I certain
that your scanner will never be used interactively, and you want to
squeeze a
.I little
more performance out of it.  If your goal is instead to squeeze out a
.I lot
more performance, you should  be using the
.B \-Cf
or
.B \-CF
options (discussed below), which turn on
.B \-B
automatically anyway.
.TP
.B \-F
specifies that the
.ul
fast
scanner table representation should be used (and stdio
bypassed).  This representation is
about as fast as the full table representation
.B (-f),
and for some sets of patterns will be considerably smaller (and for
others, larger).  In general, if the pattern set contains both "keywords"
and a catch-all, "identifier" rule, such as in the set:
.nf

    "case"    return TOK_CASE;
    "switch"  return TOK_SWITCH;
    ...
    "default" return TOK_DEFAULT;
    [a-z]+    return TOK_ID;

.fi
then you're better off using the full table representation.  If only
the "identifier" rule is present and you then use a hash table or some such
to detect the keywords, you're better off using
.B -F.
.IP
This option is equivalent to
.B \-CFr
(see below).  It cannot be used with
.B \-+.
.TP
.B \-I
instructs
.I flex
to generate an
.I interactive
scanner.  An interactive scanner is one that only looks ahead to decide
what token has been matched if it absolutely must.  It turns out that
always looking one extra character ahead, even if the scanner has already
seen enough text to disambiguate the current token, is a bit faster than
only looking ahead when necessary.  But scanners that always look ahead
give dreadful interactive performance; for example, when a user types
a newline, it is not recognized as a newline token until they enter
.I another
token, which often means typing in another whole line.
.IP
.I Flex
scanners default to
.I interactive
unless you use the
.B \-Cf
or
.B \-CF
table-compression options (see below).  That's because if you're looking
for high-performance you should be using one of these options, so if you
didn't,
.I flex
assumes you'd rather trade off a bit of run-time performance for intuitive
interactive behavior.  Note also that you
.I cannot
use
.B \-I
in conjunction with
.B \-Cf
or
.B \-CF.
Thus, this option is not really needed; it is on by default for all those
cases in which it is allowed.
.IP
You can force a scanner to
.I not
be interactive by using
.B \-B
(see above).
.TP
.B \-L
instructs
.I flex
not to generate
.B #line
directives.  Without this option,
.I flex
peppers the generated scanner
with #line directives so error messages in the actions will be correctly
located with respect to either the original
.I flex
input file (if the errors are due to code in the input file), or
.B lex.yy.c
(if the errors are
.I flex's
fault -- you should report these sorts of errors to the email address
given below).
.TP
.B \-T
makes
.I flex
run in
.I trace
mode.  It will generate a lot of messages to
.I stderr
concerning
the form of the input and the resultant non-deterministic and deterministic
finite automata.  This option is mostly for use in maintaining
.I flex.
.TP
.B \-V
prints the version number to
.I stdout
and exits.
.B \-\-version
is a synonym for
.B \-V.
.TP
.B \-7
instructs
.I flex
to generate a 7-bit scanner, i.e., one which can only recognized 7-bit
characters in its input.  The advantage of using
.B \-7
is that the scanner's tables can be up to half the size of those generated
using the
.B \-8
option (see below).  The disadvantage is that such scanners often hang
or crash if their input contains an 8-bit character.
.IP
Note, however, that unless you generate your scanner using the
.B \-Cf
or
.B \-CF
table compression options, use of
.B \-7
will save only a small amount of table space, and make your scanner
considerably less portable.
.I Flex's
default behavior is to generate an 8-bit scanner unless you use the
.B \-Cf
or
.B \-CF,
in which case
.I flex
defaults to generating 7-bit scanners unless your site was always
configured to generate 8-bit scanners (as will often be the case
with non-USA sites).  You can tell whether flex generated a 7-bit
or an 8-bit scanner by inspecting the flag summary in the
.B \-v
output as described above.
.IP
Note that if you use
.B \-Cfe
or
.B \-CFe
(those table compression options, but also using equivalence classes as
discussed see below), flex still defaults to generating an 8-bit
scanner, since usually with these compression options full 8-bit tables
are not much more expensive than 7-bit tables.
.TP
.B \-8
instructs
.I flex
to generate an 8-bit scanner, i.e., one which can recognize 8-bit
characters.  This flag is only needed for scanners generated using
.B \-Cf
or
.B \-CF,
as otherwise flex defaults to generating an 8-bit scanner anyway.
.IP
See the discussion of
.B \-7
above for flex's default behavior and the tradeoffs between 7-bit
and 8-bit scanners.
.TP
.B \-+
specifies that you want flex to generate a C++
scanner class.  See the section on Generating C++ Scanners below for
details.
.TP 
.B \-C[aefFmr]
controls t
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -