📄 flexdoc.1
字号:
{
yyterminate();
}
else
yy_switch_to_buffer(
include_stack[include_stack_ptr] );
}
END-OF-FILE RULES
The special rule "<<EOF>>" indicates actions which are to be taken when
an end-of-file is encountered and yywrap() returns non-zero (i.e.,
indicates no further files to process). The action must finish by doing
one of four things:
26 May 1990 17
FLEX(1) Minix Programmer's Manual FLEX(1)
- the special YY_NEW_FILE action, if yyin has been pointed at a new
file to process;
- a return statement;
- the special yyterminate() action;
- or, switching to a new buffer using yy_switch_to_buffer() as shown
in the example above.
<<EOF>> rules may not be used with other patterns; they may only be
qualified with a list of start conditions. If an unqualified <<EOF>>
rule is given, it applies to all start conditions which do not already
have <<EOF>> actions. To specify an <<EOF>> rule for only the initial
start condition, use
<INITIAL><<EOF>>
These rules are useful for catching things like unclosed comments. An
example:
%x quote
%%
...other rules for dealing with quotes...
<quote><<EOF>> {
error( "unterminated quote" );
yyterminate();
}
<<EOF>> {
if ( *++filelist )
{
yyin = fopen( *filelist, "r" );
YY_NEW_FILE;
}
else
yyterminate();
}
MISCELLANEOUS MACROS
The macro YY_USER_ACTION can be redefined to provide an action which is
always executed prior to the matched rule's action. For example, it
could be #define'd to call a routine to convert yytext to lower-case.
The macro YY_USER_INIT may be redefined to provide an action which is
always executed before the first scan (and before the scanner's internal
initializations are done). For example, it could be used to call a
26 May 1990 18
FLEX(1) Minix Programmer's Manual FLEX(1)
routine to read in a data table or open a logging file.
In the generated scanner, the actions are all gathered in one large
switch statement and separated using YY_BREAK, which may be redefined.
By default, it is simply a "break", to separate each rule's action from
the following rule's. Redefining YY_BREAK allows, for example, C++ users
to #define YY_BREAK to do nothing (while being very careful that every
rule ends with a "break" or a "return"!) to avoid suffering from
unreachable statement warnings where because a rule's action ends with
"return", the YY_BREAK is inaccessible.
INTERFACING WITH YACC
One of the main uses of flex is as a companion to the yacc parser-
generator. yacc parsers expect to call a routine named yylex() to find
the next input token. The routine is supposed to return the type of the
next token as well as putting any associated value in the global yylval.
To use flex with yacc, one specifies the -d option to yacc to instruct it
to generate the file y.tab.h containing definitions of all the %tokens
appearing in the yacc input. This file is then included in the flex
scanner. For example, if one of the tokens is "TOK_NUMBER", part of the
scanner might look like:
%{
#include "y.tab.h"
%}
%%
[0-9]+ yylval = atoi( yytext ); return TOK_NUMBER;
TRANSLATION TABLE
In the name of POSIX compliance, flex supports a translation table for
mapping input characters into groups. The table is specified in the
first section, and its format looks like:
%t
1 abcd
2 ABCDEFGHIJKLMNOPQRSTUVWXYZ
52 0123456789
6 \t\ \n
%t
This example specifies that the characters 'a', 'b', 'c', and 'd' are to
all be lumped into group #1, upper-case letters in group #2, digits in
group #52, tabs, blanks, and newlines into group #6, and no other
characters will appear in the patterns. The group numbers are actually
disregarded by flex; %t serves, though, to lump characters together.
Given the above table, for example, the pattern "a(AA)*5" is equivalent
to "d(ZQ)*0". They both say, "match any character in group #1, followed
26 May 1990 19
FLEX(1) Minix Programmer's Manual FLEX(1)
by zero-or-more pairs of characters from group #2, followed by a
character from group #52." Thus %t provides a crude way for introducing
equivalence classes into the scanner specification.
Note that the -i option (see below) coupled with the equivalence classes
which flex automatically generates take care of virtually all the
instances when one might consider using %t. But what the hell, it's there
if you want it.
OPTIONS
flex has the following options:
-b Generate backtracking information to lex.backtrack. This is a list
of scanner states which require backtracking and the input
characters on which they do so. By adding rules one can remove
backtracking states. If all backtracking states are eliminated and
-f or -F is used, the generated scanner will run faster (see the -p
flag). Only users who wish to squeeze every last cycle out of their
scanners need worry about this option. (See the section on
PERFORMANCE CONSIDERATIONS below.)
-c is a do-nothing, deprecated option included for POSIX compliance.
NOTE: in previous releases of flex -c specified table-compression
options. This functionality is now given by the -C flag. To ease
the the impact of this change, when flex encounters -c, it currently
issues a warning message and assumes that -C was desired instead.
In the future this "promotion" of -c to -C will go away in the name
of full POSIX compliance (unless the POSIX meaning is removed
first).
-d makes the generated scanner run in debug mode. Whenever a pattern
is recognized and the global yy_flex_debug is non-zero (which is the
default), the scanner will write to stderr a line of the form:
--accepting rule at line 53 ("the matched text")
The line number refers to the location of the rule in the file
defining the scanner (i.e., the file that was fed to flex).
Messages are also generated when the scanner backtracks, accepts the
default rule, reaches the end of its input buffer (or encounters a
NUL; at this point, the two look the same as far as the scanner's
concerned), or reaches an end-of-file.
-f specifies (take your pick) full table or fast scanner. No table
compression is done. The result is large but fast. This option is
equivalent to -Cf (see below).
26 May 1990 20
FLEX(1) Minix Programmer's Manual FLEX(1)
-i instructs flex to generate a case-insensitive scanner. The case of
letters given in the flex input patterns will be ignored, and tokens
in the input will be matched regardless of case. The matched text
given in yytext will have the preserved case (i.e., it will not be
folded).
-n is another do-nothing, deprecated option included only for POSIX
compliance.
-p generates a performance report to stderr. The report consists of
comments regarding features of the flex input file which will cause
a loss of performance in the resulting scanner. Note that the use
of REJECT and variable trailing context (see the BUGS section in
flex(1)) entails a substantial performance penalty; use of yymore(),
the ^ operator, and the -I flag entail minor performance penalties.
-s causes the default rule (that unmatched scanner input is echoed to
stdout) to be suppressed. If the scanner encounters input that does
not match any of its rules, it aborts with an error. This option is
useful for finding holes in a scanner's rule set.
-t instructs flex to write the scanner it generates to standard output
instead of lex.yy.c.
-v specifies that flex should write to stderr a summary of statistics
regarding the scanner it generates. Most of the statistics are
meaningless to the casual flex user, but the first line identifies
the version of flex, which is useful for figuring out where you
stand with respect to patches and new releases, and the next two
lines give the date when the scanner was created and a summary of
the flags which were in effect.
-F specifies that the fast scanner table representation should be used.
This representation is about as fast as the full table
representation (-f), and for some sets of patterns will be
considerably smaller (and for others, larger). In general, if the
pattern set contains both "keywords" and a catch-all, "identifier"
rule, such as in the set:
"case" return TOK_CASE;
"switch" return TOK_SWITCH;
...
"default" return TOK_DEFAULT;
[a-z]+ return TOK_ID;
then you're better off using the full table representation. If only
the "identifier" rule is present and you then use a hash table or
some such to detect the keywords, you're better off using -F.
26 May 1990 21
FLEX(1) Minix Programmer's Manual FLEX(1)
This option is equivalent to -CF (see below).
-I instructs flex to generate an interactive scanner. Normally,
scanners generated by flex always look ahead one character before
deciding that a rule has been matched. At the cost of some scanning
overhead, flex will generate a scanner which only looks ahead when
needed. Such scanners are called interactive because if you want to
write a scanner for an interactive system such as a command shell,
you will probably want the user's input to be terminated with a
newline, and without -I the user will have to type a character in
addition to the newline in order to have the newline recognized.
This leads to dreadful interactive performance.
If all this seems to confusing, here's the general rule: if a human
will be typing in input to your scanner, use -I, otherwise don't; if
you don't care about squeezing the utmost performance from your
scanner and you don't want to make any assumptions about the input
to your scanner, use -I.
Note, -I cannot be used in conjunction with full or fast tables,
i.e., the -f, -F, -Cf, or -CF flags.
-L instructs flex not to generate #line directives. Without this
option, flex peppers the generated scanner with #line directives so
error messages in the actions will be correctly located with respect
to the original flex input file, and not to the fairly meaningless
line numbers of lex.yy.c. (Unfortunately flex does not presently
generate the necessary directives to "retarget" the line numbers for
those parts of lex.yy.c which it generated. So if there is an error
in the generated code, a meaningless line number is reported.)
-T makes flex run in trace mode. It will generate a lot of messages to
stdout concerning the form of the input and the resultant non-
deterministic and deterministic finite automata. This option is
mostly for use in maintaining flex.
-8 instructs flex to generate an 8-bit scanner, i.e., one which can
recognize 8-bit characters. On some sites, flex is installed with
this option as the default. On others, the default is 7-bit
characters. To see which is the case, check the verbose (-v) output
for "equivalence classes created". If the denominator of the number
shown is 128, then by default flex is generating 7-bit characters.
If it is 256, then the default is 8-bit characters and the -8 flag
is not required (but may be a good idea to keep the scanner
specification portable). Feeding a 7-bit scanner 8-bit characters
will result in infinite loops, bus errors, or other such fireworks,
so when in doubt, use the flag. Note that if equivalence classes
are used, 8-bit scanners take only slightly more table space than 7-
bit scanners (128 bytes, to be exact); if equivalence classes are
not used, however, then the tables may grow up to twice their 7-bit
26 May 1990 22
FLEX(1) Minix Programmer's Manual FLEX(1)
size.
-C[efmF]
controls the degree of table compression.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -