📄 flex.man
字号:
much text as the originally chosen rule but came later
in the flex input file, or one which matched less text.
For example, the following will both count the words in
the input and call the routine special() whenever
"frob" is seen:
int word_count = 0;
%%
frob special(); REJECT;
[^ \t\n]+ ++word_count;
Without the REJECT, any "frob"'s in the input would not
be counted as words, since the scanner normally exe-
cutes only one action per token. Multiple REJECT's are
allowed, each one finding the next best choice to the
currently active rule. For example, when the following
scanner scans the token "abcd", it will write "abcdab-
caba" to the output:
%%
a |
ab |
abc |
abcd ECHO; REJECT;
.|\n /* eat up any unmatched character */
(The first three rules share the fourth's action since
they use the special '|' action.) REJECT is a
Version 2.5 Last change: April 1995 12
FLEX(1) USER COMMANDS FLEX(1)
particularly expensive feature in terms of scanner per-
formance; if it is used in any of the scanner's actions
it will slow down all of the scanner's matching.
Furthermore, REJECT cannot be used with the -Cf or -CF
options (see below).
Note also that unlike the other special actions, REJECT
is a branch; code immediately following it in the
action will not be executed.
- yymore() tells the scanner that the next time it
matches a rule, the corresponding token should be
appended onto the current value of yytext rather than
replacing it. For example, given the input "mega-
kludge" the following will write "mega-mega-kludge" to
the output:
%%
mega- ECHO; yymore();
kludge ECHO;
First "mega-" is matched and echoed to the output.
Then "kludge" is matched, but the previous "mega-" is
still hanging around at the beginning of yytext so the
ECHO for the "kludge" rule will actually write "mega-
kludge".
Two notes regarding use of yymore(). First, yymore() depends
on the value of yyleng correctly reflecting the size of the
current token, so you must not modify yyleng if you are
using yymore(). Second, the presence of yymore() in the
scanner's action entails a minor performance penalty in the
scanner's matching speed.
- yyless(n) returns all but the first n characters of the
current token back to the input stream, where they will
be rescanned when the scanner looks for the next match.
yytext and yyleng are adjusted appropriately (e.g.,
yyleng will now be equal to n ). For example, on the
input "foobar" the following will write out "foobar-
bar":
%%
foobar ECHO; yyless(3);
[a-z]+ ECHO;
An argument of 0 to yyless will cause the entire
current input string to be scanned again. Unless
you've changed how the scanner will subsequently pro-
cess its input (using BEGIN, for example), this will
result in an endless loop.
Version 2.5 Last change: April 1995 13
FLEX(1) USER COMMANDS FLEX(1)
Note that yyless is a macro and can only be used in the flex
input file, not from other source files.
- unput(c) puts the character c back onto the input
stream. It will be the next character scanned. The
following action will take the current token and cause
it to be rescanned enclosed in parentheses.
{
int i;
/* Copy yytext because unput() trashes yytext */
char *yycopy = strdup( yytext );
unput( ')' );
for ( i = yyleng - 1; i >= 0; --i )
unput( yycopy[i] );
unput( '(' );
free( yycopy );
}
Note that since each unput() puts the given character
back at the beginning of the input stream, pushing back
strings must be done back-to-front.
An important potential problem when using unput() is that if
you are using %pointer (the default), a call to unput() des-
troys the contents of yytext, starting with its rightmost
character and devouring one character to the left with each
call. If you need the value of yytext preserved after a
call to unput() (as in the above example), you must either
first copy it elsewhere, or build your scanner using %array
instead (see How The Input Is Matched).
Finally, note that you cannot put back EOF to attempt to
mark the input stream with an end-of-file.
- input() reads the next character from the input stream.
For example, the following is one way to eat up C com-
ments:
%%
"/*" {
register int c;
for ( ; ; )
{
while ( (c = input()) != '*' &&
c != EOF )
; /* eat up text of comment */
if ( c == '*' )
{
while ( (c = input()) == '*' )
Version 2.5 Last change: April 1995 14
FLEX(1) USER COMMANDS FLEX(1)
;
if ( c == '/' )
break; /* found the end */
}
if ( c == EOF )
{
error( "EOF in comment" );
break;
}
}
}
(Note that if the scanner is compiled using C++, then
input() is instead referred to as yyinput(), in order
to avoid a name clash with the C++ stream by the name
of input.)
- YY_FLUSH_BUFFER flushes the scanner's internal buffer
so that the next time the scanner attempts to match a
token, it will first refill the buffer using YY_INPUT
(see The Generated Scanner, below). This action is a
special case of the more general yy_flush_buffer()
function, described below in the section Multiple Input
Buffers.
- yyterminate() can be used in lieu of a return statement
in an action. It terminates the scanner and returns a
0 to the scanner's caller, indicating "all done". By
default, yyterminate() is also called when an end-of-
file is encountered. It is a macro and may be rede-
fined.
THE GENERATED SCANNER
The output of flex is the file lex.yy.c, which contains the
scanning routine yylex(), a number of tables used by it for
matching tokens, and a number of auxiliary routines and mac-
ros. By default, yylex() is declared as follows:
int yylex()
{
... various definitions and the actions in here ...
}
(If your environment supports function prototypes, then it
will be "int yylex( void )".) This definition may be
changed by defining the "YY_DECL" macro. For example, you
could use:
#define YY_DECL float lexscan( a, b ) float a, b;
to give the scanning routine the name lexscan, returning a
Version 2.5 Last change: April 1995 15
FLEX(1) USER COMMANDS FLEX(1)
float, and taking two floats as arguments. Note that if you
give arguments to the scanning routine using a K&R-
style/non-prototyped function declaration, you must ter-
minate the definition with a semi-colon (;).
Whenever yylex() is called, it scans tokens from the global
input file yyin (which defaults to stdin). It continues
until it either reaches an end-of-file (at which point it
returns the value 0) or one of its actions executes a return
statement.
If the scanner reaches an end-of-file, subsequent calls are
undefined unless either yyin is pointed at a new input file
(in which case scanning continues from that file), or yyres-
tart() is called. yyrestart() takes one argument, a FILE *
pointer (which can be nil, if you've set up YY_INPUT to scan
from a source other than yyin), and initializes yyin for
scanning from that file. Essentially there is no difference
between just assigning yyin to a new input file or using
yyrestart() to do so; the latter is available for compati-
bility with previous versions of flex, and because it can be
used to switch input files in the middle of scanning. It
can also be used to throw away the current input buffer, by
calling it with an argument of yyin; but better is to use
YY_FLUSH_BUFFER (see above). Note that yyrestart() does not
reset the start condition to INITIAL (see Start Conditions,
below).
If yylex() stops scanning due to executing a return state-
ment in one of the actions, the scanner may then be called
again and it will resume scanning where it left off.
By default (and for purposes of efficiency), the scanner
uses block-reads rather than simple getc() calls to read
characters from yyin. The nature of how it gets its input
can be controlled by defining the YY_INPUT macro.
YY_INPUT's calling sequence is
"YY_INPUT(buf,result,max_size)". Its action is to place up
to max_size characters in the character array buf and return
in the integer variable result either the number of charac-
ters read or the constant YY_NULL (0 on Unix systems) to
indicate EOF. The default YY_INPUT reads from the global
file-pointer "yyin".
A sample definition of YY_INPUT (in the definitions section
of the input file):
%{
#define YY_INPUT(buf,result,max_size) \
{ \
int c = getchar(); \
result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
Version 2.5 Last change: April 1995 16
FLEX(1) USER COMMANDS FLEX(1)
}
%}
This definition will change the input processing to occur
one character at a time.
When the scanner receives an end-of-file indication from
YY_INPUT, it then checks the yywrap() function. If yywrap()
returns false (zero), then it is assumed that the function
has gone ahead and set up yyin to point to another input
file, and scanning continues. If it returns true (non-
zero), then the scanner terminates, returning 0 to its
caller. Note that in either case, the start condition
remains unchanged; it does not revert to INITIAL.
If you do not supply your own version of yywrap(), then you
must either use %option noyywrap (in which case the scanner
behaves as though yywrap() returned 1), or you must link
with -lfl to obtain the default version of the routine,
which always returns 1.
Three routines are available for scanning from in-memory
buffers rather than files: yy_scan_string(),
yy_scan_bytes(), and yy_scan_buffer(). See the discussion of
them below in the section Multiple Input Buffers.
The scanner writes its ECHO output to the yyout global
(default, stdout), which may be redefined by the user simply
by assigning it to some other FILE pointer.
START CONDITIONS
flex provides a mechanism for conditionally activating
rules. Any rule whose pattern is prefixed with "<sc>" will
only be active when the scanner is in the start condition
named "sc". For example,
<STRING>[^"]* { /* eat up the string body ... */
...
}
will be active only when the scanner is in the "STRING"
start condition, and
<INITIAL,STRING,QUOTE>\. { /* handle an escape ... */
...
}
will be active only when the current start condition is
either "INITIAL", "STRING", or "QUOTE".
Start conditions are declared in the definitions (first)
section of the input using unindented lines beginning with
Version 2.5 Last change: April 1995 17
FLEX(1) USER COMMANDS FLEX(1)
either %s or %x followed by a list of names. The former
declares inclusive start conditions, the latter exclusive
start conditions. A start condition is activated using the
BEGIN action. Until the next BEGIN action is executed,
rules with the given start condition will be active and
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -