📄 flex.man
字号:
rules with other start conditions will be inactive. If the
start condition is inclusive, then rules with no start con-
ditions at all will also be active. If it is exclusive,
then only rules qualified with the start condition will be
active. A set of rules contingent on the same exclusive
start condition describe a scanner which is independent of
any of the other rules in the flex input. Because of this,
exclusive start conditions make it easy to specify "mini-
scanners" which scan portions of the input that are syntac-
tically different from the rest (e.g., comments).
If the distinction between inclusive and exclusive start
conditions is still a little vague, here's a simple example
illustrating the connection between the two. The set of
rules:
%s example
%%
<example>foo do_something();
bar something_else();
is equivalent to
%x example
%%
<example>foo do_something();
<INITIAL,example>bar something_else();
Without the <INITIAL,example> qualifier, the bar pattern in
the second example wouldn't be active (i.e., couldn't match)
when in start condition example. If we just used <example>
to qualify bar, though, then it would only be active in
example and not in INITIAL, while in the first example it's
active in both, because in the first example the example
startion condition is an inclusive (%s) start condition.
Also note that the special start-condition specifier <*>
matches every start condition. Thus, the above example
could also have been written;
%x example
%%
Version 2.5 Last change: April 1995 18
FLEX(1) USER COMMANDS FLEX(1)
<example>foo do_something();
<*>bar something_else();
The default rule (to ECHO any unmatched character) remains
active in start conditions. It is equivalent to:
<*>.|\n ECHO;
BEGIN(0) returns to the original state where only the rules
with no start conditions are active. This state can also be
referred to as the start-condition "INITIAL", so
BEGIN(INITIAL) is equivalent to BEGIN(0). (The parentheses
around the start condition name are not required but are
considered good style.)
BEGIN actions can also be given as indented code at the
beginning of the rules section. For example, the following
will cause the scanner to enter the "SPECIAL" start condi-
tion whenever yylex() is called and the global variable
enter_special is true:
int enter_special;
%x SPECIAL
%%
if ( enter_special )
BEGIN(SPECIAL);
<SPECIAL>blahblahblah
...more rules follow...
To illustrate the uses of start conditions, here is a
scanner which provides two different interpretations of a
string like "123.456". By default it will treat it as three
tokens, the integer "123", a dot ('.'), and the integer
"456". But if the string is preceded earlier in the line by
the string "expect-floats" it will treat it as a single
token, the floating-point number 123.456:
%{
#include <math.h>
%}
%s expect
%%
expect-floats BEGIN(expect);
<expect>[0-9]+"."[0-9]+ {
Version 2.5 Last change: April 1995 19
FLEX(1) USER COMMANDS FLEX(1)
printf( "found a float, = %f\n",
atof( yytext ) );
}
<expect>\n {
/* that's the end of the line, so
* we need another "expect-number"
* before we'll recognize any more
* numbers
*/
BEGIN(INITIAL);
}
[0-9]+ {
printf( "found an integer, = %d\n",
atoi( yytext ) );
}
"." printf( "found a dot\n" );
Here is a scanner which recognizes (and discards) C comments
while maintaining a count of the current input line.
%x comment
%%
int line_num = 1;
"/*" BEGIN(comment);
<comment>[^*\n]* /* eat anything that's not a '*' */
<comment>"*"+[^*/\n]* /* eat up '*'s not followed by '/'s */
<comment>\n ++line_num;
<comment>"*"+"/" BEGIN(INITIAL);
This scanner goes to a bit of trouble to match as much text
as possible with each rule. In general, when attempting to
write a high-speed scanner try to match as much possible in
each rule, as it's a big win.
Note that start-conditions names are really integer values
and can be stored as such. Thus, the above could be
extended in the following fashion:
%x comment foo
%%
int line_num = 1;
int comment_caller;
"/*" {
comment_caller = INITIAL;
BEGIN(comment);
}
Version 2.5 Last change: April 1995 20
FLEX(1) USER COMMANDS FLEX(1)
...
<foo>"/*" {
comment_caller = foo;
BEGIN(comment);
}
<comment>[^*\n]* /* eat anything that's not a '*' */
<comment>"*"+[^*/\n]* /* eat up '*'s not followed by '/'s */
<comment>\n ++line_num;
<comment>"*"+"/" BEGIN(comment_caller);
Furthermore, you can access the current start condition
using the integer-valued YY_START macro. For example, the
above assignments to comment_caller could instead be written
comment_caller = YY_START;
Flex provides YYSTATE as an alias for YY_START (since that
is what's used by AT&T lex).
Note that start conditions do not have their own name-space;
%s's and %x's declare names in the same fashion as
#define's.
Finally, here's an example of how to match C-style quoted
strings using exclusive start conditions, including expanded
escape sequences (but not including checking for a string
that's too long):
%x str
%%
char string_buf[MAX_STR_CONST];
char *string_buf_ptr;
\" string_buf_ptr = string_buf; BEGIN(str);
<str>\" { /* saw closing quote - all done */
BEGIN(INITIAL);
*string_buf_ptr = '\0';
/* return string constant token type and
* value to parser
*/
}
<str>\n {
/* error - unterminated string constant */
/* generate error message */
}
Version 2.5 Last change: April 1995 21
FLEX(1) USER COMMANDS FLEX(1)
<str>\\[0-7]{1,3} {
/* octal escape sequence */
int result;
(void) sscanf( yytext + 1, "%o", &result );
if ( result > 0xff )
/* error, constant is out-of-bounds */
*string_buf_ptr++ = result;
}
<str>\\[0-9]+ {
/* generate error - bad escape sequence; something
* like '\48' or '\0777777'
*/
}
<str>\\n *string_buf_ptr++ = '\n';
<str>\\t *string_buf_ptr++ = '\t';
<str>\\r *string_buf_ptr++ = '\r';
<str>\\b *string_buf_ptr++ = '\b';
<str>\\f *string_buf_ptr++ = '\f';
<str>\\(.|\n) *string_buf_ptr++ = yytext[1];
<str>[^\\\n\"]+ {
char *yptr = yytext;
while ( *yptr )
*string_buf_ptr++ = *yptr++;
}
Often, such as in some of the examples above, you wind up
writing a whole bunch of rules all preceded by the same
start condition(s). Flex makes this a little easier and
cleaner by introducing a notion of start condition scope. A
start condition scope is begun with:
<SCs>{
where SCs is a list of one or more start conditions. Inside
the start condition scope, every rule automatically has the
prefix <SCs> applied to it, until a '}' which matches the
initial '{'. So, for example,
<ESC>{
"\\n" return '\n';
"\\r" return '\r';
"\\f" return '\f';
"\\0" return '\0';
Version 2.5 Last change: April 1995 22
FLEX(1) USER COMMANDS FLEX(1)
}
is equivalent to:
<ESC>"\\n" return '\n';
<ESC>"\\r" return '\r';
<ESC>"\\f" return '\f';
<ESC>"\\0" return '\0';
Start condition scopes may be nested.
Three routines are available for manipulating stacks of
start conditions:
void yy_push_state(int new_state)
pushes the current start condition onto the top of the
start condition stack and switches to new_state as
though you had used BEGIN new_state (recall that start
condition names are also integers).
void yy_pop_state()
pops the top of the stack and switches to it via BEGIN.
int yy_top_state()
returns the top of the stack without altering the
stack's contents.
The start condition stack grows dynamically and so has no
built-in size limitation. If memory is exhausted, program
execution aborts.
To use start condition stacks, your scanner must include a
%option stack directive (see Options below).
MULTIPLE INPUT BUFFERS
Some scanners (such as those which support "include" files)
require reading from several input streams. As flex
scanners do a large amount of buffering, one cannot control
where the next input will be read from by simply writing a
YY_INPUT which is sensitive to the scanning context.
YY_INPUT is only called when the scanner reaches the end of
its buffer, which may be a long time after scanning a state-
ment such as an "include" which requires switching the input
source.
To negotiate these sorts of problems, flex provides a
mechanism for creating and switching between multiple input
buffers. An input buffer is created by using:
YY_BUFFER_STATE yy_create_buffer( FILE *file, int size )
which takes a FILE pointer and a size and creates a buffer
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -