📄 flexdoc.1
字号:
By default (and for purposes of efficiency), the scanner uses block-reads
rather than simple getc() calls to read characters from yyin. The nature
of how it gets its input can be controlled by redefining the YY_INPUT
macro. YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)".
Its action is to place up to max_size characters in the character array
buf and return in the integer variable result either the number of
characters read or the constant YY_NULL (0 on Unix systems) to indicate
EOF. The default YY_INPUT reads from the global file-pointer "yyin".
A sample redefinition of YY_INPUT (in the definitions section of the
input file):
%{
#undef YY_INPUT
#define YY_INPUT(buf,result,max_size) \
{ \
int c = getchar(); \
result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
}
%}
This definition will change the input processing to occur one character
at a time.
You also can add in things like keeping track of the input line number
this way; but don't expect your scanner to go very fast.
When the scanner receives an end-of-file indication from YY_INPUT, it
then checks the yywrap() function. If yywrap() returns false (zero),
then it is assumed that the function has gone ahead and set up yyin to
point to another input file, and scanning continues. If it returns true
(non-zero), then the scanner terminates, returning 0 to its caller.
The default yywrap() always returns 1. Presently, to redefine it you
must first "#undef yywrap", as it is currently implemented as a macro.
As indicated by the hedging in the previous sentence, it may be changed
to a true function in the near future.
The scanner writes its ECHO output to the yyout global (default, stdout),
which may be redefined by the user simply by assigning it to some other
FILE pointer.
START CONDITIONS
flex provides a mechanism for conditionally activating rules. Any rule
whose pattern is prefixed with "<sc>" will only be active when the
scanner is in the start condition named "sc". For example,
<STRING>[^"]* { /* eat up the string body ... */
...
}
26 May 1990 12
FLEX(1) Minix Programmer's Manual FLEX(1)
will be active only when the scanner is in the "STRING" start condition,
and
<INITIAL,STRING,QUOTE>\. { /* handle an escape ... */
...
}
will be active only when the current start condition is either "INITIAL",
"STRING", or "QUOTE".
Start conditions are declared in the definitions (first) section of the
input using unindented lines beginning with either %s or %x followed by a
list of names. The former declares inclusive start conditions, the
latter exclusive start conditions. A start condition is activated using
the BEGIN action. Until the next BEGIN action is executed, rules with
the given start condition will be active and rules with other start
conditions will be inactive. If the start condition is inclusive, then
rules with no start conditions at all will also be active. If it is
exclusive, then only rules qualified with the start condition will be
active. A set of rules contingent on the same exclusive start condition
describe a scanner which is independent of any of the other rules in the
flex input. Because of this, exclusive start conditions make it easy to
specify "mini-scanners" which scan portions of the input that are
syntactically different from the rest (e.g., comments).
If the distinction between inclusive and exclusive start conditions is
still a little vague, here's a simple example illustrating the connection
between the two. The set of rules:
%s example
%%
<example>foo /* do something */
is equivalent to
%x example
%%
<INITIAL,example>foo /* do something */
The default rule (to ECHO any unmatched character) remains active in
start conditions.
BEGIN(0) returns to the original state where only the rules with no start
conditions are active. This state can also be referred to as the start-
condition "INITIAL", so BEGIN(INITIAL) is equivalent to BEGIN(0). (The
parentheses around the start condition name are not required but are
considered good style.)
26 May 1990 13
FLEX(1) Minix Programmer's Manual FLEX(1)
BEGIN actions can also be given as indented code at the beginning of the
rules section. For example, the following will cause the scanner to
enter the "SPECIAL" start condition whenever yylex() is called and the
global variable enter_special is true:
int enter_special;
%x SPECIAL
%%
if ( enter_special )
BEGIN(SPECIAL);
<SPECIAL>blahblahblah
...more rules follow...
To illustrate the uses of start conditions, here is a scanner which
provides two different interpretations of a string like "123.456". By
default it will treat it as as three tokens, the integer "123", a dot
('.'), and the integer "456". But if the string is preceded earlier in
the line by the string "expect-floats" it will treat it as a single
token, the floating-point number 123.456:
%{
#include <math.h>
%}
%s expect
%%
expect-floats BEGIN(expect);
<expect>[0-9]+"."[0-9]+ {
printf( "found a float, = %f\n",
atof( yytext ) );
}
<expect>\n {
/* that's the end of the line, so
* we need another "expect-number"
* before we'll recognize any more
* numbers
*/
BEGIN(INITIAL);
}
[0-9]+ {
printf( "found an integer, = %d\n",
atoi( yytext ) );
}
"." printf( "found a dot\n" );
26 May 1990 14
FLEX(1) Minix Programmer's Manual FLEX(1)
Here is a scanner which recognizes (and discards) C comments while
maintaining a count of the current input line.
%x comment
%%
int line_num = 1;
"/*" BEGIN(comment);
<comment>[^*\n]* /* eat anything that's not a '*' */
<comment>"*"+[^*/\n]* /* eat up '*'s not followed by '/'s */
<comment>\n ++line_num;
<comment>"*"+"/" BEGIN(INITIAL);
Note that start-conditions names are really integer values and can be
stored as such. Thus, the above could be extended in the following
fashion:
%x comment foo
%%
int line_num = 1;
int comment_caller;
"/*" {
comment_caller = INITIAL;
BEGIN(comment);
}
...
<foo>"/*" {
comment_caller = foo;
BEGIN(comment);
}
<comment>[^*\n]* /* eat anything that's not a '*' */
<comment>"*"+[^*/\n]* /* eat up '*'s not followed by '/'s */
<comment>\n ++line_num;
<comment>"*"+"/" BEGIN(comment_caller);
One can then implement a "stack" of start conditions using an array of
integers. (It is likely that such stacks will become a full-fledged flex
feature in the future.) Note, though, that start conditions do not have
their own name-space; %s's and %x's declare names in the same fashion as
#define's.
26 May 1990 15
FLEX(1) Minix Programmer's Manual FLEX(1)
MULTIPLE INPUT BUFFERS
Some scanners (such as those which support "include" files) require
reading from several input streams. As flex scanners do a large amount
of buffering, one cannot control where the next input will be read from
by simply writing a YY_INPUT which is sensitive to the scanning context.
YY_INPUT is only called when the scanner reaches the end of its buffer,
which may be a long time after scanning a statement such as an "include"
which requires switching the input source.
To negotiate these sorts of problems, flex provides a mechanism for
creating and switching between multiple input buffers. An input buffer
is created by using:
YY_BUFFER_STATE yy_create_buffer( FILE *file, int size )
which takes a FILE pointer and a size and creates a buffer associated
with the given file and large enough to hold size characters (when in
doubt, use YY_BUF_SIZE for the size). It returns a YY_BUFFER_STATE
handle, which may then be passed to other routines:
void yy_switch_to_buffer( YY_BUFFER_STATE new_buffer )
switches the scanner's input buffer so subsequent tokens will come from
new_buffer. Note that yy_switch_to_buffer() may be used by yywrap() to
sets things up for continued scanning, instead of opening a new file and
pointing yyin at it.
void yy_delete_buffer( YY_BUFFER_STATE buffer )
is used to reclaim the storage associated with a buffer.
yy_new_buffer() is an alias for yy_create_buffer(), provided for
compatibility with the C++ use of new and delete for creating and
destroying dynamic objects.
Finally, the YY_CURRENT_BUFFER macro returns a YY_BUFFER_STATE handle to
the current buffer.
Here is an example of using these features for writing a scanner which
expands include files (the <<EOF>> feature is discussed below):
/* the "incl" state is used for picking up the name
* of an include file
*/
%x incl
%{
#define MAX_INCLUDE_DEPTH 10
YY_BUFFER_STATE include_stack[MAX_INCLUDE_DEPTH];
int include_stack_ptr = 0;
26 May 1990 16
FLEX(1) Minix Programmer's Manual FLEX(1)
%}
%%
include BEGIN(incl);
[a-z]+ ECHO;
[^a-z\n]*\n? ECHO;
<incl>[ \t]* /* eat the whitespace */
<incl>[^ \t\n]+ { /* got the include file name */
if ( include_stack_ptr >= MAX_INCLUDE_DEPTH )
{
fprintf( stderr, "Includes nested too deeply" );
exit( 1 );
}
include_stack[include_stack_ptr++] =
YY_CURRENT_BUFFER;
yyin = fopen( yytext, "r" );
if ( ! yyin )
error( ... );
yy_switch_to_buffer(
yy_create_buffer( yyin, YY_BUF_SIZE ) );
BEGIN(INITIAL);
}
<<EOF>> {
if ( --include_stack_ptr < 0 )
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -