📄 flexdoc.1
字号:
describe a scanner which is independent of any of the other rules in the.I lexinput. Because of this,exclusive start conditions make it easy to specify "mini-scanners"which scan portions of the input that are syntactically differentfrom the rest (e.g., comments)..PPIf the distinction between inclusive and exclusive start conditionsis still a little vague, here's a simple example illustrating theconnection between the two. The set of rules:.nf %s example %% <example>foo /* do something */.fiis equivalent to.nf %x example %% <INITIAL,example>foo /* do something */.fi.PPAlso note that the special start-condition specifier.B <*>matches every start condition. Thus, the above example could alsohave been written;.nf %x example %% <*>foo /* do something */.fi.PPThe default rule (to.B ECHOany unmatched character) remains active in start conditions..PP.B BEGIN(0)returns to the original state where only the rules withno start conditions are active. This state can also bereferred to as the start-condition "INITIAL", so.B BEGIN(INITIAL)is equivalent to.B BEGIN(0).(The parentheses around the start condition name are not required butare considered good style.).PP.B BEGINactions can also be given as indented code at the beginningof the rules section. For example, the following will causethe scanner to enter the "SPECIAL" start condition whenever.I yylex()is called and the global variable.I enter_specialis true:.nf int enter_special; %x SPECIAL %% if ( enter_special ) BEGIN(SPECIAL); <SPECIAL>blahblahblah ...more rules follow....fi.PPTo illustrate the uses of start conditions,here is a scanner which provides two different interpretationsof a string like "123.456". By default it will treat it asas three tokens, the integer "123", a dot ('.'), and the integer "456".But if the string is preceded earlier in the line by the string"expect-floats"it will treat it as a single token, the floating-point number123.456:.nf %{ #include <math.h> %} %s expect %% expect-floats BEGIN(expect); <expect>[0-9]+"."[0-9]+ { printf( "found a float, = %f\\n", atof( yytext ) ); } <expect>\\n { /* that's the end of the line, so * we need another "expect-number" * before we'll recognize any more * numbers */ BEGIN(INITIAL); } [0-9]+ { printf( "found an integer, = %d\\n", atoi( yytext ) ); } "." printf( "found a dot\\n" );.fiHere is a scanner which recognizes (and discards) C comments whilemaintaining a count of the current input line..nf %x comment %% int line_num = 1; "/*" BEGIN(comment); <comment>[^*\\n]* /* eat anything that's not a '*' */ <comment>"*"+[^*/\\n]* /* eat up '*'s not followed by '/'s */ <comment>\\n ++line_num; <comment>"*"+"/" BEGIN(INITIAL);.fiThis scanner goes to a bit of trouble to match as muchtext as possible with each rule. In general, when attempting to writea high-speed scanner try to match as much possible in each rule, asit's a big win..PPNote that start-conditions names are really integer values andcan be stored as such. Thus, the above could be extended in thefollowing fashion:.nf %x comment foo %% int line_num = 1; int comment_caller; "/*" { comment_caller = INITIAL; BEGIN(comment); } ... <foo>"/*" { comment_caller = foo; BEGIN(comment); } <comment>[^*\\n]* /* eat anything that's not a '*' */ <comment>"*"+[^*/\\n]* /* eat up '*'s not followed by '/'s */ <comment>\\n ++line_num; <comment>"*"+"/" BEGIN(comment_caller);.fiFurthermore, you can access the current start condition usingthe integer-valued.B YY_STARTmacro. For example, the above assignments to.I comment_callercould instead be written.nf comment_caller = YY_START;.fi.PPNote that start conditions do not have their own name-space; %s's and %x'sdeclare names in the same fashion as #define's..PPFinally, here's an example of how to match C-style quoted strings usingexclusive start conditions, including expanded escape sequences (butnot including checking for a string that's too long):.nf %x str %% char string_buf[MAX_STR_CONST]; char *string_buf_ptr; \\" string_buf_ptr = string_buf; BEGIN(str); <str>\\" { /* saw closing quote - all done */ BEGIN(INITIAL); *string_buf_ptr = '\\0'; /* return string constant token type and * value to parser */ } <str>\\n { /* error - unterminated string constant */ /* generate error message */ } <str>\\\\[0-7]{1,3} { /* octal escape sequence */ int result; (void) sscanf( yytext + 1, "%o", &result ); if ( result > 0xff ) /* error, constant is out-of-bounds */ *string_buf_ptr++ = result; } <str>\\\\[0-9]+ { /* generate error - bad escape sequence; something * like '\\48' or '\\0777777' */ } <str>\\\\n *string_buf_ptr++ = '\\n'; <str>\\\\t *string_buf_ptr++ = '\\t'; <str>\\\\r *string_buf_ptr++ = '\\r'; <str>\\\\b *string_buf_ptr++ = '\\b'; <str>\\\\f *string_buf_ptr++ = '\\f'; <str>\\\\(.|\\n) *string_buf_ptr++ = yytext[1]; <str>[^\\\\\\n\\"]+ { char *yytext_ptr = yytext; while ( *yytext_ptr ) *string_buf_ptr++ = *yytext_ptr++; }.fi.SH MULTIPLE INPUT BUFFERSSome scanners (such as those which support "include" files)require reading from several input streams. As.I lexscanners do a large amount of buffering, one cannot controlwhere the next input will be read from by simply writing a.B YY_INPUTwhich is sensitive to the scanning context..B YY_INPUTis only called when the scanner reaches the end of its buffer, whichmay be a long time after scanning a statement such as an "include"which requires switching the input source..PPTo negotiate these sorts of problems,.I lexprovides a mechanism for creating and switching between multipleinput buffers. An input buffer is created by using:.nf YY_BUFFER_STATE yy_create_buffer( FILE *file, int size ).fiwhich takes a.I FILEpointer and a size and creates a buffer associated with the givenfile and large enough to hold.I sizecharacters (when in doubt, use.B YY_BUF_SIZEfor the size). It returns a.B YY_BUFFER_STATEhandle, which may then be passed to other routines:.nf void yy_switch_to_buffer( YY_BUFFER_STATE new_buffer ).fiswitches the scanner's input buffer so subsequent tokens willcome from.I new_buffer.Note that.B yy_switch_to_buffer()may be used by yywrap() to set things up for continued scanning, insteadof opening a new file and pointing.I yyinat it..nf void yy_delete_buffer( YY_BUFFER_STATE buffer ).fiis used to reclaim the storage associated with a buffer..PP.B yy_new_buffer()is an alias for.B yy_create_buffer(),provided for compatibility with the C++ use of.I newand.I deletefor creating and destroying dynamic objects..PPFinally, the.B YY_CURRENT_BUFFERmacro returns a.B YY_BUFFER_STATEhandle to the current buffer..PPHere is an example of using these features for writing a scannerwhich expands include files (the.B <<EOF>>feature is discussed below):.nf /* the "incl" state is used for picking up the name * of an include file */ %x incl %{ #define MAX_INCLUDE_DEPTH 10 YY_BUFFER_STATE include_stack[MAX_INCLUDE_DEPTH]; int include_stack_ptr = 0; %} %% include BEGIN(incl); [a-z]+ ECHO; [^a-z\\n]*\\n? ECHO; <incl>[ \\t]* /* eat the whitespace */ <incl>[^ \\t\\n]+ { /* got the include file name */ if ( include_stack_ptr >= MAX_INCLUDE_DEPTH ) { fprintf( stderr, "Includes nested too deeply" ); exit( 1 ); } include_stack[include_stack_ptr++] = YY_CURRENT_BUFFER; yyin = fopen( yytext, "r" ); if ( ! yyin ) error( ... ); yy_switch_to_buffer( yy_create_buffer( yyin, YY_BUF_SIZE ) ); BEGIN(INITIAL); } <<EOF>> { if ( --include_stack_ptr < 0 ) { yyterminate(); } else { yy_delete_buffer( YY_CURRENT_BUFFER ); yy_switch_to_buffer( include_stack[include_stack_ptr] ); } }.fi.SH END-OF-FILE RULESThe special rule "<<EOF>>" indicatesactions which are to be taken when an end-of-file isencountered and yywrap() returns non-zero (i.e., indicatesno further files to process). The action must finishby doing one of four things:.IP -assigning.I yyinto a new input file (in previous versions of lex, after doing theassignment you had to call the special action.B YY_NEW_FILE;this is no longer necessary);.IP -executing a.I returnstatement;.IP -executing the special.B yyterminate()action;.IP -or, switching to a new buffer using.B yy_switch_to_buffer()as shown in the example above..PP<<EOF>> rules may not be used with otherpatterns; they may only be qualified with a list of startconditions. If an unqualified <<EOF>> rule is given, itapplies to.I allstart conditions which do not already have <<EOF>> actions. Tospecify an <<EOF>> rule for only the initial start condition, use.nf <INITIAL><<EOF>>.fi.PPThese rules are useful for catching things like unclosed comments.An example:.nf %x quote %% ...other rules for dealing with quotes... <quote><<EOF>> { error( "unterminated quote" ); yyterminate(); } <<EOF>> { if ( *++filelist ) yyin = fopen( *filelist, "r" ); else yyterminate(); }.fi.SH MISCELLANEOUS MACROSThe macro.bdYY_USER_ACTIONcan be defined to provide an actionwhich is always executed prior to the matched rule's action. For example,it could be #define'd to call a routine to convert yytext to lower-case..PPThe macro.B YY_USER_INITmay be defined to provide an action which is always executed beforethe first scan (and before the scanner's internal initializations are done).For example, it could be used to call a routine to readin a data table or open a logging file..PPIn the generated scanner, the actions are all gathered in one largeswitch statement and separated using.B YY_BREAK,which may be redefined. By default, it is simply a "break", to separateeach rule's action from the following rule's.Redefining.B YY_BREAKallows, for example, C++ users to#define YY_BREAK to do nothing (while being very careful that everyrule ends with a "break" or a "return"!) to avoid suffering fromunreachable statement warnings where because a rule's action ends with"return", the.B YY_BREAKis inaccessible..SH INTERFACING WITH YACCOne of the main uses of.I lexis as a companion to the.I yaccparser-generator..I yaccparsers expect to call a routine named.B yylex()to find the next input token. The routine is supposed toreturn the type of the next token as well as putting any associatedvalue in the global.B yylval.To use.I lexwith.I yacc,one specifies the.B \-doption to.I yaccto instruct it to generate the file.B y.tab.hcontaining definitions of all the.B %tokensappearing in the.I yaccinput. This file is then included in the.I lexscanner. For example, if one of the tokens is "TOK_NUMBER",part of the scanner might look like:.nf %{ #include "y.tab.h" %} %% [0-9]+ yylval = atoi( yytext ); return TOK_NUMBER;.fi.SH OPTIONS.I lexhas the following options:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -