⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 flex.texi

📁 Flex词法/语法分析器源码
💻 TEXI
📖 第 1 页 / 共 5 页
字号:
        BEGIN(INITIAL);        @}<<EOF>> @{        if ( --include_stack_ptr < 0 )            @{            yyterminate();            @}        else            @{            yy_delete_buffer( YY_CURRENT_BUFFER );            yy_switch_to_buffer(                 include_stack[include_stack_ptr] );            @}        @}@end exampleThree routines are available for setting up input buffersfor scanning in-memory strings instead of files.  All ofthem create a new input buffer for scanning the string,and return a corresponding @code{YY_BUFFER_STATE} handle (whichyou should delete with @samp{yy_delete_buffer()} when done withit).  They also switch to the new buffer using@samp{yy_switch_to_buffer()}, so the next call to @samp{yylex()} willstart scanning the string.@table @samp@item yy_scan_string(const char *str)scans a NUL-terminated string.@item yy_scan_bytes(const char *bytes, int len)scans @code{len} bytes (including possibly NUL's) startingat location @var{bytes}.@end tableNote that both of these functions create and scan a @emph{copy}of the string or bytes.  (This may be desirable, since@samp{yylex()} modifies the contents of the buffer it isscanning.) You can avoid the copy by using:@table @samp@item yy_scan_buffer(char *base, yy_size_t size)which scans in place the buffer starting at @var{base},consisting of @var{size} bytes, the last two bytes ofwhich @emph{must} be @code{YY_END_OF_BUFFER_CHAR} (ASCII NUL).These last two bytes are not scanned; thus,scanning consists of @samp{base[0]} through @samp{base[size-2]},inclusive.If you fail to set up @var{base} in this manner (i.e.,forget the final two @code{YY_END_OF_BUFFER_CHAR} bytes),then @samp{yy_scan_buffer()} returns a nil pointer insteadof creating a new input buffer.The type @code{yy_size_t} is an integral type to which youcan cast an integer expression reflecting the sizeof the buffer.@end table@node End-of-file rules, Miscellaneous, Multiple buffers, Top@section End-of-file rulesThe special rule "<<EOF>>" indicates actions which are tobe taken when an end-of-file is encountered and yywrap()returns non-zero (i.e., indicates no further files toprocess).  The action must finish by doing one of fourthings:@itemize -@itemassigning @code{yyin} to a new input file (in previousversions of flex, after doing the assignment youhad to call the special action @code{YY_NEW_FILE}; this isno longer necessary);@itemexecuting a @code{return} statement;@itemexecuting the special @samp{yyterminate()} action;@itemor, switching to a new buffer using@samp{yy_switch_to_buffer()} as shown in the exampleabove.@end itemize<<EOF>> rules may not be used with other patterns; theymay only be qualified with a list of start conditions.  Ifan unqualified <<EOF>> rule is given, it applies to @emph{all}start conditions which do not already have <<EOF>>actions.  To specify an <<EOF>> rule for only the initialstart condition, use@example<INITIAL><<EOF>>@end exampleThese rules are useful for catching things like unclosedcomments.  An example:@example%x quote%%@dots{}other rules for dealing with quotes@dots{}<quote><<EOF>>   @{         error( "unterminated quote" );         yyterminate();         @}<<EOF>>  @{         if ( *++filelist )             yyin = fopen( *filelist, "r" );         else            yyterminate();         @}@end example@node Miscellaneous, User variables, End-of-file rules, Top@section Miscellaneous macrosThe macro @code{YY_USER_ACTION} can be defined to provide anaction which is always executed prior to the matchedrule's action.  For example, it could be #define'd to calla routine to convert yytext to lower-case.  When@code{YY_USER_ACTION} is invoked, the variable @code{yy_act} gives thenumber of the matched rule (rules are numbered startingwith 1).  Suppose you want to profile how often each ofyour rules is matched.  The following would do the trick:@example#define YY_USER_ACTION ++ctr[yy_act]@end examplewhere @code{ctr} is an array to hold the counts for the differentrules.  Note that the macro @code{YY_NUM_RULES} gives the total numberof rules (including the default rule, even if you use @samp{-s}, soa correct declaration for @code{ctr} is:@exampleint ctr[YY_NUM_RULES];@end exampleThe macro @code{YY_USER_INIT} may be defined to provide an actionwhich is always executed before the first scan (and beforethe scanner's internal initializations are done).  Forexample, it could be used to call a routine to read in adata table or open a logging file.The macro @samp{yy_set_interactive(is_interactive)} can be usedto control whether the current buffer is considered@emph{interactive}.  An interactive buffer is processed more slowly,but must be used when the scanner's input source is indeedinteractive to avoid problems due to waiting to fillbuffers (see the discussion of the @samp{-I} flag below).  Anon-zero value in the macro invocation marks the buffer asinteractive, a zero value as non-interactive.  Note thatuse of this macro overrides @samp{%option always-interactive} or@samp{%option never-interactive} (see Options below).@samp{yy_set_interactive()} must be invoked prior to beginning toscan the buffer that is (or is not) to be consideredinteractive.The macro @samp{yy_set_bol(at_bol)} can be used to controlwhether the current buffer's scanning context for the nexttoken match is done as though at the beginning of a line.A non-zero macro argument makes rules anchored withThe macro @samp{YY_AT_BOL()} returns true if the next tokenscanned from the current buffer will have '^' rulesactive, false otherwise.In the generated scanner, the actions are all gathered inone large switch statement and separated using @code{YY_BREAK},which may be redefined.  By default, it is simply a"break", to separate each rule's action from the followingrule's.  Redefining @code{YY_BREAK} allows, for example, C++users to #define YY_BREAK to do nothing (while being verycareful that every rule ends with a "break" or a"return"!) to avoid suffering from unreachable statementwarnings where because a rule's action ends with "return",the @code{YY_BREAK} is inaccessible.@node User variables, YACC interface, Miscellaneous, Top@section Values available to the userThis section summarizes the various values available tothe user in the rule actions.@itemize -@item@samp{char *yytext} holds the text of the current token.It may be modified but not lengthened (you cannotappend characters to the end).If the special directive @samp{%array} appears in thefirst section of the scanner description, then@code{yytext} is instead declared @samp{char yytext[YYLMAX]},where @code{YYLMAX} is a macro definition that you canredefine in the first section if you don't like thedefault value (generally 8KB).  Using @samp{%array}results in somewhat slower scanners, but the valueof @code{yytext} becomes immune to calls to @samp{input()} and@samp{unput()}, which potentially destroy its value when@code{yytext} is a character pointer.  The opposite of@samp{%array} is @samp{%pointer}, which is the default.You cannot use @samp{%array} when generating C++ scannerclasses (the @samp{-+} flag).@item@samp{int yyleng} holds the length of the current token.@item@samp{FILE *yyin} is the file which by default @code{flex} readsfrom.  It may be redefined but doing so only makessense before scanning begins or after an EOF hasbeen encountered.  Changing it in the midst ofscanning will have unexpected results since @code{flex}buffers its input; use @samp{yyrestart()} instead.  Oncescanning terminates because an end-of-file has beenseen, you can assign @code{yyin} at the new input file andthen call the scanner again to continue scanning.@item@samp{void yyrestart( FILE *new_file )} may be called topoint @code{yyin} at the new input file.  The switch-overto the new file is immediate (any previouslybuffered-up input is lost).  Note that calling@samp{yyrestart()} with @code{yyin} as an argument thus throwsaway the current input buffer and continuesscanning the same input file.@item@samp{FILE *yyout} is the file to which @samp{ECHO} actions aredone.  It can be reassigned by the user.@item@code{YY_CURRENT_BUFFER} returns a @code{YY_BUFFER_STATE} handleto the current buffer.@item@code{YY_START} returns an integer value corresponding tothe current start condition.  You can subsequentlyuse this value with @code{BEGIN} to return to that startcondition.@end itemize@node YACC interface, Options, User variables, Top@section Interfacing with @code{yacc}One of the main uses of @code{flex} is as a companion to the @code{yacc}parser-generator.  @code{yacc} parsers expect to call a routinenamed @samp{yylex()} to find the next input token.  The routineis supposed to return the type of the next token as wellas putting any associated value in the global @code{yylval}.  Touse @code{flex} with @code{yacc}, one specifies the @samp{-d} option to @code{yacc} toinstruct it to generate the file @file{y.tab.h} containingdefinitions of all the @samp{%tokens} appearing in the @code{yacc} input.This file is then included in the @code{flex} scanner.  Forexample, if one of the tokens is "TOK_NUMBER", part of thescanner might look like:@example%@{#include "y.tab.h"%@}%%[0-9]+        yylval = atoi( yytext ); return TOK_NUMBER;@end example@node Options, Performance, YACC interface, Top@section Options@code{flex} has the following options:@table @samp@item -bGenerate backing-up information to @file{lex.backup}.This is a list of scanner states which requirebacking up and the input characters on which theydo so.  By adding rules one can remove backing-upstates.  If @emph{all} backing-up states are eliminatedand @samp{-Cf} or @samp{-CF} is used, the generated scanner willrun faster (see the @samp{-p} flag).  Only users who wishto squeeze every last cycle out of their scannersneed worry about this option.  (See the section onPerformance Considerations below.)@item -cis a do-nothing, deprecated option included forPOSIX compliance.@item -dmakes the generated scanner run in @dfn{debug} mode.Whenever a pattern is recognized and the global@code{yy_flex_debug} is non-zero (which is the default),the scanner will write to @code{stderr} a line of theform:@example--accepting rule at line 53 ("the matched text")@end exampleThe line number refers to the location of the rulein the file defining the scanner (i.e., the filethat was fed to flex).  Messages are also generatedwhen the scanner backs up, accepts the defaultrule, reaches the end of its input buffer (orencounters a NUL; at this point, the two look thesame as far as the scanner's concerned), or reachesan end-of-file.@item -fspecifies @dfn{fast scanner}.  No table compression isdone and stdio is bypassed.  The result is largebut fast.  This option is equivalent to @samp{-Cfr} (seebelow).@item -hgenerates a "help" summary of @code{flex's} options to@code{stdout} and then exits.  @samp{-?} and @samp{--help} are synonymsfor @samp{-h}.@item -iinstructs @code{flex} to generate a @emph{case-insensitive}scanner.  The case of letters given in the @code{flex} inputpatterns will be ignored, and tokens in the inputwill be matched regardless of case.  The matchedtext given in @code{yytext} will have the preserved case(i.e., it will not be folded).@item -lturns on maximum compatibility with the originalAT&T @code{lex} implementation.  Note that this does notmean @emph{full} compatibility.  Use of this option costsa considerable amount of performance, and it cannotbe used with the @samp{-+, -f, -F, -Cf}, or @samp{-CF} options.For details on the compatibilities it provides, seethe section "Incompatibilities With Lex And POSIX"below.  This option also results in the name@code{YY_FLEX_LEX_COMPAT} being #define'd in the generatedscanner.@item -nis another do-nothing, deprecated option includedonly for POSIX compliance.@item -pgenerates a performance report to stderr.  Thereport consists of comments regarding features ofthe @code{flex} input file which will cause a serious lossof performance in the resulting scanner.  If yougive the flag twice, you will also get commentsregarding features that lead to minor performancelosses.Note that the use of @code{REJECT}, @samp{%option yylineno} andvariable trailing context (see the Deficiencies / Bugs section below)entails a substantial performance penalty; use of @samp{yymore()},the @samp{^} operator, and the @samp{-I} flag entail minor performancepenalties.@item -scauses the @dfn{default rule} (that unmatched scannerinput is echoed to @code{stdout}) to be suppressed.  Ifthe scanner encounters input that does not matchany of its rules, it aborts with an error.  Thisoption is useful for finding holes in a scanner'srule set.@item -tinstructs @code{flex} to write the scanner it generates tostandard output instead of @file{lex.yy.c}.@item -vspecifies that @code{flex} should write to @code{stderr} asummary of statistics regarding the scanner itgenerates.  Most of the statistics are meaningless tothe casual @code{flex} user, but the first line identifiesthe version of @code{flex} (same as reported by @samp{-V}), andthe next line the flags used when generating thescanner, including those that are on by default.@item -wsuppresses warning messages.@item -Binstructs @code{flex} to generate a @emph{batch} scanner, theopposite of @emph{interactive} scanners generated by @samp{-I}(see below).  In general, you use @samp{-B} when you are@emph{certain} that your scanner will never be usedinteractively, and you want to squeeze a @emph{little} moreperformance out of it.  If your goal is instead tosqueeze out a @emph{lot} more performance, you should beusing the @samp{-Cf} or @samp{-CF} options (discussed below),which turn on @samp{-B} automatically anyway.@item -Fspecifies that the @dfn{fast} scanner tablerepresentati

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -