📄 flexdoc.1

📁 生成C++的词法/语法分析的Flex语法分析器
💻 1
📖 第 1 页 / 共 5 页
字号:
.fiHere is a scanner which recognizes (and discards) C comments whilemaintaining a count of the current input line..nf    %x comment    %%            int line_num = 1;    "/*"         BEGIN(comment);    <comment>[^*\\n]*        /* eat anything that's not a '*' */    <comment>"*"+[^*/\\n]*   /* eat up '*'s not followed by '/'s */    <comment>\\n             ++line_num;    <comment>"*"+"/"        BEGIN(INITIAL);.fiNote that start-conditions names are really integer values andcan be stored as such.  Thus, the above could be extended in thefollowing fashion:.nf    %x comment foo    %%            int line_num = 1;            int comment_caller;    "/*"         {                 comment_caller = INITIAL;                 BEGIN(comment);                 }    ...    <foo>"/*"    {                 comment_caller = foo;                 BEGIN(comment);                 }    <comment>[^*\\n]*        /* eat anything that's not a '*' */    <comment>"*"+[^*/\\n]*   /* eat up '*'s not followed by '/'s */    <comment>\\n             ++line_num;    <comment>"*"+"/"        BEGIN(comment_caller);.fiOne can then implement a "stack" of start conditions using anarray of integers.  (It is likely that such stacks will becomea full-fledged.I flexfeature in the future.)  Note, though, thatstart conditions do not have their own name-space; %s's and %x'sdeclare names in the same fashion as #define's..SH MULTIPLE INPUT BUFFERSSome scanners (such as those which support "include" files)require reading from several input streams.  As.I flexscanners do a large amount of buffering, one cannot controlwhere the next input will be read from by simply writing a.B YY_INPUTwhich is sensitive to the scanning context..B YY_INPUTis only called when the scanner reaches the end of its buffer, whichmay be a long time after scanning a statement such as an "include"which requires switching the input source..LPTo negotiate these sorts of problems,.I flexprovides a mechanism for creating and switching between multipleinput buffers.  An input buffer is created by using:.nf    YY_BUFFER_STATE yy_create_buffer( FILE *file, int size ).fiwhich takes a.I FILEpointer and a size and creates a buffer associated with the givenfile and large enough to hold.I sizecharacters (when in doubt, use.B YY_BUF_SIZEfor the size).  It returns a.B YY_BUFFER_STATEhandle, which may then be passed to other routines:.nf    void yy_switch_to_buffer( YY_BUFFER_STATE new_buffer ).fiswitches the scanner's input buffer so subsequent tokens willcome from.I new_buffer.Note that.B yy_switch_to_buffer()may be used by yywrap() to sets things up for continued scanning, insteadof opening a new file and pointing.I yyinat it..nf    void yy_delete_buffer( YY_BUFFER_STATE buffer ).fiis used to reclaim the storage associated with a buffer..LP.B yy_new_buffer()is an alias for.B yy_create_buffer(),provided for compatibility with the C++ use of.I newand.I deletefor creating and destroying dynamic objects..LPFinally, the.B YY_CURRENT_BUFFERmacro returns a.B YY_BUFFER_STATEhandle to the current buffer..LPHere is an example of using these features for writing a scannerwhich expands include files (the.B <<EOF>>feature is discussed below):.nf    /* the "incl" state is used for picking up the name     * of an include file     */    %x incl    %{    #define MAX_INCLUDE_DEPTH 10    YY_BUFFER_STATE include_stack[MAX_INCLUDE_DEPTH];    int include_stack_ptr = 0;    %}    %%    include             BEGIN(incl);    [a-z]+              ECHO;    [^a-z\\n]*\\n?        ECHO;    <incl>[ \\t]*      /* eat the whitespace */    <incl>[^ \\t\\n]+   { /* got the include file name */            if ( include_stack_ptr >= MAX_INCLUDE_DEPTH )                {                fprintf( stderr, "Includes nested too deeply" );                exit( 1 );                }            include_stack[include_stack_ptr++] =                YY_CURRENT_BUFFER;            yyin = fopen( yytext, "r" );            if ( ! yyin )                error( ... );            yy_switch_to_buffer(                yy_create_buffer( yyin, YY_BUF_SIZE ) );            BEGIN(INITIAL);            }    <<EOF>> {            if ( --include_stack_ptr < 0 )                {                yyterminate();                }            else                yy_switch_to_buffer(                     include_stack[include_stack_ptr] );            }.fi.SH END-OF-FILE RULESThe special rule "<<EOF>>" indicatesactions which are to be taken when an end-of-file isencountered and yywrap() returns non-zero (i.e., indicatesno further files to process).  The action must finishby doing one of four things:.IP -the special.B YY_NEW_FILEaction, if.I yyinhas been pointed at a new file to process;.IP -a.I returnstatement;.IP -the special.B yyterminate()action;.IP -or, switching to a new buffer using.B yy_switch_to_buffer()as shown in the example above..LP<<EOF>> rules may not be used with otherpatterns; they may only be qualified with a list of startconditions.  If an unqualified <<EOF>> rule is given, itapplies to.I allstart conditions which do not already have <<EOF>> actions.  Tospecify an <<EOF>> rule for only the initial start condition, use.nf    <INITIAL><<EOF>>.fi.LPThese rules are useful for catching things like unclosed comments.An example:.nf    %x quote    %%    ...other rules for dealing with quotes...    <quote><<EOF>>   {             error( "unterminated quote" );             yyterminate();             }    <<EOF>>  {             if ( *++filelist )                 {                 yyin = fopen( *filelist, "r" );                 YY_NEW_FILE;                 }             else                yyterminate();             }.fi.SH MISCELLANEOUS MACROSThe macro.bdYY_USER_ACTIONcan be redefined to provide an actionwhich is always executed prior to the matched rule's action.  For example,it could be #define'd to call a routine to convert yytext to lower-case..LPThe macro.B YY_USER_INITmay be redefined to provide an action which is always executed beforethe first scan (and before the scanner's internal initializations are done).For example, it could be used to call a routine to readin a data table or open a logging file..LPIn the generated scanner, the actions are all gathered in one largeswitch statement and separated using.B YY_BREAK,which may be redefined.  By default, it is simply a "break", to separateeach rule's action from the following rule's.Redefining.B YY_BREAKallows, for example, C++ users to#define YY_BREAK to do nothing (while being very careful that everyrule ends with a "break" or a "return"!) to avoid suffering fromunreachable statement warnings where because a rule's action ends with"return", the.B YY_BREAKis inaccessible..SH INTERFACING WITH YACCOne of the main uses of.I flexis as a companion to the.I yaccparser-generator..I yaccparsers expect to call a routine named.B yylex()to find the next input token.  The routine is supposed toreturn the type of the next token as well as putting any associatedvalue in the global.B yylval.To use.I flexwith.I yacc,one specifies the.B -doption to.I yaccto instruct it to generate the file.B y.tab.hcontaining definitions of all the.B %tokensappearing in the.I yaccinput.  This file is then included in the.I flexscanner.  For example, if one of the tokens is "TOK_NUMBER",part of the scanner might look like:.nf    %{    #include "y.tab.h"    %}    %%    [0-9]+        yylval = atoi( yytext ); return TOK_NUMBER;.fi.SH TRANSLATION TABLEIn the name of POSIX compliance,.I flexsupports a.I translation tablefor mapping input characters into groups.The table is specified in the first section, and its format looks like:.nf    %t    1        abcd    2        ABCDEFGHIJKLMNOPQRSTUVWXYZ    52       0123456789    6        \\t\\ \\n    %t.fiThis example specifies that the characters 'a', 'b', 'c', and 'd'are to all be lumped into group #1, upper-case lettersin group #2, digits in group #52, tabs, blanks, and newlines intogroup #6, and.Ino other characters will appear in the patterns.The group numbers are actually disregarded by.I flex;.B %tserves, though, to lump characters together.  Given the abovetable, for example, the pattern "a(AA)*5" is equivalent to "d(ZQ)*0".They both say, "match any character in group #1, followed byzero-or-more pairs of charactersfrom group #2, followed by a character from group #52."  Thus.B %tprovides a crude way for introducing equivalence classes intothe scanner specification..LPNote that the.B -ioption (see below) coupled with the equivalence classes which.I flexautomatically generates take care of virtually all the instanceswhen one might consider using.B %t.But what the hell, it's there if you want it..SH OPTIONS.I flexhas the following options:.TP.B -bGenerate backtracking information to.I lex.backtrack.This is a list of scanner states which require backtrackingand the input characters on which they do so.  By adding rules onecan remove backtracking states.  If all backtracking statesare eliminated and.B -for.B -Fis used, the generated scanner will run faster (see the.B -pflag).  Only users who wish to squeeze every last cycle out of theirscanners need worry about this option.  (See the section on PERFORMANCECONSIDERATIONS below.).TP.B -cis a do-nothing, deprecated option included for POSIX compliance..IP.B NOTE:in previous releases of.I flex.B -cspecified table-compression options.  This functionality isnow given by the.B -Cflag.  To ease the the impact of this change, when.I flexencounters.B -c,it currently issues a warning message and assumes that.B -Cwas desired instead.  In the future this "promotion" of.B -cto.B -Cwill go away in the name of full POSIX compliance (unlessthe POSIX meaning is removed first)..TP.B -dmakes the generated scanner run in.I debugmode.  Whenever a pattern is recognized and the global.B yy_flex_debugis non-zero (which is the default),the scanner will write to.I stderra line of the form:.nf    --accepting rule at line 53 ("the matched text").fiThe line number refers to the location of the rule in the filedefining the scanner (i.e., the file that was fed to flex).  Messagesare also generated when the scanner backtracks, accepts thedefault rule, reaches the end of its input buffer (or encountersa NUL; at this point, the two look the same as far as the scanner's concerned),or reaches an end-of-file..TP.B -fspecifies (take your pick).I full tableor.I fast scanner.No table compression is done.  The result is large but fast.This option is equivalent to.B -Cf(see below)..TP.B -iinstructs.I flexto generate a.I case-insensitivescanner.  The case of letters given in the.I flexinput patterns willbe ignored, and tokens in the input will be matched regardless of case.  Thematched text given in.I yytextwill have the preserved case (i.e., it will not be folded)..TP.B -nis another do-nothing, deprecated option included only forPOSIX compliance..TP.B -pgenerates a performance report to stderr.  The reportconsists of comments regarding features of the.I flexinput file which will cause a loss of performance in the resulting scanner.Note that the use of.I REJECTand variable trailing context (see the BUGS section in flex(1))entails a substantial performance penalty; use of.I yymore(),the.B ^operator,and the.B -Iflag entail minor performance penalties..TP.B -scauses the.I default rule(that unmatched scanner input is echoed to.I stdout)to be suppressed.  If the scanner encounters input that does notmatch any of its rules, it aborts with an error.  This option isuseful for finding holes in a scanner's rule set..TP.B -tinstructs.I flexto write the scanner it generates to standard output insteadof.B lex.yy.c..TP.B -vspecifies that.I flexshould write to.I stderra summary of statistics regarding the scanner it generates.Most of the statistics are meaningless to the casual.I flexuser, but thefirst line identifies the version of.I flex,which is useful for figuringout where you stand with respect to patches and new releases,
💿 文件大小 334 K
👤 上传用户 junbo2009
📂 所属分类编译器/解释器
🏷️ 相关标签

#Flex #分 #语法分析器
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -