⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 flex.man

📁 flex编译器的源代码
💻 MAN
📖 第 1 页 / 共 5 页
字号:
          much  text as the originally chosen rule but came later
          in the flex input file, or one which matched less text.
          For example, the following will both count the words in
          the input  and  call  the  routine  special()  whenever
          "frob" is seen:

                      int word_count = 0;
              %%

              frob        special(); REJECT;
              [^ \t\n]+   ++word_count;

          Without the REJECT, any "frob"'s in the input would not
          be  counted  as  words, since the scanner normally exe-
          cutes only one action per token.  Multiple REJECT's are
          allowed,  each  one finding the next best choice to the
          currently active rule.  For example, when the following
          scanner  scans the token "abcd", it will write "abcdab-
          caba" to the output:

              %%
              a        |
              ab       |
              abc      |
              abcd     ECHO; REJECT;
              .|\n     /* eat up any unmatched character */

          (The first three rules share the fourth's action  since
          they   use   the  special  '|'  action.)  REJECT  is  a



Version 2.5          Last change: April 1995                   12






FLEX(1)                  USER COMMANDS                    FLEX(1)



          particularly expensive feature in terms of scanner per-
          formance; if it is used in any of the scanner's actions
          it will  slow  down  all  of  the  scanner's  matching.
          Furthermore,  REJECT cannot be used with the -Cf or -CF
          options (see below).

          Note also that unlike the other special actions, REJECT
          is  a  branch;  code  immediately  following  it in the
          action will not be executed.

     -    yymore() tells  the  scanner  that  the  next  time  it
          matches  a  rule,  the  corresponding  token  should be
          appended onto the current value of yytext  rather  than
          replacing  it.   For  example,  given  the input "mega-
          kludge" the following will write "mega-mega-kludge"  to
          the output:

              %%
              mega-    ECHO; yymore();
              kludge   ECHO;

          First "mega-" is matched  and  echoed  to  the  output.
          Then  "kludge"  is matched, but the previous "mega-" is
          still hanging around at the beginning of yytext so  the
          ECHO  for  the "kludge" rule will actually write "mega-
          kludge".

     Two notes regarding use of yymore(). First, yymore() depends
     on  the value of yyleng correctly reflecting the size of the
     current token, so you must not  modify  yyleng  if  you  are
     using  yymore().  Second,  the  presence  of yymore() in the
     scanner's action entails a minor performance penalty in  the
     scanner's matching speed.

     -    yyless(n) returns all but the first n characters of the
          current token back to the input stream, where they will
          be rescanned when the scanner looks for the next match.
          yytext  and  yyleng  are  adjusted appropriately (e.g.,
          yyleng will now be equal to n ).  For example,  on  the
          input  "foobar"  the  following will write out "foobar-
          bar":

              %%
              foobar    ECHO; yyless(3);
              [a-z]+    ECHO;

          An argument of  0  to  yyless  will  cause  the  entire
          current  input  string  to  be  scanned  again.  Unless
          you've changed how the scanner will  subsequently  pro-
          cess  its  input  (using BEGIN, for example), this will
          result in an endless loop.




Version 2.5          Last change: April 1995                   13






FLEX(1)                  USER COMMANDS                    FLEX(1)



     Note that yyless is a macro and can only be used in the flex
     input file, not from other source files.

     -    unput(c) puts the  character  c  back  onto  the  input
          stream.   It  will  be the next character scanned.  The
          following action will take the current token and  cause
          it to be rescanned enclosed in parentheses.

              {
              int i;
              /* Copy yytext because unput() trashes yytext */
              char *yycopy = strdup( yytext );
              unput( ')' );
              for ( i = yyleng - 1; i >= 0; --i )
                  unput( yycopy[i] );
              unput( '(' );
              free( yycopy );
              }

          Note that since each unput() puts the  given  character
          back at the beginning of the input stream, pushing back
          strings must be done back-to-front.

     An important potential problem when using unput() is that if
     you are using %pointer (the default), a call to unput() des-
     troys the contents of yytext, starting  with  its  rightmost
     character  and devouring one character to the left with each
     call.  If you need the value of  yytext  preserved  after  a
     call  to  unput() (as in the above example), you must either
     first copy it elsewhere, or build your scanner using  %array
     instead (see How The Input Is Matched).

     Finally, note that you cannot put back  EOF  to  attempt  to
     mark the input stream with an end-of-file.

     -    input() reads the next character from the input stream.
          For  example, the following is one way to eat up C com-
          ments:

              %%
              "/*"        {
                          register int c;

                          for ( ; ; )
                              {
                              while ( (c = input()) != '*' &&
                                      c != EOF )
                                  ;    /* eat up text of comment */

                              if ( c == '*' )
                                  {
                                  while ( (c = input()) == '*' )



Version 2.5          Last change: April 1995                   14






FLEX(1)                  USER COMMANDS                    FLEX(1)



                                      ;
                                  if ( c == '/' )
                                      break;    /* found the end */
                                  }

                              if ( c == EOF )
                                  {
                                  error( "EOF in comment" );
                                  break;
                                  }
                              }
                          }

          (Note that if the scanner is compiled using  C++,  then
          input()  is  instead referred to as yyinput(), in order
          to avoid a name clash with the C++ stream by  the  name
          of input.)

     -    YY_FLUSH_BUFFER flushes the scanner's  internal  buffer
          so  that  the next time the scanner attempts to match a
          token, it will first refill the buffer  using  YY_INPUT
          (see  The  Generated Scanner, below).  This action is a
          special case  of  the  more  general  yy_flush_buffer()
          function, described below in the section Multiple Input
          Buffers.

     -    yyterminate() can be used in lieu of a return statement
          in  an action.  It terminates the scanner and returns a
          0 to the scanner's caller, indicating "all  done".   By
          default,  yyterminate()  is also called when an end-of-
          file is encountered.  It is a macro and  may  be  rede-
          fined.

THE GENERATED SCANNER
     The output of flex is the file lex.yy.c, which contains  the
     scanning  routine yylex(), a number of tables used by it for
     matching tokens, and a number of auxiliary routines and mac-
     ros.  By default, yylex() is declared as follows:

         int yylex()
             {
             ... various definitions and the actions in here ...
             }

     (If your environment supports function prototypes,  then  it
     will  be  "int  yylex(  void  )".)   This  definition may be
     changed by defining the "YY_DECL" macro.  For  example,  you
     could use:

         #define YY_DECL float lexscan( a, b ) float a, b;

     to give the scanning routine the name lexscan,  returning  a



Version 2.5          Last change: April 1995                   15






FLEX(1)                  USER COMMANDS                    FLEX(1)



     float, and taking two floats as arguments.  Note that if you
     give  arguments  to  the  scanning  routine  using  a   K&R-
     style/non-prototyped  function  declaration,  you  must ter-
     minate the definition with a semi-colon (;).

     Whenever yylex() is called, it scans tokens from the  global
     input  file  yyin  (which  defaults to stdin).  It continues
     until it either reaches an end-of-file (at  which  point  it
     returns the value 0) or one of its actions executes a return
     statement.

     If the scanner reaches an end-of-file, subsequent calls  are
     undefined  unless either yyin is pointed at a new input file
     (in which case scanning continues from that file), or yyres-
     tart()  is called.  yyrestart() takes one argument, a FILE *
     pointer (which can be nil, if you've set up YY_INPUT to scan
     from  a  source  other  than yyin), and initializes yyin for
     scanning from that file.  Essentially there is no difference
     between  just  assigning  yyin  to a new input file or using
     yyrestart() to do so; the latter is available  for  compati-
     bility with previous versions of flex, and because it can be
     used to switch input files in the middle  of  scanning.   It
     can  also be used to throw away the current input buffer, by
     calling it with an argument of yyin; but better  is  to  use
     YY_FLUSH_BUFFER (see above).  Note that yyrestart() does not
     reset the start condition to INITIAL (see Start  Conditions,
     below).

     If yylex() stops scanning due to executing a  return  state-
     ment  in  one of the actions, the scanner may then be called
     again and it will resume scanning where it left off.

     By default (and for purposes  of  efficiency),  the  scanner
     uses  block-reads  rather  than  simple getc() calls to read
     characters from yyin. The nature of how it  gets  its  input
     can   be   controlled   by   defining  the  YY_INPUT  macro.
     YY_INPUT's           calling           sequence           is
     "YY_INPUT(buf,result,max_size)".   Its action is to place up
     to max_size characters in the character array buf and return
     in  the integer variable result either the number of charac-
     ters read or the constant YY_NULL (0  on  Unix  systems)  to
     indicate  EOF.   The  default YY_INPUT reads from the global
     file-pointer "yyin".

     A sample definition of YY_INPUT (in the definitions  section
     of the input file):

         %{
         #define YY_INPUT(buf,result,max_size) \
             { \
             int c = getchar(); \
             result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \



Version 2.5          Last change: April 1995                   16






FLEX(1)                  USER COMMANDS                    FLEX(1)



             }
         %}

     This definition will change the input  processing  to  occur
     one character at a time.

     When the scanner receives  an  end-of-file  indication  from
     YY_INPUT, it then checks the yywrap() function.  If yywrap()
     returns false (zero), then it is assumed that  the  function
     has  gone  ahead  and  set up yyin to point to another input
     file, and scanning continues.   If  it  returns  true  (non-
     zero),  then  the  scanner  terminates,  returning  0 to its
     caller.  Note that  in  either  case,  the  start  condition
     remains unchanged; it does not revert to INITIAL.

     If you do not supply your own version of yywrap(), then  you
     must  either use %option noyywrap (in which case the scanner
     behaves as though yywrap() returned 1),  or  you  must  link
     with  -lfl  to  obtain  the  default version of the routine,
     which always returns 1.

     Three routines are available  for  scanning  from  in-memory
     buffers     rather     than     files:     yy_scan_string(),
     yy_scan_bytes(), and yy_scan_buffer(). See the discussion of
     them below in the section Multiple Input Buffers.

     The scanner writes its  ECHO  output  to  the  yyout  global
     (default, stdout), which may be redefined by the user simply
     by assigning it to some other FILE pointer.

START CONDITIONS
     flex  provides  a  mechanism  for  conditionally  activating
     rules.   Any rule whose pattern is prefixed with "<sc>" will
     only be active when the scanner is in  the  start  condition
     named "sc".  For example,

         <STRING>[^"]*        { /* eat up the string body ... */
                     ...
                     }

     will be active only when the  scanner  is  in  the  "STRING"
     start condition, and

         <INITIAL,STRING,QUOTE>\.        { /* handle an escape ... */
                     ...
                     }

     will be active only when  the  current  start  condition  is
     either "INITIAL", "STRING", or "QUOTE".

     Start conditions are declared  in  the  definitions  (first)
     section  of  the input using unindented lines beginning with



Version 2.5          Last change: April 1995                   17






FLEX(1)                  USER COMMANDS                    FLEX(1)



     either %s or %x followed by a list  of  names.   The  former
     declares  inclusive  start  conditions, the latter exclusive
     start conditions.  A start condition is activated using  the
     BEGIN  action.   Until  the  next  BEGIN action is executed,
     rules with the given start  condition  will  be  active  and

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -