📄 flex.1

📁 操作系统设计与实现源码
💻 1
📖 第 1 页 / 共 2 页
字号:
12 下一页


FLEX(1)                   Minix Programmer's Manual                    FLEX(1)


NAME
     flex, lex - fast lexical analyzer generator

SYNOPSIS
     flex [-bcdfinpstvFILT8 -C[efmF] -Sskeleton] [filename ...]

DESCRIPTION
     flex is a tool for generating scanners: programs which recognized lexical
     patterns  in  text.   flex  reads  the given input files, or its standard
     input if no file names are given, for  a  description  of  a  scanner  to
     generate.  The description is in the form of pairs of regular expressions
     and C code, called rules. flex generates  as  output  a  C  source  file,
     lex.yy.c,  which  defines  a  routine  yylex(). This file is compiled and
     linked with  the  -lfl  library  to  produce  an  executable.   When  the
     executable  is  run, it analyzes its input for occurrences of the regular
     expressions.  Whenever it finds one,  it  executes  the  corresponding  C
     code.

     For full documentation, see flexdoc(1). This manual entry is intended for
     use as a quick reference.

OPTIONS
     flex has the following options:

     -b   Generate backtracking information to lex.backtrack. This is  a  list
          of   scanner   states  which  require  backtracking  and  the  input
          characters on which they do so.  By  adding  rules  one  can  remove
          backtracking  states.  If all backtracking states are eliminated and
          -f or -F is used, the generated scanner will run faster.

     -c   is a do-nothing, deprecated option included for POSIX compliance.

          NOTE: in previous releases of flex  -c  specified  table-compression
          options.   This  functionality is now given by the -C flag.  To ease
          the the impact of this change, when flex encounters -c, it currently
          issues  a  warning  message and assumes that -C was desired instead.
          In the future this "promotion" of -c to -C will go away in the  name
          of  full  POSIX  compliance  (unless  the  POSIX  meaning is removed
          first).

     -d   makes the generated scanner run in debug mode.  Whenever  a  pattern
          is recognized and the global yy_flex_debug is non-zero (which is the
          default), the scanner will write to stderr a line of the form:

              --accepting rule at line 53 ("the matched text")

          The line number refers to the location  of  the  rule  in  the  file
          defining  the  scanner  (i.e.,  the  file  that  was  fed  to flex).
          Messages are also generated when the scanner backtracks, accepts the
          default  rule,  reaches the end of its input buffer (or encounters a


                                 26 May 1990                                 1



FLEX(1)                   Minix Programmer's Manual                    FLEX(1)


          NUL; the two look the same as far as the  scanner's  concerned),  or
          reaches an end-of-file.

     -f   specifies (take your pick) full table  or  fast  scanner.  No  table
          compression  is done.  The result is large but fast.  This option is
          equivalent to -Cf (see below).

     -i   instructs flex to generate a case-insensitive scanner.  The case  of
          letters given in the flex input patterns will be ignored, and tokens
          in the input will be matched regardless of case.  The  matched  text
          given  in  yytext will have the preserved case (i.e., it will not be
          folded).

     -n   is another do-nothing, deprecated option  included  only  for  POSIX
          compliance.

     -p   generates a performance report to stderr.  The  report  consists  of
          comments  regarding features of the flex input file which will cause
          a loss of performance in the resulting scanner.

     -s   causes the default rule (that unmatched scanner input is  echoed  to
          stdout) to be suppressed.  If the scanner encounters input that does
          not match any of its rules, it aborts with an error.

     -t   instructs flex to write the scanner it generates to standard  output
          instead of lex.yy.c.

     -v   specifies that flex should write to stderr a summary  of  statistics
          regarding the scanner it generates.

     -F   specifies that the fast scanner table representation should be used.
          This   representation   is   about   as   fast  as  the  full  table
          representation  (-f),  and  for  some  sets  of  patterns  will   be
          considerably  smaller  (and for others, larger).  See flexdoc(1) for
          details.

          This option is equivalent to -CF (see below).

     -I   instructs flex to  generate  an  interactive  scanner,  that  is,  a
          scanner  which  stops  immediately  rather  than looking ahead if it
          knows that the currently scanned text cannot be  part  of  a  longer
          rule's match.  Again, see flexdoc(1) for details.

          Note, -I cannot be used in conjunction with  full  or  fast  tables,
          i.e., the -f, -F, -Cf, or -CF flags.

     -L   instructs flex not to generate #line  directives  in  lex.yy.c.  The
          default  is  to  generate  such  directives so error messages in the
          actions will be correctly located with respect to the original  flex
          input  file,  and  not  to  the  fairly  meaningless line numbers of


                                 26 May 1990                                 2



FLEX(1)                   Minix Programmer's Manual                    FLEX(1)


          lex.yy.c.

     -T   makes flex run in trace mode.  It will generate a lot of messages to
          stdout  concerning  the  form  of  the  input and the resultant non-
          deterministic and deterministic finite  automata.   This  option  is
          mostly for use in maintaining flex.

     -8   instructs flex to generate an 8-bit scanner.  On some sites, this is
          the  default.   On  others, the default is 7-bit characters.  To see
          which is the case, check the verbose (-v)  output  for  "equivalence
          classes  created".   If  the denominator of the number shown is 128,
          then by default flex is generating 7-bit characters.  If it is  256,
          then the default is 8-bit characters.

     -C[efmF]
          controls the degree of table compression.

          -Ce directs flex to construct equivalence  classes,  i.e.,  sets  of
          characters  which  have  identical  lexical properties.  Equivalence
          classes usually give dramatic reductions in the  final  table/object
          file  sizes  (typically  a  factor  of  2-5)  and  are  pretty cheap
          performance-wise (one array look-up per character scanned).

          -Cf specifies that the full scanner tables  should  be  generated  -
          flex  should not compress the tables by taking advantages of similar
          transition functions for different states.

          -CF  specifies  that  the  alternate  fast  scanner   representation
          (described in flexdoc(1)) should be used.

          -Cm directs flex to construct meta-equivalence  classes,  which  are
          sets  of  equivalence classes (or characters, if equivalence classes
          are  not  being  used)  that  are  commonly  used  together.   Meta-
          equivalence  classes  are  often  a  big  win  when using compressed
          tables, but they have a moderate performance impact (one or two "if"
          tests and one array look-up per character scanned).

          A lone -C specifies that the scanner tables should be compressed but
          neither  equivalence  classes nor meta-equivalence classes should be
          used.

          The options -Cf or -CF and -Cm do not make sense together - there is
          no  opportunity  for  meta-equivalence  classes  if the table is not
          being compressed.  Otherwise the options may be freely mixed.

          The default setting  is  -Cem,  which  specifies  that  flex  should
          generate  equivalence  classes  and  meta-equivalence classes.  This
          setting provides the highest degree of table compression.   You  can
          trade  off  faster-executing  scanners  at the cost of larger tables
          with the following generally being true:


                                 26 May 1990                                 3



FLEX(1)                   Minix Programmer's Manual                    FLEX(1)


              slowest & smallest
                    -Cem
                    -Cm
                    -Ce
                    -C
                    -C{f,F}e
                    -C{f,F}
              fastest & largest


          -C options are not cumulative; whenever the flag is encountered, the
          previous -C settings are forgotten.

     -Sskeleton_file
          overrides the default skeleton file from which flex  constructs  its
          scanners.   You'll  never need this option unless you are doing flex
          maintenance or development.

SUMMARY OF FLEX REGULAR EXPRESSIONS
     The patterns in the input are written using an extended  set  of  regular
     expressions.  These are:

         x          match the character 'x'
         .          any character except newline
         [xyz]      a "character class"; in this case, the pattern
                      matches either an 'x', a 'y', or a 'z'
         [abj-oZ]   a "character class" with a range in it; matches
                      an 'a', a 'b', any letter from 'j' through 'o',
                      or a 'Z'
         [^A-Z]     a "negated character class", i.e., any character
                      but those in the class.  In this case, any
                      character EXCEPT an uppercase letter.
         [^A-Z\n]   any character EXCEPT an uppercase letter or
                      a newline
         r*         zero or more r's, where r is any regular expression
         r+         one or more r's
         r?         zero or one r's (that is, "an optional r")
         r{2,5}     anywhere from two to five r's
         r{2,}      two or more r's
         r{4}       exactly 4 r's
         {name}     the expansion of the "name" definition
                    (see above)
         "[xyz]\"foo"
                    the literal string: [xyz]"foo
         \X         if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v',
                      then the ANSI-C interpretation of \x.
                      Otherwise, a literal 'X' (used to escape
                      operators such as '*')
         \123       the character with octal value 123
         \x2a       the character with hexadecimal value 2a


                                 26 May 1990                                 4



FLEX(1)                   Minix Programmer's Manual                    FLEX(1)


         (r)        match an r; parentheses are used to override
                      precedence (see below)


         rs         the regular expression r followed by the
                      regular expression s; called "concatenation"


         r|s        either an r or an s


         r/s        an r but only if it is followed by an s.  The
                      s is not part of the matched text.  This type
                      of pattern is called as "trailing context".
         ^r         an r, but only at the beginning of a line
         r$         an r, but only at the end of a line.  Equivalent
                      to "r/\n".


         <s>r       an r, but only in start condition s (see
                    below for discussion of start conditions)
         <s1,s2,s3>r
                    same, but in any of start conditions s1,
                    s2, or s3


         <<EOF>>    an end-of-file
         <s1,s2><<EOF>>
                    an end-of-file when in start condition s1 or s2

     The regular expressions listed above are grouped according to precedence,
     from  highest  precedence  at  the  top  to  lowest at the bottom.  Those
     grouped together have equal precedence.

     Some notes on patterns:

     -    Negated  character  classes  match  newlines  unless  "\n"  (or   an
          equivalent  escape  sequence)  is  one  of the characters explicitly
          present in the negated character class (e.g., "[^A-Z\n]").

     -    A rule can have at most one instance of trailing  context  (the  '/'
          operator  or  the  '$'  operator).   The  start  condition, '^', and
          "<<EOF>>" patterns can only occur at the  beginning  of  a  pattern,
          and,  as  well  as  with  '/'  and  '$',  cannot  be  grouped inside
          parentheses.  The following are all illegal:

              foo/bar$
              foo|(bar$)
              foo|^bar
              <sc1>foo<sc2>bar


                                 26 May 1990                                 5
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -