flex.man

来自「flex编译器的源代码」· MAN 代码 · 共 1,889 行 · 第 1/5 页
MAN
1,889 行



FLEX(1)                  USER COMMANDS                    FLEX(1)



NAME
     flex - fast lexical analyzer generator

SYNOPSIS
     flex [-bcdfhilnpstvwBFILTV78+? -C[aefFmr] -ooutput  -Pprefix
     -Sskeleton] [--help --version] [filename ...]

OVERVIEW
     This manual describes flex, a tool for  generating  programs
     that  perform pattern-matching on text.  The manual includes
     both tutorial and reference sections:

         Description
             a brief overview of the tool

         Some Simple Examples

         Format Of The Input File

         Patterns
             the extended regular expressions used by flex

         How The Input Is Matched
             the rules for determining what has been matched

         Actions
             how to specify what to do when a pattern is matched

         The Generated Scanner
             details regarding the scanner that flex produces;
             how to control the input source

         Start Conditions
             introducing context into your scanners, and
             managing "mini-scanners"

         Multiple Input Buffers
             how to manipulate multiple input sources; how to
             scan from strings instead of files

         End-of-file Rules
             special rules for matching the end of the input

         Miscellaneous Macros
             a summary of macros available to the actions

         Values Available To The User
             a summary of values available to the actions

         Interfacing With Yacc
             connecting flex scanners together with yacc parsers




Version 2.5          Last change: April 1995                    1






FLEX(1)                  USER COMMANDS                    FLEX(1)



         Options
             flex command-line options, and the "%option"
             directive

         Performance Considerations
             how to make your scanner go as fast as possible

         Generating C++ Scanners
             the (experimental) facility for generating C++
             scanner classes

         Incompatibilities With Lex And POSIX
             how flex differs from AT&T lex and the POSIX lex
             standard

         Diagnostics
             those error messages produced by flex (or scanners
             it generates) whose meanings might not be apparent

         Files
             files used by flex

         Deficiencies / Bugs
             known problems with flex

         See Also
             other documentation, related tools

         Author
             includes contact information


DESCRIPTION
     flex is a  tool  for  generating  scanners:  programs  which
     recognized  lexical  patterns in text.  flex reads the given
     input files, or its standard input  if  no  file  names  are
     given,  for  a  description  of  a scanner to generate.  The
     description is in the form of pairs of  regular  expressions
     and  C  code,  called  rules.  flex  generates as output a C
     source file, lex.yy.c, which defines a routine yylex(). This
     file is compiled and linked with the -lfl library to produce
     an executable.  When the executable is run, it analyzes  its
     input  for occurrences of the regular expressions.  Whenever
     it finds one, it executes the corresponding C code.

SOME SIMPLE EXAMPLES
     First some simple examples to get the flavor of how one uses
     flex.  The  following  flex  input specifies a scanner which
     whenever it encounters the string "username" will replace it
     with the user's login name:

         %%



Version 2.5          Last change: April 1995                    2






FLEX(1)                  USER COMMANDS                    FLEX(1)



         username    printf( "%s", getlogin() );

     By default, any text not matched by a flex scanner is copied
     to  the output, so the net effect of this scanner is to copy
     its input file to its output with each occurrence of  "user-
     name"  expanded.   In  this  input,  there is just one rule.
     "username" is the pattern and the "printf"  is  the  action.
     The "%%" marks the beginning of the rules.

     Here's another simple example:

                 int num_lines = 0, num_chars = 0;

         %%
         \n      ++num_lines; ++num_chars;
         .       ++num_chars;

         %%
         main()
                 {
                 yylex();
                 printf( "# of lines = %d, # of chars = %d\n",
                         num_lines, num_chars );
                 }

     This scanner counts the number of characters and the  number
     of  lines in its input (it produces no output other than the
     final report on the counts).  The first  line  declares  two
     globals,  "num_lines"  and "num_chars", which are accessible
     both inside yylex() and in the main() routine declared after
     the  second  "%%".  There are two rules, one which matches a
     newline ("\n") and increments both the line  count  and  the
     character  count,  and one which matches any character other
     than a newline (indicated by the "." regular expression).

     A somewhat more complicated example:

         /* scanner for a toy Pascal-like language */

         %{
         /* need this for the call to atof() below */
         #include <math.h>
         %}

         DIGIT    [0-9]
         ID       [a-z][a-z0-9]*

         %%

         {DIGIT}+    {
                     printf( "An integer: %s (%d)\n", yytext,
                             atoi( yytext ) );



Version 2.5          Last change: April 1995                    3






FLEX(1)                  USER COMMANDS                    FLEX(1)



                     }

         {DIGIT}+"."{DIGIT}*        {
                     printf( "A float: %s (%g)\n", yytext,
                             atof( yytext ) );
                     }

         if|then|begin|end|procedure|function        {
                     printf( "A keyword: %s\n", yytext );
                     }

         {ID}        printf( "An identifier: %s\n", yytext );

         "+"|"-"|"*"|"/"   printf( "An operator: %s\n", yytext );

         "{"[^}\n]*"}"     /* eat up one-line comments */

         [ \t\n]+          /* eat up whitespace */

         .           printf( "Unrecognized character: %s\n", yytext );

         %%

         main( argc, argv )
         int argc;
         char **argv;
             {
             ++argv, --argc;  /* skip over program name */
             if ( argc > 0 )
                     yyin = fopen( argv[0], "r" );
             else
                     yyin = stdin;

             yylex();
             }

     This is the beginnings of a simple scanner  for  a  language
     like  Pascal.   It  identifies different types of tokens and
     reports on what it has seen.

     The details of this example will be explained in the follow-
     ing sections.

FORMAT OF THE INPUT FILE
     The flex input file consists of three sections, separated by
     a line with just %% in it:

         definitions
         %%
         rules
         %%
         user code



Version 2.5          Last change: April 1995                    4






FLEX(1)                  USER COMMANDS                    FLEX(1)



     The definitions section contains declarations of simple name
     definitions  to  simplify  the  scanner  specification,  and
     declarations of start conditions, which are explained  in  a
     later section.

     Name definitions have the form:

         name definition

     The "name" is a word beginning with a letter  or  an  under-
     score  ('_')  followed by zero or more letters, digits, '_',
     or '-' (dash).  The definition is  taken  to  begin  at  the
     first  non-white-space character following the name and con-
     tinuing to the end of the line.  The definition  can  subse-
     quently  be referred to using "{name}", which will expand to
     "(definition)".  For example,

         DIGIT    [0-9]
         ID       [a-z][a-z0-9]*

     defines "DIGIT" to be a regular expression which  matches  a
     single  digit,  and  "ID"  to  be a regular expression which
     matches a letter followed by zero-or-more letters-or-digits.
     A subsequent reference to

         {DIGIT}+"."{DIGIT}*

     is identical to

         ([0-9])+"."([0-9])*

     and matches one-or-more digits followed by a '.' followed by
     zero-or-more digits.

     The rules section of the flex input  contains  a  series  of
     rules of the form:

         pattern   action

     where the pattern must be unindented  and  the  action  must
     begin on the same line.

     See below for a further description of patterns and actions.

     Finally, the user code section is simply copied to  lex.yy.c
     verbatim.   It  is used for companion routines which call or
     are called by the scanner.  The presence of this section  is
     optional;  if it is missing, the second %% in the input file
     may be skipped, too.

     In the definitions and rules sections, any indented text  or
     text  enclosed in %{ and %} is copied verbatim to the output



Version 2.5          Last change: April 1995                    5






FLEX(1)                  USER COMMANDS                    FLEX(1)



     (with the %{}'s removed).  The %{}'s must appear  unindented
     on lines by themselves.

     In the rules section, any indented  or  %{}  text  appearing
     before the first rule may be used to declare variables which
     are local to the scanning routine and  (after  the  declara-
     tions)  code  which  is to be executed whenever the scanning
     routine is entered.  Other indented or %{} text in the  rule
     section  is  still  copied to the output, but its meaning is
     not well-defined and it may well cause  compile-time  errors
     (this feature is present for POSIX compliance; see below for
     other such features).

     In the definitions section (but not in the  rules  section),
     an  unindented comment (i.e., a line beginning with "/*") is
     also copied verbatim to the output up to the next "*/".

PATTERNS
     The patterns in the input are written using an extended  set
     of regular expressions.  These are:

         x          match the character 'x'
         .          any character (byte) except newline
         [xyz]      a "character class"; in this case, the pattern
                      matches either an 'x', a 'y', or a 'z'
         [abj-oZ]   a "character class" with a range in it; matches
                      an 'a', a 'b', any letter from 'j' through 'o',
                      or a 'Z'
         [^A-Z]     a "negated character class", i.e., any character
                      but those in the class.  In this case, any
                      character EXCEPT an uppercase letter.
         [^A-Z\n]   any character EXCEPT an uppercase letter or
                      a newline
         r*         zero or more r's, where r is any regular expression
         r+         one or more r's
         r?         zero or one r's (that is, "an optional r")
         r{2,5}     anywhere from two to five r's
         r{2,}      two or more r's
         r{4}       exactly 4 r's
         {name}     the expansion of the "name" definition
                    (see above)
flex.man - 源码说明

本页面展示了「flex编译器的源代码」中的 flex.man 源码文件，采用 MAN 编程语言编写，共 1,889 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与flex相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?