📄 freegrm5.txt

📁 用于lex和yacc的c++及c语言的文法,可用来构造C++语言的编译器
💻 TXT
📖 第 1 页 / 共 2 页
字号:
12 下一页
FILENAME: FREEGRAM5.TXT
AUTHOR: Jim Roskind
        Independent Consultant
        516 Latania Palm Drive
        Indialantic FL 32903
        (407)729-4348
        jar@hq.ileaf.com
        or ...uunet!leafusa!jar

                                                        7/4/91

Dear C++ and C Grammar User,

I have written a YACC debugging tool, and a set of grammars for C  and 
C++ in order to use them within my own personal project development. I 
have  made  the  results  of  my  work in this area available to other 
developers at no charge with the hope that they would use my work.   I 
believe   the   entire   C++   community   can   benefit   from   such 
standardization.  If any of the  copyright  notices  on  the  grammars 
(which  are  VERY  liberal) prevent using my work, please notify me of 
the problem.

Note that the grammars can each be processed by  YACC,  but  they  are 
very clean, and make NO USE of the precedence setting (i.e.: %prec) or 
associativity  setting (i.e.:%assoc) constructs of YACC.  This feature 
should make them easily  portable  to  other  parser  generator  input 
format.   This "cleanliness" fact also provides brutal exposure of all 
the complex constructs in C++, and the complexity of the grammar as  a 
whole  (the  C++  grammar  is  2 to 3 times as large as the C grammar) 
reflects this exposure.

The files included in this set are:

    FREEGRM5.TXT    This introductory file
    GRAMMAR5.TXT    Parsing ambiguities in C++, and in my grammar
    CPP5.Y          My YACC compatible C++ grammar
    C5.Y            My YACC compatible, ANSI C conformant grammar
    CPP5.L          Flex input file defining a C++ lexical analyzer
    SKELGRPH.C      A hacked file from the Berkeley YACC distribution
    AUTODOC5.TXT    Documentation for my machine generated analysis
    Y.OUTPUT        Machine generated analysis of the C++ grammar.

Aside from the addition of several files, this release of  my  grammar 
corrects  a  few  problems  located  in my prior release.  I have also 
transitioned to using names in my grammar that are more acceptable  to 
a  wider  variety  of  parser  generators.  This release also includes 
support for nested types (at  least  grammatically,  as  there  is  no 
symbol  table  provided).  It does not support templates and exception 
handling, as the ANSI C++ Committee  is  still  discussing  variations 
(and  trying  to  deal  with a variety of ambiguities that the initial 
proposals, such as what is described in the ARM, would entail).

Since my first public release of my grammar, I have received a  number 
of  requests.   One  of  the  most  common  requests was for a lexical 
analyzer to  go  with  the  grammar.   This  release  of  the  grammar 
continues  to  provides  such  a a "bare bones" lexical analyzer.  The 
analyzer does not support preprocessing, or even comment removal.   In 
addition,  since  I  have  not  included  a  symbol table, or semantic 
actions in the grammar  to  maintain  proper  context  (i.e.,  current 
scope),   typedef  names  and  struct/class/union/enum  tags  are  not 
*really* defined.  To  allow  users  to  experiment  with  my  grammar 
without  a  symbol table, my lexer assumes that if the first letter of 
the name is upper case, then then name is a type name.  This  hack  is 
far  from  sufficient  for parsing full blown programs, but it is more 
than sufficient for experimenting with the grammar  to  determine  the 
acceptability  of  a  token sequence, and to understand how my grammar 
parsed the sequence.

Since I did not  believe  that  a  lexical  analyzer  alone  would  be 
sufficient  to assist many people with playing with my grammar, I have 
also provided the basis for a tool to explain what a grammar is doing. 
Specifically, I have modified a file that is included in the  Berkeley 
YACC  distribution  so  that  parsers  generated  by such a YACC would 
automatically display a syntax tree in graphical-ASCII format during a 
parse.  The instructions for using and building  this  yacc  tool  are 
presented  in  the  next  section.  Note that there are no significant 
special hooks in my grammar or parser to excite this  yacc  tool,  and 
the  tool can be used equally well on any grammar that you are working 
with.  This graphical debugging tool  is  probably  one  of  the  most 
popular  aspects  of  my  releases, and its presence and usefulness to 
grammar developers should not be underestimated.

Significantly new to this  release  is  a  large  file  that  contains 
machine  generated  documentation (re: Y.OUTPUT).  This file goes well 
beyond what is provided in a typical verbose output, and provides both 
detailed conflict analysis, and a  number  of  cross-references  which 
make  it  **MUCH**  easier to read the associated grammar.  I have not 
yet decided whether  to  market,  shareware,  or  plain  give-away  my 
program,  so the best I can do at this point is to release the machine 
generated documentation.  Unfortunately, this file  is  *very*  large, 
and  I have decided (for the time being) to distribute it only via the 
ftp sites only.  I am  doing  this  to  lessen  the  global  bandwidth 
utilization  during my grammar posting to the network.  I will however 
post the file (AUTODOC5.TXT)  which  documents  the  contents  of  the 
Y.OUTPUT  file,  so that users can decide if they want to download the 
larger  file.   To  hint  at  what  is  included  in   the   automatic 
documentation, the following are the sections:

	Reference Grammar
	Alphabetized Grammar
	Sample Expansions for the Non-terminal Tokens
	Summary Descriptions of the Terminal Tokens
	Symbol and Grammar Cross Reference for the Tokens
	Sample Stack Context and Accessing Sentences for each State
	Concise list of Conflicts
	Canonical Description of Conflicts
	Verbose listing of state transitions
	Explanations for all reductions suggested in conflicts

Please see AUTODOC5.TXT for more details.

I  have  posted  7  of  the  8 files to comp.lang.c++ (I will not post 
Y.OUTPUT due to its size), to make this information  as  available  as 
possible  to users and developers.  I will also post this introductory 
note to comp.compilers, and comp.lang.c.  I am arranging for  archival 
support  via  several  ftp  sites, and updates will be posted to those 
sites.  I will also try to get the source to Berkeley YACC  posted  to 
these  ftp  sites,  although it is certainly available at more central 
sites.

Currently, Doug Lea and  Doug  Schmidtt  have  graciously  offered  to 
provide  anonymous  ftp  sites  for  all  8  of  files, as well as the 
Berkeley YACC source (if you need it).  The ftp addresses are:

ics.uci.edu (128.195.1.1) in the ftp/pub directory as:

	c++grammar2.0.tar.Z 
	byacc1.8.tar.Z

mach1.npac.syr.edu (128.230.7.14) in the ftp/pub/C++ directory as:

	c++grammar2.0.tar.Z
	byacc1.8.tar.z



HOW TO EXPERIMENT WITH THE C++ GRAMMAR

The following describes how to use the graphical debugging  extensions 
to Berkeley YACC to explore the grammar.

Note that the following instructions assume that you have the Berkeley 
YACC  source  on  hand.   You can actually use any YACC to process the 
grammar, but  running  the  resulting  demo  (which  has  no  semantic 
actions)  will  tend to be quite boring.  If you can't get hold of the 
Berkeley YACC, you should  at  least  try  to  enable  the  "debugging 
options"  in  your  parser  to  so  that  you can see in some way what 
reductions are taking place.  (Hint: search for the letters "debug" in 
the C file that your yacc produces...).

        1) Get the entire source for Berkeley YACC 1.8 1/2/91
        2) Verify that you can make the YACC
        3) Rename SKELETON.C to SKELOLD.C, and implant my SKELGRPH.C
                in that directory as SKELETON.C
        4) Make the yacc using this new SKELETON.C
        5) Using the above yacc, process my CPP5.Y file
                yacc -dvl cpp5.y
           The result should be a file y.tab.c, and y.tab.h
        6) Using Flex (replacement for lex) to process my CPP5.L file
                flex cpp5.l
           the result should be yy.lex.c
        7) Compile the two files
                cc -o cpp5  y.tab.c yy.lex.c
           the result should be an executable called cpp5
        8) Set the environment variable YYDEBUG to 6
                setenv YYDEBUG 6
           If you don't do this, the graphical output will not appear!
        9) Run the program cpp5
                cpp5
        10) Try the input:
                int a;
        11) You should see a nice parse tree.  Enjoy.  Note that
            the lexer DOES NOT INCLUDE A SYMBOL TABLE, and does
            NOT KEEP TRACK OF CURRENT SCOPES.  The hack (see the
            CPP5.L file for details) is to assume that any identifier
            that begins with a capital letter is a typedef name
            Send complaints about code that doesn't parse "correctly".




HISTORICAL NOTES


Developing the C grammar (that is intended to be compatible  with  the 
ANSI  C standard) was relatively straight forward (compared to the C++ 
grammar).  The one difficulty in this process was the desire to  avoid 
use  of  %prec  and  %assoc  constructs  in  YACC, which would tend to 
obscure ambiguities.  Since I didn't know what ambiguities were  lying 
in  wait  in  C++,  obscuring  ambiguities  was unacceptable.  It took 
several weeks to remove the conflicts that typically appear,  and  the 
tedious  process  exposed  several  ambiguities that are not currently 
disambiguated by the ANSI standard.  The quality of the C  grammar  is 
(IMHO)  dramatically  higher  than what has been made available within 
the  public  domain.   Specifically,  a   C   grammar's   support   of 
redefinition  of typedef names within inner scopes (the most difficult 
area of the grammar) is typically excluded from public domain grammar, 
and even excluded from most grammars that  are  supplied  commercially 
with  parser  generators!   I  expect  that  this grammar will be very 
useful in the development of C related tools.

The development of the C++ grammar (initially compatible with  version 
1.2,  but  enhanced to support version 2.0 specifications as they were 
made available) was anything but straight  forward.   The  requirement 
that  I  set  to NOT USE %prec and %assoc proved both a blessing and a 
curse.  The blessing was that I could see what the  problems  were  in 
the  language, the curse was that there were A LOT of conflicts (I can
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -