📄 freegrm5.txt

📁 用于lex和yacc的c++及c语言的文法,可用来构造C++语言的编译器
💻 TXT
📖 第 1 页 / 共 2 页
字号:
上一页 12
recall  times  during  the  development  effort  when  the  number  of 
conflicts  was  well  in  excess of 200).  The most recent addition of 
nested types probably took about 2 weeks to implement.  On  the  other 
hand,  I  probably  spent  several  months  developing  the  automated 
documentation tools that allowed me to  debug  the  grammar  additions 
this quickly. 

Towards  the  end of the initial development of the C++ grammar, which 
took roughly 3 months of my time (circa summer 1989), I began  to  see 
the  folly in part of my quest.  I came to the conclusion that further 
attempts  to  modify  my  grammar,  so  as  to  defer  resolutions  of 
ambiguities,  would  lead  to an unreadable language. Specifically, my 
feeling was that I was  entering  into  a  battle  of  wits  with  the 
compiler,  and  the compiler was starting to win.  It was beginning to 
be the case that the parser COULD  figure  out  what  I  said,  but  I 
couldn't.  Indeed, even examples in a version of the C++ 2.0 reference 
manual (and published in the ARM) demonstrated this problem (my parser 
could  parse  some  examples  that  neither  I  nor the authors parsed 
correctly!).  At this point I decided to  stop  my  quest  to  FURTHER 
defer  resolutions  of  ambiguities, and let the grammar commit in one 
direction (always in favor of declarations), at the late point that is 
provided by my grammar.  If this direction proved "incorrect in  light 
of  the  context  that  followed", then I generated a syntax error.  I 
believe this strategy provides  ample  room  for  expressiveness.   In 
support  of  this expressiveness, I have (based on my discussions with 
language  experts)  deferred  disambiguation  far  longer  than  other 
attempts  at  producing an LR(1) grammar.  I would strongly argue that 
any code that my grammar identifies as having a "syntax error"  (based 
on  "premature"  disambiguation), but cfront allows, should ABSOLUTELY 
be rewritten in a less ambiguous (and hence more portable) form.

It should be noted that my grammar cannot  be  in  constant  agreement 
with   such  implementations  as  cfront  because  a)  my  grammar  is 
internally consistent (mostly courtesy of its formal nature  and  YACC 
verification),  and b) YACC generated parsers don't dump core. (I will 
probably take a lot of flack for that last snipe, but.... every time I 
have had difficulty figuring what  was  meant  syntactically  by  some 
construct that the ARM was vague about, and I fed it to cfront, cfront 
dumped core.)

One major motivation for using the C++ grammar that I have provided is 
that it is capable of supporting old style function definitions (e.g.: 
main(argc,  argv)  int  argc;  char*argv[];  {...}  ).  I believe this 
capability was removed from the C++ specification in order  to  reduce 
ambiguities  in  a  specific  implementation  (cfront).  As my grammar 
demonstrates, this action was  not  necessary.  Supporting  old  style 
function  definition  GREATLY  eases  the transition to the use of C++ 
from traditional C.  I expect that as some parsers  begin  to  support 
this  option,  that  other  parsing  engines  will  be  forced in this 
direction by a competitive marketplace.  Using  my  grammar,  and  the 
standards it implies, appears to be a very straightforward approach to 
this support.

A  second  motivation for using my grammar is that it can be processed 
by YACC.  The advantage in this fact lies with  YACC's  capability  to 
identify  ambiguities.   For  software  manufacturers that are heavily 
concerned with correctness,  this  is  an  INCREDIBLE  advantage.   My 
experience  with  hand  written  parsers  (which  usually  represent a 
translation by a fallible human from a grammar  to  parsing  code)  is 
that  they  evolve and become more correct with time.  Ambiguous cases 
are often misparsed, without the author ever  realizing  there  was  a 
conflict!   In  contrast,  either  a  YACC  grammar  supports  a given 
construct, or it doesn't.  If a YACC grammar supports a construct, the 
semantic interpretation  is  usually  rather  straight  forward.   The 
likelihood of internal errors in the parser is therefore SIGNIFICANTLY 
reduced.  The  fact the the grammars I supplied are free of %assoc and 
%prec operators, implies the grammar  are  fairly  portable,  and  the 
conflicts are open to peer code review (and not obscured).

Most  recently  I have joined the ANSI C++ committee (X3J16), and have 
tried to follow their progress with hopes of maintaining compliance in 
my grammar.  Unfortunately, political pressures within X3J16 appear to 
make it IMHO more desirable to quickly approve a standard that matches 
cfront's performance (when it is not dumping core), than to provide  a 
clean,  consistent  and formal syntax as part of the standard.  Rather 
than fixing inconsistent hackery within the syntax, there  is  IMHO  a 
tendency   to  want  to  "hack  further"  to  match  cfront's  current 
performance (or the ARM's prophesy).   As  an  example  of  this,  the 
fundamental  hack  in  all of C is the feedback from the parser to the 
lexer to identify typedefnames.  There is discussion afoot to (for  no 
reason  other than compliance with a *proposed* cfront feature) extend 
this another notch and require feedback to distinguish template names. 
This hackery was not required by the syntax, rather it was "desirable" 
to match the performance of beta-cfront (and the  ARM).   When  cfront 
changes,  and  old  code is obsoleted, the arguments abound that it is 
for the good of humanity.  When cfront is hacking inconsistently, then 
no change can be made, because of the thousands of lines of code  that 
depend  on this psuedo-standard.  Perhaps my grammar will help in some 
small  way  Microsoft,  Zortech,  Borland,   and   dozens   of   other 
entrepreneurs  work toward building a standard for a language that has 
enough consistency to grow and flourish (note that none of  the  above 
vendors  use  my grammar in their products, but I think they would all 
share my desire for a cleaner syntax).  If I  am  successful  with  my 
grammar,  then  I  will be able to write C++ tools in a consistent and 
open marketplace.  From my perspective, the outcome is not clear.   If 
you have a channel to support the use of a cleaner syntax in the X3J16 
standard, I would heartily invite you to exercise that channel.

As  it  currently  stands,  my  grammar  teeters  on the edge of being 
unusable due to its size.  The size in turn is due to the  variety  of 
special cases that must be dealt with within C++ parsing.  With only a 
few more inconsistent additions to the "standard language", my grammar 
will surely become completely unusable.  I am trying to develop a yacc 
preprocessor  that will allow me to rein back in the complexity of the 
grammar.  If I can do this, then it will continue to  be  possible  to 
update  my  grammar  to  match  the emerging ANSI Standard. I can only 
promise to try.


FEEDBACK ABOUT THE GRAMMARS

If you find any errors in my grammars, I would be  DELIGHTED  to  hear 
mention  of  them!!!!   These  should  fall  into one of the following 
categories:

        1) The grammar left out the following features of C++...
        or
        2) The grammar mis-parses the following sequences...
        or
        3) The discussion of the following ambiguity is
        incorrect...
        4) The grammar could be simplified as follows...

Please send  correspondence  of  this  form  to  jar@hq.ileaf.com.  My 
response  to  1's  will be to add the feature (if possible!); feel sad 
that I made a mistake; and feel glad that YOU found it.  I will have a 
similar response to 2's.  Responses of type 3 are GREAT, but I haven't 
found many folks that really get into YACC ambiguities, so I have  low 
expectations...  feel free to surprise me!!! :-) :-).  Items of type 3 
are interesting, but since simplicity is in the eye of  the  beholder, 
such  suggestions  are  subject  to  debate.  I would be interested in 
seeing suggestions in this area with the constraint that they  do  not 
increase  the  number of conflicts in the grammar!  Please use YACC to 
check your suggestions before submitting them. (It  is  often  AMAZING 
how  the  slightest  change  in  the  grammar can change the number of 
conflicts, and it took a great deal  of  work  to  reach  the  current 
state).  Distribution site(s) will be set up to distribute updates and 
or corrections.  Postings about the presence of  corrections  will  be 
made on the net.

Since  the  two  grammars  (C and C++) were generated in parallel, you 
should be able to compare non-terminals directly.  This will hopefully 
make it easier to identify the  complexities  involved  with  the  C++ 
grammar, and "ignore" those that result from standard ANSI C.  In both 
cases  I  have  left the standard if-if-else ambiguity intact.  In the 
case of ANSI C grammar, this is the only shift-reduce conflict in  the 
grammar.  Although there are a number of conflicts in the C++ grammar, 
there  are  actually  very  few  classes  of  problems.  In  order  to 
disambiguate the C++ grammar enough that YACC can figure out  what  to 
do,  I  was  commonly forced to "inline expand" non-terminals found in 
the C grammar.  This expansion allowed YACC  to  defer  disambiguation 
until  it  was possible for an LR(1) parser to understand the context. 
The unfortunate consequence of this inline expansion is a large growth 
in the number of rules, and the presence of an effective  "multiplier" 
in  most  cases  where  conflicts do occur. As a result, any conflicts 
that arise are multiplied by a factor corresponding to the  number  of 
rules  I  had  to  list  separately.   I  have grouped the C++ grammar 
conflicts in the "Status" section of the GRAMMAR5.TXT paper,  but  you 
are welcome to explore my grammars using YACC directly (be warned that 
you  will  need  a  robust  version  of  YACC  to handle the C++ sized 
grammar).  PLEASE do not be put off by the number of conflicts in  the 
C++  grammar.  There are VERY FEW CONFLICTS, but my elaborated grammar 
confuses the count.

The GRAMMAR5.TXT paper is FAR from a publishable quality paper, but it 
discusses many of the issues involved in ambiguities  in  my  grammar, 
and  more  generally  in the C++ language. I hope GRAMMAR5.TXT it is a 
vast improvement over "nothing at all", but apologize in  advance  for 
lack of polished structure, and the presence of many typos (which must 
surely be present).  I hope you find this almost-paper interesting. My 
attempts   at  documenting  conflicts  have  certainly  clarified  the 
problems in my mind. Based on my experience with the conflicts I  have 
identified,  most  current  compilers  and translator fall prey to the 
situations that I have uncovered.  I hope that other  compilers,  that 
do  not  make  use of the grammar I have made available, will at least 
seek to standardize the resolution of the problems identified.


The AUTODOC5.TXT file provides interesting reading  for  both  readers 
interested  in  LR  and  LALR  parsing (and the subtle connections and 
distinctions between them), as well as any user that wishes  to  fully 
comprehend  the  contents  of  the  Y.OUTPUT  file.   It  includes  an 
extensive  discussion  of  ambiguities,  how  they  are  removed,  how 
LALR-only ambiguities arise, and how they can be dealt with.

With  this  release  of the grammar I have begun to distribute machine 
generated documentation for my grammar.  As a result, if  my  analysis 
of  conflicts  are  questionable,  the  supporting data is immediately 
present to confirm or deny my analysis.  If you wish to correct any of 
my analysis, please use and refer to the Y.OUTPUT  file  that  I  have 
provided. 

As  a  small  commercial message... I am a freelance consultant, and I 
travel far and wide to perform contracts.  If you like the work that I 
am presenting in this group of documents, and  would  like  to  see  a 
resume or at least talk, please feel free to contact me.

I  hope  that  the  grammars  that  I have provided, will lead to many 
successful C++ processing projects.

Jim Roskind 
Independent Consultant 
516 Latania Palm Drive  
Indialantic FL 32903 
(407)729-4348 
jar@hq.ileaf.com or ...!uunet!leafusa!jar
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -