📄 grammar5.txt
字号:
"(*(*c2)(wasType2(a1)))(int(a1))"
and avoid the constructor ambiguity, but it would only complicate
the discussion. Note that in this form, if "wasType2" is not a
type, the the quoted text cannot be a declaration.*/
/* Two parens are all a user would need to add to the cryptic
example to unambiguously specify that this statement is an
expression. Specifically: */
(Type1) (a2) = 3, wasType2 (4), (*c2)(wasType2(a1));
/* or ...*/
(Type1 (a2) = 3), wasType2 (4), (*c2)(wasType2(a1));
/* I would vote for a syntax error in such ambiguous stream, with
an early decision that it was a declaration. After seeing this
example, I doubt that I could quickly assert that I could produce
a non-backtracking parser that disambiguates statements according
to the C++ 2.0 rule. I am sure I can forget about a simple
lex-YACC combination doing it. */
}
Most simply put, if a "smart lexer" understands these: a) I am
impressed, b) Why use a parser when a lexer can parse so well?
The bottom line is that disambiguation of declarations via "If it can
be a declaration, then it is one", seems to require a backtracking
parser. (Or some very fancy parsing approach). I am not even sure if
the above examples are as bad as it can get!
CONCLUSION
I believe that the C++ grammar that I have made available represents a
viable machine readable standard for the syntax description of the C++
language. In cases where the ambiguities are still exposed by
conflicts (as noted by YACC), to further defer resolution would be
detrimental to a user. I see no benefit in describing a computer
language that must support human writers, but cannot be understood by
humans. Any code that exploits such deferral is inherently
non-portable, and deserves to be diagnosed as an error (my grammar
asserts a "syntax error"). Rather than dragging the C++ language into
support for a ad-hoc parser implementations such as what cfront (and
the "smart lexer") have tried unsuccessfully to implement, I would
heavily suggest the use of my grammar. I do not believe that my
grammar would "break" much existing code, but in cases where it would,
the code would not be portable anyway (other than to a port of an
IDENTICAL parser).
I hope to see a great deal of use of my grammars, and I believe that
standardizing on the represented syntax will unify the C++ language
greatly.
Jim Roskind
Independent Consultant
516 Latania Palm Drive
Indialantic FL 32903
(407)729-4348
jar@hq.ileaf.com or ...uunet!leafusa!jar
APPENDIX A:
PROPOSED GRAMMAR MODIFICATIONS (fixing '*', and '&' conflicts)
Based on the other items described above, I have the following
suggestions for cleaning up the grammar definition. Unfortunately, it
provides subtle variations from the "C++ 2.0" standard.
Current Grammar:
operator_function_name :
OPERATOR any_operator
| OPERATOR type_qualifier_list operator_function_ptr_opt
| OPERATOR non_elaborating_type_specifier operator_function_ptr_opt
;
operator_new_type:
type_qualifier_list operator_new_declarator_opt
operator_new_initializer_opt
| non_elaborating_type_specifier operator_new_declarator_opt
operator_new_initializer_opt
;
Proposed new grammar (which requires parens around complex types):
operator_function_name :
OPERATOR any_operator
| OPERATOR basic_type_name
| OPERATOR TYPEDEFname
| OPERATOR type_qualifier
| OPERATOR '(' type_name ')'
;
operator_new_type:
basic_type_name operator_new_initializer_opt
| TYPEDEFname operator_new_initializer_opt
| type_qualifier operator_new_initializer_opt
| '(' type_name ') operator_new_initializer_opt
;
The impact of the above changes is that all complex type names (i.e.:
names that are not simply a typedef/class name, or a basic type names
like char) must be enclosed in parenthesis in both `new ...' and
`operator ...' expressions. Both of the above changes would clear up a
number of ambiguities. In some sense, the current "disambiguation"
(of trailing '*', and '&') is really a statement that whatever an
LR(1) parser cannot disambiguate is a syntax error. In contrast, the
above rules define an unambiguous grammar.
APPENDIX B:
CANONICAL DESCRIPTION OF CONFLICTS and STATES
The following is directly extracted from the canonical list of
conflicts provided in the y.output file. For a more complete
discussion of the significance of the canonical sentence provided with
each state, see the autodoc5.txt file. I have also added annotation
to connect these sentence to the summary given earlier:
state 64: STRUCT IDENTIFIER . ':' (1 reduction, or a shift)
1 SR caused by member declaration of sub-structure, with trailing :
state 131: OPERATOR INT . '*' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 131: OPERATOR INT . '&' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 138: OPERATOR CONST . '*' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 138: OPERATOR CONST . '&' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 281: OPERATOR INT '*' CONST . '*' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 281: OPERATOR INT '*' CONST . '&' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 282: OPERATOR INT '*' . '*' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 282: OPERATOR INT '*' . '&' (1 reduction, or a shift)
8 SR caused by operator function name with trailing * or &
state 395: CLCL TYPEDEFname '(' TYPEDEFname . ')' (1 reduction, or a shift)
5 SR caused by redundant parened TYPEDEFname redeclaration vs old style cast
Make declaration rather than expression.
problem: Constructor with anonymous arg name at file scope
looks like redeclaration of typename.
A::B(C){} should be the same as A::B(C x){}
(but it isn't for an LR(1) grammar)
state 536: IDENTIFIER '(' '~' TYPEDEFname . '(' (1 reduction, or a shift)
1 SR caused by explicit call to destructor, without explicit scope
state 571: IDENTIFIER '(' NEW INT . '*' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 571: IDENTIFIER '(' NEW INT . '&' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 572: IDENTIFIER '(' NEW CONST . '*' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 572: IDENTIFIER '(' NEW CONST . '&' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 621: CLCL TYPEDEFname '(' TYPEDEFname '[' ']' . ')' (1 reduction, or a shift)
5 SR caused by redundant parened TYPEDEFname redeclaration vs old style cast
Make declaration rather than expression.
Don't form an old style cast.
state 738: IDENTIFIER '(' TYPEDEFname '(' ')' . ')' (2 reductions)
3 RR caused by function-like cast vs typedef redeclaration ambiguity
LALR-only can be ignored.
Make declaration rather than expression.
state 738: IDENTIFIER '(' TYPEDEFname '(' ')' . ',' (2 reductions)
3 RR caused by function-like cast vs typedef redeclaration ambiguity
Problem with LALR-only conflict.
Make declaration rather than expression.
state 738: IDENTIFIER '(' TYPEDEFname '(' ')' . '=' (2 reductions)
3 RR caused by function-like cast vs typedef redeclaration ambiguity
Problem with LALR-only conflict.
Make declaration rather than expression.
state 739: IDENTIFIER '(' TYPEDEFname '(' IDENTIFIER . '(' (2 reductions)
3 RR caused by function-like cast vs identifier declaration ambiguity
Make declaration rather than expression.
state 739: IDENTIFIER '(' TYPEDEFname '(' IDENTIFIER . ')' (2 reductions)
3 RR caused by function-like cast vs identifier declaration ambiguity
LALR-only can be ignored.
Make declaration rather than expression.
state 739: IDENTIFIER '(' TYPEDEFname '(' IDENTIFIER . '[' (2 reductions)
3 RR caused by function-like cast vs identifier declaration ambiguity
Make declaration rather than expression.
state 740: IDENTIFIER '(' TYPEDEFname '(' '~' TYPEDEFname . '(' (2 reductions)
3 RR caused by destructor declaration vs destructor call
Make declaration of destructor rather than expression.
state 740: IDENTIFIER '(' TYPEDEFname '(' '~' TYPEDEFname . ')' (2 reductions)
3 RR caused by destructor declaration vs destructor call
LALR-only can be ignored.
Make declaration of destructor rather than expression.
state 740: IDENTIFIER '(' TYPEDEFname '(' '~' TYPEDEFname . '[' (2 reductions)
3 RR caused by destructor declaration vs destructor call
Make declaration of destructor rather than expression.
state 758: IDENTIFIER '(' CLCL TYPEDEFname '(' ')' . ')' (2 reductions)
3 RR caused by parened initializer vs prototype/typename
Make declaration rather than expression.
state 758: IDENTIFIER '(' CLCL TYPEDEFname '(' ')' . ',' (2 reductions)
3 RR caused by parened initializer vs prototype/typename
Problem with LALR-only conflict.
Make declaration rather than expression.
state 758: IDENTIFIER '(' CLCL TYPEDEFname '(' ')' . '=' (2 reductions)
3 RR caused by parened initializer vs prototype/typename
Problem with LALR-only conflict.
Make declaration rather than expression.
state 778: IDENTIFIER '(' NEW INT '*' CONST . '*' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 778: IDENTIFIER '(' NEW INT '*' CONST . '&' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 779: IDENTIFIER '(' NEW INT '*' . '*' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 779: IDENTIFIER '(' NEW INT '*' . '&' (1 reduction, or a shift)
8 SR caused by freestore with trailing * or &
state 1038: IDENTIFIER '{' TYPEDEFname '(' '(' TYPEDEFname . ')' (1 reduction, or a shift)
5 SR caused by redundant parened TYPEDEFname redeclaration vs old style cast
Make declaration rather than expression.
state 1100: IDENTIFIER '{' IF '(' IDENTIFIER ')' ';' . ELSE (1 reduction, or a shift)
1 SR caused by dangling else and my laziness
state 1102: IDENTIFIER '{' TYPEDEFname '(' '(' TYPEDEFname '[' ']' . ')' (1 reduction, or a shift)
5 SR caused by redundant parened TYPEDEFname redeclaration vs old style cast
Make declaration rather than expression.
state 1103: IDENTIFIER '{' TYPEDEFname '(' '*' '(' TYPEDEFname . ')' (1 reduction, or a shift)
5 SR caused by redundant parened TYPEDEFname redeclaration vs old style cast
Make declaration rather than expression.
state 1105: STRUCT IDENTIFIER '{' TYPEDEFname '(' TYPEDEFname ')' . ';' (2 reductions)
6 RR caused by constructor declaration vs member declaration
state 1105: STRUCT IDENTIFIER '{' TYPEDEFname '(' TYPEDEFname ')' . '{' (2 reductions)
6 RR caused by constructor declaration vs member declaration
state 1152: STRUCT IDENTIFIER '{' TYPEDEFname '(' TYPEDEFname '[' ']' ')' . ';' (2 reductions)
6 RR caused by constructor declaration vs member declaration
state 1152: STRUCT IDENTIFIER '{' TYPEDEFname '(' TYPEDEFname '[' ']' ')' . '{' (2 reductions)
6 RR caused by constructor declaration vs member declaration
state 1175: STRUCT IDENTIFIER '{' EXTERN INT '(' TYPEDEFname '[' ']' ')' . ';' (2 reductions)
6 RR caused by constructor declaration vs member declaration
state 1175: STRUCT IDENTIFIER '{' EXTERN INT '(' TYPEDEFname '[' ']' ')' . '{' (2 reductions)
6 RR caused by constructor declaration vs member declaration
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -