changes_summary.txt
来自「SRI international 发布的OAA框架软件」· 文本 代码 · 共 1,594 行 · 第 1/5 页
TXT
1,594 行
parser action
>>
Being confused with:
#token X "x" <<lexical action>>
#142. (Changed in MR11) class BufFileInput subclass of DLGInputStream
Alexey Demakov (demakov@kazbek.ispras.ru) has supplied class
BufFileInput derived from DLGInputStream which provides a
function lookahead(char *string) to test characters in the
input stream more than one character ahead.
The class is located in pccts/h/BufFileInput.* of the kit.
#140. #pred to define predicates
+---------------------------------------------------+
| Note: Assume "-prc on" for this entire discussion |
+---------------------------------------------------+
A problem with predicates is that each one is regarded as
unique and capable of disambiguating cases where two
alternatives have identical lookahead. For example:
rule : <<pred(LATEXT(1))>>? A
| <<pred(LATEXT(1))>>? A
;
will not cause any error messages or warnings to be issued
by earlier versions of pccts. To compare the text of the
predicates is an incomplete solution.
In 1.33MR11 I am introducing the #pred statement in order to
solve some problems with predicates. The #pred statement allows
one to give a symbolic name to a "predicate literal" or a
"predicate expression" in order to refer to it in other predicate
expressions or in the rules of the grammar.
The predicate literal associated with a predicate symbol is C
or C++ code which can be used to test the condition. A
predicate expression defines a predicate symbol in terms of other
predicate symbols using "!", "&&", and "||". A predicate symbol
can be defined in terms of a predicate literal, a predicate
expression, or *both*.
When a predicate symbol is defined with both a predicate literal
and a predicate expression, the predicate literal is used to generate
code, but the predicate expression is used to check for two
alternatives with identical predicates in both alternatives.
Here are some examples of #pred statements:
#pred IsLabel <<isLabel(LATEXT(1))>>?
#pred IsLocalVar <<isLocalVar(LATEXT(1))>>?
#pred IsGlobalVar <<isGlobalVar(LATEXT(1)>>?
#pred IsVar <<isVar(LATEXT(1))>>? IsLocalVar || IsGlobalVar
#pred IsScoped <<isScoped(LATEXT(1))>>? IsLabel || IsLocalVar
I hope that the use of EBNF notation to describe the syntax of the
#pred statement will not cause problems for my readers (joke).
predStatement : "#pred"
CapitalizedName
(
"<<predicate_literal>>?"
| "<<predicate_literal>>?" predOrExpr
| predOrExpr
)
;
predOrExpr : predAndExpr ( "||" predAndExpr ) * ;
predAndExpr : predPrimary ( "&&" predPrimary ) * ;
predPrimary : CapitalizedName
| "!" predPrimary
| "(" predOrExpr ")"
;
What is the purpose of this nonsense ?
To understand how predicate symbols help, you need to realize that
predicate symbols are used in two different ways with two different
goals.
a. Allow simplification of predicates which have been combined
during predicate hoisting.
b. Allow recognition of identical predicates which can't disambiguate
alternatives with common lookahead.
First we will discuss goal (a). Consider the following rule:
rule0: rule1
| ID
| ...
;
rule1: rule2
| rule3
;
rule2: <<isX(LATEXT(1))>>? ID ;
rule3: <<!isX(LATEXT(1)>>? ID ;
When the predicates in rule2 and rule3 are combined by hoisting
to create a prediction expression for rule1 the result is:
if ( LA(1)==ID
&& ( isX(LATEXT(1) || !isX(LATEXT(1) ) ) { rule1(); ...
This is inefficient, but more importantly, can lead to false
assumptions that the predicate expression distinguishes the rule1
alternative with some other alternative with lookahead ID. In
MR11 one can write:
#pred IsX <<isX(LATEXT(1))>>?
...
rule2: <<IsX>>? ID ;
rule3: <<!IsX>>? ID ;
During hoisting MR11 recognizes this as a special case and
eliminates the predicates. The result is a prediction
expression like the following:
if ( LA(1)==ID ) { rule1(); ...
Please note that the following cases which appear to be equivalent
*cannot* be simplified by MR11 during hoisting because the hoisting
logic only checks for a "!" in the predicate action, not in the
predicate expression for a predicate symbol.
*Not* equivalent and is not simplified during hoisting:
#pred IsX <<isX(LATEXT(1))>>?
#pred NotX <<!isX(LATEXT(1))>>?
...
rule2: <<IsX>>? ID ;
rule3: <<NotX>>? ID ;
*Not* equivalent and is not simplified during hoisting:
#pred IsX <<isX(LATEXT(1))>>?
#pred NotX !IsX
...
rule2: <<IsX>>? ID ;
rule3: <<NotX>>? ID ;
Now we will discuss goal (b).
When antlr discovers that there is a lookahead ambiguity between
two alternatives it attempts to resolve the ambiguity by searching
for predicates in both alternatives. In the past any predicate
would do, even if the same one appeared in both alternatives:
rule: <<p(LATEXT(1))>>? X
| <<p(LATEXT(1))>>? X
;
The #pred statement is a start towards solving this problem.
During ambiguity resolution (*not* predicate hoisting) the
predicates for the two alternatives are expanded and compared.
Consider the following example:
#pred Upper <<isUpper(LATEXT(1))>>?
#pred Lower <<isLower(LATEXT(1))>>?
#pred Alpha <<isAlpha(LATEXT(1))>>? Upper || Lower
rule0: rule1
| <<Alpha>>? ID
;
rule1:
| rule2
| rule3
...
;
rule2: <<Upper>>? ID;
rule3: <<Lower>>? ID;
The definition of #pred Alpha expresses:
a. to test the predicate use the C code "isAlpha(LATEXT(1))"
b. to analyze the predicate use the information that
Alpha is equivalent to the union of Upper and Lower,
During ambiguity resolution the definition of Alpha is expanded
into "Upper || Lower" and compared with the predicate in the other
alternative, which is also "Upper || Lower". Because they are
identical MR11 will report a problem.
-------------------------------------------------------------------------
t10.g, line 5: warning: the predicates used to disambiguate rule rule0
(file t10.g alt 1 line 5 and alt 2 line 6)
are identical when compared without context and may have no
resolving power for some lookahead sequences.
-------------------------------------------------------------------------
If you use the "-info p" option the output file will contain:
+----------------------------------------------------------------------+
|#if 0 |
| |
|The following predicates are identical when compared without |
| lookahead context information. For some ambiguous lookahead |
| sequences they may not have any power to resolve the ambiguity. |
| |
|Choice 1: rule0/1 alt 1 line 5 file t10.g |
| |
| The original predicate for choice 1 with available context |
| information: |
| |
| OR expr |
| |
| pred << Upper>>? |
| depth=k=1 rule rule2 line 14 t10.g |
| set context: |
| ID |
| |
| pred << Lower>>? |
| depth=k=1 rule rule3 line 15 t10.g |
| set context: |
| ID |
| |
| The predicate for choice 1 after expansion (but without context |
| information): |
| |
| OR expr |
| |
| pred << isUpper(LATEXT(1))>>? |
| depth=k=1 rule line 1 t10.g |
| |
| pred << isLower(LATEXT(1))>>? |
| depth=k=1 rule line 2 t10.g |
| |
| |
|Choice 2: rule0/2 alt 2 line 6 file t10.g |
| |
| The original predicate for choice 2 with available context |
| information: |
| |
| pred << Alpha>>? |
| depth=k=1 rule rule0 line 6 t10.g |
| set context: |
| ID |
| |
| The predicate for choice 2 after expansion (but without context |
| information): |
| |
| OR expr |
| |
| pred << isUpper(LATEXT(1))>>? |
| depth=k=1 rule line 1 t10.g |
| |
| pred << isLower(LATEXT(1))>>? |
| depth=k=1 rule line 2 t10.g |
| |
| |
|#endif |
+----------------------------------------------------------------------+
The comparison of the predicates for the two alternatives takes
place without context information, which means that in some cases
the predicates will be considered identical even though they operate
on disjoint lookahead sets. Consider:
#pred Alpha
rule1: <<Alpha>>? ID
| <<Alpha>>? Label
;
Because the comparison of predicates takes place without context
these will be considered identical. The reason for comparing
without context is that otherwise it would be necessary to re-evaluate
the entire predicate expression for each possible lookahead sequence.
This would require more code to be written and more CPU time during
grammar analysis, and it is not yet clear whether anyone will even make
use of the new #pred facility.
A temporary workaround might be to use different #pred statements
for predicates you know have different context. This would avoid
extraneous warnings.
The above example might be termed a "false positive". Comparison
without context will also lead to "false negatives". Consider the
following example:
#pred Alpha
#pred Beta
rule1: <<Alpha>>? A
| rule2
;
rule2: <<Alpha>>? A
| <<Beta>>? B
;
The predicate used for alt 2 of rule1 is (Alpha || Beta). This
appears to be different than the predicate Alpha used for alt1.
However, the context of Beta is B. Thus when the lookahead is A
Beta will have no resolving power and Alpha will be used for both
alternatives. Using the same predicate for both alternatives isn't
very helpful, but this will not be detected with 1.33MR11.
To properly handle this the predicate expression would have to be
evaluated for each distinct lookahead context.
To determine whether two predicate expressions are identical is
difficult. The routine may fail to identify identical predicates.
The #pred feature also compares predicates to see if a choice between
alternatives which is resolved by a predicate which makes the second
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?