changes_summary.txt
来自「SRI international 发布的OAA框架软件」· 文本 代码 · 共 1,594 行 · 第 1/5 页
TXT
1,594 行
======================================================================
CHANGES_SUMMARY.TXT
A QUICK overview of changes from 1.33 in reverse order
A summary of additions rather than bug fixes and minor code changes.
Numbers refer to items in CHANGES_FROM_133*.TXT
which may contain additional information.
DISCLAIMER
The software and these notes are provided "as is". They may include
typographical or technical errors and their authors disclaims all
liability of any kind or nature for damages due to error, fault,
defect, or deficiency regardless of cause. All warranties of any
kind, either express or implied, including, but not limited to, the
implied warranties of merchantability and fitness for a particular
purpose are disclaimed.
======================================================================
#258. You can specify a user-defined base class for your parser
The base class must constructor must have a signature similar to
that of ANTLRParser.
#253. Generation of block preamble (-preamble and -preamble_first)
The antlr option -preamble causes antlr to insert the code
BLOCK_PREAMBLE at the start of each rule and block.
The antlr option -preamble_first is similar, but inserts the
code BLOCK_PREAMBLE_FIRST(PreambleFirst_123) where the symbol
PreambleFirst_123 is equivalent to the first set defined by
the #FirstSetSymbol described in Item #248.
#248. Generate symbol for first set of an alternative
rr : #FirstSetSymbol(rr_FirstSet) ( Foo | Bar ) ;
#216. Defer token fetch for C++ mode
When the ANTLRParser class is built with the pre-processor option
ZZDEFER_FETCH defined, the fetch of new tokens by consume() is deferred
until LA(i) or LT(i) is called.
#215. Use reset() to reset DLGLexerBase
#188. Added pccts/h/DLG_stream_input.h
#180. Added ANTLRParser::getEofToken()
#173. -glms for Microsoft style filenames with -gl
#170. Suppression for predicates with lookahead depth >1
Consider the following grammar with -ck 2 and the predicate in rule
"a" with depth 2:
r1 : (ab)* "@"
;
ab : a
| b
;
a : (A B)? => <<p(LATEXT(2))>>? A B C
;
b : A B C
;
Normally, the predicate would be hoisted into rule r1 in order to
determine whether to call rule "ab". However it should *not* be
hoisted because, even if p is false, there is a valid alternative
in rule b. With "-mrhoistk on" the predicate will be suppressed.
If "-info p" command line option is present the following information
will appear in the generated code:
while ( (LA(1)==A)
#if 0
Part (or all) of predicate with depth > 1 suppressed by alternative
without predicate
pred << p(LATEXT(2))>>?
depth=k=2 ("=>" guard) rule a line 8 t1.g
tree context:
(root = A
B
)
The token sequence which is suppressed: ( A B )
The sequence of references which generate that sequence of tokens:
1 to ab r1/1 line 1 t1.g
2 ab ab/1 line 4 t1.g
3 to b ab/2 line 5 t1.g
4 b b/1 line 11 t1.g
5 #token A b/1 line 11 t1.g
6 #token B b/1 line 11 t1.g
#endif
A slightly more complicated example:
r1 : (ab)* "@"
;
ab : a
| b
;
a : (A B)? => <<p(LATEXT(2))>>? (A B | D E)
;
b : <<q(LATEXT(2))>>? D E
;
In this case, the sequence (D E) in rule "a" which lies behind
the guard is used to suppress the predicate with context (D E)
in rule b.
while ( (LA(1)==A || LA(1)==D)
#if 0
Part (or all) of predicate with depth > 1 suppressed by alternative
without predicate
pred << q(LATEXT(2))>>?
depth=k=2 rule b line 11 t2.g
tree context:
(root = D
E
)
The token sequence which is suppressed: ( D E )
The sequence of references which generate that sequence of tokens:
1 to ab r1/1 line 1 t2.g
2 ab ab/1 line 4 t2.g
3 to a ab/1 line 4 t2.g
4 a a/1 line 8 t2.g
5 #token D a/1 line 8 t2.g
6 #token E a/1 line 8 t2.g
#endif
&&
#if 0
pred << p(LATEXT(2))>>?
depth=k=2 ("=>" guard) rule a line 8 t2.g
tree context:
(root = A
B
)
#endif
(! ( LA(1)==A && LA(2)==B ) || p(LATEXT(2)) ) {
ab();
...
#165. (Changed in MR13) option -newAST
To create ASTs from an ANTLRTokenPtr antlr usually calls
"new AST(ANTLRTokenPtr)". This option generates a call
to "newAST(ANTLRTokenPtr)" instead. This allows a user
to define a parser member function to create an AST object.
#161. (Changed in MR13) Switch -gxt inhibits generation of tokens.h
#158. (Changed in MR13) #header causes problem for pre-processors
A user who runs the C pre-processor on antlr source suggested
that another syntax be allowed. With MR13 such directives
such as #header, #pragma, etc. may be written as "\#header",
"\#pragma", etc. For escaping pre-processor directives inside
a #header use something like the following:
\#header
<<
\#include <stdio.h>
>>
#155. (Changed in MR13) Context behind predicates can suppress
With -mrhoist enabled the context behind a guarded predicate can
be used to suppress other predicates. Consider the following grammar:
r0 : (r1)+;
r1 : rp
| rq
;
rp : <<p LATEXT(1)>>? B ;
rq : (A)? => <<q LATEXT(1)>>? (A|B);
In earlier versions both predicates "p" and "q" would be hoisted into
rule r0. With MR12c predicate p is suppressed because the context which
follows predicate q includes "B" which can "cover" predicate "p". In
other words, in trying to decide in r0 whether to call r1, it doesn't
really matter whether p is false or true because, either way, there is
a valid choice within r1.
#154. (Changed in MR13) Making hoist suppression explicit using <<nohoist>>
A common error, even among experienced pccts users, is to code
an init-action to inhibit hoisting rather than a leading action.
An init-action does not inhibit hoisting.
This was coded:
rule1 : <<;>> rule2
This is what was meant:
rule1 : <<;>> <<;>> rule2
With MR13, the user can code:
rule1 : <<;>> <<nohoist>> rule2
The following will give an error message:
rule1 : <<nohoist>> rule2
If the <<nohoist>> appears as an init-action rather than a leading
action an error message is issued. The meaning of an init-action
containing "nohoist" is unclear: does it apply to just one
alternative or to all alternatives ?
#151a. Addition of ANTLRParser::getLexer(), ANTLRTokenStream::getLexer()
You must manually cast the ANTLRTokenStream to your program's
lexer class. Because the name of the lexer's class is not fixed.
Thus it is impossible to incorporate it into the DLGLexerBase
class.
#151b.(Changed in MR12) ParserBlackBox member getLexer()
#150. (Changed in MR12) syntaxErrCount and lexErrCount now public
#149. (Changed in MR12) antlr option -info o (letter o for orphan)
If there is more than one rule which is not referenced by any
other rule then all such rules are listed. This is useful for
alerting one to rules which are not used, but which can still
contribute to ambiguity.
#148. (Changed in MR11) #token names appearing in zztokens,token_tbl
One can write:
#token Plus ("+") "\+"
#token RP ("(") "\("
#token COM ("comment begin") "/\*"
The string in parenthesis will be used in syntax error messages.
#146. (Changed in MR11) Option -treport for locating "difficult" alts
It can be difficult to determine which alternatives are causing
pccts to work hard to resolve an ambiguity. In some cases the
ambiguity is successfully resolved after much CPU time so there
is no message at all.
A rough measure of the amount of work being peformed which is
independent of the CPU speed and system load is the number of
tnodes created. Using "-info t" gives information about the
total number of tnodes created and the peak number of tnodes.
Tree Nodes: peak 1300k created 1416k lost 0
It also puts in the generated C or C++ file the number of tnodes
created for a rule (at the end of the rule). However this
information is not sufficient to locate the alternatives within
a rule which are causing the creation of tnodes.
Using:
antlr -treport 100000 ....
causes antlr to list on stdout any alternatives which require the
creation of more than 100,000 tnodes, along with the lookahead sets
for those alternatives.
The following is a trivial case from the ansi.g grammar which shows
the format of the report. This report might be of more interest
in cases where 1,000,000 tuples were created to resolve the ambiguity.
-------------------------------------------------------------------------
There were 0 tuples whose ambiguity could not be resolved
by full lookahead
There were 157 tnodes created to resolve ambiguity between:
Choice 1: statement/2 line 475 file ansi.g
Choice 2: statement/3 line 476 file ansi.g
Intersection of lookahead[1] sets:
IDENTIFIER
Intersection of lookahead[2] sets:
LPARENTHESIS COLON AMPERSAND MINUS
STAR PLUSPLUS MINUSMINUS ONESCOMPLEMENT
NOT SIZEOF OCTALINT DECIMALINT
HEXADECIMALINT FLOATONE FLOATTWO IDENTIFIER
STRING CHARACTER
-------------------------------------------------------------------------
#143. (Changed in MR11) Optional ";" at end of #token statement
Fixes problem of:
#token X "x"
<<
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?