changes_from_133.txt
来自「EFI BIOS是Intel提出的下一代的BIOS标准。这里上传的Edk源代码是」· 文本 代码 · 共 1,767 行 · 第 1/5 页
TXT
1,767 行
=======================================================================
List of Implemented Fixes and Changes for Maintenance Releases of PCCTS
For a summary of the most significant changes see CHANGES_SUMMARY.TXT
=======================================================================
DISCLAIMER
The software and these notes are provided "as is". They may include
typographical or technical errors and their authors disclaims all
liability of any kind or nature for damages due to error, fault,
defect, or deficiency regardless of cause. All warranties of any
kind, either express or implied, including, but not limited to, the
implied warranties of merchantability and fitness for a particular
purpose are disclaimed.
-------------------------------------------------------
Note: Items #153 to #1 are now in a separate file named
CHANGES_FROM_133_BEFORE_MR13.txt
-------------------------------------------------------
#261. (Changed in MR19) Defer token fetch for C++ mode
Item #216 has been revised to indicate that use of the defer fetch
option (ZZDEFER_FETCH) requires dlg option -i.
#260. (MR22) Raise default lex buffer size from 8,000 to 32,000 bytes.
ZZLEXBUFSIZE is the size (in bytes) of the buffer used by dlg
generated lexers. The default value has been raised to 32,000 and
the value used by antlr, dlg, and sorcerer has also been raised to
32,000.
#259. (MR22) Default function arguments in C++ mode.
If a rule is declared:
rr [int i = 0] : ....
then the declaration generated by pccts resembles:
void rr(int i = 0);
however, the definition must omit the default argument:
void rr(int i) {...}
In the past the default value was not omitted. In MR22
the generated code resembles:
void rr(int i /* = 0 */ ) {...}
Implemented by Volker H. Simonis (simonis@informatik.uni-tuebingen.de)
#258. (MR22) Using a base class for your parser
In item #102 (MR10) the class statement was extended to allow one
to specify a base class other than ANTLRParser for the generated
parser. It turned out that this was less than useful because
the constructor still specified ANTLRParser as the base class.
The class statement now uses the first identifier appearing after
the ":" as the name of the base class. For example:
class MyParser : public FooParser {
Generates in MyParser.h:
class MyParser : public FooParser {
Generates in MyParser.cpp something that resembles:
MyParser::MyParser(ANTLRTokenBuffer *input) :
FooParser(input,1,0,0,4)
{
token_tbl = _token_tbl;
traceOptionValueDefault=1; // MR10 turn trace ON
}
The base class must constructor must have a signature similar to
that of ANTLRParser.
#257. (MR21a) Removed dlg statement that -i has no effect in C++ mode.
This was incorrect.
#256. (MR21a) Malformed syntax graph causes crash after error message.
In the past, certain kinds of errors in the very first grammar
element could cause the construction of a malformed graph
representing the grammar. This would eventually result in a
fatal internal error. The code has been changed to be more
resistant to this particular error.
#255. (MR21a) ParserBlackBox(FILE* f)
This constructor set openByBlackBox to the wrong value.
Reported by Kees Bakker (kees_bakker@tasking.nl).
#254. (MR21a) Reporting syntax error at end-of-file
When there was a syntax error at the end-of-file the syntax
error routine would substitute "<eof>" for the programmer's
end-of-file symbol. This substitution is now done only when
the programmer does not define his own end-of-file symbol
or the symbol begins with the character "@".
Reported by Kees Bakker (kees_bakker@tasking.nl).
#253. (MR21) Generation of block preamble (-preamble and -preamble_first)
The antlr option -preamble causes antlr to insert the code
BLOCK_PREAMBLE at the start of each rule and block. It does
not insert code before rules references, token references, or
actions. By properly defining the macro BLOCK_PREAMBLE the
user can generate code which is specific to the start of blocks.
The antlr option -preamble_first is similar, but inserts the
code BLOCK_PREAMBLE_FIRST(PreambleFirst_123) where the symbol
PreambleFirst_123 is equivalent to the first set defined by
the #FirstSetSymbol described in Item #248.
I have not investigated how these options interact with guess
mode (syntactic predicates).
#252. (MR21) Check for null pointer in trace routine
When some trace options are used when the parser is generated
without the trace enabled, the current rule name may be a
NULL pointer. A guard was added to check for this in
restoreState.
Reported by Douglas E. Forester (dougf@projtech.com).
#251. (MR21) Changes to #define zzTRACE_RULES
The macro zzTRACE_RULES was being use to pass information to
AParser.h. If this preprocessor symbol was not properly
set the first time AParser.h was #included, the declaration
of zzTRACEdata would be omitted (it is used by the -gd option).
Subsequent #includes of AParser.h would be skipped because of
the #ifdef guard, so the declaration of zzTracePrevRuleName would
never be made. The result was that proper compilation was very
order dependent.
The declaration of zzTRACEdata was made unconditional and the
problem of removing unused declarations will be left to optimizers.
Diagnosed by Douglas E. Forester (dougf@projtech.com).
#250. (MR21) Option for EXPERIMENTAL change to error sets for blocks
The antlr option -mrblkerr turns on an experimental feature
which is supposed to provide more accurate syntax error messages
for k=1, ck=1 grammars. When used with k>1 or ck>1 grammars the
behavior should be no worse than the current behavior.
There is no problem with the matching of elements or the computation
of prediction expressions in pccts. The task is only one of listing
the most appropriate tokens in the error message. The error sets used
in pccts error messages are approximations of the exact error set when
optional elements in (...)* or (...)+ are involved. While entirely
correct, the error messages are sometimes not 100% accurate.
There is also a minor philosophical issue. For example, suppose the
grammar expects the token to be an optional A followed by Z, and it
is X. X, of course, is neither A nor Z, so an error message is appropriate.
Is it appropriate to say "Expected Z" ? It is correct, it is accurate,
but it is not complete.
When k>1 or ck>1 the problem of providing the exactly correct
list of tokens for the syntax error messages ends up becoming
equivalent to evaluating the prediction expression for the
alternatives twice. However, for k=1 ck=1 grammars the prediction
expression can be computed easily and evaluated cheaply, so I
decided to try implementing it to satisfy a particular application.
This application uses the error set in an interactive command language
to provide prompts which list the alternatives available at that
point in the parser. The user can then enter additional tokens to
complete the command line. To do this required more accurate error
sets then previously provided by pccts.
In some cases the default pccts behavior may lead to more robust error
recovery or clearer error messages then having the exact set of tokens.
This is because (a) features like -ge allow the use of symbolic names for
certain sets of tokens, so having extra tokens may simply obscure things
and (b) the error set is use to resynchronize the parser, so a good
choice is sometimes more important than having the exact set.
Consider the following example:
Note: All examples code has been abbreviated
to the absolute minimum in order to make the
examples concise.
star1 : (A)* Z;
The generated code resembles:
old new (with -mrblkerr)
------------- --------------------
for (;;) { for (;;) {
match(A); match(A);
} }
match(Z); if (! A and ! Z) then
FAIL(...{A,Z}...);
}
match(Z);
With input X
old message: Found X, expected Z
new message: Found X, expected A, Z
For the example:
star2 : (A|B)* Z;
old new (with -mrblkerr)
------------- --------------------
for (;;) { for (;;) {
if (!A and !B) break; if (!A and !B) break;
if (...) { if (...) {
<same ...> <same ...>
} }
else { else {
FAIL(...{A,B,Z}...) FAIL(...{A,B}...);
} }
} }
match(B); if (! A and ! B and !Z) then
FAIL(...{A,B,Z}...);
}
match(B);
With input X
old message: Found X, expected Z
new message: Found X, expected A, B, Z
With input A X
old message: Found X, expected Z
new message: Found X, expected A, B, Z
This includes the choice of looping back to the
star block.
The code for plus blocks:
plus1 : (A)+ Z;
The generated code resembles:
old new (with -mrblkerr)
------------- --------------------
do { do {
match(A); match(A);
} while (A) } while (A)
match(Z); if (! A and ! Z) then
FAIL(...{A,Z}...);
}
match(Z);
With input A X
old message: Found X, expected Z
new message: Found X, expected A, Z
This includes the choice of looping back to the
plus block.
For the example:
plus2 : (A|B)+ Z;
old new (with -mrblkerr)
------------- --------------------
do { do {
if (A) { <same>
match(A); <same>
} else if (B) { <same>
match(B); <same>
} else { <same>
if (cnt > 1) break; <same>
FAIL(...{A,B,Z}...) FAIL(...{A,B}...);
} }
cnt++; <same>
} }
match(Z); if (! A and ! B and !Z) then
FAIL(...{A,B,Z}...);
}
match(B);
With input X
old message: Found X, expected A, B, Z
new message: Found X, expected A, B
With input A X
old message: Found X, expected Z
new message: Found X, expected A, B, Z
This includes the choice of looping back to the
star block.
#249. (MR21) Changes for DEC/VMS systems
Jean-Fran鏾is Pi閞onne (jfp@altavista.net) has updated some
VMS related command files and fixed some minor problems related
to building pccts under the DEC/VMS operating system. For DEC/VMS
users the most important differences are:
a. Revised makefile.vms
b. Revised genMMS for genrating VMS style makefiles.
#248. (MR21) Generate symbol for first set of an alternative
pccts can generate a symbol which represents the tokens which may
appear at the start of a block:
rr : #FirstSetSymbol(rr_FirstSet) ( Foo | Bar ) ;
This will generate the symbol rr_FirstSet of type SetWordType with
elements Foo and Bar set. The bits can be tested using code similar
to the following:
if (set_el(Foo, &rr_FirstSet)) { ...
This can be combined with the C array zztokens[] or the C++ routine
tokenName() to get the print name of the token in the first set.
The size of the set is given by the newly added enum SET_SIZE, a
protected member of the generated parser's class. The number of
elements in the generated set will not be exactly equal to the
value of SET_SIZE because of synthetic tokens created by #tokclass,
#errclass, the -ge option, and meta-tokens such as epsilon, and
end-of-file.
The #FirstSetSymbol must appear immediately before a block
such as (...)+, (...)*, and {...}, and (...). It may not appear
immediately before a token, a rule reference, or action. However
a token or rule reference can be enclosed in a (...) in order to
make the use of #pragma FirstSetSymbol legal.
rr_bad : #FirstSetSymbol(rr_bad_FirstSet) Foo; // Illegal
rr_ok : #FirstSetSymbol(rr_ok_FirstSet) (Foo); // Legal
Do not confuse FirstSetSymbol sets with the sets used for testing
lookahead. The sets used for FirstSetSymbol have one element per bit,
so the number of bytes is approximately the largest token number
divided by 8. The sets used for testing lookahead store 8 lookahead
sets per byte, so the length of the array is approximately the largest
token number.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?