📄 changes_from_1.33
字号:
The problem is that both alpha and beta are stored in the syntax diagram, and that some analysis routines would fail to skip the alpha portion when it was not on the leading edge. Consider the following grammar with -ck 2: r : ( (A)? B )* C D | A B /* forces -ck 2 computation for old antlr */ /* reports ambig for alts 1 & 2 */ | B C /* forces -ck 2 computation for new antlr */ /* reports ambig for alts 1 & 3 */ ; The prediction expression for the first alternative should be LA(1)={B C} LA(2)={B C D}, but previous versions of antlr would compute the prediction expression as LA(1)={A C} LA(2)={B D} Reported by Arpad Beszedes (beszedes@inf.u-szeged.hu) who provided a very clear example of the problem and identified the probable cause.#177. (Changed in MR14) #tokdefs and #token with regular expression In MR13 the change described by Item #162 caused an existing feature of antlr to fail. Prior to the change it was possible to give regular expression definitions and actions to tokens which were defined via the #tokdefs directive. This now works again. Reported by Manfred Kogler (km@cast.uni-linz.ac.at).#176. (Changed in MR14) Support for #line in antlr source code Note: this was implemented by Arpad Beszedes (beszedes@inf.u-szeged.hu). In 1.33MR14 it is possible for a pre-processor to generate #line directives in the antlr source and have those line numbers and file names used in antlr error messages and in the #line directives generated by antlr. The #line directive may appear in the following forms: #line ll "sss" xx xx ... where ll represents a line number, "sss" represents the name of a file enclosed in quotation marks, and xxx are arbitrary integers. The following form (without "line") is not supported at the moment: # ll "sss" xx xx ... The result: zzline is replaced with ll from the # or #line directive FileStr[CurFile] is updated with the contents of the string (if any) following the line number Note ---- The file-name string following the line number can be a complete name with a directory-path. Antlr generates the output files from the input file name (by replacing the extension from the file-name with .c or .cpp). If the input file (or the file-name from the line-info) contains a path: "../grammar.g" the generated source code will be placed in "../grammar.cpp" (i.e. in the parent directory). This is inconvenient in some cases (even the -o switch can not be used) so the path information is removed from the #line directive. Thus, if the line-info was #line 2 "../grammar.g" then the current file-name will become "grammar.g" In this way, the generated source code according to the grammar file will always be in the current directory, except when the -o switch is used.#175. (Changed in MR14) Bug when guess block appears at start of (...)* In 1.33 vanilla and all maintenance releases prior to 1.33MR14 there is a bug when a guess block appears at the start of a (...)+. Consider the following k=1 (ck=1) grammar: rule : ( (STAR)? ZIP )* ID ; Prior to 1.33MR14, the generated code resembled: ... zzGUESS_BLOCK while ( 1 ) { if ( ! LA(1)==STAR) break; zzGUESS if ( !zzrv ) { zzmatch(STAR); zzCONSUME; zzGUESS_DONE zzmatch(ZIP); zzCONSUME; ... Note that the routine uses STAR for the prediction expression rather than ZIP. With 1.33MR14 the generated code resembles: ... while ( 1 ) { if ( ! LA(1)==ZIP) break; ... This problem existed only with (...)* blocks and was caused by the slightly more complicate graph which represents (...)* blocks. This caused the analysis routine to compute the first set for the alpha part of the "(alpha)? beta" rather than the beta part. Both (...)+ and {...} blocks handled the guess block correctly. Reported by Arpad Beszedes (beszedes@inf.u-szeged.hu) who provided a very clear example of the problem and identified the probable cause.#174. (Changed in MR14) Bug when action precedes syntactic predicate In 1.33 vanilla, and all maintenance releases prior to MR14, there was a bug when a syntactic predicate was immediately preceded by an action. Consider the following -ck 2 grammar: rule : <<int i;>> (alpha)? beta C | A B ; alpha : A ; beta : A B; Prior to MR14, the code generated for the first alternative resembled: ... zzGUESS if ( !zzrv && LA(1)==A && LA(2)==A) { alpha(); zzGUESS_DONE beta(); zzmatch(C); zzCONSUME; } else { ... The prediction expression (i.e. LA(1)==A && LA(2)==A) is clearly wrong because LA(2) should be matched to B (first[2] of beta is {B}). With 1.33MR14 the prediction expression is: ... if ( !zzrv && LA(1)==A && LA(2)==B) { alpha(); zzGUESS_DONE beta(); zzmatch(C); zzCONSUME; } else { ... This will only affect users in which alpha is shorter than than max(k,ck) and there is an action immediately preceding the syntactic predicate. This problem was reported by reported by Arpad Beszedes (beszedes@inf.u-szeged.hu) who provided a very clear example of the problem and identified the presence of the init-action as the likely culprit.#173. (Changed in MR13a) -glms for Microsoft style filenames with -gl With the -gl option antlr generates #line directives using the exact name of the input files specified on the command line. An oddity of the Microsoft C and C++ compilers is that they don't accept file names in #line directives containing "\" even though these are names from the native file system. With -glms option, the "\" in file names appearing in #line directives is replaced with a "/" in order to conform to Microsoft compiler requirements. Reported by Erwin Achermann (erwin.achermann@switzerland.org).#172. (Changed in MR13) \r\n in antlr source counted as one line Some MS software uses \r\n to indicate a new line. Antlr now recognizes this in counting lines. Reported by Edward L. Hepler (elh@ece.vill.edu).#171. (Changed in MR13) #tokclass L..U now allowed The following is now allowed: #tokclass ABC { A..B C } Reported by Dave Watola (dwatola@amtsun.jpl.nasa.gov)#170. (Changed in MR13) Suppression for predicates with lookahead depth >1 In MR12 the capability for suppression of predicates with lookahead depth=1 was introduced. With MR13 this had been extended to predicates with lookahead depth > 1 and released for use by users on an experimental basis. Consider the following grammar with -ck 2 and the predicate in rule "a" with depth 2: r1 : (ab)* "@" ; ab : a | b ; a : (A B)? => <<p(LATEXT(2))>>? A B C ; b : A B C ; Normally, the predicate would be hoisted into rule r1 in order to determine whether to call rule "ab". However it should *not* be hoisted because, even if p is false, there is a valid alternative in rule b. With "-mrhoistk on" the predicate will be suppressed. If "-info p" command line option is present the following information will appear in the generated code: while ( (LA(1)==A) #if 0 Part (or all) of predicate with depth > 1 suppressed by alternative without predicate pred << p(LATEXT(2))>>? depth=k=2 ("=>" guard) rule a line 8 t1.g tree context: (root = A B ) The token sequence which is suppressed: ( A B ) The sequence of references which generate that sequence of tokens: 1 to ab r1/1 line 1 t1.g 2 ab ab/1 line 4 t1.g 3 to b ab/2 line 5 t1.g 4 b b/1 line 11 t1.g 5 #token A b/1 line 11 t1.g 6 #token B b/1 line 11 t1.g #endif A slightly more complicated example: r1 : (ab)* "@" ; ab : a | b ; a : (A B)? => <<p(LATEXT(2))>>? (A B | D E) ; b : <<q(LATEXT(2))>>? D E ; In this case, the sequence (D E) in rule "a" which lies behind the guard is used to suppress the predicate with context (D E) in rule b. while ( (LA(1)==A || LA(1)==D) #if 0 Part (or all) of predicate with depth > 1 suppressed by alternative without predicate pred << q(LATEXT(2))>>? depth=k=2 rule b line 11 t2.g tree context: (root = D E ) The token sequence which is suppressed: ( D E ) The sequence of references which generate that sequence of tokens: 1 to ab r1/1 line 1 t2.g 2 ab ab/1 line 4 t2.g 3 to a ab/1 line 4 t2.g 4 a a/1 line 8 t2.g 5 #token D a/1 line 8 t2.g 6 #token E a/1 line 8 t2.g #endif && #if 0 pred << p(LATEXT(2))>>? depth=k=2 ("=>" guard) rule a line 8 t2.g tree context: (root = A B ) #endif (! ( LA(1)==A && LA(2)==B ) || p(LATEXT(2)) ) { ab(); ...#169. (Changed in MR13) Predicate test optimization for depth=1 predicates When the MR12 generated a test of a predicate which had depth 1 it would use the depth >1 routines, resulting in correct but inefficient behavior. In MR13, a bit test is used.#168. (Changed in MR13) Token expressions in context guards The token expressions appearing in context guards such as:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -