📄 javaccgrm.html
字号:
<TD ALIGN=LEFT VALIGN=BASELINE>"LOOKAHEAD" "(" [ <EM>java_integer_literal</EM> ] [ "," ] [ <A HREF="#prod16">expansion_choices</A> ] [ "," ] [ "{" <EM>java_expression</EM> "}" ] ")"</TD></TR></TABLE><P>A local lookahead specification is used to influence the way the generatedparser makes choices at the various<A HREF="lookahead.html">choice points</A>in the grammar. A local lookahead specification starts with the reservedword "LOOKAHEAD" followed by a set of lookahead constraints within parentheses.There are three different kinds of lookahead constraints - a lookahead limit(the integer literal), a syntactic lookahead (the expansion choices), anda semantic lookahead (the expression within braces). At least one lookaheadconstraint must be present. If more than one lookahead constraint is present,they must be separated by commas.<P>For a detailed description of how lookahead works, please<A HREF="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</A>.A brief description of each kind of lookahead constraint is given below:<P><UL><LI><STRONG>Lookahead Limit:</STRONG>This is the maximum number of tokens of lookahead that may be used for choicedetermination purposes. This overrides the default value which is specifiedby the <A HREF="#prod2">LOOKAHEAD option</A>. This lookahead limit appliesonly to the <A HREF="lookahead.html">choice point</A>at the location of the local lookahead specification.If the local lookahead specification is not at a choice point, the lookaheadlimit (if any) is ignored.<P><LI><STRONG>Syntactic Lookahead:</STRONG>This is an expansion (or expansion choices) that is used for the purpose ofdetermining whether or not the particular choice that this local lookaheadspecification applies to is to be taken. If this was not provided, the parseruses the expansion to be selected during lookahead determination.If the local lookahead specification is not at a<A HREF="lookahead.html">choice point</A>, the syntacticlookahead (if any) is ignored.<P><LI><STRONG>Semantic Lookahead:</STRONG>This is a boolean expression that is evaluated whenever the parser crosses thispoint during parsing. If the expression evaluates to true, the parsingcontinues normally. If the expression evaluates to false and the locallookahead specification is at a <A HREF="lookahead.html">choice point</A>,the current choice is not taken and the next choice is considered.If the expression evaluates to false and the local lookahead specificationis <EM>not</EM> at a choice point, then parsing aborts with a parse error.Unlike the other two lookahead constraints that are ignored at non-choicepoints, semantic lookahead is always evaluated. In fact, semantic lookaheadis even evaluated if it is encountered during the evaluation of some othersyntactic lookahead check (for more details<A HREF="lookahead.html">click here to visit the minitutorial on LOOKAHEAD</A>).</UL><P><STRONG>Default values for lookahead constraints:</STRONG>If a local lookahead specification has been provided, but not all lookaheadconstraints have been included, then the missing ones are assigned defaultvalues as follows:<P><UL><LI>If the lookahead limit is not provided and if the syntactic lookahead isprovided, then the lookahead limit defaults to the largest integer value(2147483647). This essentially implements "infinite lookahead" - namely,look ahead as many tokens as necessary to match the syntactic lookahead thathas been provided.<P><LI>If neither the lookahead limit nor the syntactic lookahead has beenprovided (which means the semantic lookahead is provided), the lookaheadlimit defaults to 0. This means that syntactic lookahead is not performed(it passes trivially), and only semantic lookahead is performed.<P><LI>If the syntactic lookahead is not provided, it defaults to the choiceto which the local lookahead specification applies. If the local lookaheadspecification is not at a choice point, then the syntactic lookahead isignored - hence a default value is not relevant.<P><LI>If the semantic lookahead is not provided, it defaults to the booleanexpression "true". That is, it trivially passes.</UL><P><HR><P><TABLE><TR><TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod19">regular_expression</A></TD><TD ALIGN=CENTER VALIGN=BASELINE>::=</TD><TD ALIGN=LEFT VALIGN=BASELINE><EM>java_string_literal</EM></TD></TR><TR><TD ALIGN=RIGHT VALIGN=BASELINE></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD><TD ALIGN=LEFT VALIGN=BASELINE>"<" [ [ "#" ] <EM>java_identifier</EM> ":" ] <A HREF="#prod29">complex_regular_expression_choices</A> ">"</TD></TR><TR><TD ALIGN=RIGHT VALIGN=BASELINE></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD><TD ALIGN=LEFT VALIGN=BASELINE>"<" <EM>java_identifier</EM> ">"</TD></TR><TR><TD ALIGN=RIGHT VALIGN=BASELINE></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD><TD ALIGN=LEFT VALIGN=BASELINE>"<" "EOF" ">"</TD></TR></TABLE><P>There are two places in a grammar files where regular expressions may bewritten:<UL><LI>Within a <A HREF="#prod18">regular expression specification</A>(part of a <A HREF="#prod10">regular expression production</A>),<P><LI>As an <A HREF="#prod22">expansion unit</A> with an <A HREF="#prod20">expansion</A>.When a regular expression is used in this manner, it is as if the regular expressionwere defined in the following manner at this location and then referred to by itslabel from the expansion unit:<P><PRE> <DEFAULT> TOKEN : { regular expression }</PRE><P>That is, this usage of regular expression can be rewritten using the otherkind of usage.</UL><P>The complete details of regular expression matching by the token manager isavailable in<A HREF="tokenmanager.html">the minitutorial on the token manager</A>. Thedescription of the syntactic constructs follows.<P>The first kind of regular expression is a string literal. The input beingparsed matches this regular expression if the token manager is in a<A HREF="#prod10">lexical state</A> for which this regular expression appliesand the next set of characters in the input stream is the same (possibly withcase ignored) as this string literal.<P>A regular expression may also be a more <A HREF="#prod29">complex regular expression</A>using which more involved regular expression (than string literals can be defined).Such a regular expression is placed within angular brackets "<...>", andmay be labeled optionally with an identifier. This label may be used to referto this regular expression from<A HREF="#prod22">expansion units</A>or from within other regular expressions.If the label is preceded by a "#", then this regular expression may not bereferred to from expansion units, but only from within other regular expressions.When the "#" is present, the regular expression is referred to as a"private regular expression".<P>A regular expression may be a reference to some other labeled regular expressionin which case it is written as the label enclosed in angular brackets "<...>".<P>Finally, a regular expression may be a reference to the predefined regularexpression "<EOF>" which is matched by the end of file.<P>Private regular expressions are not matched as tokens by the token manager.Their purpose is solely to facilitate the definition of other more complexregular expressions.<P>Consider the following example defining Java floating point literals:<P><PRE>TOKEN :{ < FLOATING_POINT_LITERAL: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)? (["f","F","d","D"])? | "." (["0"-"9"])+ (<EXPONENT>)? (["f","F","d","D"])? | (["0"-"9"])+ <EXPONENT> (["f","F","d","D"])? | (["0"-"9"])+ (<EXPONENT>)? ["f","F","d","D"] >| < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >}</PRE><P>In this example, the token FLOATING_POINT_LITERAL is defined using thedefinition of another token, namely, EXPONENT. The "#" before the labelEXPONENT indicates that this exists solely for the purpose of defining othertokens (FLOATING_POINT_LITERAL in this case). The definition ofFLOATING_POINT_LITERAL is not affected by the presence or absence of the "#".However, the token manager's behavior is. If the "#" is omitted, thetoken manager willerroneously recognize a string like E123 as a legal token of kind EXPONENT(instead of IDENTIFIER in the Java grammar).<P><HR><P><TABLE><TR><TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod29">complex_regular_expression_choices</A></TD><TD ALIGN=CENTER VALIGN=BASELINE>::=</TD><TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod30">complex_regular_expression</A> ( "|" <A HREF="#prod30">complex_regular_expression</A> )*</TD></TR></TABLE><P>Complex regular expression choices is made up of a list of one or more<A HREF="#prod30">complex regular expressions</A> separated by "|"s.A match for a complex regular expression choice is a match of any of itsconstituent complex regular expressions.<P><HR><P><TABLE><TR><TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod30">complex_regular_expression</A></TD><TD ALIGN=CENTER VALIGN=BASELINE>::=</TD><TD ALIGN=LEFT VALIGN=BASELINE>( <A HREF="#prod31">complex_regular_expression_unit</A> )*</TD></TR></TABLE><P>A complex regular expression is a sequence of complex regular expression units.A match for a complex regular expression is a concatenation of matches tothe complex regular expression units.<P><HR><P><TABLE><TR><TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod31">complex_regular_expression_unit</A></TD><TD ALIGN=CENTER VALIGN=BASELINE>::=</TD><TD ALIGN=LEFT VALIGN=BASELINE><EM>java_string_literal</EM></TD></TR><TR><TD ALIGN=RIGHT VALIGN=BASELINE></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD><TD ALIGN=LEFT VALIGN=BASELINE>"<" <EM>java_identifier</EM> ">"</TD></TR><TR><TD ALIGN=RIGHT VALIGN=BASELINE></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD><TD ALIGN=LEFT VALIGN=BASELINE><A HREF="#prod32">character_list</A></TD></TR><TR><TD ALIGN=RIGHT VALIGN=BASELINE></TD><TD ALIGN=CENTER VALIGN=BASELINE>|</TD><TD ALIGN=LEFT VALIGN=BASELINE>"(" <A HREF="#prod29">complex_regular_expression_choices</A> ")" [ "+" | "*" | "?" ]</TD></TR></TABLE><P>A complex regular expression unit can be a string literal, in which casethere is exactly one match for this unit, namely, the string literal itself.<P>A complex regular expression unit can be a reference to another regularexpression. The other regular expression has to be labeled so that itcan be referenced. The matches of this unit are all the matches of thisother regular expression. Such references in regular expressions cannotintroduce loops in the dependency between tokens.<P>A complex regular expression unit can be a <A HREF="#prod32">character list</A>.A character list is a way of defining a set of characters. A match for thiskind of complex regular expression unit is any character that is allowedby the character list.<P>A complex regular expression unit can be a parenthesized set ofcomplex regular expression choices. In this case, a legal match ofthe unit is any legal match of the nested choices. The parenthesizedset of choices can be suffixed (optionally) by:<UL><LI><STRONG>"+":</STRONG>Then any legal match of the unit is one or morerepetitions of a legal match of the parenthesized set ofchoices.<LI><STRONG>"*":</STRONG>Then any legal match of the unit is zero or morerepetitions of a legal match of the parenthesized set ofchoices.<LI><STRONG>"?":</STRONG>Then a legal match of the unit is either theempty string or any legal match of the nested choices.</UL>Note that unlike the BNF <A HREF="#prod20">expansions</A>,the regular expression "[...]" is not equivalentto the regular expression "(...)?". This is because the [...]construct is used to describe <A HREF="#prod32">character lists</A>in regular expressions.<P><HR><P><TABLE><TR><TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod32">character_list</A></TD><TD ALIGN=CENTER VALIGN=BASELINE>::=</TD><TD ALIGN=LEFT VALIGN=BASELINE>[ "~" ] "[" [ <A HREF="#prod33">character_descriptor</A> ( "," <A HREF="#prod33">character_descriptor</A> )* ] "]"</TD></TR></TABLE><P>A character list describes a set of characters. A legal match for acharacter list is any character in this set. A character list is a listof character descriptors separated by commas within square brackets.Each character descriptor describes a single character or a range of characters(see <A HREF="#prod33">character descriptor</A> below),and this is added to the set of characters of the characterlist. If the character list is prefixed by the "~" symbol, the set ofcharacters it represents is any UNICODE character not in the specified set.<P><HR><P><TABLE><TR><TD ALIGN=RIGHT VALIGN=BASELINE><A NAME="prod33">character_descriptor</A></TD><TD ALIGN=CENTER VALIGN=BASELINE>::=</TD><TD ALIGN=LEFT VALIGN=BASELINE><EM>java_string_literal</EM> [ "-" <EM>java_string_literal</EM> ]</TD></TR></TABLE><P>A character descriptor can be a single character string literal, in whichcase it describes a singleton set containing that character; or it istwo single character string literals separated by a "-", in which case, itdescribes the set of all characters in the range between and including thesetwo characters.<P></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -