⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 2.doc.html

📁 java语言规范
💻 HTML
字号:
<html>
<head>
<title>The Java Language Specification Grammars</title>
</head>
<body BGCOLOR=#eeeeff text=#000000 LINK=#0000ff VLINK=#000077 ALINK=#ff0000>
 
<a href="index.html">Contents</a> | <a href="1.doc.html">Prev</a> | <a href="3.doc.html">Next</a> | <a href="j.index.doc1.html">Index</a>
<hr><br>
 
<a name="14911"></a>
<p><strong>
CHAPTER 2 </strong></p>
<a name="44271"></a>
<h1>Grammars</h1>
<hr><p>
<a name="90172"></a>
This chapter describes the context-free grammars used in this specification to 
define the lexical and syntactic structure of a Java program.
<p><a name="40415"></a>
<h2>2.1    Context-Free Grammars</h2>
<a name="40485"></a>
A <i>context-free </i><i>grammar</i> consists of a number of <i>productions</i>. Each production has 
an abstract symbol called a <i>nonterminal</i> as its <i>left-hand side</i>, and a sequence of 
one or more nonterminal and <i>terminal</i> symbols as its <i>right-hand side</i>. For each 
grammar, the terminal symbols are drawn from a specified <i>alphabet</i>.
<p><a name="40491"></a>
Starting from a sentence consisting of a single distinguished nonterminal, called the <i>goal symbol</i>, a given context-free grammar specifies a <i>language</i>, namely, the infinite set of possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for which the nonterminal is the left-hand side.<p>
<a name="142375"></a>
<h2>2.2    The Lexical Grammar</h2>
<a name="142382"></a>
A <i>lexical grammar</i> for Java is given in <a href="3.doc.html#48198">&#167;3</a>. This grammar has as its terminal symbols
the characters of the Unicode character set. It defines a set of productions, 
starting from the goal symbol <i>Input </i><a href="3.doc.html#25687">(&#167;3.5)</a>,<i> </i>that describe how sequences of Unicode
characters <a href="3.doc.html#95413">(&#167;3.1)</a> are translated into a sequence of input elements <a href="3.doc.html#25687">(&#167;3.5)</a>.
<p><a name="140274"></a>
These input elements, with white space <a href="3.doc.html#95710">(&#167;3.6)</a> and comments <a href="3.doc.html#48125">(&#167;3.7)</a> discarded, form the terminal symbols for the syntactic grammar for Java and are called Java <i>tokens</i> <a href="3.doc.html#25687">(&#167;3.5)</a>. These tokens are the identifiers <a href="3.doc.html#40625">(&#167;3.8)</a>, keywords <a href="3.doc.html#229308">(&#167;3.9)</a>, literals <a href="3.doc.html#48272">(&#167;3.10)</a>, separators <a href="3.doc.html#230752">(&#167;3.11)</a>, and operators <a href="3.doc.html#230752">(&#167;3.11)</a> of the Java language.<p>
<a name="140845"></a>
<h2>2.3    The Syntactic Grammar</h2>
<a name="142461"></a>
The <i>syntactic grammar</i> for Java is given in Chapters <a href="4.doc.html#95843">4</a>, <a href="6.doc.html#48086">6</a>-<a href="10.doc.html#27803">10</a>, <a href="14.doc.html#44383">14</a>, and <a href="15.doc.html#4709">15</a>. This 
grammar has Java tokens defined by the lexical grammar as its terminal symbols. 
It defines a set of productions, starting from the goal symbol <i>CompilationUnit</i> 
<a href="7.doc.html#40031">(&#167;7.3)</a>, that describe how sequences of tokens can form syntactically correct Java 
programs.
<p><a name="15629"></a>
A LALR(1) version of the syntactic grammar is presented in Chapter <a href="19.doc.html#52994">19</a>. The grammar in the body of this specification is very similar to the LALR(1) grammar but more readable.<p>
<a name="90767"></a>
<h2>2.4    Grammar Notation</h2>
<a name="48048"></a>
Terminal symbols are shown in <code>fixed</code> <code>width</code> font in the productions of the lexical 
and syntactic grammars, and throughout this specification whenever the text is 
directly referring to such a terminal symbol. These are to appear in a program 
exactly as written. 
<p><a name="139619"></a>
Nonterminal symbols are shown in <i>italic</i> type. The definition of a nonterminal is introduced by the name of the nonterminal being defined followed by a colon. One or more alternative right-hand sides for the nonterminal then follow on succeeding lines. For example, the syntactic definition:<p>
<ul><pre>
<i>IfThenStatement:<br>
</i>	<code>if ( </code><i>Expression</i><code> )&#32;</code><i>Statement
</i></pre></ul><a name="48054"></a>
states that the nonterminal <i>IfThenStatement </i>represents the token <code>if</code>, followed by a 
left parenthesis token, followed by an <i>Expression</i>, followed by a right parenthesis 
token, followed by a <i>Statement</i>. As another example, the syntactic definition:
<p><ul><pre>
<i>ArgumentList:<br>
</i>	<i>Argument<br>
</i>	<i>ArgumentList</i><code> , </code><i>Argument
</i></pre></ul><a name="139922"></a>
states that an <i>ArgumentList</i> may represent either a single <i>Argument</i> or an 
<i>ArgumentList</i>, &#32;followed by a comma, followed by an <i>Argument</i>. This definition of 
<i>ArgumentList</i> is <i>recursive</i>, that is to say, it is defined in terms of itself. The result 
is that an <i>ArgumentList</i> may contain any positive number of arguments. Such 
recursive definitions of nonterminals are common.
<p><a name="149388"></a>
The subscripted suffix "<i>opt</i>", which may appear after a terminal or nonterminal, indicates an <i>optional symbol</i>. The alternative containing the optional symbol actually specifies two right-hand sides, one that omits the optional element and one that includes it. This means that:<p>
<ul><pre>
<i>BreakStatement:<br>
</i>	<code>break </code><i>Identifier</i><sub><i>opt</i></sub><code> ;
</code></pre></ul><a name="48059"></a>
is a convenient abbreviation for:
<p><ul><pre>
<i>BreakStatement:<br>
</i>	<code>break ;<br>
</code>	<code>break </code><i>Identifier</i><code> ;
</code></pre></ul><a name="48061"></a>
and that:
<p><ul><pre>
<i>ForStatement</i>:<br>
	<code>for ( </code><i>ForInit</i><sub><i>opt</i></sub><code> ; </code><i>Expression</i><sub><i>opt</i></sub><code> ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement
</i></pre></ul><a name="44856"></a>
is a convenient abbreviation for:
<p><ul><pre>
<i>ForStatement</i>:<br>
<code>	for ( ; </code><i>Expression</i><sub><i>opt</i></sub><code> ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement<br>
</i><code>	for ( </code><i>ForInit</i><code> ; </code><i>Expression</i><sub><i>opt</i></sub><code> ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement
</i></pre></ul><a name="49500"></a>
which in turn is an abbreviation for:
<p><ul><pre>
<i>ForStatement</i>:<br>
<code>	for ( ; ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement<br>
</i><code>	for ( ; </code><i>Expression</i><code> ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement<br>
</i><code>	for ( </code><i>ForInit</i><code> ; ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement<br>
</i><code>	for ( </code><i>ForInit</i><code> ; </code><i>Expression</i><code> ; </code><i>ForUpdate</i><sub><i>opt</i></sub><code> ) </code><i>Statement
</i></pre></ul><a name="49529"></a>
which in turn is an abbreviation for:
<p><ul><pre>
<i>ForStatement</i>:<br>
<code>	for ( ; ; ) </code><i>Statement<br>
</i><code>	for ( ; ; </code><i>ForUpdate</i><code> ) </code><i>Statement<br>
</i><code>	for ( ; </code><i>Expression</i><code> ; ) </code><i>Statement<br>
</i><code>	for ( ; </code><i>Expression</i><code> ; </code><i>ForUpdate</i><code> ) </code><i>Statement<br>
</i>	<code>for ( </code><i>ForInit</i><code> ; ; ) </code><i>Statement<br>
</i><code>	for ( </code><i>ForInit</i><code> ; ; </code><i>ForUpdate</i><code> ) </code><i>Statement<br>
</i><code>	for ( </code><i>ForInit</i><code> ; </code><i>Expression</i><code> ; ) </code><i>Statement<br>
</i><code>	for ( </code><i>ForInit</i><code> ; </code><i>Expression</i><code> ; </code><i>ForUpdate</i><code> ) </code><i>Statement
</i></pre></ul><a name="48067"></a>
so the nonterminal <i>ForStatement</i> actually has eight alternative right-hand sides.
<p><a name="27547"></a>
A very long right-hand side may be continued on a second line by substantially indenting this second line, as in:<p>
<ul><pre>
<i>ConstructorDeclaration</i>:<br>
<i>	ConstructorModifiers</i><sub><i>opt</i></sub><code>&#32;</code><i>ConstructorDeclarator<br>
</i>		<i>Throws</i><sub><i>opt</i></sub><code>&#32;</code><i>ConstructorBody
</i></pre></ul><a name="27550"></a>
which defines one right-hand side for the nonterminal <i>ConstructorDeclaration</i>. 
(This right-hand side is an abbreviation for four alternative right-hand sides, 
because of the two occurrences of "<sub><i>opt</i></sub>".)
<p><a name="139949"></a>
When the words "one of" follow the colon in a grammar definition, they signify that each of the terminal symbols on the following line or lines is an alternative definition. For example, the lexical grammar for Java contains the production:<p>
<ul><pre>
<i>ZeroToThree: one of<br>
</i>	<code>0 1 2 3
</code></pre></ul><a name="46646"></a>
which is merely a convenient abbreviation for:
<p><ul><pre>
<i>ZeroToThree:<br>
</i>	<code>0<br>
</code>	<code>1<br>
</code>	<code>2<br>
</code>	<code>3
</code></pre></ul><a name="46654"></a>
When an alternative in a lexical production appears to be a token, it represents the sequence of characters that would make up such a token. Thus, the definition:<p>
<ul><pre>
<i>BooleanLiteral: one of<br>
</i><code>	true&#32;false
</code></pre></ul><a name="7071"></a>
in a lexical grammar production is shorthand for:
<p><ul><pre>
<i>BooleanLiteral:<br>
</i><code>	t r u e<br>
	f a l s e
</code></pre></ul><a name="19987"></a>
The right-hand side of a lexical production may specify that certain expansions are not permitted by using the phrase "but not" and then indicating the expansions to be excluded, as in the productions for <i>InputCharacter</i> <a href="3.doc.html#231571">(&#167;3.4)</a> and <i>Identifier</i> <a href="3.doc.html#40625">(&#167;3.8)</a>:<p>
<ul><pre>
<i>InputCharacter:<br>
</i>	<i>UnicodeInputCharacter</i> but not CR or LF

<i>Identifier:<br>
</i>	<i>IdentifierName</i> but not a <i>Keyword</i> or <i>BooleanLiteral</i> or <i>NullLiteral
</i></pre></ul><a name="23502"></a>
Finally, a few nonterminal symbols are described by a descriptive phrase in roman type in cases where it would be impractical to list all the alternatives:<p>
<ul><pre>
<i>RawInputCharacter</i>:<br>
	any Unicode character
</pre></ul>

<hr>
<!-- This inserts footnotes--><p>
<a href="index.html">Contents</a> | <a href="1.doc.html">Prev</a> | <a href="3.doc.html">Next</a> | <a href="j.index.doc1.html">Index</a>
<p>
<font size=-1>Java Language Specification (HTML generated by Suzette Pelouch on February 24, 1998)<br>
<i><a href="jcopyright.doc.html">Copyright &#169 1996 Sun Microsystems, Inc.</a>
All rights reserved</i>
<br>
Please send any comments or corrections to <a href="mailto:doug.kramer@sun.com">doug.kramer@sun.com</a>
</font>
</body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -