📄 compiling.doc.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title>VM Spec Compiling for the Java Virtual Machine</title>
</head>
<body BGCOLOR=#eeeeff text=#000000 LINK=#0000ff VLINK=#000077 ALINK=#ff0000>
<table width=100%><tr>
<td><a href="VMSpecTOC.doc.html">Contents</a> | <a href="Instructions2.doc15.html">Prev</a> | <a href="Threads.doc.html">Next</a> | <a href="Lindholm.INDEX.html">Index</a></td><td align=right><i><i>The Java<sup><font size=-2>TM</font></sup> Virtual Machine Specification</i></i></td>
</tr></table>
<hr><br>
<a name="2839"></a>
<p><strong>CHAPTER 7 </strong></p>
<a name="2989"></a>
<h1>Compiling for the Java Virtual Machine</h1>
<hr><p>
<a name="6043"></a>
<p>
<a name="4028"></a>
The Java Virtual Machine is designed to support the Java programming language.
Sun's JDK 1.0.2 release of the Java programming language contains both a compiler
from Java source code to the Java Virtual Machine's instruction set (<code>javac</code>) and a
runtime system that implements the Java Virtual Machine itself (<code>java</code>). Understanding how one Java compiler utilizes the Java Virtual Machine is useful to the prospective Java compiler writer, as well as to one trying to understand the operation of the
Java Virtual Machine.
<p><a name="4054"></a>
Although this chapter concentrates on compiling Java code, the Java Virtual Machine does not assume that the instructions it executes were generated from Java source code. While there have been a number of efforts aimed at compiling other languages to the Java Virtual Machine, version 1.0.2 of the Java Virtual Machine was not designed to support a wide range of languages. Some languages may be hosted fairly directly by the Java Virtual Machine. Others may support constructs that only can be implemented inefficiently. <p>
<a name="9588"></a>
We are considering bounded extensions to future versions of the Java Virtual Machine to support a wider range of languages more directly. Please contact us at <code>jvm@javasoft.com</code> if you have interest in this effort.<p>
<a name="11072"></a>
Note that the term "compiler" is sometimes used when referring to a translator from the instruction set of a Java Virtual Machine to the instruction set of a specific CPU. One example of such a translator is a "Just In Time" (JIT) code generator, which generates platform-specific instructions only after Java Virtual Machine code has been loaded into the Java Virtual Machine. This chapter does not address issues associated with code generation, only those associated with compiling from Java source code to Java Virtual Machine instructions.<p>
<a name="4083"></a>
<hr><h2>7.1 Format of Examples</h2>
<a name="11107"></a>
This chapter consists mainly of examples of Java source code together with annotated listings of the Java Virtual Machine code that the <code>javac</code> compiler in Sun's JDK
1.0.2 release generates for the examples. The Java Virtual Machine code is written
in the informal "virtual machine assembly language" output by Sun's <code>javap</code> utility,
also distributed with the JDK. You can use <code>javap</code> to generate additional examples of
compiled Java methods.
<p><a name="10062"></a>
The format of the examples should be familiar to anyone who has read assembly code. Each instruction takes the form<p>
<pre><br><a name="4190"></a> <i><index> <opcode></i> [<i><operand1> </i>[<i><operand2>...</i>]]<i> </i>[<i><comment>]
</i></pre><a name="6265"></a>
The <i><index></i> is the index of the opcode of the instruction in the array that contains
the bytes of Java Virtual Machine code for this method. Alternatively, the <i><index></i>
may be thought of as a byte offset from the beginning of the method. The <i><opcode></i>
is the mnemonic for the instruction's opcode, and the zero or more <i><operandN></i> are
the operands of the instruction. The optional <i><comment></i> is given in Java-style end-
of-line comment syntax:<p><Table Border="0">
<tr><td> <i>8
</i><br><td> <i>bipush 100
</i><br><td><i>// Push constant </i><code>100
</code>
</Table><br><br>
<p><a name="8694"></a>
Some of the material in the comments is emitted by <code>javap</code>; the rest is supplied by the
authors. The <i><index></i> prefacing each instruction may be used as the target of a control transfer instruction. For instance, a <i>goto 8</i> instruction transfers control to the
instruction at index 8. Note that the actual operands of Java Virtual Machine control
transfer instructions are offsets from the addresses of the opcodes of those instructions; these operands are displayed by <code>javap</code>, and are shown in this chapter, as more
easily read offsets into their methods.
<p><a name="14270"></a>
We preface an operand representing a constant pool index with a hash sign, and follow the instruction by a comment identifying the constant pool item referenced, as in<p><Table Border="0">
<tr><td> <i>10
</i><br><td> <i>ldc #1
</i><br><td><i>// Float </i><code>100.</code><i>000000
</i>
</Table><br><br><p>
<a name="14286"></a>
<p>
<a name="14330"></a>
or<p><Table Border="0">
<tr><td> <i>9
</i><br><td> <i>invokevirtual
#4
</i><br><td><i>// Method Example</i><code>.addTwo(II)I
</code>
</Table><br><br>
<p><a name="14332"></a>
For the purposes of this chapter, we do not worry about specifying details such as
operand sizes.
<p><a name="4182"></a>
<hr><h2>7.2 Use of Constants, Local Variables, and Control Constructs</h2>
<a name="7213"></a>
Java Virtual Machine code exhibits a set of general characteristics imposed by the
Java Virtual Machine's design and use of types. In the first example we encounter
many of these, and we consider them in some detail.
<p><a name="4154"></a>
The <code>spin</code> method simply spins around an empty <code>for</code> loop 100 times:<p>
<pre><br><a name="4115"></a> <code>void spin() {
</code></pre><pre> <code> int i;
</code> <code> for (i = 0; i < 100; i++) {
;</code><code> // Loop body is empty</code>
<code> }
</code><a name="4118"></a> <code>}
</code><br></pre><a name="4112"></a>
The Java compiler compiles <code>spin</code> to<p>
<a name="14050"></a>
<i>Method </i><code>void</code> <code>spin()<p><Table Border="0">
<tr><td> 0
<br><td> <code>iconst_0
</code><br><td><i>// Push </i><code>int</code><i> constant </i><code>0
</code>
<tr><td> <i> 1
</i><br><td> <code>istore_1
</code><br><td><i>// Store into local 1 (</i><code>i</code>=<code>0</code><i>)
</i>
<tr><td> <i> 2
</i><br><td> <code>goto 8
</code><br><td><i>// First time through don't increment
</i>
<tr><td> <i> 5
</i><br><td> <code>iinc 1 1
</code><br><td><i>// Increment local 1 by 1 (</i><code>i++</code><i>)
</i>
<tr><td> <i> 8
</i><br><td> <code>iload_1
</code><br><td><i>// Push local 1 (</i><code>i</code><i>)
</i>
<tr><td> <i> 9
</i><br><td> <code>bipush 100
</code><br><td><i>// Push </i><code>int</code><i> constant (</i><code>100</code><i>)
</i>
<tr><td> <i> 11
</i><br><td> <code>if_icmplt 5
</code><br><td><i>// Compare, loop if </i><<i> (</i><code>i </code>< <code>100</code><i>)
</i>
<tr><td> <i> 14
</i><br><td> <code>return
</code><br><td><i>// Return </i><code>void</code><i> when done
</i>
</Table><br><br></code><p>
<a name="14059"></a>
The Java Virtual Machine is stack-oriented, with most operations taking one or more operands from the operand stack of the Java Virtual Machine's current frame, or pushing results back onto the operand stack. A new frame is created each time a Java method is invoked, and with it is created a new operand stack and set of local variables for use by that method (see <a href="Overview.doc.html#17257">Section 3.6, "Frames"</a>). At any one point of the computation, there are thus likely to be many frames and equally many operand stacks per thread of control, corresponding to many nested method invocations. Only the operand stack in the current frame is active. <p>
<a name="4169"></a>
The instruction set of the Java Virtual Machine distinguishes operand types by using distinct bytecodes for operations on its various data types. The method <code>spin</code> only operates on values of type <code>int</code>. The instructions in its compiled code chosen to operate on typed data (<i>iconst_0</i>, <i>istore_1</i>, <i>iinc</i>, <i>iload_1</i>, <i>if_icmplt</i>) are all specialized for type <code>int</code>.<p>
<a name="4172"></a>
The two constants in <code>spin</code>, <code>0</code> and <code>100</code>, are pushed onto the operand stack using two different instructions. The <code>0</code> is pushed using an <i>iconst_0</i> instruction, one of the family of <i>iconst_<i></i> instructions. The <code>100</code> is pushed using a <i>bipush</i> instruction, which fetches the value it pushes as an immediate operand.<p>
<a name="4926"></a>
The Java Virtual Machine frequently takes advantage of the likelihood of certain operands (<code>int</code> constants -<i>1</i>, <i>0</i>, <i>1</i>, <i>2</i>, <i>3</i>, <i>4</i> and <i>5</i> in the case of the <i>iconst_<i></i> instructions) by making those operands implicit in the opcode. Because the <i>iconst_0</i> instruction knows it is going to push an <code>int</code><i> </i><code>0</code>, <i>iconst_0</i> does not need to store an operand to tell it what value to push, nor does it need to fetch or decode an operand. Compiling the push of <code>0</code> as <i>bipush 0</i> would have been correct, but would have made the compiled code for <code>spin</code> one byte longer. A simple virtual machine would have also spent additional time fetching and decoding the explicit operand each time around the loop. Use of implicit operands makes compiled code more compact and efficient. <p>
<a name="4163"></a>
The <code>int</code><i> </i><code>i</code> in <code>spin</code> is stored as Java Virtual Machine local variable <i>1</i>. Because most Java Virtual Machine instructions operate on values popped from the operand stack rather than directly on local variables, instructions that transfer values between local variables and the operand stack are common in code compiled for the Java Virtual Machine. These operations also have special support in the instruction set. In <code>spin</code>, values are transferred to and from local variables using the <i>istore_1</i> and <i>iload_1</i> instructions, each of which implicitly operates on local variable <i>1</i>. The <i>istore_1</i> instruction pops an <code>int</code> from the operand stack and stores it in local variable <i>1</i>. The <i>iload_1</i> instruction pushes the value in local variable <i>1</i> onto the operand stack.<p>
<a name="4941"></a>
The use (and reuse) of local variables is the responsibility of the compiler writer. The specialized load and store instructions should encourage the compiler writer to reuse local variables as much as is feasible. The resulting code is faster, more compact, and uses less space in the Java frame.<p>
<a name="4179"></a>
Certain very frequent operations on local variables are catered to specially by the Java Virtual Machine. The <i>iinc</i> instruction increments the contents of a local variable by a one-byte signed value. The <i>iinc</i> instruction in <code>spin</code> increments the first local variable (its first operand) by <i>1</i> (its second operand). The <i>iinc</i> instruction is very handy when implementing looping constructs. <p>
<a name="12225"></a>
The <code>for</code> loop of <code>spin</code> is accomplished mainly by these instructions:<p><Table Border="0">
<tr><td> <i> 5
</i><br><td> <i>iinc 1 1
</i><br><td><i>// Increment local 1 by 1 (i++)
</i>
<tr><td> <i> 8
</i><br><td> <i>iload_1
</i><br><td><i>// Push local 1 (i)
</i>
<tr><td> <i> 9
</i><br><td> <i>bipush 100
</i><br><td><i>// Push int constant (100)
</i>
<tr><td> <i> 11
</i><br><td> <i>if_icmplt 5
</i><br><td><i>// Compare, loop if < (i < 100)
</i>
</Table><br><br><p>
<a name="4207"></a>
The <i>bipush</i> instruction pushes the value <i>100</i> onto the operand stack as an <code>int</code>, then
the <i>if_icmplt</i> instruction pops that value off the stack and compares it against <i>i</i>. If
the comparison succeeds (the Java variable <code>i</code> is less than <code>100</code>), control is transferred
to index <i>5</i> and the next iteration of the <code>for</code> loop begins. Otherwise, control passes to
the instruction following the <i>if_icmplt</i>.
<p><a name="24512"></a>
If the <code>spin</code> example had used a data type other than <code>int</code> for the loop counter, the compiled code would necessarily change to reflect the different data type. For instance, if instead of an <code>int</code> the <code>spin</code> example uses a <code>double</code>:<p>
<pre><br><a name="24513"></a> <code>void dspin() {
</code></pre><pre> <code> double i;
</code> <code> for (i = 0.0; i < 100.0; i++) {
</code> <code>              ;      // Loop body is empty
</code> <code> }
</code><a name="24518"></a> <code>}
</code><br></pre><a name="24519"></a>
the compiled code is
<p><a name="6881"></a>
<i>Method </i><code>void</code> d<code>spin()<p><Table Border="0">
<tr><td> <i> 0
</i><br><td> <i>dconst_0
</i><br><td><i>// Push double constant 0.0
</i>
<tr><td> <i> 1
</i><br><td> <i>dstore_1
</i><br><td><i>// Store into locals 1 and 2 (i = 0.0)
</i>
<tr><td> <i> 2
</i><br><td> <i>goto 9
</i><br><td><i>// First time through don't increment
</i>
<tr><td> <i> 5
</i><br><td> <i>dload_1
</i><br><td><i>// Push double onto operand stack
</i>
<tr><td> <i> 6
</i><br><td> <i>dconst_1
</i><br><td><i>// Push double constant 1 onto stack
</i>
<tr><td> <i> 7
</i><br><td> <i>dadd
</i><br><td><i>// Add; there is no dinc instruction
</i>
<tr><td> <i> 8
</i><br><td> <i>dstore_1
</i><br><td><i>// Store result in locals 1 and 2
</i>
<tr><td> <i> 9
</i><br><td> <i>dload_1
</i><br><td><i>// Push local
</i>
<tr><td> <i> 10
</i><br><td> <i>ldc2_w #4
</i><br><td><i>// Double 100.000000
</i>
<tr><td> <i> 13
</i><br><td> <i>dcmpg
</i><br><td><i>// There is no if_dcmplt instruction
</i>
<tr><td> <i> 14
</i><br><td> <i>iflt 5
</i><br><td><i>// Compare, loop if < (i < 100.000000)
</i>
<tr><td> <i> 17
</i><br><td> <i>return
</i><br><td><i>// Return void when done
</i>
</Table><br><br></code><p>
<a name="4181"></a>
The instructions that operate on typed data are now specialized for type <code>double</code>. (The
<i>ldc2_w</i> instruction will be discussed later in this chapter.)
<p><a name="24112"></a>
Note that in <code>dspin</code>, <code>double</code> values use two words of storage, whether on the operand stack or in local variables. This is also the case for values of type <code>long</code>. As another example:<p>
<pre><br><a name="24113"></a> <code>double doubleLocals(double d1, double d2) {
</code></pre><pre> <code> return d1 + d2;
</code><a name="24115"></a> <code>}
</code><br></pre><a name="24116"></a>
becomes
<p><a name="10234"></a>
<i>Method </i><code>double</code> <code>doubleLocals(double,double)<p><Table Border="0">
<tr><td> <i> 0
</i><br><td> <i>dload_1
</i><br><td><i>// First argument in locals 1 and 2
</i>
<tr><td> <i> 1
</i><br><td> <i>dload_3
</i><br><td><i>// Second argument in locals 3 and 4
</i>
<tr><td> <i> 2
</i><br><td> <i>dadd
</i><br><td><i>// Each also uses two words on stack
</i>
<tr><td> <i> 3
</i><br><td> <i>dreturn
</i><br><td>
<br>
</Table><br><br></code><p>
<a name="10226"></a>
It is always necessary to access the words of a two-word type in pairs and in their original order. For instance, the words of the <code>double</code> values in <code>doubleLocals</code> must never be manipulated individually.<p>
<a name="7349"></a>
The Java Virtual Machine's opcode size of one byte results in its compiled code being very compact. However, one-byte opcodes also mean that the Java Virtual Machine's instruction set must stay small. As a compromise, the Java Virtual Machine does not provide equal support for all data types: it is not completely orthogonal (see <a href="Overview.doc.html#23711">Table 3.1, "Type support in the Java Virtual Machine instruction set"</a>). In the case of <code>dspin</code>, note that there is no <i>if_dcmplt</i> instruction in the Java Virtual Machine instruction set. Instead, the comparison must be performed using a <i>dcmpg</i> followed by an <i>iflt</i>, requiring one more Java Virtual Machine instruction than the <code>int</code> version of <code>spin</code>.<p>
<a name="4595"></a>
The Java Virtual Machine provides the most direct support for data of type <code>int</code>. This is partly because the Java Virtual Machine's operand stack and local variables are one word wide, and a word is guaranteed to hold values of all integral types up to and including an <code>int</code> value. It is also motivated by the frequency of <code>int</code> data in typical Java programs.<p>
<a name="24533"></a>
Smaller integral types have less direct support. There are no <code>byte</code>, <code>char</code>, or <code>short</code> versions of the store, load, or add instructions, for instance. Here is the <code>spin</code> example written using a <code>short</code>:<p>
<pre><br><a name="24537"></a> <code>void sspin() {
</code></pre><pre> <code> short i;
</code> <code> for (i = 0; i < 100; i++) {
</code> <code> ; // Loop body is empty
</code> <code> }
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -