📄 compiling.doc.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"><html><head><title>VM Spec Compiling for the Java Virtual Machine</title></head><body BGCOLOR=#eeeeff text=#000000 LINK=#0000ff VLINK=#000077 ALINK=#ff0000><table width=100%><tr><td><a href="VMSpecTOC.doc.html">Contents</a> | <a href="Instructions2.doc.html">Prev</a> | <a href="Threads.doc.html">Next</a> | <a href="VMSpecIX.fm.html">Index</a></td><td align=right><i><i>The Java<sup><font size=-2>TM</font></sup> Virtual Machine Specification</i></i></td></tr></table><hr><br> <a name="2839"></a><p><strong>CHAPTER 7 </strong></p><a name="2989"></a><h1>Compiling for the Java Virtual Machine</h1><hr><p><a name="6043"></a><p>The Java virtual machine is designed to support the Java programming language. Sun's JDK releases and Java 2 SDK contain both a compiler from source code writtenin the Java programming language to the instruction set of the Java virtual machine, and a runtime system that implements the Java virtual machine itself. Understanding how one compiler utilizes the Java virtual machine is useful to the prospective compiler writer, as well as to one trying to understand the Java virtual machine itself.<p><a name="4054"></a>Although this chapter concentrates on compiling source code written in the Java programming language, the Java virtual machine does not assume that the instructions it executes were generated from such code. While there have been a number of efforts aimed at compiling other languages to the Java virtual machine, the current version of the Java virtual machine was not designed to support a wide range of languages. Some languages may be hosted fairly directly by the Java virtual machine. Other languages may be implemented only inefficiently. <p><a name="15152"></a>Note that the term "compiler" is sometimes used when referring to a translator from the instruction set of a Java virtual machine to the instruction set of a specific CPU. One example of such a translator is a just-in-time (JIT) code generator, which generates platform-specific instructions only after Java virtual machine code has been loaded. This chapter does not address issues associated with code generation, only those associated with compiling source code written in the Java programming language to Java virtual machine instructions.<p><a name="4083"></a><hr><h2>7.1 Format of Examples</h2>This chapter consists mainly of examples of source code together with annotated listings of the Java virtual machine code that the <code>javac</code> compiler in Sun's JDK release 1.0.2 generates for the examples. The Java virtual machine code is written in the informal "virtual machine assembly language" output by Sun's <code>javap</code> utility, distributed with the JDK software and the Java 2 SDK. You can use <code>javap</code> to generateadditional examples of compiled methods.<p><a name="15144"></a>The format of the examples should be familiar to anyone who has read assembly code. Each instruction takes the form<p><blockquote><pre><index> <opcode> [<operand1> [<operand2>...]] [<comment>]</pre></blockquote>The <index> is the index of the opcode of the instruction in the array that contains the bytes of Java virtual machine code for this method. Alternatively, the <index> may be thought of as a byte offset from the beginning of the method. The <opcode> is the mnemonic for the instruction's opcode, and the zero or more <operandN> are the operands of the instruction. The optional <comment> is given in end-of-line comment syntax:<p><blockquote><pre>8 bipush 100 // Push <code>int</code> constant <code>100</code></pre></blockquote>Some of the material in the comments is emitted by <code>javap</code>; the rest is supplied by the authors. The <index> prefacing each instruction may be used as the target of a control transfer instruction. For instance, a goto 8 instruction transfers control to the instruction at index 8. Note that the actual operands of Java virtual machine control transfer instructions are offsets from the addresses of the opcodes of those instructions;these operands are displayed by <code>javap</code> (and are shown in this chapter) as more easily read offsets into their methods.<p><a name="8695"></a>We preface an operand representing a runtime constant pool index with a hash sign and follow the instruction by a comment identifying the runtime constant pool item referenced, as in<p><blockquote><pre> 10 ldc #1 // Push <code>float</code> constant <code>100.0</code> </pre></blockquote>or<p><blockquote><pre> 9 invokevirtual #4 // Method <code>Example.addTwo(II)I</code></pre></blockquote>For the purposes of this chapter, we do not worry about specifying details such as operand sizes.<p><a name="4182"></a><hr><h2>7.2 Use of Constants, Local Variables, and Control Constructs</h2>Java virtual machine code exhibits a set of general characteristics imposed by the Java virtual machine's design and use of types. In the first example we encounter many of these, and we consider them in some detail.<p><a name="4154"></a>The <code>spin</code> method simply spins around an empty <code>for</code> loop 100 times:<p><blockquote><pre><code>void spin() {</code><code> int i;</code><code> for (i = 0; i < 100; i++) {</code><code> ; // Loop body is empty</code><code> }</code><code>}</code></pre></blockquote><a name="4112"></a>A compiler might compile <code>spin</code> to<p><blockquote><pre>Method <code>void</code> <code>spin()</code> 0 iconst_0 // Push <code>int</code> constant <code>0</code> 1 istore_1 // Store into local variable 1 (<code>i</code>=<code>0</code>) 2 goto 8 // First time through don't increment 5 iinc 1 1 // Increment local variable 1 by 1 (<code>i++</code>) 8 iload_1 // Push local variable 1 (<code>i</code>) 9 bipush 100 // Push <code>int</code> constant <code>100</code> 11 if_icmplt 5 // Compare and loop if less than (<code>i</code> < <code>100</code>) 14 return // Return <code>void</code> when done</pre></blockquote><a name="10105"></a>The Java virtual machine is stack-oriented, with most operations taking one or more operands from the operand stack of the Java virtual machine's current frame or pushing results back onto the operand stack. A new frame is created each time a method is invoked, and with it is created a new operand stack and set of local variables for use by that method (see <a href="Overview.doc.html#17257">Section 3.6, "Frames"</a>). At any one point of the computation, there are thus likely to be many frames and equally many operand stacks per thread of control, corresponding to many nested method invocations. Only the operand stack in the current frame is active. <p><a name="4169"></a>The instruction set of the Java virtual machine distinguishes operand types by using distinct bytecodes for operations on its various data types. The method <code>spin</code> operates only on values of type <code>int</code>. The instructions in its compiled code chosen to operate on typed data (iconst_0, istore_1, iinc, iload_1, if_icmplt) are all specialized for type <code>int</code>.<p><a name="4172"></a>The two constants in <code>spin</code>, <code>0</code> and <code>100</code>, are pushed onto the operand stack using two different instructions. The <code>0</code> is pushed using an iconst_0 instruction, one of the family of iconst_<i> instructions. The <code>100</code> is pushed using a bipush instruction, which fetches the value it pushes as an immediate operand.<p><a name="14767"></a>The Java virtual machine frequently takes advantage of the likelihood of certain operands (<code>int</code> constants -1, 0, 1, 2, 3, 4 and 5 in the case of the iconst_<i> instructions) by making those operands implicit in the opcode. Because the iconst_0 instruction knows it is going to push an <code>int</code> <code>0</code>, iconst_0 does not need to store an operand to tell it what value to push, nor does it need to fetch or decode an operand. Compiling the push of <code>0</code> as bipush 0 would have been correct, but would have made the compiled code for <code>spin</code> one byte longer. A simple virtual machine would have also spent additional time fetching and decoding the explicit operand each time around the loop. Use of implicit operands makes compiled code more compact and efficient. <p><a name="15090"></a>The <code>int</code> <code>i</code> in <code>spin</code> is stored as Java virtual machine local variable 1. Because most Java virtual machine instructions operate on values popped from the operand stack rather than directly on local variables, instructions that transfer values between local variables and the operand stack are common in code compiled for the Java virtual machine. These operations also have special support in the instruction set. In <code>spin</code>, values are transferred to and from local variables using the istore_1 and iload_1 instructions, each of which implicitly operates on local variable 1. The istore_1 instruction pops an <code>int</code> from the operand stack and stores it in local variable 1. The iload_1 instruction pushes the value in local variable 1 onto the operand stack.<p><a name="4941"></a>The use (and reuse) of local variables is the responsibility of the compiler writer. The specialized load and store instructions should encourage the compiler writer to reuse local variables as much as is feasible. The resulting code is faster, more compact, and uses less space in the frame.<p><a name="15074"></a>Certain very frequent operations on local variables are catered to specially by the Java virtual machine. The iinc instruction increments the contents of a local variable by a one-byte signed value. The iinc instruction in <code>spin</code> increments the first local variable (its first operand) by 1 (its second operand). The iinc instruction is very handy when implementing looping constructs. <p><a name="12225"></a>The <code>for</code> loop of <code>spin</code> is accomplished mainly by these instructions:<p><blockquote><pre> 5 iinc 1 1 // Increment local 1 by 1 (<code>i++</code>) 8 iload_1 // Push local variable 1 (<code>i</code>) 9 bipush 100 // Push <code>int</code> constant <code>100</code> 11 if_icmplt 5 // Compare and loop if less than (<code>i</code> < <code>100</code>)</pre></blockquote>The bipush instruction pushes the value 100 onto the operand stack as an <code>int</code>, then the if_icmplt instruction pops that value off the operand stack and compares it against i. If the comparison succeeds (the variable <code>i</code> is less than <code>100</code>), control is transferred to index 5 and the next iteration of the <code>for</code> loop begins. Otherwise, controlpasses to the instruction following the if_icmplt.<p><a name="4229"></a>If the <code>spin</code> example had used a data type other than <code>int</code> for the loop counter, the compiled code would necessarily change to reflect the different data type. For instance, if instead of an <code>int</code> the <code>spin</code> example uses a <code>double</code>, as shown,<p><blockquote><pre><code>void dspin() {</code><code> double i;</code><code> for (i = 0.0; i < 100.0; i++) {</code><code> ; // Loop body is empty</code><code> }</code><code>}</code></pre></blockquote>the compiled code is<p><blockquote><pre>Method <code>void</code> d<code>spin()</code> 0 dconst_0 // Push <code>double</code> constant <code>0.0</code> 1 dstore_1 // Store into local variables 1 and 2 2 goto 9 // First time through don't increment 5 dload_1 // Push local variables 1 and 2 6 dconst_1 // Push <code>double</code> constant <code>1.0</code> 7 dadd // Add; there is no dinc instruction 8 dstore_1 // Store result in local variables 1 and 2 9 dload_1 // Push local variables 1 and 2 10 ldc2_w #4 // Push <code>double</code> constant <code>100.0</code> 13 dcmpg // There is no if_dcmplt instruction 14 iflt 5 // Compare and loop if less than (<code>i</code> < <code>100.0</code>) 17 return // Return <code>void</code> when done</pre></blockquote>The instructions that operate on typed data are now specialized for type <code>double</code>. (The ldc2_w instruction will be discussed later in this chapter.)<p><a name="10228"></a>Recall that <code>double</code> values occupy two local variables, although they are only accessed using the lesser index of the two local variables. This is also the case for values of type <code>long</code>. Again for example,<p><blockquote><pre><code>double doubleLocals(double d1, double d2) {</code><code> return d1 + d2;</code><code>}</code></pre></blockquote>becomes<p><blockquote><pre>Method <code>double</code> <code>doubleLocals(double,double)</code> 0 dload_1 // First argument in local variables 1 and 2 1 dload_3 // Second argument in local variables 3 and 4 2 dadd 3 dreturn</pre></blockquote><a name="16197"></a>Note that local variables of the local variable pairs used to store <code>double</code> values in <code>doubleLocals</code> must never be manipulated individually.<p><a name="16186"></a>The Java virtual machine's opcode size of 1 byte results in its compiled code being very compact. However, 1-byte opcodes also mean that the Java virtual machine instruction set must stay small. As a compromise, the Java virtual machine does not provide equal support for all data types: it is not completely orthogonal (see <a href="Overview.doc.html#37356">Table 3.2, "Type support in the Java virtual machine instruction set"</a>). <p><a name="17566"></a>For example, the comparison of values of type <code>int</code> in the <code>for</code> statement of example <code>spin</code> can be implemented using a single if_icmplt instruction; however, there is no single instruction in the Java virtual machine instruction set that performs a conditional branch on values of type <code>double</code>. Thus, <code>dspin</code> must implement its comparison of values of type <code>double</code> using a dcmpg instruction followed by an iflt instruction.<p><a name="4595"></a>The Java virtual machine provides the most direct support for data of type <code>int</code>. This is partly in anticipation of efficient implementations of the Java virtual machine's operand stacks and local variable arrays. It is also motivated by the frequency of <code>int</code> data in typical programs. Other integral types have less direct support. There are no <code>byte</code>, <code>char</code>, or <code>short</code> versions of the store, load, or add instructions, for instance. Here is the <code>spin</code> example written using a <code>short</code>:<p><blockquote><pre><code>void sspin() {</code><code> short i;</code><code> for (i = 0; i < 100; i++) {</code><code> ; // Loop body is empty</code><code> }</code><code>}</code></pre></blockquote>It must be compiled for the Java virtual machine, as follows, using instructions operating on another type, most likely <code>int</code>, converting between <code>short</code> and <code>int</code> values as necessary to ensure that the results of operations on <code>short</code> data stay within the appropriate range:<p><blockquote><pre>Method <code>void</code> <code>sspin()</code> 0 iconst_0 1 istore_1 2 goto 10 5 iload_1 // The <code>short</code> is treated as though an <code>int</code> 6 iconst_1 7 iadd 8 i2s // Truncate <code>int</code> to <code>short</code> 9 istore_1 10 iload_1 11 bipush 100 13 if_icmplt 5 16 return</pre></blockquote>The lack of direct support for <code>byte</code>, <code>char</code>, and <code>short</code> types in the Java virtual machine is not particularly painful, because values of those types are internally promotedto <code>int</code> (<code>byte</code> and <code>short</code> are sign-extended to <code>int</code>, <code>char</code> is zero-extended). Operations on <code>byte</code>, <code>char</code>, and <code>short</code> data can thus be done using <code>int</code> instructions. The only additional cost is that of truncating the values of <code>int</code> operations to valid ranges. <p><a name="5831"></a>The <code>long</code> and floating-point types have an intermediate level of support in the Java virtual machine, lacking only the full complement of conditional control transfer instructions. <p><a name="4228"></a><hr><h2>7.3 Arithmetic</h2>The Java virtual machine generally does arithmetic on its operand stack. (The exception is the iinc instruction, which directly increments the value of a local variable.) For instance, the <code>align2grain</code> method aligns an <code>int</code> value to a given power
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -