📄 manual.tex
字号:
\item[Load and store operations] for local variables like\texttt{iload} and \texttt{istore}. There are also array operationslike \texttt{iastore} which stores an integer value into an array.\item[Field access:] The value of an instance field may be retrievedwith \texttt{getfield} and written with \texttt{putfield}. For staticfields, there are \texttt{getstatic} and \texttt{putstatic}counterparts.\item[Method invocation:] Methods may either be called via staticreferences with \texttt{invokesta\-tic} or be bound virtually with the\texttt{invokevirtual} instruction. Super class methods and privatemethods are invoked with \texttt{invokespecial}.\item[Object allocation:] Class instances are allocated with the\texttt{new} instruction, arrays of basic type like \texttt{int[]}with \texttt{newarray}, arrays of references like \texttt{String[][]}with \texttt{anewarray} or \texttt{multianewarray}.\item[Conversion and type checking:] For stack operands of basic typethere exist casting operations like \texttt{f2i} which converts afloat value into an integer. The validity of a type cast may bechecked with \texttt{checkcast} and the \texttt{instanceof} operatorcan be directly mapped to the equally named instruction.\end{description}Most instructions have a fixed length, but there are also somevariable-length instructions: In particular, the \texttt{lookupswitch}and \texttt{tableswitch} instructions, which are used to implement\texttt{switch()} statements. Since the number of \texttt{case}clauses may vary, these instructions contain a variable number ofstatements.We will not list all byte code instructions here, since these areexplained in detail in the JVM specification. The opcode names aremostly self-explaining, so understanding the following code examplesshould be fairly intuitive.\subsection{Method code}\label{sec:code2}Non-abstract methods contain an attribute (\texttt{Code}) that holdsthe following data: The maximum size of the method's stack frame, thenumber of local variables and an array of byte codeinstructions. Optionally, it may also contain information about thenames of local variables and source file line numbers that can be usedby a debugger.Whenever an exception is thrown, the JVM performs exception handlingby looking into a table of exception handlers. The table markshandlers, i.e. pieces of code, to be responsible for exceptions ofcertain types that are raised within a given area of the bytecode. When there is no appropriate handler the exception is propagatedback to the caller of the method. The handler information is itselfstored in an attribute contained within the \texttt{Code} attribute.\subsection{Byte code offsets}\label{sec:offsets}Targets of branch instructions like \texttt{goto} are encoded asrelative offsets in the array of byte codes. Exception handlers andlocal variables refer to absolute addresses within the byte code. Theformer contains references to the start and the end of the\texttt{try} block, and to the instruction handler code. The lattermarks the range in which a local variable is valid, i.e. its scope.This makes it difficult to insert or delete code areas on this levelof abstraction, since one has to recompute the offsets every time andupdate the referring objects. We will see in section \ref{sec:cgapi}how \jc remedies this restriction.\subsection{Type information}\label{sec:types}Java is a type-safe language and the information about the types offields, local variables, and methods is stored in\emph{signatures}. These are strings stored in the \cp and encoded ina special format. For example the argument and return types of the\texttt{main} method\begin{verbatim} public static void main(String[] argv)\end{verbatim}are represented by the signature\begin{verbatim} ([java/lang/String;)V\end{verbatim}Classes and arrays are internally represented by strings like\texttt{"java/lang/String"}, basic types like \texttt{float} by aninteger number. Within signatures they are represented by singlecharacters, e.g., \texttt{"I"}, for integer.\subsection{Code example}\label{sec:fac}The following example program prompts for a number and prints thefaculty of it. The \texttt{readLine()} method reading from thestandard input may raise an \texttt{IOException} and if a misspellednumber is passed to \texttt{parseInt()} it throws a\texttt{NumberFormatException}. Thus, the critical area of code must beencapsulated in a \texttt{try-catch} block.{\small \begin{verbatim}import java.io.*;public class Faculty { private static BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); public static final int fac(int n) { return (n == 0)? 1 : n * fac(n - 1); } public static final int readInt() { int n = 4711; try { System.out.print("Please enter a number> "); n = Integer.parseInt(in.readLine()); } catch(IOException e1) { System.err.println(e1); } catch(NumberFormatException e2) { System.err.println(e2); } return n; } public static void main(String[] argv) { int n = readInt(); System.out.println("Faculty of " + n + " is " + fac(n)); }}\end{verbatim}}This code example typically compiles to the following chunks of bytecode:\subsubsection{Method fac}{\small \begin{verbatim}0: iload_01: ifne #84: iconst_15: goto #168: iload_09: iload_010: iconst_111: isub12: invokestatic Faculty.fac (I)I (12)15: imul16: ireturnLocalVariable(start_pc = 0, length = 16, index = 0:int n)\end{verbatim}}The method \texttt{fac} has only one local variable, the argument\texttt{n}, stored in slot 0. This variable's scope ranges from thestart of the byte code sequence to the very end. If the value of\texttt{n} (stored in local variable 0, i.e. the value fetched with\texttt{iload\_0}) is not equal to 0, the \texttt{ifne} instructionbranches to the byte code at offset 8, otherwise a 1 is pushed ontothe operand stack and the control flow branches to the final return.For ease of reading, the offsets of the branch instructions, which areactually relative, are displayed as absolute addresses in theseexamples.If recursion has to continue, the arguments for the multiplication(\texttt{n} and \texttt{fac(n - 1)}) are evaluated and the resultspushed onto the operand stack. After the multiplication operation hasbeen performed the function returns the computed value from the top ofthe stack.\subsubsection{Method readInt}{\small \begin{verbatim}0: sipush 47113: istore_04: getstatic java.lang.System.out Ljava/io/PrintStream;7: ldc "Please enter a number> "9: invokevirtual java.io.PrintStream.print (Ljava/lang/String;)V12: getstatic Faculty.in Ljava/io/BufferedReader;15: invokevirtual java.io.BufferedReader.readLine ()Ljava/lang/String;18: invokestatic java.lang.Integer.parseInt (Ljava/lang/String;)I21: istore_022: goto #4425: astore_126: getstatic java.lang.System.err Ljava/io/PrintStream;29: aload_130: invokevirtual java.io.PrintStream.println (Ljava/lang/Object;)V33: goto #4436: astore_137: getstatic java.lang.System.err Ljava/io/PrintStream;40: aload_141: invokevirtual java.io.PrintStream.println (Ljava/lang/Object;)V 44: iload_045: ireturnException handler(s) = From To Handler Type4 22 25 java.io.IOException(6)4 22 36 NumberFormatException(10)\end{verbatim}}First the local variable \texttt{n} (in slot 0) is initialized to thevalue 4711. The next instruction, \texttt{getstatic}, loads thestatic \texttt{System.out} field onto the stack. Then a string isloaded and printed, a number read from the standard input andassigned to \texttt{n}.If one of the called methods (\texttt{readLine()} and\texttt{parseInt()}) throws an exception, the \jvm calls one of thedeclared exception handlers, depending on the type of the exception.The \texttt{try}-clause itself does not produce any code, it merelydefines the range in which the following handlers are active. In theexample the specified source code area maps to a byte code arearanging from offset 4 (inclusive) to 22 (exclusive). If no exceptionhas occurred (``normal'' execution flow) the \texttt{goto}instructions branch behind the handler code. There the value of\texttt{n} is loaded and returned.For example the handler for \texttt{java.io.IOException} starts atoffset 25. It simply prints the error and branches back to the normalexecution flow, i.e. as if no exception had occurred.\section{The BCEL API}\label{sec:api}The \jc API abstracts from the concrete circumstances of the \jvm andhow to read and write binary Java class files. The API mainlyconsists of three parts:\begin{enumerate} \item A package that contains classes that describe ``static'' constraints of class files, i.e., reflect the class file format and is not intended for byte code modifications. The classes may be used to read and write class files from or to a file. This is useful especially for analyzing Java classes without having the source files at hand. The main data structure is called \texttt{JavaClass} which contains methods, fields, etc..\item A package to dynamically generate or modify \texttt{JavaClass}objects. It may be used e.g. to insert analysis code, to stripunnecessary information from class files, or to implement the codegenerator back-end of a Java compiler.\item Various code examples and utilities like a class file viewer, atool to convert class files into HTML, and a converter from classfiles to the Jasmin assembly language \cite{jasmin}.\end{enumerate}\subsection{JavaClass}\label{sec:javaclass}The ``static'' component of the \jc API resides in the package\path{de.fub.bytecode.classfile} and represents class files. All of thebinary components and data structures declared in the JVMspecification \cite{jvm} and described in section \ref{sec:jvm} aremapped to classes. Figure \ref{fig:umljc} shows an UML diagram of thehierarchy of classes of the \jc API. Figure \ref{fig:umlcp} in theappendix also shows a detailed diagram of the \texttt{ConstantPool}components.\begin{figure}[htbp] \begin{center} \leavevmode \epsfysize0.93\textheight \epsfbox{eps/javaclass.eps} \caption{UML diagram for the \jc API}\label{fig:umljc} \end{center}\end{figure}The top-level data structure is \texttt{JavaClass}, which in mostcases is created by a \texttt{Class\-Par\-ser} object that is capableof parsing binary class files. A \texttt{JavaClass} object basicallyconsists of fields, methods, symbolic references to the super classand to the implemented interfaces.The \cp serves as some kind of central repository and is thus ofoutstanding importance for all components. \texttt{ConstantPool}objects contain an array of fixed size of \texttt{Constant} entries,which may be retrieved via the \texttt{getConstant()} method taking an
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -