⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 manual.tex

📁 一个用于对.class文件进行插桩的开源工具
💻 TEX
📖 第 1 页 / 共 4 页
字号:
\documentclass[12pt,twoside]{article}\usepackage{epsf,a4wide,moreverb,url}\usepackage{palatino}\newcommand\jc{{\sffamily BCEL }}\newcommand\cp{{constant pool }}\newcommand\cpe{constant pool}\newcommand\jvm{{Java Virtual Machine }}\newcommand\jvme{{Java Virtual Machine}}\newcommand\vm{{Virtual Machine }}\newcommand\href[2]{#2}\begin{document}\title{Byte Code Engineering Library (BCEL)\\       Description and usage manual\\       {\small \textbf{Version 1.0}}}\author{{\Large Markus Dahm}\\\\        \href{mailto:markus.dahm@inf.fu-berlin.de}{\texttt{markus.dahm@berlin.de}}}\maketitle%\tableofcontents\begin{abstract}Extensions  and improvements of the  programming language Java and itsrelated  execution environment (Java   Virtual Machine,  JVM)  are thesubject  of a large number  of research  projects and proposals. Thereare  projects, for  instance, to add   parameterized types to Java, toimplement ``Aspect-Oriented Programming'', to perform sophisticatedstatic analysis, and to improve the run-time performance.Since  Java classes  are  compiled into  portable  binary class  files(called   \emph{byte   code}),  it   is   the   most  convenient   andplatform-independent  way  to  implement  these  improvements  not  bywriting a  new compiler or changing  the JVM, but  by transforming thebyte  code.   These  transformations  can either  be  performed  aftercompile-time,  or at load-time.   Many programmers  are doing  this byimplementing their own specialized byte code manipulation tools, whichare, however, restricted in the range of their re-usability.To deal with the necessary class file transformations, we introduce anAPI   that   helps   developers   to  conveniently   implement   theirtransformations.\end{abstract}\section{Introduction}\label{sec:intro}The  Java language  \cite{gosling} has  become very  popular  and manyresearch projects  deal with further  improvements of the  language orits run-time behavior.  The possibility  to extend a language with newconcepts  is surely  a  desirable feature,  but implementation  issuesshould be hidden from the user.  Fortunately, the concepts of the \jvmpermit  the user-transparent  implementation of  such  extensions withrelatively little effort.Because the target language of  Java is an interpreted language with asmall  and  easy-to-understand  set  of instructions  (the  \emph{bytecode}), developers  can implement  and test their  concepts in  a veryelegant way.   One can  write a plug-in  replacement for  the system'sclass loader which is  responsible for dynamically loading class filesat  run-time  and  passing the  byte  code  to  the \vm  (see  section\ref{sec:classloaders}).  Class loaders may  thus be used to interceptthe  loading process and  transform classes  before they  get actuallyexecuted  by the  JVM  \cite{classloader}.  While  the original  classfiles always remain unaltered, the behavior of the class loader may bereconfigured for every execution or instrumented dynamically.The \jc API (Byte Code Engineering Library), formerly known asJavaClass, is a toolkit for the static analysis and dynamic creationor transformation of Java class files.  It enables developers toimplement the desired features on a high level of abstraction withouthandling all the internal details of the Java class file format andthus re-inventing the wheel every time.  \jc is written entirely inJava and freely available under the terms of the Apache SoftwareLicense.  \footnote{The distribution is available at  \url{http://jakarta.apache.org/bcel/}, including several code  examples and javadoc manuals.  }This paper is structured as follows:  We give a  brief description ofthe \jvm and the class  file format in section \ref{sec:jvm}.  Section\ref{sec:api} introduces  the \jc API.   Section \ref{sec:application}describes  some typical  application areas  and example  projects. Theappendix contains  code examples that are  to long to  be presented inthe  main  part  of this  paper.  All  examples  are included  in  thedown-loadable distribution.\subsection{Related work}There are  a number  of proposals and  class libraries that  have somesimilarities with \textsc{BCEL}: The JOIE \cite{joie} toolkit canbe used to instrument class loaders with dynamic behavior.  Similarly,``Binary  Component Adaptation''  \cite{bca} allows  components  to beadapted and  evolved on-the-fly.  Han  Lee's ``Byte-code InstrumentingTool'' \cite{bit} allows the user  to insert calls to analysis methodsanywhere in the  byte code.  The Jasmin language  \cite{jasmin} can beused  to   hand-write  or  generate   pseudo-assembler  code.   D-Java\cite{classfile} and JCF \cite{inside} are class viewing tools.In contrast to these projects, \jc is intended to be a general purposetool  for ``byte  code engineering''.   It gives  full control  to thedeveloper on a high level of  abstraction and is not restricted to anyparticular application area.\section{The Java Virtual Machine}\label{sec:jvm}Readers already familiar with the \jvm and the Java class file formatmay want to skip this section and proceed with section \ref{sec:api}.Programs written  in the  Java language are  compiled into  a portablebinary format called \emph{byte  code}.  Every class is represented bya  single class  file  containing  class related  data  and byte  codeinstructions. These  files are loaded dynamically  into an interpreter(\jvme, JVM) and executed.Figure  \ref{fig:jvm}  illustrates  the  procedure  of  compiling  andexecuting a Java class:  The source file (\texttt{HelloWorld.java}) iscompiled into a Java class file (\texttt{HelloWorld.class}), loaded bythe  byte  code  interpreter  and  executed.  In  order  to  implementadditional  features, researchers  may want  to transform  class files(drawn  with  bold lines)  before  they  get  actually executed.  Thisapplication area is one of the main issues of this article.\begin{figure}[htbp]  \begin{center}    \leavevmode    \epsfxsize\textwidth    \epsfbox{eps/jvm.eps}    \caption{Compilation and execution of Java classes}    \label{fig:jvm}  \end{center}\end{figure}Note that the  use of the general term  ``Java'' implies two meanings:on the one hand, Java as a programming language is meant, on the otherhand, the Java  Virtual Machine, which is not  necessarily targeted bythe Java language  exclusively, but may be used  by other languages aswell (e.g.   Eiffel \cite{eiffel}, or Ada \cite{ada}).   We assume thereader to  be familiar with  the Java language  and to have  a generalunderstanding of the Virtual Machine.\subsection{Java class file format}\label{sec:format}Giving a  full overview of  the design issues  of the Java  class fileformat and the  associated byte code instructions is  beyond the scopeof this paper.   We will just give a  brief introduction covering thedetails  that  are  necessary  for  understanding  the  rest  of  thispaper. The format of class files and the byte code instruction set aredescribed  in more  detail  in the  ``\jvm Specification''  \cite{jvm}\footnote{Also             available             online             at\url{http://www.javasoft.com/docs/books/vmspec/index.html}},   and  in\cite{jasmin}.   Especially,  we  will  not  deal  with  the  securityconstraints that the \jvm has to check at run-time, i.e. the byte codeverifier.Figure \ref{fig:classfile} shows a  simplified example of the contentsof a  Java class file:  It starts with  a header containing  a ``magicnumber'' (\texttt{0xCAFEBABE}) and the version number, followed by the\emph{\cpe}, which can be roughly thought of as the text segment of anexecutable, the  \emph{access rights}  of the class  encoded by  a bitmask, a list of interfaces  implemented by the class, lists containingthe  fields and  methods of  the  class, and  finally the  \emph{classattributes}, e.g.  the  \texttt{SourceFile} attribute telling the nameof  the source  file.  Attributes  are  a way  of putting  additional,e.g. user-defined,  information into class file  data structures.  Forexample, a  custom class  loader may evaluate  such attribute  data inorder to perform its  transformations.  The JVM specification declaresthat unknown, i.e.  user-defined attributes must be ignored by any \vmimplementation.\begin{figure}[htbp]  \begin{center}    \leavevmode    \epsfxsize\textwidth    \epsfbox{eps/classfile.eps}    \caption{Java class file format}    \label{fig:classfile}  \end{center}\end{figure}Because  all of  the  information needed  to  dynamically resolve  thesymbolic  references to  classes, fields  and methods  at  run-time iscoded  with string  constants, the  \cp contains  in fact  the largestportion of an average class file, approximately 60\% \cite{statistic}.The byte code instructions themselves just make up 12\%.The right upper box shows a  ``zoomed'' excerpt of the \cpe, while therounded box below depicts  some instructions that are contained withina  method  of the  example  class.  These  instructions represent  thestraightforward translation of the well-known statement:\begin{verbatim}   System.out.println("Hello, world");\end{verbatim}The first instruction loads the  contents of the field \texttt{out} ofclass  \texttt{java.lang.System} onto  the operand  stack. This  is aninstance of  the class \texttt{java.io.PrintStream}.  The \texttt{ldc}(``Load constant'') pushes a reference  to the string "Hello world" onthe  stack.    The  next  instruction  invokes   the  instance  method\texttt{println}  which  takes  both  values as  parameters  (Instancemethods always  implicitly take an  instance reference as  their firstargument).Instructions,  other  data  structures   within  the  class  file  andconstants  themselves  may  refer  to  constants  in  the  \cpe.  Suchreferences are implemented via fixed indexes encoded directly into theinstructions.   This  is illustrated  for  some  items  of the  figureemphasized    with   a    surrounding   box.For  example,  the  \texttt{invokevirtual}  instruction  refers  to  a\texttt{MethodRef} constant  that contains information  about the nameof the  called method, the  signature (i.e.  the encoded  argument andreturn types),  and to  which class the  method belongs.  In  fact, asemphasized by the boxed  value, the \texttt{MethodRef} constant itselfjust refers to other entries holding the real data, e.g.  it refers toa \texttt{ConstantClass} entry containing  a symbolic reference to theclass \texttt{java.io.PrintStream}.   To keep the  class file compact,such  constants  are   typically  shared  by  different  instructions.Similarly, a field is represented by a \texttt{Fieldref} constant thatincludes information about the name, the type and the containing classof the field.The \cp  basically holds the following types  of constants: Referencesto methods, fields and  classes, strings, integers, floats, longs, anddoubles.\subsection{Byte code instruction set}\label{sec:code}The JVM  is a  stack-oriented interpreter that  creates a  local stackframe of fixed size for every method invocation. The size of the localstack has to  be computed by the compiler.  Values  may also be storedintermediately in a frame area containing \emph{local variables} whichcan  be used  like  a set  of  registers.  These  local variables  arenumbered from 0  to 65535, i.e.  you have a maximum  of 65536 of localvariables.   The  stack  frames   of  caller  and  callee  method  areoverlapping, i.e.  the caller  pushes arguments onto the operand stackand the called method receives them in local variables.The byte code instruction  set currently consists of 212 instructions,44  opcodes  are  marked  as  reserved  and may  be  used  for  futureextensions   or   intermediate   optimizations  within   the   VirtualMachine. The instruction set can be roughly grouped as follows:\begin{description}\item[Stack operations:] Constants can be pushed onto the stack eitherby loading them from the \cp with the \texttt{ldc} instruction or withspecial ``short-cut''  instructions where the operand  is encoded intothe  instructions, e.g.   \texttt{iconst\_0} or  \texttt{bipush} (pushbyte value).\item[Arithmetic  operations:]   The  instruction  set   of  the  \jvmdistinguishes  its  operand  types  using  different  instructions  tooperate on  values of  specific type.  Arithmetic  operations startingwith  \texttt{i}, for  example,  denote  an  integer  operation.  E.g.,\texttt{iadd} that adds two integers and pushes the result back on thestack.     The    Java    types    \texttt{boolean},    \texttt{byte},\texttt{short}, and \texttt{char} are handled as integers by the JVM.\item[Control flow:] There  are branch instructions like \texttt{goto}and   \texttt{if\_icmpeq},    which   compares   two    integers   forequality.  There  is  also   a  \texttt{jsr}  (jump  sub-routine)  and\texttt{ret} pair of instructions that  is  used to implement the\texttt{finally} clause of  \texttt{try-catch} blocks.  Exceptions maybe thrown with the \texttt{athrow} instruction.Branch  targets  are coded  as  offsets  from  the current  byte  codeposition, i.e. with an integer number.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -