📄 appendix a debugging and virtual machine architecture.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0063)http://topaz.cs.byu.edu/text/html/Textbook/AppendixA/index.html -->
<HTML><HEAD><TITLE>Appendix A: Debugging and Virtual Machine Architecture</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>
<BODY>
<CENTER>
<H2>Appendix A<BR>Debugging and Virtual Machine Architecture</H2></CENTER>
<H2>Introduction</H2><!----------------------------------------------------------------------------->This
chapter will serve as a buffer to the shock that might otherwise follow. I
short, code generation in a compiler is difficult, time consuming, and can be
frustrating. In this chapter we will introduce some of the basic elements to
making a SAL program run. We will also introduce the code-generation API that is
presented in <TT>SAL-CodeGen.CPP</TT>. As we dive into this, we will go over the
architecture of the virtual machine and present this in a way that is
interactive and understandable. We will also introduce the debugger and teach
how to use it.
<P>
<H2>A.1 Debugging a SAL Program</H2><!----------------------------------------------------------------------------->Let's
start by learning a little about the architecture of the SAL Virtual Machine
(VM). There is a great deal of information to be absorbed in a short time, so we
will try to make our explanations clear. Start by getting the compiler and VM,
and putting them into the same directory. Then, type in the following program
and compile it. <PRE> program Test;
var
s: int;
begin
write "Enter a number: ";
read s;
s:= s + 5;
write "\ns + 5 = ", s, nl;
end program.
</PRE>Let's debug this program. Type the following line: <PRE> runsal -d -p test
</PRE>You should see something that looks like this: <PRE> C:\SALTest>d>runsal -d -p test</D>
SAL Interpreter
Version 1.0 11-1997
Brigham Young University
Department of Computer Science
Compiler Research Laboratory
Build date: (Mon Jan 11 18:57:34 1999)
00CF0D70: ENTR 00000000
<Enter>:StpIn <Space>:StpOvr <R>:StpOut <?>:Help <Esc>:Quit
</PRE>After the startup banner and the build date we can see a line that has the
instruction <TT>ENTR</TT> followed by a 4-byte dword that is zero. This is the
first instruction of the program. Below that is a prompt for the most common
next commands. Commands are given to the debugger in the form of a single
keystroke. Hit the '?' key. You will see a summary of all the commands that can
be given. <PRE> Help commands:
----------------------------------------------
E: View the EES
G: View global memory
L: View local stack memory
S: View string data
M: View memory
D: Decode procedure.
Enter Step into a procedure
Space Step over a procedure
R Step out of (return from) a procedure
X Continue execution
[esc] Quit
</PRE>Let's see what our program looks like. Hit 'D'. When it asks for the
procedure number, type 0 and hit enter. <PRE> Decode procedure num (in hex): <B>0</B>
Decoding procedure 0000:
00CF0D70: ENTR 00000000
00CF0D75: LGA 00000004
00CF0D7A: TS
00CF0D7B: SJPZ 00CF0D7F
00CF0D7E: RTN
00CF0D7F: LSTA 00000000
00CF0D84: SYS 00
00CF0D86: LGA 0000000D
00CF0D8B: LIB 14
00CF0D8D: SYS 06
00CF0D8F: SSD 00000000
00CF0D94: LGA 0000000D
00CF0D99: LGD 0000000D
00CF0D9E: LID 00000005
00CF0DA3: ADD _s32 _s32
00CF0DA6: SSD 00000000
00CF0DAB: LSTA 00000011
00CF0DB0: SYS 00
00CF0DB2: LGD 0000000D
00CF0DB7: LID 00000000
00CF0DBC: LIB 14
00CF0DBE: SYS 02
00CF0DC0: LIB 0A
00CF0DC2: SYS 01
00CF0DC4: LEXA
00CF0DC5: LSD 00000000
00CF0DCA: LIB 00
00CF0DCC: UCMP _u32 _u8
00CF0DCF: EQL
00CF0DD0: JPNZ 00CF0E01
00CF0DD5: LEXA
00CF0DD6: LSD 00000000
00CF0DDB: COPT _u32
00CF0DDD: LIB 00
00CF0DDF: UCMP _u32 _u8
00CF0DE2: EQL
00CF0DE3: JPNZ 00CF0DFA
00CF0DE8: COPT _u32
00CF0DEA: LSA 00000008
00CF0DEF: SWAP 0004 0004
00CF0DF4: MFREE
00CF0DF5: JMP 00CF0DDB
00CF0DFA: POP 0004
00CF0DFD: LIW 000E
00CF0E00: TRAP
00CF0E01: RTN
End of procedure.
</PRE>It will print out about 21 lines and ask you if you want to view more. If
you hit any key other than 'N' it will keep on printing stuff. Let's go over
what this code does and step through it piece by piece. On the first line we see
this: <PRE> 00CF0D70: ENTR 00000000
</PRE>This instructions starts every procedure in SAL. In fact, the VM will halt
if this is not the first instruction that it encounters in a procedure. All it
does is allocate space for local variables on the procedure stack. Since this is
the main procedure all of its variables are global, so no space is needed. Hit
the space bar once and let's step to the next instruction.
<P>The next section of code looks like this: <PRE> 00CF0D75: LGA 00000004
00CF0D7A: TS
00CF0D7B: SJPZ 00CF0D7F
00CF0D7E: RTN
</PRE>The first instruction calculates an address by taking the value of the G
register and adding 4 to it. The instruction is called Load Global Address
(LGA). The G register is a pointer that keeps track of the start of global
memory for the current module. The value that we want is a single-byte flag at
the fourth offset from the start of global memory.
<P>SAL programs are linked at runtime. Although this link is still static, the
possibility that two or more modules of a program might rely on a common
library. If this is true the VM needs to make sure that the module only gets
initialized once. What this bit of code is doing is checking a flag to see if it
has been set. Let's take a look at the byte that we want to look at. Hit the 'G'
key. <PRE> Memory dump at G = 00CF0C90
Length: 13
Address: +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F | ASCII text:
------------------------------------------------------------+------------------
00CF0C90 D0 0C CF 00 <B>00</B> 00 0D CF - 00 00 00 00 00 | ........ .....
</PRE>We can see that (in this case) G is pointing to the address 00CF0C90h. At
offset 4 (highlited in bold) we wee that this flag is set to zero, indicating
that our module has not yet been initialized. Initialization consists of setting
up pointers to global arrays and records, calling constructors of global
classes, and executing the main procedure of libraries (or in our case
programs).
<P><!----------------------------------------------------------------------------->Consider
the memory dump for a moment. On the left you can see the starting address for
the current row. It reads 00CF0C90h. Along the top of the dump is a sequence of
numbers labeling the columns, numbered +0, +1, +2, and so on up to +F. We can
look up the value of any hex address by taking the least-significant digit to
find the column and use the remainder to find the row.
<P>To the right we can see the actual ASCII characters for each byte. This is
very useful in viewing string memory, as we shall see later. Right now none of
the bytes are printable characters so the debugger merely prints periods.
<P>A word about SAL VM memory. Anything longer than a byte is arranged in
"little endian" order. In other words, the little end (i.e., the least
significant byte) is listed first and the most significant byte is listed last.
With the SAL VM endian order is determined by the host architecture, in our case
Intel x86. If we were running on a HP or Sun workstation the byte order would be
in "big endian".
<P>The <TT>LGA</TT> instruction has not yet executed, so let's hit the space bar
one more time and step through it. <TT>LGA</TT> leaves its result on the
Expression Evaluation Stack (EES). Let's look at it now. Hit 'E'.
<P><PRE> EES:
-----------
00 CF 0C 94
</PRE>The only exception to the endian rule is the EES, which is always the
reverse of the host architecture. The reason is that the EES is a byte stack;
items are stored on the EES one byte at a time reguardless of the original data
size. As a LIFO structure, the ordering ends up reversed.
<P>The next instruction is <TT>TS</TT>. This stands for Test and Set. This
instruction does three things.
<OL>
<LI>It eats an address from the EES.
<LI>It gets the byte at that address and stores it back on the EES.
<LI>It writes the value of 1 to that address. </LI></OL>This all happens in one
instruction cycle. If you have taken a class in operating systems you will
understand how this instruction is significant for controlling threads and
processes.
<P>The next instruction is a short jump if the byte at the top of the EES is
zero to address 00CF0D7Fh. The next instruction is a <TT>RTN</TT>. What these
four instructions effectively accomplish is to check if this module's main
procedure has been executed. If so, the VM should return immediately. In our
case, it keeps going.
<P>Here is the next group of instructions. <PRE> 00CF0D7F: LSTA 00000000
00CF0D84: SYS 00
</PRE>Hit the space bar a couple of times until the current instruction is
<TT>LSTA</TT>. The <TT>LSTA</TT> instruction stands for Load STring Address. The
address is loaded to the EES. Tap the space bar one more time. If we take
another look at the EES, we can see four bytes. We can view the string area for
this module by hitting 'S'. This is what we get. <PRE> String memory at 00CF0D00
Length: 27
Address: +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F | ASCII text:
------------------------------------------------------------+------------------
00CF0D00 <B>45 6E 74 65 72 20 61 20 - 6E 75 6D 62 65 72 3A 20</B> | <B>Enter a number: </B>
00CF0D10 <B>00</B> 0A 73 20 2B 20 35 20 - 3D 20 00 | <B>.</B>.s + 5 = .
</PRE>This gives us a dump of the contents of string memory for this module.
Starting at address 00CF0D00 we can see that the string being pointed to is
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -