📄 chapter 7 code generation for control statements.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0062)http://topaz.cs.byu.edu/text/html/Textbook/Chapter7/index.html -->
<HTML><HEAD><TITLE>Chapter 7: Code Generation for Control Statements</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>
<BODY>
<CENTER>
<H2>Chapter 7<BR>Flow-Control Statements</H2></CENTER>Central to to building a
control statement such as <TT>for</TT>, <TT>if</TT>, or any of the loops is
understanding how the branch instructions in the SAL VM work. In figure {AJUMP}
we can see the disassembled portion of a code fragment for a short function. The
code has been presented as it appears in memory (in little-endian order). We can
see how each byte is laid out starting from the first address.
<MENU><IMG
src="Chapter 7 Code Generation for Control Statements.files/AJUMP.gif">
<P><FONT face=arial size=-1><B>Figure {AJUMP}.</B> A fragment of the machine
code in memory that is generated for function <TT>foo()</TT>. The data for
each instruction is in little-endian order.</FONT> </P></MENU>Instructions in
the VM are logically processed in a continuous cycle by first reading the entire
instruction, and then processing it. By "logically", we mean that reguardless of
how the virtual machine implements a particular instruction, it is assumed to be
read in one piece from the instruction stream, i.e., the instruction with all of
its immediate parameters. Once the instruction is read, the program counter is
incremented to the byte that immediately follows the instruction. After these
two steps, the instruction is finally processed.
<P>Let us suppose that the program counter is currently at address 0008FB0C as
shown. On the next instruction cycle, the five bytes (shown in color) from
address 0008FB0C througn 0008FB10 will be read into the VM, and the PC will be
moved to address 0008FB11. The particular branch instruction that is used in
this example is a <TT>JPZ</TT>, or "JuMp if Zero". This instruction will take a
single byte from the EES. If the byte is zero, the VM will add the offset from
the immediate parameter to the current value of the PC. The offset can be either
positive or negative (2's compliment).
<P>There are three branch instructions that we need to be immediately concerned
about.
<MENU><TT><B>JMP</B></TT> This is an unconditional branch. It will not read
anything from the EES, and will immediately branch to the offset that is given
in its immediate parameter.
<P><TT><B>JPZ</B></TT> This instruction will read a byte from the top of the
EES. If the byte is zero, the offset within the immediate parameter will be
added to the current value of the PC.
<P><TT><B>JPNZ</B></TT> This does the opposite of the <TT>JPZ</TT>
instruction. It will read a byte from the EES, and if it is anything but zero
it will add the offset within the immediate parameter to the current value of
the PC.
<P></P></MENU>Basically, there are two types of jumps in compilers. There are
forward jumps, and backward jumps. A forward jump will take the PC to some
higher address, and the backward jump will take the PC to a lower address. The
easiest for a compiler to do are backward jumps. The reason is that the compiler
can easily remember the position in the code array where it wants to have the
computer jump to. Here is what we mean. Suppose for a tiny instant that there
actually exists a <TT>goto</TT> in SAL. We want to perform a backward jump like
this: <PRE> jumphere:
write "I'm in an infinite loop!\n";
goto jumphere;
</PRE>The compiler would compile this sequence of statements by remembering the
PC where the write statement started, like this: <PRE> DWORD backpc = PC;
</PRE>Once the goto was reached, the compiler would emit a <TT>JMP</TT>
instruction with the offset of backpc - (PC + 4). The code would look like this:
<PRE> DWORD backpc = PC;
RuleStatementSequence (...);
Emit1(JMP);
Emit4( backpc - (PC + 4) );
</PRE>
<P>Forward jumps are harder, since we need to know what offset to emit for the
jump, and it is not known. <PRE> goto jumphere;
write "I will never be executed...\n";
jumphere:
...
</PRE>We compensate by saving the PC of the jump offset (not of the jump
instruction), and then we emit four bytes of zeros. After processing a block of
statements, we then go back to where we emit the zeros and poke in the right
values. To make this a little easier, we have two functions:
<MENU><TT><B>DWORD ForwardJump()</B></TT> This function emits four bytes of
zeros, and returns the address of the first byte. Its use is to reserve space
for a forward jump offset.
<P><TT><B>void JumpFix(DWORD jPC)</B></TT> This function takes a value
returned by <TT>ForwardJump()</TT> and sets the offset at that address to jump
to the current spot in the code array.
<P></P></MENU>The process looks generally like this: <PRE> Emit1(JMP);
DWORD fwdjmp = ForwardJump();
RuleStatementSequence(...);
JumpFix(fwdjmp);
</PRE><!----------------------------------------------------------------------------->
<H2>7.1 If Statement</H2><!----------------------------------------------------------------------------->The
basic layout for the <TT>if</TT> statement can be seen in figure {RULE36}. As we
can see, there are only three steps involved. The real key to simplicity here is
to keep a list (the implementation of which is left to the student) of jump
fixups to exit the if statement.
<P>
<MENU><IMG
src="Chapter 7 Code Generation for Control Statements.files/RULE36.gif">
<P><FONT face=arial size=-1><B>Figure {RULE36}.</B> Code generation for rule
IfStatement.</FONT> </P></MENU>To begin with, we parse an <TT>if</TT> keyword,
and then call <TT>RuleExpression()</TT>. We initialize the input type to
<TT>booltyp</TT>. Once <TT>RuleExpression()</TT> returns, we make sure that the
type of the expression was Boolean. If not, we give an error.
<P>After parsing the <TT>then</TT>, we call <TT>RuleStatementSequence()</TT>. If
the following token is an <TT>end</TT>, then we know that we are at the end of
the <TT>if</TT> statement. Otherwise, we need to insert a jump leading out,
otherwise the VM will execute the following condition, which we do not want to
have happen.
<P>Whether or not there is an <TT>end</TT>, or an <TT>elseif</TT>, or simply an
<TT>else</TT>, we want to call <TT>JumpFix()</TT> right after this, so if the
runtime condition was false, the VM will jump to the point that we are currently
at.
<P>Since all the jumps leading out are forward, and since they all land at the
same address, it makes sense to keep them in a list, where they can be fixed up
after we are done processing the whole <TT>if</TT> statement. The general
pattern that we want to achieve is shown in figure <TT>{IFMEM}</TT>.
<MENU><IMG
src="Chapter 7 Code Generation for Control Statements.files/IFMEM.gif">
<P><FONT face=arial size=-1><B>Figure {IFMEM}.</B> Pattern of code for an
if-statement.</FONT> </P></MENU>Here is a sample of code, and the pattern of
assembly instructions that should be generated. <PRE>
program TestIf;
var
x: int;
begin
if x<1 or x>10 then
write "I said between 1 and 10, dude!\n";
elseif x<7 then
write "Too low\n";
elseif x>8 then
write "Too high\n";
else
write "You got it!!\n";
end if;
end program;
*** Module table ************************************
Module: TestIf 15:34.30 3-12-99
Size: 4 Parent: 0x00000000
var 1:b int s32 fl: 0 Offset: 4 pass by val
*** DISASSEMBLY *************************************
ENTR 00000000
LGA 00000004
TS
SJPZ begin
RTN
begin:
LGD 00000004 if x<1 or x>10 then
LID 00000001
LESD
LGD 00000004
LID 0000000A
GTRD
ORB
JPZ NxtCnd01
LSTA 0000001E write "I said between 1 and 10, dude!\n";
SYS 00
JMP EndIf
NxtCnd01:
LGD 00000004 elseif x<7 then
LID 00000007
LESD
JPZ NxtCnd02
LSTA 0000003E write "Too low\n";
SYS 00
JMP EndIf
NxtCnd02:
LGD 00000004 elseif x>8 then
LID 00000008
GTRD
JPZ NxtCnd03
LSTA 00000047 write "Too high\n";
SYS 00
JMP EndIf
NxtCnd03:
else
LSTA 00000051 write "You got it!!\n";
SYS 00
end if;
EndIf:
RTN end program.
</PRE></TD></TR></TABLE>
<P>
<P>
<P><!----------------------------------------------------------------------------->
<H2>7.2 Pre-Test Loop</H2><!----------------------------------------------------------------------------->The
pre-test loop is much simpler than the <TT>if</TT> statement. It makes use of a
forward jump to make the VM skip over the body if the condition fails, and a
backward jump to take the VM back to the conditional part after each iteration
of the body.
<P>The only tricky part is to remember to make sure we test for true conditions
for <TT>while</TT> loops and false conditions for <TT>until</TT> loops. This can
be done by using a variable to store the appropriate op-code, as shown in figure
{RULE37}.
<P>
<MENU><IMG
src="Chapter 7 Code Generation for Control Statements.files/RULE37.gif">
<P><FONT face=arial size=-1><B>Figure {RULE37}.</B> Code generation for rule
PreTestLoop.</FONT> </P></MENU>In the first step, we want to check for a
<TT>while</TT> or an <TT>until</TT>. If we are doing an <TT>until</TT> loop then
we want to test for a false condition, and we emit a <TT>JPNZ</TT>. Otherwise we
are testing for a true condition, and we emit a <TT>JPZ</TT>We use a variable
called <TT>JumpOp</TT> to keep track of this. Then we save the PC for a backward
jump.
<P>Next, we call <TT>RuleExpression()</TT>. Just like with the <TT>if</TT>
statement, we pre-initialize the input type to a Boolean. If the type returned
is not Boolean, then we give an error.
<P>Next comes the crucial part, and do a forward jump. After emitting the
op-code that we previously saved, we call <TT>ForwardJump()</TT> and save the
address returned.
<P>After calling <TT>RuleStatementSequence()</TT>, we make an unconditional jump
that takes us back to the start of the condition for the loop. After emitting
the branch, the code for the loop is finished, and we can go back and fix up the
forward jump so that when the condition fails, we can jump out of the loop to
the current spot in the code array. The pattern that we would like to achieve
looks like figure {PTMEM}. The patterns for both types of pre-test loops are
shown.
<MENU><IMG
src="Chapter 7 Code Generation for Control Statements.files/PTMEM.gif">
<P><FONT face=arial size=-1><B>Figure {PTMEM}.</B> Patterns of code for both
types of pre-test loop.</FONT> </P></MENU>Here is a sample of code, and the
pattern of assembly instructions that should be generated. <PRE>
program TestWhile;
const A = 10;
var
b: int;
begin
b := 0;
while b < A do
b := b+1;
loop;
end program.
*** Module table ************************************
Module: TestWhile 15:34.30 3-12-99
Size: 4 Parent: 0x00000000
cnst 1:A int s32 fl: 010
var 1:b int s32 fl: 0 Offset: 4 pass by val
*** DISASSEMBLY *************************************
ENTR 00000000
LGA 00000004
TS
SJPZ begin
RTN
begin:
LGA 00000009 b := 0
LID 00000000
SSD 00000000
StartLoop:
LGD 00000009 while b < A do
LID 0000000A
LESD
JPZ EndLoop
LGA 00000009 b := b + 1
LGD 00000009
LID 00000001
ADDD
SSD 00000000
JMP StartLoop: loop;
EndLoop:
RTN end program.
</PRE>
<P>
<P>
<P><!----------------------------------------------------------------------------->
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -