📄 chapter 7 code generation for control statements.htm

📁 英文版编译器设计：里面详细介绍啦C编译器的设计
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<H2>7.2 Post-Test Loop</H2><!----------------------------------------------------------------------------->The 
post-test loop is the simplest of all, since it involves a single backward jump. 
We start by storing the value of the current PC, and then calling 
<TT>RuleStatementSequence()</TT>. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/RULE38.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE38}.</B> Code generation for rule 
  PostTestLoop.</FONT> </P></MENU>We then detect whether we are doing a 
<TT>while</TT> loop or a <TT>until</TT> loop. We keep track of this the same way 
as with a pre-test loop, by saving the proper op-code in a variable. If we are 
in a <TT>while</TT> loop, then we need to test for a true condition, so we save 
a <TT>JPNZ</TT>. Otherwise, we are in an <TT>until</TT> loop, and we are trying 
to detect a false condition; thus we save a <TT>JPZ</TT>. Notice that this 
pattern is somewhat reversed from the pre-test loop.
<P>Next we call <TT>RuleExpression()</TT>. Just like before, we initialize the 
input type to Boolean, and verify that the output type is also Boolean. If not, 
we give an error. Finally, we emit the backward jump using the op-code that we 
previoiusly stored.
<P>There is an alternate form of post-test loop, which has no condition. This is 
an infinite <TT>do</TT> loop. If this is the case, then we emit an unconditional 
branch, i.e., a <TT>JMP</TT> instruction.
<P>
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/PSTMEM.gif">
  <P><FONT face=arial size=-1><B>Figure {PTMEM}.</B> Patterns of code for both 
  types of post-test loop.</FONT> </P></MENU>Here is a sample of code, and the 
pattern of assembly instructions that should be generated. <PRE>      program TestRepeat;

        const A = 5;

        var
          b: int;

      begin
        b:= 0;

        do
          b:= b+1;
        loop until b &gt; A;

      end program.


      *** Module table ************************************

      Module: TestRepeat  15:42.23  3-12-99
      Size: 4 Parent: 0x00000000
        cnst   1:A              int      s32    fl:   05
         var   1:b              int      s32    fl:   0	Offset: 4 pass by val


      *** DISASSEMBLY *************************************

        ENTR  00000000
        LGA   00000004  
        TS    
        SJPZ  begin  
        RTN

      begin:
        LGA   00000009          b := 0
        LID   00000000  
        SSD   00000000  

      LoopStart:                do
        LGA   00000009            b := b + 1
        LGD   00000009  
        LID   00000001  
        ADDD
        SSD   00000000

        LGD   00000009          loop until b &gt; A
        LID   00000005  
        GTRD
        JPZ   LoopStart
        
        RTN                   end program.
</PRE>&nbsp;
<P>&nbsp;
<P>&nbsp;
<P><!----------------------------------------------------------------------------->
<H2>7.3 For Loop</H2><!----------------------------------------------------------------------------->The 
<TT>for</TT> loop in SAL is a little more involved than any the previous 
grammatical constructs. This is true for any language. Before we dive right in 
to how it is implemented, it is important to discuss a little background 
philisophy and theory--there are many ways to build a for-loop. Not only do 
for-loops differ in syntax from language to language, but they also differ in 
behavior.
<P>More than likely, the for-loop evolved out of the common practice of using a 
counter variable with a loop: <PRE>      i:= 1;                      // Initialize i
      do
        write "i = ", i, nl;      // Do something with i
        i:= i + 1;                // Increment i at the end of the loop
      loop until i &gt; 10;          // Check to see if i is out of bounds
</PRE>Programmers quickly discovered the common pattern, and invented a 
shorthand form: <PRE>      for i:= 1 to 10 do
        write "i = ", i, nl;
      loop;
</PRE>There are two general types of for-loop among procedural languages. The 
most widely-known is due to the popularity of C and C++, and is basically a 
glorified while-loop. This is a class of loop that terminates by evaluating an 
expression. Each iteration (including the first) is performed while a condition 
is true. Commonly, a loop in C looks like this: <PRE>      for (i=0; i&lt;10; i++)
        /* do something here... */
</PRE>There are three parts in the patentheses separated by semicolons. The 
first part is an expression list, where each expression is separated by commas. 
Thus, we can initialize any number of variables. <PRE>      for (i=0, j=0, k=0; ...; ...)
</PRE>The next portion is the stopping condition. It is a single expression that 
is evaluated prior to each time the body of the loop is executed. The condition 
is arbitrary, and doesn't even have to have anything to do with the counter 
variable(s). <PRE>      for (i=0; WhileISaySo(); i++)
</PRE>The loop will execute while the condition evaluates to some non-zero value
<P>The third part is for incrementing one or more counters. It is executed after 
the body of the loop on each successful iteration. Again, this is an expression 
list, and we can modify multiple variables. <PRE>      for (i=0, j=0; ...; i++, j++)
</PRE>All parts of the C for-loop are optional, and we can even do an infinite 
loop that has nothing: <PRE>      for (;;)
        /* Do this infinitely!!! */
</PRE>In reality, we don't even have to use C's for-loop to increment. We can 
use it to traverse a linked list: <PRE>      for (current = liststart; current!=NULL; current = current-&gt;next)
</PRE>C's for-loop is a teriffic example of this particular class of for-loop. 
Its distinguishing characteristic is that the terminating condition is an 
expression that is evaluated prior to each time the body is executed, and must 
be true for the loop to keep iterating. These two loops are equivalent in C, and 
function identically: <PRE>      /* Loop 1 */
      for (i=0; i&lt;10; i++)
        ...
        
      /* Loop 2 */
      i = 0;
      while (i&lt;10) {
        ...

        i++;
      }
</PRE>SAL for-loops are different, and they follow a different convention. Most 
significantly, the starting value, the terminating value, and the step value are 
evaluated <I>once</I>, and are stored internally (on the procedure stack). In 
other words, the loop remembers the values of the expressions for the starting 
value, stopping value, and the step value (if there is one), and they remain 
constant throughout execution of the loop. This is significantlly different from 
the C counterpart. This can confusing if it is not understood. This loop will 
count evenly from 1 to 10, even though the variables in the stop expression and 
step expression are changed in the body of the loop.
<P><PRE>      j:= 1;
      k:= 1;

      for i:= 1 to j by k do
        write "i = ", i, nl;

        j:= j + 1;
        k:= k + j;
      loop
</PRE>In short, the expressions for the starting value, stopping value, and step 
value are evaluated only once at the start of the loop, and remain untouched 
throughout the execution of the loop.
<P>The <TT>for</TT> loop in SAL is performed by two instructions in the VM, 
called <TT>FOR</TT> and <TT>NEXT</TT>. The <TT>FOR</TT> instruction takes two 
immediate parameters. The first is a byte that designates the type of data that 
the instruction will work with. It can be any of the type bytes, e.g., _u8, 
_s16, _f32, etc. The second parameter is a jump offset. This is used if the 
starting conditions of the loop are such that the body should never be executed. 
The way this works depends on the sign of the step value. Assuming that the step 
value is positive, the instruction will make sure that the value of the starting 
expression is less than the value of the stopping expression. If the step value 
is negative, the instruction makes sure that the starting expression is greater 
than the stopping expression. If this test fails, then the jump is taken, and 
the body of the loop is never executed. This instruction is diagramed in figure 
{IFOR}.
<P>
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/IFOR.gif">
  <P><FONT face=arial size=-1><B>Figure {IFOR}.</B> Description of the 
  parameters for the FOR instruction.</FONT> </P></MENU>The <TT>NEXT</TT> 
instruction works in a similar manner. It takes two immediate parameters: a type 
parameter and a jump offset to the beginning of the loop body. This instruction 
reads the step value from the EES, adds it to the value at the address of the 
index variable. If the value of the index variable has moved beyond the stop 
value, then this instruction will clean up the EES, and end the loop. 
<P>
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/INEXT.gif">
  <P><FONT face=arial size=-1><B>Figure {INEXT}.</B> Description of the 
  parameters for the NEXT instruction.</FONT> </P></MENU>These two instructions 
bracket the code for the body of the loop. In spite of its inherent complexity, 
setting up a <TT>for</TT> loop for the SAL compiler is not that bad. We can see 
the process in figure {RULE39}.
<P>
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/RULE39.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE39}.</B> Code generation for rule 
  ForLoop.</FONT> </P></MENU>The first thing that we do is generate the address to 
the index variable. Notice that this can be any expression that is an lvalue. We 
tell <TT>RuleIdentExpr()</TT> to generate an lvalue, and we store the type that 
it returns in a temporary variable called <TT>LType</TT>.
<P>The next thing to do is to generate the starting and stopping values. We call 
<TT>RuleExpression()</TT> twice. Each time we do, we prime <TT>RType</TT> with 
the value in <TT>LType</TT>, and make sure that the value returned is still the 
same. If not, we generate an error.
<P>Next, we need to generate the step value. If one is present, i.e., there is a 
<TT>by</TT> keyword, then we go ahead and call <TT>RuleExpression()</TT> one 
more time. Otherwise, the return type defaults to +1, and we emit it as a 
constant.
<P>At this point, we have loaded the parameters for the <TT>FOR</TT> and 
<TT>NEXT</TT> instructions, and we are ready to begin the loop. Emitting the 
<TT>FOR</TT> instruction involves a forward jump. We call <TT>Emit1()</TT> to 
emit the <TT>FOR</TT>, and then once again to emit the byte for 
<TT>LType.subtype</TT>. Then we do the forward jump.
<P>Before calling <TT>RuleStatementSequence()</TT>, we save the PC for a 
backward jump, which will be done by the <TT>NEXT</TT> instruction. Once 
<TT>RuleStatementSequence()</TT> returns, we call <TT>Emit1()</TT> to emit the 
<TT>NEXT</TT> instruction, and then once again to emit the byte for 
<TT>LType.subtype</TT> (Just like with the <TT>FOR</TT> instruction). After 
that, we emit the offset for the backward jump by a call to <TT>Emit4()</TT>. 
Finally, we fixup the forward jump that belonged to the <TT>FOR</TT> instruction 
at the beginning of the body.
<P>We can see the pattern in memory that the <TT>for</TT> loop takes on in 
figure <TT>FORMEM</TT>. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/FORMEM.gif">
  <P><FONT face=arial size=-1><B>Figure {FORMEM}.</B> </FONT></P></MENU>Here is an 
example in SAL, and a dissasembly of the code that is generated. <PRE>
      program TestFor;

        const A = 0;
        const B = 10;
        const C = 2;

        var
          x, y: int;

      begin
        y:= A;

        for x := A to B by 2 do
          y := y + x;
        loop;
      end program.


      *** Module table ************************************

      Module: TestFor  15:34.33  3-12-99
      Size: 8 Parent: 0x00000000
        cnst   1:A              int      s32    fl:   00
        cnst   1:B              int      s32    fl:   010
        cnst   1:C              int      s32    fl:   02
         var   1:x              int      s32    fl:   0	Offset: 4 pass by val
         var   1:y              int      s32    fl:   0	Offset: 8 pass by val


      *** DISASSEMBLY *************************************

        ENTR  00000000
        LGA   00000004  
        TS    
        SJPZ  begin  
        RTN

      begin:
        LGA   0000000D          y : = A
        LID   00000000  
        SSD   00000000  

        LGA   00000009          for x:= A to B by 2 do
        LID   00000000  
        LID   0000000A  
        LID   00000002  
        FOR   _s32  SkipOver  
      
      BodyStart:
        LGA   00000008            y := y + x
        LGD   00000008  
        LGD   00000004  
        ADDD
        SSD   00000000  

        NEXT  _s32  BodyStart   loop
      SkipOver:
        
        RTN
</PRE>&nbsp;
<P>&nbsp;
<P>&nbsp;
<P><!----------------------------------------------------------------------------->
<H2>7.4 Select Statement</H2><!----------------------------------------------------------------------------->This 
rule is perhaps the most involved of all the rules. It requires the construction 
of a case table within the code array. Let's take a look at the <TT>CASE</TT>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -