📄 chapter 7 code generation for control statements.htm

📁 英文版编译器设计：里面详细介绍啦C编译器的设计
💻 HTM
📖 第 1 页 / 共 5 页
字号:
instruction for a moment. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/CASEINS.gif">
  <P><FONT face=arial size=-1><B>Figure {CASEINS}.</B> A closer look at a 
  <TT>CASE</TT> instruction.</FONT> </P></MENU>It is a complex instruction with 
many parts. <TT>CASE</TT> takes two immediate parameters from the EES. The first 
is a byte containing the type of the switch value. This is important, since this 
instruction works with many different types. The second is word that tells how 
many elements are in the case table, not counting the trivial accept/reject 
entry (which comes first). After this line, we have the case table Each item in 
the case table consists of three fields. The first two are an upper and lower 
boundary in a range. The third is a jump offset. We can see a closer look at a 
single case table item in figure {CASEITM}
<P>
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/CASEITM.gif">
  <P><FONT face=arial size=-1><B>Figure {CASEITM}.</B> A single item in a case 
  table. As we can see, once this table item has been looked at by the VM, the 
  PC will be pointing to the byte that immediately follows the entry. If the 
  switch value is within the upper and lower boundaries, the offset will be 
  added to the current PC.</FONT> </P></MENU>Here is how this instruction works. 
The first line of the case table contains the extreme upper and lower boundaries 
of the case. The <TT>CASE</TT> instruction first compares the switch value on 
the EES to the upper and lower boundaries in the first entry. If the value is 
outside these boundaries then the VM will jump to the offset that is contained 
in this entry. This offset may be either an <TT>else</TT> clause, or a jump to 
outside the <TT>select</TT> statement if no <TT>else</TT> clause exists.
<P>If the switch value is within the upper and lower boundaries, then the VM 
will remember the offset of the <TT>else</TT> clause, and will begin comparing 
the switch value to each range in the table. If the switch value is within a 
range in the table then the VM will jump to the corresponding offset. If the 
switch value does not match any of the ranges in the table, then the VM will 
jump to the offset of the <TT>else</TT> clause.
<P>All jump offsets in the case table are relative to the byte immediately 
following the jump offset. In figure {CASEITM} we can see the lower and upper 
boundaries followed by the jump offset for the block that corresponds to a 
particular condition. The offset for the jump is relative to the address of the 
byte immediately following the entry. As the VM considers an entry in the table, 
it reads the entire entry starting where the PC is currently located. If the 
condition matches, then the offset that was read is added directly to the PC. 
This will take the VM to where the clause for that particular case begins.
<P>So, this is our goal: to emit a <TT>CASE</TT> instruction, and a proper case 
table. Let's see how this can be done. There are actually three rules involved 
in making this statement work. We can start by looking at rule SelectStatement.
<P>
<H3>7.4.1 Rule SelectStatent</H3><!----------------------------------------------------------------------------->In 
figure {RULE42} we have a diagram of the select statement as it is implemented 
in the SAL compiler. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/RULE42.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE42}.</B> Code generation for rule 
  SelectStatement.</FONT> </P></MENU>We start by allocating an array of 
<TT>ctabType</TT> records. The constant <TT>MAXCASES</TT> is currently defined 
to 100, but that can be changed. Here is their definition, for reference: <PRE>      #define MAXCASES 100
      typedef struct {
         QWORD lo, hi;       // lo and hi are range in case statement
         DWORD StartPC;      // pc at begining of code segment for case
         DWORD JumpFix;      // pc of jumpfix at the end of the case
      } ctabType;
</PRE>Once we have set up the case table, we call <TT>RuleExpression()</TT>, and 
pass in a type variable. The type returned by <TT>RuleExpression()</TT> is used 
to construct the case table at the end of this rule.
<P>Next, because of the way the <TT>CASE</TT> instruction works, we need to put 
it after all of the cases. The <TT>CASE</TT> instruction and the case table are 
placed together in memory. Since we do not have data for the case table yet (as 
it has not been compiled), we have to emit the <TT>CASE</TT> instruction at some 
later time. We will do this after we are done compiling our select statement. At 
this time, all we do is a forward jump. Later, we will fix this up to take the 
VM to the <TT>CASE</TT> instruction.
<P>The algorithm that we present here uses the zeroth element of our table as 
the extreme upper and lower boundaries of our case array. This is consistant 
with how the <TT>CASE</TT> instruction works. Thus, when we begin making entries 
in the case table, we start at entry one. We have a count, called 
<TT>cindx</TT>, that we initialize to 1, and an additional flag called 
<TT>first</TT> that we initialize to zero.
<P>We then call rule Case, and pass in <TT>cindx</TT>, <TT>first</TT>, the type 
of the switch value (returned from <TT>RuleExpression()</TT>), and our case 
table that we allocated earlier.
<P>We loop at this point and process cases.
<P>Once we are done processing cases, we are at a decision point in the rule. 
There can be an else part, or we can encounter the end of the select statement. 
If there is no else part, then we set the <TT>StartPC</TT> and <TT>JumpFix</TT> 
fields of element zero in our case table to -1. We will see why we do this in a 
bit. Otherwise, we set the <TT>StartPC</TT> field of the zeroth element to the 
current PC, and call rule StatementSequence for the last time. When we exit, 
then we emit a <TT>JMP</TT> instruction, and take a forward-jump reference, and 
save it in <TT>ctab[0].JumpFix</TT>
<P>At this point, we have finished compiling the select statement, and we are 
now ready to emit the <TT>CASE</TT> instruction with its extensive table. Before 
doing so, we do one last step: we go back and fix up the jump that we did at the 
very beginning. We call <TT>JumpFix()</TT>, and pass in <TT>jPC</TT>.
<P>Finally, we emit our <TT>CASE</TT> instruction and its corresponding table 
that we have so nicely built, and as a final measure, we go back and perform 
fixups for all our cases. <PRE>      //*** Jump to CASE ********************
      JumpFix (jPC);   // case table starts here
      Emit (CASE, CType.subtype, cindex-1);

      //*** Emit the table ******************
      for i:= 0 to cindex-1 do
        select CType.subtype from     //*** Emit a range
          case _s8:
          case _u8:
            Emit1 (card8(ctab[i].lo));
            Emit1 (card8(ctab[i].hi));

          case _s16:
          case _u16:
            Emit2 (card16(ctab[i].lo));
            Emit2 (card16(ctab[i].hi));

          case _s32:
          case _u32:
          case _f32:
            Emit4 (card32(ctab[i].lo));
            Emit4 (card32(ctab[i].hi));
            break;

          case _s64:
          case _u64:
          case _f64:
            Emit8 (card64(ctab[i].lo));
            Emit8 (card64(ctab[i].hi));

          else
            write "Select for this type not implemented\n";
        end select;

        if i&gt;0 then                   //*** Emit the range's jump offset
          Emit4(ctab[i].StartPC - PC - 4);
        else
          if ctab[i].StartPC = 0FFFFFFFFh then   // If there is no else, do a forward jump
            ctab[i].JumpFix:= ForwardJump();     //   out of the select statement
          else                                   // Else emit a jump to the else clause
            Emit4(ctab[i].StartPC - PC - 4);
          end if;
        end if;
      loop;

      //*** Fix up all jumps to exit *********
      for i:= 0 to cindex-1 do
        if ctab[i].JumpFix &lt;&gt; 0FFFFFFFFh then
          JumpFix (ctab[i].JumpFix);
        end if;
      loop;
</PRE>Above is the code to emit each item in the case table, and perform the 
fixups, too. Notice that the upper and lower boundaries can be different sizes, 
depending on the type of the switching value. Notice also that we can select 
from real values, too.
<P>If we recall back to the decision point in this rule, where we had the 
optional else part, we initialized the zeroth element's <TT>StartPC</TT> to -1. 
We did this for a reason. We can see in the above listing that if we are 
emitting item zero and <TT>StartPC</TT> is set to -1, we can take that as a flag 
to emit a forward jump. This is why we did it this way.
<P>After we are done emitting the case table, we can go back and perform a fixup 
for each clause. The value -1 in the <TT>JumpFix</TT> field is used to signify 
that there is nothing to fixup. This is the case when we have a list like: <PRE>      case 1 to 5, 6, 10 to 15:
</PRE>This case label list will generate three entries: 
<MENU>1 to 5<BR>6 to 6<BR>10 to 15<BR></MENU>If we have three entries all 
pointing to the same clause, we only need to store one fixup. In fact, the 
<TT>ForwardJump()</TT> function will not return the same PC if it is called 
three times in a row. For this reason, we only use the <TT>JumpFix</TT> field of 
the range for 10 to 15, and set the <TT>JumpFix</TT> field of the other ranges 
to -1.
<P>
<H3>7.4.2 Rule Case</H3><!----------------------------------------------------------------------------->This 
is the rule that actually handles each clause of the select statement. As we can 
see by figure {RULE41}, there is relatively little to do. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/RULE41.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE41}.</B> Code generation for rule 
  Case.</FONT> </P></MENU>The prototype for <TT>RuleCase()</TT> is as follows: <PRE>      void RuleCase (Set Follow, ctabType *ctab, int &amp;cindx, Type CType, BOOL &amp;first)
</PRE>The first parameter is the set, FOLLOW. The second parameter is a pointer 
to the case table array that we created at the beginning of 
<TT>RuleSelectStatement()</TT>. The next parameter is an index into this table. 
We need to make this variable a reference, because it will eventually be 
incremented by <TT>RuleCaseLabelList()</TT>. The next variable is the type 
returned by the call to <TT>RuleExpression()</TT> that we made in 
<TT>RuleSelectStatement()</TT> to get our switch value. It is used by 
<TT>RuleCaseLabelList()</TT>, as well.
<P>For the most part, this rule takes its parameters and passes them through to 
<TT>RuleCaseLabelList()</TT>. Once that rule returns, it calls 
<TT>RuleStatementSequence()</TT>. Finally, before returning it emits a forward 
jump, and stores the return value from <TT>ForwardJump()</TT> in 
<TT>ctab[cindx-1].JumpFix</TT>. The reason that we use <TT>cindx-1</TT> is 
because <TT>cindx</TT> stores the position where the next entry will be stored. 
There is really nothing tricky to this rule.
<P>
<H3>7.4.3 Rule CaseLabelList</H3><!----------------------------------------------------------------------------->This 
rule is where the rest of the magic occurrs. This rule only handles the ranges 
within the case, i.e: <PRE>      case <B>1, 3, 7 to 5, 12 to 15</B>:
       //  |&lt;--- This part ----&gt;|
</PRE>The prototype for this rule is identical to that of <TT>RuleCase()</TT>, 
except for the name, <PRE>      void RuleCaseLabelList (Set Follow, ctabType *ctab, int &amp;cindx, Type CType, BOOL &amp;first)
</PRE>For the most part, these parameters were simply passed through from 
<TT>RuleCase()</TT>. The job of this rule is to make all of the entries in the 
case table. Again, each entry consists of an upper boundary and a lower 
boundary. If the numbers are in reverse order, like <PRE>      case 10 to 1:
</PRE>then they are switched to be in their proper order. If only one number 
appears, then the range starts and ends on that number. Let's have a look at the 
syntax diagram of rule CaseLabelList. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/RULE40.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE40}.</B> Code generation for rule 
  CaseLabelList.</FONT> </P></MENU>The first constant read is placed into 
<TT>loval</TT>, and likewise the second constant is placed into 
<TT>highval</TT>. Now, if the boolean flag <TT>first</TT> is true, then we need 
to use these values to prime our extreme upper and lower boundaries. We do this 
right away. If this is not the first condition, then we need to at least check 
to see that our extreme upper and lower boundaries contain the values within 
<TT>loval</TT> and <TT>highval</TT>.
<P>Each time we iterate through this table, we set <TT>first</TT> to 
<TT>false</TT>. At first glance it might appear more appropriate to set this 
variable back in <TT>RuleSelectStatement()</TT>, once <TT>RuleCase()</TT> 
returns. However, the problem is that <TT>RuleCaseLabelList()</TT> processes 
multiple ranges. If we did not set the value of <TT>first</TT> here, then our 
algorithm could fail to find the proper extreme upper and lower boundaries in 
some situations.
<P>After checking that our extreme boundaries are current, we make a new entry 
in the case table. At this time we set the <TT>JumpFix</TT> field to -1: our 
flag value. As we saw in the preceeding section, <TT>RuleCase()</TT> sets this 
field for us once it has called <TT>RuleStatementSequence()</TT>. Finally, we 
increment our index, and loop if there is a comma token. If not, we are done. 
<H3>7.4.4 Summary: Rule SelectStatement</H3><!----------------------------------------------------------------------------->Rule 
SelectStatement generates a pattern in memory as shown in figure {SELMEM}. On 
the right, we can see an example of the pattern of code generated for a select 
statement that has an else clause. On the left is a pattern of code generated 
for a select statement that has no else clause. They are ever so slightly 
different. The only real difference is that the pattern for the select with no 
else clause has a pointer that jumps out of the case statement. 
<MENU><IMG 
  src="Chapter 7 Code Generation for Control Statements.files/SELMEM.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE40}.</B> Pattern of code generation 
  for rule SelectStatement. We have two examples here. On the left is a select 
  statement with an else clause. On the right is a select statement without an 
  else clause.</FONT> </P></MENU>Here is a look at a sample program. There are two 
<TT>select</TT> statements, one that has an <TT>else</TT> clause, and one that 
does not.
<P><PRE>      *** Test 1 ******************************************
      program TestCase1;

        var
          i, j: int;

      begin
        i:= 4;

        select i from
          case 1 to 3, 5:
            j:= 1;

          case 9 to 10:
            j:= 2;

          else
            j:= 3;
        end select;
      end program.


      *** DISASSEMBLY *************************************

        ENTR  00000000
        LGA   00000004
        TS
        SJPZ  begin
        RTN

      begin:
        LGA   00000004            <B>i:= 4;</B>
        LID   00000004
        SSD   00000000

        LGD   00000004            <B>select i from</B>
        JMP   SelectStart

      Case01:                       <B>case 1 to 3, 5:</B>
        LGA   00000008                <B>j:= 1;</B>
        LID   00000001
        SSD   00000000
        JMP   SelectEnd

      Case02:                       <B>case 9 to 10:</B>
        LGA   00000008                <B>j:= 2;</B>
        LID   00000002
        SSD   00000000
        JMP   SelectEnd

      CaseElse:                     <B>else</B>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -