📄 chapter 8 procedures and functions.htm

📁 英文版编译器设计：里面详细介绍啦C编译器的设计
💻 HTM
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
      end program.
</PRE>We have two examples here. The first is where procedure <TT>foo()</TT> 
calls <TT>Nested2()</TT>. Because we don't have to back up at all, we use <TT>GB 
00</TT>. This effectively sets the static link to the dynamic link for 
<TT>Nested2()</TT>.
<P>The other example is when <TT>Nested2()</TT> calls <TT>Nested1()</TT>. 
Obviously, if no static link were preserved, by the time <TT>Nested1()</TT> was 
called, <TT>x</TT> would be lost two levels deep in the stack. In the general 
case, a nested procedure does not know <I>when</I> it will be called, or <I>by 
whom</I>. Conceivably, the variables of its parent procedure's scope could be 
buried anywhere in the stack. The dynamic link makes sure that this scope is 
accessable for any level of nesting.
<P><!----------------------------------------------------------------------------->
<H2>8.3 Rule ReturnStatement</H2><!----------------------------------------------------------------------------->The 
<TT>return</TT> statement works differently in both procedures and functions. In 
functions it returns a value, and in procedures it does not. So, we check to see 
if we are currently compiling a function. If so, then we call rule Expression, 
and then make sure that the return type for the function matches the type 
returned by rule Expression. Finally, in all cases we emit a <TT>RTN</TT> 
instruction.
<P>
<MENU><IMG src="Chapter 8 Procedures and Functions.files/RULE45.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE45}.</B></FONT> </P></MENU>For this 
class, we will not worry about functions returning anything other than an 
intrinsic type.
<P><!----------------------------------------------------------------------------->
<H2>8.4 Rule CompilationUnit</H2><!----------------------------------------------------------------------------->In 
some ways this rule is similar to rule ProcFuncBlock(). However, there are many 
other things that also need to be done. This rule is by far the most complex, 
since it handles all of the initialization--not only of records and arrays, but 
of all imported modules as well. This rule also manages the creation of the 
object file. In all, it has nine steps. Here is a diagram of rule 
CompilationUnit showing all of the necessary steps involved. 
<MENU><IMG src="Chapter 8 Procedures and Functions.files/RULE63.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE63}.</B> Rule CompilationUnit, and 
  all the steps involved for initializing global modules and libraries.</FONT> 
  </P></MENU>In all, there are seven steps. 
<MENU><B>Step 1, Make a boolean flag.</B> This boolean flag keeps track of 
  whether the module is a program or a library. It is used in steps 2, 3, and 4.
  <P><B>Step 2, Make a call to <TT>PrepareObjFile()</TT>.</B> This function 
  opens an object file on disk (if one already exists by the name specified then 
  it is re-created). Nothing is written to the file until step 3.
  <P><B>Step 3, Make a call to <TT>ModuleInit()</TT>.</B> This function writes 
  the object file header for programs and modules. The format for object files 
  is complex, and would require another chapter to explain. The process to 
  initialize this module has been encapsulated into one function.
  <P><B>Step 4, Begin the main procedure.</B> Every module has a main procedure, 
  whether it is a program or a library. We emit an <TT>ENTR</TT>, and then call 
  <TT>MainProcInit()</TT>.
  <P><B>Step 5, Initialize all imported libraries.</B> Before we execute a 
  single line of source code, each library needs to make sure that all libraries 
  that are imported have been properly initialized.
  <P><B>Step 6, Allocate local arrays and records.</B> This operation is very 
  similar to step three for procedures and functions.
  <P><B>Step 7, <I>(Optional for libraries).</I> Call rule 
  StatementSequence.</B> This is the body of the main procedure.
  <P><B>Step 8, Emit a <TT>RTN</TT> and call <TT>PutCodeText()</TT>.</B> This is 
  similar to step 5 for rule ProcFuncBlock.
  <P><B>Step 9, Call <TT>CloseObjFile()</TT>.</B> This function performs several 
  fixups, and basically finalizes the object file. Once this function has been 
  called, the object file will be complete and ready for execution. 
</P></MENU>Now, let's go over each of these steps in greater detail. 
<H3>8.4.1 Boolean Flag</H3><!----------------------------------------------------------------------------->This 
is the first thing that we do. The reason is because there are certain things 
that are done for programs and there are other things that are done for 
libraries. The boolean flag is a way to keep track of what we are compiling.
<P>
<H3>8.4.2 Calling <TT>PrepareObjFile()</TT></H3><!----------------------------------------------------------------------------->This 
function opens the object file for writing. If an object file of that name 
already exists, it will be overwritten, and truncated to zero. This function 
takes a boolean flag, which is true if the module is a program, and false if the 
module is a library. This function also fixes the extension for the module. If 
the module is a program, it will be given a ".PRG" extension. Otherwise, it will 
be given a ".RLL" extension. The file is opened as write-only, binary.
<P>
<H3>8.4.3 Calling <TT>ModuleInit()</TT></H3><!-----------------------------------------------------------------------------><TT>ModuleInit()</TT> 
does several things. Its overall objective is to initialize the object file. 
First it writes the header with the module's time stamp. Then it writes the list 
of imports. For this reason, we cannot call this function until after all the 
<TT>import</TT> statements have been compiled. Finally, this function begins a 
code segment for storing procedures.
<P>
<H3>8.4.4 Generate Code For the Main Procedure</H3><!----------------------------------------------------------------------------->Every 
module has a main procedure, reguardless of whether it is a program module or a 
library module. Even if the module is a library and it doesn't require any 
initialization, there still must be a main procedure for an importing module to 
call. This point is closely related to the next step.
<P>The first thing we want to do is emit an <TT>ENTR</TT> statement. This is 
elementary. The next thing that we do is if the module is a library, we call 
<TT>MainProcInit()</TT>. This function generates the following sequence of code: 
<PRE>        LGA   00000000h
        TS
        SJPZ  Begin
        RTN

      begin:
</PRE>The first instruction loads the address of the init-flag onto the EES. The 
linker-loader will have initialized this to zero by the time this piece of code 
is executed. The next instruction is an atomic test/set instruction. The byte at 
the address is read, and then set to one. The result is then a boolean flag that 
is used for the next instruction: a short jump (if the byte is zero) to the 
label <TT>begin</TT>. If the byte is non-zero, the next instruction is a 
<TT>RTN</TT>. This piece of code makes sure that the main routine for a library 
is called only once.
<P>
<H3>8.4.5 Initialize Imported Libraries</H3><!----------------------------------------------------------------------------->The 
previous section is tied closely to this section. In this step we call the main 
routine of all library modules that we import. Notice that we take extra care to 
make sure that we do not call the main procedure for our own library. That would 
put us into an infinite recursive loop.
<P>As an example, let us suppose that we have a program module that imports 
three libraries. The libraries would be numbered one, two, and three, in the 
order that they were imported (i.e., the order they were read by the compiler). 
The program would need to emit a <TT>CX</TT> for each of these libraries, like 
this: <PRE>      begin:
        CX    1,    0
        CX    2,    0
        CX    3,    0
</PRE>Let's take a closer look at how this step and the previous step tie 
together. Let's suppose we have a program module P that imports two libraries, A 
and B. Library A also imports library B. This leads to an import hierarchy like 
so: 
<MENU><IMG src="Chapter 8 Procedures and Functions.files/IMPRT.gif">
  <P><FONT face=arial size=-1><B>Figure {IMPRT}.</B> </FONT></P></MENU>The goal is 
to call the main routine for each module. The linker-loader will call the main 
procedure for the topmost module (i.e., the program module) only. All other 
modules must be initialized out of the goodness and courtesy of the module that 
imports them. This is why we perform this step in the compiler. Now, notice that 
since module B is imported twice, it will be initialized twice. This must not 
happen.
<P>Things will happen in this order: module P will initialize module A and then 
B. But when module A is initialized, it will initialize B first. Then when P 
tries to initialize B, B will return immediately.
<P>
<H3>8.4.6 Allocate Local Arrays and Records</H3><!----------------------------------------------------------------------------->This 
is a very similar procedure to the one in section 8.1.3 for procedures and 
functions. The difference is that the arrays are initialized to point to an area 
in global memory. Global memory is set to be the size of all global variables, 
including pointers, plus data for arrays and records. If there is one or more 
global array or record, global memory will be laid out like this: 
<MENU><IMG src="Chapter 8 Procedures and Functions.files/GLOBMEM.gif">
  <P><FONT face=arial size=-1><B>Figure {GLOBMEM}.</B> </FONT></P></MENU>On the 
right of figure {GLOBMEM} we have an example of an array/record pointer that 
addresses data elsewhere in the global block. The algorithm to set these 
pointers is very simple: <PRE>01    ident:= Teble.getFirst();
02
03    while ident &lt;&gt; null do
04      if ident-&gt;getObj() == varobj and ident-&gt;getType()-&gt;isComplexType() then
05        Emit(LGA, currGlobalSize);
06        Emit (SGD, ident-&gt;getOffset());
07        currGlobalSize:= currGlobalSize + ident-&gt;getExtra()-&gt;getSize();
08      end if;
09
10      ident = table.next();
11    loop;
</PRE>In line 1 we get the first identifier and then begin looping while the 
identifier is not null in line 3. If the identifier is a variable and it is an 
array or record, we initialize a pointer. We get the pointer's value by using 
our current global offset, which should have been incremented for each global 
variable that was declared. An <TT>LGA</TT> instruction with the current global 
offset will compute this address (line 5). We then have the VM store that 
address where the pointer is located (line 6) and increment the current global 
offset by the size of the array/record's data (line 7).
<P>
<H3>8.4.7 Call <TT>RuleStatementSequence()</TT></H3><!----------------------------------------------------------------------------->In 
this step we process the statements for the body of the module, if there is one. 
This step is also elementary. Remember that libraries do not need to have a 
formal body.
<P>
<H3>8.4.8 Finishing the Module's Main Procedure</H3><!----------------------------------------------------------------------------->This 
is accomplished in much the same fashion as for procedures and functions. The 
only difference is that the procedure number is always zero. <PRE>      Emit (RTN);
      PutCodeText (0);
</PRE>
<H3>8.4.9 Close the Object File</H3><!----------------------------------------------------------------------------->Many 
things have to be done prior to closing the object file. The function 
<TT>CloseObjectFile()</TT> makes sure that these things always get done. This 
function takes four parameters: 
<MENU><B><TT>BOOLEAN ok</TT></B> This parameter is a flag indicating whether 
  or not an error has occurred during compile. If this is the case, then the 
  object file will be immediately closed and then deleted.
  <P><B><TT>BOOLEAN HasExports</TT></B> This flag should be true if the module 
  is a library and exports any of its declarations. This causes the function to 
  tell the symbol table to export its items.
  <P><B><TT>WORD ProcNum</TT></B> This is a count of all procedures and 
  functions, not including the main procedure.
  <P><B><TT>BOOLEAN AllSyms</TT></B> This flag is for the benefit of integrated 
  developnent environments and other debuggers, which would require all symbol 
  information. It merely causes <I>all</I> symbol information to be written out 
  to disk.
  <P></P></MENU>
<P>
<P></P></BODY></HTML>
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -