📄 chapter 9 designing classes.htm

📁 英文版编译器设计：里面详细介绍啦C编译器的设计
💻 HTM
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
additional parameters, which have to do with managing memory. Such parameters 
would be meaningless to a standard member function pointer. Based on this 
information, we may conclude that one possibility is that C++ compilers 
typically imbed the meta-construction code inline within each constructor for a 
class. Each constructor could then take an additional parameter, which acts as a 
flag that either activates or bypasses the code. To date, the authors of this 
text are unaware of any other text on this subject that makes the distinction 
between meta-constructing memory, and calling a class constructor. It is our 
philosophy that these are two subtly different things.
<P>
<H3>9.2.4 Member and Method Access in Multiple Inheritance</H3><!-------------------------------------------------------------------------------->Member 
and method access is obtained in much the same way as was discussed in section 
9.1. However, in order to achieve some of the functionality of base classes, a 
few additional facilities are required: 
<OL>
  <LI><B>External Access.</B> The rule for designator must be modified to make 
  use of two additional facilities: 
  <OL type=A>
    <LI>A mechansim to search a class hierarchy for members and methods. 
    <LI>A mechansim to generate code for computing the offset to the base class 
    where a member or method was found. </LI></OL>
  <LI><B>Internal access.</B> The facility that matches identifiers found by the 
  parser to declared items in the symbol table must additionally search a 
  class's base classes. This facility must also notify its caller whether or not 
  the identifier was matched to a class member. </LI></OL>These facilities merely 
expand the capabilities of the four items listed in section 9.1. 
<P>
<H4>9.2.4.1 External Access</H4><!-------------------------------------------------------------------------------->As 
we discussed in sections 9.1.1 and 9.1.2 all external access is handled through 
the rule for designators. After encountering the base identifier for a class, 
the parser eats the following dot, and then searches the tables for the 
following identifier. This process must be expanded so that while searching a 
class, the contents of all base classes are searched as well. This is a 
recursive process, and the search proceeds in a depth-first pre-order fashion. 
The members of the current class are searched, and then all base classes are 
searched in the order that they are inherited. The first match is returned.
<P>In SAL there is no question of ambiguity as there may be in other languages. 
In C++, some compilers will geve an error if two base classes have a member or 
method bearing the same name and scope designation is not used. In SAL, the 
specification for the language resolves these ambiguities. In this case the 
ambiguity is resolved by inspecting the order in which the base classes are 
inherited.
<P>As we can recall, all references to complex types are started by generating 
the address to the first byte of data. This is known as the base address. If a 
class's member or method is found in a superclass, the base address must be 
modified so that it points to the start of that subclass. At compile time we can 
use the following algorithm: 
<OL>
  <LI>A temporary value for holding the current offset is initialized to zero. 
  <LI>Get a pointer to the most derived class's info from the symbol table. This 
  pointer becomes the current class. 
  <LI>While the current class is not the owner of the desired method, iterate 
  along the path in the inheritance hierarchy until the class that owns member 
  or method has been reached. On each iteration do: 
  <OL>
    <LI>Get the offset of the next base class along the path and add it to the 
    current offset. 
    <LI>If the next class is a shared instance we need to modify the base 
    address so that it points to the start of the next group. Do: 
    <OL>
      <LI>Generate code to retrieve the delta at the address of [the current 
      offset plus the base address]. 
      <LI>Generate code to add the delta to the current offset. 
      <LI>Generate code that adds that result to the base address. 
      <LI>Reset the current offset to zero. </LI></OL>
    <LI>Get the next base class along the path from the symbol table, and set it 
    to be the current class. </LI></OL>
  <LI><I>Final step:</I> If the current offset contains data (is nonzero) 
  generate code to add it to the base address. </LI></OL>This algorithm creates 
code that is optimal at runtime for paths that do not cross group boundaries. 
The only real catch comes when a jump must be made across a group boundary. At 
that point, the base address needs to be modified so that it points to the start 
of the next group, and then the current offset is reset to zero. The delta is 
retrieved by adding the current offset to the base address. The address of the 
next group is then computed by adding the delta plus the current offset to the 
base address. Notice that if the path to the member or method does not cross a 
shared group boundary, the code will merely compute a single offset that is 
added to the base address once the algorithm completes.
<P>Given this method, the question might come forward, "Why does the delta start 
from the beginning of the current class, instead of the beginning of the most 
derived class?" This would certainly be more efficient in our case here. 
However, this procedure would not work for virtual functions, which require an 
offset from <TT>self</TT> (i.e., the current class), and not from the most 
derived class, which could very likely be something completely different (as 
this is the very purpose for having virtual functions, in the first place).
<P>Let us go over two examples. For these, we shall refer to listing {SMI} and 
figure {SMIMEM}, and suppose the compiler encounteres the following fragment of 
code: <PRE>      var
        d: D;

      begin
        <B>d.q</B>:= ...;   // Example 1
        <B>d.x</B>:= ...;   // Example 2
</PRE>In both examples, the compiler determines that <TT>d</TT> was an instance 
of class D.
<P>In the first example, the next step would be to search for a member called 
<TT>q</TT>. The search would work in pre-order, and would proceed from D to B 
and then to A. After exhausting that branch, the search would restart at class 
C, where the member <TT>q</TT> would be found.
<P>The next step would be to modify the base address so that it points to class 
C and not class A. Since the path to <TT>q</TT> does not cross a shared 
boundary, this amounts to a single offset. Once this process is complete, the 
statement can be processed as if it was a simple record. We can now rely upon 
functionality that already exists within the compiler.
<P>For the second example, the next step sould be to search for a member called 
<TT>x</TT>. Walking along the inheritance graph in pre-order, we come through B 
to A, where the member <TT>x</TT> is found.
<P>In this case, the path to the member crosses a shared boundary. In order to 
move the base pointer to the start of the instance of A, we first move it to the 
edge of the boundary, i.e., to the class containing the shared link. Notice how 
the algorithm listed above referrs to a running total of the current offset. 
That means that if there had been one or more classes between D and B, the 
offset would take those into account, too. The compiler can add the offset to 
the base address (which will already be on the EES) by emitting a <PRE>      LSA offset
</PRE>Once we add the offset to the base address, we need to get the delta that 
points to the start of the next shared group (ignoring the init flag for the 
moment). The delta will be stored at some offset (which we will call 
deltaoffset) from the start of the class. Before we try to retrieve it, we need 
to make a duplicate of the current base address. If not, it will be lost. We 
then use the copy of the base address on the EES, and load the delta from 
deltaoffset. <PRE>      COPT 4
      LSD deltaoffset
</PRE>Finally, we can add the delta to the current base address: <PRE>      ADDD
</PRE>This takes the base address over the group boundary and sets it at the 
start of the data for class A. At this point, we can proceed as if the statement 
referenced the field of a record, and rely on the compiler's existing 
functionality.
<P>In both cases, we can see that the real work involved is in moving the base 
address to the start of the class containing the member variable. We could look 
at another example that makes use of scope designation, but scope designation 
only directs the initial search to the desired member or method. The algorithm 
explaned here assumes that the path to the member has already been found. 
Finding the member in the first place is a simple matter of performing a 
graph-search.
<P>
<H4>9.2.4.2 Internal Access</H4><!-------------------------------------------------------------------------------->In 
section 9.1.3 we discussed certain semantic translations that could be used to 
change from the intermal point of view to the external point of view. These 
items still apply. The only real necessity at this point is to determine whether 
or not an identifier is a member of a base class. Figure {IESCP} showed the 
levels of scope within a class. Inheritance inserts additional levels between a 
class and the scope where it was declared. For instance, consider Figure 
{INSCP}. 
<CENTER><IMG height=440 src="Chapter 9 Designing Classes.files/INSCP.gif" 
width=640></CENTER>Line 18 makes use of three variables. One is local, one is a 
member of class Y, and the the last one is a member of base class X. When 
processing the method <TT>doit()</TT>, the compiler has to keep the locations of 
all three of these variables straight.
<P>The standard search begins with the local scope of <TT>doit()</TT>. That is 
where <TT>i</TT> is found. In order to find <TT>p</TT>, the compiler has to look 
back one level. In order to look back through base class, a slightly different 
technique is used. Symbolically, all classes (like all records) have a block 
listing that contains the identifiers for their members and methods. In 
addition, their block listings contain references to their immediate base 
classes. This is because although the symbol table is truly a tree, inheritance 
requires some branches to join up.
<P>At compile time of the method <TT>doit()</TT>, the compiler has track of the 
scope of <TT>doit()</TT>, its parent class <TT>Y</TT>, and the global scope of 
program <TT>ClassScope</TT>. There is no general way to gracefully insert the 
additional scopes of a class's base classes into the chain of scope without 
destroying the integrity of the symbol table. What has been done to compensate 
for this is to have each class include an entry for each of its direct base 
classes in its block of identifiers. In other words, in addition to the a list 
of data members and methods, each class's symbol block also maintains a list of 
base classes. A general searching mechanism can then be implemented as follows: 
<P>
<OL>
  <LI>Get the topmost scope from the scope stack. This will become the current 
  scope. 
  <LI>While the current scope is not null, do: 
  <OL>
    <LI>Search the current scope for a matching identifier. If found, return the 
    identifier. 
    <LI>If the current scope is a class: while the current class has un-searched 
    base classes, recurse to step A (depth-first pre-order) for each un-searched 
    base class. 
    <LI>Go to the next prior scope in the scope stack. If no levels remain, set 
    current scope to null. </LI></OL>
  <LI>Return an error that the identifier was not found. </LI></OL>Once the 
identifier has been found, if it was a member of the parent class or one of the 
parent's superclasses there needs to be some notification. This can be in the 
form of a boolean parameter (passed by reference) for the search routine. If an 
identifier is determined by the search routine to be a member of the parent 
class, The next step is to modify the token stream by inserting a <TT>self</TT> 
identifier and a dot. The rest of the compiler's systems will then process the 
expression normally.
<P>The key to the compiler's ability to correctly determine the location of an 
identifier's scope is critical in this case. Indeed, the symbol table's search 
algorithm greatly facilitates this task. If we can determine that a given 
identifier is a member of a class, or a base class to which the current method 
belongs (either directly of through inheritance), then inserting a 
"<TT>self.</TT>" into the token stream suddenly makes our internal reference 
external. In other words, they really the same problem.
<P>
<H3>9.2.5 Parsing SAL Inheritance</H3><!-------------------------------------------------------------------------------->Our 
first job is to flesh out <TT>RuleExtends()</TT>. This rule should already be 
called by <TT>RuleClassType()</TT>, as was indicated back in section 9.1.4.1. In 
figure {RULE12} we have the parser diagram for <TT>RuleExtends()</TT>. 
<MENU><IMG src="Chapter 9 Designing Classes.files/RULE12.gif">
  <P><FONT face=arial size=-1><B>Figure {RULE12}</B> </FONT></P></MENU>In figure 
{RULE20} we can see where the scope designation operator fits into the grammar. 
However, this operator is not ever handled directly by the parser. It is handled 
by a helper function, which both parses and performs identifier lookup, called 
<TT>TraceInheritance()</TT>. This function has more to do with the symbol table, 
than with parsing. 
<H3>9.2.6 Symbol Tables for SAL Inheritance</H3><!--------------------------------------------------------------------------------><TT>RuleExtends()</TT> 
is where a lot of the work for inheritance occurrs. For each time we call 
<TT>RulePreDeclaredType()</TT> we make sure that the identifier returned is a 
type: <PR
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -