📄 chapter 9 designing classes.htm

📁 英文版编译器设计：里面详细介绍啦C编译器的设计
💻 HTM
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0062)http://topaz.cs.byu.edu/text/html/Textbook/Chapter9/index.html -->
<HTML><HEAD><TITLE>Chapter 9: Designing Classes</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>
<BODY>
<CENTER>
<H1>Chapter 9<BR>Designing Classes</H1></CENTER>
<HR>
<!-------------------------------------------------------------------------------->
<H2>9.1 Lightweight Classes</H2><!-------------------------------------------------------------------------------->Leightweight 
classes are the simplest of all object oriented structures. They lack most of 
the complex features that are common to classes namely, inheritance, 
constructors, and destructors. They do have methods that can act on the internal 
state of the class.
<P>Fortunately, much of the design of leightweight classes involves only slight 
modifications to the compiler. We will also make extensive use of another 
facility which will simplify our job considerably: the token queue.
<P>When adding class capability to a language, it is important to consider four 
different aspects: 
<OL>
  <LI>External member access, 
  <LI>External method access, 
  <LI>Internal member access, 
  <LI>Internal method access, </LI></OL>These are listed in order from 
conceptually easiest to hardest. In this book, class members are analagous to 
record fields. Methods are used to refer to the procedures and functions that 
act on the members of the class. In this text classes will be looked at from two 
perspectives. The internal view referrs to accessing the class's members and 
methods from inside a method of the same class. The external view refers to 
accessing members of a class from outside the same class.
<P><PRE> 1    program IntExt;
 2
 3    type 
 4      X is class
 5        A: int;
 6
 7        proc foo;
 8        begin
 9          A:= 5;     //***  INTERNAL MEMBER ACCESS
10        end proc;
11
12        proc bar;
13        begin
14          foo();     //***  INTERNAL METHOD CALL
15        end proc;
16      end class;
17
18    var
19      x: X;
20
21    begin
22      x.A:= 0;       //***  EXTERNAL MEMBER ACCESS
23
24      x.bar();       //***  EXTERNAL METHOD CALL
25    end program.
 
      <B>Listing {INTEXT}.</B>  A demonstration of internal and external access
      to members and methods.
</PRE>A class in SAL is basically a record with functions and procedures. 
Listing {INTEXT} has a class called <TT>X</TT>, which contains a single member 
and two methods. The only other real syntactic difference between a class and a 
record is the use of the <TT>class</TT> keyword instead of <TT>record</TT>. Lets 
take a look at some of the implementation issues that must be faced.
<P>
<H3>9.1.1 External Member Access</H3><!-------------------------------------------------------------------------------->This 
is the part where the compiler requires the least amount of modification. In the 
case of external member access, a class behaves exactly like a record. To 
review, 
<OL>
  <LI>The parser first encounters the base identifier, which is always a 
  pointer. The data about the base identifier is retrieved from the symbol 
  table, and its address is loaded onto the EES. The next step is for the parser 
  and the code generator to resolve the pointer (that currently points to the 
  beginning of the structure) so that it points to the member in question. 
  <LI>Eat a dot. 
  <LI>The parser will find the name of the member, and its information is 
  retrieved from the symbol table. 
  <LI>After finding the member's offset, code is generated to add the offset to 
  the pointer. </LI></OL>These steps are discussed in more detail in section 6.5 
of chapter 6. Their result is an lvalue: a pointer to the member in question. 
This can be further resolved into an rvalue by loading the data at this address 
onto the EES. The only real changes that need to be effected to the compiler are 
to the symbol table to allow for a new type of structured entry, i.e., a class.
<P>
<H3>9.1.2 External Method Access</H3><!-------------------------------------------------------------------------------->Making 
this aspect work within a compiler involves a subtle piece of trickery. In order 
to understand, let us go back to the beginning of the idea of OOP, where old OOP 
compilers were actually <I>preprocessors</I>. Historically, a C++ preprocessors 
translated a C++ program into straight C, which could then compiled by a C 
compiler. A call to <PRE>      x.bar();
</PRE>was deftly translated into <PRE>      bar(x);
</PRE>
<P>This is one of two ways for calling class methods. More importantly, knowing 
this implicit translation sheds a little insight into the way classes actually 
work. We know that in C++ every class method takes an implicit first parameter 
to <TT>this</TT>. In SAL, <TT>this</TT> is called <TT>self</TT>, like in the 
language Smalltalk. In all OOP languages, <TT>self</TT> is the instance for 
which a method was invoked. In our example, self is a pointer to <TT>x</TT>
<P>This is one of the the ways to access methods externally. Its downside is 
that it requires the use of a preprocessor, or a complicated look-ahead and 
token substitution procedure.
<P>A simpler process is to make use of the way the parser works with records. In 
section 9.1, we review this procedure. Step one is key: parsing the base 
identifier, and generating code to fetch a pointer to it and put it on the EES. 
This step is equivalent to looking up <TT>x</TT> as a parameter, and loading a 
pointer to it.
<P>Essentially, we can make use of the same procedure in section 9.1, with only 
a slight modification: 
<OL>
  <LI>The parser first encounters the base identifier, which is always a 
  pointer. The data about the base identifier is retrieved from the symbol 
  table, and its address is loaded onto the EES. 
  <LI>Eat a dot. 
  <LI>The next step is for the parser and the code generator to resolve the 
  pointer (that currently points to the beginning of the structure) so that it 
  points to the member in question. The parser will find the name of the member, 
  and its information is retrieved from the symbol table. 
  <OL>
    <LI>If the item is found to be a data member, After finding the member's 
    offset, code is generated to add the offset to the pointer. 
    <LI>else if the item found is a method, the base identifier (already loaded 
    onto the EES) becomes the reference to <TT>self</TT>, and a call to the rule 
    for procedure and function calls is made. </LI></OL></LI></OL>For this procedure 
to work, there needs to be only one more change, and it is made to the rule for 
procedure and function calls. The change tells this rule that if the function or 
procedure being called is actually a class member, it should skip processing of 
the parameter to <TT>self</TT>, since this will have already effecitvely been 
performed.
<P>
<H3>9.1.3 Internal Member and Method Access</H3><!-------------------------------------------------------------------------------->Making 
any sort of internal access involves another implicit translation. Refer to 
lines 9 and 14 of listing {INTEXT}. <PRE> 9    A:= 5;

14    foo();
</PRE>The implicit translation required here involves changing these lines so 
that they appear like so: <PRE> 9    self.A:= 5;

14    self.foo();
</PRE>It is easy to see that once this translation is made, the programmer can 
then rely upon the existing method that covers external access.
<P>In this case, making this translation through either a preprocessor or token 
substitution would be equally fesable. SAL uses token substitution. The token 
queue allows programmers to push items into the front of the queue. Given this, 
the procedure for processing all internal access is a matter of detecting 
whether or not an identifier is a local variable, a member variable, or a global 
variable. Precidence is defined in that order. Since all OOP languages are also 
block-structured, we can easily determine the scope of all our identifiers. A 
class then becomes another enclosing block of scope. See figure {IESCP}. 
<CENTER><IMG src="Chapter 9 Designing Classes.files/IESCP.gif"></CENTER>This 
diagram shows blocks that enclose the different levels and areas of scope within 
the program listed in {INTEXT}. We can see that a class is a block of scope, 
just like procedures and functions. The compiler can know whether or not it is 
working on a class member by examining the current scope to see whether or not 
its parent scope belongs to a class. Also, any identifiers that are not 
encountered locally can be compared to the items within the class scope. If so, 
the compiler can perform a token substitution, pushing a <TT>self</TT> and then 
a dot, and then processing the result as if it were an external access.
<P>This implicit translation takes place in the SAL compiler at the time that an 
identifier is first encountered and looked up in the symbol table. Encountering 
a statement like <PRE>      A:= 5;
</PRE>at line 9, the compiler looks up A in the symbol table. The code in the 
compiler looks <EM>something</EM> like this: <PRE>      Ident *ident;

      ident =  table.Find_Backward (Token.data.id,  
      IsMember);
         if(!ident) Error
      (ER_UNDEFINED); 
         if(IsMember)scanner::-&gt;EnqueueToken(periodsy);
      else
         AcceptStd (identsy, Current+Follow, ER_NOIDENT);
</PRE>If some of these functions appear unfamiliar to the reader, he is 
requested to review the material in chapter 5. The magic of our translation 
occurrs in the call to <TT>table.Find_Backward()</TT>. This function does two 
things. Under most circumstances, it takes a string (in this case, the one 
contained in <TT>Token.data.id</TT>) and returns an <TT>Ident</TT> from the 
symbol table that matches. <TT>ident</TT> will contain all the necessary 
information for the compiler to proceed. If however 
<TT>table.Find_Backward()</TT> determined that the identifier requested belonged 
to a class, it will <I>not</I> return a pointer to that identifier, but a 
pointer to <TT>self</TT>. <I>This is the crucial step.</I> The boolean variable 
<TT>IsMember</TT> is used to detect this subtle switch. At this point, if the 
the switch is made we call <TT>OutSymbol()</TT> to fabricate a token containing 
a period. We don't need to push an additional token for the identifier since the 
purpose of this rule was to <I>eat</I> an identifier in the first place. We have 
effectively fooled the parser into thinking that our <PRE>      A:= 5;
</PRE>is really a <PRE>      self.A:= 5;
</PRE>And that is really the crux of the matter. The rest of the parser as it 
stands can proceed without modification. This technique will work equally well 
for methods as for members.
<P>
<H3>9.1.4 Implementing Lightweight Classes</H3><!-------------------------------------------------------------------------------->Before 
proceeding, it is assumed that the reader be familiar with section 6.5 of 
chapter 6, which explains the implementation of records and arrays. 
Understanding the implementation of these two items, especially records is 
crucial to understanding the way that classes are implemented.
<P>At this stage of object-oriented functionality, the implementation work is 
light. We need two new rules: <PRE>      RuleClassType()
      RuleExtends()
</PRE>We will not talk about the implementation for <TT>RuleExtends()</TT> until 
section 9.2.5, when we discuss inheritance. For now, RuleExtends() should do 
nothihng. However, we will need to define <TT>RuleClassType()</TT>. In addition 
to these two new rules, we also need to modify several other rules: <PRE>      RuleDeclareType()
      RuleMemberField()
</PRE>For now, ignore rule CtorDeclaration and rule DtorDeclaration. We will 
discuss constructors and destructors in greater detail in chapter 11.
<P>Some work with the symbol table needs to be done, namely creating instances 
of class types and variables, and adding members and methods appropriately. At 
the back end of the compiler, only a small amount of code generation work is 
required, since so much of this relies on existing functionality.
<P>
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -