📄 chapter 14 pointers.htm

📁 英文版编译器设计：里面详细介绍啦C编译器的设计
💻 HTM
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0063)http://topaz.cs.byu.edu/text/html/Textbook/Chapter14/index.html -->
<HTML><HEAD><TITLE>Chapter 14: Pointers</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>
<BODY>
<CENTER>
<H1>Chapter 14<BR>Pointers </H1></CENTER>
<HR>

<H2>14.1 Overview </H2>
<P>At this point, this will not be a full chapter on pointers. It will tell you 
how to implement pointers in SAL, but nothing else. </P>
<H2>14.2 Pointers in SAL </H2>
<P>Pointers are managed in SAL in the Type class. Class Type has a field called 
<FONT face="Courier New">plev </FONT>of type BYTE. <FONT 
face="Courier New">plev</FONT> is used to keep track of the current pointer 
level of the type. For example: <PRE>   int  // plev is 0
  ^int  // plev is 1
^^plev  // plev is 2 ... etc. 
        
</PRE>So in the declarations <PRE>type A is ^int;  // the type identifier a has type int with a plev of 1

var 
     x: int;     // the variable ident x has type int with a plev of 0
     y: ^int;    // the variable ident x has type int with a plev of 1
     z: ^A;      // the variable ident x has type A   with a plev of 2
</PRE>
<P></P>
<P>SAL currently does not have a void type. </P>
<P>This pointer level is accessed though the following methods:<BR><PRE>BYTE Type::getPlev(); // returns the pointer level of the type
VOID Type::setPlev(BYTE lvl); // sets the pointer level of the type to lvl
</PRE><BR>You can also use these methods of a indent to access/alter the pointer 
level of the identifier's type: <PRE>BYTE Ident::getPlev();                // returns the pointer level of the type of the identifier
VOID Ident::setPlev(BYTE lvl);        // sets the pointer level of the type of the identifier to lvl
VOID Ident::addPlev(BYTE inc); // increments the pointer level of the type of the identifier by inc
</PRE>
<H2>14.3 Declarations</H2>
<P>First, we must be able to declare pointers.&nbsp;&nbsp;We do this by making a 
simple alteration to RulePreDeclaredType: </P>
<P class=MsoNormal><SPAN 
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Times New Roman'; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"><SPAN 
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Times New Roman'; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"><STRONG>PreDeclaredType</STRONG></FONT><SUP>6</SUP></SPAN> 
</SPAN></P>
<P class=MsoNormal>int plev := 0;</P>
<P><IMG alt="" src="Chapter 14 Pointers.files/image001.gif"></P>
<P>Essentially we count how many pointersy's there are in front of the type and 
we increment the pointer level accordingly.</P>
<P>That's it.</P>
<H2>14.4 Assignment and Type Checking</H2>
<P>With the addition of pointers assignment and parameter passing add a level of 
complexity onto type checking. Assignment and parameter passing not only need to 
be type compatible, but they also need to be pointer level compatible.&nbsp; 
Before we expore this further, lets take a look at a special problem when 
checking type and pointer level compatiblilty: the null constant</P>
<H3>14.4.1 The <EM>null</EM> Constant</H3>
<P>When using pointers we should have a way of initializing a pointer so that it 
points to nothing. For this, we use the null constant. The value of a 
null&nbsp;constant is easy to determine, like in C/C++&nbsp;, we&nbsp;set it to 
0. How do we set its pointer level?&nbsp; It must be compatible with all 
different pointer levels. As null is a special exception, we make its pointer 
level a special exception. We&nbsp;define a specific null pointer level 
constant.&nbsp;This must be unique to the null pointer, so&nbsp;we pick some 
number greater than&nbsp;the max pointer level. For 
example:&nbsp;NULL_PLEV&nbsp;:= 100.&nbsp; Thus, when we do type/pointer-level 
checking, we make a special case for when we find a type where the pointer level 
is&nbsp;NULL_PLEV ornull. For assignment, if the rvalue&nbsp;pointer level&nbsp; 
is&nbsp;NULL_PLEV and the&nbsp;lvalue pointer level is greater than 0,&nbsp; we 
ignore and type checking&nbsp;and pointer-level checking. Similarly, for 
parameter passing, if the if the formal parameter pointer level&nbsp; 
is&nbsp;NULL_PLEV and the&nbsp;actual parameter pointer level is greater than 
0,&nbsp; we ignore and type checking&nbsp;and pointer-level checking. The 
constant null is defined for you in the symbol table, but you have to initialize 
its plev. You can do that in table.h </P>
<P>&nbsp;</P>
<H3>14.4.2 Rule Assignment </H3>
<P>After retrieving the lvalues and rvalues in rule assignment,&nbsp; we check 
the types and pointer-levels. If they are not we out put an error message. If 
they are we make the compatible assignment.&nbsp; We can do this as follows: 
(the code changes are in bold) </P>
<P><B style="mso-bidi-font-weight: normal"><SPAN 
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA; mso-bidi-font-family: 'Times New Roman'">Assignment<SUP>35</SUP></SPAN></B></P>
<P><IMG alt="" src="Chapter 14 Pointers.files/image007.gif"></P>
<H3>14.4.2&nbsp;Parameter Passing</H3>
<P>This is left as an excercise for the reader. One hint: it is very similar to 
the change in RuleAssignment, and&nbsp;it is in&nbsp;VerifyParameter 
and&nbsp;VerifyTypes(Ident* ident .....);</P>
<H2>14.5 GetPointerLevel</H2>
<P>Now that we are able to declare pointers&nbsp;to variables. We&nbsp;must also 
be able to dereference them.</P>
<P>We first must implement the grammar rule GetPointerLevel.&nbsp; This function 
essentially counts up and returns the number of pointersy's found.</P>
<P><STRONG><B style="mso-bidi-font-weight: normal"><SPAN 
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA; mso-bidi-font-family: 'Times New Roman'">GetPointerLevel<SUP>18</SUP></SPAN></B></STRONG> 
</P>
<P class=MsoNormal>int plev := 0;</P>
<P><IMG alt="" src="Chapter 14 Pointers.files/image002.gif">&nbsp;</P>
<P>&nbsp;</P>
<P>&nbsp;</P><!----------------------------------------------------------------------------->
<H2>14.6 Rule IdentExpr: For Pointers</H2>
<P><!----------------------------------------------------------------------------->This 
is one of the most important rules in the compiler. We will&nbsp; go over how 
this rule was implemented in chapter 6, making changes for pointers along the 
way. The changes will be italicized and colored in... let's say... <FONT 
color=green>green</FONT>.</P>Based on the type of identifier returned by 
<TT>RuleQualIdent()</TT>, we will process a variable, a constant, a type 
conversion, a function/procedure call, or an array or record. The prototype for 
this function should look like this: <PRE>      void IdentExpr ( Set Follow, BOOLEAN MakeRValue, Type &amp;RType )
</PRE>The first parameter is self-explanatory. The second parameter is a 
command, telling <TT>RuleIdentExpr()</TT> to make an rvalue or an lvalue. The 
third parameter is a reference to a type, which is filled in by this function.
<P>The first thing that we want to do in this function is get the current token 
(which we already know to be an identifier), and look it up in the symbol table. 
This is done through a call to <TT>QualIdent()</TT>. If <TT>QualIdent()</TT> is 
successful, it will return a pointer to a valid <TT>Ident</TT>. At this point, 
if the return value is <TT>NULL</TT>, we set <TT>RType</TT> to a no-type value. 
This is done by calling <TT>Type::Init()</TT>, and passing in <TT>notyp</TT> and 
<TT>nosubtyp</TT>, like so: <PRE>      RType.Init(notyp, nosubtyp);
</PRE>We then return. This should effectively be a signal to the caller that the 
identifier was not recognized. If <TT>QualIdent()</TT> returns a valid pointer, 
we initialize <TT>RType</TT> to the type of the identifier returned. This can 
also be done by a call to <TT>Type::Init()</TT>. <PRE>      RType.Init(ident);
</PRE>Once we have a pointer to an identifier in the symbol table, there are 
several levels at which we can analyze it. The topmost level tells us what the 
identifier is, whether it is a constant, a type, a variable, a procedure, or a 
function. In rule IdentExpr we split off based on this information. We can use 
the <TT>Ident::getObj()</TT> to get the identifier's object type. Procedure and 
function calls are discussed in chapter 8. We will discuss the rest of these in 
detail, beginning with symbolic constants.
<P><FONT color=green>
<H3>14.6.0.1 The Ampersand '&amp;' </H3>
<P>If you encounter an ampersand, you treat the identifier as an lvalue. You do 
this by setting makerval to false. You also need to increment the rype's plev by 
one. You don't want to do this right away at the beginning because it will mess 
up your plev checks within rule ident expression. If there is an ampersand you 
can set a boolean flag and at the very end of ruleIdentExpr if that flag is set 
increment rtype's plev. </FONT>
<H3>14.6.1 Symbolic Constants</H3><!----------------------------------------------------------------------------->A 
symbolic constant is one that is defined in a <TT>const</TT> declaration. 
Calling <TT>Ident::getObj()</TT> will return the value <TT>constobj</TT>.&nbsp; 
<FONT color=green><EM>We have included the textbook section below on symbolic 
constants, but&nbsp; we only have one simple change for pointers. Essentially if 
the pointer-level of the constant is the null pointer level, we want to make 
sure that RType's (or rval's) pointer level is initalized to the null pointer 
level. We must do this explicitly as a type's pointer level is set by default to 
zero.</EM></FONT>&nbsp; Here is an example of a constant's declaration. <PRE>      program AConstant;

        const
          x = 100;

      begin
        write x;
      end program.
</PRE>All symbolic constants are rvalues; you can't take the address of a 
constant, nor can you assign anything to it. For the most part, dealing with a 
constant symbolically is the same as dealing with a constant literally. The only 
difference is that when a constant is literal we extract its value from a token. 
Here our constant is symbolic. We can cast our <TT>Ident</TT> pointer to a 
<TT>ConstantIdent</TT> pointer by calling <TT>Ident::toConstantIdent()</TT>. 
This will return a pointer that is a <TT>ConstantIdent</TT>. We can then 
retrieve a <TT>ConstRec</TT> that has stored the value by calling 
<TT>ConstantIdent::getValue()</TT> .When it comes to symbolic constants, we want 
to give ourselves as much latitude as possible in order to allow the language to 
be flexible. 
<H3>14.6.2 Variables</H3><!----------------------------------------------------------------------------->We 
know if our identifier is a variable when the call to <TT>Ident::getObj()</TT> 
yields a <TT>varobj</TT>. A variable can be either an lvalue or an rvalue. This 
does not necessarily complicate things. It just means that we use a bifurcated 
method when dealing with any type of variable, including records and arrays (we 
will talk about those later). However, as we shall see with functions, it is 
also possible to have a variable that is a parameter that is passed by 
reference. <EM><FONT color=green>We must also take into account what pointer 
level a variable is and if it is being dereferenced or not. </FONT></EM>A 
pointer <I>is</I> used for variables that are passed by reference.
<P>
<H4>14.6.2.1 A Variable's Storage Location</H4>Variables can be either passed by 
value or passed by reference. Variables that are passed by value can be stored 
in one of four different areas. Once an identifier has been found in the symbol 
table and determined to be a variable, we can query its properties to find out 
where in memory its value will be stored at runtime. A variable can be stored in 
an external module, in global memory, in local memory, or in a parent function's 
local memory (previous scope). We test the varible in this order to determine 
its location: 
<OL>
  <LI><B>In an external module.</B> If <TT>Ident::getMod()</TT> returns a value 
  that is not equal to <TT>table.ModNum</TT>, we know that the variable is in a 
  different module. When a module imports a variable from an external module, 
  the compiler has to tell the VM to look in the other module's GDA. The 
  <TT>LGx</TT> instructions perform this task.
  <P></P>
  <LI><B>In global memory.</B> The method <TT>Ident::getFuncLev()</TT> returns 
  the scope level of an identifier. If the value returned is zero, then we know 
  that the variable is global. This means all the global variables for the 
  current module. These variables are found at some offset from the G register 
  (the GDA), and the <TT>LGx</TT> instructions are used.
  <P></P>
  <LI><B>In local memory.</B> These variables are found on the local stack at 
  some offset from the L register. Local variables always belong to the current 
  function, and die when the function exits. We can determine whether or not a 
  variable is local if table.curFuncLevel() returns the same value as 
  <TT>Ident::getFuncLev()</TT>.
  <P></P>
  <LI><B>In a previous scope.</B> If none of the other conditions apply then the 
  variable is assumed to reside in a parent function's scope. A nested procedure 
  can see all the variables of the parent procedure. If a procedure called 
  <TT>foobar()</TT> has two nested procedures called <TT>foo()</TT> and 
  <TT>bar()</TT>, and <TT>foo()</TT> calls <TT>bar()</TT>, <TT>bar()</TT> still 
  needs to be able to see the local variables of <TT>foobar()</TT>. However, by 
  the time <TT>bar()</TT> is called, <TT>foobar()</TT>'s local variables are 
  lost deep in the stack. The variables are accesed through a "static link" 
  (explained in chapter 8). Basically, when <TT>foo()</TT> calls <TT>bar()</TT> 
  it leaves a pointer in the call frame to the parent procedure (i.e., 
  <TT>foobar()</TT>).
  <P>Variables are retrieved from previous scopes using the <TT>GB</TT> (which 
  stands for Get Base) instruction. This instruction takes a single byte 
  parameter that tells the VM how many jumps back it needs to go. We use the 
  <TT>Ident::getFuncLevel()</TT> method to get the value of this parameter. The 
  <TT>GB</TT> instruction basically returns the value of L for the parent 
  function where the variable lives. This address is placed on the EES, and we 
  can calculate an offset based on this to get to the variable.
  <P></P></LI></OL>
<P>Variables that are passed by reference (or simply, references) are by far 
much simpler, and are stored only one of two ways. They are either local or 
within a previous scope. By definition, a reference can only be a function 
parameter, and as such, a reference must be either local or within a parent 
procedure's scope. In all cases, a reference is a single pointer--in other 
words, there is no such thing as a reference to a reference to a reference... If 
a reference is passed into a procedure that also takes its parameter as a 
reference, then we give the reference that we already have. We would not give a 
reference to the reference.</P>
<P><FONT color=green>Pointers on the other hand, do not have the restriction of 
being only a function parameter.&nbsp; And so, they can be dereferenced at any 
time. Before we can execute any operations on a variable we must know whether or 
not we are dereferencing it. We can do this by a call to RuleGetPointerLevel(). 
We use two local variables in our compiler to keep track of important 
information:</P><PRE>   deref : BYTE // how many levels we are dereferencing the variable
   currPointerLevel : BYTE // the calculated pointer level of the variable after dereferencing
</PRE>We can make use of RuleGetPointerLevel() in the following manner: <PRE>		//*** GET POINTER LEVEL ************************************
	    deref :=  
		 RuleGetPointerLevel(follow, last);
		currPtrLevel := deref;
		currPtrLevel  -=
			ident-&gt;getPlev(); if(deref &gt; rval-&gt;getPlev()) SemanticErrorMsg("Invalid dereference for ident %s,max is%d",
		ident-&gt;getName(),rval-&gt;getPlev());rval-&gt;setPlev(rval-&gt;getPlev() - deref);


</PRE></FONT>We can now determine which operations we can use on the variable 
<P>
<H4>14.6.2.2 Operations on a Variable</H4>In all, we need to be concerned with 
four different operations on a variable: 
<OL>
  <LI>make an rvalue from a variable, 
  <LI>make an rvalue from a reference, 
  <LI>make an lvalue from a variable, 
  <LI>make an lvalue from a reference.</LI></OL>
<P>Variables are assumed to include procedure/function parameters passed by 
value, and all references are procedure/function parameters that are passed by 
reference. We can call <TT>Ident::getByVal()</TT> to see whether or not our 
identifier is a variable passed by reference. Let's go over these four cases one 
by one.</P>
<P>
<MENU><B>Making an rvalue from a variable.</B> Essentially, we want to get a 
  value from a variable and store it on the EES. This can be something as simple 
  as: <PRE>      var
        x: int;

      begin
        write x;
              ^
              |______ Get the value of x and put it on the EES
</PRE>Our variable will be at some specific offset from one of the four areas 
  that were previously mentioned. We can use <TT>Ident::getOffset()</TT> method 
  to get the offset. We will also need to know the size of the variable. This
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -