📄 chapter 6 assignment and expressions.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0062)http://topaz.cs.byu.edu/text/html/Textbook/Chapter6/index.html -->
<HTML><HEAD><TITLE>Chapter 6: Assignment and Expressions</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>
<BODY>
<CENTER>
<H2>Chapter 6<BR>Assignment and Expressions</H2></CENTER><!----------------------------------------------------------------------------->
<H2>6.1 Assignment and Expressions</H2><!----------------------------------------------------------------------------->This
chapter will serve as a buffer to the shock that might otherwise follow. In
short, code generation in a compiler is difficult, time consuming, and can be
frustrating. In this chapter we will introduce the most basic aspect to any
programming language: expression evaluation. We will also introduce the
code-generation API that is presented in <TT>SAL-CodeGen.CPP</TT>. For an
explanation of the virtual machine's architecture, the reader should referr to
appendix A. Appendix A goes over the architecture of the virtual machine and
presents it in a way that is interactive and understandable. It will also
introduce the debugger and teaches how to use it. For this chapter, we will
assume that the reader is familiar with all of these things.
<P>In order to get a good overview of what we are going to do in this chapter,
let's consider a very simple example. Suppose we wish to generate code for a
statement that assigns the value of five to a 32-bit integer: <PRE> x:= 5;
</PRE>To start out, we consider the rule for assignment:
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/RULE35.gif">
<P><FONT face=arial size=-1><B>Figure {RULE35}.</B> Rule Assignment, number
35.</FONT></P></MENU>We can literally follow the path that the compiler takes as
it generates the code for this statement, from start to finish. Beginning at the
left in rule Assignment, we call rule IdentExpr.
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/RULE35a.gif">
<P></P></MENU>In this case, we are going to have <TT>RuleIdentExpr()</TT>
generate an address to <TT>x</TT> (We will use a flag of some sort in order to
indicate this). We want the address of <TT>x</TT> placed onto the EES. This
causes a single instruction to be generated: <PRE> LGA 00000004
</PRE>Next we eat an assignment token:
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/RULE35b.gif">
<P></P></MENU>After this, we call rule Expression. Rule Expression calls many
other rules, and eventually calls rule Factor.
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/RULE35c.gif">
<P></P></MENU>Rule Factor handles all immediate constants, so it will be rule
Factor's job to make sure the value of 5 gets placed onto the EES. It emits
another instruction: <PRE> LID 00000005
</PRE>At this point, the runtime EES will have two items on the stack, the value
5 (as 32 bits) on the top, with the address to <TT>x</TT> underneath. <PRE> __________
| . . . |
|----------|
| x's addr |
|----------|
| 00000005 | <--- EES Top
|----------|
| |
|
</PRE>At this point, we are basically finished parsing all that needs to be
parsed for this simple statement.
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/RULE35d.gif">
<P></P></MENU>All that remains is for us to store the value that has been
generated at the address that we have computed. We can do this with another
single instruction: <PRE> SSD 00000000
</PRE>The entire piece of code that stores the value 5 in <TT>x</TT> is thus: <PRE> LGA 00000004 ; Loads an address, G+4 onto the EES
LID 00000005 ; Loads the value of 5 onto the EES as a 32-bit quantity
SSD 00000000 ; Store the four bytes on the top of the EES at the
; address underneath
</PRE>This is a really simple example. In this chapter we are going to learn how
to do this and many other things. By the end of this chapter our compiler should
be able to perform assignments using expressions of arbitrary complexity. We
will also get a strong grasp of records and arrays, and our compiler should be
able to handle those, as well.
<P>The SAL compiler generates the program image byte by byte as it parses
through the source program. A byte-array (or code array) stores the VM op-codes
and their immediate parameters as they are generated. Functions and procedures
are compiled and written out to disk along with string data and external fixups.
At any given time, the code array will be partially filled, with a program
counter holding the index to the byte where the next instruction will be
written. <PRE> PC
|
V
-+---+----+----+----+----+---+----+----+----+----+---+----+----+----+----+-------
|LGA| 04 | 00 | 00 | 00 |LID| 00 | 00 | 00 | 00 |SSD| 00 | 00 | 00 | 00 | ...
-+---+----+----+----+----+---+----+----+----+----+---+----+----+----+----+-------
</PRE>
<MENU><FONT face=arial size=-1><B>Figure {CODARR}.</B> The code array and the
compiler's program counter (PC). This particular sequence of instructions
shows our previous example once it is complete.</FONT> </MENU>We have several
functions within our code-generation library that properly pack byte-codes into
the code array. They all begin with <TT>Emit</TT>. For instance,
<TT>Emit1()</TT> emits a single byte. Here is an example of the code array
before and after a call to <TT>Emit1()</TT>: <PRE> Before a call to Emit1(): PC
|
V
-+---+----+----+----+----+---+----+----+----+---+-
| ... Other instructions ...| Garbage ...
-+---+----+----+----+----+---+----+----+----+---+-
After a call to Emit1(0FFh): PC
|
V
-+---+----+----+----+----+---+----+----+----+---+-
| ... Other instructions ...|0FFh| Garbage ...
-+---+----+----+----+----+---+----+----+----+---+-
</PRE>As we can see, a single byte is emitted, and the program counter is
incremented one place. An assignment can be broken up into a hierarchy of rules.
We will talk about each of these rules in this chapter. The hierarchy of rules
looks like so:
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/ASSNTX.gif">
<P><FONT face=arial size=-1><B>Figure {ASSNTX}.</B> Hierarchy of various parts
of an assignment. This graph is crucial to properly understanding the
relationship between various parts of an assignment.</FONT></P></MENU>Again,
performing an assignment is essentially loading the address of some variable
onto the EES, loading the value of some expression onto the EES, and then
storing the result of the expression into the address. The prototype for
<TT>RuleAssignment()</TT> is unremarkable, and appears like the prototypes for
most of the other rules <PRE> void RuleAssignment( Set Follow );
</PRE>Let's take a closer look at what we need to do. In figure {GENASSN} we can
see rule Assignment with several callouts showing various tasks that need to be
done, and where in the rule they need to be done.
<MENU><IMG src="Chapter 6 Assignment and Expressions.files/GENASSN.gif">
<P><FONT face=arial size=-1><B>Figure {GENASSN}.</B> A diagram of how rule
Assignment is implemented. Also shown is a section of the code array that
shows the various parts of the assignment in machine code.</FONT></P></MENU>
<H3>6.1.1 Setting Up and Calling Rule IdentExpression</H3><!----------------------------------------------------------------------------->We
begin by declaring two type variables. These are needed to keep track of the
right-hand and left-hand types. Next, we call <TT>RuleIdentExpr()</TT>. This
will be explained shortly. The important thing about this rule is that it is
used to handle identifiers on the left side of the assignment as well as the
right. Remember that on the left side we want an identifier's address, and on
the right side we want its value. Notice that we pass in the value
<TT>false</TT> as one of <TT>RuleIdentExpr()</TT>'s parameters. This tells it to
make an lvalue (pronounced L-value, it is a value on the left-hand side of an
assignment. There is also an rvalue for the right-hand side). We also pass in
<TT>LType</TT>. <TT>RuleIdentExpr()</TT> will make an lvalue and set LType to
the type of the variable.
<P>
<H3>6.1.2 Handling Procedure Calls</H3><!----------------------------------------------------------------------------->Take
a look at rule Assignment figure {GENASSN} again. Notice how there is a path
that bypases the assignment token and the following expression--basically a
valid assignment can be simply one identifier expression. How can this be?
<P>This is always the case with a procedure call. The statement, <PRE> foo();
</PRE>is called an empty expression in SAL; it returns nothing. There are two
ways to check for a procedure call. The easiest is to just see if the current
token is a semicolon. However, that does not assure us that the user didn't just
type something like: <PRE> x;
</PRE>In order to make sure that this does not happen, we have a better way. A
procedure's return type will be <TT>notyp</TT>, whereas the return type for the
above expression will be whatever <TT>x</TT> was defined as. The best way is to
merge the two: <PRE> //*** Checking for procedure call<BR>
if LType.type = notyp and Token.sy = ';' then
return;
end if;
</PRE>
<H3>6.1.3 Evaluating the Expression and Storing the Result</H3><!----------------------------------------------------------------------------->Before
calling <TT>RuleExpression()</TT> we want to set RType equal to LType. We then
call <TT>RuleExpression()</TT> and pass it in. This rule will use the value in
RType to resolve ambiguities in type (for constants, especially), and will set
the value of RType to whatever the type of the expression ends up being. After
this rule returns we check to see that <TT>LType</TT> is the same as
<TT>RType</TT>. We can only assign integers to integers, cardinals to cardinals,
reals to reals, etc. If the two types do not match then we generate an error.
<P>Once we know that two types are compatible, we are faced with two situations.
We are doing either an assignment of complex data types (like a record or an
array), or we are doing an assignment of scalar data types (variables).
<P><B>6.1.3.1 Assigning Complex data types</B><BR>There are two forms of complex
assignment. One is a record or an array copy, and the other is a string copy.
<P>
<MENU><B>Record and Array copy.</B> This case is true when
<TT>LType.IsComplexType()</TT> returns <TT>true</TT>. Here is an example of a
record copy: <PRE> type
RecType is record
a: int;
b: char;
c: real;
end record;
var
x, y: RecType; //*** Declare two records of the same type
begin
x.a:= 1;
x.b:= 'A';
x.c:= 1.5;
y:= x; //*** Do a record copy
</PRE>In this example, we declare a type that is a record. We fill <TT>x</TT>
with data, and then assign it to <TT>y</TT>. The assignment performed is a
memory-copy and not a pointer-copy. In other words, the data is duplicated,
not the pointer to the data. The difference is that with a pointer copy, a
change to <TT>x</TT>'s data appears immediately in <TT>y</TT>. This is because
they are actually pointing to the same block of memory. The type of assignment
that we want is a memory copy.
<P>Before performing a memory copy, we need to confirm that the two are
actually of identical type. Simply veryifing that <TT>LType.type</TT> is the
same as <TT>RType.type</TT> will tell us if they are both records or both
arrays, but are they the same record or array? This is done by comparing the
block pointers of <TT>LType</TT> and <TT>RType</TT>. If they are pointing to
the same block in the symbol table, then we know that they are of the same
declared type.
<P>Performing a memory copy in the SAL VM is done by using the <TT>SYS</TT>
instruction. <TT>SYS</TT> is a way for SAL programs to get at some low-level
routines that are implemented in fast native code. The particular system call
that we are going to use in order to copy an array or record is service 04h
(memory manipulation), function 00 (memory copy). This function takes a single
16-bit size value, and two 32-bit pointers from the EES. The source address is
the top one, and the destination address is the one underneath. The EES should
look like this: <PRE> +------+
| ... |
| Dest |
| Src |
| Size | <--- EES top
| |
</PRE>The needed size can be retrieved from <TT>RType.block->size()</TT>.
The code to do the mem-copy looks like this:
<P><PRE> Emit (LIW,RType.getExtra()->size());
Emit (SYS, 04h, 00h );
</PRE>
<P><B>String Copy.</B> A string copy in SAL is equivalent to a
<TT>strncpy()</TT> in C. In SAL we want to make sure that <TT>LType</TT> is an
array of characters. This is easily verified by checkig its block information.
The <TT>RType</TT> for a string will have its type as arraytyp, its subtype as
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -