📄 appendix b the salvm instruction set.htm
字号:
| | | | | [Data] |
| | | | | |
| | |__________| |__________|
| | External
[PC] | | Global EES
| ___|___ ______|__________ Data Area
V / \ / \
-----+-----+----+----+----+----+----+----+-----
... | LEx | modnum | dword offset | ...
-----+-----+----+----+----+----+----+----+-----
(LEx or SEx)
<B>Figure {LSEXTF}</B> A representation of the way the LEx and SEx instructions
operate.
</PRE>The functionality of these instructions is a little obtuse. They require a
certain amount of abstract thought in order to visualize. Of all the load/store
op codes, this is one of the more complex.
<P>
<H3>B.1.5 Using a pointer on the EES</H3><!-------------------------------------------------------------------------------->The
VM may also work with addresses directly. All of the segments in memory reside
in the same address space. That means that even though some data is referenced
by G and other data is referenced by L, and still other data is referenced by a
lookup in the MAT, it is all still contained within the same address space.
<P>In addition to working with data indirectly through G or L, the VM may also
work with a pure address, and access data directly. How addresses can be
computed is discussed in the next section. Typically, when loading or storing,
the pointer is consumed, i.e., poped off the of the EES. The best way to
maintain a pointer while it is being worked with is to make a copy.
<P>The op-codes to load and store via a pointer take on the same familiar
general format: <PRE> Num Instruction
=========================================
1C LSB <I><u32 offset></I>
1D LSW <I><u32 offset></I>
1E LSD <I><u32 offset></I>
1F LSQ <I><u32 offset></I>
2C SSB <I><u32 offset></I>
2D SSW <I><u32 offset></I>
2E SSD <I><u32 offset></I>
2F SSQ <I><u32 offset></I>
<B>Table {LSEEST}</B> The instructions to load and store data using a pointer
on the EES.
</PRE>Realize that the S in these op-codes referrs to the EES, and not to the S
register. The S register is only used as a place holder, demarking the end of
the local data area. There are no instructions to load and store relative to the
S register. All of the LSx and SSx op-codes referr to the EES. This must be kept
in mind, or it can become very confusing.
<P><TT>LSx </TT>works by popping a 32-bit value from the EES, which is used as a
base pointer. The immediate offset is then added to the pointer, and the data
from that address in memory is fetched and pushed onto the EES. The <TT>SSx</TT>
instruction does the opposite. Notice that the data to be stored is <I>at the
top</I> of the EES, and the pointer to where it gets stored is underneath. This,
also, must be remembered, or things can get ugly.
<P><PRE> | ... |
|----------|
[Base Address]--> | |
(popped from EES)| | |
| | |
| | |
| | |
V |----------|
,-------- Offset --> | [Data] --------------.
| |----------| |
[PC] | | | |
| _________|________ | | | V |
V / \ | | | [Data] |
-----+-----+----+----+----+----+----- | | | |
... | LSx | dword offset | ... | ... | |__________|
-----+-----+----+----+----+----+----- Memory EES
<B>Figure {LSEESF}</B> A representation of the way the LSx instructions
operates.
</PRE><!-------------------------------------------------------------------------------->
<H3>B.1.6 Computing Pointers</H3>There are a group of instructions specifically
for computing pointers. Again, the only registers within the SALVM are base
pointers to various segments in memory. While the values of these registers may
be taken, they may not ever be set--at least not directly. These registers are
listed in table {COMPPTR}. <PRE> Num Instruction
======================================
08 LLA <I><u32 offset></I>
09 LGA <I><u32 offset></I>
0A LSA <I><u32 offset></I>
0B LEA <I><u16 modnum></I> <I><u32 offset></I>
<B>Table {COMPPTR}</B> A description of the op-codes and their machine codes
</PRE>All of these instructions take a 32-bit unsigned offset as an immediate
parameter. The <TT>LEA</TT> instruction also takes an immediate unsigned 16-bit
parameter, designating an entry in the MAT from which to extract an external G.
Each of these instructions will compute an address, and leave it on the stack.
Let's say a programmer wants to get a pointer to a variable in global memory.
The programmer will know the address of the variable only as an offset from G.
In order to get a pure pointer to the variable, the programmer will need to add
that offset to G, and that will get the variable's pointer. The <TT>LGA</TT>
instruction would be used. If the variable was stored at offset 2Ah from G, then
the programmer would use the instruction, <PRE> LGA 0000002A
</PRE>If the programmer wants to know the value of one of these registers, and
nothing more, it can be done by using an offset of zero. This example will
retrieve the value of L. <PRE> LLA 00000000
</PRE>As stated before, all of these instructions will compute an address and
leave it on the stack. The <TT>LSA</TT> instruction is particularly useful, in
that it is used to add a signed 32-bit quantity to another 32-bit quantity (a
pointer) on the top of the EES. This can be very useful for walking down through
several nested levels of records. Either an <TT>LGA</TT> or an <TT>LLA</TT> will
get the record's base address onto the EES, and then one or more successive
<TT>LSA</TT> instructions will take the pointer to the appropriate offset.
<P>The <TT>LEA</TT> instruction is used to get a pointer to a global variable in
another module. The first immediate parameter is a 16-bit number designating an
external module. The instruction will then extract the G for that module (using
the MAT), and add the 32-bit offset, in order to get the global variable's
pointer.
<H3>B.1.7 Accessing Other Registers and Memory Segments</H3><!-------------------------------------------------------------------------------->There
are a series of additional registers for retrieving various other addresses.
These are listed in table {AUXADDR}. <PRE> Num Instruction
==================================
20 LPCA <I><s32 offset></I>
21 LSTA <I><s32 offset></I>
22 LVTA <I><u16 modnum></I> <I><u16 procnum></I>
23 LEXA (no offset)
</PRE><TT><B>LPCA</B></TT>. The <TT>LPCA</TT> instruction is used to get the
current value of the PC. It will also add a 32-bit offset to the value. By
definition, <PRE> LPCA 00000000
</PRE>returns the address of the next instruction in the stream. This
instruction is useful for computing a jump address. In the SAL compiler, it is
used for unwinding the stack during exception handeling. It has other uses, too.
<P><TT><B>LSTA</B></TT>. This instruction is used for fetching the address of a
string constant. Only the string segment for the local module may be accessed;
strings belonging to external modules may not. The use of the string segment was
designed exclusively for the storage of string constants. Strings constants are
never referred to symbolically, other than to copy their contents into some
other area of memory that <I>is</I> symbolically accessed. Additionally, notice
that there are no <TT>LSTx</TT> or <TT>SSTx</TT> instructions. Again, the only
data that is stored in the string segment is string <I>constants</I>
<P>As example of this instruction's usage, suppose the programmer wants to print
the string <TT>"Hello, world!"</TT> to the screen. The string will be stored in
the local string segment at an offset that is known at compile time. For
example, if the offset were at 1A4h, then the programmer would use, <PRE> LSTA 000001A4
</PRE>The address of the string would be loaded onto the stack, and the
programmer would then pass that address to a routine, which handeles strings.
<P><TT><B>LVTA</B></TT>. This instruction computes the address of a virtual
table. Its use is related to classes, and is discussed at great length in
chapter {VIRTUAL FUNCTIONS}. <TT><B>LEXA</B></TT>. This instruction is used to
load the starting address of stack memory (i.e., the start of the segment into
which L and S point). By definition, the first 32-bits of this segment store a
pointer to the exception stack. The exception stack is not truly a part of the
virtual machine's architecture, however, a pointer to the current thread's
exception stack needs to be stored at a consistent address. The <TT>LEXA</TT>
instruction exists so that that pointer may be properly set, or its value
retrieved.
<P>
<H3>B.1.8 Using These Instructions Together</H3><!-------------------------------------------------------------------------------->Let's
have a few examples of using these instructions together. First, we will cover
assignment to a global variable, since it is the easiest. Suppose we have a
variable called <TT>X</TT>, and we want to initialize it to zero. There are some
things that we have to consider. First, since the architecture that we are
working with is stack-based, we need to consider in which order the items need
to appear on the EES. The second thing that we need to consider is the address
at which the variable is stored. This is usually known at compile time. Third,
we need to know the size of the variable. This is also known at compile time.
<P>If we want to assign a value to a variable, there are two ways to do it. Each
way is dependant upon the architecture, and the way the instructions work. The
easiest would be to simply store the value at the address of the variable. Since
there is no instruction to store an immediate value, we need to load it onto the
EES, first. We can accomplish this through a load-immediate instruction. Then we
tell the VM to store the proper quantity of bytes at the proper offset from the
proper base register. We always know beforehand the offset of all variables, and
the segments where the reside. Supposing the variable was at offset 14h from the
G (it's in global memory) and it was a word and we want to set it to zero, we
would issue an instruction sequence like this: <PRE> LIW 0000 ; Put the word containing zero on the EES
SGW 00000014 ; Store the word at the top of the EES at offset 14h from G
</PRE>Very simple. The second way involves getting a pointer to the variable,
and uses the <TT>SSW</TT> instruction. We know that these instructions take the
data from the top of the EES and the address is under that, so we need to make
sure that our data is on the stack in the correct order. Remember, the EES does
not retain <I>any</I> type information, whatsoever. It merely works with
quantities of information. We would issue a sequence of instructions like this:
<P><PRE> LGA 00000014 ; Put the address of the variable on the EES
LIW 0000 ; Put a word containing the value zero on the EES
SSW 00000000 ; Store the word at the top of the EES at the address
; underneath (no offset).
</PRE>This method is a little less straightforward. However, it lends itself
very nicely to code generation in the compiler. In fact, this is the method
discussed in this text.
<P>
<H2>B.2 Integer and Floating Point Arithmetic Instructions</H2><!-------------------------------------------------------------------------------->The
Arithmetic Instructions are all fairly straightforward. They take no immediate
parameters, they consume two equal quantities of bytes from the EES, perform an
operation, and deposit the result on the EES in the same quantity of bytes. The
VM has instructions for performing arithmetic on both integer and real data.
They are listed in table {ARITHT}. <PRE> Data Type(s) bits add sub mul div mod trunc neg abs
====================================================================================
(un)signed byte 8 ADDB SUBB MULB DIVB MODB NEGB ABSB
(un)signed word 16 ADDW SUBW MULW DIVW MODW NEGW ABSW
(un)signed dword 32 ADDD SUBD MULD DIVD MODD NEGD ABSD
(un)signed qword 64 ADDQ SUBQ MULQ DIVQ MODQ NEGQ ABSQ
single precision 32 FADDS FSUBS FMULS FDIVS FMODS FTRNCS FNEGS FABSS
double precision 64 FADDD FSUBD FMULD FDIVT FMODD FTRNCD FNEGD FABSD
tenbyte precision 80 FADDT FSUBT FMULT FDIVT FMODT FTRNCT FNEGT FABST
quad precision 128 FADDQ FSUBQ FMULQ FDIVQ FMODQ FTRNCQ FNEGQ FABSQ
<B>Table {ARITHT}</B> This table shows all of the instructions that are used to
perform arithmetic operations.
</PRE>Integers within the SALVM may be either signed 2's compliment, or
unsigned, and can range in size from 8 bits to 64 bits. Floating point numbers
can be single precision (32 bits) to quad precision (128 bits).
<P>The integral instructions assume a 2's compliment host architecture. Most of
the instructions take two operands. <TT>FMODx</TT>, <TT>NEGx</TT> and
<TT>FNEGx</TT>, and <TT>ABSx</TT> and <TT>FABSx</TT> take one operand.All of the
floating point instructions start with an F, and all of the integeral
instructions do not. The letter at the end of the instruction tells what size of
data it works with. For instance, <B>MULD</B> is an integeral instruction that
works on 32-bit quantities of data. It requires that two dwords be on the EES.It
will remove them, multiply them together, and then push the result as a dword
back onto the EES. All of these instructions work in a three-step fashion. In
step one, the appropriate amount of data is removed from the EES (either one or
two operands). In step two, the operation is performed, and in step three the
result is pushed back onto the EES. All two-operand instructions require that
the second operand be at the top of the EES, and not the first. Thus, if we want
to add two single precision floats together, say 3.5 + 2.6, the numbers need to
be on the EES in reverse order. This easiy to remember, as long as we push the
operands onto the EES in the order that they appear in the equation. Thus, we
would first push 3.5, and then 2.6. See figure {EESORD}. <PRE> 3.5 + 2.6 3.5 + 2.6
^ ^
| 3.5 | | 2.6 |
| | | 3.5 |
| ... | | ... |
|_______| |_______|
EES EES
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -