📄 http:^^www.cs.wisc.edu^~cs354-2^cs354^lec.notes^mal.html

📁 This data set contains WWW-pages collected from computer science departments of various universities
💻 HTML
📖 第 1 页 / 共 2 页
字号:
12 下一页
Date: Tue, 05 Nov 1996 00:32:34 GMTServer: NCSA/1.5Content-type: text/htmlLast-modified: Wed, 30 Aug 1995 21:21:33 GMTContent-length: 20490<html><head><title> Lecture notes - Chapter 8 - MAL and Registers</title></head><BODY><h1> Chapter 8 -- MAL and registers</h1><pre>REGISTERS and MAL-----------------An introduction to the subject of registers -- from a motivationalpoint of view.This lecture is an attempt to explain a bit about why computersare designed (currently) the way they are.  Try to remember thatspeed of program execution is an important goal.  Desire for increasedspeed drives the design of computer hardware.The impediment to speed (currently):  transfering data to and frommemory.look at a SAL instruction:    add  x, y, z    -x, y, and z must all be addresses of data in memory.    -each address is 32 bits.    -so, this instruction requires more than 96 bits.    if each read from memory delivers 32 bits of data,    then it takes a lot of reads before this instruction can    be completed.       at least 3 for instruction fetch       1 to load x       1 to load y       1 to store z       that's 6 transactions with memory for 1 instruction!How bad is the problem?  Assume that a 32-bit 2's complement addition takes 1 time unit.   A read/write from/to memory takes about 10 time units.  So we get     fetch instruction:  30 time units     decode               1 time unit     load x              10 time units     load y              10 time units     add                  1 time unit     store z             10 time units     ---------------------------------       total time:       62 time units     60/62 = 96.7 % of the time is spent doing memory operations.what do we do to reduce this number?  1. transfer more data at one time     if we transfer 2 words at one time, then it only takes 2 reads     to get the instruction.  There is no savings in loading/storing     the operands.  And, an extra word worth of data is transferred     for each load, a waste of resources.     So, this idea would give a saving of 1 memory transaction.    2. modify instructions such that they are smaller.     This was common on machines from more than a decade ago.     Here's how it works:     SAL implies what is called a 3-address machine.  Each      arithmetic type instruction contains 3 operands, 2 for sources     and 1 for the destination of the result.     To reduce the number of operands (and thereby reduce the number     of reads for the instruction fetch), develop an instruction set     that uses 2 operands for arithemtic type instructions.     (Called a 2-address machine.)     Now, instead of       add  x, y, z     we will have          load x, z      (puts the value of z into x)			   add  x, y      ( x <- x + y )	   so, arithmetic type instructions always use one of the operands	   as both a source and a destination.    There's a couple of problems with this approach:       - where 1 instruction was executed before, 2 are now executed.	 It actually takes more memory transactions to execute this sequence!	    at least 2 to fetch each instruction	    1 for each of the load/storing of the operands themselves.	    that is 8 reads/writes for the same sequence.  So, allow only 1 operand -- called a 1-address format.          now, the instruction     add  x, y, z   will be accomplished     by something like     load  z     add   y     store x     to facilitate this, there is an implied word of storage     associated with the ALU.  All results of instructions     are placed into this word -- called an ACCUMULATOR.     the operation of the sequence:	 load z --  place the contents of address z into the accumulator		   (sort of like if you did  move accumulator, z  in SAL)	 add  y --  implied operation is to add the contents of the		    accumulator with the operand, and place the result		    back into the accumulator.	 store x--  place the contents of the accumulator into the location		    specified by the operand.     Notice that this 1-address instruction format implies the use     of a variable (the accumulator).     How many memory transactions does it take?	2 -- (load) at least 1 for instruction fetch, 1 for read of z	2 -- (add) at least 1 for instruction fetch, 1 for read of y	2 -- (store) at least 1 for instruction fetch, 1 for write of x       ---	6   the same as for the 3 address machine -- no savings.  BUT, what if the operation following the add was something like	 div x, x, 3  then, the value for x is already in the accumulator, and the  code on the 1 address machine could be    load z    add  y    div  3    store x  there is only 1 extra instruction (2 memory transactions) for this  whole sequence!       On the 3-address machine:   12 transactions (6 for each instr.)     On the 1-address machine:    8 transactions (2 for each instr.)REMEMBER this:  the 1 address machine uses an extra word of storage		that is located in the CPU.		the example shows a savings in memory transactions		when a value is re-used.  3.  shorten addresses.  This restricts where variables can be placed.      First, make each address be 16 bits (instead of 32).  Then	 add  x, y, z      requires 2 words for instruction fetch.      Shorten addresses even more . . . make them each 5 bits long.      Problem:  that leaves only 32 words of data for operand storage.      So, use extra move instructions that allow moving data from      a 32 bit address to one of these special 32 words.      Then, the add can fit into 1 instruction.NOW, put a couple of these ideas together.Use of storage in CPU (accumulator) allowed re-use of data.Its easy to design -- put a bunch of storage in the CPU --call them REGISTERS.  How about 32 of them?  Then, restrictarithmetic instructions to only use registers as operands.   add  x, y, z   becomes something more like   load  reg10, y   load  reg11, z   add   reg12, reg11, reg10   store x, reg12presuming that the values for x, y, and z can/will be used again,the load operations take relatively less time.The MIPS R2000 architecture does this.  It has  1. 32  32-bit registers.  2. Arithmetic/logical instructions use register values as operands.A set up like this where arith/logical instr. use only registersfor operands is called a LOAD/STORE architecture.A computer that allows operands to come from main memory is oftencalled a MEMORY TO MEMORY architecture, although that term is notuniversal.Load/store architectures are common today.  They have the advantages  1.  instructions can be fixed length (and short)  2.  their design allows (easily permits) pipelining, making load/store      architectures faster      (More about pipelining at the end of the semester)MAL---discussing some of the details of the MIPS architecture, and howto write assembly language.MIPS assembly language (or at least MAL) looks a lot like SAL,except that operands are now in registers.To reference a register as an operand, use the syntax	 $x,      where x is the number of the register you want.  Some limitations on the use of registers.  Due to conventions set by the simulator, certain registers are used  for special purposes.  It is wise to avoid the use of those registers.	   $0     is    0    (use as needed)	   $1     is used by the assembler (the simulator in our case)		  -- don't use it.	   $2-7   are used by the simulator -- don't use them until		  you know what they are for and how they are used.	   $26-27 Used to implement the mechanism for calling special		  procedures that do I/O and take care of other		  error conditions (like overflow)	   $29    is a stack pointer -- you are automatically allocated		  a stack (of words), and the $sp is initialized to		  contain the address of the empty word at the top of		  the stack at the start of any program.On to some MAL instructions.  Here are descriptions and samplesof only some instructions.  There are far too many to be able togo over each one in detail.Some sample info for all the examples:   hex address      hex contents    (opt) assembly lang.      00002000         0011aaee        c1:  .word  0x0011aaee   00002004         ????????        c2:  .space 12   00002008         ????????   0000200c         ????????   00002010         00000016        c4:  .word  22   00002014         000000f3        c5:  .word  0x000000f3Load/Store----------   la rt, label          # load address	place the address assigned to label into the register rt.	example:             la  $9, c1                  $9 gets the value 0x00002000    lw rt, label         # load word	place the word at address       label    into the register rt.    lw rt,  (rb)         # load word	place the word at address        (rb)    into the register rt.    lw rt, x(rb)         # load word	place the word at address    X + (rb)    into the register rt.	example:             lw  $10, c1	      $10 gets the value 0x0011aaee    lb rt, label         # load byte	place the byte at address      label     into the least	significant byte of register rt, and sign extend the value	to the rest of the register.    lb rt,  (rb)         # load byte	place the byte at address        (rb)    into the least	significant byte of register rt, and sign extend the value	to the rest of the register.    lb rt, x(rb)         # load byte	place the byte at address    X + (rb)    into the least	significant byte of register rt, and sign extend the value	to the rest of the register.	example:             lb  $10, c1	      on a little endian machine:	      presuming $9 contains the value 0x00002000,	      $10 gets the value 0xffffffee    sw rt, label         # store word	write the contents of register rt to address       label    sw rt,  (rb)         # store word	write the contents of register rt to address        (rb)    sw rt, x(rb)         # store word	write the contents of register rt to address    X + (rb)	example:           sw  $10, c2	      the value 0xffffffee is placed into the word of memory	      at address 0x00002004Branch------all the branch instructions for MAL look just like the ones fromSAL! (on purpose).  Just be sure that you use one that exists!The only difference worth mentioning is that the operands arerequired to be in registers.			example:       beq     $20, $23, branchtarget	Compare the values in registers 20 and 23.  If the values are	the same, then load the PC with the address branchtarget	If not, then do nothing and fetch the next instruction.    j target       # jump target		identical in effect to    b target, but the implementation	and execution are different (wrt the machine code).	A branch specifies an offset to be added to the current value	of the PC.  A jump gives as many bits of address as possible,	and the remaining ones come from the PC (no addition).Arithmetic/Logical------------------
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -