Transcript CS305_03

CS.305
Computer Architecture
<local.cis.strath.ac.uk/teaching/ug/classes/CS.305>
Computer Abstractions and
Technology
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005,
and from slides kindly made available by Dr Mary Jane Irwin, Penn State University.
Instruction Sets
 Language of the Machine
 We’ll be working with the
MIPS instruction set
architecture
 Similar to other
architectures
developed since the 1980's
 Almost 100 million MIPS
processors manufactured
in 2002
 Used by NEC, Nintendo,
Cisco, Silicon Graphics,
Sony, …
Instructions: Language of the Computer
CS305_03/2
MIPS is a RISC
 RISC - Reduced Instruction Set Computer
 RISC philosophy




fixed instruction lengths
load-store instruction sets
limited addressing modes
limited operations
 MIPS, Sun SPARC, HP PA-RISC, IBM PowerPC,
Intel (Compaq) Alpha, …
 Instruction sets are measured by how well
compilers use them as opposed to how well
assembly language programmers use them
Design goals: speed, cost (design, fabrication, test,
packaging), size, power consumption, reliability,
memory space (embedded systems)
Instructions: Language of the Computer
CS305_03/3
MIPS R3000 Instruction Set Architecture (ISA)
 Instruction Formats
 R - Register
 I - Immediate
 J - Jump
 Instruction Categories




Computational
Load/Store
Jump and Branch
Floating Point
•
Registers
R0 - R31
PC
HI
LO
coprocessor
 Memory Management
 Special
Instructions: Language of the Computer
CS305_03/4
MIPS Instruction Formats
 Three basic formats:
R-format
op
rs
rt
I-format
op
rs
rt
J-format
op
rd
shamt
funct
16-bit address/number
26-bit address
 Simple instructions - all 32 bits wide
 Very structured, no unnecessary baggage
 Rely on compiler to achieve performance
— what are the compiler's goals?
 [Suggests another version of the acronym RISC ;-)]
 Q: Why only three basic formats?
 A: Design Principle #1…
Instructions: Language of the Computer
CS305_03/5
Design Principle #1
Simplicity favours regularity
 The fixed-width and limited number of instruction
formats keeps the hardware simple
 One example of this first underlying principle of
hardware design in action
Instructions: Language of the Computer
CS305_03/6
Registers vs. Memory
 Arithmetic instructions operands must be registers,
— only 32 registers provided
 Compiler associates variables with registers
 What about programs with lots of variables?
Control
Input
Memory
Datapath
Processor
Instructions: Language of the Computer
Output
I/O
CS305_03/7
Memory Organization
 Viewed as a large, single-dimension array, with an
address.
 A memory address is an index into the array
 "Byte addressing" means that the index points to a
byte of memory.
0
1
2
3
4
5
6
...
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
Instructions: Language of the Computer
CS305_03/8
Memory Organization
 Bytes are nice, but most data items use larger
"words"
 For MIPS, a word is 32 bits or 4 bytes.
0
4
8
12
...
32 bits of data
32 bits of data
32 bits of data
Registers hold 32 bits of data
32 bits of data
 232 bytes with byte addresses from 0 to 232-1
 230 words with byte addresses 0, 4, 8, ... 232-4
 Words are aligned
What are the least 2 significant bits of a word
address?
Instructions: Language of the Computer
CS305_03/9
Machine Language
 Instructions, like registers and words of data, are
also 32 bits long
 Example: add $t1,$s1,$s2
 Registers have numbers, $t1=9,$s1=17,$s2=18
 Above add's machine language instruction
encoding:
000000 10001 10010 01001 00000 100000
op
rs
rt
rd
shamt
funct
Can you guess what the field names, such as
'op', stand for?
Instructions: Language of the Computer
CS305_03/10
MIPS Computational Operations
 Computational (arithmetic and logical) instructions
have 3 operands.
Example:
C code:
a = b + c
MIPS ‘code’:
add a, b, c
(we’ll talk about registers in a bit)
“The natural number of operands for an operation like addition is
three…requiring every instruction to have exactly three
operands, no more and no less, conforms to the philosophy of
keeping the hardware simple”
Instructions: Language of the Computer
CS305_03/11
MIPS Arithmetic Instructions
 MIPS assembly language arithmetic statement
examples:
add $t0,$s1,$s2
sub $t0,$s1,$s2
 Each arithmetic instruction performs only one
operation
 Each arithmetic instruction fits in 32 bits and
specifies exactly three operands
destination  source1 op source2
 Those operands are all contained in the datapath’s
register file ($t0,$s1,$s2) – indicated by $
 Operand order is fixed (destination first in the
assembly language statement)
Instructions: Language of the Computer
CS305_03/12
MIPS Arithmetic
 Remember "Simplicity favors regularity"
 Of course this complicates some things...
C code:
a = b + c + d;
MIPS 'code':
add a, b, c
add a, a, d
 Each register contains 32 bits
 Operands must be registers, but only 32 registers
available
 Q: Why only 32 registers?
 A: Design Principle #2…
Instructions: Language of the Computer
CS305_03/13
Design Principle #2
Smaller is Faster
 Operands of arithmetic instructions cannot be
arbitrary (program) variables; they must come from a
limited number of special operands called registers.
 One major difference between program variables and
registers is the limited number of registers - 32 in
MIPS.
 A very large number of registers would increase the
clock cycle time as electronic signals take longer the
further they have to travel.
 This is one illustration of this second underlying
principle of hardware design
Instructions: Language of the Computer
CS305_03/14
Aside: MIPS Register Conventions
Name
Register
Usage
Number
Preserved
on call?
$zero
0
The constant value 0
n.a.
$at
1
Reserved for the assembler
n.a.
$v0-$v1
2-3
Values for results and expression evaluation
no
$a0-$a3
4-7
arguments
no
$t0-$t7
8-15
temporaries
no
$s0-$s7
16-23
saved
yes
$t8-$t9
24-25
More temporaries
no
$k0-$k1
26-27
Reserved for the operating system
n.a.
$gp
28
Global pointer
yes
$sp
29
Stack pointer
yes
$fp
30
Frame pointer
yes
$ra
31
Return address
yes
Instructions: Language of the Computer
CS305_03/15
Aside: MIPS Register File
Register File
 Holds thirty-two 32-bit registers
 Two read ports and
 One write port
 Registers are
 Faster than main memory
src1 addr
src2 addr
dst addr
write data
32 bits
5
32 src1
data
5
5
32
locations
32 src2
32
• But register files with more locations
are slower (e.g., a 64 word file could
be as much as 50% slower than a 32 word file)
data
write control
• Read/write port increase impacts speed quadratically
 Easier for a compiler to use
 Convenient places to hold variables
• code density improves (since register are named with fewer
bits than a memory location)
Instructions: Language of the Computer
CS305_03/16
Recap: R-format Instructions
op
rs
rt
rd
shamt
funct
6-bits
5-bits
5-bits
5-bits
5-bits
6-bits
op
opcode that specifies the operation
rs
register file address of the first source operand
rt
register file address of the second source operand
rd
register file address of the result’s destination
shamt shift amount (for shift instructions)
funct function code augmenting the opcode
Instructions: Language of the Computer
CS305_03/17
Register Addressing Mode
 The register address fields are rs, rt, and rd. Each
field is 5-bits wide
op
rs
rt
rd
shamt
funct
6-bits
5-bits
5-bits
5-bits
5-bits
6-bits
Register addressing
op
rs
rt
rd
s…
f…
Registers
Register ($rd)
Register ($rt)
Register ($rs)
Instructions: Language of the Computer
CS305_03/18
Load and Store Instructions
 Example:
C code:
A[12] = h + A[8];
MIPS code: lw $t0,32($s3)
add $t0,$s2,$t0
sw $t0,48($s3)
 Destination is last in the store word AL statement
 Remember arithmetic operands are registers, not
memory!
Can’t write: add 48($s3),$s2,32($s3)
Instructions: Language of the Computer
CS305_03/19
Our First Example
 Can we figure out the code?
swap(int v[], int k);
{ int temp;
temp = v[k]
v[k] = v[k+1];
v[k+1] = temp;
}

Instructions: Language of the Computer
swap:
muli
add
lw
lw
sw
sw
jr
$2,$5,4
$2,$4,$2
$15,0($2)
$16,4($2)
$16,0($2)
$15,4($2)
$31
CS305_03/20
So far we’ve learned:
 MIPS
— loading words but addressing bytes
— arithmetic on registers only
 Instruction
add $s1,$s2,$s3
sub $s1,$s2,$s3
lw $s1,100($s2)
sw $s1,100($s2)
Instructions: Language of the Computer
Meaning
$s1 = $s2 + $s3
$s1 = $s2 – $s3
$s1 = Memory[$s2+100]
Memory[$s2+100] = $s1
CS305_03/21
MIPS Load/Store Instruction Format
 Consider load-word and store-word instructions and
the design principle Simplicity Favours Regularity…
 …so use another (existing) type of (32-bit) instruction
format other than R-type:
 I-type for data transfer instructions
 Example: lw $t0,32($s2)
35
18
9
32
op
rs
rt
16-bit number/offset
 Q: Why only a 16-bit number/offset?
 A: Design Principle #3…
Instructions: Language of the Computer
CS305_03/22
Design Principle #3
Good Design Demands Good Compromises
 A single (R-type) instruction format is not well suited
to instructions - like lw and sw - that specify address
as well as register operands. If the address field was
to be allocated to one of the 5-bit fields, say, then
such instructions could only address 32 (25) words!
 The conflict between having instructions all the same
length and the desire to have a single format leads to
this third underlying principle of hardware design.
 One compromise in MIPS is to have a small number of
different fixed-width instruction formats rather than
instructions of varying length. Multiple formats do
complicate the hardware, but the complexity can be
minimised by keeping them similar.
Instructions: Language of the Computer
CS305_03/23
MIPS Load/Store Memory Addressing
 MIPS has two basic data transfer instructions for
accessing memory:
lw $t0, 4($s3) # load word from memory
sw $t0, 8($s3) # store word to memory
 Data is loaded into (lw) or stored from (sw) a register
in the register file – a 5 bit address
 The memory address – a 32 bit address – is formed
by adding the contents of a base address register to
an offset value
 A 16-bit field means access is limited to memory locations
within a region of 213 or 8,192 words (215 or 32,768 bytes)
of the address in the base register
 Note that the offset can be positive or negative
Instructions: Language of the Computer
CS305_03/24
Base (displacement) Addressing Mode
 Base (displacement) addressing – operand is at the
memory location whose address is the sum of a
register and a 16-bit constant contained within the
instruction
Memory
Base addressing
op
rs
rt
offset
Byte/Halfword/Word
Register ($rs)
Instructions: Language of the Computer
CS305_03/25
Stored Program Concept
 Instructions are bits
 Programs are stored in memory
 to be read or written just like data
 Fetch & Execute Cycle
 Instructions are fetched and put into
a special register
 Bits in the register "control" the
subsequent actions
 Fetch the “next” instruction and
continue
Instructions: Language of the Computer
CS305_03/26
Control
 Decision making instructions
 alter the control flow,
 i.e., change the "next" instruction to be executed
 MIPS conditional branch instructions:
bne $t0,$t1,Label
beq $t0,$t1,Label
 Example:
C
if (i==j)
h = i + j;
Instructions: Language of the Computer
MIPS
bne $s0,$s1,Label
add $s3,$s0,$s1
Label:
....
CS305_03/27
Control
 MIPS unconditional branch instructions:
j label
 Example:
C
if (i!=j)
h=i+j;
else
h=i-j;
MIPS
beq $s4,$s5,Lab1
add $s3,$s4,$s5
j
Lab2
Lab1: sub $s3,$s4,$s5
Lab2: ...
Can you build a simple for loop?
Instructions: Language of the Computer
CS305_03/28
Recap:
 Instruction
Meaning
add $s1,$s2,$s3
sub $s1,$s2,$s3
lw $s1,100($s2)
sw $s1,100($s2)
bne $s4,$s5,Label
beq $s4,$s5,Label
j Label
$s1 = $s2 + $s3
$s1 = $s2 – $s3
$s1 = Memory[$s2+100]
Memory[$s2+100] = $s1
Next ins. at Label if $s4≠$s5
Next ins. at Label if $s4=$s5
Next ins. at Label
 Formats:
R-format
op
rs
rt
I-format
op
rs
rt
J-format
op
Instructions: Language of the Computer
rd
shamt
funct
16-bit address/number
26-bit address
CS305_03/29
More Control 'Instructions'
 We have: beq, bne, what about blt (branch-if-lessthan)?
 New instruction:
slt $t0,$s1,$s2

if $s1 < $s2 then
$t0 = 1
else
$t0 = 0
 Can use slt to synthesise "blt $s1,$s2,Label"
— can now build general control structures
 Note that the assembler needs a register to do this,
 $at
Instructions: Language of the Computer
CS305_03/30
Constants
 Constant (immediate) operands are frequently used in
programs
e.g.,
A = A + 4;
B = B + 1;
C = C - 16;
 Possible approaches?
 put 'typical constants' in memory and load them.
 create hard-wired registers (like $zero) for constants like 1.
 have special instructions that contain constants !
 Note: small constants are very common (>50% of
operands)
 Q: Which instruction format(s) to use?
 A: See Design Principle #4…
Instructions: Language of the Computer
CS305_03/31
Design Principle #4
Make the Common Case Fast
 Analysis of a large variety of compiled programs
reveal that the vast majority of constants used are
quite small numbers: >90% within the range of a 16bit twos complement integer.
 Obvious choice is to use the I-type format for
instructions that have as an operand this most
common case of constant.
 Hence, these typical MIPS 'Immediate' instructions:
addi $sp,$sp,4
#$sp = $sp + 4
slti $t0,$s2,15#$t0 = 1 if $s2<15
Instructions: Language of the Computer
CS305_03/32
What about larger constants?
 There must be a way to 'load' a 32-bit constant into a
register. Compromise by using two instructions:
 "Load Upper Immediate" (lui) instruction:
lui $t0,1010101010101010b
Zero filled
$t0
1010101010101010 0000000000000000
 Followed by a "logical or" (ori) instruction:
ori $t0,$t0,1010101010101010b
$t0
1010101010101010 0000000000000000
ori
0000000000000000 1010101010101010
$t0
1010101010101010 1010101010101010
Instructions: Language of the Computer
CS305_03/33
Assembly Language vs. Machine Language
 Assembly provides convenient symbolic
representation
 much easier than writing down numbers
 e.g., destination first
 Machine language is the underlying reality
 e.g., destination is no longer first
 Assembly can provide 'pseudoinstructions'
 e.g., “move $t0,$t1” exists only in Assembly
 would be implemented by “add $t0,$t1,$zero”
 When considering performance you should count real
instructions
Instructions: Language of the Computer
CS305_03/34
Addresses in Branches and Jumps
 Instructions:
bne $t4,$t5,Label
beq $t4,$t5,Label
j
Label
Next instruction is at Label if $t4≠$t5
Next instruction is at Label if $t4=$t5
Next instruction is at Label
 Formats:
I-format
op
J-format
op
rs
rt
16-bit address
26-bit address
 Addresses are not 32 bits
 How do we handle this with load and store
instructions?
Instructions: Language of the Computer
CS305_03/35
Addresses in Branches
 Instructions:
bne $t4,$t5,Label
beq $t4,$t5,Label
Next instruction is at Label if $t4≠$t5
Next instruction is at Label if $t4=$t5
 Format:
I-format
op
rs
rt
16-bit address (offset)
 Could specify a register (like lw and sw did) and add it
to address (offset). Q: Which register?
 A: Instruction Address Register (aka Program Counter - PC)
PC-Relative addressing
op
rs
rt
offset
?
?
Program Counter (PC)
Instructions: Language of the Computer
?
CS305_03/36
Specifying Branch Destinations
 Why PC?
 its use is automatically implied by instruction
• PC gets updated (PC+4) during the fetch cycle so that it holds
the address of the next instruction
 limits the branch distance to -215 to +215-1 instructions from
the (instruction after the) branch instruction, but most
branches are local anyway. (Principle of Locality).
from the low order 16 bits of the branch instruction
16
offset
sign-extend
00
32
32 Add
PC
32
Instructions: Language of the Computer
32
4
32
Add
32
branch dst
address
32
?
CS305_03/37
Addresses in Jumps
 Instruction:
j
Label
Next instruction is at Label
 Format:
J-format
op
26-bit address
 Jump instructions just use high order bits of PC
 A compromise: such jumps are limited by address
boundaries of 256 MB, i.e within blocks of 226 instructions.
from the low order 26 bits of the jump instruction
26
00
32
4
PC
Instructions: Language of the Computer
32
CS305_03/38
MIPS ISA So Far
Category
Arithmetic
(R & I
format)
Data
Transfer
(I format)
Cond.
Branch (I
& R format)
Uncond.
Jump
(J
& R format)
Instr
Op Code
Example
Meaning
add
0 and 32
add $s1, $s2, $s3
$s1 = $s2 + $s3
subtract
0 and 34
sub $s1, $s2, $s3
$s1 = $s2 - $s3
add immediate
8
addi $s1, $s2, 6
$s1 = $s2 + 6
or immediate
13
ori $s1, $s2, 6
$s1 = $s2 v 6
load word
35
lw
$s1, 24($s2)
$s1 = Memory($s2+24)
store word
43
sw $s1, 24($s2)
Memory($s2+24) = $s1
load byte
32
lb
$s1, 25($s2)
$s1 = Memory($s2+25)
store byte
40
sb
$s1, 25($s2)
Memory($s2+25) = $s1
load upper imm
15
lui
$s1, 6
$s1 = 6 * 216
br on equal
4
beq $s1, $s2, L
if ($s1==$s2) go to L
br on not equal
5
bne $s1, $s2, L
if ($s1 !=$s2) go to L
set on less than
0 and 42
slt
if ($s2<$s3) $s1=1 else
$s1=0
set on less than
immediate
10
slti $s1, $s2, 6
if ($s2<6) $s1=1 else
$s1=0
jump
2
j
2500
go to 10000
jump register
0 and 8
jr
$t1
go to $t1
jump and link
3
jal
2500
go to 10000; $ra=PC+4
Instructions: Language of the Computer
$s1, $s2, $s3
CS305_03/39
Review of MIPS Operand Addressing Modes
 Register addressing – operand is in a register
op
rs
rt
rd
funct
Register
word operand
 Base (displacement) addressing – operand is at the
memory location whose address is the sum of a register
and a 16-bit constant contained within the instruction
op
rs
rt
offset
Memory
word or byte operand
base register
 Register relative (indirect) with
 Pseudo-direct with
0($a0)
addr($zero)
 Immediate addressing – operand is a 16-bit constant
contained within the instruction
op
rs
rt
operand
Instructions: Language of the Computer
CS305_03/40
Review of MIPS Instruction Addressing Modes
 PC-relative addressing –instruction address is the sum
of the PC and a 16-bit constant contained within the
instruction
op
rs
rt
offset
Memory
branch destination instruction
Program Counter (PC)
 Pseudo-direct addressing – instruction address is the
26-bit constant contained within the instruction
concatenated with the upper 4 bits of the PC
op
Memory
jump address
||
jump destination instruction
Program Counter (PC)
Instructions: Language of the Computer
CS305_03/41
Instructions for Accessing Procedures
 MIPS 'procedure call' instruction:
jal ProcedureAddress
#jump and link
 Saves PC+4 in register $ra ($31) to have a link to
the next instruction for the procedure return
 Instruction format (J-format):
jal
000011
26-bit address
 Then can do procedure 'return' with a
jr
$ra
#return
 Instruction format (R-format):
jr
00000
rs
op
rs
Instructions: Language of the Computer
001000
rt
rd
shamt
funct
CS305_03/42
Aside: Spilling Registers
 What if the callee needs more registers and/or the
procedure is recursive?
 use a stack – a last-in-first-out queue – in memory
for passing additional values or saving (recursive)
return address(es)
 One of the general registers, $sp, is
high addr
used to address the stack (which
“grows” from high address to low
address)
top of stack
$sp  Push (a register onto the stack):
subi $sp,$sp,4
sw
$ra,0($sp)
 Pop (a register off the stack):
low addr
lw
$ra,0($sp)
addi $sp,$sp,4
Instructions: Language of the Computer
CS305_03/43
Example: Nested Procedure Calls - MIPS code
A:
B:
C:
...
...
jal B
...
...
...
subi $sp,$sp,4
sw $ra,0($sp)
jal C
lw $ra,0($sp)
addi $sp,$sp,4
...
jr $ra
...
...
jr $ra
Instructions: Language of the Computer
# Call B, save return addr in $ra
#
#
#
#
#
#
Get ready to call C
Adjust ToS to make room to...
...'push' the old return addr
Call C, save return addr in $31
Restore B's return address...
...and re-adjust ToS ('pop')
# Return to proc that called B
# Return to proc that called C
CS305_03/44
Passing Parameters to Procedures
 Conventions for passing parameters - arguments may vary from machine to machine, language to
language, and even compiler to compiler.
 MIPS uses $4 to $7 ($a0-$a3) as arguments.
 There must also be a convention for preserving
registers across procedure calls. The two usual
conventions are:
 Caller save.
The calling procedure (caller) has the
responsibility for preserving affected registers. The called
procedure (callee) can then modify any registers without
constraint.
 Callee save. The callee has the responsibility for saving and
restoring any registers that it might use.
The calling
procedure (caller) uses registers without worrying about their
preservation.
Instructions: Language of the Computer
CS305_03/45
MIPS Pseudoinstructions
 In keeping with design principles the MIPS ISA does
not contain complex instructions as these could
compromise the performance of all instructions.
 However, a MIPS compiler/assembler can synthesise
'pseudoinstructions' from common variations of real
instructions. Such pseudoinstructions simplify
translation and programming.
 Pseudoinstructions give MIPS a richer set of
assembly language instructions than those
implemented by hardware.
 The assembler reserves one register, $at, that is
used in the synthesis of many pseudoinstructions.
 For example…
Instructions: Language of the Computer
CS305_03/46
Example MIPS Pseudoinstructions
 Pseudoinstruction
 move $t0,$t1
Real MIPS
add $t0,$t1,$zero
 clear $s0
add $s0,$zero,$zero
 blt $s1,$s2,label
slt $at,$s1,$s2
bne $at,$zero,label
 bge $s1,$s2,label
slt $at,$s1,$s2
beq $at,$zero,label
Instructions: Language of the Computer
CS305_03/47
Summary of MIPS (RISC) Design Principles
 Simplicity favors regularity
 fixed size instructions – 32-bits
 small number of instruction formats
 opcode always the first 6 bits
 Good design demands good compromises
 three instruction formats
 Smaller is faster
 limited instruction set
 limited number of registers in register file
 limited number of addressing modes
 Make the common case fast
 arithmetic operands from the register file (load-store
machine)
 allow instructions to contain immediate operands
Instructions: Language of the Computer
CS305_03/48
Fallacies and Pitfalls
 Fallacy: More powerful instructions mean higher
performance.
 Such instructions often do more work than is required in the
frequent case or don't match the requirements of the
language.
 Pitfall: To obtain the highest performance, write in
assembly language.
 The increasing sophistication of modern compilers means
that the gap between compiled code and 'hand-crafted' code
is closing fast.
 Even if the gap isn't closed completely, the drawbacks of
writing in assembly language are longer time spent coding
and debugging, the loss in portability, and difficulty of
maintenance.
 Pitfall: Forgetting that sequential word addresses in
memory differ by 4, not by 1.
Instructions: Language of the Computer
CS305_03/49