Transcript lecture4x

ECE 15B Computer Organization
Spring 2010
Dmitri Strukov
Lecture 4: Arithmetic / Data Transfer Instructions
Partially adapted from Computer Organization and Design, 4th edition, Patterson and Hennessy, and classes taught by
Ryan Kastner at UCSB
Agenda
• Review of last lecture
• Load/store operations
• Multiply and divide instructions
ECE 15B Spring 2010
Last Lecture
ECE 15B Spring 2010
Assembly Language
• Basic job of a CPU: execute lots of instructions
• Instructions are the primitive operations that the CPU may
execute
• Different CPUs implement different sets of
instructions
• Instruction Set Architecture (ISA) is a set of instructions a
particular CPU implements
• Examples: Intel 80x86 (Pentium 4), IBM/Motorola Power
PC (Macintosh), MIPS, Intel IA64, ARM
ECE 15B Spring 2010
Assembly Variables: Registers
• Unlike HLL like C or Java, assembly cannot use
variables
– Why not? Keep hardware simple
• Assembly Operands are registers
– Limited number of special locations built directly
into the hardware
– Operations can only be performed on these
– Benefit: Since registers file is small, it is very fast
ECE 15B Spring 2010
Assembly Variables: Registers
• By convention, each register also has a name to make it
easier to code
• For now:
$16 - $23
 $s0 - $s7
(correspond to C variables)
$8 - $15

$t0 - $t7
(correspond to temporary variables)
Will explain other 16 register names later
• In general, use names to make your code more readable
ECE 15B Spring 2010
MIPS Syntax
• Instruction Syntax:
[Label:] Op-code [oper. 1], [oper. 2], [oper.3], [#comment]
(0)
(1)
(2)
(3)
(4)
(5)
– Where
1) operation name
2,3,4) operands
5) comments
0) label field is optional, will discuss later
– For arithmetic and logic instruction
2) operand getting result (“destination”)
3) 1st operand for operation (“source 1”)
4) 2nd operand for operation (source 2”
• Syntax is rigid
– 1 operator, 3 operands
– Why? Keep hardware simple via regularity
ECE 15B Spring 2010
Addition and Subtraction of Integers
• Addition in assembly
– Example:
add
$s0, $s1, $s2
(in MIPS)
• Equivalent to: a = b + c (in C)
• Where MIPS registers $s0, $s1, $s2 are associated with C
variables a, b, c
• Subtraction in Assembly
– Example
Sub
$s3, $s4, S5
(in MIPS)
• Equivalent to: d = e - f (in C)
• Where MIPS registers $s3, $s4, $s5 are associated with C
variables d, e, f
ECE 15B Spring 2010
Addition and Subtraction of Integers
• How do we do this?
f = (g + h) – (i + j)
Use intermediate temporary registers
add $t0, $s1, $s2
#temp = g + h
add $t1, $s3, $s4
#temp = I + j
sub $s0, $t0, $t1
#f = (g+h)-(i+j)
ECE 15B Spring 2010
Immediates
• Immediates are numerical constants
• They appear often in code, so there are
special instructions for them
• Add immediate:
addi $s0, $s1, 10
# f= g + 10 (in C)
– Where MIPS registers $s0 and $s1 are associated
with C variables f and g
– Syntax similar to add instruction, except that last
argument is a number instead of register
ECE 15B Spring 2010
Load and Store Instructions
ECE 15B Spring 2010
CPU Overview
ECE 15B Spring 2010
… with muxes

Can’t just join wires
together

ECE 15B Spring 2010
Use multiplexers
… with muxes
ECE 15B Spring 2010
Memory Operands
• Main memory used for composite data
– Arrays, structures, dynamic data
• To apply arithmetic operations
– Load values from memory into registers
– Store result from register to memory
• Memory is byte addressed
– Each address identifies an 8-bit byte
• Words are aligned in memory
– Address must be a multiple of 4
• MIPS is Big Endian
– Most-significant byte at least address of a word
– c.f. Little Endian: least-significant byte at least address
ECE 15B Spring 2010
Data Transfer: Memory to Register
• MIPS load Instruction Syntax
lw
register#, offset(register#)
(1)
(2)
(3) (4)
Where
1)
2)
3)
4)
operation name
register that will receive value
numerical offset in bytes
register containing pointer to memory
lw – meaning Load Word
32 bits or one word are loaded at a time
ECE 15B Spring 2010
Data Transfer: Register to Memory
• MIPS store Instruction Syntax
sw
register#, offset(register#)
(1)
(2)
(3) (4)
Where
1)
2)
3)
4)
operation name
register that will be written in memory
numerical offset in bytes
register containing pointer to memory
sw – meaning Store Word
32 bits or one word are stored at a time
ECE 15B Spring 2010
Memory Operand Example 1
• C code:
g = h + A[8];
– g in $s1, h in $s2, base address of A in $s3
• Compiled MIPS code:
– Index 8 requires offset of 32
• 4 bytes per word
lw $t0, 32($s3)
add $s1, $s2, $t0
offset
# load word
base register
ECE 15B Spring 2010
Memory Operand Example 2
• C code:
A[12] = h + A[8];
– h in $s2, base address of A in $s3
• Compiled MIPS code:
– Index 8 requires offset of 32
lw $t0, 32($s3)
# load word
add $t0, $s2, $t0
sw $t0, 48($s3)
# store word
ECE 15B Spring 2010
Registers vs. Memory
• Registers are faster to access than memory
• Operating on memory data requires loads and
stores
– More instructions to be executed
• Compiler must use registers for variables as
much as possible
– Only spill to memory for less frequently used
variables
– Register optimization is important!
ECE 15B Spring 2010
Byte/Halfword Operations
• MIPS byte/halfword load/store
– String processing is a common case
lb rt, offset(rs)
lh rt, offset(rs)
– Sign extend to 32 bits in rt
lbu rt, offset(rs)
lhu rt, offset(rs)
– Zero extend to 32 bits in rt
sb rt, offset(rs)
sh rt, offset(rs)
– Store just rightmost byte/halfword
Why do we need them?
characters and multimedia data are expressed by less than 32
bits; having dedicated 8 and 16 bits load and store instructions
results in faster operation
ECE 15B Spring 2010
Two’s Compliment Representation
Multiply and Divide
ECE 15B Spring 2010
Unsigned Binary Integers
• Given an n-bit number
n 1
x  x n1 2


 x n2 2
   x1 2  x 0 2
1
Range: 0 to +2n – 1
Example


n2
0000 0000 0000 0000 0000 0000 0000 10112
= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
Using 32 bits

0 to +4,294,967,295
ECE 15B Spring 2010
0
2s-Complement Signed Integers
• Given an n-bit number
n 1
x   x n1 2


 x n2 2
   x1 2  x 0 2
1
Range: –2n – 1 to +2n – 1 – 1
Example


n2
1111 1111 1111 1111 1111 1111 1111 11002
= –1×231 + 1×230 + … + 1×22 +0×21 +0×20
= –2,147,483,648 + 2,147,483,644 = –410
Using 32 bits

–2,147,483,648 to +2,147,483,647
ECE 15B Spring 2010
0
2s-Complement Signed Integers
• Bit 31 is sign bit
– 1 for negative numbers
– 0 for non-negative numbers
• –(–2n – 1) can’t be represented
• Non-negative numbers have the same unsigned and
2s-complement representation
• Some specific numbers
–
–
–
–
0: 0000 0000 … 0000
–1: 1111 1111 … 1111
Most-negative: 1000 0000 … 0000
Most-positive: 0111 1111 … 1111
ECE 15B Spring 2010
Signed Negation
• Complement and add 1
– Complement means 1 → 0, 0 → 1
x  x  1111...1112  1
x  1  x

Example: negate +2


+2 = 0000 0000 … 00102
–2 = 1111 1111 … 11012 + 1
= 1111 1111 … 11102
ECE 15B Spring 2010
Sign Extension
• Representing a number using more bits
– Preserve the numeric value
• In MIPS instruction set
– addi: extend immediate value
– lb, lh: extend loaded byte/halfword
– beq, bne: extend the displacement
• Replicate the sign bit to the left
– c.f. unsigned values: extend with 0s
• Examples: 8-bit to 16-bit
– +2: 0000 0010 => 0000 0000 0000 0010
– –2: 1111 1110 => 1111 1111 1111 1110
ECE 15B Spring 2010
Integer Addition
• Example: 7 + 6
ECE 15B Spring 2010
Integer Subtraction
• Add negation of second operand
• Example: 7 – 6 = 7 + (–6)
+7:
–6:
+1:
0000 0000 … 0000 0111
1111 1111 … 1111 1010
0000 0000 … 0000 0001
ECE 15B Spring 2010
Multiplication
• Start with long-multiplication approach
multiplicand
multiplier
product
1000
× 1001
1000
0000
0000
1000
1001000
Length of product is
the sum of operand
lengths
ECE 15B Spring 2010
Multiplication Hardware
Initially 0
ECE 15B Spring 2010
Stopped here… will start next lecture
from here
ECE 15B Spring 2010
Optimized Multiplier
• Perform steps in parallel: add/shift

One cycle per partial-product addition

That’s ok, if frequency of multiplications is low
ECE 15B Spring 2010
Faster Multiplier
• Uses multiple adders
– Cost/performance tradeoff

Can be pipelined

Several multiplication performed in parallel
ECE 15B Spring 2010
MIPS Multiplication
• Two 32-bit registers for product
– HI: most-significant 32 bits
– LO: least-significant 32-bits
• Instructions
– mult rs, rt
/
multu rs, rt
• 64-bit product in HI/LO
– mfhi rd
/
mflo rd
• Move from HI/LO to rd
• Can test HI value to see if product overflows 32 bits
– mul rd, rs, rt
• Least-significant 32 bits of product –> rd
ECE 15B Spring 2010
Division
• Check for 0 divisor
• Long division approach
quotient
dividend
divisor
1001
1000 1001010
-1000
10
101
1010
-1000
10
remainder
n-bit operands yield n-bit
quotient and remainder
– If divisor ≤ dividend bits
• 1 bit in quotient, subtract
– Otherwise
• 0 bit in quotient, bring down next
dividend bit
• Restoring division
– Do the subtract, and if remainder goes <
0, add divisor back
• Signed division
– Divide using absolute values
– Adjust sign of quotient and remainder as
required
ECE 15B Spring 2010
Division Hardware
Initially divisor in
left half
Initially dividend
ECE 15B Spring 2010
Optimized Divider
• One cycle per partial-remainder subtraction
• Looks a lot like a multiplier!
– Same hardware can be used for both
ECE 15B Spring 2010
Faster Division
• Can’t use parallel hardware as in multiplier
– Subtraction is conditional on sign of remainder
• Faster dividers (e.g. SRT devision) generate
multiple quotient bits per step
– Still require multiple steps
ECE 15B Spring 2010
MIPS Division
• Use HI/LO registers for result
– HI: 32-bit remainder
– LO: 32-bit quotient
• Instructions
– div rs, rt / divu rs, rt
– No overflow or divide-by-0 checking
• Software must perform checks if required
– Use mfhi, mflo to access result
ECE 15B Spring 2010
Conclusions
• In MIPS assembly language
– Register replace C variables
– One instruction (simple operation) per line
– Simpler is faster
• Memory is byte-addressable, but lw and sw
access one word at a time
• A pointer (used by lw and sw) is just a memory
address, so we can add to it or subtract from it
(using offset)
ECE 15B Spring 2010
Review
• Instructions so far:
add, addi, sub
mult, div, mfhi, mflo, lw, sw,
lb, lbu, lh, lhu
• Registers so far
C variables: $s0 - $s7
Temporary variables: $t0 - $t9
Zero: $zero
ECE 15B Spring 2010