Computer Architecture

Download Report

Transcript Computer Architecture

Chapter 3
Arithmetic for
Computers
Multiplication

More complicated than addition



accomplished via shifting and addition
More time and more area
Let's look at 3 versions based on grade
school algorithm
0010
__x_1011
(multiplicand)
(multiplier)
Multiplication: Implementation
Start
Multiplicand
Multiplier0 = 1
Shift left
1. Test
Multiplier0
Multiplier0 = 0
64 bits
Multiplier
Shift right
64-bit ALU
1a. Add multiplicand to product and
place the result in Product register
32 bits
2. Shift the Multiplicand register left 1 bit
Product
Write
Control test
64 bits
Product
0000 0000
 0000 0010
 0000 0110
 0000 0110

3. Shift the Multiplier register right 1 bit
Multiplier
0011
0001
0000
Multiplicand
0000 0010
0000 0100
0000 1000
32nd repetition?
No: < 32 repetitions
Yes: 32 repetitions
Done
Second Version
Start
Multiplicand
32 bits
Multiplier0 = 1
1. Test
Multiplier0
Multiplier0 = 0
Multiplier
Shift right
32-bit ALU
32 bits
Product
64 bits
1:
2:
3:
1:
2:
3:
1:
2:
3:
1:
2:
3:
Shift right
Write
1a. Add multiplicand to the left half of
the product and place the result in
the left half of the Product register
Control test
Product Multiplier Multiplicand
0000 0000 0011
0010
0010 0000 0011
0010
0001 0000 0011
0010
0001 0000 0001
0010
0011 0000 0001
0010
0001 1000 0001
0010
0001 1000 0000
0010
0001 1000 0000
0010
0000 1100 0000
0010
0000 1100 0000
0010
0000 1100 0000
0010
0000 0110 0000
0010
0000 0110 0000
0010
0000 0110
0000
0010
2. Shift the Product register right 1 bit
3. Shift the Multiplier register right 1 bit
32nd repetition?
No: < 32 repetitions
Yes: 32 repetitions
Done
Final Version
Start
Product0 = 1
1. Test
Product0
Product0 = 0
Multiplicand
1a. Add multiplicand to the left half of
the product and place the result in
the left half of the Product register
32 bits
32-bit ALU
2. Shift the Product register right 1 bit
Product
Shift right
Write
Control
test
32nd repetition?
No: < 32 repetitions
Yes: 32 repetitions
64 bits
Done
Signed Multiplication



Convert the multiplier and multiplicand to
positive numbers and remember the original
signs.
Negate the product if the original signs
disagree.
A more elegant method: Booth’s algorithm
Booth’s Algorithm

Classifying groups of bits into the beginning,
the middle or the end of a run of 1s
Middle of Run
End of Run

0
1
1
1
0
Beginning of Run
Take a look at 2-bit groups
Current Bit
1
1
0
0
Bit to the right
0
1
1
0
Explanation
Beginning of a run of 1s
Middle of a run of 1s
End of a run of 1s
Middle of a run of 0s
Example
00001111000
00001111000
00001111000
00001111000
Booth’s Algorithm (Cont’d)

Depending on the current and previous bits,
do one of the following:





00: no arithmetic operation
01: End of a string of 1s, so add multiplicand to
the left half of the product
10: Beginning of a strings of 1s, so subtract the
multiplicand from the left half of the product
11: no arithmetic operation
Shift the Product register right 1 bit.
Example
 2x-3 = 6 or 0010x1101 = 1111 1010
Iteration
Step
Multiplicand
Product
0
Initial values
0010
0000 1101 0
1
1c:10=>P-M
0010
1110 1101 0
2: Shift right P
0010
1111 0110 1
2
1b:01=>P=P+M
0010
0001 0110 1
2: Shift right P
0010
0000 1011 0
3
1c:10=>P-M
0010
1110 1011 0
2: Shift right P
0010
1111 0101 1
4
1d:11=> no operation 0010
1111 0101 1
2: Shift right P
0010
1111 1010 1
Multiply by 2^i via shift



Shift left by one bit --> multiply by 2
Shift left by n bit --> multiply by 2^n
Proof of Booth’s Algorithm
Why does it work for 2’s
complement signed number?
ai
0
0
1
1


ai-1
0
1
0
1
Operation
Do nothing
Add b
Subtract b
Do nothing
In other words, if (ai-1 - ai)
 = 0 do nothing
 = 1 add b
 = -1 subtract b
Booth’s algorithm can be written as:
(a-1 - a0) x b x 2^0
+ (a0- a1) x b x 2^1
+…
+(a30-a31)x b x 2^31 = … = b x a
Faster Multiplication
• Use 32 adders instead of using a
single 32-bit adder on at a time.
Multiply in MIPS

Use a pair of 32-bit registers to contain the 64-bit product,
called Hi and Lo





mult: signed multiplication
multu: unsigned multiplication
mflo: move from lo to fetch the integer 32-bit product
mfhi
Both mult and multu ignore overflow! You must take care
of this yourself in the software.



How? Check the value of Hi
Hi must be 0 for multu
Hi must be the replicate sign of Lo for mult
Division



Some definitions:
Dividend, Divisor, Quotient, Remainder
Dividend = Quotient X Divisor + Remainder
Example: 1001010 divided by 1000
1001
Divisor 1000 1001010
–1000
10
101
1010
–1000
10
Quotient
Dividend
Remainder (or Modulo result)
First Version of the division
hardware
Start
Divisor
1. Subtract the Divisor register from the
Remainder register and place the
result in the Remainder register
Shift right
64 bits
Quotient
Shift left
64-bit ALU
Remainder > 0
Test Remainder
Remainder < 0
32 bits
Remainder
Write
64 bits
Control
test
2a. Shift the Quotient register to the left,
setting the new rightmost bit to 1
2b. Restore the original value by adding
the Divisor register to the Remainder
register and place the sum in the
Remainder register. Also shift the
Quotient register to the left, setting the
new least significant bit to 0
3. Shift the Divisor register right 1 bit
33rd repetition?
No: < 33 repetitions
Yes: 33 repetitions
Done
Observations on the first version of the
division hardware

1/2 bits in divisor always 0




1/2 of 64-bit adder is wasted
1/2 of divisor is wasted
Instead of shifting divisor to right, shift
remainder to left?
1st step cannot produce a 1 in quotient bit
(otherwise too big)

switch order to shift first and then subtract, can
save 1 iteration
Second Version of the division
hardware
Divisor
32 bits
Quotient
Shift left
32-bit ALU
32 bits
Remainder
64 bits
Shift left
Write
Control
test
Final Version of the division
hardware
Start
1. Shift the Remainder register left 1 bit
2. Subtract the Divisor register from the
left half of the Remainder register and
place the result in the left half of the
Remainder register
Divisor
32 bits
Remainder > 0
Test Remainder
Remainder < 0
32-bit ALU
3a. Shift the Remainder register to the
left, setting the new rightmost bit to 1
Shift right
Remainder Shift left
Write
Control
test
3b. Restore the original value by adding
the Divisor register to the left half of the
Remainder register and place the sum
in the left half of the Remainder register.
Also shift the Remainder register to the
left, setting the new rightmost bit to 0
64 bits
32nd repetition?
No: < 32 repetitions
Yes: 32 repetitions
Done. Shift left half of Remainder right 1 bit
Signed Division



Remember signs of the divisor and dividend,
negate the quotient if the signs disagree.
What about the sign of the remainder?
Rule: Dividend and remainder must have the
same sign.
Division in MIPS




Same hardware can be used for both multiply
and divide so long as we have a 64-bit
register that can shift left or right and a 32-bit
ALU that adds or subtracts.
Hi: contains remainder
Lo: contains quotient
div vs divu
Representing Real Numbers

We need a way to represent

numbers with fractions, e.g., 3.1416

very small numbers, e.g., .000000001

very large numbers, e.g., 3.15576 x 109

Use normalized scientific notation

Called floating point number because the binary point is
not fixed.

Advantages:

simplifies data exchange

simplifies floating point arithmetic algorithms

increases accuracy
Floating Point Representation



Representation:
(–1)sign x significand x 2exponent

sign, exponent, significand:

more bits for significand gives more accuracy

more bits for exponent increases range
IEEE 754 floating point standard:

single precision: 8 bit exponent, 23 bit significand

double precision: 11 bit exponent, 52 bit significand
Need to worry about underflow, i.e., a number too small
to be represented by floating point expression.
IEEE 754 floating-point
standard


Leading “1” bit of significand is implicit
Exponent is “biased” to make sorting easier

all 0s is smallest exponent all 1s is largest
bias of 127 for single precision and 1023 for double precision

summary: (–1)sign x (1+significand) x 2exponent – bias


Example:




decimal: -.75 = -3/4 = -3/22
binary: -.11 = -1.1 x 2-1
floating point: exponent = -1+127 = 126 = 01111110
IEEE single precision: 10111111010000000000000000000000
Converting Binary to Decimal
Floating Point





1 1000001 01 0000…00
Sign bit =1
Exponent = 129
Significand = 1x2^-2 = 0.25
Decimal value = (-1)^1(1+0.25)x2^(129-127)
=-5.0
Representing 0 in IEEE 754
standard



Smallest single precision normalized number
= 1.000000…00 x 2^(-126)
Leading “1” bit of significand is implicit except for the value 0.
Reserve exponent value 0 for zero so that the hardware won’t attach
a leading 1 to it.
Exponent
Significand
Object represented
0
0
0
0
nonzero
+- denormalized number
1-254
anything
Floating point number
255
0
+- infinity
255
nonzero
Not a Number (NaN)
Floating Point Addition





Decimal example: 9.999x10^1 + 1.610x 10^1
Step 1: Align the decimal point
Step 2: Add significands
Step 3: Convert to normalized form, check for
underflow or overflow
Step 4: Round the resulting significand
Floating-Point
Addition
Start
1. Compare the exponents of the two numbers.
Shift the smaller number to the right until its
exponent would match the larger exponent
2. Add the significands
3. Normalize the sum, either shifting right and
incrementing the exponent or shifting left
and decrementing the exponent
Overflow or
underflow?
Yes
No
4. Round the significand to the appropriate
number of bits
No
Still normalized?
Yes
Done
Exception
Block Diagram of Floating-Point
Addition Unit
Sign
Exponent
Significand
Sign
Exponent
Significand
Compare
exponents
Small ALU
Exponent
difference
0
1
0
Control
1
0
Shift smaller
number right
Shift right
Big ALU
0
1
0
Increment or
decrement
Exponent
Add
1
Shift left or right
Rounding hardware
Sign
1
Significand
Normalize
Round
Floating-Point Multiplication






Example: 1.110x10^10x9.200x10^-5
Step 1: Add the exponent (then -127 since
bias is counted twice)
Step 2: Multiply the two significands
Step 3: Normalized the result, check for
underflow or overflow
Step 4: Round the resulting significand
Step 5: Determine the sign
Floating-Point
Multiplication
Start
1. Add the biased exponents of the two
numbers, subtracting the bias from the sum
to get the new biased exponent
2. Multiply the significands
3. Normalize the product if necessary, shifting
it right and incrementing the exponent
Overflow or
underflow?
Yes
No
4. Round the significand to the appropriate
number of bits
No
Still normalized?
Yes
5. Set the sign of the product to positive if the
signs of the original operands are the same;
if they differ make the sign negative
Done
Exception
Floating-Point Instructions in
MIPS







Addition: add.s, add,d
Subtraction: sub.s, sub.d
Multiplication: mul.s ,mul.d
Division: div.s, div.d
Comparison c.x.s or c.x.d where x may be eq,
neq, lt, le, gt,ge
Branch: bclt (branch true), bclf (branch false)
Separate registers for floating-point
operations
Floating Point Complexities



Operations are somewhat more complicated
In addition to overflow we can have “underflow”
Accuracy can be a big problem







IEEE 754 keeps two extra bits, guard and round
four rounding modes
positive divided by zero yields “infinity”
zero divide by zero yields “not a number”
other complexities
Implementing the standard can be tricky
Not using the standard can be even worse

see text for description of 80x86 and Pentium bug!
Rounding with Guard Digits



Add 2.56x10^0 to 2.34x10^2, assuming we
have 3 significant decimal digits
With guard and round bits: (2.3400 + 0.0256)
Without guard and round bits
Guard
Round
Fallacies and Pitfalls


Floating point addition is associative, i.e,
x+(y+z) = (x+y)+z
Counter example: -1.5x10^38+ (1.5
x10^38+1)
Right shit is the same as an integer division
by a power of 2: true for unsigned integers.
Summary


Computer arithmetic is constrained by limited precision
Bit patterns have no inherent meaning but standards do
exist




two’s complement
IEEE 754 floating point
Computer instructions determine “meaning” of the bit
patterns
Performance and accuracy are important so there are
many complexities in real machines (i.e., algorithms and
implementation).