CA03-Arithmetic for Computers

Download Report

Transcript CA03-Arithmetic for Computers

Computer Architecture
Chapter 3
Instructions: Arithmetic for Computer
Yu-Lun Kuo 郭育倫
Department of Computer Science and
Information Engineering
Tunghai University, Taichung, Taiwan R.O.C.
[email protected]
http://www.csie.ntu.edu.tw/~d95037/
Review: MIPS Organization
Processor
src1 addr
5
src2 addr
5
1…1100
src1
Register
File
32 data
32
registers
dst addr
write
data
Memory
read/write
src2
5 ($zero - $ra)
32 data
32
32 bits
addr
32
branch offset
32
32 Add
PC
32 Add
4
Fetch
32
Decode
32
write data
32
32
PC = PC+4
Exec
32
word
s
read data
32
32 ALU
230
32
4
0
5
1
6
2
32 bits
byte address
(big Endian)
7
3
0…1100
0…1000
0…0100
0…0000
word address
(binary)
2
Memory Address Binding
• High level language 與 Machine language
在memory address轉換的對應方法
– 編譯期鏈結
– 載入期鏈結
» Relocate
– 執行期鏈結
» Dynamic linking & Dynamic loading
3
Review: MIPS Addressing Modes
1. Operand: Register addressing
op
rs
rt
rd
Register
funct
word operand
2. Operand: Base/Displacement addressing
op
rs
rt
R-Type
Memory
offset
word or byte operand
I-Type
base register
3. Operand: Immediate addressing
op
rs
rt
addi
operand
4. Instruction: PC-relative addressing
op
rs
rt
Memory
offset
branch destination instruction
Branch
Program Counter (PC)
5. Instruction: Pseudo-direct addressing
op
Memory
jump address
||
Program Counter (PC)
jump destination instruction
J-Type
4
Overview
• Data type includes representation and
operations
• let’s look at some arithmetic operations:
– Addition
– Subtraction
– Sign Extension
• Also look at overflow conditions for addition.
• Multiplication, division, etc.
• Logical operations are also useful:
– AND
– OR
– NOT
5
Signed and Unsigned Numbers
• Humans  base 10, Computers  base 2
• Bits are just bits (no inherent meaning)
– Conventions define relationship between bits
and numbers
• Binary numbers (base 2)
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001...
decimal: 0...2n-1
• Of course it gets more complicated:
– Numbers are finite (overflow)
– Negative numbers
» e.g., no MIPS subi instruction; addi can add a
negative number
6
Signed and Unsigned Numbers
• Computer program calculate both positive
and negative numbers
– Distinguishes the positive from the negative
– Obvious solution
» Add a separate sign
» Conveniently can be represented in a sign
bit
» Name of this representation is sign and
magnitude
7
Signed and Unsigned Numbers
• Sign and magnitude representation has
several shortcomings
– Not obvious where to put the sign bit
» Right or left?
– Adders may need an extra step to set the sign
– Has both a positive and negative zero
» Lead to problems for inattentive
programmers
8
Signed and Unsigned Numbers
• How do we represent negative numbers?
– i.e., which bit patterns will represent which
numbers?
» No use sign and magnitude
» Use two’s complement representation
• Leading 0s mean positive
• Leading 1s mean negative
• Two’s complement advantage
– All negative numbers have a 1 in the most
significant bit
» Hardware needs to test only this bit (+ or -)
9
Signed and Unsigned Numbers
• 32-bit signed numbers (2’s complement):
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten
...
0111
0111
1000
1000
...
MSB
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1110two
1111two
0000two
0001two
=
=
=
=
+
+
–
–
maxint
2,147,483,646ten
2,147,483,647ten
2,147,483,648ten
2,147,483,647ten
1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
minint
LSB
(2)10 = (0000 0000 0000 0000 0000 0000 0000 0010)2
(1111 1111 1111 1111 1111 1111 1111 1101) 2
(-2)10 = (1111 1111 1111 1111 1111 1111 1111 1110) 2
10
Signed and Unsigned Numbers
• Represent positive and negative numbers
(x31 x -231)+(x30 x 230)+(x29 x 229)+…+ (x1 x 21)+(x0 x 20)

Converting < 32-bit values into 32-bit values

Copy the most significant bit (the sign bit) into the “empty” bits
0010
1010
-> 0000 0010
-> 1111 1010
versus
(sign extend)
zero extend (lb vs. lbu)

sign extend

slt vs. slti (set on less than immediate)

sltu (set on less than unsigned) vs. sltiu
11
Sign Extension
• To add two numbers, we must represent
them with the same number of bits
• If we just pad with zeroes on the left:
4-bit
8-bit
0100 (4)
00000100 (still 4)
1100 (-4)
00001100
(12, not -4)
• Instead, replicate the MS bit -- the sign bit:
4-bit
8-bit
0100 (4)
00000100 (still 4)
1100 (-4)
11111100
(still -4)
12
Signed versus Unsigned Comparison
• Suppose register $s0 has binary number
1111 1111 1111 1111 1111 1111 1111 1111
• Register $s1 has the binary number
0000 0000 0000 0000 0000 0000 0000 0001
What are the value of registers $t0 and $t1
•
slt $t0, $s0, $s1
#signed
•
sltu $t1, $s0, $s1
#unsigned
Ans: $t0 = 1 and $t1 = 0
13
MIPS Arithmetic Logic Unit (ALU)
zeroovf
• Must support the Arithmetic/Logic
operations of the ISA
A
add, addi, addiu, addu
sub, subu,
mult, multu, div, divu
B
sqrt
and, andi, nor, or, ori, xor, xori
beq, bne, slt, slti, sltiu, sltu

1
1
32
ALU
result
32
32
4
m (operation)
With special handling for

sign extend – addi, addiu andi, ori, xori, slti, sltiu

zero extend – lbu, addiu, sltiu

no overflow detected – addu, addiu, subu, multu, divu, sltiu, sltu
14
Review: 2’s Complement Representation
-23 =

Negate
-(23 - 1) =
1011
and add a 1
1010
complement all the bits
23 - 1 =
2’sc binary
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
decimal
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
15
A 32-bit Ripple Carry Adder/Subtractor

Remember 2’s
complement is just

add/sub
complement all the bits
c0=carry_in
A0
B0
control
(0=add,1=sub)
B0
A
0111
B - 0110
0001

0111
 + 1001
1
1 0001
S1
c2
1-bit
FA
B2
S2
c3
...
add a 1 in the least
significant bit
1-bit
FA
B1
A2

S0
c1
A1
B0 if control = 0,
!B0 if control = 1
1-bit
FA
c31
A31
B31
1-bit
FA
S31
c32=carry_out
16
Addition & Subtraction
• Just like in grade school (carry/borrow 1s)
0111
0111
+ 0110
- 0110
0110
- 0101
• Two's complement operations easy
– subtraction using addition of negative numbers
0111
+ 1010
• Overflow (result too large for finite computer word):
– e.g., adding two n-bit numbers does not yield an n-bit number
0111
+ 0001
note that overflow term is somewhat misleading,
1000
it does not mean a carry “overflowed”
17
Overflow
• If operands are too big, then sum cannot
be represented as an n-bit 2’s comp
number
01000
(8)
11000
(-8)
+ 01001
(9)
+ 10111
(-9)
10001
(-15)
01111
(+15)
18
Overflow Detection
• Overflow
– the result is too large to represent in 32 bits
• Overflow occurs when
– adding two positives yields a negative
– adding two negatives gives a positive
– subtract a negative from a positive gives a negative
– subtract a positive from a negative gives a positive
0
+
1
1
1
1
0
1
1
1
7
0
0
1
1
3
1
0
1
0
–6
+
0
1
1
0
0
–4
1
0
1
1
–3
0
1
1
1
7
19
Tailoring the ALU to the MIPS ISA
•
•
•
•
Need to support the logic operation (and,nor,or,xor)
–
Bit wise operations (no carry operation involved)
–
Need a logic gate for each function, mux to choose the output
Need to support the set-on-less-than instruction (slt)
–
Use subtraction to determine if (a – b) < 0 (implies a < b)
–
Copy the sign bit into the low order bit of the result, set remaining
result bits to 0
Need to support test for equality (bne, beq)
–
Again use subtraction: (a - b) = 0 implies a = b
–
Additional logic to “nor” all result bits together
Immediates are sign extended outside the ALU with
wiring (i.e., no logic needed)
20
Logic Operation (*)
• Shift Operations
– Shifts move all the bits in a word left or right
sll
$t2, $s0, 8
#$t2 = $s0 << 8 bits
srl
$t2, $s0, 8
#$t2 = $s0 >> 8 bits
op
rs
rt
rd
shamt
funct
• The shift operation is implemented by
hardware separate from the ALU
21
Logic Operation (*)
• sll (R-format)
• Ex. If register $s0 is
0000 0000 0000 0000 0000 0000 0000 1101
execute sll $s2, $s0, 8
What is the value of $s2?
0000 0000 0000 0000 0000 1101 0000 0000
sll $s2, $s0, 8
0
unused 16
10
8
0
22
Arithmetic Logic Unit (ALU)
• Using 4 kinds hardware components
23
Logical Operations
• Operations on logical TRUE or FALSE
– two states -- takes one bit to represent:
TRUE=1, FALSE=0
A
0
0
1
1
B
0
1
0
1
A AND B
0
0
0
1
A
0
0
1
1
B A OR B
0
0
1
1
0
1
1
1
A
0
1
NOT A
1
0
24
Logic Operation (*)
• And (and) instruction
• Or (or) instruction
• Ex. If register $t2 is
0000 0000 0000 0000 0000 1101 0000 0000
If register $t1 is
0000 0000 0000 0000 0011 1100 0000 0000
– Then execute and $t0, $t1, $t2
The value of $t0 is
0000 0000 0000 0000 0000 1100 0000 0000
25
Examples of Logical Operations
• AND
– useful for clearing bits
» AND with zero = 0
» AND with one = no change
11000101
AND
00000101
• OR
– useful for setting bits
» OR with zero = no change
» OR with one = 1
• NOT
– unary operation -- one argument
– flips every bit
00001111
11000101
OR
00001111
11001111
NOT
11000101
00111010
26
Adder (1-bit)
• 1-bit full adder
Input
Output
a
b CarryIn CarryOut Sum
0
0
0
0
0
0
0
1
0
1
0
1
0
0
1
0
1
1
1
0
1
0
0
0
1
1
0
1
1
0
1
1
0
1
0
1
1
1
1
1
CarryIn
a
+
b
Sum
CarryOut
27
Adder (4-bit ripple carry adder)
• Each full adder inputs a Cin, which is the
Cout of the previous adder
28
Adder (32 bit ALU)
• 32-bit adder requires 31 carry computations
Operatio
CarryIn n
a
Result
b
CarryOu
t
29
ALU
• ALU notation
ALU-運算
a
ALU
Zero
結果
溢位
b
Overflow
進位輸出
Carry out
30
3.4 Multiplication (1/4)
• Binary multiplication is just a bunch of
right shifts and adds
n
multiplicand
multiplier
partial
n
product
can be formed in parallel
and added in parallel for
faster multiplication
array
double precision product
2n
31
Multiplication (2/4)
• More complicated than addition
– accomplished via shifting and addition
• More time and more area
• Ex. Unsigned Multiplication
(1000)2 x (1011)2
multiplicand
multiplier
32
Multiplication (3/4)
• The length of the multiplication
– n-bit multiplicand
– m-bit multiplier
– Product is n + m bits long
» The n + m bits are required to represent all
possible products
• We must cope with overflow
– Because we frequently want a 32-bit product
as the result of multiplying two 32-bit numbers
33
Multiplication (4/4)
• The design mimics the algorithm
– We learned in grammar school
• Assume
– Multiplier: in the 32-bit Multiplier register
– 64-bit Product register is initialized to 0
» Write new values into the Product register
– 64-bit Multiplicand register
» Need to move the multiplicand left one digit
each step
» Over 32 steps a 32-bit multiplicand would
move 32 bits to the left
34
The First Multiplication Algorithm
• Product register is initialized to 0
1.
• Each step took a clock cycle
Multiplier0 = 1
– Require almost 100 clock cycle T
F
1a.
將被乘數與乘積相加,然後把結
果放入乘積暫存器內
2.
將被乘數暫存器左移一位元
3.
將乘數暫存器右移一位元
重複32次否?
T
F
35
First Multiply Algorithm (ex. p.180)
• Using 4-bit number to save space
– Multiply 2ten X 3ten (0010 X 0011)
36
The Second Multiplication Algorithm
• Multiplicand register, ALU, and Multiplier
register are all 32 bits wide
• Only Product register is 64 bits (initial = 0)
– The multiplier is placed instead in the right
half on the Product register
37
The Second Multiplication Algorithm
1.
T
Multiplier0 = 1
F
1a.
把被乘數加到乘積的 左半邊 ,
然後把結果放到乘積暫存器的左半邊
2.
將乘積暫存器右移一位元
3.
將乘數暫存器右移一位元
重複32次否?
T
F
38
Multiply in MIPS
• MIPS has two instructions
– Multiply: mult
– Multiply unsigned: multu
• MIPS multiply instructions ignore overflow
– Up to the software to check to see if the
product is too big to fit in 32 bits
39
3.5 Division
• Division is just a bunch of quotient digit
guesses and left shifts and subtracts
n
quotient
n
0 0 0
dividend
divisor
0
partial
0
remainder
0
array
remainder
n
40
Division
• MIPS has two instructions
– Divide: div
– Divide unsigned: divu
• As with multiply, divide ignores overflow
– Software must determine if the quotient is too
large
– Software must also check the divisor to avoid
division by 0
41
3.6 Floating Point (p.189)
• We need a way to represent
– numbers with fractions, e.g., 3.14159265 (π)
– very small numbers, e.g., .000000001
– very large numbers, e.g., 3.15576 X 109
42
Floating Point
• Representation
– sign, exponent, significand: (–1)sign X significand X 2exponent
– more bits for significand gives more accuracy
– more bits for exponent increases range
• IEEE 754 floating point standard:
– single precision: 8 bit exponent, 23 bit significand
– double precision: 11 bit exponent, 52 bit significand
43
Representing Big (and Small) Numbers
•
What if we want to encode the approx. age of the earth?
4,600,000,000
or
4.6 x 109
or the weight in kg of one a.m.u. (atomic mass unit)
0.0000000000000000000000000166

Floating point representation

or 1.6 x 10-27
(-1)sign x F x 2E
Still have to fit everything in 32 bits (single precision)
s E (exponent)
1 bit
8 bits
F (fraction)
23 bits
44
Floating Point Form
• Generally of the form
(-1)S * F* 2E
• Single precision
S Exponent
1-bit
Significand
8-bit
23-bit
• Double precision
S
1-bit
Exponent
Significand
11-bit
20-bit
Significand (continue)
32-bit
45
Floating Point Form
• Generally of the form
(-1)S * F* 2E
• Single precision
S Exponent
1-bit
8-bit
Significand
23-bit
– S: the sign of the floating-point number
– Exponent: the value of the 8-bit exponent field
– Fraction(significand): the 23-bit number
» Sign and magnitude (符號與大小)
• The sign has a separate bit from the rest of the number
46
Overflow & Underflow
• Overflow
– Means that the positive exponent is too large
to be represented in the exponent field
• Underflow
– Means that the negative exponent is too large
to be represented in the exponent field
47
48
IEEE 754 floating-point standard (1/2)
• These format go beyond MIPS
– They are part of the IEEE 754 floating-point
standard
• To pack even more bits into the significand
– IEEE754 makes the leading 1 bit of normalized
binary numbers implicit
» The number is 24 bits long in single precision
• Implied 1 and a 23-bit fraction
» The number is 53 bits long in double precision
• Implied 1 and a 52-bit fraction
49
IEEE 754 floating-point standard (2/2)
• 0 has no leading 1
– It is given the reserved exponent value 0
– 000…00two represents 0
• The representation of the rest of the
numbers
(-1)S * (1+Fraction)* 2E
@F is stored in normalized form where the msb in the fraction is
1 (so there is no need to store it!) – called the hidden bit
@ E specifies the value in the exponent field
• If number the bits of the fraction from left
to right s1, s2, s3,…
(-1)S*(1+(s1*2-1) +(s2*2-2) +(s3*2-3)+…)*2E
50
Signed and Unsigned Numbers
• 32-bit signed numbers (2’s complement):
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten
...
0111
0111
1000
1000
...
MSB
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1110two
1111two
0000two
0001two
=
=
=
=
+
+
–
–
maxint
2,147,483,646ten
2,147,483,647ten
2,147,483,648ten
2,147,483,647ten
1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
minint
LSB
(2)10 = (0000 0000 0000 0000 0000 0000 0000 0010)2
(1111 1111 1111 1111 1111 1111 1111 1101) 2
(-2)10 = (1111 1111 1111 1111 1111 1111 1111 1110) 2
51
IEEE 754
• The designers of IEEE 754 wanted a
floating –point
– The sign is in the most significant bit
– Could be easily processed by integer
comparisons, especially for sorting
» Quick test of less than, greater than, or
equal to 0
• Negative exponents pose a challenge to
simplified sorting
– If we use two’s complement in which negative
exponents have a 1 in the MSB
– Negative exponent will look like a big number
52
IEEE 754 Biased Notation (1/2) (p.194)
• Biased notation
– The most negative exponent as 00…00two
– The most positive exponent as 11…11two
• Bias of 127 for single precision and
1023 for double precision
指數 = E + Biased (偏差值)
E
指數
128
255
127
254
…
0
Biased
127
…
127
-1
126
…
…
-127
0
53
IEEE 754 Biased Notation (2/2) (p.194)
• IEEE 754 floating point standard
(-1)sign x (1+F) x 2E-bias
– Formats for both single and double precision
» 127 for single and 1023 for double precision
– F is stored in normalized form where the msb in
the fraction is 1 (so there is no need to store it!)
called the hidden bit
54
Floating Point Representation (ex.195)
• Show the IEEE 754 binary representation
-0.75 in single precision
-3/4ten  -3/22ten = - ( ½ + ¼ )  -0.11two
-0.11two X 20  - 1.1two X 2-1
» Sign is 1 – number is negative
» Exponent field is 01111110 = 126 (decimal)
» Fraction is 0.100000000000… = 0.5 (decimal)
• Value = -1.5 x 2(126-127) = -1.5 x 2-1 = -0.75
10111111010000000000000000000000
sign exponent
fraction
55
Converting Binary to Decimal FP
• 1 10000001 01000000000000000000000
The sign bit: 1
The exponent field: 129
The fraction field: 1*2-2 (=0.25)
(-1)sign x (1+F) x 2E-bias
= (-1)1 x (1+0.25) x 2129-127
= -1 x 1.25 x 22
= -1.25 x 4
= -5.0
56
Floating-Point Addition (p.199)

Addition (and subtraction) Algorithm on p.200
(F1  2E1) + (F2  2E2) = F3  2E3

Step 1: Restore the hidden bit in F1 and in F2

Step 1: Align fractions by right shifting F2 by E1 - E2 positions
(assuming E1  E2) keeping track of (three of) the bits shifted out in
a round bit, a guard bit, and a sticky bit

Step 2: Add the resulting F2 to F1 to form F3

Step 3: Normalize F3 (so it is in the form 1.XXXXX …)
- If F1 and F2 have the same sign  F3 [1,4)  1 bit right shift F3 and
increment E3
- If F1 and F2 have different signs  F3 may require many left shifts each
time decrementing E3

Step 4: Round the sum F3 and possibly normalize F3 again

Step 5: Rehide the most significant bit of F3 before storing the result
57
Floating-Point Addition (ex.)
• Adding 0.5ten and -0.4375ten in binary
0.5 = (0.1)2 = (1.000)2 * 2-1
-0.4375 = - (0.0111)2 = - (1.110)2 * 2-2
• Step 1. Exponent matches the larger number
- (1.110)2 * 2-2 = - (0.111)2 * 2-1
• Step 2. Add the significands
(1.000)2 * 2-1 - (0.111)2 * 2-1 = (0.001)2 * 2-1
• Step3. Normalize the sum
 (0.001)2 * 2-1 = (1.000)2 * 2-4
• Step4. Round the sum
(1.000)2 * 2-4 = (0.0001000)2 = (1/24)10 = (1/16)10 =
(0.0625)10 …ANS
58
•Q & A
59