Addition, Subtraction

Download Report

Transcript Addition, Subtraction

MIPS Architecture
Arithmetic/Logic Instructions
ALU Design – Chapter 4
By N. Guydosh
2/17/04
Integer Representation
• 32 bit representation.. every bit is contributes to the
magnitude of the number
• Range for unsigned integers is 0 through 232 -1 or
0 through 4,294,967,294 = 0xFFFFFFFF
• But what about negative numbers?
- The universal standard is signed two’s Complement.
-
Automatically results in the most significant bit (bit 31,
little endian notation) being a “1” for negative numbers and a
“0” for positive numbers ... thus we have a sign bit.
-
Results in simple ALU design
Two’s Complement - 1
•
Two’s Complement Representation:
– 0 (zero) is 0x00000000
– Positive numbers (231 - 1 of them) are:
0x00000001,
0x00000002,
...
0x7FFFFFFF = 231 -1 = 2,147,483,647
– Negative numbers (231 of them) are:
-1 = 0xFFFFFFFF,
-2 = 0xFFFFFFFE ,
...
-2,147,483,647 = 0x80000001 (2's complement) = -(231 - 1)
-2,147,483,648 = 0x80000000 (2's complement) = -231 , see note below & p. 213 text.
– There are more negative numbers than positive
but we’ll live with it
– NOTE: the 2’s complement of 0x80000000 is also 0x80000000 because of a net
carry-out when forming the 2’s complement – an anomaly. This number has no
positive counterpart in a 32 bit architecture. We will see later that when we use
2’s complement for the exponent of a floating point number, we may treat values
such as 1000…0 as “-0” and 000…0 as “+0”
Two’s Complement - 2
• Three formulas for doing 2's complement:
1. Complement all the bits (1's complement) and add 1
2. Starting with the least significant bit, retain all low order zeros, retain the
first nonzero digit, complement the remaining digits.
3. A real slick method:
If the digits in the 32 bit number are:
B31B30B29 . . .B1B0
Then the “sign/magnitude” decimal equivalent is:
B31*-231 + B30*230 + B29*229 + . . .+ B1*21 + B0*20
Note that the sign bit position is multiplied by:
-231 and not 231 !!! A generalization of converting from binary to decimal.
which also works for 2's complement.
... see P. 213-214
Uses of 2’s complement
• How 2's complement is used:
– For “algebraic” addition of signed numbers represent negative
numbers by their 2's complement of the magnitude
– For the operation of subtraction add the 2's complement of the
minuend to the subtrahend ... Take the 2's complement of the
minuend even if it is a negative number. ... We converted the
operation of subtraction to addition - All we need is an adder.
– Note there we distinguished between the sign of a number and the
operation (add or subtract) being done. In either case we take the
2's complement.
2’s Complement Sign Extension
• To convert an integer shorter than 32 bits to a 32 bit
integer we simply extend the sign bit (0 or 1) to the left
until we have 32 bits - for example we convert a 16 bit
version of 2 and -2 (base 10) as follows:
2 = 0x0002 ==> 0x00000002 (sign bit is 0)
-2 = 0xFFFE ==> 0xFFFFFFFE (sign bit is 1)
2’s Complement and the Instruction Set
• How this affects the MIPS instruction set . . . see p. 215
– The “less than” test gets more complicated ... the slt and slti
instructions interpret integers as signed
-2d = 0xfffffffe (2's comp) is less than 5280d = 0x000014a0
– unsigned version of these instructions:
sltu and sltiu assume numbers are unsigned:
0xfffffffe is now greater than 0x000014a0
Data Types And Signs
• Fundamental Principle:
– Unlike c language which intrinsically types data, data represented at the
assembly language level is untyped - data types are interpreted by the
operation being done on them:
– Example in C:
int num;
/* represents a signed 32 bit integer
*/
/* num carries a label saying “I am signed” */
unsigned int num; /* represents an unsigned 32 bit integer */
/* num carries a label saying “I am unsigned”*/
– In assembly language the operation or instruction type determines whether
num is signed or unsigned.
– The unsigned counterparts of slt and slti are:
Assuming that $16 = 0xFFFFFFFE
$17 = 0x000014A0
Then:
slt $8, $16, $17
#Signed comparison ==> $8 = 1
sltu $9 $16, $17
#Signed comparison ==> $9 = 0
– As an unsigned number 0xFFFFFFFE is a very large positive integer
(4,294,967,294), but as a signed number it is -2.
Addition, Subtraction And Overflow
• The most straight forward method of doing binary
addition is the way we do it on paper - starting from
low order we add corresponding pairs of bits and the
carry-in generating a sum and carry-out. the carry-out
is then propagated to the next stage.
– Carry-ripple approach
• Do subtraction by adding the 2's complement of the
minuend to the subtrahend.
• What if the answer cannot be represented in 32 bits?
Overflow
• Example; consider a 5 bit word for convenience. Assume the high
(most significant) bit is the sign bit for 2’s complement:
-9 = 10111
-5 = 11011
-14 = 1 10010 ==> there is a net carryout, but answer is correct because sign
bit is correct:
10010 ==>2’s compliment = -01110 = -14 )d
-9 = 10111
-7 = 11001
-16 = 1 10000 ==> there is a net carryout, but answer is still correct because
sign bit is correct:
10000 ==> 2’s complement = -10000 = -16 )d (the unique number that has no
positive counterpart in a 5 bit scheme)
- 9 = 10111
-10 = 10110
-19 = 1 01101 ==> overflow because sign bit wrong 01101 = +13)d
Detecting overflow
•
Rule (see p. 222, fig 4.4):
Addition:
Overflow can occur only if the signs are the same and
the sign of the answer is different from the sign of the operands.
Subtraction:
Overflow (ie., underflow) results only if the operands have opposite signs
and the sign of the answer is not the same as the sign of the 1st operand
(or the sign of the answer is the same as the sign of the 2nd operand).
• Conditions for overflow (fig. 4.4)
Operation Operand A
Operand B
Result
A+B
0
0
<0
A+B
<0
<0
0
A–B
0
<0
<0
A-B
<0
0
0
• Simple hardware test for overflow in 2’s complement Add/Subtract
– Overflow occurs if and only if the carry into the sign bit is not the same as
the carry out from the sign bit (thus explaining why net carry-outs are not
always overflows). The condition being tested for is necessary &
sufficient for overflow. See Exercise 4.42. Now review previous examples.
Exception Handling And Overflows
• MIPS will generate an overflow exception (interrupt) only for
signed arithmetic.
• The address of the offending instruction is saved in the exception
program counter (EPC) and a branch is made to an
error/interrupt handler for corrective action.
• MIPS instruction mfc0 is used to move the EPC to MIPS
general purpose register. On returning from the exception
handler, the software now has the option of returning back to the
offending instruction via a jr instruction.
UNSIGNED ARITHMETIC
• Data interpreted as unsigned ( by means of unsigned
instructions) addresses and other data for which negative
has no meaning.
• The unsigned counterparts of add, addi, and sub are: addu,
addiu, and subu with syntax the same as the signed
counterparts.
• The main differences are the unsigned version do not cause
exceptions in case of overflow
Logical Operations ... See P. 225
• Shift Logical Left And Shift Logical Right
sll $rd, $rt, shamt
srl $rd, $rt, shamt
#reg $rd = reg $16 << shamt (in bits)
#reg $rd = reg $16 >> shamt (in bits)
• Good for efficient multiplication and division by powers (shamt) of two
• “and” and “or” instructions ... see P. 226-227
Good for bit manipulation - testing and setting bits in a word field
• Type R instructions:
and $rd, $rs, $rt
or $rd, $rs, $rt
# $rd = $rs & $rt
# $rd = $rs | $rt , Note: “|” is the“or” operator
Logical Operations (cont.)
• Type I instructions:
andi $rt, $rs, imm
ori $rt, $rs, imm
# $rt = $rs & imm
# $rt = $rs | imm
imm treated as unsigned ... expanded to 32 bits by padding with zeros
rather than extending the sign bit
Application: we can create a 32 bit constant using lui and ori in much
the same way we did using lui and addi.
Designing An ALU
• We now look at hardware design which will support the
arithmetic instructions just described
• This is the number cruncher of the machine
• Start by building a 1 bit ALU and replicating it 32 times
• Function of the ALU:
– add, subtract, and, or, test for less than,
overflow detection
• Simple example of an single stage adder design
Single Stage One Bit Adder
From Mano,
“Digital Design”
2nd ed. p. 121
One Bit ALU
Binvert
Each bit has a full adder, “and”,
and “or” function.
Operation
CarryIn
a
0
1
Unit for bits 0 - 30
Result
b
0
2
1
Less
3
a.
CarryOut
Binvert
Operation
CarryIn
a
0
1
Result
b
0
2
1
Less
3
Set
Overflow
detection
b.
Overflow
Unit for bit 31.
Has overflow detect and
slt” capability using sign bit
which will fed to “less” input
of bit 0:
For slt $s1, $s2, $s3
do $s2 - $s3 and put sign bit
in bit 0 & zero out remaining
bits.
Final 32 Bit ALU
Bnegate
Operation
a0
b0
CarryIn
ALU0
Less
CarryOut
Result0
a1
b1
0
CarryIn
ALU1
Less
CarryOut
Result1
a2
b2
0
CarryIn
ALU2
Less
CarryOut
Result2
a31
b31
0
CarryIn
ALU31
Less
For slt, zero out bits 1-32
via “Less” input and feed sign bit
into “Less” input of bit 0.
Test for zero if all bits 0 – use
“or” gate.
Zero
If Bnegate = 1, and Operation = 3,
for set on less than (slt),
then subtraction is done “under
the covers”, but not transmitted
outside. Set is the sign bit of the
subtract result.
Result31
Set
Overflow
<== Note “wrap-around” connection
Carry Look Ahead
• Avoids the slow carry-ripple affect of the above circuit
• Carry ripple is very intuitive, but inefficient – must
sequence through all low order 31 bit positions before
getting the carry-in for bit position 32 (bit 31).
• A less intuitive, but more efficient method calculates the
carries in advance: carry lookahead
Carry Look Ahead (cont)
From Mano,
“Digital Design”
2nd ed. p. 158
Reading the diagram directly:
Pi is “carry propagate” for bit position i … propagates through bit position i even
if position i does not generate its own carry.
Gi is “carry generate” for bit position i … generated directly by this bit position
independently of any carry in
Pi = ai  bi, Gi = aibi
Si = Pi  Ci, Ci+1 = Gi +PiCi … a recursive formula
Carry Look Ahead (cont)
•
•
By reading the full adder circuit diagram directly, we see that: Pi = Pixor = ai  bi.
The book uses: Pi = Pior = ai + bi for this function – a much cheaper and simpler circuit.
How is this done? See below.
If the latter (“OR”) is used, it must be used only for the carry out function. The sum
function still requires the XOR (former version). Since the XOR version is already
needed, why not also use it for the carry out? - maybe there are fan-out considerations.
– Thus we also may use:
Si = Pixor  Ci, where Pixor = ai  bi
Ci+1 = Gi +Pior Ci where Pior = ai + bi , and Gi = aibi
remember Pixor will also work for Ci is also but is a more complicated circuit.
• Proof of the equivalence of Pixor and Pior for carry out calculations:
From the circuit diagram on previous slide we have the Boolean equations:
Ci+1 = Gi + Pixor Ci
= aibi + (ai  bi)Ci = aibi+ (aib’i + a’ibi) Ci …prime means complement
= aibi+ aibi+ aib’i Ci + a’ibi Ci = ai(bi+b’i Ci ) + bi(ai+a’i Ci )
= ai(bi+ Ci ) + bi(ai+ Ci ) = aibi + aibi + aiCi + biCi
= aibi + (ai + bi)Ci
= Gi + Pior Ci
Note: we used boolean identities: x = x+x, and x+x’y = x+y
Carry Look Ahead (cont)
• Formulas for carries – showing increasing fan-in.
Recursively expand out the carry-out formulas:
From Mano,
“Digital Design”
2nd ed. p. 159
Carry Look Ahead (cont)
Note that “fan-in” increases
as the bit position increases.
From Mano,
“Digital Design”
2nd ed. p. 159
Carry Look Ahead (cont)
4 bit carry look-ahead adder
From Mano,
“Digital Design”
2nd ed. p. 160
See also fig. 4.24, p. 426
P&H
Carry Look-Ahead – Reality sets in!
• Note that the fan-in of the “and” gates and the “or” gate for the carryout any bit position increases with the bit position. At Some point it
becomes impossible to implement due to fan-in/fan-out constraints of
the technology.
• We can decompose a 32 bit adder to 4 groups of 8 bit carry look-ahead
adders, we can limit the fan-in of the carry look-ahead circuitry. The 4
8 bit adders, in turn, can also utilize carry-look-ahead from group to
group.
This is an example of using a second level of abstraction of the carrylook-ahead concept.
• See the 16 bit example on pp. 242-248 for an illustration.