Transcript Letcture 3

CSE 246: Computer
Arithmetic Algorithms and
Hardware Design
Prof Chung-Kuan Cheng
Lecture 3
How to compare two RNS numbers

We can approximate the magnitude of a RNS
number by the following formula
x ( xk 1 | xk  2 | ... | x0 ) RNS ( pk 1 | pk  2 | ... | p0 )

M
M
k 1
 (
i 0
 i xi
Pi
) MOD 1
where
M   Pj
j
M 1
 i  [(
) ] Pi
Pi
An Example
Suppose,
x = (6|3|0) RNS (7|5|3)
y = (3|0|1) RNS (7|5|3)
Then we have
x/105 = [6(1/7) + 3(1/5) + 0(2/3)] mod 1 ≈ 0.457
y/105 = [3(1/7) + 0(1/5) + 1(2/3)] mod 1 ≈ 0.095
Clearly, x (48) is greater than y (10).
Double Base Number System (DBNS)


DBNS is a new kind of number system,
where there are two bases, p and q.
Any number x is represented by the equation
x   d ij p i q j
, p  q; 0  d ij  p
i, j
Also, this number system could be redundant, e.g.
54 = 2030+2130+2131+2031+2032+2232
= 2133
Double Base Number System (DBNS)

We can represent DBNS numbers in a two-dimensional
table. For example we can express 54 by this tabular
representation.
1
1
2
x
x
3
9
4
x
x
x
For each entry in the table, we multiply the corresponding row-value and
column-value. Then we add up all such entries to get the value of the
number represented by the table.
Double Base Number System (DBNS)

DBMS can be of practical use too in some
scenarios.



In binary number representation, each bit has
approximately 0.5 probability of being 1.
But in DBNS, the number of bits that are logic 1 in
the tabular representation could be much less.
Effectively, we can reduce the number of 01 and
10 transitions, thus saving power.
Double Base Number System (DBNS)
A greedy approach to minimize the number of TRUE bits in
the tabular representation of any integer :
GREEDY (x)
{
if (x > 0) then do{
find the largest 2-integer w such that w ≤ x;
write(w);
x = x-w;
greedy(x);
}
}
Double Base Number System (DBNS)
• It can be shown that expected number of bits that are
‘turned on’ in a DBNS representation of integer is
•O[lg x/(lg lg x)], which is significantly lower than the
corresponding number in the positional binary system,
O(lg x).
• As an example, consider the integer 2215
• In binary system, number of ‘1’s ≈ 100
• In DBNS, number of ‘1’s ≈ 30
• In the next few slides we shall discuss how we can
implement ADDITION operation on two DBNS numbers.
DBNS Numbers: Addition
Consider the integers 14 and 20. In DBNS system,
14 = 2231 + 2130 [We represent this number by a green cross]
20 = 2132 + 2130 [We represent this number by a red cross]
The addition operation is performed by representation the numbers in tabular
form, and then ‘merging’ the tables.
1
1
2
xx
3
9
4
x
x
DBNS System: Addition
The final merged table is :
1
2
4
1
+
3
+
9
+
And the sum of 14 and 20 is
2230 + 2231 + 2132 = 34, which is indeed correct
DBNS System: Addition

Few rules for ‘shifting’ values in the merged
table

We can always use algebraic manipulations to
minimize number of entries in a DBNS table, e.g.



2i3j + 2i3j+1 = 2i+23j
2i3j + 2i+13j = 2i3j+1
A variant of 2-integers are represented by
using only single digit. They are of the form
2s3t, and might be useful in logarithmic
operations.
Chapter 2: ADDERS

Half Adders


Half adders can add two 1-bit binary numbers when there
is no carry in.
If the inputs are xi and yi, the sum and carry-out is given by
the formula



si = xi ^ yi
ci+1 = xi . yi
We use the following notations throughout the slides




. means logical AND
+ means logical OR
^ means logical XOR
‘ means complementation
Full Adder



The inputs are x[i], y[i] (operand bits) and c[i]
(carry in)
The outputs are s[i] (result bit) and c[i+1]
(carry out)
Inputs and outputs are related by these
relations


s[i] = x[i] ^ y[i] ^ c[i]
c[i+1] = x[i].y[i] + c[i].(x[i] + y[i])
= x[i].y[i] + c[i].(x[i] ^ y[i])
Full Adder


If carry-in bit is zero, then full adder becomes
half adder
If carry-in bit is one, then



s[i] = (x[i] ^ y[i])’
c[i+1] = x[i] + y[i]
To add two n-bit numbers, we can chain n full
adders to build a ripple carry adder
Ripple Carry Adder
x[0] y[0] cin/c[0]
x[n-1]
y[n-1]
x[1]
y[1]
c[n-1]
...
c[1]
c[2]
cout
s[n-1]
s[1]
s[0]
Overflow happen when operands are of same sign, and the result is of
different sign. If we use 2’s complement to represent negative numbers,
overflow occurs when (cout ^ c[n-1]) is 1
Ripple Carry Adder



For sake of brevity, we use the following notations:
 g[i] = x[i].y[i]
 p[i] = x[i] + y[i]
In terms of these notations, we can rewrite carry equations as
 c[1] = g[0] + p[0].c[0]
 c[2] = g[1] + p[1].c[1]
 and so on…
 We shall use these notations afterwards while discussing the
design of other kind of adders
It has been observed that expected length of carry chain is 2, while
expected maximal length of carry chain is lg n. Hence, ripple carry
adders are in general fast.
Ripple Carry Adder

How do know that an adder has completed the operation?
 Worst case scenario: Wait for the longest chain in the carry propagation
network
 We might inspect c[i+1] and its complement b[i+1] to determine the
status of the adder
c[i+1]
b[i+1]
Remark
0
0
Not
complete
1
0
Complete
0
1
Complete
1
1
Don’t care
Improvement to Ripple Carry Adder:
Manchester Adders





By intelligently using our device properties, we can reduce the
complexity of the circuit used to compute carries in a ripple carry
adder.
Define: a[i] = (x[i])’.(y[i])’
Next we observe that c[i+1] is 1 in exactly these scenarios:
 g[i] is 1, i.e. both x[i] & y[i] are 1
 c[i] is 1 and it is propagated because p[i] is 1
c[i+1] is ‘pulled down’ to logic 0 irrespective of the value of c[i],
when a[i] is 1, i.e. both x[i] and y[i] are 0
From these conditions, and keeping in mind the general
characteristics of transistor devices we can design simplified
circuits for computing carries – as shown in the next slide
Improvement to Ripple Carry Adder:
Manchester Adders
VDD
g[i]
c[i+1]
p[i]
a[i]
c[i]
Implementation of Manchester Adder
using MOS transistors
(g[i])'
c[i+1]
c[i]
p[i]
a[i]
This is essentially the same circuit for computing carry, but implemented with
MOS devices
Manchester Adder: Alternate design


We divide the computation cycle into two
distinct half-cycle : ‘precharge’ and ‘evaluate’.
In the precharge half-cycle, g[i] and c[i+1] are
assigned a tentative value of logic 1. This is
evaluated in the next half-cycle with actual
value of a[i].
The actual circuit for computing carries is
shown in the next slide.
Manchester Adder: Alternate design
precharge q
c[i+1]
c[i]
p[i]
evaluation
precharge
a[i]
Q
q
Time 
Carry Look-ahead Adder



In a ripple-carry adder m-full adders are grouped together (m is
usually equal to 4). Once the carry-in to the group is known, all the
internal carries and the output carry is calculated simultaneously.
We can use some algebraic manipulations to minimize hardware
complexity.
Consider the carry out of the group
 c[i] = g[i-1] + p[i-1].c[i-1]
 Putting the value of c[i-1], we can rewrite as
c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].c[i-2]
 Proceeding in this manner we get
c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i-1].p[i-2].p[i-3].g[i-4] + p[i1].p[i-2].p[i-3].p[i-4].c[i-4]

To further simplify the equation, we note that g[i-1] = g[i-1].p[i-1],
and p[i-1] can be factored out
Ling’s Adder
c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i-1].p[i2].p[i-3].g[i-4] + p[i-1].p[i-2].p[i-3].p[i-4].c[i-4]
We replace p[i]=x[i]^y[i] with t[i]=x[i]+y[i].
Because g[i]=g[i]t[i], we have
c[i] = g[i-1]t[i-1] + t[i-1]g[i-2] + t[i-1].t[i-2].g[i-3] + t[i-1].t[i2].t[i-3].g[i-4] + t[i-1].t[i-2].t[i-3].t[i-4].c[i-4]
Let
h[i] = g[i-1] + g[i-2] + t[i-2].g[i-3] + t[i-2].t[i-3].g[i-4] + t[i2].t[i-3].t[i-4].t[i-5] h[i-4]
C[i]= h[i]t[i-1]
Generalized Design for Adders: Prefix
Adder

Prefix computation
Given n inputs x1, x2, x3…xn and an associative
operator ×. We want to compute
yi = xi × xi-1 × xi-2 …× x2 × x1 for all i, 1≤ i ≤n
 x can be a scalar/vector/matrix
 For design of adders, we define the operator × in
the following manner




(g, p) = (g’, p’) × (g’’, p’’)
g = g’’ + p’’.g’
p = p’.p’’
Alternate modeling of Prefix Computer:
Finite State Machine

A finite state machine has a set of states, and
it ‘moves’ from one state to another according
to input. Mathematically,



sk = f (sk-1, ak-1)
The problem is to determine final state sn in
O(lg n) operations, given initial state s0 and
sequence of inputs (a0, a1, …an-1)
This problem can be formulated in terms of
prefix computation
Alternate modeling of Prefix Computer:
Finite State Machine



We assume that number of states are small
and finite.
Let sk = fak-1(sk-1), fak-1 can be represented by
matrix Mak-1
Now we are ready to represent our problem
in terms of prefix computation.
Alternate modeling of Prefix Computer:
Finite State Machine
The algorithm
Compute Mai in parallel
Compute

1.
2.




3.
N1 = Ma1
N2 = Ma2.Ma1
…
Nn = Man.Man-1…Ma1
Compute Si+1= Ni(S0)