Transcript Document

Multiply-Accumulator (MAC)
• Compute Sum of Product (SOP)
• Linear convolution
L-1
y[n] = f[n]*x[n] = Σ f[k] x[n-k]
k=0
requires L consecutive multiplications and L–1
additions.
• For a NN multiplier, the product is 2N bits wide.
• The Accumulator should be designed with an
extra K bits in width depending on L.
Digital Kommunikationselektronik
TNE027 Lecture 3
1
Distributed Arithmetic (DA)
• A method for computing Sum of Product
N-1
y = <c, x> = n=0
Σ c[n] x[n]
N-1
B-1
n=0
b=0
= Σ c[n]  Σ xb[n]  2b
B-1
N-1
b=0
n=0
= Σ 2b  Σ c[n]  xb[n]
B-1
= Σ 2b  Fc(xb)
b=0
Digital Kommunikationselektronik
TNE027 Lecture 3
2
N-1
Fc(xb) = Σ
c[n] xb[n]
n=0
where xb = (xb[0], xb[1], …, xb[N–1])
Fc(xb) is a function of an N-bit input vector and
•
•
can be pre-computed and saved in a look-up
table.
During computation, the look-up table accepts
one bit from each of the N words as its input.
The output of the look-up table is weighted by the
approximate power-of-two factor and
accumulated.
Note that, in Fig.2.31, the content of the accumulator is
shifted instead of shifting the output of the look-up table.
Digital Kommunikationselektronik
TNE027 Lecture 3
3
• Signed DA systems
B-1
x[n] = – 2B  xB[n] + Σ xb[n]  2b
b=0
where x[n] has (B+1) bits.
B-1
y = – 2B  Fc(xB) + Σ 2b  Fc(xb)
b=0
where xB is the N-bit vector of sign bit.
– The signed DA system can be implemented by
• An accumulator with add/subtract control (See Fig.
2.31)
• Using a ROM with one additional input bit to
indicate Fc(xb) or – Fc(xb) and an accumulator
without add/subtract control.
Digital Kommunikationselektronik
TNE027 Lecture 3
4
• Modified DA Solutions
– Use partial LUTs to reduce the size of ROM
Suppose the length LN inner product is
computed
LN –1
y = <c, x> = n=0
Σ c[n] x[n]
L –1 N –1
=Σ
Σ c[Nl+n] x[Nl+n]
l=0 n=0
See Fig. 2.32.
The ROM size is reduced from 2LN  B to
L  2N  B.
Digital Kommunikationselektronik
TNE027 Lecture 3
5
Modified DA Solutions (cont.)
• Use additional LUTs to increase speed
– A basic DA architecture accepts one bit from
each of the N words as input to its LUT. If two
bits per words are accepted, then the
computational speed can be essentially doubled.
– The maximum speed can be achieved with the
fully pipelined word-parallel architecture.
See Fig. 2.33.
Digital Kommunikationselektronik
TNE027 Lecture 3
6
Computation of Special Functions
Using CORDIC
• Digital signal processing algorithms might
use certain trigonometric, hyperbolic, linear
and logarithmic functions.
• Taylor series can be computed to
approximate the function and a sequence of
multiply and add operations should be
performed.
Digital Kommunikationselektronik
TNE027 Lecture 3
7
Coordinate Rotation Digital
Computer (CORDIC)
• CORDIC algorithm is a more efficient
alternative approach.
• CORDIC algorithm:
At each iteration, compute
[ ][
Xk+1
Yk+1
=
1
δk2–k
–mδk2–k
1
][]
Xk
Yk
Zk+1 = Zk – δk θk
Digital Kommunikationselektronik
TNE027 Lecture 3
8
CORDIC Theory
The following text is based on the article
”A survey of CORDIC algorithms for FPGA based
computers” by Ray Andraka
• Vector rotation transform
x´ = x cosφ – y sinφ
y´ = y cosφ + x sinφ
which rotate a vector in a Cartesian plane
by the angle φ.
x´ = cosφ (x – y tanφ)
y´ = cosφ (y + x tanφ)
Digital Kommunikationselektronik
TNE027 Lecture 3
9
If the rotation angles are restricted so that
tanφ = ±2–i, the multiplication by the tangent
term is reduced to a simple shift operation.
Arbitrary angles of rotation are obtainable by
performing a series of successively smaller
elementary rotations.
If the decision at each iteration, i, is which
direction to rotate, then the cos(δi) term becomes
a constant, because cos(δi) = cos(– δi).
xi + 1 = Ki (xi – yi di •2–i)
yi + 1 = Ki (yi + xi di •2–i)
where:
Ki = cos(tan–1 2–i) = 1/ (1 + 2–2i)1/2
di = ±1
Digital Kommunikationselektronik
TNE027 Lecture 3
10
• Removing the scale constant from the iterative
equations yields a shift-add algorithm for vector
rotation.
• The product of the Ki’s can be applied elsewhere
in the system or treated as part of a system
processing gain.
• That product approaches 0.6073 as the number of
iterations goes to infinity.
• Therefore, the rotation algorithm has a gain, An, of
approximately 1.647. The exact gain depends on
the number of iterations, and obeys the relation
An = Пn (1 + 2–2i)1/2
Digital Kommunikationselektronik
TNE027 Lecture 3
11
• The angle of a composite rotation is uniquely
defined by the sequence of the directions of the
elementary rotations. That sequence can be
represented by a decision vector.
• Conversion from this vector representation to a
convensional angle representation can be
accomplished by using a look-up table.
• A better conversion method uses an additional
adder-subtractor that accumulates the elementary
rotation angles at each iteration.
• The angular values are supplied by a small lookup table (one entry per iteration) or are hardwired.
Digital Kommunikationselektronik
TNE027 Lecture 3
12
• The angle accumulator adds a third
difference equation to the CORDIC
algorithm:
zi + 1 = zi – di • tan–1(2–i)
• CORDIC rotator is normally operated in
one of two modes:
– Rotation mode
– Vectoring mode
Digital Kommunikationselektronik
TNE027 Lecture 3
13
• The rotation mode rotates the input vector by a
specified angle.
• In the rotation mode, the CORDIC equations are:
xi + 1 = xi – yi di •2–i
yi + 1 = yi + xi di •2–i
zi + 1 = zi – di • tan–1(2–i)
where di = –1 if zi < 0, +1 otherwise.
The angle acummulator zi is initialized with the
desired rotation angle z0. xi and yi are initialized
with the input vector x0 and y0.
Digital Kommunikationselektronik
TNE027 Lecture 3
14
• The following results of the rotation mode
are obtained when n is reasonably large:
xn = An (x0 cosz0 – y0 sinz0)
yn = An (y0 cosz0 + x0 sinz0)
zn = 0
An = П (1 + 2–2i)1/2
n
where x0 and y0 define the initial input vector
and z0 defines the desired rotation angle.
The result xn and yn should be scaled by
1/An to get the actual rotated vector.
Digital Kommunikationselektronik
TNE027 Lecture 3
15
• The vectoring mode rotates the input vector
to the x axis while recoding the angle
required to make that rotation.
• In the vectoring mode, the CORDIC
equations are:
xi + 1 = xi – yi di •2–i
yi + 1 = yi + xi di •2–i
zi + 1 = zi – di • tan–1(2–i)
where di = +1 if yi < 0, –1 otherwise.
Digital Kommunikationselektronik
TNE027 Lecture 3
16
In the vectoring mode, after n iterations, the
result is:
xn = An (x0 2 + y0 2) 1/2
yn = 0
zn = z0 + tan–1(y0 / x0 )
–2i)1/2
An = П
(1
+
2
n
zn is the rotated angle, i.e., the angle
between the input vector (x0 , y0) and x axis.
xn/An is the magnitude of the input vector
(x0 , y0).
Digital Kommunikationselektronik
TNE027 Lecture 3
17
• The CORDIC rotation and vectoring
algorithms are limited to rotation angles
between –π/2 and π/2. The limitation is due
to the use of 20 for the tangent in the first
iteration.
• For composite rotation angles larger than
π/2, an additional rotation of π/2 or π is
required.
Digital Kommunikationselektronik
TNE027 Lecture 3
18
• Computing sin(θ) and cos(θ)
– Use the rotation mode
– Setting y0 = 0, x0 = 1/ An and z0 = θ
– sin(θ) and cos(θ) can be obtained at the end of
rotation:
xn = An ((1/ An)cos θ – 0 · sin θ)
yn = An (0 · cos θ + ((1/ An) sin θ)
– Note that multiplication of 1/ An is performed as
part of rotation operation. This eliminates the
need for another multiplier for modulation.
Digital Kommunikationselektronik
TNE027 Lecture 3
19
• Extension to Linear Functions
– A simple modification to the CORDIC equation permits
the computation of linear functions:
xi + 1 = xi – 0 • yi di •2–i = xi
yi + 1 = yi + xi di •2–i
zi + 1 = zi – di • (2–i)
For rotation mode, di = – 1 if zi < 0, +1 otherwise,
the linear rotation produces:
xn = x0
yn = y0 + x0 z0
zn = 0
The operation is similar to that of a shift-add
multiplier.
Digital Kommunikationselektronik
TNE027 Lecture 3
20
For the vectoring mode, di = +1 if yi < 0, –1
otherwise, it provides a method for evaluating
ratios:
xn = x 0
yn = 0
zn = z0 + y0 /x0
Note that for linear functions, no scaling corrections
are required, since the gain is a unity gain.
Digital Kommunikationselektronik
TNE027 Lecture 3
21
• Extension to Hyperbolic Functions
The close relationship between the trigonometric
and hyperbolic functions suggests the same
architecture can be used to compute hyperbolic
functions. The CORDIC equations are:
xi + 1 = xi + yi di •2–i
yi + 1 = yi + xi di •2–i
zi + 1 = zi – di • tanh–1(2–i)
For rotation mode, di = –1 if zi < 0, +1 otherwise.
For vectoring mode, di = +1 if yi < 0, –1 otherwise.
Note that An = П (1 – 2–2i)1/2  0.80 .
n
Digital Kommunikationselektronik
TNE027 Lecture 3
22
CORDIC Architectures
• CORDIC state machine
See Fig. 2.38 and Fig. 2.39.
• Fast CORDIC pipeline
See Fig. 2.40.
Digital Kommunikationselektronik
TNE027 Lecture 3
23