Transcript FFT Survey

FFT Survey
Eric Jackowski
ECE 734
Fourier Transforms
• The Discrete Fourier Transform (DFT) converts time
domain data to frequency domain data.
N 1
X (k )   x(n)e
j
2 n k
N
N 1
  x(n)WNnk
k  0,1, N  1
n 0
n 0
where the Twiddle Factor is
nk
N
W
•
e
j
2 n k
N
 cos(
2 n k
2 n k
)  j sin(
)
N
N
The Inverse DFT (IDFT) coverts frequency domain data to the
time domain
1
x(n) 
N
N 1
 X (k )e
n 0
j
2nk
N
1

N
N 1
 nk
X
(
k
)
W

N
n  0,1, N  1
n 0
• The DFT/IDFT requires N2 complex multiplies and N(N-1)
complex additions
Fast Fourier Transform
• The FFT provides a fast method for
computing the DFT.
– Break an N-point DFT into two
N/2-Point FFTs plus N/2 butterfly
operations
• This process continues by
recursively dividing the DFT
• Reduces complexity of O(N2) to
O(NlogrN)
Radix-2 FFT
• An N-point radix-2 FFT requires log2(N) x (N/2) butterfly
operations
• With the DIT FFT, the inputs are in bit-reversed order and
outputs are in sequential order
– Sequential order: 000, 001, 010, 011, 100, 101, 110, 111
0 1
2 3
4 5
6 7
– Bit-reversed order: 000, 100, 010, 110, 001, 101, 011, 111
0 4
2 6
1 5
3 7
• With the decimation in frequency (DIT) FFT, the inputs are in
sequential order and outputs are in bit-reversed order
Butterfly Operations
• A radix-2 DIT butterfly operations is represented as:
• where A, B, W, X, and Y are all complex numbers.
• The complex operations can be performed as:
(a  bj )  (c  dj )  (ac  bd )  (ad  bc) j
(a  bj )  (c  dj )  (a  c)  (b  d ) j
(a  bj )  (c  dj )  (a  c)  (b  d ) j
DIT FFT
Radix-2 Decimation-in-time (DIT) FFT with N = 8
W80  1 W82   j
DIF FFT
•
•
•
•
•
Coefficients change
Data accesses change
Butterflies change
Outputs bit reversed
Number of butterflies and
amount of memory same
Higher-Radix FFTs
• Radix-r FFT divides an Npoint DFT into (N/r)*logr(N) rpoint FFTs
• Reduces the number of
butterflies, but each butterfly
is more complex
– Reduces total number of
complex multiplies
– Can improve latency and
throughput
– Complicates overall
design
Radix-4 DIT butterfly
FFT Implementations
• Sequential FFT Processor - only has one butterfly unit
– Requires (N/r)*logr(N) butterfly
– Relatively low area, but long latency and low
throughput
• Pipelined (Cascaded) FFT Processor – one butterfly unit
per stage or column (logrN butterfly units total)
– One butterfly does not need complex multiplier
– Higher area, but lower latency and better throughput
– May be easier to access memory and support multiple
FFT sizes
• Parallel-iterative FFT Processor – one butterfly unit per
row (N/r butterfly units total)
• Array FFT Processor – fully parallel implementation
((N/r)*logr(N) butterfly units total)
Memory Address
Generation
• Handling bit reversed input or output
– Sequential FFT Processor – Handle by
how you read from or write to memory
– Pipelined FFT Processor – May need
an extra memory buffer that is written
to and then read from in a different
order
• Multiple Banks for low power
• Minimize twiddle memory accesses
• Reorder butterflies
Other Research
• Additions and multiplications can increase the
size of the data
– May want to use larger internal datapath
– May want to saturate results of additions
– May want to round results of multiplies
• Supporting different FFT sizes
– Sequential FFT processor – change address
generation based on FFT size
– Pipelined FFT processor – skip stages that
aren’t needed
• Decreasing Twiddle Rom size
IFFT
• Implementing the IFFT using the FFT
– Swap the real and imaginary inputs
– Perform the FFT
– Swap the real and imaginary outputs
– Scale output by 1/N