Document 13545574

advertisement
Fast Fourier Transform:
VLSI Architectures
Lecture 10
Vladimir Stojanović
6.973 Communication System Design – Spring 2006
Massachusetts Institute of Technology
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Pipelined FFT architectures
‰
Examples
8
4
BF2
C2
2
BF2
C2
4
C2
1
BF2
C2
j
2
BF2
1
(1) . R2MDC(N-16)
8
‰
Radix-2
„
„
‰
BF2
multi-path delay commutator
single-path delay feedback
„
„
2
BF2
1
j
BF2
BF2
(2) . R25DF(N-16)
2B4E8F
Radix-4
„
4
3X64
3X16
3X4
3X1
BF4
BF4
BF4
BF4
(3) . R4SDF(N-256)
single-path delay feedback
multi-path delay commutator
single-path delay commutator
C4
192
128
64
BF4
16
32
C4
48
32
16
BF4
48
4
8
C4
12
8
4
BF4
12
1
2
C4
3
2
1
BF4
3
(4) . R4MDC(N-256)
DC6X64
BF4
DC6X16
BF4
DC6X4
BF4
DC6X1
BF4
(5) . R4SDC(N-256)
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
2
Radix-2 Multi-path Delay Commutator
8
C2
4
BF2
C2
4
2
BF2
C2
1
BF2
2
C2
j
BF2
1
Figure by MIT OpenCourseWare.
The most classical approach for pipeline
implementation of radix-2 FFT
‰ Input sequence broken into two parallel data
streams flowing forward with correct “distance”
between data elements entering the butterfly
scheduled by proper delays
‰ Both butterflies and multipliers are in 50%
utilization
‰
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
3
Radix-2 Single-path Delay Feedback
8
4
2
BF2
BF2
BF2
1
j
BF2
Figure by MIT OpenCourseWare.
[Wold & Despain ‘84]
‰
Uses registers more efficiently
„
‰
‰
Both as input and the output of the butterfly
A single data stream goes through the
multiplier at every stage
Multiplier utilization is also 50%
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
4
Radix-4 Single-path Delay Feedback
[Despain74]
x0
x4
3 X4
3X1
DFT
4
x8
BF4
X12
X1
X13
X2
X14
X3
DFT
4
DFT
4
x12
BF4
X0
DFT
4
DFT
4
x15
X15
Figure by MIT OpenCourseWare.
Figure by MIT OpenCourseWare.
0
WN
x(n)
‰
Utilization of multipliers 75%
„
‰
By storing 3 BF4 outputs
x(n+
N
)
4
x(n+
N
)
2
3N
x(n+ )
4
Radix-4 butterfly utilization only 25%
„
y(n)
n
WN
-j
-1
y(n+
N
)
4
y(n+
N
)
2
2n
WN
-1
-1
j
-1
-j
3n
WN
3N
y(n+ )
4
Figure by MIT OpenCourseWare.
Butterfly fairly complicated
„
At least 8 complex adders
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
Figure by MIT OpenCourseWare.
5
Radix-4 Multi-path Delay Commutator
[Swartzlander84]
x0
12
4
BF4
+++
C4
8
1
2
C4
3
x4
2
x8
1
BF4
3
„
„
DFT
4
x12
DFT
4
x15
What is the utilization of
Butterflies?
Multipliers?
X12
X1
X13
X2
X14
X3
DFT
4
DFT
4
Figure by MIT OpenCourseWare.
‰
X0
DFT
4
X15
Figure by MIT OpenCourseWare.
0
WN
x(n)
x(n+
N
)
4
x(n+
N
)
2
3N
x(n+ )
4
y(n)
n
WN
-j
-1
-1
-j
N
)
4
y(n+
N
)
2
2n
WN
-1
j
-1
y(n+
3n
WN
3N
y(n+ )
4
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
6
Radix-4 Single-path Delay Commutator
[Bi & Jones ‘89]
x0
stage 1
input
commutator
butterfly
element
stage 2
commutator
butterfly
element
DFT
4
x4
DFT
4
DFT
4
x8
DFT
4
x12
DFT
4
x15
c1 c2 c3
X0
X12
X1
X13
X2
X14
X3
X15
Figure by MIT OpenCourseWare.
c4 c5 c6
coefficient
Figure by MIT OpenCourseWare.
0
WN
x(n)
Modified radix-4 algorithm
‰ Programmable ¼ radix-4 BF
‰ 75% utilization
‰ Used to build one of the largest single-chip
FFTs (8192pts) [Bidet’95]
‰
x(n+
N
)
4
x(n+
N
)
2
3N
x(n+ )
4
-j
-1
-1
n
WN
-j
y(n+
N
)
4
y(n+
N
)
2
2n
WN
-1
j
-1
y(n)
3n
WN
3N
y(n+ )
4
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
7
R4SDC commutator and butterfly details
input
Nt
Nt
Nt
1
T
0
Nt
2 1 0 15 14 13 12 11 10 9 8 7 6
Time
x(n)
5 4 3 2 1 0 15 14 13
t'+16T
input
t'
Nt
1
0
1
0
1
0
Nt
mt
0
1
2
3
c1
1
0
0
0
c2 c3
1 1
1 1
0 1
0 0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
2 1 0 15 14 13 12 11 10 9 8 7 6
5 4 3 2 1 0 15 14 13
14 13 12 11 10 9 8
7 6 5 4 3 2
1 0 15 14 13 12 11 10 9
10 9 8 7 6 5 4
3 2 1 0 15 14 13 12 11 10 9 8 7
Time
Figure by MIT OpenCourseWare.
6 5 4 3 2 1 0 15 14 13 12 11 10 9
0
0
1
0
2
0
3
0
4
1
5
1
6
1
7
1
8
2
9
2
10
2
11
2
12
3
13
3
14
3
15
3
stage 1
0
0
0
0
0
1
2
3
0
2
4
6
0
3
6
9
0
0
1
4
2
8
3
12
0
1
1
5
2
9
3
13
0
2
1
6
2
10
3
14
0
3
1
7
2
11
3
15
2:1 multiplexers
8 7
6 5 4
t'+28T
m =3
1
Outputs from
commutator at
stage 1
6 5
3 2 1
t'+12T
m1= 2
m1= 1
m1= 0
Figure by MIT OpenCourseWare.
re (0)
im (0)
re (1)
im (1)
add/sub
add/sub
Re
add/sub
D
re (2)
im (2)
re (3)
im (3)
add/sub
add/sub
add/sub
mt
0
1
2
3
stage 2
Figure by MIT OpenCourseWare.
Im
c4
0
1
0
1
c5
0
0
1
1
c6 (0 = addition, 1 = subtraction)
0
1
1
0
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
8
Some conclusions
‰
Delay feedback approaches are always more
efficient than corresponding delay-commutator
approaches
„
In terms of memory utilization
„
‰
Since butterfly outputs share same storage with its inputs
Pipeline architectures require FFT algorithms to
be formulated in a “hardware-oriented” form
„
„
Where spatial regularity is preserved in a signal-flow
graph (SFG)
So that arithmetic operations can be tightly
scheduled for efficient hardware utilization
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
9
Decomposition – a review
‰
Twiddle factor is Nth primitive root of unity
„
‰
With exponent evaluated modulo N
Most fast algorithms share same general
strategy
„
Map one-dimensional transform int a two or multidimensional representation
„
‰
Exploit congruence property of coefficients to simplify
computation
Unlike traditional step-by-step decomposition
of twiddle factors
„
Cascading the twiddle factor decomposition leads
to new forms of FFT with high-spatial regularity
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
10
Radix 22 approach
‰
Start by classical divide-and-conquer radix-2 DIF
indexing
„
But, consider the first two steps of decomposition together
[Shouseng and Torkelson 1996]
Compute directly in standard radix-2 approach
New idea is to proceed to shorter DFTs cascading
the twiddle factor WN(N/4n2+n3)k1
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
11
A 16pt example
‰
Get radix-4-like mulitplier complexity with
radix-2 butterfly structures (radix-22)
x(0)
x(1)
N/4 DFT
x(2)
(k1=0, k2=0)
x(3)
x(4)
x(5)
x(6)
x(7)
x(8)
x(9)
x(10)
x(11)
x(12)
x(13)
W0
W2
W4
W6
W0
W1
W2
W3
W0
W3
-j
-j
-j
-j
x(14)
x(15)
BF I
6
W
W9
BF II
N/4 DFT
(k1=0, k2=1)
N/4 DFT
(k1=1, k2=0)
N/4 DFT
(k1=1, k2=1)
X(0)
X(8)
X(4)
X(12)
X(2)
X(10)
X(6)
X(14)
X(1)
X(9)
X(5)
X(13)
X(3)
X(11)
X(7)
X(15)
x(0)
x(1)
x(2)
x(3)
x(4)
x(5)
x(6)
x(7)
x(8)
x(9)
x(10)
x(11)
x(12)
x(13)
X(0)
X(8)
X(4)
X(12)
X(2)
X(10)
-j
W2
W4
-j
-j
x(15)
BF I
-j
W1
W2
W3
-j
X(6)
X(14)
X(1)
X(9)
X(5)
X(13)
X(3)
X(11)
W3
W6
-j
-j
x(14)
W6
W9
BF II
X(7)
-j
BF III
X(15)
BF IV
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
12
A 64pt radix-22 example
Image removed due to copyright restrictions.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
13
Radix-22 (R22SDF) architecture
N=256
32
16
8
4
2
1
BF2I
BF2II
BF2I
BF2II
BF2I
BF2II
BF2I
BF2II
clk
X
t
X
W1(n)
7
t
X
X
W2(n)
6
5
4
+
t
X
X
+
64
+
x(n)
128
t
X
X(k)
X
W3(n)
3
2
1
0
Figure by MIT OpenCourseWare.
xr(n)
+
xi(n)
+
-+
-+
xr(n+N/2)
xi(n+N/2)
‰
„
‰
Reduced number of multipliers
0
xr(n)
„
+
1
0
+
1
-
+
+-
+
1
0
1
0
xi(n)
1
0
1
0
+-
xr(n+N/2)
xi(n+N/2)
x
Figure by MIT OpenCourseWare.
zr(n)
zi(n)
Figure by MIT OpenCourseWare.
One identical to that in R2SDF
The other contains the logic for trivial twiddle factor multiplication (with j)
Just a log2N binary counter
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
zi(n+N/2)
t x
Synchronization control very simple due to spatial regularity
„
zr(n+N/2)
(ii). BF2II
Need two types of butterflies
„
‰
(i). BF2I
Similar to R2SDF
0
1
0
1
14
Radix-22 architecture – Sync control
‰
log2N-bit binary counter
„
„
‰
On first N/2 cycles, 2-to-1 mux in BF1 switch to 0
„
‰
Synchronization controller
Address counter for twiddle factor reading in each stage
Butterfly is idle (input data directed to shift registers)
On next N/2 cycles, muxes in BF1 switch to 1
„
„
„
Butterfly computes a 2pt DFT with incoming data and data stored in the shift registers
Output Z1(n) sent to twiddle multiplier
Output Z1(n+N/2) sent back to the shift register to be “multiplied” in next N/2 cycles,
when the first half of the next frame is loaded in
32
16
8
4
BF2I
BF2II
BF2I
BF2II
BF2I
BF2II
clk
X
t
X
X
W1(n)
7
6
t
X
X
W2(n)
5
4
+
t
X
+
64
+
x(n)
128
2
1
BF2I
BF2II
t
X
X
X(k)
W3(n)
3
2
1
0
Figure by MIT OpenCourseWare.
‰
‰
‰
‰
Operation of BF2 is similar, except the “distance” of butterfly input sequence is
just N/4 and the trivial multiply logic
Utilization of the multiplier is 75%
Next frame can be computed w/o pausing due to the pipelined processing in
each stage
Pipeline register can be inserted between each multiplier and BF stage to
improve the performance
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
15
Arithmetic complexity
R2MDC
R2SDF
R4SDF
R4MDC
R4SDC
R22SDF
multiplier #
adder #
memory size
control
2(log4 N - 1)
2(log4 N - 1)
log4 N - 1
3(log4 N - 1)
log4 N - 1
log4 N - 1
4 log4 N
4 log4 N
8 log4 N
8 log4 N
3 log4 N
4 log4 N
3N/2 - 2
N-1
N-1
5N/2 - 4
2N - 2
N-1
simple
simple
medium
simple
complex
simple
Figure by MIT OpenCourseWare.
R22SDF has reached minimum requirement
for both multiplier and storage
‰ Only R4SDC better in terms of adder usage
‰ R22SDF well suited for VLSI implementations
of pipeline FFT processors
‰
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
16
Memory issues
‰
The area/power consumption in the pipeline architectures
dominated by the
„
„
‰
FIFO register files at each stage
Complex multipliers at each (or every other stage)
To diminish the unnecessary data moving in the FIFO need
to reconstruct the storage
„
A known approach is to use FIFO with 2-port RAM
„
„
„
With read and write addresses displaced by a constant
2-port RAM cells 33% more area of the 1-port RAM cell
Use two N/2 1-port RAMs
„
„
b
D(n)
a
E
E
Read and write interleaved
Each active every other cycle
E
c
d
D(n-N)
E
D(n)
lxN 2-port RAM
D(n-N)
N/2-1 RAM
N/2-1 RAM
R/W
Addr.
Figure by MIT OpenCourseWare.
W-addr.
R-addr.
W R
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
17
Single stage hardware example x[3]
1
W
-1
-1
x[4]
2
W
x[5]
x[6]
W
-1
W
3
W
0
2
W
0
2
W
0
-1
-1
X[1]
X[5]
-1
W
X[2]
X[6]
-1
-1
-1
x[7]
0
X[3]
X[7]
N
TFFT = r . logrN .Tr,PE
Where,
N/r = No. of butterfly per stage
logrN = No. of stage
Tr,PE = Time to calculate one butterfly
Figure by MIT OpenCourseWare.
‰
Fold stages onto each other
„
„
P/S
W
X[4]
-1
-1
Counter
x[2]
W
X[0]
N/r Butterflies
-1
W
0
Coeff
ROM
x[1]
0
S/P & Bit reverse
0
W
Control
Circuits
x[0]
Figure by MIT OpenCourseWare.
[Sadat2001]
Need constant geometry signal flow graph
Big price in area for parallelism (within each stage)
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
18
Radix-8 Pipelined/Parallel implementation
‰
A 64pt FFT example for 802.11a
‰
Two dimensional structure of 8pt FFTs
„
The number of nontrivial complex multiplications is 49 (7x7)
„
„
„
‰
[Excerpted from Maharatna et al 2004]
Since the first twiddle is always 1
The number of nontrivial complex multiplications for radix-2
FFT is 66
Radix-4 (or 22) FFTs need only 52 multiplies
Important to note that for 8pt FFT (DIT) no need for
multiplies
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
19
8pt DIT FFT
Figure from Maharatna, K., E. Grass, and U. Jagdhold. "A 64-point Fourier Transform Chip for High-speed Wireless
LAN Application Using OFDM." Solid-State Circuits 39 (2004): 484-493. Copyright 2004 IEEE. Used with permission.
‰
The only nontrivial multiply is with 1/sqrt(2)
„
Easily realize using hardwired shift-and-add
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
20
Block diagram of the FFT unit
Figure from Maharatna, K., E. Grass, and U. Jagdhold. "A 64-point Fourier Transform Chip for High-speed Wireless
LAN Application Using OFDM." Solid-State Circuits 39 (2004): 484-493. Copyright 2004 IEEE. Used with permission.
‰
Two-stages are pipelined
„
‰
Fully parallel in each stage (radix-2 8pt FFT, single clk cycle)
Two performance bottlenecks
„
„
Large number of global wires resulting from the multiplexing of
complex data to the 8-point FFTs
Construction of the multiplier unit to attain the required speed with
minimal silicon are is not trivial
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
21
Input unit
‰
Hard wired outputs and data shifting
„
„
„
‰
To the 8pt FFT
Reduce de-muxing
Reduce global wires
Cannot shift every clk
„
„
Multiplier cannot finish
Extend latency
„
Temporary registers 1,2,3
Figure from Maharatna, K., E. Grass, and U. Jagdhold. "A 64-point Fourier Transform Chip for High-speed Wireless
LAN Application Using OFDM." Solid-State Circuits 39 (2004): 484-493. Copyright 2004 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
22
Multiplier unit
‰
49 multiplies
Only nine sets unique (cos,sin)
„
hard-wired constant
„
Significantly less storage space
„
‰
for coefficients
Turn multiplies into shift&add
Figure from Maharatna, K., E. Grass, and U. Jagdhold. "A 64-point Fourier Transform Chip for High-speed Wireless
LAN Application Using OFDM." Solid-State Circuits 39 (2004): 484-493. Copyright 2004 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
23
Multiplier unit and scheduling
Figures from Maharatna, K., E. Grass, and U. Jagdhold. "A 64-point Fourier Transform Chip for High-speed Wireless
LAN Application Using OFDM." Solid-State Circuits 39 (2004): 484-493. Copyright 2004 IEEE. Used with permission.
‰
Some of the coefficients requested concurrently by different FFT
outputs
„
‰
‰
Solve by adding temp registers in the input unit
~50% less power and area than 8 standard complex multipliers
Buffer unit similar to input unit, just w/o temporary registers
„
Outputs also hardwired with distance of 8
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
24
Output unit
‰
A mirror of input unit
„
‰
Just w/o temporary registers
Control/sync is simple
„
5-bit counter
„
„
Starts counting when in put full
Local counters control
„ Input
„ Intermediate
„ Output units
Figure from Maharatna, K., E. Grass, and U. Jagdhold. "A 64-point Fourier Transform Chip for High-speed Wireless
LAN Application Using OFDM." Solid-State Circuits 39 (2004): 484-493. Copyright 2004 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 C ommunication System Design
25
Readings
‰
[1] H.e. Shousheng and M. Torkelson "A new approach to pipeline FFT processor," Parallel
Processing Symposium, 1996., Proceedings of IPPS '96, The 10th International no. SN -,
pp. 766-770, 1996.
„
‰
‰
‰
‰
[2] E. Wold and Alvin M. Despain "Pipeline and Parallel-Pipeline FFT Processors for VLSI
Implementations," IEEE Trans. Computers vol. 33, no. 5, pp. 414-426, 1984.
[3] G. Bi and E.V. Jones "A pipelined FFT processor for word-sequential data," Acoustics,
Speech, and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE
Transactions on vol. 37, no. 12 SN - 0096-3518, pp. 1982-1985, 1989.
[4] K. Maharatna, E. Grass and U. Jagdhold "A 64-point Fourier transform chip for highspeed wireless LAN application using OFDM," Solid-State Circuits, IEEE Journal of vol. 39,
no. 3 SN - 0018-9200, pp. 484-493, 2004.
Interesting DIT&F algorithm
„
‰
[3] H.e. Shousheng and M. Torkelson "Designing pipeline FFT processor for OFDM (de)modulation,"
Signals, Systems, and Electronics, 1998. ISSSE 98. 1998 URSI International Symposium on no. SN
-, pp. 257-262, 1998.
[4] C. Chiu, W. Hui, T.J. Ding and J.V. McCanny "A 64-point Fourier transform chip for video motion
compensation using phase correlation," Solid-State Circuits, IEEE Journal of vol. 31, no. 11 SN 0018-9200, pp. 1751-1761, 1996.
Power-performance estimation
„
„
[2] S. Hong, S. Kim, M.C. Papaefthymiou and W.E. Stark "Power-complexity analysis of pipelined
VLSI FFT architectures for low energy wireless communication applications," Circuits and Systems,
1999. 42nd Midwest Symposium on vol. 1, no. SN -, pp. 313-316 vol. 1, 1999.
[3] K. Pagiamtzis and P.G. Gulak "Empirical performance prediction for IFFT/FFT cores for OFDM
systems-on-a-chip," Circuits and Systems, 2002. MWSCAS-2002. The 2002 45th Midwest
Symposium on vol. 1, no. SN -, pp. I-583-6 vol.1, 2002.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design
26
Download