D IRECT C OMPUTATION OF T WO -D IMENSIONAL DFT S
If the twiddle factors (W
N
terms) are pre-computed and stored in a table, then we require
N
1
2 N
2
2 complex multiplies and adds
10 12 complex multiplies and adds (CMADs) if N
1
= N
2
=1024:
@1
s/CMAD
11.5 days (low-end programmable processor)
@1 ns/CMAD
16.7 minutes (high-end dedicated processor)
Since bus speeds are significantly slower than processor speeds, we can only process the data in real time if we can feed data to the processor quickly enough. Note that a 200 MHz Texas Instruments TMS320C62x is a fixed-point programmable DSP processor that can compute one CMAD in 10 ns. A CMAD consists of four real multiplications plus four real additions, and the 200 MHz C62x can perform 400 million real multiplications per second. For more information about the C62x, see http://www.ece.utexas.edu/~bevans/courses/ee345s/lectures/01_Architecture
R
OW
-C
OLUMN
D
ECOMPOSITION
We can regroup the computations
X ( k
1
, k
2
)
N
1
1
n
1
0
N n
2
2
1
0 x ( n
1
, n
2
) n
W
N
2
2 k
2
W n
1 k
1
N
1
G ( n
1
, k
2
)
N n
2
2
1
0 x
n
1
,n
2
n
W
N
2
2 k
2
X ( k
1
, k
2
)
n
1
N
1
1
0
G(n
1
,k
2
) W n
1 k
1
N
1
0
0
k
1 k
2
N
1
N
2
1
1
0
0
n
1 k
2
N
1
N
2
1
1
0
0
k
1 k
2
N
1
N
2
1
1
Interpretation
To compute G ( n , k
2
), compute a 1-D DFT of the n th column of x . Do this for all values of n .
Computation can be done using N
1
N
2
-point DFTs +N
2
N
1
point DFT’s
With direct computation of the 1-D DFTs, the requires
N
1
N
2
2
N
2
N
1
2
N
1
N
2
N
1
N
2
CMADS
2
10 9 CMADs if N
1
= N
2
= 1024 for direct computation
@1
s/CMAD
33.3 minutes
@1 ns/CMAD
2 seconds
If N
1
= 2 u ; N
2
= 2 v and a 2-D Fast Fourier Transform (FFT) is used
N
1
N
2
2 log
2
N
2
N
1
N
2 log
2
N
1
2
N
1
N
2
2 log
2
N
2
N
1
2x10 7 CMADs if N
1
= N
2
= 1024
@1
s/CMAD
10 seconds
@1 ns/CMAD
0.01 seconds
This execution time is now fast enough to perform a frame-based DFT for a video sequence at 30 frames/s.
Note: Must be able to access data rowwise and columnwise. This needs memory!
M ULTIDIMENSIONAL
DFT’
S
Things become more complicated as the dimensionality goes up.
X ( k
1
, k
2
,..., k
M
)
n
1 n
2
n
M x ( n
1
, n
2
,..., n
M
) W n
N
M
M k
M ...W
N n
1 k
1
1
1. Compute a 1-D DFT with regard to n
1
for each value of n
2
, n
3
, …, n
M
.
2.
Compute a 1-D DFT with regard to n
2
for each value of k
1
, n
3
, …, n
M
…
.
M. Compute a 1-D DFT with regard to n
M
for each value of k
1
, k
2
, … k
( M -1)
Let N
P
= N
1
N
2
…N
M
, then
# CMADS
i
M
1
j j
M
1
i
N j
N i log
2
N i
2
i
M
1
j
M
1
N j
N p
1
2 log
2
N i
N
P
1
2 log
2
N
P
D EALING WITH S TORAGE
If the machine on which an FFT is to be computed has limited memory, then it may be difficult to access the data both rowwise and columnwise.
This is particularly true for 3-D and higher.
In such cases the data is typically stored rowwise in secondary storage and matrix transposition is performed.
1.
Take row DFTs
2.
Transpose
3.
Take row DFTs
4.
Transpose etc.
E
KLUNDH
’
S
T
RANSPOSITION
A
LGORITHM
If N
1
= N
2
= 2 u , then transposition of a 2-D sequence (note that the origin is in the lower left-hand corner) can be performed recursively in u stages using only 2 rows of memory. The algorithm requires u passes through the matrix.
X
X t
X
X
X
X
1
0
2
0 t t
X
X
1
X
X
3
3
2 t
t
M ORE G ENERAL DFT A LGORITHMS
The rectangular DFT relates a finite extent rectangular array to rectangular samples of its Fourier transform.
Is there a DFT for hexagonal arrays?
Is there a DFT connecting a rectangular array to hexagonal samples of its Fourier transform?
Definitions
( n ) : periodic sequence with periodicity matrix N .
I
N
: set of |det( N )| samples that makes up exactly one period of x ( n ) .
Hypothesis
Let’s hypothesize that
( n ) can be expressed as a finite linear combination of harmonically related complex sinusoids.
( n )
k
J
N a ( k ) e j k
T
R
T n where k is an integer vector and J
N
is a set of multidimensional indices.
(*)
If this representation is valid, it must be consistent with the periodicity of x .
( n )
( n
Nr )
k
J
N k
J
N a ( k a ( k
) e
) e j k
T
R
T
n
Nr
j k
T
R
T
Nr e j k
T
R
T n
(**)
Equating the right sides of (*) and (**) and making use of the orthogonality property of the complex exponentials, for any harmonic that is not zero, we must have e j k
T
R
T
Nr
1
R
T
N
2
I
R
T
2
N
1
If we define
~
X
det N a ( k ) , then x ( n )
1 det N
k
J
N
~
X ( k ) e j k
T
2
N
1
n
The complex exponentials in this sum are periodic in n with periodicity matrix N and in k with periodicity matrix N t .
J
N contains det N
T det N members
~
X ( k )
( n ) e
j k
T
( 2
N
1
) n n
I
N
If we replace the periodic arrays x
~ and X with support on I
N
and J
N,
~
and X with finite support arrays
respectively, we obtain the x generalized DFT. x ( n )
X ( k )
1 det N k
J
N
X
n
I
N x ( n ) e
j k
T
( k ) e j k
T
2
N
1
n
2
N
1
n
, n
I
N
, k
J
N
S AMPLING THE F OURIER T RANSFORM
Define
X (
ω
)
n x ( n ) e j ω T n
Then
X ( k )
X ( ω )
ω
( 2
N
T
) k
Recall
Note: If V is given, we have limited control over the locations of the
Ω
V
T ω
ω
V
T Ω
Ω k
V
T
( 2
N
T
) k
2
( VN )
T k frequency samples because of the constraint that N must be an integer matrix.
Q
UINCUNX
DFT
(Figure 3.10 in notes)
V
T
2
T
2
T
2
T
2
If we use
N
N
0
0
N
Then
k
1
T
1
T
1
T
1
T
2
N
0
2
0
k k
1
2
2
2
NT
2
NT
2
NT
k k
1
2
2
NT
1
1
1
1
k k
1
2
N NT
This matrix [1 1; 1
–1] is called a quincunx sampling matrix (a nonrectangular sampling matrix of determinant +2 or –2). This is quincunx sampling.
E VALUATING GENERAL
DFT’
S
Fact: If N is diagonal, the general DFT can be evaluated using a rowcolumn algorithm
X ( k
1
, k
2
)
2
n
1 n
2 x ( n
1
, n
2
) e
j
k
1 k
2
N
1
0
n
1 n
2
2
0
N
2
n
1 n
2
x ( n
1
, n
2
) e
j
2
n
1 k
1
N
1 e
j
2
n
2 k
2
N
2
Smith Form Decomposition:
Any non-singular integer matrix can be expressed as
N = U D V where
U , V : unimodular integer matrices (determinant of +1 or –1)
D : diagonal integer matrix
Substituting a Smith form of a periodicity matrix N ,
X ( k )
n
I
N
n
I
N
n
I
N x ( n ) e
j k
T x ( n ) e
j k
T
2
N
1
n
2
V
1
D
1
U
1
n x ( n ) e
j 2
V
T k
We would like to express the DFT of a signal with non-rectangular periodicity in terms of a separable DFT. We would like the term in the
j 2
k
T
D
1 n exponential to be of the form
We work towards this goal by making the following change of variables:
Algorithm
Let
V
T k k
V
T
U
1 n n
U
ˆ
X ( V
T k )
U
ˆ
I
N x ( U
ˆ
) e
j 2
k
T
D
1 ˆ
1.
Compute x (
ˆ
) by permuting x ( n ) according to
U
1 n . The permuted values should fall (or be made to fall using circular shifting) into the fundamental parallelopiped formed by D , i.e. a rectangular volume of samples.
2.
Calculate a separable DFT according to entries on the diagonals of D
3.
Permute DFT X (
ˆ
) using k
V
T k .