18.336 Pset 1 Solutions Problem 1: Trig. interp. poly.

advertisement
18.336 Pset 1 Solutions
0
10
−1
10
Problem 1: Trig. interp. poly.
−2
10
When aliasing is included, the general form
|ρ(x) − d2φ100/dx2|
(a)
for the interpolated function is:
f (x) =
N
−1
X
ck ei(k+`k N )x ,
k=0
−3
10
−4
10
−5
10
−6
10
−7
10
and thus the mean-square slope is:
−8
10
1
2π
Z
2π
0
2
|f (x)| dx =
0
N
−1
X
2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x / 2π
2
(k + `k N ) |ck |
k=0
Figure 1: The absolute error
since the integration kills all of the cross terms
in the squared summation, by the usual orthog-
x/2π ,
onality.
the derivative
Each of these terms is non-negative,
|ρ(x) − φ00N (x)|
vs.
using a nite-dierence approximation for
φ00N .
and so the mean-square slope is minimized when
(k +`k N )2 is minimized indepen- Problem 2: Solving Poisson's eq.
N = 2M + 1 (odd), then for k ≤ M
via
nite
dif(a) To evaluate φ00
this is minimized for `k = 0, and for k > M
N (x)
in
Matlab,
we
simply
use
it is minimized for `k = −1 (since in that case ferences
This is
|k −N | ≤ M < k ). Q.E.D. (since M = (N −1)/2 diff(diff(phi))/(2*pi/100)^2.
equivalent to:
and M + 1 = (N + 1)/2).
each coecient
dently. If
(b)
The symmetric form from (a) can equiva-
φN (x + ∆x) − 2φN (x) + φN (x − ∆x)
φ00N (x) ∼
=
∆x2
lently be written as
M
X
f (x) =
which (as we shall explore in more detail in
ck eikx ,
lecture) is a second-order approximation with
k=−M
where
c−k ≡ cN −k
O(∆x2 ) error.
versus x using
according to aliasing. Since
that
discontinuity, but in this case the error is exactly
zero there, and the error elsewhere is lower than
we might expect from pessimistic upper bounds.
(Similarly, in Fourier space the
and thus
f (x) = c0 +
M
X
due to Gibb's phenomena.
ikx
(ck e
+
(b)
L2
ρ(x). First,
the original discontinuous sawtooth ρ1 (x). Second, ρ2 (x) = | sin x| − 2/π which is continuous
c∗k e−ikx ).
c0 = c∗0
2<[ck eikx ].
This is manifestly real, since
each term in the sum is
not
a
unique
nomial, however.
any
c0 = 0 coecient
O(1)
is computed exactly.) Still, the peak error is
error
k=1
This is
this approximation. As we noted
often, the peak error would be at the point of
fn (fn = fn∗ ) it immediately follows
c−k = c∗k ,
|ρ(x) − φ00N (x)|
in class, there is a happy accident in this case
N −1
1 X
nk
ck =
fn ωN
,
N n=0
then for real
In gure 1 we plot
is real and
In gure 2, we show the approximate
∆N
vs.
N
for three functions
with discontinuous slope.
Third, we consider
ρ3 (x) = 2x/π − 1 + cos x for x < π and ρ3 (x) =
2x/π − 3 − cos x for x > π , which is continuous
real interpolating poly-
In particular, we could also
as long as `−k ≡ `N −k + 1 =
−`k , as this would preserve the property that we
with continuous rst derivative but discontinu-
have a sum of terms and their complex conju-
ness (see below), we see that
gates.
O(1/N 2 )
choose
shifts
`k
ous second derivative. After some initial bumpi-
1
both ρ1
and
ρ2
give
convergence. The simple upper bound
0
0
10
10
ρ = sawtooth
ρ = |sin x|
ρ = 2x/π ± cos x ...
−2
10
−2
10
1/N2
1/N
approx. L2 error ∆N
approx. L2 error ∆N
−4
10
4
−4
10
−6
10
−8
10
−10
10
−6
10
−8
10
−10
10
−12
10
−12
10
−14
10
−14
10
−16
10
−16
10
−18
1
2
10
3
10
10
4
10
10
0
10
20
30
40
50
Figure 2: Approximate
three
ρ(x)
L2
1/N 4
and
N
vs.
for
Figure 3: Approximate
1
ρ(x) = f (x)− 2π
1/(2 + cos x).
with delta-function rst, second, and
third derivatives respectively.
1/N 2
∆N
error
60
70
80
90
100
N
N
smooth
Also shown are
L
R 2 error ∆N vs. N for
f (x)dx, where f (x) =
lines, for reference.
ric convergence. We therefore plot on a semilog
from class would be
2
O(1/N )
and
O(1/N )
for discontinuous
for discontinuous-slope
explained in class why this
ρ1
ρ,
scale, and indeed obtain the expected straight-
ρ
line
but we
e−αN
behavior. At least, we get a straight
line until the error reaches
happens to do bet-
10−16
or so, at which
the
point it saturatesthis is nothing more than the
general upper bound for discontinuoussecond-
limit imposed by the nite precision of double-
ter.
ρ3
The
convergence goes as
4
O(1/N );
but this function hap-
precision oating-point arithmetic (numbers on
pens to do better because of a similar accident
the computer are represented with only about 16
ρ
derivative
as
ρ1
is
3
O(1/N ),
(we have constructed
ρ3
so that its
is summed exactly, whereas this is
ρ2 )
not
signicant digits of precision).
c0 = 0
true for
.
Problem 3: FFTs
The initial bumpiness of the
ρ1
convergence
For radix-2 DIT Cooley-Tukey, the outputs are
is not random, howeverit follows a straight
line with
puted
1/N
∆N
for more
N
Θ(1/N 2 ).
N,
where
∆N
is
N1 = 2
and
N2 = N/2):
values we would see that
there are a whole sequence of
ing to large
computed via (from class, taking
convergence. In fact, if we com-


N points, extendN
1
2 −1
X
X
Θ(1/N ) instead of y N
n2 k 2  n1 k 1
ω n1 k2
ω2
xn1 +2n2 ωN/2
k1 +k2 =
N
What's going on? The answer is that
there is a bug in the code: at the point
2
n1 =0
x = π,
n2 =0
ρ1 (π) should be zero, but because of rounding er- where the inner sum is a size-N/2 DFT comlinspace function the middle x value puted recursively, and the outer sum is a size-2
is sometimes slightly greater or less than pi, and
DFT.
for these values of N we get sum(rho) 6= 0. Then
(a) Let T (N ) be the number of real-arithmetic
the c0 Fourier coecient is not computed exactly
operations. Then T (N ) = 2T (N/2) + Θ(N ) +
and the error returns to the 1/N upper bound.
(N/2)T (2), where the Θ(N ) represents the mul2
n1 k 2
To x this and get O(1/N ) convegence everytiplications by the twiddle factors ωN
. To get
where we could simply set rho(N/2+1)=0, howT (N ) more explicitly, we need to count the numever.
ber of operations in T (2) and the twiddle multiFinally, in gure 3, we show ∆N for ρ(x) = plications.
R 2π
1
dx
1/(1+2 cos x)− 2π
, which is smooth
T (2) is easy since ω2 = e−πi = −1: a DFT
0 1+2 cos x
on the real-x axis and should thus give geometof length 2 is just an addition y0 = x0 + x1 and
ror in the
2
a subtraction
y1 = x0 − x1 ,
and since these are
the usual conjugate symmetry, and similarly for
T (2) = 4.
n1 k 2
the ωN
mul-
ok2 .
complex-number additions we get
Naively, we might think that
tiplications require
N
complex multiplications,
but many of these are multiplications by
hence have no cost.
N/2
1
1
k
yk = ek + ωN
ok
and
n1 = 0
In particular, for
they are multiplications by
to consider the
In order to combine the outputs of these
two sub-DFTs, we must do:
for
so we only need
multiplications for
Some of these also simplify: for
k2 = 0
0 ≤ k < N/2,
and
k
yN/2+k = ek − ωN
ok
for
the other halfthis is the DFT of size 2 (the
n1 = 1 .
n1
it is 1,
yN −k = yk∗ , we need
only compute half of these outputs (0 ≤ k ≤
N/2). Right away, this saves us half of the 4N/2
sum).
However, since
k2 = N/4 it is −i which is also cost-free,
really care about counting operations
additions that were previously required for the
you also must note that for k2 = N/4 ± N/8
√
T (2) transforms. However, it still seems like we
you get −(i ± 1)/ 2 which requires only 2 real
k
are multiplying by N/2 twiddle factors ωN we
multiplications and 2 real additions. However,
and for
and if you
need to save half of these somehow.
the important point is that the number of these
We can save half of the twiddle multiplies by
special cases is bounded (at most four of them),
using the conjugate symmetry of
and so we know that the twiddle multiplications
require
6N/2 − O(1)
get
real-arithmetic operations
(since each general complex multiply takes 6 op-
yN/2−k
for
k < N/4.
N/2−k
yN/2−k = eN/2−k +ωN
erations). Thus, we now know that
ek
and
ok ,
to
In particular:
k
oN/2−k = e∗k −(ωN
ok )∗ ,
N/2−k
T (N ) = 2T (N/2) + 5N − O(1)
where we have used the fact that ωN
=
−k
= 2[2T (N/4) + 5N/2 − O(1)] + 5N − O(1) −ωN . Therefore, we only need to compute
k
ωN
ok for k ≤ N/4 and obtain the N/4 <
= ···
k < N/2 values by conjugation (which is only
Now we are almost done. If we repeat this recur-
a sign ip and requires no multiplies or addi-
sively down to
tions). Thus, we have saved roughly half of the
From each
N = 2, there are log2 N −1 stages.
stage we pick up 5N operations (N
5N
operations that were previously required to
combine the sub-transforms:
halves at each stage but the number of trans-
5N (log2 N − 1)
O(1)
operations multiplied by 1, 2, 4, · · · but the sum
log2 N
of this geometric series is ∼ 2
= N and thus
this term is at most O(N ). Thus,
forms doubles), which gives us
5
Tr (N ) = 2Tr (N/2) + N + O(1)
2
operations. From each stage we also pick up
where
#=
(b)
and
Tr (N )
and
denote the number of real-
into two DFTs of length
Tr (N ) = 2Tr (N/2)+??.
real
N/2,
inputs and thus
Denote the results of
these two real-input sub-DFTs as
ek2
and
ok2
(the transforms of the even- and odd-indexed
xn ,
respectively).
1 In
1968,
Note that
Yavne's split-radix
eN/2−k2 = e∗k2
the
same
arguments
by
algorithm achieved
T (N ) = 4N log2 N + Θ(N ), which was the best count for
N = 2m for 36 years. The current best count for N = 2m ,
34
achieved in 2004, is T (N ) =
N log2 N + Θ(N ).
9
3
as
above,
and we asymptoti-
cally save half the operations. Q.E.D.
Then the Cooley-Tukey algorithm breaks the
N
by
Tr (N ) = 52 N log2 N + O(N )
gorithm specialized to the case of real inputs.
both of which again have
bounded value that
k = N/4
k = N/8, the k1 = 1, k2 = 0 term for yN/2 ,
the k1 = k2 = 0 term which is purely real.
Then,
arithmetic operations required for the same al-
DFT of length
±
factors that simplify as before such as
5.1
Let
is again some
optimizations and special cases: the few twiddle
T (N ) = 5N log2 N − O(N )
and
O(1)
takes into account a bounded number of extra
Download