Fast Algorithms for the Computation of Oscillatory Integrals

advertisement
Fast Algorithms
for the Computation of Oscillatory Integrals
Emmanuel Candès
California Institute of Technology
EPSRC Symposium Capstone Conference
Warwick Mathematics Institute, July 2009
Collaborators
Lexing Ying
Laurent Demanet
Our problem
Evaluate numerically a Fourier integral operator (FIO)
Z
(T f )(x) =
a(x, k)e2πiΦ(x,k) fˆ(k) dk
Rd
at points x given on a Cartesian grid
k ∈ Rd : frequency variable (fˆ(k) =
R
Rd
e−2πix·k f (x) dx)
a(x, k): (smooth) amplitude
Φ(x, λk): homogeneous (smooth) phase function as large as |k|
Φ(x, λk) = λ Φ(x, k),
e.g. Φ(x, k) = g(x)|k|
λ>0
A motivating example: wave propagation
∂2u
(x, t) = c2 ∆u(x, t),
∂t2
u(x, 0)
=
∂u/∂t(x, 0) =
u0 (x)
0
A motivating example: wave propagation
∂2u
(x, t) = c2 ∆u(x, t),
∂t2
u(x, 0)
=
∂u/∂t(x, 0) =
u0 (x)
0
Solution operator is
Z
Z
1
2πi(x·k+c|k|t)
2πi(x·k−c|k|t)
u(x, t) =
e
û0 (k) dk +
e
û0 (k) dk
2
R2
R2
Two FIOs with phase functions
Φ± (x, k) = x · k ± c|k|t
Inhomogeneous medium c(x) → solution operator = sum of two FIOs (small times)
Importance of FIOs
Arise in many (inverse) problems
Applying FIOs is often the computational bottleneck
Example in seismics
Marine survey
Kirchhoff migration
Wave measurements fs (t, xr ) parametrized
by
time t
receiver location xr
source coordinate xs
Forward map: F δc = δp
Imaging operator is F ∗ : FIO under general assumptions
Approximations by generalized Radon transform (GRT): integration of fs
over fixed set of curves parametrized by travel times
Z
gs (x) = δ(t − τ (x, xr ) − τ (x, xs )) fs (t, xr ) dt dxr
Followed by stack operation over the s index
Other examples
Transmission electron microscopy
Radar imaging
Ultrasound imaging
Discrete computational problem
Discrete grids
X = {(i1 /N, i2 /N ) : 0 ≤ i1 , i2 < N } ⊂ [0, 1]2
Ω = {(k1 /N, k2 /N ) : −N/2 ≤ k1 , k2 < N } ⊂ [−1/2, 1/2]2
Given input {f (k)}k∈Ω , evaluate
(T f )(x) :=
1 X
a(x, k) e2πiN Φ(x,k) f (k),
N
k∈Ω
with Φ smooth and homogeneous in k
at all
x∈X
Peek at the results
(T f )(x) :=
1 X
a(x, k) e2πiN Φ(x,k) f (k),
N
x∈X
k∈Ω
Kernel is not analytic and is highly oscillatory
Naive evaluation O(N 4 ) (O(N 2 ) inputs/outputs)
Algorithm for fast summation O(N 2.5 log N ) (C., Demanet and Ying, ’06)
Peek at the results
(T f )(x) :=
1 X
a(x, k) e2πiN Φ(x,k) f (k),
N
x∈X
k∈Ω
Kernel is not analytic and is highly oscillatory
Naive evaluation O(N 4 ) (O(N 2 ) inputs/outputs)
Algorithm for fast summation O(N 2.5 log N ) (C., Demanet and Ying, ’06)
Today:
Novel algorithm with optimal complexity for accurate summation
O(N 2 log N ) flops
O(N 2 ) storage
Agenda
The butterfly structure
Fast butterfly algorithm for the evaluation of FIOs
The Butterfly Structure
Butterfly algorithm
General algorithmic structure for evaluating certain types of integrals
X
ui =
K(xi , pj )fj
j
Introduced by Michielssen and Boag (’96)
Generalized by O’Neil and Rokhlin (’07)
Butterfly algorithm
General algorithmic structure for evaluating certain types of integrals
X
ui =
K(xi , pj )fj
j
Introduced by Michielssen and Boag (’96)
Generalized by O’Neil and Rokhlin (’07)
Example: K(x, p) = e2πiN xp
Applications
{xi }: N points in [0, 1]
FFT
{pj }: N points in [0, 1]
nonuniform FFTs
{fj }: sources at {pj }
many others
Kernel is dense and oscillatory
Low-rank approximation (K(x, p) = exp(2πiN xp))
A interval in x
B interval in p
obeying
length(A) × length(B) ≤ 1/N
The submatrix {K(xi , pj ) : xi ∈ A, pj ∈ B} has approximately low rank:
|K(x, p) −
r
X
t=1
with r = O(log(1/))
at (x)bt (p)| ≤ Example
106 × 106 DFT
Top left 103 × 103 block
Top singular values
0
10
−5
10
−10
10
−15
10
−20
10
0
10
20
30
40
50
60
70
80
90
100
Interpolative decompositions
O’Neil and Rokhlin suggest using interpolative decompositions
at (x) := K(x, pt ),
pt ∈ B
Rank-revealing QR decomposition: Gu and Eisenstat (’96)
Interpolative decomp.: Cheng, Gimbutas, Martinsson and Rokhlin (’05)
Interpolative representation
K(x, p) ≈
r
X
t=1
Cost for an m × n matrix is O(mn2 )
K(x, pt )bt (x)
Multiscale decompositions
Compute low-rank approximations of all submatrices obeying
length(A) × length(B) = 1/N
Use two-scale relations for efficiency
Definition of partial sums and equivalent sources
Partial sums
uB (x) =
X
K(x, pj )fj
pj ∈B
Approximation for x ∈ A and length(A) × length(B) ≤ 1/N


r
X
X

, x ∈ A
uB (x) ≈
K(x, pAB
bAB
t )
t (pj )fj
t=1
pj ∈B
Definition of partial sums and equivalent sources
Partial sums
uB (x) =
X
K(x, pj )fj
pj ∈B
Approximation for x ∈ A and length(A) × length(B) ≤ 1/N


r
X
X

, x ∈ A
uB (x) ≈
K(x, pAB
bAB
t )
t (pj )fj
t=1
pj ∈B
Equivalent sources for (A, B): {ftAB }1≤t≤r
X
ftAB =
bAB
t (pj )fj
pj ∈B
Definition of partial sums and equivalent sources
Partial sums
uB (x) =
X
K(x, pj )fj
pj ∈B
Approximation for x ∈ A and length(A) × length(B) ≤ 1/N


r
X
X

, x ∈ A
uB (x) ≈
K(x, pAB
bAB
t )
t (pj )fj
pj ∈B
t=1
Equivalent sources for (A, B): {ftAB }1≤t≤r
X
ftAB =
bAB
t (pj )fj
pj ∈B
Compact representation of K AB : B → A
uB (x) =
r
X
t=1
AB
K(x, pAB
t )ft
Definition of partial sums and equivalent sources
Partial sums
uB (x) =
X
K(x, pj )fj
pj ∈B
Approximation for x ∈ A and length(A) × length(B) ≤ 1/N


r
X
X

, x ∈ A
uB (x) ≈
K(x, pAB
bAB
t )
t (pj )fj
pj ∈B
t=1
Equivalent sources for (A, B): {ftAB }1≤t≤r
X
ftAB =
bAB
t (pj )fj
pj ∈B
Compact representation of K AB : B → A
uB (x) =
r
X
AB
K(x, pAB
t )ft
t=1
Butterfly structure
AB
Recursive computation of {pAB
} for length(A) × length(B) = 1/N
t }, {ft
Initialization
AB
For all (A, B), `(A) = 1 & `(B) = 1/N , construct {pAB
}1≤t≤r
t }1≤t≤r and {ft
X
ftAB =
bAB
t (pj )fj
pj ∈B
2
cost of constructing {pAB
t } is O(r N )/pair
AB
cost of constructing {ft } is O(r2 )/pair
⇒
tot. cost is O(r2 N 2 )
AB
Interpretation: {pAB
} column weights
t } selected columns, {ft
Recursion: “merge, split, compress”
AB
For all (A, B), `(A) = 1/2 & `(B) = 2/N , get {pAB
}1≤t≤r
t }1≤t≤r and {ft
uB (x) ≈
2 X
r
X
i=1 t=1
A Bi
K(x, pt p
Ap Bi
)ft
,
x ∈ A ⊂ Ap
Recursion: “merge, split, compress”
AB
For all (A, B), `(A) = 1/2 & `(B) = 2/N , get {pAB
}1≤t≤r
t }1≤t≤r and {ft
uB (x) ≈
2 X
r
X
A Bi
K(x, pt p
Ap Bi
)ft
,
x ∈ A ⊂ Ap
i=1 t=1
A Bi
Can reduce the rank of K(x, pt p
Ap Bi
Treat {ft
A Bi
) : x ∈ A, {pt p
A Bi
}t,i as sources at {pt p
ftAB =
2 X
r
X
}t,i → K(x, pAB
t )
}t,i ⊂ B
Ap Bi
bAB
)fsAp Bi
t (ps
i=1 s=1
2
cost of constructing {pAB
t } is O(r N )/pair
AB
cost of constructing {ft } is O(r2 )/pair
⇒
tot. cost is O(r2 N 2 )
Schematic representation
Schematic representation
Bi
u (x) =
r
X
t=1
A B
A B
K(x, pt p i )ft p i
uB (x) =
r
X
t=1
AB
K(x, pAB
t )ft
Next step
... until `(B) = 1 and `(A) = 1/N
Termination
In the end, `(A) = 1/N and `(B) = 1, and
B
u (x) ≈
r
X
AB
K(x, pAB
t )ft
t=1
cost of evaluating uB (x) is O(r)/pair
⇒
tot. cost is O(rN )
Multiscale recursion
Summary: complexity analysis
If low-rank expansions available
2
If not
O(r N log N ) evaluation time
O(r2 N 2 ) evaluation time
O(r2 N log N ) storage
O(r2 N log N ) storage
Summary: complexity analysis
If low-rank expansions available
If not
2
O(r N log N ) evaluation time
O(r2 N 2 ) evaluation time
O(r2 N log N ) storage
O(r2 N log N ) storage
Powerful architecture but
Precomputation time may be prohibitive
Storage may be prohibitive
Summary: complexity analysis
If low-rank expansions available
If not
2
O(r N log N ) evaluation time
O(r2 N 2 ) evaluation time
O(r2 N log N ) storage
O(r2 N log N ) storage
Powerful architecture but
Precomputation time may be prohibitive
Storage may be prohibitive
Our contribution
O(r2 N log N ) evaluation time and storage complexity is O(r2 N )
Fast Evaluation of FIOs
Recall oscillatory integral
u(x) =
X
e2πiN Φ(x,k) fk
k∈Ω
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
0
0.1
0.2
0.4
0.6
x∈X
0.8
1
0
0
0.2
0.4
0.6
k∈Ω
0.8
1
Polar coordinates for frequency variable k
Phase may be singular at k = 0 because of homogeneity (e.g. Φ(x, k) = |k|)
Polar coordinates for frequency variable k
Phase may be singular at k = 0 because of homogeneity (e.g. Φ(x, k) = |k|)
Polar coordinates p = (p1 , p2 )
k1 = p1 cos(2πp2 ) k2 = p2 sin(2πp2 )
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
30
20
10
0
−10
0.3
0.3
0.2
0.2
0.1
0.1
0
0
0.2
0.4
0.6
0.8
1
Cartesian
0
0
−20
0.2
0.4
0.6
0.8
Polar
Slight abuse of notations
X
u(x) =
e2πiN Ψ(x,p) fp ,
p∈Ω
1
−30
−30
−20
−10
0
10
20
Partitioning in k
⇒
smooth phase Ψ
30
Hierarchical structure
`(A) × `(B) = 1/N
Low-rank interactions
If `(A)`(B) ≤ 1/N , kernel
e2πiN Ψ(x,p) : x ∈ A, p ∈ B
has approx. low rank
Assume wlog A and B centered at 0
Low-rank interactions
If `(A)`(B) ≤ 1/N , kernel
e2πiN Ψ(x,p) : x ∈ A, p ∈ B
has approx. low rank
Assume wlog A and B centered at 0
Residual phase in 1D (for simplicity)
RAB (x, p) = Ψ(x, p) − Ψ(0, p) − Ψ(x, 0) + Ψ(0, 0)
= ∂x ∂p Ψ(x∗ , p∗ ) xp
= O(1/N )
Same calculation in higher dimensions
Low-rank interactions
If `(A)`(B) ≤ 1/N , kernel
e2πiN Ψ(x,p) : x ∈ A, p ∈ B
has approx. low rank
Assume wlog A and B centered at 0
Residual phase in 1D (for simplicity)
RAB (x, p) = Ψ(x, p) − Ψ(0, p) − Ψ(x, 0) + Ψ(0, 0)
= ∂x ∂p Ψ(x∗ , p∗ ) xp
= O(1/N )
Same calculation in higher dimensions
Shows that e2πiN R
AB
(x,p)
approx. low rank
Low-rank interactions
If `(A)`(B) ≤ 1/N , kernel
e2πiN Ψ(x,p) : x ∈ A, p ∈ B
has approx. low rank
Assume wlog A and B centered at 0
Residual phase in 1D (for simplicity)
RAB (x, p) = Ψ(x, p) − Ψ(0, p) − Ψ(x, 0) + Ψ(0, 0)
= ∂x ∂p Ψ(x∗ , p∗ ) xp
= O(1/N )
Same calculation in higher dimensions
Shows that e2πiN R
AB
(x,p)
approx. low rank
Factorization
h
i
AB
e2πiN Ψ(x,p) = e−2πiN Ψ(0,0) e2πiN Ψ(0,x) e2πiN R (x,p) e2πiN Ψ(0,p)
Oscillatory Chebyshev interpolation
Chebyshev interpolation of e2πiN R
√
x when `(A) ≤ 1/ N
√
p when `(B) ≤ 1/ N
AB
(x,p)
in
E.g. interpolation in p
{pt }B : tensor-Chebyshev grid in B
0
B
B B
LB
t (p) : Lagrange interpolant with inputs at {pt }, Lt (pt0 ) = δ(t = t )
Oscillatory Chebyshev interpolation
Chebyshev interpolation of e2πiN R
√
x when `(A) ≤ 1/ N
√
p when `(B) ≤ 1/ N
AB
(x,p)
in
E.g. interpolation in p
{pt }B : tensor-Chebyshev grid in B
0
B
B B
LB
t (p) : Lagrange interpolant with inputs at {pt }, Lt (pt0 ) = δ(t = t )
With grid of logarithmic size in 1/
2πiN RAB (x,p) X 2πiN RAB (x,pB
) B
t
−
e
Lt (p) ≤ e
t
When interpolation in x, low-rank approximation is
P
t
e2πiN R
AB
on A × B
(xA
t ,p)
LA
t (x)
Demodulation/Interpolation/Remodulation
From RAB (x, p) = Ψ(x, p) − Ψ(0, p) − Ψ(x, 0) + Ψ(0, 0)
X
B
B
2πiN Ψ(x,p)
Ψ(x,pt ) −i2πN Ψ(0,pt ) B
i2πN Ψ(0,p) −
e|2πiN{z
e
L
(p)
e
e
≤
t
} |
{z
}
t
at (x)
b (x)
t
Similar structure (with different interpretation) when interpolating kernel in x
Overall structure and two-scale relation
1
Initialize equivalent sources
2
Propagate equivalent sources
(interpolation in p) until mid-level
3
Switch representation at mid-level
4
Propagate equivalent sources
(interpolation in x)
5
Terminate by evaluating output
Step 2: propagation of equivalent sources
X
Ap Bc
Bc
ftAB =
bB
t (pt0 )ft0
c,t0
B
= e−2πiN Ψ(x0 (A),pt
)
XX
c
Step 4: similar
t0
Bc
A Bc
Bc
2πiN Ψ(x0 (A),pt0 )
LB
ft0 p
t (pt0 ) e
Finer points and summary
{pB
t } and polynomials Lt (p) are computed all at once
Only need to store equivalent sources at pairs of consecutive scales
Separation rank higher than that of the interpolative decomposition
Finer points and summary
{pB
t } and polynomials Lt (p) are computed all at once
Only need to store equivalent sources at pairs of consecutive scales
Separation rank higher than that of the interpolative decomposition
Complexity is O(polylog(1/) N 2 log N )
Storage is O(polylog(1/) N 2
for -accurate computation
Easy extensions to varying amplitudes → a(x, k)ei2πN Φ(x,k)
Numerical results
Generalized Radon transform integrating f along ellipses
centered at x
and with axes of length c1 (x) and c2 (x)
is the sum of two FIOs with phases
Φ± (x, k) = x · k ±
q
c21 (x)k12 + c22 (x)k22
Numerical results
Generalized Radon transform integrating f along ellipses
centered at x
and with axes of length c1 (x) and c2 (x)
is the sum of two FIOs with phases
Φ± (x, k) = x · k ±
q
c21 (x)k12 + c22 (x)k22
First example (constant amplitude)
X
u(x) =
e2πiN Φ+ (x,k) fˆ(k)
k∈Ω
with
c1 (x) =
1
(2 + sin(2πx1 ) sin(2πx2 )),
3
c2 (x) =
1
(2 + cos(2πx1 ) cos(2πx2 ))
3
(N , Chby. grid)
(1024, 5)
(2048, 5)
(4096, 5)
(1024, 7)
(2048, 7)
(4096, 7)
(1024, 9)
(2048, 9)
(4096, 9)
(1024, 11)
(2048, 11)
Alg. Time
1.48e+3
6.57e+3
3.13e+4
2.76e+3
1.23e+4
5.80e+4
4.95e+3
2.21e+4
1.02e+5
8.33e+3
3.48e+4
Dir. Time
9.44e+4
1.53e+6
2.43e+7
9.48e+4
1.46e+6
2.31e+7
9.44e+4
1.48e+6
2.23e+7
9.50e+4
1.49e+6
Conclusion: scales like O(log(1/)) × O(N 2 log N )
Speedup
6.37e+1
2.32e+2
7.74e+2
3.44e+1
1.19e+2
3.99e+2
1.91e+1
6.71e+1
2.18e+2
1.14e+1
4.27e+1
Error
1.26e-2
1.75e-2
1.75e-2
6.45e-4
8.39e-4
8.18e-4
3.45e-5
4.01e-5
4.21e-5
5.23e-7
5.26e-7
Numerical results II
Second example (variable amplitude): exact integration
X
X
u(x) =
a+ (x, k)e2πiN Φ+ (x,k) fˆ(k) +
a− (x, k)e2πiN Φ− (x,k) fˆ(k)
k∈Ω
k∈Ω
a± (x, k) = (J0 (2πc(x)|k|) ± iY0 (2πc(x)|k|) e∓2πic(x)|k|
Φ± (x, k) = x · k + c(x)|k|
J0 and Y0 are Bessel functions and spheres’ radii
c(x) =
1
(3 + sin(2πx1 ) sin(2πx2 ))
4
(N , Chby. grid)
(256, 5)
(512, 5)
(1024, 5)
(256, 7)
(512, 7)
(1024, 7)
(256, 9)
(512, 9)
(1024, 9)
(256, 11)
(512, 11)
(1024, 11)
Alg. Time
1.39e+2
7.25e+2
3.45e+3
2.69e+2
1.38e+3
6.43e+3
5.23e+2
2.49e+3
1.15e+4
1.04e+3
4.10e+3
1.84e+4
Dir. Time
3.20e+3
5.20e+4
8.34e+5
3.21e+3
5.20e+4
8.35e+5
3.20e+3
5.17e+4
8.32e+5
3.18e+3
5.11e+4
8.38e+5
Conclusion: scales like O(log(1/)) × O(N 2 log N )
Speedup
2.31e+1
7.17e+1
2.42e+2
1.19e+1
3.78e+1
1.30e+2
6.12e+0
2.08e+1
7.25e+1
3.06e+0
1.24e+1
4.57e+1
Error
1.48e-2
1.62e-2
1.90e-2
4.71e-4
7.30e-4
6.35e-4
1.59e-5
2.97e-5
1.75e-5
8.03e-7
9.38e-7
8.01e-7
Summary
Accurate and near-optimal numerical evaluation of FIOs
Operating characteristics
Butterfly structure
Residue phase
Oscillatory Chebyshev interpolation
Many applications/extensions
Other types of oscillatory kernels K(x, p)
Other types of problems: e.g. sparse Fourier transform (Ying, ’08)
Reference: E. J. Candès, L. Demanet and L. Ying (2008). “A Fast Butterfly Algorithm
for the Computation of Fourier Integral Operators,” to appear in Mult. Model. Sim
Download