A fast and well-conditioned spectral method: The US method Alex Townsend

advertisement
A fast and well-conditioned spectral method:
The US method
Alex Townsend
University of Oxford
Leslie Fox Prize, 24th of June 2013
Partially based on: S. Olver & T., A fast and well-conditioned spectral method, to
appear in SIAM Review, (2013).
Alex Townsend, University of Oxford
24th of June 2013
0/26
The problem
Problem: Solve linear ordinary differential equations (ODEs)
aN (x)
dNu
du
+ a0 (x)u = f (x),
+ · · · + a1 (x)
dx N
dx
x ∈ [−1, 1]
with K linear boundary conditions (Dirichlet, Neumann,
Assumptions:
R1
−1
u = c, etc.)
a0 , . . . , aN , f are continuous with bounded variation.
The leading term aN (x) is non-zero on the interval.
The solution exists and is unique.
Main philosophy: Replace a0 , . . . , aN , f by polynomial approximations,
and solve for a polynomial approximation for u. The spectral method
finds the coefficients in a Chebyshev series in O(m2 n) operations:
u(x) ≈
n−1
X
αk Tk (x),
Tk (x) = cos(k cos−1 x).
k=0
Alex Townsend, University of Oxford
24th of June 2013
1/26
Overview
6
4
degree(u) = 51511
u(x)
2
0
−2
−4
−1
−0.5
0
x
0.5
1
1
0.5
y
Conventional wisdom
A comparison
First-order ODEs
Higher-order ODEs
Stability and convergence
Fast linear algebra
PDEs
Conclusion
0
−0.5
−1
Alex Townsend, University of Oxford
−1
−0.5
24th of June 2013
0
x
0.5
1
2/26
Quotes about spectral methods
“It is known that matrices generated by spectral methods can
be dense and ill-conditioned.” [Chen 2005]
“The idea behind spectral methods is to take this process to
the limit, at least in principle, and work with a differentiation
formula of infinite order and infinite bandwidth, i.e., a dense
matrix.” [Trefethen 2000]
“The best advice is still this: use pseudospectral (collocation)
methods instead of spectral (coefficient), and use Fourier series
and Chebyshev polynomials in preference to more exotic
functions.” [Boyd 2003]
The Ultraspherical spectral method (US method) can result in almost
banded, well-conditioned linear systems.
Alex Townsend, University of Oxford
24th of June 2013
3/26
Chebyshev spectral methods
u 00 (x) + cos(x)u(x) = 0,
Collocation
Coefficients
10
10
Condition
number
u(±1) = 0.
US method
10
4
5
3
10
5
10
10
0
0
10
10
0
50
100
150
200
0
50
100
150
200
2
0
50
100
150
200
Density
Matrix structure
Condition number
Problem sizes
Operation count
Collocation
Dense
O(n2N )
n ≤ 5,000
O(n3 )
Alex Townsend, University of Oxford
Coefficient
Banded from below
O(n2N )
n ≤ 5,000
O(mn2 )
24th of June 2013
US method
Banded + rank-K
O(n2(D−1) )
n ≤ 70,000
O(m2 n)
4/26
First-order differential equations
Differentiation
Operator
Multiplication
Operator
u′(x) + a(x)u(x) = f(x)
(D + SM[a]) u = Sf
Jacobi with w(x)=(1−x)α(1+x)β
−0.5
S : T → C(1)
n
en
tia
tio
er
s
io
n
L
on
v
0
(1)
C
T
C
β
M[a] : T → T
iff
er
0.5
D
1
Hypergeometric functions
D:T→C
1.5
(1)
Jacobi
Ultraspherical
−1
Hypergeometric functions
−1.5
−1.5
−1
−0.5
0
0.5
1
1.5
α
Alex Townsend, University of Oxford
24th of June 2013
5/26
First-order operators
Differentiation operator: D : T → C(1) , [DLMF]

0 1
(
(1)

dTk
kCk−1 , k ≥ 1,

=
D=
dx

0,
k = 0,

2


.

3
..
.
Conversion operator: S : T → C(1) , [DLMF]
Tk =

 (1)
(1)
1

C
−
C
2
k
k−2 , k ≥ 2,

1 (1)
2 C1 ,


 (1)
C0 ,


S=


k = 1,
k = 0,
1
0
1
2
− 12
0
1
2

− 12
0
..
.


.. 
.
.

..
.
Multiplication operator: M[a] : T → T, [DLMF]
Tj Tk =
1
T|j−k| + Tj+k ,
2
Alex Townsend, University of Oxford
j, k ≥ 0.
24th of June 2013
6/26
First-order multiplication operator
1
T|j−k|
2
Tj Tk =

2a0

 a1

1
a2
M[a] = 
2

 a3

..
.
|
1
Tj+k
2
+
a1
a2
a3
2a0
a1
a2
a1
2a0
a1
a2
..
.
a1
..
.
{z
2a0
..
.
Toeplitz


...
0

.. 
a1
.



..  1 
. + 
a2
2

.. 
a3
.


..
..
.
.
}
|
0
0
a2
a3
a3
a4
a4
.
..
a5
.
..
{z

...
.
a4 . . 

.
a5 . . 

.
.
a6 . 

. ..
..
.
}
0
Hankel + rank-1
Multiplication is not a dense operator in finite precision. It is m-banded:
a(x) =
∞
X
ak Tk (x) =
k=0
m
X
ãk Tk (x) + O(),
k=0
where ãk are aliased Chebyshev coefficients.
Alex Townsend, University of Oxford
24th of June 2013
7/26
Imposing boundary conditions
Dirichlet: u(−1) = c:
B = (1, −1, 1, −1, . . . , ) ,
Tk (−1) = (−1)k .
Neumann: u 0 (1) = c:
B = (0, 1, 4, 9, . . . , ) ,
Other conditions:
R1
−1
Tk (1) = k 2 .
u = c, u(0) = c, or u(0) + u 0 (π/6) = c.
Imposing boundary conditions by boundary bordering:
u 0 (x)+a(x)u(x) = f ,
B
c
=
.
D + SM[a]
f
Bu = c,
The finite section can be done exactly by projection Pn = (In , 0):
!
BPnT
T
Pn−1 DPnT + Pn−1 SPn+m
Pn+m M[a]PnT
Alex Townsend, University of Oxford
24th of June 2013
=
c
Pn−1 f
!
.
8/26
First-order examples
Example 1
u 0 (x) + x 3 u(x) = 100 sin(20,000x 2 ),
u(−1) = 0.
The exact solution is
x4
u(x) = e − 4
Z
x
100e
t4
4
sin(20,000t 2 )dt .
−1
1.2
N = ultraop(@(x,u) · · · );
N.lbc = 0; u = N \ f;
1
u(x)
0.8
degree(u) = 20391
Adaptively selects the
discretisation size.
Forms a chebfun object
[Chebfun V4.2].
0.6
0.4
time = 15.5s
0.2
0
−0.2
−1
−0.5
0
x
0.5
1
Alex Townsend, University of Oxford
kũ − uk∞ = 1.5 × 10−15 .
24th of June 2013
9/26
First-order examples
Example 2
u 0 (x) +
1
u(x) = 0,
1 + 50,000x 2
u(−1) = 1.
The exact solution with a = 50,000 is
√
√ tan−1 ( ax) + tan−1 ( a)
√
u(x) = exp −
.
a
−5
10
1.005
1
−10
Absolute error
u(x)
degree(u) = 5093
0.995
10
−15
10
0.99
0.985
−1
−20
−0.5
0
x
0.5
1
Alex Townsend, University of Oxford
10
Old chebop
Preconditioned collocation
US method
−1
−0.5
0
0.5
1
x
24th of June 2013
10/26
Higher-order linear ODEs
Higher-order diff operators: Dλ : T → C(λ) , [DLMF]

(λ)
dCk
dx
(
=
(λ+1)
2λCk−1 ,
0,
z
0
λ−1
Dλ = 2
(λ−1)! 


k ≥ 1,
k = 0,

λ times
}| {
··· 0


.


λ
λ+1
..
.
Higher-order conversion operators: Sλ : C(λ) → C(λ+1) , [DLMF]
(λ)
Ck

(λ+1)
(λ+1)
λ

− Ck−2
,

 λ+k Ck
(λ+1)
λ
=
C
,
λ+1 1


 (λ+1)
C0
,
k ≥ 2,
k = 1,

1

Sλ = 


λ
− λ+2
λ
λ+1
..
k = 0,
λ
− λ+3
..
.
α
.

.

β
Jacobi with w(x)=(1−x) (1+x)
2.5
Multiplication operators:
M1 [a] = Toeplitz + Hankel.
er
en
t
0
T
−0.5
−0.5
io
n
iff
C(1)
0.5
on
ve
rs
,
C(2)
C
→C
1
D
β
M1 [a] : C
(1)
ia
1.5
Mλ [a] : Cλ → Cλ ,
(1)
tio
n
2
L
Jacobi
Ultraspherical
0
0.5
1
1.5
2
2.5
α
Alex Townsend, University of Oxford
24th of June 2013
11/26
Construction of ultraspherical multiplication operators
Mλ [a] : C(λ) → C(λ) represents multiplication with a(x) in C (λ) . If
a(x) =
m
X
(λ)
aj Cj
(x),
j=0
then it can be shown that Mλ [a] is m-banded with
k
X
(Mλ [a])j,k =
a2s+j−k csλ (k, 2s + j − k),
j, k ≥ 0,
s=max(0,k−j)
where (to prevent overflow issues) we have
csλ (j, k) =
×
j−s−1
s−1
Y λ+t
j + k + λ − 2s Y λ + t
×
×
j +k +λ−s
1+t
1+t
t=0
t=0
s−1
Y
t=0
j−s−1
Y k −s +1+t
2λ + j + k − 2s + t
×
.
λ + j + k − 2s + t
k −s +λ+t
t=0
For efficiency, we use a recurence relation to generate csλ (j, k).
Alex Townsend, University of Oxford
24th of June 2013
12/26
A high-order example
aN (x)
dNu
d N−1 u
+ aN−1 (x) N−1 + · · · + a0 (x)u = f (x),
N
dx
dx
Bu = c,
B
MN [aN ]DN
+ SN−1 MN−1
[aN−1 ]D
N−1
+ · · · + SN−1 · · · S0 M0
[a0 ]
=
c
SN−1 · · · S0 f
Example 3: High-order ODE
u (10) (x) + cosh(x)u (8) (x) + cos(x)u (2) (x) + x 2 u(x) = 0
u(±1) = 0, u 0 (±1) = 1, u (k) (±1) = 0, k = 2, 3, 4.
0
0.5
10
0.4
−5
0.3
10
−u ||
0
−0.1
Z
−10
10
10
n+1
u(x)
0.1
||u
n 2
0.2
(ũ(x) + ũ(−x))
−15
2
21
−1
−0.2
= 1.3 × 10−14 .
−20
−0.3
1
10
−0.4
−0.5
−1
−25
−0.5
0
0.5
1
10
20
40
60
x
Alex Townsend, University of Oxford
80
100
120
140
n
24th of June 2013
13/26
Airy equation and regularity preserving
Example 4
Consider Airy’s equation for > 0,
p u(−1) = Ai − 3 1/ ,
u 00 (x) − xu(x) = 0,
u(1) = Ai
p 3
1/ .
Relative error in derivative at x=1/2
5
10
10
Ai(1)
10
= 10−9
Ai(20)
10
6
Relative error
Condition Number
Ai(100)
0
8
10
= 10−4
10
4
10
−5
10
−10
10
2
10
=1
−15
10
0
10
0.5
1
1.5
n
2
2.5
3
0
50
4
x 10
100
n
150
200
The US method can accurately compute high derivatives of functions.
For a different approach, see [Bornemann 2011].
Alex Townsend, University of Oxford
24th of June 2013
14/26
Stability when K = N
Assumption: Number of boundary conditions = Differential order.
Definition: Higher-order `2λ -norms
For λ ∈ N, `2λ ⊂ C∞ is the Banach space with norm
kuk2`2 =
∞
X
λ
|uk |2 (1 + k)2λ < ∞.
k=0
Right diagonal preconditioner:

R=
1
2N−1 (N



− 1)! 
Alex Townsend, University of Oxford

IN
1
N
1
N+1
24th of June 2013
..
.


.

15/26
Stability when K = N
Recall the notation:
Solve Lu
=
f,
with Bu
=
c,
B : `2D → CK , bounded,
D ≥ 1.
Theorem: Condition number is bounded with n
B
Let
: `2λ+1 → `2λ be an invertible operator for some λ ≥ D − 1.
L
Then, as n → ∞,
kAn Rn k`2λ = O(1),
k(An Rn )−1 k`2λ = O(1),
where
Rn =
PnT RPn ,
An =
PnT
B
Pn .
L
Proof uses integral reformulation ideas: “R integrates the ODE to
form a Fredholm operator”, i.e.,
B
R = I + K,
K = compact.
L
Alex Townsend, University of Oxford
24th of June 2013
16/26
Convergence with K = N
Theorem: Computed solution converges in high-order norms
B
2
Suppose f ∈ `λ−N+1 for some λ ≥ D − 1, and that
: `2λ+1 → `2λ is
L
an invertible operator. Define
c
−1
un = An Pn
,
SN−1 · · · S0 f
then
ku − Pn> un k`2λ+1 ≤ C ku − Pn> Pn uk`2λ+1 .
Relative error in derivative at x=1/2
5
10
Ai(1)
Ai(20)
You don’t always need the
complex plane to compute
high derivatives.
Ai(100)
0
10
Relative error
The constant, C , does not
depend on n.
−5
10
−10
10
−15
10
0
Alex Townsend, University of Oxford
50
24th of June 2013
100
n
150
200
17/26
Fast linear algebra: The QTP∗ factorisation
Original matrix:
After left Givens:
After right Givens:
Apply a partial factorisation on the left, followed by a partial
factorisation on the right.
Factorisation: An = QTP ∗ in O(m2 n) operations.
No fill-in: The matrix T contains no more non-zeros than An .
For solving An x = b: The matrix Q can be immediately applied to b.
The matrix P ∗ can be stored using Demmel’s trick [Demmel 1997].
UMFPACK by Davis is faster for n ≤ 50000, but has observed
complexity of, roughly, O(n1.25 ) with a very small constant.
Alex Townsend, University of Oxford
24th of June 2013
18/26
Constant coefficient PDEs
Consider the constant coefficient PDE
N N−j
X
X
j=0 i=0
aij
∂ i+j u
= f (x, y ),
∂ i x∂ j y
aij ∈ R
defined on [a, b] × [c, d] with linear boundary conditions satisfying
continuity conditions.
We discretise as a generalised Sylvester matrix equation
k
X
σj Aj XBjT = F ,
Aj ∈ Rn1 ×n1 ,
Bj ∈ Rn2 ×n2 ,
j=1
where Aj and Bj are US discretisations, for example,
!
Aj =
Alex Townsend, University of Oxford
24th of June 2013
19/26
Determining the rank of a PDE operator
Given a PDE with constant coefficients of the form,
L=
N N−j
X
X
aij
j=0 i=0
∂ i+j
,
∂ i x∂ j y
A = (aij ) .
We can rewrite this as

∂ 0 /∂x 0


..
D(x) = 
.
.

L = D(y )T AD(x),
∂ N /∂x N
Hence, if A = UΣV T is the truncated SVD of A with Σ ∈ Rk×k then
L = D(y )T UΣV T D(x) = D(y )T U Σ V T D(x) .
This low rank representation for L tells us how to discretize and solve the
PDE.
Alex Townsend, University of Oxford
24th of June 2013
20/26
Matrix equation solvers
Rank-1: A1 XB1T = F . Solve A1 Y = F , then B1 X T = Y T .
Rank-2: A1 XB1T + A2 XB2T = F . Generalised Sylvester solver
[Jonsson & Kågström, 1992].
Rank-k, k ≥ 3: Solve n1 n2 × n1 n2 system with QTP ∗ factorization.
Boundary conditions are
imposed by carefully
O(n13 + n23 )
O(n13 n2 )
removing degrees of
freedom in the matrix
equation.
O(n1 n2 )
n1 & n2
Alex Townsend, University of Oxford
Automatically forms and
solves subproblems, if
possible.
Corner singularities can be
partly resolved by domain
subdivision.
24th of June 2013
21/26
Examples
Example 5: Helmholtz equation
uxx + uyy + 2ω 2 u = 0,
u(±1, y ) = f (±1, y ),
u(x, ±1) = f (x, ±1),
where f (x, y ) = cos(ωx) cos(ωy ).
ω = 50
N = chebop2(@(u) ... );
N.lbc=f(-1,:); u=N\0;
1
y
0.5
Adaptively selects the
discretisation size.
0
Forms a chebfun2 object
[T. & Trefethen 2013].
−0.5
−1
−1
−0.5
0
x
0.5
1
For ω = 2000,
kũ − uk2 = 2.43 × 10−9 .
For ω = 50 solve takes 0.27s for n1 = n2 = 126,
for ω = 2000 solve takes 32.1s for n1 = n2 = 2124.
Alex Townsend, University of Oxford
24th of June 2013
22/26
Conclusions
The US method results in almost banded well-conditioned matrices.
It requires O(m2 n) operations to solve an ODE.
It can be used as a preconditioner for the collocation method.
The US method converges and is numerically stable.
It can be extended to a spectral method for solving PDEs on
rectangular domains.
Alex Townsend, University of Oxford
24th of June 2013
23/26
Historical example [Orszag 1971]
Orr-Sommerfeld equation
1
(u 0000 − 2u 00 + u) − 2iu − i(1 − x 2 )(u 00 − u) = λ(u 00 − u),
5772.2
u(±1) = u 0 (±1) = 0.
Eigenvalues of the Orr−Sommerfeld operator
−0.3
−0.4
imag
−0.5
−0.6
−0.7
−0.8
−0.9
−1
−1
−0.8
−0.6
−0.4
−0.2
0
real
Alex Townsend, University of Oxford
24th of June 2013
24/26
The end
Thank you
More information:
S. Olver & T., A fast and well-conditioned spectral method, to appear in
SIAM Review, (2013).
Alex Townsend, University of Oxford
24th of June 2013
25/26
References
F. Bornemann, Accuracy and stability of computing high-order derivatives of
analytic functions by Cauchy integrals, Found. Comput. Math., 11 (2011),
pp. 1–63.
I. Jonsson & Kågström, Recursive blocked algorithms for solving triangular
matrix equations — Part II: Two-sided and generalized Sylvester and Lyapunov
equations, ACM Trans. Math. Softw., 28 (2002), pp. 416–435.
S. Olver & T., A fast and well-conditioned spectral method, to appear in SIAM
Review, (2013).
S. Orszag, Accurate solution of the Orr-Sommerfeld stability equation, J. Fluid
Mech., 50 (1971), pp. 689–703.
T. & L. N. Trefethen, An extension of Chebfun to two dimensions, likely to
appear in SISC, (2013).
L. N. Trefethen et al., The Chebfun software system, version 4.2 , 2011.
Alex Townsend, University of Oxford
24th of June 2013
26/26
Chebyshev polynomials
An underappreciated fact: Every Chebyshev series can be rewritten as
a palindromic Laurent series.
The degree k Chebyshev polynomial (of the first kind) is defined by
1 k
Tk (x) = cos k cos−1 x =
z + z −k ,
2
x ∈ [−1, 1],
where z = e ikθ and cos θ = x.
T25(θ) = cos(25θ)
T25(x)
1
1
0.5
0
0
−0.5
−1
−1
−0.5
0
x
0.5
1
−1
0
1.5708
θ
3.1416
Every Chebyshev series can be written as a palindromic Laurent series:
N
X
k=0
N
αk Tk (x) =
N
1X
1X
αk z −k + 1α0 +
αk z k ,
2
2
k=1
Alex Townsend, University of Oxford
z = e ikθ .
k=1
24th of June 2013
27/26
Demmel’s trick for storing Givens rotations
A Givens rotation zeros out one entry of a matrix and we can store the
rotation information there.

×
×

×
×
×
×
×
×
×
×
×

×
×

×
×
Givens
z}|{
−→

×
×

×
×
×
×
p
×
×
×
×
×

×
×

×
×
Let s = sin θ and c = cos θ, where θ is rotation angle, then
(
s · sign(c), |s| < |c|,
p = sign(s)
,
|s| ≥ |c|.
c
To restore
(up to a 180 degrees): If |p| < √
1, then s = p and
√
c = 1 − s 2 , otherwise c = 1/p and s = 1 − c 2 .
Alex Townsend, University of Oxford
24th of June 2013
28/26
Download