A fast and well-conditioned spectral method: The US method Alex Townsend University of Oxford Leslie Fox Prize, 24th of June 2013 Partially based on: S. Olver & T., A fast and well-conditioned spectral method, to appear in SIAM Review, (2013). Alex Townsend, University of Oxford 24th of June 2013 0/26 The problem Problem: Solve linear ordinary differential equations (ODEs) aN (x) dNu du + a0 (x)u = f (x), + · · · + a1 (x) dx N dx x ∈ [−1, 1] with K linear boundary conditions (Dirichlet, Neumann, Assumptions: R1 −1 u = c, etc.) a0 , . . . , aN , f are continuous with bounded variation. The leading term aN (x) is non-zero on the interval. The solution exists and is unique. Main philosophy: Replace a0 , . . . , aN , f by polynomial approximations, and solve for a polynomial approximation for u. The spectral method finds the coefficients in a Chebyshev series in O(m2 n) operations: u(x) ≈ n−1 X αk Tk (x), Tk (x) = cos(k cos−1 x). k=0 Alex Townsend, University of Oxford 24th of June 2013 1/26 Overview 6 4 degree(u) = 51511 u(x) 2 0 −2 −4 −1 −0.5 0 x 0.5 1 1 0.5 y Conventional wisdom A comparison First-order ODEs Higher-order ODEs Stability and convergence Fast linear algebra PDEs Conclusion 0 −0.5 −1 Alex Townsend, University of Oxford −1 −0.5 24th of June 2013 0 x 0.5 1 2/26 Quotes about spectral methods “It is known that matrices generated by spectral methods can be dense and ill-conditioned.” [Chen 2005] “The idea behind spectral methods is to take this process to the limit, at least in principle, and work with a differentiation formula of infinite order and infinite bandwidth, i.e., a dense matrix.” [Trefethen 2000] “The best advice is still this: use pseudospectral (collocation) methods instead of spectral (coefficient), and use Fourier series and Chebyshev polynomials in preference to more exotic functions.” [Boyd 2003] The Ultraspherical spectral method (US method) can result in almost banded, well-conditioned linear systems. Alex Townsend, University of Oxford 24th of June 2013 3/26 Chebyshev spectral methods u 00 (x) + cos(x)u(x) = 0, Collocation Coefficients 10 10 Condition number u(±1) = 0. US method 10 4 5 3 10 5 10 10 0 0 10 10 0 50 100 150 200 0 50 100 150 200 2 0 50 100 150 200 Density Matrix structure Condition number Problem sizes Operation count Collocation Dense O(n2N ) n ≤ 5,000 O(n3 ) Alex Townsend, University of Oxford Coefficient Banded from below O(n2N ) n ≤ 5,000 O(mn2 ) 24th of June 2013 US method Banded + rank-K O(n2(D−1) ) n ≤ 70,000 O(m2 n) 4/26 First-order differential equations Differentiation Operator Multiplication Operator u′(x) + a(x)u(x) = f(x) (D + SM[a]) u = Sf Jacobi with w(x)=(1−x)α(1+x)β −0.5 S : T → C(1) n en tia tio er s io n L on v 0 (1) C T C β M[a] : T → T iff er 0.5 D 1 Hypergeometric functions D:T→C 1.5 (1) Jacobi Ultraspherical −1 Hypergeometric functions −1.5 −1.5 −1 −0.5 0 0.5 1 1.5 α Alex Townsend, University of Oxford 24th of June 2013 5/26 First-order operators Differentiation operator: D : T → C(1) , [DLMF] 0 1 ( (1) dTk kCk−1 , k ≥ 1, = D= dx 0, k = 0, 2 . 3 .. . Conversion operator: S : T → C(1) , [DLMF] Tk = (1) (1) 1 C − C 2 k k−2 , k ≥ 2, 1 (1) 2 C1 , (1) C0 , S= k = 1, k = 0, 1 0 1 2 − 12 0 1 2 − 12 0 .. . .. . . .. . Multiplication operator: M[a] : T → T, [DLMF] Tj Tk = 1 T|j−k| + Tj+k , 2 Alex Townsend, University of Oxford j, k ≥ 0. 24th of June 2013 6/26 First-order multiplication operator 1 T|j−k| 2 Tj Tk = 2a0 a1 1 a2 M[a] = 2 a3 .. . | 1 Tj+k 2 + a1 a2 a3 2a0 a1 a2 a1 2a0 a1 a2 .. . a1 .. . {z 2a0 .. . Toeplitz ... 0 .. a1 . .. 1 . + a2 2 .. a3 . .. .. . . } | 0 0 a2 a3 a3 a4 a4 . .. a5 . .. {z ... . a4 . . . a5 . . . . a6 . . .. .. . } 0 Hankel + rank-1 Multiplication is not a dense operator in finite precision. It is m-banded: a(x) = ∞ X ak Tk (x) = k=0 m X ãk Tk (x) + O(), k=0 where ãk are aliased Chebyshev coefficients. Alex Townsend, University of Oxford 24th of June 2013 7/26 Imposing boundary conditions Dirichlet: u(−1) = c: B = (1, −1, 1, −1, . . . , ) , Tk (−1) = (−1)k . Neumann: u 0 (1) = c: B = (0, 1, 4, 9, . . . , ) , Other conditions: R1 −1 Tk (1) = k 2 . u = c, u(0) = c, or u(0) + u 0 (π/6) = c. Imposing boundary conditions by boundary bordering: u 0 (x)+a(x)u(x) = f , B c = . D + SM[a] f Bu = c, The finite section can be done exactly by projection Pn = (In , 0): ! BPnT T Pn−1 DPnT + Pn−1 SPn+m Pn+m M[a]PnT Alex Townsend, University of Oxford 24th of June 2013 = c Pn−1 f ! . 8/26 First-order examples Example 1 u 0 (x) + x 3 u(x) = 100 sin(20,000x 2 ), u(−1) = 0. The exact solution is x4 u(x) = e − 4 Z x 100e t4 4 sin(20,000t 2 )dt . −1 1.2 N = ultraop(@(x,u) · · · ); N.lbc = 0; u = N \ f; 1 u(x) 0.8 degree(u) = 20391 Adaptively selects the discretisation size. Forms a chebfun object [Chebfun V4.2]. 0.6 0.4 time = 15.5s 0.2 0 −0.2 −1 −0.5 0 x 0.5 1 Alex Townsend, University of Oxford kũ − uk∞ = 1.5 × 10−15 . 24th of June 2013 9/26 First-order examples Example 2 u 0 (x) + 1 u(x) = 0, 1 + 50,000x 2 u(−1) = 1. The exact solution with a = 50,000 is √ √ tan−1 ( ax) + tan−1 ( a) √ u(x) = exp − . a −5 10 1.005 1 −10 Absolute error u(x) degree(u) = 5093 0.995 10 −15 10 0.99 0.985 −1 −20 −0.5 0 x 0.5 1 Alex Townsend, University of Oxford 10 Old chebop Preconditioned collocation US method −1 −0.5 0 0.5 1 x 24th of June 2013 10/26 Higher-order linear ODEs Higher-order diff operators: Dλ : T → C(λ) , [DLMF] (λ) dCk dx ( = (λ+1) 2λCk−1 , 0, z 0 λ−1 Dλ = 2 (λ−1)! k ≥ 1, k = 0, λ times }| { ··· 0 . λ λ+1 .. . Higher-order conversion operators: Sλ : C(λ) → C(λ+1) , [DLMF] (λ) Ck (λ+1) (λ+1) λ − Ck−2 , λ+k Ck (λ+1) λ = C , λ+1 1 (λ+1) C0 , k ≥ 2, k = 1, 1 Sλ = λ − λ+2 λ λ+1 .. k = 0, λ − λ+3 .. . α . . β Jacobi with w(x)=(1−x) (1+x) 2.5 Multiplication operators: M1 [a] = Toeplitz + Hankel. er en t 0 T −0.5 −0.5 io n iff C(1) 0.5 on ve rs , C(2) C →C 1 D β M1 [a] : C (1) ia 1.5 Mλ [a] : Cλ → Cλ , (1) tio n 2 L Jacobi Ultraspherical 0 0.5 1 1.5 2 2.5 α Alex Townsend, University of Oxford 24th of June 2013 11/26 Construction of ultraspherical multiplication operators Mλ [a] : C(λ) → C(λ) represents multiplication with a(x) in C (λ) . If a(x) = m X (λ) aj Cj (x), j=0 then it can be shown that Mλ [a] is m-banded with k X (Mλ [a])j,k = a2s+j−k csλ (k, 2s + j − k), j, k ≥ 0, s=max(0,k−j) where (to prevent overflow issues) we have csλ (j, k) = × j−s−1 s−1 Y λ+t j + k + λ − 2s Y λ + t × × j +k +λ−s 1+t 1+t t=0 t=0 s−1 Y t=0 j−s−1 Y k −s +1+t 2λ + j + k − 2s + t × . λ + j + k − 2s + t k −s +λ+t t=0 For efficiency, we use a recurence relation to generate csλ (j, k). Alex Townsend, University of Oxford 24th of June 2013 12/26 A high-order example aN (x) dNu d N−1 u + aN−1 (x) N−1 + · · · + a0 (x)u = f (x), N dx dx Bu = c, B MN [aN ]DN + SN−1 MN−1 [aN−1 ]D N−1 + · · · + SN−1 · · · S0 M0 [a0 ] = c SN−1 · · · S0 f Example 3: High-order ODE u (10) (x) + cosh(x)u (8) (x) + cos(x)u (2) (x) + x 2 u(x) = 0 u(±1) = 0, u 0 (±1) = 1, u (k) (±1) = 0, k = 2, 3, 4. 0 0.5 10 0.4 −5 0.3 10 −u || 0 −0.1 Z −10 10 10 n+1 u(x) 0.1 ||u n 2 0.2 (ũ(x) + ũ(−x)) −15 2 21 −1 −0.2 = 1.3 × 10−14 . −20 −0.3 1 10 −0.4 −0.5 −1 −25 −0.5 0 0.5 1 10 20 40 60 x Alex Townsend, University of Oxford 80 100 120 140 n 24th of June 2013 13/26 Airy equation and regularity preserving Example 4 Consider Airy’s equation for > 0, p u(−1) = Ai − 3 1/ , u 00 (x) − xu(x) = 0, u(1) = Ai p 3 1/ . Relative error in derivative at x=1/2 5 10 10 Ai(1) 10 = 10−9 Ai(20) 10 6 Relative error Condition Number Ai(100) 0 8 10 = 10−4 10 4 10 −5 10 −10 10 2 10 =1 −15 10 0 10 0.5 1 1.5 n 2 2.5 3 0 50 4 x 10 100 n 150 200 The US method can accurately compute high derivatives of functions. For a different approach, see [Bornemann 2011]. Alex Townsend, University of Oxford 24th of June 2013 14/26 Stability when K = N Assumption: Number of boundary conditions = Differential order. Definition: Higher-order `2λ -norms For λ ∈ N, `2λ ⊂ C∞ is the Banach space with norm kuk2`2 = ∞ X λ |uk |2 (1 + k)2λ < ∞. k=0 Right diagonal preconditioner: R= 1 2N−1 (N − 1)! Alex Townsend, University of Oxford IN 1 N 1 N+1 24th of June 2013 .. . . 15/26 Stability when K = N Recall the notation: Solve Lu = f, with Bu = c, B : `2D → CK , bounded, D ≥ 1. Theorem: Condition number is bounded with n B Let : `2λ+1 → `2λ be an invertible operator for some λ ≥ D − 1. L Then, as n → ∞, kAn Rn k`2λ = O(1), k(An Rn )−1 k`2λ = O(1), where Rn = PnT RPn , An = PnT B Pn . L Proof uses integral reformulation ideas: “R integrates the ODE to form a Fredholm operator”, i.e., B R = I + K, K = compact. L Alex Townsend, University of Oxford 24th of June 2013 16/26 Convergence with K = N Theorem: Computed solution converges in high-order norms B 2 Suppose f ∈ `λ−N+1 for some λ ≥ D − 1, and that : `2λ+1 → `2λ is L an invertible operator. Define c −1 un = An Pn , SN−1 · · · S0 f then ku − Pn> un k`2λ+1 ≤ C ku − Pn> Pn uk`2λ+1 . Relative error in derivative at x=1/2 5 10 Ai(1) Ai(20) You don’t always need the complex plane to compute high derivatives. Ai(100) 0 10 Relative error The constant, C , does not depend on n. −5 10 −10 10 −15 10 0 Alex Townsend, University of Oxford 50 24th of June 2013 100 n 150 200 17/26 Fast linear algebra: The QTP∗ factorisation Original matrix: After left Givens: After right Givens: Apply a partial factorisation on the left, followed by a partial factorisation on the right. Factorisation: An = QTP ∗ in O(m2 n) operations. No fill-in: The matrix T contains no more non-zeros than An . For solving An x = b: The matrix Q can be immediately applied to b. The matrix P ∗ can be stored using Demmel’s trick [Demmel 1997]. UMFPACK by Davis is faster for n ≤ 50000, but has observed complexity of, roughly, O(n1.25 ) with a very small constant. Alex Townsend, University of Oxford 24th of June 2013 18/26 Constant coefficient PDEs Consider the constant coefficient PDE N N−j X X j=0 i=0 aij ∂ i+j u = f (x, y ), ∂ i x∂ j y aij ∈ R defined on [a, b] × [c, d] with linear boundary conditions satisfying continuity conditions. We discretise as a generalised Sylvester matrix equation k X σj Aj XBjT = F , Aj ∈ Rn1 ×n1 , Bj ∈ Rn2 ×n2 , j=1 where Aj and Bj are US discretisations, for example, ! Aj = Alex Townsend, University of Oxford 24th of June 2013 19/26 Determining the rank of a PDE operator Given a PDE with constant coefficients of the form, L= N N−j X X aij j=0 i=0 ∂ i+j , ∂ i x∂ j y A = (aij ) . We can rewrite this as ∂ 0 /∂x 0 .. D(x) = . . L = D(y )T AD(x), ∂ N /∂x N Hence, if A = UΣV T is the truncated SVD of A with Σ ∈ Rk×k then L = D(y )T UΣV T D(x) = D(y )T U Σ V T D(x) . This low rank representation for L tells us how to discretize and solve the PDE. Alex Townsend, University of Oxford 24th of June 2013 20/26 Matrix equation solvers Rank-1: A1 XB1T = F . Solve A1 Y = F , then B1 X T = Y T . Rank-2: A1 XB1T + A2 XB2T = F . Generalised Sylvester solver [Jonsson & Kågström, 1992]. Rank-k, k ≥ 3: Solve n1 n2 × n1 n2 system with QTP ∗ factorization. Boundary conditions are imposed by carefully O(n13 + n23 ) O(n13 n2 ) removing degrees of freedom in the matrix equation. O(n1 n2 ) n1 & n2 Alex Townsend, University of Oxford Automatically forms and solves subproblems, if possible. Corner singularities can be partly resolved by domain subdivision. 24th of June 2013 21/26 Examples Example 5: Helmholtz equation uxx + uyy + 2ω 2 u = 0, u(±1, y ) = f (±1, y ), u(x, ±1) = f (x, ±1), where f (x, y ) = cos(ωx) cos(ωy ). ω = 50 N = chebop2(@(u) ... ); N.lbc=f(-1,:); u=N\0; 1 y 0.5 Adaptively selects the discretisation size. 0 Forms a chebfun2 object [T. & Trefethen 2013]. −0.5 −1 −1 −0.5 0 x 0.5 1 For ω = 2000, kũ − uk2 = 2.43 × 10−9 . For ω = 50 solve takes 0.27s for n1 = n2 = 126, for ω = 2000 solve takes 32.1s for n1 = n2 = 2124. Alex Townsend, University of Oxford 24th of June 2013 22/26 Conclusions The US method results in almost banded well-conditioned matrices. It requires O(m2 n) operations to solve an ODE. It can be used as a preconditioner for the collocation method. The US method converges and is numerically stable. It can be extended to a spectral method for solving PDEs on rectangular domains. Alex Townsend, University of Oxford 24th of June 2013 23/26 Historical example [Orszag 1971] Orr-Sommerfeld equation 1 (u 0000 − 2u 00 + u) − 2iu − i(1 − x 2 )(u 00 − u) = λ(u 00 − u), 5772.2 u(±1) = u 0 (±1) = 0. Eigenvalues of the Orr−Sommerfeld operator −0.3 −0.4 imag −0.5 −0.6 −0.7 −0.8 −0.9 −1 −1 −0.8 −0.6 −0.4 −0.2 0 real Alex Townsend, University of Oxford 24th of June 2013 24/26 The end Thank you More information: S. Olver & T., A fast and well-conditioned spectral method, to appear in SIAM Review, (2013). Alex Townsend, University of Oxford 24th of June 2013 25/26 References F. Bornemann, Accuracy and stability of computing high-order derivatives of analytic functions by Cauchy integrals, Found. Comput. Math., 11 (2011), pp. 1–63. I. Jonsson & Kågström, Recursive blocked algorithms for solving triangular matrix equations — Part II: Two-sided and generalized Sylvester and Lyapunov equations, ACM Trans. Math. Softw., 28 (2002), pp. 416–435. S. Olver & T., A fast and well-conditioned spectral method, to appear in SIAM Review, (2013). S. Orszag, Accurate solution of the Orr-Sommerfeld stability equation, J. Fluid Mech., 50 (1971), pp. 689–703. T. & L. N. Trefethen, An extension of Chebfun to two dimensions, likely to appear in SISC, (2013). L. N. Trefethen et al., The Chebfun software system, version 4.2 , 2011. Alex Townsend, University of Oxford 24th of June 2013 26/26 Chebyshev polynomials An underappreciated fact: Every Chebyshev series can be rewritten as a palindromic Laurent series. The degree k Chebyshev polynomial (of the first kind) is defined by 1 k Tk (x) = cos k cos−1 x = z + z −k , 2 x ∈ [−1, 1], where z = e ikθ and cos θ = x. T25(θ) = cos(25θ) T25(x) 1 1 0.5 0 0 −0.5 −1 −1 −0.5 0 x 0.5 1 −1 0 1.5708 θ 3.1416 Every Chebyshev series can be written as a palindromic Laurent series: N X k=0 N αk Tk (x) = N 1X 1X αk z −k + 1α0 + αk z k , 2 2 k=1 Alex Townsend, University of Oxford z = e ikθ . k=1 24th of June 2013 27/26 Demmel’s trick for storing Givens rotations A Givens rotation zeros out one entry of a matrix and we can store the rotation information there. × × × × × × × × × × × × × × × Givens z}|{ −→ × × × × × × p × × × × × × × × × Let s = sin θ and c = cos θ, where θ is rotation angle, then ( s · sign(c), |s| < |c|, p = sign(s) , |s| ≥ |c|. c To restore (up to a 180 degrees): If |p| < √ 1, then s = p and √ c = 1 − s 2 , otherwise c = 1/p and s = 1 − c 2 . Alex Townsend, University of Oxford 24th of June 2013 28/26