An implicit method for two-electron time-dependent fairly large-scale quantum simulations Raymond Nepstad University of Bergen Time dependent quantum mechanics – analysis and numerics Oslo 30 April 2010 Motivation Free-electron laser (Hasylab, DESY) Mauritsson et al., PRL (2008) Short pulses Attosecond electron motion Ultra strong field Electron interactions The crew Morten Førre Raymond Nepstad Sigurd Askeland Tore Birkeland Stian Sørngård Ingjald Pilskog Outline Implicit method for two-electron systems Pyprop and the Helium package Results Summary Simulating two-electron systems TDSE Solve the (6D) two-electron time-depedent Schrödinger equation, i ∂ Ψ(r1 , r2 , t) = HΨ(r1 , r2 , t). ∂t Semiclassical light-atom interaction Hamiltonian, H= 2 p2 2 p21 2 1 − + Hf ,1 + − + Hf ,2 + . 2 r1 2 r2 |r1 − r2 | Field term (long-wavelength approximation), Hfv,i = A(t) · pi Hfl ,i = E(t) · ri . TDSE Solve the (6D) two-electron time-depedent Schrödinger equation, i ∂ Ψ(r1 , r2 , t) = HΨ(r1 , r2 , t). ∂t Semiclassical light-atom interaction Hamiltonian, H= 2 p2 2 p21 2 1 − + Hf ,1 + − + Hf ,2 + . 2 r1 2 r2 |r1 − r2 | Field term (long-wavelength approximation), Hfv,i = A(t) · pi Hfl ,i = E(t) · ri . TDSE Solve the (6D) two-electron time-depedent Schrödinger equation, i ∂ Ψ(r1 , r2 , t) = HΨ(r1 , r2 , t). ∂t Semiclassical light-atom interaction Hamiltonian, H= 2 p2 2 p21 2 1 − + Hf ,1 + − + Hf ,2 + . 2 r1 2 r2 |r1 − r2 | Field term (long-wavelength approximation), Hfv,i = A(t) · pi Hfl ,i = E(t) · ri . TDSE Solve the (6D) two-electron time-depedent Schrödinger equation, i ∂ Ψ(r1 , r2 , t) = HΨ(r1 , r2 , t). ∂t Semiclassical light-atom interaction Hamiltonian, H= 2 p2 2 p21 2 1 − + Hf ,1 + − + Hf ,2 + . 2 r1 2 r2 |r1 − r2 | Field term (long-wavelength approximation), Hfv,i = A(t) · pi Hfl ,i = E(t) · ri . Discretization Continuous PDF → set of ODEs via ψ(r1 , r2 , t) = ∑ cijk (t) i,j,k Bi (r1 ) Bj (r2 ) LM Yl1 ,l2 (Ω1 , Ω2 ). r1 r2 Coupled spherical harmonics (k = {L, M, l1 , l2 }), m M−m Yl1LM (Ω2 ). l2 = ∑hl1 l2 mM − m|LMiYl1 (Ω1 )Yl2 m B-splines are non-orthogonal (overlap matrix), Z ∞ 0 dr Bi (r)Bj (r) = Sij S = Ik ⊗ S 1 ⊗ S 2 Discretization Continuous PDF → set of ODEs via ψ(r1 , r2 , t) = ∑ cijk (t) i,j,k Bi (r1 ) Bj (r2 ) LM Yl1 ,l2 (Ω1 , Ω2 ). r1 r2 Coupled spherical harmonics (k = {L, M, l1 , l2 }), m M−m Yl1LM (Ω2 ). l2 = ∑hl1 l2 mM − m|LMiYl1 (Ω1 )Yl2 m B-splines are non-orthogonal (overlap matrix), Z ∞ 0 dr Bi (r)Bj (r) = Sij S = Ik ⊗ S 1 ⊗ S 2 Discretization Continuous PDF → set of ODEs via ψ(r1 , r2 , t) = ∑ cijk (t) i,j,k Bi (r1 ) Bj (r2 ) LM Yl1 ,l2 (Ω1 , Ω2 ). r1 r2 Coupled spherical harmonics (k = {L, M, l1 , l2 }), m M−m Yl1LM (Ω2 ). l2 = ∑hl1 l2 mM − m|LMiYl1 (Ω1 )Yl2 m B-splines are non-orthogonal (overlap matrix), Z ∞ 0 dr Bi (r)Bj (r) = Sij S = Ik ⊗ S 1 ⊗ S 2 B-splines Piecewise polynomials (order k) Can handle non-smooth functions Compact support, sparse matrices x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x Radial box and B-splines Matrix elements Single-particle operators Gauss-Legendre quadrature for radial integrals Analytical solution of angular integrals Multipole expansion of r12 , lmax 1 ≈∑ r1 − r2 l=0 l l 4π r< Y ∗ (Ω1 )Yl,m (Ω2 ). l+1 l,m m=−l 2l + 1 r> ∑ Matrix-form TDSE, ıSċ(t) = H(t)c(t) Matrix elements Single-particle operators Gauss-Legendre quadrature for radial integrals Analytical solution of angular integrals Multipole expansion of r12 , lmax 1 ≈∑ r1 − r2 l=0 l l 4π r< Y ∗ (Ω1 )Yl,m (Ω2 ). l+1 l,m m=−l 2l + 1 r> ∑ Matrix-form TDSE, ıSċ(t) = H(t)c(t) Matrix elements Single-particle operators Gauss-Legendre quadrature for radial integrals Analytical solution of angular integrals Multipole expansion of r12 , lmax 1 ≈∑ r1 − r2 l=0 l l 4π r< Y ∗ (Ω1 )Yl,m (Ω2 ). l+1 l,m m=−l 2l + 1 r> ∑ Matrix-form TDSE, ıSċ(t) = H(t)c(t) Matrix structure 2D radial B-spline matrix Matrix structure Full matrix Matrix structure Per 2D radial block: Ex: ≈ (2NB · k)2 (2 · 280 · 6)2 → 170MB Total matrix (500 non-zero angular blocks): Typically 80GB Nb k Memory-intensive approach, parallelization required Distributing the wavefunction Distributing the wavefunction ? Distributing the wavefunction Distributing the wavefunction Distributing the wavefunction Distributing matrices P0 P1 P2 = * Calculate - receive/send Distributing matrices P0 P1 P2 = * Calculate - receive/send Time discretization Exponential (approximate) propagator, c(t + ∆t) = exp −ı∆tS−1 H c(t) + O(∆t2 ) Numerical schemes: 1. Approximate exponential in Krylov subspace (Arnoldi/Lanczos) No go - the equations are too stiff! 2. Padé approximation: Cayley-Hamilton (Crank-Nicolson) Time discretization Exponential (approximate) propagator, c(t + ∆t) = exp −ı∆tS−1 H c(t) + O(∆t2 ) Numerical schemes: 1. Approximate exponential in Krylov subspace (Arnoldi/Lanczos) No go - the equations are too stiff! 2. Padé approximation: Cayley-Hamilton (Crank-Nicolson) Time discretization Exponential (approximate) propagator, c(t + ∆t) = exp −ı∆tS−1 H c(t) + O(∆t2 ) Numerical schemes: 1. Approximate exponential in Krylov subspace (Arnoldi/Lanczos) No go - the equations are too stiff! 2. Padé approximation: Cayley-Hamilton (Crank-Nicolson) Time discretization Exponential (approximate) propagator, c(t + ∆t) = exp −ı∆tS−1 H c(t) + O(∆t2 ) Numerical schemes: 1. Approximate exponential in Krylov subspace (Arnoldi/Lanczos) No go - the equations are too stiff! 2. Padé approximation: Cayley-Hamilton (Crank-Nicolson) Time discretization Exponential (approximate) propagator, c(t + ∆t) = exp −ı∆tS−1 H c(t) + O(∆t2 ) Numerical schemes: 1. Approximate exponential in Krylov subspace (Arnoldi/Lanczos) No go - the equations are too stiff! 2. Padé approximation: Cayley-Hamilton (Crank-Nicolson) Implicit Cayley-Hamilton Cayley-Hamilton propagator, ı∆t ı∆t H c(t + ∆t) = S − H c(t). S+ 2 2 A can be very large, but sparse, direct methods not feasible However, since A is sparse, an iterative method could work We use GMRES Implicit Cayley-Hamilton Cayley-Hamilton propagator, ı∆t ı∆t H H c(t) c(t + ∆t) = S− S+ 2 2 | | {z } | {z } {z } A x = b A can be very large, but sparse, direct methods not feasible However, since A is sparse, an iterative method could work We use GMRES Implicit Cayley-Hamilton Cayley-Hamilton propagator, ı∆t ı∆t H H c(t) c(t + ∆t) = S− S+ 2 2 | | {z } | {z } {z } A x = b A can be very large, but sparse, direct methods not feasible However, since A is sparse, an iterative method could work We use GMRES Implicit Cayley-Hamilton Cayley-Hamilton propagator, ı∆t ı∆t H H c(t) c(t + ∆t) = S− S+ 2 2 | | {z } | {z } {z } A x = b A can be very large, but sparse, direct methods not feasible However, since A is sparse, an iterative method could work We use GMRES Implicit Cayley-Hamilton Cayley-Hamilton propagator, ı∆t ı∆t H H c(t) c(t + ∆t) = S− S+ 2 2 | | {z } | {z } {z } A x = b A can be very large, but sparse, direct methods not feasible However, since A is sparse, an iterative method could work We use GMRES GMRES iterative solver Construct Krylov subspace of A (sparse, m × m) Vn ← span{b, Ab, A2 b, . . . , An−1 b} (n m) Solve (n × (n − 1)) least square problem min{||AVn y − b||} = min{||Hn y − V∗n+1 d||} where xn = Vn y and Vn+1 Hn = AVn . Problem: converges extremely slowly for typical A Solution: use preconditioner GMRES iterative solver Construct Krylov subspace of A (sparse, m × m) Vn ← span{b, Ab, A2 b, . . . , An−1 b} (n m) Solve (n × (n − 1)) least square problem min{||AVn y − b||} = min{||Hn y − V∗n+1 d||} where xn = Vn y and Vn+1 Hn = AVn . Problem: converges extremely slowly for typical A Solution: use preconditioner GMRES iterative solver Construct Krylov subspace of A (sparse, m × m) Vn ← span{b, Ab, A2 b, . . . , An−1 b} (n m) Solve (n × (n − 1)) least square problem min{||AVn y − b||} = min{||Hn y − V∗n+1 d||} where xn = Vn y and Vn+1 Hn = AVn . Problem: converges extremely slowly for typical A Solution: use preconditioner GMRES iterative solver Construct Krylov subspace of A (sparse, m × m) Vn ← span{b, Ab, A2 b, . . . , An−1 b} (n m) Solve (n × (n − 1)) least square problem min{||AVn y − b||} = min{||Hn y − V∗n+1 d||} where xn = Vn y and Vn+1 Hn = AVn . Problem: converges extremely slowly for typical A Solution: use preconditioner Preconditioner Preconditioner M for system of equations, Ax = b → M−1 Ax = M−1 b Desired properties B = M−1 A should have clustered eigenvalues (GMRES) M−1 c should be easier to compute than A−1 b M somewhere between I and A Preconditioner Preconditioner M for system of equations, Ax = b → M−1 Ax = M−1 b Desired properties B = M−1 A should have clustered eigenvalues (GMRES) M−1 c should be easier to compute than A−1 b M somewhere between I and A Choosing a preconditioner Rewrite hamiltonian H = Hr + Hf (t) + Hmultipole “Radial” hamiltonian Hr = p21 p22 2 2 1 + − − + 2 2 r1 r2 r> Use as preconditioner, operations requires no communication M = Hr , Choosing a preconditioner Rewrite hamiltonian H = Hr + Hf (t) + Hmultipole “Radial” hamiltonian Hr = p21 p22 2 2 1 + − − + 2 2 r1 r2 r> Use as preconditioner, operations requires no communication M = Hr , Choosing a preconditioner Rewrite hamiltonian H = Hr + Hf (t) + Hmultipole “Radial” hamiltonian Hr = p21 p22 2 2 1 + − − + 2 2 r1 r2 r> Use as preconditioner, operations requires no communication M = Hr , Applying the preconditioner Solve My = c Calculate factorization M = LU (only once) Backsubstitution y = U−1 L−1 c (each time) Possible methods Sparse “exact” factorization: superLU Fill-in problem Don’t need “exact” solution Structure-preserving factorization: incomplete LU (ILU) IFPACK / Trilinos Applying the preconditioner Solve My = c Calculate factorization M = LU (only once) Backsubstitution y = U−1 L−1 c (each time) Possible methods Sparse “exact” factorization: superLU Fill-in problem Don’t need “exact” solution Structure-preserving factorization: incomplete LU (ILU) IFPACK / Trilinos Applying the preconditioner Solve My = c Calculate factorization M = LU (only once) Backsubstitution y = U−1 L−1 c (each time) Possible methods Sparse “exact” factorization: superLU Fill-in problem Don’t need “exact” solution Structure-preserving factorization: incomplete LU (ILU) IFPACK / Trilinos Applying the preconditioner Solve My = c Calculate factorization M = LU (only once) Backsubstitution y = U−1 L−1 c (each time) Possible methods Sparse “exact” factorization: superLU Fill-in problem Don’t need “exact” solution Structure-preserving factorization: incomplete LU (ILU) IFPACK / Trilinos How well does it work? Nbsplines = 40 lmax = 3 L = [0, 1, 2] 10-1 No preconditioner Preconditioner 10-2 10-3 Error estimate 10-4 10-5 10-6 10-7 10-8 10-9 10-10 10-11 10-12 0 5 10 Krylov vectors 15 20 Drawbacks Max procs given by (l1 , l2 , L) (50 - 300) Load balancing not optimal (mem. usage) Possible improvement: Distribute one/both radial ranks Drawbacks 20 (l1 , l2 , L) (50 - 300) Load balancing not optimal (mem. usage) 15 % of procs Max procs given by 10 Possible improvement: Distribute 5 one/both radial ranks 0 3500 4000 4500 5000 5500 Memory (MB) 6000 6500 7000 Drawbacks 20 (l1 , l2 , L) (50 - 300) Load balancing not optimal (mem. usage) 15 % of procs Max procs given by 10 Possible improvement: Distribute 5 one/both radial ranks 0 3500 4000 4500 5000 5500 Memory (MB) 6000 6500 7000 Distributing two ranks: wavefunction Distributing two ranks: wavefunction Distributing two ranks: matrix Non-radial preconditioner? Radial preconditioner blocks no longer proc-local Use only local blocks with new distribution scheme? ILU(n), n > 0, some communication, performance penalty Work in progress Non-radial preconditioner? Radial preconditioner blocks no longer proc-local Use only local blocks with new distribution scheme? ILU(n), n > 0, some communication, performance penalty Work in progress Non-radial preconditioner? Radial preconditioner blocks no longer proc-local Use only local blocks with new distribution scheme? ILU(n), n > 0, some communication, performance penalty Work in progress Non-radial preconditioner? Radial preconditioner blocks no longer proc-local Use only local blocks with new distribution scheme? ILU(n), n > 0, some communication, performance penalty Work in progress Pyprop What is Pyprop? Toolkit for solving the Time Dependent Schrödinger Equation What is Pyprop? Goals Flexibility Performance Research tool, not QM@Home Common tasks automated Difficult tasks possible T. Birkeland PyProp - a Python Framework for Propagating the Time Dependent Schrödinger Equation, Ph.D. thesis (2009) Free Software (GPL) http://pyprop.googlecode.com Development branch on Github (http://github.com/kvantetore/PyProp) What is Pyprop? Goals Flexibility Performance Research tool, not QM@Home Common tasks automated Difficult tasks possible T. Birkeland PyProp - a Python Framework for Propagating the Time Dependent Schrödinger Equation, Ph.D. thesis (2009) Free Software (GPL) http://pyprop.googlecode.com Development branch on Github (http://github.com/kvantetore/PyProp) What is Pyprop? Goals Flexibility Performance Research tool, not QM@Home Common tasks automated Difficult tasks possible T. Birkeland PyProp - a Python Framework for Propagating the Time Dependent Schrödinger Equation, Ph.D. thesis (2009) Free Software (GPL) http://pyprop.googlecode.com Development branch on Github (http://github.com/kvantetore/PyProp) What is Pyprop? Goals Flexibility Performance Research tool, not QM@Home Common tasks automated Difficult tasks possible T. Birkeland PyProp - a Python Framework for Propagating the Time Dependent Schrödinger Equation, Ph.D. thesis (2009) Free Software (GPL) http://pyprop.googlecode.com Development branch on Github (http://github.com/kvantetore/PyProp) Flexibility Choose dimensionality and discretization Several discretization schemes built in Calculate inner products, load/save wavefunctions Supply potentials (hamiltonian) PyProp takes care of a lot of repetitive code Operator-wavefunction multiplications Choose propagator Several propagators built in Perform analysis and data exploration High level code is written in Python All the propagation tools can be used interactively Flexibility Choose dimensionality and discretization Several discretization schemes built in Calculate inner products, load/save wavefunctions Supply potentials (hamiltonian) PyProp takes care of a lot of repetitive code Operator-wavefunction multiplications Choose propagator Several propagators built in Perform analysis and data exploration High level code is written in Python All the propagation tools can be used interactively Flexibility Choose dimensionality and discretization Several discretization schemes built in Calculate inner products, load/save wavefunctions Supply potentials (hamiltonian) PyProp takes care of a lot of repetitive code Operator-wavefunction multiplications Choose propagator Several propagators built in Perform analysis and data exploration High level code is written in Python All the propagation tools can be used interactively Flexibility Choose dimensionality and discretization Several discretization schemes built in Calculate inner products, load/save wavefunctions Supply potentials (hamiltonian) PyProp takes care of a lot of repetitive code Operator-wavefunction multiplications Choose propagator Several propagators built in Perform analysis and data exploration High level code is written in Python All the propagation tools can be used interactively Pyprop Framework Design Core Routines Independent Modules User Code Pyprop Framework Design Core Routines Independent Modules User Code Pyprop Framework Design Core Routines Independent Modules User Code Pyprop Framework Design Core Routines Independent Modules User Code What can Pyprop do Two-electron quantum dot molecules Ion-molecule collisions (Kr34+ + H2 ) Parallel redistribution of multidimensional data Dynamics in two-electron atoms Life on Proc(0,0) x k k k b j z j S={1,2} S={0,1} y i S={0,2} i i Helium package layout CayleyPropagator GMRESSolver Pyprop Bsplines CoupledSphericalHarmonics Pyprop.Helium Core Eigenstates Potentials Preconditioner Project-specific wrappers Find eigenstates Basic eigenstate operations Propagation Propagation flow Propagation tasks Postprocessing Analysis Projectors Observables Differential prob. Namegenerator Results Workhorse: Cray XT4 @Bergen Cray XT4 The XT4 at Parallab, Bergen Two-photon double ionization of Helium Energy (a.u.) DI SI He++ 0 −0.5 He+ (2p) −2.0 He+ (1s) −2.9 He Two-photon double ionization of Helium 10 Pyprop (4 fs) Feist et al. (4 fs) Foumouo et al. (NC) Foumouo et al. (WC) Nikolopoulos et al. DI SI He++ 0 −0.5 He+ (2p) −2.0 He+ (1s) −2.9 He TPDI cross section (10−52 cm4 s) Energy (a.u.) 8 6 4 2 038 40 42 44 46 48 50 Photon energy (eV) 52 54 56 Ionization Probability Ultra-strong laser fields 100 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10 10-11 10-12 -2 10 Total Double Single 1-photon 2-photon >2-photon 1.0 0.5 0.0 10-1 0 5 10 15 20 25 30 101 100 Field Strength (a.u.) Ultra-strong laser fields 1.0 Ionization Probability 0.8 0.6 0.4 0.2 0.00.0 0.8 E0 /ω2 (a.u.) 1.6 0.0 0.8 E0 /ω2 (a.u.) 1.6 Summary Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Summary 1. Two-electron method 2 x Bspline + coupled spherical harmonics Implicit propagator (Cayley-Hamilton) GMRES Precondition with ILU Implemented as a Python package for Pyprop (C++ used where needed) Improvements possible (in progress) 2. Pyprop Flexible toolkit for TDSE-problems 3. Future: Circular polarization, H2 , . . . Thank you for your attention