Solving large-scale inverse electromagnetic scattering problems: A parallel AD-based approach Andrea Walther Institut für Mathematik Universität Paderborn Outline Motivation: The Rosetta-Project The Direct Problem: A FDTD-Discretization of Maxwell’s equations The Inverse Problem: An AD-based Solution Approach Conclusion and Outlook Current cooperation with F. Hoffeins, U. Markwardt, W. Nagel (ZIH, TU Dresden) D. Plettemeier (Electrical Engineering , TU Dresden) at Uni Paderborn: Maria Brune Andrea Walther 1 / 28 Solving large-scale inverse problems May 25th, 2011 Motivation: The Rosetta-Project Identification of Material Parameters Andrea Walther 2 / 28 Solving large-scale inverse problems May 25th, 2011 Motivation: The Rosetta-Project The Consert-Mission In 2004: Launch of spacecraft Rosetta In 2014: Arrival at the comet 67P/Churyumov-Gerasimenko On board: Lander Philae with 10 instruments, one is CONSERT Goal: Determine internal structure (permittivity!) of the comet Andrea Walther 3 / 28 Solving large-scale inverse problems May 25th, 2011 Motivation: The Rosetta-Project The Consert-Mission In 2004: Launch of spacecraft Rosetta In 2014: Arrival at the comet 67P/Churyumov-Gerasimenko On board: Lander Philae with 10 instruments, one is CONSERT Goal: Determine internal structure (permittivity!) of the comet For that: I Send electromagnetic waves through the comet nucleus I Measure the outcome using the orbiter Rosetta Andrea Walther 3 / 28 Solving large-scale inverse problems May 25th, 2011 Motivation: The Rosetta-Project The Consert-Mission In 2004: Launch of spacecraft Rosetta In 2014: Arrival at the comet 67P/Churyumov-Gerasimenko On board: Lander Philae with 10 instruments, one is CONSERT Goal: Determine internal structure (permittivity!) of the comet For that: I Send electromagnetic waves through the comet nucleus I Measure the outcome using the orbiter Rosetta One challenge: Size of the comet is estimated by 2 × 2 × 2 km resulting in a desired resolution of at least 7003 grid cells Andrea Walther 3 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Maxwell Equations for Electromagnetic Field ∂B = −∇ × E ∂t ∇·B =0 ∂D = ∇ × H − J̃ ∂t ∇·D =ρ B and D = magnetic respectively electric flux density, H and E = magnetic and electric field strength, Andrea Walther 4 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Maxwell Equations for Electromagnetic Field ∂B = −∇ × E ∂t ∇·B =0 ∂D = ∇ × H − J̃ ∂t ∇·D =ρ B and D = magnetic respectively electric flux density, H and E = magnetic and electric field strength, Can be substituted by −µ Andrea Walther ∂H =∇×E ∂t ε 4 / 28 ∂E = ∇ × H − σE + J ∂t Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Maxwell Equations for Electromagnetic Field ∂B = −∇ × E ∂t ∇·B =0 ∂D = ∇ × H − J̃ ∂t ∇·D =ρ B and D = magnetic respectively electric flux density, H and E = magnetic and electric field strength, Can be substituted by −µ ∂H =∇×E ∂t ε ∂E = ∇ × H − σE + J ∂t Goal: Want to estimate ε from measurements at the boundary Andrea Walther 4 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Theoretical Aspects Formulation as time-dependent problem? I Radon transform based reconstruction methods (Benna, Barriot, Kofman ’02) requires statistical a priori information Andrea Walther 5 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Theoretical Aspects Formulation as time-dependent problem? I Radon transform based reconstruction methods (Benna, Barriot, Kofman ’02) requires statistical a priori information ⇒ keep time-dependence!! Andrea Walther 5 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Theoretical Aspects Formulation as time-dependent problem? I Radon transform based reconstruction methods (Benna, Barriot, Kofman ’02) requires statistical a priori information ⇒ keep time-dependence!! I Then: Maxwell’s equations do have a unique solution for one source and observations at a closed surface that comprises the reconstruction domain (Kirsch 1998) Andrea Walther 5 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Theoretical Aspects Formulation as time-dependent problem? I Radon transform based reconstruction methods (Benna, Barriot, Kofman ’02) requires statistical a priori information ⇒ keep time-dependence!! I Then: Maxwell’s equations do have a unique solution for one source and observations at a closed surface that comprises the reconstruction domain (Kirsch 1998) Discretization? Andrea Walther 5 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Discretization: FDTD The Finite Differences in Time Domain method I proposed by Yee in 1966 I discretization of the curl equations by centered finite in space I discretization of the curl equations by leapfrog scheme in time I widely-used for simulations I E and H components are staggered in time and space Andrea Walther 6 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Discretization: FDTD The Finite Differences in Time Domain method I proposed by Yee in 1966 I discretization of the curl equations by centered finite in space I discretization of the curl equations by leapfrog scheme in time I widely-used for simulations I E and H components are staggered in time and space Ez z x Andrea Walther y Hy (i, j, k ) Ex Hx Hz Ey Layout of a Yee-cell 6 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations The Staggered Computation E E E E t = ∆t H E H E H E H t = 0.5∆t E t =0 x =0 x = ∆x x = 2∆x x = 3∆x Leapfrog scheme in 1D Andrea Walther 7 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations and the Formulas in 3D n+ 1 n− 1 Hx |i,j+21 ,k + 1 = Hx |i,j+21 ,k + 1 2 2 2 ∆t + µ 2 Ey |ni,j+ 1 ,k +1 − Ey |ni,j+ 1 ,k 2 2 ∆z n − n Ez |i,j+1,k + 1 − Ez |i,j,k + 1 2 ! 2 ∆y Hy and Hz similar. Andrea Walther 8 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations and the Formulas in 3D n+ 1 n− 1 Hx |i,j+21 ,k + 1 = Hx |i,j+21 ,k + 1 2 2 2 2 Ey |ni,j+ 1 ,k +1 − Ey |ni,j+ 1 ,k ∆t + µ 2 2 ∆z n − n Ez |i,j+1,k + 1 − Ez |i,j,k + 1 2 ! 2 ∆y Hy and Hz similar. n Ex |i+ 1 ,j,k = 2 + 1 ∆t ε + σ∆t 2ε 1− σ∆t 2ε σ∆t 2ε n−1 Ex |i+ 1 ,j,k − ∆t ε + σ∆t 2ε n− 1 Jsourcex |i+ 12,j,k 2 2 1+ 1 n− 1 n− 1 n− 1 n− 1 Hz |i+ 12,j+ 1 ,k − Hz |i+ 12,j− 1 ,k Hy |i+ 12,j,k + 1 − Hy |i+ 12,j,k − 1 2 2 2 2 2 2 2 2 − ∆y ∆z Ey and Ez similar, Andrea Walther 8 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations and the Formulas in 3D n+ 1 n− 1 Hx |i,j+21 ,k + 1 = Hx |i,j+21 ,k + 1 2 2 2 2 Ey |ni,j+ 1 ,k +1 − Ey |ni,j+ 1 ,k ∆t + µ 2 2 ∆z n − n Ez |i,j+1,k + 1 − Ez |i,j,k + 1 2 ! 2 ∆y Hy and Hz similar. n Ex |i+ 1 ,j,k = 2 + 1 ∆t ε + σ∆t 2ε 1− σ∆t 2ε σ∆t 2ε ∆t ε + σ∆t 2ε n− 1 Jsourcex |i+ 12,j,k 2 2 1+ 1 n− 1 n− 1 n− 1 n− 1 Hz |i+ 12,j+ 1 ,k − Hz |i+ 12,j− 1 ,k Hy |i+ 12,j,k + 1 − Hy |i+ 12,j,k − 1 2 2 2 2 2 2 2 2 − ∆y ∆z Ey and Ez similar, Andrea Walther n−1 Ex |i+ 1 ,j,k − material parameter ε enters nonlinearily!! 8 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Handling of Boundary No boundary in spacebut simulation performed on bounded domain. Therefore: Waves have to be damped at the boundary We use perfectly matched layers (PML) in a variant proposed by Gedney in 1996 consisting of uniaxial absorbing material. Andrea Walther 9 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Handling of Boundary No boundary in space but simulation performed on bounded domain. Therefore: Waves have to be damped at the boundary We use perfectly matched layers (PML) in a variant proposed by Gedney in 1996 consisting of uniaxial absorbing material. Andrea Walther 9 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Handling of Boundary No boundary in space but simulation performed on bounded domain. Therefore: Waves have to be damped at the boundary We use perfectly matched layers (PML) in a variant proposed by Gedney in 1996 consisting of uniaxial absorbing material. Andrea Walther 9 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Handling of Boundary No boundary in space but simulation performed on bounded domain. Therefore: Waves have to be damped at the boundary We use perfectly matched layers (PML) in a variant proposed by Gedney in 1996 consisting of uniaxial absorbing material. Andrea Walther 9 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Computational Domain in 2D Due to size of the problem: Parallelization is indispensable! inner FDTD cells ··· PML ··· ··· ··· ··· ··· ··· ··· ··· ··· (a) 2D domain decomposition Andrea Walther 10 / 28 Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Computational Domain in 2D Due to size of the problem: Parallelization is indispensable! inner FDTD cells ··· PML ··· ··· ··· ··· ··· ··· ghost cells MPI ··· ··· ··· (a) 2D domain decomposition Andrea Walther 10 / 28 (b) Synchronization of ghost cells Solving large-scale inverse problems May 25th, 2011 FDTD-Discretization of Maxwell’s equations Runtime: Function Evaluation 200 27 cores 64 cores 125 cores 216 cores 343 cores 512 cores 729 cores 1000 cores overall runtime [s] 150 100 50 0 1003 Andrea Walther 5003 6003 7003 8003 # inner grid cells 11 / 28 9003 Solving large-scale inverse problems 10003 May 25th, 2011 An AD-based Solution Approach The Inverse Problem J(ε) = X N X 1 (i,j,k )∈M n=0 2 ku(ε)|ni,j,k − u obs |ni,j,k k2 + βkεk2∗ with M . . . set of indices of observed cells N . . . number of observed time steps u(ε) . . . simulated state u obs . . . observed state β . . . Regularization parameter Andrea Walther 12 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach The Inverse Problem J(ε) = X N X 1 (i,j,k )∈M n=0 2 ku(ε)|ni,j,k − u obs |ni,j,k k2 + βkεk2∗ with M . . . set of indices of observed cells N . . . number of observed time steps u(ε) . . . simulated state u obs . . . observed state β . . . Regularization parameter All solution approaches need derivatives. Andrea Walther 12 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach The Inverse Problem J(ε) = X N X 1 (i,j,k )∈M n=0 2 ku(ε)|ni,j,k − u obs |ni,j,k k2 + βkεk2∗ with M . . . set of indices of observed cells N . . . number of observed time steps u(ε) . . . simulated state u obs . . . observed state β . . . Regularization parameter All solution approaches need derivatives. Consistent derivatives for discrete version? (Abenius ’04) provided by algorithmic differentiation (AD). Andrea Walther 12 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Forward mode AD = Tangents/Sensitivities F x y Andrea Walther 13 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Forward mode AD = Tangents/Sensitivities F x y Andrea Walther 13 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Forward mode AD = Tangents/Sensitivities ẋ F x y Andrea Walther 13 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Forward mode AD = Tangents/Sensitivities ẋ F ẏ x y Ḟ Andrea Walther 13 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Forward mode AD = Tangents/Sensitivities ẋ F ẏ x y Ḟ ẏ (t) = Andrea Walther ∂ F (x(t)) = F 0 (x(t)) ẋ(t) ≡ Ḟ (x, ẋ) ∂t 13 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Reverse Mode AD = Discrete Adjoints F x y Andrea Walther 14 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Reverse Mode AD = Discrete Adjoints F ȳ x ȳ > y Andrea Walther 14 / 28 y = Solving large-scale inverse problems c May 25th, 2011 An AD-based Solution Approach Reverse Mode AD = Discrete Adjoints x̄ F ȳ > ȳ x x) F( = F̄ ȳ > y y = c Andrea Walther 14 / 28 Solving large-scale inverse problems c May 25th, 2011 An AD-based Solution Approach Reverse Mode AD = Discrete Adjoints x̄ F ȳ > ȳ x x) F( = F̄ ȳ > y y = c c x̄ ≡ ȳ > F 0 (x) = ∇x h ȳ > F (x) i ≡ F̄ (x, ȳ ) Andrea Walther 14 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Algorithmic Differentiation (AD) I Differentiation of “computer programs” within machine precision I Evaluation of derivatives with working accuracy I Forward mode: Reverse mode: Combination: OPS(F 0 (x)ẋ) OPS(ȳ > F 0 (x)) MEM(ȳ > F 0 (x)) OPS(ȳ > F 00 (x)ẋ) ≤ ≤ ∼ ≤ I Tools: ADOL-C, CppAD, Tapenade, . . . I www.autodiff.org, (Griewank, Walther 08) Andrea Walther 15 / 28 c OPS(F ), c OPS(F ), OPS(F ), c OPS(F ), Solving large-scale inverse problems c ∈ [2, 5/2] c ∈ [3, 4] c ∈ [7, 10] May 25th, 2011 An AD-based Solution Approach Algorithmic Differentiation (AD) I Differentiation of “computer programs” within machine precision I Evaluation of derivatives with working accuracy I Forward mode: Reverse mode: Combination: OPS(F 0 (x)ẋ) OPS(ȳ > F 0 (x)) MEM(ȳ > F 0 (x)) OPS(ȳ > F 00 (x)ẋ) ≤ ≤ ∼ ≤ I Tools: ADOL-C, CppAD, Tapenade, . . . I www.autodiff.org, (Griewank, Walther 08) c OPS(F ), c OPS(F ), OPS(F ), c OPS(F ), c ∈ [2, 5/2] c ∈ [3, 4] c ∈ [7, 10] Remarks: I Cost for gradient calculation independent of # variables I Memory requirement may cause problem! ⇒ Checkpointing Andrea Walther 15 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Store-Everything Approach Example: 12 time steps 1 10 20 t 1 5 10 l Andrea Walther 16 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Store-Everything Approach Example: 12 time steps 1 10 20 t 1 5 10 l MEM = O(l), TIME = 2l, might cause problems, even if it fits in memory Andrea Walther 16 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Binomial Checkpointing Example: 12 time steps, 4 checkpoints, reusage of all checkpoints! 1 10 20 30 40 t 1 5 10 l Andrea Walther 17 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Binomial Checkpointing Example: 12 time steps, 4 checkpoints, reusage of all checkpoints! 1 10 20 30 40 t 1 5 10 l MEM = c, TIME = ? Andrea Walther 17 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Checkpointing Theory Goal: Minimal number of recomputations for c checkpoints Available results: I l known, constant step costs (Griewank ’92), (Griewank, Walther ’00), (Kowarz, Walther ’07) I l known, variable step costs (Walther ’00), (Hinze, Sternberg ’05) I l unknown, constant step costs (Hinze, Walther, Sternberg ’06), (Stumm, Walther ’10), (Moin, Wang ’10) l known, variable access cost (Stumm, Walther ’09) I Andrea Walther 18 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Checkpointing Theory Goal: Minimal number of recomputations for c checkpoints Available results: I l known, constant step costs (Griewank ’92), (Griewank, Walther ’00), (Kowarz, Walther ’07) I l known, variable step costs (Walther ’00), (Hinze, Sternberg ’05) I l unknown, constant step costs (Hinze, Walther, Sternberg ’06), (Stumm, Walther ’10), (Moin, Wang ’10) l known, variable access cost (Stumm, Walther ’09) I Also applicable for continuous adjoints! Implemented in software driver revolve Andrea Walther 18 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Software Components for Inverse Problem I own simulation code in C++ using MPI I ADOL-C for the computation of adjoints for one time step I revolve for checkpointing of time loop I coupling with L-BFGS I coupling with Ipopt (goal: also optimizer in parallel) Andrea Walther 19 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Scaling of Gradient Computation weak scaling: same amount of work per processor Andrea Walther 20 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Runtime Gradient I 700 27 cores 64 cores 125 cores 216 cores 343 cores 512 cores 729 cores 1000 cores 600 overall runtime [s] 500 400 300 200 100 0 Andrea Walther 1003 2003 2503 3003 # inner grid cells 21 / 28 3503 Solving large-scale inverse problems 4003 May 25th, 2011 An AD-based Solution Approach Runtime Quotient I factor between forward and gradient computation 40 30 25 20 15 10 5 0 Andrea Walther 27 cores 64 cores 125 cores 216 cores 343 cores 512 cores 729 cores 1000 cores 35 103 403 # grid cells per core 22 / 28 503 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Runtime Quotient II 50 27 cores 64 cores 125 cores factor between forward and gradient computation 45 40 35 30 25 20 15 10 5 40 Andrea Walther 60 80 100 120 140 160 # grid cells per core 23 / 28 180 200 220 Solving large-scale inverse problems 240 May 25th, 2011 An AD-based Solution Approach Test Problem Andrea Walther 24 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Test Problem 3 ’data/test14t01_eps_r_referenceXYat50.txt’ matrix 3 ’data/test14t01_eps_r_startXYat50.txt’ matrix 90 90 80 80 50 2 ¡r 60 40 50 2 40 30 30 1.5 1.5 20 20 10 10 0 ¡r 2.5 70 60 Cells Cells 2.5 70 0 10 20 30 40 50 Cells 60 70 80 90 1 0 0 10 20 30 40 50 Cells 60 70 80 90 1 reference and starting point Andrea Walther 25 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Test Problem 3 ’data/test14t01_eps_r_referenceXYat50.txt’ matrix 3 ’data/test14t04_eps_r_optXYat50.txt’ matrix 90 90 80 80 50 2 ¡r 60 40 50 2 40 30 30 1.5 1.5 20 20 10 10 0 εr 2.5 70 60 Cells Cells 2.5 70 0 10 20 30 40 50 Cells 60 70 80 90 1 0 1 0 10 20 30 40 50 Cells 60 70 80 90 reference and final point after 60 iterations for 729000 unknowns Andrea Walther 26 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach Test Problem T R 2.4 2.2 εr 2 1.8 1.6 1.4 1.2 1 0 20 40 60 80 100 120 140 Cells target and reconstructed permittivity Andrea Walther 26 / 28 Solving large-scale inverse problems May 25th, 2011 An AD-based Solution Approach First Tests of Regularizations So far: Treated as PDE constrained optimization problem Now: Add appropriate regularisation Implemented: k.k∗ = k.k2 and k.k∗ = k.kTV no reg L2 TV Andrea Walther iter 70 70 70 function value 7.5084540e-03 3.9867834e-03 3.9868059e-03 27 / 28 β 0.0 2.0 8.0 Solving large-scale inverse problems May 25th, 2011 Conclusion and Outlook Conclusions I Simulation in parallel for 3D ⇒ large-scale discretizations I Gradient in parallel for 3D ⇒ large-scale discretizations I Coupling with L-BFGS and recently with Ipopt I First tests with respect to regularisations I Next steps: I infinite dimensional setting ? I appropriate regularization techniques ? I globalization strategies ? Andrea Walther 28 / 28 Solving large-scale inverse problems May 25th, 2011