Exploiting parallel computing in Discrete Fracture Network simulations: an inherently parallel optimization approach Stefano Berrone stefano.berrone@polito.it Team: Mat`ıas Benedetto, Andrea Borio, Claudio Canuto, Corrado Fidelibus, Sandra Pieraccini, Stefano Scial` o, Marco Tamburini, Fabio Vicini Dipartimento di Scienze Matematiche Workshop HPC@POLITO - 12 marzo 2014 Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 1 / 28 Outline 1 Introduction to DFNs 2 Scalability Parallel environment Scalability results 3 Uncertainty Quantification in Discrete Fracture Network Models Stochastic fracture transmissivity Stochastic geometry Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 2 / 28 Discrete Fracture Networks (DFN) DFN models can be used in simulating underground flows in fractured media. Possible applications: simulation of underground displacement of pollutant, water in aquifers, super-critical carbon dioxide... Main features of the DFN models addressed herein: 3D network of intersecting fractures Fractures represented as planar polygons Quantity of interest is the hydraulic head evaluated with Darcy law Flux balance and hydraulic head continuity imposed across trace intersections Challenges: Very large domain: high computational cost and memory requirements Complex domain: difficulties in good quality mesh generation Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 3 / 28 The problem of mesh generation Typical approach: Discretization with finite elements on totally conforming meshes geometry modifications to conform fracture intersections to the mesh Resolution of a large linear system Mortar methods to allow partial non-conformity of the mesh at fracture intersections Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 4 / 28 [B., Pieraccini, Scial` o (SISC2013a-b, JCP2014)] Our approach: reformulate the problem as a PDE constrained optimization problem Independent mesh on each fracture Coupling given by the minimization of a cost functional Independent resolution of small linear systems on the fractures Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 5 / 28 Problem formulation: a coupled system of PDEs Local problems Find ∀i ∈ I hi = h0i + R hiD , with h0i ∈ Vi := H1D,0 (Fi ) such that: 0 = (qi , v) + hui , v|Si iU Si ,U Si 0 + hhiN , v|ΓiN i Ki ∇hi , ∇v H − (Ki ∇ R hiD , ∇v) , −1 2 1 (ΓiN ),H 2 (ΓiN ) ∀v ∈ Vi where qi ∈ L2 (Fi ) is a source term and ∂Fi = ΓiN ∪ ΓiD with ΓiN ∩ ΓiD = ∅ and ΓiD 6= ∅. Coupling (matching) conditions hi|Sm − hj | Sm m uS i Stefano Berrone (DISMA) + m uS j = 0, for i, j ∈ ISm , ∀m ∈ M, = 0, for i, j ∈ ISm , ∀m ∈ M, Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 6 / 28 PDE constrained optimization approach [BPS 2013-2014] Let u be the tuple of all ui , i ∈ I, and define J : U → R as X S 2 ||hi (ui )|S − hj (uj )|S ||2HS + ||uS J(u) = i + uj ||U S S∈S u being the tuple of all control variables ui S = {set of all traces} 1 0 1 S − S S U = H 2 (S), H = H 2 (S) = U the discrete problem is written as a constrained minimization problem a gradient-like method is applied S 2 U = L (S), S 2 S H = L (S) = U 0 Remark The core of the solution process is the (repeated) solution of local and independent problems on each fracture. This makes the approach nearly inherently parallel. Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 7 / 28 The space discretization: XFEM Extended Finite Elements have been used in [BPS 2013-2014] to catch the nonsmooth behavior of the solution across the traces, without losing accuracy and still preserving a full independence of meshing process on the fractures. Proper shape functions are added, which mimic the nonsmooth behavior of the solution. Interface S Enriched DoF Reproducing el. Blending el. 1.4 1.2 1 0.8 1 0.6 0.4 0.5 0.2 0 0 1 Stefano Berrone (DISMA) −0.5 0.8 0.6 0.4 0.2 Parallel Computing and DFN simulations 0 −0.2 −0.4 −0.6 −0.8 −1 −1 HPC@POLITO - 12 marzo 2014 8 / 28 The space discretization: VEM In [BBPS 2014] flexibility of VEM has been used in order to catch the behavior of the solution across the traces, allowing for a partially conforming mesh, but still maintaining an independent meshing process on each fracture. Stefano Berrone (DISMA) Parallel Computing and DFN simulations HPC@POLITO - 12 marzo 2014 9 / 28 Part I - Scalability Stefano Berrone (DISMA) Part I - Scalability HPC@POLITO - 12 marzo 2014 10 / 28 Parallel Approach [BSTV] Serial algorithm 1 Build the triangulation on the fractures 2 Evaluate discrete operators on the fractures 3 Solve with a conjugate gradient (CG) iterative method 4 Post-processing - graphical representation Parallel implementation with MPI (Octave+mpi/openmpi ext package) 1 Partitioning the DFN 2 Organize communications 3 Preliminary results Stefano Berrone (DISMA) Part I - Scalability HPC@POLITO - 12 marzo 2014 11 / 28 Parallel implementation - MPI 1 Partitioning the DFN Define Np the number of processes available The DFN is seen as a graph in which fractures represents the nodes and traces the edges balanced graph partitioning: split the DFN into subsets of fractures Pk , k = 1, . . . , Np − 1 in such a way that the workload of the Np − 1 processes is balanced and the number edge cut (i.e. communications) is minimized Fracture Trace Figure: Balanced Graph partitioning Stefano Berrone (DISMA) Part I - Scalability HPC@POLITO - 12 marzo 2014 12 / 28 Parallel implementation - MPI 2 Organize communications Hierarchical (modular) Master/Slave structure Figure: The Master/Slave architecture Stefano Berrone (DISMA) Part I - Scalability HPC@POLITO - 12 marzo 2014 13 / 28 Scalability results 1 3 Preliminary results Scalability tests performed on a machine with 2 6-core processors (12 physical - 24 virtual cores), before the implementation on the cluster Figure: Time reduction factor vs number of Slave processes - CG algorithm Stefano Berrone (DISMA) Part I - Scalability Figure: Processes independence test HPC@POLITO - 12 marzo 2014 14 / 28 Scalability results 2 Observations the full algorithm scales worse than the CG resolution phase more inherently parallel, but higher memory occupation conflicts in accessing memory have a more relevant effect Future developments Non-blocking communication routines Implementation in “C” language Figure: Time reduction factor vs number of Slave processes - full algorithm Stefano Berrone (DISMA) Implementation on the cluster Part I - Scalability HPC@POLITO - 12 marzo 2014 15 / 28 Part II - Uncertainty Quantification Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 16 / 28 Motivation DFN simulations are largely interesting in those situations in which the discrete nature of the fractures strongly impacts on the directionality of the flow. DFN are usually applied to simulate the underground displacement of pollutant, water or super-critical carbon dioxide. The simulations mainly aim at estimating the flux entity, the resulting directionality of the flux, and characteristic time. In this context we assume as characteristic quantity the flux that is affecting a fixed boundary of the DFN (fractures F2 and F3 ). A possible scenario is as follows: Assume information is available about the probability distribution of certain fracture features, such as their density, orientation, size, aspect ratio, aperture (these data may affect transmissivity). Assume a borehole is pumping some fluid underground (e.g., carbon dioxide) We are interested in evaluating the probability that the flux of carbon dioxide reaches a certain region of the underground basin, where a large outcropping fault can be a carrier for a dangerous leakage. Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 17 / 28 Toy networks for our numerical experiments [BCPS 2014] Figure: An example of toy network. Left: 3D view of the network (traces in red); right: projection on x1 -x2 plane Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 18 / 28 General setting We consider several toy networks with: 1 An horizontal fracture F1 2 Two vertical orthogonal fractures F2 , F3 (with reference to previous 2D figure: F2 on the right of F1 , F3 above F1 ) 3 A set of additional fractures connecting the network, orthogonal to F1 , and with arbitrary orientation. Boundary conditions: 1 The east edge of F1 acts as a source (Neumann boundary conditions set to 10) 2 On south edge of F2 and west edge of F3 (constant non-homogeneous) Dirichlet b.c. are set 3 All other edges are assumed to be insulated: homogeneous Neumann conditions. We consider the problem of measuring the overall flux entering fractures F2 and F3 through their traces, respectively. Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 19 / 28 Strategy for UQ Monte Carlo methods Stochastic Collocation Approach (Smolyak-type sparse grids): moments (mean values, variance) are computed with properly chosen weighted quadrature formulas, the weight functions corresponding to the underlying probability law. In both cases, each outcome concurring to the computation of the moments, corresponds to a simulation on the whole network. Two different frameworks: Deterministic geometry, stochastic trasmissivity Deterministic trasmissivity, stochastic geometry Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 20 / 28 We consider #I = 7 fractures. We set ¯ K1 = K2 = K3 = 10L , Ki = 10Lmin +(Lmax −Lmin )Yi , Yi ∼ U (0, 1) ¯ = −2. for i = 4, . . . , N , with Lmin = −4, Lmax = 0, L Figure: DFN configuration Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 21 / 28 Figure: Flux entering F2 (left) and F3 (right) versus one selected stochastic variable yi , the one associated with the fracture with smallest distance from the interesection F2 ∩ F3 (all others set to 0.5) Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 22 / 28 Stochastic geometry We consider a geometry in which the orientation of the fractures is non-deterministic. K is fixed for all fractures. Fracture Fi , for i ≥ 4, forms an angle αi with the x1 axis which is αi = α ¯ i + ∆αi (2y − 1), y ∼ U (0, 1) . Figure: Non-deterministic configuration. Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 23 / 28 Figure: Flux entering F2 (left) and F3 (right) versus the stochastic variable y Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 24 / 28 Another test geometry Figure: Non-deterministic configuration: increasing some fractures’ length. Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 25 / 28 Figure: Flux entering F2 (left) and F3 (right) versus the stochastic variable y Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 26 / 28 Figure: Errors on mean value (left) and variance (right) versus number of grid points. Stochastic collocation (solid lines) vs Monte Carlo (dotted lines) Stefano Berrone (DISMA) Part II - Uncertainty Quantification HPC@POLITO - 12 marzo 2014 27 / 28 Stochastic collocation on sparse grids Number of outcomes to be considered for different numbers d of stochastic parameters Level 0 1 2 3 4 5 6 Stefano Berrone (DISMA) d=1 1 3 7 15 31 63 127 d=2 1 5 17 49 129 321 769 d=3 1 7 31 111 351 1023 2815 d=4 1 9 49 209 769 2561 7937 Part II - Uncertainty Quantification d=6 1 13 97 545 2561 10625 40193 HPC@POLITO - 12 marzo 2014 28 / 28