Workshop Fast BEM & BETI June 18-19, 2012 IT4Innovations, Department of Applied Mathematics VŠB-TU Ostrava Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. Foreword The aim of this workshop and the funding project is to bring foreign specialists on boundary integral equations (BIE) and boundary element methods (BEM) to Ostrava, exchange our experience and initiate new research cooperations. We are glad that our invitation was accepted by Dr. Guenther Of from University of Technology in Graz, Dipl.-Ing. Michael Karkulik from Technical University of Vienna, and Dr. Lehel Banjai from Max-Planck Institute in Leipzig. They will give presentations on important topics such as Fast Multipole BEM, BETI, adaptive BEM, and BIE for time-domain problems. The programme will be completed by a contributed talk of Professor Kompiš from Academy of Armed Forces at Liptovský Mikuláš, Slovakia, and by contributed talks of domestic researchers. We hope that the workshop will motivate new students and researchers to join the growing BEM community. We also hope that discussions will bring new ideas and initiate new scientific cooperations and results. We wish you will enjoy this event. On behalf of the organizers Dalibor Lukáš and Tomáš Kozubek Programme Monday, June 18 8:00 9:00 9:10 10:10 10:40 11:40 13:00 13:30 14:00 14:30 15:00 Registration Opening G. Of – FMM BEM coffee break M. Karkulik – Adaptive BEM lunch V. Kompiš – MFS J. Zapletal – BEM for 3d Helmholtz coffee break D. Lukáš – Parallel BEM P. Kovář – Decomposition of Complete Graphs Tuesday, June 19 9:00 10:00 10:30 11:00 11:30 13:00 13:20 13:40 14:00 14:30 G. Of – BETI coffee break T. Kozubek – Generalized Inverses O. Vlach – FETI with Mortar Elements lunch M. Jarošová – Hybrid TFETI M. Merta – Massively Parallel TFETI for Medical Image Registration M. Čermák – TFETI in Elastoplasticity coffee break L. Banjai – Time-Domain BEM Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. Abstracts of invited talks G. Of An Introduction to Fast Multipole Boundary Element Methods While boundary element methods reduce a boundary value problem to the boundary and therefore decrease the number of unknowns of the discretized problem, the resulting matrices are dense and the computational complexity of standard boundary element methods is at least quadratic in the number of unknowns. Data-sparse methods reduce the memory requirements and the computational effort to almost linear complexity. A hierarchical clustering of the boundary element mesh induces a hierarchical partitioning of the matrix. Geometrical criterions, the so-called admissibility condition, are used for an appropriate selection of blocks of the matrix, which can be approximated by low rank. After some motivating examples on such approximations, the clustering of the boundary element mesh and the hierarchical partitioning are discussed. The Fast Multipole Method uses some kernel expansion to construct low rank approximations for admissible blocks. In addition it utilizes the hierarchical cluster structure in an advanced manner to reduce the asymptotic complexity further. We illustrate the related algorithm of the Fast Multipole Method. Finally, we present numerical examples with several hundred thousands of boundary elements to show the efficiency of data-sparse boundary element methods. M. Feischl, M. Karkulik, J.M. Melenk, G. Of, T. Fuehrer, and D. Praetorius A Survey on Adaptivity in Boundary Element Methods Adaptivity has become a fundamental instrument for the efficient numerical solution of partial differential equations. In traditional FEM/BEM analysis, meshes are often designed based on a priori information and experience. In contrast to that, adaptive procedures try to automatically refine or coarsen a mesh or adjust the underlying basis functions to achieve a good solution: Starting with a given initial mesh $\tau_0$ and based on certain refinement indicators, usual adaptive algorithms of the type solve – estimate – mark – refine (1) mark elements for refinement, use a refinement rule $M_l \subset \tau_l$ to generate a mesh $\tau_{l+1}:= renine(\tau_l,M_l)$, where at least the marked elements $T \in M_l$ are refined, and iterate. In the presence of singularities of the unknown exact solution or given data, uniform meshes are suboptimal. However, adaptive methods regain the optimal rate of convergence in numerical experiments. For finite element methods, adaptive algorithms and their different components (1) have been studied since the early 1980s, starting with the pioneering works of Babuška on a-posteriori error estimates. The last decade saw a major breakthrough in the mathematical understanding Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. of quasi-optimality of adaptive FEM with the works [1,3]. In BEM, the situation is less developed. The underlying integral operators and the natural norms are non-local, which induces additional difficulties in the design of a-posteriori error estimates. In this talk, we give a short overview on the state of the art in adaptive BEM. We present certain a-posteriori error estimates, see [2], and discuss the design of the different components of (1). We will comment on the quasi-optimality [4], and give some numerical examples to illustrate the performance of adaptive BEM. [1] Peter Binev, Wolfgang Dahmen, and Ron DeVore. Adaptive finite element methods with convergence rates. Numer. Math., 97(2):219--268, 2004. [2] Carsten Carstensen, Birgit Faermann. Mathematical foundation of a posteriori error estimates and adaptive mesh-refining algorithms for boundary integral equations of the first kind. Eng. Anal. Bound. Elem., 25:497--509, 2001. [3] J. Manuel Cascon, Christian Kreuzer, Ricardo~H. Nochetto, and Kunibert~G. Siebert. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal., 46(5):2524--2550, 2008. [4] M. Feischl, M. Karkulik, M. Melenk, D. Praetorius. Quasi-optimal convergence rate for an adaptive boundary element method. Submitted. G. Of The Boundary Element Tearing and Interconnecting Method The Boundary Element Tearing and Interconnecting (BETI) method is the counterpart of the boundary element methods to the well-known Finite Element Tearing and Interconnecting (FETI) method. BETI and FETI methods are domain decomposition methods which solve large scale problems based on solvers of local subproblems and provide efficient global preconditioners built from such local solvers. The main idea of these methods is the tearing of the global trial functions into local ones and the subsequent interconnecting of the local degrees of freedom across the interfaces by means of constraints and Lagrange multipliers to enforce continuity. After a derivation of the standard BETI formulation we discuss the treatment of so-called floating subdomains. These are subdomains which have no Dirichlet boundary conditions and therefore the local subproblems are not uniquely solvable. This can be handled by appropriate projection operators. In addition, we consider the Total BETI or Allfloating BETI formulation, which treats all subdomains as floating and incorporates the Dirichlet boundary conditions by additional Lagrangian multipliers. We discuss the advantages of this formulation and present numerical examples on the efficiency of the BETI method. L. Banjai Time-Domain Boundary Integral Equations for Wave Propagation We will give an introduction to the application and numerical analysis of time-domain boundary integral equations. The emphasis will be on acoustic wave scattering applications. Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. Both time-space Galerkin and convolution quadrature discretization methods will be introduced. If time allows, some more specialized topics such as fast implementation and more advanced applications will be discussed. Abstracts of contributed talks V. Kompiš MFS for Modelling of Inhomogeneous Materials with Large Aspect Ratio Reinforcing Elements 1D continuous source functions, fundamental solutions and thein derivatives located along fibre axes are used to simulate the interactions of matrix and reinforcing elements in composite materials, when the primary field is a scalar function (e.g. temperature in heat conduction) or a vector function (e.g. displacement in elasticity). The inter-domain continuity is specified in discrete points on fibres boundaries. Intensities of the source functions are defined by 1D NURBS and computed in LS sense in the fibres. The inter-domain kontinuity equations have to be completed by balance equations (energy, equilibrium, etc.) in order to obtain temperature, displacement, etc. in centre of each fibre. Gradients of displacements (strains) and temperature are supposed to be constant in cross-sections of the fibres and are computed iteratively by considering them to be linear along fibres in the first step. Material properties of both matrix and fibres are assumed to be homogeneous and isotropic. J. Zapletal Boundary Element Method for the Helmholtz Equation in 3D In the talk we present the application of the boundary element method for solving the Helmholtz equation in 3D. Contrary to the finite element method, one does not need to discretize the whole domain and thus the problem dimension is reduced. This advantage is most pronounced when solving an exterior problem, i.e., a problem on an unbounded domain. We concentrate on the Galerkin approach known, e.g., from the finite element method and present a combination of analytic and numerical computation of matrices generated by boundary integral operators. D. Lukáš, P. Kovář, and T. Kovářová Parallel Fast BEM for Distributed Memory Systems We consider Galerkin boundary element method (BEM) accelerated by means of hierarchical matrices (H-matrices) and adaptive cross approximation (ACA). This leads to almost linear complexity O(n log n) of a serial code, where n denotes the number of boundary nodes or elements. Once the setup of an H-matrix is done, parallel assembling is straightforward via a load-balanced distribution of admissible (far-field) and nonadmissible (near-field) parts of the matrix to N concurrent processes. This traditional approach scales the computational complexity as O((n log n) / N). However, the boundary mesh is shared by all processes. We Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. propose a novel method, which leads to memory scalability O((n \log n) / \sqrt{N})$, which is optimal due to the dense matter of BEM matrices. The method relies on our recent results in cyclic decompositions of undirected graphs. Each process is assigned to a subgraph and to related boundary submesh. The parallel scalability of the Laplace single-layer matrix is documented on a distributed memory computer up to 133 cores and milions of boundary triangles. P. Kovář, T. Kovářová, M. Kravčenko, and D. Lukáš Decomposition of Complete Graphs into Dense Subgraphs When solving large systems of equations it is natural to decompose the corresponding large matrices into smaller (sub)matrices and parallelize the computation on a cluster with many nodes. If the parallel machine uses distributed memory, further requirements on the decomposition arise. For simplicity let $A$ be a large full matrix with $n \times n$ blocks $B_{ij}$. We want to choose $n$ sets $C_1, C_2, \dots, C_n$ of blocks so that each block $B_{ij}$ of $A$ belongs to some set $C_k$ and the maximum number of different block subscripts in each $C_k$ is as small as possible. We rephrase the task in the language of graph decompositions and for certain values of $n$ also as a number theory problem of perfect difference sets. We present some constructions of decompositions of complete graphs $K_n$ into small dense graphs that can be used to solve the problem above. The decompositions have been implemented and successfully tested for fast BEM matrices of size up to millions distributed to hundreds of nodes. T. Kozubek An Efficient Regularization of Generalized Inverses Arising in Engineering Problems Applying FETI to linearized problems of solid mechanics leads to the block diagonal stiffness matrix, where each block corresponds to a subdomain. All blocks are positive semidefinite and sparse matrices with known kernel bases created by the rigid body modes. This is a great advantage because all blocks can be effectively regularized without extra fill-in and then decomposed using any standard sparse Cholesky type factorization method for nonsingular matrices. Using this approach we completely avoid problems with zero pivots. O. Vlach, Z. Dostál, and T. Brzobohatý On Effective Implementation of the Non-Penetration Condition for Non-Matching Grids Preserving Scalability of FETI Based Algorithms The point of this paper is to extend our results obtained for elastic contact problems to the contact problems with non-matching grids which necessarily emerge, e.g., in the solution of transient contact problems or in contact shape optimization. We want to get good approximation and the constraint matrix $B$ with nearly orthogonal rows. We consider both Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. standard engineering approaches such as node to segment, or mortar elements. We give simple bounds on the singular values of the resulting matrix $B$ and results of numerical experiments, including both the academic examples and some problems of practical interest. We conclude that the normalized orthogonal mortars proposed by Wohlmuth can be used to approximate the non-penetration conditions in a way that complies with the requirements of the FETI methods. M. Jarošová, M. Menšík, A. Markopoulos, and T. Brzobohatý Hybrid Total FETI We propose a hybrid FETI method based on our variant of the FETI type domain decomposition method called Total FETI. Our hybrid method was developed in an effort to overcome the bottleneck of classical FETI methods, namely the bound on the dimension of the coarse space due to memory requirements. We first decompose the domain into relatively large clusters that are completely separated, and then we decompose each cluster into smaller subdomains that are joined partly by Lagrange multipliers $\lambda_0$ in selected interface variables or in averages if the transformation of basis is applied. The continuity in the rest of interface variables and also the Dirichlet condition are enforced by Lagrange multipliers $\lambda_1$. This decomposition leads to the algorithm, where TFETI is used on two levels. The results of numerical experiments on benchmark from the linear elasticity will conclude the talk. M. Merta, A. Vašatová, V. Hapla, and D. Horák Massively Parallel Implementation of Total-FETI DDM with Applications to Medical Image Registration The FETI (Finite Element Tearing and Interconnecting) method turned out to be one of the most successful methods for the parallel solution of elliptic partial differential equations. The FETI-1 is based on the decomposition of the spatial domain into non-overlapping subdomains that are ``glued'' by Lagrange multipliers. Total-FETI (TFETI) by Dostal et al. simplifes the inversion of stiffness matrices of subdomains by using Lagrange multipliers not only for gluing the subdomains along the auxiliary interfaces, but also to enforce the Dirichlet boundary conditions. Thus bases of kernels of all subdomain stiffness matrices are known a~priori and can be assembled directly from mesh data. In this work we compare two parallel implementations of TFETI method based on either PETSc or Trilinos software frameworks. Both these libraries are widely used for the developement of scientific codes. While PETSc is based almost entirely on pure C, Trilinos utilizes features of the modern C++ including templates and object oriented design. We focus on the parallel efficiency of both codes, mainly on the treatment of the solution of the coarse problem and the action of orthogonal projectors, which seem to be main bottlenecks of the TFETI parallel implementations. Although usual applications of TFETI method lie in the field of material sciences and structural mechanics, we demonstrate applicability of our codes to the problem of the image Tento projekt je spolufinancován z ESF a státního rozpočtu ČR. registration of computer tomography and magnetic resonance imaging data using elastic registration method. The numerical benchmarks were run on HECToR supercomputer at EPCC in the UK which is part of the PRACE HPC ecosystem. M. Čermák T-FETI Domain Decomposition Method for Solving Elasto-Plastic Problems In the paper, we performance an algorithm for the efficient parallel implementation of elastoplastic problems with hardening based on the T-FETI (Total Finite Element Tearing and Interconnecting) domain decomposition method. We consider an associated elasto-plastic model with the von Mises plastic criterion and the linear isotropic hardening law. The model is discretized by the implicit Euler method in time and the consequent one time step elastoplastic problem by the finite element method in space. The latter results in a system of nonlinear equations with a strongly semismooth and strongly monotone operator. The semismooth Newton method is applied to solve this nonlinear system. Corresponding linearized problems arising in the Newton iterations are solved in parallel by the above mentioned T-FETI domain decomposition method. We performance benchmark for elastoplastic problem without contact and benchmark for elasto-plastic problem with contact. Tento projekt je spolufinancován z ESF a státního rozpočtu ČR.