1 Computational Fluid Dynamics Reading Group: Finite Element Methods for Linear Elliptic Equations (Laplace) Fall 2009. 02/11/09 Dharshi Devendran, Sandra May & Eduardo Corona One of the most attractive features of …nite element methods is the rich mathematical theory behind them. Moreover, a deep understanding of this theory is germane not just to obtain error estimates, but in every aspect of its implementation. Our main objective is to present an introduction to Finite Element Methods applied to the study of ‡uids (in particular, to solve di¤erent instances of the Navier-Stokes equation). To achieve this objetive, we will …rst get acquainted with the key concepts of Finite Element Methods, in the context of solving the Laplace equation. By studying the weak (variational) formulation for this equation, we will see the Galerkin method arise as a natural discretization procedure, and Finite Element as an especially appealing instance of this procedure. 1.1 Variational Formulation So, let be a bounded domain in Rd with a smooth boundary. Let u 2 C 2 ( ) be a classical solution of the Laplace equation with Dirichlet boundary conditions: u = f in u=0 in @ Our goal is to obtain the weak formulation of this PDE. To do this, we multiply it on both sides by a test function, that is, 2 C01 ( ) and integrate over . Using Green’s identity (integration by parts) on the left side yields: Z Z u = f 8 2 C01 ( ) Z Z @u rur = @ @ Z Z rur = f Z f 8 2 C01 ( ) 8 2 C01 ( ) Where the boundary term vanishes, since is supported inside . Further, we can use the fact that C01 is dense in H01 ( ) (limits of test functions converge to H 1 functions with 0 trace in the H 1 norm) to obtain: Z Z rurv = f v 8v 2 H01 ( ) We notice that in order to make sense of the formulation above, we do not need u to be C 2 ( ): In fact, all we need is to ask is that ru 2 L2 and uj@ = 0 1 (in the sense of trace). That is, it su¢ ces to take u 2 H01 ( ). In a sense, this problem is much more "natural" than the classical formulation: one can show using the Lax-Milgram lemma that there exists a unique solution to the weak problem, and that if it is regular enough, it is also the unique classical solution. A similar thing can be done to most well-behaved linear elliptic equations, and it is known as the study of Dirichlet forms and weak formulation of elliptic PDEs. In the end, what we obtain is of the form: a(u; v) = f (v) 8v 2 V Where V is a Hilbert space, f 2 V (is a continuous functional), and a(u; v) is a continuous elliptic bilinear form. That is, De…nition 1 Let a : V V ! R be a bilinear form. Then, we say a is continuous i¤ 9M > 0 such that: a(u; v) M kukV kvkV We say a is elliptic (or coercive) if 9 8u; v 2 V > 0 such that: 2 a(u; u) kukV Although these conditions are the most general framework in which we can study this problem, we often also have that a is symmetric. Furthermore, R we>can see that all these hold true for the Laplace equation, taking a(u; v) = ru rv (continuity is trivial, and ellipticity comes from a Poincaré inequality). The key result is the following: Theorem 2 If a : V V ! R is a continuous, elliptic and symmetricpbilinear form, then a( ; ) is an inner product for V; and its induced norm k ka = a(u; u) is equivalent to k kV . That is, we can either work with a( ; ) or with the usual inner product, and use all the power of Hilbert spaces (e.g. Pythagoras, Riesz representation, etc). Additionally, a being symmetric implies that the weak problem is equivalent to the minimization problem: min fa(u; u) u2H01 ( ) f (u)g That is, the weak form (linear equation) is equivalent to a quadratic minimization problem. 2 1.2 Discretizing the weak formulation: Galerkin Method The question now is, how can we compute the weak solution? The idea is to solve a "discrete" problem on an adequately chosen …nite dimensional subspace Vh V : a(uh ; vh ) = f (vh ) 8vh 2 Vh We are now in the realm of linear algebra. Given of dimension P a subspace Vh P n; we choose a basis f k gnk=1 ; and write uh = k k k and vh = j j j . By bilinearity of a and linearity of f , X X j a( j ; k ) k = jf( j) j j;k Constructing the "sti¤ness" matrix A(j; k) = a( f ( j ); we reveal a linear system of equations: > A = > b j; k) and "load" vector b(j) = 8 2 Rn Since this is true for all , we have the linear system of equations: A =b A inherits the properties of a, and hence is a symmetric positive de…nite matrix (these properties can be checked by writing the properties of a for Vh ). One could say at this point, all that remains is to solve the system A = b using an adroit method such as conjugate gradient or cholesky. However, that would be cheating, since important questions remain: 1. How do I choose Vh ? What kind of error am I incurring by approximating with functions in this space? Can I have convergence of some sort? 2. Once I have chosen Vh , what basis should I pick? The answer to this questions will determine a particular instance of a Galerkin method, and in our case will lead us to de…ne what Finite Element is. The idea behind …nite element is to satisfy the following demands: The matrix A and vector f should be easy to assemble The matrix A should be as sparse as possible One should be able to compute / generate the basis quickly and cheaply. We note that the contributions to the entries of A and f are integrals over the support of the j s. Thus, a natural choice is to pick basis functions which are localized. As a result, for a given j the entry a( j ; k ) will be 0 for all k except those for which supports overlap. 3 1.3 So, what is Finite Element? We can now see a Finite Element method is nothing more than a "cleverly chosen" Galerkin scheme. The key idea behind this method comes from engineering / mechanics, where the responses to loads and forces were computed by dividing the domain into small pieces, or "elements". We will see that this in turn provides a way to de…ne the subspaces Vh spanned by locally supported basis functions. 1.3.1 Triangulation The …rst step is to "divide" our domain into little pieces, or "elements". This procedure is called triangulation, and it can be de…ned as follows: De…nition 3 Given a domain if (i) K = K and K 6= ? 8K 2 (ii) Ki \ Kj = ? 8i 6= j (iii) @K is[ smooth (iv) = K K2 , the collection h is called a triangulation of h h It is known as a triangulation for historical reasons: the …rst elements used by Hrenniko¤ and Courant in the 1940’s were triangles. Although triangles (and simplices, which are their n-dimensional analogues) are widely used, other shapes such as quadrilaterals and prisms are often the shape of choice. Remark 4 Of course, not all domains can be perfectly partitioned into elements, especially if we restrict ourselves to only triangular or quadrilateral shapes. However, one can show that if the boundary of is su¢ ciently smooth, the error incurred in doing this approximately is small. Remark 5 Triangulation is not trivial to implement, and quite a bit of research has gone towards generating "good quality" triangulations. We will talk a bit about this in the numerical session tomorrow. 1.3.2 De…ning a Finite Element Space The second step is the following: having chopped our domain into pieces, it is reasonable to de…ne our spaces Vh element-wise. That is, we consider spaces of the form: Vh Vh;0 = fv 2 V : vjK 2 PK g = fv 2 Vh : vj@ = 0 g 4 Where PK is a …nite dimensional function space (not necessarily the same for each K; although this is almost always the case). In particular, we say Vh is conformal if Vh V and is a C k …nite element space if Vh C k ( ). The two most "popular" choices for PK are Pg (K), the space of polynomials g of degree g in K (dim(Pg (K)) = d + ) and Qg (K), the space of polynomials d such that the maximum power is g (dim(Qg (K)) = (g + 1)d ). 1.3.3 Choosing the basis The …nal step is then to come up with a basis for each PK (the basis for Vh or Vh;0 is then constructed using these bases). There is plenty of motivation (from the physical interpretation, as well as from the study of convergence and error estimates) to generate a basis of PK through an interpolation process. That is, we start with a unisolvent given a series of "measurements" fNi gni=1 (which usually consist of evaluation of the function or its derivatives, or averages along edges, faces, etc). Unisolvent means that the problem: Ni [p] = 2 Rn 8i; i Has a unique solution in PK . If we think of K as an interval, and Ni [p] = p(xi ) for some chosen points xi , this immediately suggests a basis: the Lagrange polynomials. In general, we construct the basis such that it solves the following problem: Ni ('K j )= ij We then have that, for all p 2 PK ; p= n X Ni (p)'i i=1 This basis is often called Nodal basis, since it is generated by evaluations of some quantity at nodes. From the functional analysis point of view, Ni can be seen as a basis for the dual of PK ; and this nodal basis is often referred to in that context as a biorthogonal basis. Using it enables us to have a local interpolation theory for …nite element spaces. Remark 6 If we want things to be well-de…ned on all ; we should make sure these measurements "match" on the faces of adjacent elements. To ensure this, we can enforce that all faces of an element K are either on the boundary, or are faces of an adjacent element as well. Remark 7 Once things are well-de…ned, we have a set of measurements N for our triangularization, and a nodal basis can be similarly constructed by adding the pieces on each element sharing that measurement / node. The way to visualize this is as a "tent" function, supported on all elements sharing a vertex / edge / face. 5 1.3.4 Some examples So, we have seen that a triangulation fK 2 h g, a choice of space PK and a set of measurements NK completely determine a Finite Element method. Hence, in the literature it is often the case that the trio (K; PK ; NK ) is called a …nite element as well. Lagrange Finite Element: If all measurements in NK are point-wise evaluations, the …nite element is known as a Lagrange element. That is, if n = dim(PK ); we simply choose n points in our element K to evaluate p 2 PK . There is no unique way of doing this, but the following is desireable: All points should be distinct, and should coincide on shared faces. If our space is to consist of continuous functions, then evaluations on a face should uniquely determine the polynomial on that face. When possible, they should be assigned in a coordinate-free fashion In general, Lagrange elements yield …nite element spaces of continuous functions. When K is a simplex (triangular), it is often the case that Pg (K) is the space of choice. For g = 1 (piecewise linear), the canonical way to choose points is to take the d + 1 vertices of the simplex as our nodes: Lagrange Type 1 (PK = P1 ) The nodal basis for this element is also very important: Since we demand that 'i (vj ) = ij , we have that '(K) = '1 (K); ::; 'd+1 (K) [ 1; 1]d+1 and P 'i 1 (by interpolation of the constant function). Hence, our basis is none other than the baricentric coordinates for our simplex, and it is a basis if and only if it is non-degenerate (if it has positive volume). The simplicial Lagrange elements for Pg can be constructed in a similar fashion, and it turns out that we can de…ne their basis elements (and indeed 6 perform all other calculations) using baricentric coordinates. The points chosen in this case include vertices as well as points inside edges and faces of the simplex in what is called a "principal lattice of order g". For example, g = 3 looks like: Quadrilateral elements: Qg In the case of Lagrange quadrilateral elements, the most basic Lagrange element is analogous to the ones presented above. However, quadrilaterals don’t have the same number of vertices (or edges, or faces) than simplices. In this case it is natural to use PK = Qg (K). Other …nite element spaces For now, we won’t go into alternative …nite elements (we will save this for later, especially when we study elements designed for the Stokes and Navier-Stokes equations). However it is important to mention that: Elements that employ …rst derivatives (directional along the edges) and second derivatives as well as normal derivatives are widely used, especially when we need them for higher regularity or for convergence. Examples 7 of this are the Hermite elements (up to …rst derivatives), and Argyris elements (up to second derivatives). In practice, it might not be desireable to include nodes inside the elements (like in our Lagrange type 3 example). It is common to restrict the polynomial space to eliminate the need for these degrees of freedom. The resulting elements are widely known as Serendipity elements. As we have mentioned, other kinds of "measurements" can be made, such as averages over faces. 1.4 1.4.1 Some error estimates Cea’s Lemma: back to Galerkin and the meaning of convergence Now that we know exactly what we mean by a …nite element space, we can go back to our Galerkin discretization scheme and ask the million dollar question: Is it convergent? Let h be a set of triangulations of , and Xh V corresponding …nite element spaces, both depending on the parameter h (where h = maxfdiam(K) : K 2 h g tells us how "big" our elements are). Let u be the solution to our variational problem: a(u; v) = F (v) 8v 2 V And fuh g be the sequence of solutions of the discrete problems: a(uh ; vh ) = F (vh ) 8vh 2 Xh We would expect kuh ukV ! 0 as h ! 0 (convergence), and furthermore, we would like to know what the order of that convergence is (in terms of a power of h). Now, there is a general error estimate for conformal Galerkin methods (that is, when Xh V ) known as Cea’s lemma. This result only uses the fact that a is an inner product for the space V . Why is it so important that Xh V ? Because then, u being the solution of the variational problem means, in particular, that: a(u; vh ) = F (vh ) 8vh 2 Xh But then, subtracting the result for uh ; we see that: a(u uh ; vh ) = 0 8vh 2 Xh If we stare at this for a bit, we will recognize it as an orthogonality result. That is, uh is the a-orthogonal projection of u onto the space Xh . This means: ku uh ka = inf fku vh 2Xh 8 vh ka g However, we know that the norms k ka and k kV are equivalent, and so: ku uh kV M p inf fku vh 2Xh vh kV g In the case of the Laplacian, V = H 1 , of course. The meaning of this statement is that, if we can show: lim inf fku h!0 vh 2Xh vh kV g = 0 Then we will have proven convergence. In fact, since we have an inf, we can obtain upper bounds for the error simply by picking some vh . 1.4.2 Error estimates using interpolation In fact, most of the standard estimates that hinge on this result use a very particular vh , that is, we choose an interpolant of u, I (u) 2 Xh . For simplicity, lets assume we are working with the Laplacian example (V = H01 ), and piecewise polynomial elements. In this case, our interpolant is none other than the node interpolant, that is: n X I (u)jK = NiK (u)'K i i=1 Or writing this in terms of the triangulation nodes and basis functions for Xh : X I (u) = N (u) Also, we need to assume two more things about our scheme: The key property that all the elements we have shown share is that all the operations we need to do can be performed on a "reference element" (e.g. the unit simplex / unit square) via a simple map (usually, an a¢ ne map, such as the barycentric coordinates). This will also be useful from an implementation point of view. When we estimate the error on each element K, a factor of the form hgK Km will emerge, where hK is the diameter and K is the radius of the biggest circle we can …t inside K. If we allow our elements to grow arbitrarily "thin", K ! 0, and convergence may be lost (in fact this is the case in practice). Hence, we need to ask that the "aspect ratio" of our elements is not too big (for triangles, this is equivalent to bound the smallest angle below). If these two properties are ful…lled, we have the following result: 9 Theorem 8 Lets assume that there exist g 2 N [ f0g, 1 b we have the embeddings: reference element K b Pg (K) PKb g such that, for the b H 1 (K) b ,! C s (K) b H g+1 (K) Then, assuming that the exact solution u belongs to H g+1 ( ); 9C > 0 independent of h such that: ku uh k1; C jujg+1; hg Now, how can I apply this theorem? Lets say R2 and I suspect my 2 2 solution is only on H ( ). Then H ( ) is continuously embedded in C 0 ( ) (by Sobolev embedding), and using the previous result, I can conclude that using P1 (piecewise linear) Lagrange elements I will achieve the maximum convergence rate posible, that is O(h). I can use more complicated elements or higher degree polynomials, and yet I will not be able to improve this rate. Finally, we note that it might seem suspicious to use pointwise evaluation or derivative evaluation to perform interpolation in H 1 . This is in fact true, and what is "swept under the rug" is that, if u is not in H 2 ( ), we need a more general interpolation theory (such as Clement interpolation). However, in most applications one knows u should be regular, and nodal interpolation is good enough. 1.5 Conclusions This has only been a snapshot of the main results concerning conforming …nite elements, as they are usually applied to the solution of linear elliptic PDEs. As you can see, the "story" behind the …nite element method is quite rich even when we restrict ourselves to the simplest case in 2D or 3D domains. This tells us that this method is quite powerful, and lends itself to a very vast array of applications in the solution of PDEs and integral equations as well. To get a ‡avor of what was omitted from this introduction, we note that: In the cases in which is not "polygonal", it cannot be exactly triangularized. Also, we may want to use "non-conforming" spaces Xh ( V . These amongst others are referred to in the literature as "variational crimes", and new error estimates (Strang’s Lemmas) have to be derived. These tell us that, as long as these crimes are "not too bad" (go to 0 at a good enough rate), we can keep the same convergence results as in the conforming case. Elements may have curved boundaries. Also, even in the case when one wants to use general quadrilaterals (and not just parallellograms, say), problems arise when we want to map them onto the reference element. This requires the theory of isoparametric elements. 10 Specialized elements may be used depending on the properties we may want for uh . References [1] Braess, Dietrich. Finite Elements: Theory, fast solvers and applications in solid mechanics. 3rd edition, Cambridge University Press [2] Ciarlet, Philippe G. The Finite Element Method for Elliptic Problems. SIAM Classics in Applied Mathematics. [3] Brenner, Susanne C., Scott, Ridgway L. The Mathematical Theory of Finite Element Methods. 2nd edition, Springer Verlag. [4] Kesavan, S. Nonlinear Functional Analysis. 11