HIGH-ORDER AFEM FOR THE LAPLACE-BELTRAMI OPERATOR: CONVERGENCE RATES ANDREA BONITO ∗ , J. MANUEL CASCÓN † , KHAMRON MEKCHAY PEDRO MORIN § , AND RICARDO H. NOCHETTO ¶ ‡, Abstract. We present a new AFEM for the Laplace-Beltrami operator with arbitrary polyno1 and piecewise in a suitable Besov class mial degree on parametric surfaces, which are globally W∞ 1 embedded in C 1,α with α ∈ (0, 1]. The idea is to have the surface sufficiently well resolved in W∞ relative to the current resolution of the PDE in H 1 . This gives rise to a conditional contraction property of the PDE module. We present a suitable approximation class and discuss its relation to Besov regularity of the surface, solution, and forcing. We prove optimal convergence rates for AFEM 1 and PDE error in H 1 . which are dictated by the worst decay rate of the surface error in W∞ 1. Introduction. Let γ be a d dimensional surface in Rd+1 (d ≥ 1) either with or without boundary, which is globally Lipschitz and piecewise in a suitable Besov class embedded in C 1,α with α ∈ (0, 1]. We design and study a quasi-optimal adaptive finite element method (AFEM) to approximate the solution of −∆γ u = f on γ, (1.1) where f ∈ L2 (γ) and −∆γ is the Laplace-Beltrami operator (or R surface Laplacian) on γ. In addition, we impose that u = 0 on ∂γ or require that γ u = 0 if ∂γ = ∅ (with R f = 0 for compatibility). To represent ∆γ , one needs to describe γ mathematically γ using, for example, parametric representations on charts, level sets, distance functions, graphs of functions, etc. Moreover, one usually obtains approximate solutions (finite element solutions) by solving the problem on approximate polyhedral surfaces rather than the surface γ itself. Exploiting the variational structure of the Laplace-Beltrami operator, [19] gives an a priori error analysis whereas [16, 17, 24, 6] provide a posteriori counterparts. Our present objective is to continue our research on AFEM for elliptic PDEs on surfaces initiated in [24] for graphs and extended in [6] to parametric surfaces, the latter with polynomial degree n = 1. We design herein an AFEM for parametric surfaces using C 0 finite elements of degree n ≥ 1, prove optimal convergence rates and workload estimates, and study suitable approximation classes for the triple (u, f, γ). High-order finite elements are superior to linears for geometric problems: they provide better approximation of important geometric quantities such as curvature, ∗ Department of Mathematics, Texas A&M University, 3368 TAMU, College Station, TX 778433368, USA. (bonito@math.tamu.edu) . Partially supported by NSF Grant DMS-1254618. † Departamento de Economı́a e Historia Económica, Universidad de Salamanca, Salamanca 37008, Spain. (casbar@usal.es). Partially supported by Secretarı́a de Estado de Investigación, Desarrollo e Innovación and Centro para el Desarrollo Tecnológico Industrial of the Ministerio de Economı́a y Competitividad (Spain), grant: CGL2011-29396-C03-02 and by Conserjerı́a de Educación of the Junta de Castilla y León, grant: SA266A12-2. ‡ Department of Mathematics, Chulalongkorn University, Thailand. (k.mekchay@gmail.com). Partially supported by NSF Grants DMS-0204670, DMS-0505454, and INT-0126272. § Instituto de Matemática Aplicada del Litoral (IMAL), UNL-CONICET, Facultad de Ingenierı́a Quı́mica, UNL, CCT-CONICET, Colectora Ruta Nac. 168, 3000 Santa Fe, Argentina (pmorin@santafe-conicet.gov.ar). Partially supported by CONICET through grant PIP 112-20110100742, by Universidad Nacional del Litoral through grant CAI+D PI 501 201101 00476 LI, and by Agencia Nacional de Promoción Cientı́fica y Tecnológica, through grants PICT-2012-2590, PICT2013-3293 (Argentina). ¶ Department of Mathematics and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, USA (rhn@math.umd.edu). Partially supported by NSF Grants DMS-1109325, DMS-1411808, and a RASA semester research award of the University of Maryland. 1 and they are less sensitive to mesh tangling due to tangential node motion for time dependent problems; we refer to [6] for a discussion of several applications. The advantage of high-order methods is even more pronounced when they are combined with adaptivity. AFEMs are known to exploit the nonlinear Besov regularity scale, instead of the linear Sobolev scale, and be able to deliver optimal convergence rates N −n/d in terms of degrees of freedom N for singular elliptic problems on flat domains with limited Sobolev regularity [29, 12], [11, 13, 20, 21, 27]. The study of AFEM for the Laplace-Beltrami operator on parametric surfaces is, however, restricted to n = 1 because the first fundamental form, area element, and normal vector to the discrete surface as well as the surface gradient of discrete functions are piecewise constant, which greatly simplifies the analysis [6]. This paper bridges this gap and provides a comprehensive approach to high-order AFEM on parametric surfaces. It is standard practice to pose the discrete problem on a piecewise polynomial approximation Γ of the exact surface γ. This is unavoidable when dealing with evolving surfaces, such as time dependent free boundary problems, for which γ is unknown [6]. This surface discrepancy is responsible for a geometric consistency error not present in the flat case, which makes this setting mathematically challenging and intriguing. In fact, there is a non-linear interplay between the approximate surface Γ and the approximate solution U defined on Γ. To elucidate this issue, one might think of the Laplace-Beltrami operator as a linear elliptic operator with variable coefficients in a flat parametric domain, except that the approximate coefficients are not piecewise polynomials as in [7] but rather some rational functions when n > 1. The multiplicative structure of the solution-coefficient interaction is an essential new difficulty we must cope with to develop high-order AFEM and study their performance. To handle this nonlinear interaction, we propose an AFEM which successively 1 applies two different modules: ADAPT SURFACE approximates the surface γ in W∞ 1 and ADAPT PDE approximates the solution u in H . The former is a greedy algorithm which monitors the geometric estimator whereas the latter deals with a residual estimator. If {Tk , Uk }∞ k=0 denotes the sequence of meshes (defining a sequence of piecewise polynomial approximate surfaces) and Galerkin solutions generated by AFEM in step k, using a discrete forcing function Fk , the method reads as follows: AFEM: Given Γ0 , T0 , and parameters ε0 > 0, 0 < ρ < 1, and ω > 0, set k = 0. 1. [Tk+ ] = ADAPT SURFACE(Tk , Γk , ωεk ) 2. [Uk+1 , Tk+1 ] = ADAPT PDE(Tk+ , εk ) 3. εk+1 = ρεk ; k = k + 1 4. go to 1. This strategy bears similarities with the algorithms proposed in [7] targeting diffusion problems with partial information on the coefficients. This concept relates directly to surface approximation in the present context but it is intrinsically different than piecewise polynomial approximation of coefficients. We develop herein new techniques to handle such differences upon insisting on the geometric nature of the approximation. We now describe AFEM. For the purpose of this introduction, we assume that γ can be parametrized by a single map χ : Ω → γ, where Ω ⊂ Rd denotes the corresponding parametric domain and refer to § 2.1 for the more general case. If Tb := Tb (Ω) b Tb ) denotes the space of continuous piecewise is a generic triangulation in Ω, then V( polynomial functions of degree ≤ n subordinate to Tb . The Lagrange interpolant of b Tb ), induces the discrete (piecewise polynothe parametrization χ, XTb := ITb χ ∈ V( mial) surface Γ := XTb (Ω) and a curved mesh T := {T = XTb (Tb) | Tb ∈ Tb }; note 2 the correspondance between a flat element Tb ∈ Tb and a curved element T ∈ T . We then define the geometric estimator λTb := maxTb∈T λTb (T ) in terms of the geometric element indicator b − X b )k λTb (T ) := k∇(χ T L∞ (Tb) ∀T ∈ T ; (1.2) we observe that λTb (T ) is evaluated in the flat element Tb ∈ Tb , which explain the subscript Tb of λTb (T ). Given a tolerance τ > 0 and a mesh T , the procedure T + = ADAPT SURFACE(T , τ ), finds adaptively a refinement T + of T , denoted T + ≥ T , and its corresponding piecewise polynomial approximation Γ+ of γ, such that λTb + ≤ τ. (1.3) We assume that this module is t − optimal, i.e., the number of marked elements #M to achieve (1.3) satisfies #M 4 C(γ) τ −1/t . (1.4) The largest value of t ≤ n/d depends on the dimension d and the polynomial degree n ≥ 1. In §7.3, we show that (1.4) holds if χ belongs to a suitable Besov space. Since the exact and approximate solutions u and U are defined on different surfaces γ and Γ, we have to decide how to compare them. We lift U to γ via the map XTb ◦ χ−1 and define the energy error to be eT (U ) := k∇γ (u − U )kL2 (γ) , (1.5) where ∇γ denotes the surface gradient on γ defined below in §3.1. We further denote by ηT (U, F ) the residual estimator of eT (U ), defined later in §4.3. If ε stands for a tolerance, the procedure [U∗ , T∗ ] = ADAPT PDE(T , ε) finds adaptively a refinement T∗ of T such that the Galerkin solution U∗ ∈ V(T∗ ) on Γ∗ , the approximate surface corresponding to T∗ , satisfies the prescribed bound ηT∗ (U∗ , F∗ ) ≤ ε. This is the usual loop for linear elliptic PDE [18, 25], [6, 8, 11, 12, 13, 21, 24, 27] SOLVE → ESTIMATE → MARK → REFINE, (1.6) except that the approximate surface Γ is updated after each REFINE call and therefore changes within ADAPT PDE. Note that there is a tolerance εk being reduced geometrically in every outer loop of AFEM, and a small parameter ω (to be determined explicitly) that relates the tolerances for both procedures. The role of ω is critical to derive convergence rates for AFEM, and is explored computationally in [6, Section 2] for dimension d = 2 and polynomial degree n = 1. The presence of ω is in the spirit of the inner loop of [29] to handle data f ∈ H −1 and of [7] to deal with discontinuous coefficients in the flat case. It means that the surface must be resolved slightly better than the solution for ηT (U, F ) to provide reliable information about eT (U ). Our first main result is a conditional contraction property of ADAPT PDE, which reads as follows and is shown in §6: 3 If the parameter ω > 0 is small enough, there exist constants 0 < α < 1 and β > 0 such that, if the geometric estimator λTbk ≤ ωεk and the error estimator ηk = ηTk (Uk , Fk ) ≥ εk , then the inner iterates {Tj , Γj , Uj , ηj }Jj=0 of ADAPT PDE(Tk+ , εk ) satisfy 2 e2j+1 + βηj+1 ≤ α2 e2j + βηj2 ∀ 0 ≤ j ≤ J, where ej := eTj (Uj ) and J is uniformly bounded with respect to k. To derive convergence rates we need to seek a suitable error quantity and associated approximation class; this is fully discussed in §7. Since all decisions of the AFEM are based on the estimators {ηT , λTb }, the convergence rate of AFEM is dictated by these quantities. We will show in Lemma 4.8 that for all the inner iterates (T , U ) within ADAPT PDE 2 k∇γ (u − U )kL2 (γ) + oscTb (U, f )2 ≈ ηT (U, F )2 . (1.7) where the oscillation oscTb (U, f ), a quantity evaluated in the parametric domain, can be bounded as follows: oscTb (U, f )2 ≤ oscTb (U )2 + oscTb (f )2 . (1.8) The presence of the first term is a feature inherent to polynomial degree n > 1 which is absent in [6]. This justifies the following notion of total error ET (U ; u, f, γ) := k∇γ (u − U )kL2 (γ) + oscTb (U, f ) + ω −1 λTb , (1.9) where the scaling ω −1 brings the estimator λTb to a size comparable with ηT (U, F ). Then the quality of the best approximation of (u, f, γ) with N degrees of freedom can be assessed in terms of the following best approximation error: inf σ(N ; u, f, γ) := inf T ∈TN V ∈V(T ) ET (V ; u, f, γ), where V(T ) denotes the approximation space on the discrete surface Γ, and TN is the set of conforming triangulations obtained after N bisections from T0 . We say that the triple (u, f, γ) belongs to the approximation class As , with 0 < s ≤ n/d, if σ(N ; u, f, γ) 4 N −s ; (1.10) equivalently, for any natural number N ≥ 1, there is a conforming mesh refinement TN of the initial mesh T0 satisfying #TN − #T0 ≤ N and such that ETN (V ; u, f, γ) ≤ CN −s for some V ∈ V(TN ). Throughout this paper we use the notation A 4 B to denote A ≤ CB with a constant C independent of A and B. We shall indicate if appropriate on which quantities the constant C depends on. The algebraic error decay (1.10) relates to Besov regularity for flat domains [4, 22, 20]. The situation for surfaces is much more intricate due to the nonlinear surfacePDE interaction. We wonder whether regularity of γ enabling an error decay N −s in 1 W∞ is compatible with a similar decay rate for eT (U ) and oscTb (U, f ), which depend on the approximate surface Γ. Exploring this question is a fundamental contribution of this paper and entails the study of Besov regularity of products and composition of functions, which we carry out in §9 and is of independent interest. We apply our findings in §7 to quantify the effect of surface approximation in the decay of both eT (U ) and oscTb (U, f ). This leads to our second main contribution: 4 Let 0 < p, q ≤ ∞, 0 < s ≤ n/d such that s > triple (u, f, γ) satisfies u ∈ Bp1+sd (Lp (Ω)), f ∈ Bpsd (Lp (Ω)), 1 p − 21 , s > 1q . If the χ ∈ Bq1+sd (Lq (Ω)), then (1.10) holds, i.e., (u, f, γ) ∈ As . Moreover, oscTb (f ) exhibits a faster decay s + 1/d. We observe that s > p1 − 12 and s > 1q guarantee that Bp1+sd (Lp (Ω)) ⊂ H 1 (Ω) and 1 Bq1+sd (Lq (Ω)) ⊂ W∞ (Ω), whence the additional regularity is just above the nonlinear Sobolev scale for both u and γ. This shows that the two scales are indeed compatible, 2d and that if s = nd , then p > 2n+d and q > nd may be smaller than 1 for n > d2 . The latter does not happen for n = 1 and represents a striking difference with [6]. Our third main contribution is a quasi-optimal decay rate for the AFEM in terms of degrees of freedom under natural restrictions on the initial triangulation T0 , marking parameter θ of MARK and parameter ω of AFEM. This is developed in §8 and reads: Let the initial triangulation T0 satisfy condition (b) in [30, Section 4], and θ ∈ (0, θ∗ ), ω ∈ (0, ω∗ ) for θ∗ , ω∗ sufficiently small. If (u, f, γ) ∈ As and the module ADAPT SURFACE is s-optimal in the sense of (1.4), then the sequence {Γk , Tk , Uk } generated by AFEM verifies k∇γ (u − Uk )kL2 (γ) + oscTbk (f, Uk ) + ω −1 λTbk 4 (#Tk − #T0 )−s . −1/s Moreover, the workload up to step k of AFEM is proportional to εk provided each inner loop of ADAPT PDE has linear complexity. The rest of the paper is organized as follows. We discuss the representation and interpolation of γ in § 2, and basic differential geometry leading to the LaplaceBeltrami operator in § 3. In § 4 we obtain a posteriori error estimates for the energy error and derive several properties of the estimator and oscillation. In § 5 we examine the various modules of AFEM and establish the conditional contraction property of ADAPT PDE in § 6 (first main result). In § 7 we show that suitable Besov regularity of the triple (u, f, γ) implies (u, f, γ) ∈ As , which is our second main result. We next prove quasi-optimal convergence rates in § 8 – our third main result. After recalling a definition of Besov spaces, we establish in § 9 instrumental results on the products and compositions of functions in Besov spaces. 2. Parametric Surfaces. In this section we discuss how to represent a parametric surface by interpolation, which is instrumental for the design, analysis, and implementation of our AFEM. 2.1. Representation of Parametric Surfaces. We assume that the surface γ is described as the deformation of a d dimensional polyhedral surface Γ0 by a globally Lipschitz homeomorphism P0 : Γ0 → γ ⊂ Rd+1 . The overline notation is to emphasize SM i that Γ0 is piecewise affine. Moreover, if Γ0 = i=1 Γ0 is made up of M (closed) faces i i i Γ0 , i = 1, . . . , M , we denote by P0i : Γ0 → Rd+1 the restriction of P0 to Γ0 . We refer i to Γ0 as a macro-element which induces the partition {γ i }M i=1 of γ upon setting i γ i := P0i (Γ0 ). Note that this non-overlapping parametrization allows for piecewise smooth surfaces γ with possible kinks matched by the decomposition {γ i }M i=1 . 5 In order to avoid technicalities, we assume that all the macroelements are simplices, i.e. there is a (closed) reference simplex Ω ⊂ Rd , from now on called the i i i local parametric domain, and an affine map X 0 : Rd → Rd+1 such that Γ0 = X 0 (Ω); i Figure 2.1 sketches the situation when d = 2. We thus let χi := P0i ◦ X 0 : Ω → γ i be a local parametrization of γ which is bi-Lipschitz, namely there exists a universal constant L ≥ 1 such that for all 1 ≤ i ≤ M L−1 |x̂ − ŷ| ≤ |χi (x̂) − χi (ŷ)| ≤ L|x̂ − ŷ|, ∀ x̂, ŷ ∈ Ω. (2.1) This minimal regularity of γ, to be soon strengthened out locally in each macroelement, implies the more familiar condition, valid for a.e. x̂ ∈ Ω, b i (x̂)w| ≤ L|w| L−1 |w| ≤ |∇χ ∀ w ∈ Rd . (2.2) The collection of these parametrizations is denoted χ, i.e. χ := {χi }M i=1 . We further i assume that P0 (v) = v for all vertices v of Γ0 , so that X 0 is the nodal interpolant of χi into linears. γi P0i Γi0 i X0 Ω Fig. 2.1. Representation of each component γ i when d = 2 as a parametrization from a flat i i i triangle Γ0 ⊂ R3 as well as from the reference simplex Ω ⊂ R2 . The map X 0 : Ω → Γ0 is affine. The structure of the map P0 depends on the application. For instance, if γ i is described on Γi0 via the distance function dist(x) to γ, then γ i 3 x̃ = x − dist(x)∇ dist(x) = P0 (x) i ∀ x ∈ Γ0 , provided dist(x) is sufficiently small so that the distance is uniquely defined. If, instead, γ i is the zero level set φ(x) = 0 of a function φ, then i Γ0 3 x = x̃ + ∇φ(x̃) |x − x̃| = P0−1 (x̃) |∇φ(x̃)| ∀ x̃ ∈ γ i i is the inverse map of P0 . In both cases, dist and φ must be C 2 for P0i to be C 1 (Γ0 ). i Yet another option is to view γ i as a graph on Γ0 , in which case P0i is a lift in the i i normal direction to Γ0 and P0 is C 1 (Γ0 ) if and only if γ i is; we refer to [24]. Notice that the inverse mapping theorem implies (P0i )−1 ∈ C 1 (γ i ). The regularity of γ is expressed in terms of the regularity of the maps χi . If s > 0, 0 < p, q ≤ ∞, we say that γ is piecewise Bq1+s (Lp (Ω)) whenever χi ∈ 6 [Bq1+s (Lp (Ω))]d+1 , i = 1, . . . , M ; or shortly χ ∈ [Bq1+s (Lp (Ω))](d+1)×M . We refer to § 9 for the definition of Besov norms and spaces. We observe that a function v i : γ i → R defines uniquely two functions vbi : Ω → R i and v̄ i : Γ0 → R via the maps χi and P0 , namely i vbi (x̂) := v i (χi (x̂)) ∀ x̂ ∈ Ω, v̄ i (x̄) := v i (P0 (x̄)) ∀ x̄ ∈ Γ0 . (2.3) i Conversely, a function vbi : Ω → R (respectively, v̄ i : Γ0 → R) defines uniquely the i two functions v i : γ i → R and v̄ i : Γ0 → R (respectively, v i : γ i → R and vbi : Ω → R). When no confusion is possible, we will denote by v the three functions v i , v̄ i and vb and set x̃ := χ(x̂) for all x̂ ∈ Ω. Moreover, we will use vector notation v := {v i }M i=1 , (2.4) along with the convention kvkB(Ω) := max kv i kB(Ω) , |v|B(Ω) := i=1,...,M max |v i |B(Ω) . i=1,...,M (2.5) for (quasi) norms and semi-norms defined on a quasi-normed linear space B(Ω); typically B(Ω) will be a Lebesgue, Sobolev, or Besov space. Moreover, we will write kvkB(Tb) , |v|B(Tb) ∀ Tb ∈ Tb (2.6) to indicate the local (quasi) norms and semi-norms over a generic element Tb ∈ Tb i without specifying the superscript i in either function v or mesh Tb . Before proceeding further, we note that as a general rule, we use hat symbols to denote quantities related to Ω, an overline to refer to quantities on Γ0 , tilde to characterize quantities in γ and bold to indicate vector quantities. 2.2. Interpolation of Parametric Surfaces and Finite Element Spaces. The partition of the initial polyhedral surface Γ0 in macro-elements (or faces) induces a conforming triangulation of Γ0 ; we call this set T 0 . We only discuss the class of conforming meshes T := T(T 0 ) created by successive bisections of this initial mesh T 0 . However, our results remain valid for any refinement strategy satisfying Conditions 3, 4 and 6 in [8]. In particular, successive bisections, quad-refinement and red-refinement all with hanging nodes are admissible refinement strategies. For more details, we refer to [8, Section 6]. A triangulation T ∈ T yields triangulations of M copies of Ω and a piecewise polynomial approximation Γ of γ defined below. 2.2.1. Finite Element Spaces and Surface Approximations. Any number i i of conforming graded bisections of each macro-element Γ0 generate via (X 0 )−1 a conforming partition of the local parametric domain Ω ⊂ Rd denoted Tb i (Ω) or simply b i := V( b Tb i ) be the finite element space of globally continuous Tb i . For n ≥ 1, let V piecewise polynomials of degree ≤ n on Ω subordinate to the partition Tb i , and let b i (resp. I b i : C 0 (Ω)d → (V b i )d ) be the Lagrange interpolation ITb i : C 0 (Ω) → V T operator of scalar functions (resp. of vector-valued functions). We next define XTib i := ITb i χi , Γi := XTib i (Ω), T i := T := XTib i (Tb) : Tb ∈ Tb i to be the piecewise polynomial interpolation of χi and γ i , and their associated mesh. 7 We now define the corresponding global quantities. The global parametric space ΩM consists of M identical copies of the local parametric space Ω. Its subdivision is denoted Tb and is defined as bi Tb := ∪M i=1 T . Each triangulation T ∈ T uniquely determines Tb , so we can define the forest b := T( b Tb0 ) := {Tb : T ∈ T}. T b does not correspond necessarily to M copies of the same forest, it is Notice that T rather a set of M different forests. Indeed, the bisection rule is governed by the topology of T 0 and dictates which initial bisection of each separate Ω is performed. Similarly, the global subdivision T is given by i T := ∪M i=1 T and T := T(T0 ) := {T : T ∈ T}; note that T0 = T 0 . The global piecewise polynomial surface Γ, the corresponding Lagrange interpolation operator, and parametrization XTb of Γ are then given by M M i Γ := ΓT := ∪M ITb := ITb i i=1 , XTb := XTib i i=1 . i=1 Γ , Moreover, we say that (T , Γ) is a pair of mesh-surface approximation when T ∈ T and Γ := ΓT . Also, for a subdivision T ∈ T, we denote by ST the set of interior faces (edges if d = 2). Finally, we define the finite element space over T b i via X i , V(T ) := {V ∈ C 0 (Γ) : V |Γi is the lift of some Vb i ∈ V Tb Z with V = 0 on ∂Γ or V = 0 if ∂Γ = ∅}, (2.7) Γ and observe that functions in V(T ) are not piecewise polynomials. The refinement procedure consists of bisecting elements in T 0 and propagating −1 −1 its effects on Tb and T via the mappings X0 and XTb ◦ X0 , respectively. Figure 2.2 depicts one bisection refinement for d = 2. For T , T ∗ ∈ T, we use the notation T ∗ ≥ T to indicate that T ∗ is a conforming refinement of T . In addition, slightly abusing the notation, given two subdivisions T , T∗ ∈ T, we write T∗ ≥ T to indicate that T∗ ≥ T . Notice that given T , T∗ ∈ T, with T∗ ≥ T , the finite element space V(T ) is not a subspace of V(T∗ ) since the associated surface approximation Γ and Γ∗ do not match. This lack of consistency is accounted for in the discussion below taking b Tb∗ ) ⊂ V( b Tb ) over ΩM . advantage of the nested property V( At this point, the three different subdivisions T , Tb and T are defined. Notice that any of these three subdivisions uniquely determines the other two, which is repeatedly used in this work. In practice only T is required and the other two are recovered using −1 −1 the mappings X0 and XTb ◦ X0 . However, they have theoretical different purposes: the subdivision T is made of flat faces obtained as refinement of the initial polyhedral surface and drives the refinement procedure; Tb is the triangulation on the parametric space and it is used to evaluate some quantities (geometric estimator, oscillation, etc.) associated to the AFEM in a nested framework; T is made of curved faces and is the subdivision defining Γ = ΓT where the approximate PDE is solved. 8 χ P0 T1 T1 T2 T2 XTb X0 Tb1 Tb2 Tb Ω Fig. 2.2. Effect of one bisection of the macro-element X 0 (Ω) when d = 2 and n = 1; the superscript i is omitted for simplicity. (Left) A triangle T ∈ T 0 is split into two triangles T 1 , −1 T2 ⊂ R3 . (Bottom) Equivalently, via the affine map X 0 , the corresponding triangle Tb ∈ Tb is split 2 b b into two triangles T1 , T2 ⊂ R , whereas (Right) γ is interpolated by a new piecewise linear surface Γ := XTb (Ω), with XTb = ITb χ the piecewise linear interpolant of the parametrization χ defined in Ω and subordinate to the new triangulation Tb . The images via XTb of Tb1 and Tb2 are denoted T1 and T2 respectively; they are affine when n = 1. 2.2.2. Stability of the Lagrange Interpolation Operator. The Lagrange interpolation operator is instrumental to define the approximate surface and will be central in the definition of the geometric estimator in § 2.2.3. The following lemma 1 . We refer discusses its local stability in Besov space Bps (Lq ) and Sobolev space W∞ to § 9 for a definition of the Besov seminorms. Lemma 2.1 (Local stability of Lagrange interpolation). Let Tb be any conforming refinement of Tb0 . If s > 0 and 0 < p, q ≤ ∞ satisfy s > d/q and s ≤ n + 1, then the Lagrange interpolation operator ITb (with polynomial degree n ≥ 1) is stable in Bps (Lq (Tb)), namely there exists a constant C depending on s, d, p, q and n such that |ITb v|B s (Lq (Tb)) ≤ C|v|B s (Lq (Tb)) p p ∀v ∈ Bps (Lq (Tb)), (2.8) b The same bound is valid in W 1 (Tb), i.e. for any Tb ∈ Tb and Tb ∈ T. ∞ b b vk b k∇I T L∞ (Tb) ≤ Ck∇vkL∞ (Tb) 1 b ∀v ∈ W∞ (T ). (2.9) Proof. We consider an arbitrary element Tb ∈ Tb . If m < s ≤ m + 1, with m ≤ n, then for any P ∈ Pm (Tb) (the space of polynomials of degree ≤ m over Tb) we have P = ITb P and the following holds |ITb v|B s (Lq (Tb)) = |ITb (v − P )|B s (Lq (Tb)) 4 h−s kITb (v − P )kLq (Tb) , Tb p p where in the last inequality we use an inverse estimate for Besov semi-norms, which is obtained by scaling. We now introduce w := v − P and estimate kITb wkLq (Tb) . First, scaling to the reference simplex TbR we get d/q d/q kIT wkLq (Tb) 4 hTb kwkL∞ (TbR ) 4 hTb kwkB s (Lq (TbR )) p 9 because Bqs (Lq (TbR )) is embedded in L∞ (TbR ) in view of sq > d. Notice that we do not distinguish between the function w defined on Tb and the corresponding one defined on the reference element TbR . We recall that kwkB s (Lq (TbR )) = kwkLq (TbR ) + |v|B s (Lq (TbR )) , p p according to the definitions of § 9, and scale back to Tb: d/q hTb kwkB s (Lq (TbR )) 4 kwkLq (Tb) + hsTb |w|B s (Lq (Tb)) . p p Combining previous estimates with the immediate generalization of [22, Lemma 4.15] inf kv − P kLq (Tb) 4 |Tb|1/d |v|B s (Lq (Tb)) , P ∈Pn p and the property |P v|B s (Lq (Tb)) = 0, we conclude the desired result (2.8) p |ITb v|B s (Lq (Tb)) 4 inf hT−s b kv − P kLq (Tb) + |v|B s (Lq (Tb)) 4 |v|B s (Lq (Tb)) . P ∈Pm p p p To prove the stability bound (2.9), we take advantage of the representation ITb v = Pd+1 b j=1 v(zj )φzj in terms of the canonical basis functions φzj ∈ Pn (T ). Then b bv = ∇I T d+1 X b z v(zj ) − v(zl ) ∇φ j 1 ≤ l ≤ d + 1, j=1 Pd+1 where we exploit that {φzj }j is a partition of unity over Tb, ie. j=1 φzj = 1. Since any sequence of meshes in the flat parametric domain Ω obtained by successive bisections is shape regular, using inverse estimates in Tb and interpolation in L∞ (Tb) yields b b vk k∇I T L∞ (Tb) ≤ d+1 X b z k b k∇φ max v(zj ) − v(zl ) b) ≤ Ck∇vkL∞ (Tb) , j L∞ ( T 1≤j≤d+1 j=1 where C > 0 is a geometric constant independent of γ and proportional to the sum P d+1 b k∇φz k b which depends only on n and d. This concludes the proof. j=1 j L∞ ( T ) 2.2.3. Shape Regularity and Geometric Estimators. The proof of Lemma 2.1 utilizes direct and inverse estimates that rely on the shape regularity of elements b A discussion about shape regularity of the forests T, T b and T is in order. Tb ∈ Tb ∈ T. The forest T induced by bisection on the flat faces of the initial subdivision T 0 is b Regarding shape regular [3, 27, 30] and so is its counterpart on parametric domain T. the forest T, the question is more subtle and we start with a definition. Definition 2.2 (Shape regularity). We say that the class of conforming meshes T is shape regular if there is a constant C0 only depending on T0 , such that for all b and all i = 1, ..., M , Tb ∈ T, C0−1 |x̂ − ŷ| ≤ |XTib i (x̂) − XTib i (ŷ)| ≤ C0 |x̂ − ŷ| ∀ x̂, ŷ ∈ Tb, ∀ Tb ∈ Tb i . (2.10) b is shape regular and observe that (2.10) states that We have already noted that T i b b the deformation of T ∈ T leading to T ∈ T i does not degenerate. We also point out b i , valid for a.e. x̂ ∈ Ω that (2.10) implies the usual condition on the Jacobian ∇X Tbi b i i (x̂) w| ≤ C0 |w| C0−1 |w| ≤ |∇X Tb 10 ∀ w ∈ Rd , (2.11) b i happens to be constant on Tb for an affine map X i i [14]. and that ∇X T Tb i We stress that a bi-Lipschitz parametrization satisfying (2.1) does not guarantee that T is shape regular. This issue has been tackled in [9] assuming that the surface γ 2 is W∞ and T 0 is sufficiently fine. We present a similar result in Lemma 2.4, invoking piecewise C 1 -regularity of γ, which hinges on the quasi-monotonicity of the geometric estimator λTb , which we prove first in Lemma 2.3. We start with the definition of λTb . Since there is a one-to-one correspondence between subdivisions T ∈ T defining the b of M copies of the parametric surface interpolant Γ = ΓT and subdivisions Tb ∈ T domain Ω, we define λTb over Ω and use the subscript Tb . For 1 ≤ i ≤ M and Tb ∈ Tb i , let the geometric element indicator be i b i b i − X i i )k λTb i (Tb) := k∇(χ L∞ (Tb) = k∇(χ − ITb i χ kL∞ (Tb) , Tb (2.12) and the corresponding geometric estimator be λTb := max max λTb i (Tb). i=1,...,M Tb∈Tb i (2.13) The geometric estimator may not decrease upon refinement, especially in the preasymptotic regime, but the following quasi-monotonicity property is valid instead. Lemma 2.3 (Quasi-monotonicity of the geometric estimator). There exists a constant Λ0 > 1, solely depending on T 0 , the polynomial degree n, and dimension d, such that λTb∗ ≤ Λ0 λTb (2.14) b with Tb∗ ≥ Tb . This bound holds elementwise as well. for any Tb , Tb∗ ∈ T Proof. We consider an arbitrary element Tb ∈ Tb , i.e. Tb ∈ Tb i for some i, but do not write explicitly the superscript i. We further observe that the Lagrange interpolation operator ITb∗ is invariant on polynomials of degree ≤ n over Tb, whence b − I b χ)k b b k∇(χ T∗ L∞ (Tb) = k∇(χ − ITb χ) − ∇ITb∗ (χ − ITb χ)kL∞ (Tb) . 1 From the local W∞ stability bound (2.9) of ITb∗ , we deduce the existence of C > 0, solely depending on T 0 , d and n, such that b b χ − I b χk b k∇I T∗ T L∞ (Tb) ≤ Ck∇χ − ITb χkL∞ (Tb) . The desired estimate (2.14) thus follows with Λ0 = 1 + C. This result turns out to be critical not only for Lemma 2.4 below, which guarantees the shape regularity of T, but also to control the possible increase of the geometric estimator due to the ADAPT PDE calls. We reiterate that bi-Lipschitz parametrizations satisfying (2.1) do not guarantee that T is shape regular. Lemma 2.4 (Shape regularity). The forest T = T(T0 ) is shape-regular with constant C0 = 2L provided λTb0 ≤ 1 , 2Λ0 L (2.15) with L ≥ 1 the non-degeneracy constant in (2.1) and Λ0 > 0 the constant in (2.14). 11 Proof. Let T ∈ T be an arbitrary mesh. For any T ∈ T , we recall that T belongs to a mesh patch T i for some i, which we do not write explicitly. Let Tb be the b, y b ∈ Tb corresponding element in Tb . Since for x b − X b )k b |k∇(χ b |λTb (Tb), |(χ − XTb )(b x) − (χ − XTb )(b y)| ≤ |b x−y x−y T L∞ (Tb) = |b the shape-regularity assertion is a consequence of (2.1) and (2.14). We refer to [6, Figure 11] for an intermediate degenerate situation in which λTb1 > (2Λ0 L)−1 and (2.15) is violated for polynomial degree n = 1. 3. The Laplace-Beltrami Operator. In this section, we start the discussion with basic differential geometry properties leading to the definition of the LaplaceBeltrami operator ∆γ together with other relevant geometric operators. We then derive a weak formulation of −∆γ u = f as well as its finite element counterpart. We assume γ to be piecewise C 1 , i.e., χi ∈ C 1 (Ω)d+1 for all 1 ≤ i ≤ M , and Γ denote its piecewise polynomial approximation. In the discussion below we remove the superscript i, because no confusion is possible. 3.1. Basic Differential Geometry. In this subsection we recall a matrix formulation of some basic differential geometry facts and refer to [6] for details. Our b in the parametric domain Ω with the tangential first task is to relate the gradient ∇ gradient ∇γ on γ. To this end, let T ∈ R(d+1)×d be the matrix T := Tγ := [∂b1 χ, . . . , ∂bd χ], whose j-th column ∂bj χ ∈ Rd+1 is the vector of partial derivatives of χ with respect to the j th coordinate of Ω. Since χ is a diffeomorphism, the set {∂bj χ}dj=1 of tangent vectors to γ is well defined, linearly independent, and expands the tangent hyperplane to each γ i at interior points for all 1 ≤ i ≤ M . The first fundamental form of γ is the symmetric and positive definite matrix G ∈ Rd×d defined by G = gγ,ij 1≤i,j≤d := ∂bi χT ∂bj χ 1≤i,j≤d = TT T. (3.1) Given v : γ → R, the tangent gradient ∇γ v(x̃) = relation Pd i=1 αi (x̂)∂bi χ(x̂) satisfies the b v = ∇γ v T ∇b (3.2) e ∈ R(d+1)×(d+1) by adding To get the reverse relation, we augment T to the matrix T (d+1) the (outer) unit normal ν = (ν1 , · · · , νd+1 ) ∈ R to the tangent hyperplane span{∂bi χ}di=1 to γ as the last column, namely e := T, ν T = ∂b1 χ, . . . , ∂bd χ, ν T . T e is invertible, we let D e =T e −1 and realize that Since T eD e = ∇b e = ∇b b v, 0 D b v D, ∇γ v = ∇γ v T (3.3) e by suppressing its last row. Moreover, the first where D ∈ Rd×(d+1) results from D fundamental form G has inverse G−1 = DDT . We let √ (3.4) q := det G 12 be the area element of γ and point out the change of variables formula Z Z vbq = v. ω (3.5) χ(ω) When χ is C 2 and v ∈ H 2 (γ) (b v ∈ H 2 (Ω)), we have the compact expression for the Laplace-Beltrami operator on γ ∆γ v = 1c b div q ∇b v G−1 . q The above representation is instrumental to derive the following integration by parts formula on surfaces Z Z Z T ∇γ w∇γ v = −∆γ w v + ∇γ w nT v ∀ v, w ∈ H 2 (γ), (3.6) γ γ ∂γ where n is the unit co-normal on ∂γ pointing outside γ. The discussion above applies as well to the piecewise polynomial surface Γ (recall that we dropped the index specifying the considered patch). We denote the correb b and DΓ associated with X b : Ω → Γ, and get sponding matrices TΓ = ∇X T T b v DΓ . ∇Γ v = ∇b (3.7) The first fundamental form GΓ of Γ and its elementary area qΓ are defined by p (3.8) GΓ := TTΓ TΓ , qΓ := det GΓ . The corresponding expression of the Laplace-Beltrami operator is ∆Γ v = 1 c b v G−1 , div qΓ ∇b Γ qΓ (3.9) and only makes sense elementwise. In addition, we recall that for T ∈ T and S a side of T , the unit co-normal nΓ on S pointing outside T satisfies b = nΓ TΓ n rΓ , qΓ b DΓ nΓ = n qΓ rΓ (3.10) where rΓ is the area element associated with the subsimplex Sb := XT−1 (S) (see [6] for a detailed expression). Hence, (3.10) and (3.7) give the following local expression for the tangential derivative of v in the direction nΓ on S ∇Γ v · nΓ = qΓ b b. ∇b v G−1 b·n Γ |S rΓ (3.11) This is of particular importance when considering residual type estimators as in the present work; see § 4. 3.2. Variational Formulation and Galerkin Method. We start by introducing relevant Lebesgue and Sobolev spaces. Let Z n o L2,# (γ) := v ∈ L2 (γ) : v = 0 if ∂γ = ∅ γ 13 be the subspace of L2 (γ) of functions with vanishing meanvalue whenever the surface 1 γ is closed, and let H# (γ) be the subspace of H 1 (γ) given by n 1 H# (γ) := v ∈ L2,# (γ) : ∇γ v|γ i ∈ [L2 (γ i )]d+1 , o v|γ i = v|γ j on γ i ∩ γ j 1 ≤ i, j ≤ M, v = 0 on ∂γ , where ∇γ and traces are well defined in each component γ i due to (3.3). Let the weak 1 form of the Laplace-Beltrami operator ∆γ v for any function v ∈ H# (γ) be h−∆γ v, ϕi := M Z X γi i=1 ∇γ v · ∇γT ϕ 1 ∀ϕ ∈ H# (γ), (3.12) 1 1 where h·, ·i denotes the (H# (γ))∗ -H# (γ) duality product. We now build on (3.12) and write the weak formulation of −∆γ u = f as follows: 1 given f ∈ L2,# (γ), we seek u ∈ H# (γ) satisfying Z Z 1 ∇γ u · ∇Tγ ϕ = f ϕ, ∀ ϕ ∈ H# (γ), (3.13) γ γ R PM R where we have written γ ∇γ u · ∇Tγ ϕ to denote i=1 γ i ∇γ u · ∇Tγ ϕ. Existence and 1 uniqueness of a solution u ∈ H# (γ) is a consequence of the Lax-Milgram theorem provided γ is Lipschitz. When χi is C 2 and u ∈ H 2 (γ i ) for each 1 ≤ i ≤ M , we showed in [6] that on each component γ i we have − ∆γ u = f in int(γ i ) := χi (int(Ω)), 1 ≤ i ≤ M, (3.14) together with vanishing jump conditions at the interfaces γ i ∩ γ j e i + ∇γ j u · n ej = 0 J (u)|γ i ∩γ j := ∇γ i u · n ∀ 1 ≤ i, j ≤ M, (3.15) e i is the unit outer normal to ∂γ i in the tangent plane to γ i . where n Given T ∈ T, we next formulate an approximation to the Laplace-Beltrami operator on the piecewise polynomial interpolant Γ = ΓT of γ as follows. If FΓ ∈ L2,# (Γ) is a suitable approximation of f , then the finite element solution U : Γ → R solves Z Z T U ∈ V(T ) : ∇Γ U ∇Γ V = FΓ V ∀ V ∈ V(T ), (3.16) Γ where again R Γ g= PM R i=1 Γi Γ g i . To this end we choose FΓ to be FΓ := f q , qΓ because this specific choice of FΓ satisfies the compatibility property Z Z FΓ = f = 0, Γ (3.17) (3.18) γ whenever γ is closed, and allows us to handle separately the approximation of surface γ and forcing f . In particular, (3.16) admits a unique solution U as a consequence of the Lax-Milgram theorem. 14 4. A Posteriori Error Analysis. In order to study the discrepancy between u and U we need to agree on comparing them in a common domain, say γ. Our goal is thus to obtain a posteriori error estimates for the energy error k∇γ (u−U )kL2 (γ) . This requires developing an a priori error analysis for the interpolation error committed in replacing γ by Γ in (3.16), which is a sort of consistency error, and its impact on the PDE error. We are concerned with these issues in this section and refer to [17, 24] where they are addressed in a different framework as well as [6] that discusses the case n = 1. We again drop the superscript i that identifies the surface patch. 4.1. Geometric Error and Estimator. We now quantify the error arising from interpolating γ, the so-called geometric error. To this end we resort to the matrix formulation of § 3.1 to relate the geometric error with the geometric estimator λTb of (1.2). Given T ∈ T , we will deal with the regions Tb ∈ Tb and Te ⊂ γ given by Tb := XT−1 (T ), Te := χ(Tb). (4.1) On mapping back and forth to Tb, and using (3.5), we easily see that Z Z qΓ v= v . q T Te (4.2) The consistency error stems from the different bilinear forms of the continuous and discrete equations and reads [6, Lemma 5.1]: Z Z Z ∇Γ v∇ΓT w − ∇γ v∇γT w = ∇γ vEΓ ∇γT w ∀ v, w ∈ H 1 (γ), (4.3) Γ γ γ where EΓ ∈ R(d+1)×(d+1) stands for the following error matrix EΓ := 1 −1 T(qΓ G−1 )TT . Γ − qG q (4.4) Corollary 5.3 in [6] provides the following conditional estimate on the consistency error: If λTb0 satisfies λTb0 ≤ 1 , 6Λ0 L3 (4.5) then we have, for T ∈ T, kEΓ kL∞ (Tb) 4 λTb (Tb) ∀T ∈T, (4.6) where the hidden constant depends on Tb0 and the Lipschitz constant L of γ appearing in (2.1). The consistency error estimate (4.6) relies on the following properties for qΓ , rΓ , GΓ , DΓ and νΓ which will be used again later. Their proofs can be found in [6, Lemmas 5.2 and 5.4] except that of rΓ . Lemma 4.1 (Properties of qΓ , rΓ , GΓ , DΓ and νΓ ). If λTb0 satisfies (4.5), then the matrices G and GΓ have eigenvalues in the interval [L−2 , L2 ] and [ 21 L−2 , 32 L2 ], respectively. Moreover, the forest T is shape regular, L−d 4 q, qΓ 4 Ld , and for T ∈ T kq − qΓ kL∞ (Tb) + kr − rΓ kL∞ (∂ Tb) + kG − GΓ kL∞ (Tb) + kν − νΓ kL∞ (Tb) + kD − DΓ kL∞ (Tb) 4 λTb 15 ∀ Tb ∈ Tb (4.7) where we recall that Γ = ΓT . We stress that if Tb0 does not satisfy (4.5), then the algorithm AFEM of §5 will first refine Tb0 to make it comply with (4.5) without ever solving the discretized PDE. In this sense, (4.5) is not a serious restriction for AFEM, although is necessary for the subsequent theory. We also note that (4.5) implies (2.15) because L ≥ 1 in (2.1). We finally point out the equivalence of norms on γ and Γ provided (4.5) is valid. Lemma 4.2 (Equivalence of norms). If λTb0 satisfies (4.5), then the following equivalence of norms holds for all T ∈ T with constants depending on T 0 and L kvkL2 (Te) ≈ kvkL2 (T ) ≈ kvkL2 (Tb) , |v|H 1 (Te) ≈ |v|H 1 (T ) ≈ |v|H 1 (Tb) ∀ T ∈ T , (4.8) e b where Tb = XT−1 b (T ) and T = χ(T ). Proof. The first assertion follows directly from (4.2) and Lemma 4.1, which implies L−2d 4 qqΓ 4 L2d . For the second equivalence, we note that (3.5), (3.7) and (3.2) readily imply that Z Z qΓ 1 ∇Γ v ∇ΓT w = ∇γ v TDΓ DTΓ TT ∇γT w ∀v, w ∈ H# (γ). q T Te T This, combined with the spectral estimate given in Lemma 4.1 for G−1 Γ = DΓ DΓ and b yields the second equivalence. Similar reasoning applies to Tb. (2.2) for T = ∇χ, 4.2. Inverse Estimates for Discrete Geometric Quantities. We now establish some inverse estimates for the discrete quantities qΓ and GΓ that are instrumental to derive Lemma 4.9 (reduction of residual estimator) and Lemma 7.7 (local decay of oscillation). These estimates are only required when the polynomial degree n > 1, which is a key distinction between this work and [6]. 1 In the following, for T ∈ T , we use the notation hT := |Tb| d as a measure of the b diameter of T where Tb = XT−1 b (T ) ∈ T is the preimage of T ∈ T in Ω. This choice is motivated by the resulting reduction property after b ≥ 1 bisections of Tb. hT 0 ≤ 2−b/d hT , (4.9) where T 0 is the curvilinear element corresponding to any element Tb0 ⊂ Tb. Lemma 4.3 (Inverse inequalities in Lp ). If λTb0 satisfies (4.5), then the following estimates hold for all 1 ≤ p ≤ ∞, T , T∗ ∈ T and T ≤ T∗ , with constants depending on T0 and L d kDqΓ kLp (Tb) 4 hT p −1 , d p −1 , kDG−1 Γ kLp (Tb) 4 hT d kD(qΓ∗ − qΓ )kLp (Tc∗ ) 4 hT p h−1 T∗ λTb , d −1 kD(G−1 4 hT p h−1 c Γ∗ − GΓ )kLp (T T∗ λTb , ∗) (4.10) (4.11) c∗ ∈ T c∗ satisfy T c b whenever Tb ∈ Tb , T √ ∗ ⊂ T. Proof. We start with qΓ = det GΓ and observe that ∂j qΓ = 2q1Γ ∂j det GΓ and det GΓ is polynomial. Using an inverse inequality for det GΓ , along with the fact that qΓ is bounded from above and below (see Lemma 4.1), we obtain 1 −1 kq k b k∂j det GΓ kLp (Tb) 2 Γ L∞ (T ) d −1 1 1 2 4 k det GΓ kLp (Tb) 4 kqΓ kLp (Tb) 4 hTp . hT hT k∂j qΓ kLp (Tb) ≤ 16 We now deal with qΓ∗ − qΓ as follows. We first write 1 1 1 1 ∂j (qΓ∗ − qΓ ) = − ∂j det GΓ∗ + ∂j det GΓ∗ − det GΓ , 2 qΓ∗ qΓ 2qΓ whence, for Tb ∈ Tb , Tb∗ ∈ Tb∗ with Tb∗ ⊂ Tb, k∂j (qΓ∗ − qΓ )kLp (Tb∗ ) 4 kqΓ − qΓ∗ kLp (Tb) k∂j det GΓ∗ kL∞ (Tb∗ ) + k∂j det GΓ∗ − det GΓ kLp (Tb∗ ) . Using an inverse inequality for det GΓ∗ − det GΓ = qΓ2 ∗ − qΓ2 , the bounds (4.7) on qΓ and qΓ∗ in terms of λTb and λTb∗ , and the quasi-monotonicity (2.14) of λTb , we get d/p −1 k∂j (qΓ∗ − qΓ )kLp (Tb∗ ) ≤ h−1 T∗ kqΓ∗ − qΓ kLp (Tb) 4 hT hT∗ λTb . −1 −1 −1 To estimate DG−1 Γ we see that ∂j (GΓ GΓ ) = ∂j GΓ GΓ +GΓ ∂j GΓ = 0, whence −1 −1 −1 ∂j GΓ = −GΓ ∂j GΓ GΓ . This, an inverse inequality and the lower bound of the eigenvalues of GΓ in Lemma 4.1 imply d −1 2 k∂j G−1 Γ kLp (Tb) 4 kGΓ kL b k∂j GΓ kLp (Tb) ∞ (T ) p −1 4 h−1 T kGΓ kLp (Tb) 4 hT −1 . −1 −1 −1 Finally, G−1 Γ∗ − GΓ = GΓ∗ GΓ − GΓ∗ GΓ , so that the partial derivatives can be computed with the product rule and always keepingt he Lp norm in the middle term and the L∞ norm in the other two. Then, making use of some inverse inequalities together with (4.7) and (2.14), we arrive at d/p −1 −1 −1 k∂j G−1 Γ∗ − GΓ kLp (Tb∗ ) 4 hT∗ kGΓ∗ − GΓ kLp (Tb) 4 hT hT∗ λTb , as asserted. We now establish an inverse estimate in Besov spaces. We refer to § 9 for the definition (9.2) of the Besov seminorm |V |B s (Lp (Tb)) in terms of the modulus of smoothness ∞ of order k = bsc + 1 ωk (V, t)p = sup k∆kh (V )kLp (Tb) , |h|≤t where ∆kh are the k-th order differences defined in (9.1). Lemma 4.4 (Inverse estimate in Besov space). Let T ∈ T and s > 0, 0 < p ≤ ∞. Then, the following inequality holds |∂i V |B s ∞ (Lp (T )) b 4 1 |V |B s (Lp (Tb)) , ∞ hTb for any Tb ∈ Tb , and function V ∈ Pn (Tb) or V = qΓ G−T (with Γ = ΓT ). Γ b Proof. We prove the estimate for V ∈ Pn (T ) because dealing with qΓ G−T Γ reduces to repeating the steps in the proof of Lemma 4.3 and applying the inverse inequality for polynomials. Since the k-th order differences satisfy ∆kh (∂i V )(x) = ∂i (∆kh V (x)), and ∆kh V ∈ Pn (Tbh ), the usual inverse inequality gives k∆kh ∂i V kLq (Tb) = k∆kh ∂i V kLq (Tbh ) = k∂i ∆kh V kLq (Tbh ) 4 Invoking the definition (9.2) yields the desired estimate. 17 1 k∆kh V kLq (Tb) . hTb 4.3. Upper and Lower Bounds for the Energy Error. We now derive an error representation formula leading to lower and upper bounds for the energy error. Given T ∈ T, we recall the notation Γ = ΓT and introduce the usual interior and jump residuals for V ∈ V(T ) arbitrary RT (V, FΓ ) := FΓ |T + ∆Γ V |T , + JS (V ) := ∇Γ V |S · J∂T (V ) := {JS (V )}S⊂∂T n+ S − + ∇Γ V |S · n− S ∀T ∈ T , ∀ S ∈ ST , where, for each x ∈ S, n± S (x) denotes the outward unit normal to S and tangent to T ± at x, and T + , T − are curvilinear elements in T that share the side S ∈ ST ; recall that ST denotes the set of interior faces of T ∈ T . We emphasize that, in contrast − ± ± to flat domains, n+ S 6= −nS . Similarly, if DΓ denote the matrices associated to T , b ± D± | b are tangential gradients of V on T ± restricted to S. Moreover, ∇Γ V ± |S = ∇V Γ S c qΓ ∇ b Vb G−1 | b 6= 0 in general for T ∈ T according to (3.9), see that ∆Γ V |T = qΓ−1 div Γ T when the polynomial degree n > 1. This is a major difference relative to [6], which deals with n = 1 and Vb |Tb ∈ P1 (Tb), qΓ ∈ P0 (Tb), GΓ ∈ P0 (Tb)d×d imply ∆Γ V |T = 0. Subtracting the weak formulations (3.13) and (3.16), and employing (3.6) to integrate by parts elementwise, we obtain for all v ∈ H 1 (γ): Z ∇γ (u − U ) · ∇γ v = I1 + I2 + I3 , (4.12) γ with I1 := XZ T ∈T RT (U, FΓ )(v − V ) − T Z ∇Γ U · ∇ Γ v − Z Z I3 := f v − FΓ v. Γ Z ∇γ U · ∇ γ v = I2 := γ JS (U )(v − V ), S S∈ST Z γ X Z ∇γ U EΓ ∇Tγ v, γ Γ q qΓ f of(3.17) implies I3 = 0 so that only I1 and I2 need to be The choice FΓ = estimated. Observe that I1 is the usual residual term, whereas I2 is the geometry consistency term (4.3) and accounts for the discrepancy between γ and Γ. An estimate for the error matrix EΓ is given in (4.6). The PDE error indicator stems from I1 and is defined as follows for any V ∈ V(T ) 2 2 ηT (V, FΓ , T )2 := h2T kRT (V, FΓ )kL2 (T ) + hT kJ∂T (V )kL2 (∂T ) ∀ T ∈ Γ. 1 b We recall that hT = |Tb| d and Tb = XT−1 b (T ) is the preimage of T in the mesh T , which guarantees the strict reduction property (4.9). We also introduce the oscillation for any V ∈ V(T ) and Tb ∈ Tb c qΓ ∇V b G−1 k2 osc2Tb (V, f, Tb) := h2T k(id − Π22n−2 ) f q + div Γ L2 (Tb) (4.13) +b + + −1 + −b − − −1 − 2 b + qΓ ∇V (GΓ ) n b k2L (∂ Tb) , + hT k(id − Π2n−1 ) qΓ ∇V (GΓ ) n 2 q ± b ± is defined according to (3.10), G± where n det G± Γ and qΓ = Γ are the first fun± p damental form and area element associated to T , and Πm denotes the best Lp approximation operator onto the space Pm of polynomials of degree ≤ m; the domain 18 is implicit from the context. Notice that we used scaled local versions of the residual q(f + ∆Γ V ) (see (3.9)) and co-normal derivatives rΓ ∇Γ V · n (see (3.11)) to define the oscillation. We refer to Remark 4.7 for an alternative definition of oscillation. Finally, for any subset τ ⊂ T and corresponding τb ⊂ Tb we set X X ηT2 (V, FΓ , τ ) := osc2Tb (V, f, Tb), ηT2 (V, FΓ , T ), osc2Tb (V, f, τb) := T ∈τ T ∈τ and simply write ηT (V, FΓ ) and oscTb (V, f ) whenever τ = T . Standard arguments [2, 33] to derive upper and lower bounds for the energy error on flat domains can be extended to this case; see [17, 24, 6]. Lemma 4.5 (A posteriori upper and lower bounds). Let λTb0 satisfy (4.5). Let 1 u ∈ H# (γ) be the solution of (3.13), (T , Γ) be a pair of mesh-surface approximations and U ∈ V(T ) be the Galerkin solution of (3.16). Then there exist constants C1 , C2 and Λ1 depending only on T 0 , the Lipschitz constant of γ, and kf kL2 (γ) , such that k∇γ (u − U )k2L2 (γ) ≤ C1 ηT (U, FΓ )2 + Λ1 λ2Tb , 2 C2 ηT (U, FΓ ) ≤ k∇γ (u − U )k2L2 (γ) (4.14) 2 + oscTb (U, f ) + Λ1 λ2Tb . (4.15) 1 Proof. Our departing point is (4.12) with v ∈ H# (γ) arbitrary and V ∈ V(T ) being its Scott-Zhang interpolant built over the partition T of Γ and lifted to Γ using −1 XTb ◦ X 0 . Using interpolation estimates and (4.8) yields |I1 | 4 ηT (U, FΓ )k∇γ vkL2 (γ) . Since k∇Γ U kL2 (γ) 4 kf kL2 (γ) , the estimate (4.6) on the error matrix EΓ gives |I2 | 4 λTb k∇γ vkL2 (γ) . The upper bound (4.14) follows from I3 = 0. The lower bound (4.15) can be proved locally over an element Tb ∈ Tb in Ω using standard arguments for flat domains. To prove optimality of AFEM we need a localized upper bound for the distance between two discrete solutions U and U∗ . This bound measures k∇γ (U∗ − U )kL2 (γ) in terms of the PDE estimator restricted to the refined set and geometric estimator; we refer to [6, Lemma 4.13] for a similar estimate for n = 1. Lemma 4.6 (Localized upper bound). Let λTb0 satisfy (4.5). For (T , Γ), (T∗ , Γ∗ ) pairs of mesh-surface approximations with T ≤ T∗ , let R := RT →T∗ ⊂ T be the set of elements refined in T to obtain T∗ i.e., R = T \ T∗ . Let U ∈ V(T ) and U∗ ∈ V(T∗ ) be the corresponding discrete solutions of (3.16) on Γ and Γ∗ , respectively. Then the following localized upper bound is valid k∇γ (U∗ − U )k2L2 (γ) ≤ C1 ηT (U, FΓ , R)2 + Λ1 λ2Tb , (4.16) with constants C1 , Λ1 as in Lemma 4.5. Proof. We start from the error representation formula (4.12) by replacing γ by 1 Γ∗ , u by U∗ , and taking as a test function v = E∗ := U − U∗ ∈ H# (γ) Z ∇Γ∗ (U∗ − U ) · ∇Γ∗ E∗ = I1 + I2 + I3 . Γ∗ To estimate I1 , we proceed as in the flat case [12, 27, 29]. We first construct an approximation V ∈ V(T ) of E∗ ∈ V(T∗ ). Let ω be the union of elements of R = T \T∗ 19 and let ω be the corresponding union in T . Let ωj (resp. ωj ) denote one of the connected components of the interior of ω (resp. ω) for 1 ≤ j ≤ J. We stress that ωj may intersect several patches Γi and likewise ωj may intersect several copies of Ω. Let T j be the subset of elements in T contained in ω j and let V(T j ) be the restriction of V(T ) to ω j . We now construct the Scott-Zhang operator [28] on ω j and use the map −1 to lift it to Γ. We denote this lift by πj : H 1 (ω j ) → V(Tj ), with XTb ◦ X 0 n −1 Tj := T = XTb ◦ X 0 (T ) : T ∈ T j ⊂ T . Let V ∈ V(T ) be the following approximation of the error E∗ ∈ V(T∗ ): V := πj E∗ in ωj , V := E∗ elsewhere. By construction, V has conforming boundary values on ∂ωj , V ∈ V(T ), and is an H 1 stable approximation to E∗ . Since V = E∗ in Γ\ω, by the same standard argument for flat domains, we obtain |I1 | ≤ C1 ηT (U, FΓ , R)k∇Γ E∗ kL2 (Γ) . To estimate I2 , we note that Γ and Γ∗ coincide in the unrefined region Γ\ω, so that I2 = J Z X j=1 ∇γ U EΓ ∇Tγ E∗ − ∇γ U EΓ∗ ∇Tγ E∗ ω ej with ω ej := χ ◦ XT−1 (ωj ). Combining the estimate (4.6) on the error matrices EΓ and EΓ∗ with (4.8) and (2.14), in its elementwise form, we obtain |I2 | 4 λTb + λTb∗ k∇Γ E∗ kL2 (γ) 4 (1 + Λ0 )kf kL2 (γ) λTb . Since I3 = 0 in view of the choice (3.17) of FΓ∗ and FΓ , collecting the preceding estimates we finally conclude (4.16). 4.4. Properties of the PDE Estimator and Oscillation. As indicated in (4.14)–(4.15), we have access to the energy error k∇γ (u − U )kL2 (γ) only through the PDE estimator ηT (U, FΓ ), the geometric estimator λTb , and the oscillation quantity oscTb (U, f ). As is customary for flat domains, the definition (4.13) of oscillation guarantees that oscTb (U, f ) is dominated by ηT (U, FΓ ), namely oscTb (U, f, Tb)2 ≤ C3 ηT (U, FΓ , T )2 ∀ T ∈ Tb , (4.17) where the constant C3 depends on the surface γ. Remark 4.7 (Alternative definition of oscillation). The alternative definition for the local oscillation oscTb (V, f, Tb) 2 −1/2 −1/2 c b G−1 osc b (V, f, Tb) = h2 (id − Π2 ) f qq −q div qΓ ∇V T T 2n−2 Γ Γ Γ L2 (Tb) 2 −1/2 + b + −1 + b − (G− )−1 n b + qΓ− ∇V b− + hT (id − Π22n−1 ) rΓ qΓ ∇V (G+ n Γ) Γ c) L2 (∂T would imply (4.17) with an optimal constant C3 = 1. However, this would be at the expense of a more intricate proof of Proposition 7.7 (local decay of oscillation). We opted to use definition (4.13) to simplify the presentation. 20 The main novelty in (4.14)–(4.16) relative to flat domains, which is also the chief challenge of the present analysis, is the presence of λTb . In this respect, we show now the equivalence of ηT (U, FΓ ) and the PDE error 12 ET (U, f ) := k∇γ (u − U )k2L2 (γ) + oscTb (U, f )2 (4.18) provided λTb is small relative to ηT (U, FΓ ). We refer to [12, 27] for a similar result for flat domains, and to [6] for parametric surfaces and n = 1. Lemma 4.8 (Equivalence of estimator). Let C1 , C2 , Λ1 be given in Lemma 4.5. If λ2Tb ≤ C2 ηT (U, FΓ )2 , 2Λ1 (4.19) then there exist explicit constants C4 ≥ C5 > 0, depending on C1 , C2 , such that C5 ηT (U, FΓ ) ≤ ET (U, f ) ≤ C4 ηT (U, FΓ ). (4.20) Proof. Combining (4.14) with (4.19), we infer that C2 ηT (U, FΓ )2 . k∇γ (u − U )k2L2 (γ) ≤ C1 + 2 (4.21) This, together with (4.17), gives the upper bound in (4.20). We next resort to (4.15) and (4.19) to obtain C2 ηT (U, FΓ )2 ≤ k∇γ (u − U )k2L2 (γ) + oscTb (U, f )2 + C2 ηT (U, FΓ )2 , 2 which implies the lower bound in (4.20) and concludes the proof. It turns out that the usual reduction property of ηT (U, FΓ ) [12, Corollary 3.4], [27], which is instrumental to prove a contraction property of AFEM, is also polluted by the presence of λTb as stated below. Lemma 4.9 (Reduction of residual error estimator). Let λTb0 satisfy (4.5). Given a mesh-surface pair (T , Γ), let M ⊂ T be the subset of elements bisected at least b ≥ 1 b times in refining T to obtain T∗ ≥ T . If ξ := 1 − 2− d , then there exist constants Λ2 and Λ3 , solely depending on the shape regularity of T, the Lipschitz constant L of γ, and kf kL2 (γ) , such that for any δ > 0 ηT∗ (U∗ , FΓ∗ )2 ≤ (1 + δ) ηT (U, FΓ )2 − ξηT (U, M)2 + (1 + δ −1 ) Λ3 k∇γ (U∗ − U )k2L2 (γ) + Λ2 λ2Tb . (4.22) Proof. We first examine the residual RT (U, FΓ ). If T∗ ∈ T∗ and T ∈ T satisfy Tb∗ ⊂ Tb, and T 0 := XTb ◦ XT−1 b∗ (T∗ ) ⊂ T , then the bound on qΓ∗ given in Lemma 4.1 yields 1 kRT∗ (U∗ , FΓ∗ )kL2 (T∗ ) = kqΓ2∗ RT∗ (U∗ , FΓ∗ )kL2 (Tb∗ ) 4 kFΓ∗ − FΓ kL2 (Tb∗ ) + k∆Γ∗ (U∗ − U )kL2 (Tb∗ ) + k(∆Γ∗ − ∆Γ )U kL2 (Tb∗ ) 1/2 1/2 1/2 + k(qΓ∗ − qΓ )RT (U, FΓ )kL2 (Tb∗ ) + kqΓ RT (U, FΓ )kL2 (Tb∗ ) . 21 Now, from (4.7) and the local form of (2.14) we bound the first term kFΓ∗ − FΓ kL2 (Tb∗ ) ≤ k qΓ−1 − qΓ −1 qf kL2 (Tb∗ ) 4 λTb i (Tb0 )kf kL2 (Tb0 ) . ∗ Recalling the expression (3.9) for the Laplace-Beltrami operator and taking V = U∗ − U , we can write c (qΓ · ∇V b G−1 ) ∆Γ∗ V = qΓ−1 div ∗ Γ∗ ∗ −1 b −1 c G−1 , b b 2 V : G−1 + qΓ ∇V b · div = qΓ∗ ∇qΓ∗ · ∇V GΓ∗ + qΓ∗ D ∗ Γ∗ Γ∗ and using bounds for kqΓ∗ kL∞ (Tb∗ ) and kG−1 Γ∗ kL∞ (Tb∗ ) from Lemma 4.1, the inverse b 2 V , we get inequalities (4.10), (4.11) and a third one for D k∆Γ∗ (U∗ − U )kL2 (Tb∗ ) 4 1 k∇γ (U∗ − U )kL2 (T 0 ) . hT∗ Again by virtue of (3.9) we rewrite the third term above c qΓ ∇U b G−1 k b k(∆Γ∗ − ∆Γ )U kL2 (Tb∗ ) ≤ k(qΓ−1 − qΓ−1 )div ∗ Γ∗ L2 (T∗ ) ∗ −1 c −1 b + kqΓ div qΓ∗ − qΓ )∇U GΓ∗ kL2 (Tb∗ ) c qΓ ∇U b (G−1 − G−1 ) k b 4 1 λ b (Tb0 )k∇U kL (T 0 ) + kqΓ−1 div 2 Γ∗ Γ L2 (T∗ ) hT∗ T b 2 U , (4.10), (4.11) and Lemma 4.1. Finally, using due to an inverse inequality for D the same arguments for the fourth term we obtain 1/2 1/2 1/2 1/2 k(qΓ∗ − qΓ )RT (U, FΓ )kL2 (Tb∗ ) = k(qΓ∗ − qΓ )(qΓ∗ + qΓ )−1 RT (U, FΓ )kL2 (Tb∗ ) 1 k∇U kL2 (T 0 ) + kf kL2 (T 0 ) . 4 λTb (Tb0 )kRT (U, FΓ )kL2 (Tb0 ) 4 λTb (Tb0 ) hT∗ As a consequence, the interior residuals on Γ∗ and Γ are related through the estimate hT∗ kRT (U∗ , FΓ∗ )kL2 (T 0 ) ≤ hT∗ kRT (U, FΓ )kL2 (T 0 ) + Ck∇γ (U∗ − U )kL (T 0 ) + Cλ b (Tb) k∇U kL 2 2 (T T 0) + hT∗ kf kL2 (T 0 ) , (4.23) for some constant C only depending on the shape regularity of T and the Lipschitz constant L of γ. We now examine the jump residual J∂T (U ). Let S∗ ∈ SΓ∗ and S 0 := XTb ◦ −1 XTb (S∗ ) ⊂ Γ. We denote by T∗± the two elements of T∗ sharing S∗ and recall that the ∗ corresponding outward pointing co-normals n± S∗ are not necessarily co-linear; more± over, T∗ may belong to different surface patches, i.e. T∗+ ∈ T∗i and T∗− ∈ T∗j for some 1 ≤ i, j ≤ M . Still, observe that the jump JS∗ (U∗ ) can be rewritten as follows + + JS∗ (U∗ ) = JS (U )|S∗ + ∇Γ∗ U∗+ |S∗ · n+ S∗ − ∇Γ U |S∗ · nS |S∗ − − + ∇Γ∗ U∗− |S∗ · n− S∗ − ∇Γ U |S∗ · nS |S∗ , regardless of Γi and Γj . Therefore, the last two terms in the right hand side can now be estimated using the geometric error estimates (4.7). Note that on S∗ ± ± ± ± ± ∇Γ∗ U∗± · n± S∗ − ∇Γ U · nS = ∇Γ∗ (U∗ − U ) · nS∗ ± ± ± + (∇Γ∗ − ∇Γ ) U ± · n± S∗ + ∇Γ U · nS∗ − nS = I + II + III. 22 We bound each term using their parametric representation on Sb∗ := XT−1 b∗ (S∗ ). For the first term, we use the expression (3.11) of the tangential derivative in the co-normal direction, the spectral bounds on GΓ∗ and qΓ∗ given in Lemma 4.1, and a scaled trace estimate to deduce 1 b U b∗± − U b ± )k b . b± − 2d k∇( k∇Γ∗ (U∗± − U ± ) · n± S∗ kL2 (S∗ ) 4 |T∗ | L2 (T∗ ) ± Recalling that hdT ± = |T ∗ | 4 |Tb∗± |, we see that ∗ −1/2 kIkL2 (S∗ ) 4 hT ± k∇Γ∗ (U∗± − U ± )kL2 (T∗± ) . ∗ Similarly, in view of (3.7) and (3.11), we obtain −1/2 kIIkL2 (S∗ ) + kIIIkL2 (S∗ ) 4 hT ± k∇Γ U kL2 (Tb∗± ) kDΓ∗ − DΓ )kL∞ (Tb∗± ) ∗ + kqΓ∗ − qΓ kL∞ (Tb∗± ) + krΓ∗ − rΓ kL∞ (∂ Tb∗± ) . Utilizing the geometry error estimate (4.7), we further get −1/2 kIIkL2 (S∗ ) + kIIIkL2 (S∗ ) 4 hT + λTb k∇Γ U kL2 (Tb∗± ) . ∗ Hence, combining the previous two estimates, we get ± ± k∇Γ∗ U∗± · n± S∗ − ∇Γ U · nS kL2 (S∗ ) −1/2 4 hT ± k∇γ (U∗± − U∗± )kL2 (T∗± ) + λTb k∇Γ U kL2 (Tb∗± ) . ∗ Finally, if we denote by rΓ∗ and rΓ the area elements associated with S∗ and S 0 := XTb ◦ XT−1 b (S∗ ) respectively, then we have ∗ 1/2 1/2 1/2 kJS (U )kL2 (S∗ ) = krΓ∗ JS (U )kL2 (Sb∗ ) ≤ k(rΓ∗ − rΓ )JS (U )kL2 (Sb∗ ) + kJS (U )kL2 (S 0 ) . Invoking Lemma 4.1, we realize that krΓ∗ − rΓ kL∞ (S) b ≤ λTb . Combining this with a scaled trace theorem, we deduce that 1/2 −1/2 1/2 k(rΓ∗ − rΓ )JS (U )kL2 (Sb∗ ) 4 hT ± λTb k∇Γ U kL2 (T∗± ) ∗ whence −1/2 kJS (U )kL2 (S∗ ) ≤ kJS (U )kL2 (S 0 ) + ChT ± λTb k∇Γ U kL2 (T∗± ) . ∗ The above three estimates guarantee the existence of a constant C only depending on the shape regularity of T and the Lipschitz constant L of γ such that 1/2 1/2 hT ± kJS∗ (U∗ )kL2 (S∗ ) ≤ hT ± kJS 0 (U )kL2 (S 0 ) ∗ ∗ + C k∇γ (U∗ − U )kL2 (T ± ) + λTb k∇Γ U kL2 (T∗± ) . (4.24) To conclude the proof we proceed as for graphs [24, Lemma 4.2], basically squaring (4.23) and (4.24) via Young’s inequality, adding over all elements T∗ ∈ T∗ and sides 23 S∗ ∈ S∗ , and using the strict reduction (4.9) of meshsize hT for all refined elements. In addition, we employ the global bound k∇Γ U kL2 (Γ) 4 kf kL2 (γ) . Another difference with the theory of adaptivity for flat domains is the behavior of data oscillation under refinement. The usual situation is that oscTb (U, f ) does not increase upon refinement from T to T∗ [26]. This is no longer true because oscTb (U, f ) and oscTb∗ (U∗ , f ) correspond to different domains Γ and Γ∗ . We state a quasi-monotonicity property in Lemma 4.10 but omit its proof because it is similar and somewhat simpler than that of Lemma 4.9. Lemma 4.10 (Quasi-monotonicity of data oscillation). Let λTb0 satisfy (4.5). Let (T , Γ), (T∗ , Γ∗ ) be mesh-surface pairs with T ≤ T∗ . Then, there exist constant C6 , Λ2 and Λ3 depending only on T 0 , the Lipschitz constant L of γ, and kf kL2 (γ) , such that osc2Tb (V∗ , f ) ≤ C6 osc2Tb (V, f ) + Λ3 k∇γ (V∗ − V )k2L2 (γ) + Λ2 λ2Tb ∗ (4.25) Remark 4.11 (Local perturbation of data oscillation). The previous result is also valid locally, that is for any subset τ ⊂ T∗ . In fact, if τ = T ∩ T∗ the same proof provides (4.25) with C6 = 2: osc2Tb (V, f, Tb∗ ∩ Tb ) ≤ 2 osc2Tb (W, f, Tb∗ ∩ Tb ) + Λ3 k∇γ (V − W )k2L2 (γ) + Λ2 λ2Tb , (4.26) ∗ for any piecewise polynomials V, W subordinate to T ∩ T∗ . Although the elements in Tb ∩ Tb∗ describe (part of) the common surface Γ ∩ Γ∗ , whence there is no geometric discrepancy, the presence of the geometric estimator λTb in (4.26) is due to the boundary of this common region. Note that the contribution to the oscillation associated to a side on the boundary of Tb ∩ Tb∗ involves the terms qΓ± according to (4.13). 5. AFEM: Design and Properties. Since λTb and ηT (U, FΓ ) account for quite different effects, following [7], the algorithm AFEM is designed to handle them separately via the modules ADAPT SURFACE and ADAPT PDE. AFEM: Given T0 , maps {P0i }L i=1 parametrizing the surface γ from T0 , and parameters ε0 > 0, 0 < ρ < 1, and ω > 0, set k = 0. 1. Tk+ = ADAPT SURFACE(Tk , ωεk ) 2. [Tk+1 , Uk+1 ] = ADAPT PDE(Tk+ , εk ) 3. εk+1 = ρεk ; k = k + 1 4. go to 1. We notice the presence of the factor ω , which is employed to make the geometric error small relative to the current tolerance εk , thereby controlling the interactions between the geometry and the PDE. This turns out to be essential for both contraction and optimality of AFEM, even for polynomial degree n = 1 as discussed in [6]. 5.1. Module ADAPT SURFACE. Given a tolerance τ > 0 and an admissible subdivision T , T + = ADAPT SURFACE(T , τ ) improves the surface resolution until the new subdivision T + ≥ T satisfies λTb + ≤ τ, (5.1) where λTb is the geometric estimator introduced in (1.2). This module is based on a i greedy algorithm and acts on a generic mesh T = ∪M i=1 T ∈ T: 24 T + := ADAPT SURFACE(T , τ ) 1. if M := {T ∈ T : λTb (Tb) > τ } = ∅ return(T ) and exit 2. T := REFINE(T , M) 3. go to 1. where REFINE(T , M) refines all elements in the marked set M and keeps conformity; more details are given in §5.2. To derive convergence rates for AFEM, we require that ADAPT SURFACE is t-optimal, i.e. there exists a constant C such that the set M of all the elements marked for refinement in a call to ADAPT SURFACE(T , τ ) satisfies #M ≤ C(γ) τ −1/t . (5.2) In § 7.3 we show that this assumption is satisfied by a greedy algorithm provided that χi ∈ Bq1+td (Lq (Ω)) with tq > 1, 0 < q ≤ ∞ and td ≤ n for all 1 ≤ i ≤ M . 5.2. Module ADAPT PDE. Given a tolerance ε > 0 and an admissible subdivision T + ∈ T, [T , U ] = ADAPT PDE(T + , ε) outputs a refinement T ≥ T + and the associated finite element solution U ∈ V(T ) such that ηT (U, FΓ ) ≤ ε. (5.3) This module is based on the standard adaptive sequence: [T , U ] := ADAPT PDE(T , ε) 1. U := SOLVE(T ) 2. {ηT (U, FΓ , T )}T ∈T := ESTIMATE(T , U ) 3. if ηT (U, FΓ ) < ε return(T , U ) and exit 4. M := MARK(T , {ηT (U, FΓ , T )}T ∈T ) 5. T := REFINE(T , M) 6. go to 1 We describe below the modules SOLVE, ESTIMATE, MARK and REFINE separately. Procedure SOLVE. This procedure solves the SPD linear system resulting for (3.16) where we recall that Γ = ΓT . For simplicity we assume that (3.16) is solved exactly with linear complexity. We refer to [23] for a hierachical basis multigrid preconditioner and to [9] for standard variational and non-variational multigrid algorithms. Procedure ESTIMATE. Given the Galerkin solution U ∈ V(T ) of (3.16) the procedure ESTIMATE computes the PDE error indicators {ηT (U, FΓ , T )}T ∈T . We emphasize that this procedure does not compute the oscillation terms, which are only needed to carry out the analysis. The equivalence stated in Lemma 4.8 is critical to deduce that the ADAPT PDE strategy based on the reduction of the error indicators ηT (U, FΓ ) is successful in reducing the PDE error ET (U, f ) of (4.18) provided the parameter ω satisfies s C2 ω ≤ ω1 := . (5.4) 2Λ20 Λ1 In fact, given a tolerance ε > 0 to be reached by ADAPT PDE starting from the input subdivision T + satisfying λTb + ≤ ωε, we observe that (2.14) guarantees that the T + as 25 well as all subdivisions T constructed within the inner iterates of ADAPT PDE satisfy λ2Tb ≤ Λ20 λ2Tb + ≤ C2 2 ε . 2Λ1 Within the while loop of ADAPT PDE we have ηT (U, FΓ ) > ε, so we deduce the validity of (4.19) whence that of (4.20) within that loop. Procedure MARK. We rely on an optimal Dörfler’s marking strategy for the selection of elements. Given the set of indicators {ηT (U, FΓ , T )}T ∈T and a marking parameter θ ∈ (0, 1], MARK outputs a subset of marked elements M ⊂ T such that ηT (U, FΓ , M) ≥ θηT (U, FΓ ). (5.5) In contrast to [24], MARK only employs the error indicators and does not use either the oscillation or surface indicators. We will see that quasi-optimal cardinality requires that M is minimal and quasi-optimal workload that the sorting scales linearly. Procedure REFINE. Given a subdivision T and a set M ⊂ T of marked elements, the call REFINE(T , M) bisects all elements in M at least b ≥ 1 times and perform additional refinement necessary to maintain conformity. The resulting subdivision is denoted by T∗ . Recall that the bisection procedure is first executed on faces of the corresponding flat subdivision T and its effect is transferred to the actual subdivision i via interpolations maps XTib i ◦ (X 0 )−1 for i = 1, ..., M . Since the refinement procedure is performed on T or similarly on Tb , the complexity results of the overall refinement algorithm proved by Binev, Dahmen, and DeVore for d = 2 [3] and Stevenson [30] for d > 2 holds in our setting. The precise statement of this result is in the following lemma, proved in [3, 30, 27]. Lemma 5.1 (Complexity of REFINE). Assume that the initial triangulation T0 is suitably labeled (condition (b) of §4 in [30]). Let {Tk }k≥0 be a sequence of triangulations produced by successive calls to Tk+1 = REFINE(Tk , Mk ), where Mk is any subset of Tk , k ≥ 0. Then, there exists a constant C7 solely depending on T0 , its labeling, and the refinement depth b such that #Tk − #T0 ≤ C7 k−1 X #Mj , ∀k ≥ 1. (5.6) j=0 It is worth noticing that the user parameter b ≥ 1 can be chosen equal to one, which only implies a minimal refinement, and does not force an interior node property [26, 25] or an extra refinement to improve the surface approximation [24]. Remark 5.2 (Alternative subdivision strategies). For simplicity we only discuss the refinement strategy based on simplex bisection. However, all the results obtained can be extended to any strategy satisfying Conditions 3, 4 and 6 in [8], such as quadrilaterals with hanging nodes or red refinements. 6. Conditional Contraction Property. The procedure ADAPT PDE is known to yield a contraction property in the flat case. In the present context, however, the surface approximation is responsible for lack of consistency in that the sequence of finite element spaces is no longer nested. This in turn leads to failure of a key orthogonality property between discrete solutions, the Pythagoras property. We have, 26 instead, a perturbation result referred to as quasi-orthogonality below. Its proof follows the steps of that for graphs [24, Lemma 4.4]. In this section, we use the notation ej := k∇γ (u − U j )kL2 (γ) , η j := ηT j (U j , F j ), E j := k∇γ (U j+1 − U j )kL2 (γ) , η j (Mj ) := ηT j (U j , F j , Mj ), λj := λTb j , where T j are meshes obtained after each inner iteration of ADAPT PDE, starting with T 0 = T + , Mj ⊂ T j are the subsets of elements selected by the marking procedure, F j are the scaled right hand sides defined in (3.17) with Γ replaced by Γj , the surface associated to T j , and U j ∈ V(T j ) are the corresponding Galerkin solutions. Lemma 6.1 (Quasi-orthogonality). There exists a constant Λ2 > 0 solely depending on the Lipschitz constant L of γ and kf kL2 (γ) such that for i = j, j + 1 with j ≥ 0, we have 1 3 (ej )2 − (E j )2 − Λ2 (λi )2 ≤ (ej+1 )2 ≤ (ej )2 − (E j )2 + Λ2 (λi )2 . 2 2 (6.1) Before proceeding with the proof of the above lemma, we point out that the constant Λ2 was already defined in Lemma 4.9. This is to simplify the notations below and is without loss of generality (upon redefining Λ2 as the maximum of the two constants). Proof. Since the symmetry of the Dirichlet form implies Z j 2 j+1 2 j 2 (e ) = (e ) + (E ) + 2 ∇γ (u − U j+1 )∇Tγ (U j+1 − U j ), γ we just have to examine the last term. Combining (3.13), (3.16), and (3.17) with the consistency error representation (4.3) gives Z Z j+1 T j+1 j ∇γ (u − U )∇γ (U − U ) = − ∇γ U j+1 EΓj+1 ∇Tγ (U j+1 − U j ). γ γ Invoking (4.6) yields Z ∇γ (u − U j+1 )∇Tγ (U j+1 − U j ) 4 kf kL2 (γ) λj+1 E j . γ This leads to (6.1) after applying Young’s inequality and using (2.14). Notice that relation (6.1) also holds when (i) T j , T j+1 are replaced with T , T ∗ satisfying T ∗ ≥ T ; (ii) U j+1 , is replaced by U ∗ ∈ V(T ∗ ) and (iii) U j is replaced by any V ∈ V(T ) because V need not be the Galerkin solution over T . The parameter ω2 := ξθ2 Λ0 p 32Λ2 (2Λ3 + 1) , (6.2) where ξ := 1 − 2−b/d is defined in Lemma 4.9, is used subsequently as a threshold for the AFEM parameter ω. Theorem 6.2 (Conditional contraction property). Let θ ∈ (0, 1] be the marking parameter of MARK and let {T j , U j }Jj=0 be a sequence of meshes and discrete solutions produced by one call to procedure ADAPT PDE (T 0 , ε) inside AFEM, i. e., λ0 := λTb 0 ≤ ωε. Assume that the AFEM parameter ω satisfies ω ≤ min {ω1 , ω2 } , 27 where ω1 and ω2 are given in (5.4) and (6.2), respectively. Then there exist constants 0 < α < 1 and β > 0 such that ∀ 0 ≤ j < J. (6.3) (ej+1 )2 + β(η j+1 )2 ≤ α2 (ej )2 + β(η j )2 Moreover, the number of inner iterates J of ADAPT PDE is uniformly bounded. Proof. We proceed in four steps. Note that η j ≥ ε for 0 ≤ j < J. 1 Let β > 0 be a scaling parameter to be found later. We combine (6.1) and (4.22) to write 1 (ej+1 )2 + β(η j+1 )2 ≤ (ej )2 + − + β(1 + δ −1 )Λ3 (E j )2 2 + Λ2 1 + β(1 + δ −1 ) (λj )2 + β(1 + δ) (η j )2 − ξη j (Mj )2 , where Mj is the set of elements marked for refinement at the j-th subiteration. To remove the factor of E j we now choose β dependent on δ, to be β(1 + δ −1 )Λ3 = 1 2 ⇒ β(1 + δ) = δ , 2Λ3 (6.4) and thereby obtain 1 j 2 (λ ) + β(1 + δ) (η j )2 − ξη j (Mj )2 . (ej+1 )2 + β(η j+1 )2 ≤ (ej )2 + Λ2 1 + 2Λ3 2 Invoking Dörfler marking (5.5), we deduce (η j )2 − ξη j (Mj )2 ≤ (1 − ξθ2 )(η j )2 . Since the initial mesh T 0 comes from ADAPT SURFACE we know that λ0 ≤ ωε ≤ ωη j for 1 ≤ j ≤ J. Using (2.14) yields λj ≤ Λ0 ωη j , whence ξθ2 j 2 (ej+1 )2 + β(η j+1 )2 ≤(ej )2 − β(1 + δ) (η ) 2 1 Λ20 ω 2 j 2 ξθ2 + Λ2 1 + (η ) . + β (1 + δ) 1 − 2 2Λ3 β C2 Moreover, ω ≤ ω1 implies (λj )2 ≤ 2Λ (η j )2 which turns out to be (4.19). Therefore, 1 applying the bound (4.21) and replacing β according to (6.4), we obtain (ej+1 )2 + β(η j+1 )2 ≤ α1 (δ)(ej )2 + α2 (δ)β(η j )2 with α1 (δ)2 := 1 − δ ξθ2 4Λ3 (C1 + C2 2 ) , 2 2 1 Λ0 ω ξθ2 + Λ2 1 + . α2 (δ)2 := (1 + δ) 1 − 2 2Λ3 β It remains to prove that δ can be chosen so that α2 (δ)2 < 1. We then fix the parameter δ so that 3 ξθ2 ξθ2 (1 + δ) 1 − =1− 2 4 28 ⇒ δ= ξθ2 , 4 − 2ξθ2 2 2 ξθ ξθ and, since ω ≤ ω2 , we infer that We now realize that (6.4) gives β = 2Λ3 (4−ξθ 2 ) ≥ 8Λ 3 2 2 1 Λ0 ω 4Λ2 (2Λ3 + 1) 2 2 ξθ2 Λ0 ω ≤ Λ2 1 + ≤ . 2 2Λ3 β ξθ 8 2 Hence α22 ≤ 1 − ξθ8 < 1, and choosing α := max{α1 , α2 } < 1 yields the desired (6.3). 4 The contraction property (6.3) guarantees that ADAPT PDE stops in a finite number of iterations J. To show that J is independent of the outer iteration counter k, take k ≥ 1 and note that before the call ADAPT PDE(Tk+ , εk ) we have ηk := ηTk (Uk , Fk ) ≤ εk−1 = εk , ρ λk := λTbk ≤ Λ0 λTb + ≤ k−1 We next combine (4.22), with δ = 1, and the estimate Uk k2L2 (γ) + λ2k arising from (6.1), to get Λ0 ω εk . ρ k∇γ (Uk+ − Uk k2L2 (γ) 4 k∇γ (u − ηT + (Uk+ , Fk+ )2 4 ηk2 + λ2k + k∇γ (Uk+ − Uk )k2L2 (γ) 4 ηk2 + λ2k + k∇γ (u − Uk )k2L2 (γ) , k where Fk+ is the right hand side associated to ΓT + defined in (3.17) and the hidden k constants depend on Λ2 , Λ3 . The bounds on ηk , λk , together with (4.14), yield (η 0 )2 = ηT + (Uk+ , Fk+ )2 4 ηk2 + λ2k 4 ε2k . k Since the stopping condition of ADAPT PDE is η J ≤ εk , (6.3) implies that J is bounded independently of k, as asserted. The fact that J is uniformly bounded controls the complexity of ADAPT PDE because the most expensive module SOLVE is run just J times. However, this property is not required for the study of cardinality of §8. 7. Approximation Class. In this section we discuss the approximation classes As and their connection with Besov regularity. We start with the notion of total error in §7.1 leading to the definition of As . We then introduce and discuss a greedy algorithm in §7.2, that we use repeatedly in the rest of the section. We study the best approximation error achievable with piecewise polynomials of degree n ≥ 1 for the surface γ in §7.3 and for the solution u in §7.4. We analyze the decay rate of oscillation in §7.5. Finally in §7.6 we conclude with our second main result: the membership (u, f, γ) ∈ As in terms of Besov regularity of u, f, γ. 7.1. The Total Error. Let TN ⊂ T := T(T0 ) be the set of all possible conforming triangulations, generated on γ with at most N elements more than T0 by successive bisection of T0 : TN := T ∈ T | #T − #T0 ≤ N . 1 Given v ∈ H# (γ), f ∈ L2 (γ) and V ∈ V(T ), we introduce the notion of total error 2 ET (V ; v, f, γ)2 := k∇γ (v − V )kL2 (γ) + oscTb (V, f )2 + λ2Tb . Owing to the equivalence of norms (4.8) we rewrite the first term in the parametric domain Ω and obtain the following equivalent notion of total error provided λTb0 satisfies (4.5): X b b (V ; v, f, γ)2 := b − V̂ )k2 b + osc b (V, f )2 + λ2 , E k∇(v̂ (7.1) T T L2 (T ) Tb Tb∈Tb 29 Note that the last two terms are already evaluated in Ω according to definitions (2.13), (4.13). Yet, there is a nonlinear interaction between the approximations of γ and of u, f defined on γ. At this point, we recall the convention of dropping the patch index when no confusion arises, for example vb|Tb for Tb ∈ Tb i in (7.1) stands for vbi |Tb . Assuming that the parameter ω satisfies ω ≤ ω1 in (5.4) to guarantee the validity of Lemma 4.8 and the fact that AFEM is driven by ηT (U, FΓ ) and λTb , we assess the quality of the best approximation of (v, f, γ) with N degrees of freedom in terms of the following modulus of smoothness: σ(N ; v, f, γ) := inf inf T ∈TN V ∈V(T ) b b (V ; v, f, γ); E T This is thus consistent with the approach taken for flat domains in [12, 27]. For s > 0, we define the nonlinear (algebraic) approximation class As to be n o As := (v, f, γ) : |v, f, γ|As := sup N s σ(N ; v, f, γ) < ∞ . (7.2) N ≥1 The generic range of s is dictated by the polynomial degree, namely 0 < s ≤ n/d. An alternative and useful definition to (u, f, γ) ∈ As follows: given ε > 0, there exists a subdivision Tε with Tε ≥ T0 and a discrete function Vε ∈ V(Tε ) such that b b (V ; u, f )2 ≤ ε2 , E T 1 1 #Tε − #T0 ≤ |u, f, γ|As s ε− s . and (7.3) The characterization of As in terms of Besov regularity is an open issue but we give sufficient conditions for membership in As in § 7.3 and § 7.4. Before, we discuss in § 7.2 greedy algorithms suited to our particular framework, where the subdivisions consist of a collection of compatible subdivisions. 7.2. Greedy Algorithm. In this section we present and discuss a greedy algorithm to construct a near best piecewise polynomial approximation of a vectorM valued function g := {g i }M in a suitable semi-norm. Given a mesh i=1 : Ω → R M i b the algorithm requires a local error estimator ζ b i (g i , Tb) for Tb ∈ Tb i , Tb := ∪i=1 Tb ∈ T, T 1 ≤ i ≤ M . To simplify the notations, we set ζTb (g, Tb) := ζTb i (g i , Tb), Tb ∈ Tb i , 1 ≤ i ≤ M. We emphasize at this point that the approximations of the g i ’s are not independent and, in fact, require compatibility conditions on ∂Ω. Given a conforming refinement T of an initial triangulation T 0 and a prescribed tolerance δ, the algorithm reads: + [T ] := GREEDY(g, T , δ) 1. if M := {T ∈ T : ζTb (g, Tb) > δ} = ∅ + return (T ) and exit 2. T := REFINE(T , M) 3. go to 1. where the module REFINE bisects all elements in the marked set M and keeps conformity as described in §5.1. Note that ADAPT SURFACE is a particular instance of GREEDY that uses ζTb i (χi , Tb) := λTb i (Tb) as local error estimator to approximate the patch of the surface γ i parametrized by χi : Ω → Rd+1 . We now discuss some properties of the greedy algorithm following [27]. Results of this type started with 30 Birman and Solomyak for Sobolev spaces [5], and continued with [4] for Besov spaces and [15] for wavelet tree approximation. We do not refer to any specific norm below. i Proposition 7.1 (Performance of GREEDY). Let T := ∪M i=1 T be created by successive bisections of T 0 , which in turn is assumed to satisfy condition (b) in [30, M Section 4]. Let 0 < p ≤ ∞ and let g := {g i }M be a family of vector-valued i=1 : Ω → R functions and ζTb (g, Tb) be a corresponding local error estimator that satisfies: ζTb (g, Tb) 4 hrT |g|Tb , where hT := |Tb|1/d and maxi=1,...,M Tb ∈ Tb , r > 0, P Tb∈Tb i |g i |pTb 1/p (7.4) ≤ |g|Ω is a given semi-(quasi) norm (with the convention that maxi=1,...,M maxTb∈Tb i |g i |Tb ≤ |g|Ω if p = ∞). If |g|Ω < ∞, then the module GREEDY(g, T , δ) terminates in a finite number of steps and the number of elements marked M within GREEDY satisfies dp dp #M 4 |g|Ωd+rp δ − d+rp . (7.5) Proof. The algorithm stops in a finite number of steps because the local estimator ζTb (g, Tb) is bounded by a positive power of hT according to (7.4) and the assumption on the initial triangulation ensures that a finite number of refinements are required to guarantee the conformity. To prove (7.5) we organize the elements in M by size so that it allows for a counting argument. Let Pj be the set of elements T of M with size (in the parametric domain) satisfying 2−(j+1) ≤ |Tb| < 2−j , so that T ∈ Pj 2−(j+1) ≤ |Tb| < 2−j ⇐⇒ 2−(j+1)/d ≤ hT < 2−j/d . ⇐⇒ We proceed in several steps, starting with T = T 0 . 1 We first observe that all T ’s in P j are disjoint. In fact, if T 1 , T 2 ∈ Pj and they overlap (their interiors have a nonempty intersection), then one of them is contained in the other, say T 1 ⊂ T 2 , due to the bisection procedure, thus |Tb1 | ≤ 21 |Tb2 |, contradicting the definition of Pj . Then, recalling that we have M copies of Ω, we deduce 2−(j+1) #Pj ≤ M |Ω| 2 =⇒ #Pj ≤ M |Ω| 2j+1 . (7.6) i i We note that Pj = ∪M i=1 Pj where Pj contains the elements of Pj which are refinei ments of elements in T 0 . Each element T ∈ Pj belongs to a subdivision T , where T appears within GREEDY, so that in light of (7.4), and the fact that Pj ⊂ M we have T ∈ Pji =⇒ δ ≤ ζTb (g, , Tb) 4 2−(j/d)r |g i |Tb . Therefore we have δ p #Pj 4 2−(j/d)rp M X X i=1 T ∈P i j p |g i |Tb and #Pj 4 δ −p 2−(j/d)rp |g|Ω . p (7.7) 3 The two bounds for #P j in (7.6) and (7.7) are complementary. The first one is good for j small whereas the second one is suitable for j large (think of δ 1). The crossover takes place for j0 such that d 2j0 +1 M |Ω| ≈ δ −p 2−j0 rp/d |g|pΩ ⇐⇒ 2j0 ≈ M −1 |Ω|−1 δ −p |g|pΩ d+rp . 31 4 We now compute #M = X #Pj 4 j Since P j≤j0 2j ≈ 2j0 and X p j>j0 (2 (2−rp/d )j . j>j0 j≤j0 P X 2j + δ −p |g|Ω −rp/d j ) 4 2−(rp/d)j0 , we can write #M 4 |g|Ω δ −1 dp d+rp , which is the desired estimate. 5 It remains to remove the assumption T = T . Since T is a conforming refinement of 0 T 0 , [7, Proposition 2] shows that the number of elements marked by GREEDY(T , g, δ) does not exceed those marked by GREEDY(T 0 , g, δ) and estimated in step 4. This concludes the proof. We consider the estimator ζTb (g) := {ζTb (g, Tb)}Tb∈Tb and its accumulation in `q with 0 < p < q ≤ ∞. Its decay rate is assessed next. Corollary 7.2 (Estimate in `q ). Let ζTb (g) satisfy (7.4) with r := d(s − 1/p + 1/q) > 0. Let the initial subdivision T 0 satisfy the condition (b) in [30, Section 4]. Given δ > 0 there exists a conforming mesh refinement T ∈ T such that 1/s #T − #T 0 4 #M 4 |g|Ω δ −1/s . kζTb (g)k`q 4 δ, Proof. Since dp d+rp = q 1+qs , the output of the call [T ] = GREEDY(g, T 0 , ) satisfies q ζTb (g, Tb) 4 , (7.8) q #M 4 |g|Ω1+qs − 1+qs , ∀ T̄ ∈ T , for any > 0 according to Proposition 7.1. Combining this with the complexity estimate (5.6) readily implies kζTb (g)k`q 4 #T qs 1/q 4 #M 1/q qs 1 4 1+qs |g|Ω1+qs . 1 If satisfies δ = 1+qs |g|Ω1+qs , then it is easy to see that (7.8) is valid. 7.3. Constructive Approximation of γ. We now analyze a constructive approximation of γ by piecewise polynomials based on the GREEDY algorithm. We also show that this algorithm, and hence ADAPT SURFACE, is t-optimal, i.e. the set of marked elements satisfies (1.4), provided that γ belongs to a suitable Besov space. The case of polynomial degree n = 1 with regularity of γ in terms of Sobolev scales is discussed in [6]. We establish here a result for higher order degree n ≥ 1 for which the regularity of γ must be measured in Besov scales. We recall the following compact notation (2.4): χ := {χi }M i=1 , |χ|Bq1+td (Lq (Ω)) := max |χi |Bq1+td (Lq (Ω)) . i=1,...,M Corollary 7.3 (Constructive approximation of γ). Let γ be piecewise of class 1 Bq1+td (Lq (Ω)), with tq > 1, 0 < q ≤ ∞, td ≤ n, and globally of class W∞ . Let T 0 satisfy the condition (b) in [30, Section 4]. Then ADAPT SURFACE(T , τ ) is t-optimal, i.e. λTb + ≤ τ, #M 4 C1 (γ)τ −1/t , 32 where M denotes the number of all marked elements in a call to ADAPT SURFACE 1/t and C1 (γ) ≤ |χ|B 1+td (L (Ω)) is the constant in (1.4). q q Proof. Observe that Bq1+td (Lq (Tb)), with tq > 1 and 0 < q ≤ ∞ is just above the 1 nonlinear Sobolev scale of W∞ in dimension d [31, p. 482], [22, Lemma 4.12], so that 1 1 b Bq1+td (Lq (Tb)) ⊂ B∞ (L∞ (Tb)) ⊂ W∞ (T ). Therefore, a scaling argument and local interpolation estimates give the following bound with r = dt − d/q > 0 and Tb ∈ Tb r b − X b )k λTb (Tb) = k∇(χ T L∞ (Tb) 4 hTb |χ|Bq1+td (Lq (Tb)) . (7.9) We then can apply Proposition 7.1 with p = q, g = χ, |χ|Ω = |χ|Bq1+td (Lq (Ω)) , r = dt−d/q and ζ b (g, Tb) = λ b (Tb). Upon termination of [T + ] = ADAPT SURFACE(T , τ ), T T we obtain λTb + ≤ τ along with the asserted estimate on #M because dq d+rq = 1t . 7.4. Constructive Approximation of u. We use the vector notation (2.4) u := {ui }M i=1 , |u|Bp1+sd (Lp (Ω)) := max |ui |Bp1+sd (Lp (Ω)) , i=1,...,M b = {Vb i }M and where ui := u|γ i ◦ χi and γ i is the i-th surface patch, along with V i=1 b 22 b − V)k k∇(u L (Ω) := M X b i − Vb i )k2 2 . k∇(u L (Ω) i=1 Corollary 7.4 (Constructive approximation of u). Let u ∈ H 1 (γ) be piecewise of class Bp1+sd (Lp (Ω)) with s − 1/p + 1/2 > 0, 0 < p ≤ ∞ and 0 < sd ≤ n. Let T 0 be a subdivision of Γ0 satisfying the condition (b) in [30, Section 4]. Then, given δ > 0 there exists a triangulation T ∈ T such that inf V∈V(T ) b L2 (Ω) 4 δ, b − V)k k∇(u #M 4 C(u)δ −1/s , (7.10) 1/s where M is the set of marked elements to create T and C(u) = |u|B 1+sd (L p p (Ω)) . b ∈ Bpsd (Lp (Ω)) and applying Corollary 7.2 with q := 2 we Proof. Taking g := ∇u obtain the desired estimate provided we employ discontinuous piecewise polynomials of degree ≤ n over Tb . We finally resort to [32], which shows that the error decay is in fact the same regardless of continuity for approximation of functions in H 1 (Ω). 7.5. Decay Rate of Oscillation. In order to study the decay rate of the oscillation osc2Tb (U, f ), we split it into two terms that we analyze separately: osc2Tb (U, f ) ≤ osc2Tb (U ) + osc2Tb (f ), (7.11) where osc2Tb (U ) := X osc2Tb (U, Tb), osc2Tb (f ) := Tb∈Tb X Tb∈Tb 33 osc2Tb (f, Tb), (7.12) and for T ∈ T and V ∈ V(T ) c (qΓ ∇ b Vb G−1 ) k2 osc2Tb (V, Tb) := h2T k(id − Π22n−2 )div (7.13) Γ L2 (Tb) 2 b Vb i )+ (G+ )−1 n b Vb i )− (G− )−1 n b + − qΓ− ∇( b− + hT (id − Π22n−1 ) qΓ+ ∇( Γ Γ L2 (∂ Tb) 2 oscTb (f, Tb) := h2T k(id − Π22n−2 )(f q)kL2 (Tb) . (7.14) To assess their decay rate, we resort to the following bound [12, Lemma 3.2]: k(id −Π2m )(vV )kL2 (ω) ≤ k(id −Π∞ m−n )vkL∞ (ω) kV kL2 (ω) , (7.15) which is valid for 0 ≤ n ≤ m, any domain ω of Rd or Rd−1 , V ∈ Pn (ω) and v ∈ L∞ (ω). Since Π2m is invariant over Pm , we see that (id −Π2m )(Π∞ m−n vV ) = 0 whence (id −Π2m )(vV ) = (id −Π2m )[(id −Π∞ m−n )vV ]. 2 2 This yields (7.15) for any interpolant Π∞ m−n v via the L -stability of Πm . We now embark on the study of the decay rate of oscillation: we investigate oscTb (f ) in § 7.5.1 and oscTb (U ) in § 7.5.2. 7.5.1. Decay Rate of oscTb (f ). We employ the vector notation (2.4) f := {f i }M i=1 , |f |Bpsd (Lp (Ω)) := max |f i |Bpsd (Lp (Ω)) , i=1,...,M where f i := f |γ i ◦ χi and f |γ i is the restriction of f to the surface patch γ i for 1 ≤ i ≤ M . We also recall that the superscript i indicating the patch label is dropped when no confusion arises. Corollary 7.5 (Constructive approximation of f ). Let γ be piecewise of class 1 Bq1+td (Lq (Ω)) with tq > 1, 0 < q ≤ ∞, td ≤ n, and globally of class W∞ , and let sd k = b1 + tdc + 1. Let f ∈ L2 (γ) be picewise of class Bp (Lp (Ω)) with s − 1/p + 1/2 > 0, 0 < p ≤ ∞ and sd ≤ n. Let T 0 be a subdivision of Γ0 satisfying the condition (b) in [30, Section 4]. Then, given δ > 0 there exists a triangulation T ∈ T such that oscTb (f ) 4 δ, 1 #T − #T0 4 C(f , γ) δ − s∧t+1/d , (7.16) where s ∧ t := min{s, t} and n C(f , γ) := kf kBpsd (Lp (Ω)) kχkBq1+td (Lq (Ω)) + kχkkB 1+td (L q 1 o s∧t+1/d q (Ω)) . Proof. Since Π22n−2 (f q) ∈ P2n−2 is the best L2 -approximation of f q, we see that kf q − Π22n−2 (f q)kL2 (Tb) ≤ kf q − Π22n−2 (V q)kL2 (Tb) ≤ k(f − V )qkL2 (Tb) + kV q − Π22n−2 (V q)kL2 (Tb) ≤ kf − V kL2 (Tb) kqkL∞ (Tb) + kq − Π∞ n−1 qkL∞ (Tb) kV kL2 (Tb) , for all V ∈ Pn−1 due to (7.15). Taking V = Π2n−1 f we have kV kL2 (Tb) ≤ kf kL2 (Tb) and oscTb (f, Tb) ≤ hTb kf − Π2n−1 f kL2 (Tb) kqkL∞ (Tb) + hTb kq − Π∞ n−1 qkL∞ (Tb) kf kL2 (Tb) . 34 We now introduce for each Tb ∈ Tb E2,Tb (f , Tb) := hTb kf − Π2n−1 f kL2 (Tb) , E∞,Tb (q, Tb) := hTb kq − Π∞ n−1 qkL∞ (Tb) (7.17) and notice that an immediate generalization of [22, Lemma 4.15] implies the local error estimates E∞,Tb (q, Tb) 4 hrTb∞ |q|B td (Lq (Tb)) E2,Tb (f , Tb) 4 hrTb2 |f |B sd (Lp (Tb)) , p q with r2 = (s + 1/d)d − d/p + d/2 > 1 and r∞ = (t + 1/d)d − d/q > 1. Moreover, td M thanks to Corollary 9.5, we have q := {q i }M and i=1 ∈ Bq (Lq (Ω)) n kqkL∞ (Ω) 4 kqkBqtd (Lq (Ω)) 4 max kχkBq1+td (Lq (Ω)) , kχkkB 1+td (L q o q (Ω)) , (7.18) with k = b1 + tdc + 1. Given δ > 0, we resort to Corollary 7.2 for the local indicator E2,Tb (f , Tb) with q = 2 and s replaced by s + 1/d to obtain a mesh T2 ∈ T that satisfies 1 1 − s+1/d . #T2 − #T0 4 |f |Bs+1/d sd (L (Ω)) (C2 δ) p E2,Tb2 (f ) 4 C2 δ, (7.19) p On the other hand, invoking Proposition 7.1 with tolerance δ and local indicator dq 1 = t+1/d , we find a mesh T∞ ∈ T such that E∞,Tb (q, Tb), and observing that d+r ∞q 1 E∞,Tb∞ (q) 4 C∞ δ, 1 − t+1/d #T∞ − #T0 4 |q|Bt+1/d . td (L (Ω)) (C∞ δ) q (7.20) q If T = T2 ⊕ T∞ is the overlay of the meshes T2 and T∞ , then it remains to show that T satisfies (7.16). Since the local indicators (7.17) are monotone, i.e they do not increase with refinement, we deduce from (7.19) and (7.20) 2 2 oscTb (f )2 4 E2, (f ) kqk2L∞ (Ω) + E∞, (q) kf k2L2 (Ω) Tb Tb 2 4 δ 2 C22 kqk2L∞ (Ω) + C∞ kf k2L2 (Ω) . We now choose the constants C2 and C∞ as follows: C2 = kqk−1 , B td (Lq (Ω)) C∞ = kf k−1 . B sd (Lp (Ω)) q p This implies oscTb (f ) 4 δ in view of (7.18) and kf kL2 (Ω) 4 kf kBpsd (Lp (Ω)) . Finally, since #T ≤ #T2 + #T∞ − #T0 according to [12, Lemma 3.7], we obtain 1 #T − #T0 ≤ (#T2 − #T0 ) + (#T∞ − #T0 ) 4 C(f , γ) δ − s∧t+1/d . This follows from (7.19) and (7.20) upon replacing the exponents s and t by s ∧ t and noticing that their left-hand sides are always larger than or equal to 1. This concludes the proof. Remark 7.6 (Besov regularity of f ). If ui ∈ Bp1+sd (Lp (Ω)) and γ is smooth, c q ∇u b i G−1 ∈ Bpsd−1 (Lp (Ω)) is the natural Besov regularity for f i . then f i = −q −1 div However, we require that f i ∈ Bpsd (Lp (Ω)) because the data oscillation is evaluated in L2 (Ω) rather than H −1 (Ω). This additional degree of regularity of f is responsible for the faster decay of oscTb (f ) reported in Corollary 7.5. 35 7.5.2. Decay Rate of oscT (U ). In this section we study the decay rates of oscT (U ) defined in (7.13). We again use the GREEDY algorithm where now the local indicator will be oscTb (V, Tb) for V ∈ V(T ). We start with an estimate for oscTb (V, Tb) in terms of a positive power of hTb . In view of expression (7.13) for oscTb (V, Tb), the major non standard obstruction is the presence of the surface dependent and non-polynomial term qΓ G−1 Γ . This requires two auxiliary results about Besov spaces, namely Corollary 9.2 (scale-invariant Besov semi-norm of products of functions) and Lemma 9.4 (scale-invariant Besov norm of composition), which we prove later in Section 9 not to interrupt the flow. We are now in position to show that oscT (V, Tb) is bounded by a positive power of hTb . The proof of Proposition 7.7 is a consequence of the following three lemmas. Proposition 7.7 (Local decay of oscillation). Let the surface γ be piecewise of class Bq1+td (Lq (Ω)) with tq > 1, 0 < q ≤ ∞, td ≤ n, and let γ be globally of class 1 W∞ . For all T ∈ T and all V ∈ V(T ) kk X 1/`j b Vb k k∇ oscTb (V, Tb) 4 hrTb |χ| 1+td` (7.21) j L2 (N b (Tb)) , b b j=1,`j =j/kk Bq (Lq/`j (NTb (T ));T ) T with r = td−d/q > 0 and k = btdc+1, NTb (Tb) is the set containing Tb and its adjacent 1+td`j (Lq/`j (NTb (Tb)); Tb ) indicates the broken Besov space. elements, and Bq Lemma 7.8 (Element oscillation). Under the assumptions of Proposition 7.7, for all V ∈ V(T ) and all Tb ∈ Tb i , we have c Γ∇ b Vb G−1 )k b 4 hr |qΓ G−1 | td bb hT k(id − Π22n−2 )(div(q Γ Γ B (Lq (Tb)) k∇V kL2 (Tb) , L2 (T ) Tb q with r = td − d/q > 0. Proof. We first observe that c Γ∇ c qΓ G−1 · ∇ b Vb G−1 ) = div b Vb + qΓ G−1 : D b 2 Vb , div(q Γ Γ Γ and by (7.15) −1 c Γ∇ c b Vb G−1 )k b 4 k(id − Π∞ b Vb k b k(id − Π22n−2 )div(q kL∞ (Tb) k∇ n−1 )div qΓ GΓ Γ L2 (T ) L2 (T ) −1 b2 b + k(id − Π∞ n−1 )(qΓ GΓ )kL∞ (Tb) kD V kL2 (Tb) . Using interpolation estimates in Besov norms of an immediate generalization of [22, Lemma 4.15] we have −1 c c qΓ G−1 | td k(id − Π∞ kL∞ (Tb) . hrTb |div n−1 )div qΓ GΓ Γ B (Lq (Tb)) , q with 0 < r ≤ n. By the inverse inequality of Lemma 4.4, we readily get 1 −1 c qΓ G−1 | td |div Γ Bq (Lq (Tb)) . h |qΓ GΓ |Bqtd (Lq (Tb)) . Tb A similar argument applied to the second term gives −1 −1 r b2 b k(id − Π∞ n−1 )(qΓ GΓ )kL∞ (Tb) kD V kL2 (Tb) . hTb |qΓ GΓ |B td (Lq (Tb)) q 36 b Vb k b k∇ L2 (T ) hTb . This proves the asserted estimate. Lemma 7.9 (Jump oscillation). Let the assumptions of Proposition 7.7 be valid. For all Tb ∈ T i and all V ∈ V(T ), there holds 2 1/2 −b − − −1 − + b + (G+ )−1 n b b hT (id − Π22n−1 ) qΓ+ ∇V − q ∇V (G ) n Γ Γ Γ L2 (∂ Tb) 4 b bi hrTb |qΓ G−1 Γ |Bqtd (Lq (NTb (Tb));Tb ) k∇V kL2 (NTb (Tb)) , with r = td − d/q > 0. Proof. Let Sb = Tb+ ∩ Tb− be any side of Tb+ := Tb. Since 1/2 b + (G+ )−1 n b − (G− )−1 n b + − qΓ− ∇V b− hTb (id − Π22n−1 ) qΓ+ ∇V Γ Γ b L2 (S) 1/2 +b + + −1 + 2 b 4 hTb (id − Π2n−1 ) qΓ ∇V (GΓ ) n b L2 (S) 1/2 −b − − −1 − 2 b , + hTb (id − Π2n−1 ) qΓ ∇V (GΓ ) n b L2 (S) we estimate each term separately, dropping the ± superscript. We invoke (7.15) to deduce that b S b G−1 n b b, ∇V b L (S) ≤ (id − Π∞, ) qΓ G−1 n (id − Π22n−1 ) qΓ ∇V b n−1 Γ b Γ L2 (S) ∞ b L2 (S) b S Π∞, n−1 b projector onto Pn−1 (S). b Since the unit normal n b is constant where is the L∞ (S) b on S, the interpolation estimate from an immediate generalization of [22, Lemma 4.15] reveals that b −1 r b k k(id − Π∞,S ) qΓ G−1 n b h qΓ G td b± , n−1 Γ L∞ (S) Γ Tb Bq (Lq (T )) where we have used the assumption r ≤ n. This together with a scaled trace estimate b k b h−1/2 k∇V b k b± yields the desired estimate. k∇V L2 (S) L2 (T ) Tb We see from Lemmas 7.8 and 7.9 that the discrete surface Γ enters the estimates via |qΓ G−1 Γ |B td (Lq (Tb)) . The next lemma provides control of this term. q Lemma 7.10 (Besov semi-norm of qΓ G−1 Γ ). Let the assumptions of Proposition 7.7 hold and k = btdc + 1. We then have for all Tb ∈ Tb k |qΓ G−1 Γ |Bqtd (Lq (Tb)) k X 4 1/`j |χ| 1+td`j Bq j=1,`j =j/kk (Lq/`j (Tb)) . Proof. We invoke Lemma 9.1 (scale-invariant Besov semi-norm of the product of two functions) along with the Hölder inequality to write −1 |qΓ G−1 Γ |B td (Lq (Tb)) 4 qΓ L∞ (Tb) GΓ B td (Lq (Tb)) q q + k−1 X 1/`m qΓ td`m Bq m=1 `m :=m/k + qΓ B td (L q 4 k X q (T )) b m=1 `m :=m/k −1 G Γ L 1/`m qΓ td`m Bq (Lq/`m (Tb)) q q/(1−`m ) (T )) b ∞ (T ) b (Lq/`m (Tb)) 37 1/(1−`m ) + G−1 Γ B td(1−`m ) (L 1/`m + G−1 Γ B td`m (L q q/`m (T )) b , because qΓ L∞ (Tb) , G−1 4 1 for γ being globally Lipschitz. We denote Γ L∞ (Tb) s∗ = td`m , q ∗ = q/`m , 1 ≤ m ≤ k, 1/2 : and bound |qΓ |B s∗ (Lq∗ (Tb)) using Lemma 9.4 for qΓ = det(∇X T ∇X) q |qΓ |B s∗ (Lq∗ (Tb)) 4 k X ` X q `=1 i=1 k∇Xk`−i L (Tb) i Y X ∞ Pi ∗ j=1 `j =1 |∇X| s∗ ` ∗ j Bq j=1 (Lq∗ /`∗ (Tb)) ; j a similar bound is valid for G−1 Γ . To get a simpler expression, we observe that Pi ∗ ∗ i−1 ≤ 1 and ij ∈ N, whence there cannot be more j=1 `j = 1 with 0 ≤ `j = ij /k Qi ∗ than i − 1 vanishing `j ’s in each product j=1 |∇X| s∗ `∗j . Therefore b Bq i Y j=1 |∇X| s∗ ` ∗ Bq j (Lq∗ /`∗ (Tb)) j because |∇X|B 0 ∞ (L∞ (T )) b ≤ max 1, k∇Xki−1 L∞ (Tb) (Lq∗ /`∗ (T )) j i Y |∇X| j=1 `∗ j 6=0 s∗ ` ∗ j Bq (Lq∗ /`∗ (Tb)) , j = k∇XkL∞ (Tb) . Now, from i ≤ ` ≤ k we obtain k X ` X |qΓ |B s∗ (Lq∗ (Tb)) 4 max 1, k∇Xkk−1 L (Tb) q ∞ `=1 i=1 i Y X Pi ∗ j=1 `j =1 |∇X| s ∗ `∗ j Bq j=1 `∗ j 6=0 (Lq∗ /`∗ (Tb)) , j and remove the first factor in light of (2.9) and χ being globally Lipschitz. Since Pi ∗ ∗ j=1 `j = 1 and `j > 0, we employ Hölder’s inequality to estimate each product as i Y |∇X| j=1 `∗ j 6=0 s∗ ` ∗ Bq j (Lq∗ /`∗ (Tb)) j 4 i X |∇X| 1/`∗ j s∗ ` ∗ j Bq j=1 `∗ j 6=0 . (Lq∗ /`∗ (Tb)) j Combining this with the preceding expression, and taking into account that the numbers of appearances of each |∇X| s∗ `∗j only depends of k, we have b Bq 1/` |qΓ |B s∗m(L q b q ∗ (T )) 4 (Lq∗ /`∗ (T )) j k−1 kX |∇X| j=1 k−1 `∗ j =j/k 1/`∗ j `m s∗ ` ∗ j Bq , (Lq∗ /`∗ (Tb)) j where `∗j = j/k k−1 has been redefined to fit all possible cases. It suffices now to realize that `∗j `m can be written as `n = `∗j `m = n/k k for 1 ≤ n ≤ k k and s∗ `∗j = s`n , q ∗ /`∗j = q/`n . Finally, applying Lemma 2.1 (local stability of Lagrange interpolation) we may replace X by χ in Tb and thereby obtain the asserted estimate. Proposition 7.11 (Uniform decay rate of oscTb (V )). Let γ be piecewise of class 1 Bq1+td (Lq (Ω)), with tq > 1, td ≤ n, and globally of class W∞ . Let T 0 satisfy condition 38 (b) in [30, Section 4] and let T ≥ T 0 be a refinement of T 0 . Then, for any tolerance δ > 0 there exists a subdivision Tδ ∈ T such that Tδ ≥ T and oscTbδ (V ) 4 δ, b L (Ω) b Vk V ∈V(Tδ ) k∇ #M 4 C2 (γ) δ −1/t , max 2 where M is the set of elements marked to create Tδ from T and the constant C2 (γ) depends on γ and is given explicitly by n o1/t k C2 (γ) := max kχkBq1+td (Lq (Ω)) , kχkkB 1+td (L (Ω)) . q q Proof. We make use of the GREEDY algorithm upon taking p = q, g = χ, and k ζTb (χ, Tb) = hrTb |χ|Tb , k X |χ|Tb = |χ| j=1,`j =j/kk 1/`j 1+td`j Bq (Lq/`j (Tb)) , (7.22) with r = d(t − 1q ) > 0. The assumptions of Proposition 7.1 hold because k |χ|qΩ = X |χ|qTb = Tb∈Tb X k X |χ| j=1 `j =j/kk Tb∈Tb q 1/`j 1+td`j Bq k k 4 X Tb∈Tb (Lq/`j (Tb)) k X |χ| j=1 `j =j/kk q/`j 1+td`j Bq (Lq/`j (Tb)) k X 4 j=1 `j =j/kk q/`j |χ| 1+td`j Bq (Lq/`j (Ω)) , and together with Bq1+td (Lq (Ω)) ⊂ Bq1+td`j (Lq/`j (Ω)) implies |χ| 1+td`j Bq (Lq/`j (Ω)) for all 0 < `j ≤ 1 4 kχkBq1+td (Lq (Ω)) and |χ|Ω 4 C2 (γ)t . Then, Proposition 7.1 guarantees that the call T = GREEDY(χ, T 0 , δ) stops in a finite number of steps and the resulting subdivision T satisfies ζTb (χ, Tb) ≤ δ ∀ Tb ∈ Tb . This, in conjunction with Proposition 7.7 and the finite overlapping property of the sets NTb (Tb), implies that Tb satisfies X b 2 b Vb k2 b Vk oscTb (V )2 4 ζTb (χ, NTb (Tb))2 k∇ ≤ δ 2 k∇ ∀ V ∈ V(T ). L2 (Ω) L (N (Tb)) 2 b T Tb∈Tb This proves the first assertion. In order to bound the cardinality of M (or equivalently of M), we rely on the estimate (7.5) on the elements marked by GREEDY dq dq #M 4 |χ|Ωd+rq δ − d+rq . dq The proof concludes upon realizing that d+rq = 1t . Remark 7.12 (Approximation of γ). Since the local error estimator in (7.22) satisfies ζTb (χ, Tb) < λTb (Tb), we deduce λTb 4 δ for the mesh Tδ of Proposition (7.11). 39 7.6. Membership in As . We now collect the estimates derived earlier in this section and prove our second main result. Theorem 7.13 (Membership in As ). Let γ be piecewise of class Bq1+td (Lq (Ω)) 1 with tq > 1, 0 < q ≤ ∞ and td ≤ n, and globally of class W∞ , and let k := b1+tdc+1. 1 1+sd Let u ∈ H# (γ) and f ∈ L2 (γ) be piecewise of class Bp (Lp (Ω)) and Bpsd (Lp (Ω)) respectively, with s − 1/p + 1/2 > 0, 0 < p ≤ ∞ and 0 < sd ≤ n. Let T 0 satisfy condition (b) in [30, Section 4] and λTb0 satisfy (4.5). Then, (u, f, γ) ∈ As∧t , i.e, given δ > 0 there exists a conforming refinement T such that 1 inf V ∈V(T ) 1 #T − #T0 4 |u, f, γ|As∧t δ − s∧t , s∧t ETb (V ; u, f, γ) 4 δ, (7.23) and |u, f, γ|As∧t ≤ |u|Bp1+sd (Lp (Ω)) k + kχkBq1+td (Lq (Ω)) + kχkkB 1+td (L q q (Ω)) 1 + kf kBpsd (Lp (Ω)) . (7.24) Proof. Since λTb0 satisfies (4.5), instead of dealing with ETbN (V ; u, f, γ), we argue b b (V ; u, f, γ) from (7.1), which is evaluated in Ω. We with the equivalent quantity E TN invoke Corollaries 7.4 and 7.5 to obtain meshes Tu , Tf ∈ T satisfying inf V∈V(Tu ) #Mu 4 C(u) δ −1/s , b 22 k∇(u − V)k L (Ω) 4 δ, #Tf − #T0 4 C(f , γ) δ −1/(s∧t+1/d) , oscTbf (f ) 4 δ, We next apply Proposition 7.11 and Remark 7.12, starting from Tu , to obtain a refinement Tγ ∈ T such that λTbγ 4 δ, max V ∈V(Tγ ) oscTbγ (V ) b L (Ω) b Vk k∇ 2 4 δ, #Mγ 4 C2 (γ) δ −1/t . The cardinality of Tγ can be estimated via Lemma 5.1 (complexity of REFINE) #Tγ − #T0 4 #Mu + #Mγ 4 C(u)δ −1/s + C2 (γ)δ −1/t . Since the cardinalities can be assumed to be at least 1, we can replace the exponents s, t and s ∧ t + 1/d of δ and the constants C2 (γ), C(u) and C(f , γ) by s ∧ t. Let T = Tγ ⊕ Tf be the overlay of the two meshes Tγ and Tf . According to [12, Lemma 3.7] the cardinality of #T − #T0 is bounded by #Tγ + #Tf − 2#T0 , whence 1/s∧t #T − #T0 4 |u, γ, f |As∧t δ −1/s∧t , 1/s∧t with the nonlinear quantity |u, γ, f |As∧t satisfying (7.24). It remains to show the first estimate in (7.23). We first observe that inf V∈V(T ) b L2 (Ω) ≤ k∇(u − V)k 40 inf V∈V(Tu ) b L2 (Ω) k∇(u − V)k because T ≥ Tu . We choose Vu ∈ V(Tu ) to be the function that realizes the minimum. Since the definition (7.13) of oscTb (V ) involves the best L2 -approximation, we can argue as in Lemma 4.9 to deduce for T ≥ Tγ b u kL (Ω) + δ, bV oscTb (Vu ) 4 oscTbγ (Vu ) + λTbγ ≤ δk∇ 2 because Vu ∈ V(Tu ) ⊂ V(Tγ ). Upon adding and subtracting u, we readily see that b u )kL (Ω) + δ 4 δ. b L (Ω) + δk∇(u b −V oscTb∗ (Vu ) 4 δk∇uk 2 2 Since the definition (7.14) utilizes the L2 -projection and T ≥ Tf we infer that oscTb (f ) ≤ oscTbf (f ) 4 δ. Collecting the preceding estimates and using the definition (7.1) of total error gives the desired estimate ETb (Vu ; u, f, γ) 4 δ, and finishes the proof. 8. Convergence rates. In this section we study the cardinality of AFEM, which is dictated by the regularity of u, f and γ. We now prove that AFEM achieves the asymptotic decay rate s dictated by the class As . We establish the link between the performance of AFEM and the best possible error by adapting a clever idea of Stevenson [29] for the Laplace operator, further extended by Cascón et al [12] to general elliptic PDE, in flat domains. We refer to the survey [27] for a thorough discussion and to [11]. The insight is the following Any marking strategy that reduces the total error relative to its current value must contain a substantial portion of the error estimator, and so it can be (8.1) related to Dörfler Marking. Exploiting next the minimality of Dörfler marking we can compare meshes generated by AFEM with the best meshes within T. The approach of [12, 27, 29] does not apply directly in the present context because of the consistency error due to surface interpolation. We account for this discrepancy below upon making the parameter ω of ADAPT SURFACE sufficiently small. Let ω3 := Λ0 √ C5 , 3Λ1 + 4Λ2 + 2Λ1 Λ3 (8.2) be a threshold for ω to be used next and let θ∗ be a threshold for the Dörfler parameter θ θ∗ := p C5 2C3 + C1 (3 + 2Λ3 ) ; (8.3) p since C5 = C2 /2 and C2 ≤ C1 , we see that θ∗ < 1. Lemma 8.1 (Dörfler marking). Let λTb0 satisfy (4.5), and the parameters θ and ω satisfy 0 < θ < θ∗ , 0 < ω ≤ min{ω1 , ω3 }, where θ∗ , ω3 are defined in (8.2), (8.3), and ω1 in (5.4). Let µ := (8.4) 1 2 q 1− θ2 θ∗2 and (Γ, T , U ) be the approximate surface, mesh and discrete solution produced by an inner 41 iterate of ADAPT PDE. If (Γ∗ , T∗ , U∗ ) is a surface-mesh-solution triple with T∗ ≥ T , such that the PDE error satisfies ET∗ (U∗ , f ) ≤ µ ET (U, f ), (8.5) then the refined set R := RT →T∗ satisfies Dörfler property with parameter θ, namely ηT (U, FΓ , R) ≥ θηT (U, FΓ ). (8.6) Proof. We proceed as in [12, Lemma 5.9] using the notation e(U ) := k∇γ (u − U )kL2 (γ) . Since ω ≤ ω1 , we combine the lower bound of (4.20) with (8.5) to write (1 − 2µ2 )C52 ηT (U, FΓ )2 ≤ (1 − 2µ2 ) e(U )2 + oscTb (U, f )2 ≤ e(U )2 − e(U∗ )2 + oscTb (U, f )2 − 2 oscTb∗ (U∗ , f )2 . We now estimate separately error and oscilation terms. According to (6.1) and (4.16), we obtain 3 k∇γ (U∗ − U )k2L2 (γ) + Λ2 λ2Tb 2 3 3 ≤ C1 ηT (U, FΓ , R)2 + Λ1 + Λ2 λ2Tb . 2 2 e(U )2 − e(U∗ )2 ≤ For the oscillation terms we argue according to whether an element T ∈ T belongs to the set of refined elements R or not. We define the corresponding elements in Tb by b := {Tb ∈ Tb : X i (Tb) ∈ R for some 1 ≤ i ≤ M } and use the dominance bound R (4.17) to arrive at Tb i b 2 ≤ C3 ηT (U, FΓ , R) b 2. oscTb (U, f, R) On the other hand, using (4.26) for Tb∗ ∩ Tb with V = U and W = U∗ yields oscTb (U, f, Tb∗ ∩ Tb )2 − 2 oscTb∗ (U∗ , f, )2 ≤ Λ3 k∇γ (U∗ − U )k2L2 (γ) + Λ2 λ2Tb . By combining these two estimates with (4.16) we infer that oscTb (U, f )2 − 2 oscTb∗ (U∗ , f )2 ≤ (C3 + C1 Λ3 )ηT (U, FΓ , R)2 + (Λ1 Λ3 + Λ2 )λ2Tb . Since T is produced within ADAPT PDE, we have ηT (U, FΓ ) ≥ ε and λTb + ≤ ωε, whence λTb ≤ Λ0 λTb + ≤ Λ0 ωε ≤ Λ0 ωηT (U, FΓ ). Collecting these three estimates, we deduce 1 (1 − 2µ2 )C52 ηT (U, FΓ )2 ≤ 2C3 + C1 (3 + 2Λ3 ) ηT (U, FΓ , R)2 2 1 + 3Λ1 + 4Λ2 + 2Λ1 Λ3 Λ20 ω 2 ηT (U, FΓ )2 . 2 Finally, using that ω ≤ ω3 along with (8.2) and (8.3), we infer that (1 − 4µ2 )ηT (U, FΓ )2 ≤ θ∗2 ηT (U, FΓ , R)2 . 42 The choice of µ implies the asserted estimate (8.6). Lemma 8.2 (Cardinality of M). Let λTb0 satisfy (4.5) and the procedure MARK select a set M with minimal cardinality. Let the parameters θ and ω satisfy 0 < θ < θ∗ , 0 < ω ≤ min{ω1 , ω3 } (8.7) with θ∗ , ω1 and ω3 given in (8.3), (5.4), and (8.2), respectively. Let u be the solution of (3.13), and let (Γ, T , U ) be produced within ADAPT PDE. If (u, f, γ) ∈ As , then 1 1 #M 4 |u, f, γ|As s ET (U, f )− s . Proof. We set δ 2 = µ̂2 ET (U, f )2 = µ̂2 e(U )2 + oscTb (U, f )2 , for 0 < µ̂ < µ = 1 2 q 1− θ2 θ∗2 < 1 sufficiently small to be determined later. Since (u, f, γ) ∈ As , there exists a subdivision Tδ ∈ T and Vδ ∈ V(Tδ ) such that 1 1 #Tδ − #T0 4 |u, f, γ|As s δ − s , e(Vδ )2 + oscTbδ (Vδ , f )2 + λ2Tb ≤ δ 2 . (8.8) δ Let T∗ = T ⊕ Tδ be the overlay of T and Tδ , which satisfies [12, Lemma 3.7], [27], #T∗ ≤ #T + #Tδ − #T0 . (8.9) Let U∗ ∈ V(T∗ ) be the corresponding Galerkin solution. We observe that T∗ ≥ Tδ , T , and invoke the upper bound of (6.1) in conjunction with (4.25) to write e(U∗ )2 + oscTb∗ (U∗ , f )2 ≤ e(Vδ )2 + Λ2 λ2Tb δ + C6 oscTbδ (Vδ , f )2 + Λ3 k∇γ (U∗ − Vδ )k2L2 (γ) + Λ2 λ2Tb . δ Applying (6.1) again gives k∇γ (U∗ − Vδ )k2L2 (γ) 2 ≤ e(Vδ ) + Λ2 λ2Tb , δ whence e(U∗ )2 + oscTb∗ (U∗ , f )2 ≤ (1 + 2Λ3 )e(Vδ )2 + C6 oscTbδ (Vδ , f )2 + 2Λ2 (1 + Λ3 )λ2Tb . δ We now choose µ̂ = µ √ max{C6 ,1+2Λ3 ,2Λ2 (1+Λ3 )} to end up with e(U∗ )2 + oscTb∗ (U∗ , f )2 ≤ max{C6 , 1 + 2Λ3 , 2Λ2 (1 + Λ3 )}δ 2 = µ2 e(U )2 + osc2Tb (U, f ) . We thus deduce from Lemma 8.1 that the subset R = RT →T∗ ⊂ T satisfies Dörfler property (8.6). Since the set M ⊂ T also satisfies this property, but with minimal cardinality, we infer from (8.8)–(8.9) 1 1 #M ≤ #R ≤ #T∗ − #T ≤ #Tδ − #T0 4 |u, f, γ|As s δ − s , The asserted estimate finally follows upon using the definition of δ. The quasi-optimal cardinality of AFEM is a direct consequence of Lemma 8.2 and Theorem 6.2. This is our third main result and we prove it next. Theorem 8.3 (Convergence rate of AFEM). Let (u, f, γ) ∈ As for some 0 < s ≤ n/d. Let ε0 ≤ (6ωΛ0 L3 )−1 be the initial tolerance, and the parameters θ, ω satisfy 0 < θ ≤ θ∗ , 0 < ω ≤ ω∗ := min{ω1 , ω2 , ω3 }, 43 (8.10) where θ∗ , ω1 , ω2 , ω3 are given in (8.3), (5.4), (6.2), and (8.2), respectively. Let the initial triangulation T 0 satisfy the condition (b) in [30, Section 4]. Let the procedure MARK select sets with minimal cardinality, and ADAPT SURFACE be s-optimal on the surface γ. Let u be the solution of (3.13) and {Γk , Tk , Uk }k≥0 be a sequence of approximate surfaces, meshes and discrete solution generated by AFEM. Then there exists a constant C, depending on the Lipschitz constant L of γ, kf kL2 (γ) , the refinement depth b, the initial triangulation T 0 , and AFEM parameters (θ, ω, ρ) such that −s e(Uk ) + oscTbk (Uk , f ) + ω −1 λTbk ≤ C|u, f, γ|As #Tk − #T0 , (8.11) where |u, f, γ|As is defined in (7.2). Proof. We start by noting that since ωε0 ≤ 6Λ10 L3 the first output of the procedure ADAPT SURFACE fulfills λΓ+ (T0+ ) ≤ 6Λ10 L3 which is (4.5) and implies that T(T0+ ) is 0 shape regular. There are two instances where elements are added, inside ADAPT SURFACE and ADAPT PDE. Making the assumption (5.2) of s-optimality, the set of all the elements marked for refinement in the k-th call to ADAPT SURFACE satisfies −1 1 #Mk 4 C(γ) ω − s εk s . For ADAPT PDE, Lemma 8.2 (cardinality of M) yields − 1s 1 #Mjk 4 |u, f, γ|As s e(Ukj ) + oscTb j (Ukj , f ) 0 ≤ j < J. k where Mjk denotes the subset of elements selected by the marking procedure at the j-th subiteration of the k-th step of ADAPT PDE. Since the inner iterates of ADAPT PDE satisfy Theorem 6.2 (conditional contraction property) and e(Ukj ) + oscTb j (Ukj , f ) ≈ e(Ukj ) + ηT j (Ukj , Fkj ), k k we deduce that − 1s − 1s J−j−1 4α s e(UkJ−1 ) + ηT J−1 (UkJ−1 , FkJ−1 ) e(Ukj ) + oscTb j (Ukj , f ) k k ≤α J−j−1 s −1 εk s . This implies J−1 X 1 − 1s #Mjk 4 |u, f, γ|As s εk j=0 J−1 X α J−j−1 s 1 −1 4 |u, f, γ|As s εk s . j=0 To do a full counting argument, we resort to the crucial estimate (5.6), which combined with the estimates above gives #Tk − #T0 ≤ C7 k−1 X #Mi + i=0 J−1 X #Mji 4 ω − 1s 1 s C(γ) + |u, f, γ|As k−1 X −1 εi s . i=0 j=0 1 s We observe that C(γ) 4 |u, f, γ|As . We now use the relation εk+1 = ρεk of step 3 of Pk−1 − 1 − 1s Pk−1 i − 1s s AFEM, and since ρ < 1, we obtain i=0 εi s = εk−1 i=0 ρ 4 εk , whence 1 −1 #Tk − #T0 4 |u, f, γ|As s εk s . 44 (8.12) Moreover, the stopping criteria (5.1) and (5.3) guarantee that e(Uk ) + oscTbk (Uk , f ) + ω −1 λTk 4 εk , (8.13) which implies the desired estimate (8.11). The precise constant on right-hand side of (8.11) is ω −1 C(γ)s + |u, f, γ|As . This and the condition ω ≤ ω∗ in (8.10) suggest that ω should not be too small to optimize (8.11). An optimal choice of ω, which unfortunately is not computable, appears to be s ω = min ω∗ , |u, f, γ|−1 As C(γ) . We also provide an estimate on the workload in the following corollary. We assume that the adaptive loop (1.6) on a subdivision T ∈ T requires O(#T ) computations and in particular (i) the linear algebra solver scales like #T and (ii) an approximate sort requiring O(#T ) arithmetic operations is used to select the local estimators ηT (U, FΓ , T ) for all T ∈ T (see e.g. [8, Remark 5.3]). Corollary 8.4 (Workload estimate). In addition to the assumptions of Theorem 8.3, suppose that each inner loop of ADAPT PDE on a subdivision T ∈ T requires O(#T ) arithmetic operations. If ε ≤ ε0 , then the number of arithmetic operations W for AFEM to construct a triple (Γ, T , U ) such that e(U ) + oscTb (U, f ) + ω −1 λTb ≤ ε (8.14) satisfies W ε−1/s . Proof. Let C ≥ 1 be the hidden constant in (8.13) and set L to be the integer such that εL+1 := ρL+1 ε0 ≤ ε/C ≤ εL . Moreover, we define Wj+ to be the number of arithmetic operations performed within the call [Tj+ ] = ADAPT SURFACE(Tj , Γj , ωεj ) and Wj+1 those within the call [Uj+1 , Tj+1 ] = ADAPT PDE(Tj+ , εj ). With these notations, the total number of operations to achieve (8.14) satisfies W L X (Wj+ + Wj+1 ). j=0 We now bound each term separately starting with Wj+ . The computation of each local geometric estimator requires O(1) arithmetic operations and is performed #Tj + #Mj ≤ #Tj+ ≤ #Tj+1 times. Since ADAPT SURFACE does not involve sorting the local geometric estimators, we readily deduce that Wj+ #Tj+1 . Regarding Wj+1 , we recall that Theorem 6.2 guarantees that the number of inner iterations J within ADAPT PDE is uniformly bounded. This, together with the complexity assumption on the inner loops of ADAPT PDE, yields Wj+1 #Tj+1 . 45 Now combining the above two estimates and invoking (8.12), we deduce that −1/s Wj+ + Wj+1 εj+1 . Going back to the total number of operations W , we find −1/s W ρ−1/s εL L X −1/s ρj/s εL , j=0 where we used the relations εj+1 = ρ−(L−j−1) εL for j = 0, ..., L. The desired estimate follows from the definition of εL . 9. Products and Compositions in Besov Spaces. We recall the definition of Besov spaces via modulus of smoothness; a thorough discussion can be found in [22]. Let Ω be a Lipschitz domain in Rd , 0 < p ≤ ∞ and u ∈ Lp (Ω), we define the differences as follow, for h ∈ Rd , k ∈ N: ∆h u : Ω → R, with ( u(x + h) − u(x), if x ∈ Ωh = {x ∈ Ω : x + th ∈ Ω, ∀t ∈ [0, 1]}, ∆h u(x) = 0, otherwise, k+1 k and ∆k+1 h u : Ω → R as ∆h u(x) = ∆h ∆h u(x) for k ∈ N and x ∈ Ω(k+1)h and 0 otherwise. Therefore, (P k k+j k k j=0 (−1) j u(x + jh), if x ∈ Ωkh , ∆h u(x) = (9.1) 0, otherwise. Using these difference operators we define the modulus of smoothness of order k in Lp (Ω) as: ωk (u, t)p = sup k∆kh ukLp (Ω) , t > 0, |h|≤t Given s > 0 and 0 < q, p ≤ ∞, the Besov space Bqs (Lp (Ω)), is the set of all functions f ∈ Lp (Ω) such that the semi-(quasi)norm |f |Bqs (Lp (Ω)) is finite, with |f |Bqs (Lp (Ω)) := Z ∞ −s [t q dt ωk (u, t)p ] 0 t q1 , sup t−s ωk (f, t)p , if 0 < q < ∞ (9.2) if q = ∞, t>0 where k ∈ N is such that s < k − 1 + max{1, p1 } (e.g. k = bsc + 1). The (quasi)norm of Bqs (Lp (Ω)) is defined by: kf kBqs (Lp (Ω)) = kf kLp (Ω) + |f |Bqs (Lp (Ω)) . (9.3) An important property that we will exploit in what follows is the embedding of Bqs (Lp (Ω)) in L∞ (Ω) whenever sp > d. And the embedding of Bqs11 (Lp1 (Ω)) in Bqs22 (Lp2 (Ω)) whenever s1 − d/p1 > s2 − d/p2 [22]. The following result, essential for our discussion, is analogous to Theorem 4.39 in [1] for Besov spaces and is scale-invariant. Lemma 9.1 (Scale-invariant Besov semi-norm of the product of two functions). If s > 0 and 0 < p, q ≤ ∞ satisfy s > d/p (i.e. Bqs (Lp (Ω)) ⊂ L∞ (Ω)) and k = bsc + 1, 46 then k X |uv|Bqs (Lp (Ω)) 4 |u|B sj/k (L pk/j (Ω)) q |v|B s(k−j)/k (L q pk/(k−j) (Ω)) , (9.4) j=0 with the convention that Bq0 (Lpk/0 (Ω)) = L∞ (Ω) and | · |Bq0 (Lpk/0 (Ω)) = k · kL∞ (Ω) . Proof. Recall that ωk (uv, t)p = sup|h|≤t k∆kh (uv)kLp (Ω) and that the k-th differences obey a rule similar to Leibniz rule. This translates into the expression k X k∆kh (uv)kLp (Ω) 4 k∆jh u∆k−j h vkLp (Ω) ≤ j=0 k X k∆jh ukLpk/j (Ω) k∆k−j h vkLpk/(k−j) (Ω) , j=0 where we have used Hölder inequality in the last step and the conventions ∆0h = Id and Lpk/0 (Ω) = L∞ (Ω). Therefore, ωk (uv, t)p 4 k X ωj (u, t)pk/j ωk−j (v, t)pk/(k−j) , j=0 where we use the convention ω0 (v, t)pk/0 = kvkL∞ (Ω) . We now consider the cases q = ∞ and q < ∞ separately. Observe that −s s (L (Ω)) = sup t |uv|B∞ ωk (uv, t)p p t>0 4 k X j=0 Utilizing that |u|B js/k (L ∞ sup t−js/k ωj (u, t)pk/j sup t−(k−j)s/k ωk−j (v, t)pk/(k−j) . t>0 pk/j (Ω)) t>0 ' supt>0 t−js/k ωj (u, t)pk/j , because k > s so that js/k < j ≤ j − 1 + max{1, j/(pk)} for 0 ≤ j ≤ k, we have s (L (Ω)) 4 |uv|B∞ p k X |u|B js/k (L ∞ pk/j (Ω)) |v|B (k−j)s/k (L ∞ pk/(k−j) (Ω)) . j=0 If 0 < q < ∞, we define q ∗ = max{1, q} and notice that by the triangle and Hölder inequalities q/q ∗ |uv|B s (Lp (Ω)) q ≤ = t k Z X dt ωk (uv, t)qp k Z X k X j=0 t −sq t −sq ∞ 0 |u| 1/q∗ t ∞ 0 j=0 4 −sq 0 j=0 ≤ ∞ Z dt ωj (u, t)qpk/j ωk−j (v, t)qpk/(k−j) t qk/j ωj (v, t)pk/j q/q ∗ sj/k Bqk/j (Lpk/j (Ω)) dt t kqj∗ Z ∞ −sq t 0 q/q ∗ |v| s(k−j)/k Bqk/(k−j) (Lpk/(k−j) (Ω)) 1/q∗ qk/(k−j) ωk−j (v, t)pk/(k−j) . Upon raising both sides to the power q ∗ /q ≥ 1, this inequality implies (9.4). 47 dt t k−j kq ∗ We make the important observation that (9.4) is scale-invariant: simply scale Ω by a constant h and realize that both sides of (9.4) scale by the same factor hs−d/p . This implies that (9.4) can be used at the element level. Upon iterating (9.4) we obtain the following simple, but technical, generalization of (9.4). Corollary 9.2 (Scale-invariant Besov semi-norm of products of functions). If s > 0 and 0 < p, q ≤ ∞ satisfy s > d/p (i.e. Bqs (Lp (Ω)) ⊂ L∞ (Ω)) k = bsc + 1, and 0 ≤ `i = mi /k m−1 ≤ 1 with mi ∈ N0 , then m Y ui i=1 Bqs (Lp (Ω)) m Y X 4 |ui |B s`i (L q Pm i=1 i=1 `i =1 p/`i (Ω)) , (9.5) Pm where the sum ranges over all choices of mi ∈ N0 such that i=1 mi = k m−1 . Using embedding theorems for Besov spaces, (9.5) can be further simplified by replacing the semi-norms by norms. However, this is at the expense of having a constant depending on |Ω|. Corollary 9.3 (Besov norm of products of functions). If s > 0 and 0 < p, q ≤ ∞ satisfy s > d/p (i.e. Bqs (Lp (Ω)) ⊂ L∞ (Ω)), then there exists a constant C = C(Ω) such that k Y ui i=1 Bqs (Lp (Ω)) ≤ C(Ω) k Y kui kB s (Lp (Ω)) q i=1 Proof. Since s`i − d/(p/`i ) = `i (s − d/p) < s − d/p for 0 < `i < 1, we deduce Bqs (Lp (Ω)) ⊂ Bqs`0 i (Lp/`i (Ω)) for any 0 < q, q 0 ≤ ∞. This, together with Bqs (Lp (Ω)) ⊂ L∞ (Ω), enables us to replace the semi-norms in (9.4) by the full Besov norms absorbing the scaling into the constant C(Ω). The following result is a scale-invariant generalization of a result in [10] related to the Besov regularity of the composition of functions. Lemma 9.4 (Scale-invariant Besov semi-norm of composition). Let u : Ω → R be of class Bqs (Lp (Ω)) with 0 < p, q ≤ ∞ and s > d/p, and let R be a closed interval in R that contains the range of u. If f ∈ C k (R), with k = bsc + 1, then the composite function f ◦ u ∈ Bqs (Lp (Ω)) and there exists a constant C(f ) depending on max1≤j≤k kf (j) kL∞ (R) such that |f ◦ u|Bqs (Lp (Ω)) 4 C(f ) k X ` X X kuk`−i L∞ (Ω) i Y j=1 j=1 `j =1 `=1 i=1 Pi |u| s`j Bq (Lp/`j (Ω)) , (9.6) where the inner sum ranges over all choices of 0 ≤ `j = mj /k i ≤ 1 with mj ∈ N0 Pi such that j=1 mj = k i . Proof. Recall formula (9.1) and notice that ∆kh 1 = 0. Then for x ∈ Ωkh , ∆kh k X k f ◦ u (x) = (−1)k+j f (u(x + jh)) − f (u(x)) . j j=1 48 Using Taylor’s formula k−1 X ` f (`) (u(x)) ∆1jh u(x) `! `=1 Z 1 f (k) u(x) + t∆1 u(x) k jh + (1 − t)k−1 dt ∆1jh u(x)) . (k − 1)! 0 f (u(x + jh)) − f (u(x)) = P k Therefore, k∆kh f ◦ u kLp (Ω) ≤ `=1 I` where Lp (Ω) k X k ` f (`) (u(x)) 1 ≤ ` < k, ∆1jh u(x) j `! j=1 Z 1 f (k) u(x) + t∆1 u(x) k X k jh k Ik (x) := (−1)k+j (1 − t)k−1 dt ∆1jh u(x)) . j (k − 1)! 0 j=1 I` (x) := (−1)k+j In order to bound the terms corresponding to ` < k we use Newton’s binomial formula: I` = k X k j=1 = ` X ` i=0 = j i ` X ` i=1 i (−1) k+j ` f (`) (u(x)) X ` (−1)`−i u(x + jh)i u(x)`−i i `! i=0 `−i f (`) (−1) (−1)`−i k (u(x)) X k (−1)k+j u(x + jh)i u(x)`−i `! j j=1 f (`) (u(x)) k i ∆h u (x)u(x)`−i , `! because ∆kh u0 = 0. Consequently, k−1 X I` `=1 Lp (Ω) ≤ C(f ) k−1 ` XX `−i k∆kh ui kLp (Ω) kukL . ∞ (Ω) `=1 i=1 A similar formula is valid for Ik , whence k∆kh (f ◦ u)kLp (Ω) ≤ C(f ) k X ` X k∆kh ui kLp (Ω) kuk`−i L∞ (Ω) . `=1 i=1 The modulus of smoothness ωk (f ◦ u, t)p in turn satisfies ωk (f ◦ u, t)p = sup k∆kh f ◦ ukLp (Ω) ≤ C(f ) |h|≤t k X ` X `=1 i=1 49 `−i ωk (ui , t)p kukL . ∞ (Ω) Whence, the Besov seminorm satisfy Z ∞ 1/q dt t−sq ωk (f ◦ u, t)qp |f ◦ u|Bqs (Lp (Ω)) := t 0 1/q Z k ` ∞ XX dt 4 C(f ) t−sp ωk (ui , t)qp kuk`−i L∞ (Ω) t 0 i=1 `=1 4 C(f ) k X ` X |ui |Bqs (Lp (Ω)) kuk`−i L∞ (Ω) . `=1 i=1 Employing Corollary 9.2, we eventually infer the desired bound (9.6). The inequality (9.6) is scale-invariant and, as such, can be used at the element level. However, it can be further simplified at the expense of having a constant depending on |Ω|. Corollary 9.5 (Besov norm of composition). Under the assumptions of Lemma 9.4, there exists a constant C(f, Ω) such that |f ◦ u|Bqs (Lp (Ω)) 4 C(f, Ω) k X kuk`Bqs (Lp (Ω)) (9.7) `=1 n o ≤ C(f, Ω) k max kukBqs (Lp (Ω)) , kukkBqs (Lp (Ω)) . s` Proof. It suffices to use the embeddings Bqs (Lp (Ω)) ⊂ Bq j (Lp/`j (Ω)) for 0 < `j < s` 1 as well as Bq j (Lp/`j (Ω)) ⊂ L∞ (Ω), which are valid because s > d/p, to convert (9.6) into (9.7). REFERENCES [1] R. A. Adams and J. J. F. Fournier. Sobolev spaces, volume 140 of Pure and Applied Mathematics (Amsterdam). Elsevier/Academic Press, Amsterdam, second edition, 2003. [2] M. Ainsworth and J. T. Oden. A posteriori error estimation in finite element analysis. Pure and Applied Mathematics (New York). Wiley-Interscience [John Wiley & Sons], New York, 2000. [3] P. Binev, W. Dahmen, and R. DeVore. Adaptive finite element methods with convergence rates. Numer. Math., 97(2):219–268, 2004. [4] P. Binev, W. Dahmen, R. DeVore, and P. Petrushev. Approximation classes for adaptive methods. Serdica Math. J., 28(4):391–416, 2002. Dedicated to the memory of Vassil Popov on the occasion of his 60th birthday. [5] M. Sh. Birman and M.Z. Solomyak. Piecewise-polynomial approximations of functions of the classes Wpα . Mat. Sb. (N.S.), 73(115)(3):331–355, 1967. [6] A. Bonito, J. M. Cascón, P. Morin, and R. H. Nochetto. AFEM for geometric PDE: the LaplaceBeltrami operator. In Analysis and numerics of partial differential equations, volume 4 of Springer INdAM Ser., pages 257–306. Springer, Milan, 2013. [7] A. Bonito, R. A. DeVore, and R. H. Nochetto. Adaptive finite element methods for elliptic problems with discontinuous coefficients. SIAM J. Numer. Anal., 51(6):3106–3134, 2013. [8] A. Bonito and R. H. Nochetto. Quasi-optimal convergence rate of an adaptive discontinuous Galerkin method. SIAM J. Numer. Anal., 48(2):734–771, 2010. [9] A. Bonito and J.E. Pasciak. Convergence analysis of variational and non-variational multigrid algorithm for the laplace-beltrami operator. Math. Comp., 81:1263–1288, 2012. [10] G. Bourdaud and W. Sickel. Composition operators on function spaces with fractional order of smoothness. In Harmonic analysis and nonlinear partial differential equations, RIMS Kôkyûroku Bessatsu, B26, pages 93–132. Res. Inst. Math. Sci. (RIMS), Kyoto, 2011. [11] C. Carstensen, M. Feischl, M. Page, and D. Praetorius. Axioms of adaptivity. Comput. Math. Appl., 67(6):1195–1253, 2014. 50 [12] J. M. Cascón, C. Kreuzer, R. H. Nochetto, and K. G. Siebert. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal., 46(5):2524–2550, 2008. [13] J. M. Cascón and R. H. Nochetto. Quasioptimal cardinality of AFEM driven by nonresidual estimators. IMA J. Numer. Anal., 32(1):1–29, 2012. [14] P. G. Ciarlet. The finite element method for elliptic problems. North-Holland Publishing Co., Amsterdam, 1978. Studies in Mathematics and its Applications, Vol. 4. [15] A. Cohen, W. Dahmen, I. Daubechies, and R. DeVore. Tree approximation and optimal encoding. Appl. Comput. Harmon. Anal., 11(2):192–226, 2001. [16] A. Demlow. Higher-order finite element methods and pointwise error estimates for elliptic problems on surfaces. SIAM J. Numer. Anal., 47(2):805–827, 2009. [17] A. Demlow and G. Dziuk. An adaptive finite element method for the Laplace-Beltrami operator on implicitly defined surfaces. SIAM J. Numer. Anal., 45(1):421–442 (electronic), 2007. [18] W. Dörfler. A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal., 33(3):1106–1124, 1996. [19] G. Dziuk. Finite elements for the Beltrami operator on arbitrary surfaces. In Partial differential equations and calculus of variations, volume 1357 of Lecture Notes in Math., pages 142– 155. Springer, Berlin, 1988. [20] T. Gantumur. Convergence rates of adaptive methods, besov spaces, and multilevel approximation. arXiv:1408.3889, 2015. [21] E. M. Garau and Pedro M. Convergence and quasi-optimality of adaptive FEM for Steklov eigenvalue problems. IMA J. Numer. Anal., 31(3):914–946, 2011. [22] F. D. Gaspoz and P. Morin. Approximation classes for adaptive higher order finite element approximation. Math. Comp., 83(289):2127–2160, 2014. [23] R. Kornhuber and H. Yserentant. Multigrid methods for discrete elliptic problems on triangular surfaces. Comput. Vis. Sci., 11(4-6):251–257, 2008. [24] K. Mekchay, P. Morin, and R. H. Nochetto. AFEM for the Laplace-Beltrami operator on graphs: design and conditional contraction property. Math. Comp., 80(274):625–648, 2011. [25] P. Morin, R. H. Nochetto, and K. G. Siebert. Convergence of adaptive finite element methods. SIAM Rev., 44(4):631–658 (electronic) (2003), 2002. Revised reprint of “Data oscillation and convergence of adaptive FEM” [SIAM J. Numer. Anal. 38 (2000), no. 2, 466–488 (electronic); MR1770058 (2001g:65157)]. [26] P. Morin, R.H. Nochetto, and K. G. Siebert. Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal., 38(2):466–488 (electronic), 2000. [27] R. H. Nochetto, K. G. Siebert, and A. Veeser. Theory of adaptive finite element methods: an introduction. In Multiscale, nonlinear and adaptive approximation, pages 409–542. Springer, Berlin, 2009. [28] L. R. Scott and S. Zhang. Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math. Comp., 54(190):483–493, 1990. [29] R. Stevenson. Optimality of a standard adaptive finite element method. Found. Comput. Math., 7(2):245–269, 2007. [30] R. Stevenson. The completion of locally refined simplicial partitions created by bisection. Math. Comp., 77(261):227–241 (electronic), 2008. [31] H. Triebel. Function spaces in Lipschitz domains and on Lipschitz manifolds. Characteristic functions as pointwise multipliers. Rev. Mat. Complut., 15(2):475–524, 2002. [32] A. Veeser. Approximating gradients with continuous piecewise polynomial functions. Found. Comput. Math., pages 1–28, 2015. [33] R. Verfürth. A Review of A Posteriori Error Estimation and Adaptive Mesh-Refinement Technique. Wiley-Teubner, Chichester, 1996. 51