A posteriori error estimation and adaptivity Chapter 3 3.1

advertisement
Chapter 3
A posteriori error estimation and
adaptivity
3.1
3.1.1
Introduction
Review: The main questions
In this chapter we return to finite element approximation of solutions to Poisson’s problem
u=
f in ⌦, where ⌦ is a polygonal or polyhedral domain. To set the stage, we briefly recall some
computational observations we made in our computational prologue in Chapter 1. Let uh 2 Vh be
a Lagrange finite element approximation to u on a shape-regular grid. Then:
1. Assume that u 2 H s (⌦). If we use finite element space of degree r and quasi-uniform grids of
size h, then ku uh kH 1 (⌦)  Ch kukH +1 (⌦) , where = min(r, s 1). The computational rates
we observed in Chapter 1 align precisely with those that we predict from the H s regularity
results obtained for polygonal and polyhedral domains in Chapter 2.
2. When solving the problem
u = 1 In two space dimensions, we always were able to recover
optimal convergence rates DOF r/2 by employing adaptivity.
3. When solving the problem
u = 1 in three space dimensions, adaptivity generally led to improvement in convergence rates, but were were not always able to recover optimal convergence
rates DOF r/3 .
In summary, the results of Chapter 2 now allow us to make rigorously grounded theoretical predictions that align with experimental results when using quasi-uniform mesh refinement. However,
even though we now understand the singularities produced on polyhedral domains, we still haven’t
rigorously explained how adaptive mesh refinement works to improve convergence rates in many
cases, and why its ability to do so is sometimes limited. Note that there are two separate questions
we may ask in this regard:
1. Given a function u with a given singularity structure, is it in theory possible to construct a
(possibly locally refined) mesh such that ku uh kH 1 (⌦) . DOF r/d ?
41
42
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
2. Can we construct an algorithm such that given the right-hand-side f and boundary value
problem
u = f , u = 0 on @⌦, the algorithm returns a sequence of meshes that recovers the
rate ku uh kH 1 (⌦) . DOF r/d ? Or more generally, does the algorithm return a sequence of
meshes which is in some sense the best possible sequence for approximating u?
The first question above is essentially a question in a priori error estimation, and there have been
many papers over the years that have addressed it. In the interests of time we will mainly leave it
alone for now, but may return to it later. The short answer is that using weighted Sobolev spaces, one
can prove that it is always possible to construct meshes that recover the optimal rate of convergence
DOF r/d , provided that one allows for anisotropic mesh refinement in three space dimensions. The
latter point is critical, as anisotropic meshing can be much trickier than shape-regular meshing.
Our main focus for now will be on the second question: Can we construct an algorithm that will
automatically produce a sequence of meshes that optimally reduces the finite element error?
3.1.2
Vocabulary and a little history
We now recall some basic error estimation concepts and vocabulary. First, a (quasi)-optimality
result states that a given numerical approximation is somehow the best possible. Ceá’s Lemma is
an example of an optimality result:
ku
uh kH01 (⌦) = inf ku
2Vh
kH01 (⌦) .
An estimate is quasi-optimal if it expresses optimality up to a nonessential constant, e.g.,
ku
uh kH 1 (⌦)  C inf ku
2Vh
kH 1 (⌦) ,
where C does not depend on u or the mesh size parameter h. Such optimality results do not even
guarantee that a given method converges, much less that it converges with a given rate. They do
however reduce the problem of showing such convergence to proving things about the finite element
space, that is, they remove the PDE from task of error estimation.
An a priori error estimate is one which estimates the finite element error in terms of a mesh
parameter and the generally unknown continuous solution u. A standard example is the bound
ku
uh kH 1 (⌦)  Chr |u|H r+1 (⌦) .
A priori estimates are quite useful in understanding whether a given method is e↵ective, but are much
less useful in assessing whether the output of a given computation is accurate. We have already seen
a major reason why this is so: Very often the solution u is simply not smooth enough to make use of
the estimate above. However, proof of such estimates of course ensures that the method will make
full use of the polynomials at hand if the regularity of u allows it. A priori estimates also are an
important tool for testing codes, as one may insert a manufactured solution with known regularity
and test whether the expected rate of convergence is observed in practice.
A priori error estimation has been studied intensively since the inception of mathematical study of
finite element methods in the late 1960’s and early 1970’s. By the late 1970’s, basic error estimates
had been established for a wide range of problems and methods, while quite detailed estimates
(estimates in Lp norms, local estimates, etc.) had been proved for relatively simple model problems.
3.1. INTRODUCTION
43
An a posteriori error bound is one which estimates the error in a given finite element computation
by means of some computable functional of the problem data, finite element solution, mesh, and
finite element space:
ku
uh kH 1 (⌦)  E(f, uh , Th , Vh ),
(3.1)
reliability
where Th is the mesh. We call such a functional E an a posteriori error estimator. In ideal cases
there is no unknown constant in the error estimate, although one has to work harder to obtain
such estimates. There are many di↵erent options in the literature for reliability
obtaining useful functionals
E having the above form. We say that the estimator E is reliable if (3.39) holds at least up to a
nonessential constant, and it is efficient if it also provides a lower bound for the error:
E(f, uh , Th , Vh )  Cku
uh kH 1 (⌦) .
(3.2)
Reliability guarantees that if the quantity E is small, then so is the actual error. That is, reliability
ensures that we do not underestimate the error. Efficiency, on the other hand, guarantees that E
does not overestimate the true error. The e↵ectivity index
ef f =
estimated error
true error
is often used to assess the quality of a given error estimator E. Ideally we have ef f = 1. In practice,
some estimators guarantee that ef f ! 1 as h ! 0; this property is called asymptotic exactness.
Other estimators guarantee that c1  ef f  c2 on any mesh (that is, they are efficient and reliable on
any mesh). Much research has been directed toward finding estimators that are both asymptotically
exact and unconditionally reliable.
An adaptive finite element method is an iterative feedback of the form
solve ! estimate ! mark ! refine.
(3.3)
A rough definition of the modules is as follows. One first solves for the finite element solution uh . In
the estimate step, an estimator E is used to determine whether the error has reached a given userdefined tolerance. If not, one proceeds to the mark step. In this step a posteriori error indicators
⌘(T ) are used to determine which elements in the mesh are “most responsible” for causing the error.
A posteriori error indicators ⌘(T ) are roughly speaking mesh functions which assign a nonnegative
number to each element T 2 Th . If ⌘(T ) is relatively large, then T is judged to be more responsible
for the error on the given mesh. Thus in mark, some subset of the elements in Th is “marked”.
In refine, the marked elements are refined (e.g., bisected) in order to produce a new mesh, and the
feedback procedure is repeated.
Adaptive FEM are related to a posteriori error estimation, although the two tasks can also be
separated. One might be interested in assessing the solution quality a posteriori on a fixed mesh.
Alternatively, it is quite possible to adaptively refine the mesh without being too concerned about
accurately assessing the error. However, in typical situations
P the local error indicators ⌘(T ) are used
to construct an error estimator. Very often we have E = ( T 2Th ⌘(T )2 )1/2 .
A posteriori error estimation and adaptivity, although now fairly well-understood, developed
as subjects on the whole somewhat later than a priori theory. Early study of a posteriori
error
BR78
estimation was carried out for example in a 1978 paper of Babuska and Rheinboldt [5]. The topic
continued to be studied in 80’s and up to the present, with many improvements and innovations
afem
44
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
occurring over the years. The concept
of adaptivity developed around the same time, and a 1984
BV84
paper of Babuska and Vogelius [6] gave an a convergence analysis of an adaptive FEM for 1D
problems.
After this the next paper to appear on the subject was the landmark 1996 paper of
Dor96
Dörfler [24], which provided foundational ideas for the rigorous study of adaptive FEM that has
blossomed over the two decades since. Binev, Dahmen, and Devore in
2004 used ideas from nonlinear
BDD04
Ste07
approximation theory to establish a notion of optimality for AFEM [12]. Stevenson’s 2007 paper [38]
established optimality of a standard (practical) AFEM. Finally, a mature theory for linearCKNS08
elliptic
scalar problems was given by Cascon, Kreuzer, Nochetto, and Siebert in the 2008 paper [16], and
many papers have since appeared which adapt that work’s general framework to other situations.
From a historical perspective it is interesting to note that adaptive finite element methods were used
for many years before any in-depth understanding of their mathematical properties was gained. This
is probably due in part to the fact that their convergence can be observed by use of a posteriori
error estimators. Another reason is that, as we shall see below, the ideas needed to analyze AFEM
are quite di↵erent in many ways than those used to understand standard FEM using a priori error
estimates.
3.2
Residual a posteriori error estimates and AFEM convergence
There are quite a number of types of a posteriori error estimates available in the literature. We first
will prove residual-type a posteriori error estimates and look at their properties in some detail. We
will then survey other types of a posteriori error estimators and discuss their relative advantages
and drawbacks. Note that our purpose in studying residual estimators in depth is twofold: They
will provide the foundation for our study of AFEM convergence theory, and they also provide the
basis for understanding the properties of other types of estimators.
3.2.1
Preliminaries and technical tools
Let ⌦ ⇢ Rd be a polyhedral domain. We generally think of d = 2, 3, but this is not necessary for our
considerations. Let Th be a simplicial decomposition of ⌦. As in Chapter 1, we assume that Th is
conforming in the sense that T1 \ T2 is either empty or a shared subsimplex (edge, face, vertex, etc.)
of both elements. In addition, we assume that Th is shape-regular. That is, there are constants c, C
such that for each T 2 Th , there exists balls having diameters cdiam(T ) and Cdiam(T ) inscribed
in and superscribing T , respectively. Recall that shape-regularity places some restrictions on mesh
geometry but allows for substantial local grading (refinement). Let hT = |T |1/d be the local mesh
size. This definition of hT is perhaps unfamiliar, as typically in the finite element literature hT is
defined as the diameter of T . These two definitions are equivalent for shape-regular grids, but the
one we use is more convenient for technical purposes. Next we define a patch of elements about a
given T 2 Th :
!T = [T 0 2Th :T 0 \T 6=; T 0 .
Finally, we shall consider standard Lagrange finite element spaces of degree r on Th :
¯ : u|T 2 Pr }.
Vh = {u 2 C(⌦)
Note that we have not incorporated any boundary conditions into our definition of Vh as of yet.
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
45
Trace inequalities play an important role below. Note that on a reference element T̂ (i.e., a
simplex with vertices at the origin and the canonical unit directions (0, ..., 0, 1, 0, ...0)), we have for
1p<1
kvkLp (@ T̂ ) . kvkW p,1 (T̂ ) , v 2 H 1 (T̂ ).
Scaling this inequality to an arbitrary T 2 Th yields
kvkLp (@T ) . hT
1/p
1 1/p
kvkLp (T ) + hT
krvkLp (T ) .
(3.4)
scaledtrace
Implicitly used in the above inequalities is the fact that if v 2 H 1 (T ), then trace v 2 L2 (@T ).
We will also need the Bramble-Hilbert Lemma below. The
Bramble-Hilbert Lemma on a single
BS08
mesh element is quite standard in finite element
texts
(cf.
[15]).
The final form needed below (for
DS80
element patches) can be found for example in [25, Theorem 7.1].
Lemma 3.2.1 Suppose that ! is either an element T 2 Th or an element patch !T corresponding
to T 2 Th . Then for 0  k  r, k  m  r + 1, 1  p  1, and u 2 W p,m (!T ),
infr |u
2P
|W p,k (!)  Chm
T
k
|u|W p,m (!) .
(3.5)
BHL
Here C depends on the shape regularity properties of Th but not hT or u.
We do not prove the Bramble-Hilbert Lemma here, but make a couple of notes on the proof. Note
first that when k = 0 and m = 1 the Bramble-Hilbert Lemma reduces to a Poincaré inequality. One
standard technique for proving the Bramble-Hilbert Lemma (used in the references given above)
is to essentially generalize standard techniques used to prove Poincaré inequalities. This is to use
averaged Taylor polynomials and potential theory. Another option is assume the standard Poincaré
inequality and then iterate it to “knock out” higher-derivative terms.
Finally, we mention that shape regularity implies that
hT ⇠ hT 0 ⇠ diam(!T ), T 0 ⇢ !T .
That is, shape regular meshes are locally quasi-uniform. In addition, shape regular meshes possess
a finite overlap property:
#(T 0 ⇢ !T ) . 1.
(3.6)
SZ90
We next recall the Scott-Zhang interpolation operator [37]. Let N be the Lagrange nodes of
Th . For a node z 2 N , let z be the Lagrange basis function. That is, z 2 Vh , z (z) = 1, and
0
0
0
z (z ) = 0, z 2 N and z 6= z. We next attach to each z 2 N a “control simplex” Tz . Tz may either
be an element in Th , or a d 1-dimensional subsimplex (face) of an element in Th . In either case,
we require that z 2 Tz . In particular, we use the following rules:
1. If z 2 int(⌦), then Tz 2 Th is any mesh element containing z.
2. If z 2 @⌦, then we choose Tz ⇢ @⌦ to be any element face (d
lying on @⌦ and containing z.
1-dimensional subsimplex)
finite_overlap
46
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
Finally, we let 'z 2 Pr (Tz ) be dual to { z }z2N in the sense that for z, z 0 2 N ,
⇢
Z
1, z = z 0 ,
'z z 0 =
0, z 6= z 0 .
Tz
Given v 2 H 1 (⌦), we now define the Scott-Zhang quasi-interpolant as
Z
X
Ih v(x) =
(x)
v'z .
z
(3.7)
szduality
(3.8)
szdef
Tz
z2N
(Ih is referred to as a quasi-interpolant because it does not actually interpolate, or exactly reproduce,
v at a set of control points.)
szproperties
Theorem 3.2.2 The Scott-Zhang interpolant Ih acts as the identity on Vh . In addition, given
v 2 W p,1 (⌦) (1  p  1), we have
kIh vkLp (T ) . kvkLp (!T ) + hT krvkLp (!T ) ,
krIh vkLp (T ) . krvkLp (!T ) .
(3.9)
h1stab
(3.10)
l2stab
If we additionally assume that v 2 W0p,1 (⌦), then Ih v 2 W0p,1 (⌦) \ Vh , and
kIh vkLp (T ) . kvkLp (!T ) ,
krIh vkLp (T ) . krvkLp (!T ) .
Proof. We first show that Ih is a projection (actsszduality
as the identity on Vh ). Given T 2 Th or a
subsimplex Tz , let NT = {z 2 N : z 2 T }. We use (3.7) to compute that for v 2 Vh ,
Z
X
Ih v =
v'z
z
z2N
=
X
z
z2N
=
X
Z
Tz
'z
Tz
X
v(z 0 )
z0
z 0 2NTz
z v(z)
= v.
z2N
We next need to bound norms of 'z and
inequalities yields
k
z kW p,k (T )
=
k+d/p
. hT z
Also,
k'z kL1 (Tz ) =
Clearly k
z.
sup
v2Pr (Tz ),kvkL1 (Tz ) =1
sup
v2Pr (Tz ),kvkL1 (Tz ) =1
Z
Z
z kL1 (Tz )
. 1, so employing inverse
, k = 0, 1.
(3.11)
'z v
Tz
'z
Tz
X
z 0 2N ,z 0 2Tz
v(z 0 )
z0
= v(z)  1.
sizepsi
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
47
Standard inverse inequalities yield
k dim(Tz )(1 1/p)
k'z kW p,k (Tz ) . hTz
sizepsi
, k = 0, 1.
We may then use (3.11) and (3.12) to compute that for T 2 Th ,
Z
X
X
kIh vkLp (T ) = k
v'z kLp (T ) 
k z kLp (T ) kvkLp (Tz ) k'z kLq (Tz )
z
Tz
z2NT
.
X
z2NT
Here
1
p
+
1
q
(3.12)
sizephi
(3.13)
eq3-1
sizephi
z2NT
d/p
dim(Tz )/q
hT kvkLp (Tz ) hT
.
= 1. If Tz 2 Th , then dim(Tz ) = d and kvkLp (Tz )  kvkLp (!T ) . If Tz is a face, then
scaledtrace
1/p
1 1/p
dim(Tz ) = d 1, and we also use (3.4) to find that kvkLp (Tz ) . hT kvkLp (!T ) + hT
In either case,
d/p
dim(Tz )/p
hT kvkLp (Tz ) hT
. kvkLp (!T ) + hT krvkLp (!T ) ,
krvkLp (!T ) .
h1stab
which yields the first line of (3.9). If in addition v 2 W0p,1 (⌦), then all boundary
terms fall out,
l2stab
leaving only those Tz for which dim(Tz ) = d. This then yields the first line of (3.10), since we no
longer need a trace inequality toh1stab
bound kvk
Lp (Tz ) for any z.
l2stab
To obtain the second line of (3.9) and (3.10), note that for any constant A, rIh v = r(Ih v A) =
r(I
A)), since Ih is the identity on V
h (v
h and p 2 Vh . Using an inverse inequality, the first line of
h1stab
BHL
(3.9), and the Bramble-Hilbert Lemma (3.5) with p = 2, k = 0, and m = 1 yields for appropriately
chosen A
krIh vkLp (T ) . hT 1 kIh (v A)kLp (T )
. hT 1 kv
AkLp (!T ) + knabla(v
A)kLp (!T )
. krvkLp (!T ) .
2
scaledtrace
BHL
h1stab
l2stab
We now combine the trace estimate (3.4), the Bramble-Hilbert Lemma (3.5), and (3.9) and (3.10)
in order to obtain approximation estimates of the form we shall need below.
Theorem 3.2.3 Assume that u 2 W p,m (⌦) with 0  k  m  r + 1 and 1  p  1. Then for
T 2 Th ,
ku
If in addition m
k
|u|W p,m (!T ) ,
(3.14)
approx_el
(3.15)
approx_bd
1, then for T 2 Th
ku
3.2.2
Ih ukW p,k (T ) . hm
T
m 1/p
Ih ukLp (@T ) . hT
|u|W p,m (!T ) .
Residual-type a posteriori error estimates
Assume that we approximate the solution u to a di↵erential equation Lu = f by uh . The basic
approach of residual-type estimates is to bound the residual Lu f in an appropriate norm (typically
a dual to the norm in which we wish to measure the error).
We begin by defining elementwise error indicators. First recall that a finite element function
uh 2 Vh is continuous, but its gradient ruh is discontinuous across element boundaries. More
48
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
precisely, ruh · ~n is discontinuous on @T , with ~n the outward-pointing unit normal on @T . The
tangential component of ruh is continuous. Given T, T 0 2 Th sharing a face e, we define
Jruh K = ruh |T · ~nT + ruh |T 0 · ~nT 0 .
(3.16)
⌘(T )2 = h2T k uh + f k2L2 (T ) + hT kJruh KkL2 (@T ) .
(3.17)
Note that adding rather than subtracting is correct here because ~nT = ~nT 0 . Also, for x 2 @T
we more precisely define ruh |T (x) = limT 3x0 !x ruh (x0 ). Also, note that because the tangential
component of ruh is continuous across @T we have |Jruh K| = |ruh |T ruh |T 0 |.
To fix thoughts, we solve
u = f in ⌦, u = 0 on @⌦, as before. For element faces lying on the
boundary, we define Jruh K = 0. The reasons for this will become clear later. We will also briefly
discuss modifications for other boundary conditions later on. For T 2 Th , let
jump_def
eta_def
We then have the following basic a posteriori error estimate.
Theorem 3.2.4 Assume that uh 2 Vh0 := Vh \ H01 (⌦) is the standard finite element approximation
to u. Then there exists a constant Crel depending on the shape regularity properties of Th , but not
on other essential quantities, such that
X
ku uh kH01 (⌦)  Crel (
⌘(T )2 )1/2 .
(3.18)
T 2Th
Proof. Our basic strategy is to apply Galerkin orthogonality, integrate parts elementwise, do some
careful bookkeeping, and apply the approximation properties proved in the previous subsection. In
particular, Rlet = u uhR, and let Ih 2 Vh0 . Applying Galerkin orthogonality, applying the global
weak form ⌦ ruh · rv = ⌦ f v, and integrating by parts elementwise for the remaining terms yields
Z
Z
2
ku uh kH 1 (⌦) =
r(u uh ) · r =
r(u uh )r(
Ih )
0
⌦
Z⌦
Z
(3.19)
X
=
f(
Ih )
uh (
Ih )
ruh · ~nT (
Ih ).
⌦
Note that element faces lying in @⌦,
and we may compute that
X Z
ruh · ~nT (
T 2Th
@T
T 2Th
Ih
= 0. Otherwise each face is shared by two elements,
Ih ) =
@T
Z
1 X
Jruh K(
2
@T
Ih ).
(3.20)
T 2Th
Combining
the previous two equalities while recalling that
u = f , applying Cauchy-Schwarz, and
approx_el approx_bd
using (3.14) and (3.15), and finally applying the `2 Cauchy-Schwarz inequality over Th yields
ku
uh k2H 1 (⌦) 
0
.
X
uh kL2 (T ) k
(kf +
uh kL2 (T ) hT | |H 1 (!T ) + kJruh KkL2 (@T ) hT | |H 1 (!T ) )
T 2Th
X
T 2Th
.(
1
Ih kL2 (T ) + kJruh KkL2 (@T ) k
2
kf +
X
T 2Th
⌘(T )2 )1/2 (
1/2
X
T 2Th
|
2
1/2
.
H 1 (!T ) )
Ih kL2 (@T )
(3.21)
res_reliability
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
49
finite_overlap
The finite overlap property (3.6) implies that
X
2
2
| H
1 (! ) . | |H 1 (⌦) = ku
T
T 2Th
uh kH01 (⌦) .
Combining the last two inequalities and dividing through by ku
(3.22)
uh kH01 (⌦) completes the proof.
2
We also prove a local a posteriori upper bound for the di↵erence between finite element solutions
on two nested grids. Let T be a shape-regular grid having the same properties as Th as above, and
let T 0 be a refinement of T . That is, for each T 2 T , we either have T 2 T 0 , or T is the union
of some subset of elements in T 0 . In this case we write T ⇢ T 0 . Denote also by RT !T 0 the set of
elements that are refined in passing from T to T 0 . Let VT and VT 0 be Lagrange finite element spaces
on T and T 0 as above. Finally, denote by ⌘T (T ) and ⌘T 0 (T ) denote elementwise error indicators (as
above) computed on T and T 0 , respectively.
Corollary 3.2.5 Let T ⇢ T 0 as above. Let uT and uT 0 be the Galerkin solutions on T and T 0 .
Then
X
kuT u0T kH01 (⌦) . Crel (
⌘T (T )2 )1/2 .
(3.23)
locupper
T 2RT !T 0
res_reliability
Proof. Following the proof of (3.18), let = uT uT 0 . We consider the interpolation error
Ih .
Let Nnr be the set of nodes lying in elements that are not refined in passing from T to T 0 , that is,
lying in T \ RT !T 0 . We define Ih so that the control simplices for all z 2 Nnr res_reliability
also lie in T \ RT !T 0 .
In this case supp(
Ih ) ⇢ [T 2RT !T 0 T . We may then follow the proof of (3.18) nearly verbatim,
except while noting that
Ih = 0 on elements lying outside of RT !T 0 and thus residual terms
from those elements may be omitted.
2
res_reliability
P
The inequality (3.18) establishes that the estimator Crel ( T 2Th ⌘(T )2 )1/2 is a reliable estimator
for the energy error ku uh kH01 (⌦) . We next wish to establish that it is also efficient. In order to
do so, we first introduce the concept of data oscillation. Data oscillation measures the degree to
which problem data (right-hand-side, coefficients, etc.) di↵er from piecewise polynomials. In our
case, data oscillation only depends on the right-hand-side f . We define
X
osc(T ) = infr 1 hT kf fT kL2 (T ) , osc(Th ) = (
osc(T )2 )1/2 .
(3.24)
fT 2P
oscdef
T 2Th
We similarly define osc(T 0 ) and osc(!) for any subset T 0 ⇢ Th and ! ⇢ ⌦.
Theorem 3.2.6 Assume that T 2 Th . Then
2
⌘(T )2  C̃ef
f (ku
uh k2H 1 (!T ) + osc(!T )2 ).
0
(3.25)
Thus
(
X
T 2Th
⌘(T )2 )1/2  Cef f (ku
uh kH01 (⌦) + osc(Th )).
(3.26)
Here C̃ef f and Cef f depend on shape regularity properties of Th but are independent of other essential
quantities.
eq:ef
50
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
Ver89
Proof. Our proof follows a technique due to Verfürth [41]. We begin by defining an element
bubble function bT (x) := ⇧d+1
i=1 i (x), where i , i = 1, ..., d + 1, are barycentric coordinates on T .
Note that bT = 0 on @T , since on each edge one of the barycentric coordinates is identically zero.
In addition, bT > 0 in int(T ). oscdef
Also, kbT kL1 (T ) ' 1 (actually, it is equal to 1/3d+1 ).
r 1
Let now fT 2 P , as in (3.24). Then
hT kf +
uh kL2 (T )  hT kf
fT kL2 (T ) + hT kfT +
Note next fT + uh 2 Pr 1 (T ). kvkbT := (
reference element and using the fact that Pr
k·kL2 (T ) and kvkbT are equivalent norms on Pr
quantities. Thus
h2T kfT +
R
uh kL2 (T ) = osc(T ) + hT kfT +
uh kL2 (T ) .
(3.27)
eqeff1
bT v 2 )1/2 is a norm on T . By transformation to a
is a finite dimensional space, we in fact have that
1
with constant independent of hT and other essential
T
1
uh k2L2 (T ) ' h2T kfT + uh k2bT
Z
= h2T
bT (fT + uh )(fT f + f + uh )
T
Z
 h2T
bT (fT + uh )[
(u uh )] + h2T kfT + uh kL2 (T ) kfT f kL2 (T )
T
Z
1
2
 hT
bT (fT + uh )[
(u uh )] + osc(T )2 + h2T kfT + uh k.
4
T
(3.28)
We then integrate by parts while recalling that bT = 0 on @T and then apply an inverse inequality
to obtain
Z
Z
h2T
bT (fT + uh )[
(u uh )] = h2T
r[bT (fT + uh )]r(u uh )
T
T
 h2T kr[bT (fT +
 ChT kbT (fT +
 Ckr(u
uh )kL2 (T ) kr(u
uh )kL2 (T )
uh )kL2 (T ) kr(u uh )kL2 (T )
1
uh )kL2 (T ) + h2T kfT + uh kL2 (T ) .
4
(3.29)
Combining the last two estimates and reabsorbing the resulting final term 12 h2T kfT + uh kL2 (T ) then
yields
h2T kfT + uh k2L2 (T )  C(kr(u uh )k2L2 (T ) + osc(T )2 ).
eqeff1
Inserting into (3.27) then yields
hT kf +
uh kL2 (T ) . kr(u
1/2
uh )kL2 (T ) + osc(T ).
(3.30)
We now consider the edge term hT kJruh KkL2 (@T ) . Let e be one of the faces of T , and assume
that e is shared by T and T 0 ; recall that Jruh K|e = 0 if e ⇢ @⌦. We first define an edge bubble be
as follows. By (possibly) renumbering we can assume that 1 , ..., d are the barycentric coordinates
on T that are nonzero on e, and similarly for T 0 . We then define be |T = ⇧di=1 d , and similarly on
T 0 . It is easy to compute that be = 0 on @(T [ T 0 ) and that be is continuous on T [ T 0 . Also,
voleff
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
51
1/2
kbe kL1 (T [T 0 ) ' 1. In addition, we have as above that for v 2 Pr 1 , kvkL2 (e) ' kbe vkL2 (e) . Finally,
we let be the polynomial on T [ T 0 which is obtained by extending Jruh K as a constant in the
direction normal to e. Using the fact that Jruh K is a polynomial, we can compute that
1/2
1/2
k kL2 (T [T 0 ) . hT k kL2 (e) = hT kJruh KkL2 (e) .
R
R
We next note that 0 = @(T [T 0 ) (be )ru · n = T [T 0 r(be )ru + be
u. Thus
Z
kJruh Kk2L2 (e) =' be Jruh K2
Z e
= be Jruh K
Ze
Z
=
be ruh · ~n +
be ruh · ~n
@T
@T 0
Z
Z
=
r(be )ruh + be
uh +
r(be )ruh + be
uh
T0
ZT
Z
=
r(be )r(uh u) + be
(uh u) +
r(be )r(uh u) + be
T0
T
 kr(be )kL2 (T [T 0 ) kr(u
uh )kL2 (T [T 0 ) + kbe kL2 (T [T 0 ) kf +
(3.31)
(uh
u)
h uh kL2 (T [T 0 ) ,
(3.32)
psiscale
computed elementwise. Using (3.31) and kbe kL1 (T [T 0 ) . 1, we now
where by h we mean
compute that
kr(be )kL2 (T [T 0 ) . hT
1/2
1/2
kJruh KkL2 (e) ,
kbe kL2 (T [T 0 ) . hT kJruh KkL2 (e) .
voleff
edge
(3.33)
Inserting these relationships and then (3.30) into (3.32) then yields
1/2
3/2
hT kJruh Kk2L2 (e)  C(hT kr(u
 C(kr(u
psiscale
uh )kL2 (T [T 0 ) kJruh KkL2 (e) + hT kf + uh kL2 (T ) kJruh KkL2 (e) )
1
uh )k2L2 (T [T 0 ) + osc(T [ T 0 )) + hT kJruh Kk2L2 (e) .
2
(3.34)
voleff
Reabsorbing the last term and combining the result with (3.30) completes the proof.
2
We finally make a philosophical remark about the roll that data oscillation plays in a posteriori
error estimation and adaptive finite element methods. First note that osc(T )  ⌘(T ), since uh 2
Pr 1 and so inf fT 2Pr 1 kf fT kL2 (T )  kf + uh kL2 (T ) . Combining this observation with the above
reliability and efficiency estimates, we obtain
X
ku uh kH01 (⌦) + osc(Th ) ' (
⌘(T )2 )1/2 .
T 2Th
The quantity ku uh kH01 (⌦) + osc(Th ) (error plus oscillation) is often referred to as the total error.
Data oscillation is typically a higher-order (O(hr+1 ) term if f is smooth enough, but it may dominate the total error on coarse meshes. An important philosophical step in establishing a robust
edge
52
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
convergence theory for adaptive FEM was the realization that a posteriori error estimates (and corresponding adaptive FEM) really control the total error, not just the plain energy error as is usually
considered in a priori error estimates.
3.2.3
Mesh refinement: Bisection
Adaptive finite element methods produce a sequence T0 , T1 , T2 ... of grids. Here T0 can be thought
of as an algorithm input, i.e., a user-supplied grid, and T1 , T2 , ... are generated automatically by the
algorithm. Establishing convergence
of adaptive FEM relies critically on the construction of mesh
afem
refinement routines (“refine” in (3.3)) with appropriate properties. In particular, we will eventually
need the following (or closely related properties) to hold:
1. Given T` , refine produces T`+1 with
T` ⇢ T`+1 .
(3.35)
nested
2. The sequence {T` }` 0 is uniformly shape regular. That is, there exist constants c, C independent of ` such that for any T 2 T` (` 0), there are balls of radius chT and ChT inscribed in
and superscribing T , respectively. Here c, C may be smaller and larger, respectively, than the
corresponding constants c0 , C0 describing the shape regularity of T0 , but not by more than a
fixed factor.
3. If we instruct refine to subdivide (at least) the subset M` of T` in order to produce T`+1 , the
algorithm will produce a conforming mesh T`+1 such the number of newly produce triangles is
no more than a fixed multiple of the number of triangles in M` :
#T`+1
#T` . #M` .
(3.36)
The above properties are (almost) fulfilled by the newest vertex bisection algorithm noinflate
and its higherdimensional generalizations, as we describe below. (We say “almost” here because (3.36) does not
hold with constant uniform with respect to the mesh level; it only holds in a cumulative fashion
described below.) However, other algorithms exist which violate some of the above conditions,
and good algorithmic performance may still be achieved even if proof of this fact is more difficult.
The essential requirements are that shape regularity is maintained, the algorithm does not inflate
the number of refined elements by too much more than the number of marked elements, and that
the depth of nonconformity is bounded. Nestedness is sometimes violated in practice, as is the
conformity requirement.
We briefly describe two refinement algorithms
that are widely known but lack some of the above
BSW83
properties. The first is “red-green” refinement [9]. We give a brief description of the two-dimensional
version; a 3D version is also available (“red-green-blue” refinement and variants). Given a marked
set M` ⇢ T` , the algorithm produces T`+1 by a combination of red and green refinements. “Red”
refinement of T 2 T` involves connecting the midpoints of all edges in T so that four children are
produced, i.e., T is subdivided into four similar triangles. “Green” refinement of T 2 T` is a bisection,
that is, a new edge is created by connecting a vertex of T with the midpoint of its opposite edge.
T 2 T` is colored red if either T 2 T0 or it is the child of some previous T 0 2 Tk by red refinement.
T 2 T` is green if it is the child by green refinement of some T 0 2 Tk , k < `. The steps in red-green
refinement are:
noinflate
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
53
1. Divide M` into Mr` (red elements) and Mg` (green elements).
2. Refine all elements in Mr` by red refinement.
3. Coarsen all T 2 Mg` by removing their fathers’ midlines. Refine their fathers by red refinement
and repaint red.
4. Remove possible hanging nodes by iterating the following:
(a) Refine all red elements with at least two hanging nodes by red refinement and paint the
children red.
(b) Coarsen green elements with at least one hanging node and refine fathers by red refinement;
(c) Refine red elements with only one hanging node by green refinement and paint the children
green.
Note that this strategy clearly preserves shape regularity: Children of red-refined triangles are similar
to the parent. Green-refined triangles have angles at most half that of their parent, and because no
triangle is ever green-refined more than once, the shape cannot degenerate.
A second option is longest-edge bisection. Longest-edge bisection will produce uniformly shaperegular meshes, since always bisecting the longest edge produces children which do the least damage
to the shape regularity properties of the new mesh. However, proving combinatorial properties of
the created sequence of meshes–in particular that the conforming closures to do not damage the
cardinality of the produced meshes–is problematic.
Conceptually closely related to longest-edge bisection, but having provably good combinatorial
properties,
is newest vertex bisection and its generalizations to higher space dimensions. We refer to
Ste08
[39] for a thorough exploration of these properties. The basic algorithm in two dimensions involves
first input an initial mesh T0 with one vertex in each element labeled as the “newest”. Then for
` 0:
1. Bisect each element in M` with refinement edge being the one opposite of the newest vertex.
Relabel the newly created vertex as the newest.
2. Recursively apply this step to the set of hanging nodes until the algorithm terminates, i.e,
until no hanging nodes remain.
The last step should of course raise warning flags, as it is not clear that the algorithm should
terminate at all. It can however be proved that the algorithm does in fact terminate, provided that
the initial labeling of “newest” vertices in T0 satisfies certain properties. Such a labeling can always
be supplied for two-dimensional meshes, and for higher-dimensional meshes after
possibly carrying
noinflate
out a finite number of uniform refinements of T0 . In addition, it turns out that (3.36) does not hold
with uniform constant. A cumulative version does however hold.
Lemma 3.2.7 Given a polyhedral domain ⌦, initial conforming shape-regular simplicial decomposition T0 , and for each ` 0 a subset M` ⇢ T` , the newest-vertex bisection algorithm or its generalization to higher space dimension will produce anested
sequence of meshes {T` }` 0 which is conforming,
uniformly shape-regular, nested in the sense of (3.35), and ensures that all elements T 2 M` are
54
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
bisected at least once in passing from T` to T`+1 . In addition, there is a constant C depending on T0
such that for any ` 0,
3.2.4
` 1
X
#T0  C
#T`
k=0
Mk .
(3.37)
real_noinflate
Contraction
afem
We now more precisely define the steps in (3.3). Let an initial mesh T0 be given. We denote by V`
the finite element space corresponding to T` , etc. Then for ` 0:
1. We solve for u` 2 V` , assuming exact linear algebra. (This is not the typical case in practice; there are papers available which analyze the e↵ect of inexact system solution on AFEM
convergence.)
P
2. estimate using the residual estimator ( T 2T` ⌘(T )2 )1/2 .
3. Let 0 < ✓  1 be given. In mark, we choose the set M` of minimal cardinality so that
X
X
⌘(T )2 ✓
⌘(T )2 .
(3.38)
T 2M`
dorfler
T 2T`
This is called Dörfler or bulk marking. Note for the sake of intuition that ✓ = 1 corresponds
to uniform refinement, unless there are elements on which the indicators are 0.
4. Use newest-vertex bisection or its generalization to higher space dimensions in order to refine
(bisect) the elements in M` b 1 times and then bisect additional elements in order to produce
a new conforming mesh T`+1 with T` ⇢ T`+1 .
CKNS08
The work [16] identified a sequence of four steps, or ingredients, for proving that AFEM converges
to the true solution: An a posteriori upper bound, orthogonality, an estimator reduction property,
and contraction.
Ingredient 1: A posteriori upper bound. We have from above that
X
ku uh kH01 (⌦)  Crel (
⌘(T )2 )1/2 .
(3.39)
reliability
T 2T`
Ingredient 2: Orthogonality. Nestedness of the finite element spaces and Galerkin orthogonality produce the Pythagorean identity
u`+1 k2H 1 (⌦) = ku u` k2H 1 (⌦) ku` u`+1 k2H 1 (⌦) .
(3.40)
0
0
0
R
To prove this, we write a(u, v) := ⌦ ru · rv and kuk2H 1 (⌦) = a(u, u). Then using the fact that
0
V` ⇢ V`+1 and so a(u u`+1 , v`+1 ) = 0, v`+1 2 V`+1 , we have
ku
ku
u` k2H 1 (⌦) = a(u
0
u` , u
u` ) = a(u
u`+1 , u
= a(u
u`+1 , u) + a(u`+1 , u
= a(u
u`+1 , u
u`+1 ) + a(u`+1 , u`+1
= a(u
u`+1 , u
u`+1 ) + a(u`+1
= ku
u`+1 k2H 1 (⌦) + ku`+1
0
u` )
u` ) + a(u`+1
a(u` , u`+1
u` )
u` , u`+1
u` k2H 1 (⌦) .
0
u` , u
u` )
u` )
a(u` , u`+1
u` )
u` )
pyth
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
55
Our strategy is to show that ku u`+1 kH01 (⌦)  ⇢ku u` kH01 (⌦) for some ⇢ < 1. The identity
pyth
(3.40) in essence tells us that we must show that ku` u`+1 kH01 (⌦) is sufficiently large with respect
to ku u` kH01 (⌦) .
Ingredient 3: Estimator reduction. We first expand our notation. Given v` 2 V` , we write
⌘` (v` ; T )2 = h2T kf +
v` k2L2 (T ) + hT kJrv` Kk2L2 (@T ) , T 2 T` .
Note that if v` 2 V` , then also v` 2 V`+1 and so we can consider both ⌘` (v` ; T ) (T 2 T` ) and
⌘`+1 (v` ; T ) (T 2 T`+1 ). In this case Jrv` K|e = 0 on edges e which are interfaces of elements in T`+1
but not of elements in T` .
We begin with an auxiliary proposition establishing what might be termed an “indicator continuity” property.
Proposition 3.2.8 Assume that v, w 2 V` . Then there exists C > 0 such that for any 0 < ↵ < 1
and T 2 T` ,
⌘` (T ; v)2  (1 + ↵)⌘` (T ; w)2 + C(1 + ↵
Proof. First consider the volumetric residual hT kf +
by an inverse inequality yields
hT kf +
vkL2 (T )  hT kf +
wkL2 (T ) + hT k (w
1
wk2H 1 (!T ) .
)kv
(3.41)
0
continuity
vkL2 (T ) . The triangle inequality followed
v)kL2 (T )  hT kf +
wkL2 (T ) + Ckw
vkH01 (T ) .
Squaring the above and applying Young’s inequality yields
h2T kf +
vk2L2 (T )  h2T kf +
wk2L2 (T ) + 2ChT kf +
 (1 + ↵)h2T kf +
wkL2 (T ) kw
wk2L2 (T ) + C(1 + ↵
1
)kw
vkH01 (T ) + C 2 kw
vk2H 1 (T )
0
vk2H 1 (T ) .
0
(3.42)
eq200
Similarly, let e be a face in @T which is shared by T, T 0 2 T` . Using the scaled trace inequality
(3.4) and an inverse inequality yields
scaledtrace
1/2
1/2
1/2
hT kJrvKkL2 (e)  hT kJrwKkL2 (e) + hT kJr(v


1/2
hT kJrwKkL2 (e)
1/2
1/2
+ ChT (hT kv
1/2
hT kJrwKkL2 (e)
w)KkL2 (e)
1/2
wkH01 (T [T 0 ) + hT |v
+ Ckv
1/2
w|H 2 (T ) + hT |v
w|H 2 (T 0 ) )
wkH01 (T [T 0 ) .
Squaring the above, adding the result over the faces of T , and applying Young’s inequality as above
yields
hT kJrvKk2L2 (@T )  (1 + ↵)hT kJwKk2L2 (@T ) + C(1 + ↵
eq200
1
)kv
wk2H 1 (!T ) .
0
(3.43)
eq201
Adding together (3.42) and (3.43) completes the proof.
2
eq201
56
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
dorfler
Lemma 3.2.9 Assume that the marked set M` satisfies the bulk marking criterion (3.38). Then
there are a constant Cred > 0 independent of essential quantities and a constant 0 < < 1 depending on the number of bisections b applied to each T 2 M` but otherwise independent of essential
quantities such that for any 1 > ↵ > 0,
⌘`+1 (T`+1 ; u`+1 )2  (1 + ↵)(1
✓)⌘` (T` ; u` )2 + Cred (1 + ↵
1
)ku`
continuity
Proof. Let T 2 T`+1 . Because u` , u`+1 2 V`+1 , (3.41) yields
⌘`+1 (T ; u`+1 )2  (1 + ↵)⌘`+1 (T ; u` )2 + C(1 + ↵
1
)ku`
u`+1 k2H 1 (⌦) .
0
(3.44)
reduction
(3.45)
eq202
u`+1 k2H 1 (!T ) .
0
Applying finite overlap of the patches !T while summing over T 2 T`+1 yields
⌘`+1 (T`+1 ; u`+1 )2  (1 + ↵)⌘`+1 (T`+1 ; u` )2 + C(1 + ↵
1
)ku`
u`+1 k2H 1 (⌦) .
0
b
Assume now that T 2 M` . Then there exist 2b elements T1 , ..., T2b in T`+1 such that T = [2i=1 Ti .
Note also that |Ti | = 2 b |T | and so hTi = 2 b/d hT , i = 1.., 2b . Let ˜ = 2 b/d . Then
b
2
X
i=1
h2Ti kf +
u` k2L2 (T ) = ˜ 2 h2T kf +
u` k2L2 (Ti ) = h2T1 kf +
u` k2L2 (T ) .
Recalling that Jru` K|e = 0 for faces e which are faces of elements in T`+1 but not of faces in T` , we
similarly have
2b
X
hTi kJu` Kk2L2 (@Ti ) = ˜ hT kJu` Kk2L2 (@T ) .
i=1
Thus since ˜ < 1,
⌘`+1 (T ; u` )  ˜ ⌘` (T ; u` ), T 2 M` .
(3.46)
If T 2 RT` !T`+1 we may similarly prove
⌘`+1 (T ; u` )  ⌘` (T ; u` ),
(3.47)
⌘`+1 (T ; u` ) = ⌘` (T ; u` ), T 2 T` \ RT` !T`+1 .
(3.48)
and clearly
Combining the above and summing over T`+1 and then using the fact that ⌘` (M` ; u` )2
we obtain
X
2
⌘`+1 (T`+1
; u` )  ⌘` (T` \ M` ; u` )2 + ˜
⌘` (M` ; u` )2
= ⌘` (T` ; u` )
2
 ⌘` (T` ; u` )
2
= (1
(1
✓⌘` (T` ; u` )2 ,
T 2M`
(1
˜ )⌘` (M` ; u` )2
(1
˜ )✓⌘` (T` ; u` )2
(3.49)
˜ )✓)⌘` (T` ; u` )2 .
eq203
eq202
Inserting (3.49) into (3.45) while defining = 1 ˜ completes the proof.
2
Ingredient 4: Contraction. Properly mixing the first three ingredients yields a reduction in
the total error at each step of the AFEM algorithm. a
eq203
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
th:contraction
57
Theorem 3.2.10 Assume that AFEM as defined above is employed. Then there exist constants
> 0 and 0 < ⇢ < 1 depending on the shape regularity of T0 , b, and ✓ but not on other essential
quantities such that
u`+1 k2H 1 (⌦) + ⌘`+1 (T`+1 ; u`+1 )2  ⇢(ku
ku
Proof. Let
0
=
1
Cred (1+↵
u` k2H 1 (⌦) + ⌘` (T` ; u` )2 ).
0
reduction
1)
reduction
, with Cred and ↵ as in (3.44). We then rewrite (3.44) as
⌘`+1 (T`+1 ; u`+1 )2  (1 + ↵)(1
✓)⌘` (T` ; u` )2 + ku`
u`+1 k2H 1 (⌦) .
0
pyth
Adding the above together (Ingredient 1) with the orthogonality relationship (3.40) (Ingredient 2)
then yields
ku
u`+1 kH01 (⌦) + ⌘`+1 (T`+1 ; u`+1 )2  ku
u` k2H 1 (⌦) + (1 + ↵)(1
0
✓)⌘` (T` ; u` )2 .
reliability
We next use the a posteriori upper bound (3.39) (Ingredient 3) to find
(1 + ↵)(1
✓)⌘` (T` ; u` )2 = ( (1 + ↵)(1
 (1 + ↵)(1
(1 + ↵) ✓/2)⌘` (T` ; u` )2
✓/2)
✓/2)⌘` (T` ; u` )2
Crel2 (1 + ↵) ✓/2ku
u` k2H 1 (⌦) .
0
Combining the last two inequalities then yields
ku
Crel2 (1 + ↵) ✓/2)ku
u`+1 kH01 (⌦) + ⌘`+1 (T`+1 ; u`+1 )2  (1
+ (1 + ↵)(1
u` k2H 1 (⌦)
0
✓/2)⌘` (T` ; u` )2 .
Recall that 0 < ↵ < 1 is arbitrary. We thus may take ↵ small enough so that (1 + ↵)(1
✓/2) < 1,
since 1
✓/2 < 1. We do so and then set ⇢ = max(1 Crel2 (1 + ↵) ✓/2, (1 + ↵)(1
✓/2)) < 1.
This completes the proof.
2
th:contraction
In the next subsection we will need the following variation of Theorem 3.2.10. Recalling that
⌘` (T` ) ' ku u` kH01 (⌦) + osc(T` ), the proof is immediate.
th:contraction
Corollary 3.2.11 Under the assumptions of Theorem 3.2.10, we have for m
ku
3.2.5
um kH01 (⌦) + osc(Tm ) . ⇢m
`
[ku
u` kH01 (⌦) + osc(T` )].
`
(3.50)
Quasi-Optimality of AFEM
th:contraction
Theorem 3.2.10 guarantees convergence of AFEM, and it even gives a convergence rate in that
it establishes the total error is reduced by a fixed fraction at each step of the algorithm. On
the other hand, it leaves important questions open. Recall the standard a priori estimate ku
uh kH01 (⌦)  Chr |u|H r+1 (⌦) . This estimate shows that a better convergence
rate can be achieved with
th:contraction
a higher polynomial degree, if u is sufficiently smooth. Theorem 3.2.10 however references neither
the polynomial degree r or the regularity of u. In this subsection we remedy this shortcoming by
establishing a rate optimality result for AFEM.
Let T be the set of all conforming meshes that can be derived by newest-vertex bisection (or its
generalization to higher space dimension) from the initial mesh T0 . The goal of AFEM is to optimize
approximation to u over T. That is, AFEM should pick out a sequence of meshes through T that
mod_contract
lem:dorf_count
58
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
is optimal in some reasonable sense. We shall show that if u can be approximated with rate s by a
sequence of meshes in T, then AFEM in fact picks out a sequence of meshes which also approximates
u with rate s. More formally, we define nonlinear approximation classes. Let s > 0 be given; s may
be thought of as an approximation rate. We define
|u|As := sup N s
N
0
#T
inf
inf (ku
#T0 =N uT 2VT
uT kH01 (⌦) + osc(T ))
(3.51)
def:approxclass
and write u 2 As if |u|As < 1. Our goal will be to prove that ku u` kH01 (⌦) . (#T #T0 ) s |u|As
whenever u 2 As for 0 < s  dr , under appropriate assumptions. Comparing with the classical a
priori estimate ku uh kH01 (⌦)  Chr |u|H r+1 (⌦) , we see that roughly speaking (#T #T0 ) s takes
the place of the convergence rate hr and the nonlinear approximation measure |u|As takes the place
of the regularity measure |u|H r+1 (⌦) . At this point, it may appear that employing the approximation
measure |u|As to measure the regularity of u is somewhat circular or unsatisfying, as we do not refer
here to intrinsic regularity properties of u. It is in fact possible to make some statements about the
relationship of As with certain Besov spaces, but the relationship is not entirely straightforward.
This relationship will be discussed more below.
We first prove some important technical lemmas. The first establishes that in essence, any
refinement achieving sufficient error reduction must result form a Dörfler marking strategy. This
lemma will be used to bound the number of marked elements in the Dörfler marking.
locupper
Lemma 3.2.12 Let Crel be the
constant from the local upper bound (3.23) and Cef f the constant
eq:ef
from the efficiency estimate (3.26). Assume that the Dörfler marking parameter ✓ satisfies ✓ <
1
Cef f (Crel +1) . Then for T` ⇢ T 2 T satisfying
ku
uT kH01 (⌦) + osc(T )  [1
u` kH01 (⌦) + osc(T` )),
✓Cef f (Crel + 1)](ku
(3.52)
eq300
(3.53)
eq301
(3.54)
eq302
(3.55)
eq303
there holds
⌘` (RT` !T )
✓⌘` (T` ).
Proof. We add the inequalities
ku
u` kH01 (⌦)  ku`
uT kH01 (⌦) + ku
osc(T` )  osc(RT` !T ) + osc(T )
uT kH01 (⌦) ,
to obtain
ku
u` kH01 (⌦) + osc(T` )  ku`
eq:ef
uT kH01 (⌦) + osc(RT` !T ) + ku
eq300
uT kH01 (⌦) + osc(T ).
eq303
locupper
Employing (3.26), a rearranged version of (3.52), a rearranged version of (3.55), and (3.23) then
yields
✓(Crel + 1)⌘` (T` )  ✓Cef f (Crel + 1)(ku
 ku
 ku`
u` kH01 (⌦) + osc(T` ))
u` kH01 (⌦) + osc(T` )
ku
uT kH01 (⌦) + osc(RT` !T )
uT kH01 (⌦)
osc(T )
(3.56)
 (Crel + 1)⌘(RT` !T ).
Here we have also employed the fact that osc(T )  ⌘(T ), T 2 T` . Dividing through by Crel + 1
completes the proof.
2
eq304
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
59
lem:dorf_count
Lemma 3.2.13 Let u 2 As for some s > 0. Under the assumptions
of Lemma 3.2.12, the collection
dorfler
of marked elements defined by the Dörfler marking strategy (3.38) satisfies
1/s
#M` . |u|As [ku
1/s
u` kH01 (⌦) + osc(T` )]
.
Proof. We first assert that there exists a mesh T 0 2 T such that
1/s
#T0 . |u|As
#T 0
and uT 0 2 VT 0 such that
ku
⇣
[1
✓Cef f (Crel + 1)][ku
uT 0 kH01 (⌦) + osc(T 0 )  [1
u` kH01 (⌦) + osc(T` )]
⌘
1/s
u` kH01 (⌦) + osc(T` )].
✓Cef f (Crel + 1)][ku
eq307
(3.57)
eq305
(3.58)
eq306
(3.59)
eq307
eq306
To see this, choose T 0 of minimal cardinality so that (3.59) holds. (3.58) then also hold for a mesh
of the same cardinality by definition of the approximation class As . Let now T 2 T be the smallest
common refinement of T` and T 0 . It can be shown that #T
#T`  #T 0 #T0 . Note that
0
oscillation is monotone under refinement. That is, T ⇢ T ) osc(T )  osc(T 0 ). Because VT 0 ⇢ VT ,
Céa’s Lemma then yields
ku
uT kH01 (⌦) + osc(T )  ku
 [1
uT 0 kH01 (⌦) + osc(T 0 )
✓Cef f (Crel + 1)][ku
lem:dorf_count
u` kH01 (⌦) + osc(T` )].
(3.60)
eq308
✓⌘` (T` ). Because M` is the smallest subset of T`
By Lemma 3.2.12, we thus have ⌘(RT` !T )
satisfying that latter inequality, we have
#T`  #T 0
#M`  #RT` !T  #T
eq306
#T0 .
(3.61)
eq309
The result then follows from (3.58).
2
We finally prove that AFEM converges with the best possible rate s in the sense that if u 2 As ,
then AFEM chooses a sequence of meshes such that the error decreases with rate s.
th:optimality
Theorem 3.2.14 Let u 2 As for some s > 0. Under the previous assumptions, it holds that
ku
u`
1 kH01 (⌦)
+ osc(T`
1)
. (#T`
#T0 )
1/s
1/s
|u|As .
(3.62)
eq310
.
(3.63)
eq311
ui kH01 (⌦) + osc(Ti )]
(3.64)
eq312
(3.65)
eq313
real_noinflate
eq305
Proof. Combining (3.37) and (3.57) yields
#T0 .
#T`
` 1
X
i=0
mod_contract
1/s
M` . |u|As
` 1
X
[ku
i=0
ui kH01 (⌦) + osc(Ti )]
1/s
From (3.50) we have
ku
so that for 0  i  `
[ku
u`
1 kH01 (⌦)
+ osc(T`
1)
. ⇢`
1 i
[ku
1
ui kH01 (⌦) + osc(Ti )]
1/s
. ⇢(`
1 i)/s
[ku
u`
1 kH01 (⌦)
+ osc(T`
1 )]
1/s
.
60
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
eq313
eq311
Combining (3.65) and (3.63) then yields
#T`
#T0 .
` 1
X
i=0
M`
1/s
. |u|As [ku
.
1/s
|u|As [ku
u`
u`
1 kH01 (⌦) + osc(T`
1 kH01 (⌦)
+ osc(T`
1 )]
1/s
1 )]
1/s
` 1
X
⇢(`
1 i)/s
(3.66)
i=0
,
with the last step following from the fact that ⇢1/s < 1 and so the last sum is a partial sum of a
geometric series. Rearranging completes the proof.
2
3.2.6
Approximation classes and Besov spaces
In this subsection we shall try to understand better which functions lie in the approximation classes
As . The answer we give is only a partial answer in two senses. First, it does not take into account
the role that data oscillation plays in membership of u in a given approximation class. Secondly,
we do not quite obtain an “if-and-only-if” statement telling us that u 2 As only when u lies in
some particular function space. However, the answer we obtain nonetheless gives us meaningful
information about the smoothness needed in order
for AFEM to converge with a given rate. The
BDDP02
answer to this question was originally given in [13] for space dimension d = 2 and polynomial degree
r = 1. Some of the definitions
we use below and a broader overview of nonlinear approximation
Dev98
theory may be bound in [23]; this reference
does not treat finite element approximation but is
GM14, Gan16
nonetheless a useful reference. Finally, [30, 29] prove equivalences in the context of higher space
dimension and higher polynomial degree. The latter reference additionally includes consideration of
data oscillation, and the full results in the latter two references require use of a type of generalized
Besov space.
As a first step we define Besov spaces. The Besov space Bq↵ (Lp (⌦)) has three indices, a smoothness index ↵, an integrability index p, and a “fine tuning” index q. Here ↵ 0, and 0  p, q  1.
Note carefully that the integrability index here may be less than 1, in contrast to classical Lp spaces.
This fact plays an important role in understanding AFEM convergence. Also, the smoothness and
integrability indices should be quite intuitive to those familiar with Sobolev spaces, while the third
index q is rather more mysterious. To give at least a small amount of intuition about it, though,
we think briefly of the case L1 (⌦). Here the spaces L1 of essentially bounded functions and C(⌦)
of continuous and bounded functions both have the same smoothness and integrability, but are not
the same spaces. The spaces BM O (functions of bounded mean oscillation) and VMO (vanishing
mean oscillation) also lie in the same place in the smoothness-integrability scale, giving a total of
four spaces corresponding to ↵ = 0, p = 1. These are all di↵erent spaces with di↵erent properties
and uses. Thus we see that the usual Sobolev scale does not always allow us to distinguish properly
between di↵erent types of functions.
There are various definitions of Besov spaces available (most of them equivalent under reasonable
assumptions); we now give such a definition. We begin by defining a di↵erence operator. Let h 2 Rd
eq314
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
and k 2 N. For x 2 G ⇢ Rd , the h-di↵erence of order k is given by
8
✓
◆
k
< Pk
k+j
(
1)
f (x + jh),
[x, x + kh] ⇢ G,
k
j=0
j
h (f, x, G) :=
:
0,
otherwise.
61
(3.67)
eq320
Here [x, x + kh] is the prism (x + [0, kh1 ]) ⇥ ... ⇥ (x + [0, khd ]). We also define the modulus of
smoothness of order k in Lp (G) as
!k (f, t)p = !k (f, t, G)p := sup k
|h|t
k
h (f, ·, G)kLp (G) ,
t > 0.
(3.68)
We next define the Besov semi-(quasi)norm | · |Bq↵ (Lp (⌦) . Given ↵ > 0 and 0 < p, q  1, then for any
2 N such that ↵ < + max{1, 1/p} = r + 1/p⇤ for p⇤ = min{1, p}, the Besov space Bq↵ (Lp (⌦)) is
the set of all f 2 Lp (⌦) such that the semi-(quasi)norm |f |Bq↵ (Lp (⌦) is finite, with
( R
1/q
1
[t ↵ ! +1 (f, t)p ]q dt
,
0 < q < 1,
t
0
|f |Bq↵ (Lp (⌦) :=
(3.69)
supt>0 t ↵ ! +1 (f, t)p ,
q = 1.
eq321
eq323
The (quasi)norm of Bq↵ (Lp (⌦)) is then
kf kBq↵ (Lp (⌦)) = kf kLp (⌦) + |f |Bq↵ (Lp (⌦)) .
(3.70)
eq324
This is only a quasinorm when p < 1 because in this case the triangle inequality is only satisfied up
to a constant.
Other definitions of Besov spaces exist and are generally equivalent at least if @⌦ is sufficiently
regular. We briefly discuss the real interpolation
method by way of example, or more precisely the
AF03
J-method of real interpolation. We follow [2, Chapter 7]. Assume that we are given Banach spaces
X0 , X1 with nontrivial intersection and both lying in a larger Hausdor↵ topological vector space X.
Note that X0 + X1 and X0 \ X1 are then also both Banach spaces. Also, we assume that X0 \ X1
is nontrivial. We define for t > 0 the J-norm
J(t; u) = max{kukX0 , tkukX1 }.
If 0  ✓  1 and 1  q  1, we denote by (X0 , X1 )✓,q;J the space of all u 2 X0 + X1 with
Z 1
dt
u=
f (t)
t
0
(3.71)
eq400
(3.72)
eq401
for some f 2 L1 (0, 1; dt/t, X0 +X1 ) having values in X0 \X1 and such that the
R 1real-valued function
t ! t ✓ J(t; f ) belongs to Lq (0, 1; dt/t, R). (Here g 2 Lq (0, 1; dt/t, R) if 0 |g(t)|q dt
t < 1 for
1AF03
 q < 1, and similarly for q = 1. We also denote this space by L⇤q .) We then have the following
([2, Theroem 7.13]).
Theorem 3.2.15 If either 1 < q  1 and 0 < ✓ < 1 or q = 1 and 0  ✓  1, then (X0 , X1 )✓,q;J is
a nontrivial Banach space with norm
kuk✓,q;J = inf kt
f 2S(u)
= inf
f 2S(u)
✓Z
✓
J(t; f (t)); L⇤q k
1
0
[t
✓
J(t; f (t))]q
dt
t
◆1/q
(3.73)
, q < 1,
62
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
where
⇢
S(u) =
f 2 L1 (0, 1; dt/t, X0 + X1 ) : u =
Z
1
f (t)
0
dt
t
.
(3.74)
Furthermore,
kukX0 +X1  kt
✓
min{1, t}; L⇤q0 k kuk✓,q;J  max{kukX0 , kukX1 }.
(3.75)
Thus
X0 \ X1 ! (X0 , X1 )✓,q;J ! X0 + X1 ,
(3.76)
that is, (X0 , X1 )✓,q;J is an intermediate space between X0 and X1 .
One can then define Besov spaces by real interpolation. Let 0 < ↵ < 1, 1  p < 1, and
1  q  1. Also, let m be the smallest integer greater than ↵. Then we have
Bq↵ (Lp (⌦)) = (Lp (⌦), W p,m (⌦))↵/m,q;J .
(3.77)
Now we return to our goal of gaining intuition about the relationship between intrinsic smoothness of u and its membership in a given approximation
class. Here a technical issue arises. Our
def:approxclass
original definition of apprxoimation classes given in (3.51) includes data oscillation. It is possible to
characterize approximation classes which incorporate data oscillation in a manner similar to our development below, but the
imbedding spaces that are used are not as easily characterized by referring
Gan16
to classical spaces; cf. [29]. We thus omit data oscillation, or in essence assume that f is piecewise
polynomial on T0 . We are still able to gain meaningful intuition about approximation classes by
doing so. Accordingly, we now redefine our approximation class in order to omit data oscillation.
Let
|u|As := sup N s
N
0
inf
T 2T,#T
ku
inf
#T0 =N uT inVT
uT kH01 (⌦) .
(3.78)
eq325
The u 2 AGM14
s if |u|As < 1.
From [30, Theorem 2.2], we have the following. Recall that r is the polynomial degree of the
finite element space and d is the space dimension.
th:direct
Theorem 3.2.16 Assume that u 2 B⌧s+1 (L⌧ (⌦)) with
and
1
⌧
<
s
d
+ 12 , and 0  s  r. Then u 2 As ,
|u|As . kukB⌧1+s (L⌧ (⌦)) .
More precisely, given N
(3.79)
eq326
(3.80)
eq327
1,
inf
T 2T,#T
inf
#T0 =N uT inVT
ku
uT kH01 (⌦) . N
s/d
|u|B⌧s+1 (L⌧ (⌦)) .
The indices in theorem may appear somewhat daunting. A function space diagram
(DeVore
th:direct
diagram) can be helpful in interpreting them. We may roughly restate Theorem 3.2.16 by saying
that u 2 As if u lies in a Besov space B⌧s+1th:direct
(L⌧ (⌦)) which compactly embeds into H 1 (⌦) (this is always
true when the assumptions of Theorem 3.2.16 are satisfied). A DeVore diagram has horizontal axis
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
63
th:direct
Figure 3.1: DeVore diagram illustrating Theorem 3.2.16. Order s convergence is obtained if u 2
B⌧s+1 (L⌧ (⌦)) with (1/⌧, s + 1) lying in the shaded region, but not on the line ⌧1 = ds + 12 .
fig1
1/p (inverse of integrability index) and vertical axis ↵ (smoothness index). The “fine-tuning” index
q plays little role for the direct example given here and could be thought of as a third dimension on
the diagram. Another way of stating this is that various spaces (Besov, Sobolev, etc.) may occupy
the same point on such a diagram. In order to specify the spaces which compactly embed into
H 1 (⌦), we draw a line of slope d through the point (1/p, ↵) = (1/2, 1) on the diagram. Besov spaces
B⌧s+1 (L⌧ (⌦)) with (1/⌧, s + 1) lying strictly to the left of this line and with s + 1 1 then embed
compactly into H 1 . Solving this condition
and adding in the natural convergence barrier s  r then
th:direct
gives the conditions of Theorem 3.2.16.
If (1/⌧, s + 1) lies on the line 1/⌧ = s/d + 1/2, then B⌧s+1 (L⌧ (⌦)) embeds continuously but not
compactly into H 1 , and we have no guarantee that AFEM converges with rate s–but
we also have
GM14
no indication that it does not converge with rate s. The following theorem (cf. [30, Theorem 2.5])
however indicates that if u only lies in Besov spaces below the line 1/⌧ = s/d + 1/2, then we cannot
achieve rate s convergence.
th:inverse
Theorem 3.2.17 Let s > 0 and
1
⌧
=
s
d
+ 12 . Then
As/d ⇢ B⌧1+s (L⌧ (⌦)).
(3.81)
th:direct
Note carefully
that there is a slight gap between the results of Theorem 3.2.16 and those of
th:inverse
Theorem 3.2.17. To see this, let us seek to determine what Besov regularity is required of u in order
to guarantee u 2 Ar/d (which in turn gives us hope that our AFEM
can approximate u with the
th:direct
generally best possible rate r/d). We take s = r in Theorem 3.2.16 and then find that we need
r+1
1
r
1
2d
⌧ < d + 2 . Solving for ⌧ yields the condition ⌧ > 2r+d , that is, u 2 B 2d +✏ (L 2d +✏ (⌦)) will
th:inverse
2r+d
2r+d
guarantee that u 2 Ar/d . On the other hand, from Theorem 3.2.17, we have that u 2 Ar/d implies
only that u 2 B r+1
2d (L 2d (⌦)). Thus membership in As is not precisely equivalent to membership
2r+d
2r+d
in some specific Besov space; the Besov regularity required to guarantee u 2 As is slightly stronger
than that guaranteed by u 2 As .
eq328
64
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
table:besov
r=1
d=2
2
B1+✏
(L1+✏ (⌦))
d=3
B 26 +✏ (L 65 +✏ (⌦))
r=2
B 32 +✏ (L 23 +✏ (⌦))
B 36 +✏ (L 67 +✏ (⌦))
r=3
B 41 +✏ (L 12 +✏ (⌦))
B 46 +✏ (L 69 +✏ (⌦))
r=4
B 52 +✏ (L 25 +✏ (⌦))
6
B 56 +✏ (L 11
+✏ (⌦))
3
2
5
5
7
9
11
Table 3.1: Regularity needed to achieve convergence rate r/d for various r, d.
table:besov
GM14
In table 3.2.6 we reproduce [30, Table 1], which gives the Besov regularity needed to achieve
the generally best possible adaptive convergence rate of r/d for d = 2, 3 and r = 1, 2, 3, 4. These
convergence rates may be observed quite precisely in practice in simple situations, for example when
solving
u = 1 on polyhedral domains.
We now return to some examples to understand the adaptive convergence rates observed computationally in Chapter 1. First consider the two-dimensional case d = 2, and assume that u(x, y)
solves
u = 1 on some polygonal domain ⌦. Then u ⇠ ⇢ for some
1/4, depending on the
situation
(
=
1/4
corresponds
to
mixed
boundary
conditions
which
change
type
at a crack vertex;
genudef
cf. (2.6)). We wish to check whether we can achive the “optimal” convergence rate r/2 given poly4
2
omial degree r. As already discussed, we require u 2 B⌧r+1 (L⌧ (⌦)) with ⌧ > 2r+2
= r+1
. Recall
the heuristic that each derivative of ⇢ subtracts one power from the exponent. Taking = 1/4, we
have Dr+1 ⇢R1/4 ⇠ ⇢1/4 r 1 = ⇢ r 3/4 . Raising to the power ⌧ and integrating in polar coordinates,
1
2
we require 0 ⇢( r 3/4)⌧ +1 d⇢ < 1, or ⌧ ( r 3/4) + 1 > 1, which gives ⌧ < r+3/4
. That is,
2
Dr+1 ⇢1/4 is ⌧ -integrable for all ⌧ < r+3/4
. We however only need ⇢1/4 2 B⌧r+1 (L⌧ (⌦)) for some
2
2
2
⌧ > r+1 . Because r+1 < r+3/4 , we can always find ⌧ satisfying both conditions, and it is possible
to achieve O(DOF r/2 ) convergence rate.
We can extend this observation to problems with stronger point singularities due to discontinuous
coefficients, although these situations are not directly covered by our theory. Consider the problem
div(aru) = f with a a (potentially discontinuous) scalar coefficient. Let ⌦ = ( 1, 1) ⇥ ( 1, 1),
and
⇢
b
xy > 0,
a(x, y) =
(3.82)
1
otherwise.
Here we assume b > 0. If b 6= 1, then the coefficient a is discontinuous in a way that Ke74
forms a
“checkerboard” pattern.BDN13
Such problems were analyzed in a well-known paper of Kellogg [34]. We
take our example from [14], where a slight variation of this problem is discussed and a theoretical
analysis of the e↵ects of coefficient regularity on AFEM construction and convergence rates is given.
Choosing f = 0 and fixing appropriate boundary conditions for u, we find that the solution u is
given in polar coordinates by u(⇢, ✓) = ⇢↵ µ(✓) with 0 < ↵ < 2 and
8
cos(( ⇡2
)↵) cos((✓ pi
0  ✓ < ⇡2 ,
>
>
4 )↵),
<
⇡
⇡
cos( 4 ↵) cos((✓ ⇡ + )↵),
2  ✓ < ⇡,
µ(✓) =
(3.83)
5⇡
)↵),
⇡  ✓ < 3⇡
> cos(↵ ) cos((✓
>
4
2 ,
:
cos( ⇡4 ↵) cos((✓ 3⇡
)↵), 3⇡
2
2  < 2⇡.
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
Here b, ↵, and
b=
65
satisfy
⇡
2
⇡
1
)↵) cot( ↵),
=
4
b
1)) <
⇡
↵ < min(⇡↵, ⇡),
2
tan((
⇡
tan( ↵) cot( ↵), b =
4
tan(↵ ) cot
⇡
↵)
4
(3.84)
and
max(0, ⇡(↵
max(0, ⇡(1
↵)) <
2↵ < min(⇡, ⇡(2
↵)).
(3.85)
Setting ↵ = 1/8, Mathematica returns an approximate solution b = 103.087, = 11.781. Note
that the coefficient a has rather high contrast. The solution thus produced has principal singularity
⇢1/8 , which is nastier than the ⇢1/4 singularity obtained from Poisson’s problem on a crack domain
with mixed boundary conditions. In similar fashion we may produce u with u ⇠ ⇢ with arbitrarily
close to 0, and thus only in H 1+✏ (⌦) with ✏ arbitrarily close to 0. Using the same reasoning as in the
case of corner singularities, even in these cases it is possible to recover the “optimal” convergence
rate DOF r/2 for ku u` kH01 (⌦) by using adapted shape-regular conforming meshes. The question
of whether AFEM will also recover this rate
is more complicated because the algorithm must also
BDN13
approximate the coefficient a suitably; see [14] for an in-depth discussion.
We now move to three-dimensional problems and consider the regularity of vertex and edge
singularities. We first consider vertex singularities of the form ⇢↵ with ⇢ the distance to a point.
Given polynomial degree r + 1, we have as above that Dr+1 ⇢↵ = ⇢↵ r 1 . In order to test whether
2d
6
Dr+1 ⇢↵ is ⌧ -integrable for ⌧ > 2r+d
= 2r+3
, we use polar integration to write
Z
⌦
|Dr+1 ⇢↵ |⌧ dx ⇠
Z
1
0
|⇢↵
r 1 ⌧ 2
| ⇢ d⇢ ⇠
Z
1
⇢⌧ (↵
r 1)+2
d⇢.
(3.86)
0
This integral is finite if ⌧ (↵ r 1) + 2 > 1, or ⌧ < r+13 ↵ (assuming that ↵ r 1 < 0, which
6
holds in interesting cases). Combined with the requirement ⌧ > 2r+6
, we thus require:
6
3
<⌧ <
2r + 3
r+1
↵
.
There are solutions ⌧ to this relationship if
2(r + 1
↵) < 2r + 3,
or ↵ > 1/2. This condition is precisely the one required to guarantee that ⇢↵ 2 H 1 (⌦), so as in
the 2D case any such singularity lying in H 1 can always be approximated with the best possible
rate ⇢ r/3 using AFEM on shape-regular meshes. We thus may formulate the following heuristic:
Vertex singularities lying in H 1 have infinite smoothness in the scale of Besov spaces used to
measure approximability by classes of shape-regular simplicial adaptive meshes.
We now turn to edge singularities. Here the situation is markedly di↵erent, as we see below.
We first return to the case of an polyhedral domain having maximum edge opening angle !e = 7⇡
8 ,
fig13
⇡/!e
8/7
as in Figure 1.12. Recall that the edge singularity
at
this
edge
is
of
the
form
⇢
=
⇢
,
where
e
e
fig14
⇢e is the distance to the given edge e. In Figure 1.13 we observed an adaptive convergence rate of
DOF 1.12 in H01 (⌦) when using polynomials of degree r = 4. We now explain this rate. We require
66
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
u 2 B⌧s+1 (L⌧ (⌦)) with
1
⌧
<
s
d
+ 12 , or ⌧ >
8/7
vertex singularities, we have Ds+1 ⇢e ⇠
Z
Z 1 Z 7⇡/8 Z
s 1 ⌧
|⇢8/7
|
dx
⇠
e
⌦
0
0
2d
2s+d . We
8/7 s 1
⇢e
.
0
1
|⇢8/7
e
s
8/7
now test the ⌧ -integrability of Ds+1 ⇢e . As for
Also, using now cylindrical integration, we have
Z 1
1 ⌧
s 1 ⌧
| ⇢e d⇢e d✓ dz ⇠
|⇢8/7
| ⇢e d⇢e .
e
0
Assuming that 8/7 s 1 < 0 (which again holds in the cases of interest to us), this expression is
finite if ⌧ (8/7 s 1) + 1 > 1, or ⌧ < s+12 8/7 . Thus in order to achieve order s/d convergence,
we require
6
2
<⌧ <
.
2s + 3
s + 1 8/7
6
Solving for s, we find that there exists such a ⌧ only if 2s+3
< s+12 8/7 , or s < 24
7 . This yields
8
a convergence rate of s/3
<
⇡
1.142,
which
is
quite
close
to
our
observed
adaptive
convergence
fig14 7
rate of 1.12 in Figure 1.13. Note that using polynomial degree r = 3 will lead to the generally
best possible rate of DOF 1 , while increasing the polynomial degree above 4 will never yield a
convergence rate better than DOF 8/7+✏ .
This situation reminds us to be careful when stating that a method is “optimal”, as the above
method is optimal in one sense but suboptimal in another. We have said that an AFEM is rateoptimal if it achieves the best possible convergence rate over all conforming meshes obtainable to
systematic bisection from T0 . Our AFEM is indeed optimal in this sense, and the above calculations
and numerical computations confirm that this is so. On the other hand, we also think of our method
as being optimal if achieves the best possible convergence rate with respect to the polynomial degree,
that is, DOF r/d . In this sense the method we use above–AFEM with polynomial degree r = 4–is
4/3
not optimal as it does not achieve a rate DOF BNZ05
. Anisotropic elements would be needed to recover
an optimal convergence rate in this sense; cf. [7].
We now generalize the calculation. Assume an edge opening angle of !e , which yields an edge
⇡/!
singularity for Poisson’s problem with homogeneous Dirichlet boundary conditions of the form ⇢e e .
⇡/!
⇡/! s 1
Then Ds+1 ⇢e e ⇠ ⇢e e
, and
Z
Z 1
s+1 ⇡/!e ⌧
|D ⇢e | dx ⇠
⇢⌧e (⇡/!e s 1)+1 d⇢e ,
⌦
which is finite when ⌧ (⇡/!e
2d
⌧ > 2s+d
yields
0
s 1) + 1 >
1. That is, ⌧ <
2
s+1 ⇡/!e .
Combining with the condition
6
2
<⌧ <
,
2s + 3
s + 1 ⇡/!e
3⇡
which when solved for s yields s < !
. Thus the best possible rate that can generally be achieved
e
s
⇡
is d < !e . More precisely, an AFEM for solving
u = 1 with homogeneous Dirichlet boundary
conditions on a polyhedral domain with maximum edge opening angle of !e will achieve a convergence
rate of
s
r ⇡
ku u` kH01 (⌦)  DOF s/3 ,
= min( ,
✏).
3
3 !e
Recalling that the maximum edge opening angle is 2⇡, there are cases where the best possible
⇡
convergence rate even with high-degree polynomials is 2⇡
= 12 . This rate is already achievable with
quadratics, and using shape-regular finite elements of any higher degree would simply waste degrees
3.2. RESIDUAL A POSTERIORI ERROR ESTIMATES AND AFEM CONVERGENCE
67
Figure 3.2: Two-brick domain.
fig3-1
of freedom. If we were to take mixed instead of Dirichlet boundary conditions, the singularity
strength could be so severe that even using quadratic elements would not be helpful.
We finally briefly discuss the situation when we wish to approximate u in Lp (⌦). Here rigorous
proof of AFEM optimality is only available when p = 2 (and with the restriction that ⌦ is convex),
but the same principles
can generally be used to guess optimal convergence rates. We consider an
DG12
example from [21] in which u is approximated in L1 (⌦). Our L1 AFEM is structured as follows.
Let
⌘1;` (T ) = h2T kf + u` kL1 (T ) + hT kJru` KkL1 (@T ) .
DG12, DK15
Then [21, 22]
ku
u` kL1 (⌦) . (1 + | ln min hT |) max ⌘1;` (T ).
T 2T`
T 2T`
We now use a maximum strategy to mark elements for refinement. That is, for marking parameter
0 < ✓  1,
M` = {T 2 Te ll : ⌘1;` (T ) ✓ max
⌘1;` (T 0 )}.
0
T 2T`
fig2-4
fig2-3
We run AFEM on the two-brick fig3-1
domain that we considered in Figure 2.2 (cf. Figure 2.3). We
reproduce the domain in Figure fig3-2
3.2. We used polynomial degree r = 2, 3, which fig3-3
produced the
sample adaptive mesh in Figure 3.3 and observed convergence rates seen in Figure 3.4.
In Lp
norms, the best convergence rate that we can hope for a priori is ku uh kLp (⌦)  Chr+1 . Similarly,
in an adaptive setting we aim for ku u` kLp (⌦) . DOF (r+1)/d . Thus measuring the error L1 in
a 3D calculation, we would like to see a convergence rate
of DOF 2/3 when using linears (r = 1),
fig3-3
DOF 1 when using quadratics (r = 2), etc. In Figure 3.4 we however only see a convergence rate
of DOF 2/3 –already achievable using piecewise linears–even if we use quadratics or cubics.
To understand these convergence rates, recall that u is approximable with rate s/d in H01 (⌦) if u
also lies in a Besov space with smoothness index s+1 (and equal integrability and fine-tuning indices)
which compactly imbeds into H01 (⌦). We can analogously guess that u can be approximated with
68
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
Figure 3.3: Two-brick domain; adaptive mesh.
fig3-2
0
10
L∞−error
P2 estimator
P3 estimator
slope=−2/3
−1
10
−2
10
2
10
3
10
4
10
dof
5
10
Figure 3.4: Two-brick domain: Convergence rates.
6
10
fig3-3
3.3. OTHER A POSTERIORI ERROR ESTIMATION TECHNIQUES
69
s+1
rate s+1
(L⌧ (⌦)) which compactly imbeds into L1 (⌦). A
d in L1 if u also lies in a Besov space B⌧
3
Devore diagram tells us that the relevant condition is s+1 > ⌧3 , that is, we need integrability ⌧ > s+1
.
⇡/(3⇡/2)
On theRother hand, the edge singularity at the L-shaped edges is of strength ⇢e
2
above, ⌦ |Ds+1 ⇢2/3 |⌧ dx is finite when ⌧ < s+1/3
, so overall we require
2/3
= ⇢e . As
3
2
<⌧ <
.
s+1
s + 1/3
These conditions are solvablefig3-3
when s < 1, yielding a convergence rate of s/d < 23 . This is precisely
what is observed in Figure 3.4. We emphasize again that in this case,
using any element degree
fig3-3
above r = 1 simply wastes computational e↵ort, as we see in Figure 3.4 (the error line for r = 3 is
above that for r = 2, so when we use cubics more degrees of freedom are required to achieve a given
error level).
3.3
Other a posteriori error estimation techniques
Above we have focused on residual-type a posteriori error estimates. Such estimates continue to be
widely studied and employed, and they are the easiest and most natural to work with in the context of
AFEM convergence theory. However, they also have some drawbacks, and there are quite a number
of other a posteriori error estimation techniques with varying applications and properties. As above,
we assume that our goal is to approximate a solution to an elliptic boundary value problem, say
u = f or div(Aru) = f with appropriate boundary conditions prescribed. Some of the issues
one might consider in choosing an estimator are:
1. What information about u are you trying to obtain from the calculation? Do you want to
control the error u uh in some norm, or
2. How much are you willing to pay computationally in order to achieve more accurate error
estimation?
3. How accurately must your estimator track the actual error?
4. Are you mainly interested in driving adaptivity, or in estimating computational errors?
5. How flexible do you want your estimator to be with respect to coding and di↵erential operator?
That is, are you willing to accept an estimator that must be substantially modified when you
change the di↵erential operator, or do you have a strong preference for one that is more
portable?
6. Do you insist on rigorous and provable a posteriori upper bounds (reliabilty), or are willing to
accept an estimator with guaranteed properties only on sufficiently fine meshes?
7. Are you using only low-order finite element methods, or higher-degree (or p or hp methods)
for which polynomial-degree robustness matters?
We illustrate varying approaches to balancing the demands posed by these questions by giving a
few examples of other a posteriori error estimation techniques. Our set of examples is decidedly
non-exhaustive and should not be viewed as a survey of current techniques.
70
3.3.1
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
Goal-oriented adaptivity
The essence of goal-oriented adaptivity is that one should first identify the desired output from a
given calculation, then design an adaptive algorithm that will efficiently reduce the error in producing
that output. This idea was popularized in the late 90’s and early 2000’s and continues to be a viable
and important part of BR03
the a posteriori error estimation and adaptivity landscape.
An important
BR01
reference is the book [8] of Bangerth and Rannacher; cf. the survey article [11] of Becker and
Rannacher. Also, one often hears the term “dual-weighted residual method” (DWR) in the context
of goal-oriented adaptivity, and the two ideas are sometimes use almost interchangably. Goaloriented adaptivity is more properly seen as the general approach to a posteriori error estimation,
while dual-weighted residual methods are the means used to implement goal-oriented adaptivity.
The core assumption of goal-oriented adaptivity is that one wishes to compute some functional
J(u) of the PDE solution. The functional J may not be closely related to the energy norm or other
Lp -type norms that are typically used in finite element error analysis. An important feature of J is
that it most often is locally supported within the overall computational domain ⌦. We also mention
that analysis of such methods is much simpler when J is linear and continuous in the H 1 norm, but
the methodology may still be quiteBR03
e↵ective even if one or both of these assumptions is violated. We
give an illustrative example from [8, Chapter 1]. Consider momentarily the classical incompressible
Navier-Stokes equations
@t v ⌫ v + v · rv + rp = f, r · v = 0.
Here v is the fluid velocity and p the pressure. We assume the problem is posed on a domain ⌦
consisting of a rectangle with a small ball removed; we denote by S the boundary of the ball. Solving
the NS equations with appropriate boundary conditions will allow us to compute the flow around
this obstacle. Our goal here is to compute the drag coefficient of the obstacle, given by
Z
2
J(v, p) := cdrag := 2
nT (2⌫⌧ pI)d ds,
U D S
where D is the diameter of the ball, U is the maximal inflow velocity, ⌧ = 12 (rv + rv T ) is the strain
tensor, and d = (0, 1)T is the main flow direction. (Some parameters may be a little unclear here; we
are only after the general structure of J). J is then lienar in v and p, but it is not bounded in H 1 ⇥L2
(a natural energy space here) because it involves surface integrals of rv and p. Thus controlling
the energy error here will not guarantee control of the error in computing J. On the other hand,
J is locally supported on the surface S, which leads us to ask how accurately we must compute
(v, p) in regions of ⌦ removed from S in order to guarantee accurate computation of J(v, p). The
dual-weighted-residual methodology gives us the ability to approach these questions in a meaningful
and computationally efficient manner.
BR03
We return to the model problem
u = f , u = 0 on @⌦ in order to give more details (cf. [8,
Chapter 3] for much of this discussion). Our goal is to control the error J(u) J(uh ) in approximating
the goal functional J(u). The DWR approach involves computing a dual solution that may be viewed
as a response function or generalized Green’s function for the functional J of interest. In particular,
we let a(u, v) := (ru, rv), and let z solve
a(', z) = J('), ' 2 H01 (⌦).
The finite element approximation zh 2 Sh to z is given by
a('h , zh ) = J('h ), 'h 2 Sh .
3.3. OTHER A POSTERIORI ERROR ESTIMATION TECHNIQUES
Assuming now that J is linear, we have for any
J(u)
J(uh ) = J(u
h
71
2 Sh that
uh ) = a(u
uh , z) = a(u
uh , z
h ).
(3.87)
Computationally we have access to uh and zh , but not to u and z, and so we can attempt to build an
a posteriori error estimate out of uh and zh . There are multiple options for approaching this task.
The most primitive is to set h = zh and use standard residual (or other) estimators for measuring
the product ku uh kH 1 (⌦) kz zh kH 1 (⌦) . This approach assumes that z 2 H 1 (⌦) and costs us
the ability to take advantage of local interactions between u uh and z zh , but is simple and
straightforward, basically allows use of just about any a posteriori error estimators o↵ the shelf, and
still may lead to significantly faster error decrease than considering the error
ku uh k in isolation.
AO00
A second approachdwr1
based on the parallelogram
identity
is
also
possible;
cf.
[3,
Chapter 8].
BR03
Returning
to
(
3.87)
and
following
again
[8],
we
again
assume
that
is
arbitrary.
We have from
h
dwr1
(3.87) that
J(u)
J(uh ) = (f, z
h)
a(uh , z
h)
= ⇢(uh )(z
h ),
where ⇢(uh ) is the residual. Standard elementwise integration by parts yields
Z
X Z
1
⇢(uh )(z
(f + uh )(z
Jruh K(z
h) =
h) +
h)
2 @T
T 2Th T
X

⇢T ! T ,
(3.88)
(3.89)
T 2Th
where ⇢T = (kf + uh k2L2 (T ) + hT 1 kJruh Kk2L2 (@T ) )1/2 may be termed smoothness indicators and
2
2
1/2
!T = (kz
may be termed influence factors.
h kL2 (T ) + hT kz
h kL2 (@T ) )
The trick now is to meaningfully bound z
h , using zh and/or other information about z. We
illustrate with a specific example. Let J(u) = u(x0 ) for a given point x0 2 ⌦. The dual solution z is
then the Green’s function, satisfying a(', z) = '(x0 ). Typically
the functional J and therefore also
R
the dual solution z are regularized by setting J(u) = B1✏ B✏ u(x) dx, where B✏ is the ball of radius ✏
centered at x0 . Thus averaging u over a small ball around x0 generally gives a p
close approximation
to u(x0 ), with error depending on the smoothness of u and ✏. Let r(x) = |x x0 | + ✏2 be a
regularized distance to x0 . It is known that in 2D,
z(x) ⇠ log(r(x))
and for |↵| > 0,
D↵ z(x) ⇠ r(x)
|↵|
(at least assuming that @⌦ is sufficiently regular). By choosing h suitably, we can then compute
2
2
3
that !T ⇡ h2T kD2 zkL2 (T ) ⇡ h2T |T |1/2 krkL1
(T ) ⇡ hT krkL1 (T ) . Thus
|J(u)
J(uh )| ⇡
X
T 2Th
h3T
krkL1 (T )2
⇢T .
This sketch is very incomplete. For example, in general information about D2 z must be estimated
from zh , which can become computationally involved. However, in general DWR estimators do a
dwr1
72
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
very e↵ective job at directing adaptive refinement to compute u(x0 ) and other locally supported
quantities of interest efficiently, and with proper adjustments also to estimating the resulting error.
There are also drawbacks. First, computation of the dual solution zh will be viewed by many as
relatively expensive. It involves solution of a global, dual problem that in the current situation is
just as costly as solving the original finite element equations. It is however often pointed out that
in the case of nonlinear problems, the dual problem is still linear and thus the added expense may
be relatively minor. In addition the use of DWR methods becomes less clear as the functional J
becomes more nonlinear. If one wants for example to control ku uh kL1 (⌦) instead of (u uh )(x0 ),
then it is not so clear how to choose J and compute the associated dual solution z. If error control in
norms is desired, it may make sense to instead resort to other techniques. Finally, the most e↵ective
techniques for estimating Dz (which are derived from di↵erence quotients) do not lead to provably
reliable estimators, even though they are typically e↵ective in practice.
3.3.2
Recovery estimators
We next discuss recovery-type a posteriori error estimators. Such estimators broadly speaking work
by “recovering” an approximation G(uh ) to ru from the discrete solution uh which is superconvergent or otherwise a better approximation to ru than is ruh . By superconvergent, we mean
that
kG(uh ) ruk
! 0 as h ! 0.
kr(u uh )k
(3.90)
A standard and popular example AO00
is the Zienkiewicz-Zhu superconvergent patch recovery estimator; we take our discussion from [3, Chapter 4]. Let N be the set of nodes in a mesh Th .
Given z 2 N , we denote by !z the patch of elements sharing the vertex z. Assume also that
Sh is a standard piecewise linear finite element space. The recovered gradient G(uh ) will be a
finite element function polynomial of degree 1 over !z in each component. We begin by defining an auxiliary function gz that is linear in each component. gz is taken to have x-component
↵1,x + ↵2,x x + ↵3,x y and similarly for the y-component. Given T ⇢ !Z , let bT be the barycenter of
T . gz (x, y) = (↵1,x + ↵2,x x + ↵3,x y, ↵1,y + ↵2,y x + ↵3,y y) is taken to be the vector function which
uniquely minimizes
◆2
X ✓ @uh
(bT ) ↵1,x + ↵2,x x + ↵3,x y .
@x
T ⇢!z
and similarly for the y-component. The recovered gradient G(uh ) is the piecewise linear function
(in each component)superconvergent
with value gz (z) at each node z.
Assuming that (3.90) holds, we then have
kG(uh )
kr(u
ruh kL2 (⌦)
kG(uh )

uh )kL2 (⌦)
rukL2 (⌦) + kr(u
ku uh kL2 (⌦)
uh )kL2 (⌦)
! 1 as h ! 0.
A similar lower bound can be obtained by using the triangle inequality in the other direction. We
thus have kG(uh ) ruh kL2 (⌦) /kr(u uh )kL2 (⌦) ! 0 as h ! 0. We say that the error estimator
kG(uh ) ruh kL2 (⌦) is asymptotically exact in this case.
The property of asymptotic exactness is highly desirable, and because the ZZ estimator only
depends on the solution uh directly and not on properties of the underlying PDE is it also very
superconvergent
3.3. OTHER A POSTERIORI ERROR ESTIMATION TECHNIQUES
73
easy to program into a highly portable code. There are
two main drawbacks to the ZZ estimator,
superconvergent
however. First, while the superconvergence property (3.90) is often observed in practice, it is much
harder to establish theoretically in much generality. On highly structured (uniform) grids and for
linear elements it is observed, and can be proved, that ruh (bT ) is a higher-order approximation
to ru(bT ) than is generally the case for other points. That is, for the finite element method
using linear elements on uniform grids is superconvergent at barycenters for gradients. However,
BX03
superconvegence is observed computationally in much more general circumstances. The paper [10]
explains this observation by showing that if the grid satisfies a much weaker regularity property
involving “near-parallelograms”
formed by adjacent mesh elements, then sufficient superconvergence
superconvergent
will occur to guarantee (3.90) for suitably defined recovery operators.
A second issue involves reliability of ZZ-type (recovery) estimators. Assume that we solve
u=
f , and it happens that 0 6= f ? §h . Although this situation may seem artificial, it is not hard to
construct such examples. Then we find uh = 0, and because G(uh ) depends only on uh and not on
f , we also have G(uh ) = 0. Similarly, the error estimate kG(uh ) ruh k = 0. Unlike the residual
estimators we studied previously, recovery estimators thus are not reliable. If implemented in an
AFEM, the AFEM will stop at this point. Clearly this situation is a disaster if it occurs, although it
may not happen so often in practice. There are also techniques for “safeguarding” the ZZ estimator
by adding on additional
terms which guarantee coarse-mesh reliability while preserving asymptotic
FV06
exactness; cf. [28].
3.3.3
Equilibrated flux estimators
Another option for developing a posteriori error estimators uses somewhat di↵erent techniques. Our
entry point is the Prager-Synge identity. We present a variation more
directly useful for a posteriori
EV15
error estimation. While we take our immediate presentation from [26], we note that a large number
of papers over the past couple of decades have exploited similar ideas in constructing a posteriori
error estimates.
Lemma 3.3.1 Let u 2 H01 (⌦) weakly solve
u = f . Let uhR 2 Sh be the finite
element approxiR
mation to u over the mesh Th , and let h 2 H(div; ⌦) satisfy T div h dx = T f dx. Let also now
hT = diam(T ). Then
uh )k2L2 (⌦) 
kr(u
X
(kruh +
T 2Th
2
h kL2 (T )
+
hT
kf
⇡
div
Proof. Let v 2 H01 (⌦). Then integrating by parts and employing
Z
⌦
r(u
uh ) · rv =
Z
(ru +
⌦
h)
· rv
R
R
R
Using T div h = T f , we have T (f
recall the Poincaré inequality
inf kw
wT 2R
Z
div
(ruh +
h)
⌦
h )v
=
R
T
(f
· rv =
div
Z
h kL2 (T ) )
2
.
(3.91)
PS
u = f , we have
(f
div
h )v
⌦
h )(v
Z
(ruh +
⌦
h)
· rv.
vT ) for any vT 2 R. We now
wT kL2 (T )  CT hT krwkL2 (T ) , w 2 H 1 (T ).
74
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
PW60
It is known that we can in fact take CT = 1/⇡ [36]. Thus breaking integrals over ⌦ into sums over
integrals over mesh elements, we have for any v 2 H01 (⌦)
Z
Z
X Z
r(u uh ) · rv =
( (f div h )(v vT )
(ruh + h )rv
⌦
T 2Th

X
T
T
2 Th (
T
hT
kf
⇡
X hT
( kf
⇡

div
div
T 2Th
h kL2 (T )
h kL2 (T )
+ kruh +
+ kruh +
h kL2 (T ) )krvkL2 (T )
h kL2 (T ) )
2
!1/2
krvkL2 (⌦) .
Taking v = r(u uh ) and dividing through by kr(u uh )kL2 (⌦)PScompletes the proof.
2
Let us reflect on what we have shown. First, note that (3.91)
is
both
rigorous
(it
gives
a
PS
reliable upper bound), and that there are no unknown constants in (3.91). These two facts together
constitute the good news. The bad news is that we have not come up with an e↵ective way to
compute h ; it is thus far completely unknown. One option is to solve the standard mixed finite
element problem. Let RTk ⇥ P Ck be a standard mixed finite element space consisting of degree-k
(k ge0) Raviart-Thomas elements and piecewise polynomials. We then seek ( h , ũh ) 2 RTk ⇥ P Ck
such that
Z
Z
Z
Z
ũh div ⌧h , ⌧h 2 RTk ,
(div h )vh =
f vh , v h 2 P C K .
(3.92)
h · ⌧h =
⌦
⌦
⌦
R
R h then satisfies the given conditions, including the approximate equilibrium condition T div h =
f , and we could in theory insert it into our identity. This however raises an important cost-benefit
T
question. Solving the mixed finite element problem above requires again solving a global system,
and it is a saddle point system to boot. It would be more desirable to instead find an appropriate
flux
approximation h which is locally instead of globally defined. We thus employ a local version of
global_mfem
(3.92). Given a vertex z 2 Nh , let !z be the patch of elements sharing z. Also, let RTkz be the set
of functions which are 0 outside of !z and lie in RTk ; the latter condition is equivalent to requiring
⌧h · ~n = 0 on @!z \ @⌦, where ~n is the unit normal on @!z .R We similarly define P Ck,z to be the
piecewise polynomials of degree k on !z with the constraint !z vh = 0 when z is an interior node.
Finally, let z be the standard continuous piecewise linear hat function corresponding to z. We then
let ( h,z , rh,z ) 2 RTk,z ⇥ P Ck,z solve
Z
Z
Z
rh,z div ⌧h,z =
h,z · ⌧h,z
z ruh · ⌧h,z , ⌧h,z 2 RTk,z ,
!z
Z !z
Z !z
(3.93)
div h,z vh,z =
( z f r z · ruh )vh,z , vh,z 2 P Ck,z .
P
!z
⌦
local_mfem
!z
We then set h = z2Nh h,z , which is in H(div; ⌦). We now discuss the properties of h . First,
R
R
for any interior node z we have !z ruh r z = !z z f (since uh is the Galerkin solution), so
R
R
R
( z f r z · ruh ) = 0. This in turn implies that !z div h,z vh = !z ( z f r z ruh )vh for all
!z
vh 2 P Ck . Summing over z and using the fact that { z } is a partition of unity yields
Z
Z
Z
X
X
(f
r(
f vh , v h 2 P C k .
(3.94)
h vh =
z
z ) · ruh )vh =
⌦)
global_mfem
⌦
z2Nh
z2Nh
⌦
sigma
3.3. OTHER A POSTERIORI ERROR ESTIMATION TECHNIQUES
R
R
Thus the equilibrium
condition T h = T f is satisfied.
sigma
Next note that (3.94) holds elementwise (since P Ck is a broken space,
so that kf div
PS
kf PPk (T ) f kL2 (T ) with PPk (T ) the L2 (T ) projection onto Pk . Thus (3.91) reduces to
kr(u
uh )k2L2 (⌦) 
X
T 2Th
(kruh +
2
h kL2 (T )
+
hT
kf
⇡
PPk (T ) )2 .
75
h kL2 (T )
=
(3.95)
The latter term is a data oscillation term and is generally of higher order. If we take k = r, then
kr(u uh )kL2 (⌦) = O(hr ), and hT kf PPk (T ) kL2 (T ) = O(hr+2 ), soEV15
that we expect kruh + h k
to dominate the estimator asymptotically. Numerical results (cf. [26]) indicate that very good
correlation between errors and estimators may be obtained with this method, with efficiency indices
quite close to 1 and robust with respect to polynomial degree. Because the upper bound is also
guaranteed with no unknown constants, we see that these estimators have some attractive features.
76
CHAPTER 3. A POSTERIORI ERROR ESTIMATION AND ADAPTIVITY
Download