13: Brezzi-Fortin stability of saddle point problems. Math 639 (updated: January 2, 2012) In this section, we consider a saddle point problem. This is defined from a matrix (with real entries) given in the following block form: A Bt . A= B 0 Here A has dimension n × n, B has dimension m × n and B t (the transpose of B) has dimension n × m. We seek to iteratively solve the problem: Given b ∈ Rn+m , find x ∈ Rn+m solving (13.1) Ax = b. This is the simplest of saddle point problems. The techniques this section as well as their application to more general saddle point problems can be found in Brezzi-Fortin, ... Remark 1. We temporarily consider the case when A is symmetric and positive definite and the rank of B is m. The mini-max characterization of the eigenvalues of A, i.e., λj = max min Sj z∈Sj (Az, z) . (z, z) Here the minimum is taken over all subspaces Sj of dimension j and λ1 ≥ λ2 ≥ ... are the eigenvalues of A in decreasing order. The positive definiteness of A immediately implies that A has at least n positive eigenvalues since for j ≤ n, taking Sj = [V, 0] where V is a j dimensional subspace of Rn gives rise to a positive minimum. Similarly, the eigenvalues of A, given in increasing order are given by µj = min max Sj z∈Sj (Az, z) . (z, z) For q ∈ Rm , we set z = [−A−1 B t q, q] and find that (Az, z) = −(A−1 B t q, B t q). Since the rank of B t = m, it follows that A has at least m negative eigenvalues, i.e., A has n positive and m negative eigenvalues and (13.1) is a saddle point problem. We shall recast this problem in a slightly different notation by defining u = (x1 , . . . , xn )t , p = (xn+1 , . . . , xn+m )t F = (b1 , . . . , bn ) and G = 1 2 (bn+1 , . . . , bn+m ). Then (13.1) is equivalent to the system of equations (13.2) Au + B t p = F Bu = G. As we shall often be splitting vectors in Rn+m , we introduce the notation x = [u, p] and b = [F, G] for this splitting. Before investigating the iterative solution of (13.1), we consider some conditions which guarantee the existence of solutions to (13.1) for any vector b ∈ Rn+m1 We let (x, y) = x · y denote the Euclidean inner product on Rn , Rm or depending on where the vectors x and y reside. We also introduce auxiliary norms on Rn and Rm which we denote by kxk (again, the specific norm is identified from where x resides). These norms are induced by inner products on Rn and Rm , respectively. For z = [w, q] ∈ Rn+m , we set p kzk = kwk2 + kqk2 . Rn+m We define the following (dual) norms kAk = sup sup x∈Rn y∈Rn kBk = sup sup x∈Rn y∈Rm (Ax, y) , kxkkyk (Bx, y) . kxkkyk It is immediate that kB t k = sup sup y∈Rm x∈Rn = sup sup y∈Rm x∈Rn (B t y, x) kxkkyk (Bx, y = kBk kxkkyk where we reversed the order of the supremums to derive the last equality above. We shall use additional dual norms for vectors, i.e., for w ∈ Rn , kwk∗ = sup x∈Rn (w, x) . kxk The dual norms kqk∗ and kyk∗ for q ∈ Rm and y ∈ Rn + m are defined analogously. It is not hard to show that the dual norms and the matrix 1Of course, A is a square matrix and so existence immediately implies existence. 3 norms do indeed satisfy the norm axioms. There are obvious inequalities which are immediate from the definition of these norms, e.g., |(p, q)| ≤ kpk∗ kqk and kAuk∗ ≤ kAk kuk. Exercise 1. For all z = [w, q] ∈ Rn+m , kzk∗ ≤ kwk∗ + kqk∗ ≤ 2kzk∗ . Conditions on the matrix A: (M.1) The matrices A is symmetric and positive semi-definite. (M.2) We denote the null space of B by ker B = {x ∈ Rn : Bx = 0} and assume that A is coercive on ker B. More specifically, we assume that there is a positive constant α satisfying αkxk2 ≤ (Ax, ·x), for all x ∈ ker B. (M.3) We assume the following inf-sup condition, i.e., there is a positive constant β satisfying (13.3) β ≤ infm sup p∈R u∈Rn (Bu, p) . kukkpk Remark 2. We note that (13.3) implies that (13.4) kpk ≤ c sup u∈Rn (Bu, p) for all p ∈ Rm kuk with constant c = 1/β. In fact, c = 1/β is the smallest constant c for which (13.4) holds. These systems often come from the discretization of partial differential equations with constraints enforced by the introduction of Lagrange multipliers (discussed below). In this case, natural problem dependent norms (which we have denoted kuk for u ∈ Rn and kpk for p ∈ Rm ) often give rise to constants α−1 , β −1 , kAk, kBk which are uniformly bounded independently of the discretization or mesh size. These norms are central to the stability analysis of the original PDE as well as their discretizations (i.e., the solution of (13.1)). It is a fact from linear algebra that Rn = ker B ⊕ RangeB t 4 is an orthogonal decomposition in the Euclidean inner product and we denote ker B ⊥ = RangeB t . Now, (13.4) implies that for each nonzero p ∈ Rm , there is at least one (nonzero) u ∈ Rn with (Bu, p) = (u, B t p) 6= 0. This means that B t is a one to one map of Rm onto ker B ⊥ . Thus it follow from the (13.4) that B −t (the inverse of B t on ker B ⊥ ) satisfies (13.5) (Bw, B −t u) (w, B t B −t u) = β −1 sup kwk kwk w∈Rn w∈Rn (w, u) = β −1 kuk∗ . = β −1 sup n kwk w∈R kB −t uk ≤ β −1 sup Now there is a complementary inf-sup condition, namely, (13.6) β1 ≤ or (Bu, p) inf sup . t m u∈ker B p∈R kukkpk kuk ≤ β1−1 sup (13.7) p∈Rm (Bu, p) for all u ∈ ker B t . kuk It turns out that (13.3) and (13.6) are equivalent and hold with the same β. We illustrate the proof in one direction below. Suppose that (13.3) holds and that u is in ker B t . Then for p = B −t u, (13.5) implies kuk = = (Bu, p) (Bu, p) ≤ β −1 sup kuk kpk p∈Rm (u, B t p) (u, u) = kuk kuk This implies that (13.6) holds with β1−1 ≤ β −1 . A similar argument shows that if (13.6) holds then so does (13.3) with β −1 ≤ β1−1 , i.e., the inf-sup conditions are equivalent and hold with the same constant. Now B is one to one on ker B ⊥ and maps onto Rm . Its inverse B −1 maps bijectively onto ker B ⊥ . The inf-sup condition (13.7) implies Rm (13.8) kB −1 pk ≤ β −1 sup q∈Rm = β −1 sup w∈Rn (p, q) (BB −1 p, q) = β −1 sup kqk q∈Rm kqk (w, u) = β −1 kuk∗ . kwk 5 Theorem 1. Let b be in Rn+m . Then there is a unique solution x ∈ Rn+m satisfying (13.1). Moreover, there is a constant c = c(α, β, kAk, kBk) such that kxk ≤ ckbk∗ . Proof. We shall construct the solution [u, p of (13.2). We first define u0 = B −1 G. By (??), ku0 k ≤ β −1 kGk∗ . Next, we set u1 ∈ ker B to be the solution of (13.9) (Au1 , θ) = (F, θ) − (Au0 , θ) for all θ ∈ ker B. That this problem has a unique solution follows from the Riesz Representation Theorem applied on ker B with inner product < u, v >= (Au, v), u, v ∈ ker B. That < ·, · > is an inner product on ker B follows from the coercivity assumption (M.2). We can then bound u1 by αku1 k2 ≤ (Au1 , u1 ) = (F, u1 ) − (Au0 , u1 ) ≤ kF k∗ ku1 k + kAk ku0 kku1 k which immediately implies (13.10) ku1 | ≤ α−1 (kF k∗ + β −1 kGk∗ ). We set u = u0 + u1 and note that (13.11) Set (13.12) kuk ≤ (1 + α−1 )β−1kGk∗ + α−1 kF k∗ p = B −t (F − Au). Note that (13.9) implies that F − Au is in ker B ⊥ and hence p is well defined and satisfies (13.13) kpk ≤ kF k∗ + kAk kuk ≤ kF k∗ + kAk((1 + α−1 )β−1kGk∗ + α−1 kF k∗ ). Note that (13.12) implies the first equation in (13.2) while Bu = Bu0 + Bu1 = Bu0 = G. This implies that [u, p] is the solution of (13.2). The bound in the theorem follows from (13.11), (13.13) and Exercise 1. Remark 3. The assumption of symmetry on A can be relaxed by replacing the Riesz Representation Theorem by the Lax-Milgram Theorem. Corollary 1. There are constants c0 and c1 depending only on α, β, kAk, and kBk satisfying (13.14) c0 kxk ≤ kAxk∗ ≤ c1 kxk for all x ∈ Rn+m . 6 Proof. For any x ∈ Rn+m , x is the unique solution to (13.1) with b = Ax thus the first inequality of (13.14) is the inequality of the theorem. For the second, let x = [u, p]. By Exercise 1, kAxk∗ ≤ kAu + B t pk∗ + kBuk∗ ≤ (kAk + kBk)kuk + kBkkpk √ ≤ (kAk + kBk)(kuk + kpk) ≤ 2(kAk + kBk)kxk. We used the arithmetic geometric mean inequality (a + b)2 ≤ 2(a2 + b2 ) for the last inequality above.