ETNA Electronic Transactions on Numerical Analysis. Volume 40, pp. 82-93, 2013. Copyright 2013, Kent State University. ISSN 1068-9613. Kent State University http://etna.math.kent.edu ON SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENVALUE PROBLEMS∗ ALEKSANDRA KOSTIƆ AND HEINRICH VOSS‡ Dedicated to Lothar Reichel on the occasion of his 60th birthday Abstract. For Hermitian matrices and generalized definite eigenproblems, the LDLH factorization provides an easy tool to slice the spectrum into two disjoint intervals. In this note we generalize this method to nonlinear eigenvalue problems allowing for a minmax characterization of (some of) their real eigenvalues. In particular we apply this approach to several classes of quadratic pencils. Key words. eigenvalue, variational characterization, minmax principle, Sylvester’s law of inertia AMS subject classification. 15A18, 65F15 1. Introduction. The inertia of a Hermitian matrix A is the triplet of nonnegative integers In(A) := (np , nn , nz ), where np , nn , and nz are the number of positive, negative, and zero eigenvalues of A (counting multiplicities). Sylvester’s classical law of inertia states that two Hermitian matrices A, B ∈ Cn×n are congruent (i.e., A = S H BS for some nonsingular matrix S) if and only if they have the same inertia In(A) = In(B). An obvious consequence of the law of inertia is the following corollary: if A has an LDLH factorization A = LDLH , then np and nn equal the number of positive and negative entries of D, and if a block LDLH factorization exists where D is a block diagonal matrix with 1 × 1 and indefinite 2 × 2 blocks on its diagonal, then one has to increase the number of positive and negative 1 × 1 blocks of D by the number of 2 × 2 blocks to get np and nn , respectively. Hence, the inertia of A can be computed easily. This is particularly advantageous if the matrix is banded. If B ∈ Cn×n is positive definite, and A − σB = LDLH is the block diagonal LDLH factorization of A − σB for some σ ∈ R, we get the inertia In(A − σB) = (np , nn , nz ) as described in the previous paragraph. Then, the generalized eigenvalue problem Ax = λBx has nn eigenvalues smaller than σ. Hence, the law of inertia yields a tool to locate eigenvalues of Hermitian matrices or definite matrix pencils. Combining it with bisection or the secant method, one can determine all eigenvalues in a given interval or determine initial approximations for fast eigensolvers, and it can be used to check whether a method has found all eigenvalues in an interval of interest or not. The law of inertia was first proved in 1858 by J. J. Sylvester [19], and several different proofs can be found in textbooks [3, 6, 11, 13, 15], one of which is based on the minmax characterization of eigenvalues of Hermitian matrices. In this note we discuss generalizations of the law of inertia to nonlinear eigenvalue problems allowing for a minmax characterization of their eigenvalues. 2. Minmax characterization. Our main tools in this paper are variational characterizations of eigenvalues of nonlinear eigenvalue problems generalizing the well known minmax characterization of Poincaré [16] or Courant [2] and Fischer [5] for linear eigenvalue problems. ∗ Received October 17, 2012. Accepted January 29, 2013. Published online on April 22, 2013. Recommended by D. Kressner. † Mas̆inski fakultet Sarajevo, Univerzitet Sarajevo, Bosnia and Herzegovina, (kostic@mef.unsa.ba). ‡ Institute of Mathematics, Hamburg University of Technology, D-21071 Hamburg, Germany (voss@tu-harburg.de). 82 ETNA Kent State University http://etna.math.kent.edu SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENPROBLEMS 83 We consider the nonlinear eigenvalue problem T (λ)x = 0, (2.1) where T (λ) ∈ Cn×n , λ ∈ J, is a family of Hermitian matrices depending continuously on the parameter λ ∈ J, and J is a real open interval which may be unbounded. To generalize the variational characterization of eigenvalues, we need a generalization of the Rayleigh quotient. To this end we assume that (A1 ) for every fixed x ∈ Cn , x 6= 0, the scalar real equation (2.2) f (λ; x) := xH T (λ)x = 0 has at most one solution λ =: p(x) ∈ J. Then the equation f (λ; x) = 0 implicitly defines a functional p on some subset D ⊂ Cn , which is called the Rayleigh functional of (2.1), and which is exactly the Rayleigh quotient in case of a monic linear matrix function T (λ) = λI − A. Generalizing the definiteness requirement for linear pencils T (λ) = λB − A, we further assume that (A2 ) for every x ∈ D and every λ ∈ J with λ 6= p(x) it holds that (λ − p(x))f (λ; x) > 0. If p is defined on D = Cn \ {0}, then the problem T (λ)x = 0 is called overdamped. This notion is motivated by the finite dimensional quadratic eigenvalue problem T (λ)x = λ2 M x + λCx + Kx = 0, where M , C, and K are Hermitian and positive definite matrices. If C is large enough such that d(x) := (xH Cx)2 − 4(xH Kx)(xH M x) > 0 for every x 6= 0, then T (·) is overdamped. Generalizations of the minmax and maxmin characterizations of eigenvalues were proved by Duffin [4] for the quadratic case and by Rogers [17] for general overdamped problems. For nonoverdamped eigenproblems, the natural ordering to call the smallest eigenvalue the first one, the second smallest the second one, etc., is not appropriate. This is obvious if we make a linear eigenvalue problem T (λ)x := (λI − A)x = 0 nonlinear by restricting it to an interval J which does not contain the smallest eigenvalue of A. Then the conditions (A1 ) and (A2 ) are satisfied, p is the restriction of the Rayleigh quotient RA to D := {x 6= 0 : RA (x) ∈ J}, and inf x∈D p(x) will in general not be an eigenvalue. If λ ∈ J is an eigenvalue of T (·), then µ = 0 is an eigenvalue of the linear problem T (λ)y = µy, and therefore there exists an ℓ ∈ N such that 0 = max min V ∈Hℓ v∈V \{0} v H T (λ)v , kvk2 where Hℓ denotes the set of all ℓ–dimensional subspaces of Cn . In this case, λ is called an ℓth eigenvalue of T (·). With this enumeration the following minmax characterization for eigenvalues was proved in [20, 21]. T HEOREM 2.1. Let J be an open interval in R, and let T (λ) ∈ Cn×n , λ ∈ J, be a family of Hermitian matrices depending continuously on the parameter λ ∈ J such that the conditions (A1 ) and (A2 ) are satisfied. Then the following statements hold. ETNA Kent State University http://etna.math.kent.edu 84 ALEKSANDRA KOSTIĆ AND HEINRICH VOSS (i) For every ℓ ∈ N there is at most one ℓth eigenvalue of T (·) which can be characterized by λℓ = (2.3) min sup V ∈Hℓ , V ∩D6=∅ v∈V ∩D p(v). (ii) If λℓ := inf sup V ∈Hℓ , V ∩D6=∅ v∈V ∩D p(v) ∈ J for some ℓ ∈ N, then λℓ is the ℓth eigenvalue of T (·) in J, and (2.3) holds. (iii) If there exist the kth and the ℓth eigenvalue λk and λℓ in J (k < ℓ), then J contains the jth eigenvalue λj (k ≤ j ≤ ℓ) as well with λk ≤ λj ≤ λℓ . (iv) Let λ1 = inf x∈D p(x) ∈ J and λℓ ∈ J. If the minimum in (2.3) is attained for an ℓ-dimensional subspace V , then V ⊂ D ∪ {0}, and (2.3) can be replaced by λℓ = min sup V ∈Hℓ , V ⊂D∪{0} v∈V, v6=0 p(v). (v) λ̃ is an ℓth eigenvalue if and only if µ = 0 is the ℓth largest eigenvalue of the linear eigenproblem T (λ̃)x = µx. (vi) The minimum in (2.3) is attained for the invariant subspace of T (λℓ ) corresponding to its ℓ largest eigenvalues. 3. Sylvester’s law for nonlinear eigenvalue problems. We first consider the overdamped case. Then T (·) has exactly n eigenvalues λ1 ≤ · · · ≤ λn in J [17]. T HEOREM 3.1. Assume that T : J → Cn×n satisfies the conditions of the minmax characterization in Theorem 2.1, and assume that the nonlinear eigenvalue problem (2.1) is overdamped, i.e., for every x 6= 0 Equation (2.2) has a unique solution p(x) ∈ J. For σ ∈ J, let (np , nn , nz ) be the inertia of T (σ). Then the nonlinear eigenproblem T (λ)x = 0 has n eigenvalues in J, np of which are less than σ, nn exceed σ, and for nz > 0, σ is an eigenvalue of geometric multiplicity nz . Proof. The invariant subspace W of T (σ) corresponding to its positive eigenvalues has dimension np , and it holds that f (σ; x) = xH T (σ)x > 0 for every x ∈ W , x 6= 0. Hence p(x) < σ by (A2 ), and therefore the np th smallest eigenvalue of T (·) satisfies λnp = min max dim V =np x∈V,x6=0 p(x) ≤ max x∈W,x6=0 p(x) < σ. On the other hand for every subspace V of Cn of dimension np + nz + 1, there exists a vector x ∈ V such that f (σ; x) < 0. Thus p(x) > σ, and it holds that λnp +nz +1 = min max dim V =np +nz +1 x∈V,x6=0 p(x) > σ, which completes the proof. Next we consider the case that an extreme eigenvalue, i.e., either λ1 := inf x∈D p(x) or λn := supx∈D p(x), is contained in J. T HEOREM 3.2. Assume that T : J → Cn×n satisfies the conditions of the minmax characterization, and let (np , nn , nz ) be the inertia of T (σ) for some σ ∈ J. (i) If λ1 := inf x∈D p(x) ∈ J, then the nonlinear eigenproblem T (λ)x = 0 has exactly np eigenvalues λ1 ≤ · · · ≤ λnp in J which are smaller than σ. (ii) If supx∈D p(x) ∈ J, then the nonlinear eigenproblem T (λ)x = 0 has exactly nn eigenvalues λn−nn +1 ≤ · · · ≤ λn in J exceeding σ. ETNA Kent State University http://etna.math.kent.edu SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENPROBLEMS 85 Proof. (i): We first show that f (λ; x) < 0 for every λ ∈ J with λ < λ1 and for every vector x 6= 0. Assume that f (λ; x) ≥ 0 for some λ < λ1 and x 6= 0, let x̂ be an eigenvector of (2.1) corresponding to λ1 , and let w(t) := tx̂ + (1 − t)x, 0 ≤ t ≤ 1. Then φ(t) := f (λ; w(t)) is continuous in [0, 1], φ(0) = f (λ; x) ≥ 0, and φ(1) = f (λ; x̂) < 0 by (A2 ). Hence, there exists a value t̂ ∈ [0, 1) such that f (λ; w(t̂)) = 0, i.e., w(t̂) ∈ D and p(w(t̂)) = λ < λ1 contradicting λ1 := inf x∈D p(x). For np = 0, the matrix T (σ) is negative semidefinit, i.e., xH T (σ)x ≤ 0 for x 6= 0, and it follows from (A2 ) that p(x) ≥ σ holds true for every x ∈ D. Hence, there is no eigenvalue less than σ. For np > 0, let W denote the invariant subspace of T (σ) corresponding to its positive eigenvalues. Then f (σ; x) = xH T (σ)x > 0 for x ∈ W , x 6= 0, and from f (λ; x) < 0 for λ < λ1 , it follows that x ∈ D and p(x) < σ. Hence, W ⊂ D ∪ {0} and as in the proof of Theorem 3.1 we obtain λnp = min max p(x) ≤ dim V =np , V ∩D6=∅ x∈V ∩D max p(x) < σ, x∈W,x6=0 i.e., T (·) has at least np eigenvalues less than σ. Assume that there exists an (np + 1)th eigenvalue λnp +1 < σ of T (·), and let W be the invariant subspace of T (λnp +1 ) corresponding to its nonnegative eigenvalues. Then this implies that dim W ≥ np + 1, W \ {0} ⊂ D, and p(x) ≤ λnp +1 < σ for every x ∈ W with x 6= 0, contradicting the fact that for every subspace V with dim V = np + 1 there exists a vector x ∈ V with xH T (σ)x ≤ 0, i.e., either x 6∈ D or p(x) ≥ σ. (ii) S(λ) := −T (−λ) satisfies the conditions of Theorem 2.1 in the interval −J, and −J contains the smallest eigenvalue −λn of S. For the general case the law of inertia has the following form: T HEOREM 3.3. Let T : J → Cn×n satisfy the conditions of the minmax characterization, and let σ, τ ∈ J, σ < τ . Let (npσ , nnσ , nzσ ) and (npτ , nnτ , nzτ ) be the inertias of T (σ) and T (τ ), respectively. Then the inequality npσ ≤ npτ holds, and the eigenvalue problem (2.1) has exactly npτ −npσ eigenvalues λnpσ +1 ≤ · · · ≤ λnpτ in (σ, τ ). Proof. Let W be the invariant subspace of T (σ) corresponding to its positive eigenvalues. Then by the positive definiteness, xH T (σ)x > 0, for every x ∈ W , x 6= 0, it follows from (A2 ) that xH T (τ )x > 0 holds as well, hence, npσ ≤ npτ . Let V be a subspace of Cn with V ∩ D = 6 ∅ and dim V = npσ + 1. We first show that there exits a x ∈ V ∩ D with p(x) > σ, from which we then obtain λnpσ +1 := inf sup dim V =npσ +1, V ∩D6=∅ x∈V ∩D p(x) > σ. From the hypothesis dim V > npσ , it follow that there exists a vector x ∈ V , x 6= 0 such that xH T (σ)x < 0. If x ∈ D, then it follows from (A2 ) that we are done. Otherwise, we choose y ∈ V ∩ D and ω > min(p(y), σ). Then xH T (ω)x < 0 < y H T (ω)y, and with w(t) := tx + (1 − t)y it follows in the same way as in the proof of Theorem 3.2 that there exist a value t̂ ∈ [0, 1] such that w(t̂) ∈ V ∩ D and p(w(t̂)) = ω > σ. If U denotes the invariant subspace of T (τ ) corresponding to its positive eigenvalues, then xH T (τ )x > 0 holds for every x ∈ U , x 6= 0. If U ∩ D = ∅, then xH T (λ)x > 0 holds for every λ ∈ J and x ∈ U , x 6= 0 and in particular for λ = σ. Hence, U ⊂ W and from σ < τ the equality npσ = npτ follows. In this case, T (·) has no eigenvalue in (σ, τ ) because otherwise a corresponding eigenvector x would satisfy xH T (σ)x < 0 < xH T (τ )x, i.e., x 6∈ W and x ∈ U contradicting U ⊂ W . ETNA Kent State University http://etna.math.kent.edu 86 ALEKSANDRA KOSTIĆ AND HEINRICH VOSS If U ∩ D 6= ∅, then p(x) < τ holds for every x ∈ U ∩ D and therefore λnpτ = inf sup dim V =npτ , V ∩D6=∅ x∈V ∩D p(x) ≤ sup p(x) < τ. x∈U ∩D Hence, λnpσ +1 and λnpτ are both contained in (σ, τ ) and so are the eigenvalues λj for j = npσ + 1, . . . , npτ . R EMARK 3.4. Without using the minmax characterization of eigenvalues, Neumaier [13] proved Theorem 3.3 for matrices T : J → Cn×n which are Hermitian and (elementwise) differentiable in J with positive definite derivative T ′ (λ), λ ∈ J. Obviously, such T (·) satisfy the conditions of the minmax characterization. E XAMPLE 3.5. Consider the rational eigenvalue problem T (λ) := −K + λM + p X j=1 λ Cj CjT , σj − λ where K, M ∈ Rn×n are symmetric and positive definite, the matrix Cj ∈ Rn×kj has rank kj , and 0 < σ1 < · · · < σp . This problem models the free vibrations of certain fluid– solid structures; cf. [1]. In each interval Jℓ := (σℓ , σℓ+1 ), ℓ = 0, . . . , p, σ0 = 0, σp+1 = ∞, the function fℓ (λ, x) := xH T (λ)x is strictly monotonically increasing, and therefore all eigenvalues in Jℓ are minmax values of the Rayleigh functional pℓ . For the first interval J0 , Theorem 3.2 applies. Hence, if τ ∈ J0 and (np , nn , nz ) is the inertia of T (τ ), then there are exactly np eigenvalues in J0 which are less than τ . Moreover, if τ1 < τ2 are contained in one interval Jj , then the number of eigenvalues in the interval (τ1 , τ2 ) can be obtained from the inertias of T (τ1 ) and T (τ2 ) according to Theorem 3.3. 4. Quadratic eigenvalue problems. We consider quadratic matrix pencils (4.1) Q(λ) := λ2 A + λB + C, with Hermitian matrices A, B, C ∈ Cn×n under several conditions that guarantee that (some of) the real eigenvalues allow for a variational characterization and hence for slicing of the spectrum using the inertia. 4.1. C < 0 and A ≥ 0. Let C be negative definite and A positive semidefinite. Multiplying Q(λ)x = 0 by λ−1 , one gets the equivalent nonlinear eigenvalue problem Q̃(λ)x := λAx + Bx + λ−1 Cx = 0. Differentiating f (λ; x) := xH Q̃(λ)x with respect to λ yields ∂ f (λ; x) = xH Ax − λ−2 xH Cx > 0 for every x 6= 0 and every λ 6= 0. ∂λ Hence, Q̃ satisfies the conditions of the minmax characterization for both intervals J− := (−∞, 0) and J+ := (0, ∞). For the corresponding Rayleigh functional p± with domain D± , it holds − that λ+ 1 = inf x∈D+ p+ (x) ∈ J+ and λn = supx∈D− p− (x) ∈ J− , and therefore the following statement follows from Theorem 3.2. T HEOREM 4.1. Let C be negative definite and A positive semidefinite. (i) For σ > 0 let In(Q̃(σ)) = (np , nn , nz ) be the inertia of Q̃(σ). Then the quadratic pencil (4.1) has np positive eigenvalues less than σ. ETNA Kent State University http://etna.math.kent.edu SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENPROBLEMS 87 (ii) For σ < 0 let In(Q̃(σ)) = (np , nn , nz ) be the inertia of Q̃(σ). Then (4.1) has nn negative eigenvalues exceeding σ. If A is positive definite, then Q̃ is overdamped with respect to J+ and J− , and therefore there exist exactly n positive and n negative eigenvalues. If A 6= 0 is positive semidefinite and r = rank(A), then ∞ is an infinite eigenvalue of multiplicity n−r, and there are only n+r finite eigenvalues. If B is positive definite, then the Rayleigh functional p+ (x) = −2 xH Cx xH Bx + p (xH Bx)2 − 4(xH Ax)(xH Cx) is defined on Cn \ {0}. Hence, (Q̃, J+ ) is overdamped, and there exist n positive and r negative eigenvalues. Theorem 4.1 can be strengthened according to the following result. T HEOREM 4.2. Assume that A is positive semidefinite, B is positive definite, and C is negative definite. (i) For σ > 0 let In(Q̃(σ)) = (np , nn , nz ) be the inertia of Q̃(σ). Then the quadratic pencil (4.1) has np positive eigenvalues less than σ, nn eigenvalues exceeding σ, and if nz 6= 0, then σ is an eigenvalue of Q(·) with multiplicity nz . (ii) For σ < 0 let In(Q̃(σ)) = (np , nn , nz ) be the inertia of Q̃(σ). Then (4.1) has nn negative eigenvalues exceeding σ, np −r finite eigenvalues less than σ, and if nz 6= 0, then σ is an eigenvalue of Q(·) with multiplicity nz . 4.2. Hyperbolic problems. The quadratic pencil Q(·) defined by the Hermitian matrices A, B, C ∈ Cn×n is called hyperbolic if A is positive definite and for every x ∈ Cn , x 6= 0, the quadratic polynomial f (λ; x) := λ2 xH Ax + λxH Bx + xH Cx = 0 has two distinct real roots xH Bx p± (x) = − H ± 2x Ax (4.2) s xH Bx 2xH Ax 2 − xH Cx . xH Ax A hyperbolic quadratic matrix polynomial Q(·) has the following properties (cf. [12]): the ranges J˜± := p± (Cn \ {0}) are disjoint real closed intervals with max J˜− < min J˜+ , moreover Q(λ) is positive definite for λ < min J˜− and λ > max J˜+ , and it is negative definite for λ ∈ (max J˜− , min J˜+ ). Let J+ be an open interval with J˜+ ⊂ J+ and J˜− ∩ J+ = ∅, and let J− be an open interval with J˜− ⊂ J− and J˜+ ∩ J− = ∅. Then (Q, J+ ) and (−Q, J− ) satisfy the conditions of the variational characterization of eigenvalues and they are both overdamped. Hence, there exist 2n eigenvalues λ1 ≤ · · · ≤ λn < λn+1 ≤ · · · ≤ λ2n and λj = min max dim V =j x∈V,x6=0 p− (x) and λn+j = min max dim V =j x∈V,x6=0 p+ (x), j = 1, . . . , n. If In(Q(σ)) = (np , nn , nz ) is the inertia of Q(σ) and nn = n, then Q(σ) is negative definite and there are n eigenvalues smaller than σ and n eigenvalues exceeding σ. If np = n holds, then Q(σ) is positive definite, f (σ; x) > 0 is valid for every x 6= 0, and ∂ f (σ; x) < 0 holds, then it follows that σ < λ1 , and σ > λ2n otherwise. if ∂λ ETNA Kent State University http://etna.math.kent.edu 88 ALEKSANDRA KOSTIĆ AND HEINRICH VOSS If np 6= n and nn 6= n, then σ ∈ J− ∪ J+ and Theorem 3.1 applies. We only have to find out which of these intervals σ is located in. To this end we determine x 6= 0 such that f (σ; x) := xH Q(σ)x > 0 (this can be done by a few steps of the Lanczos method which ∂ f (σ; x) = 2σxH Ax + xH Bx < 0, is known to converge first to extreme eigenvalues). If ∂λ then it follows that p− (x) > σ, and therefore σ < λn = maxx6=0 p− (x). Similarly, the inequalites f (σ; x) > 0 and 2σxH Ax + xH Bx > 0 imply σ > λn+1 = minx6=0 p+ (x). Hence we obtain the following slicing of the spectrum of Q(·). T HEOREM 4.3. Let Q(λ) := λ2 A + λB + C be hyperbolic, and let (np , nn , nz ) be the inertia of Q(σ) for σ ∈ R. (i) If nn = n then there are n eigenvalues smaller than σ and n eigenvalues greater than σ. (ii) Let np = n. If 2σxH Ax + xH Bx < 0 for an arbitrary x 6= 0, then there are 2n eigenvalues exceeding σ. If 2σxH Ax + xH Bx > 0 for an arbitrary x 6= 0, then all 2n eigenvalues are less than σ. (iii) For np = 0 and nz > 0 let x 6= 0 be an element of the null space of Q(σ). If 2σxH Ax + xH Bx < 0, then Q(λ)x = 0 has n − nz eigenvalues in (−∞, σ), n eigenvalues in (σ, ∞) and σ = λn with multiplicity nz . If 2σxH Ax + xH Bx > 0, then Q(·) has n eigenvalues in (−∞, σ), n − nz eigenvalues in (σ, ∞), and σ = λn+1 with multiplicity nz . (iv) For np > 0 and nz = 0 let x 6= 0 be such that f (σ; x) > 0. If 2σxH Ax + xH Bx < 0, then Q(·) has n − np eigenvalues in (−∞, σ) and n + np eigenvalues in (σ, ∞). If 2σxH Ax + xH Bx > 0, then Q(·) has n + np eigenvalues in (−∞, σ) and n − np eigenvalues in (σ, ∞). (v) For np > 0 and nz > 0 let x 6= 0 be such that f (σ; x) > 0. If 2σxH Ax + xH Bx < 0, then Q(·) has n − np − nz eigenvalues in (−∞, σ) and n + np eigenvalues in (σ, ∞). If 2σxH Ax + xH Bx > 0, then Q(·) has n + np eigenvalues in (−∞, σ) and n − np − nz eigenvalues in (σ, ∞). In either case σ is an eigenvalue with multiplicity nz . R EMARK 4.4. These results on quadratic hyperbolic pencils can be generalized to a hyperbolic matrix polynomial of higher degree P (λ) = k X λj Aj , Aj = AH j , j = 0, . . . , k, Ak > 0, j=0 which is hyperbolic if Ak is positive definite and for every x 6= 0 the corresponding polynomial f (λ; x) := xH P (λ)x has k real and distinct roots. In this case there exist k disjoint open intervals Jj ⊂ R, j = 1, . . . , k such that P (·) has exactly n eigenvalues in each Jj , and these eigenvalues allow for a minmax characterization; cf. [12, 14]. To fix the numeration let sup Jj+1 < inf Jj for j = 1, . . . , k − 1. For σ ∈ R, let (np , nn , nz ) be the inertia of P (σ), and let x ∈ Cn be a vector such that xH P (σ)x > 0. If f (·; x) has exactly j roots which exceed σ, then it holds that σ ∈ Jj+1 or σ ∈ [sup Jj+1 , inf Jj ] or σ ∈ Jj . ETNA Kent State University http://etna.math.kent.edu SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENPROBLEMS 89 Which one of these situations occur can be deduced from the inertia (nn = n or np = n) and ∂ f (σ; x) as for the quadratic case. the derivative ∂λ 4.3. Definite quadratic pencils. In a recent paper, Higham, Mackey, and Tisseur [10] generalized the concept of hyperbolic quadratic polynomials waiving the positive definiteness of the leading matrix A. A quadratic pencil (4.1) is definite if A, B, and C are Hermitian, there exists a real value µ ∈ R ∪ {∞} such that Q(µ) is positive definite, and for every x ∈ Cn , x 6= 0 the scalar quadratic polynomial f (λ; x) := λ2 xH Ax + λxH Bx + xH Cx = 0 has two distinct roots in R ∪ {∞}. The following theorem was proved in [10]. T HEOREM 4.5. The Hermitian matrix polynomial Q(λ) is definite if and only if any two (and hence all) of the following properties hold: • d(x) := (xH Bx)2 − 4(xH Ax)(xH Cx) > 0 for every x ∈ Cn \ {0}, • Q(η) > 0 for some η ∈ R ∪ {∞}, • Q(ξ) < 0 for some ξ ∈ R ∪ {∞}. For ξ < η (otherwise consider Q(−λ)) it was shown in [14] that there are n eigenvalues in (ξ, η) which are minmax values of a Rayleigh functional, and the remaining n eigenvalues in [−∞, ξ) and (η, ∞] are maxmin and minmax values of a second Rayleigh functional. Hence, if ξ and η are known, then the slicing of the spectrum using the LDLH factorization follows similarly to the hyperbolic case. However, given σ and the LDLH factorization of Q(σ), we are not aware of an easy way to decide in which of the intervals [−∞, ξ), (ξ, η), or (η, ∞] the parameter σ is located. The articles [7, 8, 14] contain methods to detect whether a quadratic pencil is definite and to compute the parameters ξ and η, however they are much more costly than computing an LDLH factorization of a matrix. For the Examples 4.6 and 4.8 at least one of these parameters are known and the slicing can be given explicitly. E XAMPLE 4.6. Duffin [4] called a quadratic eigenproblem (4.1) an overdamped network, if A, B, and C are positive semidefinite and the so called overdamping condition d(x) = (xH Bx)2 − 4(xH Ax)(xH Cx) > 0 for every x 6= 0 is satisfied. So, actually B has to be positive definite, and therefore Q(µ) is positive definite for every µ > 0, and Q(·) is definite. If r denotes the rank of A, then (4.1) has n + r finite real eigenvalues, the largest n ones of which (called primary eigenvalues by Duffin) are minmax values of p+ (x) := −2 xH Cx p , xH Bx + d(x) and the smallest r ones (called secondary eigenvalues) are maxmin values λn+1−j = max min dim V =j, V ∩D− 6=∅ x∈V ∩D− p− (x), p where D− := {x ∈ Cn : xH Ax 6= 0} and p− (x) = −xH Bx − d(x) /(2xH Ax) for x ∈ D− . Hence, the following slicing of the spectrum can be derived. T HEOREM 4.7. Let A, B, C ∈ Cn×n be positive semidefinite and assume that d(x) > 0 for x 6= 0. Let r be the rank of A and In(Q(σ)) = (np , nn , nz ) be the inertia of Q(σ) for some σ ∈ R. Then the following holds. ETNA Kent State University http://etna.math.kent.edu 90 ALEKSANDRA KOSTIĆ AND HEINRICH VOSS (i) If nn = n, then there are r eigenvalues smaller than σ and n eigenvalues greater than σ. (ii) For np = 0 and nz > 0 let x 6= 0 be an element of the null space of Q(σ). If 2σxH Ax + xH Bx < 0, then Q(·) has r − nz eigenvalues in (−∞, σ) and n eigenvalues in (σ, 0]. If 2σxH Ax + xH Bx > 0, then Q(·) has r eigenvalues in (−∞, σ), and n − nz eigenvalues in (σ, 0]. In either case, σ is an eigenvalue of Q(·) with multiplicity nz . (iii) For np > 0 let x 6= 0 be such that f (σ; x) > 0. If 2σxH Ax + xH Bx < 0, then Q(λ)x = 0 has r − np eigenvalues in (−∞, σ) and n + np − nz eigenvalues in (σ, 0]. If 2σxH Ax + xH Bx > 0, then Q(λ)x = 0 has r + np − nz eigenvalues in (−∞, σ) and n − np eigenvalues in (σ, 0]. E XAMPLE 4.8. Free vibrations of fluid–solid structures are governed by the nonsymmetric eigenvalue problem [9, 18] Ms 0 Ks C xs xs (4.3) =λ 0 Kf x f −C T Mf xf where Ks ∈ Rs×s , Kf ∈ Rf ×f are the stiffness matrices, and Ms ∈ Rs×s , Mf ∈ Rf ×f are the mass matrices of the structure and the fluid, respectively, and C ∈ Rs×f describes the coupling of structure and fluid. The vector xs is the structure displacement vector, and xf is the fluid pressure vector. Ks , Ms , Kf , and Mf are symmetric and positive definite. Multiplying the first line of (4.3) by λ, one obtains the quadratic pencil −Ks −C 0 0 0 2 Ms + Q(λ) := λ +λ . 0 0 0 −Kf −C T Mf x It can easily be seen that for xs 6= 0 the quadratic equation [xTs , xTf ]Q(λ) s = 0 has xf one positive solution p+ (xs , xf ) and one negative solution p− (xs , xf ), and for xs = 0 it has one positive solution p+ (xs , xf ) := xTf Kf xf /(xTf Mf xf ) and the solution p− (xs , xf ) := ∞. Moreover the positive eigenvalues of (4.3) are minmax values of the Rayleigh functional p+ . Hence, one gets the following result for the (physically meaningful) positive eigenvalues: if In(Q(σ)) = (np , nn , nz ) for σ > 0, then there are exactly np eigenvalues in (0, σ), nn eigenvalues in (σ, ∞), and if nz 6= 0, then σ is an eigenvalue of multiplicity nz . 4.4. Nonoverdamped quadratic pencils. We consider the quadratic pencil (4.1) where the matrices A, B, and C are positive definite. Then for x 6= 0, the two complex roots of f (λ; x) := xH Q(λ)x are given in (4.2). Let us define δ− := sup{p− (x) : p− (x) ∈ R}, J− := (−∞, δ+ ), D± := {x ∈ Cn : p± (x) ∈ J± }. δ+ := inf{p+ (x) : p+ (x) ∈ R}, J+ = (δ− , 0), and If f (λ, x) > 0 for x 6= 0 and λ ∈ R, then it follows that δ− = −∞ and δ+ = ∞. The eigenvalue problem Q(λ)x = 0 has no real eigenvalues, but this does not have to be known in advance. Theorem 4.9 applies to this case as well. It is obvious that −Q and Q satisfy the conditions of the minmax characterization of its eigenvalues in J− and J+ , respectively. Hence, all eigenvalues in J− are minmax-values of p− : λ− j = min max dim V =j, V ∩D− 6=∅ x∈V ∩D− p− (x), j = 1, 2, . . . . ETNA Kent State University http://etna.math.kent.edu SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENPROBLEMS 91 Taking advantage of the minmax characterization of the eigenvalues of Q̃(λ) := −Q(−λ) in J˜ := J+ with the Rayleigh functional p̃ := −p+ , we obtain the following maxmin characterization λ+ 2n+1−j = max min dim V =j, V ∩D+ 6=∅ x∈V ∩D+ p+ (x), j = 1, 2, . . . . of all eigenvalues of Q in J+ . Hence, for σ < δ+ and for σ > δ− , we obtain slicing results for the spectrum of Q(·) from Theorem 3.2. If In(Q(σ)) = (np , nn , nz ) and σ < δ+ , then there exist nn eigenvalues of Q(·) in (−∞, σ), and if σ ∈ (δ− , 0), then there are nn eigenvalues in (σ, 0). However, δ+ and δ− are usually not known. The following theorem contains upper bounds of δ− and lower bounds of δ+ , thus yielding subintervals of (−∞, δ+ ) and (δ− , 0) where the above slicing applies. T HEOREM 4.9. Let A, B, C ∈ Cn×n be positive definite, and let p+ and p− be defined in (4.2). Then it holds that (i) s xH Cx ≤ δ+ = inf{p+ (x) : p+ (x) ∈ R} δ̃+ := − max H x6=0 x Ax and δ− = sup{p− (x) : p− (x) ∈ R} ≤ − s min x6=0 xH Cx =: δ̃− . xH Ax (ii) δ̂+ := −2 max x6=0 xH Cx ≤ δ+ xH Bx and δ− ≤ −2 min x6=0 xH Cx =: δ̂− . xH Bx Proof: (i): The value δ̃+ is a lower bound of δ+ if for every x 6= 0 such that p+ (x) ∈ R it holds that p+ (x) ≥ δ̃+ . The following proof takes advantage of the facts that p+ (x) < 0 ∂ f (p+ (x); x) ≥ 0. and ∂λ The equation f (p+ (x); x) = xH Q(p+ (x))x = 0 is satisfied if and only if xH Bx = −p+ (x)xH Ax − 1 xH Cx. p+ (x) Hence, ∂ 1 f (p+ (x); x) = 2p+ (x)xH Ax + xH Bx = p+ (x)xH Ax − xH Cx ≥ 0 ∂λ p+ (x) if and only if xH Cx , p+ (x)2 ≤ H x Ax i.e., δ+ ≥ − s max x6=0 and analogously we obtain δ− ≤ − s min x6=0 xH Cx = δ̃− . xH Ax xH Cx = δ̃+ , xH Ax ETNA Kent State University http://etna.math.kent.edu 92 ALEKSANDRA KOSTIĆ AND HEINRICH VOSS (ii): Solving f (p+ (x); x) = 0 for xH Ax, one gets from p+ (x) ≥ −2 ∂ ∂λ f (p+ (x); x) ≥ 0 that xH Cx xH Cx , i.e., δ+ ≥ −2 max H = δ̂+ , H x6=0 x Bx x Bx and analogously δ− ≤ −2 min x6=0 xH Cx = δ̂− . xH Bx From Theorem 3.2 we obtain the following slicing of the spectrum of Q(·). T HEOREM 4.10. Let A, B, and C be positive definite matrices, and for some σ ∈ R let In(Q(σ)) = (np , nn , nz ). (i) If ) ( s xH Cx xH Cx , , −2 max H σ ≤ max − max H x6=0 x Ax x6=0 x Bx then there exist nn eigenvalues of Q(λ)x = 0 in (−∞, σ). (ii) If ) ( s xH Cx xH Cx , , −2 min H σ ≥ min − min H x6=0 x Bx x6=0 x Ax then there exist nn eigenvalues of Q(λ)x = 0 in (σ, 0). E XAMPLE 4.11. In a numerical experiment, the matrices A, B, and C were generated by the following MATLAB statements: randn(’state’,0); A=eye(20); B=randn(20);B=B’*B; C=randn(20);C=C’*C;. It was found that Q(λ)x = 0 has 26 real eigenvalues, 13 in the domain of p− and 13 in the domainp of p+ . So, Sylvester’s theorem can be appliedpto all of them. 12 eigenvalues are less than − max(λ(C, A)) and 6 eigenvalues exceed − min(λ(C, A)). Conclusions. We have considered a given family of Hermitian matrices T (λ) depending continuously on a parameter λ in an open real interval J which allows for a variational characterization of its eigenvalues. We proved slicing results for the spectrum of T (·), where at first general nonlinear eigenvalue problems are considered, which are then specialized to various types of quadratic eigenproblems. REFERENCES [1] C. C ONCA , J. P LANCHARD , AND M. VANNINATHAN, Fluid and Periodic Structures, vol. 38 of Research in Applied Mathematics, Masson, Paris, 1995. [2] R. C OURANT, Über die Eigenwerte bei den Differentialgleichungen der mathematischen Physik, Math. Z., 7 (1920), pp. 1–57. [3] J. D EMMEL, Applied Numerical Linear Algebra, SIAM, Philadelphia, 1997. [4] R. D UFFIN, A minimax theory for overdamped networks, J. Rational Mech. Anal., 4 (1955), pp. 221–233. [5] E. F ISCHER, Über quadratische Formen mit reellen Koeffizienten, Monatsh. Math. Phys., 16 (1905), pp. 234– 249. [6] G. G OLUB AND C. VAN L OAN, Matrix Computations, 3rd ed., John Hopkins University Press, Baltimore, 1996. [7] C.-H. G UO , N. H IGHAM , AND F. T ISSEUR, Detecting and solving hyperbolic quadratic eigenvalue problems, SIAM J. Matrix Anal. Appl., 30 (2009), pp. 1593–1613. ETNA Kent State University http://etna.math.kent.edu SYLVESTER’S LAW OF INERTIA FOR NONLINEAR EIGENPROBLEMS [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] 93 , An improved arc algorithm for detecting definite Hermitian pairs, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 1131–1151. C. H ARRIS AND A. P IERSOL, Harris’ Shock and Vibration Handbook, 5th ed., McGraw-Hill, New York, 2002. N. H IGHAM , D. M ACKEY, AND F. T ISSEUR, Definite matrix polynomials and their linearization by definite pencils, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 478–502. R. H ORN AND C. J OHNSON, Matrix Analysis, Cambridge University Press, Cambridge, 1985. A. M ARKUS, Introduction to the Spectral Theory of Polynomial Operator Pencils, vol. 71 of Translations of Mathematical Monographs, AMS, Providence, 1988. A. N EUMAIER, Introduction to Numerical Analysis, Cambridge University Press, Cambridge, 2001. V. N IENDORF AND H. VOSS, Detecting hyperbolic and definite matrix polynomials, Linear Algebra Appl., 432 (2010), pp. 1017–1035. B. PARLETT, The Symmetric Eigenvalue Problem, SIAM, Philadelphia, 1998. H. P OINCAR É, Sur les equations aux dérivées partielles de la physique mathématique, Amer. J. Math., 12 (1890), pp. 211–294. E. ROGERS, A minimax theory for overdamped systems, Arch. Rational Mech. Anal., 16 (1964), pp. 89–96. M. S TAMMBERGER AND H. VOSS, On an unsymmetric eigenvalue problem governing free vibrations of fluid-solid structures, Electr. Trans. Numer. Anal., 36 (2010), pp. 113–125. http://etna.mcs.kent.edu/vol.36.2009-2010/pp113-125.dir J. S YLVESTER, A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by orthogonal substitutions to the form of a sum of positive and negative squares, Philosophical Magazin, 4 (1852), pp. 138–142. H. VOSS, A minmax principle for nonlinear eigenproblems depending continuously on the eigenparameter, Numer. Linear Algebra Appl., 16 (2009), pp. 899–913. H. VOSS AND B. W ERNER, A minimax principle for nonlinear eigenvalue problems with applications to nonoverdamped systems, Math. Methods Appl. Sci., 4 (1982), pp. 415–424.