Applications of Quadratic Differential Forms Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy by Ishan K. Pendharkar (Roll number: 01407004) Supervisor: Prof. Harish K. Pillai Department of Electrical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai, 400076 2005 Dissertation Approval Sheet The dissertation “Applications of Quadratic Differential Forms” by Ishan Pendharkar is approved for the degree of Doctor of Philosophy. Examiners .................... .................... .................... Supervisor .................... Chairman .................... Place :.............. Date :.............. iii Indian Institute of Technology Bombay Certificate of Course Work This is to certify that Ishan Pendharkar was admitted to the candidacy of the Ph.D. degree in January 2002 after successfully completing all the courses required for the Ph.D. degree programme. The details of the course work done are given below: Sr. Course No. Course name Credits 1 EE 678 Wavelets 6.00 2 EE 698 Special topics in Electrical Engg. 6.00 3 EES 801 Ph.D seminar 4.00 IIT Bombay: Date:........ Deputy Registrar v Acknowledgment The work presented in this thesis is the outcome of about three years (May 2002-July 2005) of research that I have carried out in the Department of Electrical Engineering, IIT Bombay under the supervision of Dr. Harish K. Pillai. I am grateful to Dr. Pillai for giving me an opportunity of working under his supervision. Dr. Pillai has been a source of constant support and motivation during the course of my research. All along my association with him, Dr. Pillai has been extremely patient with me. I thank him for all the help and support. I also thank Dr. Paolo Rapisarda, who is currently with the Department of Electrical and Computer Engineering, University of Southampton, UK, for suggesting interesting problems. I have enjoyed working with Dr. Rapisarda. I gratefully acknowledge receiving help and guidance from several people at IIT Bombay. I thank Dr. Madhu Belur for many helpful discussions and suggestions. I am grateful to my masters’ adviser Prof. V.R. Sule, and my instructors at IIT Bombay: Prof. S.D. Agashe, Prof. Shiva Shankar and Prof. M.C. Srisailam for their efforts. The several friends I made during the course of my Ph.D have been a great help from time to time. I thank Amit Kalele, Dr. Mashuq-un-Nabi and Priyadarshanam for their help and support. I also thank all masters’ students and project staff at the Control and Computing laboratory for providing great company and making my workplace lively. I would have never managed to do research and write this thesis but for the unflinching support and encouragement of my parents, and my wife Mitra. I thank them for their patience, and for the many big and small sacrifices they have made for my sake. IIT Bombay, Mumbai. Ganesh Chaturthi, Saka 1927 (7 September, 2005). vii Abstract Quadratic functionals are commonly encountered in systems and control theory as a means of describing as well as analysing dynamical systems. Quadratic filters and bilinear time series models can be thought of as examples of the former. The latter use of quadratic functionals is more popular and widely understood. It is based upon our deep and intuitive association of energy or power like quantities with quadratic functionals. The thesis titled “Applications of Quadratic Differential Forms” is a study of quadratic functionals and their applications in systems and control theory. The quadratic functionals in question are “Quadratic Differential Forms”, or QDFs. Using QDFs I have investigated different areas in systems and control theory in search for a unifying theme which binds these areas together. Specifically, I have addressed problems in the following areas: Dissipative systems: A parametrization of systems that are dissipative with respect to supply functions defined by quadratic differential forms has been obtained. KYP lemma: A generalization of the Kalman-Yakubovich-Popov (KYP) lemma has been obtained for systems that are dissipative with respect to supply functions defined by QDFs. Absolute stability criteria: Using QDFs, absolute stability criteria have been obtained for a large class of nonlinearities. Polynomial J-spectral factorization: Using the algebra of QDFs, a new algorithm has been developed for J-spectral factorization of polynomial matrices. H∞ control: A new characterization of all solutions to the H∞ problem has been obtained. The characterization is in terms of LTI dynamical systems that are dissipative with respect to a “special” supply function defined by a QDF. Modelling of data with bilinear and quadratic differential forms: An iterative algorithm has been developed for computing all bilinear differential forms that model a given set of data. Nevanlinna-Pick interpolation: A characterization of all rational functions that satisfy, along with given interpolation conditions, a “frequency dependent norm” condition has been obtained. ix x Contents 1 Introduction 1 1.1 Preview of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Quadratic Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Quadratic and Bilinear forms . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.2 Representations of Quadratic Differential Forms . . . . . . . . . . . . . . 5 1.2.3 Factorization of Quadratic Differential Forms . . . . . . . . . . . . . . . 8 1.2.4 Point-wise non-negative Quadratic Differential Forms . . . . . . . . . . . 8 1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Behavioral theory of dynamical systems 11 2.1 Dynamical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Linear differential systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 The space of trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Latent variables and their elimination . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5 Equivalent representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6 Controllability and Observability . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.7 Autonomous systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.8 State representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.9 Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3 A parametrization for dissipative systems 31 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Dissipativity in the Behavioral setting . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3 An equivalence relation on supply functions . . . . . . . . . . . . . . . . . . . . 35 3.4 SISO dissipative systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.5 MIMO dissipative systems: the constant inertia case . . . . . . . . . . . . . . . 43 3.5.1 Supply functions defined by constant matrices . . . . . . . . . . . . . . . 43 3.5.2 Supply functions defined by polynomial matrices . . . . . . . . . . . . . . 45 3.6 MIMO dissipative systems: the general inertia case . . . . . . . . . . . . . . . . 46 3.6.1 Parametrizing a set of Φ-dissipative behaviors using split sums . . . . . . 51 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 xi 4 KYP lemma and its extensions 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 4.2 Storage functions for dissipative systems . . . . . . 4.3 Classical KYP lemma in terms of storage functions 4.4 Generalization of KYP lemma . . . . . . . . . . . . 4.4.1 Generalization with respect to QDFs . . . . 4.5 Strict versions of the KYP lemma . . . . . . . . . . 4.6 Special case: KYP lemma for SISO systems . . . . 4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Designing linear controllers for nonlinearities 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Control as an interconnection . . . . . . . . . . . . . . . . . . . . . 5.4 Nonlinear systems: problem formulation . . . . . . . . . . . . . . . 5.5 Constructing stabilizing controllers . . . . . . . . . . . . . . . . . . 5.5.1 A recipe to obtain stabilizing behaviors for all nonlinearities family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Stability results . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 A characterization of stabilizing controllers . . . . . . . . . . 5.6 The Circle criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Classical Popov Criterion . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Slope restricted nonlinearities . . . . . . . . . . . . . . . . . . . . . 5.9 Nonlinearities with memory . . . . . . . . . . . . . . . . . . . . . . 5.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . in a given . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Polynomial J-spectral factorization 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Σ-unitary modeling of dualized data . . . . . . . . . . . . . . . . . . . 6.2.1 Modeling vector-exponential time series with behaviors . . . . . 6.2.2 Data dualization, semi-simplicity, and the Pick matrix . . . . . 6.2.3 A procedure for Σ-unitary modeling . . . . . . . . . . . . . . . . 6.3 J-spectral factorization via Σ-unitary modeling . . . . . . . . . . . . . 6.4 Numerical Aspects of the Algorithm . . . . . . . . . . . . . . . . . . . . 6.4.1 Symmetric Canonical Factorization . . . . . . . . . . . . . . . . 6.4.2 Computing singularities . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Implementation of iterations . . . . . . . . . . . . . . . . . . . . 6.4.4 Computer implementation of polynomial J-spectral factorization 6.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 55 56 60 61 63 67 70 74 75 75 76 80 80 82 82 83 84 85 86 89 90 92 93 93 94 94 96 96 98 106 106 108 109 109 110 116 7 Synthesis of dissipative systems 117 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 xii 7.3 A Solution to the synthesis problem . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.4 A characterization of all solutions of the synthesis problem . . . . . . . . . . . . 126 7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 8 Modeling of data with bilinear differential forms 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 A recursive algorithm for interpolating with BDFs . . . . . . . . . . . 8.4 Examples and applications . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Interpolation with BDFs . . . . . . . . . . . . . . . . . . . . . 8.4.2 Application 1: Interpolation with scalar bivariate polynomials 8.4.3 Application 2: Storage functions . . . . . . . . . . . . . . . . . 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Nevanlinna-Pick interpolation 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Nevanlinna Pick interpolation – the standard case . . . . . 9.2.1 Dualizing of the data . . . . . . . . . . . . . . . . . 9.3 System theoretic implications of dualizing the data . . . . 9.4 Nevanlinna-Pick problem with frequency dependent norms 9.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 . 129 . 130 . 131 . 134 . 134 . 135 . 136 . 140 . . . . . . 141 . 141 . 142 . 144 . 145 . 148 . 150 10 Conclusion and future work 151 10.1 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 10.2 Directions for further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 References 155 A Notation 163 xiii xiv Chapter 1 Introduction In this thesis we study quadratic forms and their relationships and applications vis-a-vis systems and control. Quadratic forms have been studied at length in Physics, Mathematics and Engineering. Consider the following well known examples: 1. The kinetic energy of a mass m moving with a speed v is 12 mv 2 , a quadratic in v. 2. The power supplied to an electrical circuit is voltage × current, a bilinear expression in voltage and current. 3. The energy stored in a capacitor with capacitance C and voltage V across it is 21 CV 2 . Likewise, the energy stored in an inductor with inductance L and current I flowing through it is 12 LI 2 . 4. Power supplied to a particle that is acted upon by a force F and has a speed v is given by F v. 5. The energy stored in a mechanical spring with a spring constant K and a displacement x from the equilibrium is given by 12 Kx2 . The idea of associating quadratic functionals with quantities representing power or energy has been deeply ingrained in the human mind. Or is there something deeply subtle in nature due to which “energy like” natural phenomena reveal themselves to us in a way that a quadratic approximation is often the best one? Whatever be the case, a study of quadratic functionals and their relationship with dynamical systems is a very important and interesting area of research. The action of quadratic forms on dynamical systems can be studied from two viewpoints: 1. Quadratic forms that are useful in describing a dynamical system. 2. Quadratic forms that are helpful in design and analysis of a dynamical system. We first consider the use of quadratic forms in describing systems. Describing a dynamical system means searching for a law that adequately explains the behavior of a dynamical system. All laws (or models) that have been found by physicists and engineers over the past several centuries are approximations of observed natural phenomena. We generally want laws that describe a system “sufficiently” accurately and are at the same time not too difficult to handle analytically. A study of linear models has received great attention because of their simplicity. 2 1 Introduction However, not all systems can be adequately explained by linear laws. The next best in terms of simplicity to a linear law is a quadratic law. A large class of systems that cannot be adequately explained by a linear law can be explained by a quadratic law. Thus, quadratic forms can serve as models for dynamical systems that cannot be adequately described by linear laws. The use of quadratic forms in design and analysis of dynamical systems is more popular and well-understood than the use of quadratic forms in modeling. The use stems from the deep and intuitive association of a quadratic form with energy and power, as we have seen in the several examples given above. A quadratic form can be associated with a dynamical system by studying how the quadratic form changes along trajectories of the dynamical system. This association has led to the development of an interesting area of research called “dissipative systems” which concerns systems in which the net “generalized power” supplied is non-negative. Due to our intuitive association of quadratic forms with energy and power, quadratic forms have been used to formulate a energy based theory for dynamical systems. We know from the law of conservation of energy that an isolated system that loses energy with time must eventually come to rest. Systems theorists, starting with A.M. Lyapunov, have generalized this idea to construct energy like functionals, now called Lyapunov functions, to examine stability of isolated systems. Quadratic forms are often good candidates for Lyapunov functions. Hence, quadratic forms can be used to examine stability. Another important use of quadratic forms is in optimization problems. In many control problems, physically meaningful cost functionals are quadratic. This leads to the subject of “Linear Quadratic” optimal control where one wants to constrain a linear system in such a way that a cost specified by a quadratic form is minimized along trajectories of this system. Thus, we see that a quadratic form is an immensely useful tool for studying dynamical systems. While opinions may differ as to why a quadratic form is almost omnipresent in systems theory, no one can deny its importance across several diverse disciplines in systems and control theory. This thesis is a study of quadratic forms, albeit in a slightly different setting. We study what are called Quadratic Differential Forms (QDFs). We show the importance and versatility of QDFs by considering their applications across a broad spectrum of systems and control theory. These include: dissipative systems, absolute stability criteria for nonlinear systems, quadratic modeling, Nevanlinna-Pick interpolation, polynomial matrix factorization and robust controller design. 1.1 Preview of the thesis This thesis is organized as follows: in the remaining pages of this chapter, we introduce Quadratic Differential Forms (QDFs), including notation and some elementary properties. We use QDFs in tandem with a recent development in mathematical systems theory called the behavioral theory of dynamical systems. In Chapter 2, we introduce some basic concepts from behavioral systems theory. Using QDFs along with behavioral theoretic ideas, we investigate problems in several different areas in systems and control theory and try to show the existence of a common thread that binds all these areas together. We now give a summary of results presented in Chapters 3 to 9 in this thesis. 1. Chapter 3 is titled “A parametrization for dissipative systems”: We address the problem 1.1. Preview of the thesis 3 of how to construct all linear, time-invariant (LTI) dynamical systems that are dissipative with respect to a “generalized power” defined using a QDF. We examine different cases in an increasing order of complexity and show that under reasonable assumptions one can obtain a complete parametrization of all LTI dissipative systems. 2. Chapter 4 titled “KYP lemma and its extensions” is a continuation of Chapter 3. Dissipative systems have certain associated functionals called storage functions. In this chapter, we address the question of obtaining conditions for the existence of positive definite storage functions for dissipative systems. Results that we have presented in this chapter are a generalization of the well known Kalman-Yakubovich-Popov lemma, which gives conditions for the existence of positive definite storage functions for passive systems. The results that we obtain here are representation free. 3. In Chapter 5 we consider the absolute stability problem: given a family of nonlinearities, obtain a class of linear time-invariant systems such that any system from this class, when interconnected with any nonlinearity from the given family yields a stable system. We use results obtained in Chapters 3 and 4 to construct Lyapunov functions for nonlinear systems. 4. Chapter 6 titled “Polynomial J-spectral factorization” is about a novel algorithm for obtaining a factorization of polynomial matrices called “polynomial J-spectral factorization”. The algorithm is guaranteed to yield a factorization (if it exists) in finitely many steps. The algorithm has been found to have good numerical properties. 5. Chapter 7 is titled “Synthesis of dissipative systems”. Here, we address what is commonly known as the “H∞ control problem”. We obtain a novel characterization of all solutions to the H∞ problem using the idea of parametrization discussed in Chapters 3 and 4. 6. In Chapter 8 we consider the problem of exact modeling of data with bilinear and quadratic differential forms. We obtain a recursive algorithm for modeling. We address two applications of the modeling scheme: computation of storage functions for autonomous systems, and scalar interpolation with bivariate polynomials. 7. Chapter 9 is about a behavioral view of Nevanlinna-Pick interpolation problem. We use QDFs to address this classical problem. We also address a generalization of the Nevanlinna-Pick problem, where we obtain interpolating rational functions that also satisfy a frequency-weighted norm, in contrast to the classical problem where this norm is frequency independent. 1.1.1 Notation The following notation is used throughout the thesis: The field of real and complex numbers is denoted by R and C respectively. Integers are denoted by Z. Abstract vector spaces are denoted by V. Rm (respectively Cm ) denotes the set of m-dimensional column vectors (over R or C). Rm×p , respectively Cm×p , denotes the set of m × p matrices over R or C. The ring of polynomials in ξ is denoted by R[ξ] or C[ξ] depending on the field. Polynomial matrices with 4 1 Introduction m rows and p columns are denoted by Rm×p [ξ] or Cm×p [ξ] depending on the field. Rwו , Rwו [ξ] denote the set (real constant or polynomial) matrices having w rows. Likewise, R•×p and R•×p [ξ] denote (real constant or polynomial) matrices with p columns. Analogously, C•×p and C•×p [ξ]. Given a linear operator K : V1 → V2 , where V1 , V2 are vector spaces, the kernel of K, denoted by Ker K := {v ∈ V1 such that Kv = 0}. Likewise, the image of K, denoted by Im K := {w ∈ V2 such that ∃v ∈ V1 satisfying w = Kv}. n! . Given w ∈ Z, by w! we mean the factorial of w. By n Cw we mean w!·(n−w)! 1.2 1.2.1 Quadratic Differential Forms Quadratic and Bilinear forms We start with an introduction to quadratic and bilinear forms, which are special cases of what we define as “quadratic differential forms”. All material presented in this section is standard and can be found in numerous textbooks on matrices, [22] for instance. Definition 1.2.1 A bilinear form on vector spaces (V1 , V2 ) over the field F is a map ` : (V1 , V2 ) → F which is linear in both of its arguments. Being linear in both arguments, ` defines a linear map `v2 on V1 for every v2 ∈ V2 defined as `v2 (v1 ) = `(v1 , v2 ). Notice that `v2 is an element in V1∗ , the dual space of V1 . The following properties of ` are a consequence of bilinearity: 1. `(0, v2 ) = 0 for all v2 ∈ V2 . 2. `(v1 + v10 , v2 ) = `(v1 , v2 ) + `(v10 , v2 ), v1 , v10 ∈ V1 and v2 ∈ V2 . 3. `(kv1 , v2 ) = k`(v1 , v2 ) with k ∈ F. A special case of interest is when V1 = V2 = Rn and F = R. In this case, ` is said to define a quadratic form. A quadratic form is a homogeneous polynomial of second degree in n variables P x1 , x2 . . . xn and has a representation ni,k=1 aik xi xk with aik = aki , i = 1 . . . n (Gantmacher, [22] page 294). Another important special case is when V1 = V2 = Cn and F = R. These quadratic forms (together with a notion of symmetry) are called hermitian. A hermitian form is an expression Pn ∗ ∗ of the form A(x, x) = i,j=1 aij xi xj with aij = aji . A quadratic or hermitian form can be studied by studying the matrix A = [aik ]ni,k=1 . Thus, a quadratic form defined by a matrix h iT A = AT can be written in a compact way as xT Ax where x = x1 x2 . . . xn . Similarly, a hermitian form defined by a matrix A = A∗ can be written in a compact form as x∗ Ax. One of the most important properties of quadratic and hermitian forms is diagonalization: consider the hermitian form x∗ Ax defined by the hermitian matrix A. Then, there exists P a nonsingular transformation on the variables xi given by nj=1 tij yi = xi such that x∗ Ax = Pσ+ (A) ∗ Pσ+ (A)+σ− (A) ∗ yi yi . This diagonalization can be expressed conveniently in terms of yi yi − i=σ i=1 + (A)+1 the matrix A and the matrix T = [tij ]nij=1 that defines the similarity transformation: consider the matrix à = T ∗ AT . Then, y ∗ Ãy = x∗ Ax with à = diag [Iσ+ (A) , −Iσ− (A) , 0n−σ+ (A)−σ− (A) ]. Sylvester’s law of inertia ( [22] page 296) states that the numbers σ+ (A) and σ− (A) do not 1.2. Quadratic Differential Forms 5 depend on the similarity transformation used and are intrinsic to the hermitian form defined by A. It can be shown that σ+ (A) coincides with the number of positive eigenvalues of A and σ− (A) coincides with the number of negative eigenvalues of A. In many problems, hermitian forms that are positive definite are important. A hermitian form defined by a matrix A = A∗ is said to be positive definite (respectively semidefinite) if x∗ Ax > 0 for all nonzero x (respectively x∗ Ax ≥ 0 for all x). The definition for positive definiteness and positive semidefiniteness for quadratic forms is analogous. The quadratic or hermitian form defined by A is positive semidefinite if and only if σ− (A) = 0, and positive definite if and only if σ− (A) = 0 and A is a nonsingular matrix. A compact notation for denoting that A defines a positive definite (respectively positive semidefinite) quadratic or hermitian form is A > 0 (respectively A ≥ 0). We now define some well known terms about spectra of matrices [22]: Definition 1.2.2 Let A = A∗ ∈ Cw×w . Let σ+ (A), σ− (A) and σ0 (A) denote respectively the number of positive, negative and zero eigenvalues of A. Then, 1. The non-negative integer |σ+ (A) − σ− (A)| is called the signature of A. 2. The non-negative integer three tuple (σ+ (A), σ− (A), σ0 (A)) is called the inertia of A. 1.2.2 Representations of Quadratic Differential Forms In section 1.2.1 we reviewed quadratic and bilinear forms. These are functions on real or complex Euclidean vector spaces. However, in many system theoretic problems, bilinear and quadratic forms that are a function of system variables and also their derivatives are commonly encountered. For example, a Lagrangian is a function of generalized positions q and velocities q̇. Power supplied to a mechanical system is defined as F q̇ where F is the force and q the position. Such bilinear and quadratic functionals that also contain expressions involving derivatives can be defined using a bilinear/quadratic differential form(B/QDF). The concepts that we present in the remaining parts of this chapter is standard introductory material [103]. Consider a finite number of matrices Φkl ∈ Rw1 ×w2 , k, l = 0, 1, . . . , n. Let C ∞ (R, R• ) denote the space of infinitely many times differentiable functions from R to R• . Let w1 ∈ C ∞ (R, Rw1 ) and w2 ∈ C ∞ (R, Rw2 ). Consider the following expression involving w1 , w2 and their derivatives: n X dl dk (1.1) ( k w1 )T Φkl l w2 dt dt k,l=0 The expression in (1.1) is bilinear in w1 and w2 and is a map from C ∞ (R, Rw1 ) × C ∞ (R, Rw2 ) to C ∞ (R, R). Expressions like (1.1) can be conveniently represented by using the following notation: define a bivariate polynomial matrix Φ(ζ, η) = n X Φkl ζ k η l (1.2) k,l=0 Then Φ(ζ, η) can be associated with the bilinear expression (1.1) by associating a monomial k l Φkl ζ k η l in (1.2) with the differential operator ( dtd k )T Φkl dtd l in (1.1). 6 1 Introduction In the sequel, the set of all w1 × w2 bivariate polynomial matrices in ζ, η with real coefficients will be denoted by Rw1 ×w2 [ζ, η]. Definition 1.2.3 The matrix Φ(ζ, η) ∈ Rw1 ×w2 [ζ, η] is said to define a bilinear differential form (BDF) LΦ which is a map LΦ : C ∞ (R, Rw1 ) × C ∞ (R, Rw2 ) → C ∞ (R, R) defined by LΦ (w1 , w2 ) = P dl dk T k,l ( dtk w1 ) Φkl dtl w2 for w1 ∈ C ∞ (R, Rw1 ) and w2 ∈ C ∞ (R, Rw2 ). The special case when w1 = w2 = w and w1 = w2 = w is more interesting. Under these conditions, the bilinear differential form LΦ (w, w) is called a quadratic differential form: Definition 1.2.4 The matrix Φ(ζ, η) ∈ Rw×w [ζ, η] is said to define a quadratic differential form (QDF) QΦ which is a map QΦ : C ∞ (R, Rw ) → C ∞ (R, R) P k l defined by QΦ (w) = LΦ (w, w) = k,l ( dtd k w)T Φkl dtd l w for w ∈ C ∞ (R, Rw ). Example 1.2.5 Let Φ1 (ζ, η) = ζ + η + ζη. Consider `1 , `2 ∈ C ∞ (R, R). Then, LΦ1 (`1 , `2 ) = d ` · `2 + `1 · dtd `2 + dtd `1 · dtd `2 . When `1 = `2 = `, QΦ1 (`) = 2` · dtd ` + ( dtd `)2 dt 1 Example 1.2.6 Let Φ2 (ζ, η) = " ζ η 1 0 QΦ2 (`) = # . Let ` = (`1 , `2 )T ∈ C ∞ (R, R2 ). Then, d d `1 · `1 + ` 1 · `2 + ` 2 `1 dt dt Using algebraic properties of bivariate polynomial matrices, a calculus can be built for QDFs / BDFs. Using this calculus, properties of QDFs/BDFs can be translated into equivalent properties of bivariate polynomial matrices. An instance of this calculus is the asterisk operator ? defined as follows: ? : Rw1 ×w2 [ζ, η] → Rw2 ×w1 [ζ, η]; Φ? (ζ, η) = ΦT (η, ζ) where T denotes the (usual) matrix transposition. Clearly, LΦ (w, v) = LΦ? (v, w). Bivariate polynomial matrices that satisfy Φ(ζ, η) = Φ? (ζ, η) are called symmetric. Clearly, a necessary condition for Φ(ζ, η) to be symmetric is that it should be square. Notation 1.2.7 The set of w × w symmetric matrices in ζ, η will be denoted throughout this thesis by Rw×w s [ζ, η] Example 1.2.8 Consider Φ1 (ζ, η) in Example 1.2.5: Φ"1 (ζ, η) #= ζ + η + ζη. Φ1 (ζ, η) is symζ η metric. Consider Φ2 (ζ, η) in Example 1.2.6: Φ2 (ζ, η) = . Φ2 (ζ, η) is not symmetric. 1 0 1.2. Quadratic Differential Forms 7 Since LΦ (w, v) = LΦ? (v, w) it follows that QΦ (w) = QΦ? (w) = Q Φ+Φ? (w) 2 Example 1.2.9 Consider QΦ2 (`) as defined in Example 1.2.6: dtd `1 ·`1 +`1 · dtd `2 +`2 `1 . Compute " # Φ2 (ζ, η) + ΦT2 (η, ζ) 1 ζ +η 1+η Φ(ζ, η) = = 2 2 1+ζ 0 Then, QΦ (`) = 12 (2`1 · d ` dt 1 + ` 1 `2 + ` 1 · d ` dt 2 + ` 2 `1 + d ` dt 2 · `1 ) which is precisely QΦ2 (`). Therefore, for the purpose of studying QDFs, one may assume that the QDF is defined by a symmetric bivariate polynomial matrix without loss of generality. Henceforth, we assume that the QDF QΦ is defined by Φ(ζ, η) = ΦT (η, ζ) unless otherwise mentioned. Another instance of the calculus for QDFs/BDFs is differentiation. Clearly, if LΦ is a BDF, then so is dtd LΦ . The result of this differentiation can be elegantly represented in terms of two-variable polynomial matrices and leads to the • operator defined as • • : Rw1 ×w2 [ζ, η] → Rw1 ×w2 [ζ, η]; Φ (ζ, η) := (ζ + η)Φ(ζ, η) It can be seen that d d LΦ (w1 , w2 ) = L • (w1 , w2 ) and QΦ (w) = Q • (w) Φ Φ dt dt Example 1.2.10 Consider QΦ1 (`) in Example 1.2.5: 2` · d ` dt + ( dtd `)2 . Then, d2 d d2 d d QΦ1 (`) = 2` · 2 ` + 2( `)2 + 2 ` · 2 ` dt dt dt dt dt Now consider the two-variable polynomial matrix Ψ1 (ζ, η) = (ζ + η)Φ1 (ζ, η) = ζ 2 + 2ζη + η 2 + ζ 2 η + η 2 ζ It is easy to see that d2 d d2 d d QΦ1 (`) = QΨ1 (`) = 2[` · 2 ` + ( `)2 + ` · 2 `] dt dt dt dt dt Given Φ(ζ, η) ∈ Rw×w s [ζ, η] consider the matrix Φ(−ξ, ξ) obtained from Φ(ζ, η) by substituting ζ = −ξ and η = ξ. Then, Φ(−ξ, ξ) is what is called a “para-Hermitian” matrix: Definition 1.2.11 A matrix Z(ξ) ∈ Rw×w [ξ] is called para-Hermitian if Z(ξ) = Z T (−ξ). Para-Hermitian matrices are interesting in many problems in systems and control theory and signal processing. We will again encounter para-Hermitian matrices in Chapter 6 where we study factorizations of these matrices. 8 1.2.3 1 Introduction Factorization of Quadratic Differential Forms Analogous to the reduction of a quadratic form to a sum of squares, which we reviewed in Section 1.2.1, we would like to express a QDF in some “canonical” form so that it is easier to manipulate. Consider a two-variable polynomial matrix Φ(ζ, η) ∈ Rw×w [ζ, η]. We can associate the following matrix with Φ(ζ, η) Φ00 Φ01 . . . , . . . Φ 10 Φ11 . . . , . . . Φ̃ = (1.3) .. .. ... . Φkl . .. . ··· ··· ··· It is easy to verify that Φ(ζ, η) = h I . . . Iζ k I i ... . . . Φ̃ Iη l .. . (1.4) Note that only finitely many rows and columns of Φ̃ are non-zero. Since QΦ is symmetric, Φ̃ is a symmetric matrix, i.e. Φkl = ΦTlk ∀k, l. We have reviewed Sylvester’s law of inertia in Section 1.2.1 which says that the quadratic form defined by Φ̃ can be expressed as a sum and difference of squares. Hence, there exist matrices T, Λ such that Φ̃ = T T ΛT where Λ = diag [Iσ+ (Φ̃) , −Iσ− (Φ̃) ]. Consequently, Φ(ζ, η) admits the factorization Φ(ζ, η) = M T (ζ)ΛM (η) If T is chosen to be surjective then Λ is unique unique modulo a congruence transformation. The QDF QΦ can now be written as QΦ (w) = [P ( d d d d )w]T P ( )w − [N ( )w]T N ( )w dt dt dt dt where P (ξ), N (ξ) are obtained by partitioning rows of M (ξ) conformally with the blocks I σ+ (Φ̃) and −Iσ− (Φ̃) in Λ. Such a factorization of QΦ is known as a symmetric canonical factorization. We now define what we mean by observability of a QDF: Definition 1.2.12 A QDF QΦ is called observable if the matrix M in any symmetric canonical factorization Φ(ζ, η) = M (ζ)T ΣM (η) is such that M (λ) has full column rank for all λ ∈ C. 1.2.4 Point-wise non-negative Quadratic Differential Forms Sign definite QDFs are interesting in a large number of problems: synthesis of passive systems, construction of Lyapunov functions, solution of interpolation problems etc. A QDF QΦ is called positive semidefinite if QΦ (w)(t) ≥ 0 for all w ∈ C ∞ (R, Rw ). A positive semidefinite QDF is called positive definite if in addition QΦ (w) = 0 if and only if w = 0. 1.3. Conclusion 9 Example 1.2.13 Let Φ3 (ζ, η) = 1 + ζη. Then, QΦ3 (`) = `2 + ( dtd `)2 . QΦ3 (`) is non-negative for all `, and is zero if and only if ` = 0. Therefore, QΦ3 is a positive definite QDF. Let Φ4 (ζ, η) = 1 + ζ + η + ζη. Then, QΦ4 (`) = `2 + 2` · dtd ` + ( dtd `)2 . We may simplify this expression to obtain QΦ4 (`) = (` + dtd `)2 . Clearly, QΦ4 (`) is non-negative for all ` and is zero if and only if ` = ce−t , c ∈ R. Hence, the QDF QΦ4 is positive semidefinite. One can easily check whether a QDF QΦ is positive semidefinite by a symmetric canonical factorization. Proceed by first writing Φ(ζ, η) as in equation (1.4). If Φ̃ is positive (semi)definite, QΦ is clearly positive (semi)definite. We now prove the converse: if Φ̃ is not positive semidefinite, there exist a vector v ∈ R• such that v T Φ̃v < 0. One can find w ∈ C ∞ (R, Rw ) such that i [ ddtwi ]•i=0 (t) = v. Then, clearly QΦ (w)(t) < 0. Therefore, QΦ cannot be positive semidefinite if Φ̃ is not positive semidefinite. Hence: Corollary 1.2.14 For a QDF QΦ to be positive semidefinite, it is necessary and sufficient that the truncated coefficient matrix Φ̃0 obtained by retaining only the maximum rank sub-matrix of Φ̃ defined in (1.3) be positive semidefinite. Example 1.2.15 In example 1.2.13, consider the matrix Φ4 (ζ, η): 1 + ζ + η + ζη. It is easy to see that in this case " # 1 1 Φ̃4 = 1 1 which is clearly positive semidefinite. Hence QΦ4 is positive semidefinite. 1.3 Conclusion In this chapter we have reviewed elementary material on quadratic and hermitian forms. We have defined quadratic and bilinear differential forms and have obtained representations for them using two-variable polynomial matrices. The use of two variable polynomial matrices lets us define a calculus for QDFs, with the help of which we have obtained an expression for the derivative of a QDF. We have addressed a “symmetric canonical factorization” of a QDF which will be useful in many problems. We have also defined the notion of sign-definiteness and semidefiniteness of QDFs and have obtained conditions under which a given QDF is positive definite or positive semidefinite. 10 1 Introduction Chapter 2 Behavioral theory of dynamical systems In this chapter, we introduce basic concepts that form the backbone of the behavioral approach to dynamical systems. We try to show that the strength of the behavioral approach comes from its formal setting which makes the treatment of dynamical systems fairly general. Properties of a dynamical system (for example linearity, time/shift invariance) can be defined in this formal setting without recourse to a definition involving a specific model of the dynamical system. We consider the so-called differential systems, i.e. those dynamical systems that can be modeled by differential equations. Though in this thesis we also consider nonlinear differential systems, in the present chapter we only consider linear differential systems, since the chapter is intended to be of a introductory nature. We review various kinds of models (representations) for linear differential systems and show how so-called latent variables arise naturally. The important aspect of elimination is also considered. We review important properties of a linear differential system like controllability and observability. We then consider autonomous dynamical systems and define stability for these systems. Further, we define the notion of inputs and outputs, and some invariants associated with a dynamical system. 2.1 Dynamical systems The starting point of our study is the notion of a dynamical system. When modeling a system, we are trying to describe the way in which the variables of the system evolve. Let w denote a vector of variables whose components consist of system variables. We stipulate that w takes values in W, the signal space. Usually w itself is a function of an independent variable usually called time which, we stipulate, takes values in a set called the time axis T. Let WT denote the set of maps from T to W, then w ∈ WT . Not every map w is allowed by the laws that govern the system evolution. The set of maps that are allowed by the system is precisely the object of our study, and is called the behavior of the system. The laws that govern the system bring about a restriction of WT to the behavior of the system. This leads to the following definition of a dynamical system, given by Willems [101]: Definition 2.1.1 A dynamical system Σ is a triple Σ = (T, W, B) with T an indexing set (time axis), W the signal space and B ⊆ WT called the behavior of Σ. 12 2 Behavioral theory of dynamical systems When T and W are clear from the context, or have been explicitly specified, we will, for ease of notation, use the terms “behavior” and “dynamical system” interchangeably since there is little scope for ambiguity. Thus, when we say that T = R and W = Rw and WT is the set of infinitely many times differentiable functions from R to Rw (denoted as C ∞ (R, Rw )), by a behavior B we mean B ⊆ C ∞ (R, Rw ). In order to define Σ completely, we need to specify the behavior B. This is usually done with a representation of B. A representation is typically obtained from equations defining the laws obeyed by Σ. We now address some properties of dynamical systems. In this chapter we only consider dynamical systems that are 1. linear 2. time-invariant 3. described by ordinary differential equations In the context of continuous time dynamical systems, T can be taken to be R and W to be Rw . Definition 2.1.2 A dynamical system Σ = (R, Rw , B) is called linear if for all w1 , w2 ∈ B and α1 , α2 ∈ R, α1 w1 + α2 w2 ∈ B. Linearity of a behavior B ⊂ C ∞ (R, Rw ) is related to B being a vector space over R and is equivalent to Σ obeying the superposition principle. The property of time-invariance can be formulated by stipulating that the laws that describe a behavior are independent of time (time invariant): Definition 2.1.3 A dynamical system Σ = (R, Rw , B) is called time-invariant if for all w(t) ∈ B and τ ∈ R, w(t + τ ) ∈ B. 2.2 Linear differential systems We now consider linear systems that are defined by constant-coefficient ordinary differential equations (ODEs). Assume that we have m constant-coefficient ODEs in the Rw -valued variable w and we are interested in solutions (in some suitable function space) w : R → Rw to the m equations: dn d (2.1) R0 w + R 1 w + . . . + R n n w = 0 dt dt where Ri ∈ Rm×w , i = 0 . . . n. We introduce a polynomial matrix R(ξ) : R0 + R1 ξ + . . . Rn ξ n . A concise way of writing the m equations in (2.1) is R( dtd )w = 0. Suppose we are interested in C ∞ -solutions to (2.1). Then, we may define a the solution set (the behavior B) as: B = {w ∈ C ∞ (R, Rw )|R( d )w = 0} dt (2.2) Thus, R( dtd ) is a differential operator from C ∞ (R, Rw ) to C ∞ (R, Rm ) and B is the kernel of R( dtd ). Hence, the representation of B in equation (2.2) is called a kernel representation. Since R( dtd ) is a linear operator, B is linear. Since coefficients of R(ξ) (i.e. the matrices Ri , i = 0 . . . n 2.2. Linear differential systems 13 R1 I R2 C V Figure 2.1: A simple RC circuit in (2.1)) do not explicitly depend on time, B is time-invariant. We have assumed that every trajectory in B is infinitely many times differentiable. This is more for the sake of mathematical convenience. However, with appropriate modifications, different functions spaces can also be used. For example in Chapter 5, where we also address nonlinear systems, we will consider behaviors in the space of locally integrable functions. Throughout this thesis we denote the set of linear differential systems with w variables (with solutions in a pre-defined function space, say C ∞ ) as Lw (the “L” stands for linear). While a kernel representation defines the behavior uniquely, the converse is not true, namely, the same behavior could be defined by a number of kernel representations. Therefore, in the behavioral approach, the behavior of a system takes precedence over a specific representation for the system. Hence, we make a distinction between the behavior, defined as the solution set of a system of equations, and the system of equations itself. We consider the question of when two representations describe the same behavior in section 2.5 below. Example 2.2.1 Consider the RC circuit shown in Figure 2.1. Assume that we are interested in the voltage V and current I which are admissible, i.e. the two tuple (V, I) that respects the law defined by the circuit. It is clear that the time axis T in this case is R, and the signal space W is R2 . The use of Kirchoff’s voltage and current laws tells us that only those V and I are admissible that satisfy the ODE C V d R1 d V + = R1 C I + (1 + )I dt R2 dt R2 If we define the matrix R(ξ) = h Cξ + 1 R2 −R1 Cξ − (1 + R1 ) R2 The behavior B of the RC circuit can then be defined as: d {(V, I) ∈ C ∞ (R, R2 ) such that R( ) dt " V I # i = 0} 14 2 Behavioral theory of dynamical systems 2.3 The space of trajectories In the previous section we considered behaviors that have smooth (C ∞ ) trajectories. However, in this thesis behaviors in several other function spaces will be encountered. We now summarize the notation for function spaces encountered in this thesis: • C ∞ (R, Rw ): the space of infinitely many times differentiable functions from R to Rw . • D(R, Rw ): the space of compactly supported C ∞ -functions from R to Rw . • D 0 (R, Rw ): the dual space of compactly supported C ∞ -functions, also called Rw -valued distributions on R. w w w • Lloc 1 (R, R ): the space of all locally integrable functions from R to R , i.e. all w : R → R Rb such that a |w(t)| < ∞ for all a, b ∈ R. d )w = 0} with R(ξ) = R0 + R1 ξ + . . . + Rn ξ n . Consider a behavior B defined as {w|R( dt To assure differentiability, it is enough for w to be n times differentiable. A solution w of R( dtd )w = 0 is called a strong solution if it is at least n times differentiable. In particular, all C ∞ solutions of R( dtd )w = 0 are strong solutions. The disadvantage of working with the function space C ∞ is that it excludes many important trajectories that are commonly used in systems and control theory, for example steps, impulses and so on. Also, in nonlinear systems (Chapter 5), trajectories are not smooth in general, due to presence of, for example, relays. This motivates us to consider the space Lloc 1 . We assume loc that differentiation of L1 functions is in the sense of distributions, as follows. R∞ Let φ ∈ D(R, Rw ) and w ∈ C ∞ (R, Rw ). We define < w, φ >= −∞ w T (t)φ(t)dt. Using integration by parts: Z ∞ Z ∞ d d T d d ( w) φdt = − w T φ dt =< w, − φ > (2.3) < w, φ >= dt dt dt −∞ dt −∞ i i By using integration by parts repeatedly one obtains < dtd i w, φ >=< w, (−1)i dtd i φ >. Now assume that w satisfies the system of equations R( dtd )w = 0 with R(ξ) = R0 + R1 ξ + . . . + Rn ξ n , Ri ∈ Rm×w , i = 0 . . . n. Then, R(−ξ) = R0 − R1 ξ + . . . + (−1)i Ri ξ i + . . . + (−1)n Rn ξ n Consider the differential operator RT (− dtd ) and a trajectory ψ ∈ D(R, Rm ). Along lines of equation (2.3), it can be seen that < R( dtd )w, ψ >=< w, RT (− dtd )ψ >, ψ ∈ D(R, Rm ). Hence, R( d )w = 0 dt ⇐⇒ < w, RT (− d )ψ >= 0 ∀ψ ∈ D(R, Rm ) dt (2.4) We had assumed w ∈ C ∞ (R, Rw ), however, the same argument is true for any strong solution (i.e. a solution that is at least n times differentiable). The property of strong solutions (2.4) is used to define weak solutions: d m×w w [ξ] if for Definition 2.3.1 We call w ∈ Lloc 1 (R, R ) a weak solution of R( dt )w = 0, R(ξ) ∈ R d m T all ψ ∈ D(R, R ) we have < w, R (− dt )ψ >= 0. 2.4. Latent variables and their elimination 15 Thus, when considering functions that are locally integrable, the equation R( dtd )w = 0 is understood to hold in a distributional sense. With this brief review of weak solutions of differential equations, we return to various modeling issues for dynamical systems and show how auxiliary (“latent”) variables are naturally introduced in any systematic modeling procedure. 2.4 Latent variables and their elimination Most systems that we encounter during modeling are made up of smaller, simpler subsystems that are interconnected to each other via their terminals. A systematic procedure for modeling a system can be to first model every subsystem and then use the interconnection relations to build a model for the entire system. This procedure is called modeling by tearing and zooming. As a result of this systematic procedure, we invariably obtain a set of equations with additional variables called latent variables. The latent variables are different from the variables we are really interested in – which we call the manifest variables. We have defined a dynamical system in Definition 2.1.1 using just manifest variables, it is also possible to define the dynamical system using both latent and manifest variables. Such a definition gives rise to what is called a “full behavior”: Definition 2.4.1 A dynamical system with latent variables is a quadruple ΣL = (T, W, L, Bfull ) where T is the time axis, W is the space of manifest variables, L is the space of latent variables and Bfull ⊆ (W × L)T , i.e. Bfull is a set of maps from T to W × L and is called the “full behavior”. Representation of a behavior that has both manifest as well as latent variables is called a latent variable representation, or a hybrid representation. A latent variable representation is very simple to obtain and generally does not require any “post-processing”, unlike the kernel representation we have seen above, or the image representation which we will encounter later. A latent variable representation for a behavior with manifest variables w has the form: R( d d )w = M ( )` dt dt where ` is a free (possibly vector-valued) latent variable, and R(ξ), M (ξ) are polynomial matrices of the appropriate dimension. Consider w ∈ W and ` ∈ L. We consider the projection operator Π : (W×L)T to WT defined as Π(w, `) := w. Then, the behavior B := ΠBfull is called the manifest behavior induced by Bfull . Assuming Bfull to be linear and time invariant, we address the following issues about B: 1. Can the manifest behavior B, associated to Bfull , be described as the solution set of a system of linear differential equations. 2. If B can be described as the solution set of a system of differential equations, does it inherit properties of Bfull like linearity and time-invariance. The question whether B is a linear differential system depends on the function space under consideration. In the important case when Bfull is a C ∞ behavior, the manifest behavior B 16 2 Behavioral theory of dynamical systems can be expressed as the solution set of a system of linear differential equations. This is the consequence of the all important elimination theorem: Theorem 2.4.2 Let Bfull := {(w, `) ⊆ C ∞ (R, Rw )×C ∞ (R, Rl )} ∈ Lw+l . Consider the behavior B := {w ∈ C ∞ (R, Rw ) such that ∃` ∈ C ∞ (R, Rl ) with (w, `) ∈ Bfull } Then, B ∈ Lw . Thus, if a C ∞ -behavior Bfull ∈ Lw+l =⇒ B := ΠBfull ∈ Lw . The elimination theorem has important consequences in the context of modeling. During the process of modeling we need to introduce additional variables that come up naturally. As a consequence of the elimination theorem, latent variables are not a problem since they can always be eliminated, provided we restrict to C ∞ trajectories. Since we will be considering Lloc trajectories in Chapter 5, it is important to elaborate 1 loc on elimination in L1 . The elimination theorem may not hold in Lloc 1 . If it does, the latent variables ` are called properly eliminable [74]. We use the following method ([75], Section 6.2) for elimination in Lloc 1 . Consider w+l Bfull = {(w, `) ∈ Lloc )|R( 1 (R, R d d )w = M ( )`} dt dt Define B = ΠBfull : w B = {w ∈ Lloc 1 (R, R )| ∃` such that (w, `) ∈ Bfull } In general there may not exist a polynomial matrix R0 (ξ) ∈ R•×w [ξ] that induces a kernel representation for B; however, the closure of B does admit a kernel representation. Define B0full = {(w 0 , `) ∈ C ∞ (R, Rw+l )|R( d d 0 )w = M ( )`} dt dt Clearly, B0full ⊆ Bfull and in fact contains all C ∞ trajectories in Bfull . By elimination theorem, there exists a matrix R0 (ξ) such that R0 ( dtd )w 0 (t) = 0. We use R0 (ξ) to define a behavior B00 as follows: w 0 d B00 = {w ∈ Lloc )w = 0} 1 (R, R )|R ( dt It has been discussed in [75], section 6.2 that B00 is the closure of B in the topology of Lloc 1 . We now explain the concepts behind elimination using the circuit in Example 2.2.1. Example 2.4.3 We refer to the circuit in Figure 2.1, which has been redrawn in Figure 2.2 to emphasize the “tearing and zooming” approach to modeling. As before, R1 , R2 and C denote the values of the two resistors and the capacitance of the capacitor. We are interested in the “behavior” of V and I– the port voltage and the port current respectively. Proceeding from first principles we introduce some latent variables to model the circuit. These could be 1. Currents i1 and i2 in the capacitor branch and the resistor branch respectively. 2. Potentials v1 and v2 at the terminals of the resistance R1 . 2.4. Latent variables and their elimination 17 R1 v1 v2 v4 v3 I V i1 R2 C g i2 g=0 g Figure 2.2: The “tearing and zooming approach” to modeling 3. Potential v3 at one terminal of the capacitance C (in order to reduce the number of variables, we assume that the other terminal of the capacitor is at ground potential 0, but such an assumption is not necessary). 4. Potential v4 at one terminal of the resistance R2 . Again, we assume that the other terminal of R2 is at ground potential. The subsystems R1 , R2 and C satisfy the following equations: 1. v2 − v1 = IR1 . 2. i2 = v4 /R2 . 3. d v dt 3 = i1 /C. The subsystems are interconnected in such a way that the following constraints are imposed: v2 = v 3 = v 4 , I = i 1 + i 2 , v1 = V . The seven equations that we have just written define the “full behavior” which includes our variables of interest V, I and variables that we introduced during the course of modeling. We eliminate the extra variables that have been introduced and obtain the manifest behavior that describes the evolution of V and I. Using fairly straightforward calculations, the following equations in terms of i1 and i2 can be obtained d d V = R1 I + i1 /C dt dt V = IR1 + i2 R2 One can get the following latent variable representation for the port behavior: d " # # 1/C 0 −R1 dtd " dt i V 1 = 0 R2 1 −R1 i2 I 1 1 0 1 (2.5) (2.6) Notice that we have obtained the above latent variable representation from the device laws without any computation, or “post-processing” as it is generally known. Let us now obtain a 18 2 Behavioral theory of dynamical systems kernel representation for the port behavior. Multiplying the first equation by C and the second by 1/R2 , and adding the two equations we get C d V d R1 I V + = R1 C I + + i1 + i2 dt R2 dt R2 (2.7) Substitute I = i1 + i2 . We see that the behavior so obtained and the behavior in Example 2.2.1 are the same. Remark 2.4.4 The simple example that we have considered illustrates the general ideas behind the tearing and zooming approach to modeling. Of course, in this simple example, defining just two latent variables i1 and i2 would probably be enough for someone familiar with how to identify equipotential terminals in the circuit. One particularly clever way of choosing latent variables in this case is to define a single latent variable as the voltage across the capacitor. We will see in Example 2.8.3 that equations take a particularly simple and appealing form using this latent variable. However, such simplifications are the result of insight into the nature of the problem and are therefore not included in a systematic method for modeling (done for example with the help of a computer). Therefore, the number of latent variables that could be introduced in the course of modeling may vary depending on, among others the experience of the modeler. Therefore, given Bfull (which presumes a particular choice of latent variables) it makes sense to ask what is B, however the converse question is meaningless in general, since as we have seen, Bfull is highly non-unique and depends on the number of latent variables a modeler may choose to add in the course of modeling. Having said that Bfull is highly nonunique, we must however add that there are some special “full behaviors” associated with a given behavior B that are of immense practical and theoretical significance– these are state space representations of B, which we review in Section 2.8. 2.5 Equivalent representations We have discussed kernel and latent variable representations for behaviors. We have also addressed when a latent variable representation can be converted into a kernel representation. A given behavior may be represented by more than one representation. An example is the following: Example 2.5.1 The behavior (V, I) of port voltage and port current in Example 2.2.1 may also be given by d R1 + R 2 d )I (V /C + R2 V ) = R1 R2 I + ( dt dt C Arguably the representations obtained in Example 2.5.1 and Example 2.2.1 are easy to relate, since they only involve a scaling (by a constant). However, technically speaking, these are different representations for the same behavior. In more complicated situations however, it may not be so easy to relate two representations of a behavior. The following result [75], Section 3.6 can be used to address the issue of equivalence of two representations: 2.5. Equivalent representations 19 Proposition 2.5.2 Consider two C ∞ behaviors B1 , B2 ∈ Lw represented as the kernel of the differential operators R1 ( dtd ) and R2 ( dtd ) respectively. Then, B1 ⊆ B2 if and only if there exists a matrix F (ξ) ∈ R•×• [ξ] such that F (ξ)R1 (ξ) = R2 (ξ) The above proposition is quite intuitive: consider the behaviors B1 , B2 defined as Ker R1 ( dtd ) and Ker R2 ( dtd ) respectively. Clearly, if F (ξ)R1 (ξ) = R2 (ξ) then, B1 ⊆ B2 , since for every w ∈ B1 by definition R1 ( dtd )w = 0, and therefore R2 ( dtd )w = 0. Using Proposition 2.5.2 we can easily obtain conditions for two representations to describe the same behavior: two behaviors B1 and B2 are equivalent (i.e. B1 = B2 ) if and only if B1 ⊆ B2 and B2 ⊆ B1 . Therefore, in terms of representations, B1 = B2 if and only if there exist matrices F1 (ξ), F2 (ξ) ∈ R•×• [ξ] such that F1 (ξ)R1 (ξ) = R2 (ξ) and R1 (ξ) = F2 (ξ)R2 (ξ). Consider a behavior B = {w ⊆ C ∞ (R, Rw )} satisfying R( dtd )w = 0. The system of equations R( dtd )w = 0 may have redundancies of the following nature: 1. Some equations may be identically zero. 2. Some equations may be obtained by an algebraic combination of other equations. 3. A subset of the equations may not be independent Therefore, such redundant equations can be removed without affecting the behavior. Kernel representations of a behavior that donot contain any redundant equation are called minimal kernel representations. In the sequel, we need the notion of a minor and the rank of a polynomial matrix, which we define below: Definition 2.5.3 Let R(ξ) ∈ Rw1 ×w2 [ξ]. Denote elements in R(ξ) by rp,q where p = 1, . . . , w1 , q = 1, . . . , w2 . Let {i1 , . . . ik } ⊆ {1, . . . , w1 } and {j1 , . . . , jk } ⊆ {1, . . . , w2 }. Define mk (R) = det[rip ,jq ]kp,q=1 . mk (R) is called the k-th order minor of R(ξ). Clearly, R(ξ) ∈ Rw1 ×w2 [ξ] has w1 Ck .w2 Ck minors of order k, k ≤ w1 , w2 . Definition 2.5.4 Let R(ξ) ∈ Rw1 ×w2 [ξ]. By rank of R(ξ), denoted by Rank R(ξ), we mean the highest order non-zero minor of R(ξ). Example 2.5.5 Consider the following polynomial matrices: ! 1 ξ 1 ξ 0 1 ξ 1+ξ R1 (ξ) = ; R2 (ξ) = ξ ξ 2 ; R3 (ξ) = 1 0 0 0 0 1 ξ + 1 ξ(ξ + 1) 0 1 0 ! ξ ξ+1 Then, Rank R1 (ξ) = 2 since det is nonzero. Rank R2 (ξ) = 1 since every 2 × 2 minor 0 1 of R2 (ξ) is zero. Similarly, Rank R3 (ξ) = 2 since ! the 3 × 3 minor is zero, however there exist 1 0 nonzero 2 × 2 minors, for example: det . 0 1 20 2 Behavioral theory of dynamical systems We now return to the question of when a kernel representation is minimal. The following Proposition ([75], Theorem 3.6.4) addresses this issue: Proposition 2.5.6 Let B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0}. Then, R(ξ) induces a minimal kernel representation of B if and only if R(ξ) has full row rank. Example 2.5.7 For the RC circuit in Examples 2.2.1 and 2.5.1 the behavior of port voltage and port current, (V, I) defined by " # h i V 1 = 0 and (2.8) C dtd + R12 −R1 C dtd − (1 + R ) R2 I " # i V h +R2 = 0 (2.9) − R1 R2 dtd 1/C + R2 dtd − R1 C I are minimal kernel representations. However, the kernel representation # #" " 1 −R1 C dtd − (1 + R ) C dtd + R12 V R2 =0 I 1/C + R2 dtd − R1 C+R2 − R1 R2 dtd is not a minimal representation, since the second equation is obtained by scaling the first by R2 /C. We now address two important concepts in systems theory: controllability and observability 2.6 Controllability and Observability Controllability plays a central role in systems and control. This intuitive notion was given a strong foundation when it was introduced, and formalized for state space systems by Kalman in 1960. Consider the state space system d x = Ax + Bu dt with A ∈ Rn×n and B ∈ Rn×m . The Rn -valued variables x are called states. This system is called state controllable if for every x0 , x1 ∈ Rn there exists some τ > 0 and some u1 : R → Rm such that the solution to the above differential equation with u = u1 and x(0) = x0 satisfies x(τ ) = x1 . This definition of controllability has been the starting point for many important developments in systems theory. We now provide the behavioral definition of controllability. In the behavioral approach, controllability is an intrinsic property of the behavior, i.e., controllability is the property of trajectories allowed by a system. The controllability of a behavior is akin to the ability to steer from one trajectory in the behavior to every other, using some trajectory in the behavior. Definition 2.6.1 The system Σ(R, Rw , B) is called controllable if for all w1 , w2 ∈ B there exists τ ≥ 0 and w ∈ B such that w(t) = w1 (t) for t < 0 and 2.6. Controllability and Observability 21 w(t + τ ) = w2 (t) for t ≥ 0. A characterization of representations of controllable systems is important. Controllable behaviors turn out to be exactly those that admit a special representation called image representation which is defined as follows: the latent variable representation w = M ( dtd )` with ` free (for example ` ∈ C ∞ (R, Rl )) is called an image representation of the behavior B := {w ∈ C ∞ (R, Rw )|∃` ∈ C ∞ (R, Rl ) such that w = M ( dtd )`}. The following important result from [75], Theorem 5.2.10 gives a characterization of representations of a controllable behavior: Proposition 2.6.2 Let B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0}. Then, the following statements are equivalent: 1. B is controllable. 2. Rank R(λ)= Rank R(ξ) for all λ ∈ C. 3. There exists M (ξ) ∈ Rwו such that w = M ( dtd )`, ` ∈ C ∞ (R, R• ). Notation 2.6.3 The family of controllable linear differential systems with w manifest variables will be denoted by Lwcon . Another crucial property for systems is that of “observability”. Observability is not an intrinsic property of a behavior since it depends on a chosen partition of the system variables. It relates to whether one can infer some (specified) variables in a given system uniquely by measuring some other (specified) variables. Let us consider a behavior B with trajectories w = [w1 w2 ]T with w1 ∈ WT1 and w2 ∈ WT2 . We define when the variable w2 is observable from w1 as follows: Definition 2.6.4 Let Σ = (T, W1 ×W2 , B) be a behavior with trajectories w = [w1 w2 ]T where w1 (respectively w2 ) takes values in W1 (respectively W2 ). Then, w2 is said to be observable from w1 if whenever (w1 , w20 ) ∈ B and (w1 , w200 ) ∈ B =⇒ w20 = w200 Consider a controllable behavior B defined by an image representation w = M ( dtd )`. Consider the full behavior Bfull := {(w, `)|w = M ( dtd )`}. One can then ask the question whether latent variables are observable from the manifest variables: (w, `1 ) ∈ Bfull and (w, `2 ) ∈ Bfull ? =⇒ `1 = `2 Since we are only considering linear behaviors, one may infer the above implication by only considering the observability with w = 0: (0, `) ∈ Bfull ? =⇒ ` = 0 If the two (equivalent) properties hold then we say that a controllable behavior is defined by an observable image representation. Notice that since latent variables are introduced by us, the modelers, and are not intrinsic to the system, an image representation can be assumed to be observable without loss of generality. Given w = M ( dtd )`, the differential operator M ( dtd ) 22 2 Behavioral theory of dynamical systems defines an observable image representation if and only if M (ξ) is a right-prime matrix, i.e. if M (ξ) = M 0 (ξ)U (ξ), U (ξ) square, then det U (ξ) = constant 6= 0. Further, by defining appropriate latent variables, a image representation can be assumed to be induced by a full column rank polynomial matrix without loss of generality. We demonstrate the concepts of controllability and observability with the help of some examples: Example 2.6.5 With reference to the RC circuit in Figure 2.1, consider the kernel representation of the behavior in Example 2.2.1 with variables V , the port voltage, and I,port current: " # h i V 1 =0 C dtd + R12 −R1 C dtd − (1 + R ) R2 I Let p(ξ) = Cξ + 1/R2 and q(ξ) = R1 Cξ + 1 + R1 /R2 . Let us investigate when p(ξ) and q(ξ) have a common root. There is a common root if and only if ξ=− R1 + R 2 1 =− R2 C R1 R2 C which is not satisfied for any nonzero R1 , R2 , C. Therefore, the port behavior (V, I) is controllable, since the matrix [p(ξ) − q(ξ)] that induces the kernel representation has full row rank for every ξ ∈ C, which is also equal " to the # rank of [p(ξ) − q(ξ)]. q(ξ) Observe that [p(ξ) − q(ξ)] · = 0. Therefore, a image representation for the port p(ξ) behavior can be " # " # V R1 C dtd + 1 + R1 /R2 = ` I C dtd + 1/R2 Notice that since we have shown that p(ξ), q(ξ) are coprime for all nonzero R1 , R2 , C the image representation is observable since V = 0; I = 0 if and only if ` = 0. We now demonstrate uncontrollability of dynamical systems with an example: Example 2.6.6 Consider the AC bridge circuit shown in Figure 2.3. It can be easily shown that the bridge is “balanced” (i.e. nodes “A” and “B” are equipotential) if and only if R1 R2 = L1 C1 We introduce as latent variables the currents I1 and I2 . We then get the following equations C1 R1 d V dt V = R 1 I2 + C 1 R1 R2 = R 1 I1 + L 1 d I2 dt d I1 dt Adding the two equations and substituting C1 R1 R2 = L1 we get " # i V h =0 C1 R1 dtd + 1 −R1 − L1 dtd I (2.10) (2.11) 2.7. Autonomous systems 23 L1 A I1 R1 R2 I2 B C1 I V Figure 2.3: An AC bridge circuit Let p(ξ) = C1 R1 ξ + 1 and q(ξ) = R1 + L1 ξ. The polynomials p(ξ), q(ξ) are not coprime when 1 R1 = R1 C1 L1 p or when R1 = L1 /C1 . Under these conditions, the behavior (V, I) of port variables is not √ controllable since the matrix [p(ξ), −q(ξ)] is zero when ξ = −1/ L1 C1 . Assume for the sake of simplicity that R1 = C1 = R2 = L2 = 1. Then, the port behavior can be given by a kernel representation: h i d ( + 1) 1 −1 dt " V I # =0 The controllable part of the behavior acts like a pure unit resistance: V = I. If one of these is taken to be free, the other gets determined. Now consider the case when V = c1 e−t , I = c2 e−t where c1 , c2 are determined by the initial conditions. It is easy to see that these V, I are admissible trajectories in the port behavior. However, c1 and c2 could now be arbitrary. In particular, the trajectory with c1 = 0, c2 6= 0 can not be “patched” with a trajectory with, say, V = I = 1. We have considered controllable systems where there is complete freedom to steer from one trajectory to any other trajectory. The other end of the spectrum consists of systems with no freedom to steer among its trajectories. Such systems are called autonomous. 2.7 Autonomous systems The evolution of autonomous systems depends on only the initial conditions: 24 2 Behavioral theory of dynamical systems Definition 2.7.1 A time-invariant dynamical system Σ = (R, W, B) is called autonomous if for all w1 , w2 ∈ B w1 (t) = w2 (t) for t ≤ 0 =⇒ w1 = w2 The above definition says that in autonomous systems, the future of every trajectory is entirely determined by its past. Most natural systems are autonomous, e.g. the motion of planets around the sun, the rotation of earth on its axis etc. Physicists study autonomous systems in order to obtain laws governing the evolution. Engineers are interested in autonomous systems since their evolution is so predictable – once a given system is made autonomous (as for example by attaching a device called a controller), it shows a very predictable response that can be matched with the desired response. In the context of linear differential systems, the following result from [75], Section 3.2 gives a characterization of representations of autonomous systems. Proposition 2.7.2 Assume B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} is defined by a minimal kernel representation. Then, the following statements are equivalent: 1. B is the behavior of an autonomous system. 2. Rank R(ξ) = w. 3. B is a finite dimensional subspace of C ∞ (R, Rw ). Since B in Proposition 2.7.2 is defined by a minimal kernel representation, B is the behavior of an autonomous system if and only if the matrix R(ξ) is square and nonsingular. Roots of det R(ξ) = 0 are called the characteristic values of the system. We consider a simple autonomous system in the following example. Example 2.7.3 Consider B1 defined as the set of all w1 ∈ C ∞ (R, R) such that ( dtd − 1)w1 = 0. Then, B1 is autonomous since the rank of ξ − 1 is 1. Further, B1 is finite (one in this case) dimensional since every trajectory in B1 is of the form cet , c ∈ R. Now consider B2 defined as the set of all w2 ∈ C ∞ (R, R) such that ( dtd + 1)2 w2 = 0. Again, B2 is autonomous and finite dimensional (two dimensional in this case). Every trajectory in B2 is of the form (c1 + tc2 )e−t , c1 , c2 ∈ R. Since the evolution of an autonomous system is “fixed” by its past, it is not illogical to ask what happens if the system is allowed to evolve for a large enough time. The consideration of how the system behaves as t → ∞ leads to the question of stability. Definition 2.7.4 Consider an autonomous linear time-invariant system Σ = (R, Rw , B). Then Σ is called stable if for all trajectories w ∈ B, ||w(t)|| → 0 as t → ∞. Since we have only considered Rw -valued systems, which norm is used for w(t) is immaterial since all norms on Rw are equivalent. An autonomous behavior Ker R( dtd ) is stable if and only if R(ξ) has all its characteristic values in the open left half complex plane: (det R(λ) = 0 =⇒ Re λ < 0). 2.8. State representation 25 Example 2.7.5 In Example 2.7.3, the behavior B1 corresponds to an unstable system, since |w1 (t)| := |cet | → ∞ as t → ∞. On the other hand, the behavior B2 corresponds to a stable system, since |w2 (t)| := |(c1 + tc2 )e−t | → 0 as t → ∞. We now review a particularly important form of latent variable representations called state representations. 2.8 State representation States are intuitively related to “memory” of a dynamical system. We formalize this association with the “axiom of state”. Variables that are state variables obey the axiom of state. In the behavioral framework, state x of a system is a latent variable with the special property that if the values of state corresponding to two manifest variable trajectories is equal at a certain time t, then the two manifest variable trajectories can be concatenated at time t. Roughly speaking, this means that while going from the “past” into the “future”, one only needs to see that the states match. Hence, the value of states at time t can be thought of capturing the entire history of evolution of a system from rest till time t. In the sequel by w1 ∧τ w2 we mean the concatenation of w1 (t) and w2 (t) at t = τ . Definition 2.8.1 (Axiom of state) Let ΣX = (R, Rw , R× , Bfull) be a linear dynamical system with latent variables x taking values in R× . The latent variable x is said to have the property of state if and only if {(w, x) ∈ Bfull } and {x(0) = 0} and {x continuous at t = 0} =⇒ {(0 ∧0 w, 0 ∧0 x) ∈ Bfull } Systems with latent variables that are also state variables will be called state systems. Issues about the axiom of state were addressed by Rapisarda and Willems [83]. The relationship between behaviors and state representations was also addressed by Sule [94]. The following result from [83], Proposition 3.1 characterizes representations of state systems: Proposition 2.8.2 ΣX = (R, Rw , R× , Bfull ) be a linear dynamical system with latent variables x taking values in R× . Then, ΣX is a state system if and only if there exist matrices E, F, G ∈ R•×• such that d Bfull = {(w, x) ∈ C ∞ (R, Rw+× )|E x + F x + Gw = 0} dt Note that if B is a C ∞ -behavior, B = Π(Bfull ), i.e. B can be obtained by eliminating states from Bfull using the elimination theorem. By a representation of B in terms of states we mean a state system with behavior Bfull such that B = Π(Bfull ). A representation of B in terms of states will also be called a state representation of B. We demonstrate state representations with an example. 26 2 Behavioral theory of dynamical systems Example 2.8.3 Consider the RC circuit in Figure 2.2. Let us re-do Example 2.4.3 with a latent variable vc , which is the voltage across the capacitor C. Then, we get the following equations that relate V, I and vc V = IR1 + vc d I = C vc + vc /R2 dt (2.12) (2.13) These equations can be written as " # " # " #" # d C 1/R2 −1 0 I vc + vc + =0 0 dt 1 R1 −1 V which is in the form specified in Proposition 2.8.2. Hence vc is a state variable. The current i2 is another state variable since i2 = vc /R2 and hence the circuit defines the following full behavior: " # " #" # " # d 1 −1 0 I R2 C i2 + i2 + =0 dt R2 R1 −1 V 0 Example 2.8.3 shows that several state representations are possible for a given behavior B. Therefore, this brings us to the consideration of a minimal state representation: Definition 2.8.4 A state system ΣX1 = (R, Rw , R×1 , B1full ) with behavior B and R×1 -valued states x1 is said to be a minimal state representation of B if whenever ΣX2 = (R, Rw , R×2 , B2full ) is another state system for B with R×2 -valued states x2 then ×1 ≤ ×2 . This minimal number of states is called McMillan degree of B and is denoted by n(B). In order to deduce when a given state representation of B with behavior Bfull is state minimal, we need the notion of trimness for Bfull . Definition 2.8.5 Consider a behavior B ∈ Lw with trajectories w, and a state representation of B with behavior Bfull ∈ Lw+× with trajectories (w, x). Then, Bfull is called state trim if for every a ∈ R× there exists a trajectory (w, x) ∈ Bfull such that x(0) = a. State trimness means that there are no algebraic constraints among states. It is intuitive that state trim behaviors are minimal: Proposition 2.8.6 Let Bfull ∈ Lw+× be a state system with trajectories (w, x). Then, Bfull is state minimal for B if and only if Bfull is state trim and x is observable from w. It is easy to see that the state representation obtained in Example 2.8.3 is a minimal state representation of B. We now consider briefly construction of states. Given a linear differential behavior, it is in general not easy to identify latent variables that qualify as state variables. Hence, a 2.9. Inputs and Outputs 27 systematic procedure for construction of states starting from manifest or latent variable trajectories is important. Given a behavior and its representation in a kernel, hybrid or image form, Rapisarda [82] discusses ways to construct a set of states for the behavior. Let B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} be defined by a minimal kernel representation. There exists a differential operator X( dtd ) with X(ξ) ∈ Rn(B)×w [ξ] such that x = X( dtd )w where x is a set of minimal states for B. The map X( dtd ) is called a state map for B. There is a constructive procedure to construct X(ξ) starting from the rows of R(ξ). See [82], Proposition 3.4.4, Theorem 3.5.1 and Proposition 3.5.4 for ways to construct states starting from kernel, hybrid and image representations of a behavior respectively. States are related to memory of a dynamical system. The history of trajectories in B that correspond to x = 0 cannot captured by states. Hence the set of such trajectories is called the “memoryless part” of a behavior: Definition 2.8.7 Let B ∈ Lw be a linear differential behavior with manifest variables w. Let x = X( dtd )w be a minimal state map for B. Then, the set {w ∈ B ∩ Ker X( d )} dt is called the memoryless part of B. The memoryless part is thus, the projection of the trajectory (w, x) ∈ Bfull on manifest variables by setting x = 0. We will see that the memoryless part of a behavior plays a crucial role in results presented in Chapter 4. We now come to the concluding part of this introductory chapter on behavioral theory where we introduce the concept of inputs and outputs. Notice that till now, we have not imposed any cause-effect relationship among the system variables. Even the notion of controllability was defined without defining “inputs”, and observability was defined without defining “outputs”. The behavioral notion of inputs and outputs is different from the classical notion, and therefore needs a special mention. 2.9 Inputs and Outputs The concept of a “free variable” (i.e. a variable that is not constrained by laws defining a system) plays an important role in defining an input. The idea underlying the definition is that an input is unconstrained by the system and therefore can be fixed by the environment. The existence of “free” variables in a behavior is related to the fact that in general a behavior is described by an under-determined system of equations. This leaves some components of a solution unconstrained. These components are free to assume arbitrary C ∞ -functions. Definition 2.9.1 Let B ∈ Lw be a C ∞ -behavior. Let w = (w1 , w2 . . . ww ) denote manifest variables of B. Let I = {i1 , i2 , . . . ik } ⊆ {1, 2 . . . w} be an index set. We denote by Π(B) the system obtained by eliminating the variables wj , j ∈ / I. Let the set of variables obtained after elimination be denoted by {wi1 , wi2 , . . . , wik }. The variables {wi1 , wi2 , . . . , wik } are called free in B if Π(B) = C ∞ (R, Rk ) Recall that k has been defined to be |I|, the cardinality of the index set. 28 2 Behavioral theory of dynamical systems Using the concept of free variables in a behavior, one may define a set of maximally free variables as a set which contains the maximum possible number of free variables. Once a set of maximally free variables is found, no more free variables remain outside the corresponding index set I. We use the concept of maximally free variables to define an “input-output” partition for a behavior B: Definition 2.9.2 Consider a C ∞ -behavior B ∈ Lw . Let w denote manifest variables of B. Partition w (after possibly a permutation of its components) as w = (wi , wo ) with wi = (w1 , w2 . . . wm ) and wo = (wm+1 . . . ww ). The partition w = (wi , wo ) is said to be an inputoutput partition of w if the variables wi are maximally free in B. Once we have an input-output partition of B we say that wi are inputs and wo are outputs of B. Using the customary notation, wi is denoted by u and wo by y. Because u is maximally free, we know that y does not have any free components. The following result gives conditions when a kernel representation of a behavior corresponds to an input-output partition. Consider a C ∞ -behavior B with manifest variables w partitioned arbitrarily as w = (u, y). Let R( dtd ) be a minimal kernel representation of B. Let R(ξ) = [P (ξ), −Q(ξ)] be the partition of columns of R(ξ) conformally with u and y. Then, w = (u, y) is a input-output partition if and only if Q(ξ) is square and nonsingular. The rational function Q−1 (ξ)P (ξ) is called a transfer function for B. Different input-output partitions of manifest variables of B give rise to different transfer functions for B. A transfer function of a behavior B is a rational function in ξ, i.e. every element ij of −1 Q (ξ)P (ξ) is of the form pij (ξ)/qij (ξ) where pij (ξ), qij (ξ) ∈ R[ξ] and qij (ξ) 6= 0. A rational function is called proper (respectively strictly proper) if deg qij (ξ) ≥ deg pij (ξ) (respectively deg qij (ξ) > deg pij (ξ)) for all i, j. Obviously, a strictly proper rational function is proper. For C ∞ -behaviors, an input-output partition may not correspond to a proper rational function. In other words, for C ∞ -behaviors, a transfer function that is not proper may still correspond to a valid input-output partition of the system variables. However, in the case of Lloc 1 behaviors, loc properness is crucial: The partition (u, y) for a L1 behavior B is an input-output partition if and only if the corresponding transfer function is proper [74]. We now address briefly three “invariants” associated with a linear differential behavior B with manifest variables w. 1. Input cardinality: Let w = (u, y) be a input-output partition of B. Clearly, several inputoutput partitions are possible for a behavior. However, it turns out that the cardinality of every set of maximally free variables in B is the same. The cardinality of the set u of inputs in a given input-output partition of B is called the input cardinality of B, denoted by m(B). m(B) is intrinsic to a behavior and does not depend on a particular representation. Therefore, we say it is an invariant associated with B. If B is controllable, m(B) is also the minimal number of latent variables in an image representation for B. 2. Output cardinality: If B has w manifest variables and input cardinality m(B) then p(B) : w − m(B) is called the output cardinality of B. The output cardinality is completely determined by the number of manifest variables and the input cardinality, both of which are intrinsic to a behavior. Therefore, we say that p(B) is an invariant associated with B. 2.9. Inputs and Outputs 29 3. McMillan degree: Given a behavior B, the number of states in a minimal state representation for B is defined as the McMillan degree of B (see Definition 2.8.4), and is denoted by n(B). The McMillan degree of a behavior is an invariant associated with the behavior. Given a behavior B := KerR( dtd ), the McMillan degree of B is equal to the maximal degree minor of R(ξ). Example 2.9.3 Consider Figure 2.1 and the corresponding behavior B in terms of port voltage V and port current I obtained in Example 2.2.1. Recall that " # " # V R1 C dtd + 1 + R1 /R2 = ` I C dtd + 1/R2 is an observable image representation for (V, I). Then, B has the following invariants: 1. Input cardinality: m(B) = 1 since either V or I can be an input (i.e. a maximally free set of variables in B). Note that m(B) is also the cardinality of the number of latent variables, which is one in this case. 2. Output cardinality: Since B has 2 manifest variables and m(B) = 1, the output cardinality p(B) = 1. 3. McMillan degree: In example 2.8.3 we showed that B admits a state space representation with state as the capacitor voltage. It turns out that this state representation is minimal. Therefore, McMillan degree of B, n(B) = 1. This completes a brief review of background material for behavioral systems theory. We have only included the bare minimum that is required in order to read the forthcoming chapters. What is included in this chapter shall be repeatedly used throughout the thesis; therefore we will not refer to the relevant results in this chapter time and again. However, we shall need some additional concepts (for example the notion of dissipativity, which uses concepts from Quadratic differential Forms, Chapter 1, and also from the current chapter) which we introduce as and when these are necessary. 30 2 Behavioral theory of dynamical systems Chapter 3 A parametrization for dissipative systems 3.1 Introduction In the context of electrical circuits, power supplied is defined as the product of voltage and current. Circuits in which the net power supplied is non-negative are called passive: if v and Rt i are the port voltage and current respectively, 0 v T idt ≥ 0 for all (v, i) permitted by the circuit. Passivity has been an area of active research since many decades, see [18] for an early account. Meixner [54] examines general passive linear systems in a abstract framework and derives several properties of these systems. Before electronic operational amplifiers (op-amps) became widely available, passive circuits were preferred as a rule since they can be realized with passive electrical components, i.e. resistors, inductors, capacitors and gyrators. The classic book [56] examines various realizability issues for passive systems. Even today, passive circuits have not lost their importance, though realizability is not the central issue. Note that power in electrical circuits is a quadratic form in terms of voltage and current. More generally, quadratic forms, and quadratic differential forms can be used to define “generalized power”. Systems in which net generalized power supplied is non-negative are called dissipative. The abstract theory of dissipative systems was introduced by Jan Willems, who in 1972 wrote two seminal papers on the subject [98, 99]. The ideas in these papers have been singularly successful in tieing together ideas from network theory, mechanical systems, thermodynamics, as well as feedback control theory. The dissipation hypothesis which distinguishes dissipative dynamical systems from general dynamical systems results in a fundamental constraint on their dynamic behavior. Typical examples of dissipative systems include: 1. Electrical networks: Electrical energy is dissipated as heat in resistors. 2. Viscoelastic systems: Mechanical energy is dissipated as heat due to viscous friction. 3. Thermodynamic systems: The second law of thermodynamics postulates a form of dissipativity leading to an increase in entropy. System theorists are interested in dissipative systems as one of the methods to stabilize interconnected systems. Moylan and Hill treat this aspect of dissipative systems in several papers 32 3 A parametrization for dissipative systems [34, 35, 55]. Dissipative systems are currently an area of active research in several diverse areas in engineering: Bernstein, Bhat, Haddad, and others [13, 32] have proposed a “network flow” model based on dissipativity ideas for thermodynamic systems. Work on “absolute stability” criteria for nonlinear systems using dissipativity ideas is ongoing [31, 52]. Problems like disturbance attenuation have led to research in “H∞ control” [19], which can also be interpreted using dissipativity ideas. In this chapter, we study dissipative systems in the behavioral theoretic framework. The premise of the current chapter is: while it is easy to check if a given dynamical system is dissipative, the converse problem of constructing all systems dissipative with respect to a given generalized power is more difficult. We address the following problem: given a generalized power defined by a QDF, parametrize the set of all dynamical systems (behaviors) that are dissipative with respect to the generalized power. In this chapter we restrict to C ∞ -behaviors. When considering more general function spaces (e.g. Lloc 1 ), the theory presented here can still be used with appropriate technical modifications. These aspects will be elaborated upon in later chapters. The scalar version of results presented in this chapter has been published [60, 61]. The full results presented here will be shortly sent for possible publication. Thinking of the parametrization problem, one is immediately reminded of the KalmanYakubovich-Popov (KYP) lemma [2, 40, 78, 109, 106] which is in fact a result of this type. However, this connection will be explored more fully in the next chapter. Constructing dissipative behaviors is important for the same reason as the KYP lemma is important. Knowing a characterization of dissipative systems often yields a simple solution to problems in systems theory, the passivity theorem [107] is an example. Results in this chapter build a thread that runs through most chapters in this thesis. This chapter is organized as follows: in Section 3.2, we define dissipativity and give tests to check for dissipativity. In Section 3.3, we define an equivalence relation on the set of QDFs which lets us identify similar QDFs from the viewpoint of generalized power. Section 3.4 is about a parametrization for dissipative systems with two manifest variables, i.e., singleinput-single-output (SISO) systems. In Section 3.5 we extend the parametrization results to multi-input-multi-output (MIMO) dissipative systems. The parametrization results depend on a factorization which may not always exist. In Section 3.5, several cases are investigated in an increasing order of complexity of the generalized power. Parametrization for dissipative systems is proposed in each case. 3.2 Dissipativity in the Behavioral setting Consider a QDF QΦ induced by Φ ∈ Rw×w s [ζ, η]. Consider the action of QΦ on trajectories w of w a controllable behavior B ∈ Lcon . Then QΦ (w) can be interpreted as a measure of “generalized power” supplied to the dynamical system (behavior B). QΦ (w) is called the supply function [103]. Definition 3.2.1 A behavior B ∈ Lwcon is said to be dissipative with respect to the QDF QΦ (or in short Φ-dissipative) if Z ∞ QΦ (w)dt ≥ 0 ∀w ∈ D(R, Rw ) ∩ B (3.1) −∞ 3.2. Dissipativity in the Behavioral setting 33 where D(R, Rw ) denotes the space of compactly supported C ∞ functions from R to Rw . We emphasize that the above definition is valid only for controllable systems. For uncontrollable systems, the definition of dissipativity is still an open problem. See [12, 72] for discussion on dissipativity of uncontrollable systems. Since B ∈ Lwcon , we can use an observable image representation of B = Im M ( dtd ) to define a new two variable polynomial matrix Φ0 (ζ, η) = M T (ζ)ΦM (η). Then the condition for Φdissipativity given by equation (3.1) can be rewritten as Z ∞ (3.2) QΦ0 (`)dt ≥ 0 ∀ ` ∈ D(R, R• ) −∞ w Given a QDF QΦ with Φ ∈ Rw×w s [ζ, η], one expects a subset of behaviors in Lcon to be Φdissipative. Notation 3.2.2 We denote the set of all Φ-dissipative controllable behaviors by LΦ . Clearly LΦ ⊆ Lwcon . A characterization of Φ-dissipative controllable behaviors is given in the following theorem ([103], Proposition 5.2, page 1719): Theorem 3.2.3 Consider a QDF QΦ induced by Φ ∈ Rw×w s [ζ, η]. Then the following statements are equivalent: 1. B ∈ Lwcon is dissipative with respect to the QDF QΦ . 2. R∞ −∞ QΦ (w)dt ≥ 0 ∀w ∈ D(R, Rw ) ∩ B. 3. Consider an (observable) image representation of B given by B = {w | ∃ ` ∈ C ∞ (R, Rl ) such that w = M ( d )`} dt Then Φ0 (−iω, iω) := M T (−iω)Φ(−iω, iω)M (iω) ≥ 0 ∀ω ∈ R Example 3.2.4 Let J1 1 = that " 1 0 0 −1 # and B = Im M ( dtd ) where M (ξ) = " # ξ+2 . We see 1 M T (−iω)J1 1 M (iω) = 3 + ω 2 which is clearly positive for all ω ∈ R. Hence, B is J1 1 -dissipative. The following lemma is immediate from Theorem 3.2.3: Lemma 3.2.5 Consider a supply function QΦ with Φ ∈ Rw×w s [ζ, η]. If there exists an ω0 ∈ R such that Φ(−iω0 , iω0 ) is negative definite, there exist no non-trivial behaviors that are Φdissipative. 34 3 A parametrization for dissipative systems Proof: Suppose there exists a behavior B ∈ Lwcon that is Φ-dissipative. Let w = M ( dtd )` be an observable image representation for B. Since B is assumed to be Φ-dissipative, M T (−iω0 ) Φ(−iω0 , iω0 ) M (iω0 ) ≥ 0. Since Φ(−iω0 , iω0 ) is negative definite, it must be that M (iω0 ) = 0. We now arrive at a contradiction, since by assumption, Im M ( dtd ) is an observable image representation, and hence M (λ) has full column rank for all λ ∈ C. Note that condition 3 of Theorem 3.2.3 gives a test for checking whether a given behavior B is Φ-dissipative. Note that the matrix Φ0 (−iω, iω) ∈ Rl×l [iω] is a Hermitian matrix for every ω ∈ R. Let Φ0 (−iω, iω) = [φij ]li,j=1 . We denote by Dk (ω 2 ) the kth successive principal minor of Φ0 (−iω, iω): Dk (ω 2 ) = det[φi,j ]ki,j=1 , k = 1, . . . , l We emphasize that these successive principal minors are even polynomials in ω. The following result is easy to show: Proposition 3.2.6 If Dk (ω 2 ) is the kth successive principal minor of Φ0 (−iω, iω) then Φ0 (−iω, iω) > 0 for every ω ∈ R if and only if Dk (ω 2 ) > 0, k = 1, 2, . . . , l. Proof: See for instance Gantmacher [22]. The case where Φ0 (−iω, iω) is positive semidefinite is more difficult to check: Φ0 (−iω, iω) ≥ 0 for every ω ∈ R if and only if every (and not just successive) principal minor of Φ0 (−iω, iω) is non-negative for all ω ∈ R. In either case, the computation boils down to checking when an even polynomial π(ω) ∈ R[ω] takes non-negative values for every ω ∈ R. Proposition 3.2.7 An even non-zero polynomial π(ω) ∈ R[ω] takes non-negative values for every ω ∈ R if and only if the following conditions hold: 1. All real roots of π(ω) have even multiplicities. 2. π(ω0 ) > 0 for some ω0 ∈ R. Proof: Assume that the even polynomial π(ω) is greater than zero at ω = ω0 and all its real roots have even multiplicity. It can change signs only at it’s real roots (since all the coefficients of the polynomial are real). Since these real roots are of even multiplicities, the signs of the polynomial π at ω − and ω + are the same for all real valued ω that are roots of the equation π(ω) = 0. Consequently, π(ω) ≥ 0 for every ω ∈ R. Conversely, if π(ω) takes non-negative values at every ω ∈ R, the sign of π evaluated at ω + and ω − is positive for some real root ω of the equation π(ω) = 0, Hence, the multiplicity of this root should be even. Tests are available in literature that enable us to check the non-negativity of even polynomials without explicit root computation. The approach based on Sturm chains is classical and can be found in literature, for example in [22, 86]. A Routh-array type test for checking non-negativity of even polynomials can be found in [43]. Arguably, positive definiteness of Φ(−iω, iω) is far easier to check using Proposition 3.2.7 than positive semi-definiteness. In case, the spectrum of Φ(−iω, iω) ∩ R is known, or can be computed easily, the following procedure provides a simple check for positive semi-definiteness. It relies on the fact that eigenvalues of Φ(−iω, iω) are a continuous function of ω: 3.3. An equivalence relation on supply functions 35 1. We denote by R+ the semi-open interval [0, ∞) ⊂ R. Given nonsingular Φ(−iω, iω) = ΦT (iω, −iω) ∈ Rw×w [iω], compute the spectrum SΦ := {ωi ∈ R+ | det Φ(−iωi , iωi ) = 0}. 2. Define the set of distinct spectral points arranged in a ascending order: S := {ωi ∈ SΦ |ω1 < ω2 < . . . ωk with ωi , i = 1, . . . , k distinct}. 3. If ω1 = 0: Let ωk+1 be an arbitrary finite real number larger than ωk . Determine inertia of Φ(−iω̄i , iω̄i ) with ω̄i := (ωi + ωi+1 )/2, i = 1, . . . , k. Φ(−iω, iω) is positive semidefinite if and only if the matrices Φ(−iω̄i , iω̄i ) are positive definite for i = 1, 2, . . . , k. 4. If ω1 > 0: Let ω0 = 0, ωk+1 be an arbitrary finite real number larger than ωk . Determine inertia of Φ(−iω̄i , iω̄i ) with ω̄i := (ωi−1 + ωi )/2, i = 1, . . . , k + 1. Φ(−iω, iω) is positive semidefinite if and only if the matrices Φ(−iω̄i , iω̄i ) are positive definite, i = 1, . . . , k + 1. Note that determining inertia is computationally far easier than first computing eigenvalues and then counting them. See [59] for iterative algorithms for inertia computation using Schurcomplement [33]. We can ask the question: given a Φ such that Φ(−iω, iω) is indefinite for ω ∈ R, find behaviors that are Φ-dissipative. Finding a Φ-dissipative behavior can be viewed as a polynomial interpolation problem in the following sense: there exists a complex cone Cω in Cw such that v ∗ Φ(−iω, iω)v > 0 for all v ∈ Cω . In order to show the existence of a non-trivial behavior B in LΦ we need to find a polynomial matrix M (ξ) of the appropriate size such that the vectors M (iω) lie in Cω for all ω ∈ R. From Theorem 3.2.3 it is clear that a behavior could be dissipative with respect to several different supply functions. From the set of all supply functions, one can identify families of supply functions such that the set of behaviors dissipative with respect to every supply function in a family is the same. We formalize this notion by first defining an equivalence relation on the set of all QDFs. This equivalence relation, as we shall see, will be crucial for the parametrization obtained in this chapter. 3.3 An equivalence relation on supply functions Definition 3.3.1 Two QDFs QΦ1 and QΦ2 with Φ1 and Φ2 ∈ Rw×w s [ζ, η] are equivalent (denoted by QΦ1 ∼ QΦ2 ) if the set of behaviors dissipative with respect to QΦ1 is precisely the same as the set of behaviors dissipative with respect to QΦ2 i.e. LΦ1 is the same as LΦ2 . Using the above definition, we have the following proposition: Proposition 3.3.2 If Φ1 (−iω, iω) = π(ω)Φ2 (−iω, iω) where π(ω) is a nonzero scalar polynomial in ω such that π(ω) ≥ 0 for all ω ∈ R then QΦ1 ∼ QΦ2 . Proof: Consider a behavior B ∈ LΦ1 defined by an observable image representation w = M ( dtd )`. Then from Theorem 3.2.3, M T (−iω)Φ1 (−iω, iω)M (iω) ≥ 0 ∀ω ∈ R 36 3 A parametrization for dissipative systems Multiplying both sides of the above inequality by π(ω) does not change the inequality since π(ω) takes non-negative values for every ω ∈ R. Consequently, M T (−iω)Φ2 (−iω, iω)M (iω) ≥ 0 ∀ω ∈ R which shows that every behavior in LΦ1 is also in LΦ2 .The converse of the proposition can be easily proved by invoking a continuity argument. In particular, the above proposition holds for π(ω) = 1 ∀ω ∈ R. Though obvious, we need this in the sequel and hence it is convenient to have the following corollary: Corollary 3.3.3 If Φ1 (−iω, iω) = Φ2 (−iω, iω) then QΦ1 ∼ QΦ2 . Example 3.3.4 Consider the following matrices # # " " 0 2ζ + η ζ +η ζ ; Φ2 (ζ, η) = Φ1 (ζ, η) = 2η + ζ −(ζ 2 + η 2 )/2 η ζη Let us consider the action of QΦ1 and QΦ2 on a trajectory (u, y)T ∈ C ∞ (R, R2 ): d d d u + 2 u · y + ( y)2 dt dt dt d d d2 QΦ2 (u, y) = 4 u · y + 2u · y − y · 2 y dt dt dt QΦ1 (u, y) = 2u · (3.3) (3.4) It is clear that QΦ1 and QΦ2 are different QDFs. However, notice that " # 0 −iω Φ1 (−iω, iω) = Φ2 (−iω, iω) = iω ω 2 Therefore, though QΦ1 and QΦ2 are different QDFs, QΦ1 ∼ QΦ2 since from Corollary 3.3.3, every behavior that is Φ1 -dissipative is also Φ2 -dissipative, and vice versa. In the following section, we first concentrate on dissipative systems with two manifest variables. In section 3.5, we generalize the results obtained below in several directions. 3.4 SISO dissipative systems Consider a behavior B ∈ L2con defined by an observable image representation # " # " d u q( dt ) ` = p( dtd ) y (3.5) with the scalar ` being a free latent variable. Note that since equation (3.5) is an observable image representation of the behavior B, p(ξ), q(ξ) are coprime polynomials. Define the transfer function G(ξ) = p(ξ)/q(ξ). In the following sections, we identify the behavior B with its transfer function G(ξ). 3.4. SISO dissipative systems Consider the QDF QJ with J = definition) 37 " 0 1/2 1/2 0 Z # ∈ R2×2 s [ζ, η]. Then, B is J-dissipative if (by ∞ −∞ u(t)y(t)dt ≥ 0 (3.6) with (u(t), y(t)) ∈ D(R, R2 ) ∩ B, i.e., compactly supported C ∞ -trajectories in B. Equation (3.6) is precisely the condition satisfied by inputs and outputs of passive linear systems. Thus, J-dissipativity in the behavioral setting is equivalent to passivity in classical (input-output) setting. Passive systems are of interest in circuit theory because these can be synthesized using just passive components. Transfer functions of passive systems have a very well known characterization, namely that they are positive real (see [56] for details). A rational function G(ξ) is said to be positive real if: 1. G(ξ) is analytic in C+ where C+ denotes the open right half of the complex plane. 2. G(iω) + G∗ (iω) ≥ 0 for almost all ω ∈ R. 3. All poles of G(ξ) on the imaginary axis are simple. The behavior B defined by equation (3.5) is J-dissipative if and only if p(−iω)q(iω) + q(−iω)p(iω) ≥ 0 ∀ω ∈ R (3.7) Note that the equation (3.7) divided by the polynomial q(−iω)q(iω) gives us G(iω) + G (iω) ≥ 0 for almost all ω ∈ R. This implies that the associated scalar transfer function G(ξ) = p(ξ)/q(ξ), satisfies Re(G(iω)) ≥ 0 ∀ω ∈ R. Therefore, if we view the scalar transfer function G(ξ) corresponding to B as a function from C → C, then it maps every point on the imaginary axis to some point in the closed right half complex plane. A geometric way of visualizing this map is through the Nyquist plot of G(ξ), i.e., the Nyquist plot of G(ξ) lies entirely in the closed right half complex plane. This is precisely the condition (2) in the definition of positive realness of scalar transfer functions given above. Note that in controllable behaviors, there is no notion of stability. A behavioral counterpart to the analyticity condition given in the definition of positive real transfer functions (condition 1) will be obtained in the next chapter (Chapter 4) where we will show that analyticity of the G(ξ) is equivalent to the existence of positive definite storage functions. It follows that all positive real transfer functions can be identified with behaviors that are J-dissipative but the converse is not true. Let us consider an example that demonstrates the difference between passivity and J-dissipativity: ∗ Example 3.4.1 Consider the behavior associated with G(ξ) = (ξ − 2)/(ξ − 1). Let us check that this behavior is J-dissipative: ! ! 0 1/2 iω − 2 = ω 2 + 2 > 0 ∀ω ∈ R −iω − 2 −iω − 1 1/2 0 iω − 1 However, the rational function (ξ − 2)/(ξ − 1) is not positive real, since both the numerator and denominator have roots in the right half complex plane. 38 3 A parametrization for dissipative systems It is clear from the equation (3.7) that if the behavior represented by equation (3.5) is J-dissipative, then so is the behavior represented by # " # " w1 p( dtd ) ` (3.8) = q( dtd ) w2 In the language of transfer functions, the above implies that if the behavior identified with G(ξ) is J-dissipative, then so is the behavior identified with 1/G(ξ). Since every B ∈ LJ can be identified with a scalar transfer function whose Nyquist plot lies entirely in the right half of the complex plane, we use this property to identify behaviors that are J-dissipative. In other words, behaviors in LJ are precisely those behaviors whose associated transfer functions have their Nyquist plots entirely in the closed right half plane. We now characterize Φ-dissipative behaviors (for a general Φ ∈ R2×2 s [ζ, η]). Since the matrix Φ(−iω, iω) is hermitian for every ω ∈ R, it has real eigenvalues. It can be easily shown that eigenvalues of this matrix at every ω are precisely the roots of the second degree polynomial equation s2 − trace (Φ(−iω, iω))s + det(Φ(−iω, iω)) = 0 Using concepts about signs of roots of quadratic equations, we have the following proposition. Proposition 3.4.2 Given a Φ ∈ R2×2 s [ζ, η], the following hold: 1. If ∀ω ∈ R, det(Φ(−iω, iω)) ≥ 0 and trace(Φ(−iω, iω)) ≥ 0, then every behavior in L2con is Φ-dissipative. 2. If there exists some ω = ω0 such that det(Φ(−iω0 , iω0 )) > 0 and trace(Φ(−iω0 , iω0 )) < 0, then there exist no non-trivial behaviors in L2con that are Φ-dissipative. 3. If ∀ω ∈ R, ω ∈ / spec(Φ(−iω, iω)), det(Φ(−iω, iω)) < 0, then LΦ ( L2con . Proof: When trace (Φ(−iω0 , iω0 )) and det(Φ(−iω0 , iω0 )) are both positive for some ω0 ∈ R, then we can conclude that Φ(−iω0 , iω0 ) is a positive definite matrix at that ω0 . If Φ(−iω0 , iω0 ) is positive definite for every ω0 ∈ R, it is easy to see that every controllable behavior B ∈ L2con is Φ-dissipative. By a similar argument, it is easy to see that when det(Φ(−iω, iω)) ≥ 0 and trace (Φ(−iω, iω)) ≥ 0 ∀ω ∈ R then every behavior in L2con is Φ-dissipative. When trace (Φ(−iω0 , iω0 )) < 0 and det(Φ(−iω0 , iω0 )) > 0 for some ω0 ∈ R, both the roots of the characteristic polynomial are negative. Therefore Φ(−iω0 , iω0 ) is a negative definite matrix and consequently there exists no non-trivial behavior that is Φ-dissipative. When det(Φ(−iω, iω)) < 0, eigenvalues of Φ(−iω, iω) have opposite signs. Hence, for some ω0 ∈ R, ω0 6∈ spec(Φ(−iω, iω)), we can find a vector v in C2 such that v ∗ Φ(−iω0 , iω0 )v < 0 We can construct two coprime polynomials r(ξ) and s(ξ) of arbitrary degrees such that " # r(iω0 ) =v s(iω0 ) 3.4. SISO dissipative systems 39 # r( dtd ) is in L2con . But by very construction, this behavior Note that the behavior given by Im s( dtd ) is not in LΦ . Hence LΦ is not all of L2con . We now demonstrate the use of Proposition 3.4.2 with some examples. " Example 3.4.3 With Φ[ζ, η] = I2 , it is easy to see (using Theorem 3.2.3) that every behavior B ∈ L2con is Φ-dissipative. This can also be seen from part 1 of Proposition 3.4.2, since for Φ = I2 , both the trace and determinant are positive for all real values of ω. Example 3.4.4 Consider the QDF QΦ induced by " # ζη − 2 0 Φ(ζ, η) = 0 ζη − 3 Let w = (w1 , w2 ) be a typical trajectory in some behavior B ∈ L2con . Then 2 2 dw2 dw1 2 − 2w1 + − 3w22 QΦ (w) = dt dt It is not difficult to show that every behavior B ∈ L2con contains some trajectory where w1 , w2 are combinations of the sinusoidal functions sin t and cos t. Clearly, the above QDF would yield negative values, when we integrate the QDF over some finite time interval (we might have to make some adjustments about how long a time period we consider). By taking a chopped version of the sinusoidal trajectories (this is always possible, since the behavior is controllable), R∞ we obtain a compactly supported trajectory in the behavior such that −∞ QΦ (w)dt < 0. Thus no behavior B ∈ L2con is Φ-dissipative. We arrive at the same conclusion using condition 2 of Proposition 3.4.2, since at ω = ±1, det(Φ(−iω, iω)) = 2 and trace(Φ(−iω, iω)) = −3. Henceforth we assume that Φ(−iω, iω) ∈ R2×2 [iω] is nonsingular and has one positive, and one negative eigenvalue for almost all ω ∈ R, i.e., Φ(−iω, iω) has constant zero signature for almost all ω ∈ R. Hence, from Proposition 3.4.2 it follows that det Φ(−iω, iω) < 0 for almost all ω ∈ R. We shall now parametrize Φ-dissipative behaviors in terms of J-dissipative behaviors. Our base set for parametrization is LJ which we had characterized as those behaviors whose associated transfer functions have their Nyquist plots entirely in the right half of the complex plane. We now show by explicit construction that every Φ ∈ R2×2 s [ζ, η] such that Φ(−iω, iω) has constant zero signature for almost all ω ∈ R admits an interesting factorization: Proposition 3.4.5 Every matrix Φ ∈ R2×2 s [ζ, η] such that Φ(−iω, iω) has constant zero signature for almost all ω ∈ R can be factorized in the following manner π(ω)Φ(−iω, iω) = K T (−iω)JK(iω) with π(ω) a scalar polynomial in ω such that π(ω) ≥ 0 for all ω ∈ R and K(iω) a 2 × 2 matrix with entries that are polynomials in iω. 40 3 A parametrization for dissipative systems Proof: We prove the proposition by considering various cases and constructing the matrices K(iω) in every case. Case 1: Φ with nonzero diagonal elements: Consider Φ given by " # φ11 (iω) φ∗12 (iω) Φ(−iω, iω) = φ12 (iω) φ22 (iω) Since det(Φ(−iω, iω)) ≤ 0 for all ω ∈ R we factorize the determinant as d(iω)d(−iω) = −det(Φ(−iω, iω)) Define the matrix K(iω) by K(iω) = " φ11 (iω) φ∗12 (iω) + d(iω) φ211 (iω) φ11 (iω)(φ∗12 (iω) − d(iω)) # (3.9) Then, direct multiplication shows that φ11 (iω)2 Φ(−iω, iω) = K T (−iω)JK(iω) Note that φ11 (iω) is an even polynomial in ω and so φ11 (iω) = φ11 (−iω). As a result, φ11 (iω)2 = |φ11 (iω)|2 is non-negative for all real ω. Thus, π(ω) = |φ11 (iω)|2 . Case 2: Φ with at least one diagonal element uniformly zero: We consider a Φ with at least one of the diagonal elements uniformly zero. Without loss of generality we assume the (1, 1) element to be zero. Consider the Φ defined by " # 0 φ∗12 (iω) Φ(−iω, iω) = (3.10) φ12 (iω) φ22 (iω) Consider K(iω) defined by # (3.11) 1 Φ(−iω, iω) = K T (−iω)JK(iω) 2 (3.12) K(iω) = " φ12 (iω) φ22 (iω)/2 0 1 It can be verified by direct multiplication that Thus we have obtained the required factorization when at least one of the diagonal elements of Φ is uniformly zero. Note that in the above factorization, π(ω) = (1/2) which is positive for all ω ∈ R. This exhausts all possible cases for Φ with constant zero signature and hence the proof is complete. Remark 3.4.6 The factorization given in Proposition 3.4.5 is a rather special case of rational J-spectral factorization. However, the interesting feature of this factorization is that it is explicitly stated in terms of the coefficients of the QDF. This makes it possible to study a parametric dependence of the coefficients of the QDF on the matrix K(ξ) obtained in the factorization. 3.4. SISO dissipative systems 41 We now show that the factorization in Proposition 3.4.5 can be used to define a differential operator in a natural way which can be used as a map from the set of all Φ-dissipative behaviors to the set of all J-dissipative behaviors. Theorem 3.4.7 Given Φ(ζ, η) ∈ R2×2 s [ζ, η] such that Φ(−iω, iω) is nonsingular and has constant zero signature, factorize Φ(−iω, iω) as in Proposition 3.4.5. Let K(ξ) be the matrix one obtains by substituting ξ for iω in the matrix K(iω). Then the following hold: 1. K( dtd ) is a map from the set LΦ to LJ , 2. Define L(ξ) as L(ξ) = adj K(ξ), i.e., L(ξ)K(ξ) = det(K(ξ))I2 . Then L( dtd ) parametrizes the set LΦ through the set LJ , i.e., L(ξ) maps every J-dissipative behavior to a Φdissipative behavior. Proof: Given a Φ ∈ R2×2 s [ζ, η] such that Φ(−iω, iω) has constant zero signature, from Proposition 3.4.5: π(ω)Φ(−iω, iω) = K T (−iω)JK(iω) where π(ω) ≥ 0 for all ω ∈ R. We note from Proposition 3.5.2 that multiplication iω) " of Φ(−iω, # d q( dt ) with a non-negative polynomial π(ω) does not affect the set LΦ . Let B = Im ∈ L2con . p( dtd ) " # q̃( dtd ) Consider the behavior B̃ defined by Im where, p̃( dtd ) " q̃(ξ) p̃(ξ) # = K(ξ) " q(ξ) p(ξ) # (3.13) Then, B̃ is a J-dissipative behavior iff B is a Φ-dissipative behavior. Thus K( dtd ) is a map from LΦ to LJ . " " # # d q( r( dtd )q( dtd ) ) dt Note that Im defines the same controllable behavior in L2con as Im r( dtd )p( dtd ) p( dtd ) for any nonzero scalar polynomial r(ξ). Therefore the map defined by K( dtd )L( dtd ) is the identity map from L2con → L2con and so is the restriction of this map to LJ . Therefore" it follows # that d q̂( dt ) L( dtd ) defines the inverse map of K( dtd ) and every behavior defined by B̂ = Im where p̂( dtd ) " " q̂(ξ) p̂(ξ) # = L(ξ) " q̃(ξ) p̃(ξ) # (3.14) # q̃( dtd ) is Φ-dissipative iff B̃ = Im defines a J-dissipative behavior. Thus, L(ξ) parametrizes p̃( dtd ) the set LΦ , as LΦ is precisely the image of LJ under the map L(ξ). We now demonstrate the use of Theorem 3.4.7 with the help of some examples: 42 3 A parametrization for dissipative systems Example 3.4.8 We obtain a characterization of all behaviors dissipative with respect to the QDF QΦ with " # 1 0 Φ(ζ, η) = 0 −1 Φ is a constant matrix and has zero signature. Also its determinant is negative. Note that this Φ corresponds to the case 1 in the proof of Proposition 3.4.5. From Proposition 3.4.5 and Theorem 3.5.3 we see that " # −1 −1 L(ξ) = (3.15) −1 1 This L(ξ) is a map that parametrizes all Φ-dissipative behaviors, for the action of L(ξ) on the set LJ yields the set LΦ . Let us now check the above"claim for# an arbitrary J-dissipative d +1 . We can check that this behavior. Take an arbitrary behavior B in LJ , say Im dtd + 2 dt behavior B is in LJ by checking the Nyquist plot of the associated transfer function action of L( dtd ) on the image representation of the J-dissipative behavior B gives " # " # ξ+1 −2ξ − 3 L(ξ) = ξ+2 1 ξ+1 . ξ+2 Then We now see that h −2(−iω) − 3 1 i " 1 0 0 −1 #" −2(iω) − 3 1 # = 4ω 2 + 8 (3.16) is positive for all real values of ω. Hence the image of B ∈ LJ under L(ξ) indeed gives a Φ-dissipative behavior. We now consider the example of a more complicated QDF. Example 3.4.9 Consider the QDF QΘ with Θ given by " # 0 ζ Θ(ζ, η) = η ζη Note that Θ(−iω, iω) corresponds to the case 2 in the proof of Proposition 3.4.5. For this example we see that " # 1 ξ 2 /2 L(ξ) = (3.17) 0 ξ The map L(ξ) acting on LJ parametrizes the set of Θ-dissipative behaviors LΘ . Let us check the "above claim # for an arbitrary J-dissipative behavior defined, as in Example 3.4.8 by B = d +1 . Action of L( dtd ) on this behavior B gives the behavior B0 defined as the image Im dtd +2 dt of " #" # " # 1 ξ 2 /2 ξ+1 ξ 3 /2 + ξ 2 + ξ + 1 = 0 ξ ξ+2 ξ 2 + 2ξ 3.5. MIMO dissipative systems: the constant inertia case 43 We see that h # 0.5(iω)3 + (iω)2 + (iω) + 1 (iω)2 + 2(iω) # 0.5(−iω)3 + (−iω)2 + (−iω) + 1 (−iω)2 + 2(−iω) " " 0 −iω iω ω 2 i (3.18) gives the polynomial 2ω 4 + 4ω 2 . This polynomial is non-negative for all real values of ω and hence the behavior B0 is Θ-dissipative. In the next section, we generalize the results obtained so far. We investigate behaviors with more than two manifest variables that are dissipative with respect to a supply function defined by a QDF. We first consider QDFs QΦ where Φ(−iω, iω) has constant inertia for almost all ω ∈ R. In this case, the results for SISO and the MIMO case do not differ much, though the techniques used are different. We then consider the important case when Φ(−iω, iω) has variable inertia with ω. 3.5 MIMO dissipative systems: the constant inertia case In Section 3.4 we considered the QDF QJ , where J was the 2 × 2 matrix: ! 0 1/2 J= 1/2 0 Notice that in the MIMO case, such a “J” can be defined only if the number of inputs and outputs of a system are the same. In order to overcome this limitation, we would like to look at a more “general” QJ . 3.5.1 Supply functions defined by constant matrices Consider the following inertia matrix Jmn = " Im 0 0 −In # (3.19) where m, n ∈ Z+ and Im (respectively In ) denotes the m × m (respectively n × n) Identity matrix. Consider the corresponding QDF QJmn . We show that m is the maximum possible input cardinality for a Jmn -dissipative behavior: Lemma 3.5.1 Let B be Jmn -dissipative. Then, the input cardinality of B, m(B) ≤ m. Proof: We prove the lemma by a contradiction argument. Suppose there exists a Jmn -dissipative behavior B0 defined by an observable image representation w = M ( dtd )` such that m(B0 ) = α, α > m. Since the minimum number of latent variables is the same as input cardinality, ` is Rα valued. Further, the column rank of M (ξ) is α. Partition rows of M as Mm and Mn conformally with Jmn . Since m < α, the system of equations Mm ( dtd )` = 0 is underdetermined. Hence, one 44 3 A parametrization for dissipative systems can find ` ∈ D(R, Rα ) − {0} such that Mm ( dtd )` = 0. Since ImM ( dtd ) is observable, w = M ( dtd )` such that ` ∈ D(R, Rα ) − {0} ∩ KerMm ( dtd ) is non-zero. Integrating QJmn along this w yields a negative quantity which contradicts the assumption that B0 is Jmn -dissipative. Hence, the input cardinality of a Jmn -dissipative behavior cannot exceed m. Consider a Jmn -dissipative behavior B defined by an observable image representation " w1 w2 # = " Q( dtd ) P ( dtd ) # ` with ` ∈ C ∞ (R, Rl ), Q(ξ) ∈ Rm×l [ξ], P (ξ) ∈ Rn×l [ξ]. Consider the special case when l = m. Theorem 3.2.3 shows that B is Jmn -dissipative if and only if the following inequality holds: QT (−iω)Q(iω) − P T (−iω)P (iω) ≥ 0 ∀ω ∈ R Notice that since the image representation is observable by assumption, Q(ξ) has no singularities on iR. Further, since Q(ξ) is a polynomial matrix, it follows that det Q(ξ) 6= 0. Define G(ξ) = P (ξ)Q−1 (ξ). Then, the above inequality is equivalent to GT (−iω)G(iω) ≤ Im for almost all ω ∈ R (3.20) Rational functions that satisfy the above inequality have been well-studied in systems theory. Note that GT (−iω)G(iω) represents the square of the gain of the transfer function G(ξ) with an input Im sin ωt. Thus, inequality (3.20) gives an upper-bound on the gain of the transfer function G(ξ) in the frequency domain. The existence of an upper bound is particularly significant in disturbance attenuation, see Chapter 7. Another important application of rational functions satisfying the inequality (3.20) is in interpolation theory, see [9] and Chapter 9. Inequality (3.20) implies that the L∞ norm of G(ξ) is at most unity. Due to familiarity with rational functions satisfying the inequality (3.20), we shall consider the set of all Jmn -dissipative behaviors as the “base set” and parametrize more general dissipative behaviors in terms of Jmn -dissipative behaviors. Section 3.4 dealt with the case when m = n = 1 We now consider the special case when m = n. The matrix Jmm is then congruent to the matrix J = J T ∈ R2m×2m defined by " # 1 0 Im (3.21) J= 2 Im 0 since 2 " 0 Im Im 0 # = " Im Im Im −Im #" Im 0 0 −Im #" Im Im Im −Im # We have seen that the supply function QJ has important implications especially in the context of electrical circuits. It is equivalent to study systems that are Jmn -dissipative or J-dissipative. We may prefer one over the other depending on the situation at hand. In the sequel, we consider Jmn instead of J because QJmn , unlike QJ does not presume that the number of inputs and outputs are the same. We are now in a position to address the parametrization problem for supply functions defined by non-constant polynomial matrices. 3.5. MIMO dissipative systems: the constant inertia case 3.5.2 45 Supply functions defined by polynomial matrices Consider a supply function defined by QΦ with Φ ∈ Rw×w s [ζ, η]. Since Φ(−iω, iω) is a Hermitian matrix, it has real eigenvalues for every ω ∈ R. Clearly, if Φ(−iω, iω) is positive semi-definite for every ω ∈ R, every controllable behavior in Lwcon is Φ-dissipative. On the other extreme is the case when Φ(−iω1 , iω1 ) is negative definite for some ω1 ∈ R. In this case, there exist no non-trivial Φ-dissipative behaviors (Lemma 3.2.5). In this section, we assume that Φ(−iω, iω) has constant inertia for almost all ω ∈ R, i.e., the number of positive, negative and zero eigenvalues of the Hermitian matrix Φ(−iω, iω) remain the same for almost all ω ∈ R. We distinguish the case when Φ(−iω, iω) has constant inertia for almost all ω ∈ R, from the case when inertia of Φ(−iω, iω) changes with ω. We address the latter case in the next section. It is a difficult and a non-trivial result that every nonsingular Φ(−iω, iω) = ΦT (iω, −iω) ∈ Rw×w [iω] having constant inertia for almost all ω ∈ R admits a polynomial Jmn -spectral factorization, (see [17, 38, 49, 80] for details), i.e., there exist nonsingular matrices J mn = Jmn T ∈ Rw×w and K(iω) ∈ Rw×w [iω] such that Φ(−iω, iω) = K T (−iω)Jmn K(iω) (3.22) with m + n = w. Consider the differential operator K( dtd ) induced by the polynomial matrix K(iω). We define the action of K( dtd ) on a controllable behavior B ∈ Lwcon in the following manner: K( d d )(B) := {v|v(t) = K( )w(t), w(t) ∈ B} dt dt Clearly, K( dtd )(B) is controllable and has the same input cardinality as that of B, since K(ξ) is nonsingular. The following proposition follows as a easy consequence of the definition of Definition 3.3.1: Proposition 3.5.2 Consider Φ1 ∈ Rw×w s [ζ, η] such that Φ1 (−iω, iω) is nonsingular and has constant inertia for almost all ω ∈ R. Obtain a polynomial Jmn -spectral factorization of Φ1 (−iω, iω) as K T (−iω)Jmn K(iω). Define Φ2 as Φ2 (ζ, η) = K T (ζ)Jmn K(η) Then, QΦ1 ∼ QΦ2 . The following theorem gives a parametrization for Φ-dissipative behaviors. Theorem 3.5.3 Given a QDF QΦ with Φ(−iω, iω) ∈ Rw×w [iω] and nonsingular, having constant inertia for almost all ω ∈ R, obtain a polynomial Jmn -spectral factorization of Φ(−iω, iω) as in equation (3.22). Let K(ξ) be the matrix one obtains by substituting ξ for iω in the matrix K(iω). Then the following hold 1. K( dtd ) is a map from the set LΦ to LJmn , i.e., given B ∈ Lwcon , K( d )(B) ∈ LJmn ⇐⇒ B ∈ LΦ dt 46 3 A parametrization for dissipative systems 2. Define L(ξ) as the adjugate matrix of K(ξ): L(ξ) = adj K(ξ), i.e., L(ξ)K(ξ) = det(K(ξ))I w . Then the differential operator L( dtd ) parametrizes the set LΦ through the set LJmn , i.e., L( dtd ) maps every Jmn -dissipative behavior to a Φ-dissipative behavior. Proof: Let Θ(ζ, η) = K T (ζ)Jmn K(η). Clearly, Θ(−iω, iω) = Φ(−iω, iω) and therefore from Proposition 3.5.2, QΘ ∼ QΦ . Let w = M ( dtd )` be an observable image representation of a behavior B ∈ Lwcon . Notice that M T (−iω)Φ(−iω, iω)M (iω) = [K(−iω)M (−iω)]T Jmn [K(iω)M (iω)] Define the behavior B0 as the image of M 0 ( dtd ) := K( dtd )M ( dtd ). Then, M T (−iω)Φ(−iω, iω)M (iω) ≥ 0 ⇐⇒ M 0T (−iω)Jmn M 0 (iω) ≥ 0 which shows that B is Φ-dissipative if and only if B0 is Jmn -dissipative. Thus, K( dtd )(B) is Jmn -dissipative if and only if B is Φ-dissipative. Consider an arbitrary nonzero polynomial r(ξ) ∈ R[ξ]. Note that Im M ( dtd )r( dtd ) defines the same controllable behavior in Lwcon as Im M ( dtd ). Therefore the map defined by L( dtd )K( dtd ) is the identity map on Lwcon , i.e., L( d d )K( )(B) = B ∀ B ∈ Lwcon dt dt and so is the restriction of this map to LΦ . Therefore it follows that L( dtd ) defines the inverse map of K( dtd ) d L( )(B0 ) ∈ LΦ ⇐⇒ B0 ∈ LJmn dt Thus the set LΦ is precisely the image of LJmn under the map L(ξ). Remark 3.5.4 Closely associated with the concept of behavioral dissipativity is that of lossR∞ lessness (B is said to be Φ-lossless if −∞ QΦ (w)dt = 0 along all compactly supported trajectories w ∈ B ). It is clear that the parametrization obtained in this chapter can also be used for parametrization of lossless behaviors. In particular, it is easy to see that Φ-lossless behaviors can be parametrized as the image of Jmn -lossless behaviors under L(ξ). Thus, we have obtained from Theorem 3.5.3 a complete characterization of Φ-dissipative behaviors, assuming that Φ(−iω, iω) is nonsingular and has constant inertia for almost all ω ∈ R. We now consider the most general case, when inertia of Φ(−iω, iω) changes as a function of ω. These supply functions will be useful in solving the problem of synthesis of a dissipative behavior in Chapter 8. 3.6 MIMO dissipative systems: the general inertia case In this section we address the problem of parametrizing Φ-dissipative behaviors with Φ(−iω, iω) having inertia that changes with ω. Given Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ] let n = maxξ∈iR σ− (Φ(−ξ, ξ)), i.e., n is the maximum number of negative eigenvalues of Φ(−iω, iω), ω ∈ R. 3.6. MIMO dissipative systems: the general inertia case 47 Recall that if Φ(−iω0 , iω0 ) is negative definite for some ω0 ∈ R, there can exist no non-trivial behaviors that are Φ-dissipative. Therefore, a necessary condition for existence of non-trivial dissipative behaviors is, clearly that n < w. Thus, the number n seems to have a direct influence on the “size” of the set of Φ-dissipative behaviors. We now define, what we call a “worst inertia” matrix associated with Φ(−ξ, ξ). This matrix plays a central role in the parametrization results obtained later in this chapter. Definition 3.6.1 Given nonsingular Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ], let n := maxξ∈iR σ− (Φ(−ξ, ξ)). The w × w inertia matrix " # Iw−n 0 Jworst := 0 −In is called the “worst inertia” matrix of Φ(−ξ, ξ). The non-zero integer three-tuple (w − n, n, 0) is called the “worst inertia” of Φ(−ξ, ξ) and is denoted by σworst (Φ). Given QΦ , σworst (Φ) is unique. In Definition 3.6.1, Φ(−ξ, ξ) has been assumed to be nonsingular more for the sake of convenience. The theory presented in this section can still be used when Φ(−ξ, ξ) is singular by appropriately modifying the “worst inertia” matrix with zero submatrices. Since Φ(−iω, iω) is a polynomial matrix in iω, its eigenvalues change continuously with ω. Let ω1 < ω2 such that Φ(−iωi , iωi ) is singular for i = 1, 2. The inertia of Φ(−iω, iω) remains constant in the interval (ω1 , ω2 ) if there exists no ω3 ∈ (ω1 , ω2 ) such that Φ(−iω3 , iω3 ) is singular. Thus, in order to determine the “worst inertia” of Φ(−iω, iω) it is enough to determine inertia at any real number between two consequent real singularities of Φ(−iω, iω). Using this fact, the following algorithm can be used to determine the “worst inertia”: 1. Given nonsingular Φ(−iω, iω) = ΦT (iω, −iω) ∈ Rw×w [iω], determine all real, non-negative and distinct roots of det Φ(−iω, iω) = 0, and arrange them in an ascending order i.e., determine ω1 . . . ωk such that det Φ(−iωi , iωi ) = 0, ωi ≥ 0 and distinct, i = 1, . . . , k. Further, ω1 < . . . < ωk . 2. If ω1 = 0: Let ωk+1 be an arbitrary finite real number larger than ωk . Determine inertia of the k matrices Φ(−iω̄i , iω̄i ) with ω̄i := (ωi + ωi+1 )/2, i = 1, . . . , k. Denote the inertia i of the i-th matrix by σ i (Φ). Let σ− (Φ) denote the number of negative eigenvalues of the i-th matrix. Then the “worst inertia” σworst (Φ) is defined as σ p (Φ) where p is such that p i σ− (Φ) is the maximum among σ− (Φ), i = 1, . . . , k. 3. If ω1 > 0: Let ω0 = 0, ωk+1 be an arbitrary real number larger than ωk . Determine inertia of the k + 1 matrices Φ(−iω̄i , iω̄i ) with ω̄i := (ωi−1 + ωi )/2, i = 1, . . . , k + 1. Denote the i inertia of the i-th matrix by σ i (Φ). Let σ− (Φ) denote the number of negative eigenvalues of the i-th matrix. Then the “worst inertia” σworst (Φ) is defined as σ p (Φ) where p is such p i that σ− (Φ) is the maximum among σ− (Φ), i = 1, . . . , k + 1. Figure 3.1 illustrates the concept of the “worst inertia” of a matrix Φ(−iω, iω). Let ω i , i = 1, 2, 3 be the real non-negative roots of det Φ(−iω, iω) = 0. For ω ∈ (ω1 , ω2 ), Φ(−iω, iω) has two positive and one negative eigenvalues. For ω ∈ (ω2 , ω3 ), Φ(−iω, iω) has one positive and two 48 3 A parametrization for dissipative systems Eigenvalues of Φ(−iω, ω) i Worst inertia of 0 ω 1 ω Φ(−iω, i ω) ω 2 3 ω Real roots of det Φ(−iω, i ω) =0 Eigenvalues of Φ(−iω, i ω) Figure 3.1: Worst inertia of Φ(−iω, iω) negative eigenvalues. For ω ∈ (ω3 , ∞), Φ(−iω, iω) has three positive eigenvalues. Thus, the “worst inertia” of Φ(−iω, iω) is attained in the interval (ω1#, ω2 ), and is found to be (1, 2, 0). " 1 0 Thus the “worst inertia” matrix is Jworst = J1 2 = . 0 −I2 We begin by quoting the following important theorem from [80] (IMA preprint no. 992, Theorem 5.1), that relates to a “minimal size” factorization of a para-Hermitian matrix. Theorem 3.6.2 The polynomial matrix Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ] admits a factorization Φ(−ξ, ξ) = K T (−ξ)RK(ξ) (3.23) with R = RT ∈ Rm×m and K(ξ) ∈ Rm×w [ξ] if and only if m ≥ m0 := max σ+ (Φ) + max σ− (Φ) ξ∈iR ξ∈iR Moreover, if R is the minimal size (i.e., m0 ×m0 ) then it is uniquely determined up to congruence: R has exactly maxξ∈iR σ+ (Φ) positive and maxξ∈iR σ− (Φ) negative eigenvalues, and hence can be taken to be the inertia matrix with these eigenvalues without loss of generality. Obviously, m0 ≤ 2w. The case that interests us is when maxξ∈iR σ− (Φ) < w (i.e., Φ(−iω, iω) is not negative definite for any ω ∈ R). Consider Figure 3.1. Notice that max ξ∈iR σ− (Φ) = 2 and maxξ∈iR σ+ (Φ) = 3. Therefore, for the matrix considered in Figure 3.1, m0 = 2 + 3 = 5. Further, R can be taken to be the 5 × 5 inertia matrix diag[I3 , −I2 ]. Remark 3.6.3 Consider Φ(−ξ, ξ) = K T (−ξ)RK(ξ) (Theorem 3.6.2). Consider polynomial matrices X(ξ) such that X T (−ξ)RX(ξ) = R. Then, K 0 (ξ) := X(ξ)K(ξ) is also a possible factor for Φ(−ξ, ξ): Φ(−ξ, ξ) = K 0T (−ξ)RK 0 (ξ). Matrices X(ξ) that satisfy X T (−ξ)RX(ξ) = R are called R-unitary. See [9] for discussion on R-unitary matrices. 3.6. MIMO dissipative systems: the general inertia case 49 " # K1 (ξ) If Φ(−ξ, ξ) is nonsingular then K(ξ) is full column rank. Partition K(ξ) as with K2 (ξ) K1 (ξ) having rows equal to σ+ (R), i.e., the number of positive eigenvalues of R and K2 (ξ) having rows equal to σ− (R). Then, Theorem 3.6.4 The matrices K1 (ξ) and K2 (ξ), obtained from partitioning rows of K(ξ) conformally with the number of positive, and negative eigenvalues of R respectively, have full row rank as polynomial matrices. Proof: Suppose not. We first consider the case when K1 (ξ) is not full row rank. Denote the rows of K1 (ξ) by r1 , . . . rl where l = σ+ (R). Then there exist polynomials p1 , . . . , pl ∈ R[ξ] not P all zero such that p1 r1 = li=2 pi ri . We assume without loss of generality that p1 6= 0. Define h i C(ξ) = p2 p3 . . . pl r2 r3 L(ξ) = (3.24) .. . rl Consider the polynomial matrix Z(ξ) = p1 (ξ)p1 (−ξ)Φ(−ξ, ξ). Then, " #" # h i p (ξ)p (−ξ)I + C T (−ξ)C(ξ) 0 L(ξ) 1 1 a Z(ξ) = LT (−ξ) K2T (−ξ) 0 −p1 (ξ)p1 (−ξ)Ib K2 (ξ) where Ia denotes the identity matrix having size maxξ∈iR σ+ (Φ) − 1 and Ib denotes the identity matrix having size maxξ∈iR σ− (Φ). The matrices p1 (ξ)p1 (−ξ)Ia +C T (−ξ)C(ξ) and p1 (ξ)p1 (−ξ)Ib are positive definite matrices for almost all ξ ∈ iR. Hence, the block diagonal matrix has constant inertia for all ξ ∈ iR. Consequently, it can be factorized as LT1 (−ξ)Jab L1 (ξ) with L1 (ξ) square and Jab a inertia matrix having (σ+ (Φ)−1) positive ones. This follows from the standard polynomial Jmn -spectral factorization (3.22). Notice that the eigenvalues of Z(iω) and that of Φ(−iω, iω) are related by a positive scaling factor for almost all ω ∈ R and consequently max σ+ (Z(ξ)) = max σ+ (Φ(−ξ, ξ)) ξ∈iR ξ∈iR We have obtained a factorization of Z(ξ) which is of a size less than max ξ∈iR σ+ (Z(ξ)) + minξ∈iR σ− (Z(ξ)) which is a contradiction to Theorem 3.6.2. Hence, K1 (ξ) must have full row rank. The proof of K2 (ξ) being full row rank is analogous. We use Theorem 3.6.4 to prove the following important result: Theorem 3.6.5 Every nonsingular matrix Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ] can be written as: Φ(−ξ, ξ) = N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ) with Jworst ∈ Rw×w being the “worst inertia” matrix associated with Φ(−ξ, ξ) and N (ξ) square and nonsingular. 50 3 A parametrization for dissipative systems Proof: Obtain a minimal factorization of Φ(−ξ, ξ) as in Theorem 3.6.2: Φ(−ξ, ξ) = K T (−ξ)RK(ξ) with R = diag (Im0 −n , −In ) ∈ Rm0 ×m0 . We emphasize that n denotes number of " the maximum # K1 (ξ) negative eigenvalues of Φ(−iω, iω) for ω ∈ R. Partition K(ξ) = , conformally with K2 (ξ) the partition of R. Note that K(ξ) is full column rank. By Theorem 3.6.4, K1 (ξ) and K2 (ξ) are full row rank. Find a ξ0 ∈ C such that the rows of K2 (ξ0 ) are linearly independent (over C), and K(ξ0 ) is full column rank. The rows of K2 (ξ0 ) form a basis for a n dimensional subspace of Cw , and can be extended to Cw using k := (w − n) rows from K1 (ξ0 ). Denote these rows by r1 (ξ) . . . rk (ξ). Let N (ξ) be the matrix obtained by stacking r1 (ξ) . . . rk (ξ) over the matrix K2 (ξ), i.e., r1 .. . N (ξ) = rk K2 (ξ) Then, N (ξ) is a square and nonsingular polynomial matrix. Let D(ξ) be the matrix obtained by stacking the remaining rows, row[rk+1 (ξ) . . . rm0 −n−k (ξ)] of K1 (ξ): rk+1 (ξ) .. D(ξ) = . rm0 −n−k (ξ) It is now easy to see that Φ(−ξ, ξ) can be written as N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ), and as we have shown already, N (ξ) is nonsingular. Remark 3.6.6 Given Φ(−ξ, ξ) = K T (−ξ)RK(ξ) (Theorem 3.6.2, Theorem 3.6.5), it is clear that N (ξ) and D(ξ) are not unique. In fact, there exist m0 −n Cw−n “sums” of the form N T (−ξ) Jworst N (ξ) +D T (−ξ)D(ξ) that can be obtained from Φ(−ξ, ξ) with N (ξ), D(ξ) unique up to a permutation of rows. It is guaranteed that N (ξ) will have a rank at least n since by construction we have retained all rows of K2 (ξ) which is a full row rank matrix. However, not all such sums will yield N (ξ) nonsingular. We now define the notion of a “split sum” of Φ(−ξ, ξ): Definition 3.6.7 Given a nonsingular para-Hermitian matrix Φ(−ξ, ξ), let Jworst be the “worst inertia” matrix associated with Φ(−ξ, ξ). Consider matrices N (ξ), D(ξ) such that N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ) = Φ(−ξ, ξ). The triple (Jworst , N (ξ), D(ξ)) is said to define a split sum for Φ(−ξ, ξ) if N (ξ) is nonsingular. A split sum of Φ(−ξ, ξ) can be thought of as a decomposition of Φ(−iω, iω) into the sum of a sign-indefinite and a positive semidefinite matrix. The definition of split sum requires that the indefinite matrix be nonsingular. A split sum of Φ(−ξ, ξ) is not unique. From Remarks 3.6.3 and 3.6.6, one can modify a split sum using either a R-unitary transformation (Remark 3.6.3), or taking a different combination of rows that form the split sum (Remark 3.6.6). A split sum of Φ(−ξ, ξ) is useful for parametrizing a set of Φ-dissipative behaviors as the following section shows: 3.6. MIMO dissipative systems: the general inertia case 3.6.1 51 Parametrizing a set of Φ-dissipative behaviors using split sums Given a nonsingular Φ(−ξ, ξ) = ΦT (ξ, −ξ), let (Jworst , N (ξ), D(ξ)) define a split sum for Φ(−ξ, ξ). Define the two variable polynomial matrix Θ(ζ, η) as follows: Θ(ζ, η) = N T (ζ)Jworst N (η) (3.25) We can see that Θ(−iω, iω) is precisely the indefinite part of the split sum on the imaginary axis. Then: Theorem 3.6.8 The set of Θ-dissipative behaviors, LΘ , is a subset of the set of Φ-dissipative behaviors LΦ : LΘ ⊆ LΦ ⊂ Lwcon . Proof: Let w = M ( dtd )` be a Θ-dissipative behavior. Therefore, M T (iω)Θ(−iω, iω)M (iω) ≥ 0 ∀ ω ∈ R The matrix M T (−iω)D T (−iω)D(iω)M (iω) is positive-semidefinite for all ω ∈ R, M (ξ) ∈ Rwו [ξ]. Hence, adding it to the above inequality does not affect the nature of the inequality. Therefore, M T (−iω)[D T (−iω)D(iω) + Θ(−iω, iω)]M (iω) ≥ 0 ∀ ω ∈ R which shows that B is Φ-dissipative. Let (Jworst , N (ξ), D(ξ)) define a split sum for Φ(−ξ, ξ). Consider the map N ( dtd ). From results presented in Section 3.5, we can think of N ( dtd ) as a map from the set of Θ-dissipative behaviors to the set of Jworst -dissipative behaviors, i.e., N( d ) : LΘ → LJworst dt Further, the map N ( dtd ) admits an inverse map in the following sense: denote by L(ξ) the adjugate matrix of N (ξ), i.e., L(ξ)N (ξ) = d(ξ)Iw where d(ξ) = det N (ξ). Then, L( d ) : LJworst → LΘ dt and hence LΦ . The following Theorem can be used to parametrize a set of Φ-dissipative behaviors. Theorem 3.6.9 Consider a QDF QΦ with Φ(ζ, η) ∈ Rw×w s [ζ, η] such that Φ(−ξ, ξ) is nonsingular. Let (Jworst , Ni (ξ), Di (ξ)), i = 1, 2 . . . define split sums of Φ(−ξ, ξ). Consider the set of Jworst -dissipative behaviors LJworst . Consider the behaviors Bi := adjNi ( d )(BJworst ) dt BJworst ∈ LJworst . Then, the behaviors Bi are Φ-dissipative. Proof: Define QDFs QΘi where Θi (ζ, η) = NiT (ζ)Jworst Ni (η) 52 3 A parametrization for dissipative systems adj N (d/dt) 1 adj N 2(d/dt) adj N 3(d/dt) Jworst Maps obtained from different split sums of Φ(−ξ,ξ) adj N 4(d/dt) adj N 5(d/dt) adj N (d/dt) 6 Φ Figure 3.2: Parametrization using split sums yields a proper subset of LΦ The map adjNi ( dtd ) acts on any Jworst -dissipative behavior to yield a Θi -dissipative behavior. Note that every Θi -dissipative behavior is also Φ-dissipative. The process of parametrizing Φ-dissipative behaviors using split sums is shown in Figure 3.2. We demonstrate parametrization of dissipative behaviors using split sums with an example: " # 1 0 Example 3.6.10 Let Φ(ζ, η) = . Then QΦ (w) with w = [w1 w2 ]T is w12 + w22 − 0 1 − ζη " # 1 0 ( dtd w2 )2 . Note that Φ(−iω, iω) = . Thus, Φ(−iω, iω) has an inertia that varies 0 1 − ω2 with ω: for |ω| ∈ [0, 1), Φ(−iω, iω) is positive definite, while for |ω| ∈ (1, ∞), Φ(−iω, iω) has one positive, and one negative eigenvalue. It is easy to see that 1. The maximum number of positive eigenvalues of Φ(−iω, iω) is 2, when |ω| ∈ [0, 1). 2. The maximum number of negative eigenvalues of Φ(−iω, iω), n is 1, when |ω| ∈ (1, ∞). 3. The “worst inertia” of Φ(−iω, iω) is thus (1, 1). 4. The “worst inertia” matrix Jworst of Φ(−iω, iω) is thus: " # 1 0 Jworst = 0 −1 Notice that Φ(−ξ, ξ) can be written as: Φ(−ξ, ξ) = " 1 0 0 0 1 −ξ # 1 0 1 0 0 0 1 0 0 1 0 ξ 0 0 −1 3.6. MIMO dissipative systems: the general inertia case 53 It is not"difficult # to see using Theorem 3.6.2 that the above factorization is minimal. Define h i 1 0 N (ξ) = and D(ξ) = 0 1 . Then, one can see that 0 ξ Φ(−ξ, ξ) = N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ) The matrix N (ξ) is nonsingular. Thus (Jworst , N (ξ),"D(ξ)) defines a split sum of Φ(−ξ, ξ). Let # ξ 0 L(ξ) be the adjugate of N (ξ), which is found to be . We claim that 0 1 L( d ) : LJworst → LΦ dt i.e, L( dtd )(B) is a Φ-dissipative behavior if B is Jworst -dissipative. Let us verify this claim for an arbitrary Jworst -dissipative"behavior.# d +1 dt . Then, B is Jworst -dissipative. Also, L( dtd )(B) has Let B be defined as Im 1 " # d d ( + 1) an image representation defined by Im dt dt . In order to check whether L( dtd )(B) is 1 Φ-dissipative we compute: h −iω − ω 2 1 i " 1 0 0 1 − ω2 #" iω − ω 2 1 # which is found to be 1 + ω 4 which is clearly positive for all ω ∈ R. Hence, L( dtd )(B) is indeed Φ-dissipative. Note that parametrization of LΦ using a split sum of Φ(−ξ, ξ) will in general only yield a proper subset of LΦ . Different split sums define different maps adj Ni ( dtd ), i = 1, 2, . . ., all of which can be used to parametrize subsets of LΦ using the same base set LJworst . With reference to Example 3.6.10, we show that the set of behaviors parametrized using split sums is a proper subset of LΦ by showing that there exists a Φ-dissipative behavior which is not Θ-dissipative. Example 3.6.11 Let B = Im " q( dtd ) p( dtd ) dissipative: h −ω 2 1 − iω However, B is not Θ-dissipative: i " # with q(ξ) = ξ 2 and p(ξ) = 1 + ξ. Then, B is Φ- 1 0 0 1 − ω2 #" 2 −ω 1 + iω # =1 >0 Thus, we have parametrized a proper subset of LΦ using split sums. The problem of determing all of LΦ from Lworst using split sums is, however, still open. 54 3.7 3 A parametrization for dissipative systems Conclusion In this chapter we have addressed the problem of parametrizing dissipative systems. We have constructed a differential operator that maps passive dynamical systems into dissipative dynamical systems. First, we have examined SISO systems. Under certain assumptions on the supply function, we have obtained explicit formulae for the parametrization. This parametrization is in terms of J-dissipative dynamical systems. In the MIMO case we have shown that one can parametrize all dissipative systems when the supply function satisfies the condition of constant inertia on the imaginary axis. When this assumption does not hold, we have defined the notion of a split sum which we have then used to construct a proper subset of the set of all dissipative systems. As a part of future work, it will be interesting to extend the idea of split sums to parametrize all dissipative behaviors, rather than a proper subset. It will also be interesting to investigate computational aspects of the minimal factorization proposed by Ran and Rodman [80] which has been crucially used in this chapter. Chapter 4 KYP lemma and its extensions 4.1 Introduction The Kalman-Yakubovich-Popov (KYP) lemma is one of the key results in linear systems theory. Though originally formulated by Yakubovich [109] and Kalman [40], and later by Popov [78], subsequent research has added to it. The classical KYP lemma can be thought of as giving conditions for a transfer function matrix to be positive real. What makes the formulation attractive is that these conditions are in terms of state space (real constant) matrices. This has enabled in the recent times, the use of fast computational tools like Linear Matrix Inequalities (LMIs). KYP lemma is also called positive real lemma. We now give a statement of the lemma as in [2]: Theorem 4.1.1 Consider the system ẋ(t) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t) (4.1) with x(t) ∈ Rn and y(t), u(t) ∈ Rm . Suppose that (i)no eigenvalues of A lie in the open right half plane and all its purely imaginary eigenvalues are simple, (ii)(A, B) is controllable and (iii) (C, A) is observable and (iv) the transfer function matrix G(s) := C(sI − A)−1 B + D satisfies G(iω) + G∗ (iω) ≥ 0 for almost all ω ∈ R Then, under these conditions there exist matrices K = K T ∈ Rn×n ; K > 0, Q ∈ Rm×n and W = W T ∈ Rm×m such that AT K + KA = −QT Q BT K + W T Q = C W T W = D + DT (4.2) Note that the KYP lemma can only be formulated for systems having equal number of inputs and outputs. In this chapter, we address some system theoretic (rather than computational) 56 4 KYP lemma and its extensions questions that arise in a natural manner from the KYP lemma. Many important results can be proved in a simple manner by invoking the lemma. In particular, it provides a simple proof for the well known passivity theorem in nonlinear systems theory (see for instance [107]). It is therefore important to investigate what lies beneath the conditions and equations which makes the lemma so powerful. The scalar version of results presented in this chapter have been published [62, 63, 64]. A behavioral formulation of KYP lemma has already been made in [106] where a similar lemma is formulated for systems described by high-order differential equations. This formulation, like the classical formulation, leads to an LMI. Another well known interpretation of the KYP lemma is in terms of certain functionals associated with a dynamical system, called “storage functions”. In Chapter 3 we parametrized LTI systems that are dissipative with respect to a supply function defined by a QDF. In this chapter we investigate LTI systems, that in addition to being dissipative with respect to a supply function defined by a QDF also have positive definite storage functions. Storage functions have a deep connection with Lyapunov theory and hence they can be used to investigate stability of equilibria of dynamical systems. Attempts to generalize the KYP lemma were motivated by the absolute stability problem, in particular, constructing Lyapunov functions for “sector bound” nonlinearities. Thus, the socalled “Meyer-Kalman-Yakubovich” (MKY) lemma [79, 87, 89] received attention as a means of obtaining explicit state-space inequalities for systems interconnected with sector-bound nonlinearities. In this chapter we show that the KYP lemma can be generalized to a great extent in a representation free manner with the help of behavioral systems theory and QDFs. If representations are invoked, the conditions that we obtain will be in terms of some frequency domain (rational function) inequalities rather than state space inequalities. This chapter is organized as follows: in Section 4.2 we review introductory material on storage functions. We highlight the similarities and differences between storage functions on manifest variables, and storage functions on states of a behavior. In Section 4.3 we build a connection between the KYP lemma and storage functions. We subsequently use this connection to generalize the KYP lemma in Section 4.4. There is a certain “strict version” of the KYP lemma available in literature. We generalize the strict version in Section 4.5. 4.2 Storage functions for dissipative systems In Chapter 3, we obtained results about parametrization of Φ-dissipative behaviors. Closely associated with a Φ-dissipative behavior B are generalizations of the concepts of “stored energy” and “dissipated power”. A QDF Q∆ with ∆ ∈ Rw×w s [ζ, η] is called a “dissipation function” associated to a Φ-dissipative behavior B if: Z ∞ QΦ (w)dt = −∞ Z ∞ −∞ Q∆ (w) ≥ 0 and Q∆ (w)dt ∀w ∈ D(R, Rw ) ∩ B (4.3) Here Q∆ (w) ≥ 0 means that the QDF Q∆ is point-wise non-negative on trajectories w(t). A QDF QΨ is said to be a “storage function” associated to a Φ-dissipative behavior B if d Q (w) ≤ QΦ (w) ∀w ∈ B. Moreover, given a supply function QΦ , there is a one-to-one dt Ψ 4.2. Storage functions for dissipative systems 57 relation between the storage and dissipation functions given by [103]: d QΨ (w) = QΦ (w) − Q∆ (w) ∀w ∈ B dt The above equation is also known as the dissipation equality. A storage function Q Ψ of B with respect to QΦ is called positive semidefinite on manifest variables of B if ∀w ∈ B, QΨ (w)(t) ≥ 0 ∀ t . QΨ is called positive definite on manifest variables of B if it is positive semidefinite on manifest variables of B, and if in addition QΨ (w) = 0 ⇐⇒ w is the zero trajectory in B. Associated with a behavior B are certain special latent variables that are called states. We reviewed properties of state variables and state-representations of B in Section 2.8. Recall from Proposition 2.8.4, that the McMillan degree of a behavior B, n(B), is an invariant of B. We now define what we mean by storage functions that are functions of state. Let Φ ∈ Rw×w be a constant matrix. Consider a Φ-dissipative behavior B. Let x denote states corresponding to a minimal state representation Bfull of B. If B is Φ-dissipative, Bfull is dissipative with respect to the supply function defined by diag[Φ, 0n(B) ]. Hence, we can define dissipativity of Bfull analogously as dissipativity of B. Let Ba− denote the subset of B such that the trajectories have state 0 at time t = −∞ and have state a at time t = t 0 . Then R t0 Q (w(τ ))dτ with w(t) ∈ Ba− denotes the total generalized energy supplied in reaching −∞ Φ state a from state 0 along w(t). We consider those trajectories in Ba− for which this energy supply is the least. This is known as the “minimum required supply” since this is the least amount of energy that must be supplied to reach state a. The minimum required supply is denoted by QΨ+ (a) and is defined as follows: Z t0 QΦ (w(τ ))dτ (4.4) QΨ+ (a) = inf w∈Ba− −∞ We can evaluate QΨ+ (x) for every state x(t0 ) ∈ Rn(B) , to ∈ R. It can be easily verified that QΨ+ is a storage function of B with respect to QΦ , since d QΨ+ (x) ≤ QΦ (w) ∀(w, x) ∈ Bfull dt Now define the set Ba+ of trajectories w(t) ∈ B which have state a at time t0 and state 0 R∞ at time t = ∞. The quantity − t0 QΦ (w(τ ))dτ , with w(t) ∈ Ba+ denotes the generalized energy extracted while reaching state 0 from state a along w(t). We can look at trajectories for which this energy dissipation is the maximum. This energy dissipation is known as “available storage” since this is the maximum energy that can be extracted from state a while reaching the state 0. The available storage is denoted by QΨ− (a) and is defined as follows: Z ∞ QΦ (w(τ ))dτ ) QΨ− (a) = sup (− w∈Ba+ t0 The available storage QΨ− corresponding to every state x(t) ∈ Rn(B) can also be shown to be a storage function of B with respect to QΦ . For a more detailed discussion of required supply and available storage, see [98]. In [98], Theorem 3, page 331 it is shown that the set of all possible storage functions is bounded and forms a convex set. Any other storage function QΨ satisfies the inequality QΨ− (x) ≤ QΨ (x) ≤ QΨ+ (x) ∀ x such that (w, x) ∈ Bfull (4.5) 58 4 KYP lemma and its extensions Hence, QΨ− (available storage) and QΨ+ (minimum required supply) can be thought of as the “minimum” and the “maximum” storage functions, respectively, of B with respect to QΦ . Note that the representation of QΨ (x) depends on the choice of states x. A storage function on states can be re-written as xT Kx with x ∈ Rn(B) , K = K T ∈ Rn(B)×n(B) . QΨ (x) is called a positive definite state function of B if x ∈ Rn(B) represents a minimal set of states of B and QΨ (x) > 0 for all x 6= 0. If QΨ (x) is a positive definite state function of B it can be represented by a symmetric positive definite matrix. Using a state map, storage functions on states can be defined as functions of the manifest variables w(t). Let X( dtd ) be a minimal state map for B: x = X( d )w dt Then, Ψ(ζ, η) = X T (ζ)KX(η) is a storage function on manifest variables of B if K represents a storage function on states of B. Note that the discussion about required supply and available storage was with the tacit assumption that the supply function is defined by a constant matrix. When the supply function is defined by polynomial matrices, Willems and Trentelman showed in [95] that storage functions for a behavior are state functions of an associated behavior. However, if one considers storage functions on manifest variables rather than states, one can still define “maximum” and “minimum” storage functions. It is shown in [103] that for any Φ-dissipative behavior B, there exist storage functions QΨ− and QΨ+ on manifest variables such that every other storage function QΨ for B satisfies: QΨ− (w) ≤ QΨ (w) ≤ QΨ+ (w) ∀w ∈ B A procedure has been given in [103] to compute these storage functions. We summarize the procedure below. R∞ 1. Assume −∞ QΦ0 (`) ≥ 0 for all compactly supported `. Then, Φ0 (−iω, iω) ≥ 0 for all ω ∈ R (Theorem 3.2.3). 2. Using a non-trivial result called polynomial spectral factorization ([17, 38] and Chapter 6 of this thesis), it can be shown that if Φ0 (−iω, iω) ≥ 0 then Φ0 (−iω, iω) = AT (−iω)A(iω) = H T (−iω)H(iω). Here, A(ξ) is a square matrix having all its singularities in the closed right half plane (“A” for anti-Hurwitz) and H(ξ) is a square matrix having all its singularities in the closed left half plane (“H” for Hurwitz). 3. Since Φ0 (−iω, iω) − AT (−iω)A(iω) = 0, define: Ψ0+ (ζ, η) = Φ0 (ζ, η) − AT (ζ)A(η) ζ +η Then, QΨ0+ defines the “maximum” storage function. 4. Since Φ0 (−iω, iω) − H T (−iω)H(iω) = 0, define: Ψ0− (ζ, η) = Φ0 (ζ, η) − H T (ζ)H(η) ζ +η Then, QΨ0− defines the “minimum” storage function. 4.2. Storage functions for dissipative systems 59 Notice that in this recipe, we have considered positivity of QΦ0 on all trajectories in the ambient space (for example C ∞ ). In practice, this procedure will yield storage functions on latent variables, which can then be converted into storage functions on manifest variables provided latent variables are observable from the manifest variables. Let us consider an example which demonstrates the essential ideas: Example 4.2.1 Consider the simple RC circuit which we examined in Chapter 2, Example 2.2.1. Assume for the sake of simplicity that R1 = R2 = C = 1. Then, the behavior B of the port voltage V and the port current I is given by " = " Define a supply function QJ (V, I) where J = " J 0 (ζ, η) = h ζ +2 ζ +1 i " V I # 0 1/2 1/2 0 d dt d dt +2 +1 # ` # 0 1/2 . Define J 0 (ζ, η) as 1/2 0 #" η+2 η+1 # = (1/2)(ζη + 3(ζ + η) + 4) Notice that J 0 (−iω, iω) = ω 2 /2 + 2 > 0 for all ω ∈ R. Therefore B is J-dissipative. We compute Hurwitz and anti-Hurwitz spectral factorizations of J 0 (−ξ, ξ) : J 0 (−iω, iω) = (2 + iω) (2 − iω) √ √ 2 2 √ √ Define H(ξ) = (ξ + 2)/ 2 and A(ξ) = (ξ − 2)/ 2. Define J 0 (ζ, η) − A(ζ)A(η) = 5/2 ζ +η J 0 (ζ, η) − H(ζ)H(η) Ψ0− (ζ, η) = = 1/2 ζ +η Ψ0+ (ζ, η) = Then, QΨ0+ (`) = 5`2 /2 and QΨ0− (`) = `2 /2. Clearly, QΨ+ (`) > QΨ− (`) as was expected. Using observability of ` from (V, I) one can substitute: `= h 1 −1 i " V I # to write QΨ0+ , QΨ0− in terms of (V, I). Define Ψ+ = (5/2) " 1 −1 −1 1 # QΨ+ and QΨ− are storage functions on (V, I). ; Ψ− = (1/2) " 1 −1 −1 1 # 60 4.3 4 KYP lemma and its extensions Classical KYP lemma in terms of storage functions The connection between KYP lemma and storage functions is well-known in literature (see [90] for instance). Note that u, y that satisfy equations (4.1) are Rm -valued. We consider the 2m × 2m symmetric matrix J: " # 1 0 Im J= 2 Im 0 Then, QJ (u, y) = uT y. Note that since G(iω) + G∗ (iω) ≥ 0 in Theorem 4.1.1, u and y that satisfy equations (4.1) define a behavior B that is J-dissipative. Suppose a state space representation for B satisfies the conditions that (A, B) is controllable and (C, A) is observable. Then, (A, B, C, D) is a minimal state representation of B and states x are observable from the manifest variables (u, y). Then: Corollary 4.3.1 The n × n symmetric matrix (1/2)K in equations (4.2) defines a storage function which is a positive definite state function for the system (4.1), since K > 0 and 1d T (x Kx) ≤ uT y 2 dt for all x, y, u such that the equations (4.1) are satisfied. Proof: Note that using equations in the KYP lemma, it follows immediately that d T (x Kx) = −(Qx)T (Qx) − (W u)T (Qx) dt −(Qx)T (W u) + uT Cx + xT C T u (4.6) and from the state space equations it follows that 2uT y = uT y + y T u = uT Cx + xT C T u + (W u)T (W u) And therefore d T (x Kx) − 2uT y = −(Qx + W u)T (Qx + W u) dt Since (Qx + W u)T (Qx + W u) ≥ 0 it follows that d T (x (1/2)Kx) − uT y ≤ 0 dt Hence (1/2)xT Kx is a storage function. Note that G(s) in Theorem 4.1.1 is a positive real (PR) matrix. See [3] for a detailed treatment and applications of positive real matrices. Thus the classical KYP lemma can be interpreted as: every storage function of the J-dissipative behavior B is a positive definite state function if and only if G(s) is positive real. Remark 4.3.2 A variant of Theorem 4.1.1 is available in literature that relates strict positive real (SPR) matrices with storage functions, see for instance [107], page 223: G(s) is SPR if and only if for any minimal state representation (A, B, C, D) of G(s), A is a Hurwitz matrix and there exist matrices K, Q and W of appropriate dimensions, and an > 0 such that AT K + KA = −K − QT Q; B T K + W T Q = C; and W T W = D + D T . A result similar to 4.4. Generalization of KYP lemma 61 Corollary 4.3.1 can also be proved for this variant. However, note that Corollary 4.3.1 implies that there may be certain non-zero state trajectories along which the rate of change of storage exactly equals the supply. The in the above mentioned variant of the KYP lemma ensures that the rate of change of storage along every nonzero state trajectory is strictly less than the supply. This distinction between PR and SPR versions of the lemma is of crucial importance especially in stability analysis. 4.4 Generalization of KYP lemma The behavioral formulation of the classical KYP lemma lends itself to generalizations in several directions. The classical KYP lemma considers passive dynamical systems. These dynamical systems are dissipative with respect to a quadratic functional of the form uT y where u(t) and y(t) are obtained from certain permutations of the manifest variables. Note that such a supply function forces u(t) and y(t) to have the same dimension. While this is perfectly logical for passive systems, it neednot always be the case. Therefore, we work with a more “general” supply function than QJ . Consider " # Im 0 Jmn = 0 −In and the associated QDF QJmn . Clearly, behaviors dissipative with respect to the supply function defined by QJmn neednot necessarily have equal number of inputs and outputs. Moreover, dissipativity and existence of storage functions are fundamental to the system and donot depend on the cardinality of system variables. The question of when do Jmn -dissipative behaviors with input cardinality m (i.e. the number of +1s in Jmn , which is the maximum possible input cardinality for a Jmn -dissipative behavior, Lemma 3.5.1) have positive definite storage functions on states has been addressed in [103], Theorem 6.4, page 1726. We reproduce that result below: Theorem 4.4.1 Let Jmn = diag[Im , −I a Jmn -dissipative behavior B defined by an # " n ]. Consider d R( dt ) with R(ξ) ∈ Rm×m [ξ], S(ξ) ∈ Rn×m [ξ]. Then, the observable image representation: Im S( dtd ) following are equivalent: 1. There exists a positive definite storage function on states of B. 2. R(ξ) is a Hurwitz matrix, i.e. every singularity of R(ξ) lies in the open left half complex plane. In particular, det R(ξ) 6= 0. 3. Every storage function of B with respect to QJmn is positive definite on states of B. 4. Every storage function on manifest variables of B is positive semidefinite. Theorem 4.4.1 gives an elegant characterization of all dissipative behaviors that have positive definite storage functions on states. Thus, a behavior B defined as in Theorem 4.4.1 by 62 4 KYP lemma and its extensions # R( dtd ) has positive definite storage functions with respect to QJmn on states if and only Im S( dtd ) if ||S(ξ)R−1 (ξ)||H∞ ≤ 1, or in other words, the rational matrix S(ξ)R−1 (ξ) is bounded real [3]. Consider a Jmn -dissipative behavior B := {w|w = M ( dtd )`}. Let X( dtd ) be a minimal state map for B defining states x. Assume B has a positive definite storage function on states defined by a matrix K = K T ∈ Rn(B)×n(B) . Then, K > 0 and " d T x Kx ≤ QJmn (w) dt for all (w, x) ∈ Bfull , the full behavior. Using the state map X( dtd ) and the observable image representation M ( dtd ), a storage function on states of B can be “converted” into a storage function on manifest variables of B. Define Ψ(ζ, η) as Ψ(ζ, η) = X T (ζ)KX(η) Then, QΨ is a storage function on manifest variables w of B. Clearly, since K > 0, QΨ (w) ≥ 0 for all w ∈ B. We ask the following question: when is a storage function positive definite on manifest variables of a Jmn -dissipative behavior. The answer to this question is rather intuitive. If some part of the behavior is “memoryless”, i.e. its evolution is not governed by any past history, there cannot be any storage on such trajectories. See Definition 2.8.7 for the memoryless part of a behavior. The memoryless part of a behavior plays an important role in obtaining storage functions on manifest variables, as the following proposition shows: # " R( dtd ) with Proposition 4.4.2 Let B defined by an observable image representation: Im S( dtd ) R(ξ) ∈ Rm×m [ξ], S(ξ) ∈ Rn×m [ξ] and m + n := w. Let Jmn = diag[Im , −In ]. Then, B is Jmn dissipative and every storage function on manifest variables of B is positive definite if and only if the following conditions hold: 1. ||S(ξ)R−1 (ξ)||H∞ ≤ 1, i.e. R(ξ) is a Hurwitz matrix and R−T (−iω)S T (−iω)S(iω)R−1 (iω) ≤ Im , ω ∈ R 2. B has no non-trivial memoryless part, i.e., if X( dtd ) is a minimal state map for B then, KerX( dtd ) ∩ B = {0}. Proof: From Theorem 4.4.1 it follows that every storage function of B is positive definite on states, and positive semidefinite on manifest variables if and only if condition (1) holds. If K = K T ∈ Rn(B)×n(B) defines a storage function on states, QΨ defined by Ψ(ζ, η) = X T (ζ)KX(η) is a storage function on manifest variables. QΨ is positive definite iff in addition KerX( dtd ) ∩ B = {0}. In keeping with the behavioral philosophy, we try as far as possible, not to invoke a state representation. This has the advantage that all our results are representation free. Hence, we work exclusively with manifest variables, which is more natural from the behavioral viewpoint. The role of the QDF defining the supply function needs a special mention. The supply function contributes dynamics to the problem, in addition to the dynamics of the behavior. As 4.4. Generalization of KYP lemma 63 a result, a storage function could depend on a lesser number of states than the McMillan degree (the phenomenon of “degree drop”) and yet be positive definite on manifest variables. However, such phenomena occur only when the supply function QΦ is defined by a polynomial matrix Φ(ζ, η). See Example 4.6.9 for a demonstration of this phenomenon. In the remaining sections of the chapter, we address the problem of positive definite storage functions for behaviors dissipative with respect to a QDF induced by a polynomial matrix Φ(ζ, η). 4.4.1 Generalization with respect to QDFs Consider a matrix Φ(ζ, η) ∈ Rw×w s [ζ, η]. The matrix obtained by substituting ζ = −iω0 , η = iω0 , i.e. Φ(−iω0 , iω0 ) is a w×w Hermitian matrix for any ω0 ∈ R. Therefore it has w real eigenvalues. It is shown in Chapter 3 that when Φ(−iω, iω) is nonsingular and has constant inertia for almost all ω ∈ R, one can parametrize the set of Φ-dissipative behaviors. In this chapter, we go a step further: we parametrize Φ-dissipative behaviors with positive definite storage functions on manifest variables. Let us now consider matrices Φ(ζ, η) ∈ Rw×w s [ζ, η] such that Φ(ζ, η) = K T (ζ)Jmn K(η) (4.7) with K(ξ), Jmn square and nonsingular. Clearly, if Φ(ζ, η) can be written in the form (4.7) then Φ(−iω, iω) has constant inertia for almost all ω ∈ R. However, note that the converse does not necessarily hold, i.e. every two variable polynomial matrix Φ(ζ, η) such that Φ(−iω, iω) has constant inertia for almost all ω ∈ R neednot admit a factorization as in (4.7). We now obtain necessary and sufficient conditions when a given Φ(ζ, η) can be written in the form (4.7). From equation (1.4), every matrix Φ(ζ, η) ∈ Rw×w s [ζ, η] can be written as I i Φ00 Φ01 . . . Φ0k h Iη . k (4.8) Φ(ζ, η) = I Iζ . . . Iζ Φ10 . . . . . . .. .. . Φk0 . . . . . . Φkk Iη k where Φij = ΦTji denote constant w × w matrices, I denotes the w × w identity matrix, and k denotes the maximum degree of ζ (and hence also of η) that occurs in Φ(ζ, η). As in equation (1.3), we will denote the w(k + 1) × w(k + 1) symmetric matrix obtained as [Φij ]ki,j=0 by Φ̃ and call it the coefficient matrix of Φ(ζ, η). Note that Φ̃ has all real eigenvalues. Recall that the inertia of Φ̃, σ(Φ̃), has been defined in Definition 1.2.2 as the non-negative integer three tuple (σ+ , σ− , σ0 ) with σ+ , σ− , σ0 denoting, respectively, the number of positive, negative and zero eigenvalues of Φ̃. Then: Theorem 4.4.3 Consider a nonsingular polynomial matrix Φ(ζ, η) ∈ Rw×w s [ζ, η] (i.e. det Φ(ζ, η) w(k+1)×w(k+1) 6= 0 ∈ R[ζ, η]). Let Φ̃ ∈ R be the coefficient matrix of Φ(ζ, η). Φ(ζ, η) admits a T factorization of the form K (ζ)Jmn K(η) with Jmn = diag[Im , −In ], w = m + n, K(ξ) ∈ Rw×w [ξ] and nonsingular if and only if σ(Φ̃) is (m, n, w(k + 1) − m − n). Proof: If σ(Φ̃) = (m, n, w(k +1)−m−n), Φ̃ can be factorized as Φ̃ = RT Jmn R with R ∈ Rw×w(k+1) 64 4 KYP lemma and its extensions and full row rank. Define K(ξ) as follows: K(ξ) = R I Iξ .. . Iξ k (4.9) Then, clearly, Φ(ζ, η) = K T (ζ)Jmn K(η). Since (−1)n det K T (ζ) det K(η) = det Φ(ζ, η) 6= 0, it follows that det K(ξ) 6= 0. Conversely, suppose Φ(ζ, η) = K T (ζ)Jmn K(η) with K(ξ) nonsingular. Then, K(ξ) can be P re-written as K(ξ) = ki=0 Ki ξ i with Ki nonzero matrices for i = 0 . . . k. h i R = K0 K1 . . . K k (4.10) Since K(ξ) is a nonsingular polynomial matrix, K(λ) is nonsingular for most λ ∈ C. Hence, R is full row rank. Therefore, the rank of the matrix RT Jmn R is precisely w. Also, clearly, this matrix has m positive and n negative eigenvalue. Hence, Φ̃ ∈ Rw(k+1)×w(k+1) defined by RT Jmn R has inertia (m, n, w(k + 1) − m − n). We now demonstrate Theorem 4.4.3 with the help of an example: " # " # 1 1+η 1 1−ζ Example 4.4.4 Let Φ1 (ζ, η) = and Φ2 (ζ, η) = . Notice 1+ζ 0 1−η 0 that Φ1 (−iω, iω) = Φ2 (−iω, iω). We see that 1 1 0 0 1 1 0 1 1 0 −1 0 1 0 0 0 Φ̃1 = ; Φ̃2 = 0 −1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 Notice that Φ̃1 has one positive and one negative eigenvalue (−1, 2 respectively). Hence, Φ1 (ζ, η) can be factorized as # #" " #" 1 1+η 1 0 1 0 Φ1 (ζ, η) = 0 1+η 0 −1 1+ζ 1+ζ {z }| {z }| {z } | K T (ζ) J1 1 K(η) Now consider Φ̃2 . The (nonzero) eigenvalues of Φ̃2 are found to be (-1.2469796, 0.4450419, 1.8019377). Since Φ̃2 has greater rank than 2, it cannot be diagonalized as R T J1 1 R and consequently, Φ2 (ζ, η) cannot be written as K T (ζ)J1 1 K(η). We come to the same conclusion from Theorem 4.4.3. T If a two variable polynomial matrix Φ(ζ, η) ∈ Rw×w s [ζ, η] can be written as K (ζ)Jmn K(η) with K(ξ), Jmn square and nonsingular, K( dtd ) maps any Φ-dissipative behavior into a Jmn -dissipative behavior as shown in Chapter 3. However, in this case, there is also a simple relationship between storage functions of a Φ-dissipative behavior and those of the corresponding J mn dissipative behavior: 4.4. Generalization of KYP lemma 65 T Proposition 4.4.5 Consider Φ(ζ, η) ∈ Rw×w s [ζ, η] with Φ(ζ, η) = K (ζ)Jmn K(η). Let B be a Φ-dissipative behavior and BJmn be the corresponding Jmn -dissipative behavior defined as K( dtd )(B) := {v(t)|v(t) = K( dtd )w(t), w(t) ∈ B}. Let QΨJmn be a storage function for BJmn with respect to QJmn . Then, Ψ(ζ, η) = K T (ζ)ΨJmn K(η) defines the QDF QΨ which is a storage function for B with respect to QΦ . Proof: Since QΨJmn is a storage function on manifest variables v of the Jmn -dissipative behavior BJmn : d QΨ (v) ≤ QJmn (v) ∀v ∈ BJmn dt Jmn substituting v = K( dtd )w, we see that d d d QΨJmn (K( )w) ≤ QJmn (K( )w) ∀w ∈ B dt dt dt Further, QΨJmn (K( dtd )w) = QΨ (w) with Ψ(ζ, η) = K T (ζ)ΨJmn K(η) and QJmn (K( dtd )w) = QΦ (w) since Φ(ζ, η) = K T (ζ)Jmn K(η). Hence, d QΨ (w) ≤ QΦ (w) ∀ w ∈ B dt which shows that QΨ is a storage function for B with respect to QΦ . We now obtain a characterization of all Φ-dissipative behaviors that have positive definite storage functions on manifest variables. Assume that Φ(ζ, η) ∈ Rw×w s [ζ, η] can be written as T K (ζ)Jmn K(η) with K(ξ), Jmn square and nonsingular , and Jmn = diag[Im , −In ]. Let B be a Φdissipative behavior with manifest variables w, defined by an observable image representation: " # Q( dtd ) w= ` P ( dtd ) with Q(ξ) ∈ Rm×m [ξ] and P (ξ) ∈ Rn×m [ξ]. Define " # " # R(ξ) Q(ξ) = K(ξ) S(ξ) P (ξ) (4.11) # d R( ) dt with R(ξ) ∈ Rm×m [ξ] and S(ξ) ∈ Rn×m [ξ]. Define BJmn = {v|v = Im }. Then, by S( dtd ) Theorem 3.5.3, BJmn is Jmn -dissipative if and only if B is Φ-dissipative. Using an observable image representation of B and an image representation of BJmn we now obtain a characterization of all Φ-dissipative behaviors having positive definite storage functions on manifest variables: " T Theorem 4.4.6 Consider Φ(ζ, η) ∈ Rw×w s [ζ, η] such that Φ(ζ, η) = K (ζ)Jmn K(η) with Jmn = diag[Im , −In ], w = m + n and K(ξ) ∈ Rw×w [ξ] nonsingular. Let " # B be a Φ-dissipative behavior d Q( dt ) defined by an observable image representation w = ` with Q(ξ) ∈ Rm×m [ξ]. Let BJmn P ( dtd ) # " R( dtd ) ` with R(ξ), S(ξ) as be the corresponding Jmn -dissipative behavior defined by v = S( dtd ) given in equation (4.11). Then, every storage function of B with respect to QΦ is positive definite on manifest variables of B if and only if the following conditions hold: 66 4 KYP lemma and its extensions 1. The matrices R(ξ), S(ξ) are right coprime, i.e. if R(ξ) = R1 (ξ)U (ξ) and S(ξ) = S1 (ξ)U (ξ) then U (ξ) is unimodular. 2. The behavior BJmn has no non-trivial memoryless part. 3. The matrix R(ξ) is Hurwitz, i.e. every singularity of R(ξ) lies in the open left half complex plane. Proof: Assume that BJmn has no memoryless part, R(ξ) is Hurwitz and R(ξ), S(ξ) are" right co# R( dtd ) prime. Under these conditions, BJmn is defined by an observable image representation: `. S( dtd ) Notice that because Φ(ζ, η) = K T (ζ)Jmn K(η), every storage function of B, defined on latent variables `, with respect to QΦ is also a storage function for BJmn with respect to QJmn , defined on latent variables `. Every storage function on ` can be expressed in terms of the manifest variables of BJmn since by assumption BJmn is defined by an observable image representation. If v ∈ BJmn then ∃w ∈ B such that v = K( dtd )w. Substituting manifest variables of BJmn with K( dtd )w, a storage function on manifest variables of BJmn can be defined in terms of manifest variables of B. Hence, every storage function QΨ of B with respect to QΦ can be written with Ψ(ζ, η) = K T (ζ)ΨJmn (ζ, η)K(η) where QΨJmn is a storage function for BJmn with respect to QJmn . Since BJmn is Jmn -dissipative, has no memoryless part and R(ξ) is Hurwitz, it follows from Theorem 4.4.1 and Proposition 4.4.2 that every storage function of BJmn is positive definite on manifest variables of BJmn . We now prove that because R(ξ), S(ξ) are right coprime, every storage function on manifest variables of B is also positive definite: notice that QΨ (w) = 0 if and only if w ∈ Ker K( dtd ) ∩ B. Since by assumption R(ξ), S(ξ) are right-coprime, Ker K( dtd ) ∩ B = {0}. Hence, every storage function on manifest variables of B is positive definite. Conversely, assume that every storage function on manifest variables of B is positive definite. One can arrive at easy contradictions if any one of the three conditions listed in the theorem fails: 1. Assume that R(ξ), S(ξ) are not right coprime. Then, there exists a `0 ∈ C ∞ (R, Rm ) which is not observable from BJmn . One can easily construct a storage function for BJmn (expressed in terms of latent variables `) which is zero along `0 . We can convert this storage function into a storage function on manifest variables of B since B is given by an observable image representation. Note that because of observability, the image of ` 0 yields non-zero trajectories in B. Hence, there exists a storage function that is zero along trajectories in B that are obtained as the image of `0 . 2. When BJmn has a non-trivial memoryless part, a storage function QΨJmn for BJmn cannot be positive definite on manifest variables of BJmn , and consequently because K(ξ) is nonsingular, QΨ can not be positive definite on manifest variables of B. 3. If R(ξ) is not Hurwitz, a storage function QΨJmn for BJmn is not positive definite on manifest variables of BJmn (Theorem 4.4.2). If QΨJmn is not positive definite then because K(ξ) is nonsingular, QΨ cannot be positive definite. 4.5. Strict versions of the KYP lemma 67 Storage function for B J mn Ψ Observable image representation for B J mn Storage function on latent variables for B and B J mn Ψ K(d/dt) l Observable image representation for B T Κ (ζ)ΨΚ(η) Storage function for B Figure 4.1: Mapping among storage functions The essential ideas in the proof of Theorem 4.4.6 are illustrated in Figure 4.1. Consider Φ(ζ, η) = K T (ζ)Jmn K(η) and a Φ-dissipative behavior B. The associated Jmn -dissipative behavior BJmn is defined as K( dtd )(B). QΨl is a storage function (on latent variables) for B with respect to QΦ and also a storage function for BJmn with respect to QJmn . Using observability of image representation for B, QΨl can be expressed in terms of manifest variables of B. Using observability of image representation for BJmn , QΨl can be expressed in terms of manifest variables of BJmn , which in turn can be expressed in terms of manifest variables of B using the map K( dtd ). Given a Φ-dissipative behavior B, there may, in general, exist trajectories in B along which dissipation is zero. All such trajectories are Φ-lossless, i.e. along these trajectories the rate of change of the storage function is exactly equal to the supply. Such a situation is undesirable in certain situations, especially in construction of Lyapunov functions. Hence, we would like a characterization of behaviors such that along every nonzero trajectory, the rate of change of storage is strictly less than the supply. In other words, behaviors in which no nonzero trajectories are lossless. We show that such a “strict” problem corresponds to a certain strict version of the KYP lemma. 4.5 Strict versions of the KYP lemma We have summarized the strict version of the KYP lemma in Remark 4.3.2. Recall that Jmn = Jmn T ∈ Rw×w has been defined as the inertia matrix diag [Im , −In ]. We address the “strict” version of the KYP lemma by first defining the matrix Jmn = Jmn − Iw , ∈ (0, 1) (4.12) 68 4 KYP lemma and its extensions The supply function QJmn exhibits some interesting properties: Lemma 4.5.1 Consider the matrix Jmn in (4.12). Let B be a Jmn -dissipative behavior with manifest variables w. Then, 1. B is Jmn -dissipative. 2. If QΨ is any storage function for B with respect to QJmn , d Q (w) dt Ψ < QJ (w) ∀ w ∈ B−{0} Proof: Note that QJmn (w) = QJmn (w) − w T w. Therefore, QJmn (w) < QJmn (w) ∀w ∈ C ∞ (R, Rw ) and in particular along all w ∈ B. If QΨ be any storage function for B with respect to Jmn then, d QΨ (w) ≤ QJmn (w) < QJmn (w) ∀w ∈ B dt Hence, B is Jmn -dissipative and dtd QΨ (w) < QJmn (w) along all nonzero w ∈ B. Analogous to the generalization of the KYP lemma in Theorem 4.4.6 we now propose a generalization of the strict version of the KYP lemma: T Theorem 4.5.2 Consider Φ(ζ, η) ∈ Rw×w s [ζ, η] such that Φ(ζ, η) = K (ζ)Jmn K(η) with Jmn = diag[Im , −In ], w = m + n and K(ξ) ∈ Rw×w [ξ] nonsingular. Let # B be a Φ-dissipative behavior " d Q( dt ) ` with Q(ξ) ∈ Rm×m [ξ]. Let BJmn defined by an observable image representation w = d P ( dt ) " # R( dtd ) be the corresponding Jmn -dissipative behavior defined by v = ` with R(ξ), S(ξ) as S( dtd ) given in equation (4.11). Then, every storage function of B with respect to QΦ is positive definite on manifest variables of B, and the rate of change of storage function is strictly less than QΦ along every nonzero manifest variable trajectory in B if the following conditions hold: 1. BJmn is Jmn -dissipative for some ∈ (0, 1). 2. The matrices R(ξ), S(ξ) are right coprime, i.e. if R(ξ) = R1 (ξ)U (ξ) and S(ξ) = S1 (ξ)U (ξ) then U (ξ) is unimodular. 3. The behavior BJmn has no nontrivial memoryless part. 4. The matrix R(ξ) is Hurwitz, i.e. every singularity of R(ξ) lies in the open left half complex plane. Proof: Since the proof goes along similar lines to the proof of Theorem 4.4.6, we only give a brief sketch. If B is Φ dissipative and BJ is Jmn -dissipative for some ∈ (0, 1), then the rate of change of every storage function on manifest variables of B is strictly less than Q Φ . The remaining three conditions are the same as those in Theorem 4.4.6, and relate to the existence of positive definite storage functions on manifest variables of B. Remark 4.5.3 Note that the converse of Theorem 4.5.2, unlike its milder counterpart, Theorem 4.4.6 does not necessarily hold, i.e. a Φ-dissipative behavior could be such that the rate of change of the storage function along trajectories in the behavior is strictly less than Q Φ , -dissipative. This is because a but the corresponding Jmn -dissipative behavior BJmn is not Jmn 4.5. Strict versions of the KYP lemma 69 behavior that is Jmn -dissipative but not Jmn -dissipative may still have positive definite storage functions such that the rate of change of the storage function along every nonzero trajectory in the behavior is strictly less than QJmn . Such a situation arises when a dissipation function corresponding to QJmn , defined by ||D( dtd )w||2 , w ∈ B is such that D(ξ) is unimodular. The claim in this remark is illustrated with an example. Example 4.5.4 The aim of this example is to show that Jmn -dissipativity, for some ∈ (0, 1), is in general sufficient for the rate of change of storage function to be strictly less than the supply function, " but not# necessary. 1 0 Let J1 1 = . Consider the RC circuit in Example 2.2.1 with R1 = R2 = C = 1. 0 −1 Then, behavior of the port voltage V and the port current I is given by: " # the"corresponding # d V q( dt ) = ` with q(ξ) = ξ + 2 and p(ξ) = ξ + 1. Let us check that B is J1 1 -dissipative: I p( dtd ) J10 1 (−iω, iω) = h q(−iω) p(−iω) i J1 1 " q(iω) p(iω) # =3>0 Let us check whether B is J1 1 dissipative for some ∈ (0, 1): " # h i q(iω) = 3 − 5 − 2ω 2 q(−iω) p(−iω) J1 1 p(iω) which is not positive for any ∈ (0, 1). Therefore, B is not J1 1 -dissipative for any ∈ (0, 1). Theorem 4.4.6 tells that B has positive definite storage functions. In this case there is a unique storage function for B with respect to QJ1 1 and can be computed (on latent variables) to be QΨ (`) = `2 , clearly a positive definite QDF. Since the corresponding dissipation function Q∆ (`) is 3`2 , also positive definite, dtd QΨ (`) = QJ1 1 0 (`) if and only if Q∆ (`) = 0 which holds if and only if ` = 0. Since dtd QΨ (`) ≤ QJ1 1 0 (`) and the corresponding dissipation function is positive definite, we see that dtd QΨ (`) < QJ1 1 0 (`). Remark 4.5.5 It is easy to see that Theorems 4.4.6 and 4.5.2 can be used to parametrize the set of Φ-dissipative behaviors with positive definite storage functions. If Φ(ζ, η) = K T (ζ)Jmn K(η), define L(ξ) = adj K(ξ), i.e.L(ξ)K(ξ) = det K(ξ)Iw . We have seen in Chapter 3 that the differential operator L( dtd ) is a map from the set of all Jmn -dissipative behaviors to the set of all Φ-dissipative behaviors. Suppose a Jmn -dissipative behavior BJmn having input cardinality m (the number of positive 1s in Jmn ) is defined by an observable image representation ImM ( dtd ). Then, B defined by the image representation L(ξ)M (ξ) (converted into an observable image representation M 0 (ξ)) has positive definite storage functions if and only if K(ξ)M 0 (ξ) defines an observable image representation, has no non-trivial memoryless part and the corresponding rational function is bounded real. We now, as in the previous chapter, concentrate on the important and interesting special case of SISO dissipative behaviors. We see that in this case, the results are more explicit. 70 4.6 4 KYP lemma and its extensions Special case: KYP lemma for SISO systems As in the previous chapter, in the SISO case we prefer working with passive systems, rather than Jmn -dissipative systems. This is, however, solely a matter of choice. Working with passive systems helps us relate more closely to the KYP lemma as stated in Theorem 4.1.1. Recall that the matrix J has been defined in the previous chapter as " # 0 1/2 J= (4.13) 1/2 0 Consider a behavior B defined by the observable image representation " # " # d u q( dt ) ` = p( dtd ) y (4.14) Recall that B is J-dissipative if and only if the associated rational function of B, which we have defined to be G(ξ) : p(ξ)/q(ξ) has its Nyquist plot entirely in the closed right half plane. We begin by defining the two variable polynomial J 0 (ζ, η): " # h i q(η) J 0 (ζ, η) = q(ζ) p(ζ) J (4.15) p(η) J-dissipativity of B implies that J 0 (−iω, iω) ≥ 0 ∀ ω ∈ R. Or, written explicitly, p(−iω)q(iω) + p(iω)q(−iω) ≥ 0 ∀ω ∈ R Notice that J can be written in the following manner: " #" #" # 1 1 1 1 0 1 1 J= 2 1 −1 0 −1 1 −1 (4.16) Using the polynomials p(ξ), q(ξ) that define the image representation of B, we now define two new polynomials: r(ξ) = q(ξ) + p(ξ) and s(ξ) = q(ξ) − p(ξ). Then, we have the following lemma: Lemma 4.6.1 r(ξ) and s(ξ) are co-prime if and only if p(ξ) and q(ξ) are coprime. " # r( dtd ) Lemma 4.6.1 implies that the image representation Im is observable if and only if the s( dtd ) # " q( dtd ) is also observable. We now state the following result that image representation Im p( dtd ) relates the roots of r(ξ) with those of q(ξ) and p(ξ). This result is not new, however, we believe that the proof presented here is simpler and elementary than those found elsewhere [105]. It only uses elementary complex analysis and the celebrated Nyquist stability criterion. Proposition 4.6.2 Consider polynomials p(ξ) # and q(ξ) such that the behavior defined by the " d q( dt ) is J-dissipative. Then, r(ξ) := q(ξ) + p(ξ) is observable image representation Im p( dtd ) Hurwitz if and only if the following hold: 4.6. Special case: KYP lemma for SISO systems 71 1. No roots of q(ξ) and p(ξ) lie in the open right half of the complex plane. 2. Any purely imaginary root of q(ξ) and p(ξ) is a simple root. Proof: Define a rational function G(ξ) = p(ξ) . q(ξ) Since by J-dissipativity, J 0 (−iω, iω) = p(−iω)q(iω) + p(iω)q(−iω) ≥ 0 ∀ω ∈ R we have G(iω) + G(−iω) ≥ 0 for almost all ω ∈ R or, Re G(iω) ≥ 0 for almost all ω ∈ R. This implies that the Nyquist plot of G(ξ) lies entirely in the closed right half complex plane. Note that the polynomial J 0 (ζ, η) is symmetric in p(ζ) and q(ζ). Therefore the behavior associated with G(ξ) and the behavior associated with 1/G(ξ) are both J-dissipative. Thus without loss of generality, it is enough to prove the Proposition for q(ξ). Suppose q(ξ) has no roots in the open right half plane and all it’s purely imaginary roots are simple. Since the Nyquist plot of G(ξ) lies in the closed right half plane, by using Nyquist G(ξ) has all its poles in stability arguments, it is seen that the rational function defined by 1+G(ξ) the open left half of the complex plane. Hence, p(ξ) + q(ξ) is a Hurwitz polynomial. G(ξ) Conversely, suppose p(ξ) + q(ξ) is Hurwitz. Then, 1+G(ξ) has all its poles in the open left half of the complex plane. Observe that if q(ξ) has non-simple roots on the imaginary axis, the Nyquist plot of G(ξ) has circle(s) of infinite radius. This implies that the Nyquist plot of G(ξ) does not lie in the closed right half plane. This contradicts our assumption on J-dissipativity. Therefore, q(ξ) can have at most simple roots on the imaginary axis. If q(ξ) has k roots in the open right half complex plain, for p(ξ) + q(ξ) to be Hurwitz, the Nyquist plot of G(ξ) will have to encircle the origin k times in the counter clockwise sense. This again contradicts our assumption that B is J-dissipative. Hence q(ξ) has no roots in the open right half plane and all its purely imaginary roots are simple. Using results obtained in Theorem 4.4.6, Lemma 4.6.1 and Proposition 4.6.2 the following result follows as a corollary: Corollary 4.6.3 Let B be a J-dissipative behavior associated with the rational function p(ξ)/q(ξ). Then, the following are equivalent 1. p(ξ) and q(ξ) are not both constant, have no roots in the open right half plane, and all purely imaginary roots are simple. 2. There exists a positive definite storage function (with respect to QJ ) on the manifest variables of B. Every storage function on the manifest variables of B is positive definite. Corollary 4.6.3 states that every storage function for B is positive definite if and only if the associated rational function p(s)/q(s) is (non-constant) positive real with p(s), q(s) coprime. The strict version of the KYP lemma (Theorem 4.5.2) for J-dissipative SISO systems reduces to: 72 4 KYP lemma and its extensions Corollary 4.6.4 Let B be a J-dissipative behavior associated with the rational function p(ξ)/q(ξ). Every storage function of B with respect to QJ is positive definite on manifest variables, and the rate of change of the storage function is strictly less than QJ if p(ξ)/q(ξ) is nonconstant and strictly positive real, i.e. p(ξ), q(ξ) are strictly Hurwitz and Re p(iω)/q(iω) ≥ > 0 for every ω ∈ R. We shall now consider SISO systems dissipative with respect to a supply QΦ with Φ(ζ, η) ∈ such that Φ(ζ, η) can be written as K T (ζ)JK(η). Necessary and sufficient conditions for such a splitting of Φ(ζ, η) are given in Theorem 4.4.3. R2×2 s [ζ, η] T Corollary 4.6.5 Let Φ(ζ, η) ∈ R2×2 s [ζ, η] such that Φ(ζ, η) = K (ζ)JK(η) with K(ξ) ∈ R2×2 [ξ] and nonsingular. Consider a Φ-dissipative controllable behavior B given by an observable image representation: " # " # u q( dtd ) = ` y p( dtd ) Consider the associated J-dissipative behavior defined by the image of " # " # q̃(ξ) q(ξ) = K(ξ) p̃(ξ) p(ξ) Then, every storage function on the manifest variables of B is positive definite if and only if the following holds: q̃(ξ) and p̃(ξ) are coprime and the rational function p̃(ξ)/q̃(ξ) is non-constant and positive real. Further, the rate of change of the storage function is strictly less than the supply if the following holds: q̃(ξ) and p̃(ξ) are coprime and the rational function p̃(ξ)/q̃(ξ) is non-constant and strictly positive real. We now demonstrate the ideas behind results presented in this chapter with the help of a few examples. Example 4.6.6 Consider a Capacitor of capacitance 1 unit. Let v be the voltage across the capacitor and i be the corresponding current. Then, the behavior B corresponding to the voltage-current relationship of the capacitor is given by " # i i h =0 1 − dtd v This behavior is clearly controllable. An observable image representation of B is: " # " # d i = dt ` 1 v Consider the supply function QJ . Then, QJ (i, v) = vi. It is easy to see that B is lossless on QJ . Hence, there exists a unique storage function for B with respect to QJ which (on latent variables) is found to be `2 /2. Clearly, the storage function is positive definite on all `. Since B is non-constant, the corresponding storage function on manifest variables is v 2 /2, which is also clearly positive definite. 4.6. Special case: KYP lemma for SISO systems 73 One may reach the same conclusion by a quick examination of the associated transfer function G(ξ) := 1/ξ. This transfer function is positive real, and non-constant. Therefore, from Corollary 4.6.3, every storage function on manifest variables of B is positive definite. We now examine what happens when the supply function is a more general QDF: Example 4.6.7 Consider the supply function defined by " # " # " # 1 0 η 1 0 1 0 Φ(ζ, η) = = J 2 ζ 0 0 ζ 0 η (4.17) Consider the behavior B in the Example 4.6.6, corresponding to a unit capacitance: " # " # d i = dt ` v 1 The action of QΦ on trajectories in B can be found to be QΦ (i, v) = i2 , which is pointwise positive. Defining a dissipation function to be i2 , one can see that the storage function corresponding to this dissipation function is zero. " # 1 0 We come to the same conclusion using Corollary 4.6.3. Let K(ξ) = . The image 0 ξ " # representation corresponding to K( dtd )(B) is found to be d dt d dt ` which is not observable, since the image of every constant ` is zero. Therefore, from Corollary 4.6.3, it follows that B does not have positive definite storage functions with respect to QΦ . Remark 4.6.8 Examples 4.6.6 and 4.6.7 though extremely simple demonstrate some interesting concepts and give some insight into sign definiteness of storage functions. In many problems, checking whether or not a given dynamical system has positive definite storage functions is important, rather than actually computing these storage function. In Example 4.6.6, checking for positive definiteness of the rational function 1/ξ is clearly easier than computing the storage function (which in general requires a symbolic computational package, and also spectral factorization). In Example 4.6.7, the supply function QΦ is point-wise positive on (all) trajectories. Since, the supply function is now indistinguishable from a dissipation function, one can compute a storage function using QΦ as a dissipation function. Such a storage function will clearly be zero. The final example serves to highlight the difference between storage functions on manifest variables and storage functions on states: Example 4.6.9 Consider the behavior B given by: # " # " d + 2 u dt ` = d2 y + dtd − 4 dt 74 4 KYP lemma and its extensions Consider a Φ(ζ, η) where: 1 Φ(ζ, η) = 4 " ζ + η + 3ζη − 1 −3ζ − 1 −3η − 1 3 # Define the two variable polynomial Φ0 (ζ, η) by: 0 Φ (ζ, η) = h ζ + 2 ζ2 + ζ − 4 i Φ(ζ, η) " η+2 2 η +η−4 # which is equal to ζη +4(ζ +η)+15. B is dissipative with respect to the supply function induced by Φ(ζ, η) since Φ0 (−iω, iω) > 0∀ω ∈ R. The minimum storage function on the latent variable ` of B is found to be: √ QK (`) = (4 − 15)`2 which is positive definite on all ` and hence also positive definite on all (u, y) ∈ B, due to observability of `. That every storage function on the manifest variables of B is positive definite is also obtained from Theorem 4.4.6 since: Φ0 (ζ, η) = (ζ + 4)(η + 4) − 1 Note that ξ + 4 is a Hurwitz polynomial. The McMillan degree of B is 2. Hence, a minimal state representation for B has 2 state variables (for example ` and dtd `). However, the storage function obtained above depends only on ` (i.e., only one state). Therefore, this storage function is only positive semidefinite on a set of minimal states of B but is positive definite on the manifest variables of B. 4.7 Conclusion The KYP lemma relates passive systems with positive definite storage functions. Using this connection, we generalized the KYP lemma in a number of directions. We obtained a characterization of dissipative systems having positive definite storage functions. Unlike the KYP lemma which only considers supply functions that are quadratic forms, we also considered supply functions that are quadratic differential forms. Hence, the KYP lemma is a special case of the characterization proposed here. We also formulated variants of the KYP lemma available in literature, in the behavioral framework, and generalized them. We have not invoked a state space representation while formulating the KYP lemma and its generalizations. Therefore, our approach is representation free– however, frequency domain representations can be readily obtained if these are desired. As a part of future work, we plan to investigate state space formulations of our generalization, not so much for conceptual benefits but for the sake of computations. It is known that the state space version of the KYP lemma can be formulated as an LMI (Linear Matrix Inequality). This has in the recent times has led to a spurt of research in this area due to developments in convex optimization. It would be interesting to investigate LMI based formulations for the generalizations obtained in this chapter. Chapter 5 Designing linear controllers for nonlinearities 5.1 Introduction Consider the following problem: given a nonlinear system, characterize a class of linear differential systems (called controllers), that can be interconnected with the nonlinear system through given interconnection constraints to yield a stable or an asymptotically stable system. This problem is a variant of a classical problem that has received wide attention in literature – the problem of “absolute stability” due to Popov [76, 77, 78], Yakubovich and many others. The absolute stability problem has its origin in the well known conjecture of Aizerman. In 1949, Aizerman put forward the following conjecture: assume that a system is obtained by an interconnection of a linear system and a memoryless time-invariant, single-valued nonlinearity whose characteristic is bounded inside a sector in R2 defined by a pair of straight lines with slopes k1 , k2 . The (interconnected) system is stable if, when the nonlinearity were replaced by a linear gain k, the resulting linear system is stable for k1 < k < k2 . Pliss [73] demonstrated in 1958 that Aizerman’s conjecture isn’t true in general. Subsequently, many researchers have demonstrated counterexamples to Aizerman’s conjecture, see [21]. A big impetus to the development of absolute stability theory came with the work of the Romanian scientist V.M. Popov, who in a series of seminal papers in the 1960s [76, 77] obtained a very general “frequency criterion” for predicting stability of systems obtained by interconnecting a linear system with a nonlinearity. The absolute stability problem is rich– for though it is now more than fourty years since Popov’s fundamental work in this direction, the stability problem continues to be an area of active research. Broadly speaking, absolute stability criteria can be stated in the frequency domain (using certain “multipliers”), or in state space (using a Linear Matrix Inequality). The debate on which of these is “better” is ongoing [20]. There are still numerous open issues– the greatest of them being that till date, no necessary conditions for absolute stability seem to be known. In the 1960s and 70s, the emphasis was mostly on SISO systems, see for instance the insightful papers by Brockett and Willems [15, 16]. Focus was also on investigating different “families” of nonlinearities, for example the family of all monotone nonlinearities [111]. In the 1990s, research in absolute stability was revived due to developments in Linear Matrix 76 5 Designing linear controllers for nonlinearities Inequalities and convex optimization, see [81, 52, 42] for example. MIMO versions of stability theorems also received attention [31, 37, 48]. Research is also focused on nonlinearities with memory [30, 39]. See [5] for a collection of current problems. From a behavioral viewpoint, we are interested in finding stabilizing linear controllers for given nonlinearities. Formulating the absolute stability problem in a behavioral theoretic framework has advantages: firstly, there are fewer a priori assumptions on the linear system that can be connected with a nonlinearity as compared to transfer function based methods. Classical theories are essentially dependent on a “negative feedback” interconnection of the linear and nonlinear systems. In the behavioral framework, fairly general type interconnection constraints can also be handled. Further, the so called “loop transformations” can be shown to be a special case of results obtained in the behavioral framework. Lyapunov theory is a well known tool for analysis of dynamical systems. An important part of Lyapunov theory deals with the construction of Lyapunov functions. In this chapter we use behavior-theoretic ideas to construct Lyapunov functions for nonlinear systems. These Lyapunov functions will be constructed using storage functions for dissipative LTI systems. In order for the storage functions to qualify as Lyapunov functions, they need to satisfy conditions on positivity and radial unboundedness. We explore these issues in the next section. The scalar version of results obtained in this chapter have been published [65, 69]. A journal version of the results presented in this chapter is under preparation. This chapter is organized as follows: in Section 5.2, we develop the necessary background. Section 5.3 is about the “control as interconnection” viewpoint. Here, we formulate the stabilization problem as an interconnection. In Section 5.4 we provide precise definitions of a nonlinearity. We also give a precise problem statement. Section 5.5 is the main section of this paper. Here, we give a recipe for constructing a set of stabilizing controllers for a given nonlinearity. This is followed by applications: Section 5.6 about the Circle criterion, Section 5.7 about Popov’s stability criterion, and Section 5.8 about slope restricted nonlinearities. In section 5.9, we discuss applications of the theory for nonlinearities with memory. 5.2 Preliminaries In this section, we continue with the ideas presented in Chapter 4 to construct Lyapunov functions using storage functions of linear systems. Consider a QDF QΦ with Φ(ζ, η) = K T (ζ)JK(η), K(ξ) ∈ Rw×w [ξ] and nonsingular, " # 1 0 Im J= 2 Im 0 where w = 2m. Let B be a Φ-dissipative behavior with input cardinality w/2, corresponding to the rational function P (ξ)Q−1 (ξ). We now define: " # " # R(ξ) Q(ξ) = K(ξ) (5.1) S(ξ) P (ξ) The following corollary follows as a consequence of the KYP lemma, Theorem 4.4.6: 5.2. Preliminaries 77 Corollary 5.2.1 Consider a Φ-dissipative behavior B = {(u, y)} associated with the rational function P (ξ)Q−1 (ξ), with input cardinality w/2. # Consider the corresponding J-dissipative " d R( dt ) with R(ξ), S(ξ) as in equation (5.1). Then, BJ := {(u0 , y 0 )} behavior given by Im S( dtd ) the following statements are equivalent: 1. Every storage function of B with respect to QΦ is positive definite on manifest variables. 2. Matrices defining BJ satisfy the following conditions: (a) R(ξ) is nonsingular. (b) R(ξ), S(ξ) are right coprime, i.e. if R(ξ) = R1 (ξ)U (ξ) and S(ξ) = S1 (ξ)U (ξ) then U (ξ) is unimodular. (c) BJ has no non-trivial memoryless part. (d) The rational function S(ξ)R−1 (ξ) is positive real. Further, the rate of change of every storage function of B with respect to QΦ is strictly less than QΦ if S(ξ)R−1 (ξ) is strictly positive real. Proof: We use Theorem 4.4.6 to prove the Corollary. The only difference between Theorem 4.4.6 and this Corollary is that, while the latter concerns the supply function QJmn , we consider the supply function QJ . Notice that #" # " # " #" Im 0 Im /2 Im /2 0 Im /2 Im /2 Im /2 = 0 −Im Im /2 −Im /2 Im /2 0 Im /2 −Im /2 {z } | {z } | C CT Thus we see that Φ(ζ, η) = K T (ζ)C T Jmm CK(η). Let B1 = K( dtd )(B) and B2 = CK( dtd )(B). Assume that " # N ( dtd ) B2 = Im ` M ( dtd ) Notice that N = (R + S)/2 and M = (R − S)/2. Since B is Φ-dissipative, B1 is J-dissipative and B2 is Jmm -dissipative. From Theorem 4.4.6, B has positive definite storage functions on manifest variables if and only if M, N are right coprime, N (ξ) is a Hurwitz matrix and the behavior associated with the rational function M N −1 has no non-trivial memoryless part. We now prove that these conditions are equivalent to those given in the statement of the Corollary. If w ∈ B1 then v := Cw ∈ B2 . Let XJ be a minimal state map for B1 : x = XJ ( d )w, w ∈ B1 dt Then, XJ (ξ)C −1 is a state map for B2 : x = XJ ( d −1 )C v, v ∈ B2 dt Therefore, B1 has no non-trivial memoryless part if and only if B2 has no non-trivial memoryless part. 78 5 Designing linear controllers for nonlinearities The claim that B1 is given by an observable image representation if and only if B2 is given by an observable image representation also follows from the invertibility of C. Finally, since ||(R − S)(R + S)−1 ||H∞ ≤ 1, it follows that SR−1 is positive real, [3, 56]. This is a well known property of bounded real rational functions. The strict version of this corollary can be proved similarly from Theorem 4.5.2. We have elaborated on the difference between storage functions on states and storage functions on manifest variables in Chapter 4. Trentelman and Willems [95] have shown that a storage function for a behavior B is a state function of an associated behavior. However, under additional assumptions, a storage function for a behavior can be guaranteed also to be a state function. We now investigate when storage functions for a Φ-dissipative behavior are functions of state. Theorem 5.2.2 Given Φ(ζ, η) = K T (ζ)JK(η), K(ξ) ∈ Rw×w [ξ] nonsingular, let " # K11 K12 K(ξ) = K21 K22 where Kij ∈ Rm×m [ξ], i, j = 1, 2. Let B, defined by an observable image representation, # " # " u Q( dtd ) ` = y P ( dtd ) be Φ-dissipative. Further assume that P (ξ)Q−1 (ξ) is proper. If GB := K11 + K12 P Q−1 is a proper rational function then every storage function of B with respect to QΦ is a state function. Proof: Consider the associated J-dissipative behavior of B: " # " # ū K11 ( dtd )Q( dtd ) + K12 ( dtd )P ( dtd ) B̄ = = ` ȳ K21 ( dtd )Q( dtd ) + K22 ( dtd )P ( dtd ) From [95], page 254, Theorem 6.1 it follows that every storage function of B̄ with respect to QJ is a state function of B̄, i.e. every storage function can written as x̄T P x̄ such that d Q (x̄) ≤ QJ (ū, ȳ), where x̄ denote a minimal set of states of B̄. dt P Let us investigate under what conditions there exists a constant matrix C such that x̄ = Cx where x denote states of B. Let X̄( dtd ) and X( dtd ) denote state maps acting on the latent variable ` for B̄ and B respectively. x̄ = X̄( d d )`; x = X( )` dt dt Then, x̄ = Cx if and only if X̄(ξ) = CX(ξ). We know that the row span of X̄(ξ) (over R) is precisely the span of rows ri (ξ) such that ri (K11 Q + K12 P )−1 is strictly proper (Corollary 3.5.2, [82], page 64). Since ri (K11 Q + K12 P )−1 is strictly proper, Gs = ri Q−1 (K11 + K12 P Q−1 )−1 is strictly proper. Since by assumption K11 + K12 P Q−1 is proper, ri Q−1 = Gs (K11 + K12 P Q−1 ) 5.2. Preliminaries 79 is strictly proper because Gs is strictly proper. This shows that ri also lies in the row span of X(ξ). We have proved that every vector in the row span of X̄(ξ) lies in the row span of X(ξ) (over R) when K11 + K12 P Q−1 is proper. Hence, there exists a C such that x̄ = Cx. The above result is however not enough to guarantee positive definiteness of storage functions of B since the McMillan degree of B̄ could be less than that of B (see Example 4.6.9 for a demonstration). In order to preserve positive definiteness we want C to be full column rank so that Cx = 0 =⇒ x = 0. A necessary condition for the existence of such a C is that McMillan degree of B̄ ≥ McMillan degree of B As a special case we have the following result: Theorem 5.2.3 With reference to Theorem 5.2.2, if GB := K11 + K12 P Q−1 is bi-proper (i.e. proper with a proper inverse) then every positive definite state function of B̄ is a positive definite state function of B. Proof: We show that if K11 + K12 P Q−1 is biproper then there exist matrices C and C 0 such that x̄ = Cx ; x = C 0 x̄ The first part of the claim follows from Theorem 5.2.2. If ri Q−1 is strictly proper, then ri Q−1 [K11 + K12 P Q−1 ]−1 is strictly proper since K11 + K12 P Q−1 has a proper inverse by assumption. Therefore, ri [K11 Q + K12 P ]−1 is strictly proper Hence, x = C 0 x̄. −1 Thus, Theorem 5.2.3 shows that if K11 + K12 P Q is biproper then the state spaces of B̄ and B are the same. Let xT Kx be a positive definite storage function for B with respect to QΦ . Note that the storage function is radially unbounded, i.e. xT Kx → ∞ as x → ∞. Positive definiteness and radial unboundedness of storage functions will be crucially used while constructing Lyapunov functions. Since we are considering nonlinear systems in this chapter, we have to consider a larger space than C ∞ , the space of smooth functions. In this chapter, we assume the function space w w loc to be Lloc 1 (R, R ), the space of locally integrable functions from R to R . Differentiation of L1 functions is of course assumed in the distributional sense. See Section 2.3 for an introduction to differentiation in Lloc 1 , and the concept of “weak solutions”. loc A note about the action of QDFs on Lloc 1 functions is in order. Note that the space L1 is not stable under differentiation, i.e. the differentiation of a Lloc 1 function may not yield another loc loc L1 function. Hence, given a behavior B = (u, y) ⊂ L1 (R, Rw ), a QDF QΦ (u, y) may not be a L1loc function in general, and worse, may not even be well defined. However, by restricting Φs to loc w a suitable set, one can guarantee QΦ (u, y) to be a Lloc 1 function when (u, y) ∈ B ⊂ L1 (R, R ). We have reviewed state space representations for a behavior in Section 2.8. Due to the axiom of state (Definition 2.8.1), states are continuous (C 0 ) functions of t. It can be shown that loc w QΦ (u, y) ∈ Lloc 1 (R, R) when (u, y) ∈ B ⊂ L1 (R, R ), if QΦ (u, y) can be written as a quadratic form in terms of u, x and ẋ where u denotes an input (Section 2.9) and x denotes a minimal set of states for B. In the sequel, given a QΦ , we only associate those behaviors with it which loc w satisfy QΦ (u, y) ∈ Lloc 1 (R, R) when (u, y) ∈ B ⊂ L1 (R, R ). 80 5.3 5 Designing linear controllers for nonlinearities Control as an interconnection A system that does not satisfy the superposition principle (Definition 2.1.2) is called a nonlinear system, or in short, a nonlinearity. We reserve the precise definition till the next section. At an abstract level, the nonlinearity consists of a set of trajectories allowed by the law that defines the nonlinearity. This set may contain trajectories that are desirable and those that are not. By attaching a controller to the nonlinearity, we want to restrict the nonlinearity in such a way that the undesirable trajectories are eliminated. In the context of this chapter, the “desirable” trajectories are those that ensure stability of the equilibrium (a precise definition will be given later). The idea behind control is a restriction brought about by interconnection of systems. This notion of control is both physically appealing and practically important (see [101, 102] for details about the “control-as-an-interconnection” viewpoint). Therefore, in this chapter, we also adopt the view that control of a system is a restriction brought about by interconnection. In Chapter 7, we will again encounter the “control-as-an-interconnection” philosophy when we address problems in synthesis of dissipative systems. We intend to design linear stabilizing controllers for given nonlinearities. Hence, in the sequel, we only study autonomous systems that are obtained by interconnection of a linear and a nonlinear system. In Section 2.7, we defined the stability of linear autonomous systems. There, the notion of stability matches with our intuitive notion that “trajectories in a stable system do not blow up”. However, in nonlinear systems, one has to define stability more carefully. We now define what we mean by stability of equilibria a dynamical system: Definition 5.3.1 Consider the system ẋ = f (x) with f (0) = 0, x(t) ∈ R× being states. Assume that there is a unique solution to this equation for every x(0) ∈ R× . The equilibrium state x = 0 is called stable if for every > 0 there exists a δ > 0 such that if ||x(0)|| < δ then ||x(t)|| < for all t ∈ R+ . Further, the equilibrium x = 0 is called globally asymptotically stable if it is stable, and for all x(0) ∈ R× , ||x(t)|| → 0 as t → ∞. These definitions are standard [107]. Since we are considering real valued systems, the definition is independent of the norm used, since all norms on Euclidean spaces are equivalent. Further, global asymptotic stability means that no matter what is the initial condition, the trajectory x will eventually tend to 0. For an equilibrium to be globally asymptotically stable, it should be the only equilibrium in a system. Since the state x = 0 corresponds to the zero manifest variable trajectory, by stability about the zero trajectory we mean stability of x = 0. When considering systems with only a single equilibrium, we refer to the asymptotic stability about the zero trajectory, however strictly speaking we are really considering the global asymptotic stability of the equilibrium state x = 0. 5.4 Nonlinear systems: problem formulation In this chapter, we only consider systems that are obtained by an interconnection of a linear system and a nonlinearity. Let v = [vi ]mi=1 and w = [wi ]mi=1 be Rm -valued columns. Consider the maps vi = fi (wi ) defined almost everywhere, i.e. fi (wi ) is a Lloc 1 function with respect to 5.4. Nonlinear systems: problem formulation 81 wi , i = 1, . . . , m. Further assume that fi (0) = 0. We represent these m maps notationally by v = f (w). We say that v = f (w) defines a memoryless nonlinearity. If the fi s do not depend on time, they will be called time-invariant. In this chapter we only consider memoryless and time-invariant nonlinearities. Define w N = {(w, v) ∈ Lloc 1 (R, R ) such that vi = fi (wi ), i = 1, . . . , m}, w = 2m Thus, N is the set of trajectories consistent with v = f (w). Note that because fi (0) = 0, i = 1, . . . , m by assumption, it follows that (0, 0) ∈ N. Analogous to Chapter 2 where we considered a linear system Σ and its behavior B interchangeably, in this chapter we consider the nonlinearity defined by f and its “behavior” N interchangeably. Consider a QDF QΘ with Θ ∈ Rw×w s [ζ, η]. Define w loc N := {(w, v) ∈ Lloc 1 (R, R ) such that QΘ (w, v) ∈ L1 (R, R), QΘ (w, v) ≥ 0} If v = f (w) is such that N ⊂ N then N will be called nonlinearity of the N -kind. The set of all nonlinearities N of the N -kind defines a family of nonlinearities FN . Consider any nonlinearity N ∈ FN . Also, consider a linear differential behavior B = (u, y) associated with the strictly proper rational function P (ξ)Q−1 (ξ). Let the nonlinearity N and the behavior B be interconnected as follows: " # " # d u(t) w(t) (5.2) = X( ) dt y(t) v(t) Here, X (ξ) ∈ Rw×w [ξ] is called the interconnection matrix, which is assumed to be nonsingular. When " # B and N are interconnected using “negative feedback”, X can be seen to be 0 Im . For problems addressed in this chapter, only “negative feedback” interconnection −Im 0 will be considered. However, any other type of interconnection does not affect the essence of the theory presented here. Using negative feedback, B and N are interconnected to obtain the autonomous nonlinear system BN . The process of interconnection is diagrammatically shown in Figure 5.1. Since N ∈ FN is memoryless, a state space representation of BN (corresponding to a negative feedback interconnection) can be obtained using the states of the linear differential behavior B. In particular, BN admits a representation of the type: ẋ = Ax − Bf (Cx) (5.3) where (A, B, C) is a minimal state representation of B with states x, and v = f (w) is the law defining the nonlinearity N. Since both B and N admit the zero trajectory, BN is nonempty and has at least the zero trajectory (u, y) = (0, 0). Recall that by asymptotic stability of (0, 0) ∈ BN , we mean global asymptotic stability of the state x = 0. Problem formulation: Given a family of nonlinearities FN , determine a class CB of LTI behaviors B such that (0, 0) ∈ BN is asymptotically stable for every B ∈ CB , N ∈ FN . The next section is devoted towards addressing the problem formulated above. 82 5 Designing linear controllers for nonlinearities u(t) LTI Dynamical System y(t) (Behavior) Interconnection v(t) w(t) Nonlinearity Figure 5.1: Interconnection of B and N 5.5 Constructing stabilizing controllers For the sake of clarity, we first present a recipe for obtaining CB in Section 5.5.1, remaining parts of this section are devoted to proving claims made in the recipe. 5.5.1 A recipe to obtain stabilizing behaviors for all nonlinearities in a given family 1. Consider a QDF QΘ1 . Define: w loc NΘ1 := {(w, v) ∈ Lloc 1 (R, R )|QΘ1 (w, v) ∈ L1 (R, R), QΘ1 (w, v) ≥ 0} (5.4) w Consider sets N = {(w, v) ∈ Lloc 1 (R, R )|v = f (w)} where f defines a memoryless timeinvariant nonlinearity. Then, FNΘ1 := {N|N ⊂ NΘ1 } is a family of nonlinearities of the NΘ1 -kind. 2. Find a continuous function G(w, v) such that: (a) G(w, v) is positive semidefinite on all (w, v) ∈ NΘ1 . (b) d G(w, v) dt Note that d G dt is a QDF, say QΘ2 (w, v). is guaranteed to be a Lloc 1 function. 3. Let Θ(ζ, η) = Θ1 (ζ, η) + Θ2 (ζ, η). Note that for all (w, v) ∈ NΘ1 : d G(w, v) dt (5.5) Φ(ζ, η) = −X T Θ(ζ, η)X (5.6) QΘ (w, v) ≥ 4. Compute the matrix Φ(ζ, η) as follows: Here X is the interconnection matrix. We have already stated above that in this chapter only a negative feedback interconnection will be considered. 5.5. Constructing stabilizing controllers 83 5. Check whether Φ(ζ, η) can be split as Φ(ζ, η) = K T (ζ)JK(η), K(ξ) ∈ Rw×w [ξ], nonsingular and " # 1 0 Im J= , m = w/2 2 Im 0 Consider a Φ-dissipative behavior B associated with the rational function P (ξ)Q−1 (ξ). If B has positive definite storage functions on manifest variables, and if the rational function GB defined in Theorem 5.2.3 is biproper then (0, 0) ∈ BN is asymptotically stable. 5.5.2 Stability results Notice that using the interconnection relations, one can express G(w, v) in terms of (u, y) as G(u, y). The following lemma is useful in finding a Lyapunov function candidate for BN : Lemma 5.5.1 Given FNΘ1 , construct a suitable QΦ . Interconnect any nonlinearity N ∈ FNΘ1 with a Φ-dissipative behavior B resulting in the autonomous behavior BN . Let QΨ be any storage function of B with respect to QΦ such that dtd QΨ (u, y) < QΦ (u, y) for all nonzero (u, y) ∈ B. Then, d (QΨ (u, y) + G(u, y)) < 0 ∀ (u, y) ∈ BN − {0} (5.7) dt Proof: Note that d QΨ (u, y) ∀(u, y) ∈ B − {0} (5.8) dt Note that for all (u, y) ∈ BN , QΘ (w, v) = −QΦ (u, y) (Equation (5.6)). Adding the inequality (5.5) to the inequality (5.8): QΦ (u, y) > d (QΨ (u, y) + G(w, v)) < 0 ∀ non − zero (u, y) ∈ BN dt (5.9) Substituting G(w, v) = G(u, y) in the above inequality completes the proof. We now prove the central result of this chapter. We use Theorem 5.2.3, to “convert” the Lyapunov function candidate in Lemma 5.5.1 in terms of states. Theorem 5.5.2 Given FNΘ1 , construct an appropriate QDF QΦ . Assume Φ(ζ, η) = K T (ζ)J " # K11 K12 K(η) with K(ξ) = nonsingular, Kij (ξ) ∈ Rm×m [ξ], i, j = 1, 2. Let B be a ΦK21 K22 dissipative behavior associated with the strictly proper rational function P (ξ)Q−1 (ξ) such that GB = K11 + K12 P Q−1 is biproper. Consider the Lyapunov function candidate: V (u, y) = QΨ (u, y) + G(u, y) which satisfies: 1. QΨ a positive definite storage function on the manifest variables of B. 2. d Q (u, y) dt Ψ is strictly less than QΦ (u, y) for all nonzero (u, y) ∈ B. 3. G(u, y) a continuous positive semidefinite function of (u, y) ∈ BN and Then, the zero trajectory in BN is asymptotically stable. d G(u, y) dt a QDF. 84 5 Designing linear controllers for nonlinearities Proof: If QΨ (u, y) > 0 then from Theorem 4.4.6 and Corollary 5.2.1 it can be shown that QΨ is a positive definite state function of K( dtd )(B). Since GB is biproper, Theorem 5.2.3 shows that QΨ is also a positive definite state function of B. Let (A, B, C) be a minimal state space representation for B with states x. Then, QΨ (u, y) can be written as xT Dx where D > 0. Note that xT Dx is radially unbounded. Using the substitution y = Cx and u = −f (Cx), V (u, y) is a positive definite state function of BN , V(x). Further, since G(u, y) is positive semidefinite, and xT Dx is radially unbounded, V(x) is radially unbounded. Using Lemma 5.5.1 we conclude that V (u, y) is a Lyapunov function for BN . 5.5.3 A characterization of stabilizing controllers Theorem 5.5.2 gives a recipe for constructing Lyapunov functions for BN . Note that this stability result holds for any Φ-dissipative behavior with positive definite storage functions such that the rate of change of the derivative of the storage function is strictly less than the supply. It can immediately be seen that if the class of Φ-dissipative behaviors with positive definite storage functions is known, a class of CB of stabilizing controllers can be parametrized. In Chapter 4 we addressed the Kalman-Yakubovich-Popov lemma in the behavioral setting and obtained conditions for a behavior to have positive definite storage functions on manifest variables. Let Φ(ζ, η) = K T (ζ)JK(η) and let B associated with a strictly proper rational function P (ξ)Q−1 (ξ) "be Φ-dissipative. Define B̃ = K( dtd )(B). Assume that B̃ is given by an # Q̃(ξ) . Recall from Corollary 5.2.1 that B has positive definite storage P̃ (ξ) functions on manifest variables, and the rate of change of the storage function is strictly less than the supply QΦ if: image representation 1. P̃ (ξ), Q̃(ξ) are right coprime matrices. 2. B̃ has no memoryless part. 3. The rational function P̃ (ξ)Q̃−1 (ξ) is strictly positive real. Using the characterization of behaviors with positive definite storage functions, we can address absolute stability criteria for several important classes of nonlinearities. Remark 5.5.3 The stability theory presented here depends in an essential way on the fact that the nonlinearity and the behavior are interconnected using negative feedback. Consider a behavior B that is interconnected with a nonlinearity N ∈ FNΘ1 using a general interconnection (specified by a nonsingular polynomial matrix X ( dtd )). Lyapunov functions for such systems can be constructed using storage functions that are state functions of the behavior X ( dtd )(B) (and not of B). Remark 5.5.4 In Section 5.5.1, assume that G(w, v) is positive definite. To ensure asymptotic stability, rate of change of storage function along trajectories in B need not be strictly less than the supply function. Condition 3 mentioned above can now be relaxed to: P̃ (ξ)Q̃−1 (ξ) is positive real, as opposed to strictly positive real. 5.6. The Circle criterion 85 Remark 5.5.5 Condition 1 above states that Q̃(ξ) and P̃ (ξ) must be right coprime in order to conclude asymptotic stability. However, if P̃ (ξ) and Q̃(ξ) are not right coprime, one can still conclude asymptotic stability if the states of B can be constructed from states of the behavior associated with the reduced rational function P̃ (ξ)Q̃−1 (ξ). Remark 5.5.6 Let (u, y) be an input-output partition of a linear differential behavior B and x denote a set of minimal states for B. Assume that QΨ , a storage function for B, is a positive definite state function of ẋ, and not of x. With a memoryless nonlinearity interconnected with B through negative feedback, G(w, v) can be defined as a function of y: v = f (y), w = y. Assuming G(w, v) > 0, and (f (y) = 0 ⇐⇒ y = 0), V(u, y) in Theorem 5.5.2 can be defined as a positive definite function of ẋ and y: V(u, y) = ẋT D ẋ + G(y). If dtd V(u, y) is negative along all (u, y) ∈ BN , we see that ẋ → 0 and y → 0 as t → ∞. Since x are a set of minimal states for B, and BN has only one equilibrium by assumption, we can still conclude global asymptotic stability of x = 0. See [31] for details about this argument. In the following sections, a representative list of applications of the general theory stated above is presented. This leads to some well known classical results along with some new results. 5.6 The Circle criterion One of the simplest classes of nonlinearities is the family of sector bound nonlinearities. We consider systems with w := 2m manifest variables We define the m×m matrices A = diag [a 1 , a2 . . . am ] and B = diag [b1 , b2 . . . bm ] with 0 ≤ ai < bi , i = 1, 2 . . . , m,. Define a w × w constant matrix Θ1 as: # " −AB A+B 2 (5.10) Θ1 = A+B −I m 2 Define, as before, the set NΘ1 as follows w loc NΘ1 = {(w, v) ∈ Lloc 1 (R, R )|QΘ1 (w, v) ∈ L1 (R, R), QΘ1 (w, v) ≥ 0} (5.11) Consider any nonlinearity defined by f = diag[fi ], i = 1, . . . , m where vi = fi (wi ), fi (0) = 0 are Lloc 1 maps. Let N be the set of trajectories that are compatible with f , i.e., N = {(w, v) ∈ w Lloc 1 (R, R )|vi = fi (wi ), i = 1, . . . m} and assume that N ⊂ NΘ1 . The family of all N that satisfy these criteria form a family of sector bound nonlinearities in [A, B] which we denote by FAB . Consider a linear differential behavior B corresponding to the strictly proper transfer function P (ξ)Q−1 (ξ) with Q(ξ), P (ξ) ∈ Rm×m [ξ]. We interconnect a nonlinearity N ∈ FAB with a linear differential behavior B using “negative feedback”: " # " #" # w 0 Im u = (5.12) v −Im 0 y Following the notation in Section 5.5.1, let G = 0. We now compute Φ(ζ, η) corresponding to “negative feedback” and observe that: # " # " # " I I I A Im A+B m m m 2 = J (5.13) Φ(ζ, η) = A+B AB A B I m B 2 | {z } K 86 5 Designing linear controllers for nonlinearities where J = (1/2) " 0 Im Im 0 # . From Corollary 5.2.1 and Theorem 5.5.2: Corollary 5.6.1 Consider the family FAB of sector bound nonlinearities in [A, B] with 0 ≤ ai < bi < ∞. Interconnect a linear differential behavior B associated with the strictly proper rational function P (ξ)Q−1 (ξ), with any nonlinearity N ∈ FAB , using negative feedback to get BN . Define the rational function H(ξ) = [Q(ξ) + BP (ξ)][Q(ξ) + AP (ξ)]−1 (5.14) The equilibrium (0, 0) ∈ BN is asymptotically stable if H(ξ) is strictly positive real (SPR). This is precisely the condition one obtains after a “loop transformation” on the nonlinearity [107]. 5.7 Classical Popov Criterion Since the circle criterion is only a sufficiency result, it is worthwhile investigating whether there exist more stabilizing controllers for a given nonlinearity N. This question will now be explored in more detail. We consider the family F0K of sector bound nonlinearities in [0, K] where K = diag(k1 , . . . km ), 0 < ki < ∞, i = 1, . . . m. Let " # 0 K/2 Θ1 = K/2 −Im w Let NΘ1 be the set of all (w, v) ∈ Lloc Q (w, v) is non-negative. Then, 1 (R, R ) such that R w i Θ1 Rw T Pm for all N ∈ F0K , N ⊂ NΘ1 . Let G(w, v) = i=1 ki αi 0 vi dwi := 0 v KΛdw with Λ = diag [α1 , . . . αm ], αi > 0. Note that G(w, v) ≥ 0 for all (w, v) ∈ NΘ1 . Define a QDF QΘ2 as: QΘ2 (w, v) = d d G(w, v) = v T KΛ w dt dt (5.15) Define Θ(ζ, η) = Θ1 (ζ, η) + Θ2 (ζ, η). Using X corresponding to negative feedback, compute the two variable polynomial matrix Φ(ζ, η) = −X T Θ(ζ, η)X . Thus: " # I K(I + Λη)/2 m m Φ(ζ, η) = −X T Θ(ζ, η)X = (5.16) K(Im + Λζ)/2 0 Note that Φ(ζ, η) can be factorized as: " # " # Im Im Im 0 Φ(ζ, η) = J 0 K(Im + Λζ) Im K(Im + Λη) | {z } | {z } N T (ζ) (5.17) N (η) Consider a Φ-dissipative behavior associated with the strictly proper rational function P (ξ)Q −1 (ξ). Interconnect a nonlinearity N ∈ F0K with B, using negative feedback, to obtain BN . Let QΨ be a positive definite storage function for B with respect to QΦ such that: d QΨ (u, y) < QΦ (u, y) ∀ nonzero (u, y) ∈ B dt (5.18) 5.7. Classical Popov Criterion 87 From Corollary 5.2.1, all storage functions for B are positive definite and the rate of change of storage is less than the supply if the following conditions are satisfied: 1. Y (ξ) := [Q(ξ) + K(Im + Λξ)P (ξ)]Q−1(ξ), which is the transfer function associated with N ( dtd )(B), is strictly positive real. 2. The behavior N ( dtd )(B) has no nontrivial memoryless part. 3. K(Im + Λξ)P (ξ) + Q(ξ) and Q(ξ) are right coprime. Further, Theorem 5.2.2 shows that storage functions of B are state functions. We interconnect any N ∈ F0K with B to get BN . From Theorem 5.5.2, (0, 0) ∈ BN is asymptotically stable for every nonlinearity N ∈ F0K and every Φ-dissipative behavior that satisfies the above conditions. This is a multivariable generalization of the celebrated result due to Popov, often called “Popov’s stability criterion”. Remark 5.7.1 Following Remark 5.5.4, consider the case where the nonlinearity is restricted to the open sector (0, K). The procedure outlined above is still valid for this case. However, the assumption that the rate of change of storage function along trajectories of B be strictly less than the supply is no longer crucial. Notice that in the result we have just proved, Λ ≥ 0. However, in literature concerning Popov criterion (e.g. Hsu and Meyer [36] pp 377, [16]), Λ is often allowed to be negative. This result will now be derived in the framework of this chapter. Let " # 0 K/2 Θ1 = (5.19) K/2 −Im Consider F0K , the family of sector bound nonlinearities in [0, K]. Now, define G(w, v) = R wi Rw Pm T i=0 ki αi 0 (ki wi − vi )dwi := 0 (Kw − v) KΓdw , Γ = diag [α1 . . . αm ] ≥ 0. Notice that G(w, v) ≥ 0 for all (w, v) ∈ NΘ1 . Let QΘ2 (w, v) = d d G(w, v) = (Kw − v)T KΓ w dt dt (5.20) Let Θ(ζ, η) = Θ1 + Θ2 (ζ, η). Compute, as before, Φ(ζ, η) = −X T ΘX where X denotes the interconnection matrix corresponding to negative feedback. Thus: " # Im K(Im − Γη)/2 Φ(ζ, η) = (5.21) K(Im − Γζ)/2 −ΓK 2 (ζ + η)/2 Using negative feedback, interconnect any nonlinearity N ∈ F0K with linear differential Φdissipative behaviors B associated with the strictly proper rational function P (ξ)Q−1 (ξ). Let QΨ be a storage function of B with respect to QΦ such that the rate of change of QΨ along nonzero trajectories of B is strictly less than QΦ . Notice that Φ(ζ, η) can be factorized in the following manner: # " # " # " Im K Im K(Im − Γη)/2 Im Im J (5.22) = Im −KΓη K(Im − Γζ)/2 −ΓK 2 (ζ + η)/2 K −KΓζ {z } | {z } | M T (ζ) M (η) 88 5 Designing linear controllers for nonlinearities Denote by Z(ξ) := P̃ (ξ)Q̃−1 (ξ) the rational function associated with M ( dtd )(B): Z(ξ) := [Q(ξ) − KΓξP (ξ)][Q(ξ) + KP (ξ)]−1 (5.23) If Z(ξ) is reduced and SPR, and the behavior M ( dtd )(B) has no non-trivial memoryless part, (0, 0) ∈ BN is asymptotically stable. This is a result in itself; however in order to simplify application, we modify it using transfer function manipulations. Lemma 5.7.2 Z(ξ) in equation (5.23) is SPR if and only if Z̄(ξ) := [Q(ξ) + K(Im − Γξ)P (ξ)]Q−1(ξ) is SPR Proof: Consider the polynomial matrices # # " " Im K(Im − Γη)/2 Im K(Im − Γη)/2 ; Φ̄(ζ, η) = Φ(ζ, η) = K(Im − Γζ)/2 0 K(Im − Γζ)/2 −ΓK 2 (ζ + η)/2 Since Φ(−iω, iω) = Φ̄(−iω, iω) it follows that QΦ ∼ QΦ̄ , see Section 3.3 for details about equivalent supply functions. Therefore, B, associated with the strictly proper rational function P (ξ)Q−1 (ξ) is Φ-dissipative if and only if it is also Φ̄-dissipative. Notice that " # " # " # Im K(Im − Γη)/2 Im Im Im 0 = J K(Im − Γζ)/2 0 0 K(Im − Γζ) Im K(Im − Γη) {z } | N (η) Since QΦ ∼ QΦ̄ , B is Φ-dissipative if and only if B̄, associated with the rational function Z̄(ξ) := [Q(ξ) + K(Im − Γξ)P (ξ)]Q−1(ξ) is J-dissipative. Hence, Z(iω) + Z T (−iω) ≥ Im > 0 if and only if Z̄(iω) + Z̄ T (−iω) ≥ Im > 0, ω ∈ R. Further, Z(ξ) is SPR if and only if Z(iω) + Z T (−iω) ≥ Im > 0 and 2Q(ξ) + K(Im − Γξ)P (ξ) is Hurwitz. Note that the sum of the “numerator” and “denominator” of Z(ξ) and Z̄(ξ) are the same. Therefore, Z(ξ) is SPR if and only if Z̄(ξ) is SPR. Thus, Popov’s result in all its generality can now be stated as: Theorem 5.7.3 Consider the family F0K of memoryless, time-invariant, single-valued, sectorbound nonlinearities in [0, K], K = diag [k1 . . . km ], 0 < ki < ∞. Consider linear differential behaviors B associated with a strictly proper rational function P (ξ)Q−1 (ξ). Interconnect a nonlinearity N ∈ F0K with a behavior B using “negative feedback” to obtain the autonomous nonlinear behavior BN . Then, the equilibrium (0, 0) in BN is asymptotically stable if there exists Γ = diag [γ1 . . . γm ] ≥ 0 such that Z(ξ) := [Q(ξ) + K(Im ± Γξ)P (ξ)]Q−1 (ξ) is reduced, and is SPR and further the behavior associated with Z(ξ) has no memoryless part. Further, if nonlinearities N ∈ F0K are restricted to lie in the open sector (0, K), the equilibrium (0, 0) is asymptotically stable if Z(ξ) is reduced and is PR (when Γ nonsingular) and SPR when Γ singular. An interesting corollary follows from Theorem 5.7.3 Corollary 5.7.4 Consider behaviors B that satisfy conditions given in Theorem 5.7.3. Then, no root of Q(ξ) is in the open right half plane and all its roots on the imaginary axis are simple. 5.8. Slope restricted nonlinearities 89 Proof: With Q(ξ) + K(Im ± Γξ)P (ξ) and Q(ξ) right coprime, the rational function Z(ξ) := [Q(ξ) + K(Im ± Γξ)P (ξ)]Q−1 (ξ) is positive real only if Z(ξ) is analytic in the open right half plane and all singularities of Z(ξ) on ξ = iR are simple. Since singularities of Z(ξ) are roots of det Q(ξ) = 0, the Corollary is proved. In other words, the corresponding linear transfer function P (ξ)Q−1 (ξ) is stable. It is interesting to note that in most references on Popov-like stability criteria in literature ([15, 36, 52, 107]), the stability of the linear element is assumed a priori. However, from results obtained in this chapter it can be seen that stability of the linear element follows as a consequence of the theory, and not as a starting point. 5.8 Slope restricted nonlinearities In this section we investigate memoryless nonlinearities that in addition to being sector bound, also have restriction on their slopes. Let # " K ζη 0 2 (5.24) Θ1 (ζ, η) = K ζη −ζηI m 2 w with K = diag [k1 . . . km ], 0 < ki < ∞. Let NΘ1 be the set of (w, v) ∈ Lloc 1 (R, R ) such that QΘ1 (w, v) > 0 for all nonzero (w, v). Consider nonlinearities N ∈ F0K that also satisfy N ⊂ NΘ1 . The set of all N that satisfy the above conditions form a family of sector bound nonlinearities in (0, K) that also have the restriction that the slope of the w − v characteristic of every N in this family lies in (0, K). This family of nonlinearities is denoted by Fmon (the subscript “mon” stands for monotone). Clearly, Fmon ⊂ F0K . Define a function G(w, v) as follows: Z w Z wi m X T v T KΓdw) (5.25) vi dwi ) := (w KΓv − G(w, v) = ki αi (wi vi − i=0 0 0 R wi with Γ = diag [α1 . . . αm ] with αi positive. Note that 0 vi dwi denotes the area under the wi −vi characteristic of the i-th component of nonlinearity N and wi vi denotes the area of the rectangle in R2 with vertices (0, 0), (0, vi ), (wi , 0) and (wi , vi ). Since the slope of N is restricted to (0, K), it can be seen that the area of this rectangle will always be greater than the corresponding area Rw under the wi − vi curve. Thus, 0 i vi dwi < wi vi . Then, every N = {(w, v)} ∈ Fmon satisfies the inequality G(w, v) > 0. Define the QDF QΘ as the sum of QΘ1 and the QDF associated with dtd G(w, v). Now compute the matrix Φ(ζ, η) from Θ(ζ, η) corresponding to “negative feedback”. Thus: # " K ζηIm (ζη + Γζ) 2 (5.26) Φ(ζ, η) = K (ζη + Γη) 0 2 Notice that Φ(ζ, η) admits the following factorization: # # " " ηIm 0 ζIm ζIm J Φ(ζ, η) = ηIm K(Γ + ηIm ) 0 Km (Γ + ζIm ) {z } | {z } | M T (ζ) M (η) (5.27) 90 5 Designing linear controllers for nonlinearities Consider a Φ-dissipative behavior B associated with the strictly proper rational function P (ξ)Q−1 (ξ). The transfer function associated with M ( dtd )(B) is: H(ξ) = [K(Γ + ξIm )P (ξ) + ξQ(ξ)][ξQ(ξ)]−1 (5.28) Following the notation in Theorem 5.5.2, we see that GB is not biproper. Note that every storage function for B is a state function of the behavior associated with H(ξ). When H(ξ) is reduced, m(M ( dtd )(B)) = m(B) + 1. Further, every storage function of B can be defined as a state function of ẋ where x is a set of minimal states for B. We conclude from Remark 5.5.6 that x = 0 is still globally asymptotically stable. Thus: Corollary 5.8.1 Let Fmon be the family of nonlinearities that are sector bound in [0, K] and have slopes that are sector bound in (0, K). Consider a behavior B associated with the strictly proper rational function P (ξ)Q−1 (ξ). Interconnect B with any N ∈ Fmon to get BN . Then, (0, 0) ∈ BN is asymptotically stable for every N ∈ Fmon if there exist Γ = diag [α1 . . . αm ], 0 < αi , i = 1, . . . , m such that 1. The polynomial matrices K(Γ + ξIm )P (ξ) + ξQ(ξ) and ξQ(ξ) are right coprime. 2. H(ξ) := [K(Γ + ξIm )P (ξ) + ξQ(ξ)][ξQ(ξ)]−1 is positive real. 3. The behavior associated with H(ξ) has no memoryless part. The scalar version of this result was first proved by Vimal Singh [88], while the matrix version was addressed by Haddad and Kapila [31], Park et al [57]. Consider the scalar case of Corollary 5.8.1. If P (0) = 0 then, numerator and denominator of H(ξ) are not right coprime. However, one can still conclude asymptotic stability of B N in the light of Remark 5.5.5. While Singh assumes that P (0) 6= 0 [88], we see that the assumption can be relaxed. 5.9 Nonlinearities with memory Till now in this chapter, we only considered nonlinearities that were memoryless. We now discuss how the theory presented in this chapter can be used to handle nonlinearities with memory (e.g. systems with hysteresis). Figure 5.2 shows some examples of nonlinearities with 2 memory. Let N denote the set of (w, v) ∈ Lloc 1 (R, R ) that is consistent with the laws defining a nonlinearity shown in Figure 5.2. Notice that the characteristics shown in Figure 5.2 satisfy the inequality v̇w ≥ 0. This QDF can be represented as QΘ (w, v) ≥ 0 where: " # 0 η/2 Θ(ζ, η) = ζ/2 0 2 Let NΘ denote (w, v) ∈ Lloc 1 (R, R ) that satisfy QΘ (w, v) ≥ 0. Then, N ⊂ NΘ . Let FNΘ denote the family of all N such that N ⊂ NΘ . With reference to Section 5.5.1, we choose G = 0 and compute Φ corresponding to negative feedback: " # 0 ζ/2 Φ(ζ, η) = η/2 0 5.9. Nonlinearities with memory 91 v v h h −∆ 0 ∆ λ ∆ −∆ w 0 −h w −h (a) Ideal relay with hysteresis (b)Saturation with hysteresis Figure 5.2: Nonlinearities with memory Imaginary part (j ω +2)/(j ω +3) 0 ω= ω=0 ΟΟ Real Part −1/N(a) for ideal relay with hysteresis (typical) Figure 5.3: Describing function analysis Interconnect N with a Φ-dissipative behavior B associated with the rational function p(ξ)/q(ξ) to obtain BN . Notice that " # " # " # 0 ζ/2 ζ 0 η 0 = J η/2 0 0 1 0 1 | {z } K(η) The rational function associated with K( dtd )(B) is seen to be H(ξ) = coprime, not both constant, and H(ξ) is PR, p(ξ) . ξq(ξ) If p(ξ), ξq(ξ) are (5.29) every storage function of B is positive definite on manifest variables. Further, d QΨ (u, y) ≤ 0 ∀(u, y) ∈ BN dt We have “transformed” the nonlinearity with memory into the [0, ∞] sector using the transformation defined by K( dtd ). Hence we conjecture that if H(ξ) is positive real and p(0) 6= 0, (0, 0) ∈ BN is stable in the sense of Lyapunov. Let us check this claim using other methods of analysis like describing functions [8]. Consider the behavior B associated with the rational function G(ξ) = p(ξ)/q(ξ) with p(ξ) = ξ + 2 and q(ξ) = ξ + 3. The Nyquist plot of G(ξ) lies entirely in the first quadrant for ω ∈ [0, ∞]. Let 92 5 Designing linear controllers for nonlinearities N be an ideal relay with hysteresis (Figure 5.2(a)). Let N (a) be the describing function of N (see [8]). We superimpose the plot of −1/N (a) on the Nyquist plot of G(ξ) in Figure 5.3. We see that the two plots do not intersect. Hence we conclude that the interconnected system has a stable equilibrium. We now apply the results obtained in this section for analyzing stability. Notice that H(ξ) = ξ+2 ξ(ξ + 3) satisfies the conditions given in (5.29) since it is positive real. Thus the theory is in agreement with analysis using describing functions. We have also carried out a number of simulations and it seems that p(ξ)/(ξq(ξ)) being positive real with p(0) 6= 0 guarantees stability of the equilibrium. Note that Condition (5.29) allows for p(ξ)/q(ξ) to be bi-proper, i.e., there could be a feedthrough. Note that we conjecture that the algebraic methods used here will help in analysing nonlinearities with memory. We have found remarkable agreement between our results and those obtained from other analytic methods like describing functions. Moreover our results are in agreement with simulations. However, some technical difficulties dealing with differentiablilty of trajectories in B have been an obstruction to obtaining a rigourous proof. 5.10 Conclusion In this chapter, some stability issues of nonlinear systems have been addressed in the behavioral theoretic framework. Using QDFs, we have proposed an algebraic method for stability analysis of nonlinear dynamical systems that can be obtained by interconnection of a linear system and a sector-bound memoryless nonlinearity. The method helps us construct Lyapunov functions on manifest variables of a system without explicitly invoking states. As applications, we demonstrated how the circle criterion and a more general version of Popov’s stability criterion than those usually seen in literature follow as special cases. We also investigated systems with slope restricted nonlinearities. The algebraic nature of the method proposed here also lets us investigate nonlinearities with memory. We have shown that stability results obtained using this method match with other independent methods of analysis, though we have not been able to formulate a rigorous theory for such nonlinearities. Chapter 6 Polynomial J-spectral factorization 6.1 Introduction The problem of polynomial J-spectral factorization is the following: a real para-Hermitian w×w polynomial matrix Z in the indeterminate ξ (i.e. Z(ξ) = Z(−ξ)T ) is given, together with two integers m and n such that m + n = w. It is required to find a w × w polynomial matrix F (if there exists one) such that Z(ξ) = F T (−ξ)Jmn F (ξ) where Jmn " Im 0 = 0 −In (6.1) # and F is Hurwitz, i.e. has no singularities in the closed complex right half-plane. Strictly speaking, equation (6.1) is really a “Jmn -spectral” factorization. However, for ease of notation, and for conforming with available literature in this area, we still denote it as “J-spectral” factorization. Polynomial J-spectral factorization arises in different areas of systems and control, for example in the case of Wiener filtering, LQG theory, in the polynomial approach to H∞ control and filtering (see [49, 53]). Many algorithms have been suggested for the solution of such problem, especially in the case when n = 0, i.e. Jmn = Iw (see [4, 11, 24, 27, 38, 49, 97]). In this chapter we propose an algorithm for J-spectral factorization based on the special kernel representations of solutions to the subspace Nevanlinna interpolation problem (see [84]), and on the calculus of quadratic differential forms introduced in Chapter 1. The theory of metric interpolation has been used in [24, 27] for solving the problem of rational spectral factorization (i.e., the case in which Jmn = Iw and the entries of the matrix Z consist of rational functions); apart from the fact that our focus here is on polynomial J-spectral factorization, the approach proposed in this chapter differs from that of [24, 27] in many aspects: Our approach arises in the theory of QDFs, and uses two-variable polynomial matrix algebra; this point of view allows new insights in the nature of the problem. For example, an important consequence of our reliance on the theory of QDFs is that we are able to formulate necessary and sufficient conditions for the existence of a J-spectral factorization, thus providing an original and effective test alternative to the ones already known; 94 6 Polynomial J-spectral factorization Our approach covers also the case when Jmn 6= Iw , which is of special interest in H∞ -control and filtering. This generalization to the indefinite case is based on the results on Σ-unitary vector-exponential modeling developed in the behavioral framework (see [84]); Finally, the functioning of the algorithm does not depend on the assumptions underlying the algorithm proposed in [24, 27], and can be applied to a general para-Hermitian matrix Z. Moreover, one is not required to know a priori the inertia matrix Jmn of the spectral factorization, since it is determined in the course of the computations carried out in the algorithm we suggest. The results reported in this Chapter were obtained in collaboration with Dr. Paolo Rapisarda who is currently with the Department of Electrical and Computer Engineering, University of Southampton, UK. This chapter is organized as follows: in Section 6.2 we illustrate the basic features of modeling vector-exponential time series, with special emphasis on modeling with Σ-unitary models. Section 6.3 is the main section of this chapter, where we give an algorithm for polynomial J-spectral factorization. This section also has other results of independent interest. In Section 6.4, we examine numerical aspects of the algorithm. This is followed by examples in Section 6.5. 6.2 Σ-unitary modeling of dualized data In this section we illustrate the problem of modeling dualized data sets, a concept introduced in [84] in the context of the subspace Nevanlinna interpolation problem (SNIP in the following). Σ-unitary kernel representations play a central role in the algorithm for Σ-spectral factorization illustrated in section 6.3. 6.2.1 Modeling vector-exponential time series with behaviors Assume that a set D of data consisting of vector-exponential time series is given, i.e. D := {vi eλi t }i=1,...,N (6.2) where vi ∈ Cw , λi ∈ C, i = 1, . . . , N . We pick our model from a model class M, whose choice embodies the a priori assumptions on the nature of the phenomenon producing the data, for example linearity, time-invariance, etc. For the purposes of this chapter we choose the model class consisting of finite-dimensional linear differential systems with w external (manifest) variables, denoted, as in the earlier chapters by Lw . We say that a model B ∈ Lw is an unfalsified model for D, or equivalently, that the model B explains the data D, if D ⊆B. Of course, in general more than one model explains the data. Clearly, the model B = ∞ C (R, Cw ) is unfalsified by every trajectory. This extreme case leads us to deduce that the strength of a model lies in its prohibitive power : the more a model forbids, the better it is. In this sense C ∞ (R, Cw ) is a trivial model because no restrictions are being imposed on the outcomes of the model: according to C ∞ (R, Cw ), every outcome is possible. 6.2. Σ-unitary modeling of dualized data 95 Taking the point of view that the strength of a model lies in its prohibitive power (see [100, 101]) leads to a natural partial ordering among unfalsified models: if B1 and B2 are unfalsified by a set of data D, then we call B1 more powerful than B2 if B1 ⊆ B2 . We call a model B∗ the most powerful unfalsified model (abbreviated MPUM ) for D if B∗ ⊇ D and any other unfalsified model B for D satisfies B∗ ⊆ B. That is, the MPUM is the most restrictive model among those not refuted by the data. It can be shown that given a set of vector-exponential time series D as in (6.2), there exists a unique behavior B∗ ⊂ C ∞ (R, Cw ) which explains the data D and as little else as possible. Since B∗ ∈ Lw by assumption, and since D ⊆ B∗ : B∗ = lin span {vi eλi t }i=1,...,N (6.3) Observe that the MPUM of a finite set of vector-exponential time series is autonomous, i.e. it is a finite dimensional subspace of C ∞ (R, Cw ); equivalently, it can be represented as the kernel of a matrix polynomial differential operator RN ( dtd ), such that RN is square and nonsingular as a polynomial matrix (see Section 2.7 of this thesis). Note that the data D is, by assumption, in general complex valued. Hence there exists a kernel representation RN ( dtd ) for B∗ with RN (ξ) ∈ Cw×w [ξ], i.e. RN (ξ) is a polynomial matrix with complex coefficients. The following remark addresses when we can find RN ∈ Rw×w [ξ]: Remark 6.2.1 Let vi ∈ Cw = [vi1 , vi2 , . . . , viw ]T , vij ∈ C, j = 1 . . . w and λi ∈ C. Define v̄i ∈ Cw = [v̄i1 , v̄i2 , . . . , v̄iw ]T . If the data set D is self-conjugate, i.e. for every trajectory vi eλi t ∈ D, the trajectory v̄i eλ̄i t ∈ D, then there exists RN (ξ) ∈ Rw×w [ξ] such that RN ( dtd )w = 0 for all w ∈ B∗ . We now present an iterative algorithm to compute a kernel representation of B∗ . Define: R0 := Iw and proceed iteratively as follows for k = 1, . . . , N . At step k, define the k-th error trajectory d )vk eλk t = Rk−1 (λk )vk eλk t (6.4) dt Observe that the error-trajectory is also a vector-exponential time-series associated with the frequency λk and the vector εk := Rk−1 (λk )vk . A kernel representation of the MPUM for εk eλk is εk ε∗k d d Ek ( ) := Iw λk − dt kεk k2 dt Now define Rk := Ek Rk−1 Rk−1 ( After N steps such algorithm produces a w×w polynomial matrix RN such that RN ( dtd )vi eλi t = 0 for 1 ≤ i ≤ N ; then d B∗ = ker RN ( ) dt Especially in the Subspace Nevanlinna Interpolation Problem, the issue is to model not one, but an entire subspace of vector-exponential trajectories associated with the same frequency of the exponential, i.e. the data consists of the exponential trajectories in Vi eλi t := {veλi t | v ∈ Vi , Vi linear subspace of Cw }, i = 1, . . . , N 96 6 Polynomial J-spectral factorization Observe that this problem can also be interpreted as that of modeling the data lin span N [ i=1 {vij eλi t }j=1,...,dim(Vi ) where {vij }j=1,...,dim(Vi ) is a basis for Vi , i = 1, . . . , N . 6.2.2 Data dualization, semi-simplicity, and the Pick matrix In order to state the main result of this section, we need some preliminaries. Let Σ ∈ Rw×w be a symmetric, nonsingular matrix, and consider λi ∈ C+ , i = 1, . . . , N . We first introduce data dualization. Consider the set of trajectories Vi eλi t := {veλi t |v ∈ Vi } and we define its dual set as Vi ⊥Σ e−λ̄i t := {we−λ̄i t |w ∗ Σv = 0 for all v ∈ Vi } We call the set λi t ∪N ∪ Vi⊥Σ e−λ̄i t } i=1 {Vi e (6.5) the dualized data set. In the following it will be shown that by modeling the dualized dataset, a unfalsified model exhibiting a special (“Σ-unitary”) structure can be obtained. This special structure has special importance in considering the solution to the Subspace Nevanlinna Interpolation Problem (see [84]). Next, we define the concept of semi-simplicity of a one-variable polynomial matrix. Let Z ∈ Rw×w [ξ]; Z is semi-simple if for all λ ∈ C the dimension of the kernel of Z(λ) is equal to the multiplicity of λ as a root of det(Z). Note that if det(Z) has distinct roots then Z is semi-simple. Finally, we introduce the notion of Pick matrix associated with the data {(λi , Vi )}i=1,...,N . Let Vi ∈ Rw×dim(Vi ) be a full column rank matrix such that Im(Vi ) = Vi , i = 1, . . . , N . The P PN ( N i=1 dim(Vi )) Hermitian matrix i=1 dim(Vi )) × ( T{Vi }i=1,...,N := h Vi∗ ΣVj λ̄i +λj i i,j=1,...,N (6.6) is called a Pick matrix associated with {(λi , Vi )}i=1,...,N . Of course T{Vi }i=1,...,N depends on the particular basis matrices Vi chosen for Vi , but it is easy to see that the inertia of all these Pick matrices is the same. 6.2.3 A procedure for Σ-unitary modeling Consider a nonsingular matrix Σ = ΣT ∈ Rw×w . A polynomial matrix R ∈ Cw×w [ξ] is said to be Σ-unitary if there exists p(ξ) ∈ C[ξ], p 6= 0, such that RΣR∼ = R∼ ΣR = pp∼ Σ 6.2. Σ-unitary modeling of dualized data 97 where R∼ := R? (−ξ). Assume now that a set consisting of vector-exponential data {vi eλi t }i=1,...,N is to be modeled, and that the characteristic frequencies λi are all distinct; then det(R) = ΠN i=1 (ξ − λi ) (see [7] for the case when the characteristic frequencies are repeated). Recall that the Pick matrix for Vi eλi t , i = 1, . . . , N is defined as T{Vi }i=1,...,N = [Vi∗ ΣVj /(λ̄i + λj )]N i,j=1 , where Vi is a basis for Vi . We call the matrix: T{Vi }i=1,...,k := [Vi∗ ΣVj /(λ̄i + λj )]ki,j=1 , k ≤ N the k-th order principal block submatrix of T{Vi }i=1,...,N , and det T{Vi }i=1,...,k the k-th order principal block minor of T{Vi }i=1,...,N . The following result gives a sufficient condition for the existence of a Σ-unitary model of a dualized data set (6.5). Theorem 6.2.2 Assume that the Hermitian matrices T{Vi }i=1,...,k , k = 1, . . . , N are nonsingular, i.e. every principal block minor of T{Vi }i=1,...,N = [Vi∗ ΣVj /(λ̄i + λj )]N i,j=1 is nonzero. Then the ⊥Σ −λ̄i t N λi t MPUM for the dualized data set ∪i=1 {Vi e ∪ Vi e } has a Σ-unitary kernel representation, d w×w λi t ∪ Vi⊥Σ e−λ̄i t } and i.e. there exists R̂ ∈ C [ξ] such that R̂( dt )w = 0 for all w ∈ ∪N i=1 {Vi e R̂? (−ξ)ΣR̂(ξ) = p? (−ξ)p(ξ)Σ, p 6= 0. Proof: Let V1 ∈ Rw×dim(V1 ) be a full-column rank matrix such that Im(V1 ) = V1 . By assumption, V1∗ ΣV1 /(λ̄1 + λ1 ) is nonsingular. Consider the w × w matrix −1 R̂1 (ξ) := (ξ + λ̄1 )Iw − V1 T{V V ∗Σ 1} 1 (6.7) It is easily verified that ker(R̂1 ( dtd )) ⊇ V1 eλ1 t ∪ V1⊥Σ e−λ̄1 t . Now observe that since T{V1 } is ¯ nonsingular, it holds that dim(lin span(V1 eλ1 t ∪ V1⊥Σ e−λ1 t )) = w. Since deg(det(R̂1 )) = w, it follows that ker(R̂1 ( dtd )) is the MPUM for the dualized data associated with the subspace V1 . It is a matter of straightforward verification to see that R̂1 is a Σ-unitary matrix. Then, R̂ = R̂1 . The error subspaces of R̂1 ( dtd ) are generated by the matrices −1 Vi0 := R̂1 (λi )Vi = (λi + λ̄1 )Vi − V1 T{V V ∗ ΣVi , i = 2, . . . , N 1} 1 We first verify that the (i − 1, j − 1)-th block of T{Vi0 }2≤i≤N is Vi0∗ ΣVj0 (λ1 + λ̄i )(λj + λ¯1 ) T −1 = Vi ΣVj − ViT ΣV1 T{V V ∗ ΣVj , i, j = 2, . . . , N 1} 1 λj + λ̄i λj + λ̄i Now partition the principal block submatrices of T{Vi }i=1,...,N as follows: " # T{V1 } b∗k T{Vi }i=1,...,k = k = 2, . . . , N bk T{Vi }i=2,...,k with bk := col " Vi∗ ΣV1 λ̄i +λ1 Idim(V1 ) 0 −1 −∆k bk T{V1 } ∆k # , i = 2, . . . , k. Let ∆k := diag(λ̄i + λ1 )i=2,...,k . Observe that T{Vi }i=1,...,k # −1 ∗ ∆ Idim(V1 ) −T{V b k 1} k 0 ∆k # " T{V1 } 0 , k = 2, . . . , N = −1 ∗ b ∆k 0 ∆k T{Vi }i=2,...,k ∆k − ∆k bk T{V 1} k " 98 6 Polynomial J-spectral factorization −1 ∗ It is not difficult to see that the (i, j)-th block of ∆k T{Vi }i=2,...,k ∆k − ∆k bk T{V b ∆k equals the 1} k (i, j)-th block of T{Vi0 }2≤i≤k , i, j = 1, . . . , k and k = 2, . . . , N . Since the above equality also holds for k = 2, we have shown that T{V20 } is nonsingular. Hence we can define a representation R̂2 of the MPUM for V20 eλ2 t and V20⊥Σ e−λ̄2 t , similar to that in equation (6.7). Define R̂ = R̂2 R̂1 . Thus, R̂ is a Σ-unitary representation of the MPUM for Vi eλi t ∪ Vi e−λ̄i t , i = 1, 2 and further, every principal block minor of T{Vi0 }2≤i≤N is nonzero. An inductive argument completes the proof. The proof of Theorem 6.2.2 implies the following result. Corollary 6.2.3 If every principal block minor of the Pick matrix (6.6) is nonsingular, then the Pick matrix associated with the k-th error subspace of the modeling procedure in the proof of Theorem 6.2.2 is also nonsingular. Remark 6.2.4 We show with a counterexample that the converse implication of Theorem 6.2.2, i.e. that the existence of a Σ-unitary model implies that every principal block minor of the Pick matrix is nonzero, does not hold true. Let w = 2 and " # 1 0 Σ= 0 −1 h iT and associated with λ ∈ R+ . Observe Consider the subspace V generated by v := 1 1 that such subspace is associated with a singular Pick matrix, since v is self-orthogonal in the indefinite inner product induced by Σ. A model for Veλt + V ⊥Σ e−λt is " # 1 2 1 2 2 ξ − λ ξ 2 2 1 2 1 2 ξ ξ − λ2 2 2 which is easily seen is Σ-unitary. 6.3 J-spectral factorization via Σ-unitary modeling In this section we illustrate the application of the ideas of the previous section to the problem of computing a J-spectral factor of a para-Hermitian matrix Z ∈ Rw×w [ξ]. The main result of this section is an algorithm for J-spectral factorization based on Σ-unitary modeling. In the process of deriving this procedure we also prove some results of independent interest. In this section a crucial role will be played by the association of a two-variable polynomial matrix to a one-variable polynomial matrix, a technique introduced as lifting in [97] (see [58, 97] for examples of applications of this idea). In the following, we associate with a para-Hermitian matrix Z ∈ Rw×w [ξ] a matrix Φ ∈ Rw×w [ζ, η] such that Φ(−ξ, ξ) = Z(ξ). The following result (see Lemma 3.1 of [97]) characterizes the matrices Φ(ζ, η) satisfying this condition. P Proposition 6.3.1 A symmetric matrix Φ(ζ, η) = Li,j=0 Φi,j ζ i η j satisfies Φ(−ξ, ξ) = Z(ξ) =: M X Zk ξ k k=0 if and only if Zk = Φ0,k − Φ1,k−1 + Φ2,k−2 − . . . + (−1)k Φk,0 for all k = 0, . . . , M . 6.3. J-spectral factorization via Σ-unitary modeling 99 Observe that for example the matrix Φ(ζ, η) := 12 (Z(ζ)T + Z(η)) satisfies the condition of Proposition 6.3.1; see also formula (3.3) in [97] for another example. We now proceed to prove some important results concerning the properties of the twovariable matrix Φ that can be associated to a para-Hermitian Z. Theorem 6.3.2 Let Z ∈ Rw×w [ξ] be para-Hermitian. Then there exists Φ ∈ Rw×w s [ζ, η] such that 1. Φ(−ξ, ξ) = Z(ξ); 2. Φ admits a canonical factorization " #" # h i Σ Σ D(η) 1 2 Φ(ζ, η) = D(ζ)T N (ζ)T T Σ2 Σ3 N (η) # " Σ1 Σ2 with Σ = ∈ Rw+q and Σ1 ∈ Rw×w , D ∈ Rw×w [ξ] nonsingular, N ∈ Rq×w [ξ], such ΣT2 Σ3 that N D −1 is strictly proper. If a symmetric canonical factorization of Φ(ζ, η) satisfies condition (2) in the Theorem statement, the factorization is called a “strictly proper symmetric canonical factorization” of Φ. Proof: Using Proposition 6.3.1 we first find a Ω(ζ, η) = K(ζ)T ΣΩ K(η) such that Ω(−ξ, ξ) = Z(ξ). From Theorem 3.3.22 p. 90 of [75] we can conclude the existence of an input-output partition of the external variables of Im K( dtd ) with an associated proper (but not necessarily strictly proper) transfer function. Consequently, without loss of generality we can assume that " # D K= N with D ∈ Rw×w [ξ] nonsingular, N ∈ Rq×w [ξ], such that N D −1 is proper, but not necessarily strictly proper. Now compute a unimodular matrix U ∈ Rw×w [ξ] such that DU is column proper. Observe that K 0 := KU = col(D 0 , N 0 ) is also associated with a proper transfer function, since N 0 D 0−1 = (N U )(DU )−1 = N D −1 . Now observe that since N 0 D 0−1 is proper and since D 0 is column proper, it follows that each column of N 0 has degree less than or equal to that of the corresponding column of D 0 . We can conclude that there exists a constant matrix X ∈ Rq×w such that N 0 = XD 0 + N 00 with N 00 D 0−1 strictly proper. Now observe that Z(ξ) = U (−ξ)−T K 00 (−ξ)T Σ0 K 00 (ξ)U (ξ)−1 where K 00 Σ0 # " # Iw 0w×q D0 := KU = −X Iq N 00 " #−T " #−1 Iw 0w×q Iw 0w×q := ΣΩ −X Iq −X Iq " # " # Iw X T Iw 0w×q = ΣΩ 0q×w Iq X Iq " 100 6 Polynomial J-spectral factorization Observe that K 00 (λ) is full column rank for all λ ∈ C. The existence of the matrix Φ of the claim follows taking Σ = Σ0 , D = D 0 U (ξ)−1 , N = N 00 U (ξ)−1 . We proceed to show an important application of the result of Theorem 6.3.2, namely that if the inertia of Z(iω) is constant for all ω ∈ R, then it equals the inertia of the sub-matrix of Σ corresponding to the input variables. Theorem 6.3.3 Let Z ∈ Rw×w [ξ] be para-Hermitian, and assume that σ(Z(iω)) is constant for all ω ∈ R. Consider Φ(ζ, η) such that Φ(−ξ, ξ) = Z(ξ). Let Φ(ζ, η) = K T (ζ)ΣK(η) be a strictly proper symmetric canonical factorization of Φ (Theorem 6.3.2). Then: σ(Z(iω)) = σ(Σ1 ) for all ω ∈ R, where Σ1 ∈ Rw×w is the (1, 1)-block of Σ. Proof: Consider the strictly proper symmetric canonical factorization of Φ obtained as in Theorem 6.3.2. Now observe that " #" # i Σ Σ h D(iω) 1 2 Z(iω) = D(−iω)T N (−iω)T ΣT2 Σ3 N (iω) = D(−iω)T Σ1 D(iω) + D(−iω)T Σ2 N (iω) +N (−iω)T ΣT2 D(iω) + N (−iω)T Σ3 N (iω) Now multiply both sides of this equality by D(−iω)−T on the left and D(iω)−1 on the right, obtaining D(−iω)−T Z(iω)D(iω)−1 = Σ1 + Σ2 N (iω)D(iω)−1 + D(−iω)−T N (−iω)T ΣT2 +D(−iω)−T N (−iω)T Σ3 N (iω)D(iω)−1 Taking the limit for ω to infinity, we conclude from the strict properness of N D −1 that lim D(−iω)−T Z(iω)D(iω)−1 = Σ1 ω→∞ On the other hand, by assumption σ(Z(iω)) is constant, and consequently the claim of the Proposition is proved. The result of Theorem 6.3.3 implies that if Z(iω) has constant inertia for all ω ∈ R, one may infer this inertia from the inertia of Σ1 . Of course, Σ1 is not unique. However, the claim of Theorem 6.3.3 is that any input-output “partition” that satisfies properties listed in Proposition 6.3.2 can be used to infer the inertia of Z(ξ). Observe that this test involves only standard polynomial- and numerical matrix computations and as a consequence is easier to carry out than that on the inertia of Z(iω) for a general ω ∈ R. We now prove that the Σ-unitary model (6.7) used in the proof of Theorem 6.2.2 maps “strictly proper” image representations in “strictly proper” ones, in the following sense. Lemma 6.3.4 Let col(G, F ) ∈ R(w+q)×w [ξ] be such that F G−1 is strictly proper. Let R̂ be a Σ-unitary model for Veλt , as in (6.7). Then R̂ · col(G, F ) also represents a strictly proper transfer function, i.e. " # " # G Ĝ R̂ = F F̂ is such that F̂ Ĝ−1 is strictly proper. 6.3. J-spectral factorization via Σ-unitary modeling 101 Proof: Recall from equation (6.7) that R̂ is: −1 ∗ R̂(ξ) := (ξ + λ̄)I(w+q) − V T{V} V Σ where V ∈ R(w+q)×dim(V) is a full-column rank matrix such that Im(V ) = V. Now partition " # " # V1 Σ1 Σ2 V = and Σ = V2 ΣT2 Σ3 compatibly with w and q, and write " # −1 −1 (ξ + λ̄)Iw − V1 T{V} (V1∗ Σ1 + V2∗ ΣT2 ) −V1 T{V} (V1∗ Σ2 + V2∗ Σ3 ) R̂(ξ) = −1 −1 −V2 T{V} (V1∗ Σ1 + V2∗ ΣT2 ) (ξ + λ̄)Iq − V2 T{V} (V1∗ Σ2 + V2∗ Σ3 ) # " D∼ N ∼ =: Q −P Observe that D and P are nonsingular matrices, since their determinant has degree w, respectively q. Observe that the transfer function associated with R̂ · col(G, F ) is (QG − P F )(D ∼ G + N ∼ F )−1 , which can be rewritten as P (P −1 Q − F G−1 )GG−1 (I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1 = P (P −1 Q − F G−1 )(I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1 −1 Since D ∼ = (ξ + λ̄)Iw − V1 T{V} (V1∗ Σ1 + V2∗ ΣT2 ) is column proper, it is easy to see that (D ∼ )−1 is strictly proper. Observe also that the matrix (I + (N D −1 )∼ F G−1 ) is bi-proper because as ξ → ∞, (I + (N D −1 )∼ F G−1 ) → I. Therefore (I + (N D −1 )∼ F G−1 )−1 is also bi-proper. Hence we conclude that every entry of (I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1 is a rational function with the denominator having degree at least equal to the degree of the numerator plus one. Now observe that P −1 Q is strictly proper, and consequently P −1 Q−F G−1 is a matrix of strictly proper rational functions. Conclude that (P −1 Q − F G−1 )(I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1 is also a matrix of strictly proper rational functions. −1 (V1∗ Σ2 + V2∗ Σ3 ) is a polynomial of Now observe that every entry of P = −(ξ + λ̄)Iw + V2 T{V} degree at most one. Conclude that P (P −1 Q + F G−1 )(I + (N D −1 )F G−1 )−1 (D ∼ )−1 is a matrix of strictly proper rational functions, as was to be proved. Having proved these important preliminary results, we can now state the main result of this chapter. The statement makes use of the notion of observability of a QDF (Definition 1.2.12) and of the McMillan degree of a behavior (Section 2.8). Recall that McMillan degree of a behavior B is denoted with n(B). 102 6 Polynomial J-spectral factorization Theorem 6.3.5 Consider Z(ξ) ∈ Rw×w having constant inertia on the imaginary axis. Let Φ(ζ, η) be a w × w matrix such that Φ(−ξ, ξ) = Z(ξ). Let Φ(ζ, η) = K T (ζ)ΣK(η) be a strictly proper symmetric canonical factorization of Φ(ζ, η) (Theorem 6.3.2). Assume that Q Φ is observable and that Φ(−ξ, ξ) is semi-simple. Let λi , i = 1, . . . , N be the singularities of det(Z) in C+ , and assume that n(Im(K( dtd ))) = N . Assume that every principal block minor of the matrix [K(λi )∗ ΣK(λj )/(λ̄i + λj )]N i,j=1 is nonzero. Define K1 := K, and consider the recursion for i = 1, . . . , N : 1. Vi := full column-rank matrix such that Im(Vi ) = Im(Ki (λi )); −1 2. R̂i (ξ) := (ξ + λ̄i )I(w+q) − Vi T{V V ∗ Σ; i} i 3. Ki+1 (ξ) := R̂i (ξ)Ki (ξ) ; ξ−λi Then the matrix KN +1 is such that KN +1 = col(GN +1 , 0), with GN +1 ∈ Rw×w [ξ] a Hurwitz Σ1 -spectral factor of Z(ξ), i.e. Z(ξ) = Φ(−ξ, ξ) = GN +1 (−ξ)T Σ1 GN +1 (ξ) Proof: It has been stated in Corollary 6.2.3 that since every principal block minor of the matrix [K(λi )∗ ΣK(λj )/(λ̄i + λj )]N i,j=1 is nonzero, the Pick matrix associated with the error subspace of the model R̂i at the i-th iteration is nonsingular. This implies that the one-step model R̂i can be defined at each iteration of step 2 of the above recursion. Now observe that since at the i-th step the matrix R̂i is a kernel representation of Im(Ki (λi )), in other words R̂i (λi )Ki (λi ) = 0, it must necessarily hold that (ξ − λi ) is a factor of R̂i (ξ)Ki (ξ). i (ξ) = Ki+1 (ξ) is polynomial. This implies that the matrix R̂i (ξ)K ξ−λi It follows from the result of Lemma 6.3.4 that the model R̂i also preserves the strict properi (ξ) ness of Ki+1 in the step Ki (ξ) → R̂i (ξ)K = Ki+1 (ξ) of the algorithm. ξ−λi We now prove that the Σ-unitariness of R̂i also implies that Ki∼ ΣKi = Φ(−ξ, ξ) = Z(ξ) for all i = 1, . . . , N +1. The claim is true by assumption for i = 1. Note that for i = 2, . . . , N +1: ∼ ∼ Ki−1 R̂i−1 R̂i−1 Ki−1 Σ −ξ − λ̄i−1 ξ − λi−1 ∼ Ki−1 Ki−1 = (−ξ − λ̄i−1 )(ξ − λi−1 )Σ ξ − λi−1 −ξ − λ̄i−1 ∼ = Ki−1 ΣKi−1 Ki∼ ΣKi = = Z(ξ) because of inductive assumption Now denote Ki := col(Gi , Fi ), i = 1, . . . , N + 1, with Gi ∈ Cw×w [ξ] and Fi ∈ Cq×w [ξ]. We prove by induction that the “denominator” Gi of Ki is nonsingular for i = 1, . . . , N + 1. The statement is true by assumption for i = 1. Assume now that the claim holds true for i < j, and partition the (j − 1)-th model R̂j−1 as " # ∼ ∼ Dj−1 Nj−1 R̂j−1 (ξ) = Qj−1 −Pj−1 6.3. J-spectral factorization via Σ-unitary modeling 103 with Dj−1 ∈ Rw×w [ξ], Nj−1 ∈ Rq×w [ξ], Qj−1 ∈ Rq×w [ξ], Pj−1 ∈ Rq×q [ξ] defined as in the proof of Lemma 6.3.4. We prove the claim for i = j. Observe that ∼ ∼ Dj−1 Gj−1 + Nj−1 Fj−1 ξ − λj−1 −1 ∼ ∼ Dj−1 [I + (Nj−1 Dj−1 ) Fj−1 G−1 j−1 ]Gj−1 = (ξ − λj−1 ) Gj = (6.8) ∼ Observe that Dj−1 is nonsingular by construction; that Gj−1 is nonsingular by inductive as−1 ∼ sumption; and that [I + (Nj−1 Dj−1 ) Fj−1 G−1 j−1 ] is also nonsingular due to the strict-properness −1 −1 of Nj−1 Dj−1 and Fj−1 Gj−1 . Conclude that Gj is also nonsingular as was to be proved. We now prove that deg(det(Gi )) = deg(det(Gi−1 )), i = 2 . . . , N + 1. From (6.8) it follows that ∼ det(Di−1 ) det(Gi ) −1 ∼ = det([I + (Ni−1 Di−1 ) Fi−1 G−1 i−1 ]) det(Gi−1 ) det((ξ − λi−1 )Iw ) ∼ ∼ Observe that since Di−1 = (ξ+λ̄)Iw +constant, it follows that deg(det(Di−1 )) = w = deg(det((ξ− ∼ det(Di−1 ) λi−1 )Iw ), and consequently that det((ξ−λi−1 )Im ) is a proper, but not strictly-proper, transfer −1 ∼ function. Moreover, it follows from the strict-properness of Fi−1 G−1 i−1 and of (Ni−1 Di−1 ) −1 ∼ that I + (Ni−1 Di−1 ) Fi−1 G−1 i−1 is a matrix of bi-proper rational functions, and consequently that its determinant is a proper, but not strictly-proper, rational function. Conclude that det(Gi ) is a proper, but not strictly-proper, rational function. This concludes the proof of det(Gi−1 ) deg(det(Gi )) = deg(det(Gi−1 )), i = 2, . . . , N + 1. For each i, i = 1, . . . N + 1, let Ki = Ki0 Ui , with Ki0 ∈ C(w+q)×w [ξ] right prime, and Ui ∈ Cw×w [ξ] a greatest common right divisor of Ki . Partition Ki0 compatibly with w, q as Ki0 = col(G0i , Fi0 ). We now show that deg(det(G0i+1 )) < deg(det(G0i )), i = 1, . . . , N , i.e. that the degree of the determinant of the “denominator” G0i associated with the transfer function of Ki0 decreases with i, i = 1, . . . , N . Observe that since we have already proved above that deg(det(Gi )) = deg(det(Gi−1 )),i = 2, . . . , N + 1, it follows that deg(det(G0i Ui )) = deg(det(G0i ) det(Ui )) = deg(det(G0i )) + deg(det(Ui )) = deg(det(G0i−1 )) + deg(det(Ui−1 )), Hence, to prove the claim, it is equivalent to prove that deg(det(Ui )) > deg(det(Ui−1 )), i = 2, . . . N + 1. In order to prove the above statement, observe first that since det(Φ(−λ̄i , λi )) = 0, it follows that there exist vij ∈ Cw , j = 1, . . . , dim(ker(Φ(−λ̄i , λi ))), such that vij∗ Φ(−λ̄i , λi ) = vij∗ K ? (−λ̄i )ΣK(λi ) = (K(−λ̄i )vij )∗ ΣK(λi ) = 0 Given that Φ(−ξ, ξ) is semi-simple by assumption, it follows that Im(K(−λ̄i )) contains exactly dim(ker(Φ(−λ̄i , λi ))) vectors which are Σ-orthogonal to Im(K(λi )). Since R̂i also models ¯ (Im(Ki (λi )))⊥Σ e−λi t , it follows that R̂i (−λ̄i )Ki (−λ̄i )vij = 0, j = 1, . . . , dim(ker(Φ(−λ̄i , λi ))). Hence, every greatest common right divisor of Ki has exactly dim(ker(Φ(−λ̄i , λi )) singularities in −λ̄i . Hence it follows that deg(det(Ui )) = deg(det(Ui−1 ))+dim(ker(Φ(−λ̄i , λi ))) which shows that deg(det(G0i )) < deg(det(G0i−1 )), i = 2, . . . , N + 1. The claim just proved also shows that deg(det(G0N +1 )) = 0, i.e. that G0N +1 is unimodular, since by the semi-simplicity assumption 104 6 Polynomial J-spectral factorization and the fact that N equals the McMillan degree of Im(K( dtd )) = deg(det(G1 )), the number of P PN singularities of Φ(−ξ, ξ) is exactly N i=1 dim(ker(Φ(−λ̄i , λi ))) = i=1 deg(det(Ui )). −1 0 0 0 Since deg(det(GN +1 )) = 0 and GN +1 FN +1 is strictly proper, it must be that FN0 +1 = 0. Hence we conclude that FN +1 = 0. This proves the first statement of the Theorem. Moreover, because Ki∼ ΣKi = Φ(−ξ, ξ) = Z(ξ) for i = 1, . . . , N + 1: " # h i GN +1 (ξ) KN∼+1 ΣKN +1 = G∼ N +1 0 Σ 0 = G∼ N +1 Σ1 GN +1 = Z(ξ) Since GN +1 = G0N +1 UN +1 with G0N +1 unimodular and UN +1 Hurwitz, it follows that GN +1 is a spectral factor, as was to be proved. We now examine the case when the matrix K coming from the symmetric canonical factorization of Φ in Theorem 6.3.5 is unobservable, i.e. K = K 0 U with U ∈ Rw×w [ξ] such that dim(ker(U (λ))) > 0 for some λ ∈ C, and K 0 ∈ R(q+w)×w [ξ] right prime. Proposition 6.3.6 Let Φ(ζ, η) = K T (ζ)ΣK(η) be a strictly proper symmetric canonical factorization, and assume that Φ(−ξ, ξ) is semi-simple. Assume that K ∈ R(w+q)×w [ξ] is such that there exists λ ∈ C not purely imaginary, such that dim(ker(K(λ))) > 0. Let R̂ be a Σ-unitary kernel representation of the model for the dualized data induced by Im(K(λ)). Then K 0 (ξ) := R̂(ξ)K(ξ) ξ−λ is such that dim(ker(K 0 (−λ̄))) = dim(ker(K(λ))). Moreover, Φ0 (ζ, η) := K 0 (ζ)T ΣK 0 (η) is such that Φ0 (−ξ, ξ) = Φ(−ξ, ξ). Proof: It has been argued in the proof of Theorem 6.3.5 that if R̂ is a Σ-unitary kernel representation of the model for the dualized data induced by Im(K(λ)), it holds that K 0 (ξ) = R̂(ξ)K(ξ) is polynomial. Moreover, it follows from the Σ-unitariness of R̂ that ξ−λ K 0∼ ΣK 0 = K ∼ R̂∼ R̂K Σ = K ∼ ΣK −ξ − λ̄ ξ − λ We now prove the claim dim(ker(K 0 (−λ̄))) = dim(ker(K(λ))). In order to prove this claim, observe that if v ∈ Cw is such that K(λ)v = 0 then also Φ(−λ̄, λ)v = 0. Since Φ(−ξ, ξ) is para-Hermitian, we conclude that v ∗ Φ(−λ̄, λ) = v ∗ K ? (−λ̄)ΣK(λ) = (K(−λ̄)v)∗ ΣK(λ) = 0 which implies that K(−λ̄)v ∈ Im(K(λ))⊥Σ . Recall that ker(R̂( dtd )) contains Im(K(λ))⊥Σ e−λ̄t , hence R̂(−λ̄)K(−λ̄)v = 0. It follows from this and from R̂(ξ)K(ξ) = K 0 (ξ)(ξ − λ) that (−λ̄ − λ)K 0 (−λ̄)v = R̂(−λ̄)K(−λ̄)v = 0 6.3. J-spectral factorization via Σ-unitary modeling 105 Since λ is not purely imaginary, it follows that (−λ̄ − λ) 6= 0, and consequently that K 0 (−λ̄)v = 0. This holds for every v ∈ ker(K(λ)); the claim is proved. The result of Proposition 6.3.6 shows that if we are given a canonical symmetric factorization of Φ with a K which has a singularity in λ ∈ C+ , we can compute from it a new two-variable polynomial matrix Φ0 such that Φ0 (−ξ, ξ) = Z(ξ), and whose canonical factor K 0 has a singularity in −λ̄ ∈ C− . Observe that this is particularly relevant in the application of the recursions (1) − (3) of Theorem 6.3.5, since the spectral factor produced at the end of the algorithm is ensured to be Hurwitz in this case. The result of Theorem 6.3.2 and Proposition 6.3.6, and the recursion of Theorem 6.3.5 suggest the following algorithm in order to compute a J-spectral factor of a para-Hermitian matrix Z. Algorithm Input: Semi-simple para-Hermitian matrix Z ∈ Rw×w [ξ] with constant inertia on the imaginary axis. Output: A Σ1 -spectral factorization Z = D ∼ Σ1 D. Compute Φ ∈ Rw×w [ζ, η] s.t. Φ(−ξ, ξ) = Z(ξ) (* Use Proposition 6.3.1 *) Compute a strictly proper canonical factorization: Φ(ζ, η) = K(ζ) T ΣK(η); Apply Proposition 6.3.6 in order to remove any singularities of K in the right half-plane; (* Comment: Φ resulting from previous step satisfies *) (* the assumptions of Theorem 6.3.5 *) Compute the roots λi , i = 1, . . . , N of det(Z) in C+ ; Define K1 := K; For i = 1, . . . , N do Compute full column rank matrix Vi ∈ C(w+q)ו such that Im(Vi ) = Im(Ki (λi )); ∗ Compute R̂i (ξ) := (ξ + λ̄i )I(w+q) − Vi T{−1 Im(K (λ ))} Vi Σ i Define Ki+1 (ξ) := R̂i (ξ)Ki (ξ) ; ξ−λi end; Return the first w rows of KN +1 ; i 106 6.4 6 Polynomial J-spectral factorization Numerical Aspects of the Algorithm Several issues need be investigated for a good practical implementation of the algorithm presented above. The treatment here is far from exhaustive, and is more a summary of the problems we faced during an implementation, and their possible solutions. Our algorithm has the three basic operations: 1. Compute a symmetric canonical factorization that meets conditions in Proposition 6.3.2. 2. Compute the spectrum of the given para-Hermitian matrix. 3. Implement the iterations and division by (ξ − λi ). We now examine each of these in some detail and discuss the manner in which we implemented the algorithm. 6.4.1 Symmetric Canonical Factorization The starting step in the Algorithm is a “pre-factorization” of a given para-Hermitian matrix, usP ing a symmetric canonical factorization of a bivariate polynomial matrix. Let Z(ξ) = dk=0 Zk ξ k with Zk ∈ Rw×w and d the least integer such that Zd+1 = Zd+2 = . . . = 0. We call d the degree of Z(ξ). A necessary condition for the existence of a Σ-spectral factorization is that d is even. Note that Z(ξ) can be written as follows: Z(ξ) = h (−ξ) d/2 I (−ξ) d/2−1 I ... I i × (−1)d/2 Zd (−1)d/2 Zd−1 /2 0... d/2−1 d/2−1 d/2−1 Zd−1 /2 (−1) Zd−2 (−1) Zd−3 /2 . . . (−1) d/2−2 d/2−2 0 (−1) Zd−3 /2 (−1) Zd−4 ... .. 0 . ··· Z1 /2 0 ... −Z1 /2 Z0 | {z S:= (6.9) ξ d/2 I d/2−1 ξ I .. . I {z | } K:= Define Σ = (S + S T )/2. With Φ(ζ, η) = K T (ζ)ΣK(η) it is easy to verify that: } 1. Z(ξ) = Φ(−ξ, ξ) = K T (−ξ)ΣK(ξ). 2. Denote the first w rows of K(ξ) as Q(ξ), and the rest as P (ξ). Then P Q−1 is a rational matrix of strictly proper rational functions. 3. If Zd is nonsingular, and Z(ξ) has constant inertia on the imaginary axis, inertia of Z(iω) is the same as that of Zd . (Proposition 6.3.3). We now prove an important implication of the pre-factorization (6.10): Lemma 6.4.1 If Zd is nonsingular, the number of singularities of Z(ξ) in the open right half plane is exactly equal to wd/2, which is also the McMillan degree of ImK( dtd ) 6.4. Numerical Aspects of the Algorithm 107 P Proof: Let Z(ξ) = dk=0 Zk ξ k . If Zd is nonsingular, we can define L(ξ) : Zd−1 Z(ξ) = ξ d Iw + Ld−1 ξ d−1 . . . + L0 with Li = Zd−1 Zi , i = 0 . . . d − 1. Clearly, roots of det Z(ξ) = 0 are the same as roots of det L(ξ) = 0. We define the matrix C ∈ Rwd×wd : 0 I 0 0 0 0 I ... C= . . 0 0 I . −L0 −L1 . . . −Ld−1 It is shown in [29] that det L(ξ) = det(ξIdw − C). Hence, roots of det L(ξ) (and consequently those of det Z(ξ)) are precisely the eigenvalues of C, and therefore dw in number. Since Z(ξ) is para-Hermitian with no singularities on the imaginary axis, the number of singularities in the open right half plane is wd/2. Since P (ξ)Q−1 (ξ) is a matrix of strictly proper rational functions, the McMillan degree of ImK( dtd ) = deg det ξ d/2 Iw , which is precisely wd/2. Thus, many “good” things happen if Zd is nonsingular. Such polynomial matrices are called regular: Definition 6.4.2 A polynomial matrix Z(ξ) = regular polynomial matrix. Pd k=0 Zk ξ k with Zd nonsingular is called a Using the prefactorization of Z in the “triple banded form” as shown in equation (6.10), we can start the algorithm in the regular case. When Z is not regular, we say that Z has singularities at infinity. An important special case is when Z is unimodular, i.e.,det Z = constant, 6= 0, when all singularities of Z are said to be at infinity. We use a method due to Aliev and Larin [1] to convert a polynomial matrix that is not regular, into one that is. The regular polynomial matrix so obtained can then be pre-factored as in equation (6.10) and the algorithm started. Algorithm for converting a non-regular para-Hermitian matrix into a regular para-Hermitian matrix. Input Z(ξ) ∈ Rw×w [ξ] which is not regular. Output Y (ξ) ∈ Rw×w [ξ] which is regular. Initialize i = 1 and Y (ξ) = Z(ξ) 1. While Y (ξ) ∈ Rw×w [ξ] is not regular, repeat steps 2-5: P 2. Write Y (ξ) = di=0 Yi ξ i . Since Yd is symmetric and singular, there exist Ui , Λi ∈ Rw×w such that Yd = Ui Λi UiT with Λi = diag [λ1 , . . . , λk , 0 . . . , 0], k < w and Ui UiT = Iw . 3. Define Y 0 (ξ) = UiT Y Ui = [yij ] i, j = 1, . . . , w with yij polynomials in ξ having degree δij . Consider the k + 1-th row of Y 0 . Let 2δ0 = δk+1 k+1 be the degree of the polynomial at the diagonal position in this row. Let δm be maxi6=k+1 δk+1,i be the maximum degree of the off-diagonal polynomial in this row. Define δ1 = min(d − δ0 , 2d − δm ). 4. Define Ti (ξ) = diag [1, . . . 1, (ξ + 1)δ1 , 1 . . . 1] with (s + 1)δ1 being at the k + 1-th position. 108 6 Polynomial J-spectral factorization 5. Update: Y (ξ) ← TiT (−ξ)UiT Y Ui Ti (ξ). i ← i + 1. Since at every step in the above procedure, the degree of Y remains the same (i.e. remains d), the recursion in step 5 above is guaranteed to terminate after a finite number of steps. One can obtain a Σ-spectral factorization of Y (ξ) using our algorithm: Y (ξ) = F T (−ξ)ΣF (ξ). A (Hurwitz) spectral factor of Z can then be easily computed from a (Hurwitz) spectral factor of Q Y , and is seen to be F (ξ)X −1 (ξ) where X(ξ) = i Ui Ti taken in the sense of a left multiplication. Notice that explicit inversion of X(ξ) is not necessary since: X −1 (ξ) = adjX(ξ) det X(ξ) The adjoint matrix is easily calculated by noting that with Ti (ξ) = diag [1, . . . 1, (ξ +1)δ1 , 1 . . . 1], adjoint Ti (ξ) is simply diag [(s + 1)δ1 , . . . (s + 1)δ1 , 1, (s + 1)δ1 . . . (s + 1)δ1 ]. Further, det X(ξ) is of the form (s + 1)δ , for some δ since det Ti = (s + 1)δ1 , and det Ui = 1. 6.4.2 Computing singularities Singularities of the para-Hermitian matrix Z(ξ) are by definition roots of det Z(ξ) = 0. However, this naive approach is not preferred due to the following numerical reasons: 1. The complexity of determinant computation of a w × w matrix is O(w!) and therefore, it is prohibitive for large w. 2. Especially in the context of polynomial matrices, determinant computation results in a large number of spurious terms. These terms are floating point errors accumulated due to the phenomenally large number of computations. 3. Even if the determinant could be computed as a polynomial, computing spectra is not straightforward: root computation programs that come with most general purpose computational packages (for example “Scilab”) are unable to compute roots of a polynomial of degree greater than 100. These considerations motivate the use of some other “equivalent” methods for computation of singularities that are better from a numerical viewpoint. It is known that the roots of a (scalar) polynomial are the eigenvalues of a certain matrix known as a “companion matrix”[22]. A generalization of this well-known result is that the singularities of a polynomial matrix are the generalized eigenvalues of a “block companion matrix”. P Proposition 6.4.3 Let Z(ξ) = dk=0 Zk ξ k ∈ Rw×w [ξ]. Then the finite singularities of Z(ξ) (i.e. the roots of det Z(ξ) = 0) are the finite generalized eigenvalues of the matrix pencil ξA − B where A ∈ Rdw×dw , B ∈ Rdw×dw with I 0 0 0 0 I 0 0 0 I ... 0 0 0 I ... A= . and B = .. . 0 I 0 0 0 I . . 0 ... 0 Zd −Z0 −Z1 . . . −Zd−1 6.4. Numerical Aspects of the Algorithm Proof: This result is well known. See [49] for a proof. 109 Computation of eigenvalues is a highly developed and specialized field. Some methods of computing eigenvalues have been summarized in [14] which gives “templates” for computing eigenvalues depending on the structure of the problem in hand. Note that the matrices A, B are sparse and we need only half (i.e positive) generalized eigenvalues of the matrix pencil. We used the package ARPACK (http:// www.caam.rice.edu/software/ARPACK/) to compute the positive singularities of Z. 6.4.3 Implementation of iterations There are two main numerical issues in this step 1. Computing the Σ-unitary model 2. Polynomial division by (ξ − λ). The first step involves a computation of the following nature R̂(ξ) = (ξ + λ̄i )I − Vi T −1 Vi∗ Σ where T = (Vi∗ ΣVi )/(λi + λ̄i ) is the i-th stage Pick matrix. Define T 0 = Vi∗ ΣVi . Then, T −1 = (λi + λ̄i )T 0−1 . Explicit inversion of the Pick matrix is not preferred because it could lead to a very sensitive solution if T is badly conditioned, and secondly, it is heavy in terms of the number of computations and the memory required. Therefore, we do the following: we solve a linear system of equations T 0 x = b where b = Vi∗ Σ. Then, R̂(ξ) = (ξ + λ̄i )I − (λi + λ̄i )Vi x is the Σ-unitary model. Implementing the Σ-unitary model by solving an associated linear equation is computationally more robust and efficient than an explicit implementation. We now come to the last issue namely division by (ξ−λ). In every iteration in the Algorithm, we get matrices Ki (ξ) in which (ξ − λi ) is a factor. This division is implemented as follows: let P Ki (ξ) = di=0 Ci ξ i with Ci “tall” complex matrices. Ki (ξ)/(ξ − λi ) can be computed by doing a long division with the blocks Ci directly, rather than doing element-by-element long division. The first term in the quotient is easily seen to be Cd ξ d−1 . The second term is (Cd−1 + λCd )ξ d−2 . By induction, the i-th term is (Cd−i+1 + λCd−i+2 + λ2 Cd−i+3 + . . . + λi−1 Cd )ξ d−i . The quotient is explicit and can be easily generated by a recursive loop. This division is by far the most crucial element in the algorithm. In-exact division can be a source of problems. We implemented the division in double precision and found the division to be satisfactory. In the next section, we report some facts about an implementation of the (entire) algorithm. 6.4.4 Computer implementation of polynomial J-spectral factorization Scilab is a computer algebra package developed by researchers at INRIA, France. It is distributed under a license similar to the GNU general public license and is freely available through 110 6 Polynomial J-spectral factorization the Internet (http://www.scilabsoft.inria.fr). Scilab has good support for polynomial computations. We implemented the algorithm presented in this Chapter in Scilab. We need the concept of “element wise error” in order to evaluate the performance of the algorithm. Suppose Z(ξ) ∈ Rw×w [ξ] admits a Σ-spectral factorization given by F T (−ξ)ΣF (ξ). We define: E(ξ) = Z(ξ) − F T (−ξ)ΣF (ξ) Due to numerical errors, E(ξ) will in general be nonzero. One can express E(ξ) as E(ξ) = 1 × 10 −n d X Ei ξ i i=0 with Ei ∈ Rw×w and n is the least integer such that all coefficients of every polynomial in Pd i i=0 Ei ξ have absolute values less than 10, i.e., every element in E i is less than ±a · bcd... with a an integer between 1 to 9. Then, by the “element wise error” we mean 1 × 10−n . Below, we summarize summarize some numerical results. The matrices considered have a fixed degree and randomly selected coefficients. The performance results are for a Pentium-IV PC (256 MB RAM) running Scilab-3.1.1 on a Fedora Core-2 linux. Sr size of Z 1 10 × 10 2 20 × 20 3 40 × 40 4 20 × 20 5 50 × 50 degree 4 4 4 8 6 time taken (sec) 0.261 2.193 21.807 15.19 28.1 Element wise error 1 × 10−8 1 × 10−7 1 × 10−5 1 × 10−4 1 × 10−3 Thus we see that while the algorithm works quite well for a matrices of reasonable complexity (having 80-100 singularities), the performance deteriorates with the number of singularities. The possible causes for this deterioration can be: 1. The algorithm requires exact computation of singularities– an error in computing the singularities reflects as an error in the final spectral factor. This however is a fundamental problem in many algorithms that depend on computation of singularities of polynomial matrices. A through study of available eigenvalue computation routines is being undertaken to evaluate the “best” routine for computing singularities of a para-Hermitian matrix. 2. Division by a (ξ − λ) factor in every step in the algorithm could lead to errors, because this division may not be exact. One way to get around this problem can be the following: if all singularities are distinct, carry out the algorithm without dividing by (ξ − λi ). After Q the last iteration, divide by N i=1 (ξ − λi ). The accumulation of all factors till the end reduces the numerical errors to some extent. 6.5 Examples In this section we provide four examples of the application of our results to J-spectral factorization. 6.5. Examples 111 Example 6.5.1 Consider " 1 − ξ2 ξ Z(ξ) = −ξ 1 − ξ2 # Note that Z(iω) has inertia (2, 0, 0) for all ω ∈ R. Pre-factor Z(ξ) as T ξ 0 1 0 0 −0.5 −ξ 0 0 −ξ 0 1 0.5 0 0 ξ Z(ξ) = 1 0.5 1 0 1 0 0 0 0 1 −0.5 0 0 1 0 1 Observe that the transfer function associated with the choice of the first two variables of Im(K( dtd )) as inputs, is strictly proper. Observe also that Σ1 = I2 , and that n(Im(K1 ( dtd ))) = 1 deg(det(Z)) = 2. Observe also that the singularities of Z(ξ) in C+ are 0.86 ± i0.5, and that 2 since the inertia of Σ1 , the (1, 1)-block of the matrix Σ, is (2, 0, 0), there exists a I2 -spectral factorization of Z. It can be easily verified that every principal block minor of the Pick matrix associated with the data is nonzero. Consequently the Σ-unitary model (6.7) can be constructed at every step. We initialize ξ 0 0 ξ K1 (ξ) = , 1 0 0 1 and we compute K1 (0.86 + i0.5). It is not difficult to see that K1 (0.86 + i0.5) has full column rank, and consequently we can define: 0.86 + i0.5 0 0 0.86 + i0.5 V1 = K1 (0.86 + i0.5) = 1 0 0 1 In step i = 1, we proceed to compute a model R̂1 (ξ) as in (6.7) for the data pair (0.86 + i0.5, Im(V1 )): −i0.6 + ξ −0.4 −0.8 − i0.34 0.34 + i0.2 0.4 −i0.6 + ξ −0.34 − i0.2 −0.8 − i0.34 −0.8 + i0.34 −0.34 + i0.2 −i0.4 + ξ 0.4 0.34 − i0.2 −0.8 + i0.34 −0.4 −i0.4 + ξ This model yields K2 (ξ) = R̂1 (ξ)K1 (ξ)/(ξ − 0.86 − i0.5). It is easy to verify that K2 (0.86 − i0.5) is also full column rank, and consequently we can define V2 = K2 (0.86 − i0.5). In step i = 2, we compute a model R̂2 (ξ) as in (6.7) for the data pair (0.86 − i0.5, Im(V2 )): −0.86 + i0.6 + ξ 0.1 −0.2 − i0.34 −0.34 + i0.2 −0.1 −0.86 + i0.6 + ξ 0.34 − i0.2 −0.2 − i0.34 −0.2 + i0.34 −0.51 + i0.2 0.86 + i0.4 + ξ −0.1 0.51 − i0.2 −0.2 + i0.34 0.1 0.86 + i0.4 + ξ 112 6 Polynomial J-spectral factorization We then define 0.86 + ξ −0.5 0.5 R̂2 (ξ)K2 (ξ) 0.86 + ξ KN +1 (ξ) = K3 (ξ) = = ξ − 0.86 + i0.5 0 0 0 0 As stated in Theorem 6.3.5, the matrix KN +1 has the last q rows equal to zero. Observe that " # 0.86 + ξ −0.5 GN +1 (ξ) = 0.5 0.86 + ξ has singularities in −0.86 ± i0.5, i.e. GN +1 is Hurwitz. Moreover, as stated in the last part of Theorem 6.3.5, GN +1 (−ξ)T GN +1 (ξ) = Z(ξ), i.e. GN +1 is a spectral factor of Z. Example 6.5.2 The purpose of this example is to show how the preprocessing steps sketched in the proof of Theorem 6.3.2 are carried out. We consider an example of a mixed-sensitivity problem from [53], Example 4.4.3, with parameters r = 0, c = 1, and γ = 2. Consider the matrix T 1 −1 + ξ 1 0 0 1 −1 − ξ Z(ξ) = −1 0 0 1 0 −1 0 2 2ξ 0 0 −1 2 −2ξ # " −2 −1 + 3ξ = −1 − 3ξ 1 + 3ξ 2 It can be shown that Z(iω) has inertia (1, 0, 1) for all ω ∈ R. However, it can be readily verified that no transfer function associated with the factorization above is strictly proper. In order to perform the spectral factorization, we consider first the i/o partition with w3 the output, and (w1 , w2 ) inputs. We write " # i −1 −1 − ξ h i N (ξ) = 2 −2ξ = 2 0 + 0 2 −1 0 h and define i h 1 −1 − ξ 1 0 0 M 0 (ξ) := 0 1 0M (ξ) = −1 0 0 2 −2 0 1 {z } | T := Note that the transfer function corresponding to M 0 is strictly proper. Note also that Z(ξ) = M 0 (ξ)T Σ0 M 0 (ξ) where −3 0 −2 Σ0 = T −T ΣT −1 = 0 1 0 −2 0 −1 6.5. Examples 113 We now apply the algorithm to the matrix M 0 . It is easy to verify that M 0 (1) has full column rank. We can choose V1 = M 0 (1) and proceed constructing the Σ0 -unitary model, obtaining −3 + ξ −2 −2 R̂(ξ) = 2 1+ξ 2 2 2 1+ξ and 1 1−ξ R̂(ξ)M (ξ) = −1 −2 ξ−1 0 0 0 Now define " −3 0 Σ1 := 0 1 # the sub-matrix of Σ0 corresponding to the input variables (w1 , w2 ), and " # 1 1−ξ F (ξ) = −1 −2 " # −3 0 Observe that F is Hurwitz, and moreover that Z(ξ) = F (−ξ)T F (ξ). Now assume 0 1 that we choose the i/o partition associated with the variables w1 (output) and (w2 , w3 ) inputs. The corresponding transfer function is not strictly proper, but since it follows that h i h i i 1h 2 −2ξ + 0 −1 1 −1 − ξ = 2 0 −1 1 0 − 21 00 0 1 0 M (ξ) = −1 0 =: M (ξ) 0 0 1 2 −2ξ {z } | T 0 := corresponds to a strictly proper transfer function for the i/o partition chosen. Observe that Z(ξ) = M 00 (−ξ)T Σ00 M 00 (ξ) with Σ00 := T 0−T ΣT 0−1 −T −1 1 0 − 21 1 0 12 1 0 − 21 = 0 1 0 Σ 0 1 0 = 0 1 0 1 0 − 34 0 0 1 0 0 1 2 Now verify that M 00 (λ) is of full column rank construct a Σ00 -unitary model for M 00 (1): ξ 0 R̂ (ξ) = −2 2 for all λ ∈ C. We can consequently proceed to −1 − 12 1+ξ 1 −2 −2 + ξ 114 6 Polynomial J-spectral factorization Observe that 0 0 R̂ (ξ)M (ξ) = −1 −2 ξ−1 2 2 − 2ξ 0 Now define 00 # 1 0 , Σ002 := 0 − 34 " the sub-matrix of Σ00 corresponding to the input variables (w2 , w3 ), and let " −1 −2 F 0 (ξ) := 2 2 − 2ξ # It is a matter of straightforward verification to check that Z(ξ) = F 0 (−ξ)T Σ002 F 0 (ξ) and that F 0 is a Hurwitz matrix. Example 6.5.3 We now consider a case in which the result of Proposition 6.3.6 must be applied, i.e. in which the matrix Z(ξ) is factored in a non-observable form. Let " 2 − 3ξ 2 + ξ 4 1 + ξ Z(ξ) = 1−ξ 1 − ξ2 # Z(ξ) admits the factorization Z(ξ) = " −ξ − ξ 2 0 # 1 0 1+ξ 0 0 −ξ 0 1 0 0 | 0 0 1 0 0 2 0 1 {z Σ 0 ξ − ξ2 0 0 1 1 − ξ 0 1 }| {z K(ξ) 0 ξ 0 1 } Note that the matrix K(ξ) is not right prime, since it loses rank at λ = 1. The roots of det(Z(ξ)) in the right half plane are 1.618034, 0.6180340, 1. We proceed first by modeling the root 1 as in the proof of Proposition 6.3.6. It is easily seen that 0 1 Im(K(1)) =< > 0 1 and that consequently the Pick matrix is 1 and the one-step model for K(1) is 1+ξ 0 0 0 0 ξ −1 −1 R̂1 (ξ) = 0 0 1+ξ 0 0 −1 −1 ξ 6.5. Examples 115 Observe that −ξ 2 − ξ 0 R̂1 (ξ)K(ξ) 1 + ξ 1 K 0 (ξ) := = , ξ−1 −1 − ξ 0 1 0 and that K 0 (−ξ)T ΣK 0 (ξ) = Z(ξ). The only singularity of K 0 is −1. This corresponds to a Hurwitz greatest common right divisor and consequently does not impair the final result of the application of our algorithm. Modeling Im(K 0 (1.618034)) yields 0 −2.11803 −0.809017 ξ − 21 0 −1.61803 + ξ 0 0 R̂2 (ξ) = 1 −1.30902 0 0.309017 + ξ −2 1 1 0 1.80902 + ξ 2 2 and −(ξ + 21 )(ξ + 1.618034) 0 R̂2 (ξ)K 0 (ξ) 1 1 + ξ = K2 (ξ) = ξ − 1.618034 0.3090(ξ + 1.618034) 0 0 − 12 (ξ + 1.618034) We now model Im(K2 (0.618034)) with 0 0.118034 −0.190983 ξ − 12 0 −0.618034 + ξ 0 0 0.309017 0 0.58541 + ξ 0.0527864 − 12 0 0.0527864 0.532624 + ξ and define Observe that −(0.618034 + ξ)(1.61803 + ξ) 0 1 1 + ξ K3 (ξ) = 0 0 0 0 " −(0.618034 + ξ)(1.61803 + ξ) 0 F (ξ) := 1 1+ξ is Hurwitz, and that F (−ξ)T I2 F (ξ) = Z(ξ). # Example 6.5.4 The last example that we consider in this chapter demonstrates that we can also compute factorization of unimodular matrices using the algorithm presented here. Let ! 1 − ξ2 ξ Z(ξ) = −ξ 1 It is easy to see that Z(ξ) is unimodular since det Z(ξ) = 1. Since Z(ξ) has no finite spectral points, the algorithm cannot be used as such. Notice that Z(ξ) is not a regular polynomial 116 6 Polynomial J-spectral factorization matrix (Definition 6.4.2). We therefore convert Z(ξ) into a regular polynomial matrix using the procedure due to Aliev and Larin outlined in Section 6.4.1. Using this procedure, we compute ! −0.7071068 0.7071068 + 0.7071068ξ X(ξ) = 0.7071068 + 0.7071068ξ 0.7071068 + 1.4142136ξ + 0.7071068ξ 2 Define Z 0 (ξ) = X T (−ξ)Z(ξ)X(ξ) which is found to be Z 0 (ξ) = 2 1 − 2ξ ξ − ξ2 −ξ − ξ 1 − ξ2 2 ! Now Z 0 (ξ) can be factorized in the “triple-banded” form as in Section 6.4.1: 2 1 0 0.5 ξ 0 ! 1 −ξ 0 1 0 1 −0.5 0 0 ξ 0 Z (ξ) = 0 −ξ 0 1 0 −0.5 1 0 1 0 0.5 0 0 1 0 1 We can now apply the algorithm and compute a spectral factorization of Z 0 (ξ): ! ! ! 1 − ξ −1 2 1 1 + ξ 0 Z 0 (ξ) = 0 1−ξ 1 1 −1 1 + ξ {z } | F 0 (ξ) Define F (ξ) = F 0 (ξ)X −1 (ξ) which is found to be F (ξ) = It can be easily seen that −0.7071068 − 0.7071068ξ 0.7071068 1.4142136 0 ! 2 1 Z(ξ) = F T (−ξ) F (ξ) 1 1 which gives a factorization of Z(ξ). 6.6 ! Conclusion In this chapter, we have obtained an algorithm for J-spectral factorization of para-Hermitian matrices having constant inertia on the imaginary axis. The algorithm is closely connected with the so called “Σ-unitary” modeling of a finite dimensional vector space. We have obtained a number of results of independent interest for Σ-unitary models. We have shown that the algorithm computes a (Hurwitz) spectral factor in a finite number of iterations. We have studied some numerical aspects associated with the algorithm. Finally, we have demonstrated the use of the algorithm using a number of examples. We have shown that we can also compute factorizations of unimodular matrices using the method described in this chapter. Chapter 7 Synthesis of dissipative systems 7.1 Introduction Synthesis of dissipative systems, better known as the “H∞ control problem” in systems theory is an important area of research. H∞ theory provides a powerful method for designing controllers. In simple terms, a “H∞ controller” aims to modify the sensitivity of a given system to userdefined values. Use of H∞ theory received a huge impetus after the publication of the well known paper [19] which gave explicit state space formulae for the solution of the H∞ control problem. Since the 90’s a great deal of research is being carried out in this area. In the same period, Meinsma [53], Willems and Trentelman [104, 96], and later, Belur [11] addressed the H∞ control problem using a polynomial approach. In this chapter we obtain a novel characterization of all solutions to the H∞ problem. We show we can associate a QDF QΦ with a given H∞ problem in a “natural” way. All solutions to the H∞ problem are precisely Φ-dissipative behaviors with positive (semi)definite storage functions on manifest variables. Recall that a characterization of the latter was already obtained in Chapter 4 under some special assumptions. In this way, ideas presented in Chapters 3 and 4 are relevant to a large area of systems theory, including, as we shall show in this chapter, H ∞ controller design. Willems and Trentelman have formulated and solved the H∞ problem in a behavioral framework in [96]. We build upon their formulation in this chapter. However our approach differs from that in [96] in a substantial manner: The H∞ problem has been formulated in [96] based on kernel representations (Chapter 2, Section 2.5). However, kernel representations suffer from the disadvantage that controllability is not in-built in their definition. In this chapter, we formulate the H∞ problem in terms of image representations. Since we only consider image representations, controllability is ensured automatically. An image representation approach also yields itself to dissipativity based arguments more readily than a kernel representation approach. The solution in [96] proceeds by constructing behaviors that are “orthogonal” with respect to a indefinite matrix. This orthogonalization runs into difficulties when some associated matrix is non-constant. Therefore, it is advantageous to have a solution that does not depend on orthogonalization. We believe that the approach presented here can be used to address the H∞ problem with frequency dependent weights. 118 7 Synthesis of dissipative systems Most results in [96] are existential. In this chapter, we attempt to give concrete algorithms. This chapter is organized as follows: in Section 7.2 we formulate the H∞ control problem. The problem is first formulated as in [96]. Using ideas developed earlier in the thesis, we simplify the problem formulation in several steps. The problem is subsequently solved in Section 7.3. Section 7.4 is about a characterization of all solutions using ideas in Chapters 3 and 4. 7.2 Problem formulation We have briefly elaborated on the “control-as-an-interconnection” philosophy of the behavioral approach in Chapter 5, Section 5.3. We call any dynamical system that we wish to control as the “plant”. Control in a behavioral context means a restriction of trajectories in the plant to a desired subset, and can be accomplished by interconnection of the plant with another dynamical system called the controller. Since trajectories in the interconnected system satisfy laws of the plant as well as laws of the controller, the controller can be looked at as excluding undesirable trajectories from the plant. We call the dynamical system obtained by interconnection of the plant and the controller the controlled system. In Chapter 5 we addressed the interconnection of a linear system with a nonlinear system so that the (autonomous) interconnected system has desired properties (the property of stability). In this chapter, we address the interconnection of a linear system with another linear system so that the interconnected system is controllable, and is dissipative with respect to a given supply function. A formal definition of terms related to control in the context of the present chapter are given below. In most practical situations, not all variables in the plant are available for interconnection with the controller. We call variables that are available for interconnection the control variables and those that are not, as the to-be-controlled variables. By the “full plant behavior” we mean the behavior corresponding to the control variables and the to-be-controlled variables in the plant. Consider a plant that is controllable. Let c denote variables in the plant that are available for control and v denote the to-be-controlled variables. Then, the full plant behavior is the set of all trajectories (v, c) that satisfy the laws of the plant: P full := {(v, c) ∈ C ∞ (R, Rv+c )|(v, c) obey the plant equations} (7.1) We are generally not interested in the evolution of all variables in P full but only of those variables that are to-be-controlled. We call the projection of P full on to the to-be-controlled behavior as the plant behavior P: P := {v ∈ C ∞ (R, Rv )|(v, c) ∈ P full } (7.2) We attach a controller to the plant through the control variables c. One can look at a controller as another dynamical system having c variables which obey certain laws: C := {c ∈ C ∞ (R, Rc ) such that c satisfies controller equations} (7.3) When the plant and the controller are interconnected through c, the variables c in P full are now restricted since these obey the laws defining both the plant and the controller. The corresponding trajectories v ∈ P represent the controlled behavior, K. K := {v ∈ P|∃c ∈ C with (v, c) ∈ P full } (7.4) 7.2. Problem formulation 119 By a proper design, the controller C ensures that the controlled behavior K meets given specifications. What these specifications are depends on the problem at hand, however in the context of H∞ control, one of the specifications is that K should be dissipative with respect to a given supply function. We give a complete problem specification later. A fundamental question in designing controllers is: can an arbitrary controlled behavior K be achieved by applying control through the control variables? If K can be achieved by applying control through the control variables c, then K is called implementable through c. Clearly, since a controller restricts trajectories in P, it is necessary that K ⊆ P. To address which controlled behaviors can be achieved, we also need the concept of a hidden behavior N : N := {v ∈ P|(v, 0) ∈ P full } (7.5) That is, the hidden behavior is the set of all trajectories in P that correspond to the control variables being zero. Clearly, the controller cannot influence trajectories in N since these are unobservable (“hidden”) from the control variables c. The following result in [96] (Theorem 1, page 55) addresses the question of what controlled behaviors K can be achieved by interconnection of P and C: v Proposition 7.2.1 Let P full ∈ Lv+c con be the full plant behavior, P ∈ Lcon be the manifest plant behavior and N ∈ Lvcon be the hidden behavior. Then, a controlled behavior K ∈ Lvcon is implementable by a controller C ∈ Lccon acting on the control variables if and only if N ⊂ K ⊂ P. Thus, a controlled behavior K is implementable by a controller acting through the control variables if and only if it contains the hidden behavior N and K is in turn contained in the manifest plant behavior P. Proposition 7.2.1 is very intuitive, and of fundamental importance in designing a controller, for it gives bounds on what controlled behaviors are possible. We define a nonsingular matrix Σ = ΣT ∈ Rv×v . Let σ+ (Σ) denote the number of positive eigenvalues of Σ. In the sequel, we shall denote by m(B) the input cardinality (Chapter 2, Section 2.9) of a behavior B. Problem statement 1 Given the manifest plant behavior P ∈ Lvcon , the hidden behavior N ∈ Lvcon and a nonsingular weighting matrix Σ = ΣT ∈ Rv×v , find all behaviors K ∈ Lvcon , called the controlled behaviors such that 1. K is implementable, i.e. N ⊂ K ⊂ P. 2. m(K) = σ+ (Σ), i.e. K has σ+ (Σ) inputs. 3. K is Σ-dissipative on R− , i.e., K is dissipative with respect to QΣ and every storage function on manifest variables is positive semidefinite. It is clear that since N ⊂ K and K should be Σ-dissipative on R− , it is necessary that N must be Σ-dissipative on R− . We now explain why the above conditions are important. Condition 1 is about implementability and must be satisfied for any control system design. Note that the implementability condition follows from Proposition 7.2.1. We have seen that the idea 120 7 Synthesis of dissipative systems of control is a restriction brought about by interconnection of the plant and the controller. The controller is chosen such that the interconnected system exhibits some desirable properties (specifications). Conditions 2 and 3 represent the specifications on the controlled system. The dissipativity condition (Condition 3) is fundamental; it could imply, for example, disturbance attenuation, or passivation depending on the problem under consideration. The condition on input cardinality (Condition 2) ensures that the controlled system has sufficient “freedom” available to it. Recall from Lemma 3.5.1 that σ+ (Σ) is the maximum possible input cardinality of a Σdissipative behavior. Since N ⊂ K, m(K) > m(N ). Therefore, it is necessary that input cardinality of N be less than σ+ (Σ): Lemma 7.2.2 There exists a K that is a solution to “Problem statement 1” only if m(N ) < σ+ (Σ). The description of the H∞ problem that we have just given is representation free. However, in practice, we have to work with concrete representations for behaviors. We begin by translating the problem in terms of representations for behaviors. We have assumed that the behaviors P and N are controllable in a behavioral sense. Further, we also want that the controlled behavior K be controllable. We denote the behaviors N , P and K by their image representations Im N ( dtd ), Im P ( dtd ) and Im K( dtd ) respectively and assume that these image representations are observable. We assume that the input cardinality of the hidden behavior is n and that of the manifest plant behavior is p. Thus, N (ξ) ∈ Rv×n [ξ] and P (ξ) ∈ Rv×p [ξ]. We want to find a matrix K(ξ) ∈ Rv×k [ξ], k = σ+ (Σ), such that in addition to the dissipativity conditions, N ⊂ K ⊂ P. Since N ⊂ P, by a suitable transformation, the manifest plant behavior can be represented by Im[N ( dtd ) M ( dtd )]. Hence, we assume that P given by ImP ( dtd ) is such that P (ξ) = [N (ξ) M (ξ)], i.e. the first n columns of P (ξ) are an image representation for the hidden behavior N and further P = N ⊕ Im M ( dtd ). Since the controlled behavior K must contain N , we assume that K(ξ) = [N (ξ) B(ξ)] such that K = N ⊕ Im B( dtd ) with B(ξ) ∈ Rv×b [ξ] with b = σ+ (Σ) − n. Clearly, the behavior Im B( dtd ) ⊂ Im M ( dtd ). Using these preliminaries, we reformulate the problem statement: Problem statement 2 Given the hidden behavior N = Im N ( dtd ) and the manifest plant behavior P = ImP ( dtd ) with N (ξ) ∈ Rv×n [ξ] and P (ξ) ∈ Rv×p [ξ], P (ξ) = [N (ξ) M (ξ)], find a controllable behavior K = Im K( dtd ) with K(ξ) ∈ Rv×k [ξ], k = σ+ (Σ), K(ξ) = [N (ξ) B(ξ)] having full column rank such that 1. K = N ⊕ Im B( dtd ). 2. Im B( dtd ) ⊂ Im M ( dtd ). 3. K is Σ-dissipative on R− . We address Problems defined in statements (1) and (2) above by considering the QDF QΣd associated with QΣ , defined by the polynomial matrix Σd = d(ζ)d(η)Σ with d(ξ) ∈ R[ξ], d(ξ) 6= 0, having no roots in the open right half plane. Notice that a behavior is Σ-dissipative if 7.2. Problem formulation 121 and only if it is also Σd -dissipative. In other words, the QDFs QΣ and QΣd are equivalent (QΣ ∼ QΣd ) in the sense of the equivalence relation defined in Chapter 3, Section 3.3. Using the procedure for computing storage functions (Section 4.2) we see that there exists a storage function for a Σd -dissipative behavior that has the structure d(ζ)d(η)Ψ(ζ, η) with QΨ a storage function for B with respect to QΣ . The following proposition relates storage functions of B with respect to QΣ and QΣd : Proposition 7.2.3 Let Σ = Jmn = diag[Im , −In ]. Let B be Σd -dissipative with d(ξ) such that no root of d(ξ) lies in the open right half plane. Assume m(B) = σ+ (Σ). Let (u, y) be an inputoutput partition of manifest variables w in B, defined by an observable image representation: " # " # u Q( dtd ) = ` y P ( dtd ) Every storage function for B with respect to QΣd is positive semidefinite on manifest variables if and only if Q(ξ) is Hurwitz, i.e. det Q(ξ) has all its zeros in the open left half plane. Proof: Let x denote a set of minimal states of B. Since QΣ ∼ QΣd , B is Σ-dissipative if and only if it is Σd -dissipative. Note that σ+ (Σ) = σ+ (d(iω)d(−iω)Σ) for almost all ω ∈ R. Since m(B) = σ+ (Σ) and B is Σ-dissipative, P Q−1 is a proper rational function and Q(ξ) has no singularities on ξ = iR. The behavior L0 := {`|Q( dtd )` = 0} has the same McMillan degree as B. Further, corresponding to every state trajectory in B, there exists a corresponding state trajectory in L0 . Assume that Q(ξ) is Hurwitz. Let QΨ be an arbitrary storage function on states for B with respect to QΣd . Then, d QΨ (x) = QΣd (w) − Q∆ (w, x) dt We now integrate the above equality from 0 to ∞ by setting u = 0. Note that the integration is well defined since Q(ξ) is a Hurwitz matrix. Z ∞ Z ∞ QΨ (x(0)) = Q∆ (w, x)dt − QΣd (w)dt 0 0 R∞ Since − 0 QΣd (w)dt = 0 ||d( dtd )y||2 − ||d( dtd )u||2 dt, it follows that − 0 QΣd (w) ≥ 0 for all y = P ( dtd )` such that ` ∈ L0 . Since Q∆ (w, x) is positive semidefinite, we have QΨ (x(0)) ≥ 0. We have just shown that every storage function (on states) for B with respect to Q Σd is positive semidefinite. Using a state map x = X( dtd )w we conclude that every storage function for B with respect to QΣd is positive semidefinite on manifest variables, see Section 4.2 for details of this argument. Conversely, assume that Q(ξ) is not Hurwitz and yet every storage function for B on manifest variables is positive semidefinite. Because of observability of ` from w, y = P ( dtd )`, ` ∈ L0 is nonzero. Consider `1 ∈ L0 − {0} such that `1 (−∞) = 0. Note that such `1 exists because Q(ξ) is not Hurwitz. Define y1 = P ( dtd )`1 and let w1 = (0, y1 )T , the image defined by `1 . Let states corresponding to w1 at t = −∞ and t = 0 be 0 and a respecR0 tively. Consider QΨ (a) := inf w∈Ba −∞ QΣd (w)dt. Recall from Section 4.2 that QΨ is the R0 maximum storage function for B. We evaluate the integral −∞ QΣd (w1 )dt. Notice that R0 R0 Q (w1 )dt = −∞ −||d( dtd )y1 ||2 dt. Since d(ξ) has no singularities in the open right half −∞ Σd R∞ R∞ 122 7 Synthesis of dissipative systems R0 plane, −∞ −||d( dtd )y1 ||2 dt is negative. Therefore, the infimum of this integral over all w ∈ Ba is negative. Thus, the maximum storage function for B along state a is a negative quantity. Hence, every storage function for B along state a is negative. This contradicts our assumption that every storage function for B with respect to QΣd is positive semidefinite. Corollary 7.2.4 Proposition 7.2.3 also shows that for a Σd -dissipative behavior with input cardinality σ+ (Σ), if there exists one positive semidefinite storage function for B with respect to QΣd then every storage function for B with respect to QΣd is positive semidefinite. Due to similarities in the supply functions QΣ and QΣd , we expect that solving the H∞ problem with Σ, or with Σd as the weight should be equivalent: Problem statement 1’ Given the manifest plant behavior P ∈ Lvcon , the hidden behavior N ∈ Lvcon , a nonsingular weighting matrix Σ = ΣT ∈ Rv×v , and a nonzero polynomial d(ξ) having no roots in the open right half plane, find all behaviors K ∈ Lvcon , called the controlled behaviors such that: 1. K is implementable, i.e. N ⊂ K ⊂ P. 2. m(K) = σ+ (Σ), i.e. K has σ+ (Σ) inputs. 3. K is dissipative with respect to QΣd and every storage function of K with respect to QΣd is positive semidefinite on manifest variables. Proposition 7.2.5 Problem statement 1 and Problem statement 1’ are equivalent. Proof: Note that the only difference in Problem statement 1 and 1’ is in Condition 3. Since QΣ ∼ QΣd (Chapter 3, Section 3.3) it follows that K is Σ-dissipative if and only if it is also Σd -dissipative. Note that the minimum storage function for K with respect to QΣd , QΨΣd and − Σ d the minimum storage function for K with respect to QΣ , QΨΣ− are related by ΨΣ − = d(ζ)Ψ− d(η) since d(ξ) has no roots in the open right half plane by assumption. This shows that Condition 3 in Problem statement 1 and 1’ are equivalent. Using Proposition 7.2.5 we transform the problem from dissipativity with respect to QΣ to dissipativity with respect to QΣd , and then invoke the fact that the two are equivalent. In the sequel we make a simplifying assumption: we assume that N is average positive on QΣ , i.e. R∞ R∞ Q (v) ≥ 0 for all v ∈ N ∩ D(R, Rv ) and ( −∞ QΣ (v) = 0 for some v ∈ N ⇐⇒ v = 0). −∞ Σ With this simplifying assumption, det N T (−ξ)ΣN (ξ) 6= 0 [103]. Building upon the equivalence of Problem statements 1, 2 and 1’, we have the following equivalent problem statement: Problem statement 2’ Given the hidden behavior N = Im N ( dtd ), determine necessary and sufficient conditions for the existence of a behavior B( dtd ) such that K := N ⊕ Im B( dtd ) satisfies K ⊂ P, K is Σd -dissipative and every storage function on manifest variables of K with respect to QΣd is positive semidefinite. 7.3. A Solution to the synthesis problem 7.3 123 A Solution to the synthesis problem We now show that the problem of constructing an associated behavior B := Im B( dtd ) such that N ⊕ B is Σd -dissipative (see Problem statement 2’) can be elegantly cast into the framework of parametrization results presented in Chapters 3 and 4. We now state and prove a central result that deals with this parametrization. We however require a few concepts from linear algebra before we proceed. Consider a Hermitian matrix H = H ∗ ∈ Ch×h block partitioned as " # A B H= B∗ C where A = A∗ ∈ Ca×a . If A is nonsingular then the Schur complement of A in H, S(H/A) is defined as [33]: S(H/A) = C − B ∗ A−1 B We denote by σ(H) the inertia of H (Definition 1.2.2). One of the better known results is that σ(H) = σ(A) + σ(S(H/A)) [33]. Therefore, if H ≥ 0 and A > 0 then S(H/A) ≥ 0. Generalizations of this result are known [92]. Theorem 7.3.1 Given nonsingular Σ = ΣT ∈ Rv×v and behaviors N := Im N ( dtd ) and B := ImB( dtd ) such that N T (−ξ)ΣN (ξ) is nonsingular, define K := N ⊕ B. Then, K is Σ-dissipative (on R) if and only if N is Σ-dissipative and B is Φ-dissipative (on R) with Φ(ζ, η) = d(ζ)d(η)Σ− D T (ζ)D(η) for some non-zero polynomial d(ξ) ∈ R[ξ] having no roots in the open right half plane, D(ξ) ∈ Rv×v [ξ], and N ∩ B = 0. Proof: Let N (ξ) ∈ Rv×n [ξ], B(ξ) ∈ Rv×b [ξ]. K is Σ-dissipative if and only if the following inequality holds (Theorem 3.2.3): " # T T N (−iω)ΣN (iω) N (−iω)ΣB(iω) ≥ 0 ∀ω ∈ R (7.6) B T (−iω)ΣN (iω) B T (−iω)ΣB(iω) By a Schur complement argument it follows that inequality (7.6) holds if and only if N T (−iω)ΣN (iω) > 0 for almost all ω ∈ R and B T (−iω) Σ − ΣN (iω)[N T (−iω)ΣN (iω)]−1 N T (−iω)Σ B(iω) ≥ 0 ∀ω ∈ R (7.7) B T (−iω) d(−iω)d(iω)Σ − ΣN (iω)Z(iω)N T (−iω)Σ B(iω) ≥ 0 ∀ω ∈ R (7.8) B T (−iω) d(−iω)d(iω)Σ − D T (−iω)D(iω) B(iω) ≥ 0 ∀ω ∈ R. (7.9) Since the matrix N T (−iω)ΣN (iω) is invertible for almost all ω, let Z(ξ) be the adjugate of the matrix N T (−ξ)ΣN (ξ), and let d(−ξ)d(ξ) be the determinant of N T (−ξ)ΣN (ξ). Here d(ξ) is chosen so that it does not have any root in the open right half plane. Then, inequality (7.7) can be re-written as Since Z(iω) is non-negative definite for every ω ∈ R, there exists a matrix F (iω) ∈ Rn×n [iω] such that Z(iω) = F T (−iω)F (iω). Define the polynomial matrix D(ξ) := F (ξ)N T (−ξ)Σ. Notice that substituting D(iω) in inequality (7.8): 124 7 Synthesis of dissipative systems We now define a two variable polynomial matrix Φ(ζ, η) := d(ζ)d(η)Σ − D T (ζ)D(η). Notice that there exists a behavior B such that K := N ⊕ B with K Σ-dissipative if and only if B is Φ-dissipative and N ∩ B = 0 . The QDF QΦ that we have constructed from N and QΣ has some interesting properties. The QDF QΦ has “absorbed” into itself the information associated with the hidden behavior. This fact is reflected in a very appealing manner in the spectrum of Φ(−iω, iω): Theorem 7.3.2 Let N := Im N ( dtd ) be the hidden behavior with N (ξ) ∈ Rv×n [ξ] having full column rank. Given the QDF QΣ , Σ = ΣT nonsingular, assume N is Σ-average positive. Construct the QDF QΦ associated with N and Σ as detailed in Theorem 7.3.1. Then for almost all ω ∈ R: 1. Φ(−iω, iω) has n eigenvalues at zero (n being the input cardinality of N ). 2. Φ(−iω, iω) has (σ+ (Σ) − n) positive eigenvalues. 3. Φ(−iω, iω) has (σ− (Σ)) negative eigenvalues Proof: We prove the Theorem by using a simple counting argument which shows that Φ(−iω, iω) has at least n zero, (σ+ (Σ) − n) positive and (σ− (Σ)) negative eigenvalues for almost all ω ∈ R. Since Σ = ΣT is nonsingular, we can assume without loss of generality that Σ = diag[Ip , −Iw ]. Recall from Theorem 7.3.1 that Φ(−iω, iω) = d(−iω)d(iω)Σ − ΣN (iω)[adj N T (−iω)ΣN (iω)]N T (−iω)Σ Let adj N T (−iω)ΣN (iω) = F T (−iω)F (iω), F (ξ) ∈ Rn×n [ξ]. Define h i B1 (ξ) B2 (ξ) = F (ξ)N T (−ξ)Σ with B1 (ξ) ∈ Rn×σ+ (Σ) [ξ] and B2 (ξ) ∈ Rn×σ− (Σ) [ξ]. Thus, the term ΣN (iω)[adj N T (−iω)ΣN (iω)] N T (−iω)Σ can be factorized as: " # h i T B (−iω) 1 ΣN (iω)[adj N T (−iω)ΣN (iω)]N T (−iω)Σ = B (iω) B (iω) 1 2 B2T (−iω) Consequently, Φ(−iω, iω) can now be re-written as # " d(iω)d(−iω)Ip − B1T (−iω)B1 (iω) −B1T (−iω)B2 (iω) Φ(−iω, iω) = −d(iω)d(−iω)Iw − B2T (−iω)B2 (iω) −B2T (−iω)B1 (iω) The block matrix −d(iω)d(−iω)Iw − B2T (−iω)B2 (iω) is negative definite and consequently Φ(−iω, iω) has at least σ− (Σ) negative eigenvalues. Notice that Φ(−iω, iω)N (iω) = 0. Since N (ξ) is full column rank matrix, Φ(−iω, iω) has at least n zero eigenvalues. From Lemma 7.2.2, n < σ+ (Σ). Therefore, B1T (−iω)B1 (iω) is a σ+ (Σ) × σ+ (Σ) matrix with rank at most n, thus having at least σ+ (Σ) − n eigenvalues at 0. Hence, d(−iω)d(iω)Ip − B1T (−iω)B1 (iω) has at least σ+ (Σ) − n positive eigenvalues for almost all ω ∈ R. We have thus shown that a principle submatrix of Φ(−iω, iω) has at least σ+ (Σ) − n positive eigenvalues. 7.3. A Solution to the synthesis problem 125 Using suitable congruence transformations we obtain a matrix Φ̄(−iω, iω) = X ∗ Φ(−iω, iω)X such that the top left block of Φ̄(−iω, iω) has exactly σ+ (Σ) − n positive eigenvalues, and in particular, is nonsingular. A Schur complement argument then shows that Φ̄(−iω, iω) has at least σ+ (Σ) − n positive eigenvalues, and therefore so does Φ(−iω, iω). For further details of this argument, see [33], particularly Section 4, page 79 thereof. Thus, we have shown that Φ(−iω, iω) has at least σ− (Σ) negative, n zero and σ+ (Σ) − n positive eigenvalues. Since the sum of these numbers must be v := σ− (Σ) + σ+ (Σ), (and not more), Φ(−iω, iω) has exactly σ− (Σ) negative, n zero and σ+ (Σ) − n positive eigenvalues for almost all ω. Another interesting property of QΦ is that the hidden behavior N is Φ-lossless: Corollary 7.3.3 For every compactly supported trajectory v ∈ N , R∞ −∞ QΦ (v)dt = 0. The elegance of working with QΦ will be brought out in the following theorem, where we show that the hidden behavior will be automatically contained in any Φ-dissipative behavior with full input cardinality: Theorem 7.3.4 Let QΦ be a QDF with Φ(ζ, η) as defined in Theorem 7.3.1. Then, the controlled behavior K is Φ-dissipative. Moreover, every Φ-dissipative behavior K 0 having input cardinality σ+ (Σ) enjoys the property that N ⊂ K0 . Further, K0 is Σd -dissipative and N is Σd -dissipative. Proof: Assume K = Im [N ( dtd )B( dtd )] with N = Im N ( dtd ), B = B( dtd ) and N ∩ B = 0. From Theorem 7.3.1, B exists if and only if B is Φ-dissipative. We see that " # " # h i N T (−iω) 0 0 Φ(−iω, iω) N (iω) B(iω) = B T (−iω) 0 B T (−iω)Φ(−iω, iω)B(iω) which is clearly positive semidefinite since B is Φ-dissipative. We now show by a contradiction argument that every behavior K 0 having input cardinality σ+ (Σ) that is Φ-dissipative satisfies N ⊂ K0 . Assume that there exists a Φ-dissipative behavior K0 with input cardinality σ+ (Σ) such that N is not contained in K0 . Consider the behavior K00 = K0 + N . Since N is not contained in K0 , the input cardinality of K00 is strictly larger R∞ than σ+ (Σ). Moreover, since −∞ LΦ (w, v) = 0 for v ∈ N , w ∈ D(R, Rv ) it follows that K00 is Φ-dissipative with input cardinality strictly larger than σ+ (Σ). However, from Theorem 7.3.2, Φ(−iω, iω) has only σ+ (Σ) non-negative eigenvalues for almost all ω. Therefore the input cardinality of any Φ-dissipative behavior can at most be σ+ (Σ) (Lemma 3.5.1). Hence, a K0 that is Φ-dissipative with input cardinality σ+ (Σ) and not containing N cannot exist. Since Φ(ζ, η) = d(ζ)d(η)Σ − D T (ζ)D(η), Theorem 3.2.3 shows that every Φ-dissipative behavior is also Σd -dissipative. Since every Φ-dissipative behavior with input cardinality σ+ (Σ) contains N , it follows that N is Σd -dissipative. Note that from Theorem 7.3.4 we have characterized all behaviors that are Σ d -dissipative and that contain N . We have not yet stipulated that these behaviors be contained in P, the manifest plant behavior. We consider this implementability issue in the next proposition: 126 7 Synthesis of dissipative systems Proposition 7.3.5 Consider QΦ with Φ(ζ, η) as in Theorem 7.3.1. Let P = Im P ( dtd ) be the manifest plant behavior defined by an observable image representation. Define Θ(ζ, η) = P T (ζ)Φ(ζ, η)P (η). Then N ⊂ K ⊂ P, K has input cardinality σ+ (Σ) and K is Σd -dissipative if and only if ∃ G = Im G( dtd ) a Θ- dissipative behavior with input cardinality σ+ (Σ) such that K = P ( dtd )(G). Proof: Suppose ∃ G satisfying the conditions in the proposition. Then K := P ( dtd )(G) ⊂ P and has input cardinality σ+ (Σ). Since G is Θ-dissipative, K is Φ-dissipative. Therefore, from Proposition 7.3.4, N ⊂ K and K is Σd -dissipative. Hence, N ⊂ K ⊂ P. Conversely, every K such that N ⊂ K, K Σd -dissipative, with input cardinality σ+ (Σ) is Φ-dissipative (Theorem 7.3.4). Since K ⊂ P, ∃ G(ξ) ∈ Rv×σ+ (Σ) [ξ] such that K(ξ) = P (ξ)G(ξ) where K = Im K( dtd ), P = Im P ( dtd ). Since K is Φ-dissipative, G := ImG( dtd ) is Θ-dissipative. Further, input cardinality of K and G are the same because P (ξ) is full column rank. Hence, input cardinality of G is σ+ (Σ). The essence of Proposition 7.3.5 is that all controllable behaviors K that are Σd -dissipative, having input cardinality σ+ (Σ) and that satisfy N ⊂ K ⊂ P are characterized by the set of all Θ-dissipative behaviors with input cardinality σ+ (Σ) where QΘ is a QDF constructed from N and P as given in Theorem 7.3.1 and Proposition 7.3.5. 7.4 A characterization of all solutions of the synthesis problem Problem statement 2’ required K to be Σd -dissipative with positive semidefinite storage functions on manifest variables. We now see how this can be achieved using the set of Θ-dissipative behaviors in Proposition 7.3.5. Theorem 7.4.1 Consider a behavior K ∈ Lvcon that satisfies 1. N ⊂ K ⊂ P. 2. K is dissipative with respect to QΣd and every storage function on manifest variables of K is positive semidefinite. 3. K has input cardinality σ+ (Σ). Such a K is characterized as K = P ( dtd )G where P = Im P ( dtd ) is the manifest plant behavior defined by an observable image representation and 1. G is a Θ-dissipative behavior with QΘ as in Proposition 7.3.5. 2. G has input cardinality σ+ (Σ). 3. G has a positive semidefinite storage function on manifest variables (with respect to Q Θ ). 7.5. Conclusion 127 Proof: From Proposition 7.3.5, N ⊂ K ⊂ P, K has input cardinality σ+ (Σ), and K is Σd dissipative if and only if K = P ( dtd )(G) where G is Θ-dissipative and has input cardinality σ+ (Σ). Since Θ(ζ, η) = P T (ζ)Φ(ζ, η)P (η), every storage function on latent variables of G with respect to QΘ is also a storage function on latent variables of K with respect to QΦ . Hence, there exists a positive semi-definite storage function on manifest variables of K with respect to QΦ if and only if there exists a positive semidefinite storage function on manifest variables of G with respect to QΘ . Since Φ(ζ, η) = d(ζ)d(η)Σ − D T (ζ)D(η) it follows that QΦ (v) ≤ QΣd (v) for all v ∈ C ∞ (R, Rv ), and in particular for all v ∈ K. Let QΨ be a positive semidefinite storage function for K with respect to QΦ : d QΨ (v) ≤ QΦ (v) ≤ QΣd (v) ∀v ∈ K dt Thus, starting from a positive semidefinite storage function for G with respect to QΘ we have constructed one positive semidefinite storage function for K with respect to QΣd . Corollary 7.2.4 shows that every storage function for K with respect to QΣd is positive semidefinite. The converse statement, i.e. existence of a positive semidefinite storage function on manifest variables of K with respect to QΣd implies the existence of a positive semidefinite storage function on manifest variables of K with respect to QΦ is shown by a contradiction argument: suppose there exists no storage function for K with respect to QΦ that is positive semidefinite. Then, there exists one storage function for K with respect to QΣd that is not positive semidefinite. We conclude from Condition 3 of Problem statement 1’ that K cannot be a solution. We now invoke the problem equivalence as in Proposition 7.2.5. Note that K has positive semidefinite storage functions with respect to QΣd if and only if it also has positive semidefinite storage functions with respect to QΣ . Hence, Theorem 7.4.1 gives characterization of all solutions for the standard H∞ problem with Σ as the weighting matrix. Thus, in summary, K is a solution to the H∞ problem, i.e., K is Σ-dissipative on R− , m(K) = σ+ (Σ) and N ⊂ K ⊂ P if and only if there exists a behavior G with input cardinality σ+ (Σ) that is Θ-dissipative and has a positive semidefinite storage function on manifest variables. Every solution K is then given by K = P ( dtd )(G) where P = ImP ( dtd ) is defined by an observable image representation. 7.5 Conclusion In this chapter, we have obtained a novel characterization of all solutions to the H∞ problem. We have shown that given the “plant behavior” and the “hidden behavior”, one can construct a QDF QΘ from these behaviors. Every solution to the H∞ problem can be obtained from Θ-dissipative behaviors having the “right” input cardinality, and positive semidefinite storage functions. We have assumed that the hidden behavior is average non-negative on the given weighting matrix Σ. While this is crucial for the treatment as presented in this chapter, we recognize the fact that this assumption is not central to the theme of this chapter. Work on doing away with this assumption is in progress. Also, we believe that much of the theory presented in this chapter generalizes in a neat manner when the weighting matrix is “frequency dependent”, 128 7 Synthesis of dissipative systems rather than constant as was considered in this chapter. Investigation in this direction is in progress. Chapter 8 Modeling of data with bilinear differential forms 8.1 Introduction Modeling a system from observed time series data is a problem arising in several important areas of engineering, for example time series analysis, signal processing, system identification, automatic control, etc. The usual modeling approach proceeds by fixing the type of laws that we desire, or we believe the system is likely to obey and then searching for a model of this type that explains the data. In the setting which we will be working on, a model explains given data if the data are exactly compatible with the laws describing the model itself; the procedure for finding such a model is called exact identification. Exact modeling has been covered in this thesis in Chapter 6, Section 6.2.1. We refer the interested reader to [100] for a deeper exposition of the issues regarding exact identification. One of the most reasonable and easy a priori assumptions on the model is that of linearity, primarily because of a good understanding of linear systems theory and also because of the availability of efficient computational algorithms for modeling. The linear exact modeling problem has been well studied in systems theory, see for example [100, 7]. The assumption of linearity, with its simplicity, ease and mathematical tractability may not always be a good choice and indeed there are several systems (for example econometric systems) and important applications (signal filtering) where it has been shown that going beyond a linear model has advantages. Bilinear models are of paramount importance in nonlinear modeling and have found applications in signal processing [85] and coding theory [51, 47]. The intertwining of quadratic and bilinear models and optimization problems is well known, see [50] for a survey. The central question in interpolation with bilinear forms is to determine a bilinear form that takes prescribed values along certain prescribed directions specified by the data. It is worth emphasizing that though a bilinear model is obviously nonlinear, determining its parameters from the data is still a linear problem. A brute-force solution of the linear system may not be desirable because modifications in the data may necessitate re-computation of the whole solution. This motivates the search for an algorithm that models data iteratively, i.e. the bilinear form should be modeled depending on the current and past available data– the current model should be updated to explain future data as and when such data are available. 130 8 Modeling of data with bilinear differential forms In this chapter we address bilinear modeling by considering a general bilinear interpolation problem. We show its relevance to a number of problems in systems and control theory. The most interesting aspect of the interpolation scheme is that it is recursive with respect to data. Further, it only uses standard matrix manipulations. Hence, it can be implemented on general purpose computational packages. The scheme is based in a formal setting due to which it can be extended to various domains of application with no additional effort. The work presented here has interesting connections with several problems in mathematics and engineering. The most immediate connection is with bivariate polynomial interpolation problems [10, 23] among others. Reed-Solomon decoding [51, 47] is a bilinear interpolation problem. Discrete time bilinear interpolation is useful in quadratic filtering [85]. Bilinear interpolation is a well known technique in image processing [91]. This chapter is organized as follows. A precise problem statement is given in Section 8.2. Section 8.3 is the main section in this chapter. Here we propose a scheme to address the bilinear interpolation problem. Section 8.4 is about examples and applications. Here we show among other things, the use of the interpolation scheme for computing Lyapunov functions. These will be followed by concluding remarks in Section 8.5. 8.2 The problem statement Consider w ∈ C ∞ (R, Cw ), v ∈ C ∞ (R, Cw ) and γ ∈ C ∞ (R, C). We call (w, v, γ) the data. Bilinear models that we wish to construct for the data will be elements of the model class MBDF : MBDF = {(v, w, γ) ∈ C ∞ (R, Cw ) × C ∞ (R, Cw ) × C ∞ (R, C) |∃ Φ ∈ Cw×w [ζ, η] satisfying LΦ (w, v) = γ} (8.1) The model class MBDF thus explicitly assumes that the data is amenable to exact bilinear modeling. Since we do not exclude the possiblity of complex Φs, a word on “symmetry” Pn k l ? in this context is in order. Let Φ(ζ, η) = k,l=0 Φkl ζ η . By Φ (ζ, η) we mean the matrix Pn ? ∗ ∗ k l k,l=0 Φkl η ζ where Φkl denotes the hermitian transpose of Φkl . If Φ(ζ, η) = Φ (ζ, η) we call Φ symmetric. Further, while considering one variable polynomial matrices R(ξ) ∈ C•×• [ξ] P P defined by ni=0 Ri ξ i , by R? (ξ) we mean the matrix ni=0 Ri∗ ξ i , where Ri∗ denotes the hermitian transpose of Ri . We consider the following problem: Given N distinct trajectories ci eλi t with ci ∈ Cc ; λi ∈ C and distinct, and qij ∈ C, i, j = 1, 2, . . . N , determine a bilinear differential form LΦ , Φ(ζ, η) = Φ? (ζ, η), such that: LΦ (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t i, j = 1, 2, . . . N (8.2) Observe that the requirement of symmetry automatically fixes the values of LΦ (cj eλj t , ci eλi t ) to qij∗ e(λi +λ̄j )t . Using the two variable polynomial notation for BDFs the problem statement can be translated into an equivalent algebraic statement: c∗i Φ(λ̄i , λj )cj = qij 8.3. A recursive algorithm for interpolating with BDFs 131 In general, problems involving polynomial interpolation have non-unique solutions. Deciding which solution is the “best” from among infinitely many others is not always straightforward. Indeed, often, what is the “best” solution is a question that can only be answered depending on the application in hand, and the required computational effort. In the context of this chapter, the choice is made more complicated by the fact that there is no natural ordering on the ring of polynomials in two variables. Several criteria for choosing the “best” solution to problems involving two variable polynomial matrices are available in literature. See [97] for a discussion on the “effective size”.1 In some applications (like error correcting codes [47]), the criterion has been the least weighted degree. 2 The criterion that we address in this chapter is ease of computation. The most interesting feature of our scheme is that it is recursive. The required computational effort is very modest– indeed, one can write down the solution by hand in the simpler cases. In the next section we develop the algorithm to address the problem of interpolation with BDFs. 8.3 A recursive algorithm for interpolating with BDFs We begin by recalling that the problem of interpolating with BDFs is the following: given N distinct trajectories ci eλi t with ci ∈ Cc ; λi ∈ C and distinct, and qij ∈ C, i, j = 1, 2, . . . N , determine a BDF LΦ with Φ(ζ, η) = Φ? (ζ, η) such that LΦ (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t i, j = 1, . . . , N (8.3) For the sake of simplicity we make the following assumption: c∗i cj 6= 0 i, j = 1 . . . N (8.4) Discussion on how to relax this assumption can be found in Remark 8.3.2 below. In order to develop the interpolation scheme, we need the concept of the Most Powerful Unfalsified Model (MPUM) for a finite set of vector exponential trajectories (Section 6.2.1). Recall that the MPUM (chosen from a model class) is the most restrictive model for data which does not refute the data. We now prove a Theorem that is the basis of the interpolation scheme: Theorem 8.3.1 Consider N trajectories ci eλi t with ci ∈ Cc ; c∗i cj 6= 0, i, j = 1 . . . N ; λi ∈ ∗ C, i = 1, . . . , N and distinct, together with N 2 complex numbers qij satisfying qij = qji , i, j = 1, . . . N . Consider the following iterations: Φ1 = q11 I. c∗1 c1 for l= 1 to N − 1 do, ? (ζ)Rl (η) + Rl? (ζ)El+1 (ζ) Φl+1 (ζ, η) = Φl (ζ, η) + El+1 end for P The effective size of k,l Φkl ζ k η l is defined as max{k|Φkl 6= 0}. P 2 The (wζ , wη )-weighted degree of k,l Φkl ζ k η l is defined as max{kwζ + lwη |Φkl 6= 0} where wζ , wη are fixed “weights” for ζ and η respectively. 1 132 8 Modeling of data with bilinear differential forms where Rl is a representation of the MPUM for c1 eλ1 t , . . . cl eλl t and El+1 is the “error removing matrix” which satisfies El+1 (λi ) = [Rl−1 ]? (λ̄l+1 ) q(l+1)i − c∗l+1 Φl (λ̄l+1 , λi )ci i = 1, . . . , l + 1, αc∗l+1 ci (8.5) where α = 2 when i = l + 1; else α = 1. Then, LΦN is a solution to the bilinear interpolation problem. Proof: We prove the theorem by induction on the number of trajectories. Consider Φ1 (ζ, η). Then, clearly, c∗1 Φ1 c1 = q11 . Hence, LΦ1 is a solution of the problem for N = 1. Assume that LΦl is a solution to the bilinear interpolation with data ci eλi t , i = 1, . . . , l i.e., we assume that Φl (ζ, η) is known such that LΦl (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t i, j = 1, 2 . . . l, (8.6) The (l + 1) th update is carried with the following formula ? Φl+1 (ζ, η) = Φl (ζ, η) + El+1 (ζ)Rl (η) + Rl? (ζ)El+1 (η) (8.7) where KerRl ( dtd ) is the MPUM for ci eλi t , i = 1, . . . , l. Since Rl is a representation of the MPUM: LΦl+1 (ci eλi t , cj eλj t ) = LΦl (ci eλi t , cj eλj t ) i, j = 1, 2 . . . l (8.8) Thus, the updating of Φl (ζ, η) has been achieved without disturbing the interpolation conditions satisfied by Φl . A solution to the (l+1)th step interpolating problem is obtained by constructing the univariate polynomial matrix El+1 that satisfies: ? c∗l+1 [Φl (λ̄l+1 , λi ) + El+1 (λ̄l+1 )Rl (λi ) + Rl? (λ̄l+1 )El+1 (λi )]ci = q(l+1),i i = 1, . . . l + 1 (8.9) We now show that these conditions are satisfied and we provide a method to construct a polynomial matrix E meeting these requirements. Since Ker Rl ( dtd ) is the MPUM for the first l trajectories, and since all λi s are assumed distinct, Rl (λl+1 ) is a constant nonsingular matrix. We verify by substitution that El+1 (λl+1 ) = [Rl−1 ]? (λ̄l+1 ) q(l+1)(l+1) − c∗l+1 Φl (λ̄l+1 , λl+1 )cl+1 2c∗l+1 cl+1 (8.10) satisfies conditions (8.9) when i = l + 1. In addition, El+1 must satisfy l cross coupling conditions associated with the pairs (λ̄l+1 , λi ), i = 1, . . . l. We verify by substitution that El+1 (λi ) = [Rl−1 ]? (λ̄l+1 ) q(l+1)i − c∗l+1 Φl (λ̄l+1 , λi )ci i = 1, 2 . . . l c∗l+1 ci (8.11) satisfies conditions (8.9) when i = 1, . . . , l. These interpolation conditions are well defined because we have assumed that c∗i cj 6= 0 ∀ i, j. In order to construct an El+1 satisfying the interpolation conditions, consider the following scheme. Define: Ai = [Rl−1 ]? (λ̄l+1 ) q(l+1)i − c∗l+1 Φl (λ̄l+1 , λi )ci i = 1, . . . , l + 1, αc∗l+1 ci (8.12) 8.3. A recursive algorithm for interpolating with BDFs 133 with α = 2 if i = l + 1, and α = 1 otherwise. Determining El+1 from the l + 1 interpolation conditions is a straightforward problem in Lagrange interpolation: El+1 (ξ) = l+1 X Ql+1 i=1,i6=j (ξ − λi ) Aj Ql+1 (λ − λ ) j i i=1,i6=j j=1 +F l+1 Y i=1 (ξ − λi ) (8.13) with F ∈ Cc×c [ξ]. We have completed the development of the recursive interpolation algorithm. The complete algorithm is summarized below: Input ci eλi t , ci ∈ Cc , λi ∈ C and distinct, c∗i cj 6= 0 ∀ i, j and complex constants qij = ∗ qji , i, j = 1, 2 . . . N . Output Φ(ζ, η) = Φ? (ζ, η) such that LΦ (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t . Φ1 (ζ, η) = q11 I c∗1 c1 For i = 1,2 . . . N − 1 do, Compute the MPUM, Ker Ri ( dtd ) for c1 eλ1 t , c2 eλ2 t , . . . cλi i t Compute the ith stage error matrix Ei from Ei (λj ) = Aj , j = 1, 2 . . . , i as in equation (8.13) ? Φi+1 (ζ, η) = Φi (ζ, η) + Ei+1 (ζ)Ri (η) + Ri? (ζ)Ei+1 (ζ) end Return Φ(ζ, η) = ΦN (ζ, η) With reference to Theorem 8.3.1, let F ∈ Cc×c [ζ, η]. Consider the following two variable polynomial matrix: ? Υ(ζ, η) = F (ζ, η)RN (η) + RN (ζ)F ? (η, ζ) (8.14) where RN is such that Ker RN ( dtd ) a MPUM for ci eλi t , i = 1, 2 . . . N . Consider the action of the BLDF LΥ on the finite dimensional vector space S given by the span of ci eλi t , i = 1, 2 . . . N . Clearly LΥ (v) = 0∀v ∈ S. Conversely, if LΥ (v) = 0∀ v ∈ S then there exists F such that (8.14) holds ([103], Proposition 3.2, page 1712). We call BDFs LΥ satisfying LΥ (S) = 0 zero along S. Since LΥ is zero along S if and only if it has the structure in equation (8.14) we see that every solution to the interpolation problem can be generated from a known solution L Φ by adding LΥ to it, i.e. LΨ is a solution to the interpolation problem if and only if there exists LΥ zero along S such that LΨ = L Φ + L Υ We call the symmetric polynomial in equation (8.14) the “tail polynomial”. What is the “best” solution to the bilinear interpolation problem will often depend on the choice of an appropriate tail polynomial. 134 8 Modeling of data with bilinear differential forms Remark 8.3.2 Notice that our assumption c∗i cj 6= 0 ∀i, j results in simple interpolation conditions for the error matrix El+1 . However, this condition is not a central assumption. Clearly if cl = 0 for some l then the given scalars ql1 , ql2 , . . . qll must have been 0. The error matrix can in this case be assigned arbitrary values and therefore no interpolation conditions need be imposed on it. Representations for the MPUM are not unique. If Ker Rl ( dtd ) is the MPUM for the first l trajectories then Ker Ul ( dtd )R1 ( dtd ) is also the MPUM for the same set of trajectories if and only if Ul is a unimodular matrix (i.e. det Ul = Const, 6= 0). In order to have well defined interpolation conditions for the error matrix El+1 it is enough to ensure that the matrix Rl represents the MPUM and satisfies: c∗l+1 Rl? (λ̄l+1 )ci 6= 0 i = 1 . . . l + 1 (8.15) Under the above conditions, well defined interpolation conditions may be set up with out inverting Rl? (λ̄l+1 ): q(l+1) i − c∗l+1 Φl (λ̄l+1 , λi )ci Ic i = 1 . . . l + 1. α = 2, when i = l + 1 otherwise α = 1 αc∗l+1 Rl? (λ̄l+1 )ci (8.16) Formulating the interpolation problem as in (8.16) may in fact be advantageous since it does not involve explicit inversion of Rl? (λ̄l+1 ). However, the actual procedure to compute Rl depends on the case in hand. El+1 (λi ) = In the next Section we demonstrate the use of the interpolation scheme in Theorem 8.3.1 for addressing specific problems. 8.4 8.4.1 Examples and applications Interpolation with BDFs Example 8.4.1 Consider two trajectories ci eλi t , i = 1, 2 with " # 1 λ1 = −1; c1 = 0 " # 1 λ2 = −2; c2 = 1 Let q11 = 1; q12 = q21 = 2; q22 = 3 Since the data are real, we can compute a real Φ(ζ, η) such that LΦ (ci eλi t , cj eλj t ) = qij e(λi +λj )t , i = 1, 2. We proceed to define: " # 1 1 0 Φ1 (ζ, η) = I = 1 0 1 We update Φ1 as indicated below: Φ2 (ζ, η) = Φ1 (ζ, η) + E2T (ζ)R1 (η) + R1T (ζ)E2 (η) 8.4. Examples and applications 135 where R1 (ξ) is such that Ker R1 ( dtd ) is the MPUM for c1 e−t . Using the procedure outlined in Section 6.2.1 we see that one possible choice for R1 is: " # T c1 c −1 − ξ 0 R1 (ξ) = λ1 I − T 1 ξ = c1 c1 0 −1 The matrix E2 is chosen such that it satisfies the following interpolation conditions: q22 − cT2 Φ1 (λ2 , λ2 )c2 2cT2 c2 q21 − cT2 Φ(λ2 , λ1 )c1 E2 (λ1 ) = R1−T (λ2 ) cT2 c1 E2 (λ2 ) = R1−T (λ2 ) Substituting the appropriate values we see that E2 (−2) = (1/4) E2 (−1) = " " 1 0 0 −1 # # 1 0 0 −1 (8.17) Solving a Lagrange interpolation problem we see that one possible choice for E2 is " # 3/4ξ + 7/4 0 0 −3/4ξ − 7/4 Thus, Φ2 (ζ, η) is found to be " #" # 3/4ζ + 7/4 0 −1 − η 0 Φ2 (ζ, η) = + 0 −3/4ζ − 7/4 0 −1 " #" # −1 − ζ 0 3/4η + 7/4 0 + 0 −1 0 −3/4η − 7/4 " # −5/2(1 + ζ + η) − 3/2ζη 0 = 0 9/2 + 3/4(ζ + η) 1 0 0 1 # " That Φ2 (ζ, η) is a solution can be easily checked by substitution. 8.4.2 (8.18) Application 1: Interpolation with scalar bivariate polynomials Interpolation with scalar bivariate polynomials (often with additional conditions) has been researched at length [10, 23] among others. An immediate application of the algorithm presented in this chapter is a general recursive scheme for interpolation by symmetric scalar bivariate polynomials. Consider the interpolation problem defined in Theorem 8.3.1 for the scalar case, i.e. c = 1. Without loss of generality, the ci s can now be assumed to be unity. We consider the following interpolation problem: given N distinct real numbers λ1 , . . . λN , together with the real numbers qij , i, j = 1, 2 . . . N , determine a scalar symmetric bivariate polynomial Φ(ζ, η) such that Φ(λi , λj ) = qij (8.19) 136 8 Modeling of data with bilinear differential forms As in the previous case, the symmetry conditions dictate that qij = qji so that only N (N + 1)/2 qij s can be specified independently. The fact that ci s are unity makes the computation of the MPUM trivial. Indeed, it can be seen that the MPUM KerRl ( dtd ) for the l trajectories Q eλ1 t , . . . eλl t can be represented by Rl = li=1 (ξ − λi ). Using Rl as a representation of the MPUM, Theorem 8.3.1 can be applied to the data (ci eλi t , qij ), j = 1, . . . , i, i = 1, . . . , N to compute a Φ ∈ Rs [ζ, η] such that Φ(λi , λj ) = qij . 8.4.3 Application 2: Storage functions We have considered autonomous systems in Section 2.7. Recall that in autonomous systems, the future is entirely governed by the initial conditions. The problem considered in this section is the following: given an autonomous system with behavior B, and a QDF QΦ , determine a QDF QΨ (if there exists one) such that d QΨ (w) ≤ QΦ (w) ∀w ∈ B dt (8.20) QΨ is called a storage function. Notice that this problem is in constrast to computation of storage functions in Chapter 4 where the behaviors were assumed to be controllable. We begin by quoting the following important result ([103], Theorem 4.3, page 1713): d )w = 0} is autonomous and stable. Given any Theorem 8.4.2 Assume that B = {w|R( dt w×w QDF QΦ , Φ(ζ, η) ∈ R [ζ, η], there exists a QDF QΨ such that dtd QΨ (w) = QΦ (w) for all w ∈ B. Further, if QΦ (w) ≤ 0 for all w ∈ B then QΨ (w) ≥ 0. Theorem 8.4.2 is essentially, Lyapunov theory for higher order systems. Computing a storage function for B has been addressed in [103], Theorem 4.8 page 1715. The suggested solution scheme has been through the solution of a certain Polynomial Lyapunov Equation (PLE), a solution scheme for which was subsequently reported in [58]. In this chapter, we present an alternate method for computing storage functions which does not involve solving the PLE. We first prove a result that relates the existence of storage functions with the sign-definiteness of a hermitian matrix: Lemma 8.4.3 Let B := {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} be a finite dimensional behavior with basis {wi = ci eλi t }, ci ∈ Cw , λi ∈ C and distinct, i = 1, 2 . . . N . Let QΦ be an arbitrary QDF with Φ(ζ, η) ∈ Rw×w s [ζ, η] Then a Ψ(ζ, η) satisfies: d QΨ (w) ≤ QΦ (w) dt for all w ∈ B if and only if the N × N hermitian matrix D = [dij ]N i,j=1 is negative semidefinite. Here: dij = c∗i [(λ̄i + λj )Ψ(λ̄i , λj ) − Φ(λ̄i , λj )]cj (8.21) Further, d Q (w) dt Ψ = QΦ (w) if and only if dij = 0. Proof: Consider an arbitrary trajectory w ∈ B. Then, there exist α1 , . . . , αN ∈ C such that P w= N i=1 αi wi . There exists a Ψ(ζ, η) satisfying the conditions stated in the lemma if and only 8.4. Examples and applications 137 if h ∗ α1∗ . . . αN α 1 i .. [dij ]N i,j=1 . ≤ 0 αN Since α1 , . . . αN are arbitrary, there exists a Ψ if and only if the matrix [dij ]N ij=1 is negative semidefinite. It is now shown that the interpolation approach developed in this chapter can be used to compute a storage function in a simple manner. Some computations are relatively trivial and all of them are operations with constant matrices. Theorem 8.4.4 Let B := {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} be a finite dimensional behavior with basis {wi = ci eλi t }, ci ∈ Cw , λi ∈ C and distinct, c∗i cj 6= 0, i, j = 1, 2 . . . N . Let QΦ , Φ(ζ, η) ∈ Rw×w / iR, i = 1, . . . , N , there exists a QDF QΨ such s [ζ, η] be an arbitrary QDF. If λi ∈ that d QΨ (w) ≤ QΦ (w) ∀w ∈ B dt Proof: We prove the theorem by considering two cases. case 1: λ̄i + λj 6= 0, i, j = 1, . . . N : A QDF QΨ qualifies to be a storage function for B if and only if the hermitian matrix [dij ]N i,j=1 is negative semidefinite, where, dij = c∗i [(λ̄i + λj )Ψ(λ̄i , λj ) − Φ(λ̄i , λj )]cj We show that there exists a Ψ(ζ, η) such that [dij ]N i,j=1 can actually be made zero. Let us compute a Ψ(ζ, η) that satisfies Φ(λ̄i , λj ) cj c∗i Ψ(λ̄i , λj )cj = c∗i (λ̄i + λj ) | {z } (8.22) qij This is the standard interpolation problem considered in Section 8.3. It has already been shown that a solution can be obtained recursively in N iterations. Every solution to the interpolation problem yields a QDF QΨ such that dtd QΨ = QΦ . Notice that we have not assumed B to be stable. case 2: λ̄j + λi = 0; i 6= j: Assume that λ̄a + λb = 0 for some 1 ≤ a, b ≤ N, a 6= b. The matrix D defined by equations (8.21) now takes the following form: 2 6 ∗ 6 c1 [(λ̄1 + λ1 )Ψ(λ̄1 , λ1 ) − Φ(λ̄1 , λ1 )]c1 6 6 . 6 . D=6 . 6 6 6 . 6 . 4 . ··· . . . ··· c∗ 1 [(λ̄1 + λN )Ψ(λ̄1 , λN ) − Φ(λ̄1 , λN )]cN ··· −c∗ a Φ(λ̄a , λb )cb ··· −c∗ b Φ(λ̄b , λa )ca ··· ··· ··· ··· c∗ N [(λ̄N + λN )Ψ(λ̄N , λN ) − Φ(λ̄N , λN )]cN 3 7 7 7 7 7 7 7 7 7 7 5 (8.23) It is now shown that one can always find Ψ(ζ, η) such that D is negative definite. We use the following scheme: first assign interpolation conditions to off-diagonal terms i, j (i.e., with i 6= j), λ̄i + λj 6= 0 so as to make them zero. Then choose the diagonal terms suitably so as to make D negative semidefinite. Thus, to start with, let Φ(λ̄i , λj ) cj i 6= j and λ̄i + λj 6= 0 λ̄i + λj c∗i Ψ(λ̄i , λj )cj = arbitrary i = 6 j and λ̄i + λj = 0 c∗i Ψ(λ̄i , λj )cj = c∗i (8.24) (8.25) 138 8 Modeling of data with bilinear differential forms Note that with this assignment, D now has zeros at all off diagonal positions i, j such that λ̄i + λj 6= 0. Further, since c∗i Ψ(λ̄i , λj )cj can take arbitrary values when λ̄i + λj = 0, no interpolation conditions need be specified for such λi , λj . It is not difficult to see that D can now be written as the sum of two matrices Λ and D 0 : D = Λ + D0 (8.26) 0 with Λ = diag[c∗i ((λ̄i + λi )Ψ(λ̄i , λi ) − Φ(λ̄i , λi ))ci ]N i=1 . D is a w × w matrix which has zeros everywhere on the off-diagonal terms, except the positions (a, b), (b, a) for all a, b, a 6= b such that λ̄a + λb = 0. Note that D 0 is a hermitian matrix and has all real eigenvalues. Denote with γ the largest eigenvalue of D 0 . Clearly, Λ can always be chosen such that Λ + D 0 is negative semidefinite. One possible solution is Λ = αIw with α ≤ −γ. Thus, along with the conditions given in (8.25), c∗i Ψ(λ̄i , λi )ci = (c∗i Φ(λ̄i , λi )ci + α)/(λ̄i + λi ) i = 1 . . . N (8.27) where α ≤ −(max spec D 0 ). The matrix Ψ(ζ, η) can now be determined from these interpolation conditions using the recursive algorithm in Theorem 8.3.1. The QDF QΨ is then a storage function for B since the matrix D (8.21) is now negative semidefinite. Remark 8.4.5 Notice that if in Theorem 8.4.4 λi ∈ iR, it is necessary that c∗i Φ(λ̄i , λi )ci = 0 for the existence of a QΨ such that dtd QΨ (w) = QΦ (w), and c∗i Φ(λ̄i , λi )ci ≥ 0 for the existence of a QΨ such that dtd QΨ (w) ≤ QΦ (w) along all trajectories w in B. In other words, if for some i, 1 ≤ i ≤ N , λi ∈ iR and c∗i Φ(λ̄i , λi )ci < 0, there cannot exist a QΨ such that dtd QΨ (w) ≤ QΦ (w) along all w ∈ B. Moreover, notice that if the above necessary condition is satisfied, c ∗i Ψ(λ̄i , λi )ci can be assigned any arbitrary value. Remark 8.4.6 In the light of Remark 8.4.5 and Theorem 8.4.4 it follows that given B = lin span ci eλi t , λi distinct, c∗i cj 6= 0, i, j = 1, . . . N , and a QDF QΦ , there exists a QDF QΨ such that dtd QΨ (w) = QΦ (w) ∀w ∈ B if and only if whenever λ̄i + λj = 0, c∗i Φ(λ̄i , λj )cj = 0. If this holds, c∗i Ψ(λ̄i , λj )cj can be assigned arbitrary values, and hence no interpolation conditions need be imposed. Example 8.4.7 Consider the behavior B defined as the set of solutions w = [w1 w2 ]T to the system of equations: " # 2 4 + 5 dtd + dtd 2 − dtd w=0 2 − dtd − dtd 2 4 + dtd | {z } d R( dt ) It is easy to see that B is finite dimensional. A basis for B is found to be ci eλi t , i = 1, 2 where ! ! 1 1 λ1 = −1; c1 = and λ2 = −2; c2 = 0 1 Let us compute a storage function for B with respect to the supply function QΦ (w) = −(2w12 + 8w1 w2 + 2w22 ) 8.4. Examples and applications 139 ! 2 4 Then, Φ = − . Let QΨ be a storage function for B with respect to QΦ . For it to be a 4 2 storage function, Ψ(ζ, η) must satisfy the following interpolation conditions: ! ! 2 4 1 ! − 1 0 4 2 0 1 = =1 1 0 Ψ(−1, −1) −1 − 1 0 ! ! 2 4 1 ! − 1 0 4 2 1 1 = =2 1 0 Ψ(−1, −2) −1 − 2 1 ! ! 2 4 1 ! − 1 1 4 2 1 1 =3 = 1 1 Ψ(−2, −2) −2 − 2 1 We can now solve the above interpolation problem. Notice that the above interpolation conditions are precisely the interpolation conditions considered in Example 8.4.1. Therefore, from Example 8.4.1 we know that Ψ(ζ, η) can be taken to be " # −5(1 + ζ + η)/2 − 3ζη/2 0 Ψ(ζ, η) = 0 9/2 + 3(ζ + η)/4 Then, QΨ is such that d Q (w) dt Ψ = QΦ (w) for all w ∈ B Remark 8.4.8 The aim of this remark is to show that the algorithm in Theorem 8.3.1 may not give an “optimal” solution. Indeed, in Examples 8.4.1 and 8.4.7 we have 2 × 2 matrices with three interpolation conditions. Without much difficulty, it can be seen that the interpolation conditions in these examples can be met with a constant (rather than bivariate polynomial) matrix. If we define ! 1 1 K= 1 0 then, LK is a simpler solution to examples. However, this is not the case in general: if we had imposed one more interpolation condition in Example 8.4.1 for instance, it is easy to see that the problem will no longer admit a “constant” solution. Every solution to the interpolation problem can be generated from a known solution by adding the tail polynomial (see equation (8.14)). Thus, how “simple” or “complex” a solution is desired depends on among other things, the choice of a tail polynomial. Remark 8.4.9 Remark 8.4.8 leads us to ask the very reasonable question of what is an “optimal” solution to the interpolation problem. A quantification of complexity in bilinear forms is an interesting problem. The notion of complexity could depend on for instance, least “total weighted degree”, least “effective size”, least “number of parameters” and so on. 140 8 Modeling of data with bilinear differential forms Remark 8.4.10 It can be shown that the interpolation scheme developed in this Chapter can be easily used for interpolation with quadratic difference forms [41]. Consider as the data finitely many discrete time sequences {vi ani }n∈Z , and constants qij . One can formulate the problem of constructing a bilinear difference form that interpolates the data in a way analogous to the continuous time case. Section 8.3 can be suitably modified to address the discrete time interpolation case. 8.5 Conclusion In this chapter we have developed a recursive algorithm for interpolation with bilinear forms using the algebra of two variable polynomial matrices. The investigation is primarily motivated by the exact modeling problem. However we have shown that it also has other applications. An interesting application of this algorithm is in computation of storage functions for finite dimensional dynamical systems which is a generalization of Lyapunov theory. We show how to compute a storage function even when the Lyapunov operator is singular, in which case conventional methods of solution generally fail. We believe that results presented in this chapter are only a starting point for a systematic investigation into a number of interesting problems, notably, a quantification of “complexity” of bilinear (differential) forms. We have shown that the interpolation scheme may not yield an “optimal” solution to the interpolation problem. Indeed, we believe what is optimal may depend on the application in hand, however, an investigation into issues regarding complexity, optimality and simplicity of solutions will yield good insights and applications. Chapter 9 Nevanlinna-Pick interpolation 9.1 Introduction In Chapter 8 we addressed interpolation with BDFs. Interpolation with rational functions is another important problem and finds applications in realization theory, sensitivity minimization, model reduction, robust stabilization, etc. A detailed treatment of the theory of interpolation with rational functions, along with some applications can be found in [9] and references therein. A class of interpolation problems in Hardy spaces consists of computing an analytic function that satisfies some interpolation conditions, along with a norm constraint. The NevanlinnaPick (NP) interpolation problem is one of the most important interpolation problems in this class. The NP interpolation problem has found numerous applications in model approximation, robust stabilization, the model matching problem in H∞ control [44], circuit theory [110] among others. Very recently, Antoulas [6] and Sorensen [93] applied concepts from Nevanlinna-Pick interpolation for addressing the problem of “passivity preserving model reduction”. We now state the classical NP problem: Given N pairs of complex numbers (λi , bi ) with λi ∈ C+ , the open right half complex plane, and |bi | < 1, i = 1, 2 . . . N , compute a scalar rational function G(s) such that 1. G(λi ) = bi , i = 1, 2 . . . N 2. G(s) is analytic in C+ . 3. supω∈R |G(iω)| < 1 Several variants of the above problem have been studied with various assumptions on the data [9]. The scalar and matrix versions, the tangential NP problem with (simple) multiplicities, the two sided Nudelman problem, the Subspace Nevanlinna Pick Interpolation (SNIP) [84] are some variants and generalizations of the classical NP problem. Basic to all these variants of the NP problem is an assumption of a “frequency independent norm” that the interpolating (rational) function must satisfy. In other words, it is assumed that the norm inequality satisfied by the interpolating rational function is the same everywhere on the imaginary axis. In this chapter, we re-examine various aspects of the NP interpolation problem in the behavior theoretic framework. We show that concepts in behavioral theory can be conveniently married to the concepts behind NP–like problems to yield generalizations in several directions. 142 9 Nevanlinna-Pick interpolation We show that the classical NP interpolation problem can be re-cast into a problem in dissipative systems, which we considered in considerable detail in Chapter 3 of this thesis. In such a setting, the aspects of analyticity and the norm conditions on the interpolant can be examined separately. This formulation also allows us to consider a generalized interpolation problem wherein the required interpolant satisfies a “frequency-weighted” norm condition. The results reported in this Chapter were obtained in collaboration with Dr. Paolo Rapisarda who is currently with the Department of Electronics and Computer Science, University of Southampton, UK. These results will appear in [70]. In Section 6.2.1 we considered the behavioral view of linear, time-invariant models. Recall that the most powerful unfalsified LTI model (MPUM) for a given set of trajectories (called data) is a model that explains every trajectory in the set, and as little else as possible. The MPUM for a finite number of vector exponential trajectories is an autonomous behavior. However, there also exist other models for data that are less restrictive. We now define models that are controllable behaviors: Definition 9.1.1 Consider the data set D = {Vi eλi t with i = 1, . . . N }. We call a matrix F (ξ) ∈ Cv×v [ξ] a representation of a model for D in the generative sense if 1. ImF ( dtd )eλi t = Vi eλi t , i = 1, . . . N . 2. ImF ( dtd )eµt = Weµt , where Im W = Cv , if µ 6= λi . We will consider the NP interpolation problem in two steps. First we will address the case when the interpolant satisfies a fixed norm condition at all frequencies. Later, we will address the problem of an interpolant satisfying a frequency weighted norm condition. 9.2 Nevanlinna Pick interpolation – the standard case We begin by translating the classical Nevanlinna Pick interpolation problem as given in Section 9.1 into the language of behavioral systems theory and address some issues that arise out of such a formulation. In this chapter, we will only consider the scalar interpolation problem, and hence we consider B ∈ L2con . A behavioral formulation of Nevanlinna-Pick interpolation has been reported in [84], where a characterization of solutions of a “Subspace Nevanlinna Interpolation Problem” was obtained in terms of kernel representations. In this chapter we obtain a characterization of all solutions in terms of image representations. Such a characterization has an advantage over [84]: controllability of all solutions obtained as image representations is guaranteed, unlike in a characterization in terms of kernel representations. Consider a controllable behavior B defined by an observable image representation: # " # " d u q( dt ) ` (9.1) = p( dtd ) y Define the 2 × 2 constant matrix J1 1 as J1 1 = " 1 0 0 −1 # (9.2) 9.2. Nevanlinna Pick interpolation – the standard case Let Jstrict = " 1− 0 0 −1 # 143 , ∈ (0, 1) Then, we have the following lemma: Lemma 9.2.1 supω∈R | p(iω) | < 1 if and only if the behavior B associated with the rational q(iω) strict function p(ξ)/q(ξ) is J -dissipative for some ∈ (0, 1). Proof: Follows by applying Theorem 3.2.3. Lemma 9.2.1 serves as a connection between the classical (rational function based) formulation and a behavioral formulation of the NP problem. Behavioral Nevanlinna Pick Interpolation (Problem statement): Consider N trajectories {Vi eλi t }, i = 1 . . . N , ∈ C ∞ (R, C2 ) (which we name the data set D). Assume that 1. λi ∈ C+ , i = 1, 2 . . . N are distinct, 2. Vi ∈ C2 are contractive, i.e. Vi = [x y]T with |x| > |y|. Under the above assumptions, determine all Jstrict -dissipative controllable behaviors B (i.e., all behaviors B that are Jstrict -dissipative for some ∈ (0, 1)) defined by the following kernel representation: " # i u h =0 p( dtd ) −q( dtd ) y such that: 1. q(ξ) is a Hurwitz polynomial. h i d d 2. B contains D, i.e. p( dt ) −q( dt ) Vi eλi t = 0, i = 1, 2 . . . N . " # q(λi ) ⊇ Vi , i = 1, 2 . . . N . Thus, the p(λi ) problem is actually that of interpolating N distinct subspaces Vi eλi t , i = 1, . . . N . We say that B “interpolates the data” if it contains the trajectories Vi eλi t . Any behavior B that satisfies conditions (1) and (2) will be called a solution to the “Subspace Nevanlinna Pick Problem (SNIP)”. Clearly, the interesting cases are only when Vi s are distinct, when solutions to the SNIP will in general be non-constant. This fact, together with the condition that q(ξ) is Hurwitz implies that the Jstrict -dissipative behavior B which is a solution to the SNIP has positive definite storage functions on manifest variables (see Chapter 4). Therefore, the problem statement given above may be stated in an equivalent fashion as: Equivalent problem statement: Consider N trajectories D = {Vi eλi t , i = 1, . . . , N } ∈ C ∞ (R, C2 ). Assume that It is evident from condition (2) above that Im 1. λi , i = 1, 2 . . . N ∈ C+ and are distinct, 2. Vi ∈ C2 are contractive, i.e. Vi = [x y]T with |x| > |y|. 144 9 Nevanlinna-Pick interpolation Under these assumptions, determine all Jstrict -dissipative behaviors B associated with the rational function p(ξ)/q(ξ) such that: 1. B has positive definite storage functions on manifest variables. " # q( dtd ) 2. B contains D, i.e. eλi t ⊇ Vi eλi t , , i = 1, 2 . . . N . d p( dt ) We now recall the definition of “Pick matrix” as given in Section 6.2.2. A Pick matrix of Q J1 1 and Vi eλi t , i = 1, . . . , N is defined as the N × N hermitian matrix: T{Vi }i=1,...,N V ∗ J1 1 V j = i λ̄i + λj N (9.3) i,j=1 where Vi ∈ C2 is a basis for Vi . A Pick matrix is obviously not unique, since a different basis for Vi gives a different Pick matrix. However it can be shown that properties of T{Vi }i=1,...,N like its signature, sign-definiteness and (non)singularity remain invariant under a change of basis of the subspaces Vi . While describing these properties, we shall refer to T{Vi }i=1,...,N as the Pick matrix with a slight abuse of notation. We now use the notion of the “dual” set of the data, previously defined in Section 6.2.2. to simplify the solution of NP problem. 9.2.1 Dualizing of the data Given subspaces Vi eλi t , i = 1 . . . N with Vi ∈ C2 and contractive, define a “data set” D as D = {V1 eλ1 t , V2 eλ2 t , . . . VN eλN t } Since we are solving SNIP for real valued rational functions, we want the data to respect this condition. Therefore, henceforth, we assume that the data set D is self-conjugate. Recall from ⊥ ⊥ ¯ Section 6.2.2 that we have defined the “dual subspace” of Vi eλi t as Vi J1 1 e−λi t , where Vi J1 1 is defined by: ⊥J 1 1 Vi = {v ∈ C2 |v ∗ J1 1 w = 0∀w ∈ Vi } ⊥ Since Vi is contractive, Vi J1 1 is uniquely defined and in fact the two are complements of one another in C2 . The dual of the data set D is denoted D ⊥J1 1 , and is defined in the obvious way: ⊥ ⊥ ⊥ D ⊥J1 1 = {V1 J1 1 e−λ̄1 t , V2 J1 1 e−λ̄2 t , . . . VN J1 1 e−λ̄N t } Dualizing of the subspaces Vi eλi t is instrumental in determining a characterization of all solutions to the SNIP. We quote the following result from [82], (Theorem 8.3.1, page 148): Proposition 9.2.2 Consider the data D. The following statements are equivalent: 1. The Pick matrix T{Vi }i=1,...,N is positive definite. 2. There exists a solution to the SNIP. 9.3. System theoretic implications of dualizing the data 145 Further, there exists a matrix R(ξ) such that kerR( dtd ) is the MPUM for D ∪ D ⊥J1 1 . R(ξ) is J1 1 -unitary: RT (−ξ)J1 1 R(ξ) = r(ξ)r(−ξ)J1 1 . The behavior B := ker[p( dtd ) − q( dtd )] is a solution to the SNIP if and only if p(ξ), q(ξ) are coprime and there exists f (ξ) ∈ R[ξ] Hurwitz such that i h i d h f (ξ) p(ξ) −q(ξ) = R( ) π(ξ) −φ(ξ) dt with ||π/φ||H∞ < 1. Note that Proposition 9.2.2 gives a characterization of all solutions to the SNIP in terms of kernel representations. We now re-formulate the characterization of all solutions to SNIP given in Proposition 9.2.2 using image representations: Lemma 9.2.3 Consider R( dtd ) as in Proposition 9.2.2. Define F (ξ) = adj R(ξ), i.e., F (ξ)R(ξ) = ⊥ det R(ξ)I2 . Then, Im F (λi ) = Vi , ImF (−λ̄i ) = Vi J1 1 and Im F (µ) = C2 if µ 6= λi , −λ̄i , i = 1, . . . , N . Proof: Note that det R(ξ) has roots at λi and −λ̄i . Therefore, R(λi )F (λi ) = 0 and R(−λ̄i ) ⊥ N (λ̄i ) = 0. Since KerR( dtd ) is the MPUM, we have ImF (λi ) = Vi and ImF (−λ̄i ) = Vi J1 1 . Following the above lemma, we restate Proposition 9.2.2: " # q( dtd ) Proposition 9.2.4 Let B := Im . B is a solution to the SNIP with data D if and p( dtd ) only if T{Vi }i=1,...,N is positive definite. Then, every solution to the SNIP is characterized by " # " # d φ(ξ) q(ξ) f (ξ) = F( ) dt π(ξ) p(ξ) ⊥J 1 1 where ||π/φ||H∞ < 1, f (ξ) is Hurwitz, ImF (λi ) = Vi and ImF (−λ̄i ) = Vi . Remark 9.2.5 Proposition 9.2.4 is only a restatement of Proposition 9.2.2 in terms of image representations. We will see that we can generalize NP interpolation problems to a considerable extent using Proposition 9.2.4. An important advantage of converting the SNIP from a kernel representation to an image representation is that by doing so, controllability is guaranteed. Such a re-formulation also explains why dualizing the data is crucial in solving the Nevanlinna-Pick interpolation problem. Notice that in the solution suggested above, we have considered the data set D ∪ D ⊥J1 1 . This is in apparent disagreement with the problem statement of SNIP which related to finding all interpolating behaviors for D alone. The following section addresses the necessity of considering D ∪ D ⊥J1 1 rather than D alone, and how this consideration still yields all solutions to SNIP. 9.3 System theoretic implications of dualizing the data In this section, we give a justification for dualizing the data. We start by considering a hypothetical situation in which the data has not been dualized. Let as before D = {V i eλi t } be the data set and Im F ( dtd ) be a model for D, i.e. Im F (λi ) = Vi and Im F (µ) = C2 , µ 6= λi . We emphasize that the D respects the existence of a real valued solution to SNIP, i.e., the data is self-conjugate. With this assumption, D admits a model (in a generative sense) that is real valued. Then, the following lemma is easily proved: 146 9 Nevanlinna-Pick interpolation Lemma 9.3.1 Let D = {Vi eλi t , i = 1, . . . , N }. Let F ( dtd ) be a model for D in a generative sense. Consider polynomials p(ξ), q(ξ) ∈ R[ξ]. Then, # " d q( dt ) e λi t = V i e λi t Im d p( dt ) if and only if there exist coprime r(ξ), s(ξ) ∈ R[ξ] such that: " # " # q(ξ) r(ξ) = F (ξ) p(ξ) s(ξ) Proof: Follows from the fact that ImF (λi ) = Vi , which in the scalar case are one dimensional. Lemma 9.3.1 gives a characterization of all controllable behaviors that interpolate given subspaces. At this juncture when all possible interpolants have been characterized, we bring in the additional condition of dissipativity. Consider the two variable polynomial matrix Φ(ζ, η) defined as Φ(ζ, η) = F T (ζ)J1 1 F (η) The following proposition relates Φ-dissipativity with solutions to SNIP: Proposition 9.3.2 Consider the set LΦ and B0 ∈ LΦ . Then, B := F ( dtd )(B0 ) is a J1 1 dissipative behavior that interpolates D := {Vi eλi t , i = 1, . . . , N }. Moreover, for every J1 1 dissipative behavior B that interpolates D there exists a corresponding Φ-dissipative behavior B0 . Proof: We have seen in Chapters 3 and 4 of this thesis that F ( dtd ) can be thought of as a differential operator that maps every Φ-dissipative behavior into a J1 1 -dissipative behavior. Moreover, it is “invertible”, i.e. for every J1 1 -dissipative behavior, there exists a corresponding Φ-dissipative behavior and vice versa. B interpolates D since ImF (λi ) = Vi . Further, such a B is J1 1 -dissipative if and only if it is obtained as the image of a Φ-dissipative behavior under the map F ( dtd ). Thus in principle given any representation of a model for D (in a generative sense), one may construct a QDF QΦ from this representation. If the set of Φ-dissipative behaviors is “known”, the set of all J1 1 -dissipative behaviors that interpolate D can be determined. One sees immediately that QΦ is a fairly general supply function and no easy characterization is available for the set of Φ-dissipative behaviors. Thus, determining the set of Φ-dissipative behaviors is arguably a difficult task in general. We try to make QΦ “as simple as possible” so that the set of Φ-dissipative behaviors is “known”. Dualizing the data does just this, as we show below. We will try to make QΦ “like QJ1 1 ”. The following result is a consequence of trying to make QΦ “simple”: Theorem 9.3.3 The matrix F (ξ) satisfies F T (−ξ)J1 1 F (ξ) = r(ξ)r(−ξ)U T (−ξ)J1 1 U (ξ) 9.3. System theoretic implications of dualizing the data 147 with U (ξ) unimodular if and only if columns of F (−λ̄i ) are J1 1 -orthogonal to columns of F (λi ), i.e. F T (−λ̄i )J1 1 F (λi ) = 0 for all λi that satisfy det F (λi ) = 0. Proof: Since F (ξ) ∈ R2×2 [ξ], it follows that if λi ∈ C is a root of det F (ξ) = 0, then λ̄i is also a root of det F (ξ) = 0. Assume F T (−λ̄i )J1 1 F (λi ) = 0. Since F (ξ) is nonsingular, F T (−ξ)J1 1 F (ξ) is a nonzero polynomial matrix. Hence, there exists a polynomial r(ξ) ∈ R[ξ] (having roots at λi ) such that r(ξ)r(−ξ) divides F T (−ξ)J1 1 F (ξ). Define Z(ξ) = F T (−ξ)J1 1 F (ξ)/r(ξ)r(−ξ). This is a unimodular matrix having the same inertia as J1 1 for almost all ξ ∈ iR. We compute a J1 1 -spectral factorization of Z(ξ) and obtain a unimodular matrix U (ξ) such that Z(ξ) = U T (−ξ)J1 1 U (ξ). Hence F T (−ξ)J1 1 F (ξ) = r(ξ)r(−ξ)U T (−ξ)J1 1 U (ξ). The converse of the statement is trivial. Theorem 9.3.3 shows that F (ξ) is a model for a dualized data set in the generative sense. Remark 9.3.4 The interesting feature of the Nevanlinna-Pick algorithm is that there exists a representation of a model for D ∪ D ⊥J1 1 with U (ξ) (see Theorem 9.3.3) equal to the identity matrix. Indeed, if Im F ( dtd ) models D ∪ D ⊥J1 1 then ImF ( dtd )V ( dtd ) also models D ∪ D ⊥J1 1 if and only if V (ξ) is unimodular. This follows from the fact that the columns of F (ξ) and that of F (ξ)V (ξ) generate the same module. Thus, given any matrix F (ξ) such that columns of F (λi ) are J1 1 -orthogonal to columns of F (−λ̄i ) there exists a unimodular matrix V (ξ) such that F1 (ξ) = F (ξ)V (ξ) satisfies F1T (−ξ)J1 1 F1 (ξ) = r(ξ)r(−ξ)J1 1 . Therefore, while it is hardly surprising that a representation with U (ξ) = I2 exists, it is interesting to note the simple way to construct this representation (see Section 6.2.3). Thus Theorem 9.3.3 and Remark 9.3.4 imply that if ImF (−λ̄i ) models D ⊥J1 1 then F (ξ) can be suitably modified so that it enjoys the J1 1 -unitary property. Consider a behavior B associated with the rational function p(ξ)/q(ξ) that is a solution to the SNIP. Therefore, it follows that " # q(λi ) Im = Vi p(λi ) While characterizing the set of all solutions to the SNIP in Proposition 9.2.4, we have considered the data set D ∪ D ⊥J1 1 . This is apparently more restrictive than the original problem statement which required B to interpolate only D. However, the characterization obtained in Proposition 9.2.4 shows that a solution to the SNIP may be obtained with a common factor that is a Hurwitz polynomial (say f (ξ)), a polynomial having roots at −λ̄i . Thus, the vector " # q(−λ̄i ) f (−λ̄i ) p(−λ̄i ) is such that f (−λ̄i ) = 0. Therefore, q(−λ̄i ), p(−λ̄i ) are, in a sense, “free” and need not obey constraints of interpolating D ⊥J1 1 . Thus, in summary, dualizing the data has the following system theoretic implications: • It is necessary for, and guarantees existence of a J1 1 -unitary model, ImF ( dtd ). • J1 1 -unitariness of the model F ( dtd ) implies that the QDF QΦ defined by Φ(ζ, η) = F T (ζ)J1 1 F (η) is “like QJ1 1 ”, i.e., QJ1 1 ∼ QΦ . Thus, the set of Φ-dissipative behaviors in this case is “known” which enables a easy characterization of the solutions of the SNIP. 148 9 Nevanlinna-Pick interpolation We now address a generalization of SNIP using QDFs. We obtain a characterization of interpolants that satisfy a frequency-dependent norm, a result which is new and important. 9.4 Nevanlinna-Pick problem with frequency dependent norms In this section, we address the problem of constructing Φ-dissipative behaviors that interpolate certain given subspaces. The matrix Φ that induces the QDF QΦ may not necessarily be a constant matrix. Hence, interpolating behaviors that are Φ-dissipative are required to satisfy a “frequency dependent norm” along with the given interpolation conditions. This leads us to define a generalized SNIP along the lines stated below. We assume that the QDF QΦ is such that Φ(ζ, η) admits the factorization Φ(ζ, η) = K T (ζ)Jstrict K(η) with K(ξ) square and nonsingular. Necessary and sufficient conditions for such a factorization (and an algorithm to compute the factorization when it exists) have been already given in Chapter 4, Theorem 4.4.3. We now state a “generalized subspace Nevanlinna interpolation problem” (GSNIP) and address it using QDFs. Problem (GSNIP): Given a QDF QΦ with Φ(ζ, η) = K T (ζ)Jstrict K(η) with K(ξ) ∈ R2×2 [ξ] and nonsingular, together with N distinct subspaces Vi eλi t , i = 1, . . . , N , determine necessary # " d q( dt ) such and sufficient conditions for the existence of Φ-dissipative behaviors B := Im p( dtd ) that 1. B has positive definite storage functions (with respect to QΦ ) on manifest variables, and " # q(λi ) 2. Im = Vi . p(λi ) Assumption: We assume that {λi }{i=1,...,N } ∩ spec K(ξ) = φ, i.e. λi is not a singularity of K(ξ). We also assume that the spaces K(λi )Vi , i = 1, . . . , N , are contractive. The following theorem gives necessary and sufficient conditions for the solvability of Problem (GSNIP): Theorem 9.4.1 Given 1. a QDF QΦ with Φ(ζ, η) = K T (ζ)Jstrict K(η). 2. a Φ-dissipative C ∞ -behavior B defined by an observable image representation: " # " # u q( dtd ) = ` y p( dtd ) 3. subspaces Vi eλi t with Vi ∈ C2 , λi ∈ C+ , i = 1, . . . , N . the Problem (GSNIP) is solvable if and only if 9.4. Nevanlinna-Pick problem with frequency dependent norms 1. " r(ξ) s(ξ) # = K(ξ) " q(ξ) p(ξ) # 149 are such that r(ξ), s(ξ) coprime. 2. The modified Pick matrix T{Vi }i=1,...,N is positive definite where T{Vi }i=1,...,N and Vi is a basis for Vi . V ∗ K T (λ̄i )J1 1 K(λj )Vj = i λj + λ̄i N i,j=1 " # d ) r( dt Proof: Define a behavior B0 as Im `. Since B is Φ-dissipative, it follows from s( dtd ) Theorem 3.5.3 that B0 is Jstrict -dissipative. From Theorem 4.4.6, the behavior B has positive definite storage functions if and only if r(ξ) and s(ξ) are coprime, and r(ξ) is Hurwitz. Define Wi = K(λi )Vi . By assumption, spaces Im Wi are contractive. Thus, there exists a solution to GSNIP if and only if there exists a solution to the SNIP. Finally, there exists a solution to the SNIP if and only if the corresponding Pick matrix is positive definite: N ∗ W i J1 1 W j >0 T{Vi }i=1,...,N = λ̄i + λj i,j=1 This argument shows that there exists a Φ-dissipative behavior that interpolates Vi eλi t if and only if the co-primeness conditions hold, and if in addition the modified Pick matrix is positive definite. Conversely, suppose that there exists a Φ-dissipative behavior B with positive definite storage functions that interpolates given subspaces Vi eλi t for which the modified Pick matrix T{Vi }i=1,...,N is not positive definite. This implies that there exists a Jstrict -dissipative behavior B0 := K( dtd )(B) with positive definite storage functions that interpolates the modified data (λi , Wi ) := (λi , K(λi )Vi ) (where Vi is a basis for Vi ), and for which the corresponding Pick matrix T{Vi }i=1,...,N is not positive definite. This is a contradiction (Proposition 9.2.4). The essential idea in the above proof is that the matrix K(ξ) can be used to convert the problem into a problem of SNIP with QJstrict . Thus solution to the Problem (GSNIP) can be obtained as follows: 1. Given subspaces Vi eλi t , i = 1, . . . , N , choose a basis Vi for Vi . Define Wi = K(λi )Vi 2. Compute the Pick matrix [Wi∗ J1 1 Wj /(λ̄i + λj )]N i,j=1 . If this matrix is positive definite then proceed, else stop, there is no solution. 3. Compute all Jstrict -dissipative behaviors that interpolate the subspaces Wi eλi t , and which have positive definite storage functions. Let B0 be such a (controllable) behavior. 4. Every behavior B that satisfies the condition that there exists a B0 (with an observable image representation) satisfying B = K( dtd )(B0 ) is a solution to Problem (GSNIP) 150 9.5 9 Nevanlinna-Pick interpolation Conclusion In this chapter we have provided a behavior-theoretic characterization of all solutions to the Nevanlinna Pick interpolation problem. The characterization presented here guarantees controllability. We have provided an explanation as to the need of the so called “mirror images” in interpolation problems. In all classical formulations of the Nevanlinna-Pick problems, the interpolant is required to satisfy a “frequency independent” norm condition. We have generalized the Nevanlinna-Pick problem to cases where a interpolant is required to satisfy a “frequency dependent” norm condition. This is shown to be intimately related to dissipativity with respect to a supply function defined using a Quadratic Differential Form. We have obtained necessary and sufficient conditions for the solvability of a class of Nevanlinna-Pick interpolation problems with “frequency dependent” norm conditions. Chapter 10 Conclusion and future work We now take a bird’s-eye view of the problems addressed in this thesis. We list contributions from this study. We also emphasize the connections between different aspects of systems and control theory that have been developed in this thesis. Finally, we address possible directions for future work. 10.1 Summary of results Chapter 3 dealt with a parametrization for dissipative behaviors. While it is not difficult to check if a given linear, time-invariant dynamical system is dissipative with respect to a supply function defined by a QDF, the question of constructing all dynamical systems that are dissipative with respect to a given supply function is not straightforward. We first considered single-input-single-output (SISO) dissipative systems and obtained an explicit parametrization for the set of all dissipative systems under some assumptions on the supply function (the assumption of constant signature on the imaginary axis). We generalized the results for the SISO case to the multi-input-multi-output (MIMO) case, again under the assumption of constant signature on the imaginary axis. We then considered the case when the assumption of constant signature on the imaginary axis does not hold. Given a general supply function defined by a QDF, we showed that one can always parametrize a subset of the set of all dissipative behaviors if the assumption of constant signature on the imaginary axis does not hold. The parametrization in this case was obtained using what we defined as “split sums”. The problem addressed in Chapter 3 can be examined with some additional conditions, for example: when do there exist positive definite storage functions for dissipative systems. We have examined precisely this question in Chapter 4, “KYP lemma and its extensions”. We first noted the well known result that the classical Kalman-Yakubovich-Popov (KYP) lemma gives an equivalence between passivity and the existence of positive definite storage functions. We examined the KYP lemma in a representation free manner and derived it using behaviortheoretic arguments. We then generalized the lemma for systems that are dissipative with respect to certain “special” supply functions. We showed that a given dissipative dynamical system has positive definite storage functions on manifest variables if and only if an associated dynamical system is “passive” and has certain properties (the properties of observability of an image representation, and the property of there being no non-trivial memoryless part). 152 10 Conclusion and future work Chapters 3 and 4 provide the basic tools to investigate problems in different areas in systems and control theory. A majority of problems addressed in the later chapters can be thought of as applications of results obtained in Chapters 3 and 4. The results in Chapters 3 and 4 thus provide a unifying thread that runs across most chapters in this thesis. One of the most interesting applications of storage functions is their use as Lyapunov functions. We investigated this aspect in Chapter 5 which dealt with the “absolute stability” problem in nonlinear systems. We considered autonomous systems obtained by interconnection of LTI systems with nonlinearities, through given interconnection constraints. The treatment unifies stability analysis for a large class of systems. We obtained MIMO versions of classical results as special cases. In Chapters 3 and 4, a certain factorization of polynomial matrices called “polynomial J-spectral factorization” was invoked. This factorization is an important and well-studied problem in computations with polynomial matrices. We addressed the polynomial J-spectral factorization problem in Chapter 6. Using behavior-theoretic ideas and QDFs, we obtained a new algorithm for polynomial J-spectral factorization. This algorithm builds on results in interpolation theory, specifically the Nevanlinna-Pick interpolation problem. We implemented the algorithm and reported the numerical aspects. The algorithm reported in Chapter 6 is one of the simplest among those available in literature. It also has good numerical properties for problems of reasonable size. In Chapter 7 we returned to the theme of dissipative systems which we considered in Chapters 3,4 and 5. An interesting application of results obtained in Chapters 3 and 4 is in the well known “H∞ ” problem. The H∞ problem is one of the most important areas in control theory and finds applications in, for instance, disturbance attenuation, passivation, etc. In Chapter 7, we obtained a novel characterization of all solutions to the H∞ problem using the parametrization results obtained in Chapters 3 and 4. Many real-life systems cannot be adequately modeled by linear laws. A model defined by a quadratic form is better suited in such cases. This motivated Chapter 8, which represents a new and interesting line of research: modeling of data with bilinear differential forms. We first translated the modeling problem to an interpolation problem with two variable polynomial matrices. We gave an iterative algorithm to solve the interpolation problem. The algorithm relies on standard polynomial matrix computations and is suitable for implementation in a symbolic computational package. As a special case, we considered the problem of interpolating with scalar bivariate polynomials. We also addressed the problem of constructing storage functions for autonomous dissipative systems. The approach presented in Chapter 8 for computing storage functions presents an interesting alternative to methods available in literature. Continuing with the interpolation ideas in Chapter 8, we considered the problem of interpolation with rational functions in Chapter 9. We addressed a generalization of the well known Nevanlinna-Pick interpolation problem. This interpolation problem is about finding a rational function that satisfies some norm constraints on the imaginary axis, and interpolates given data. The norm constraint is “frequency independent”. In Chapter 9, we generalized the Nevanlinna-Pick interpolation problem to rational functions that satisfy a “frequency dependent” norm condition. We used results obtained in Chapter 3, 4, and 6 in order to address this generalization. 10.2. Directions for further work 153 With this brief overview of the research presented in this thesis, we will now consider possible directions for future work. 10.2 Directions for further work We view results presented in this thesis as only the starting point of a systematic study of quadratic differential forms and their applications vis-a-vis dynamical systems. In our opinion, the ideas presented in this thesis can be generalized in the many directions. We present a summary of some interesting problems for research. Parametrization of dissipative systems: In Chapter 3 we obtained a complete parametrization of dissipative systems under some assumptions on the supply function (the assumption of constant inertia on the imaginary axis). When this assumption does not hold, our results give only sufficient conditions. Investigation is required into how to obtain necessary and sufficient conditions for a general supply function. Kalman-Yakubovich-Popov lemma: The KYP lemma relies on certain assumptions on the supply functions. Relaxing this assumption is an immediate problem. Research in state space formulations of our results for the sake of efficient computations with Linear Matrix Inequalities (LMIs) should yield interesting results. Nonlinear systems: The approach presented in Chapter 5 is promising. Investigations can be taken up into examining nonlinearities with memory, like relays with hysteresis. Research into state space based formulations, integration with other theories like “integral-quadratic-constraints” will also be interesting and insightful. Interpolation problems: Applications to concrete real-life systems (for example econometric systems, multidimensional signal processing, etc) can be addressed using the recursive algorithm for interpolation with bilinear and quadratic differential forms. A discrete-time version of the algorithm can be used for quadratic filter design in digital signal processing. Generalization of Nevanlinna-Pick interpolation can be used in order to address a “data driven” theory for control which does not involve models. Synthesis of dissipative systems: We have only considered systems dissipative with respect to a metric defined by a “constant matrix”. Investigation along the lines of Chapter 7 can be used to generalize the synthesis problem to what is usually called the “frequency weighted H∞ control”. Also, the treatment in Chapter 7 was based on certain assumptions about the hidden behavior– relaxing these is an immediate problem. Investigation is required to determine the feasibility of the approach taken in Chapter 7 for obtaining efficient computational algorithms. 154 10 Conclusion and future work References [1] F.A. Aliev and V.B. Larin, “Algorithm of J-spectral factorization of polynomial matrices”, Automatica, 33 (1997), pp 2179-2182. [2] B.D.O. Anderson, “A system theory criterion for positive real matrices” SIAM Journal of Control, 5 (1967) , pp 171-182. [3] B.D.O. Anderson and S. Vongpanitlerd, “Network Analysis and Synthesis: a Modern Systems Theory Approach” , Prentice Hall, 1973. [4] B.D.O. Anderson, K.L. Hitz, and N.D. Diem, “Recursive algorithm for spectral factorization”, IEEE Transactions on Circuits and Systems, 21 (1976), pp. 453-464. [5] B.D.O. Anderson, P. Kokotovic, I.D. Landau and J.C Willems, “Dissipativity of dynamical systems: applications in control– dedicated to Vasile Mihai Popov”, Special issue: European Journal of Control, 8(2002). [6] A.C. Antoulas “A new result on passivity preserving model reduction”, Systems and Control Letters, 54(2005), pp361-374. [7] A.C. Antoulas, and J.C. Willems, “A behavioral approach to linear exact modeling”, IEEE Transactions on Automatic Control, 38 (1993), pp. 1776-1802. [8] D.P. Atherton, “Nonlinear Control Engineering” , Van Nostrand, 1975. [9] J.A. Ball, I. Gohberg, L. Rodman, Birkhauser-Verlag, 1990. “Interpolation of Rational Matrix Functions”, [10] B. Bojanov, Y.Xu, “On a hermite interpolation by polynomials of two variables”, SIAM Journal of Numerical Analysis, 39(2002), pp 1780-1793. [11] M.N. Belur,“Control in a behavioral context”, Doctoral dissertation, University of Groningen, The Netherlands, 2003. [12] M.K. Camlibel, J.C. Willems, M.N. Belur, “On the dissipativity of uncontrollable systems”, Proceedings of 42nd IEEE Conference on Decision and Control, 2003, pp 1645-1650. [13] D.S. Bernstein and S.P. Bhat “Energy equipartition and the emergence of damping in lossless systems”, Proc 41st IEEE Conf Decision and Control (2002), pp 2913-2918. [14] Bai, Z, Demmel, J et al “Templates for the Solution of Algebraic Eigenvalue Problems: a Practical Guide”, SIAM publication, 2000. 155 156 References [15] R.W. Brockett, J.L. Willems, “Frequency domain stability criteria–Part I” IEEE Transactions on Automatic Control, 10 (1965), pp 255-261. [16] R.W. Brockett, J.L. Willems, “Frequency domain stability criteria–Part II” IEEE Transactions on Automatic Control, 10 (1965), pp 407-413. [17] F.M. Callier, “Spectral factorization by symmetric factor extraction”, IEEE Transactions on Automatic Control, 30(1985), pp 453-465. [18] H.J. Carlin, “The scattering matrix in network theory”, IRE Transactions on Circuit Theory, 3 (1956), pp 88- 97. [19] J.C. Doyle, K. Glover, P. Khargonekar and B.A. Francis “State space solutions to standard H2 and H∞ problems”, IEEE Transactions on Automatic Control, 34 (1989), pp 831-847. [20] M. Fu, S. Dasgupta, and Soh, Y.C. “Integral quadratic constraint approach vs. multiplier approach”, Proc. IEEE conference on Control, Automation, Robotics and Vision, ICARCV vol. 1, pp. 144 - 149, 2002. [21] R. Fitts, “Two counter-examples to Aizerman’s conjecture”, IEEE Transactions on Automatic Control, 11(1966), pp 553-556. [22] F.R. Gantmacher “The Theory of Matrices, Vol. 1” Chelsea Publishing Company, New York, 1960. [23] M. Gasca and T. Sauer “On Bivariate Hermite Interpolation with Minimal Degree Polynomials”, SIAM Journal of Numerical Analysis, 37(2000), pp 772-798. [24] T.T. Georgiou and P.P. Khargonekar, “Spectral factorization and Nevanlinna-Pick interpolation”, SIAM Journal on Control and Optimization, 25(1987), pp 754-766. [25] T.T. Georgiou, “On a Schur-algorithm based approach to spectral factorization: Statespace formulae”, Systems and Control Letters, 10(1998), pp 123-129. [26] T.T. Georgiou, “Computational aspects of spectral factorization and the tangential Schur algorithm”, IEEE Transactions on Circuits and Systems, 36 (1989), pp 103-108. [27] T.T. Georgiou and P.P. Khargonekar, “Spectral factorization of matrix valued functions using interpolation theory”, IEEE Transactions on Circuits and Systems, 36(1989), pp 568-574. [28] I. Gohberg, P. Lancaster, L. Rodman “Factorization of Selfadjoint Matrix Polynomials with Constant Signature” Linear and Multilinear Algebra, 11 (1982), pp 209-224. [29] I. Gohberg, P. Lancaster, L. Rodman “Matrix Polynomials”, Academic Press, 1982. [30] J.M. Goncalves, A. Megretski, M.A. Dahleh, “Global stability of relay feedback systems”, IEEE Transactions on Automatic Control, 46(2001), pp 550–562. REFERENCES 157 [31] W.M. Haddad, V. Kapila, “Absolute stability criteria for multiple slope restricted nonlinearities”, IEEE Transactions on Automatic Control, 40 (1995),pp 361-365. [32] W.M. Haddad, V. Chellaboina, and S.G. Nersesov, “A system-theoretic foundation for thermodynamics: energy flow, energy balance, energy equipartition, entropy and ectropy”, Proceedings of American Control Conference, 2004, pp 396-417. [33] E.V. Haynsworth, “Determination of inertia of a partitioned hermitian matrix”, Linear algebra and its applications, 1 (1968), pp 73-81. [34] D.J. Hill and P.J. Moylan “Dissipative dynamical systems: basic input, output and state properties”, Journal of the Franklin Institute, 309(1980), pp 327-357. [35] D.J. Hill and P.J. Moylan “The stability of nonlinear dissipative systems”, IEEE Transactions on Automatic Control, 21(1976), pp 708-711. [36] J.C. Hsu and A.U. Meyer, “Modern control principles and applications”, McGraw Hill, 1968. [37] Hu, T. Huang, B. Lin. Z, “Absolute Stability With a Generalized Sector Condition” IEEE Transactions on Automatic Control, 49(2004), pp 535–548. [38] J. Ježek and V. Kučera, “Efficient algorithm for matrix spectral factorization”, Automatica, 21 (1985), pp 663-669. [39] K.H. Johansson, A.E. Barabanov, and K.J. Astrom, “Limit cycles with chattering in relay feedback systems”, IEEE Transactions on Automatic Control, 47(2002) pp 1414–1423. [40] R.E. Kalman, 1963, “Lyapunov functions for the problem of Lur’e in automatic control” Proc. Mat. Acad. Sci. USA, 49 (1963), pp 201-205. [41] O. Kaneko and T. Fujii, “Discrete-time average positivity and spectral factorization in a behavioral framework”, Systems and Control Letters, 39(2000), pp 31-44. [42] C.-Y. Kao, A. Megretski and U. Jönsson, “Specialized fast algorithms for IQC feasibility and optimization problems”, Automatica, 40(2004), pp 239-252. [43] J.S. Karmarkar, “On Siljak’s absolute stability test” Proceedings of the IEEE, 58 (1970), pp 817-819. [44] H. Kimura, “Directional interpolation approach to H∞ -Optimization and robust stabilization”, IEEE Transactions on Automatic Control, 32 (1987), pp 1085-1093. [45] H. Kimura, “Directional interpolation in the state space”, Systems and Control Letters, 10 (1988), pp 317-324. [46] H. Kimura, “Conjugation, interpolation, and model-matching in H∞ ”, International Journal of Control, 49 (1989), pp 269-307. 158 References [47] M. Kuijper, J.W. Polderman, “Behavioral models for list decoding”, Mathematical and computer modelling of dynamical systems, 8 (2002), pp 429-443. [48] Kulkarni V.V. Sofanov M.G. “All multipliers for repeated monotone nonlinearities”, IEEE Transactions on Automatic Control, 47(2004), pp 1209–1212. [49] H. Kwakernaak and M. Šebek, “Polynomial J-spectral factorization”, IEEE Transactions on Automatic Control, 39 (1994), pp 315-328. [50] D.P. O’Leary, “Symbiosis between linear algebra http://citeseer.ist.psu.edu/223627.html, July 2005. and optimization”, [51] Madhusudan, “Efficient checking of polynomials and proofs and the hardness of approximation problems”, Lecture notes in computer science, Springer-Verlag, 1995. [52] A. Megretski, A. Rantzer, “System analysis via integral quadratic constraints”, IEEE Transactions on Automatic Control, 42 (1997), pp 819-830. [53] G. Meinsma, “Frequency-domain methods in H∞ -control”, Doctoral Dissertation, University of Twente, 1993. [54] J. Meixner, “On the theory of linear passive systems”, Archives for Rational Mechanics and Analysis, 17 (1964), pp 278-296. [55] P.J. Moylan, “Implications of passivity in a class of nonlinear systems”, IEEE Transactions on Automatic Control, 19(1974), pp 373-381. [56] R.W. Newcomb, “Linear Multiport Synthesis” McGraw Hill, New York, 1966. [57] G. Park, D. Banjerdpongchai, T. Kailath, “The asymptotic stability of nonlinear lure systems with multiple slope restrictions”, IEEE Transactions on Automatic Control, 43(1998), pp 979-982. [58] R. Peeters and P. Rapisarda, “A two-variable approach to solve the polynomial Lyapunov equation”, System and Control Letters, 42(2001), pp. 117-126. [59] I. Pendharkar, “Model reduction and associated problems in linear systems theory”, Masters dissertation, Department of Electrical Engineering, Indian Institute of Technology Bombay, 2001. [60] I. Pendharkar and H.K. Pillai, “ A parametrization for dissipative behaviors”, Systems and control letters, 51(2004), pp 123-132. [61] I. Pendharkar and H.K. Pillai “On dissipative SISO systems: a behavioral approach”, Proceedings of 42nd IEEE Conference on Decision and Control, 2002, Hawai, USA, pp1616 - 1620. [62] I. Pendharkar, H.K. Pillai, “The Kalman-Yakubovich lemma in the behavioral setting”, to appear in International Journal on Control. REFERENCES 159 [63] I. Pendharkar and H.K. Pillai, “A parametrization for behaviors with a non-negative storage function”, Proceedings of twenty seventh National Systems Conference, 2003, Kharagpur, India, pp-59-63. [64] I. Pendharkar and H.K. Pillai, “Kalman-Yakubovich lemma in the behavioral setting”, in Proc IFAC symposium on large scale systems, 2004, Osaka, Japan. [65] I. Pendharkar, H.K. Pillai, “On a theory for nonlinear behaviors”, Proc 16th International Symposium on Mathematical Theory of Networks and Systems (MTNS 2004). [66] I. Pendharkar and H.K. Pillai, “A behavioral approach to Popov-like stability criteria” Proceedings of National Systems Conference, Vellore , 2004. [67] I. Pendharkar and H.K. Pillai, “A rootlocus based approach for stabilization of nonlinear systems” Proceedings of National Conference on Control and Dynamical systems, 2005. [68] I. Pendharkar and H.K. Pillai, “On stability of systems with monotone nonlinearities” Proceedings of National Conference on Control and Dynamical systems, 2005. [69] I. Pendharkar and H.K. Pillai, “Guaranteed closed loop stability with sensor uncertainties” Proceedings of International Conference on Instrumentation , December, 2004. [70] I. Pendharkar, H.K. Pillai and P. Rapisarda, “A behavioral view of Nevanlinna-Pick interpolation”, 44th IEEE conference on decision and control (CDC), 2005, accepted. [71] P. Penfield, “ Passivity Conditions”, IEEE Transactions on Circuit Theory, 12(1965), pp 446-448. [72] H.K. Pillai and E. Rogers, “On quadratic differential forms for n-D systems”, Proceedings of 39th IEEE Conference on Decision and Control (CDC), Sydney, 2000, pp 5010-5013. [73] V.A. Pliss, “Necessary and sufficient conditions for the global stability of a certain system of three differential equations”, Dokl. Akad. Nauk SSR 120(1958) pp 4. [74] J.W. Polderman, “Proper elimination of latent variables”, Systems and control letters, 32(1997) 261-269. [75] J.W. Polderman, J.C. Willems, “Introduction to mathematical systems theory: A behavioral approach” Springer-Verlag, 1997. [76] V.M. Popov, “Absolute stability of nonlinear systems of automatic control”, Avtomatika i Telemekhanika, 22 (1961), pp 961-979. For an English translation, see A.G.J. McFarlane Ed “Frequency response methods in control systems”, IEEE press, 1979. [77] V.M. Popov, A. Halanay, “On the stability of nonlinear automatic control systems with lagging argument”, Avtomatika i Telemekhanika, 23 (1962), pp 849-851. For an English translation, see A.G.J. McFarlane Ed “Frequency response methods in control systems”, IEEE press, 1979. [78] V.M. Popov, “Hyperstability of Control Systems”, Springer-Verlag, 1973. 160 References [79] S. Purkayastha and A. Mahalanabis, “An extended MKY lemma and its application” IEEE Transactions on Automatic Control, 16(1971) 366-367. [80] A. C. M. Ran, L. Rodman, “Factorization of Matrix Polynomials with Symmetries” SIAM Journal on Matrix Analysis and Applications, 15 (1994), pp 845864. Also Preprint 993, Institute for Mathematics and its Applications, July 1992. http://ima.umn.edu/preprints/July92/0993.ps (as in August 2005). [81] A. Rantzer, “On the Kalman-Yakubovich-Popov lemma” Systems and Control Letters, 28 (1996), pp 7-10. [82] P. Rapisarda, Linear differential systems, Ph.D thesis, University of Groningen, The Netherlands, 1998. [83] P. Rapisarda and J.C. Willems, “State maps for linear systems”, SIAM Journal of Control and Optimization, 35 (1997), pp 1053-1091. [84] P. Rapisarda and J.C. Willems, “The subspace Nevanlinna interpolation problem and the most powerful unfalsified model”, Systems and Control Letters, 32 (1997), pp. 291-300. [85] G.L. Sicuranza, “Quadratic filters for signal processing”, Proceedings of IEEE, 80(1992), pp 1263-1285. [86] D.D. Siljak, “New Algebraic Criteria for Positive Realness” Journal of The Franklin Institute, 291 (1971), pp 109-120. [87] V. Singh, “A note on the extended MKY lemma”, IEEE Transactions on Automatic Control, 27 (1982) pp 1264. [88] V. Singh, “Absolute stability criterion for a class of nonlinear systems with slope restricted nonlinearity”, Proceedings of the IEEE, 70 (1982), pp 1232-1233. [89] V. Singh, “An extended MKY lemma– what shall it be?”, IEEE Transactions on Automatic Control , 28 (1983), pp 627-628. [90] J.E. Slotine and W. Li, “Applied Nonlinear Control” Prentice Hall, 1991. [91] P. R. Smith, “Bilinear interpolation of digital images”, Ultramicroscopy, 6(1981), pp 201204. [92] R.L. Smith, “Some interlacing properties of the schur complement of a hermitian matrix”, Linear algebra and its applications, 177 (1992) pp 137-144. [93] D.C. Sorensen, “Passivity preserving model reduction via interpolation of spectral zeros”, Systems and Control Letters, 54(2005), pp 347-360. [94] V.R. Sule, “State space approach to behavioral systems theory: the Dirac-Bergmann algorithm” Systems and Control letters, 50 (2003), pp 149-162. REFERENCES 161 [95] H.L. Trentelman and J.C. Willems, “Every storage function is a state function” Systems and Control letters, 32 (1997), pp 249-259. [96] H.L. Trentelman and J.C. Willems, “Synthesis of Dissipative Systems Using Quadratic Differential Forms, Parts I and II”, IEEE Transactions on Automatic Control, 47 (2002), pp 53–69 and 70–86. [97] H.L. Trentelman and P. Rapisarda, “New algorithms for polynomial J-spectral factorization”, Mathematics of Control, Signals and Systems, 12 (1999), pp 24-61. [98] J.C. Willems, “Dissipative dynamical systems, Part 1: General theory” Archives for Rational Mechanics and Analysis, 45 (1972), pp 321-351. [99] J.C. Willems, “Dissipative dynamical systems, Part 2: Linear systems with quadratic supply rates” Archives for Rational Mechanics and Analysis, 45 (1972), pp 352-393. [100] J.C. Willems, “From time series to linear system, part II: Exact modeling”, Automatica, 22 (1986), pp 675-694. [101] J.C. Willems, “Paradigms and puzzles in the theory of dynamical systems”, IEEE Transactions on Automatic Control, 36 (1991), pp 259-294. [102] J.C. Willems, “On Interconnections, Control and Feedback”, IEEE Transactions on Automatic Control , 42 (1997), pp 326-339. [103] J.C. Willems and H.L. Trentelman, 1998, “On Quadratic Differential Forms”, SIAM Journal of Control and Optimization, 36 (1998), pp 1703-1749. [104] J.C. Willems and H.L. Trentelman, “H∞ -control in a behavioral context: The full information case”, IEEE Transactions on Automatic Control, 44 (1999), pp 521-536. [105] J.L. Willems, “Stability theory of dynamical systems”, Nelson, 1970. [106] R. Van Der Geest and H.L. Trentelman, “The KYP lemma in a behavioral framework” Systems and Control letters, 32 (1997) , pp 283-290. [107] M. Vidyasagar, “Nonlinear Systems Analysis, 2nd ed.” Prentice Hall, 1993. [108] A.A. Voronov, “Basic principles of automatic control theory– special linear and nonlinear systems”, Mir Publishers, Moscow, 1985. [109] V.A. Yakubovich 1962, “Solutions of some matrix inequalities occurring in the theory of automatic control” Dokl. Akad. Nauk SSSR, 143 (1962), pp 1304-1367 (in Russian). [110] D.C. Youla and M. Saito, “Interpolation with positive real functions”, Journal of the Franklin Institute, 284 (1967) pp 77-108. [111] G. Zames, P.L. Falb, “Stability conditions for systems with monotone and slope restricted nonlinearities”, SIAM Journal on Control, 6 (1968), pp 89-108. 162 References Appendix A Notation R C WT teletype fonts w, q etc C ∞ (R, Rw ) D(R, Rw ) w Lloc 1 (R, R ) det deg Im Ker Re (λ) λ̄ Lw LΦ QΦ Lw con LΦ B m(B) p(B) n(B) R[ξ] R[ζ, η] Rn1 ×n2 [ξ] R•×n [ξ] The set of real numbers. The set of complex numbers. The set of maps from T → W. The number of components of vectors w, q respectively. The space of infinitely many times differentiable functions from R to Rw . The space of compactly supported C ∞ functions from R to Rw . The space of locally integrable functions from R to Rw . determinant of a matrix. degree of a one variable polynomial matrix. Image of a linear map. Kernel of a linear map. Real part of λ. Complex conjugate of λ. The set of linear differential systems with “w” variables. Bilinear differential form defined by Φ(ζ, η). Quadratic differential form defined by Φ(ζ, η). The set of controllable linear differentiable systems with “w” variables. The set of all controllable Φ-dissipative linear differentiable systems. A behavior. Input cardinality of B. Output cardinality of B. McMillan degree of B. The ring of polynomials in ξ over R. The ring of polynomials in ζ, η over R. The set of all n1 × n2 polynomial matrices in ξ. The set of polynomial matrices in ξ with n columns. 164 A Notation Rq1 ×q2 [ζ, η] Rw×w s [ζ, η] AT A∗ R∼ A≥0 Π(iω) ≥ 0 A>0 Π(iω) > 0 σ(A) σ+ (A) σ− (A) σ0 (A) σworst (Z) Iw Jmn Jmn J Jworst X( dtd ) X ( dtd ) N NΘ FN F0K Fmon BN P full P K C N The set of q1 × q2 matrices over R[ζ, η]. The set of w × w symmetric matrices over R[ζ, η]. The transpose of a constant matrix A. The hermitian transpose of a constant matrix A. The transpose of a polynomial matrix R(ξ)∼ := RT (−ξ). The quadratic form induced by A is positive semidefinite. The hermitian form induced by Π(iω) is positive semidefinite at almost all ω ∈ R. The quadratic form induced by A is positive definite. The hermitian form induced by Π(iω) is positive definite at almost all ω ∈ R. Inertia of the hermitian matrix A. Number of positive eigenvalues of A. Number of negative eigenvalues of A. Number of zero eigenvalues of A. The “worst inertia” of Z(ξ). The w × w Identity matrix. ! The inertia matrix Jmn − I m+n×m+n . Im 0 . 0 −In ! 0 Im The 2m × 2m matrix (1/2) . Im 0 The “worst inertia matrix”. The state map associated with a behavior. The matrix defining interconnection constraints. A nonlinearity. The positive cone associated with QΘ . The family of all nonlinearities that are positive along QΘ . The family of sector bound nonlinearities. The family of monotone nonlinearities. The nonlinear behavior. The full plant behavior. The manifest plant behavior. The controlled behavior. The controller behavior. The hidden behavior. Index absolute stability, 75 Aizerman’s conjecture, 75 available storage, 57 Axiom of state, 24 controlled behavior, 119 describing function, 92 dissipation equality, 57 dissipation function, 56 dissipative systems, 32 and analyticity, 37 characterization using Nyquist plot, 37 MIMO, 43 SISO, 36 storage functions for, 56 synthesis of, 117 uncontrollable, 33 dualization, 96, 144 system theoretic implications of, 145 dynamical systems, 11 autonomous, 23 stable, 24 dissipative, 31 linear, 12 linear differential, 12 models for, 94 time-invariant, 12 BDF, 5 modeling with, 129 zero along a behavior, 133 behavior, 11 autonomous, 23 controllable, 20 controlled, 119 implementable, 119 controller, 118 dissipative, 32 full, 15 full plant, 118 hidden, 119 implementable, 119 manifest, 118 McMillan degree, 26 memoryless part of, 26 most powerful unfalsified, see MPUM state space representation, 24 view of modeling, 94 bilinear difference form, 140 Bilinear differential form, see BDF Bilinear form, 4 Bivariate polynomial matrix, 5 symmetric, 6 bounded real rational function, 62 effective size, 131 Elimination theorem, 16 equivalent supply functions, 35 characterization of, 45 exact identification, 129 Factorization ala Ran, Rodman, 47 free variables, 27 function spaces, 13 Future work, 153 Circle criterion, 85 control as interconnection, 80, 118 controller, 118 control variables, 118 controllability, 20 GNU-GPL, 109 H∞ problem, 117 and dissipativity, 32 165 166 characterization of all solutions of, 126 formulation, 119, 120 hidden behavior, 119 hybrid representation, 15 image representation, 20 implementable behavior, 119 inertia, 5 worst, 47 input cardinality, 28, 119 inputs, 27 interconnection view of control, 118 interpolation for computing storage functions, 135 Lagrange, 132 Nevanlinna-Pick, 141 standard case, 142 with frequency dependent norm, 147 with BDFs notion of complexity, 139 with bilinear difference form, 140 with bivariate polynomials, 135 with two variable polynomial matrices, 131 invariants of a behavior, 28 kernel representation, 12 KYP lemma as an LMI, 55 for SISO systems, 69 generalization of, 61 storage functions and, 60 strict versions of, 67 L∞ norm, 44 latent variable, 15 representation using, 15 latent variable representation, 15 loop transformation, 86 lossless system, 46, 66, 125 Lyapunov function, 56, 76 computing for linear systems, 135 manifest plant behavior, 118 Index manifest variable, 15 McMillan degree, 26, 28, 57, 79 memoryless nonlinearity, 81 memoryless part of a behavior, 26 Meyer-Kalman-Yakubovich lemma, 56 minimum required supply, 57 Minor, 19 model in the generative sense, 142 Modeling with BDFs, 129 models for dynamical systems, 94 MPUM, 95, 142 computing a representation for, 95 for dualized data, 96 negative feedback, 81 nonlinearity, 80 memoryless, 81 sector bound, 85, 86 slope restricted, 89 stabilizing controllers for, 82 with memory, 90 observability, 21 of a QDF, 8 output cardinality, 28 outputs, 27 para-Hermitian matrix, 7, 93 J-spectral factorization of, 93 computing singularities of, 108 minimal factorizations of, 47 split sums for, 50 worst inertia of, 47 passive system, 31, 61 Pick matrix, 96, 144 polynomial J-spectral factorization, 45, 93 algorithm for, 105 computer implementation, 109 numerical aspects of the algorithm, 106 Popov, 75 stability criterion, 86 positive real rational function, 37, 71 strictly, 71 preview of the thesis, 2 INDEX principal block minor, 97 proper rational function, 27 QDF, 6 acting on Lloc 1 functions, 79 observable, 8 sign definite, 8 symmetric canonical factorization, 8, 106 Quadratic differential form, see QDF quadratic form, 4 diagonalization, 4 Rank, 19 Rational J-spectral factorization, 41 regular polynomial matrix, 107 Relay with hysteresis, 90 Representation, 11 Σ-unitary, 94 equivalent, 18 image, 20 kernel, 12 latent variable, 15 observable image, 21 state, 24 right coprime matrices, 65 Schur complement, 35, 122 Scilab, 109 semi-simple, 96 signature, 5 slope restricted nonlinearities, 89 spectral factorization, 58 split sum, 50 parametrization using, 50 stability global asymptotic, 80 of equilibria, 80 of linear systems, 24 state, 24 axiom of, 25 McMillan degree, 26 storage functions on, 57, 76 state map, 26, 58, 62, 77 storage functions, 56 a procedure to compute, 58 167 and KYP lemma, 60 maximum and minimum, 59 positive definite, 66, 68, 72 state functions, 57, 76 strictly positive real rational function, 71 strictly proper rational function, 27 strong solution, 14 Subspace Nevanlinna Interpolation Problem (SNIP), 143 summary of results, 151 supply function constant inertia, 44 constant matrices, 43 equivalent, 35 variable inertia, 46 Symmetric canonical factorization, 8 strictly proper, 99 Synthesis of dissipative systems, 117 Thermodynamics and dissipativity, 32 to-be-controlled variables, 118 unitary matrix, 48, 96 weak solution, 14 weighted degree, 131 worst inertia, 47 how to compute, 47