Applications of Quadratic Differential Forms

advertisement
Applications of Quadratic Differential Forms
Submitted in partial fulfillment of the requirements
For the degree of
Doctor of Philosophy
by
Ishan K. Pendharkar
(Roll number: 01407004)
Supervisor:
Prof. Harish K. Pillai
Department of Electrical Engineering,
Indian Institute of Technology Bombay, Powai,
Mumbai, 400076
2005
Dissertation Approval Sheet
The dissertation “Applications of Quadratic Differential Forms” by
Ishan Pendharkar is approved for the degree of Doctor of Philosophy.
Examiners
....................
....................
....................
Supervisor
....................
Chairman
....................
Place :..............
Date :..............
iii
Indian Institute of Technology Bombay
Certificate of Course Work
This is to certify that Ishan Pendharkar was admitted to the candidacy of the Ph.D. degree
in January 2002 after successfully completing all the courses required for the Ph.D. degree
programme. The details of the course work done are given below:
Sr. Course No.
Course name
Credits
1
EE 678
Wavelets
6.00
2
EE 698
Special topics in Electrical Engg.
6.00
3
EES 801
Ph.D seminar
4.00
IIT Bombay:
Date:........
Deputy Registrar
v
Acknowledgment
The work presented in this thesis is the outcome of about three years (May 2002-July 2005)
of research that I have carried out in the Department of Electrical Engineering, IIT Bombay
under the supervision of Dr. Harish K. Pillai. I am grateful to Dr. Pillai for giving me an
opportunity of working under his supervision. Dr. Pillai has been a source of constant support
and motivation during the course of my research. All along my association with him, Dr. Pillai
has been extremely patient with me. I thank him for all the help and support.
I also thank Dr. Paolo Rapisarda, who is currently with the Department of Electrical and
Computer Engineering, University of Southampton, UK, for suggesting interesting problems. I
have enjoyed working with Dr. Rapisarda.
I gratefully acknowledge receiving help and guidance from several people at IIT Bombay.
I thank Dr. Madhu Belur for many helpful discussions and suggestions. I am grateful to my
masters’ adviser Prof. V.R. Sule, and my instructors at IIT Bombay: Prof. S.D. Agashe, Prof.
Shiva Shankar and Prof. M.C. Srisailam for their efforts.
The several friends I made during the course of my Ph.D have been a great help from time
to time. I thank Amit Kalele, Dr. Mashuq-un-Nabi and Priyadarshanam for their help and
support. I also thank all masters’ students and project staff at the Control and Computing
laboratory for providing great company and making my workplace lively.
I would have never managed to do research and write this thesis but for the unflinching
support and encouragement of my parents, and my wife Mitra. I thank them for their patience,
and for the many big and small sacrifices they have made for my sake.
IIT Bombay,
Mumbai.
Ganesh Chaturthi, Saka 1927
(7 September, 2005).
vii
Abstract
Quadratic functionals are commonly encountered in systems and control theory as a means of
describing as well as analysing dynamical systems. Quadratic filters and bilinear time series
models can be thought of as examples of the former. The latter use of quadratic functionals
is more popular and widely understood. It is based upon our deep and intuitive association of
energy or power like quantities with quadratic functionals.
The thesis titled “Applications of Quadratic Differential Forms” is a study of quadratic
functionals and their applications in systems and control theory. The quadratic functionals
in question are “Quadratic Differential Forms”, or QDFs. Using QDFs I have investigated
different areas in systems and control theory in search for a unifying theme which binds these
areas together. Specifically, I have addressed problems in the following areas:
Dissipative systems: A parametrization of systems that are dissipative with respect to supply
functions defined by quadratic differential forms has been obtained.
KYP lemma: A generalization of the Kalman-Yakubovich-Popov (KYP) lemma has been obtained for systems that are dissipative with respect to supply functions defined by QDFs.
Absolute stability criteria: Using QDFs, absolute stability criteria have been obtained for a
large class of nonlinearities.
Polynomial J-spectral factorization: Using the algebra of QDFs, a new algorithm has been
developed for J-spectral factorization of polynomial matrices.
H∞ control: A new characterization of all solutions to the H∞ problem has been obtained. The
characterization is in terms of LTI dynamical systems that are dissipative with respect
to a “special” supply function defined by a QDF.
Modelling of data with bilinear and quadratic differential forms: An iterative algorithm has
been developed for computing all bilinear differential forms that model a given set of
data.
Nevanlinna-Pick interpolation: A characterization of all rational functions that satisfy, along
with given interpolation conditions, a “frequency dependent norm” condition has been
obtained.
ix
x
Contents
1 Introduction
1
1.1 Preview of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1
2
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2 Quadratic Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.2.1
Quadratic and Bilinear forms . . . . . . . . . . . . . . . . . . . . . . . .
4
1.2.2
Representations of Quadratic Differential Forms . . . . . . . . . . . . . .
5
1.2.3
Factorization of Quadratic Differential Forms
. . . . . . . . . . . . . . .
8
1.2.4
Point-wise non-negative Quadratic Differential Forms . . . . . . . . . . .
8
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2 Behavioral theory of dynamical systems
11
2.1 Dynamical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Linear differential systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The space of trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Latent variables and their elimination . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Equivalent representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Controllability and Observability . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.7 Autonomous systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.8 State representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.9 Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 A parametrization for dissipative systems
31
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Dissipativity in the Behavioral setting . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 An equivalence relation on supply functions . . . . . . . . . . . . . . . . . . . . 35
3.4 SISO dissipative systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 MIMO dissipative systems: the constant inertia case
. . . . . . . . . . . . . . . 43
3.5.1
Supply functions defined by constant matrices . . . . . . . . . . . . . . . 43
3.5.2
Supply functions defined by polynomial matrices . . . . . . . . . . . . . . 45
3.6 MIMO dissipative systems: the general inertia case . . . . . . . . . . . . . . . . 46
3.6.1
Parametrizing a set of Φ-dissipative behaviors using split sums . . . . . . 51
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
xi
4 KYP lemma and its extensions
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . .
4.2 Storage functions for dissipative systems . . . . . .
4.3 Classical KYP lemma in terms of storage functions
4.4 Generalization of KYP lemma . . . . . . . . . . . .
4.4.1 Generalization with respect to QDFs . . . .
4.5 Strict versions of the KYP lemma . . . . . . . . . .
4.6 Special case: KYP lemma for SISO systems . . . .
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Designing linear controllers for nonlinearities
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Control as an interconnection . . . . . . . . . . . . . . . . . . . . .
5.4 Nonlinear systems: problem formulation . . . . . . . . . . . . . . .
5.5 Constructing stabilizing controllers . . . . . . . . . . . . . . . . . .
5.5.1 A recipe to obtain stabilizing behaviors for all nonlinearities
family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5.2 Stability results . . . . . . . . . . . . . . . . . . . . . . . . .
5.5.3 A characterization of stabilizing controllers . . . . . . . . . .
5.6 The Circle criterion . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7 Classical Popov Criterion . . . . . . . . . . . . . . . . . . . . . . . .
5.8 Slope restricted nonlinearities . . . . . . . . . . . . . . . . . . . . .
5.9 Nonlinearities with memory . . . . . . . . . . . . . . . . . . . . . .
5.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
in a given
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
6 Polynomial J-spectral factorization
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Σ-unitary modeling of dualized data . . . . . . . . . . . . . . . . . . .
6.2.1 Modeling vector-exponential time series with behaviors . . . . .
6.2.2 Data dualization, semi-simplicity, and the Pick matrix . . . . .
6.2.3 A procedure for Σ-unitary modeling . . . . . . . . . . . . . . . .
6.3 J-spectral factorization via Σ-unitary modeling . . . . . . . . . . . . .
6.4 Numerical Aspects of the Algorithm . . . . . . . . . . . . . . . . . . . .
6.4.1 Symmetric Canonical Factorization . . . . . . . . . . . . . . . .
6.4.2 Computing singularities . . . . . . . . . . . . . . . . . . . . . .
6.4.3 Implementation of iterations . . . . . . . . . . . . . . . . . . . .
6.4.4 Computer implementation of polynomial J-spectral factorization
6.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
55
55
56
60
61
63
67
70
74
75
75
76
80
80
82
82
83
84
85
86
89
90
92
93
93
94
94
96
96
98
106
106
108
109
109
110
116
7 Synthesis of dissipative systems
117
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
xii
7.3 A Solution to the synthesis problem . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.4 A characterization of all solutions of the synthesis problem . . . . . . . . . . . . 126
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8 Modeling of data with bilinear differential forms
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2 The problem statement . . . . . . . . . . . . . . . . . . . . . . . . . .
8.3 A recursive algorithm for interpolating with BDFs . . . . . . . . . . .
8.4 Examples and applications . . . . . . . . . . . . . . . . . . . . . . . .
8.4.1 Interpolation with BDFs . . . . . . . . . . . . . . . . . . . . .
8.4.2 Application 1: Interpolation with scalar bivariate polynomials
8.4.3 Application 2: Storage functions . . . . . . . . . . . . . . . . .
8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 Nevanlinna-Pick interpolation
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 Nevanlinna Pick interpolation – the standard case . . . . .
9.2.1 Dualizing of the data . . . . . . . . . . . . . . . . .
9.3 System theoretic implications of dualizing the data . . . .
9.4 Nevanlinna-Pick problem with frequency dependent norms
9.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
129
. 129
. 130
. 131
. 134
. 134
. 135
. 136
. 140
.
.
.
.
.
.
141
. 141
. 142
. 144
. 145
. 148
. 150
10 Conclusion and future work
151
10.1 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
10.2 Directions for further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
References
155
A Notation
163
xiii
xiv
Chapter 1
Introduction
In this thesis we study quadratic forms and their relationships and applications vis-a-vis systems and control. Quadratic forms have been studied at length in Physics, Mathematics and
Engineering. Consider the following well known examples:
1. The kinetic energy of a mass m moving with a speed v is 12 mv 2 , a quadratic in v.
2. The power supplied to an electrical circuit is voltage × current, a bilinear expression in
voltage and current.
3. The energy stored in a capacitor with capacitance C and voltage V across it is 21 CV 2 .
Likewise, the energy stored in an inductor with inductance L and current I flowing
through it is 12 LI 2 .
4. Power supplied to a particle that is acted upon by a force F and has a speed v is given
by F v.
5. The energy stored in a mechanical spring with a spring constant K and a displacement
x from the equilibrium is given by 12 Kx2 .
The idea of associating quadratic functionals with quantities representing power or energy
has been deeply ingrained in the human mind. Or is there something deeply subtle in nature
due to which “energy like” natural phenomena reveal themselves to us in a way that a quadratic
approximation is often the best one? Whatever be the case, a study of quadratic functionals and
their relationship with dynamical systems is a very important and interesting area of research.
The action of quadratic forms on dynamical systems can be studied from two viewpoints:
1. Quadratic forms that are useful in describing a dynamical system.
2. Quadratic forms that are helpful in design and analysis of a dynamical system.
We first consider the use of quadratic forms in describing systems. Describing a dynamical
system means searching for a law that adequately explains the behavior of a dynamical system.
All laws (or models) that have been found by physicists and engineers over the past several
centuries are approximations of observed natural phenomena. We generally want laws that
describe a system “sufficiently” accurately and are at the same time not too difficult to handle
analytically. A study of linear models has received great attention because of their simplicity.
2
1 Introduction
However, not all systems can be adequately explained by linear laws. The next best in terms of
simplicity to a linear law is a quadratic law. A large class of systems that cannot be adequately
explained by a linear law can be explained by a quadratic law. Thus, quadratic forms can serve
as models for dynamical systems that cannot be adequately described by linear laws.
The use of quadratic forms in design and analysis of dynamical systems is more popular and
well-understood than the use of quadratic forms in modeling. The use stems from the deep and
intuitive association of a quadratic form with energy and power, as we have seen in the several
examples given above. A quadratic form can be associated with a dynamical system by studying
how the quadratic form changes along trajectories of the dynamical system. This association
has led to the development of an interesting area of research called “dissipative systems” which
concerns systems in which the net “generalized power” supplied is non-negative.
Due to our intuitive association of quadratic forms with energy and power, quadratic forms
have been used to formulate a energy based theory for dynamical systems. We know from
the law of conservation of energy that an isolated system that loses energy with time must
eventually come to rest. Systems theorists, starting with A.M. Lyapunov, have generalized this
idea to construct energy like functionals, now called Lyapunov functions, to examine stability
of isolated systems. Quadratic forms are often good candidates for Lyapunov functions. Hence,
quadratic forms can be used to examine stability.
Another important use of quadratic forms is in optimization problems. In many control
problems, physically meaningful cost functionals are quadratic. This leads to the subject of
“Linear Quadratic” optimal control where one wants to constrain a linear system in such a way
that a cost specified by a quadratic form is minimized along trajectories of this system.
Thus, we see that a quadratic form is an immensely useful tool for studying dynamical
systems. While opinions may differ as to why a quadratic form is almost omnipresent in systems
theory, no one can deny its importance across several diverse disciplines in systems and control
theory. This thesis is a study of quadratic forms, albeit in a slightly different setting. We
study what are called Quadratic Differential Forms (QDFs). We show the importance and
versatility of QDFs by considering their applications across a broad spectrum of systems and
control theory. These include: dissipative systems, absolute stability criteria for nonlinear
systems, quadratic modeling, Nevanlinna-Pick interpolation, polynomial matrix factorization
and robust controller design.
1.1
Preview of the thesis
This thesis is organized as follows: in the remaining pages of this chapter, we introduce
Quadratic Differential Forms (QDFs), including notation and some elementary properties. We
use QDFs in tandem with a recent development in mathematical systems theory called the
behavioral theory of dynamical systems. In Chapter 2, we introduce some basic concepts from
behavioral systems theory. Using QDFs along with behavioral theoretic ideas, we investigate
problems in several different areas in systems and control theory and try to show the existence
of a common thread that binds all these areas together. We now give a summary of results
presented in Chapters 3 to 9 in this thesis.
1. Chapter 3 is titled “A parametrization for dissipative systems”: We address the problem
1.1. Preview of the thesis
3
of how to construct all linear, time-invariant (LTI) dynamical systems that are dissipative
with respect to a “generalized power” defined using a QDF. We examine different cases
in an increasing order of complexity and show that under reasonable assumptions one can
obtain a complete parametrization of all LTI dissipative systems.
2. Chapter 4 titled “KYP lemma and its extensions” is a continuation of Chapter 3. Dissipative systems have certain associated functionals called storage functions. In this chapter,
we address the question of obtaining conditions for the existence of positive definite storage functions for dissipative systems. Results that we have presented in this chapter are
a generalization of the well known Kalman-Yakubovich-Popov lemma, which gives conditions for the existence of positive definite storage functions for passive systems. The
results that we obtain here are representation free.
3. In Chapter 5 we consider the absolute stability problem: given a family of nonlinearities,
obtain a class of linear time-invariant systems such that any system from this class, when
interconnected with any nonlinearity from the given family yields a stable system. We
use results obtained in Chapters 3 and 4 to construct Lyapunov functions for nonlinear
systems.
4. Chapter 6 titled “Polynomial J-spectral factorization” is about a novel algorithm for
obtaining a factorization of polynomial matrices called “polynomial J-spectral factorization”. The algorithm is guaranteed to yield a factorization (if it exists) in finitely many
steps. The algorithm has been found to have good numerical properties.
5. Chapter 7 is titled “Synthesis of dissipative systems”. Here, we address what is commonly
known as the “H∞ control problem”. We obtain a novel characterization of all solutions
to the H∞ problem using the idea of parametrization discussed in Chapters 3 and 4.
6. In Chapter 8 we consider the problem of exact modeling of data with bilinear and
quadratic differential forms. We obtain a recursive algorithm for modeling. We address two applications of the modeling scheme: computation of storage functions for
autonomous systems, and scalar interpolation with bivariate polynomials.
7. Chapter 9 is about a behavioral view of Nevanlinna-Pick interpolation problem. We
use QDFs to address this classical problem. We also address a generalization of the
Nevanlinna-Pick problem, where we obtain interpolating rational functions that also satisfy a frequency-weighted norm, in contrast to the classical problem where this norm is
frequency independent.
1.1.1
Notation
The following notation is used throughout the thesis: The field of real and complex numbers
is denoted by R and C respectively. Integers are denoted by Z. Abstract vector spaces are
denoted by V. Rm (respectively Cm ) denotes the set of m-dimensional column vectors (over R
or C). Rm×p , respectively Cm×p , denotes the set of m × p matrices over R or C. The ring of
polynomials in ξ is denoted by R[ξ] or C[ξ] depending on the field. Polynomial matrices with
4
1 Introduction
m rows and p columns are denoted by Rm×p [ξ] or Cm×p [ξ] depending on the field. Rwו , Rwו [ξ]
denote the set (real constant or polynomial) matrices having w rows. Likewise, R•×p and R•×p [ξ]
denote (real constant or polynomial) matrices with p columns. Analogously, C•×p and C•×p [ξ].
Given a linear operator K : V1 → V2 , where V1 , V2 are vector spaces, the kernel of K,
denoted by Ker K := {v ∈ V1 such that Kv = 0}. Likewise, the image of K, denoted by Im
K := {w ∈ V2 such that ∃v ∈ V1 satisfying w = Kv}.
n!
.
Given w ∈ Z, by w! we mean the factorial of w. By n Cw we mean w!·(n−w)!
1.2
1.2.1
Quadratic Differential Forms
Quadratic and Bilinear forms
We start with an introduction to quadratic and bilinear forms, which are special cases of what
we define as “quadratic differential forms”. All material presented in this section is standard
and can be found in numerous textbooks on matrices, [22] for instance.
Definition 1.2.1 A bilinear form on vector spaces (V1 , V2 ) over the field F is a map ` :
(V1 , V2 ) → F which is linear in both of its arguments.
Being linear in both arguments, ` defines a linear map `v2 on V1 for every v2 ∈ V2 defined as
`v2 (v1 ) = `(v1 , v2 ). Notice that `v2 is an element in V1∗ , the dual space of V1 . The following
properties of ` are a consequence of bilinearity:
1. `(0, v2 ) = 0 for all v2 ∈ V2 .
2. `(v1 + v10 , v2 ) = `(v1 , v2 ) + `(v10 , v2 ), v1 , v10 ∈ V1 and v2 ∈ V2 .
3. `(kv1 , v2 ) = k`(v1 , v2 ) with k ∈ F.
A special case of interest is when V1 = V2 = Rn and F = R. In this case, ` is said to define a
quadratic form. A quadratic form is a homogeneous polynomial of second degree in n variables
P
x1 , x2 . . . xn and has a representation ni,k=1 aik xi xk with aik = aki , i = 1 . . . n (Gantmacher,
[22] page 294).
Another important special case is when V1 = V2 = Cn and F = R. These quadratic forms
(together with a notion of symmetry) are called hermitian. A hermitian form is an expression
Pn
∗
∗
of the form A(x, x) =
i,j=1 aij xi xj with aij = aji . A quadratic or hermitian form can be
studied by studying the matrix A = [aik ]ni,k=1 . Thus, a quadratic form defined by a matrix
h
iT
A = AT can be written in a compact way as xT Ax where x = x1 x2 . . . xn . Similarly, a
hermitian form defined by a matrix A = A∗ can be written in a compact form as x∗ Ax.
One of the most important properties of quadratic and hermitian forms is diagonalization:
consider the hermitian form x∗ Ax defined by the hermitian matrix A. Then, there exists
P
a nonsingular transformation on the variables xi given by nj=1 tij yi = xi such that x∗ Ax =
Pσ+ (A) ∗
Pσ+ (A)+σ− (A) ∗
yi yi . This diagonalization can be expressed conveniently in terms of
yi yi − i=σ
i=1
+ (A)+1
the matrix A and the matrix T = [tij ]nij=1 that defines the similarity transformation: consider
the matrix à = T ∗ AT . Then, y ∗ Ãy = x∗ Ax with à = diag [Iσ+ (A) , −Iσ− (A) , 0n−σ+ (A)−σ− (A) ].
Sylvester’s law of inertia ( [22] page 296) states that the numbers σ+ (A) and σ− (A) do not
1.2. Quadratic Differential Forms
5
depend on the similarity transformation used and are intrinsic to the hermitian form defined
by A. It can be shown that σ+ (A) coincides with the number of positive eigenvalues of A and
σ− (A) coincides with the number of negative eigenvalues of A.
In many problems, hermitian forms that are positive definite are important. A hermitian
form defined by a matrix A = A∗ is said to be positive definite (respectively semidefinite)
if x∗ Ax > 0 for all nonzero x (respectively x∗ Ax ≥ 0 for all x). The definition for positive
definiteness and positive semidefiniteness for quadratic forms is analogous. The quadratic or
hermitian form defined by A is positive semidefinite if and only if σ− (A) = 0, and positive
definite if and only if σ− (A) = 0 and A is a nonsingular matrix. A compact notation for
denoting that A defines a positive definite (respectively positive semidefinite) quadratic or
hermitian form is A > 0 (respectively A ≥ 0).
We now define some well known terms about spectra of matrices [22]:
Definition 1.2.2 Let A = A∗ ∈ Cw×w . Let σ+ (A), σ− (A) and σ0 (A) denote respectively the
number of positive, negative and zero eigenvalues of A. Then,
1. The non-negative integer |σ+ (A) − σ− (A)| is called the signature of A.
2. The non-negative integer three tuple (σ+ (A), σ− (A), σ0 (A)) is called the inertia of A.
1.2.2
Representations of Quadratic Differential Forms
In section 1.2.1 we reviewed quadratic and bilinear forms. These are functions on real or
complex Euclidean vector spaces. However, in many system theoretic problems, bilinear and
quadratic forms that are a function of system variables and also their derivatives are commonly
encountered. For example, a Lagrangian is a function of generalized positions q and velocities q̇.
Power supplied to a mechanical system is defined as F q̇ where F is the force and q the position.
Such bilinear and quadratic functionals that also contain expressions involving derivatives can
be defined using a bilinear/quadratic differential form(B/QDF).
The concepts that we present in the remaining parts of this chapter is standard introductory
material [103]. Consider a finite number of matrices Φkl ∈ Rw1 ×w2 , k, l = 0, 1, . . . , n. Let
C ∞ (R, R• ) denote the space of infinitely many times differentiable functions from R to R• . Let
w1 ∈ C ∞ (R, Rw1 ) and w2 ∈ C ∞ (R, Rw2 ). Consider the following expression involving w1 , w2 and
their derivatives:
n
X
dl
dk
(1.1)
( k w1 )T Φkl l w2
dt
dt
k,l=0
The expression in (1.1) is bilinear in w1 and w2 and is a map from C ∞ (R, Rw1 ) × C ∞ (R, Rw2 )
to C ∞ (R, R). Expressions like (1.1) can be conveniently represented by using the following
notation: define a bivariate polynomial matrix
Φ(ζ, η) =
n
X
Φkl ζ k η l
(1.2)
k,l=0
Then Φ(ζ, η) can be associated with the bilinear expression (1.1) by associating a monomial
k
l
Φkl ζ k η l in (1.2) with the differential operator ( dtd k )T Φkl dtd l in (1.1).
6
1 Introduction
In the sequel, the set of all w1 × w2 bivariate polynomial matrices in ζ, η with real coefficients
will be denoted by Rw1 ×w2 [ζ, η].
Definition 1.2.3 The matrix Φ(ζ, η) ∈ Rw1 ×w2 [ζ, η] is said to define a bilinear differential form
(BDF) LΦ which is a map
LΦ : C ∞ (R, Rw1 ) × C ∞ (R, Rw2 ) → C ∞ (R, R)
defined by LΦ (w1 , w2 ) =
P
dl
dk
T
k,l ( dtk w1 ) Φkl dtl w2
for w1 ∈ C ∞ (R, Rw1 ) and w2 ∈ C ∞ (R, Rw2 ).
The special case when w1 = w2 = w and w1 = w2 = w is more interesting. Under these
conditions, the bilinear differential form LΦ (w, w) is called a quadratic differential form:
Definition 1.2.4 The matrix Φ(ζ, η) ∈ Rw×w [ζ, η] is said to define a quadratic differential form
(QDF) QΦ which is a map
QΦ : C ∞ (R, Rw ) → C ∞ (R, R)
P
k
l
defined by QΦ (w) = LΦ (w, w) = k,l ( dtd k w)T Φkl dtd l w for w ∈ C ∞ (R, Rw ).
Example 1.2.5 Let Φ1 (ζ, η) = ζ + η + ζη. Consider `1 , `2 ∈ C ∞ (R, R). Then, LΦ1 (`1 , `2 ) =
d
` · `2 + `1 · dtd `2 + dtd `1 · dtd `2 . When `1 = `2 = `, QΦ1 (`) = 2` · dtd ` + ( dtd `)2
dt 1
Example 1.2.6 Let Φ2 (ζ, η) =
"
ζ η
1 0
QΦ2 (`) =
#
. Let ` = (`1 , `2 )T ∈ C ∞ (R, R2 ). Then,
d
d
`1 · `1 + ` 1 · `2 + ` 2 `1
dt
dt
Using algebraic properties of bivariate polynomial matrices, a calculus can be built for
QDFs / BDFs. Using this calculus, properties of QDFs/BDFs can be translated into equivalent
properties of bivariate polynomial matrices. An instance of this calculus is the asterisk operator
? defined as follows:
? : Rw1 ×w2 [ζ, η] → Rw2 ×w1 [ζ, η]; Φ? (ζ, η) = ΦT (η, ζ)
where T denotes the (usual) matrix transposition. Clearly, LΦ (w, v) = LΦ? (v, w). Bivariate
polynomial matrices that satisfy Φ(ζ, η) = Φ? (ζ, η) are called symmetric. Clearly, a necessary
condition for Φ(ζ, η) to be symmetric is that it should be square.
Notation 1.2.7 The set of w × w symmetric matrices in ζ, η will be denoted throughout this
thesis by Rw×w
s [ζ, η]
Example 1.2.8 Consider Φ1 (ζ, η) in Example 1.2.5: Φ"1 (ζ, η) #= ζ + η + ζη. Φ1 (ζ, η) is symζ η
metric. Consider Φ2 (ζ, η) in Example 1.2.6: Φ2 (ζ, η) =
. Φ2 (ζ, η) is not symmetric. 1 0
1.2. Quadratic Differential Forms
7
Since LΦ (w, v) = LΦ? (v, w) it follows that
QΦ (w) = QΦ? (w) = Q Φ+Φ? (w)
2
Example 1.2.9 Consider QΦ2 (`) as defined in Example 1.2.6: dtd `1 ·`1 +`1 · dtd `2 +`2 `1 . Compute
"
#
Φ2 (ζ, η) + ΦT2 (η, ζ)
1 ζ +η 1+η
Φ(ζ, η) =
=
2
2 1+ζ
0
Then, QΦ (`) = 12 (2`1 ·
d
`
dt 1
+ ` 1 `2 + ` 1 ·
d
`
dt 2
+ ` 2 `1 +
d
`
dt 2
· `1 ) which is precisely QΦ2 (`).
Therefore, for the purpose of studying QDFs, one may assume that the QDF is defined by a
symmetric bivariate polynomial matrix without loss of generality. Henceforth, we assume that
the QDF QΦ is defined by Φ(ζ, η) = ΦT (η, ζ) unless otherwise mentioned.
Another instance of the calculus for QDFs/BDFs is differentiation. Clearly, if LΦ is a BDF,
then so is dtd LΦ . The result of this differentiation can be elegantly represented in terms of
two-variable polynomial matrices and leads to the • operator defined as
•
• : Rw1 ×w2 [ζ, η] → Rw1 ×w2 [ζ, η]; Φ (ζ, η) := (ζ + η)Φ(ζ, η)
It can be seen that
d
d
LΦ (w1 , w2 ) = L • (w1 , w2 ) and QΦ (w) = Q • (w)
Φ
Φ
dt
dt
Example 1.2.10 Consider QΦ1 (`) in Example 1.2.5: 2` ·
d
`
dt
+ ( dtd `)2 . Then,
d2
d
d2
d
d
QΦ1 (`) = 2` · 2 ` + 2( `)2 + 2 ` · 2 `
dt
dt
dt
dt dt
Now consider the two-variable polynomial matrix
Ψ1 (ζ, η) = (ζ + η)Φ1 (ζ, η) = ζ 2 + 2ζη + η 2 + ζ 2 η + η 2 ζ
It is easy to see that
d2
d
d2
d
d
QΦ1 (`) = QΨ1 (`) = 2[` · 2 ` + ( `)2 + ` · 2 `]
dt
dt
dt
dt dt
Given Φ(ζ, η) ∈ Rw×w
s [ζ, η] consider the matrix Φ(−ξ, ξ) obtained from Φ(ζ, η) by substituting
ζ = −ξ and η = ξ. Then, Φ(−ξ, ξ) is what is called a “para-Hermitian” matrix:
Definition 1.2.11 A matrix Z(ξ) ∈ Rw×w [ξ] is called para-Hermitian if Z(ξ) = Z T (−ξ).
Para-Hermitian matrices are interesting in many problems in systems and control theory and
signal processing. We will again encounter para-Hermitian matrices in Chapter 6 where we
study factorizations of these matrices.
8
1.2.3
1 Introduction
Factorization of Quadratic Differential Forms
Analogous to the reduction of a quadratic form to a sum of squares, which we reviewed in
Section 1.2.1, we would like to express a QDF in some “canonical” form so that it is easier to
manipulate.
Consider a two-variable polynomial matrix Φ(ζ, η) ∈ Rw×w [ζ, η]. We can associate the
following matrix with Φ(ζ, η)


Φ00 Φ01 . . . , . . .
 Φ

 10 Φ11 . . . , . . .

Φ̃ = 
(1.3)
..
.. 
 ...
.
Φkl
. 


..
. ···
···
···
It is easy to verify that
Φ(ζ, η) =
h
I . . . Iζ k


I

i 
 ... 

. . . Φ̃ 
 Iη l 


..
.
(1.4)
Note that only finitely many rows and columns of Φ̃ are non-zero. Since QΦ is symmetric, Φ̃ is
a symmetric matrix, i.e. Φkl = ΦTlk ∀k, l.
We have reviewed Sylvester’s law of inertia in Section 1.2.1 which says that the quadratic
form defined by Φ̃ can be expressed as a sum and difference of squares. Hence, there exist
matrices T, Λ such that Φ̃ = T T ΛT where Λ = diag [Iσ+ (Φ̃) , −Iσ− (Φ̃) ]. Consequently, Φ(ζ, η)
admits the factorization
Φ(ζ, η) = M T (ζ)ΛM (η)
If T is chosen to be surjective then Λ is unique unique modulo a congruence transformation.
The QDF QΦ can now be written as
QΦ (w) = [P (
d
d
d
d
)w]T P ( )w − [N ( )w]T N ( )w
dt
dt
dt
dt
where P (ξ), N (ξ) are obtained by partitioning rows of M (ξ) conformally with the blocks I σ+ (Φ̃)
and −Iσ− (Φ̃) in Λ. Such a factorization of QΦ is known as a symmetric canonical factorization.
We now define what we mean by observability of a QDF:
Definition 1.2.12 A QDF QΦ is called observable if the matrix M in any symmetric canonical
factorization Φ(ζ, η) = M (ζ)T ΣM (η) is such that M (λ) has full column rank for all λ ∈ C.
1.2.4
Point-wise non-negative Quadratic Differential Forms
Sign definite QDFs are interesting in a large number of problems: synthesis of passive systems,
construction of Lyapunov functions, solution of interpolation problems etc. A QDF QΦ is called
positive semidefinite if QΦ (w)(t) ≥ 0 for all w ∈ C ∞ (R, Rw ). A positive semidefinite QDF is
called positive definite if in addition QΦ (w) = 0 if and only if w = 0.
1.3. Conclusion
9
Example 1.2.13 Let Φ3 (ζ, η) = 1 + ζη. Then, QΦ3 (`) = `2 + ( dtd `)2 . QΦ3 (`) is non-negative
for all `, and is zero if and only if ` = 0. Therefore, QΦ3 is a positive definite QDF.
Let Φ4 (ζ, η) = 1 + ζ + η + ζη. Then, QΦ4 (`) = `2 + 2` · dtd ` + ( dtd `)2 . We may simplify
this expression to obtain QΦ4 (`) = (` + dtd `)2 . Clearly, QΦ4 (`) is non-negative for all ` and is
zero if and only if ` = ce−t , c ∈ R. Hence, the QDF QΦ4 is positive semidefinite.
One can easily check whether a QDF QΦ is positive semidefinite by a symmetric canonical factorization. Proceed by first writing Φ(ζ, η) as in equation (1.4). If Φ̃ is positive (semi)definite,
QΦ is clearly positive (semi)definite. We now prove the converse: if Φ̃ is not positive semidefinite, there exist a vector v ∈ R• such that v T Φ̃v < 0. One can find w ∈ C ∞ (R, Rw ) such that
i
[ ddtwi ]•i=0 (t) = v. Then, clearly QΦ (w)(t) < 0. Therefore, QΦ cannot be positive semidefinite if
Φ̃ is not positive semidefinite. Hence:
Corollary 1.2.14 For a QDF QΦ to be positive semidefinite, it is necessary and sufficient that
the truncated coefficient matrix Φ̃0 obtained by retaining only the maximum rank sub-matrix
of Φ̃ defined in (1.3) be positive semidefinite.
Example 1.2.15 In example 1.2.13, consider the matrix Φ4 (ζ, η): 1 + ζ + η + ζη. It is easy to
see that in this case
"
#
1 1
Φ̃4 =
1 1
which is clearly positive semidefinite. Hence QΦ4 is positive semidefinite.
1.3
Conclusion
In this chapter we have reviewed elementary material on quadratic and hermitian forms. We
have defined quadratic and bilinear differential forms and have obtained representations for
them using two-variable polynomial matrices. The use of two variable polynomial matrices
lets us define a calculus for QDFs, with the help of which we have obtained an expression for
the derivative of a QDF. We have addressed a “symmetric canonical factorization” of a QDF
which will be useful in many problems. We have also defined the notion of sign-definiteness and
semidefiniteness of QDFs and have obtained conditions under which a given QDF is positive
definite or positive semidefinite.
10
1 Introduction
Chapter 2
Behavioral theory of dynamical
systems
In this chapter, we introduce basic concepts that form the backbone of the behavioral approach
to dynamical systems. We try to show that the strength of the behavioral approach comes from
its formal setting which makes the treatment of dynamical systems fairly general. Properties of
a dynamical system (for example linearity, time/shift invariance) can be defined in this formal
setting without recourse to a definition involving a specific model of the dynamical system. We
consider the so-called differential systems, i.e. those dynamical systems that can be modeled
by differential equations. Though in this thesis we also consider nonlinear differential systems,
in the present chapter we only consider linear differential systems, since the chapter is intended
to be of a introductory nature. We review various kinds of models (representations) for linear
differential systems and show how so-called latent variables arise naturally. The important
aspect of elimination is also considered. We review important properties of a linear differential
system like controllability and observability. We then consider autonomous dynamical systems
and define stability for these systems. Further, we define the notion of inputs and outputs, and
some invariants associated with a dynamical system.
2.1
Dynamical systems
The starting point of our study is the notion of a dynamical system. When modeling a system,
we are trying to describe the way in which the variables of the system evolve. Let w denote
a vector of variables whose components consist of system variables. We stipulate that w takes
values in W, the signal space. Usually w itself is a function of an independent variable usually
called time which, we stipulate, takes values in a set called the time axis T. Let WT denote the
set of maps from T to W, then w ∈ WT . Not every map w is allowed by the laws that govern
the system evolution. The set of maps that are allowed by the system is precisely the object
of our study, and is called the behavior of the system. The laws that govern the system bring
about a restriction of WT to the behavior of the system. This leads to the following definition
of a dynamical system, given by Willems [101]:
Definition 2.1.1 A dynamical system Σ is a triple Σ = (T, W, B) with T an indexing set
(time axis), W the signal space and B ⊆ WT called the behavior of Σ.
12
2 Behavioral theory of dynamical systems
When T and W are clear from the context, or have been explicitly specified, we will, for ease of
notation, use the terms “behavior” and “dynamical system” interchangeably since there is little
scope for ambiguity. Thus, when we say that T = R and W = Rw and WT is the set of infinitely
many times differentiable functions from R to Rw (denoted as C ∞ (R, Rw )), by a behavior B we
mean B ⊆ C ∞ (R, Rw ). In order to define Σ completely, we need to specify the behavior B.
This is usually done with a representation of B. A representation is typically obtained from
equations defining the laws obeyed by Σ.
We now address some properties of dynamical systems. In this chapter we only consider
dynamical systems that are
1. linear
2. time-invariant
3. described by ordinary differential equations
In the context of continuous time dynamical systems, T can be taken to be R and W to be Rw .
Definition 2.1.2 A dynamical system Σ = (R, Rw , B) is called linear if for all w1 , w2 ∈ B and
α1 , α2 ∈ R, α1 w1 + α2 w2 ∈ B.
Linearity of a behavior B ⊂ C ∞ (R, Rw ) is related to B being a vector space over R and is
equivalent to Σ obeying the superposition principle.
The property of time-invariance can be formulated by stipulating that the laws that describe
a behavior are independent of time (time invariant):
Definition 2.1.3 A dynamical system Σ = (R, Rw , B) is called time-invariant if for all w(t) ∈
B and τ ∈ R, w(t + τ ) ∈ B.
2.2
Linear differential systems
We now consider linear systems that are defined by constant-coefficient ordinary differential
equations (ODEs). Assume that we have m constant-coefficient ODEs in the Rw -valued variable
w and we are interested in solutions (in some suitable function space) w : R → Rw to the m
equations:
dn
d
(2.1)
R0 w + R 1 w + . . . + R n n w = 0
dt
dt
where Ri ∈ Rm×w , i = 0 . . . n.
We introduce a polynomial matrix R(ξ) : R0 + R1 ξ + . . . Rn ξ n . A concise way of writing the
m equations in (2.1) is R( dtd )w = 0. Suppose we are interested in C ∞ -solutions to (2.1). Then,
we may define a the solution set (the behavior B) as:
B = {w ∈ C ∞ (R, Rw )|R(
d
)w = 0}
dt
(2.2)
Thus, R( dtd ) is a differential operator from C ∞ (R, Rw ) to C ∞ (R, Rm ) and B is the kernel of R( dtd ).
Hence, the representation of B in equation (2.2) is called a kernel representation. Since R( dtd )
is a linear operator, B is linear. Since coefficients of R(ξ) (i.e. the matrices Ri , i = 0 . . . n
2.2. Linear differential systems
13
R1
I
R2
C
V
Figure 2.1: A simple RC circuit
in (2.1)) do not explicitly depend on time, B is time-invariant. We have assumed that every
trajectory in B is infinitely many times differentiable. This is more for the sake of mathematical
convenience. However, with appropriate modifications, different functions spaces can also be
used. For example in Chapter 5, where we also address nonlinear systems, we will consider
behaviors in the space of locally integrable functions.
Throughout this thesis we denote the set of linear differential systems with w variables (with
solutions in a pre-defined function space, say C ∞ ) as Lw (the “L” stands for linear).
While a kernel representation defines the behavior uniquely, the converse is not true, namely,
the same behavior could be defined by a number of kernel representations. Therefore, in the
behavioral approach, the behavior of a system takes precedence over a specific representation
for the system. Hence, we make a distinction between the behavior, defined as the solution set
of a system of equations, and the system of equations itself. We consider the question of when
two representations describe the same behavior in section 2.5 below.
Example 2.2.1 Consider the RC circuit shown in Figure 2.1. Assume that we are interested
in the voltage V and current I which are admissible, i.e. the two tuple (V, I) that respects the
law defined by the circuit. It is clear that the time axis T in this case is R, and the signal space
W is R2 . The use of Kirchoff’s voltage and current laws tells us that only those V and I are
admissible that satisfy the ODE
C
V
d
R1
d
V +
= R1 C I + (1 +
)I
dt
R2
dt
R2
If we define the matrix
R(ξ) =
h
Cξ +
1
R2
−R1 Cξ − (1 +
R1
)
R2
The behavior B of the RC circuit can then be defined as:
d
{(V, I) ∈ C ∞ (R, R2 ) such that R( )
dt
"
V
I
#
i
= 0}
14
2 Behavioral theory of dynamical systems
2.3
The space of trajectories
In the previous section we considered behaviors that have smooth (C ∞ ) trajectories. However,
in this thesis behaviors in several other function spaces will be encountered. We now summarize
the notation for function spaces encountered in this thesis:
• C ∞ (R, Rw ): the space of infinitely many times differentiable functions from R to Rw .
• D(R, Rw ): the space of compactly supported C ∞ -functions from R to Rw .
• D 0 (R, Rw ): the dual space of compactly supported C ∞ -functions, also called Rw -valued
distributions on R.
w
w
w
• Lloc
1 (R, R ): the space of all locally integrable functions from R to R , i.e. all w : R → R
Rb
such that a |w(t)| < ∞ for all a, b ∈ R.
d
)w = 0} with R(ξ) = R0 + R1 ξ + . . . + Rn ξ n .
Consider a behavior B defined as {w|R( dt
To assure differentiability, it is enough for w to be n times differentiable. A solution w of
R( dtd )w = 0 is called a strong solution if it is at least n times differentiable. In particular, all
C ∞ solutions of R( dtd )w = 0 are strong solutions.
The disadvantage of working with the function space C ∞ is that it excludes many important
trajectories that are commonly used in systems and control theory, for example steps, impulses
and so on. Also, in nonlinear systems (Chapter 5), trajectories are not smooth in general, due
to presence of, for example, relays. This motivates us to consider the space Lloc
1 . We assume
loc
that differentiation of L1 functions is in the sense of distributions, as follows.
R∞
Let φ ∈ D(R, Rw ) and w ∈ C ∞ (R, Rw ). We define < w, φ >= −∞ w T (t)φ(t)dt. Using
integration by parts:
Z ∞
Z ∞
d
d T
d
d
( w) φdt = −
w T φ dt =< w, − φ >
(2.3)
< w, φ >=
dt
dt
dt
−∞ dt
−∞
i
i
By using integration by parts repeatedly one obtains < dtd i w, φ >=< w, (−1)i dtd i φ >. Now
assume that w satisfies the system of equations R( dtd )w = 0 with R(ξ) = R0 + R1 ξ + . . . +
Rn ξ n , Ri ∈ Rm×w , i = 0 . . . n. Then,
R(−ξ) = R0 − R1 ξ + . . . + (−1)i Ri ξ i + . . . + (−1)n Rn ξ n
Consider the differential operator RT (− dtd ) and a trajectory ψ ∈ D(R, Rm ). Along lines of
equation (2.3), it can be seen that < R( dtd )w, ψ >=< w, RT (− dtd )ψ >, ψ ∈ D(R, Rm ). Hence,
R(
d
)w = 0
dt
⇐⇒ < w, RT (−
d
)ψ >= 0 ∀ψ ∈ D(R, Rm )
dt
(2.4)
We had assumed w ∈ C ∞ (R, Rw ), however, the same argument is true for any strong solution
(i.e. a solution that is at least n times differentiable). The property of strong solutions (2.4) is
used to define weak solutions:
d
m×w
w
[ξ] if for
Definition 2.3.1 We call w ∈ Lloc
1 (R, R ) a weak solution of R( dt )w = 0, R(ξ) ∈ R
d
m
T
all ψ ∈ D(R, R ) we have < w, R (− dt )ψ >= 0.
2.4. Latent variables and their elimination
15
Thus, when considering functions that are locally integrable, the equation R( dtd )w = 0 is understood to hold in a distributional sense.
With this brief review of weak solutions of differential equations, we return to various
modeling issues for dynamical systems and show how auxiliary (“latent”) variables are naturally
introduced in any systematic modeling procedure.
2.4
Latent variables and their elimination
Most systems that we encounter during modeling are made up of smaller, simpler subsystems
that are interconnected to each other via their terminals. A systematic procedure for modeling
a system can be to first model every subsystem and then use the interconnection relations to
build a model for the entire system. This procedure is called modeling by tearing and zooming.
As a result of this systematic procedure, we invariably obtain a set of equations with additional
variables called latent variables. The latent variables are different from the variables we are
really interested in – which we call the manifest variables. We have defined a dynamical system
in Definition 2.1.1 using just manifest variables, it is also possible to define the dynamical
system using both latent and manifest variables. Such a definition gives rise to what is called
a “full behavior”:
Definition 2.4.1 A dynamical system with latent variables is a quadruple ΣL = (T, W,
L, Bfull ) where T is the time axis, W is the space of manifest variables, L is the space of
latent variables and Bfull ⊆ (W × L)T , i.e. Bfull is a set of maps from T to W × L and is called
the “full behavior”.
Representation of a behavior that has both manifest as well as latent variables is called a
latent variable representation, or a hybrid representation. A latent variable representation is
very simple to obtain and generally does not require any “post-processing”, unlike the kernel
representation we have seen above, or the image representation which we will encounter later.
A latent variable representation for a behavior with manifest variables w has the form:
R(
d
d
)w = M ( )`
dt
dt
where ` is a free (possibly vector-valued) latent variable, and R(ξ), M (ξ) are polynomial matrices of the appropriate dimension.
Consider w ∈ W and ` ∈ L. We consider the projection operator Π : (W×L)T to WT defined
as Π(w, `) := w. Then, the behavior B := ΠBfull is called the manifest behavior induced by
Bfull . Assuming Bfull to be linear and time invariant, we address the following issues about B:
1. Can the manifest behavior B, associated to Bfull , be described as the solution set of a
system of linear differential equations.
2. If B can be described as the solution set of a system of differential equations, does it
inherit properties of Bfull like linearity and time-invariance.
The question whether B is a linear differential system depends on the function space under
consideration. In the important case when Bfull is a C ∞ behavior, the manifest behavior B
16
2 Behavioral theory of dynamical systems
can be expressed as the solution set of a system of linear differential equations. This is the
consequence of the all important elimination theorem:
Theorem 2.4.2 Let Bfull := {(w, `) ⊆ C ∞ (R, Rw )×C ∞ (R, Rl )} ∈ Lw+l . Consider the behavior
B := {w ∈ C ∞ (R, Rw ) such that ∃` ∈ C ∞ (R, Rl ) with (w, `) ∈ Bfull }
Then, B ∈ Lw .
Thus, if a C ∞ -behavior Bfull ∈ Lw+l =⇒ B := ΠBfull ∈ Lw . The elimination theorem has
important consequences in the context of modeling. During the process of modeling we need
to introduce additional variables that come up naturally. As a consequence of the elimination
theorem, latent variables are not a problem since they can always be eliminated, provided we
restrict to C ∞ trajectories.
Since we will be considering Lloc
trajectories in Chapter 5, it is important to elaborate
1
loc
on elimination in L1 . The elimination theorem may not hold in Lloc
1 . If it does, the latent
variables ` are called properly eliminable [74]. We use the following method ([75], Section 6.2)
for elimination in Lloc
1 . Consider
w+l
Bfull = {(w, `) ∈ Lloc
)|R(
1 (R, R
d
d
)w = M ( )`}
dt
dt
Define B = ΠBfull :
w
B = {w ∈ Lloc
1 (R, R )| ∃` such that (w, `) ∈ Bfull }
In general there may not exist a polynomial matrix R0 (ξ) ∈ R•×w [ξ] that induces a kernel
representation for B; however, the closure of B does admit a kernel representation. Define
B0full = {(w 0 , `) ∈ C ∞ (R, Rw+l )|R(
d
d 0
)w = M ( )`}
dt
dt
Clearly, B0full ⊆ Bfull and in fact contains all C ∞ trajectories in Bfull . By elimination theorem,
there exists a matrix R0 (ξ) such that R0 ( dtd )w 0 (t) = 0. We use R0 (ξ) to define a behavior B00 as
follows:
w
0 d
B00 = {w ∈ Lloc
)w = 0}
1 (R, R )|R (
dt
It has been discussed in [75], section 6.2 that B00 is the closure of B in the topology of Lloc
1 .
We now explain the concepts behind elimination using the circuit in Example 2.2.1.
Example 2.4.3 We refer to the circuit in Figure 2.1, which has been redrawn in Figure 2.2 to
emphasize the “tearing and zooming” approach to modeling. As before, R1 , R2 and C denote
the values of the two resistors and the capacitance of the capacitor. We are interested in the
“behavior” of V and I– the port voltage and the port current respectively. Proceeding from
first principles we introduce some latent variables to model the circuit. These could be
1. Currents i1 and i2 in the capacitor branch and the resistor branch respectively.
2. Potentials v1 and v2 at the terminals of the resistance R1 .
2.4. Latent variables and their elimination
17
R1
v1
v2
v4
v3
I
V
i1
R2
C
g
i2
g=0
g
Figure 2.2: The “tearing and zooming approach” to modeling
3. Potential v3 at one terminal of the capacitance C (in order to reduce the number of
variables, we assume that the other terminal of the capacitor is at ground potential 0,
but such an assumption is not necessary).
4. Potential v4 at one terminal of the resistance R2 . Again, we assume that the other
terminal of R2 is at ground potential.
The subsystems R1 , R2 and C satisfy the following equations:
1. v2 − v1 = IR1 .
2. i2 = v4 /R2 .
3.
d
v
dt 3
= i1 /C.
The subsystems are interconnected in such a way that the following constraints are imposed:
v2 = v 3 = v 4 , I = i 1 + i 2 , v1 = V .
The seven equations that we have just written define the “full behavior” which includes
our variables of interest V, I and variables that we introduced during the course of modeling.
We eliminate the extra variables that have been introduced and obtain the manifest behavior
that describes the evolution of V and I. Using fairly straightforward calculations, the following
equations in terms of i1 and i2 can be obtained
d
d
V = R1 I + i1 /C
dt
dt
V = IR1 + i2 R2
One can get the following latent variable representation for the port behavior:




d
"
#
#
1/C 0
−R1 dtd "
dt
i
V
 1



=  0 R2 
 1 −R1 
i2
I
1
1
0
1
(2.5)
(2.6)
Notice that we have obtained the above latent variable representation from the device laws
without any computation, or “post-processing” as it is generally known. Let us now obtain a
18
2 Behavioral theory of dynamical systems
kernel representation for the port behavior. Multiplying the first equation by C and the second
by 1/R2 , and adding the two equations we get
C
d
V
d
R1 I
V +
= R1 C I +
+ i1 + i2
dt
R2
dt
R2
(2.7)
Substitute I = i1 + i2 . We see that the behavior so obtained and the behavior in Example 2.2.1
are the same.
Remark 2.4.4 The simple example that we have considered illustrates the general ideas behind the tearing and zooming approach to modeling. Of course, in this simple example, defining
just two latent variables i1 and i2 would probably be enough for someone familiar with how to
identify equipotential terminals in the circuit. One particularly clever way of choosing latent
variables in this case is to define a single latent variable as the voltage across the capacitor.
We will see in Example 2.8.3 that equations take a particularly simple and appealing form
using this latent variable. However, such simplifications are the result of insight into the nature
of the problem and are therefore not included in a systematic method for modeling (done for
example with the help of a computer). Therefore, the number of latent variables that could be
introduced in the course of modeling may vary depending on, among others the experience of
the modeler. Therefore, given Bfull (which presumes a particular choice of latent variables) it
makes sense to ask what is B, however the converse question is meaningless in general, since
as we have seen, Bfull is highly non-unique and depends on the number of latent variables a
modeler may choose to add in the course of modeling. Having said that Bfull is highly nonunique, we must however add that there are some special “full behaviors” associated with a
given behavior B that are of immense practical and theoretical significance– these are state
space representations of B, which we review in Section 2.8.
2.5
Equivalent representations
We have discussed kernel and latent variable representations for behaviors. We have also
addressed when a latent variable representation can be converted into a kernel representation.
A given behavior may be represented by more than one representation. An example is the
following:
Example 2.5.1 The behavior (V, I) of port voltage and port current in Example 2.2.1 may
also be given by
d
R1 + R 2
d
)I
(V /C + R2 V ) = R1 R2 I + (
dt
dt
C
Arguably the representations obtained in Example 2.5.1 and Example 2.2.1 are easy to
relate, since they only involve a scaling (by a constant). However, technically speaking, these
are different representations for the same behavior. In more complicated situations however,
it may not be so easy to relate two representations of a behavior. The following result [75],
Section 3.6 can be used to address the issue of equivalence of two representations:
2.5. Equivalent representations
19
Proposition 2.5.2 Consider two C ∞ behaviors B1 , B2 ∈ Lw represented as the kernel of the
differential operators R1 ( dtd ) and R2 ( dtd ) respectively. Then, B1 ⊆ B2 if and only if there exists
a matrix F (ξ) ∈ R•×• [ξ] such that F (ξ)R1 (ξ) = R2 (ξ)
The above proposition is quite intuitive: consider the behaviors B1 , B2 defined as Ker R1 ( dtd )
and Ker R2 ( dtd ) respectively. Clearly, if F (ξ)R1 (ξ) = R2 (ξ) then, B1 ⊆ B2 , since for every
w ∈ B1 by definition R1 ( dtd )w = 0, and therefore R2 ( dtd )w = 0. Using Proposition 2.5.2 we can
easily obtain conditions for two representations to describe the same behavior: two behaviors
B1 and B2 are equivalent (i.e. B1 = B2 ) if and only if B1 ⊆ B2 and B2 ⊆ B1 . Therefore,
in terms of representations, B1 = B2 if and only if there exist matrices F1 (ξ), F2 (ξ) ∈ R•×• [ξ]
such that F1 (ξ)R1 (ξ) = R2 (ξ) and R1 (ξ) = F2 (ξ)R2 (ξ).
Consider a behavior B = {w ⊆ C ∞ (R, Rw )} satisfying R( dtd )w = 0. The system of equations
R( dtd )w = 0 may have redundancies of the following nature:
1. Some equations may be identically zero.
2. Some equations may be obtained by an algebraic combination of other equations.
3. A subset of the equations may not be independent
Therefore, such redundant equations can be removed without affecting the behavior. Kernel
representations of a behavior that donot contain any redundant equation are called minimal
kernel representations.
In the sequel, we need the notion of a minor and the rank of a polynomial matrix, which
we define below:
Definition 2.5.3 Let R(ξ) ∈ Rw1 ×w2 [ξ]. Denote elements in R(ξ) by rp,q where p = 1, . . . , w1 ,
q = 1, . . . , w2 . Let {i1 , . . . ik } ⊆ {1, . . . , w1 } and {j1 , . . . , jk } ⊆ {1, . . . , w2 }. Define mk (R) =
det[rip ,jq ]kp,q=1 . mk (R) is called the k-th order minor of R(ξ).
Clearly, R(ξ) ∈ Rw1 ×w2 [ξ] has
w1
Ck .w2 Ck minors of order k, k ≤ w1 , w2 .
Definition 2.5.4 Let R(ξ) ∈ Rw1 ×w2 [ξ]. By rank of R(ξ), denoted by Rank R(ξ), we mean the
highest order non-zero minor of R(ξ).
Example 2.5.5 Consider the following polynomial matrices:




!
1
ξ
1 ξ 0
1 ξ 1+ξ




R1 (ξ) =
; R2 (ξ) =  ξ
ξ 2  ; R3 (ξ) = 1 0 0
0 0
1
ξ + 1 ξ(ξ + 1)
0 1 0
!
ξ ξ+1
Then, Rank R1 (ξ) = 2 since det
is nonzero. Rank R2 (ξ) = 1 since every 2 × 2 minor
0
1
of R2 (ξ) is zero. Similarly, Rank R3 (ξ) = 2 since
! the 3 × 3 minor is zero, however there exist
1 0
nonzero 2 × 2 minors, for example: det
.
0 1
20
2 Behavioral theory of dynamical systems
We now return to the question of when a kernel representation is minimal. The following
Proposition ([75], Theorem 3.6.4) addresses this issue:
Proposition 2.5.6 Let B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0}. Then, R(ξ) induces a minimal
kernel representation of B if and only if R(ξ) has full row rank.
Example 2.5.7 For the RC circuit in Examples 2.2.1 and 2.5.1 the behavior of port voltage
and port current, (V, I) defined by
"
#
h
i V
1
= 0 and
(2.8)
C dtd + R12 −R1 C dtd − (1 + R
)
R2
I
"
#
i V
h
+R2
= 0
(2.9)
− R1 R2 dtd
1/C + R2 dtd − R1 C
I
are minimal kernel representations. However, the kernel representation
#
#"
"
1
−R1 C dtd − (1 + R
)
C dtd + R12
V
R2
=0
I
1/C + R2 dtd − R1 C+R2 − R1 R2 dtd
is not a minimal representation, since the second equation is obtained by scaling the first by
R2 /C.
We now address two important concepts in systems theory: controllability and observability
2.6
Controllability and Observability
Controllability plays a central role in systems and control. This intuitive notion was given a
strong foundation when it was introduced, and formalized for state space systems by Kalman
in 1960. Consider the state space system
d
x = Ax + Bu
dt
with A ∈ Rn×n and B ∈ Rn×m . The Rn -valued variables x are called states. This system is
called state controllable if for every x0 , x1 ∈ Rn there exists some τ > 0 and some u1 : R → Rm
such that the solution to the above differential equation with u = u1 and x(0) = x0 satisfies
x(τ ) = x1 . This definition of controllability has been the starting point for many important
developments in systems theory.
We now provide the behavioral definition of controllability. In the behavioral approach,
controllability is an intrinsic property of the behavior, i.e., controllability is the property of
trajectories allowed by a system. The controllability of a behavior is akin to the ability to steer
from one trajectory in the behavior to every other, using some trajectory in the behavior.
Definition 2.6.1 The system Σ(R, Rw , B) is called controllable if for all w1 , w2 ∈ B there
exists τ ≥ 0 and w ∈ B such that
w(t) = w1 (t) for t < 0 and
2.6. Controllability and Observability
21
w(t + τ ) = w2 (t) for t ≥ 0.
A characterization of representations of controllable systems is important. Controllable behaviors turn out to be exactly those that admit a special representation called image representation
which is defined as follows: the latent variable representation w = M ( dtd )` with ` free (for example ` ∈ C ∞ (R, Rl )) is called an image representation of the behavior B := {w ∈ C ∞ (R, Rw )|∃` ∈
C ∞ (R, Rl ) such that w = M ( dtd )`}. The following important result from [75], Theorem 5.2.10
gives a characterization of representations of a controllable behavior:
Proposition 2.6.2 Let B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0}. Then, the following statements
are equivalent:
1. B is controllable.
2. Rank R(λ)= Rank R(ξ) for all λ ∈ C.
3. There exists M (ξ) ∈ Rwו such that w = M ( dtd )`, ` ∈ C ∞ (R, R• ).
Notation 2.6.3 The family of controllable linear differential systems with w manifest variables
will be denoted by Lwcon .
Another crucial property for systems is that of “observability”. Observability is not an
intrinsic property of a behavior since it depends on a chosen partition of the system variables.
It relates to whether one can infer some (specified) variables in a given system uniquely by
measuring some other (specified) variables. Let us consider a behavior B with trajectories
w = [w1 w2 ]T with w1 ∈ WT1 and w2 ∈ WT2 . We define when the variable w2 is observable from
w1 as follows:
Definition 2.6.4 Let Σ = (T, W1 ×W2 , B) be a behavior with trajectories w = [w1 w2 ]T where
w1 (respectively w2 ) takes values in W1 (respectively W2 ). Then, w2 is said to be observable
from w1 if whenever
(w1 , w20 ) ∈ B and (w1 , w200 ) ∈ B =⇒ w20 = w200
Consider a controllable behavior B defined by an image representation w = M ( dtd )`. Consider
the full behavior Bfull := {(w, `)|w = M ( dtd )`}. One can then ask the question whether latent
variables are observable from the manifest variables:
(w, `1 ) ∈ Bfull and (w, `2 ) ∈ Bfull
?
=⇒ `1 = `2
Since we are only considering linear behaviors, one may infer the above implication by only
considering the observability with w = 0:
(0, `) ∈ Bfull
?
=⇒ ` = 0
If the two (equivalent) properties hold then we say that a controllable behavior is defined by
an observable image representation. Notice that since latent variables are introduced by us,
the modelers, and are not intrinsic to the system, an image representation can be assumed to
be observable without loss of generality. Given w = M ( dtd )`, the differential operator M ( dtd )
22
2 Behavioral theory of dynamical systems
defines an observable image representation if and only if M (ξ) is a right-prime matrix, i.e.
if M (ξ) = M 0 (ξ)U (ξ), U (ξ) square, then det U (ξ) = constant 6= 0. Further, by defining
appropriate latent variables, a image representation can be assumed to be induced by a full
column rank polynomial matrix without loss of generality.
We demonstrate the concepts of controllability and observability with the help of some
examples:
Example 2.6.5 With reference to the RC circuit in Figure 2.1, consider the kernel representation of the behavior in Example 2.2.1 with variables V , the port voltage, and I,port current:
"
#
h
i V
1
=0
C dtd + R12 −R1 C dtd − (1 + R
)
R2
I
Let p(ξ) = Cξ + 1/R2 and q(ξ) = R1 Cξ + 1 + R1 /R2 . Let us investigate when p(ξ) and q(ξ)
have a common root. There is a common root if and only if
ξ=−
R1 + R 2
1
=−
R2 C
R1 R2 C
which is not satisfied for any nonzero R1 , R2 , C. Therefore, the port behavior (V, I) is controllable, since the matrix [p(ξ) − q(ξ)] that induces the kernel representation has full row rank
for every ξ ∈ C, which is also equal
" to the
# rank of [p(ξ) − q(ξ)].
q(ξ)
Observe that [p(ξ) − q(ξ)] ·
= 0. Therefore, a image representation for the port
p(ξ)
behavior can be
"
# "
#
V
R1 C dtd + 1 + R1 /R2
=
`
I
C dtd + 1/R2
Notice that since we have shown that p(ξ), q(ξ) are coprime for all nonzero R1 , R2 , C the image
representation is observable since V = 0; I = 0 if and only if ` = 0.
We now demonstrate uncontrollability of dynamical systems with an example:
Example 2.6.6 Consider the AC bridge circuit shown in Figure 2.3. It can be easily shown
that the bridge is “balanced” (i.e. nodes “A” and “B” are equipotential) if and only if
R1 R2 =
L1
C1
We introduce as latent variables the currents I1 and I2 . We then get the following equations
C1 R1
d
V
dt
V
= R 1 I2 + C 1 R1 R2
= R 1 I1 + L 1
d
I2
dt
d
I1
dt
Adding the two equations and substituting C1 R1 R2 = L1 we get
"
#
i V
h
=0
C1 R1 dtd + 1 −R1 − L1 dtd
I
(2.10)
(2.11)
2.7. Autonomous systems
23
L1
A
I1
R1
R2
I2
B
C1
I
V
Figure 2.3: An AC bridge circuit
Let p(ξ) = C1 R1 ξ + 1 and q(ξ) = R1 + L1 ξ. The polynomials p(ξ), q(ξ) are not coprime when
1
R1
=
R1 C1
L1
p
or when R1 = L1 /C1 . Under these conditions, the behavior (V, I) of port variables is not
√
controllable since the matrix [p(ξ), −q(ξ)] is zero when ξ = −1/ L1 C1 .
Assume for the sake of simplicity that R1 = C1 = R2 = L2 = 1. Then, the port behavior
can be given by a kernel representation:
h
i
d
( + 1) 1 −1
dt
"
V
I
#
=0
The controllable part of the behavior acts like a pure unit resistance: V = I. If one of these is
taken to be free, the other gets determined. Now consider the case when V = c1 e−t , I = c2 e−t
where c1 , c2 are determined by the initial conditions. It is easy to see that these V, I are
admissible trajectories in the port behavior. However, c1 and c2 could now be arbitrary. In
particular, the trajectory with c1 = 0, c2 6= 0 can not be “patched” with a trajectory with, say,
V = I = 1.
We have considered controllable systems where there is complete freedom to steer from one
trajectory to any other trajectory. The other end of the spectrum consists of systems with no
freedom to steer among its trajectories. Such systems are called autonomous.
2.7
Autonomous systems
The evolution of autonomous systems depends on only the initial conditions:
24
2 Behavioral theory of dynamical systems
Definition 2.7.1 A time-invariant dynamical system Σ = (R, W, B) is called autonomous if
for all w1 , w2 ∈ B
w1 (t) = w2 (t) for t ≤ 0 =⇒ w1 = w2
The above definition says that in autonomous systems, the future of every trajectory is entirely
determined by its past. Most natural systems are autonomous, e.g. the motion of planets
around the sun, the rotation of earth on its axis etc. Physicists study autonomous systems in
order to obtain laws governing the evolution. Engineers are interested in autonomous systems
since their evolution is so predictable – once a given system is made autonomous (as for example
by attaching a device called a controller), it shows a very predictable response that can be
matched with the desired response.
In the context of linear differential systems, the following result from [75], Section 3.2 gives
a characterization of representations of autonomous systems.
Proposition 2.7.2 Assume B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} is defined by a minimal kernel
representation. Then, the following statements are equivalent:
1. B is the behavior of an autonomous system.
2. Rank R(ξ) = w.
3. B is a finite dimensional subspace of C ∞ (R, Rw ).
Since B in Proposition 2.7.2 is defined by a minimal kernel representation, B is the behavior
of an autonomous system if and only if the matrix R(ξ) is square and nonsingular. Roots
of det R(ξ) = 0 are called the characteristic values of the system. We consider a simple autonomous system in the following example.
Example 2.7.3 Consider B1 defined as the set of all w1 ∈ C ∞ (R, R) such that ( dtd − 1)w1 = 0.
Then, B1 is autonomous since the rank of ξ − 1 is 1. Further, B1 is finite (one in this case)
dimensional since every trajectory in B1 is of the form cet , c ∈ R.
Now consider B2 defined as the set of all w2 ∈ C ∞ (R, R) such that ( dtd + 1)2 w2 = 0. Again,
B2 is autonomous and finite dimensional (two dimensional in this case). Every trajectory in
B2 is of the form (c1 + tc2 )e−t , c1 , c2 ∈ R.
Since the evolution of an autonomous system is “fixed” by its past, it is not illogical to ask
what happens if the system is allowed to evolve for a large enough time. The consideration of
how the system behaves as t → ∞ leads to the question of stability.
Definition 2.7.4 Consider an autonomous linear time-invariant system Σ = (R, Rw , B). Then
Σ is called stable if for all trajectories w ∈ B, ||w(t)|| → 0 as t → ∞.
Since we have only considered Rw -valued systems, which norm is used for w(t) is immaterial
since all norms on Rw are equivalent. An autonomous behavior Ker R( dtd ) is stable if and only
if R(ξ) has all its characteristic values in the open left half complex plane: (det R(λ) = 0 =⇒
Re λ < 0).
2.8. State representation
25
Example 2.7.5 In Example 2.7.3, the behavior B1 corresponds to an unstable system, since
|w1 (t)| := |cet | → ∞ as t → ∞. On the other hand, the behavior B2 corresponds to a stable
system, since |w2 (t)| := |(c1 + tc2 )e−t | → 0 as t → ∞.
We now review a particularly important form of latent variable representations called state
representations.
2.8
State representation
States are intuitively related to “memory” of a dynamical system. We formalize this association
with the “axiom of state”. Variables that are state variables obey the axiom of state.
In the behavioral framework, state x of a system is a latent variable with the special property
that if the values of state corresponding to two manifest variable trajectories is equal at a certain
time t, then the two manifest variable trajectories can be concatenated at time t. Roughly
speaking, this means that while going from the “past” into the “future”, one only needs to see
that the states match. Hence, the value of states at time t can be thought of capturing the
entire history of evolution of a system from rest till time t. In the sequel by w1 ∧τ w2 we mean
the concatenation of w1 (t) and w2 (t) at t = τ .
Definition 2.8.1 (Axiom of state) Let ΣX = (R, Rw , R× , Bfull) be a linear dynamical system
with latent variables x taking values in R× . The latent variable x is said to have the property
of state if and only if
{(w, x) ∈ Bfull } and {x(0) = 0} and {x continuous at t = 0} =⇒ {(0 ∧0 w, 0 ∧0 x) ∈ Bfull }
Systems with latent variables that are also state variables will be called state systems. Issues
about the axiom of state were addressed by Rapisarda and Willems [83]. The relationship
between behaviors and state representations was also addressed by Sule [94]. The following
result from [83], Proposition 3.1 characterizes representations of state systems:
Proposition 2.8.2 ΣX = (R, Rw , R× , Bfull ) be a linear dynamical system with latent variables
x taking values in R× . Then, ΣX is a state system if and only if there exist matrices E, F, G ∈
R•×• such that
d
Bfull = {(w, x) ∈ C ∞ (R, Rw+× )|E x + F x + Gw = 0}
dt
Note that if B is a C ∞ -behavior, B = Π(Bfull ), i.e. B can be obtained by eliminating states
from Bfull using the elimination theorem. By a representation of B in terms of states we mean
a state system with behavior Bfull such that B = Π(Bfull ). A representation of B in terms
of states will also be called a state representation of B. We demonstrate state representations
with an example.
26
2 Behavioral theory of dynamical systems
Example 2.8.3 Consider the RC circuit in Figure 2.2. Let us re-do Example 2.4.3 with a
latent variable vc , which is the voltage across the capacitor C. Then, we get the following
equations that relate V, I and vc
V
= IR1 + vc
d
I = C vc + vc /R2
dt
(2.12)
(2.13)
These equations can be written as
"
#
"
#
"
#"
#
d
C
1/R2
−1 0
I
vc +
vc +
=0
0 dt
1
R1 −1
V
which is in the form specified in Proposition 2.8.2. Hence vc is a state variable.
The current i2 is another state variable since i2 = vc /R2 and hence the circuit defines the
following full behavior:
"
#
"
#"
#
"
#
d
1
−1 0
I
R2 C
i2 +
i2 +
=0
dt
R2
R1 −1
V
0
Example 2.8.3 shows that several state representations are possible for a given behavior B.
Therefore, this brings us to the consideration of a minimal state representation:
Definition 2.8.4 A state system ΣX1 = (R, Rw , R×1 , B1full ) with behavior B and R×1 -valued
states x1 is said to be a minimal state representation of B if whenever ΣX2 = (R, Rw , R×2 , B2full )
is another state system for B with R×2 -valued states x2 then ×1 ≤ ×2 . This minimal number
of states is called McMillan degree of B and is denoted by n(B).
In order to deduce when a given state representation of B with behavior Bfull is state minimal,
we need the notion of trimness for Bfull .
Definition 2.8.5 Consider a behavior B ∈ Lw with trajectories w, and a state representation
of B with behavior Bfull ∈ Lw+× with trajectories (w, x). Then, Bfull is called state trim if for
every a ∈ R× there exists a trajectory (w, x) ∈ Bfull such that x(0) = a.
State trimness means that there are no algebraic constraints among states. It is intuitive that
state trim behaviors are minimal:
Proposition 2.8.6 Let Bfull ∈ Lw+× be a state system with trajectories (w, x). Then, Bfull is
state minimal for B if and only if Bfull is state trim and x is observable from w.
It is easy to see that the state representation obtained in Example 2.8.3 is a minimal state
representation of B.
We now consider briefly construction of states. Given a linear differential behavior, it
is in general not easy to identify latent variables that qualify as state variables. Hence, a
2.9. Inputs and Outputs
27
systematic procedure for construction of states starting from manifest or latent variable trajectories is important. Given a behavior and its representation in a kernel, hybrid or image form, Rapisarda [82] discusses ways to construct a set of states for the behavior. Let
B = {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} be defined by a minimal kernel representation. There exists
a differential operator X( dtd ) with X(ξ) ∈ Rn(B)×w [ξ] such that x = X( dtd )w where x is a set of
minimal states for B. The map X( dtd ) is called a state map for B. There is a constructive procedure to construct X(ξ) starting from the rows of R(ξ). See [82], Proposition 3.4.4, Theorem
3.5.1 and Proposition 3.5.4 for ways to construct states starting from kernel, hybrid and image
representations of a behavior respectively.
States are related to memory of a dynamical system. The history of trajectories in B that
correspond to x = 0 cannot captured by states. Hence the set of such trajectories is called the
“memoryless part” of a behavior:
Definition 2.8.7 Let B ∈ Lw be a linear differential behavior with manifest variables w. Let
x = X( dtd )w be a minimal state map for B. Then, the set
{w ∈ B ∩ Ker X(
d
)}
dt
is called the memoryless part of B.
The memoryless part is thus, the projection of the trajectory (w, x) ∈ Bfull on manifest variables
by setting x = 0. We will see that the memoryless part of a behavior plays a crucial role in
results presented in Chapter 4.
We now come to the concluding part of this introductory chapter on behavioral theory where
we introduce the concept of inputs and outputs. Notice that till now, we have not imposed any
cause-effect relationship among the system variables. Even the notion of controllability was
defined without defining “inputs”, and observability was defined without defining “outputs”.
The behavioral notion of inputs and outputs is different from the classical notion, and therefore
needs a special mention.
2.9
Inputs and Outputs
The concept of a “free variable” (i.e. a variable that is not constrained by laws defining a
system) plays an important role in defining an input. The idea underlying the definition is
that an input is unconstrained by the system and therefore can be fixed by the environment.
The existence of “free” variables in a behavior is related to the fact that in general a behavior
is described by an under-determined system of equations. This leaves some components of a
solution unconstrained. These components are free to assume arbitrary C ∞ -functions.
Definition 2.9.1 Let B ∈ Lw be a C ∞ -behavior. Let w = (w1 , w2 . . . ww ) denote manifest
variables of B. Let I = {i1 , i2 , . . . ik } ⊆ {1, 2 . . . w} be an index set. We denote by Π(B) the
system obtained by eliminating the variables wj , j ∈
/ I. Let the set of variables obtained after
elimination be denoted by {wi1 , wi2 , . . . , wik }. The variables {wi1 , wi2 , . . . , wik } are called free
in B if
Π(B) = C ∞ (R, Rk )
Recall that k has been defined to be |I|, the cardinality of the index set.
28
2 Behavioral theory of dynamical systems
Using the concept of free variables in a behavior, one may define a set of maximally free variables
as a set which contains the maximum possible number of free variables. Once a set of maximally
free variables is found, no more free variables remain outside the corresponding index set I. We
use the concept of maximally free variables to define an “input-output” partition for a behavior
B:
Definition 2.9.2 Consider a C ∞ -behavior B ∈ Lw . Let w denote manifest variables of B.
Partition w (after possibly a permutation of its components) as w = (wi , wo ) with wi =
(w1 , w2 . . . wm ) and wo = (wm+1 . . . ww ). The partition w = (wi , wo ) is said to be an inputoutput partition of w if the variables wi are maximally free in B.
Once we have an input-output partition of B we say that wi are inputs and wo are outputs of
B. Using the customary notation, wi is denoted by u and wo by y. Because u is maximally
free, we know that y does not have any free components.
The following result gives conditions when a kernel representation of a behavior corresponds
to an input-output partition. Consider a C ∞ -behavior B with manifest variables w partitioned
arbitrarily as w = (u, y). Let R( dtd ) be a minimal kernel representation of B. Let R(ξ) =
[P (ξ), −Q(ξ)] be the partition of columns of R(ξ) conformally with u and y. Then, w = (u, y)
is a input-output partition if and only if Q(ξ) is square and nonsingular. The rational function
Q−1 (ξ)P (ξ) is called a transfer function for B. Different input-output partitions of manifest
variables of B give rise to different transfer functions for B.
A transfer function of a behavior B is a rational function in ξ, i.e. every element ij of
−1
Q (ξ)P (ξ) is of the form pij (ξ)/qij (ξ) where pij (ξ), qij (ξ) ∈ R[ξ] and qij (ξ) 6= 0. A rational
function is called proper (respectively strictly proper) if deg qij (ξ) ≥ deg pij (ξ) (respectively
deg qij (ξ) > deg pij (ξ)) for all i, j. Obviously, a strictly proper rational function is proper. For
C ∞ -behaviors, an input-output partition may not correspond to a proper rational function. In
other words, for C ∞ -behaviors, a transfer function that is not proper may still correspond to
a valid input-output partition of the system variables. However, in the case of Lloc
1 behaviors,
loc
properness is crucial: The partition (u, y) for a L1 behavior B is an input-output partition if
and only if the corresponding transfer function is proper [74].
We now address briefly three “invariants” associated with a linear differential behavior B
with manifest variables w.
1. Input cardinality: Let w = (u, y) be a input-output partition of B. Clearly, several inputoutput partitions are possible for a behavior. However, it turns out that the cardinality
of every set of maximally free variables in B is the same. The cardinality of the set
u of inputs in a given input-output partition of B is called the input cardinality of B,
denoted by m(B). m(B) is intrinsic to a behavior and does not depend on a particular
representation. Therefore, we say it is an invariant associated with B. If B is controllable,
m(B) is also the minimal number of latent variables in an image representation for B.
2. Output cardinality: If B has w manifest variables and input cardinality m(B) then p(B) :
w − m(B) is called the output cardinality of B. The output cardinality is completely
determined by the number of manifest variables and the input cardinality, both of which
are intrinsic to a behavior. Therefore, we say that p(B) is an invariant associated with
B.
2.9. Inputs and Outputs
29
3. McMillan degree: Given a behavior B, the number of states in a minimal state representation for B is defined as the McMillan degree of B (see Definition 2.8.4), and is denoted
by n(B). The McMillan degree of a behavior is an invariant associated with the behavior. Given a behavior B := KerR( dtd ), the McMillan degree of B is equal to the maximal
degree minor of R(ξ).
Example 2.9.3 Consider Figure 2.1 and the corresponding behavior B in terms of port voltage
V and port current I obtained in Example 2.2.1. Recall that
"
# "
#
V
R1 C dtd + 1 + R1 /R2
=
`
I
C dtd + 1/R2
is an observable image representation for (V, I). Then, B has the following invariants:
1. Input cardinality: m(B) = 1 since either V or I can be an input (i.e. a maximally free
set of variables in B). Note that m(B) is also the cardinality of the number of latent
variables, which is one in this case.
2. Output cardinality: Since B has 2 manifest variables and m(B) = 1, the output cardinality
p(B) = 1.
3. McMillan degree: In example 2.8.3 we showed that B admits a state space representation
with state as the capacitor voltage. It turns out that this state representation is minimal.
Therefore, McMillan degree of B, n(B) = 1.
This completes a brief review of background material for behavioral systems theory. We have
only included the bare minimum that is required in order to read the forthcoming chapters.
What is included in this chapter shall be repeatedly used throughout the thesis; therefore we
will not refer to the relevant results in this chapter time and again. However, we shall need some
additional concepts (for example the notion of dissipativity, which uses concepts from Quadratic
differential Forms, Chapter 1, and also from the current chapter) which we introduce as and
when these are necessary.
30
2 Behavioral theory of dynamical systems
Chapter 3
A parametrization for dissipative
systems
3.1
Introduction
In the context of electrical circuits, power supplied is defined as the product of voltage and
current. Circuits in which the net power supplied is non-negative are called passive: if v and
Rt
i are the port voltage and current respectively, 0 v T idt ≥ 0 for all (v, i) permitted by the
circuit. Passivity has been an area of active research since many decades, see [18] for an early
account. Meixner [54] examines general passive linear systems in a abstract framework and
derives several properties of these systems. Before electronic operational amplifiers (op-amps)
became widely available, passive circuits were preferred as a rule since they can be realized with
passive electrical components, i.e. resistors, inductors, capacitors and gyrators. The classic book
[56] examines various realizability issues for passive systems. Even today, passive circuits have
not lost their importance, though realizability is not the central issue.
Note that power in electrical circuits is a quadratic form in terms of voltage and current.
More generally, quadratic forms, and quadratic differential forms can be used to define “generalized power”. Systems in which net generalized power supplied is non-negative are called
dissipative. The abstract theory of dissipative systems was introduced by Jan Willems, who
in 1972 wrote two seminal papers on the subject [98, 99]. The ideas in these papers have
been singularly successful in tieing together ideas from network theory, mechanical systems,
thermodynamics, as well as feedback control theory. The dissipation hypothesis which distinguishes dissipative dynamical systems from general dynamical systems results in a fundamental
constraint on their dynamic behavior. Typical examples of dissipative systems include:
1. Electrical networks: Electrical energy is dissipated as heat in resistors.
2. Viscoelastic systems: Mechanical energy is dissipated as heat due to viscous friction.
3. Thermodynamic systems: The second law of thermodynamics postulates a form of dissipativity leading to an increase in entropy.
System theorists are interested in dissipative systems as one of the methods to stabilize interconnected systems. Moylan and Hill treat this aspect of dissipative systems in several papers
32
3 A parametrization for dissipative systems
[34, 35, 55]. Dissipative systems are currently an area of active research in several diverse areas
in engineering: Bernstein, Bhat, Haddad, and others [13, 32] have proposed a “network flow”
model based on dissipativity ideas for thermodynamic systems. Work on “absolute stability”
criteria for nonlinear systems using dissipativity ideas is ongoing [31, 52]. Problems like disturbance attenuation have led to research in “H∞ control” [19], which can also be interpreted
using dissipativity ideas.
In this chapter, we study dissipative systems in the behavioral theoretic framework. The
premise of the current chapter is: while it is easy to check if a given dynamical system is
dissipative, the converse problem of constructing all systems dissipative with respect to a given
generalized power is more difficult. We address the following problem: given a generalized
power defined by a QDF, parametrize the set of all dynamical systems (behaviors) that are
dissipative with respect to the generalized power. In this chapter we restrict to C ∞ -behaviors.
When considering more general function spaces (e.g. Lloc
1 ), the theory presented here can still
be used with appropriate technical modifications. These aspects will be elaborated upon in later
chapters. The scalar version of results presented in this chapter has been published [60, 61].
The full results presented here will be shortly sent for possible publication.
Thinking of the parametrization problem, one is immediately reminded of the KalmanYakubovich-Popov (KYP) lemma [2, 40, 78, 109, 106] which is in fact a result of this type.
However, this connection will be explored more fully in the next chapter. Constructing dissipative behaviors is important for the same reason as the KYP lemma is important. Knowing
a characterization of dissipative systems often yields a simple solution to problems in systems
theory, the passivity theorem [107] is an example. Results in this chapter build a thread that
runs through most chapters in this thesis.
This chapter is organized as follows: in Section 3.2, we define dissipativity and give tests
to check for dissipativity. In Section 3.3, we define an equivalence relation on the set of QDFs
which lets us identify similar QDFs from the viewpoint of generalized power. Section 3.4
is about a parametrization for dissipative systems with two manifest variables, i.e., singleinput-single-output (SISO) systems. In Section 3.5 we extend the parametrization results to
multi-input-multi-output (MIMO) dissipative systems. The parametrization results depend on
a factorization which may not always exist. In Section 3.5, several cases are investigated in
an increasing order of complexity of the generalized power. Parametrization for dissipative
systems is proposed in each case.
3.2
Dissipativity in the Behavioral setting
Consider a QDF QΦ induced by Φ ∈ Rw×w
s [ζ, η]. Consider the action of QΦ on trajectories w of
w
a controllable behavior B ∈ Lcon . Then QΦ (w) can be interpreted as a measure of “generalized
power” supplied to the dynamical system (behavior B). QΦ (w) is called the supply function
[103].
Definition 3.2.1 A behavior B ∈ Lwcon is said to be dissipative with respect to the QDF QΦ
(or in short Φ-dissipative) if
Z ∞
QΦ (w)dt ≥ 0 ∀w ∈ D(R, Rw ) ∩ B
(3.1)
−∞
3.2. Dissipativity in the Behavioral setting
33
where D(R, Rw ) denotes the space of compactly supported C ∞ functions from R to Rw .
We emphasize that the above definition is valid only for controllable systems. For uncontrollable
systems, the definition of dissipativity is still an open problem. See [12, 72] for discussion on
dissipativity of uncontrollable systems.
Since B ∈ Lwcon , we can use an observable image representation of B = Im M ( dtd ) to define
a new two variable polynomial matrix Φ0 (ζ, η) = M T (ζ)ΦM (η). Then the condition for Φdissipativity given by equation (3.1) can be rewritten as
Z ∞
(3.2)
QΦ0 (`)dt ≥ 0 ∀ ` ∈ D(R, R• )
−∞
w
Given a QDF QΦ with Φ ∈ Rw×w
s [ζ, η], one expects a subset of behaviors in Lcon to be Φdissipative.
Notation 3.2.2 We denote the set of all Φ-dissipative controllable behaviors by LΦ .
Clearly LΦ ⊆ Lwcon . A characterization of Φ-dissipative controllable behaviors is given in the
following theorem ([103], Proposition 5.2, page 1719):
Theorem 3.2.3 Consider a QDF QΦ induced by Φ ∈ Rw×w
s [ζ, η]. Then the following statements
are equivalent:
1. B ∈ Lwcon is dissipative with respect to the QDF QΦ .
2.
R∞
−∞
QΦ (w)dt ≥ 0 ∀w ∈ D(R, Rw ) ∩ B.
3. Consider an (observable) image representation of B given by
B = {w | ∃ ` ∈ C ∞ (R, Rl ) such that w = M (
d
)`}
dt
Then
Φ0 (−iω, iω) := M T (−iω)Φ(−iω, iω)M (iω) ≥ 0 ∀ω ∈ R
Example 3.2.4 Let J1 1 =
that
"
1 0
0 −1
#
and B = Im M ( dtd ) where M (ξ) =
"
#
ξ+2
. We see
1
M T (−iω)J1 1 M (iω) = 3 + ω 2
which is clearly positive for all ω ∈ R. Hence, B is J1 1 -dissipative.
The following lemma is immediate from Theorem 3.2.3:
Lemma 3.2.5 Consider a supply function QΦ with Φ ∈ Rw×w
s [ζ, η]. If there exists an ω0 ∈ R
such that Φ(−iω0 , iω0 ) is negative definite, there exist no non-trivial behaviors that are Φdissipative.
34
3 A parametrization for dissipative systems
Proof: Suppose there exists a behavior B ∈ Lwcon that is Φ-dissipative. Let w = M ( dtd )` be
an observable image representation for B. Since B is assumed to be Φ-dissipative, M T (−iω0 )
Φ(−iω0 , iω0 ) M (iω0 ) ≥ 0. Since Φ(−iω0 , iω0 ) is negative definite, it must be that M (iω0 ) = 0.
We now arrive at a contradiction, since by assumption, Im M ( dtd ) is an observable image representation, and hence M (λ) has full column rank for all λ ∈ C.
Note that condition 3 of Theorem 3.2.3 gives a test for checking whether a given behavior B
is Φ-dissipative. Note that the matrix Φ0 (−iω, iω) ∈ Rl×l [iω] is a Hermitian matrix for every
ω ∈ R. Let Φ0 (−iω, iω) = [φij ]li,j=1 . We denote by Dk (ω 2 ) the kth successive principal minor
of Φ0 (−iω, iω):
Dk (ω 2 ) = det[φi,j ]ki,j=1 , k = 1, . . . , l
We emphasize that these successive principal minors are even polynomials in ω. The following
result is easy to show:
Proposition 3.2.6 If Dk (ω 2 ) is the kth successive principal minor of Φ0 (−iω, iω) then Φ0 (−iω, iω) >
0 for every ω ∈ R if and only if Dk (ω 2 ) > 0, k = 1, 2, . . . , l.
Proof: See for instance Gantmacher [22].
The case where Φ0 (−iω, iω) is positive semidefinite is more difficult to check: Φ0 (−iω, iω) ≥ 0
for every ω ∈ R if and only if every (and not just successive) principal minor of Φ0 (−iω, iω) is
non-negative for all ω ∈ R. In either case, the computation boils down to checking when an
even polynomial π(ω) ∈ R[ω] takes non-negative values for every ω ∈ R.
Proposition 3.2.7 An even non-zero polynomial π(ω) ∈ R[ω] takes non-negative values for
every ω ∈ R if and only if the following conditions hold:
1. All real roots of π(ω) have even multiplicities.
2. π(ω0 ) > 0 for some ω0 ∈ R.
Proof: Assume that the even polynomial π(ω) is greater than zero at ω = ω0 and all its real
roots have even multiplicity. It can change signs only at it’s real roots (since all the coefficients
of the polynomial are real). Since these real roots are of even multiplicities, the signs of the
polynomial π at ω − and ω + are the same for all real valued ω that are roots of the equation
π(ω) = 0. Consequently, π(ω) ≥ 0 for every ω ∈ R.
Conversely, if π(ω) takes non-negative values at every ω ∈ R, the sign of π evaluated at
ω + and ω − is positive for some real root ω of the equation π(ω) = 0, Hence, the multiplicity
of this root should be even.
Tests are available in literature that enable us to check the non-negativity of even polynomials without explicit root computation. The approach based on Sturm chains is classical
and can be found in literature, for example in [22, 86]. A Routh-array type test for checking non-negativity of even polynomials can be found in [43]. Arguably, positive definiteness of
Φ(−iω, iω) is far easier to check using Proposition 3.2.7 than positive semi-definiteness. In case,
the spectrum of Φ(−iω, iω) ∩ R is known, or can be computed easily, the following procedure
provides a simple check for positive semi-definiteness. It relies on the fact that eigenvalues of
Φ(−iω, iω) are a continuous function of ω:
3.3. An equivalence relation on supply functions
35
1. We denote by R+ the semi-open interval [0, ∞) ⊂ R. Given nonsingular Φ(−iω, iω) =
ΦT (iω, −iω) ∈ Rw×w [iω], compute the spectrum SΦ := {ωi ∈ R+ | det Φ(−iωi , iωi ) = 0}.
2. Define the set of distinct spectral points arranged in a ascending order: S := {ωi ∈
SΦ |ω1 < ω2 < . . . ωk with ωi , i = 1, . . . , k distinct}.
3. If ω1 = 0: Let ωk+1 be an arbitrary finite real number larger than ωk . Determine inertia
of Φ(−iω̄i , iω̄i ) with ω̄i := (ωi + ωi+1 )/2, i = 1, . . . , k. Φ(−iω, iω) is positive semidefinite
if and only if the matrices Φ(−iω̄i , iω̄i ) are positive definite for i = 1, 2, . . . , k.
4. If ω1 > 0: Let ω0 = 0, ωk+1 be an arbitrary finite real number larger than ωk . Determine
inertia of Φ(−iω̄i , iω̄i ) with ω̄i := (ωi−1 + ωi )/2, i = 1, . . . , k + 1. Φ(−iω, iω) is positive
semidefinite if and only if the matrices Φ(−iω̄i , iω̄i ) are positive definite, i = 1, . . . , k + 1.
Note that determining inertia is computationally far easier than first computing eigenvalues
and then counting them. See [59] for iterative algorithms for inertia computation using Schurcomplement [33].
We can ask the question: given a Φ such that Φ(−iω, iω) is indefinite for ω ∈ R, find behaviors that are Φ-dissipative. Finding a Φ-dissipative behavior can be viewed as a polynomial
interpolation problem in the following sense: there exists a complex cone Cω in Cw such that
v ∗ Φ(−iω, iω)v > 0 for all v ∈ Cω . In order to show the existence of a non-trivial behavior B
in LΦ we need to find a polynomial matrix M (ξ) of the appropriate size such that the vectors
M (iω) lie in Cω for all ω ∈ R.
From Theorem 3.2.3 it is clear that a behavior could be dissipative with respect to several
different supply functions. From the set of all supply functions, one can identify families of
supply functions such that the set of behaviors dissipative with respect to every supply function
in a family is the same. We formalize this notion by first defining an equivalence relation on the
set of all QDFs. This equivalence relation, as we shall see, will be crucial for the parametrization
obtained in this chapter.
3.3
An equivalence relation on supply functions
Definition 3.3.1 Two QDFs QΦ1 and QΦ2 with Φ1 and Φ2 ∈ Rw×w
s [ζ, η] are equivalent (denoted
by QΦ1 ∼ QΦ2 ) if the set of behaviors dissipative with respect to QΦ1 is precisely the same as
the set of behaviors dissipative with respect to QΦ2 i.e. LΦ1 is the same as LΦ2 .
Using the above definition, we have the following proposition:
Proposition 3.3.2 If Φ1 (−iω, iω) = π(ω)Φ2 (−iω, iω) where π(ω) is a nonzero scalar polynomial in ω such that π(ω) ≥ 0 for all ω ∈ R then QΦ1 ∼ QΦ2 .
Proof: Consider a behavior B ∈ LΦ1 defined by an observable image representation w =
M ( dtd )`. Then from Theorem 3.2.3,
M T (−iω)Φ1 (−iω, iω)M (iω) ≥ 0 ∀ω ∈ R
36
3 A parametrization for dissipative systems
Multiplying both sides of the above inequality by π(ω) does not change the inequality since
π(ω) takes non-negative values for every ω ∈ R. Consequently,
M T (−iω)Φ2 (−iω, iω)M (iω) ≥ 0 ∀ω ∈ R
which shows that every behavior in LΦ1 is also in LΦ2 .The converse of the proposition can be
easily proved by invoking a continuity argument.
In particular, the above proposition holds for π(ω) = 1 ∀ω ∈ R. Though obvious, we need
this in the sequel and hence it is convenient to have the following corollary:
Corollary 3.3.3 If Φ1 (−iω, iω) = Φ2 (−iω, iω) then QΦ1 ∼ QΦ2 .
Example 3.3.4 Consider the following matrices
#
#
"
"
0
2ζ + η
ζ +η ζ
; Φ2 (ζ, η) =
Φ1 (ζ, η) =
2η + ζ −(ζ 2 + η 2 )/2
η
ζη
Let us consider the action of QΦ1 and QΦ2 on a trajectory (u, y)T ∈ C ∞ (R, R2 ):
d
d
d
u + 2 u · y + ( y)2
dt
dt
dt
d
d
d2
QΦ2 (u, y) = 4 u · y + 2u · y − y · 2 y
dt
dt
dt
QΦ1 (u, y) = 2u ·
(3.3)
(3.4)
It is clear that QΦ1 and QΦ2 are different QDFs. However, notice that
"
#
0 −iω
Φ1 (−iω, iω) = Φ2 (−iω, iω) =
iω ω 2
Therefore, though QΦ1 and QΦ2 are different QDFs, QΦ1 ∼ QΦ2 since from Corollary 3.3.3,
every behavior that is Φ1 -dissipative is also Φ2 -dissipative, and vice versa.
In the following section, we first concentrate on dissipative systems with two manifest variables. In section 3.5, we generalize the results obtained below in several directions.
3.4
SISO dissipative systems
Consider a behavior B ∈ L2con defined by an observable image representation
#
" # "
d
u
q( dt )
`
=
p( dtd )
y
(3.5)
with the scalar ` being a free latent variable. Note that since equation (3.5) is an observable
image representation of the behavior B, p(ξ), q(ξ) are coprime polynomials. Define the transfer
function G(ξ) = p(ξ)/q(ξ). In the following sections, we identify the behavior B with its
transfer function G(ξ).
3.4. SISO dissipative systems
Consider the QDF QJ with J =
definition)
37
"
0 1/2
1/2 0
Z
#
∈ R2×2
s [ζ, η]. Then, B is J-dissipative if (by
∞
−∞
u(t)y(t)dt ≥ 0
(3.6)
with (u(t), y(t)) ∈ D(R, R2 ) ∩ B, i.e., compactly supported C ∞ -trajectories in B. Equation
(3.6) is precisely the condition satisfied by inputs and outputs of passive linear systems. Thus,
J-dissipativity in the behavioral setting is equivalent to passivity in classical (input-output)
setting. Passive systems are of interest in circuit theory because these can be synthesized using
just passive components.
Transfer functions of passive systems have a very well known characterization, namely that
they are positive real (see [56] for details). A rational function G(ξ) is said to be positive real
if:
1. G(ξ) is analytic in C+ where C+ denotes the open right half of the complex plane.
2. G(iω) + G∗ (iω) ≥ 0 for almost all ω ∈ R.
3. All poles of G(ξ) on the imaginary axis are simple.
The behavior B defined by equation (3.5) is J-dissipative if and only if
p(−iω)q(iω) + q(−iω)p(iω) ≥ 0 ∀ω ∈ R
(3.7)
Note that the equation (3.7) divided by the polynomial q(−iω)q(iω) gives us G(iω) +
G (iω) ≥ 0 for almost all ω ∈ R. This implies that the associated scalar transfer function
G(ξ) = p(ξ)/q(ξ), satisfies Re(G(iω)) ≥ 0 ∀ω ∈ R. Therefore, if we view the scalar transfer
function G(ξ) corresponding to B as a function from C → C, then it maps every point on the
imaginary axis to some point in the closed right half complex plane. A geometric way of visualizing this map is through the Nyquist plot of G(ξ), i.e., the Nyquist plot of G(ξ) lies entirely
in the closed right half complex plane. This is precisely the condition (2) in the definition of
positive realness of scalar transfer functions given above.
Note that in controllable behaviors, there is no notion of stability. A behavioral counterpart
to the analyticity condition given in the definition of positive real transfer functions (condition
1) will be obtained in the next chapter (Chapter 4) where we will show that analyticity of
the G(ξ) is equivalent to the existence of positive definite storage functions. It follows that
all positive real transfer functions can be identified with behaviors that are J-dissipative but
the converse is not true. Let us consider an example that demonstrates the difference between
passivity and J-dissipativity:
∗
Example 3.4.1 Consider the behavior associated with G(ξ) = (ξ − 2)/(ξ − 1). Let us check
that this behavior is J-dissipative:
!
!
0 1/2
iω − 2
= ω 2 + 2 > 0 ∀ω ∈ R
−iω − 2 −iω − 1
1/2 0
iω − 1
However, the rational function (ξ − 2)/(ξ − 1) is not positive real, since both the numerator
and denominator have roots in the right half complex plane.
38
3 A parametrization for dissipative systems
It is clear from the equation (3.7) that if the behavior represented by equation (3.5) is
J-dissipative, then so is the behavior represented by
#
"
# "
w1
p( dtd )
`
(3.8)
=
q( dtd )
w2
In the language of transfer functions, the above implies that if the behavior identified with G(ξ)
is J-dissipative, then so is the behavior identified with 1/G(ξ).
Since every B ∈ LJ can be identified with a scalar transfer function whose Nyquist plot
lies entirely in the right half of the complex plane, we use this property to identify behaviors
that are J-dissipative. In other words, behaviors in LJ are precisely those behaviors whose
associated transfer functions have their Nyquist plots entirely in the closed right half plane.
We now characterize Φ-dissipative behaviors (for a general Φ ∈ R2×2
s [ζ, η]). Since the matrix
Φ(−iω, iω) is hermitian for every ω ∈ R, it has real eigenvalues. It can be easily shown that
eigenvalues of this matrix at every ω are precisely the roots of the second degree polynomial
equation
s2 − trace (Φ(−iω, iω))s + det(Φ(−iω, iω)) = 0
Using concepts about signs of roots of quadratic equations, we have the following proposition.
Proposition 3.4.2 Given a Φ ∈ R2×2
s [ζ, η], the following hold:
1. If ∀ω ∈ R, det(Φ(−iω, iω)) ≥ 0 and trace(Φ(−iω, iω)) ≥ 0, then every behavior in L2con
is Φ-dissipative.
2. If there exists some ω = ω0 such that det(Φ(−iω0 , iω0 )) > 0 and trace(Φ(−iω0 , iω0 )) < 0,
then there exist no non-trivial behaviors in L2con that are Φ-dissipative.
3. If ∀ω ∈ R, ω ∈
/ spec(Φ(−iω, iω)), det(Φ(−iω, iω)) < 0, then LΦ ( L2con .
Proof: When trace (Φ(−iω0 , iω0 )) and det(Φ(−iω0 , iω0 )) are both positive for some ω0 ∈ R,
then we can conclude that Φ(−iω0 , iω0 ) is a positive definite matrix at that ω0 . If Φ(−iω0 , iω0 )
is positive definite for every ω0 ∈ R, it is easy to see that every controllable behavior B ∈ L2con
is Φ-dissipative. By a similar argument, it is easy to see that when det(Φ(−iω, iω)) ≥ 0 and
trace (Φ(−iω, iω)) ≥ 0 ∀ω ∈ R then every behavior in L2con is Φ-dissipative.
When trace (Φ(−iω0 , iω0 )) < 0 and det(Φ(−iω0 , iω0 )) > 0 for some ω0 ∈ R, both the roots
of the characteristic polynomial are negative. Therefore Φ(−iω0 , iω0 ) is a negative definite
matrix and consequently there exists no non-trivial behavior that is Φ-dissipative.
When det(Φ(−iω, iω)) < 0, eigenvalues of Φ(−iω, iω) have opposite signs. Hence, for some
ω0 ∈ R, ω0 6∈ spec(Φ(−iω, iω)), we can find a vector v in C2 such that
v ∗ Φ(−iω0 , iω0 )v < 0
We can construct two coprime polynomials r(ξ) and s(ξ) of arbitrary degrees such that
"
#
r(iω0 )
=v
s(iω0 )
3.4. SISO dissipative systems
39
#
r( dtd )
is in L2con . But by very construction, this behavior
Note that the behavior given by Im
s( dtd )
is not in LΦ . Hence LΦ is not all of L2con .
We now demonstrate the use of Proposition 3.4.2 with some examples.
"
Example 3.4.3 With Φ[ζ, η] = I2 , it is easy to see (using Theorem 3.2.3) that every behavior
B ∈ L2con is Φ-dissipative. This can also be seen from part 1 of Proposition 3.4.2, since for
Φ = I2 , both the trace and determinant are positive for all real values of ω.
Example 3.4.4 Consider the QDF QΦ induced by
"
#
ζη − 2
0
Φ(ζ, η) =
0
ζη − 3
Let w = (w1 , w2 ) be a typical trajectory in some behavior B ∈ L2con . Then
2
2
dw2
dw1
2
− 2w1 +
− 3w22
QΦ (w) =
dt
dt
It is not difficult to show that every behavior B ∈ L2con contains some trajectory where w1 , w2
are combinations of the sinusoidal functions sin t and cos t. Clearly, the above QDF would
yield negative values, when we integrate the QDF over some finite time interval (we might have
to make some adjustments about how long a time period we consider). By taking a chopped
version of the sinusoidal trajectories (this is always possible, since the behavior is controllable),
R∞
we obtain a compactly supported trajectory in the behavior such that −∞ QΦ (w)dt < 0. Thus
no behavior B ∈ L2con is Φ-dissipative. We arrive at the same conclusion using condition 2 of
Proposition 3.4.2, since at ω = ±1, det(Φ(−iω, iω)) = 2 and trace(Φ(−iω, iω)) = −3.
Henceforth we assume that Φ(−iω, iω) ∈ R2×2 [iω] is nonsingular and has one positive, and one
negative eigenvalue for almost all ω ∈ R, i.e., Φ(−iω, iω) has constant zero signature for almost
all ω ∈ R. Hence, from Proposition 3.4.2 it follows that
det Φ(−iω, iω) < 0
for almost all ω ∈ R.
We shall now parametrize Φ-dissipative behaviors in terms of J-dissipative behaviors. Our
base set for parametrization is LJ which we had characterized as those behaviors whose associated transfer functions have their Nyquist plots entirely in the right half of the complex
plane.
We now show by explicit construction that every Φ ∈ R2×2
s [ζ, η] such that Φ(−iω, iω) has
constant zero signature for almost all ω ∈ R admits an interesting factorization:
Proposition 3.4.5 Every matrix Φ ∈ R2×2
s [ζ, η] such that Φ(−iω, iω) has constant zero signature for almost all ω ∈ R can be factorized in the following manner
π(ω)Φ(−iω, iω) = K T (−iω)JK(iω)
with π(ω) a scalar polynomial in ω such that π(ω) ≥ 0 for all ω ∈ R and K(iω) a 2 × 2 matrix
with entries that are polynomials in iω.
40
3 A parametrization for dissipative systems
Proof: We prove the proposition by considering various cases and constructing the matrices
K(iω) in every case.
Case 1: Φ with nonzero diagonal elements: Consider Φ given by
"
#
φ11 (iω) φ∗12 (iω)
Φ(−iω, iω) =
φ12 (iω) φ22 (iω)
Since det(Φ(−iω, iω)) ≤ 0 for all ω ∈ R we factorize the determinant as
d(iω)d(−iω) = −det(Φ(−iω, iω))
Define the matrix K(iω) by
K(iω) =
"
φ11 (iω)
φ∗12 (iω) + d(iω)
φ211 (iω) φ11 (iω)(φ∗12 (iω) − d(iω))
#
(3.9)
Then, direct multiplication shows that
φ11 (iω)2 Φ(−iω, iω) = K T (−iω)JK(iω)
Note that φ11 (iω) is an even polynomial in ω and so φ11 (iω) = φ11 (−iω). As a result, φ11 (iω)2 =
|φ11 (iω)|2 is non-negative for all real ω. Thus, π(ω) = |φ11 (iω)|2 .
Case 2: Φ with at least one diagonal element uniformly zero: We consider a Φ with at least
one of the diagonal elements uniformly zero. Without loss of generality we assume the (1, 1)
element to be zero. Consider the Φ defined by
"
#
0
φ∗12 (iω)
Φ(−iω, iω) =
(3.10)
φ12 (iω) φ22 (iω)
Consider K(iω) defined by
#
(3.11)
1
Φ(−iω, iω) = K T (−iω)JK(iω)
2
(3.12)
K(iω) =
"
φ12 (iω) φ22 (iω)/2
0
1
It can be verified by direct multiplication that
Thus we have obtained the required factorization when at least one of the diagonal elements
of Φ is uniformly zero. Note that in the above factorization, π(ω) = (1/2) which is positive for
all ω ∈ R. This exhausts all possible cases for Φ with constant zero signature and hence the
proof is complete.
Remark 3.4.6 The factorization given in Proposition 3.4.5 is a rather special case of rational
J-spectral factorization. However, the interesting feature of this factorization is that it is
explicitly stated in terms of the coefficients of the QDF. This makes it possible to study a
parametric dependence of the coefficients of the QDF on the matrix K(ξ) obtained in the
factorization.
3.4. SISO dissipative systems
41
We now show that the factorization in Proposition 3.4.5 can be used to define a differential
operator in a natural way which can be used as a map from the set of all Φ-dissipative behaviors
to the set of all J-dissipative behaviors.
Theorem 3.4.7 Given Φ(ζ, η) ∈ R2×2
s [ζ, η] such that Φ(−iω, iω) is nonsingular and has constant zero signature, factorize Φ(−iω, iω) as in Proposition 3.4.5. Let K(ξ) be the matrix one
obtains by substituting ξ for iω in the matrix K(iω). Then the following hold:
1. K( dtd ) is a map from the set LΦ to LJ ,
2. Define L(ξ) as L(ξ) = adj K(ξ), i.e., L(ξ)K(ξ) = det(K(ξ))I2 . Then L( dtd ) parametrizes
the set LΦ through the set LJ , i.e., L(ξ) maps every J-dissipative behavior to a Φdissipative behavior.
Proof: Given a Φ ∈ R2×2
s [ζ, η] such that Φ(−iω, iω) has constant zero signature, from Proposition 3.4.5:
π(ω)Φ(−iω, iω) = K T (−iω)JK(iω)
where π(ω) ≥ 0 for all ω ∈ R. We note from Proposition 3.5.2 that multiplication
iω)
" of Φ(−iω,
#
d
q( dt )
with a non-negative polynomial π(ω) does not affect the set LΦ . Let B = Im
∈ L2con .
p( dtd )
"
#
q̃( dtd )
Consider the behavior B̃ defined by Im
where,
p̃( dtd )
"
q̃(ξ)
p̃(ξ)
#
= K(ξ)
"
q(ξ)
p(ξ)
#
(3.13)
Then, B̃ is a J-dissipative behavior iff B is a Φ-dissipative behavior. Thus K( dtd ) is a map
from LΦ to LJ . "
"
#
#
d
q(
r( dtd )q( dtd )
)
dt
Note that Im
defines the same controllable behavior in L2con as Im
r( dtd )p( dtd )
p( dtd )
for any nonzero scalar polynomial r(ξ). Therefore the map defined by K( dtd )L( dtd ) is the identity
map from L2con → L2con and so is the restriction of this map to LJ . Therefore" it follows
# that
d
q̂( dt )
L( dtd ) defines the inverse map of K( dtd ) and every behavior defined by B̂ = Im
where
p̂( dtd )
"
"
q̂(ξ)
p̂(ξ)
#
= L(ξ)
"
q̃(ξ)
p̃(ξ)
#
(3.14)
#
q̃( dtd )
is Φ-dissipative iff B̃ = Im
defines a J-dissipative behavior. Thus, L(ξ) parametrizes
p̃( dtd )
the set LΦ , as LΦ is precisely the image of LJ under the map L(ξ).
We now demonstrate the use of Theorem 3.4.7 with the help of some examples:
42
3 A parametrization for dissipative systems
Example 3.4.8 We obtain a characterization of all behaviors dissipative with respect to the
QDF QΦ with
"
#
1 0
Φ(ζ, η) =
0 −1
Φ is a constant matrix and has zero signature. Also its determinant is negative. Note that
this Φ corresponds to the case 1 in the proof of Proposition 3.4.5. From Proposition 3.4.5 and
Theorem 3.5.3 we see that
"
#
−1 −1
L(ξ) =
(3.15)
−1 1
This L(ξ) is a map that parametrizes all Φ-dissipative behaviors, for the action of L(ξ) on
the set LJ yields the set LΦ . Let us now check the above"claim for# an arbitrary J-dissipative
d
+1
. We can check that this
behavior. Take an arbitrary behavior B in LJ , say Im dtd
+
2
dt
behavior B is in LJ by checking the Nyquist plot of the associated transfer function
action of L( dtd ) on the image representation of the J-dissipative behavior B gives
"
# "
#
ξ+1
−2ξ − 3
L(ξ)
=
ξ+2
1
ξ+1
.
ξ+2
Then
We now see that
h
−2(−iω) − 3 1
i
"
1 0
0 −1
#"
−2(iω) − 3
1
#
= 4ω 2 + 8
(3.16)
is positive for all real values of ω. Hence the image of B ∈ LJ under L(ξ) indeed gives a
Φ-dissipative behavior.
We now consider the example of a more complicated QDF.
Example 3.4.9 Consider the QDF QΘ with Θ given by
"
#
0 ζ
Θ(ζ, η) =
η ζη
Note that Θ(−iω, iω) corresponds to the case 2 in the proof of Proposition 3.4.5. For this
example we see that
"
#
1 ξ 2 /2
L(ξ) =
(3.17)
0
ξ
The map L(ξ) acting on LJ parametrizes the set of Θ-dissipative behaviors LΘ . Let us check
the "above claim
# for an arbitrary J-dissipative behavior defined, as in Example 3.4.8 by B =
d
+1
. Action of L( dtd ) on this behavior B gives the behavior B0 defined as the image
Im dtd
+2
dt
of
"
#"
# "
#
1 ξ 2 /2
ξ+1
ξ 3 /2 + ξ 2 + ξ + 1
=
0
ξ
ξ+2
ξ 2 + 2ξ
3.5. MIMO dissipative systems: the constant inertia case
43
We see that
h
#
0.5(iω)3 + (iω)2 + (iω) + 1
(iω)2 + 2(iω)
#
0.5(−iω)3 + (−iω)2 + (−iω) + 1 (−iω)2 + 2(−iω)
"
"
0 −iω
iω ω 2
i
(3.18)
gives the polynomial 2ω 4 + 4ω 2 . This polynomial is non-negative for all real values of ω and
hence the behavior B0 is Θ-dissipative.
In the next section, we generalize the results obtained so far. We investigate behaviors with
more than two manifest variables that are dissipative with respect to a supply function defined
by a QDF. We first consider QDFs QΦ where Φ(−iω, iω) has constant inertia for almost all
ω ∈ R. In this case, the results for SISO and the MIMO case do not differ much, though
the techniques used are different. We then consider the important case when Φ(−iω, iω) has
variable inertia with ω.
3.5
MIMO dissipative systems: the constant inertia case
In Section 3.4 we considered the QDF QJ , where J was the 2 × 2 matrix:
!
0 1/2
J=
1/2 0
Notice that in the MIMO case, such a “J” can be defined only if the number of inputs and
outputs of a system are the same. In order to overcome this limitation, we would like to look
at a more “general” QJ .
3.5.1
Supply functions defined by constant matrices
Consider the following inertia matrix
Jmn =
"
Im 0
0 −In
#
(3.19)
where m, n ∈ Z+ and Im (respectively In ) denotes the m × m (respectively n × n) Identity matrix. Consider the corresponding QDF QJmn . We show that m is the maximum possible input
cardinality for a Jmn -dissipative behavior:
Lemma 3.5.1 Let B be Jmn -dissipative. Then, the input cardinality of B, m(B) ≤ m.
Proof: We prove the lemma by a contradiction argument. Suppose there exists a Jmn -dissipative
behavior B0 defined by an observable image representation w = M ( dtd )` such that m(B0 ) = α,
α > m. Since the minimum number of latent variables is the same as input cardinality, ` is Rα valued. Further, the column rank of M (ξ) is α. Partition rows of M as Mm and Mn conformally
with Jmn . Since m < α, the system of equations Mm ( dtd )` = 0 is underdetermined. Hence, one
44
3 A parametrization for dissipative systems
can find ` ∈ D(R, Rα ) − {0} such that Mm ( dtd )` = 0. Since ImM ( dtd ) is observable, w = M ( dtd )`
such that ` ∈ D(R, Rα ) − {0} ∩ KerMm ( dtd ) is non-zero. Integrating QJmn along this w yields
a negative quantity which contradicts the assumption that B0 is Jmn -dissipative. Hence, the
input cardinality of a Jmn -dissipative behavior cannot exceed m.
Consider a Jmn -dissipative behavior B defined by an observable image representation
"
w1
w2
#
=
"
Q( dtd )
P ( dtd )
#
`
with ` ∈ C ∞ (R, Rl ), Q(ξ) ∈ Rm×l [ξ], P (ξ) ∈ Rn×l [ξ]. Consider the special case when l = m.
Theorem 3.2.3 shows that B is Jmn -dissipative if and only if the following inequality holds:
QT (−iω)Q(iω) − P T (−iω)P (iω) ≥ 0 ∀ω ∈ R
Notice that since the image representation is observable by assumption, Q(ξ) has no singularities
on iR. Further, since Q(ξ) is a polynomial matrix, it follows that det Q(ξ) 6= 0. Define
G(ξ) = P (ξ)Q−1 (ξ). Then, the above inequality is equivalent to
GT (−iω)G(iω) ≤ Im for almost all ω ∈ R
(3.20)
Rational functions that satisfy the above inequality have been well-studied in systems theory.
Note that GT (−iω)G(iω) represents the square of the gain of the transfer function G(ξ) with an
input Im sin ωt. Thus, inequality (3.20) gives an upper-bound on the gain of the transfer function
G(ξ) in the frequency domain. The existence of an upper bound is particularly significant in
disturbance attenuation, see Chapter 7. Another important application of rational functions
satisfying the inequality (3.20) is in interpolation theory, see [9] and Chapter 9. Inequality
(3.20) implies that the L∞ norm of G(ξ) is at most unity.
Due to familiarity with rational functions satisfying the inequality (3.20), we shall consider
the set of all Jmn -dissipative behaviors as the “base set” and parametrize more general dissipative
behaviors in terms of Jmn -dissipative behaviors. Section 3.4 dealt with the case when m = n = 1
We now consider the special case when m = n. The matrix Jmm is then congruent to the matrix
J = J T ∈ R2m×2m defined by
"
#
1 0 Im
(3.21)
J=
2 Im 0
since
2
"
0 Im
Im 0
#
=
"
Im Im
Im −Im
#"
Im 0
0 −Im
#"
Im Im
Im −Im
#
We have seen that the supply function QJ has important implications especially in the context
of electrical circuits. It is equivalent to study systems that are Jmn -dissipative or J-dissipative.
We may prefer one over the other depending on the situation at hand. In the sequel, we
consider Jmn instead of J because QJmn , unlike QJ does not presume that the number of inputs
and outputs are the same. We are now in a position to address the parametrization problem
for supply functions defined by non-constant polynomial matrices.
3.5. MIMO dissipative systems: the constant inertia case
3.5.2
45
Supply functions defined by polynomial matrices
Consider a supply function defined by QΦ with Φ ∈ Rw×w
s [ζ, η]. Since Φ(−iω, iω) is a Hermitian
matrix, it has real eigenvalues for every ω ∈ R. Clearly, if Φ(−iω, iω) is positive semi-definite
for every ω ∈ R, every controllable behavior in Lwcon is Φ-dissipative. On the other extreme is
the case when Φ(−iω1 , iω1 ) is negative definite for some ω1 ∈ R. In this case, there exist no
non-trivial Φ-dissipative behaviors (Lemma 3.2.5).
In this section, we assume that Φ(−iω, iω) has constant inertia for almost all ω ∈ R, i.e., the
number of positive, negative and zero eigenvalues of the Hermitian matrix Φ(−iω, iω) remain
the same for almost all ω ∈ R. We distinguish the case when Φ(−iω, iω) has constant inertia
for almost all ω ∈ R, from the case when inertia of Φ(−iω, iω) changes with ω. We address the
latter case in the next section.
It is a difficult and a non-trivial result that every nonsingular Φ(−iω, iω) = ΦT (iω, −iω)
∈ Rw×w [iω] having constant inertia for almost all ω ∈ R admits a polynomial Jmn -spectral
factorization, (see [17, 38, 49, 80] for details), i.e., there exist nonsingular matrices J mn =
Jmn T ∈ Rw×w and K(iω) ∈ Rw×w [iω] such that
Φ(−iω, iω) = K T (−iω)Jmn K(iω)
(3.22)
with m + n = w.
Consider the differential operator K( dtd ) induced by the polynomial matrix K(iω). We
define the action of K( dtd ) on a controllable behavior B ∈ Lwcon in the following manner:
K(
d
d
)(B) := {v|v(t) = K( )w(t), w(t) ∈ B}
dt
dt
Clearly, K( dtd )(B) is controllable and has the same input cardinality as that of B, since K(ξ)
is nonsingular. The following proposition follows as a easy consequence of the definition of
Definition 3.3.1:
Proposition 3.5.2 Consider Φ1 ∈ Rw×w
s [ζ, η] such that Φ1 (−iω, iω) is nonsingular and has constant inertia for almost all ω ∈ R. Obtain a polynomial Jmn -spectral factorization of Φ1 (−iω, iω)
as K T (−iω)Jmn K(iω). Define Φ2 as
Φ2 (ζ, η) = K T (ζ)Jmn K(η)
Then, QΦ1 ∼ QΦ2 .
The following theorem gives a parametrization for Φ-dissipative behaviors.
Theorem 3.5.3 Given a QDF QΦ with Φ(−iω, iω) ∈ Rw×w [iω] and nonsingular, having constant inertia for almost all ω ∈ R, obtain a polynomial Jmn -spectral factorization of Φ(−iω, iω)
as in equation (3.22). Let K(ξ) be the matrix one obtains by substituting ξ for iω in the matrix
K(iω). Then the following hold
1. K( dtd ) is a map from the set LΦ to LJmn , i.e., given B ∈ Lwcon ,
K(
d
)(B) ∈ LJmn ⇐⇒ B ∈ LΦ
dt
46
3 A parametrization for dissipative systems
2. Define L(ξ) as the adjugate matrix of K(ξ): L(ξ) = adj K(ξ), i.e., L(ξ)K(ξ) = det(K(ξ))I w .
Then the differential operator L( dtd ) parametrizes the set LΦ through the set LJmn , i.e.,
L( dtd ) maps every Jmn -dissipative behavior to a Φ-dissipative behavior.
Proof: Let Θ(ζ, η) = K T (ζ)Jmn K(η). Clearly, Θ(−iω, iω) = Φ(−iω, iω) and therefore from
Proposition 3.5.2, QΘ ∼ QΦ .
Let w = M ( dtd )` be an observable image representation of a behavior B ∈ Lwcon . Notice that
M T (−iω)Φ(−iω, iω)M (iω) = [K(−iω)M (−iω)]T Jmn [K(iω)M (iω)]
Define the behavior B0 as the image of M 0 ( dtd ) := K( dtd )M ( dtd ). Then,
M T (−iω)Φ(−iω, iω)M (iω) ≥ 0 ⇐⇒ M 0T (−iω)Jmn M 0 (iω) ≥ 0
which shows that B is Φ-dissipative if and only if B0 is Jmn -dissipative. Thus, K( dtd )(B) is
Jmn -dissipative if and only if B is Φ-dissipative.
Consider an arbitrary nonzero polynomial r(ξ) ∈ R[ξ]. Note that Im M ( dtd )r( dtd ) defines the
same controllable behavior in Lwcon as Im M ( dtd ). Therefore the map defined by L( dtd )K( dtd ) is
the identity map on Lwcon , i.e.,
L(
d
d
)K( )(B) = B ∀ B ∈ Lwcon
dt
dt
and so is the restriction of this map to LΦ . Therefore it follows that L( dtd ) defines the inverse
map of K( dtd )
d
L( )(B0 ) ∈ LΦ ⇐⇒ B0 ∈ LJmn
dt
Thus the set LΦ is precisely the image of LJmn under the map L(ξ).
Remark 3.5.4 Closely associated with the concept of behavioral dissipativity is that of lossR∞
lessness (B is said to be Φ-lossless if −∞ QΦ (w)dt = 0 along all compactly supported trajectories w ∈ B ). It is clear that the parametrization obtained in this chapter can also be used for
parametrization of lossless behaviors. In particular, it is easy to see that Φ-lossless behaviors
can be parametrized as the image of Jmn -lossless behaviors under L(ξ).
Thus, we have obtained from Theorem 3.5.3 a complete characterization of Φ-dissipative behaviors, assuming that Φ(−iω, iω) is nonsingular and has constant inertia for almost all ω ∈ R.
We now consider the most general case, when inertia of Φ(−iω, iω) changes as a function of
ω. These supply functions will be useful in solving the problem of synthesis of a dissipative
behavior in Chapter 8.
3.6
MIMO dissipative systems: the general inertia case
In this section we address the problem of parametrizing Φ-dissipative behaviors with Φ(−iω, iω)
having inertia that changes with ω. Given Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ] let n = maxξ∈iR
σ− (Φ(−ξ, ξ)), i.e., n is the maximum number of negative eigenvalues of Φ(−iω, iω), ω ∈ R.
3.6. MIMO dissipative systems: the general inertia case
47
Recall that if Φ(−iω0 , iω0 ) is negative definite for some ω0 ∈ R, there can exist no non-trivial
behaviors that are Φ-dissipative. Therefore, a necessary condition for existence of non-trivial
dissipative behaviors is, clearly that n < w. Thus, the number n seems to have a direct influence
on the “size” of the set of Φ-dissipative behaviors.
We now define, what we call a “worst inertia” matrix associated with Φ(−ξ, ξ). This matrix
plays a central role in the parametrization results obtained later in this chapter.
Definition 3.6.1 Given nonsingular Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ], let n := maxξ∈iR σ−
(Φ(−ξ, ξ)). The w × w inertia matrix
"
#
Iw−n 0
Jworst :=
0 −In
is called the “worst inertia” matrix of Φ(−ξ, ξ). The non-zero integer three-tuple (w − n, n, 0)
is called the “worst inertia” of Φ(−ξ, ξ) and is denoted by σworst (Φ).
Given QΦ , σworst (Φ) is unique. In Definition 3.6.1, Φ(−ξ, ξ) has been assumed to be nonsingular
more for the sake of convenience. The theory presented in this section can still be used when
Φ(−ξ, ξ) is singular by appropriately modifying the “worst inertia” matrix with zero submatrices.
Since Φ(−iω, iω) is a polynomial matrix in iω, its eigenvalues change continuously with
ω. Let ω1 < ω2 such that Φ(−iωi , iωi ) is singular for i = 1, 2. The inertia of Φ(−iω, iω)
remains constant in the interval (ω1 , ω2 ) if there exists no ω3 ∈ (ω1 , ω2 ) such that Φ(−iω3 , iω3 )
is singular. Thus, in order to determine the “worst inertia” of Φ(−iω, iω) it is enough to
determine inertia at any real number between two consequent real singularities of Φ(−iω, iω).
Using this fact, the following algorithm can be used to determine the “worst inertia”:
1. Given nonsingular Φ(−iω, iω) = ΦT (iω, −iω) ∈ Rw×w [iω], determine all real, non-negative
and distinct roots of det Φ(−iω, iω) = 0, and arrange them in an ascending order i.e.,
determine ω1 . . . ωk such that det Φ(−iωi , iωi ) = 0, ωi ≥ 0 and distinct, i = 1, . . . , k.
Further, ω1 < . . . < ωk .
2. If ω1 = 0: Let ωk+1 be an arbitrary finite real number larger than ωk . Determine inertia
of the k matrices Φ(−iω̄i , iω̄i ) with ω̄i := (ωi + ωi+1 )/2, i = 1, . . . , k. Denote the inertia
i
of the i-th matrix by σ i (Φ). Let σ−
(Φ) denote the number of negative eigenvalues of the
i-th matrix. Then the “worst inertia” σworst (Φ) is defined as σ p (Φ) where p is such that
p
i
σ−
(Φ) is the maximum among σ−
(Φ), i = 1, . . . , k.
3. If ω1 > 0: Let ω0 = 0, ωk+1 be an arbitrary real number larger than ωk . Determine inertia
of the k + 1 matrices Φ(−iω̄i , iω̄i ) with ω̄i := (ωi−1 + ωi )/2, i = 1, . . . , k + 1. Denote the
i
inertia of the i-th matrix by σ i (Φ). Let σ−
(Φ) denote the number of negative eigenvalues
of the i-th matrix. Then the “worst inertia” σworst (Φ) is defined as σ p (Φ) where p is such
p
i
that σ−
(Φ) is the maximum among σ−
(Φ), i = 1, . . . , k + 1.
Figure 3.1 illustrates the concept of the “worst inertia” of a matrix Φ(−iω, iω). Let ω i , i = 1, 2, 3
be the real non-negative roots of det Φ(−iω, iω) = 0. For ω ∈ (ω1 , ω2 ), Φ(−iω, iω) has two
positive and one negative eigenvalues. For ω ∈ (ω2 , ω3 ), Φ(−iω, iω) has one positive and two
48
3 A parametrization for dissipative systems
Eigenvalues of
Φ(−iω, ω)
i
Worst inertia of
0
ω
1
ω
Φ(−iω, i ω)
ω
2
3
ω
Real roots of det Φ(−iω, i ω) =0
Eigenvalues of
Φ(−iω, i ω)
Figure 3.1: Worst inertia of Φ(−iω, iω)
negative eigenvalues. For ω ∈ (ω3 , ∞), Φ(−iω, iω) has three positive eigenvalues. Thus, the
“worst inertia” of Φ(−iω, iω) is attained in the interval
(ω1#, ω2 ), and is found to be (1, 2, 0).
"
1 0
Thus the “worst inertia” matrix is Jworst = J1 2 =
.
0 −I2
We begin by quoting the following important theorem from [80] (IMA preprint no. 992,
Theorem 5.1), that relates to a “minimal size” factorization of a para-Hermitian matrix.
Theorem 3.6.2 The polynomial matrix Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ] admits a factorization
Φ(−ξ, ξ) = K T (−ξ)RK(ξ)
(3.23)
with R = RT ∈ Rm×m and K(ξ) ∈ Rm×w [ξ] if and only if
m ≥ m0 := max σ+ (Φ) + max σ− (Φ)
ξ∈iR
ξ∈iR
Moreover, if R is the minimal size (i.e., m0 ×m0 ) then it is uniquely determined up to congruence:
R has exactly maxξ∈iR σ+ (Φ) positive and maxξ∈iR σ− (Φ) negative eigenvalues, and hence can
be taken to be the inertia matrix with these eigenvalues without loss of generality.
Obviously, m0 ≤ 2w. The case that interests us is when maxξ∈iR σ− (Φ) < w (i.e., Φ(−iω, iω)
is not negative definite for any ω ∈ R). Consider Figure 3.1. Notice that max ξ∈iR σ− (Φ) = 2
and maxξ∈iR σ+ (Φ) = 3. Therefore, for the matrix considered in Figure 3.1, m0 = 2 + 3 = 5.
Further, R can be taken to be the 5 × 5 inertia matrix diag[I3 , −I2 ].
Remark 3.6.3 Consider Φ(−ξ, ξ) = K T (−ξ)RK(ξ) (Theorem 3.6.2). Consider polynomial
matrices X(ξ) such that X T (−ξ)RX(ξ) = R. Then, K 0 (ξ) := X(ξ)K(ξ) is also a possible
factor for Φ(−ξ, ξ): Φ(−ξ, ξ) = K 0T (−ξ)RK 0 (ξ). Matrices X(ξ) that satisfy X T (−ξ)RX(ξ)
= R are called R-unitary. See [9] for discussion on R-unitary matrices.
3.6. MIMO dissipative systems: the general inertia case
49
"
#
K1 (ξ)
If Φ(−ξ, ξ) is nonsingular then K(ξ) is full column rank. Partition K(ξ) as
with
K2 (ξ)
K1 (ξ) having rows equal to σ+ (R), i.e., the number of positive eigenvalues of R and K2 (ξ)
having rows equal to σ− (R). Then,
Theorem 3.6.4 The matrices K1 (ξ) and K2 (ξ), obtained from partitioning rows of K(ξ) conformally with the number of positive, and negative eigenvalues of R respectively, have full row
rank as polynomial matrices.
Proof: Suppose not. We first consider the case when K1 (ξ) is not full row rank. Denote the
rows of K1 (ξ) by r1 , . . . rl where l = σ+ (R). Then there exist polynomials p1 , . . . , pl ∈ R[ξ] not
P
all zero such that p1 r1 = li=2 pi ri . We assume without loss of generality that p1 6= 0. Define
h
i
C(ξ) =
p2 p3 . . . pl


r2


 r3 

L(ξ) = 
(3.24)
 .. 
.


rl
Consider the polynomial matrix Z(ξ) = p1 (ξ)p1 (−ξ)Φ(−ξ, ξ). Then,
"
#"
#
h
i p (ξ)p (−ξ)I + C T (−ξ)C(ξ)
0
L(ξ)
1
1
a
Z(ξ) = LT (−ξ) K2T (−ξ)
0
−p1 (ξ)p1 (−ξ)Ib
K2 (ξ)
where Ia denotes the identity matrix having size maxξ∈iR σ+ (Φ) − 1 and Ib denotes the identity
matrix having size maxξ∈iR σ− (Φ). The matrices p1 (ξ)p1 (−ξ)Ia +C T (−ξ)C(ξ) and p1 (ξ)p1 (−ξ)Ib
are positive definite matrices for almost all ξ ∈ iR. Hence, the block diagonal matrix has constant inertia for all ξ ∈ iR. Consequently, it can be factorized as LT1 (−ξ)Jab L1 (ξ) with L1 (ξ)
square and Jab a inertia matrix having (σ+ (Φ)−1) positive ones. This follows from the standard
polynomial Jmn -spectral factorization (3.22).
Notice that the eigenvalues of Z(iω) and that of Φ(−iω, iω) are related by a positive scaling
factor for almost all ω ∈ R and consequently
max σ+ (Z(ξ)) = max σ+ (Φ(−ξ, ξ))
ξ∈iR
ξ∈iR
We have obtained a factorization of Z(ξ) which is of a size less than max ξ∈iR σ+ (Z(ξ)) +
minξ∈iR σ− (Z(ξ)) which is a contradiction to Theorem 3.6.2. Hence, K1 (ξ) must have full row
rank. The proof of K2 (ξ) being full row rank is analogous.
We use Theorem 3.6.4 to prove the following important result:
Theorem 3.6.5 Every nonsingular matrix Φ(−ξ, ξ) = ΦT (ξ, −ξ) ∈ Rw×w [ξ] can be written as:
Φ(−ξ, ξ) = N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ)
with Jworst ∈ Rw×w being the “worst inertia” matrix associated with Φ(−ξ, ξ) and N (ξ) square
and nonsingular.
50
3 A parametrization for dissipative systems
Proof: Obtain a minimal factorization of Φ(−ξ, ξ) as in Theorem 3.6.2:
Φ(−ξ, ξ) = K T (−ξ)RK(ξ)
with R = diag (Im0 −n , −In ) ∈ Rm0 ×m0 . We emphasize that n denotes
number of
" the maximum
#
K1 (ξ)
negative eigenvalues of Φ(−iω, iω) for ω ∈ R. Partition K(ξ) =
, conformally with
K2 (ξ)
the partition of R. Note that K(ξ) is full column rank. By Theorem 3.6.4, K1 (ξ) and K2 (ξ)
are full row rank. Find a ξ0 ∈ C such that the rows of K2 (ξ0 ) are linearly independent (over C),
and K(ξ0 ) is full column rank. The rows of K2 (ξ0 ) form a basis for a n dimensional subspace
of Cw , and can be extended to Cw using k := (w − n) rows from K1 (ξ0 ). Denote these rows
by r1 (ξ) . . . rk (ξ). Let N (ξ) be the matrix obtained by stacking r1 (ξ) . . . rk (ξ) over the matrix
K2 (ξ), i.e.,


r1


..


.

N (ξ) = 


 rk 
K2 (ξ)
Then, N (ξ) is a square and nonsingular polynomial matrix. Let D(ξ) be the matrix obtained
by stacking the remaining rows, row[rk+1 (ξ) . . . rm0 −n−k (ξ)] of K1 (ξ):


rk+1 (ξ)


..
D(ξ) = 

.
rm0 −n−k (ξ)
It is now easy to see that Φ(−ξ, ξ) can be written as N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ), and
as we have shown already, N (ξ) is nonsingular.
Remark 3.6.6 Given Φ(−ξ, ξ) = K T (−ξ)RK(ξ) (Theorem 3.6.2, Theorem 3.6.5), it is clear
that N (ξ) and D(ξ) are not unique. In fact, there exist m0 −n Cw−n “sums” of the form N T (−ξ)
Jworst N (ξ) +D T (−ξ)D(ξ) that can be obtained from Φ(−ξ, ξ) with N (ξ), D(ξ) unique up to a
permutation of rows. It is guaranteed that N (ξ) will have a rank at least n since by construction
we have retained all rows of K2 (ξ) which is a full row rank matrix. However, not all such sums
will yield N (ξ) nonsingular.
We now define the notion of a “split sum” of Φ(−ξ, ξ):
Definition 3.6.7 Given a nonsingular para-Hermitian matrix Φ(−ξ, ξ), let Jworst be the “worst
inertia” matrix associated with Φ(−ξ, ξ). Consider matrices N (ξ), D(ξ) such that N T (−ξ)Jworst
N (ξ) + D T (−ξ)D(ξ) = Φ(−ξ, ξ). The triple (Jworst , N (ξ), D(ξ)) is said to define a split sum for
Φ(−ξ, ξ) if N (ξ) is nonsingular.
A split sum of Φ(−ξ, ξ) can be thought of as a decomposition of Φ(−iω, iω) into the sum of a
sign-indefinite and a positive semidefinite matrix. The definition of split sum requires that the
indefinite matrix be nonsingular. A split sum of Φ(−ξ, ξ) is not unique. From Remarks 3.6.3
and 3.6.6, one can modify a split sum using either a R-unitary transformation (Remark 3.6.3),
or taking a different combination of rows that form the split sum (Remark 3.6.6). A split sum
of Φ(−ξ, ξ) is useful for parametrizing a set of Φ-dissipative behaviors as the following section
shows:
3.6. MIMO dissipative systems: the general inertia case
3.6.1
51
Parametrizing a set of Φ-dissipative behaviors using split sums
Given a nonsingular Φ(−ξ, ξ) = ΦT (ξ, −ξ), let (Jworst , N (ξ), D(ξ)) define a split sum for
Φ(−ξ, ξ). Define the two variable polynomial matrix Θ(ζ, η) as follows:
Θ(ζ, η) = N T (ζ)Jworst N (η)
(3.25)
We can see that Θ(−iω, iω) is precisely the indefinite part of the split sum on the imaginary
axis. Then:
Theorem 3.6.8 The set of Θ-dissipative behaviors, LΘ , is a subset of the set of Φ-dissipative
behaviors LΦ : LΘ ⊆ LΦ ⊂ Lwcon .
Proof: Let w = M ( dtd )` be a Θ-dissipative behavior. Therefore,
M T (iω)Θ(−iω, iω)M (iω) ≥ 0 ∀ ω ∈ R
The matrix M T (−iω)D T (−iω)D(iω)M (iω) is positive-semidefinite for all ω ∈ R, M (ξ) ∈
Rwו [ξ]. Hence, adding it to the above inequality does not affect the nature of the inequality.
Therefore,
M T (−iω)[D T (−iω)D(iω) + Θ(−iω, iω)]M (iω) ≥ 0 ∀ ω ∈ R
which shows that B is Φ-dissipative.
Let (Jworst , N (ξ), D(ξ)) define a split sum for Φ(−ξ, ξ). Consider the map N ( dtd ). From
results presented in Section 3.5, we can think of N ( dtd ) as a map from the set of Θ-dissipative
behaviors to the set of Jworst -dissipative behaviors, i.e.,
N(
d
) : LΘ → LJworst
dt
Further, the map N ( dtd ) admits an inverse map in the following sense: denote by L(ξ) the
adjugate matrix of N (ξ), i.e., L(ξ)N (ξ) = d(ξ)Iw where d(ξ) = det N (ξ). Then,
L(
d
) : LJworst → LΘ
dt
and hence LΦ . The following Theorem can be used to parametrize a set of Φ-dissipative
behaviors.
Theorem 3.6.9 Consider a QDF QΦ with Φ(ζ, η) ∈ Rw×w
s [ζ, η] such that Φ(−ξ, ξ) is nonsingular. Let (Jworst , Ni (ξ), Di (ξ)), i = 1, 2 . . . define split sums of Φ(−ξ, ξ). Consider the set of
Jworst -dissipative behaviors LJworst . Consider the behaviors
Bi := adjNi (
d
)(BJworst )
dt
BJworst ∈ LJworst . Then, the behaviors Bi are Φ-dissipative.
Proof: Define QDFs QΘi where
Θi (ζ, η) = NiT (ζ)Jworst Ni (η)
52
3 A parametrization for dissipative systems
adj N (d/dt)
1
adj N 2(d/dt)
adj N 3(d/dt)
Jworst
Maps obtained from different split sums
of Φ(−ξ,ξ)
adj N 4(d/dt)
adj N 5(d/dt)
adj N (d/dt)
6
Φ
Figure 3.2: Parametrization using split sums yields a proper subset of LΦ
The map adjNi ( dtd ) acts on any Jworst -dissipative behavior to yield a Θi -dissipative behavior.
Note that every Θi -dissipative behavior is also Φ-dissipative.
The process of parametrizing Φ-dissipative behaviors using split sums is shown in Figure 3.2.
We demonstrate parametrization of dissipative behaviors using split sums with an example:
"
#
1
0
Example 3.6.10 Let Φ(ζ, η) =
. Then QΦ (w) with w = [w1 w2 ]T is w12 + w22 −
0 1 − ζη
"
#
1
0
( dtd w2 )2 . Note that Φ(−iω, iω) =
. Thus, Φ(−iω, iω) has an inertia that varies
0 1 − ω2
with ω: for |ω| ∈ [0, 1), Φ(−iω, iω) is positive definite, while for |ω| ∈ (1, ∞), Φ(−iω, iω) has
one positive, and one negative eigenvalue. It is easy to see that
1. The maximum number of positive eigenvalues of Φ(−iω, iω) is 2, when |ω| ∈ [0, 1).
2. The maximum number of negative eigenvalues of Φ(−iω, iω), n is 1, when |ω| ∈ (1, ∞).
3. The “worst inertia” of Φ(−iω, iω) is thus (1, 1).
4. The “worst inertia” matrix Jworst of Φ(−iω, iω) is thus:
"
#
1 0
Jworst =
0 −1
Notice that Φ(−ξ, ξ) can be written as:
Φ(−ξ, ξ) =
"
1 0 0
0 1 −ξ
#


1 0
1 0 0



 0 1 0  0 1 
0 ξ
0 0 −1

3.6. MIMO dissipative systems: the general inertia case
53
It is not"difficult
# to see using Theorem 3.6.2 that the above factorization is minimal. Define
h
i
1 0
N (ξ) =
and D(ξ) = 0 1 . Then, one can see that
0 ξ
Φ(−ξ, ξ) = N T (−ξ)Jworst N (ξ) + D T (−ξ)D(ξ)
The matrix N (ξ) is nonsingular. Thus (Jworst , N (ξ),"D(ξ)) defines
a split sum of Φ(−ξ, ξ). Let
#
ξ 0
L(ξ) be the adjugate of N (ξ), which is found to be
. We claim that
0 1
L(
d
) : LJworst → LΦ
dt
i.e, L( dtd )(B) is a Φ-dissipative behavior if B is Jworst -dissipative. Let us verify this claim for
an arbitrary Jworst -dissipative"behavior.#
d
+1
dt
. Then, B is Jworst -dissipative. Also, L( dtd )(B) has
Let B be defined as Im
1
"
#
d d
(
+
1)
an image representation defined by Im dt dt
. In order to check whether L( dtd )(B) is
1
Φ-dissipative we compute:
h
−iω − ω 2 1
i
"
1
0
0 1 − ω2
#"
iω − ω 2
1
#
which is found to be 1 + ω 4 which is clearly positive for all ω ∈ R. Hence, L( dtd )(B) is indeed
Φ-dissipative.
Note that parametrization of LΦ using a split sum of Φ(−ξ, ξ) will in general only yield a proper
subset of LΦ . Different split sums define different maps adj Ni ( dtd ), i = 1, 2, . . ., all of which
can be used to parametrize subsets of LΦ using the same base set LJworst . With reference to
Example 3.6.10, we show that the set of behaviors parametrized using split sums is a proper
subset of LΦ by showing that there exists a Φ-dissipative behavior which is not Θ-dissipative.
Example 3.6.11 Let B = Im
"
q( dtd )
p( dtd )
dissipative:
h
−ω 2 1 − iω
However, B is not Θ-dissipative:
i
"
#
with q(ξ) = ξ 2 and p(ξ) = 1 + ξ. Then, B is Φ-
1
0
0 1 − ω2
#"
2
−ω
1 + iω
#
=1 >0
Thus, we have parametrized a proper subset of LΦ using split sums. The problem of determing
all of LΦ from Lworst using split sums is, however, still open.
54
3.7
3 A parametrization for dissipative systems
Conclusion
In this chapter we have addressed the problem of parametrizing dissipative systems. We have
constructed a differential operator that maps passive dynamical systems into dissipative dynamical systems. First, we have examined SISO systems. Under certain assumptions on the supply
function, we have obtained explicit formulae for the parametrization. This parametrization is
in terms of J-dissipative dynamical systems.
In the MIMO case we have shown that one can parametrize all dissipative systems when
the supply function satisfies the condition of constant inertia on the imaginary axis. When this
assumption does not hold, we have defined the notion of a split sum which we have then used
to construct a proper subset of the set of all dissipative systems.
As a part of future work, it will be interesting to extend the idea of split sums to parametrize
all dissipative behaviors, rather than a proper subset. It will also be interesting to investigate
computational aspects of the minimal factorization proposed by Ran and Rodman [80] which
has been crucially used in this chapter.
Chapter 4
KYP lemma and its extensions
4.1
Introduction
The Kalman-Yakubovich-Popov (KYP) lemma is one of the key results in linear systems theory.
Though originally formulated by Yakubovich [109] and Kalman [40], and later by Popov [78],
subsequent research has added to it. The classical KYP lemma can be thought of as giving
conditions for a transfer function matrix to be positive real. What makes the formulation
attractive is that these conditions are in terms of state space (real constant) matrices. This has
enabled in the recent times, the use of fast computational tools like Linear Matrix Inequalities
(LMIs).
KYP lemma is also called positive real lemma. We now give a statement of the lemma as
in [2]:
Theorem 4.1.1 Consider the system
ẋ(t) = Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
(4.1)
with x(t) ∈ Rn and y(t), u(t) ∈ Rm . Suppose that (i)no eigenvalues of A lie in the open right
half plane and all its purely imaginary eigenvalues are simple, (ii)(A, B) is controllable and (iii)
(C, A) is observable and (iv) the transfer function matrix G(s) := C(sI − A)−1 B + D satisfies
G(iω) + G∗ (iω) ≥ 0 for almost all ω ∈ R
Then, under these conditions there exist matrices K = K T ∈ Rn×n ; K > 0, Q ∈ Rm×n and
W = W T ∈ Rm×m such that
AT K + KA = −QT Q
BT K + W T Q = C
W T W = D + DT
(4.2)
Note that the KYP lemma can only be formulated for systems having equal number of inputs
and outputs. In this chapter, we address some system theoretic (rather than computational)
56
4 KYP lemma and its extensions
questions that arise in a natural manner from the KYP lemma. Many important results can
be proved in a simple manner by invoking the lemma. In particular, it provides a simple proof
for the well known passivity theorem in nonlinear systems theory (see for instance [107]). It
is therefore important to investigate what lies beneath the conditions and equations which
makes the lemma so powerful. The scalar version of results presented in this chapter have been
published [62, 63, 64].
A behavioral formulation of KYP lemma has already been made in [106] where a similar
lemma is formulated for systems described by high-order differential equations. This formulation, like the classical formulation, leads to an LMI. Another well known interpretation of
the KYP lemma is in terms of certain functionals associated with a dynamical system, called
“storage functions”. In Chapter 3 we parametrized LTI systems that are dissipative with respect to a supply function defined by a QDF. In this chapter we investigate LTI systems, that
in addition to being dissipative with respect to a supply function defined by a QDF also have
positive definite storage functions. Storage functions have a deep connection with Lyapunov
theory and hence they can be used to investigate stability of equilibria of dynamical systems.
Attempts to generalize the KYP lemma were motivated by the absolute stability problem,
in particular, constructing Lyapunov functions for “sector bound” nonlinearities. Thus, the socalled “Meyer-Kalman-Yakubovich” (MKY) lemma [79, 87, 89] received attention as a means
of obtaining explicit state-space inequalities for systems interconnected with sector-bound nonlinearities. In this chapter we show that the KYP lemma can be generalized to a great extent
in a representation free manner with the help of behavioral systems theory and QDFs. If representations are invoked, the conditions that we obtain will be in terms of some frequency domain
(rational function) inequalities rather than state space inequalities.
This chapter is organized as follows: in Section 4.2 we review introductory material on
storage functions. We highlight the similarities and differences between storage functions on
manifest variables, and storage functions on states of a behavior. In Section 4.3 we build a connection between the KYP lemma and storage functions. We subsequently use this connection
to generalize the KYP lemma in Section 4.4. There is a certain “strict version” of the KYP
lemma available in literature. We generalize the strict version in Section 4.5.
4.2
Storage functions for dissipative systems
In Chapter 3, we obtained results about parametrization of Φ-dissipative behaviors. Closely
associated with a Φ-dissipative behavior B are generalizations of the concepts of “stored energy”
and “dissipated power”. A QDF Q∆ with ∆ ∈ Rw×w
s [ζ, η] is called a “dissipation function”
associated to a Φ-dissipative behavior B if:
Z
∞
QΦ (w)dt =
−∞
Z
∞
−∞
Q∆ (w) ≥ 0 and
Q∆ (w)dt ∀w ∈ D(R, Rw ) ∩ B
(4.3)
Here Q∆ (w) ≥ 0 means that the QDF Q∆ is point-wise non-negative on trajectories w(t).
A QDF QΨ is said to be a “storage function” associated to a Φ-dissipative behavior B if
d
Q (w) ≤ QΦ (w) ∀w ∈ B. Moreover, given a supply function QΦ , there is a one-to-one
dt Ψ
4.2. Storage functions for dissipative systems
57
relation between the storage and dissipation functions given by [103]:
d
QΨ (w) = QΦ (w) − Q∆ (w) ∀w ∈ B
dt
The above equation is also known as the dissipation equality. A storage function Q Ψ of
B with respect to QΦ is called positive semidefinite on manifest variables of B if ∀w ∈ B,
QΨ (w)(t) ≥ 0 ∀ t . QΨ is called positive definite on manifest variables of B if it is positive
semidefinite on manifest variables of B, and if in addition QΨ (w) = 0 ⇐⇒ w is the zero
trajectory in B.
Associated with a behavior B are certain special latent variables that are called states. We
reviewed properties of state variables and state-representations of B in Section 2.8. Recall from
Proposition 2.8.4, that the McMillan degree of a behavior B, n(B), is an invariant of B. We
now define what we mean by storage functions that are functions of state.
Let Φ ∈ Rw×w be a constant matrix. Consider a Φ-dissipative behavior B. Let x denote
states corresponding to a minimal state representation Bfull of B. If B is Φ-dissipative, Bfull is
dissipative with respect to the supply function defined by diag[Φ, 0n(B) ]. Hence, we can define
dissipativity of Bfull analogously as dissipativity of B. Let Ba− denote the subset of B such
that the trajectories have state 0 at time t = −∞ and have state a at time t = t 0 . Then
R t0
Q (w(τ ))dτ with w(t) ∈ Ba− denotes the total generalized energy supplied in reaching
−∞ Φ
state a from state 0 along w(t). We consider those trajectories in Ba− for which this energy
supply is the least. This is known as the “minimum required supply” since this is the least
amount of energy that must be supplied to reach state a. The minimum required supply is
denoted by QΨ+ (a) and is defined as follows:
Z t0
QΦ (w(τ ))dτ
(4.4)
QΨ+ (a) = inf
w∈Ba−
−∞
We can evaluate QΨ+ (x) for every state x(t0 ) ∈ Rn(B) , to ∈ R. It can be easily verified that
QΨ+ is a storage function of B with respect to QΦ , since
d
QΨ+ (x) ≤ QΦ (w) ∀(w, x) ∈ Bfull
dt
Now define the set Ba+ of trajectories w(t) ∈ B which have state a at time t0 and state 0
R∞
at time t = ∞. The quantity − t0 QΦ (w(τ ))dτ , with w(t) ∈ Ba+ denotes the generalized
energy extracted while reaching state 0 from state a along w(t). We can look at trajectories for
which this energy dissipation is the maximum. This energy dissipation is known as “available
storage” since this is the maximum energy that can be extracted from state a while reaching
the state 0. The available storage is denoted by QΨ− (a) and is defined as follows:
Z ∞
QΦ (w(τ ))dτ )
QΨ− (a) = sup (−
w∈Ba+
t0
The available storage QΨ− corresponding to every state x(t) ∈ Rn(B) can also be shown to be
a storage function of B with respect to QΦ .
For a more detailed discussion of required supply and available storage, see [98]. In [98],
Theorem 3, page 331 it is shown that the set of all possible storage functions is bounded and
forms a convex set. Any other storage function QΨ satisfies the inequality
QΨ− (x) ≤ QΨ (x) ≤ QΨ+ (x) ∀ x such that (w, x) ∈ Bfull
(4.5)
58
4 KYP lemma and its extensions
Hence, QΨ− (available storage) and QΨ+ (minimum required supply) can be thought of as the
“minimum” and the “maximum” storage functions, respectively, of B with respect to QΦ . Note
that the representation of QΨ (x) depends on the choice of states x. A storage function on states
can be re-written as xT Kx with x ∈ Rn(B) , K = K T ∈ Rn(B)×n(B) . QΨ (x) is called a positive
definite state function of B if x ∈ Rn(B) represents a minimal set of states of B and QΨ (x) > 0
for all x 6= 0. If QΨ (x) is a positive definite state function of B it can be represented by a
symmetric positive definite matrix.
Using a state map, storage functions on states can be defined as functions of the manifest
variables w(t). Let X( dtd ) be a minimal state map for B:
x = X(
d
)w
dt
Then, Ψ(ζ, η) = X T (ζ)KX(η) is a storage function on manifest variables of B if K represents
a storage function on states of B.
Note that the discussion about required supply and available storage was with the tacit assumption that the supply function is defined by a constant matrix. When the supply function is
defined by polynomial matrices, Willems and Trentelman showed in [95] that storage functions
for a behavior are state functions of an associated behavior. However, if one considers storage
functions on manifest variables rather than states, one can still define “maximum” and “minimum” storage functions. It is shown in [103] that for any Φ-dissipative behavior B, there exist
storage functions QΨ− and QΨ+ on manifest variables such that every other storage function
QΨ for B satisfies:
QΨ− (w) ≤ QΨ (w) ≤ QΨ+ (w) ∀w ∈ B
A procedure has been given in [103] to compute these storage functions. We summarize the
procedure below.
R∞
1. Assume −∞ QΦ0 (`) ≥ 0 for all compactly supported `. Then, Φ0 (−iω, iω) ≥ 0 for all
ω ∈ R (Theorem 3.2.3).
2. Using a non-trivial result called polynomial spectral factorization ([17, 38] and Chapter 6
of this thesis), it can be shown that if Φ0 (−iω, iω) ≥ 0 then Φ0 (−iω, iω) = AT (−iω)A(iω) =
H T (−iω)H(iω). Here, A(ξ) is a square matrix having all its singularities in the closed
right half plane (“A” for anti-Hurwitz) and H(ξ) is a square matrix having all its singularities in the closed left half plane (“H” for Hurwitz).
3. Since Φ0 (−iω, iω) − AT (−iω)A(iω) = 0, define:
Ψ0+ (ζ, η) =
Φ0 (ζ, η) − AT (ζ)A(η)
ζ +η
Then, QΨ0+ defines the “maximum” storage function.
4. Since Φ0 (−iω, iω) − H T (−iω)H(iω) = 0, define:
Ψ0− (ζ, η) =
Φ0 (ζ, η) − H T (ζ)H(η)
ζ +η
Then, QΨ0− defines the “minimum” storage function.
4.2. Storage functions for dissipative systems
59
Notice that in this recipe, we have considered positivity of QΦ0 on all trajectories in the ambient
space (for example C ∞ ). In practice, this procedure will yield storage functions on latent
variables, which can then be converted into storage functions on manifest variables provided
latent variables are observable from the manifest variables. Let us consider an example which
demonstrates the essential ideas:
Example 4.2.1 Consider the simple RC circuit which we examined in Chapter 2, Example
2.2.1. Assume for the sake of simplicity that R1 = R2 = C = 1. Then, the behavior B of the
port voltage V and the port current I is given by
"
=
"
Define a supply function QJ (V, I) where J =
"
J 0 (ζ, η) =
h
ζ +2 ζ +1
i
"
V
I
#
0 1/2
1/2 0
d
dt
d
dt
+2
+1
#
`
#
0 1/2
. Define J 0 (ζ, η) as
1/2 0
#"
η+2
η+1
#
= (1/2)(ζη + 3(ζ + η) + 4)
Notice that J 0 (−iω, iω) = ω 2 /2 + 2 > 0 for all ω ∈ R. Therefore B is J-dissipative. We
compute Hurwitz and anti-Hurwitz spectral factorizations of J 0 (−ξ, ξ) :
J 0 (−iω, iω) =
(2 + iω) (2 − iω)
√
√
2
2
√
√
Define H(ξ) = (ξ + 2)/ 2 and A(ξ) = (ξ − 2)/ 2. Define
J 0 (ζ, η) − A(ζ)A(η)
= 5/2
ζ +η
J 0 (ζ, η) − H(ζ)H(η)
Ψ0− (ζ, η) =
= 1/2
ζ +η
Ψ0+ (ζ, η) =
Then, QΨ0+ (`) = 5`2 /2 and QΨ0− (`) = `2 /2. Clearly, QΨ+ (`) > QΨ− (`) as was expected. Using
observability of ` from (V, I) one can substitute:
`=
h
1 −1
i
"
V
I
#
to write QΨ0+ , QΨ0− in terms of (V, I). Define
Ψ+ = (5/2)
"
1 −1
−1 1
#
QΨ+ and QΨ− are storage functions on (V, I).
; Ψ− = (1/2)
"
1 −1
−1 1
#
60
4.3
4 KYP lemma and its extensions
Classical KYP lemma in terms of storage functions
The connection between KYP lemma and storage functions is well-known in literature (see [90]
for instance). Note that u, y that satisfy equations (4.1) are Rm -valued. We consider the 2m × 2m
symmetric matrix J:
"
#
1 0 Im
J=
2 Im 0
Then, QJ (u, y) = uT y. Note that since G(iω) + G∗ (iω) ≥ 0 in Theorem 4.1.1, u and y
that satisfy equations (4.1) define a behavior B that is J-dissipative. Suppose a state space
representation for B satisfies the conditions that (A, B) is controllable and (C, A) is observable.
Then, (A, B, C, D) is a minimal state representation of B and states x are observable from the
manifest variables (u, y). Then:
Corollary 4.3.1 The n × n symmetric matrix (1/2)K in equations (4.2) defines a storage
function which is a positive definite state function for the system (4.1), since K > 0 and
1d T
(x Kx) ≤ uT y
2 dt
for all x, y, u such that the equations (4.1) are satisfied.
Proof: Note that using equations in the KYP lemma, it follows immediately that
d T
(x Kx) = −(Qx)T (Qx) − (W u)T (Qx)
dt
−(Qx)T (W u) + uT Cx + xT C T u
(4.6)
and from the state space equations it follows that
2uT y = uT y + y T u = uT Cx + xT C T u + (W u)T (W u)
And therefore
d T
(x Kx) − 2uT y = −(Qx + W u)T (Qx + W u)
dt
Since (Qx + W u)T (Qx + W u) ≥ 0 it follows that
d T
(x (1/2)Kx) − uT y ≤ 0
dt
Hence (1/2)xT Kx is a storage function.
Note that G(s) in Theorem 4.1.1 is a positive real (PR) matrix. See [3] for a detailed treatment
and applications of positive real matrices. Thus the classical KYP lemma can be interpreted
as: every storage function of the J-dissipative behavior B is a positive definite state function
if and only if G(s) is positive real.
Remark 4.3.2 A variant of Theorem 4.1.1 is available in literature that relates strict positive
real (SPR) matrices with storage functions, see for instance [107], page 223: G(s) is SPR if
and only if for any minimal state representation (A, B, C, D) of G(s), A is a Hurwitz matrix
and there exist matrices K, Q and W of appropriate dimensions, and an > 0 such that
AT K + KA = −K − QT Q; B T K + W T Q = C; and W T W = D + D T . A result similar to
4.4. Generalization of KYP lemma
61
Corollary 4.3.1 can also be proved for this variant. However, note that Corollary 4.3.1 implies
that there may be certain non-zero state trajectories along which the rate of change of storage
exactly equals the supply. The in the above mentioned variant of the KYP lemma ensures
that the rate of change of storage along every nonzero state trajectory is strictly less than the
supply. This distinction between PR and SPR versions of the lemma is of crucial importance
especially in stability analysis.
4.4
Generalization of KYP lemma
The behavioral formulation of the classical KYP lemma lends itself to generalizations in several
directions. The classical KYP lemma considers passive dynamical systems. These dynamical
systems are dissipative with respect to a quadratic functional of the form uT y where u(t) and
y(t) are obtained from certain permutations of the manifest variables. Note that such a supply
function forces u(t) and y(t) to have the same dimension. While this is perfectly logical for
passive systems, it neednot always be the case. Therefore, we work with a more “general”
supply function than QJ . Consider
"
#
Im 0
Jmn =
0 −In
and the associated QDF QJmn . Clearly, behaviors dissipative with respect to the supply function
defined by QJmn neednot necessarily have equal number of inputs and outputs. Moreover,
dissipativity and existence of storage functions are fundamental to the system and donot depend
on the cardinality of system variables.
The question of when do Jmn -dissipative behaviors with input cardinality m (i.e. the number
of +1s in Jmn , which is the maximum possible input cardinality for a Jmn -dissipative behavior,
Lemma 3.5.1) have positive definite storage functions on states has been addressed in [103],
Theorem 6.4, page 1726. We reproduce that result below:
Theorem 4.4.1 Let Jmn = diag[Im , −I
a Jmn -dissipative behavior B defined by an
#
" n ]. Consider
d
R( dt )
with R(ξ) ∈ Rm×m [ξ], S(ξ) ∈ Rn×m [ξ]. Then, the
observable image representation: Im
S( dtd )
following are equivalent:
1. There exists a positive definite storage function on states of B.
2. R(ξ) is a Hurwitz matrix, i.e. every singularity of R(ξ) lies in the open left half complex
plane. In particular, det R(ξ) 6= 0.
3. Every storage function of B with respect to QJmn is positive definite on states of B.
4. Every storage function on manifest variables of B is positive semidefinite.
Theorem 4.4.1 gives an elegant characterization of all dissipative behaviors that have positive definite storage functions on states. Thus, a behavior B defined as in Theorem 4.4.1 by
62
4 KYP lemma and its extensions
#
R( dtd )
has positive definite storage functions with respect to QJmn on states if and only
Im
S( dtd )
if ||S(ξ)R−1 (ξ)||H∞ ≤ 1, or in other words, the rational matrix S(ξ)R−1 (ξ) is bounded real [3].
Consider a Jmn -dissipative behavior B := {w|w = M ( dtd )`}. Let X( dtd ) be a minimal state map
for B defining states x. Assume B has a positive definite storage function on states defined by
a matrix K = K T ∈ Rn(B)×n(B) . Then, K > 0 and
"
d T
x Kx ≤ QJmn (w)
dt
for all (w, x) ∈ Bfull , the full behavior. Using the state map X( dtd ) and the observable image
representation M ( dtd ), a storage function on states of B can be “converted” into a storage
function on manifest variables of B. Define Ψ(ζ, η) as
Ψ(ζ, η) = X T (ζ)KX(η)
Then, QΨ is a storage function on manifest variables w of B. Clearly, since K > 0, QΨ (w) ≥ 0
for all w ∈ B. We ask the following question: when is a storage function positive definite on
manifest variables of a Jmn -dissipative behavior. The answer to this question is rather intuitive.
If some part of the behavior is “memoryless”, i.e. its evolution is not governed by any past
history, there cannot be any storage on such trajectories. See Definition 2.8.7 for the memoryless
part of a behavior. The memoryless part of a behavior plays an important role in obtaining
storage functions on manifest variables, as the following proposition shows:
#
"
R( dtd )
with
Proposition 4.4.2 Let B defined by an observable image representation: Im
S( dtd )
R(ξ) ∈ Rm×m [ξ], S(ξ) ∈ Rn×m [ξ] and m + n := w. Let Jmn = diag[Im , −In ]. Then, B is Jmn dissipative and every storage function on manifest variables of B is positive definite if and only
if the following conditions hold:
1. ||S(ξ)R−1 (ξ)||H∞ ≤ 1, i.e. R(ξ) is a Hurwitz matrix and
R−T (−iω)S T (−iω)S(iω)R−1 (iω) ≤ Im , ω ∈ R
2. B has no non-trivial memoryless part, i.e., if X( dtd ) is a minimal state map for B then,
KerX( dtd ) ∩ B = {0}.
Proof: From Theorem 4.4.1 it follows that every storage function of B is positive definite on
states, and positive semidefinite on manifest variables if and only if condition (1) holds. If K =
K T ∈ Rn(B)×n(B) defines a storage function on states, QΨ defined by Ψ(ζ, η) = X T (ζ)KX(η) is
a storage function on manifest variables. QΨ is positive definite iff in addition KerX( dtd ) ∩ B =
{0}.
In keeping with the behavioral philosophy, we try as far as possible, not to invoke a state
representation. This has the advantage that all our results are representation free. Hence, we
work exclusively with manifest variables, which is more natural from the behavioral viewpoint.
The role of the QDF defining the supply function needs a special mention. The supply
function contributes dynamics to the problem, in addition to the dynamics of the behavior. As
4.4. Generalization of KYP lemma
63
a result, a storage function could depend on a lesser number of states than the McMillan degree
(the phenomenon of “degree drop”) and yet be positive definite on manifest variables. However,
such phenomena occur only when the supply function QΦ is defined by a polynomial matrix
Φ(ζ, η). See Example 4.6.9 for a demonstration of this phenomenon. In the remaining sections
of the chapter, we address the problem of positive definite storage functions for behaviors
dissipative with respect to a QDF induced by a polynomial matrix Φ(ζ, η).
4.4.1
Generalization with respect to QDFs
Consider a matrix Φ(ζ, η) ∈ Rw×w
s [ζ, η]. The matrix obtained by substituting ζ = −iω0 , η = iω0 ,
i.e. Φ(−iω0 , iω0 ) is a w×w Hermitian matrix for any ω0 ∈ R. Therefore it has w real eigenvalues.
It is shown in Chapter 3 that when Φ(−iω, iω) is nonsingular and has constant inertia for almost
all ω ∈ R, one can parametrize the set of Φ-dissipative behaviors. In this chapter, we go a
step further: we parametrize Φ-dissipative behaviors with positive definite storage functions on
manifest variables.
Let us now consider matrices Φ(ζ, η) ∈ Rw×w
s [ζ, η] such that
Φ(ζ, η) = K T (ζ)Jmn K(η)
(4.7)
with K(ξ), Jmn square and nonsingular. Clearly, if Φ(ζ, η) can be written in the form (4.7) then
Φ(−iω, iω) has constant inertia for almost all ω ∈ R. However, note that the converse does not
necessarily hold, i.e. every two variable polynomial matrix Φ(ζ, η) such that Φ(−iω, iω) has
constant inertia for almost all ω ∈ R neednot admit a factorization as in (4.7). We now obtain
necessary and sufficient conditions when a given Φ(ζ, η) can be written in the form (4.7).
From equation (1.4), every matrix Φ(ζ, η) ∈ Rw×w
s [ζ, η] can be written as




I

i Φ00 Φ01 . . . Φ0k 
h
 Iη 


.
k

(4.8)
Φ(ζ, η) = I Iζ . . . Iζ
 Φ10 . . . . . . ..  
 .. 
 . 
Φk0 . . . . . . Φkk
Iη k
where Φij = ΦTji denote constant w × w matrices, I denotes the w × w identity matrix, and k
denotes the maximum degree of ζ (and hence also of η) that occurs in Φ(ζ, η). As in equation
(1.3), we will denote the w(k + 1) × w(k + 1) symmetric matrix obtained as [Φij ]ki,j=0 by Φ̃ and
call it the coefficient matrix of Φ(ζ, η). Note that Φ̃ has all real eigenvalues. Recall that the
inertia of Φ̃, σ(Φ̃), has been defined in Definition 1.2.2 as the non-negative integer three tuple
(σ+ , σ− , σ0 ) with σ+ , σ− , σ0 denoting, respectively, the number of positive, negative and zero
eigenvalues of Φ̃. Then:
Theorem 4.4.3 Consider a nonsingular polynomial matrix Φ(ζ, η) ∈ Rw×w
s [ζ, η] (i.e. det Φ(ζ, η)
w(k+1)×w(k+1)
6= 0 ∈ R[ζ, η]). Let Φ̃ ∈ R
be the coefficient matrix of Φ(ζ, η). Φ(ζ, η) admits a
T
factorization of the form K (ζ)Jmn K(η) with Jmn = diag[Im , −In ], w = m + n, K(ξ) ∈ Rw×w [ξ]
and nonsingular if and only if σ(Φ̃) is (m, n, w(k + 1) − m − n).
Proof: If σ(Φ̃) = (m, n, w(k +1)−m−n), Φ̃ can be factorized as Φ̃ = RT Jmn R with R ∈ Rw×w(k+1)
64
4 KYP lemma and its extensions
and full row rank. Define K(ξ) as follows:



K(ξ) = R 



I
Iξ
..
.
Iξ k





(4.9)
Then, clearly, Φ(ζ, η) = K T (ζ)Jmn K(η). Since (−1)n det K T (ζ) det K(η) = det Φ(ζ, η) 6= 0, it
follows that det K(ξ) 6= 0.
Conversely, suppose Φ(ζ, η) = K T (ζ)Jmn K(η) with K(ξ) nonsingular. Then, K(ξ) can be
P
re-written as K(ξ) = ki=0 Ki ξ i with Ki nonzero matrices for i = 0 . . . k.
h
i
R = K0 K1 . . . K k
(4.10)
Since K(ξ) is a nonsingular polynomial matrix, K(λ) is nonsingular for most λ ∈ C. Hence,
R is full row rank. Therefore, the rank of the matrix RT Jmn R is precisely w. Also, clearly, this
matrix has m positive and n negative eigenvalue. Hence, Φ̃ ∈ Rw(k+1)×w(k+1) defined by RT Jmn R
has inertia (m, n, w(k + 1) − m − n).
We now demonstrate Theorem 4.4.3 with the help of an example:
"
#
"
#
1
1+η
1
1−ζ
Example 4.4.4 Let Φ1 (ζ, η) =
and Φ2 (ζ, η) =
. Notice
1+ζ
0
1−η
0
that Φ1 (−iω, iω) = Φ2 (−iω, iω). We see that




1 1
0 0
1 1 0 1
 1 0 −1 0 
 1 0 0 0 




Φ̃1 = 

 ; Φ̃2 = 
 0 −1 0 0 
 0 0 0 0 
0 0
0 0
1 0 0 0
Notice that Φ̃1 has one positive and one negative eigenvalue (−1, 2 respectively). Hence, Φ1 (ζ, η)
can be factorized as
#
#"
"
#"
1 1+η
1 0
1
0
Φ1 (ζ, η) =
0 1+η
0 −1
1+ζ 1+ζ
{z
}| {z }|
{z
}
|
K T (ζ)
J1 1
K(η)
Now consider Φ̃2 . The (nonzero) eigenvalues of Φ̃2 are found to be (-1.2469796, 0.4450419,
1.8019377). Since Φ̃2 has greater rank than 2, it cannot be diagonalized as R T J1 1 R and consequently, Φ2 (ζ, η) cannot be written as K T (ζ)J1 1 K(η). We come to the same conclusion from
Theorem 4.4.3.
T
If a two variable polynomial matrix Φ(ζ, η) ∈ Rw×w
s [ζ, η] can be written as K (ζ)Jmn K(η) with
K(ξ), Jmn square and nonsingular, K( dtd ) maps any Φ-dissipative behavior into a Jmn -dissipative
behavior as shown in Chapter 3. However, in this case, there is also a simple relationship
between storage functions of a Φ-dissipative behavior and those of the corresponding J mn dissipative behavior:
4.4. Generalization of KYP lemma
65
T
Proposition 4.4.5 Consider Φ(ζ, η) ∈ Rw×w
s [ζ, η] with Φ(ζ, η) = K (ζ)Jmn K(η). Let B be
a Φ-dissipative behavior and BJmn be the corresponding Jmn -dissipative behavior defined as
K( dtd )(B) := {v(t)|v(t) = K( dtd )w(t), w(t) ∈ B}. Let QΨJmn be a storage function for BJmn
with respect to QJmn . Then, Ψ(ζ, η) = K T (ζ)ΨJmn K(η) defines the QDF QΨ which is a storage
function for B with respect to QΦ .
Proof: Since QΨJmn is a storage function on manifest variables v of the Jmn -dissipative behavior
BJmn :
d
QΨ (v) ≤ QJmn (v) ∀v ∈ BJmn
dt Jmn
substituting v = K( dtd )w, we see that
d
d
d
QΨJmn (K( )w) ≤ QJmn (K( )w) ∀w ∈ B
dt
dt
dt
Further, QΨJmn (K( dtd )w) = QΨ (w) with Ψ(ζ, η) = K T (ζ)ΨJmn K(η) and QJmn (K( dtd )w) = QΦ (w)
since Φ(ζ, η) = K T (ζ)Jmn K(η). Hence,
d
QΨ (w) ≤ QΦ (w) ∀ w ∈ B
dt
which shows that QΨ is a storage function for B with respect to QΦ .
We now obtain a characterization of all Φ-dissipative behaviors that have positive definite
storage functions on manifest variables. Assume that Φ(ζ, η) ∈ Rw×w
s [ζ, η] can be written as
T
K (ζ)Jmn K(η) with K(ξ), Jmn square and nonsingular , and Jmn = diag[Im , −In ]. Let B be a Φdissipative behavior with manifest variables w, defined by an observable image representation:
"
#
Q( dtd )
w=
`
P ( dtd )
with Q(ξ) ∈ Rm×m [ξ] and P (ξ) ∈ Rn×m [ξ]. Define
"
#
"
#
R(ξ)
Q(ξ)
= K(ξ)
S(ξ)
P (ξ)
(4.11)
#
d
R(
)
dt
with R(ξ) ∈ Rm×m [ξ] and S(ξ) ∈ Rn×m [ξ]. Define BJmn = {v|v = Im
}. Then, by
S( dtd )
Theorem 3.5.3, BJmn is Jmn -dissipative if and only if B is Φ-dissipative. Using an observable
image representation of B and an image representation of BJmn we now obtain a characterization
of all Φ-dissipative behaviors having positive definite storage functions on manifest variables:
"
T
Theorem 4.4.6 Consider Φ(ζ, η) ∈ Rw×w
s [ζ, η] such that Φ(ζ, η) = K (ζ)Jmn K(η) with Jmn =
diag[Im , −In ], w = m + n and K(ξ) ∈ Rw×w [ξ] nonsingular.
Let
"
# B be a Φ-dissipative behavior
d
Q( dt )
defined by an observable image representation w =
` with Q(ξ) ∈ Rm×m [ξ]. Let BJmn
P ( dtd )
#
"
R( dtd )
` with R(ξ), S(ξ) as
be the corresponding Jmn -dissipative behavior defined by v =
S( dtd )
given in equation (4.11). Then, every storage function of B with respect to QΦ is positive
definite on manifest variables of B if and only if the following conditions hold:
66
4 KYP lemma and its extensions
1. The matrices R(ξ), S(ξ) are right coprime, i.e. if R(ξ) = R1 (ξ)U (ξ) and S(ξ) = S1 (ξ)U (ξ)
then U (ξ) is unimodular.
2. The behavior BJmn has no non-trivial memoryless part.
3. The matrix R(ξ) is Hurwitz, i.e. every singularity of R(ξ) lies in the open left half complex
plane.
Proof: Assume that BJmn has no memoryless part, R(ξ) is Hurwitz and R(ξ), S(ξ) are" right co#
R( dtd )
prime. Under these conditions, BJmn is defined by an observable image representation:
`.
S( dtd )
Notice that because Φ(ζ, η) = K T (ζ)Jmn K(η), every storage function of B, defined on latent
variables `, with respect to QΦ is also a storage function for BJmn with respect to QJmn , defined
on latent variables `. Every storage function on ` can be expressed in terms of the manifest
variables of BJmn since by assumption BJmn is defined by an observable image representation.
If v ∈ BJmn then ∃w ∈ B such that v = K( dtd )w. Substituting manifest variables of BJmn with
K( dtd )w, a storage function on manifest variables of BJmn can be defined in terms of manifest
variables of B. Hence, every storage function QΨ of B with respect to QΦ can be written with
Ψ(ζ, η) = K T (ζ)ΨJmn (ζ, η)K(η) where QΨJmn is a storage function for BJmn with respect to QJmn .
Since BJmn is Jmn -dissipative, has no memoryless part and R(ξ) is Hurwitz, it follows from Theorem 4.4.1 and Proposition 4.4.2 that every storage function of BJmn is positive definite on manifest variables of BJmn . We now prove that because R(ξ), S(ξ) are right coprime, every storage
function on manifest variables of B is also positive definite: notice that QΨ (w) = 0 if and only
if w ∈ Ker K( dtd ) ∩ B. Since by assumption R(ξ), S(ξ) are right-coprime, Ker K( dtd ) ∩ B = {0}.
Hence, every storage function on manifest variables of B is positive definite.
Conversely, assume that every storage function on manifest variables of B is positive definite.
One can arrive at easy contradictions if any one of the three conditions listed in the theorem
fails:
1. Assume that R(ξ), S(ξ) are not right coprime. Then, there exists a `0 ∈ C ∞ (R, Rm )
which is not observable from BJmn . One can easily construct a storage function for BJmn
(expressed in terms of latent variables `) which is zero along `0 . We can convert this
storage function into a storage function on manifest variables of B since B is given by
an observable image representation. Note that because of observability, the image of ` 0
yields non-zero trajectories in B. Hence, there exists a storage function that is zero along
trajectories in B that are obtained as the image of `0 .
2. When BJmn has a non-trivial memoryless part, a storage function QΨJmn for BJmn cannot
be positive definite on manifest variables of BJmn , and consequently because K(ξ) is
nonsingular, QΨ can not be positive definite on manifest variables of B.
3. If R(ξ) is not Hurwitz, a storage function QΨJmn for BJmn is not positive definite on manifest
variables of BJmn (Theorem 4.4.2). If QΨJmn is not positive definite then because K(ξ) is
nonsingular, QΨ cannot be positive definite.
4.5. Strict versions of the KYP lemma
67
Storage function
for B J
mn
Ψ
Observable image
representation for B J mn
Storage function
on latent variables
for B and B J
mn
Ψ
K(d/dt)
l
Observable image
representation for B
T
Κ (ζ)ΨΚ(η)
Storage function
for B
Figure 4.1: Mapping among storage functions
The essential ideas in the proof of Theorem 4.4.6 are illustrated in Figure 4.1. Consider
Φ(ζ, η) = K T (ζ)Jmn K(η) and a Φ-dissipative behavior B. The associated Jmn -dissipative behavior BJmn is defined as K( dtd )(B). QΨl is a storage function (on latent variables) for B with
respect to QΦ and also a storage function for BJmn with respect to QJmn . Using observability
of image representation for B, QΨl can be expressed in terms of manifest variables of B. Using observability of image representation for BJmn , QΨl can be expressed in terms of manifest
variables of BJmn , which in turn can be expressed in terms of manifest variables of B using the
map K( dtd ).
Given a Φ-dissipative behavior B, there may, in general, exist trajectories in B along which
dissipation is zero. All such trajectories are Φ-lossless, i.e. along these trajectories the rate of
change of the storage function is exactly equal to the supply. Such a situation is undesirable
in certain situations, especially in construction of Lyapunov functions. Hence, we would like
a characterization of behaviors such that along every nonzero trajectory, the rate of change
of storage is strictly less than the supply. In other words, behaviors in which no nonzero
trajectories are lossless. We show that such a “strict” problem corresponds to a certain strict
version of the KYP lemma.
4.5
Strict versions of the KYP lemma
We have summarized the strict version of the KYP lemma in Remark 4.3.2. Recall that
Jmn = Jmn T ∈ Rw×w has been defined as the inertia matrix diag [Im , −In ]. We address the
“strict” version of the KYP lemma by first defining the matrix
Jmn
= Jmn − Iw , ∈ (0, 1)
(4.12)
68
4 KYP lemma and its extensions
The supply function QJmn exhibits some interesting properties:
Lemma 4.5.1 Consider the matrix Jmn
in (4.12). Let B be a Jmn
-dissipative behavior with
manifest variables w. Then,
1. B is Jmn -dissipative.
2. If QΨ is any storage function for B with respect to QJmn ,
d
Q (w)
dt Ψ
< QJ (w) ∀ w ∈ B−{0}
Proof: Note that QJmn (w) = QJmn (w) − w T w. Therefore, QJmn (w) < QJmn (w) ∀w ∈ C ∞ (R, Rw )
and in particular along all w ∈ B. If QΨ be any storage function for B with respect to Jmn
then,
d
QΨ (w) ≤ QJmn (w) < QJmn (w) ∀w ∈ B
dt
Hence, B is Jmn -dissipative and dtd QΨ (w) < QJmn (w) along all nonzero w ∈ B.
Analogous to the generalization of the KYP lemma in Theorem 4.4.6 we now propose a generalization of the strict version of the KYP lemma:
T
Theorem 4.5.2 Consider Φ(ζ, η) ∈ Rw×w
s [ζ, η] such that Φ(ζ, η) = K (ζ)Jmn K(η) with Jmn =
diag[Im , −In ], w = m + n and K(ξ) ∈ Rw×w [ξ] nonsingular.
Let
# B be a Φ-dissipative behavior
"
d
Q( dt )
` with Q(ξ) ∈ Rm×m [ξ]. Let BJmn
defined by an observable image representation w =
d
P ( dt )
"
#
R( dtd )
be the corresponding Jmn -dissipative behavior defined by v =
` with R(ξ), S(ξ) as
S( dtd )
given in equation (4.11). Then, every storage function of B with respect to QΦ is positive
definite on manifest variables of B, and the rate of change of storage function is strictly less
than QΦ along every nonzero manifest variable trajectory in B if the following conditions hold:
1. BJmn is Jmn
-dissipative for some ∈ (0, 1).
2. The matrices R(ξ), S(ξ) are right coprime, i.e. if R(ξ) = R1 (ξ)U (ξ) and S(ξ) = S1 (ξ)U (ξ)
then U (ξ) is unimodular.
3. The behavior BJmn has no nontrivial memoryless part.
4. The matrix R(ξ) is Hurwitz, i.e. every singularity of R(ξ) lies in the open left half complex
plane.
Proof: Since the proof goes along similar lines to the proof of Theorem 4.4.6, we only give a
brief sketch. If B is Φ dissipative and BJ is Jmn
-dissipative for some ∈ (0, 1), then the rate
of change of every storage function on manifest variables of B is strictly less than Q Φ . The
remaining three conditions are the same as those in Theorem 4.4.6, and relate to the existence
of positive definite storage functions on manifest variables of B.
Remark 4.5.3 Note that the converse of Theorem 4.5.2, unlike its milder counterpart, Theorem 4.4.6 does not necessarily hold, i.e. a Φ-dissipative behavior could be such that the rate
of change of the storage function along trajectories in the behavior is strictly less than Q Φ ,
-dissipative. This is because a
but the corresponding Jmn -dissipative behavior BJmn is not Jmn
4.5. Strict versions of the KYP lemma
69
behavior that is Jmn -dissipative but not Jmn
-dissipative may still have positive definite storage
functions such that the rate of change of the storage function along every nonzero trajectory
in the behavior is strictly less than QJmn . Such a situation arises when a dissipation function
corresponding to QJmn , defined by ||D( dtd )w||2 , w ∈ B is such that D(ξ) is unimodular. The
claim in this remark is illustrated with an example.
Example 4.5.4 The aim of this example is to show that Jmn
-dissipativity, for some ∈ (0, 1),
is in general sufficient for the rate of change of storage function to be strictly less than the
supply function,
" but not# necessary.
1 0
Let J1 1 =
. Consider the RC circuit in Example 2.2.1 with R1 = R2 = C = 1.
0 −1
Then,
behavior of the port voltage V and the port current I is given by:
"
# the"corresponding
#
d
V
q( dt )
=
` with q(ξ) = ξ + 2 and p(ξ) = ξ + 1. Let us check that B is J1 1 -dissipative:
I
p( dtd )
J10 1 (−iω, iω) =
h
q(−iω) p(−iω)
i
J1 1
"
q(iω)
p(iω)
#
=3>0
Let us check whether B is J1 1 dissipative for some ∈ (0, 1):
"
#
h
i
q(iω)
= 3 − 5 − 2ω 2
q(−iω) p(−iω) J1 1
p(iω)
which is not positive for any ∈ (0, 1). Therefore, B is not J1 1 -dissipative for any ∈ (0, 1).
Theorem 4.4.6 tells that B has positive definite storage functions. In this case there is a unique
storage function for B with respect to QJ1 1 and can be computed (on latent variables) to be
QΨ (`) = `2 , clearly a positive definite QDF. Since the corresponding dissipation function Q∆ (`)
is 3`2 , also positive definite, dtd QΨ (`) = QJ1 1 0 (`) if and only if Q∆ (`) = 0 which holds if and
only if ` = 0. Since dtd QΨ (`) ≤ QJ1 1 0 (`) and the corresponding dissipation function is positive
definite, we see that dtd QΨ (`) < QJ1 1 0 (`).
Remark 4.5.5 It is easy to see that Theorems 4.4.6 and 4.5.2 can be used to parametrize the set
of Φ-dissipative behaviors with positive definite storage functions. If Φ(ζ, η) = K T (ζ)Jmn K(η),
define L(ξ) = adj K(ξ), i.e.L(ξ)K(ξ) = det K(ξ)Iw . We have seen in Chapter 3 that the differential operator L( dtd ) is a map from the set of all Jmn -dissipative behaviors to the set of all
Φ-dissipative behaviors. Suppose a Jmn -dissipative behavior BJmn having input cardinality m
(the number of positive 1s in Jmn ) is defined by an observable image representation ImM ( dtd ).
Then, B defined by the image representation L(ξ)M (ξ) (converted into an observable image
representation M 0 (ξ)) has positive definite storage functions if and only if K(ξ)M 0 (ξ) defines
an observable image representation, has no non-trivial memoryless part and the corresponding
rational function is bounded real.
We now, as in the previous chapter, concentrate on the important and interesting special case
of SISO dissipative behaviors. We see that in this case, the results are more explicit.
70
4.6
4 KYP lemma and its extensions
Special case: KYP lemma for SISO systems
As in the previous chapter, in the SISO case we prefer working with passive systems, rather
than Jmn -dissipative systems. This is, however, solely a matter of choice. Working with passive
systems helps us relate more closely to the KYP lemma as stated in Theorem 4.1.1.
Recall that the matrix J has been defined in the previous chapter as
"
#
0 1/2
J=
(4.13)
1/2 0
Consider a behavior B defined by the observable image representation
" # "
#
d
u
q( dt )
`
=
p( dtd )
y
(4.14)
Recall that B is J-dissipative if and only if the associated rational function of B, which we
have defined to be G(ξ) : p(ξ)/q(ξ) has its Nyquist plot entirely in the closed right half plane.
We begin by defining the two variable polynomial J 0 (ζ, η):
"
#
h
i
q(η)
J 0 (ζ, η) = q(ζ) p(ζ) J
(4.15)
p(η)
J-dissipativity of B implies that J 0 (−iω, iω) ≥ 0 ∀ ω ∈ R. Or, written explicitly,
p(−iω)q(iω) + p(iω)q(−iω) ≥ 0 ∀ω ∈ R
Notice that J can be written in the following manner:
"
#"
#"
#
1 1 1
1 0
1 1
J=
2 1 −1
0 −1
1 −1
(4.16)
Using the polynomials p(ξ), q(ξ) that define the image representation of B, we now define two
new polynomials: r(ξ) = q(ξ) + p(ξ) and s(ξ) = q(ξ) − p(ξ). Then, we have the following
lemma:
Lemma 4.6.1 r(ξ) and s(ξ) are co-prime if and only if p(ξ) and q(ξ) are coprime.
"
#
r( dtd )
Lemma 4.6.1 implies that the image representation Im
is observable if and only if the
s( dtd )
#
"
q( dtd )
is also observable. We now state the following result that
image representation Im
p( dtd )
relates the roots of r(ξ) with those of q(ξ) and p(ξ). This result is not new, however, we believe
that the proof presented here is simpler and elementary than those found elsewhere [105]. It
only uses elementary complex analysis and the celebrated Nyquist stability criterion.
Proposition 4.6.2 Consider polynomials
p(ξ)
# and q(ξ) such that the behavior defined by the
"
d
q( dt )
is J-dissipative. Then, r(ξ) := q(ξ) + p(ξ) is
observable image representation Im
p( dtd )
Hurwitz if and only if the following hold:
4.6. Special case: KYP lemma for SISO systems
71
1. No roots of q(ξ) and p(ξ) lie in the open right half of the complex plane.
2. Any purely imaginary root of q(ξ) and p(ξ) is a simple root.
Proof: Define a rational function G(ξ) =
p(ξ)
.
q(ξ)
Since by J-dissipativity,
J 0 (−iω, iω) = p(−iω)q(iω) + p(iω)q(−iω) ≥ 0 ∀ω ∈ R
we have
G(iω) + G(−iω) ≥ 0 for almost all ω ∈ R
or, Re G(iω) ≥ 0 for almost all ω ∈ R. This implies that the Nyquist plot of G(ξ) lies
entirely in the closed right half complex plane.
Note that the polynomial J 0 (ζ, η) is symmetric in p(ζ) and q(ζ). Therefore the behavior
associated with G(ξ) and the behavior associated with 1/G(ξ) are both J-dissipative. Thus
without loss of generality, it is enough to prove the Proposition for q(ξ).
Suppose q(ξ) has no roots in the open right half plane and all it’s purely imaginary roots
are simple. Since the Nyquist plot of G(ξ) lies in the closed right half plane, by using Nyquist
G(ξ)
has all its poles in
stability arguments, it is seen that the rational function defined by 1+G(ξ)
the open left half of the complex plane. Hence, p(ξ) + q(ξ) is a Hurwitz polynomial.
G(ξ)
Conversely, suppose p(ξ) + q(ξ) is Hurwitz. Then, 1+G(ξ)
has all its poles in the open left
half of the complex plane. Observe that if q(ξ) has non-simple roots on the imaginary axis, the
Nyquist plot of G(ξ) has circle(s) of infinite radius. This implies that the Nyquist plot of G(ξ)
does not lie in the closed right half plane. This contradicts our assumption on J-dissipativity.
Therefore, q(ξ) can have at most simple roots on the imaginary axis.
If q(ξ) has k roots in the open right half complex plain, for p(ξ) + q(ξ) to be Hurwitz, the
Nyquist plot of G(ξ) will have to encircle the origin k times in the counter clockwise sense.
This again contradicts our assumption that B is J-dissipative. Hence q(ξ) has no roots in the
open right half plane and all its purely imaginary roots are simple.
Using results obtained in Theorem 4.4.6, Lemma 4.6.1 and Proposition 4.6.2 the following
result follows as a corollary:
Corollary 4.6.3 Let B be a J-dissipative behavior associated with the rational function
p(ξ)/q(ξ). Then, the following are equivalent
1. p(ξ) and q(ξ) are not both constant, have no roots in the open right half plane, and all
purely imaginary roots are simple.
2. There exists a positive definite storage function (with respect to QJ ) on the manifest
variables of B. Every storage function on the manifest variables of B is positive definite.
Corollary 4.6.3 states that every storage function for B is positive definite if and only if the
associated rational function p(s)/q(s) is (non-constant) positive real with p(s), q(s) coprime.
The strict version of the KYP lemma (Theorem 4.5.2) for J-dissipative SISO systems reduces
to:
72
4 KYP lemma and its extensions
Corollary 4.6.4 Let B be a J-dissipative behavior associated with the rational function
p(ξ)/q(ξ). Every storage function of B with respect to QJ is positive definite on manifest variables, and the rate of change of the storage function is strictly less than QJ if p(ξ)/q(ξ) is nonconstant and strictly positive real, i.e. p(ξ), q(ξ) are strictly Hurwitz and Re p(iω)/q(iω) ≥ > 0
for every ω ∈ R.
We shall now consider SISO systems dissipative with respect to a supply QΦ with Φ(ζ, η) ∈
such that Φ(ζ, η) can be written as K T (ζ)JK(η). Necessary and sufficient conditions
for such a splitting of Φ(ζ, η) are given in Theorem 4.4.3.
R2×2
s [ζ, η]
T
Corollary 4.6.5 Let Φ(ζ, η) ∈ R2×2
s [ζ, η] such that Φ(ζ, η) = K (ζ)JK(η) with K(ξ) ∈
R2×2 [ξ] and nonsingular. Consider a Φ-dissipative controllable behavior B given by an observable image representation:
" # "
#
u
q( dtd )
=
`
y
p( dtd )
Consider the associated J-dissipative behavior defined by the image of
"
#
"
#
q̃(ξ)
q(ξ)
= K(ξ)
p̃(ξ)
p(ξ)
Then, every storage function on the manifest variables of B is positive definite if and only if the
following holds: q̃(ξ) and p̃(ξ) are coprime and the rational function p̃(ξ)/q̃(ξ) is non-constant
and positive real. Further, the rate of change of the storage function is strictly less than the
supply if the following holds: q̃(ξ) and p̃(ξ) are coprime and the rational function p̃(ξ)/q̃(ξ) is
non-constant and strictly positive real.
We now demonstrate the ideas behind results presented in this chapter with the help of a
few examples.
Example 4.6.6 Consider a Capacitor of capacitance 1 unit. Let v be the voltage across the
capacitor and i be the corresponding current. Then, the behavior B corresponding to the
voltage-current relationship of the capacitor is given by
" #
i i
h
=0
1 − dtd
v
This behavior is clearly controllable. An observable image representation of B is:
" # "
#
d
i
= dt `
1
v
Consider the supply function QJ . Then, QJ (i, v) = vi. It is easy to see that B is lossless on
QJ . Hence, there exists a unique storage function for B with respect to QJ which (on latent
variables) is found to be `2 /2. Clearly, the storage function is positive definite on all `. Since
B is non-constant, the corresponding storage function on manifest variables is v 2 /2, which is
also clearly positive definite.
4.6. Special case: KYP lemma for SISO systems
73
One may reach the same conclusion by a quick examination of the associated transfer function G(ξ) := 1/ξ. This transfer function is positive real, and non-constant. Therefore, from
Corollary 4.6.3, every storage function on manifest variables of B is positive definite.
We now examine what happens when the supply function is a more general QDF:
Example 4.6.7 Consider the supply function defined by
"
# "
# "
#
1 0 η
1 0
1 0
Φ(ζ, η) =
=
J
2 ζ 0
0 ζ
0 η
(4.17)
Consider the behavior B in the Example 4.6.6, corresponding to a unit capacitance:
" # "
#
d
i
= dt `
v
1
The action of QΦ on trajectories in B can be found to be QΦ (i, v) = i2 , which is pointwise positive. Defining a dissipation function to be i2 , one can see that the storage function
corresponding to this dissipation function is zero.
"
#
1 0
We come to the same conclusion using Corollary 4.6.3. Let K(ξ) =
. The image
0 ξ
"
#
representation corresponding to K( dtd )(B) is found to be
d
dt
d
dt
` which is not observable, since
the image of every constant ` is zero. Therefore, from Corollary 4.6.3, it follows that B does
not have positive definite storage functions with respect to QΦ .
Remark 4.6.8 Examples 4.6.6 and 4.6.7 though extremely simple demonstrate some interesting concepts and give some insight into sign definiteness of storage functions. In many
problems, checking whether or not a given dynamical system has positive definite storage functions is important, rather than actually computing these storage function. In Example 4.6.6,
checking for positive definiteness of the rational function 1/ξ is clearly easier than computing
the storage function (which in general requires a symbolic computational package, and also
spectral factorization). In Example 4.6.7, the supply function QΦ is point-wise positive on (all)
trajectories. Since, the supply function is now indistinguishable from a dissipation function,
one can compute a storage function using QΦ as a dissipation function. Such a storage function
will clearly be zero.
The final example serves to highlight the difference between storage functions on manifest
variables and storage functions on states:
Example 4.6.9 Consider the behavior B given by:
#
" # "
d
+
2
u
dt
`
=
d2
y
+ dtd − 4
dt
74
4 KYP lemma and its extensions
Consider a Φ(ζ, η) where:
1
Φ(ζ, η) =
4
"
ζ + η + 3ζη − 1 −3ζ − 1
−3η − 1
3
#
Define the two variable polynomial Φ0 (ζ, η) by:
0
Φ (ζ, η) =
h
ζ + 2 ζ2 + ζ − 4
i
Φ(ζ, η)
"
η+2
2
η +η−4
#
which is equal to ζη +4(ζ +η)+15. B is dissipative with respect to the supply function induced
by Φ(ζ, η) since Φ0 (−iω, iω) > 0∀ω ∈ R. The minimum storage function on the latent variable
` of B is found to be:
√
QK (`) = (4 − 15)`2
which is positive definite on all ` and hence also positive definite on all (u, y) ∈ B, due to
observability of `. That every storage function on the manifest variables of B is positive
definite is also obtained from Theorem 4.4.6 since:
Φ0 (ζ, η) = (ζ + 4)(η + 4) − 1
Note that ξ + 4 is a Hurwitz polynomial.
The McMillan degree of B is 2. Hence, a minimal state representation for B has 2 state
variables (for example ` and dtd `). However, the storage function obtained above depends only
on ` (i.e., only one state). Therefore, this storage function is only positive semidefinite on a set
of minimal states of B but is positive definite on the manifest variables of B.
4.7
Conclusion
The KYP lemma relates passive systems with positive definite storage functions. Using this
connection, we generalized the KYP lemma in a number of directions. We obtained a characterization of dissipative systems having positive definite storage functions. Unlike the KYP
lemma which only considers supply functions that are quadratic forms, we also considered supply functions that are quadratic differential forms. Hence, the KYP lemma is a special case of
the characterization proposed here. We also formulated variants of the KYP lemma available
in literature, in the behavioral framework, and generalized them. We have not invoked a state
space representation while formulating the KYP lemma and its generalizations. Therefore,
our approach is representation free– however, frequency domain representations can be readily
obtained if these are desired.
As a part of future work, we plan to investigate state space formulations of our generalization, not so much for conceptual benefits but for the sake of computations. It is known that the
state space version of the KYP lemma can be formulated as an LMI (Linear Matrix Inequality).
This has in the recent times has led to a spurt of research in this area due to developments
in convex optimization. It would be interesting to investigate LMI based formulations for the
generalizations obtained in this chapter.
Chapter 5
Designing linear controllers for
nonlinearities
5.1
Introduction
Consider the following problem: given a nonlinear system, characterize a class of linear differential systems (called controllers), that can be interconnected with the nonlinear system
through given interconnection constraints to yield a stable or an asymptotically stable system.
This problem is a variant of a classical problem that has received wide attention in literature
– the problem of “absolute stability” due to Popov [76, 77, 78], Yakubovich and many others.
The absolute stability problem has its origin in the well known conjecture of Aizerman. In
1949, Aizerman put forward the following conjecture: assume that a system is obtained by an
interconnection of a linear system and a memoryless time-invariant, single-valued nonlinearity
whose characteristic is bounded inside a sector in R2 defined by a pair of straight lines with
slopes k1 , k2 . The (interconnected) system is stable if, when the nonlinearity were replaced by
a linear gain k, the resulting linear system is stable for k1 < k < k2 . Pliss [73] demonstrated
in 1958 that Aizerman’s conjecture isn’t true in general. Subsequently, many researchers have
demonstrated counterexamples to Aizerman’s conjecture, see [21]. A big impetus to the development of absolute stability theory came with the work of the Romanian scientist V.M. Popov,
who in a series of seminal papers in the 1960s [76, 77] obtained a very general “frequency criterion” for predicting stability of systems obtained by interconnecting a linear system with a
nonlinearity.
The absolute stability problem is rich– for though it is now more than fourty years since
Popov’s fundamental work in this direction, the stability problem continues to be an area of
active research. Broadly speaking, absolute stability criteria can be stated in the frequency
domain (using certain “multipliers”), or in state space (using a Linear Matrix Inequality). The
debate on which of these is “better” is ongoing [20]. There are still numerous open issues– the
greatest of them being that till date, no necessary conditions for absolute stability seem to be
known. In the 1960s and 70s, the emphasis was mostly on SISO systems, see for instance the
insightful papers by Brockett and Willems [15, 16]. Focus was also on investigating different
“families” of nonlinearities, for example the family of all monotone nonlinearities [111]. In
the 1990s, research in absolute stability was revived due to developments in Linear Matrix
76
5 Designing linear controllers for nonlinearities
Inequalities and convex optimization, see [81, 52, 42] for example. MIMO versions of stability
theorems also received attention [31, 37, 48]. Research is also focused on nonlinearities with
memory [30, 39]. See [5] for a collection of current problems.
From a behavioral viewpoint, we are interested in finding stabilizing linear controllers for
given nonlinearities. Formulating the absolute stability problem in a behavioral theoretic framework has advantages: firstly, there are fewer a priori assumptions on the linear system that can
be connected with a nonlinearity as compared to transfer function based methods. Classical
theories are essentially dependent on a “negative feedback” interconnection of the linear and
nonlinear systems. In the behavioral framework, fairly general type interconnection constraints
can also be handled. Further, the so called “loop transformations” can be shown to be a special
case of results obtained in the behavioral framework.
Lyapunov theory is a well known tool for analysis of dynamical systems. An important
part of Lyapunov theory deals with the construction of Lyapunov functions. In this chapter
we use behavior-theoretic ideas to construct Lyapunov functions for nonlinear systems. These
Lyapunov functions will be constructed using storage functions for dissipative LTI systems. In
order for the storage functions to qualify as Lyapunov functions, they need to satisfy conditions
on positivity and radial unboundedness. We explore these issues in the next section. The scalar
version of results obtained in this chapter have been published [65, 69]. A journal version of
the results presented in this chapter is under preparation.
This chapter is organized as follows: in Section 5.2, we develop the necessary background.
Section 5.3 is about the “control as interconnection” viewpoint. Here, we formulate the stabilization problem as an interconnection. In Section 5.4 we provide precise definitions of a
nonlinearity. We also give a precise problem statement. Section 5.5 is the main section of
this paper. Here, we give a recipe for constructing a set of stabilizing controllers for a given
nonlinearity. This is followed by applications: Section 5.6 about the Circle criterion, Section
5.7 about Popov’s stability criterion, and Section 5.8 about slope restricted nonlinearities. In
section 5.9, we discuss applications of the theory for nonlinearities with memory.
5.2
Preliminaries
In this section, we continue with the ideas presented in Chapter 4 to construct Lyapunov
functions using storage functions of linear systems.
Consider a QDF QΦ with Φ(ζ, η) = K T (ζ)JK(η), K(ξ) ∈ Rw×w [ξ] and nonsingular,
"
#
1 0 Im
J=
2 Im 0
where w = 2m. Let B be a Φ-dissipative behavior with input cardinality w/2, corresponding to
the rational function P (ξ)Q−1 (ξ). We now define:
"
#
"
#
R(ξ)
Q(ξ)
= K(ξ)
(5.1)
S(ξ)
P (ξ)
The following corollary follows as a consequence of the KYP lemma, Theorem 4.4.6:
5.2. Preliminaries
77
Corollary 5.2.1 Consider a Φ-dissipative behavior B = {(u, y)} associated with the rational
function P (ξ)Q−1 (ξ), with input cardinality
w/2.
# Consider the corresponding J-dissipative
"
d
R( dt )
with R(ξ), S(ξ) as in equation (5.1). Then,
BJ := {(u0 , y 0 )} behavior given by Im
S( dtd )
the following statements are equivalent:
1. Every storage function of B with respect to QΦ is positive definite on manifest variables.
2. Matrices defining BJ satisfy the following conditions:
(a) R(ξ) is nonsingular.
(b) R(ξ), S(ξ) are right coprime, i.e. if R(ξ) = R1 (ξ)U (ξ) and S(ξ) = S1 (ξ)U (ξ) then
U (ξ) is unimodular.
(c) BJ has no non-trivial memoryless part.
(d) The rational function S(ξ)R−1 (ξ) is positive real.
Further, the rate of change of every storage function of B with respect to QΦ is strictly less
than QΦ if S(ξ)R−1 (ξ) is strictly positive real.
Proof: We use Theorem 4.4.6 to prove the Corollary. The only difference between Theorem
4.4.6 and this Corollary is that, while the latter concerns the supply function QJmn , we consider
the supply function QJ .
Notice that
#"
#
"
# "
#"
Im 0
Im /2 Im /2
0 Im /2
Im /2 Im /2
=
0 −Im
Im /2 −Im /2
Im /2 0
Im /2 −Im /2
{z
}
|
{z
}
|
C
CT
Thus we see that Φ(ζ, η) = K T (ζ)C T Jmm CK(η). Let B1 = K( dtd )(B) and B2 = CK( dtd )(B).
Assume that
"
#
N ( dtd )
B2 = Im
`
M ( dtd )
Notice that N = (R + S)/2 and M = (R − S)/2. Since B is Φ-dissipative, B1 is J-dissipative
and B2 is Jmm -dissipative. From Theorem 4.4.6, B has positive definite storage functions on
manifest variables if and only if M, N are right coprime, N (ξ) is a Hurwitz matrix and the
behavior associated with the rational function M N −1 has no non-trivial memoryless part. We
now prove that these conditions are equivalent to those given in the statement of the Corollary.
If w ∈ B1 then v := Cw ∈ B2 . Let XJ be a minimal state map for B1 :
x = XJ (
d
)w, w ∈ B1
dt
Then, XJ (ξ)C −1 is a state map for B2 :
x = XJ (
d −1
)C v, v ∈ B2
dt
Therefore, B1 has no non-trivial memoryless part if and only if B2 has no non-trivial memoryless part.
78
5 Designing linear controllers for nonlinearities
The claim that B1 is given by an observable image representation if and only if B2 is given
by an observable image representation also follows from the invertibility of C. Finally, since
||(R − S)(R + S)−1 ||H∞ ≤ 1, it follows that SR−1 is positive real, [3, 56]. This is a well known
property of bounded real rational functions. The strict version of this corollary can be proved
similarly from Theorem 4.5.2.
We have elaborated on the difference between storage functions on states and storage functions
on manifest variables in Chapter 4. Trentelman and Willems [95] have shown that a storage
function for a behavior B is a state function of an associated behavior. However, under additional assumptions, a storage function for a behavior can be guaranteed also to be a state
function. We now investigate when storage functions for a Φ-dissipative behavior are functions
of state.
Theorem 5.2.2 Given Φ(ζ, η) = K T (ζ)JK(η), K(ξ) ∈ Rw×w [ξ] nonsingular, let
"
#
K11 K12
K(ξ) =
K21 K22
where Kij ∈ Rm×m [ξ], i, j = 1, 2. Let B, defined by an observable image representation,
#
" # "
u
Q( dtd )
`
=
y
P ( dtd )
be Φ-dissipative. Further assume that P (ξ)Q−1 (ξ) is proper. If GB := K11 + K12 P Q−1 is a
proper rational function then every storage function of B with respect to QΦ is a state function.
Proof: Consider the associated J-dissipative behavior of B:
" # "
#
ū
K11 ( dtd )Q( dtd ) + K12 ( dtd )P ( dtd )
B̄ =
=
`
ȳ
K21 ( dtd )Q( dtd ) + K22 ( dtd )P ( dtd )
From [95], page 254, Theorem 6.1 it follows that every storage function of B̄ with respect
to QJ is a state function of B̄, i.e. every storage function can written as x̄T P x̄ such that
d
Q (x̄) ≤ QJ (ū, ȳ), where x̄ denote a minimal set of states of B̄.
dt P
Let us investigate under what conditions there exists a constant matrix C such that x̄ = Cx
where x denote states of B. Let X̄( dtd ) and X( dtd ) denote state maps acting on the latent
variable ` for B̄ and B respectively.
x̄ = X̄(
d
d
)`; x = X( )`
dt
dt
Then, x̄ = Cx if and only if X̄(ξ) = CX(ξ).
We know that the row span of X̄(ξ) (over R) is precisely the span of rows ri (ξ) such that
ri (K11 Q + K12 P )−1 is strictly proper (Corollary 3.5.2, [82], page 64). Since ri (K11 Q + K12 P )−1
is strictly proper,
Gs = ri Q−1 (K11 + K12 P Q−1 )−1
is strictly proper. Since by assumption K11 + K12 P Q−1 is proper,
ri Q−1 = Gs (K11 + K12 P Q−1 )
5.2. Preliminaries
79
is strictly proper because Gs is strictly proper. This shows that ri also lies in the row span of
X(ξ). We have proved that every vector in the row span of X̄(ξ) lies in the row span of X(ξ)
(over R) when K11 + K12 P Q−1 is proper. Hence, there exists a C such that x̄ = Cx.
The above result is however not enough to guarantee positive definiteness of storage functions
of B since the McMillan degree of B̄ could be less than that of B (see Example 4.6.9 for a
demonstration). In order to preserve positive definiteness we want C to be full column rank so
that Cx = 0 =⇒ x = 0. A necessary condition for the existence of such a C is that
McMillan degree of B̄ ≥ McMillan degree of B
As a special case we have the following result:
Theorem 5.2.3 With reference to Theorem 5.2.2, if GB := K11 + K12 P Q−1 is bi-proper (i.e.
proper with a proper inverse) then every positive definite state function of B̄ is a positive
definite state function of B.
Proof: We show that if K11 + K12 P Q−1 is biproper then there exist matrices C and C 0 such
that
x̄ = Cx ; x = C 0 x̄
The first part of the claim follows from Theorem 5.2.2. If ri Q−1 is strictly proper, then
ri Q−1 [K11 + K12 P Q−1 ]−1 is strictly proper
since K11 + K12 P Q−1 has a proper inverse by assumption. Therefore,
ri [K11 Q + K12 P ]−1 is strictly proper
Hence, x = C 0 x̄.
−1
Thus, Theorem 5.2.3 shows that if K11 + K12 P Q is biproper then the state spaces of B̄
and B are the same. Let xT Kx be a positive definite storage function for B with respect
to QΦ . Note that the storage function is radially unbounded, i.e. xT Kx → ∞ as x → ∞.
Positive definiteness and radial unboundedness of storage functions will be crucially used while
constructing Lyapunov functions.
Since we are considering nonlinear systems in this chapter, we have to consider a larger
space than C ∞ , the space of smooth functions. In this chapter, we assume the function space
w
w
loc
to be Lloc
1 (R, R ), the space of locally integrable functions from R to R . Differentiation of L1
functions is of course assumed in the distributional sense. See Section 2.3 for an introduction
to differentiation in Lloc
1 , and the concept of “weak solutions”.
loc
A note about the action of QDFs on Lloc
1 functions is in order. Note that the space L1 is
not stable under differentiation, i.e. the differentiation of a Lloc
1 function may not yield another
loc
loc
L1 function. Hence, given a behavior B = (u, y) ⊂ L1 (R, Rw ), a QDF QΦ (u, y) may not be a
L1loc function in general, and worse, may not even be well defined. However, by restricting Φs to
loc
w
a suitable set, one can guarantee QΦ (u, y) to be a Lloc
1 function when (u, y) ∈ B ⊂ L1 (R, R ).
We have reviewed state space representations for a behavior in Section 2.8. Due to the axiom
of state (Definition 2.8.1), states are continuous (C 0 ) functions of t. It can be shown that
loc
w
QΦ (u, y) ∈ Lloc
1 (R, R) when (u, y) ∈ B ⊂ L1 (R, R ), if QΦ (u, y) can be written as a quadratic
form in terms of u, x and ẋ where u denotes an input (Section 2.9) and x denotes a minimal
set of states for B. In the sequel, given a QΦ , we only associate those behaviors with it which
loc
w
satisfy QΦ (u, y) ∈ Lloc
1 (R, R) when (u, y) ∈ B ⊂ L1 (R, R ).
80
5.3
5 Designing linear controllers for nonlinearities
Control as an interconnection
A system that does not satisfy the superposition principle (Definition 2.1.2) is called a nonlinear
system, or in short, a nonlinearity. We reserve the precise definition till the next section. At an
abstract level, the nonlinearity consists of a set of trajectories allowed by the law that defines
the nonlinearity. This set may contain trajectories that are desirable and those that are not.
By attaching a controller to the nonlinearity, we want to restrict the nonlinearity in such a way
that the undesirable trajectories are eliminated. In the context of this chapter, the “desirable”
trajectories are those that ensure stability of the equilibrium (a precise definition will be given
later).
The idea behind control is a restriction brought about by interconnection of systems. This
notion of control is both physically appealing and practically important (see [101, 102] for
details about the “control-as-an-interconnection” viewpoint). Therefore, in this chapter, we
also adopt the view that control of a system is a restriction brought about by interconnection.
In Chapter 7, we will again encounter the “control-as-an-interconnection” philosophy when we
address problems in synthesis of dissipative systems.
We intend to design linear stabilizing controllers for given nonlinearities. Hence, in the
sequel, we only study autonomous systems that are obtained by interconnection of a linear
and a nonlinear system. In Section 2.7, we defined the stability of linear autonomous systems.
There, the notion of stability matches with our intuitive notion that “trajectories in a stable
system do not blow up”. However, in nonlinear systems, one has to define stability more
carefully. We now define what we mean by stability of equilibria a dynamical system:
Definition 5.3.1 Consider the system ẋ = f (x) with f (0) = 0, x(t) ∈ R× being states.
Assume that there is a unique solution to this equation for every x(0) ∈ R× . The equilibrium
state x = 0 is called stable if for every > 0 there exists a δ > 0 such that if ||x(0)|| < δ then
||x(t)|| < for all t ∈ R+ . Further, the equilibrium x = 0 is called globally asymptotically stable
if it is stable, and for all x(0) ∈ R× , ||x(t)|| → 0 as t → ∞.
These definitions are standard [107]. Since we are considering real valued systems, the definition
is independent of the norm used, since all norms on Euclidean spaces are equivalent. Further,
global asymptotic stability means that no matter what is the initial condition, the trajectory
x will eventually tend to 0. For an equilibrium to be globally asymptotically stable, it should
be the only equilibrium in a system. Since the state x = 0 corresponds to the zero manifest
variable trajectory, by stability about the zero trajectory we mean stability of x = 0. When
considering systems with only a single equilibrium, we refer to the asymptotic stability about
the zero trajectory, however strictly speaking we are really considering the global asymptotic
stability of the equilibrium state x = 0.
5.4
Nonlinear systems: problem formulation
In this chapter, we only consider systems that are obtained by an interconnection of a linear
system and a nonlinearity. Let v = [vi ]mi=1 and w = [wi ]mi=1 be Rm -valued columns. Consider
the maps vi = fi (wi ) defined almost everywhere, i.e. fi (wi ) is a Lloc
1 function with respect to
5.4. Nonlinear systems: problem formulation
81
wi , i = 1, . . . , m. Further assume that fi (0) = 0. We represent these m maps notationally by
v = f (w). We say that v = f (w) defines a memoryless nonlinearity. If the fi s do not depend
on time, they will be called time-invariant. In this chapter we only consider memoryless and
time-invariant nonlinearities. Define
w
N = {(w, v) ∈ Lloc
1 (R, R ) such that vi = fi (wi ), i = 1, . . . , m}, w = 2m
Thus, N is the set of trajectories consistent with v = f (w). Note that because fi (0) = 0, i =
1, . . . , m by assumption, it follows that (0, 0) ∈ N. Analogous to Chapter 2 where we considered a
linear system Σ and its behavior B interchangeably, in this chapter we consider the nonlinearity
defined by f and its “behavior” N interchangeably. Consider a QDF QΘ with Θ ∈ Rw×w
s [ζ, η].
Define
w
loc
N := {(w, v) ∈ Lloc
1 (R, R ) such that QΘ (w, v) ∈ L1 (R, R), QΘ (w, v) ≥ 0}
If v = f (w) is such that N ⊂ N then N will be called nonlinearity of the N -kind. The set of
all nonlinearities N of the N -kind defines a family of nonlinearities FN .
Consider any nonlinearity N ∈ FN . Also, consider a linear differential behavior B = (u, y)
associated with the strictly proper rational function P (ξ)Q−1 (ξ). Let the nonlinearity N and
the behavior B be interconnected as follows:
"
#
"
#
d
u(t)
w(t)
(5.2)
= X( )
dt
y(t)
v(t)
Here, X (ξ) ∈ Rw×w [ξ] is called the interconnection matrix, which is assumed to be nonsingular. When
"
# B and N are interconnected using “negative feedback”, X can be seen to be
0 Im
. For problems addressed in this chapter, only “negative feedback” interconnection
−Im 0
will be considered. However, any other type of interconnection does not affect the essence of
the theory presented here. Using negative feedback, B and N are interconnected to obtain the
autonomous nonlinear system BN . The process of interconnection is diagrammatically shown
in Figure 5.1.
Since N ∈ FN is memoryless, a state space representation of BN (corresponding to a
negative feedback interconnection) can be obtained using the states of the linear differential
behavior B. In particular, BN admits a representation of the type:
ẋ = Ax − Bf (Cx)
(5.3)
where (A, B, C) is a minimal state representation of B with states x, and v = f (w) is the law
defining the nonlinearity N. Since both B and N admit the zero trajectory, BN is nonempty
and has at least the zero trajectory (u, y) = (0, 0). Recall that by asymptotic stability of
(0, 0) ∈ BN , we mean global asymptotic stability of the state x = 0.
Problem formulation: Given a family of nonlinearities FN , determine a class CB of LTI
behaviors B such that (0, 0) ∈ BN is asymptotically stable for every B ∈ CB , N ∈ FN .
The next section is devoted towards addressing the problem formulated above.
82
5 Designing linear controllers for nonlinearities
u(t)
LTI Dynamical System
y(t)
(Behavior)
Interconnection
v(t)
w(t)
Nonlinearity
Figure 5.1: Interconnection of B and N
5.5
Constructing stabilizing controllers
For the sake of clarity, we first present a recipe for obtaining CB in Section 5.5.1, remaining
parts of this section are devoted to proving claims made in the recipe.
5.5.1
A recipe to obtain stabilizing behaviors for all nonlinearities
in a given family
1. Consider a QDF QΘ1 . Define:
w
loc
NΘ1 := {(w, v) ∈ Lloc
1 (R, R )|QΘ1 (w, v) ∈ L1 (R, R), QΘ1 (w, v) ≥ 0}
(5.4)
w
Consider sets N = {(w, v) ∈ Lloc
1 (R, R )|v = f (w)} where f defines a memoryless timeinvariant nonlinearity. Then, FNΘ1 := {N|N ⊂ NΘ1 } is a family of nonlinearities of the
NΘ1 -kind.
2. Find a continuous function G(w, v) such that:
(a) G(w, v) is positive semidefinite on all (w, v) ∈ NΘ1 .
(b)
d
G(w, v)
dt
Note that
d
G
dt
is a QDF, say QΘ2 (w, v).
is guaranteed to be a Lloc
1 function.
3. Let Θ(ζ, η) = Θ1 (ζ, η) + Θ2 (ζ, η). Note that for all (w, v) ∈ NΘ1 :
d
G(w, v)
dt
(5.5)
Φ(ζ, η) = −X T Θ(ζ, η)X
(5.6)
QΘ (w, v) ≥
4. Compute the matrix Φ(ζ, η) as follows:
Here X is the interconnection matrix. We have already stated above that in this chapter
only a negative feedback interconnection will be considered.
5.5. Constructing stabilizing controllers
83
5. Check whether Φ(ζ, η) can be split as Φ(ζ, η) = K T (ζ)JK(η), K(ξ) ∈ Rw×w [ξ], nonsingular and
"
#
1 0 Im
J=
, m = w/2
2 Im 0
Consider a Φ-dissipative behavior B associated with the rational function P (ξ)Q−1 (ξ). If B
has positive definite storage functions on manifest variables, and if the rational function GB
defined in Theorem 5.2.3 is biproper then (0, 0) ∈ BN is asymptotically stable.
5.5.2
Stability results
Notice that using the interconnection relations, one can express G(w, v) in terms of (u, y) as
G(u, y). The following lemma is useful in finding a Lyapunov function candidate for BN :
Lemma 5.5.1 Given FNΘ1 , construct a suitable QΦ . Interconnect any nonlinearity N ∈ FNΘ1
with a Φ-dissipative behavior B resulting in the autonomous behavior BN . Let QΨ be any
storage function of B with respect to QΦ such that dtd QΨ (u, y) < QΦ (u, y) for all nonzero
(u, y) ∈ B. Then,
d
(QΨ (u, y) + G(u, y)) < 0 ∀ (u, y) ∈ BN − {0}
(5.7)
dt
Proof: Note that
d
QΨ (u, y) ∀(u, y) ∈ B − {0}
(5.8)
dt
Note that for all (u, y) ∈ BN , QΘ (w, v) = −QΦ (u, y) (Equation (5.6)). Adding the inequality
(5.5) to the inequality (5.8):
QΦ (u, y) >
d
(QΨ (u, y) + G(w, v)) < 0 ∀ non − zero (u, y) ∈ BN
dt
(5.9)
Substituting G(w, v) = G(u, y) in the above inequality completes the proof.
We now prove the central result of this chapter. We use Theorem 5.2.3, to “convert” the
Lyapunov function candidate in Lemma 5.5.1 in terms of states.
Theorem 5.5.2 Given FNΘ1 , construct an appropriate QDF QΦ . Assume Φ(ζ, η) = K T (ζ)J
"
#
K11 K12
K(η) with K(ξ) =
nonsingular, Kij (ξ) ∈ Rm×m [ξ], i, j = 1, 2. Let B be a ΦK21 K22
dissipative behavior associated with the strictly proper rational function P (ξ)Q−1 (ξ) such that
GB = K11 + K12 P Q−1 is biproper. Consider the Lyapunov function candidate: V (u, y) =
QΨ (u, y) + G(u, y) which satisfies:
1. QΨ a positive definite storage function on the manifest variables of B.
2.
d
Q (u, y)
dt Ψ
is strictly less than QΦ (u, y) for all nonzero (u, y) ∈ B.
3. G(u, y) a continuous positive semidefinite function of (u, y) ∈ BN and
Then, the zero trajectory in BN is asymptotically stable.
d
G(u, y)
dt
a QDF.
84
5 Designing linear controllers for nonlinearities
Proof: If QΨ (u, y) > 0 then from Theorem 4.4.6 and Corollary 5.2.1 it can be shown that QΨ
is a positive definite state function of K( dtd )(B). Since GB is biproper, Theorem 5.2.3 shows
that QΨ is also a positive definite state function of B. Let (A, B, C) be a minimal state space
representation for B with states x. Then, QΨ (u, y) can be written as xT Dx where D > 0. Note
that xT Dx is radially unbounded. Using the substitution y = Cx and u = −f (Cx), V (u, y)
is a positive definite state function of BN , V(x). Further, since G(u, y) is positive semidefinite,
and xT Dx is radially unbounded, V(x) is radially unbounded. Using Lemma 5.5.1 we conclude
that V (u, y) is a Lyapunov function for BN .
5.5.3
A characterization of stabilizing controllers
Theorem 5.5.2 gives a recipe for constructing Lyapunov functions for BN . Note that this
stability result holds for any Φ-dissipative behavior with positive definite storage functions
such that the rate of change of the derivative of the storage function is strictly less than the
supply. It can immediately be seen that if the class of Φ-dissipative behaviors with positive
definite storage functions is known, a class of CB of stabilizing controllers can be parametrized.
In Chapter 4 we addressed the Kalman-Yakubovich-Popov lemma in the behavioral setting
and obtained conditions for a behavior to have positive definite storage functions on manifest
variables. Let Φ(ζ, η) = K T (ζ)JK(η) and let B associated with a strictly proper rational
function P (ξ)Q−1 (ξ) "be Φ-dissipative.
Define B̃ = K( dtd )(B). Assume that B̃ is given by an
#
Q̃(ξ)
. Recall from Corollary 5.2.1 that B has positive definite storage
P̃ (ξ)
functions on manifest variables, and the rate of change of the storage function is strictly less
than the supply QΦ if:
image representation
1. P̃ (ξ), Q̃(ξ) are right coprime matrices.
2. B̃ has no memoryless part.
3. The rational function P̃ (ξ)Q̃−1 (ξ) is strictly positive real.
Using the characterization of behaviors with positive definite storage functions, we can address
absolute stability criteria for several important classes of nonlinearities.
Remark 5.5.3 The stability theory presented here depends in an essential way on the fact
that the nonlinearity and the behavior are interconnected using negative feedback. Consider a
behavior B that is interconnected with a nonlinearity N ∈ FNΘ1 using a general interconnection
(specified by a nonsingular polynomial matrix X ( dtd )). Lyapunov functions for such systems
can be constructed using storage functions that are state functions of the behavior X ( dtd )(B)
(and not of B).
Remark 5.5.4 In Section 5.5.1, assume that G(w, v) is positive definite. To ensure asymptotic
stability, rate of change of storage function along trajectories in B need not be strictly less than
the supply function. Condition 3 mentioned above can now be relaxed to: P̃ (ξ)Q̃−1 (ξ) is positive
real, as opposed to strictly positive real.
5.6. The Circle criterion
85
Remark 5.5.5 Condition 1 above states that Q̃(ξ) and P̃ (ξ) must be right coprime in order
to conclude asymptotic stability. However, if P̃ (ξ) and Q̃(ξ) are not right coprime, one can still
conclude asymptotic stability if the states of B can be constructed from states of the behavior
associated with the reduced rational function P̃ (ξ)Q̃−1 (ξ).
Remark 5.5.6 Let (u, y) be an input-output partition of a linear differential behavior B and
x denote a set of minimal states for B. Assume that QΨ , a storage function for B, is a positive
definite state function of ẋ, and not of x. With a memoryless nonlinearity interconnected with
B through negative feedback, G(w, v) can be defined as a function of y: v = f (y), w = y.
Assuming G(w, v) > 0, and (f (y) = 0 ⇐⇒ y = 0), V(u, y) in Theorem 5.5.2 can be defined as
a positive definite function of ẋ and y: V(u, y) = ẋT D ẋ + G(y). If dtd V(u, y) is negative along
all (u, y) ∈ BN , we see that ẋ → 0 and y → 0 as t → ∞. Since x are a set of minimal states for
B, and BN has only one equilibrium by assumption, we can still conclude global asymptotic
stability of x = 0. See [31] for details about this argument.
In the following sections, a representative list of applications of the general theory stated above
is presented. This leads to some well known classical results along with some new results.
5.6
The Circle criterion
One of the simplest classes of nonlinearities is the family of sector bound nonlinearities. We consider systems with w := 2m manifest variables We define the m×m matrices A = diag [a 1 , a2 . . . am ]
and B = diag [b1 , b2 . . . bm ] with 0 ≤ ai < bi , i = 1, 2 . . . , m,. Define a w × w constant matrix Θ1
as:
#
"
−AB A+B
2
(5.10)
Θ1 =
A+B
−I
m
2
Define, as before, the set NΘ1 as follows
w
loc
NΘ1 = {(w, v) ∈ Lloc
1 (R, R )|QΘ1 (w, v) ∈ L1 (R, R), QΘ1 (w, v) ≥ 0}
(5.11)
Consider any nonlinearity defined by f = diag[fi ], i = 1, . . . , m where vi = fi (wi ), fi (0) = 0 are
Lloc
1 maps. Let N be the set of trajectories that are compatible with f , i.e., N = {(w, v) ∈
w
Lloc
1 (R, R )|vi = fi (wi ), i = 1, . . . m} and assume that N ⊂ NΘ1 . The family of all N that satisfy
these criteria form a family of sector bound nonlinearities in [A, B] which we denote by FAB .
Consider a linear differential behavior B corresponding to the strictly proper transfer function P (ξ)Q−1 (ξ) with Q(ξ), P (ξ) ∈ Rm×m [ξ]. We interconnect a nonlinearity N ∈ FAB with a
linear differential behavior B using “negative feedback”:
"
# "
#" #
w
0 Im
u
=
(5.12)
v
−Im 0
y
Following the notation in Section 5.5.1, let G = 0. We now compute Φ(ζ, η) corresponding to
“negative feedback” and observe that:
# "
# "
#
"
I
I
I
A
Im A+B
m
m
m
2
=
J
(5.13)
Φ(ζ, η) = A+B
AB
A
B
I
m B
2
| {z }
K
86
5 Designing linear controllers for nonlinearities
where J = (1/2)
"
0 Im
Im 0
#
. From Corollary 5.2.1 and Theorem 5.5.2:
Corollary 5.6.1 Consider the family FAB of sector bound nonlinearities in [A, B] with 0 ≤
ai < bi < ∞. Interconnect a linear differential behavior B associated with the strictly proper
rational function P (ξ)Q−1 (ξ), with any nonlinearity N ∈ FAB , using negative feedback to get
BN . Define the rational function
H(ξ) = [Q(ξ) + BP (ξ)][Q(ξ) + AP (ξ)]−1
(5.14)
The equilibrium (0, 0) ∈ BN is asymptotically stable if H(ξ) is strictly positive real (SPR).
This is precisely the condition one obtains after a “loop transformation” on the nonlinearity
[107].
5.7
Classical Popov Criterion
Since the circle criterion is only a sufficiency result, it is worthwhile investigating whether there
exist more stabilizing controllers for a given nonlinearity N. This question will now be explored
in more detail. We consider the family F0K of sector bound nonlinearities in [0, K] where
K = diag(k1 , . . . km ), 0 < ki < ∞, i = 1, . . . m. Let
"
#
0 K/2
Θ1 =
K/2 −Im
w
Let NΘ1 be the set of all (w, v) ∈ Lloc
Q (w, v) is non-negative. Then,
1 (R, R ) such that
R w i Θ1
Rw T
Pm
for all N ∈ F0K , N ⊂ NΘ1 . Let G(w, v) =
i=1 ki αi 0 vi dwi := 0 v KΛdw with Λ =
diag [α1 , . . . αm ], αi > 0. Note that G(w, v) ≥ 0 for all (w, v) ∈ NΘ1 . Define a QDF QΘ2 as:
QΘ2 (w, v) =
d
d
G(w, v) = v T KΛ w
dt
dt
(5.15)
Define Θ(ζ, η) = Θ1 (ζ, η) + Θ2 (ζ, η).
Using X corresponding to negative feedback, compute the two variable polynomial matrix
Φ(ζ, η) = −X T Θ(ζ, η)X . Thus:
"
#
I
K(I
+
Λη)/2
m
m
Φ(ζ, η) = −X T Θ(ζ, η)X =
(5.16)
K(Im + Λζ)/2
0
Note that Φ(ζ, η) can be factorized as:
"
# "
#
Im
Im
Im
0
Φ(ζ, η) =
J
0 K(Im + Λζ)
Im K(Im + Λη)
|
{z
} |
{z
}
N T (ζ)
(5.17)
N (η)
Consider a Φ-dissipative behavior associated with the strictly proper rational function P (ξ)Q −1 (ξ).
Interconnect a nonlinearity N ∈ F0K with B, using negative feedback, to obtain BN . Let QΨ
be a positive definite storage function for B with respect to QΦ such that:
d
QΨ (u, y) < QΦ (u, y) ∀ nonzero (u, y) ∈ B
dt
(5.18)
5.7. Classical Popov Criterion
87
From Corollary 5.2.1, all storage functions for B are positive definite and the rate of change of
storage is less than the supply if the following conditions are satisfied:
1. Y (ξ) := [Q(ξ) + K(Im + Λξ)P (ξ)]Q−1(ξ), which is the transfer function associated with
N ( dtd )(B), is strictly positive real.
2. The behavior N ( dtd )(B) has no nontrivial memoryless part.
3. K(Im + Λξ)P (ξ) + Q(ξ) and Q(ξ) are right coprime.
Further, Theorem 5.2.2 shows that storage functions of B are state functions.
We interconnect any N ∈ F0K with B to get BN . From Theorem 5.5.2, (0, 0) ∈ BN is
asymptotically stable for every nonlinearity N ∈ F0K and every Φ-dissipative behavior that
satisfies the above conditions. This is a multivariable generalization of the celebrated result
due to Popov, often called “Popov’s stability criterion”.
Remark 5.7.1 Following Remark 5.5.4, consider the case where the nonlinearity is restricted
to the open sector (0, K). The procedure outlined above is still valid for this case. However,
the assumption that the rate of change of storage function along trajectories of B be strictly
less than the supply is no longer crucial.
Notice that in the result we have just proved, Λ ≥ 0. However, in literature concerning Popov
criterion (e.g. Hsu and Meyer [36] pp 377, [16]), Λ is often allowed to be negative. This result
will now be derived in the framework of this chapter. Let
"
#
0 K/2
Θ1 =
(5.19)
K/2 −Im
Consider F0K , the family of sector bound nonlinearities in [0, K]. Now, define G(w, v) =
R wi
Rw
Pm
T
i=0 ki αi 0 (ki wi − vi )dwi := 0 (Kw − v) KΓdw , Γ = diag [α1 . . . αm ] ≥ 0. Notice that
G(w, v) ≥ 0 for all (w, v) ∈ NΘ1 . Let
QΘ2 (w, v) =
d
d
G(w, v) = (Kw − v)T KΓ w
dt
dt
(5.20)
Let Θ(ζ, η) = Θ1 + Θ2 (ζ, η). Compute, as before, Φ(ζ, η) = −X T ΘX where X denotes the
interconnection matrix corresponding to negative feedback. Thus:
"
#
Im
K(Im − Γη)/2
Φ(ζ, η) =
(5.21)
K(Im − Γζ)/2 −ΓK 2 (ζ + η)/2
Using negative feedback, interconnect any nonlinearity N ∈ F0K with linear differential Φdissipative behaviors B associated with the strictly proper rational function P (ξ)Q−1 (ξ).
Let QΨ be a storage function of B with respect to QΦ such that the rate of change of QΨ
along nonzero trajectories of B is strictly less than QΦ . Notice that Φ(ζ, η) can be factorized
in the following manner:
#
"
# "
# "
Im
K
Im
K(Im − Γη)/2
Im
Im
J
(5.22)
=
Im −KΓη
K(Im − Γζ)/2 −ΓK 2 (ζ + η)/2
K −KΓζ
{z
} |
{z
}
|
M T (ζ)
M (η)
88
5 Designing linear controllers for nonlinearities
Denote by Z(ξ) := P̃ (ξ)Q̃−1 (ξ) the rational function associated with M ( dtd )(B):
Z(ξ) := [Q(ξ) − KΓξP (ξ)][Q(ξ) + KP (ξ)]−1
(5.23)
If Z(ξ) is reduced and SPR, and the behavior M ( dtd )(B) has no non-trivial memoryless part,
(0, 0) ∈ BN is asymptotically stable. This is a result in itself; however in order to simplify
application, we modify it using transfer function manipulations.
Lemma 5.7.2 Z(ξ) in equation (5.23) is SPR if and only if
Z̄(ξ) := [Q(ξ) + K(Im − Γξ)P (ξ)]Q−1(ξ) is SPR
Proof: Consider the polynomial matrices
#
#
"
"
Im
K(Im − Γη)/2
Im
K(Im − Γη)/2
; Φ̄(ζ, η) =
Φ(ζ, η) =
K(Im − Γζ)/2
0
K(Im − Γζ)/2 −ΓK 2 (ζ + η)/2
Since Φ(−iω, iω) = Φ̄(−iω, iω) it follows that QΦ ∼ QΦ̄ , see Section 3.3 for details about
equivalent supply functions. Therefore, B, associated with the strictly proper rational function
P (ξ)Q−1 (ξ) is Φ-dissipative if and only if it is also Φ̄-dissipative. Notice that
"
# "
# "
#
Im
K(Im − Γη)/2
Im
Im
Im
0
=
J
K(Im − Γζ)/2
0
0 K(Im − Γζ)
Im K(Im − Γη)
{z
}
|
N (η)
Since QΦ ∼ QΦ̄ , B is Φ-dissipative if and only if B̄, associated with the rational function
Z̄(ξ) := [Q(ξ) + K(Im − Γξ)P (ξ)]Q−1(ξ) is J-dissipative. Hence, Z(iω) + Z T (−iω) ≥ Im > 0 if
and only if Z̄(iω) + Z̄ T (−iω) ≥ Im > 0, ω ∈ R.
Further, Z(ξ) is SPR if and only if Z(iω) + Z T (−iω) ≥ Im > 0 and 2Q(ξ) + K(Im − Γξ)P (ξ)
is Hurwitz. Note that the sum of the “numerator” and “denominator” of Z(ξ) and Z̄(ξ) are
the same. Therefore, Z(ξ) is SPR if and only if Z̄(ξ) is SPR.
Thus, Popov’s result in all its generality can now be stated as:
Theorem 5.7.3 Consider the family F0K of memoryless, time-invariant, single-valued, sectorbound nonlinearities in [0, K], K = diag [k1 . . . km ], 0 < ki < ∞. Consider linear differential
behaviors B associated with a strictly proper rational function P (ξ)Q−1 (ξ). Interconnect a
nonlinearity N ∈ F0K with a behavior B using “negative feedback” to obtain the autonomous
nonlinear behavior BN . Then, the equilibrium (0, 0) in BN is asymptotically stable if there
exists Γ = diag [γ1 . . . γm ] ≥ 0 such that Z(ξ) := [Q(ξ) + K(Im ± Γξ)P (ξ)]Q−1 (ξ) is reduced,
and is SPR and further the behavior associated with Z(ξ) has no memoryless part. Further,
if nonlinearities N ∈ F0K are restricted to lie in the open sector (0, K), the equilibrium (0, 0)
is asymptotically stable if Z(ξ) is reduced and is PR (when Γ nonsingular) and SPR when Γ
singular.
An interesting corollary follows from Theorem 5.7.3
Corollary 5.7.4 Consider behaviors B that satisfy conditions given in Theorem 5.7.3. Then,
no root of Q(ξ) is in the open right half plane and all its roots on the imaginary axis are simple.
5.8. Slope restricted nonlinearities
89
Proof: With Q(ξ) + K(Im ± Γξ)P (ξ) and Q(ξ) right coprime, the rational function Z(ξ) :=
[Q(ξ) + K(Im ± Γξ)P (ξ)]Q−1 (ξ) is positive real only if Z(ξ) is analytic in the open right half
plane and all singularities of Z(ξ) on ξ = iR are simple. Since singularities of Z(ξ) are roots of
det Q(ξ) = 0, the Corollary is proved.
In other words, the corresponding linear transfer function P (ξ)Q−1 (ξ) is stable. It is interesting to note that in most references on Popov-like stability criteria in literature ([15, 36, 52, 107]),
the stability of the linear element is assumed a priori. However, from results obtained in this
chapter it can be seen that stability of the linear element follows as a consequence of the theory,
and not as a starting point.
5.8
Slope restricted nonlinearities
In this section we investigate memoryless nonlinearities that in addition to being sector bound,
also have restriction on their slopes. Let
#
"
K
ζη
0
2
(5.24)
Θ1 (ζ, η) = K
ζη
−ζηI
m
2
w
with K = diag [k1 . . . km ], 0 < ki < ∞. Let NΘ1 be the set of (w, v) ∈ Lloc
1 (R, R ) such
that QΘ1 (w, v) > 0 for all nonzero (w, v). Consider nonlinearities N ∈ F0K that also satisfy
N ⊂ NΘ1 . The set of all N that satisfy the above conditions form a family of sector bound
nonlinearities in (0, K) that also have the restriction that the slope of the w − v characteristic
of every N in this family lies in (0, K). This family of nonlinearities is denoted by Fmon (the
subscript “mon” stands for monotone). Clearly, Fmon ⊂ F0K .
Define a function G(w, v) as follows:
Z w
Z wi
m
X
T
v T KΓdw)
(5.25)
vi dwi ) := (w KΓv −
G(w, v) =
ki αi (wi vi −
i=0
0
0
R wi
with Γ = diag [α1 . . . αm ] with αi positive. Note that 0 vi dwi denotes the area under the wi −vi
characteristic of the i-th component of nonlinearity N and wi vi denotes the area of the rectangle
in R2 with vertices (0, 0), (0, vi ), (wi , 0) and (wi , vi ). Since the slope of N is restricted to (0, K),
it can be seen that the area of this rectangle will always be greater than the corresponding area
Rw
under the wi − vi curve. Thus, 0 i vi dwi < wi vi . Then, every N = {(w, v)} ∈ Fmon satisfies
the inequality G(w, v) > 0.
Define the QDF QΘ as the sum of QΘ1 and the QDF associated with dtd G(w, v). Now
compute the matrix Φ(ζ, η) from Θ(ζ, η) corresponding to “negative feedback”. Thus:
#
"
K
ζηIm
(ζη
+
Γζ)
2
(5.26)
Φ(ζ, η) = K
(ζη
+
Γη)
0
2
Notice that Φ(ζ, η) admits the following factorization:
#
# "
"
ηIm
0
ζIm
ζIm
J
Φ(ζ, η) =
ηIm K(Γ + ηIm )
0 Km (Γ + ζIm )
{z
} |
{z
}
|
M T (ζ)
M (η)
(5.27)
90
5 Designing linear controllers for nonlinearities
Consider a Φ-dissipative behavior B associated with the strictly proper rational function
P (ξ)Q−1 (ξ). The transfer function associated with M ( dtd )(B) is:
H(ξ) = [K(Γ + ξIm )P (ξ) + ξQ(ξ)][ξQ(ξ)]−1
(5.28)
Following the notation in Theorem 5.5.2, we see that GB is not biproper. Note that every
storage function for B is a state function of the behavior associated with H(ξ). When H(ξ) is
reduced, m(M ( dtd )(B)) = m(B) + 1. Further, every storage function of B can be defined as a
state function of ẋ where x is a set of minimal states for B. We conclude from Remark 5.5.6
that x = 0 is still globally asymptotically stable. Thus:
Corollary 5.8.1 Let Fmon be the family of nonlinearities that are sector bound in [0, K] and
have slopes that are sector bound in (0, K). Consider a behavior B associated with the strictly
proper rational function P (ξ)Q−1 (ξ). Interconnect B with any N ∈ Fmon to get BN . Then,
(0, 0) ∈ BN is asymptotically stable for every N ∈ Fmon if there exist Γ = diag [α1 . . . αm ], 0 < αi ,
i = 1, . . . , m such that
1. The polynomial matrices K(Γ + ξIm )P (ξ) + ξQ(ξ) and ξQ(ξ) are right coprime.
2. H(ξ) := [K(Γ + ξIm )P (ξ) + ξQ(ξ)][ξQ(ξ)]−1 is positive real.
3. The behavior associated with H(ξ) has no memoryless part.
The scalar version of this result was first proved by Vimal Singh [88], while the matrix version
was addressed by Haddad and Kapila [31], Park et al [57].
Consider the scalar case of Corollary 5.8.1. If P (0) = 0 then, numerator and denominator
of H(ξ) are not right coprime. However, one can still conclude asymptotic stability of B N in
the light of Remark 5.5.5. While Singh assumes that P (0) 6= 0 [88], we see that the assumption
can be relaxed.
5.9
Nonlinearities with memory
Till now in this chapter, we only considered nonlinearities that were memoryless. We now
discuss how the theory presented in this chapter can be used to handle nonlinearities with
memory (e.g. systems with hysteresis). Figure 5.2 shows some examples of nonlinearities with
2
memory. Let N denote the set of (w, v) ∈ Lloc
1 (R, R ) that is consistent with the laws defining
a nonlinearity shown in Figure 5.2. Notice that the characteristics shown in Figure 5.2 satisfy
the inequality v̇w ≥ 0. This QDF can be represented as QΘ (w, v) ≥ 0 where:
"
#
0 η/2
Θ(ζ, η) =
ζ/2 0
2
Let NΘ denote (w, v) ∈ Lloc
1 (R, R ) that satisfy QΘ (w, v) ≥ 0. Then, N ⊂ NΘ . Let FNΘ denote
the family of all N such that N ⊂ NΘ . With reference to Section 5.5.1, we choose G = 0 and
compute Φ corresponding to negative feedback:
"
#
0 ζ/2
Φ(ζ, η) =
η/2 0
5.9. Nonlinearities with memory
91
v
v
h
h
−∆
0
∆
λ
∆
−∆
w
0
−h
w
−h
(a) Ideal relay with hysteresis
(b)Saturation with hysteresis
Figure 5.2: Nonlinearities with memory
Imaginary part
(j ω +2)/(j ω +3)
0
ω=
ω=0
ΟΟ
Real Part
−1/N(a) for ideal relay with hysteresis
(typical)
Figure 5.3: Describing function analysis
Interconnect N with a Φ-dissipative behavior B associated with the rational function p(ξ)/q(ξ)
to obtain BN . Notice that
"
# "
# "
#
0 ζ/2
ζ 0
η 0
=
J
η/2 0
0 1
0 1
| {z }
K(η)
The rational function associated with K( dtd )(B) is seen to be H(ξ) =
coprime, not both constant, and
H(ξ) is PR,
p(ξ)
.
ξq(ξ)
If p(ξ), ξq(ξ) are
(5.29)
every storage function of B is positive definite on manifest variables. Further,
d
QΨ (u, y) ≤ 0 ∀(u, y) ∈ BN
dt
We have “transformed” the nonlinearity with memory into the [0, ∞] sector using the transformation defined by K( dtd ). Hence we conjecture that if H(ξ) is positive real and p(0) 6= 0,
(0, 0) ∈ BN is stable in the sense of Lyapunov.
Let us check this claim using other methods of analysis like describing functions [8]. Consider
the behavior B associated with the rational function G(ξ) = p(ξ)/q(ξ) with p(ξ) = ξ + 2 and
q(ξ) = ξ + 3. The Nyquist plot of G(ξ) lies entirely in the first quadrant for ω ∈ [0, ∞]. Let
92
5 Designing linear controllers for nonlinearities
N be an ideal relay with hysteresis (Figure 5.2(a)). Let N (a) be the describing function of N
(see [8]). We superimpose the plot of −1/N (a) on the Nyquist plot of G(ξ) in Figure 5.3. We
see that the two plots do not intersect. Hence we conclude that the interconnected system has
a stable equilibrium.
We now apply the results obtained in this section for analyzing stability. Notice that
H(ξ) =
ξ+2
ξ(ξ + 3)
satisfies the conditions given in (5.29) since it is positive real. Thus the theory is in agreement
with analysis using describing functions. We have also carried out a number of simulations
and it seems that p(ξ)/(ξq(ξ)) being positive real with p(0) 6= 0 guarantees stability of the
equilibrium.
Note that Condition (5.29) allows for p(ξ)/q(ξ) to be bi-proper, i.e., there could be a
feedthrough. Note that we conjecture that the algebraic methods used here will help in
analysing nonlinearities with memory. We have found remarkable agreement between our results and those obtained from other analytic methods like describing functions. Moreover our
results are in agreement with simulations. However, some technical difficulties dealing with
differentiablilty of trajectories in B have been an obstruction to obtaining a rigourous proof.
5.10
Conclusion
In this chapter, some stability issues of nonlinear systems have been addressed in the behavioral
theoretic framework. Using QDFs, we have proposed an algebraic method for stability analysis
of nonlinear dynamical systems that can be obtained by interconnection of a linear system and
a sector-bound memoryless nonlinearity. The method helps us construct Lyapunov functions
on manifest variables of a system without explicitly invoking states.
As applications, we demonstrated how the circle criterion and a more general version of
Popov’s stability criterion than those usually seen in literature follow as special cases. We also
investigated systems with slope restricted nonlinearities.
The algebraic nature of the method proposed here also lets us investigate nonlinearities with
memory. We have shown that stability results obtained using this method match with other
independent methods of analysis, though we have not been able to formulate a rigorous theory
for such nonlinearities.
Chapter 6
Polynomial J-spectral factorization
6.1
Introduction
The problem of polynomial J-spectral factorization is the following: a real para-Hermitian w×w
polynomial matrix Z in the indeterminate ξ (i.e. Z(ξ) = Z(−ξ)T ) is given, together with two
integers m and n such that m + n = w. It is required to find a w × w polynomial matrix F (if
there exists one) such that
Z(ξ) = F T (−ξ)Jmn F (ξ)
where
Jmn
"
Im 0
=
0 −In
(6.1)
#
and F is Hurwitz, i.e. has no singularities in the closed complex right half-plane. Strictly
speaking, equation (6.1) is really a “Jmn -spectral” factorization. However, for ease of notation,
and for conforming with available literature in this area, we still denote it as “J-spectral”
factorization.
Polynomial J-spectral factorization arises in different areas of systems and control, for
example in the case of Wiener filtering, LQG theory, in the polynomial approach to H∞ control and filtering (see [49, 53]). Many algorithms have been suggested for the solution of
such problem, especially in the case when n = 0, i.e. Jmn = Iw (see [4, 11, 24, 27, 38, 49, 97]).
In this chapter we propose an algorithm for J-spectral factorization based on the special
kernel representations of solutions to the subspace Nevanlinna interpolation problem (see [84]),
and on the calculus of quadratic differential forms introduced in Chapter 1. The theory of metric
interpolation has been used in [24, 27] for solving the problem of rational spectral factorization
(i.e., the case in which Jmn = Iw and the entries of the matrix Z consist of rational functions);
apart from the fact that our focus here is on polynomial J-spectral factorization, the approach
proposed in this chapter differs from that of [24, 27] in many aspects:
Our approach arises in the theory of QDFs, and uses two-variable polynomial matrix algebra;
this point of view allows new insights in the nature of the problem. For example, an
important consequence of our reliance on the theory of QDFs is that we are able to
formulate necessary and sufficient conditions for the existence of a J-spectral factorization,
thus providing an original and effective test alternative to the ones already known;
94
6 Polynomial J-spectral factorization
Our approach covers also the case when Jmn 6= Iw , which is of special interest in H∞ -control and
filtering. This generalization to the indefinite case is based on the results on Σ-unitary
vector-exponential modeling developed in the behavioral framework (see [84]);
Finally, the functioning of the algorithm does not depend on the assumptions underlying the
algorithm proposed in [24, 27], and can be applied to a general para-Hermitian matrix
Z. Moreover, one is not required to know a priori the inertia matrix Jmn of the spectral
factorization, since it is determined in the course of the computations carried out in the
algorithm we suggest.
The results reported in this Chapter were obtained in collaboration with Dr. Paolo Rapisarda who is currently with the Department of Electrical and Computer Engineering, University
of Southampton, UK.
This chapter is organized as follows: in Section 6.2 we illustrate the basic features of modeling vector-exponential time series, with special emphasis on modeling with Σ-unitary models.
Section 6.3 is the main section of this chapter, where we give an algorithm for polynomial
J-spectral factorization. This section also has other results of independent interest. In Section
6.4, we examine numerical aspects of the algorithm. This is followed by examples in Section
6.5.
6.2
Σ-unitary modeling of dualized data
In this section we illustrate the problem of modeling dualized data sets, a concept introduced in
[84] in the context of the subspace Nevanlinna interpolation problem (SNIP in the following).
Σ-unitary kernel representations play a central role in the algorithm for Σ-spectral factorization
illustrated in section 6.3.
6.2.1
Modeling vector-exponential time series with behaviors
Assume that a set D of data consisting of vector-exponential time series is given, i.e.
D := {vi eλi t }i=1,...,N
(6.2)
where vi ∈ Cw , λi ∈ C, i = 1, . . . , N . We pick our model from a model class M, whose
choice embodies the a priori assumptions on the nature of the phenomenon producing the
data, for example linearity, time-invariance, etc. For the purposes of this chapter we choose the
model class consisting of finite-dimensional linear differential systems with w external (manifest)
variables, denoted, as in the earlier chapters by Lw . We say that a model B ∈ Lw is an unfalsified
model for D, or equivalently, that the model B explains the data D, if D ⊆B.
Of course, in general more than one model explains the data. Clearly, the model B =
∞
C (R, Cw ) is unfalsified by every trajectory. This extreme case leads us to deduce that the
strength of a model lies in its prohibitive power : the more a model forbids, the better it is.
In this sense C ∞ (R, Cw ) is a trivial model because no restrictions are being imposed on the
outcomes of the model: according to C ∞ (R, Cw ), every outcome is possible.
6.2. Σ-unitary modeling of dualized data
95
Taking the point of view that the strength of a model lies in its prohibitive power (see
[100, 101]) leads to a natural partial ordering among unfalsified models: if B1 and B2 are
unfalsified by a set of data D, then we call B1 more powerful than B2 if B1 ⊆ B2 . We call a
model B∗ the most powerful unfalsified model (abbreviated MPUM ) for D if B∗ ⊇ D and any
other unfalsified model B for D satisfies B∗ ⊆ B. That is, the MPUM is the most restrictive
model among those not refuted by the data.
It can be shown that given a set of vector-exponential time series D as in (6.2), there exists
a unique behavior B∗ ⊂ C ∞ (R, Cw ) which explains the data D and as little else as possible.
Since B∗ ∈ Lw by assumption, and since D ⊆ B∗ :
B∗ = lin span {vi eλi t }i=1,...,N
(6.3)
Observe that the MPUM of a finite set of vector-exponential time series is autonomous, i.e. it
is a finite dimensional subspace of C ∞ (R, Cw ); equivalently, it can be represented as the kernel
of a matrix polynomial differential operator RN ( dtd ), such that RN is square and nonsingular as
a polynomial matrix (see Section 2.7 of this thesis). Note that the data D is, by assumption,
in general complex valued. Hence there exists a kernel representation RN ( dtd ) for B∗ with
RN (ξ) ∈ Cw×w [ξ], i.e. RN (ξ) is a polynomial matrix with complex coefficients. The following
remark addresses when we can find RN ∈ Rw×w [ξ]:
Remark 6.2.1 Let vi ∈ Cw = [vi1 , vi2 , . . . , viw ]T , vij ∈ C, j = 1 . . . w and λi ∈ C. Define
v̄i ∈ Cw = [v̄i1 , v̄i2 , . . . , v̄iw ]T . If the data set D is self-conjugate, i.e. for every trajectory
vi eλi t ∈ D, the trajectory v̄i eλ̄i t ∈ D, then there exists RN (ξ) ∈ Rw×w [ξ] such that RN ( dtd )w = 0
for all w ∈ B∗ .
We now present an iterative algorithm to compute a kernel representation of B∗ . Define:
R0 := Iw
and proceed iteratively as follows for k = 1, . . . , N . At step k, define the k-th error trajectory
d
)vk eλk t = Rk−1 (λk )vk eλk t
(6.4)
dt
Observe that the error-trajectory is also a vector-exponential time-series associated with the
frequency λk and the vector εk := Rk−1 (λk )vk . A kernel representation of the MPUM for εk eλk
is
εk ε∗k d
d
Ek ( ) := Iw λk −
dt
kεk k2 dt
Now define
Rk := Ek Rk−1
Rk−1 (
After N steps such algorithm produces a w×w polynomial matrix RN such that RN ( dtd )vi eλi t = 0
for 1 ≤ i ≤ N ; then
d
B∗ = ker RN ( )
dt
Especially in the Subspace Nevanlinna Interpolation Problem, the issue is to model not one,
but an entire subspace of vector-exponential trajectories associated with the same frequency of
the exponential, i.e. the data consists of the exponential trajectories in
Vi eλi t := {veλi t | v ∈ Vi , Vi linear subspace of Cw },
i = 1, . . . , N
96
6 Polynomial J-spectral factorization
Observe that this problem can also be interpreted as that of modeling the data
lin span
N
[
i=1
{vij eλi t }j=1,...,dim(Vi )
where {vij }j=1,...,dim(Vi ) is a basis for Vi , i = 1, . . . , N .
6.2.2
Data dualization, semi-simplicity, and the Pick matrix
In order to state the main result of this section, we need some preliminaries. Let Σ ∈ Rw×w be
a symmetric, nonsingular matrix, and consider λi ∈ C+ , i = 1, . . . , N . We first introduce data
dualization. Consider the set of trajectories
Vi eλi t := {veλi t |v ∈ Vi }
and we define its dual set as
Vi ⊥Σ e−λ̄i t := {we−λ̄i t |w ∗ Σv = 0 for all v ∈ Vi }
We call the set
λi t
∪N
∪ Vi⊥Σ e−λ̄i t }
i=1 {Vi e
(6.5)
the dualized data set. In the following it will be shown that by modeling the dualized dataset, a unfalsified model exhibiting a special (“Σ-unitary”) structure can be obtained. This
special structure has special importance in considering the solution to the Subspace Nevanlinna
Interpolation Problem (see [84]).
Next, we define the concept of semi-simplicity of a one-variable polynomial matrix. Let
Z ∈ Rw×w [ξ]; Z is semi-simple if for all λ ∈ C the dimension of the kernel of Z(λ) is equal
to the multiplicity of λ as a root of det(Z). Note that if det(Z) has distinct roots then Z is
semi-simple.
Finally, we introduce the notion of Pick matrix associated with the data {(λi , Vi )}i=1,...,N .
Let Vi ∈ Rw×dim(Vi ) be a full column rank matrix such that Im(Vi ) = Vi , i = 1, . . . , N . The
P
PN
( N
i=1 dim(Vi )) Hermitian matrix
i=1 dim(Vi )) × (
T{Vi }i=1,...,N :=
h
Vi∗ ΣVj
λ̄i +λj
i
i,j=1,...,N
(6.6)
is called a Pick matrix associated with {(λi , Vi )}i=1,...,N . Of course T{Vi }i=1,...,N depends on the
particular basis matrices Vi chosen for Vi , but it is easy to see that the inertia of all these Pick
matrices is the same.
6.2.3
A procedure for Σ-unitary modeling
Consider a nonsingular matrix Σ = ΣT ∈ Rw×w . A polynomial matrix R ∈ Cw×w [ξ] is said to be
Σ-unitary if there exists p(ξ) ∈ C[ξ], p 6= 0, such that
RΣR∼ = R∼ ΣR = pp∼ Σ
6.2. Σ-unitary modeling of dualized data
97
where R∼ := R? (−ξ). Assume now that a set consisting of vector-exponential data {vi eλi t }i=1,...,N
is to be modeled, and that the characteristic frequencies λi are all distinct; then det(R) =
ΠN
i=1 (ξ − λi ) (see [7] for the case when the characteristic frequencies are repeated).
Recall that the Pick matrix for Vi eλi t , i = 1, . . . , N is defined as T{Vi }i=1,...,N = [Vi∗ ΣVj /(λ̄i +
λj )]N
i,j=1 , where Vi is a basis for Vi . We call the matrix:
T{Vi }i=1,...,k := [Vi∗ ΣVj /(λ̄i + λj )]ki,j=1 , k ≤ N
the k-th order principal block submatrix of T{Vi }i=1,...,N , and det T{Vi }i=1,...,k the k-th order principal block minor of T{Vi }i=1,...,N .
The following result gives a sufficient condition for the existence of a Σ-unitary model of a
dualized data set (6.5).
Theorem 6.2.2 Assume that the Hermitian matrices T{Vi }i=1,...,k , k = 1, . . . , N are nonsingular,
i.e. every principal block minor of T{Vi }i=1,...,N = [Vi∗ ΣVj /(λ̄i + λj )]N
i,j=1 is nonzero. Then the
⊥Σ −λ̄i t
N
λi t
MPUM for the dualized data set ∪i=1 {Vi e ∪ Vi e
} has a Σ-unitary kernel representation,
d
w×w
λi t
∪ Vi⊥Σ e−λ̄i t } and
i.e. there exists R̂ ∈ C [ξ] such that R̂( dt )w = 0 for all w ∈ ∪N
i=1 {Vi e
R̂? (−ξ)ΣR̂(ξ) = p? (−ξ)p(ξ)Σ, p 6= 0.
Proof: Let V1 ∈ Rw×dim(V1 ) be a full-column rank matrix such that Im(V1 ) = V1 . By assumption, V1∗ ΣV1 /(λ̄1 + λ1 ) is nonsingular. Consider the w × w matrix
−1
R̂1 (ξ) := (ξ + λ̄1 )Iw − V1 T{V
V ∗Σ
1} 1
(6.7)
It is easily verified that ker(R̂1 ( dtd )) ⊇ V1 eλ1 t ∪ V1⊥Σ e−λ̄1 t . Now observe that since T{V1 } is
¯
nonsingular, it holds that dim(lin span(V1 eλ1 t ∪ V1⊥Σ e−λ1 t )) = w. Since deg(det(R̂1 )) = w, it
follows that ker(R̂1 ( dtd )) is the MPUM for the dualized data associated with the subspace V1 . It
is a matter of straightforward verification to see that R̂1 is a Σ-unitary matrix. Then, R̂ = R̂1 .
The error subspaces of R̂1 ( dtd ) are generated by the matrices
−1
Vi0 := R̂1 (λi )Vi = (λi + λ̄1 )Vi − V1 T{V
V ∗ ΣVi , i = 2, . . . , N
1} 1
We first verify that the (i − 1, j − 1)-th block of T{Vi0 }2≤i≤N is
Vi0∗ ΣVj0
(λ1 + λ̄i )(λj + λ¯1 ) T
−1
=
Vi ΣVj − ViT ΣV1 T{V
V ∗ ΣVj , i, j = 2, . . . , N
1} 1
λj + λ̄i
λj + λ̄i
Now partition the principal block submatrices of T{Vi }i=1,...,N as follows:
"
#
T{V1 }
b∗k
T{Vi }i=1,...,k =
k = 2, . . . , N
bk
T{Vi }i=2,...,k
with bk := col
"
Vi∗ ΣV1
λ̄i +λ1
Idim(V1 )
0
−1
−∆k bk T{V1 } ∆k
#
, i = 2, . . . , k. Let ∆k := diag(λ̄i + λ1 )i=2,...,k . Observe that
T{Vi }i=1,...,k
#
−1 ∗
∆
Idim(V1 ) −T{V
b
k
1} k
0
∆k
#
"
T{V1 }
0
, k = 2, . . . , N
=
−1 ∗
b ∆k
0
∆k T{Vi }i=2,...,k ∆k − ∆k bk T{V
1} k
"
98
6 Polynomial J-spectral factorization
−1 ∗
It is not difficult to see that the (i, j)-th block of ∆k T{Vi }i=2,...,k ∆k − ∆k bk T{V
b ∆k equals the
1} k
(i, j)-th block of T{Vi0 }2≤i≤k , i, j = 1, . . . , k and k = 2, . . . , N . Since the above equality also holds
for k = 2, we have shown that T{V20 } is nonsingular. Hence we can define a representation R̂2 of
the MPUM for V20 eλ2 t and V20⊥Σ e−λ̄2 t , similar to that in equation (6.7). Define R̂ = R̂2 R̂1 . Thus,
R̂ is a Σ-unitary representation of the MPUM for Vi eλi t ∪ Vi e−λ̄i t , i = 1, 2 and further, every
principal block minor of T{Vi0 }2≤i≤N is nonzero. An inductive argument completes the proof. The proof of Theorem 6.2.2 implies the following result.
Corollary 6.2.3 If every principal block minor of the Pick matrix (6.6) is nonsingular, then
the Pick matrix associated with the k-th error subspace of the modeling procedure in the proof
of Theorem 6.2.2 is also nonsingular.
Remark 6.2.4 We show with a counterexample that the converse implication of Theorem
6.2.2, i.e. that the existence of a Σ-unitary model implies that every principal block minor of
the Pick matrix is nonzero, does not hold true. Let w = 2 and
"
#
1 0
Σ=
0 −1
h
iT
and associated with λ ∈ R+ . Observe
Consider the subspace V generated by v := 1 1
that such subspace is associated with a singular Pick matrix, since v is self-orthogonal in the
indefinite inner product induced by Σ. A model for Veλt + V ⊥Σ e−λt is
"
#
1 2
1 2
2
ξ
−
λ
ξ
2
2
1 2
1 2
ξ
ξ
− λ2
2
2
which is easily seen is Σ-unitary.
6.3
J-spectral factorization via Σ-unitary modeling
In this section we illustrate the application of the ideas of the previous section to the problem
of computing a J-spectral factor of a para-Hermitian matrix Z ∈ Rw×w [ξ]. The main result of
this section is an algorithm for J-spectral factorization based on Σ-unitary modeling. In the
process of deriving this procedure we also prove some results of independent interest.
In this section a crucial role will be played by the association of a two-variable polynomial
matrix to a one-variable polynomial matrix, a technique introduced as lifting in [97] (see [58, 97]
for examples of applications of this idea). In the following, we associate with a para-Hermitian
matrix Z ∈ Rw×w [ξ] a matrix Φ ∈ Rw×w [ζ, η] such that Φ(−ξ, ξ) = Z(ξ). The following result
(see Lemma 3.1 of [97]) characterizes the matrices Φ(ζ, η) satisfying this condition.
P
Proposition 6.3.1 A symmetric matrix Φ(ζ, η) = Li,j=0 Φi,j ζ i η j satisfies
Φ(−ξ, ξ) = Z(ξ) =:
M
X
Zk ξ k
k=0
if and only if
Zk = Φ0,k − Φ1,k−1 + Φ2,k−2 − . . . + (−1)k Φk,0
for all k = 0, . . . , M .
6.3. J-spectral factorization via Σ-unitary modeling
99
Observe that for example the matrix Φ(ζ, η) := 12 (Z(ζ)T + Z(η)) satisfies the condition of
Proposition 6.3.1; see also formula (3.3) in [97] for another example.
We now proceed to prove some important results concerning the properties of the twovariable matrix Φ that can be associated to a para-Hermitian Z.
Theorem 6.3.2 Let Z ∈ Rw×w [ξ] be para-Hermitian. Then there exists Φ ∈ Rw×w
s [ζ, η] such
that
1. Φ(−ξ, ξ) = Z(ξ);
2. Φ admits a canonical factorization
"
#"
#
h
i Σ Σ
D(η)
1
2
Φ(ζ, η) = D(ζ)T N (ζ)T
T
Σ2 Σ3 N (η)
#
"
Σ1 Σ2
with Σ =
∈ Rw+q and Σ1 ∈ Rw×w , D ∈ Rw×w [ξ] nonsingular, N ∈ Rq×w [ξ], such
ΣT2 Σ3
that N D −1 is strictly proper.
If a symmetric canonical factorization of Φ(ζ, η) satisfies condition (2) in the Theorem statement, the factorization is called a “strictly proper symmetric canonical factorization” of Φ.
Proof: Using Proposition 6.3.1 we first find a Ω(ζ, η) = K(ζ)T ΣΩ K(η) such that Ω(−ξ, ξ) =
Z(ξ). From Theorem 3.3.22 p. 90 of [75] we can conclude the existence of an input-output
partition of the external variables of Im K( dtd ) with an associated proper (but not necessarily
strictly proper) transfer function. Consequently, without loss of generality we can assume that
" #
D
K=
N
with D ∈ Rw×w [ξ] nonsingular, N ∈ Rq×w [ξ], such that N D −1 is proper, but not necessarily
strictly proper.
Now compute a unimodular matrix U ∈ Rw×w [ξ] such that DU is column proper. Observe
that K 0 := KU = col(D 0 , N 0 ) is also associated with a proper transfer function, since N 0 D 0−1 =
(N U )(DU )−1 = N D −1 . Now observe that since N 0 D 0−1 is proper and since D 0 is column proper,
it follows that each column of N 0 has degree less than or equal to that of the corresponding
column of D 0 .
We can conclude that there exists a constant matrix X ∈ Rq×w such that N 0 = XD 0 + N 00
with N 00 D 0−1 strictly proper. Now observe that
Z(ξ) = U (−ξ)−T K 00 (−ξ)T Σ0 K 00 (ξ)U (ξ)−1
where
K
00
Σ0
#
" #
Iw 0w×q
D0
:=
KU =
−X Iq
N 00
"
#−T
"
#−1
Iw 0w×q
Iw 0w×q
:=
ΣΩ
−X Iq
−X Iq
"
#
"
#
Iw X T
Iw 0w×q
=
ΣΩ
0q×w Iq
X Iq
"
100
6 Polynomial J-spectral factorization
Observe that K 00 (λ) is full column rank for all λ ∈ C. The existence of the matrix Φ of the
claim follows taking Σ = Σ0 , D = D 0 U (ξ)−1 , N = N 00 U (ξ)−1 .
We proceed to show an important application of the result of Theorem 6.3.2, namely that
if the inertia of Z(iω) is constant for all ω ∈ R, then it equals the inertia of the sub-matrix of
Σ corresponding to the input variables.
Theorem 6.3.3 Let Z ∈ Rw×w [ξ] be para-Hermitian, and assume that σ(Z(iω)) is constant
for all ω ∈ R. Consider Φ(ζ, η) such that Φ(−ξ, ξ) = Z(ξ). Let Φ(ζ, η) = K T (ζ)ΣK(η) be a
strictly proper symmetric canonical factorization of Φ (Theorem 6.3.2). Then:
σ(Z(iω)) = σ(Σ1 )
for all ω ∈ R, where Σ1 ∈ Rw×w is the (1, 1)-block of Σ.
Proof: Consider the strictly proper symmetric canonical factorization of Φ obtained as in
Theorem 6.3.2. Now observe that
"
#"
#
i Σ Σ
h
D(iω)
1
2
Z(iω) = D(−iω)T N (−iω)T
ΣT2 Σ3 N (iω)
= D(−iω)T Σ1 D(iω) + D(−iω)T Σ2 N (iω)
+N (−iω)T ΣT2 D(iω) + N (−iω)T Σ3 N (iω)
Now multiply both sides of this equality by D(−iω)−T on the left and D(iω)−1 on the right,
obtaining
D(−iω)−T Z(iω)D(iω)−1 = Σ1 + Σ2 N (iω)D(iω)−1 + D(−iω)−T N (−iω)T ΣT2
+D(−iω)−T N (−iω)T Σ3 N (iω)D(iω)−1
Taking the limit for ω to infinity, we conclude from the strict properness of N D −1 that
lim D(−iω)−T Z(iω)D(iω)−1 = Σ1
ω→∞
On the other hand, by assumption σ(Z(iω)) is constant, and consequently the claim of the
Proposition is proved.
The result of Theorem 6.3.3 implies that if Z(iω) has constant inertia for all ω ∈ R, one may
infer this inertia from the inertia of Σ1 . Of course, Σ1 is not unique. However, the claim of
Theorem 6.3.3 is that any input-output “partition” that satisfies properties listed in Proposition
6.3.2 can be used to infer the inertia of Z(ξ). Observe that this test involves only standard
polynomial- and numerical matrix computations and as a consequence is easier to carry out
than that on the inertia of Z(iω) for a general ω ∈ R.
We now prove that the Σ-unitary model (6.7) used in the proof of Theorem 6.2.2 maps
“strictly proper” image representations in “strictly proper” ones, in the following sense.
Lemma 6.3.4 Let col(G, F ) ∈ R(w+q)×w [ξ] be such that F G−1 is strictly proper. Let R̂ be
a Σ-unitary model for Veλt , as in (6.7). Then R̂ · col(G, F ) also represents a strictly proper
transfer function, i.e.
" # " #
G
Ĝ
R̂
=
F
F̂
is such that F̂ Ĝ−1 is strictly proper.
6.3. J-spectral factorization via Σ-unitary modeling
101
Proof: Recall from equation (6.7) that R̂ is:
−1 ∗
R̂(ξ) := (ξ + λ̄)I(w+q) − V T{V}
V Σ
where V ∈ R(w+q)×dim(V) is a full-column rank matrix such that Im(V ) = V. Now partition
" #
"
#
V1
Σ1 Σ2
V =
and Σ =
V2
ΣT2 Σ3
compatibly with w and q, and write
"
#
−1
−1
(ξ + λ̄)Iw − V1 T{V}
(V1∗ Σ1 + V2∗ ΣT2 )
−V1 T{V}
(V1∗ Σ2 + V2∗ Σ3 )
R̂(ξ) =
−1
−1
−V2 T{V}
(V1∗ Σ1 + V2∗ ΣT2 )
(ξ + λ̄)Iq − V2 T{V}
(V1∗ Σ2 + V2∗ Σ3 )
#
"
D∼ N ∼
=:
Q −P
Observe that D and P are nonsingular matrices, since their determinant has degree w, respectively q.
Observe that the transfer function associated with R̂ · col(G, F ) is (QG − P F )(D ∼ G +
N ∼ F )−1 , which can be rewritten as
P (P −1 Q − F G−1 )GG−1 (I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1
= P (P −1 Q − F G−1 )(I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1
−1
Since D ∼ = (ξ + λ̄)Iw − V1 T{V}
(V1∗ Σ1 + V2∗ ΣT2 ) is column proper, it is easy to see that (D ∼ )−1
is strictly proper. Observe also that the matrix (I + (N D −1 )∼ F G−1 ) is bi-proper because as
ξ → ∞, (I + (N D −1 )∼ F G−1 ) → I. Therefore (I + (N D −1 )∼ F G−1 )−1 is also bi-proper. Hence
we conclude that every entry of
(I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1
is a rational function with the denominator having degree at least equal to the degree of the
numerator plus one. Now observe that P −1 Q is strictly proper, and consequently P −1 Q−F G−1
is a matrix of strictly proper rational functions. Conclude that
(P −1 Q − F G−1 )(I + (N D −1 )∼ F G−1 )−1 (D ∼ )−1
is also a matrix of strictly proper rational functions.
−1
(V1∗ Σ2 + V2∗ Σ3 ) is a polynomial of
Now observe that every entry of P = −(ξ + λ̄)Iw + V2 T{V}
degree at most one. Conclude that P (P −1 Q + F G−1 )(I + (N D −1 )F G−1 )−1 (D ∼ )−1 is a matrix
of strictly proper rational functions, as was to be proved.
Having proved these important preliminary results, we can now state the main result of
this chapter. The statement makes use of the notion of observability of a QDF (Definition
1.2.12) and of the McMillan degree of a behavior (Section 2.8). Recall that McMillan degree of
a behavior B is denoted with n(B).
102
6 Polynomial J-spectral factorization
Theorem 6.3.5 Consider Z(ξ) ∈ Rw×w having constant inertia on the imaginary axis. Let
Φ(ζ, η) be a w × w matrix such that Φ(−ξ, ξ) = Z(ξ). Let Φ(ζ, η) = K T (ζ)ΣK(η) be a
strictly proper symmetric canonical factorization of Φ(ζ, η) (Theorem 6.3.2). Assume that Q Φ
is observable and that Φ(−ξ, ξ) is semi-simple. Let λi , i = 1, . . . , N be the singularities of
det(Z) in C+ , and assume that n(Im(K( dtd ))) = N . Assume that every principal block minor
of the matrix [K(λi )∗ ΣK(λj )/(λ̄i + λj )]N
i,j=1 is nonzero. Define K1 := K, and consider the
recursion for i = 1, . . . , N :
1. Vi := full column-rank matrix such that Im(Vi ) = Im(Ki (λi ));
−1
2. R̂i (ξ) := (ξ + λ̄i )I(w+q) − Vi T{V
V ∗ Σ;
i} i
3. Ki+1 (ξ) :=
R̂i (ξ)Ki (ξ)
;
ξ−λi
Then the matrix KN +1 is such that KN +1 = col(GN +1 , 0), with GN +1 ∈ Rw×w [ξ] a Hurwitz
Σ1 -spectral factor of Z(ξ), i.e.
Z(ξ) = Φ(−ξ, ξ) = GN +1 (−ξ)T Σ1 GN +1 (ξ)
Proof: It has been stated in Corollary 6.2.3 that since every principal block minor of the matrix
[K(λi )∗ ΣK(λj )/(λ̄i + λj )]N
i,j=1 is nonzero, the Pick matrix associated with the error subspace
of the model R̂i at the i-th iteration is nonsingular. This implies that the one-step model R̂i
can be defined at each iteration of step 2 of the above recursion.
Now observe that since at the i-th step the matrix R̂i is a kernel representation of Im(Ki (λi )),
in other words R̂i (λi )Ki (λi ) = 0, it must necessarily hold that (ξ − λi ) is a factor of R̂i (ξ)Ki (ξ).
i (ξ)
= Ki+1 (ξ) is polynomial.
This implies that the matrix R̂i (ξ)K
ξ−λi
It follows from the result of Lemma 6.3.4 that the model R̂i also preserves the strict properi (ξ)
ness of Ki+1 in the step Ki (ξ) → R̂i (ξ)K
= Ki+1 (ξ) of the algorithm.
ξ−λi
We now prove that the Σ-unitariness of R̂i also implies that
Ki∼ ΣKi = Φ(−ξ, ξ) = Z(ξ)
for all i = 1, . . . , N +1. The claim is true by assumption for i = 1. Note that for i = 2, . . . , N +1:
∼
∼
Ki−1
R̂i−1
R̂i−1 Ki−1
Σ
−ξ − λ̄i−1 ξ − λi−1
∼
Ki−1
Ki−1
=
(−ξ − λ̄i−1 )(ξ − λi−1 )Σ
ξ − λi−1
−ξ − λ̄i−1
∼
= Ki−1 ΣKi−1
Ki∼ ΣKi =
= Z(ξ) because of inductive assumption
Now denote Ki := col(Gi , Fi ), i = 1, . . . , N + 1, with Gi ∈ Cw×w [ξ] and Fi ∈ Cq×w [ξ]. We
prove by induction that the “denominator” Gi of Ki is nonsingular for i = 1, . . . , N + 1. The
statement is true by assumption for i = 1. Assume now that the claim holds true for i < j,
and partition the (j − 1)-th model R̂j−1 as
"
#
∼
∼
Dj−1
Nj−1
R̂j−1 (ξ) =
Qj−1 −Pj−1
6.3. J-spectral factorization via Σ-unitary modeling
103
with Dj−1 ∈ Rw×w [ξ], Nj−1 ∈ Rq×w [ξ], Qj−1 ∈ Rq×w [ξ], Pj−1 ∈ Rq×q [ξ] defined as in the proof of
Lemma 6.3.4. We prove the claim for i = j. Observe that
∼
∼
Dj−1
Gj−1 + Nj−1
Fj−1
ξ − λj−1
−1 ∼
∼
Dj−1 [I + (Nj−1 Dj−1
) Fj−1 G−1
j−1 ]Gj−1
=
(ξ − λj−1 )
Gj =
(6.8)
∼
Observe that Dj−1
is nonsingular by construction; that Gj−1 is nonsingular by inductive as−1 ∼
sumption; and that [I + (Nj−1 Dj−1
) Fj−1 G−1
j−1 ] is also nonsingular due to the strict-properness
−1
−1
of Nj−1 Dj−1 and Fj−1 Gj−1 . Conclude that Gj is also nonsingular as was to be proved.
We now prove that deg(det(Gi )) = deg(det(Gi−1 )), i = 2 . . . , N + 1. From (6.8) it follows
that
∼
det(Di−1
)
det(Gi )
−1 ∼
=
det([I + (Ni−1 Di−1
) Fi−1 G−1
i−1 ])
det(Gi−1 )
det((ξ − λi−1 )Iw )
∼
∼
Observe that since Di−1
= (ξ+λ̄)Iw +constant, it follows that deg(det(Di−1
)) = w = deg(det((ξ−
∼
det(Di−1 )
λi−1 )Iw ), and consequently that det((ξ−λi−1 )Im ) is a proper, but not strictly-proper, transfer
−1 ∼
function. Moreover, it follows from the strict-properness of Fi−1 G−1
i−1 and of (Ni−1 Di−1 )
−1 ∼
that I + (Ni−1 Di−1
) Fi−1 G−1
i−1 is a matrix of bi-proper rational functions, and consequently
that its determinant is a proper, but not strictly-proper, rational function. Conclude that
det(Gi )
is a proper, but not strictly-proper, rational function. This concludes the proof of
det(Gi−1 )
deg(det(Gi )) = deg(det(Gi−1 )), i = 2, . . . , N + 1.
For each i, i = 1, . . . N + 1, let Ki = Ki0 Ui , with Ki0 ∈ C(w+q)×w [ξ] right prime, and Ui ∈
Cw×w [ξ] a greatest common right divisor of Ki . Partition Ki0 compatibly with w, q as Ki0 =
col(G0i , Fi0 ). We now show that deg(det(G0i+1 )) < deg(det(G0i )), i = 1, . . . , N , i.e. that the
degree of the determinant of the “denominator” G0i associated with the transfer function of
Ki0 decreases with i, i = 1, . . . , N . Observe that since we have already proved above that
deg(det(Gi )) = deg(det(Gi−1 )),i = 2, . . . , N + 1, it follows that
deg(det(G0i Ui )) = deg(det(G0i ) det(Ui )) = deg(det(G0i )) + deg(det(Ui ))
= deg(det(G0i−1 )) + deg(det(Ui−1 )),
Hence, to prove the claim, it is equivalent to prove that deg(det(Ui )) > deg(det(Ui−1 )), i =
2, . . . N + 1.
In order to prove the above statement, observe first that since det(Φ(−λ̄i , λi )) = 0, it follows
that there exist vij ∈ Cw , j = 1, . . . , dim(ker(Φ(−λ̄i , λi ))), such that
vij∗ Φ(−λ̄i , λi ) = vij∗ K ? (−λ̄i )ΣK(λi ) = (K(−λ̄i )vij )∗ ΣK(λi ) = 0
Given that Φ(−ξ, ξ) is semi-simple by assumption, it follows that Im(K(−λ̄i )) contains exactly dim(ker(Φ(−λ̄i , λi ))) vectors which are Σ-orthogonal to Im(K(λi )). Since R̂i also models
¯
(Im(Ki (λi )))⊥Σ e−λi t , it follows that R̂i (−λ̄i )Ki (−λ̄i )vij = 0, j = 1, . . . , dim(ker(Φ(−λ̄i , λi ))).
Hence, every greatest common right divisor of Ki has exactly dim(ker(Φ(−λ̄i , λi )) singularities
in −λ̄i . Hence it follows that deg(det(Ui )) = deg(det(Ui−1 ))+dim(ker(Φ(−λ̄i , λi ))) which shows
that deg(det(G0i )) < deg(det(G0i−1 )), i = 2, . . . , N + 1. The claim just proved also shows that
deg(det(G0N +1 )) = 0, i.e. that G0N +1 is unimodular, since by the semi-simplicity assumption
104
6 Polynomial J-spectral factorization
and the fact that N equals the McMillan degree of Im(K( dtd )) = deg(det(G1 )), the number of
P
PN
singularities of Φ(−ξ, ξ) is exactly N
i=1 dim(ker(Φ(−λ̄i , λi ))) =
i=1 deg(det(Ui )).
−1
0
0
0
Since deg(det(GN +1 )) = 0 and GN +1 FN +1 is strictly proper, it must be that FN0 +1 = 0.
Hence we conclude that FN +1 = 0. This proves the first statement of the Theorem. Moreover,
because Ki∼ ΣKi = Φ(−ξ, ξ) = Z(ξ) for i = 1, . . . , N + 1:
"
#
h
i
GN +1 (ξ)
KN∼+1 ΣKN +1 = G∼
N +1 0 Σ
0
= G∼
N +1 Σ1 GN +1 = Z(ξ)
Since GN +1 = G0N +1 UN +1 with G0N +1 unimodular and UN +1 Hurwitz, it follows that GN +1 is a
spectral factor, as was to be proved.
We now examine the case when the matrix K coming from the symmetric canonical factorization of Φ in Theorem 6.3.5 is unobservable, i.e. K = K 0 U with U ∈ Rw×w [ξ] such that
dim(ker(U (λ))) > 0 for some λ ∈ C, and K 0 ∈ R(q+w)×w [ξ] right prime.
Proposition 6.3.6 Let Φ(ζ, η) = K T (ζ)ΣK(η) be a strictly proper symmetric canonical factorization, and assume that Φ(−ξ, ξ) is semi-simple. Assume that K ∈ R(w+q)×w [ξ] is such that
there exists λ ∈ C not purely imaginary, such that dim(ker(K(λ))) > 0. Let R̂ be a Σ-unitary
kernel representation of the model for the dualized data induced by Im(K(λ)). Then
K 0 (ξ) :=
R̂(ξ)K(ξ)
ξ−λ
is such that dim(ker(K 0 (−λ̄))) = dim(ker(K(λ))). Moreover,
Φ0 (ζ, η) := K 0 (ζ)T ΣK 0 (η)
is such that Φ0 (−ξ, ξ) = Φ(−ξ, ξ).
Proof: It has been argued in the proof of Theorem 6.3.5 that if R̂ is a Σ-unitary kernel
representation of the model for the dualized data induced by Im(K(λ)), it holds that K 0 (ξ) =
R̂(ξ)K(ξ)
is polynomial. Moreover, it follows from the Σ-unitariness of R̂ that
ξ−λ
K 0∼ ΣK 0 =
K ∼ R̂∼ R̂K
Σ
= K ∼ ΣK
−ξ − λ̄ ξ − λ
We now prove the claim dim(ker(K 0 (−λ̄))) = dim(ker(K(λ))). In order to prove this claim,
observe that if v ∈ Cw is such that K(λ)v = 0 then also Φ(−λ̄, λ)v = 0. Since Φ(−ξ, ξ) is
para-Hermitian, we conclude that
v ∗ Φ(−λ̄, λ) = v ∗ K ? (−λ̄)ΣK(λ) = (K(−λ̄)v)∗ ΣK(λ) = 0
which implies that K(−λ̄)v ∈ Im(K(λ))⊥Σ . Recall that ker(R̂( dtd )) contains Im(K(λ))⊥Σ e−λ̄t ,
hence R̂(−λ̄)K(−λ̄)v = 0. It follows from this and from R̂(ξ)K(ξ) = K 0 (ξ)(ξ − λ) that
(−λ̄ − λ)K 0 (−λ̄)v = R̂(−λ̄)K(−λ̄)v = 0
6.3. J-spectral factorization via Σ-unitary modeling
105
Since λ is not purely imaginary, it follows that (−λ̄ − λ) 6= 0, and consequently that K 0 (−λ̄)v =
0. This holds for every v ∈ ker(K(λ)); the claim is proved.
The result of Proposition 6.3.6 shows that if we are given a canonical symmetric factorization of Φ with a K which has a singularity in λ ∈ C+ , we can compute from it a new
two-variable polynomial matrix Φ0 such that Φ0 (−ξ, ξ) = Z(ξ), and whose canonical factor K 0
has a singularity in −λ̄ ∈ C− . Observe that this is particularly relevant in the application of
the recursions (1) − (3) of Theorem 6.3.5, since the spectral factor produced at the end of the
algorithm is ensured to be Hurwitz in this case.
The result of Theorem 6.3.2 and Proposition 6.3.6, and the recursion of Theorem 6.3.5 suggest the following algorithm in order to compute a J-spectral factor of a para-Hermitian matrix
Z.
Algorithm
Input: Semi-simple para-Hermitian matrix Z ∈ Rw×w [ξ] with constant inertia on the imaginary axis.
Output: A Σ1 -spectral factorization Z = D ∼ Σ1 D.
Compute Φ ∈ Rw×w [ζ, η] s.t. Φ(−ξ, ξ) = Z(ξ)
(* Use Proposition 6.3.1
*)
Compute a strictly proper canonical factorization: Φ(ζ, η) = K(ζ) T ΣK(η);
Apply Proposition 6.3.6 in order to remove any singularities of K in the right
half-plane;
(* Comment: Φ resulting from previous step satisfies *)
(* the assumptions of Theorem 6.3.5
*)
Compute the roots λi , i = 1, . . . , N of det(Z) in C+ ;
Define K1 := K;
For i = 1, . . . , N do
Compute full column rank matrix Vi ∈ C(w+q)ו such that Im(Vi ) = Im(Ki (λi ));
∗
Compute R̂i (ξ) := (ξ + λ̄i )I(w+q) − Vi T{−1
Im(K (λ ))} Vi Σ
i
Define Ki+1 (ξ) :=
R̂i (ξ)Ki (ξ)
;
ξ−λi
end;
Return the first w rows of KN +1 ;
i
106
6.4
6 Polynomial J-spectral factorization
Numerical Aspects of the Algorithm
Several issues need be investigated for a good practical implementation of the algorithm presented above. The treatment here is far from exhaustive, and is more a summary of the problems
we faced during an implementation, and their possible solutions. Our algorithm has the three
basic operations:
1. Compute a symmetric canonical factorization that meets conditions in Proposition 6.3.2.
2. Compute the spectrum of the given para-Hermitian matrix.
3. Implement the iterations and division by (ξ − λi ).
We now examine each of these in some detail and discuss the manner in which we implemented
the algorithm.
6.4.1
Symmetric Canonical Factorization
The starting step in the Algorithm is a “pre-factorization” of a given para-Hermitian matrix, usP
ing a symmetric canonical factorization of a bivariate polynomial matrix. Let Z(ξ) = dk=0 Zk ξ k
with Zk ∈ Rw×w and d the least integer such that Zd+1 = Zd+2 = . . . = 0. We call d the degree
of Z(ξ). A necessary condition for the existence of a Σ-spectral factorization is that d is even.
Note that Z(ξ) can be written as follows:
Z(ξ) =
h
(−ξ)
d/2
I (−ξ)
d/2−1
I ... I

i
×
(−1)d/2 Zd
(−1)d/2 Zd−1 /2
0...

d/2−1
d/2−1
d/2−1
Zd−1 /2 (−1)
Zd−2 (−1)
Zd−3 /2 . . .
 (−1)

d/2−2
d/2−2

0
(−1)
Zd−3 /2 (−1)
Zd−4
...

..

0
.
···
Z1 /2

0
...
−Z1 /2
Z0
|
{z
S:=
(6.9)


ξ d/2 I

 d/2−1
 ξ
I

..

.


I
{z
|
}
K:=
Define Σ = (S + S T )/2. With Φ(ζ, η) = K T (ζ)ΣK(η) it is easy to verify that:






}
1. Z(ξ) = Φ(−ξ, ξ) = K T (−ξ)ΣK(ξ).
2. Denote the first w rows of K(ξ) as Q(ξ), and the rest as P (ξ). Then P Q−1 is a rational
matrix of strictly proper rational functions.
3. If Zd is nonsingular, and Z(ξ) has constant inertia on the imaginary axis, inertia of Z(iω)
is the same as that of Zd . (Proposition 6.3.3).
We now prove an important implication of the pre-factorization (6.10):
Lemma 6.4.1 If Zd is nonsingular, the number of singularities of Z(ξ) in the open right half
plane is exactly equal to wd/2, which is also the McMillan degree of ImK( dtd )
6.4. Numerical Aspects of the Algorithm
107
P
Proof: Let Z(ξ) = dk=0 Zk ξ k . If Zd is nonsingular, we can define L(ξ) : Zd−1 Z(ξ) = ξ d Iw +
Ld−1 ξ d−1 . . . + L0 with Li = Zd−1 Zi , i = 0 . . . d − 1. Clearly, roots of det Z(ξ) = 0 are the same
as roots of det L(ξ) = 0. We define the matrix C ∈ Rwd×wd :


0
I
0
0


 0
0
I
... 


C= .

.
0
0
I

 .
−L0 −L1 . . . −Ld−1
It is shown in [29] that det L(ξ) = det(ξIdw − C). Hence, roots of det L(ξ) (and consequently
those of det Z(ξ)) are precisely the eigenvalues of C, and therefore dw in number. Since Z(ξ) is
para-Hermitian with no singularities on the imaginary axis, the number of singularities in the
open right half plane is wd/2.
Since P (ξ)Q−1 (ξ) is a matrix of strictly proper rational functions, the McMillan degree of
ImK( dtd ) = deg det ξ d/2 Iw , which is precisely wd/2.
Thus, many “good” things happen if Zd is nonsingular. Such polynomial matrices are called
regular:
Definition 6.4.2 A polynomial matrix Z(ξ) =
regular polynomial matrix.
Pd
k=0
Zk ξ k with Zd nonsingular is called a
Using the prefactorization of Z in the “triple banded form” as shown in equation (6.10), we can
start the algorithm in the regular case. When Z is not regular, we say that Z has singularities
at infinity. An important special case is when Z is unimodular, i.e.,det Z = constant, 6= 0,
when all singularities of Z are said to be at infinity. We use a method due to Aliev and Larin
[1] to convert a polynomial matrix that is not regular, into one that is. The regular polynomial
matrix so obtained can then be pre-factored as in equation (6.10) and the algorithm started.
Algorithm for converting a non-regular para-Hermitian matrix into a regular
para-Hermitian matrix.
Input Z(ξ) ∈ Rw×w [ξ] which is not regular.
Output Y (ξ) ∈ Rw×w [ξ] which is regular.
Initialize i = 1 and Y (ξ) = Z(ξ)
1. While Y (ξ) ∈ Rw×w [ξ] is not regular, repeat steps 2-5:
P
2. Write Y (ξ) = di=0 Yi ξ i . Since Yd is symmetric and singular, there exist Ui , Λi ∈ Rw×w
such that Yd = Ui Λi UiT with Λi = diag [λ1 , . . . , λk , 0 . . . , 0], k < w and Ui UiT = Iw .
3. Define Y 0 (ξ) = UiT Y Ui = [yij ] i, j = 1, . . . , w with yij polynomials in ξ having degree δij .
Consider the k + 1-th row of Y 0 . Let 2δ0 = δk+1 k+1 be the degree of the polynomial at
the diagonal position in this row. Let δm be maxi6=k+1 δk+1,i be the maximum degree of
the off-diagonal polynomial in this row. Define δ1 = min(d − δ0 , 2d − δm ).
4. Define Ti (ξ) = diag [1, . . . 1, (ξ + 1)δ1 , 1 . . . 1] with (s + 1)δ1 being at the k + 1-th position.
108
6 Polynomial J-spectral factorization
5. Update: Y (ξ) ← TiT (−ξ)UiT Y Ui Ti (ξ).
i ← i + 1.
Since at every step in the above procedure, the degree of Y remains the same (i.e. remains d),
the recursion in step 5 above is guaranteed to terminate after a finite number of steps. One
can obtain a Σ-spectral factorization of Y (ξ) using our algorithm: Y (ξ) = F T (−ξ)ΣF (ξ). A
(Hurwitz) spectral factor of Z can then be easily computed from a (Hurwitz) spectral factor of
Q
Y , and is seen to be F (ξ)X −1 (ξ) where X(ξ) = i Ui Ti taken in the sense of a left multiplication.
Notice that explicit inversion of X(ξ) is not necessary since:
X −1 (ξ) =
adjX(ξ)
det X(ξ)
The adjoint matrix is easily calculated by noting that with Ti (ξ) = diag [1, . . . 1, (ξ +1)δ1 , 1 . . . 1],
adjoint Ti (ξ) is simply diag [(s + 1)δ1 , . . . (s + 1)δ1 , 1, (s + 1)δ1 . . . (s + 1)δ1 ]. Further, det X(ξ) is
of the form (s + 1)δ , for some δ since det Ti = (s + 1)δ1 , and det Ui = 1.
6.4.2
Computing singularities
Singularities of the para-Hermitian matrix Z(ξ) are by definition roots of det Z(ξ) = 0. However, this naive approach is not preferred due to the following numerical reasons:
1. The complexity of determinant computation of a w × w matrix is O(w!) and therefore, it
is prohibitive for large w.
2. Especially in the context of polynomial matrices, determinant computation results in a
large number of spurious terms. These terms are floating point errors accumulated due
to the phenomenally large number of computations.
3. Even if the determinant could be computed as a polynomial, computing spectra is not
straightforward: root computation programs that come with most general purpose computational packages (for example “Scilab”) are unable to compute roots of a polynomial
of degree greater than 100.
These considerations motivate the use of some other “equivalent” methods for computation of
singularities that are better from a numerical viewpoint. It is known that the roots of a (scalar)
polynomial are the eigenvalues of a certain matrix known as a “companion matrix”[22]. A
generalization of this well-known result is that the singularities of a polynomial matrix are the
generalized eigenvalues of a “block companion matrix”.
P
Proposition 6.4.3 Let Z(ξ) = dk=0 Zk ξ k ∈ Rw×w [ξ]. Then the finite singularities of Z(ξ) (i.e.
the roots of det Z(ξ) = 0) are the finite generalized eigenvalues of the matrix pencil ξA − B
where A ∈ Rdw×dw , B ∈ Rdw×dw with




I
0
0 0
0
I
0
0




 0 I ... 0

 0
0
I
... 




A= .
 and B =  ..

.
0 I 0 
0
0
I
 .

 .
0
...
0 Zd
−Z0 −Z1 . . . −Zd−1
6.4. Numerical Aspects of the Algorithm
Proof: This result is well known. See [49] for a proof.
109
Computation of eigenvalues is a highly developed and specialized field. Some methods of
computing eigenvalues have been summarized in [14] which gives “templates” for computing
eigenvalues depending on the structure of the problem in hand. Note that the matrices A, B
are sparse and we need only half (i.e positive) generalized eigenvalues of the matrix pencil. We
used the package ARPACK (http:// www.caam.rice.edu/software/ARPACK/) to compute
the positive singularities of Z.
6.4.3
Implementation of iterations
There are two main numerical issues in this step
1. Computing the Σ-unitary model
2. Polynomial division by (ξ − λ).
The first step involves a computation of the following nature
R̂(ξ) = (ξ + λ̄i )I − Vi T −1 Vi∗ Σ
where T = (Vi∗ ΣVi )/(λi + λ̄i ) is the i-th stage Pick matrix. Define T 0 = Vi∗ ΣVi . Then,
T −1 = (λi + λ̄i )T 0−1 . Explicit inversion of the Pick matrix is not preferred because it could lead
to a very sensitive solution if T is badly conditioned, and secondly, it is heavy in terms of the
number of computations and the memory required. Therefore, we do the following: we solve a
linear system of equations T 0 x = b where b = Vi∗ Σ. Then,
R̂(ξ) = (ξ + λ̄i )I − (λi + λ̄i )Vi x
is the Σ-unitary model. Implementing the Σ-unitary model by solving an associated linear
equation is computationally more robust and efficient than an explicit implementation.
We now come to the last issue namely division by (ξ−λ). In every iteration in the Algorithm,
we get matrices Ki (ξ) in which (ξ − λi ) is a factor. This division is implemented as follows: let
P
Ki (ξ) = di=0 Ci ξ i with Ci “tall” complex matrices. Ki (ξ)/(ξ − λi ) can be computed by doing
a long division with the blocks Ci directly, rather than doing element-by-element long division.
The first term in the quotient is easily seen to be Cd ξ d−1 . The second term is (Cd−1 +
λCd )ξ d−2 . By induction, the i-th term is (Cd−i+1 + λCd−i+2 + λ2 Cd−i+3 + . . . + λi−1 Cd )ξ d−i .
The quotient is explicit and can be easily generated by a recursive loop. This division is by far
the most crucial element in the algorithm. In-exact division can be a source of problems. We
implemented the division in double precision and found the division to be satisfactory.
In the next section, we report some facts about an implementation of the (entire) algorithm.
6.4.4
Computer implementation of polynomial J-spectral factorization
Scilab is a computer algebra package developed by researchers at INRIA, France. It is distributed under a license similar to the GNU general public license and is freely available through
110
6 Polynomial J-spectral factorization
the Internet (http://www.scilabsoft.inria.fr). Scilab has good support for polynomial computations. We implemented the algorithm presented in this Chapter in Scilab. We need the
concept of “element wise error” in order to evaluate the performance of the algorithm. Suppose
Z(ξ) ∈ Rw×w [ξ] admits a Σ-spectral factorization given by F T (−ξ)ΣF (ξ). We define:
E(ξ) = Z(ξ) − F T (−ξ)ΣF (ξ)
Due to numerical errors, E(ξ) will in general be nonzero. One can express E(ξ) as
E(ξ) = 1 × 10
−n
d
X
Ei ξ i
i=0
with Ei ∈ Rw×w and n is the least integer such that all coefficients of every polynomial in
Pd
i
i=0 Ei ξ have absolute values less than 10, i.e., every element in E i is less than ±a · bcd...
with a an integer between 1 to 9. Then, by the “element wise error” we mean 1 × 10−n . Below,
we summarize summarize some numerical results. The matrices considered have a fixed degree
and randomly selected coefficients. The performance results are for a Pentium-IV PC (256 MB
RAM) running Scilab-3.1.1 on a Fedora Core-2 linux.
Sr size of Z
1 10 × 10
2 20 × 20
3 40 × 40
4 20 × 20
5 50 × 50
degree
4
4
4
8
6
time taken (sec)
0.261
2.193
21.807
15.19
28.1
Element wise error
1 × 10−8
1 × 10−7
1 × 10−5
1 × 10−4
1 × 10−3
Thus we see that while the algorithm works quite well for a matrices of reasonable complexity
(having 80-100 singularities), the performance deteriorates with the number of singularities.
The possible causes for this deterioration can be:
1. The algorithm requires exact computation of singularities– an error in computing the singularities reflects as an error in the final spectral factor. This however is a fundamental
problem in many algorithms that depend on computation of singularities of polynomial
matrices. A through study of available eigenvalue computation routines is being undertaken to evaluate the “best” routine for computing singularities of a para-Hermitian
matrix.
2. Division by a (ξ − λ) factor in every step in the algorithm could lead to errors, because
this division may not be exact. One way to get around this problem can be the following:
if all singularities are distinct, carry out the algorithm without dividing by (ξ − λi ). After
Q
the last iteration, divide by N
i=1 (ξ − λi ). The accumulation of all factors till the end
reduces the numerical errors to some extent.
6.5
Examples
In this section we provide four examples of the application of our results to J-spectral factorization.
6.5. Examples
111
Example 6.5.1 Consider
"
1 − ξ2
ξ
Z(ξ) =
−ξ
1 − ξ2
#
Note that Z(iω) has inertia (2, 0, 0) for all ω ∈ R. Pre-factor Z(ξ) as


T 
ξ 0
1
0
0 −0.5
−ξ 0


 0 −ξ   0
1 0.5
0 
 0 ξ 
 

Z(ξ) = 


 
1
0.5 1
0  1 0 
0  0
0 1
−0.5 0
0
1
0
1

Observe that the transfer function associated with the choice of the first two variables of
Im(K( dtd )) as inputs, is strictly proper. Observe also that Σ1 = I2 , and that n(Im(K1 ( dtd ))) =
1
deg(det(Z)) = 2. Observe also that the singularities of Z(ξ) in C+ are 0.86 ± i0.5, and that
2
since the inertia of Σ1 , the (1, 1)-block of the matrix Σ, is (2, 0, 0), there exists a I2 -spectral
factorization of Z.
It can be easily verified that every principal block minor of the Pick matrix associated with
the data is nonzero. Consequently the Σ-unitary model (6.7) can be constructed at every step.
We initialize


ξ 0
0 ξ 


K1 (ξ) = 
,
1 0
0 1
and we compute K1 (0.86 + i0.5). It is not difficult to see that K1 (0.86 + i0.5) has full column
rank, and consequently we can define:


0.86 + i0.5
0

0
0.86 + i0.5


V1 = K1 (0.86 + i0.5) = 



1
0
0
1
In step i = 1, we proceed to compute a model R̂1 (ξ) as in (6.7) for the data pair (0.86 +
i0.5, Im(V1 )):


−i0.6 + ξ
−0.4
−0.8 − i0.34 0.34 + i0.2
0.4
−i0.6 + ξ
−0.34 − i0.2 −0.8 − i0.34

−0.8 + i0.34 −0.34 + i0.2 −i0.4 + ξ

0.4
0.34 − i0.2 −0.8 + i0.34
−0.4
−i0.4 + ξ
This model yields K2 (ξ) = R̂1 (ξ)K1 (ξ)/(ξ − 0.86 − i0.5). It is easy to verify that K2 (0.86 − i0.5)
is also full column rank, and consequently we can define V2 = K2 (0.86 − i0.5).
In step i = 2, we compute a model R̂2 (ξ) as in (6.7) for the data pair (0.86 − i0.5, Im(V2 )):


−0.86 + i0.6 + ξ
0.1

 −0.2 − i0.34
−0.34 + i0.2
−0.1
−0.86 + i0.6 + ξ
0.34 − i0.2
−0.2 − i0.34
−0.2 + i0.34
−0.51 + i0.2
0.86 + i0.4 + ξ
−0.1
0.51 − i0.2
−0.2 + i0.34 

0.1
0.86 + i0.4 + ξ
112
6 Polynomial J-spectral factorization
We then define


0.86 + ξ
−0.5
 0.5
R̂2 (ξ)K2 (ξ)
0.86 + ξ 


KN +1 (ξ) = K3 (ξ) =
=

ξ − 0.86 + i0.5  0
0 
0
0
As stated in Theorem 6.3.5, the matrix KN +1 has the last q rows equal to zero.
Observe that
"
#
0.86 + ξ
−0.5
GN +1 (ξ) =
0.5
0.86 + ξ
has singularities in −0.86 ± i0.5, i.e. GN +1 is Hurwitz. Moreover, as stated in the last part of
Theorem 6.3.5, GN +1 (−ξ)T GN +1 (ξ) = Z(ξ), i.e. GN +1 is a spectral factor of Z.
Example 6.5.2 The purpose of this example is to show how the preprocessing steps sketched
in the proof of Theorem 6.3.2 are carried out. We consider an example of a mixed-sensitivity
problem from [53], Example 4.4.3, with parameters r = 0, c = 1, and γ = 2.
Consider the matrix

T 


1 −1 + ξ
1 0 0
1 −1 − ξ

 


Z(ξ) = −1
0  0 1 0  −1
0 
2
2ξ
0 0 −1
2
−2ξ
#
"
−2
−1 + 3ξ
=
−1 − 3ξ 1 + 3ξ 2
It can be shown that Z(iω) has inertia (1, 0, 1) for all ω ∈ R. However, it can be readily verified
that no transfer function associated with the factorization above is strictly proper.
In order to perform the spectral factorization, we consider first the i/o partition with w3
the output, and (w1 , w2 ) inputs. We write
"
#
i −1 −1 − ξ
h
i
N (ξ) = 2 −2ξ = 2 0
+ 0 2
−1
0
h
and define
i
h



1 −1 − ξ
1 0 0




M 0 (ξ) :=  0 1 0M (ξ) = −1
0 
0
2
−2 0 1
{z
}
|

T :=
Note that the transfer function corresponding to M 0 is strictly proper. Note also that Z(ξ) =
M 0 (ξ)T Σ0 M 0 (ξ) where


−3 0 −2


Σ0 = T −T ΣT −1 =  0 1 0 
−2 0 −1
6.5. Examples
113
We now apply the algorithm to the matrix M 0 . It is easy to verify that M 0 (1) has full column
rank. We can choose V1 = M 0 (1) and proceed constructing the Σ0 -unitary model, obtaining


−3 + ξ −2
−2


R̂(ξ) =  2
1+ξ
2 
2
2
1+ξ
and


1 1−ξ
R̂(ξ)M (ξ) 

= −1 −2 
ξ−1
0
0
0
Now define
"
−3 0
Σ1 :=
0 1
#
the sub-matrix of Σ0 corresponding to the input variables (w1 , w2 ), and
"
#
1 1−ξ
F (ξ) =
−1 −2
"
#
−3
0
Observe that F is Hurwitz, and moreover that Z(ξ) = F (−ξ)T
F (ξ). Now assume
0 1
that we choose the i/o partition associated with the variables w1 (output) and (w2 , w3 ) inputs.
The corresponding transfer function is not strictly proper, but since
it follows that
h
i h
i
i 1h
2 −2ξ + 0 −1
1 −1 − ξ =
2



0 −1
1 0 − 21




00
0 1 0 M (ξ) = −1 0  =: M (ξ)
0 0 1
2 −2ξ
{z
}
|

T 0 :=
corresponds to a strictly proper transfer function for the i/o partition chosen. Observe that
Z(ξ) = M 00 (−ξ)T Σ00 M 00 (ξ)
with
Σ00 := T 0−T ΣT 0−1
−T 
−1 

1 0 − 21
1 0 12
1 0 − 21






= 0 1 0  Σ 0 1 0  =  0 1 0 
1
0 − 34
0 0 1
0 0 1
2

Now verify that M 00 (λ) is of full column rank
construct a Σ00 -unitary model for M 00 (1):

ξ

0
R̂ (ξ) = −2
2
for all λ ∈ C. We can consequently proceed to

−1
− 12

1+ξ
1 
−2 −2 + ξ
114
6 Polynomial J-spectral factorization
Observe that


0
0
R̂ (ξ)M (ξ) 

= −1
−2 
ξ−1
2 2 − 2ξ
0
Now define
00
#
1
0
,
Σ002 :=
0 − 34
"
the sub-matrix of Σ00 corresponding to the input variables (w2 , w3 ), and let
"
−1
−2
F 0 (ξ) :=
2 2 − 2ξ
#
It is a matter of straightforward verification to check that Z(ξ) = F 0 (−ξ)T Σ002 F 0 (ξ) and that
F 0 is a Hurwitz matrix.
Example 6.5.3 We now consider a case in which the result of Proposition 6.3.6 must be
applied, i.e. in which the matrix Z(ξ) is factored in a non-observable form. Let
"
2 − 3ξ 2 + ξ 4 1 + ξ
Z(ξ) =
1−ξ
1 − ξ2
#
Z(ξ) admits the factorization
Z(ξ) =
"
−ξ − ξ 2
0

# 1
0 1+ξ 0 
0

−ξ
0
1 0
0
|
0 0
1 0
0 2
0 1
{z
Σ

0 ξ − ξ2

0
 0

1  1 − ξ
0
1
}|
{z
K(ξ)

0
ξ


0
1
}
Note that the matrix K(ξ) is not right prime, since it loses rank at λ = 1. The roots of
det(Z(ξ)) in the right half plane are 1.618034, 0.6180340, 1. We proceed first by modeling the
root 1 as in the proof of Proposition 6.3.6. It is easily seen that
 
0
1 
 
Im(K(1)) =<   >
0 
1
and that consequently the Pick matrix is 1 and the one-step model for K(1) is


1+ξ 0
0
0
 0
ξ
−1 −1


R̂1 (ξ) = 

 0
0 1+ξ 0 
0
−1 −1
ξ
6.5. Examples
115
Observe that


−ξ 2 − ξ
0
R̂1 (ξ)K(ξ) 
1 + ξ
 1

K 0 (ξ) :=
=
,
ξ−1
 −1 − ξ
0 
1
0
and that K 0 (−ξ)T ΣK 0 (ξ) = Z(ξ). The only singularity of K 0 is −1. This corresponds to a
Hurwitz greatest common right divisor and consequently does not impair the final result of the
application of our algorithm. Modeling Im(K 0 (1.618034)) yields

0
−2.11803
−0.809017
ξ − 21


0
−1.61803 + ξ
0
0


R̂2 (ξ) = 

1

−1.30902
0
0.309017 + ξ
−2
1
1
0
1.80902 + ξ
2
2

and


−(ξ + 21 )(ξ + 1.618034)
0

R̂2 (ξ)K 0 (ξ)
1
1 + ξ


=
K2 (ξ) =

ξ − 1.618034  0.3090(ξ + 1.618034)
0 
0
− 12 (ξ + 1.618034)
We now model Im(K2 (0.618034)) with

0
0.118034
−0.190983
ξ − 12


0
−0.618034 + ξ
0
0




0.309017
0
0.58541 + ξ 0.0527864 
− 12
0
0.0527864 0.532624 + ξ

and define
Observe that


−(0.618034 + ξ)(1.61803 + ξ)
0

1
1 + ξ


K3 (ξ) = 


0
0 
0
0
"
−(0.618034 + ξ)(1.61803 + ξ)
0
F (ξ) :=
1
1+ξ
is Hurwitz, and that F (−ξ)T I2 F (ξ) = Z(ξ).
#
Example 6.5.4 The last example that we consider in this chapter demonstrates that we can
also compute factorization of unimodular matrices using the algorithm presented here. Let
!
1 − ξ2 ξ
Z(ξ) =
−ξ
1
It is easy to see that Z(ξ) is unimodular since det Z(ξ) = 1. Since Z(ξ) has no finite spectral
points, the algorithm cannot be used as such. Notice that Z(ξ) is not a regular polynomial
116
6 Polynomial J-spectral factorization
matrix (Definition 6.4.2). We therefore convert Z(ξ) into a regular polynomial matrix using the
procedure due to Aliev and Larin outlined in Section 6.4.1. Using this procedure, we compute
!
−0.7071068
0.7071068 + 0.7071068ξ
X(ξ) =
0.7071068 + 0.7071068ξ 0.7071068 + 1.4142136ξ + 0.7071068ξ 2
Define Z 0 (ξ) = X T (−ξ)Z(ξ)X(ξ) which is found to be
Z 0 (ξ) =
2
1 − 2ξ
ξ − ξ2
−ξ − ξ
1 − ξ2
2
!
Now Z 0 (ξ) can be factorized in the “triple-banded” form as in Section 6.4.1:



2
1
0
0.5
ξ
0
!
1


−ξ
0
1
0
1
−0.5 0 

0 ξ 
0
Z (ξ) =



0 −ξ 0 1  0 −0.5
1
0 1 0 
0.5
0
0
1
0 1
We can now apply the algorithm and compute a spectral factorization of Z 0 (ξ):
!
!
!
1
−
ξ
−1
2
1
1
+
ξ
0
Z 0 (ξ) =
0
1−ξ
1 1
−1 1 + ξ
{z
}
|
F 0 (ξ)
Define F (ξ) = F 0 (ξ)X −1 (ξ) which is found to be
F (ξ) =
It can be easily seen that
−0.7071068 − 0.7071068ξ 0.7071068
1.4142136
0
!
2
1
Z(ξ) = F T (−ξ)
F (ξ)
1 1
which gives a factorization of Z(ξ).
6.6
!
Conclusion
In this chapter, we have obtained an algorithm for J-spectral factorization of para-Hermitian
matrices having constant inertia on the imaginary axis. The algorithm is closely connected
with the so called “Σ-unitary” modeling of a finite dimensional vector space. We have obtained
a number of results of independent interest for Σ-unitary models. We have shown that the
algorithm computes a (Hurwitz) spectral factor in a finite number of iterations. We have
studied some numerical aspects associated with the algorithm. Finally, we have demonstrated
the use of the algorithm using a number of examples. We have shown that we can also compute
factorizations of unimodular matrices using the method described in this chapter.
Chapter 7
Synthesis of dissipative systems
7.1
Introduction
Synthesis of dissipative systems, better known as the “H∞ control problem” in systems theory is
an important area of research. H∞ theory provides a powerful method for designing controllers.
In simple terms, a “H∞ controller” aims to modify the sensitivity of a given system to userdefined values. Use of H∞ theory received a huge impetus after the publication of the well
known paper [19] which gave explicit state space formulae for the solution of the H∞ control
problem. Since the 90’s a great deal of research is being carried out in this area. In the same
period, Meinsma [53], Willems and Trentelman [104, 96], and later, Belur [11] addressed the
H∞ control problem using a polynomial approach.
In this chapter we obtain a novel characterization of all solutions to the H∞ problem. We
show we can associate a QDF QΦ with a given H∞ problem in a “natural” way. All solutions
to the H∞ problem are precisely Φ-dissipative behaviors with positive (semi)definite storage
functions on manifest variables. Recall that a characterization of the latter was already obtained
in Chapter 4 under some special assumptions. In this way, ideas presented in Chapters 3 and 4
are relevant to a large area of systems theory, including, as we shall show in this chapter, H ∞
controller design.
Willems and Trentelman have formulated and solved the H∞ problem in a behavioral framework in [96]. We build upon their formulation in this chapter. However our approach differs
from that in [96] in a substantial manner:
The H∞ problem has been formulated in [96] based on kernel representations (Chapter 2,
Section 2.5). However, kernel representations suffer from the disadvantage that controllability is not in-built in their definition. In this chapter, we formulate the H∞ problem
in terms of image representations. Since we only consider image representations, controllability is ensured automatically. An image representation approach also yields itself to
dissipativity based arguments more readily than a kernel representation approach.
The solution in [96] proceeds by constructing behaviors that are “orthogonal” with respect
to a indefinite matrix. This orthogonalization runs into difficulties when some associated
matrix is non-constant. Therefore, it is advantageous to have a solution that does not
depend on orthogonalization. We believe that the approach presented here can be used
to address the H∞ problem with frequency dependent weights.
118
7 Synthesis of dissipative systems
Most results in [96] are existential. In this chapter, we attempt to give concrete algorithms.
This chapter is organized as follows: in Section 7.2 we formulate the H∞ control problem. The
problem is first formulated as in [96]. Using ideas developed earlier in the thesis, we simplify
the problem formulation in several steps. The problem is subsequently solved in Section 7.3.
Section 7.4 is about a characterization of all solutions using ideas in Chapters 3 and 4.
7.2
Problem formulation
We have briefly elaborated on the “control-as-an-interconnection” philosophy of the behavioral
approach in Chapter 5, Section 5.3. We call any dynamical system that we wish to control as
the “plant”. Control in a behavioral context means a restriction of trajectories in the plant to a
desired subset, and can be accomplished by interconnection of the plant with another dynamical
system called the controller. Since trajectories in the interconnected system satisfy laws of the
plant as well as laws of the controller, the controller can be looked at as excluding undesirable
trajectories from the plant. We call the dynamical system obtained by interconnection of the
plant and the controller the controlled system. In Chapter 5 we addressed the interconnection
of a linear system with a nonlinear system so that the (autonomous) interconnected system has
desired properties (the property of stability). In this chapter, we address the interconnection
of a linear system with another linear system so that the interconnected system is controllable,
and is dissipative with respect to a given supply function. A formal definition of terms related
to control in the context of the present chapter are given below.
In most practical situations, not all variables in the plant are available for interconnection
with the controller. We call variables that are available for interconnection the control variables
and those that are not, as the to-be-controlled variables. By the “full plant behavior” we mean
the behavior corresponding to the control variables and the to-be-controlled variables in the
plant. Consider a plant that is controllable. Let c denote variables in the plant that are available
for control and v denote the to-be-controlled variables. Then, the full plant behavior is the set
of all trajectories (v, c) that satisfy the laws of the plant:
P full := {(v, c) ∈ C ∞ (R, Rv+c )|(v, c) obey the plant equations}
(7.1)
We are generally not interested in the evolution of all variables in P full but only of those
variables that are to-be-controlled. We call the projection of P full on to the to-be-controlled
behavior as the plant behavior P:
P := {v ∈ C ∞ (R, Rv )|(v, c) ∈ P full }
(7.2)
We attach a controller to the plant through the control variables c. One can look at a controller
as another dynamical system having c variables which obey certain laws:
C := {c ∈ C ∞ (R, Rc ) such that c satisfies controller equations}
(7.3)
When the plant and the controller are interconnected through c, the variables c in P full are
now restricted since these obey the laws defining both the plant and the controller. The
corresponding trajectories v ∈ P represent the controlled behavior, K.
K := {v ∈ P|∃c ∈ C with (v, c) ∈ P full }
(7.4)
7.2. Problem formulation
119
By a proper design, the controller C ensures that the controlled behavior K meets given
specifications. What these specifications are depends on the problem at hand, however in the
context of H∞ control, one of the specifications is that K should be dissipative with respect to
a given supply function. We give a complete problem specification later.
A fundamental question in designing controllers is: can an arbitrary controlled behavior
K be achieved by applying control through the control variables? If K can be achieved by
applying control through the control variables c, then K is called implementable through c.
Clearly, since a controller restricts trajectories in P, it is necessary that K ⊆ P. To address
which controlled behaviors can be achieved, we also need the concept of a hidden behavior N :
N := {v ∈ P|(v, 0) ∈ P full }
(7.5)
That is, the hidden behavior is the set of all trajectories in P that correspond to the control variables being zero. Clearly, the controller cannot influence trajectories in N since these
are unobservable (“hidden”) from the control variables c. The following result in [96] (Theorem 1, page 55) addresses the question of what controlled behaviors K can be achieved by
interconnection of P and C:
v
Proposition 7.2.1 Let P full ∈ Lv+c
con be the full plant behavior, P ∈ Lcon be the manifest plant
behavior and N ∈ Lvcon be the hidden behavior. Then, a controlled behavior K ∈ Lvcon is implementable by a controller C ∈ Lccon acting on the control variables if and only if N ⊂ K ⊂ P. Thus, a controlled behavior K is implementable by a controller acting through the control
variables if and only if it contains the hidden behavior N and K is in turn contained in the
manifest plant behavior P. Proposition 7.2.1 is very intuitive, and of fundamental importance
in designing a controller, for it gives bounds on what controlled behaviors are possible.
We define a nonsingular matrix Σ = ΣT ∈ Rv×v . Let σ+ (Σ) denote the number of positive
eigenvalues of Σ. In the sequel, we shall denote by m(B) the input cardinality (Chapter 2,
Section 2.9) of a behavior B.
Problem statement 1 Given the manifest plant behavior P ∈ Lvcon , the hidden behavior
N ∈ Lvcon and a nonsingular weighting matrix Σ = ΣT ∈ Rv×v , find all behaviors K ∈ Lvcon ,
called the controlled behaviors such that
1. K is implementable, i.e. N ⊂ K ⊂ P.
2. m(K) = σ+ (Σ), i.e. K has σ+ (Σ) inputs.
3. K is Σ-dissipative on R− , i.e., K is dissipative with respect to QΣ and every storage function on manifest variables is positive semidefinite.
It is clear that since N ⊂ K and K should be Σ-dissipative on R− , it is necessary that N must
be Σ-dissipative on R− . We now explain why the above conditions are important. Condition
1 is about implementability and must be satisfied for any control system design. Note that
the implementability condition follows from Proposition 7.2.1. We have seen that the idea
120
7 Synthesis of dissipative systems
of control is a restriction brought about by interconnection of the plant and the controller.
The controller is chosen such that the interconnected system exhibits some desirable properties
(specifications). Conditions 2 and 3 represent the specifications on the controlled system. The
dissipativity condition (Condition 3) is fundamental; it could imply, for example, disturbance
attenuation, or passivation depending on the problem under consideration. The condition on
input cardinality (Condition 2) ensures that the controlled system has sufficient “freedom”
available to it.
Recall from Lemma 3.5.1 that σ+ (Σ) is the maximum possible input cardinality of a Σdissipative behavior. Since N ⊂ K, m(K) > m(N ). Therefore, it is necessary that input
cardinality of N be less than σ+ (Σ):
Lemma 7.2.2 There exists a K that is a solution to “Problem statement 1” only if m(N ) <
σ+ (Σ).
The description of the H∞ problem that we have just given is representation free. However, in practice, we have to work with concrete representations for behaviors. We begin by
translating the problem in terms of representations for behaviors. We have assumed that the
behaviors P and N are controllable in a behavioral sense. Further, we also want that the
controlled behavior K be controllable. We denote the behaviors N , P and K by their image
representations Im N ( dtd ), Im P ( dtd ) and Im K( dtd ) respectively and assume that these image
representations are observable. We assume that the input cardinality of the hidden behavior
is n and that of the manifest plant behavior is p. Thus, N (ξ) ∈ Rv×n [ξ] and P (ξ) ∈ Rv×p [ξ].
We want to find a matrix K(ξ) ∈ Rv×k [ξ], k = σ+ (Σ), such that in addition to the dissipativity
conditions, N ⊂ K ⊂ P.
Since N ⊂ P, by a suitable transformation, the manifest plant behavior can be represented by Im[N ( dtd ) M ( dtd )]. Hence, we assume that P given by ImP ( dtd ) is such that P (ξ) =
[N (ξ) M (ξ)], i.e. the first n columns of P (ξ) are an image representation for the hidden behavior N and further P = N ⊕ Im M ( dtd ). Since the controlled behavior K must contain N ,
we assume that K(ξ) = [N (ξ) B(ξ)] such that K = N ⊕ Im B( dtd ) with B(ξ) ∈ Rv×b [ξ] with
b = σ+ (Σ) − n. Clearly, the behavior Im B( dtd ) ⊂ Im M ( dtd ). Using these preliminaries, we
reformulate the problem statement:
Problem statement 2 Given the hidden behavior N = Im N ( dtd ) and the manifest plant
behavior P = ImP ( dtd ) with N (ξ) ∈ Rv×n [ξ] and P (ξ) ∈ Rv×p [ξ], P (ξ) = [N (ξ) M (ξ)], find
a controllable behavior K = Im K( dtd ) with K(ξ) ∈ Rv×k [ξ], k = σ+ (Σ), K(ξ) = [N (ξ) B(ξ)]
having full column rank such that
1. K = N ⊕ Im B( dtd ).
2. Im B( dtd ) ⊂ Im M ( dtd ).
3. K is Σ-dissipative on R− .
We address Problems defined in statements (1) and (2) above by considering the QDF QΣd
associated with QΣ , defined by the polynomial matrix Σd = d(ζ)d(η)Σ with d(ξ) ∈ R[ξ], d(ξ) 6=
0, having no roots in the open right half plane. Notice that a behavior is Σ-dissipative if
7.2. Problem formulation
121
and only if it is also Σd -dissipative. In other words, the QDFs QΣ and QΣd are equivalent
(QΣ ∼ QΣd ) in the sense of the equivalence relation defined in Chapter 3, Section 3.3. Using
the procedure for computing storage functions (Section 4.2) we see that there exists a storage
function for a Σd -dissipative behavior that has the structure d(ζ)d(η)Ψ(ζ, η) with QΨ a storage
function for B with respect to QΣ . The following proposition relates storage functions of B
with respect to QΣ and QΣd :
Proposition 7.2.3 Let Σ = Jmn = diag[Im , −In ]. Let B be Σd -dissipative with d(ξ) such that
no root of d(ξ) lies in the open right half plane. Assume m(B) = σ+ (Σ). Let (u, y) be an inputoutput partition of manifest variables w in B, defined by an observable image representation:
" # "
#
u
Q( dtd )
=
`
y
P ( dtd )
Every storage function for B with respect to QΣd is positive semidefinite on manifest variables
if and only if Q(ξ) is Hurwitz, i.e. det Q(ξ) has all its zeros in the open left half plane.
Proof: Let x denote a set of minimal states of B. Since QΣ ∼ QΣd , B is Σ-dissipative if and
only if it is Σd -dissipative. Note that σ+ (Σ) = σ+ (d(iω)d(−iω)Σ) for almost all ω ∈ R. Since
m(B) = σ+ (Σ) and B is Σ-dissipative, P Q−1 is a proper rational function and Q(ξ) has no
singularities on ξ = iR. The behavior L0 := {`|Q( dtd )` = 0} has the same McMillan degree as
B. Further, corresponding to every state trajectory in B, there exists a corresponding state
trajectory in L0 .
Assume that Q(ξ) is Hurwitz. Let QΨ be an arbitrary storage function on states for B with
respect to QΣd . Then,
d
QΨ (x) = QΣd (w) − Q∆ (w, x)
dt
We now integrate the above equality from 0 to ∞ by setting u = 0. Note that the integration
is well defined since Q(ξ) is a Hurwitz matrix.
Z ∞
Z ∞
QΨ (x(0)) =
Q∆ (w, x)dt −
QΣd (w)dt
0
0
R∞
Since − 0 QΣd (w)dt = 0 ||d( dtd )y||2 − ||d( dtd )u||2 dt, it follows that − 0 QΣd (w) ≥ 0 for all
y = P ( dtd )` such that ` ∈ L0 . Since Q∆ (w, x) is positive semidefinite, we have QΨ (x(0)) ≥ 0.
We have just shown that every storage function (on states) for B with respect to Q Σd is positive
semidefinite. Using a state map x = X( dtd )w we conclude that every storage function for B
with respect to QΣd is positive semidefinite on manifest variables, see Section 4.2 for details of
this argument.
Conversely, assume that Q(ξ) is not Hurwitz and yet every storage function for B on
manifest variables is positive semidefinite. Because of observability of ` from w, y = P ( dtd )`,
` ∈ L0 is nonzero. Consider `1 ∈ L0 − {0} such that `1 (−∞) = 0. Note that such `1 exists because Q(ξ) is not Hurwitz. Define y1 = P ( dtd )`1 and let w1 = (0, y1 )T , the image
defined by `1 . Let states corresponding to w1 at t = −∞ and t = 0 be 0 and a respecR0
tively. Consider QΨ (a) := inf w∈Ba −∞ QΣd (w)dt. Recall from Section 4.2 that QΨ is the
R0
maximum storage function for B. We evaluate the integral −∞ QΣd (w1 )dt. Notice that
R0
R0
Q (w1 )dt = −∞ −||d( dtd )y1 ||2 dt. Since d(ξ) has no singularities in the open right half
−∞ Σd
R∞
R∞
122
7 Synthesis of dissipative systems
R0
plane, −∞ −||d( dtd )y1 ||2 dt is negative. Therefore, the infimum of this integral over all w ∈ Ba
is negative. Thus, the maximum storage function for B along state a is a negative quantity.
Hence, every storage function for B along state a is negative. This contradicts our assumption
that every storage function for B with respect to QΣd is positive semidefinite.
Corollary 7.2.4 Proposition 7.2.3 also shows that for a Σd -dissipative behavior with input
cardinality σ+ (Σ), if there exists one positive semidefinite storage function for B with respect
to QΣd then every storage function for B with respect to QΣd is positive semidefinite.
Due to similarities in the supply functions QΣ and QΣd , we expect that solving the H∞ problem
with Σ, or with Σd as the weight should be equivalent:
Problem statement 1’ Given the manifest plant behavior P ∈ Lvcon , the hidden behavior
N ∈ Lvcon , a nonsingular weighting matrix Σ = ΣT ∈ Rv×v , and a nonzero polynomial d(ξ)
having no roots in the open right half plane, find all behaviors K ∈ Lvcon , called the controlled
behaviors such that:
1. K is implementable, i.e. N ⊂ K ⊂ P.
2. m(K) = σ+ (Σ), i.e. K has σ+ (Σ) inputs.
3. K is dissipative with respect to QΣd and every storage function of K with respect to QΣd
is positive semidefinite on manifest variables.
Proposition 7.2.5 Problem statement 1 and Problem statement 1’ are equivalent.
Proof: Note that the only difference in Problem statement 1 and 1’ is in Condition 3. Since
QΣ ∼ QΣd (Chapter 3, Section 3.3) it follows that K is Σ-dissipative if and only if it is also
Σd -dissipative. Note that the minimum storage function for K with respect to QΣd , QΨΣd and
−
Σ
d
the minimum storage function for K with respect to QΣ , QΨΣ− are related by ΨΣ
− = d(ζ)Ψ− d(η)
since d(ξ) has no roots in the open right half plane by assumption. This shows that Condition
3 in Problem statement 1 and 1’ are equivalent.
Using Proposition 7.2.5 we transform the problem from dissipativity with respect to QΣ to
dissipativity with respect to QΣd , and then invoke the fact that the two are equivalent. In the
sequel we make a simplifying assumption: we assume that N is average positive on QΣ , i.e.
R∞
R∞
Q (v) ≥ 0 for all v ∈ N ∩ D(R, Rv ) and ( −∞ QΣ (v) = 0 for some v ∈ N ⇐⇒ v = 0).
−∞ Σ
With this simplifying assumption, det N T (−ξ)ΣN (ξ) 6= 0 [103]. Building upon the equivalence
of Problem statements 1, 2 and 1’, we have the following equivalent problem statement:
Problem statement 2’ Given the hidden behavior N = Im N ( dtd ), determine necessary and
sufficient conditions for the existence of a behavior B( dtd ) such that K := N ⊕ Im B( dtd ) satisfies
K ⊂ P, K is Σd -dissipative and every storage function on manifest variables of K with respect
to QΣd is positive semidefinite.
7.3. A Solution to the synthesis problem
7.3
123
A Solution to the synthesis problem
We now show that the problem of constructing an associated behavior B := Im B( dtd ) such that
N ⊕ B is Σd -dissipative (see Problem statement 2’) can be elegantly cast into the framework of
parametrization results presented in Chapters 3 and 4. We now state and prove a central result
that deals with this parametrization. We however require a few concepts from linear algebra
before we proceed. Consider a Hermitian matrix H = H ∗ ∈ Ch×h block partitioned as
"
#
A B
H=
B∗ C
where A = A∗ ∈ Ca×a . If A is nonsingular then the Schur complement of A in H, S(H/A) is
defined as [33]:
S(H/A) = C − B ∗ A−1 B
We denote by σ(H) the inertia of H (Definition 1.2.2). One of the better known results is
that σ(H) = σ(A) + σ(S(H/A)) [33]. Therefore, if H ≥ 0 and A > 0 then S(H/A) ≥ 0.
Generalizations of this result are known [92].
Theorem 7.3.1 Given nonsingular Σ = ΣT ∈ Rv×v and behaviors N := Im N ( dtd ) and B :=
ImB( dtd ) such that N T (−ξ)ΣN (ξ) is nonsingular, define K := N ⊕ B. Then, K is Σ-dissipative
(on R) if and only if N is Σ-dissipative and B is Φ-dissipative (on R) with Φ(ζ, η) = d(ζ)d(η)Σ−
D T (ζ)D(η) for some non-zero polynomial d(ξ) ∈ R[ξ] having no roots in the open right half
plane, D(ξ) ∈ Rv×v [ξ], and N ∩ B = 0.
Proof: Let N (ξ) ∈ Rv×n [ξ], B(ξ) ∈ Rv×b [ξ]. K is Σ-dissipative if and only if the following
inequality holds (Theorem 3.2.3):
"
#
T
T
N (−iω)ΣN (iω) N (−iω)ΣB(iω)
≥ 0 ∀ω ∈ R
(7.6)
B T (−iω)ΣN (iω) B T (−iω)ΣB(iω)
By a Schur complement argument it follows that inequality (7.6) holds if and only if
N T (−iω)ΣN (iω) > 0 for almost all ω ∈ R and
B T (−iω) Σ − ΣN (iω)[N T (−iω)ΣN (iω)]−1 N T (−iω)Σ B(iω) ≥ 0 ∀ω ∈ R
(7.7)
B T (−iω) d(−iω)d(iω)Σ − ΣN (iω)Z(iω)N T (−iω)Σ B(iω) ≥ 0 ∀ω ∈ R
(7.8)
B T (−iω) d(−iω)d(iω)Σ − D T (−iω)D(iω) B(iω) ≥ 0 ∀ω ∈ R.
(7.9)
Since the matrix N T (−iω)ΣN (iω) is invertible for almost all ω, let Z(ξ) be the adjugate of the
matrix N T (−ξ)ΣN (ξ), and let d(−ξ)d(ξ) be the determinant of N T (−ξ)ΣN (ξ). Here d(ξ) is
chosen so that it does not have any root in the open right half plane. Then, inequality (7.7)
can be re-written as
Since Z(iω) is non-negative definite for every ω ∈ R, there exists a matrix F (iω) ∈ Rn×n [iω]
such that Z(iω) = F T (−iω)F (iω). Define the polynomial matrix D(ξ) := F (ξ)N T (−ξ)Σ.
Notice that substituting D(iω) in inequality (7.8):
124
7 Synthesis of dissipative systems
We now define a two variable polynomial matrix Φ(ζ, η) := d(ζ)d(η)Σ − D T (ζ)D(η). Notice
that there exists a behavior B such that K := N ⊕ B with K Σ-dissipative if and only if B is
Φ-dissipative and N ∩ B = 0 .
The QDF QΦ that we have constructed from N and QΣ has some interesting properties. The
QDF QΦ has “absorbed” into itself the information associated with the hidden behavior. This
fact is reflected in a very appealing manner in the spectrum of Φ(−iω, iω):
Theorem 7.3.2 Let N := Im N ( dtd ) be the hidden behavior with N (ξ) ∈ Rv×n [ξ] having full
column rank. Given the QDF QΣ , Σ = ΣT nonsingular, assume N is Σ-average positive.
Construct the QDF QΦ associated with N and Σ as detailed in Theorem 7.3.1. Then for
almost all ω ∈ R:
1. Φ(−iω, iω) has n eigenvalues at zero (n being the input cardinality of N ).
2. Φ(−iω, iω) has (σ+ (Σ) − n) positive eigenvalues.
3. Φ(−iω, iω) has (σ− (Σ)) negative eigenvalues
Proof: We prove the Theorem by using a simple counting argument which shows that Φ(−iω, iω)
has at least n zero, (σ+ (Σ) − n) positive and (σ− (Σ)) negative eigenvalues for almost all ω ∈ R.
Since Σ = ΣT is nonsingular, we can assume without loss of generality that Σ = diag[Ip , −Iw ].
Recall from Theorem 7.3.1 that
Φ(−iω, iω) = d(−iω)d(iω)Σ − ΣN (iω)[adj N T (−iω)ΣN (iω)]N T (−iω)Σ
Let adj N T (−iω)ΣN (iω) = F T (−iω)F (iω), F (ξ) ∈ Rn×n [ξ]. Define
h
i
B1 (ξ) B2 (ξ) = F (ξ)N T (−ξ)Σ
with B1 (ξ) ∈ Rn×σ+ (Σ) [ξ] and B2 (ξ) ∈ Rn×σ− (Σ) [ξ]. Thus, the term ΣN (iω)[adj N T (−iω)ΣN (iω)]
N T (−iω)Σ can be factorized as:
"
#
h
i
T
B
(−iω)
1
ΣN (iω)[adj N T (−iω)ΣN (iω)]N T (−iω)Σ =
B
(iω)
B
(iω)
1
2
B2T (−iω)
Consequently, Φ(−iω, iω) can now be re-written as
#
"
d(iω)d(−iω)Ip − B1T (−iω)B1 (iω)
−B1T (−iω)B2 (iω)
Φ(−iω, iω) =
−d(iω)d(−iω)Iw − B2T (−iω)B2 (iω)
−B2T (−iω)B1 (iω)
The block matrix −d(iω)d(−iω)Iw − B2T (−iω)B2 (iω) is negative definite and consequently
Φ(−iω, iω) has at least σ− (Σ) negative eigenvalues.
Notice that Φ(−iω, iω)N (iω) = 0. Since N (ξ) is full column rank matrix, Φ(−iω, iω) has
at least n zero eigenvalues.
From Lemma 7.2.2, n < σ+ (Σ). Therefore, B1T (−iω)B1 (iω) is a σ+ (Σ) × σ+ (Σ) matrix
with rank at most n, thus having at least σ+ (Σ) − n eigenvalues at 0. Hence, d(−iω)d(iω)Ip −
B1T (−iω)B1 (iω) has at least σ+ (Σ) − n positive eigenvalues for almost all ω ∈ R. We have thus
shown that a principle submatrix of Φ(−iω, iω) has at least σ+ (Σ) − n positive eigenvalues.
7.3. A Solution to the synthesis problem
125
Using suitable congruence transformations we obtain a matrix Φ̄(−iω, iω) = X ∗ Φ(−iω, iω)X
such that the top left block of Φ̄(−iω, iω) has exactly σ+ (Σ) − n positive eigenvalues, and in
particular, is nonsingular. A Schur complement argument then shows that Φ̄(−iω, iω) has at
least σ+ (Σ) − n positive eigenvalues, and therefore so does Φ(−iω, iω). For further details of
this argument, see [33], particularly Section 4, page 79 thereof.
Thus, we have shown that Φ(−iω, iω) has at least σ− (Σ) negative, n zero and σ+ (Σ) − n
positive eigenvalues. Since the sum of these numbers must be v := σ− (Σ) + σ+ (Σ), (and not
more), Φ(−iω, iω) has exactly σ− (Σ) negative, n zero and σ+ (Σ) − n positive eigenvalues for
almost all ω.
Another interesting property of QΦ is that the hidden behavior N is Φ-lossless:
Corollary 7.3.3 For every compactly supported trajectory v ∈ N ,
R∞
−∞
QΦ (v)dt = 0.
The elegance of working with QΦ will be brought out in the following theorem, where we show
that the hidden behavior will be automatically contained in any Φ-dissipative behavior with
full input cardinality:
Theorem 7.3.4 Let QΦ be a QDF with Φ(ζ, η) as defined in Theorem 7.3.1. Then, the
controlled behavior K is Φ-dissipative. Moreover, every Φ-dissipative behavior K 0 having input
cardinality σ+ (Σ) enjoys the property that N ⊂ K0 . Further, K0 is Σd -dissipative and N is
Σd -dissipative.
Proof: Assume K = Im [N ( dtd )B( dtd )] with N = Im N ( dtd ), B = B( dtd ) and N ∩ B = 0. From
Theorem 7.3.1, B exists if and only if B is Φ-dissipative. We see that
"
#
"
#
h
i
N T (−iω)
0
0
Φ(−iω, iω) N (iω) B(iω) =
B T (−iω)
0 B T (−iω)Φ(−iω, iω)B(iω)
which is clearly positive semidefinite since B is Φ-dissipative.
We now show by a contradiction argument that every behavior K 0 having input cardinality
σ+ (Σ) that is Φ-dissipative satisfies N ⊂ K0 . Assume that there exists a Φ-dissipative behavior
K0 with input cardinality σ+ (Σ) such that N is not contained in K0 . Consider the behavior
K00 = K0 + N . Since N is not contained in K0 , the input cardinality of K00 is strictly larger
R∞
than σ+ (Σ). Moreover, since −∞ LΦ (w, v) = 0 for v ∈ N , w ∈ D(R, Rv ) it follows that K00 is
Φ-dissipative with input cardinality strictly larger than σ+ (Σ). However, from Theorem 7.3.2,
Φ(−iω, iω) has only σ+ (Σ) non-negative eigenvalues for almost all ω. Therefore the input
cardinality of any Φ-dissipative behavior can at most be σ+ (Σ) (Lemma 3.5.1). Hence, a K0
that is Φ-dissipative with input cardinality σ+ (Σ) and not containing N cannot exist.
Since Φ(ζ, η) = d(ζ)d(η)Σ − D T (ζ)D(η), Theorem 3.2.3 shows that every Φ-dissipative behavior is also Σd -dissipative. Since every Φ-dissipative behavior with input cardinality σ+ (Σ)
contains N , it follows that N is Σd -dissipative.
Note that from Theorem 7.3.4 we have characterized all behaviors that are Σ d -dissipative
and that contain N . We have not yet stipulated that these behaviors be contained in P, the
manifest plant behavior. We consider this implementability issue in the next proposition:
126
7 Synthesis of dissipative systems
Proposition 7.3.5 Consider QΦ with Φ(ζ, η) as in Theorem 7.3.1. Let P = Im P ( dtd ) be
the manifest plant behavior defined by an observable image representation. Define Θ(ζ, η) =
P T (ζ)Φ(ζ, η)P (η). Then N ⊂ K ⊂ P, K has input cardinality σ+ (Σ) and K is Σd -dissipative
if and only if ∃ G = Im G( dtd ) a Θ- dissipative behavior with input cardinality σ+ (Σ) such that
K = P ( dtd )(G).
Proof: Suppose ∃ G satisfying the conditions in the proposition. Then K := P ( dtd )(G) ⊂ P
and has input cardinality σ+ (Σ). Since G is Θ-dissipative, K is Φ-dissipative. Therefore, from
Proposition 7.3.4, N ⊂ K and K is Σd -dissipative. Hence, N ⊂ K ⊂ P.
Conversely, every K such that N ⊂ K, K Σd -dissipative, with input cardinality σ+ (Σ) is
Φ-dissipative (Theorem 7.3.4). Since K ⊂ P, ∃ G(ξ) ∈ Rv×σ+ (Σ) [ξ] such that K(ξ) = P (ξ)G(ξ)
where K = Im K( dtd ), P = Im P ( dtd ). Since K is Φ-dissipative, G := ImG( dtd ) is Θ-dissipative.
Further, input cardinality of K and G are the same because P (ξ) is full column rank. Hence,
input cardinality of G is σ+ (Σ).
The essence of Proposition 7.3.5 is that all controllable behaviors K that are Σd -dissipative,
having input cardinality σ+ (Σ) and that satisfy N ⊂ K ⊂ P are characterized by the set of all
Θ-dissipative behaviors with input cardinality σ+ (Σ) where QΘ is a QDF constructed from N
and P as given in Theorem 7.3.1 and Proposition 7.3.5.
7.4
A characterization of all solutions of the synthesis
problem
Problem statement 2’ required K to be Σd -dissipative with positive semidefinite storage functions on manifest variables. We now see how this can be achieved using the set of Θ-dissipative
behaviors in Proposition 7.3.5.
Theorem 7.4.1 Consider a behavior K ∈ Lvcon that satisfies
1. N ⊂ K ⊂ P.
2. K is dissipative with respect to QΣd and every storage function on manifest variables of
K is positive semidefinite.
3. K has input cardinality σ+ (Σ).
Such a K is characterized as K = P ( dtd )G where P = Im P ( dtd ) is the manifest plant behavior
defined by an observable image representation and
1. G is a Θ-dissipative behavior with QΘ as in Proposition 7.3.5.
2. G has input cardinality σ+ (Σ).
3. G has a positive semidefinite storage function on manifest variables (with respect to Q Θ ).
7.5. Conclusion
127
Proof: From Proposition 7.3.5, N ⊂ K ⊂ P, K has input cardinality σ+ (Σ), and K is Σd dissipative if and only if K = P ( dtd )(G) where G is Θ-dissipative and has input cardinality
σ+ (Σ). Since Θ(ζ, η) = P T (ζ)Φ(ζ, η)P (η), every storage function on latent variables of G with
respect to QΘ is also a storage function on latent variables of K with respect to QΦ . Hence,
there exists a positive semi-definite storage function on manifest variables of K with respect to
QΦ if and only if there exists a positive semidefinite storage function on manifest variables of
G with respect to QΘ .
Since Φ(ζ, η) = d(ζ)d(η)Σ − D T (ζ)D(η) it follows that QΦ (v) ≤ QΣd (v) for all v ∈
C ∞ (R, Rv ), and in particular for all v ∈ K. Let QΨ be a positive semidefinite storage function
for K with respect to QΦ :
d
QΨ (v) ≤ QΦ (v) ≤ QΣd (v) ∀v ∈ K
dt
Thus, starting from a positive semidefinite storage function for G with respect to QΘ we have
constructed one positive semidefinite storage function for K with respect to QΣd . Corollary
7.2.4 shows that every storage function for K with respect to QΣd is positive semidefinite.
The converse statement, i.e. existence of a positive semidefinite storage function on manifest variables of K with respect to QΣd implies the existence of a positive semidefinite storage
function on manifest variables of K with respect to QΦ is shown by a contradiction argument:
suppose there exists no storage function for K with respect to QΦ that is positive semidefinite.
Then, there exists one storage function for K with respect to QΣd that is not positive semidefinite. We conclude from Condition 3 of Problem statement 1’ that K cannot be a solution. We now invoke the problem equivalence as in Proposition 7.2.5. Note that K has positive
semidefinite storage functions with respect to QΣd if and only if it also has positive semidefinite
storage functions with respect to QΣ . Hence, Theorem 7.4.1 gives characterization of all solutions for the standard H∞ problem with Σ as the weighting matrix. Thus, in summary, K is a
solution to the H∞ problem, i.e., K is Σ-dissipative on R− , m(K) = σ+ (Σ) and N ⊂ K ⊂ P if
and only if there exists a behavior G with input cardinality σ+ (Σ) that is Θ-dissipative and has
a positive semidefinite storage function on manifest variables. Every solution K is then given
by K = P ( dtd )(G) where P = ImP ( dtd ) is defined by an observable image representation.
7.5
Conclusion
In this chapter, we have obtained a novel characterization of all solutions to the H∞ problem.
We have shown that given the “plant behavior” and the “hidden behavior”, one can construct
a QDF QΘ from these behaviors. Every solution to the H∞ problem can be obtained from
Θ-dissipative behaviors having the “right” input cardinality, and positive semidefinite storage
functions.
We have assumed that the hidden behavior is average non-negative on the given weighting
matrix Σ. While this is crucial for the treatment as presented in this chapter, we recognize
the fact that this assumption is not central to the theme of this chapter. Work on doing away
with this assumption is in progress. Also, we believe that much of the theory presented in
this chapter generalizes in a neat manner when the weighting matrix is “frequency dependent”,
128
7 Synthesis of dissipative systems
rather than constant as was considered in this chapter. Investigation in this direction is in
progress.
Chapter 8
Modeling of data with bilinear
differential forms
8.1
Introduction
Modeling a system from observed time series data is a problem arising in several important
areas of engineering, for example time series analysis, signal processing, system identification,
automatic control, etc. The usual modeling approach proceeds by fixing the type of laws that
we desire, or we believe the system is likely to obey and then searching for a model of this type
that explains the data. In the setting which we will be working on, a model explains given
data if the data are exactly compatible with the laws describing the model itself; the procedure
for finding such a model is called exact identification. Exact modeling has been covered in this
thesis in Chapter 6, Section 6.2.1. We refer the interested reader to [100] for a deeper exposition
of the issues regarding exact identification.
One of the most reasonable and easy a priori assumptions on the model is that of linearity,
primarily because of a good understanding of linear systems theory and also because of the
availability of efficient computational algorithms for modeling. The linear exact modeling
problem has been well studied in systems theory, see for example [100, 7].
The assumption of linearity, with its simplicity, ease and mathematical tractability may not
always be a good choice and indeed there are several systems (for example econometric systems)
and important applications (signal filtering) where it has been shown that going beyond a linear
model has advantages. Bilinear models are of paramount importance in nonlinear modeling and
have found applications in signal processing [85] and coding theory [51, 47]. The intertwining
of quadratic and bilinear models and optimization problems is well known, see [50] for a survey.
The central question in interpolation with bilinear forms is to determine a bilinear form that
takes prescribed values along certain prescribed directions specified by the data. It is worth
emphasizing that though a bilinear model is obviously nonlinear, determining its parameters
from the data is still a linear problem. A brute-force solution of the linear system may not
be desirable because modifications in the data may necessitate re-computation of the whole
solution. This motivates the search for an algorithm that models data iteratively, i.e. the
bilinear form should be modeled depending on the current and past available data– the current
model should be updated to explain future data as and when such data are available.
130
8 Modeling of data with bilinear differential forms
In this chapter we address bilinear modeling by considering a general bilinear interpolation
problem. We show its relevance to a number of problems in systems and control theory. The
most interesting aspect of the interpolation scheme is that it is recursive with respect to data.
Further, it only uses standard matrix manipulations. Hence, it can be implemented on general
purpose computational packages. The scheme is based in a formal setting due to which it can
be extended to various domains of application with no additional effort.
The work presented here has interesting connections with several problems in mathematics
and engineering. The most immediate connection is with bivariate polynomial interpolation
problems [10, 23] among others. Reed-Solomon decoding [51, 47] is a bilinear interpolation
problem. Discrete time bilinear interpolation is useful in quadratic filtering [85]. Bilinear
interpolation is a well known technique in image processing [91].
This chapter is organized as follows. A precise problem statement is given in Section 8.2.
Section 8.3 is the main section in this chapter. Here we propose a scheme to address the bilinear
interpolation problem. Section 8.4 is about examples and applications. Here we show among
other things, the use of the interpolation scheme for computing Lyapunov functions. These will
be followed by concluding remarks in Section 8.5.
8.2
The problem statement
Consider w ∈ C ∞ (R, Cw ), v ∈ C ∞ (R, Cw ) and γ ∈ C ∞ (R, C). We call (w, v, γ) the data. Bilinear
models that we wish to construct for the data will be elements of the model class MBDF :
MBDF
= {(v, w, γ) ∈ C ∞ (R, Cw ) × C ∞ (R, Cw ) × C ∞ (R, C)
|∃ Φ ∈ Cw×w [ζ, η] satisfying LΦ (w, v) = γ}
(8.1)
The model class MBDF thus explicitly assumes that the data is amenable to exact bilinear
modeling. Since we do not exclude the possiblity of complex Φs, a word on “symmetry”
Pn
k l
?
in this context is in order. Let Φ(ζ, η) =
k,l=0 Φkl ζ η . By Φ (ζ, η) we mean the matrix
Pn
?
∗
∗ k l
k,l=0 Φkl η ζ where Φkl denotes the hermitian transpose of Φkl . If Φ(ζ, η) = Φ (ζ, η) we
call Φ symmetric. Further, while considering one variable polynomial matrices R(ξ) ∈ C•×• [ξ]
P
P
defined by ni=0 Ri ξ i , by R? (ξ) we mean the matrix ni=0 Ri∗ ξ i , where Ri∗ denotes the hermitian
transpose of Ri .
We consider the following problem:
Given N distinct trajectories ci eλi t with ci ∈ Cc ; λi ∈ C and distinct, and qij ∈ C, i, j =
1, 2, . . . N , determine a bilinear differential form LΦ , Φ(ζ, η) = Φ? (ζ, η), such that:
LΦ (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t i, j = 1, 2, . . . N
(8.2)
Observe that the requirement of symmetry automatically fixes the values of LΦ (cj eλj t , ci eλi t ) to
qij∗ e(λi +λ̄j )t . Using the two variable polynomial notation for BDFs the problem statement can
be translated into an equivalent algebraic statement:
c∗i Φ(λ̄i , λj )cj = qij
8.3. A recursive algorithm for interpolating with BDFs
131
In general, problems involving polynomial interpolation have non-unique solutions. Deciding
which solution is the “best” from among infinitely many others is not always straightforward.
Indeed, often, what is the “best” solution is a question that can only be answered depending
on the application in hand, and the required computational effort. In the context of this
chapter, the choice is made more complicated by the fact that there is no natural ordering on
the ring of polynomials in two variables. Several criteria for choosing the “best” solution to
problems involving two variable polynomial matrices are available in literature. See [97] for a
discussion on the “effective size”.1 In some applications (like error correcting codes [47]), the
criterion has been the least weighted degree. 2 The criterion that we address in this chapter
is ease of computation. The most interesting feature of our scheme is that it is recursive. The
required computational effort is very modest– indeed, one can write down the solution by hand
in the simpler cases. In the next section we develop the algorithm to address the problem of
interpolation with BDFs.
8.3
A recursive algorithm for interpolating with BDFs
We begin by recalling that the problem of interpolating with BDFs is the following: given N
distinct trajectories ci eλi t with ci ∈ Cc ; λi ∈ C and distinct, and qij ∈ C, i, j = 1, 2, . . . N ,
determine a BDF LΦ with Φ(ζ, η) = Φ? (ζ, η) such that
LΦ (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t i, j = 1, . . . , N
(8.3)
For the sake of simplicity we make the following assumption:
c∗i cj 6= 0 i, j = 1 . . . N
(8.4)
Discussion on how to relax this assumption can be found in Remark 8.3.2 below. In order
to develop the interpolation scheme, we need the concept of the Most Powerful Unfalsified
Model (MPUM) for a finite set of vector exponential trajectories (Section 6.2.1). Recall that
the MPUM (chosen from a model class) is the most restrictive model for data which does not
refute the data. We now prove a Theorem that is the basis of the interpolation scheme:
Theorem 8.3.1 Consider N trajectories ci eλi t with ci ∈ Cc ; c∗i cj 6= 0, i, j = 1 . . . N ; λi ∈
∗
C, i = 1, . . . , N and distinct, together with N 2 complex numbers qij satisfying qij = qji
, i, j =
1, . . . N . Consider the following iterations:
Φ1 =
q11
I.
c∗1 c1
for l= 1 to N − 1 do,
?
(ζ)Rl (η) + Rl? (ζ)El+1 (ζ)
Φl+1 (ζ, η) = Φl (ζ, η) + El+1
end for
P
The effective size of k,l Φkl ζ k η l is defined as max{k|Φkl 6= 0}.
P
2
The (wζ , wη )-weighted degree of k,l Φkl ζ k η l is defined as max{kwζ + lwη |Φkl 6= 0} where wζ , wη are fixed
“weights” for ζ and η respectively.
1
132
8 Modeling of data with bilinear differential forms
where Rl is a representation of the MPUM for c1 eλ1 t , . . . cl eλl t and El+1 is the “error removing
matrix” which satisfies
El+1 (λi ) = [Rl−1 ]? (λ̄l+1 )
q(l+1)i − c∗l+1 Φl (λ̄l+1 , λi )ci
i = 1, . . . , l + 1,
αc∗l+1 ci
(8.5)
where α = 2 when i = l + 1; else α = 1. Then, LΦN is a solution to the bilinear interpolation
problem.
Proof: We prove the theorem by induction on the number of trajectories. Consider Φ1 (ζ, η).
Then, clearly, c∗1 Φ1 c1 = q11 . Hence, LΦ1 is a solution of the problem for N = 1. Assume that
LΦl is a solution to the bilinear interpolation with data ci eλi t , i = 1, . . . , l i.e., we assume that
Φl (ζ, η) is known such that
LΦl (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t i, j = 1, 2 . . . l,
(8.6)
The (l + 1) th update is carried with the following formula
?
Φl+1 (ζ, η) = Φl (ζ, η) + El+1
(ζ)Rl (η) + Rl? (ζ)El+1 (η)
(8.7)
where KerRl ( dtd ) is the MPUM for ci eλi t , i = 1, . . . , l. Since Rl is a representation of the MPUM:
LΦl+1 (ci eλi t , cj eλj t ) = LΦl (ci eλi t , cj eλj t ) i, j = 1, 2 . . . l
(8.8)
Thus, the updating of Φl (ζ, η) has been achieved without disturbing the interpolation conditions
satisfied by Φl . A solution to the (l+1)th step interpolating problem is obtained by constructing
the univariate polynomial matrix El+1 that satisfies:
?
c∗l+1 [Φl (λ̄l+1 , λi ) + El+1
(λ̄l+1 )Rl (λi ) + Rl? (λ̄l+1 )El+1 (λi )]ci = q(l+1),i i = 1, . . . l + 1
(8.9)
We now show that these conditions are satisfied and we provide a method to construct a
polynomial matrix E meeting these requirements. Since Ker Rl ( dtd ) is the MPUM for the first
l trajectories, and since all λi s are assumed distinct, Rl (λl+1 ) is a constant nonsingular matrix.
We verify by substitution that
El+1 (λl+1 ) = [Rl−1 ]? (λ̄l+1 )
q(l+1)(l+1) − c∗l+1 Φl (λ̄l+1 , λl+1 )cl+1
2c∗l+1 cl+1
(8.10)
satisfies conditions (8.9) when i = l + 1.
In addition, El+1 must satisfy l cross coupling conditions associated with the pairs (λ̄l+1 , λi ),
i = 1, . . . l. We verify by substitution that
El+1 (λi ) = [Rl−1 ]? (λ̄l+1 )
q(l+1)i − c∗l+1 Φl (λ̄l+1 , λi )ci
i = 1, 2 . . . l
c∗l+1 ci
(8.11)
satisfies conditions (8.9) when i = 1, . . . , l. These interpolation conditions are well defined
because we have assumed that c∗i cj 6= 0 ∀ i, j. In order to construct an El+1 satisfying the
interpolation conditions, consider the following scheme. Define:
Ai = [Rl−1 ]? (λ̄l+1 )
q(l+1)i − c∗l+1 Φl (λ̄l+1 , λi )ci
i = 1, . . . , l + 1,
αc∗l+1 ci
(8.12)
8.3. A recursive algorithm for interpolating with BDFs
133
with α = 2 if i = l + 1, and α = 1 otherwise. Determining El+1 from the l + 1 interpolation
conditions is a straightforward problem in Lagrange interpolation:
El+1 (ξ) =
l+1
X
Ql+1
i=1,i6=j (ξ − λi )
Aj
Ql+1
(λ
−
λ
)
j
i
i=1,i6=j
j=1
+F
l+1
Y
i=1
(ξ − λi )
(8.13)
with F ∈ Cc×c [ξ].
We have completed the development of the recursive interpolation algorithm. The complete
algorithm is summarized below:
Input ci eλi t , ci ∈ Cc , λi ∈ C and distinct, c∗i cj 6= 0 ∀ i, j and complex constants qij =
∗
qji
, i, j = 1, 2 . . . N .
Output Φ(ζ, η) = Φ? (ζ, η) such that LΦ (ci eλi t , cj eλj t ) = qij e(λ̄i +λj )t .
Φ1 (ζ, η) =
q11
I
c∗1 c1
For i = 1,2 . . . N − 1 do,
Compute the MPUM, Ker Ri ( dtd ) for c1 eλ1 t , c2 eλ2 t , . . . cλi i t
Compute the ith stage error matrix Ei from Ei (λj ) = Aj , j = 1, 2 . . . , i as in
equation (8.13)
?
Φi+1 (ζ, η) = Φi (ζ, η) + Ei+1
(ζ)Ri (η) + Ri? (ζ)Ei+1 (ζ)
end
Return Φ(ζ, η) = ΦN (ζ, η)
With reference to Theorem 8.3.1, let F ∈ Cc×c [ζ, η]. Consider the following two variable
polynomial matrix:
?
Υ(ζ, η) = F (ζ, η)RN (η) + RN
(ζ)F ? (η, ζ)
(8.14)
where RN is such that Ker RN ( dtd ) a MPUM for ci eλi t , i = 1, 2 . . . N . Consider the action of the
BLDF LΥ on the finite dimensional vector space S given by the span of ci eλi t , i = 1, 2 . . . N .
Clearly LΥ (v) = 0∀v ∈ S. Conversely, if LΥ (v) = 0∀ v ∈ S then there exists F such that
(8.14) holds ([103], Proposition 3.2, page 1712). We call BDFs LΥ satisfying LΥ (S) = 0 zero
along S. Since LΥ is zero along S if and only if it has the structure in equation (8.14) we see
that every solution to the interpolation problem can be generated from a known solution L Φ
by adding LΥ to it, i.e. LΨ is a solution to the interpolation problem if and only if there exists
LΥ zero along S such that
LΨ = L Φ + L Υ
We call the symmetric polynomial in equation (8.14) the “tail polynomial”. What is the “best”
solution to the bilinear interpolation problem will often depend on the choice of an appropriate
tail polynomial.
134
8 Modeling of data with bilinear differential forms
Remark 8.3.2 Notice that our assumption c∗i cj 6= 0 ∀i, j results in simple interpolation conditions for the error matrix El+1 . However, this condition is not a central assumption. Clearly
if cl = 0 for some l then the given scalars ql1 , ql2 , . . . qll must have been 0. The error matrix
can in this case be assigned arbitrary values and therefore no interpolation conditions need be
imposed on it. Representations for the MPUM are not unique. If Ker Rl ( dtd ) is the MPUM for
the first l trajectories then Ker Ul ( dtd )R1 ( dtd ) is also the MPUM for the same set of trajectories if
and only if Ul is a unimodular matrix (i.e. det Ul = Const, 6= 0). In order to have well defined
interpolation conditions for the error matrix El+1 it is enough to ensure that the matrix Rl
represents the MPUM and satisfies:
c∗l+1 Rl? (λ̄l+1 )ci 6= 0 i = 1 . . . l + 1
(8.15)
Under the above conditions, well defined interpolation conditions may be set up with out
inverting Rl? (λ̄l+1 ):
q(l+1) i − c∗l+1 Φl (λ̄l+1 , λi )ci
Ic i = 1 . . . l + 1. α = 2, when i = l + 1 otherwise α = 1
αc∗l+1 Rl? (λ̄l+1 )ci
(8.16)
Formulating the interpolation problem as in (8.16) may in fact be advantageous since it does not
involve explicit inversion of Rl? (λ̄l+1 ). However, the actual procedure to compute Rl depends
on the case in hand.
El+1 (λi ) =
In the next Section we demonstrate the use of the interpolation scheme in Theorem 8.3.1 for
addressing specific problems.
8.4
8.4.1
Examples and applications
Interpolation with BDFs
Example 8.4.1 Consider two trajectories ci eλi t , i = 1, 2 with
" #
1
λ1 = −1; c1 =
0
" #
1
λ2 = −2; c2 =
1
Let q11 = 1; q12 = q21 = 2; q22 = 3
Since the data are real, we can compute a real Φ(ζ, η) such that LΦ (ci eλi t , cj eλj t ) = qij e(λi +λj )t , i =
1, 2. We proceed to define:
"
#
1
1 0
Φ1 (ζ, η) = I =
1
0 1
We update Φ1 as indicated below:
Φ2 (ζ, η) = Φ1 (ζ, η) + E2T (ζ)R1 (η) + R1T (ζ)E2 (η)
8.4. Examples and applications
135
where R1 (ξ) is such that Ker R1 ( dtd ) is the MPUM for c1 e−t . Using the procedure outlined in
Section 6.2.1 we see that one possible choice for R1 is:
"
#
T
c1 c
−1 − ξ 0
R1 (ξ) = λ1 I − T 1 ξ =
c1 c1
0
−1
The matrix E2 is chosen such that it satisfies the following interpolation conditions:
q22 − cT2 Φ1 (λ2 , λ2 )c2
2cT2 c2
q21 − cT2 Φ(λ2 , λ1 )c1
E2 (λ1 ) = R1−T (λ2 )
cT2 c1
E2 (λ2 ) = R1−T (λ2 )
Substituting the appropriate values we see that
E2 (−2) = (1/4)
E2 (−1) =
"
"
1 0
0 −1
#
#
1 0
0 −1
(8.17)
Solving a Lagrange interpolation problem we see that one possible choice for E2 is
"
#
3/4ξ + 7/4
0
0
−3/4ξ − 7/4
Thus, Φ2 (ζ, η) is found to be
"
#"
#
3/4ζ + 7/4
0
−1 − η 0
Φ2 (ζ, η) =
+
0
−3/4ζ − 7/4
0
−1
"
#"
#
−1 − ζ 0
3/4η + 7/4
0
+
0
−1
0
−3/4η − 7/4
"
#
−5/2(1 + ζ + η) − 3/2ζη
0
=
0
9/2 + 3/4(ζ + η)
1 0
0 1
#
"
That Φ2 (ζ, η) is a solution can be easily checked by substitution.
8.4.2
(8.18)
Application 1: Interpolation with scalar bivariate polynomials
Interpolation with scalar bivariate polynomials (often with additional conditions) has been
researched at length [10, 23] among others. An immediate application of the algorithm presented
in this chapter is a general recursive scheme for interpolation by symmetric scalar bivariate
polynomials. Consider the interpolation problem defined in Theorem 8.3.1 for the scalar case,
i.e. c = 1. Without loss of generality, the ci s can now be assumed to be unity. We consider the
following interpolation problem: given N distinct real numbers λ1 , . . . λN , together with the
real numbers qij , i, j = 1, 2 . . . N , determine a scalar symmetric bivariate polynomial Φ(ζ, η)
such that
Φ(λi , λj ) = qij
(8.19)
136
8 Modeling of data with bilinear differential forms
As in the previous case, the symmetry conditions dictate that qij = qji so that only N (N + 1)/2
qij s can be specified independently. The fact that ci s are unity makes the computation of
the MPUM trivial. Indeed, it can be seen that the MPUM KerRl ( dtd ) for the l trajectories
Q
eλ1 t , . . . eλl t can be represented by Rl = li=1 (ξ − λi ). Using Rl as a representation of the
MPUM, Theorem 8.3.1 can be applied to the data (ci eλi t , qij ), j = 1, . . . , i, i = 1, . . . , N to
compute a Φ ∈ Rs [ζ, η] such that Φ(λi , λj ) = qij .
8.4.3
Application 2: Storage functions
We have considered autonomous systems in Section 2.7. Recall that in autonomous systems,
the future is entirely governed by the initial conditions. The problem considered in this section
is the following: given an autonomous system with behavior B, and a QDF QΦ , determine a
QDF QΨ (if there exists one) such that
d
QΨ (w) ≤ QΦ (w) ∀w ∈ B
dt
(8.20)
QΨ is called a storage function. Notice that this problem is in constrast to computation of
storage functions in Chapter 4 where the behaviors were assumed to be controllable. We begin
by quoting the following important result ([103], Theorem 4.3, page 1713):
d
)w = 0} is autonomous and stable. Given any
Theorem 8.4.2 Assume that B = {w|R( dt
w×w
QDF QΦ , Φ(ζ, η) ∈ R [ζ, η], there exists a QDF QΨ such that dtd QΨ (w) = QΦ (w) for all
w ∈ B. Further, if QΦ (w) ≤ 0 for all w ∈ B then QΨ (w) ≥ 0.
Theorem 8.4.2 is essentially, Lyapunov theory for higher order systems. Computing a storage
function for B has been addressed in [103], Theorem 4.8 page 1715. The suggested solution
scheme has been through the solution of a certain Polynomial Lyapunov Equation (PLE), a
solution scheme for which was subsequently reported in [58]. In this chapter, we present an
alternate method for computing storage functions which does not involve solving the PLE. We
first prove a result that relates the existence of storage functions with the sign-definiteness of
a hermitian matrix:
Lemma 8.4.3 Let B := {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} be a finite dimensional behavior with
basis {wi = ci eλi t }, ci ∈ Cw , λi ∈ C and distinct, i = 1, 2 . . . N . Let QΦ be an arbitrary QDF
with Φ(ζ, η) ∈ Rw×w
s [ζ, η] Then a Ψ(ζ, η) satisfies:
d
QΨ (w) ≤ QΦ (w)
dt
for all w ∈ B if and only if the N × N hermitian matrix D = [dij ]N
i,j=1 is negative semidefinite.
Here:
dij = c∗i [(λ̄i + λj )Ψ(λ̄i , λj ) − Φ(λ̄i , λj )]cj
(8.21)
Further,
d
Q (w)
dt Ψ
= QΦ (w) if and only if dij = 0.
Proof: Consider an arbitrary trajectory w ∈ B. Then, there exist α1 , . . . , αN ∈ C such that
P
w= N
i=1 αi wi . There exists a Ψ(ζ, η) satisfying the conditions stated in the lemma if and only
8.4. Examples and applications
137
if
h
∗
α1∗ . . . αN


α
1
i
 .. 
[dij ]N
i,j=1  .  ≤ 0
αN
Since α1 , . . . αN are arbitrary, there exists a Ψ if and only if the matrix [dij ]N
ij=1 is negative
semidefinite.
It is now shown that the interpolation approach developed in this chapter can be used to
compute a storage function in a simple manner. Some computations are relatively trivial and
all of them are operations with constant matrices.
Theorem 8.4.4 Let B := {w ∈ C ∞ (R, Rw )|R( dtd )w = 0} be a finite dimensional behavior
with basis {wi = ci eλi t }, ci ∈ Cw , λi ∈ C and distinct, c∗i cj 6= 0, i, j = 1, 2 . . . N . Let QΦ ,
Φ(ζ, η) ∈ Rw×w
/ iR, i = 1, . . . , N , there exists a QDF QΨ such
s [ζ, η] be an arbitrary QDF. If λi ∈
that
d
QΨ (w) ≤ QΦ (w) ∀w ∈ B
dt
Proof: We prove the theorem by considering two cases.
case 1: λ̄i + λj 6= 0, i, j = 1, . . . N : A QDF QΨ qualifies to be a storage function for B if and
only if the hermitian matrix [dij ]N
i,j=1 is negative semidefinite, where,
dij = c∗i [(λ̄i + λj )Ψ(λ̄i , λj ) − Φ(λ̄i , λj )]cj
We show that there exists a Ψ(ζ, η) such that [dij ]N
i,j=1 can actually be made zero. Let us
compute a Ψ(ζ, η) that satisfies
Φ(λ̄i , λj )
cj
c∗i Ψ(λ̄i , λj )cj = c∗i
(λ̄i + λj )
|
{z
}
(8.22)
qij
This is the standard interpolation problem considered in Section 8.3. It has already been shown
that a solution can be obtained recursively in N iterations. Every solution to the interpolation
problem yields a QDF QΨ such that dtd QΨ = QΦ . Notice that we have not assumed B to be
stable.
case 2: λ̄j + λi = 0; i 6= j: Assume that λ̄a + λb = 0 for some 1 ≤ a, b ≤ N, a 6= b. The
matrix D defined by equations (8.21) now takes the following form:
2
6 ∗
6 c1 [(λ̄1 + λ1 )Ψ(λ̄1 , λ1 ) − Φ(λ̄1 , λ1 )]c1
6
6
.
6
.
D=6
.
6
6
6
.
6
.
4
.
···
.
.
.
···
c∗
1 [(λ̄1 + λN )Ψ(λ̄1 , λN ) − Φ(λ̄1 , λN )]cN
···
−c∗
a Φ(λ̄a , λb )cb
···
−c∗
b Φ(λ̄b , λa )ca
···
···
···
···
c∗
N [(λ̄N + λN )Ψ(λ̄N , λN ) − Φ(λ̄N , λN )]cN
3
7
7
7
7
7
7
7
7
7
7
5
(8.23)
It is now shown that one can always find Ψ(ζ, η) such that D is negative definite. We use
the following scheme: first assign interpolation conditions to off-diagonal terms i, j (i.e., with
i 6= j), λ̄i + λj 6= 0 so as to make them zero. Then choose the diagonal terms suitably so as to
make D negative semidefinite. Thus, to start with, let
Φ(λ̄i , λj )
cj i 6= j and λ̄i + λj 6= 0
λ̄i + λj
c∗i Ψ(λ̄i , λj )cj = arbitrary i =
6 j and λ̄i + λj = 0
c∗i Ψ(λ̄i , λj )cj = c∗i
(8.24)
(8.25)
138
8 Modeling of data with bilinear differential forms
Note that with this assignment, D now has zeros at all off diagonal positions i, j such that
λ̄i + λj 6= 0. Further, since c∗i Ψ(λ̄i , λj )cj can take arbitrary values when λ̄i + λj = 0, no
interpolation conditions need be specified for such λi , λj .
It is not difficult to see that D can now be written as the sum of two matrices Λ and D 0 :
D = Λ + D0
(8.26)
0
with Λ = diag[c∗i ((λ̄i + λi )Ψ(λ̄i , λi ) − Φ(λ̄i , λi ))ci ]N
i=1 . D is a w × w matrix which has zeros
everywhere on the off-diagonal terms, except the positions (a, b), (b, a) for all a, b, a 6= b such
that λ̄a + λb = 0. Note that D 0 is a hermitian matrix and has all real eigenvalues. Denote with
γ the largest eigenvalue of D 0 . Clearly, Λ can always be chosen such that Λ + D 0 is negative
semidefinite. One possible solution is Λ = αIw with α ≤ −γ. Thus, along with the conditions
given in (8.25),
c∗i Ψ(λ̄i , λi )ci = (c∗i Φ(λ̄i , λi )ci + α)/(λ̄i + λi ) i = 1 . . . N
(8.27)
where α ≤ −(max spec D 0 ). The matrix Ψ(ζ, η) can now be determined from these interpolation conditions using the recursive algorithm in Theorem 8.3.1. The QDF QΨ is then a storage
function for B since the matrix D (8.21) is now negative semidefinite.
Remark 8.4.5 Notice that if in Theorem 8.4.4 λi ∈ iR, it is necessary that c∗i Φ(λ̄i , λi )ci = 0
for the existence of a QΨ such that dtd QΨ (w) = QΦ (w), and c∗i Φ(λ̄i , λi )ci ≥ 0 for the existence
of a QΨ such that dtd QΨ (w) ≤ QΦ (w) along all trajectories w in B. In other words, if for some
i, 1 ≤ i ≤ N , λi ∈ iR and c∗i Φ(λ̄i , λi )ci < 0, there cannot exist a QΨ such that dtd QΨ (w) ≤ QΦ (w)
along all w ∈ B. Moreover, notice that if the above necessary condition is satisfied, c ∗i Ψ(λ̄i , λi )ci
can be assigned any arbitrary value.
Remark 8.4.6 In the light of Remark 8.4.5 and Theorem 8.4.4 it follows that given B =
lin span ci eλi t , λi distinct, c∗i cj 6= 0, i, j = 1, . . . N , and a QDF QΦ , there exists a QDF QΨ such
that dtd QΨ (w) = QΦ (w) ∀w ∈ B if and only if whenever λ̄i + λj = 0, c∗i Φ(λ̄i , λj )cj = 0. If
this holds, c∗i Ψ(λ̄i , λj )cj can be assigned arbitrary values, and hence no interpolation conditions
need be imposed.
Example 8.4.7 Consider the behavior B defined as the set of solutions w = [w1 w2 ]T to the
system of equations:
"
#
2
4 + 5 dtd + dtd 2 − dtd
w=0
2
− dtd − dtd 2
4 + dtd
|
{z
}
d
R( dt
)
It is easy to see that B is finite dimensional. A basis for B is found to be ci eλi t , i = 1, 2 where
!
!
1
1
λ1 = −1; c1 =
and λ2 = −2; c2 =
0
1
Let us compute a storage function for B with respect to the supply function
QΦ (w) = −(2w12 + 8w1 w2 + 2w22 )
8.4. Examples and applications
139
!
2 4
Then, Φ = −
. Let QΨ be a storage function for B with respect to QΦ . For it to be a
4 2
storage function, Ψ(ζ, η) must satisfy the following interpolation conditions:
! !
2 4
1
!
− 1 0
4 2
0
1
=
=1
1 0 Ψ(−1, −1)
−1 − 1
0
! !
2 4
1
!
− 1 0
4 2
1
1
=
=2
1 0 Ψ(−1, −2)
−1 − 2
1
! !
2 4
1
!
− 1 1
4 2
1
1
=3
=
1 1 Ψ(−2, −2)
−2 − 2
1
We can now solve the above interpolation problem. Notice that the above interpolation conditions are precisely the interpolation conditions considered in Example 8.4.1. Therefore, from
Example 8.4.1 we know that Ψ(ζ, η) can be taken to be
"
#
−5(1 + ζ + η)/2 − 3ζη/2
0
Ψ(ζ, η) =
0
9/2 + 3(ζ + η)/4
Then, QΨ is such that
d
Q (w)
dt Ψ
= QΦ (w) for all w ∈ B
Remark 8.4.8 The aim of this remark is to show that the algorithm in Theorem 8.3.1 may
not give an “optimal” solution. Indeed, in Examples 8.4.1 and 8.4.7 we have 2 × 2 matrices with
three interpolation conditions. Without much difficulty, it can be seen that the interpolation
conditions in these examples can be met with a constant (rather than bivariate polynomial)
matrix. If we define
!
1 1
K=
1 0
then, LK is a simpler solution to examples. However, this is not the case in general: if we had
imposed one more interpolation condition in Example 8.4.1 for instance, it is easy to see that
the problem will no longer admit a “constant” solution. Every solution to the interpolation
problem can be generated from a known solution by adding the tail polynomial (see equation
(8.14)). Thus, how “simple” or “complex” a solution is desired depends on among other things,
the choice of a tail polynomial.
Remark 8.4.9 Remark 8.4.8 leads us to ask the very reasonable question of what is an “optimal” solution to the interpolation problem. A quantification of complexity in bilinear forms
is an interesting problem. The notion of complexity could depend on for instance, least “total
weighted degree”, least “effective size”, least “number of parameters” and so on.
140
8 Modeling of data with bilinear differential forms
Remark 8.4.10 It can be shown that the interpolation scheme developed in this Chapter
can be easily used for interpolation with quadratic difference forms [41]. Consider as the
data finitely many discrete time sequences {vi ani }n∈Z , and constants qij . One can formulate the
problem of constructing a bilinear difference form that interpolates the data in a way analogous
to the continuous time case. Section 8.3 can be suitably modified to address the discrete time
interpolation case.
8.5
Conclusion
In this chapter we have developed a recursive algorithm for interpolation with bilinear forms
using the algebra of two variable polynomial matrices. The investigation is primarily motivated
by the exact modeling problem. However we have shown that it also has other applications.
An interesting application of this algorithm is in computation of storage functions for finite
dimensional dynamical systems which is a generalization of Lyapunov theory. We show how
to compute a storage function even when the Lyapunov operator is singular, in which case
conventional methods of solution generally fail.
We believe that results presented in this chapter are only a starting point for a systematic
investigation into a number of interesting problems, notably, a quantification of “complexity”
of bilinear (differential) forms. We have shown that the interpolation scheme may not yield
an “optimal” solution to the interpolation problem. Indeed, we believe what is optimal may
depend on the application in hand, however, an investigation into issues regarding complexity,
optimality and simplicity of solutions will yield good insights and applications.
Chapter 9
Nevanlinna-Pick interpolation
9.1
Introduction
In Chapter 8 we addressed interpolation with BDFs. Interpolation with rational functions is another important problem and finds applications in realization theory, sensitivity minimization,
model reduction, robust stabilization, etc. A detailed treatment of the theory of interpolation
with rational functions, along with some applications can be found in [9] and references therein.
A class of interpolation problems in Hardy spaces consists of computing an analytic function
that satisfies some interpolation conditions, along with a norm constraint. The NevanlinnaPick (NP) interpolation problem is one of the most important interpolation problems in this
class. The NP interpolation problem has found numerous applications in model approximation,
robust stabilization, the model matching problem in H∞ control [44], circuit theory [110] among
others. Very recently, Antoulas [6] and Sorensen [93] applied concepts from Nevanlinna-Pick
interpolation for addressing the problem of “passivity preserving model reduction”. We now
state the classical NP problem:
Given N pairs of complex numbers (λi , bi ) with λi ∈ C+ , the open right half complex plane,
and |bi | < 1, i = 1, 2 . . . N , compute a scalar rational function G(s) such that
1. G(λi ) = bi , i = 1, 2 . . . N
2. G(s) is analytic in C+ .
3. supω∈R |G(iω)| < 1
Several variants of the above problem have been studied with various assumptions on the data
[9]. The scalar and matrix versions, the tangential NP problem with (simple) multiplicities,
the two sided Nudelman problem, the Subspace Nevanlinna Pick Interpolation (SNIP) [84] are
some variants and generalizations of the classical NP problem. Basic to all these variants of
the NP problem is an assumption of a “frequency independent norm” that the interpolating
(rational) function must satisfy. In other words, it is assumed that the norm inequality satisfied
by the interpolating rational function is the same everywhere on the imaginary axis.
In this chapter, we re-examine various aspects of the NP interpolation problem in the
behavior theoretic framework. We show that concepts in behavioral theory can be conveniently
married to the concepts behind NP–like problems to yield generalizations in several directions.
142
9 Nevanlinna-Pick interpolation
We show that the classical NP interpolation problem can be re-cast into a problem in dissipative
systems, which we considered in considerable detail in Chapter 3 of this thesis. In such a
setting, the aspects of analyticity and the norm conditions on the interpolant can be examined
separately. This formulation also allows us to consider a generalized interpolation problem
wherein the required interpolant satisfies a “frequency-weighted” norm condition.
The results reported in this Chapter were obtained in collaboration with Dr. Paolo Rapisarda who is currently with the Department of Electronics and Computer Science, University of
Southampton, UK. These results will appear in [70]. In Section 6.2.1 we considered the behavioral view of linear, time-invariant models. Recall that the most powerful unfalsified LTI model
(MPUM) for a given set of trajectories (called data) is a model that explains every trajectory
in the set, and as little else as possible. The MPUM for a finite number of vector exponential
trajectories is an autonomous behavior. However, there also exist other models for data that
are less restrictive. We now define models that are controllable behaviors:
Definition 9.1.1 Consider the data set D = {Vi eλi t with i = 1, . . . N }. We call a matrix
F (ξ) ∈ Cv×v [ξ] a representation of a model for D in the generative sense if
1. ImF ( dtd )eλi t = Vi eλi t , i = 1, . . . N .
2. ImF ( dtd )eµt = Weµt , where Im W = Cv , if µ 6= λi .
We will consider the NP interpolation problem in two steps. First we will address the case
when the interpolant satisfies a fixed norm condition at all frequencies. Later, we will address
the problem of an interpolant satisfying a frequency weighted norm condition.
9.2
Nevanlinna Pick interpolation – the standard case
We begin by translating the classical Nevanlinna Pick interpolation problem as given in Section
9.1 into the language of behavioral systems theory and address some issues that arise out of
such a formulation. In this chapter, we will only consider the scalar interpolation problem, and
hence we consider B ∈ L2con . A behavioral formulation of Nevanlinna-Pick interpolation has
been reported in [84], where a characterization of solutions of a “Subspace Nevanlinna Interpolation Problem” was obtained in terms of kernel representations. In this chapter we obtain
a characterization of all solutions in terms of image representations. Such a characterization
has an advantage over [84]: controllability of all solutions obtained as image representations is
guaranteed, unlike in a characterization in terms of kernel representations.
Consider a controllable behavior B defined by an observable image representation:
#
" # "
d
u
q( dt )
`
(9.1)
=
p( dtd )
y
Define the 2 × 2 constant matrix J1 1 as
J1 1 =
"
1 0
0 −1
#
(9.2)
9.2. Nevanlinna Pick interpolation – the standard case
Let
Jstrict =
"
1− 0
0
−1
#
143
, ∈ (0, 1)
Then, we have the following lemma:
Lemma 9.2.1 supω∈R | p(iω)
| < 1 if and only if the behavior B associated with the rational
q(iω)
strict
function p(ξ)/q(ξ) is J
-dissipative for some ∈ (0, 1).
Proof: Follows by applying Theorem 3.2.3.
Lemma 9.2.1 serves as a connection between the classical (rational function based) formulation
and a behavioral formulation of the NP problem.
Behavioral Nevanlinna Pick Interpolation (Problem statement): Consider N trajectories {Vi eλi t }, i = 1 . . . N , ∈ C ∞ (R, C2 ) (which we name the data set D). Assume that
1. λi ∈ C+ , i = 1, 2 . . . N are distinct,
2. Vi ∈ C2 are contractive, i.e. Vi = [x y]T with |x| > |y|.
Under the above assumptions, determine all Jstrict -dissipative controllable behaviors B (i.e.,
all behaviors B that are Jstrict -dissipative for some ∈ (0, 1)) defined by the following kernel
representation:
" #
i u
h
=0
p( dtd ) −q( dtd )
y
such that:
1. q(ξ) is a Hurwitz polynomial.
h
i
d
d
2. B contains D, i.e. p( dt ) −q( dt ) Vi eλi t = 0, i = 1, 2 . . . N .
"
#
q(λi )
⊇ Vi , i = 1, 2 . . . N . Thus, the
p(λi )
problem is actually that of interpolating N distinct subspaces Vi eλi t , i = 1, . . . N . We say that
B “interpolates the data” if it contains the trajectories Vi eλi t . Any behavior B that satisfies
conditions (1) and (2) will be called a solution to the “Subspace Nevanlinna Pick Problem
(SNIP)”.
Clearly, the interesting cases are only when Vi s are distinct, when solutions to the SNIP will
in general be non-constant. This fact, together with the condition that q(ξ) is Hurwitz implies
that the Jstrict -dissipative behavior B which is a solution to the SNIP has positive definite
storage functions on manifest variables (see Chapter 4). Therefore, the problem statement
given above may be stated in an equivalent fashion as:
Equivalent problem statement: Consider N trajectories D = {Vi eλi t , i = 1, . . . , N } ∈
C ∞ (R, C2 ). Assume that
It is evident from condition (2) above that Im
1. λi , i = 1, 2 . . . N ∈ C+ and are distinct,
2. Vi ∈ C2 are contractive, i.e. Vi = [x y]T with |x| > |y|.
144
9 Nevanlinna-Pick interpolation
Under these assumptions, determine all Jstrict -dissipative behaviors B associated with the
rational function p(ξ)/q(ξ) such that:
1. B has positive definite storage functions on manifest variables.
"
#
q( dtd )
2. B contains D, i.e.
eλi t ⊇ Vi eλi t , , i = 1, 2 . . . N .
d
p( dt )
We now recall the definition of “Pick matrix” as given in Section 6.2.2. A Pick matrix of Q J1 1
and Vi eλi t , i = 1, . . . , N is defined as the N × N hermitian matrix:
T{Vi }i=1,...,N
V ∗ J1 1 V j
= i
λ̄i + λj
N
(9.3)
i,j=1
where Vi ∈ C2 is a basis for Vi . A Pick matrix is obviously not unique, since a different basis
for Vi gives a different Pick matrix. However it can be shown that properties of T{Vi }i=1,...,N like
its signature, sign-definiteness and (non)singularity remain invariant under a change of basis
of the subspaces Vi . While describing these properties, we shall refer to T{Vi }i=1,...,N as the Pick
matrix with a slight abuse of notation.
We now use the notion of the “dual” set of the data, previously defined in Section 6.2.2. to
simplify the solution of NP problem.
9.2.1
Dualizing of the data
Given subspaces Vi eλi t , i = 1 . . . N with Vi ∈ C2 and contractive, define a “data set” D as
D = {V1 eλ1 t , V2 eλ2 t , . . . VN eλN t }
Since we are solving SNIP for real valued rational functions, we want the data to respect this
condition. Therefore, henceforth, we assume that the data set D is self-conjugate. Recall from
⊥
⊥
¯
Section 6.2.2 that we have defined the “dual subspace” of Vi eλi t as Vi J1 1 e−λi t , where Vi J1 1 is
defined by:
⊥J 1 1
Vi
= {v ∈ C2 |v ∗ J1 1 w = 0∀w ∈ Vi }
⊥
Since Vi is contractive, Vi J1 1 is uniquely defined and in fact the two are complements of one
another in C2 . The dual of the data set D is denoted D ⊥J1 1 , and is defined in the obvious way:
⊥
⊥
⊥
D ⊥J1 1 = {V1 J1 1 e−λ̄1 t , V2 J1 1 e−λ̄2 t , . . . VN J1 1 e−λ̄N t }
Dualizing of the subspaces Vi eλi t is instrumental in determining a characterization of all solutions to the SNIP. We quote the following result from [82], (Theorem 8.3.1, page 148):
Proposition 9.2.2 Consider the data D. The following statements are equivalent:
1. The Pick matrix T{Vi }i=1,...,N is positive definite.
2. There exists a solution to the SNIP.
9.3. System theoretic implications of dualizing the data
145
Further, there exists a matrix R(ξ) such that kerR( dtd ) is the MPUM for D ∪ D ⊥J1 1 . R(ξ)
is J1 1 -unitary: RT (−ξ)J1 1 R(ξ) = r(ξ)r(−ξ)J1 1 . The behavior B := ker[p( dtd ) − q( dtd )] is a
solution to the SNIP if and only if p(ξ), q(ξ) are coprime and there exists f (ξ) ∈ R[ξ] Hurwitz
such that
i
h
i
d h
f (ξ) p(ξ) −q(ξ) = R( ) π(ξ) −φ(ξ)
dt
with ||π/φ||H∞ < 1.
Note that Proposition 9.2.2 gives a characterization of all solutions to the SNIP in terms of
kernel representations. We now re-formulate the characterization of all solutions to SNIP given
in Proposition 9.2.2 using image representations:
Lemma 9.2.3 Consider R( dtd ) as in Proposition 9.2.2. Define F (ξ) = adj R(ξ), i.e., F (ξ)R(ξ) =
⊥
det R(ξ)I2 . Then, Im F (λi ) = Vi , ImF (−λ̄i ) = Vi J1 1 and Im F (µ) = C2 if µ 6= λi , −λ̄i ,
i = 1, . . . , N .
Proof: Note that det R(ξ) has roots at λi and −λ̄i . Therefore, R(λi )F (λi ) = 0 and R(−λ̄i )
⊥
N (λ̄i ) = 0. Since KerR( dtd ) is the MPUM, we have ImF (λi ) = Vi and ImF (−λ̄i ) = Vi J1 1 . Following the above lemma, we restate Proposition 9.2.2:
"
#
q( dtd )
Proposition 9.2.4 Let B := Im
. B is a solution to the SNIP with data D if and
p( dtd )
only if T{Vi }i=1,...,N is positive definite. Then, every solution to the SNIP is characterized by
"
#
"
#
d
φ(ξ)
q(ξ)
f (ξ)
= F( )
dt
π(ξ)
p(ξ)
⊥J 1 1
where ||π/φ||H∞ < 1, f (ξ) is Hurwitz, ImF (λi ) = Vi and ImF (−λ̄i ) = Vi
.
Remark 9.2.5 Proposition 9.2.4 is only a restatement of Proposition 9.2.2 in terms of image
representations. We will see that we can generalize NP interpolation problems to a considerable
extent using Proposition 9.2.4. An important advantage of converting the SNIP from a kernel
representation to an image representation is that by doing so, controllability is guaranteed. Such
a re-formulation also explains why dualizing the data is crucial in solving the Nevanlinna-Pick
interpolation problem.
Notice that in the solution suggested above, we have considered the data set D ∪ D ⊥J1 1 . This
is in apparent disagreement with the problem statement of SNIP which related to finding all
interpolating behaviors for D alone. The following section addresses the necessity of considering
D ∪ D ⊥J1 1 rather than D alone, and how this consideration still yields all solutions to SNIP.
9.3
System theoretic implications of dualizing the data
In this section, we give a justification for dualizing the data. We start by considering a hypothetical situation in which the data has not been dualized. Let as before D = {V i eλi t } be the
data set and Im F ( dtd ) be a model for D, i.e. Im F (λi ) = Vi and Im F (µ) = C2 , µ 6= λi . We
emphasize that the D respects the existence of a real valued solution to SNIP, i.e., the data
is self-conjugate. With this assumption, D admits a model (in a generative sense) that is real
valued. Then, the following lemma is easily proved:
146
9 Nevanlinna-Pick interpolation
Lemma 9.3.1 Let D = {Vi eλi t , i = 1, . . . , N }. Let F ( dtd ) be a model for D in a generative
sense. Consider polynomials p(ξ), q(ξ) ∈ R[ξ]. Then,
#
"
d
q( dt )
e λi t = V i e λi t
Im
d
p( dt )
if and only if there exist coprime r(ξ), s(ξ) ∈ R[ξ] such that:
"
#
"
#
q(ξ)
r(ξ)
= F (ξ)
p(ξ)
s(ξ)
Proof: Follows from the fact that ImF (λi ) = Vi , which in the scalar case are one dimensional.
Lemma 9.3.1 gives a characterization of all controllable behaviors that interpolate given subspaces. At this juncture when all possible interpolants have been characterized, we bring in
the additional condition of dissipativity. Consider the two variable polynomial matrix Φ(ζ, η)
defined as
Φ(ζ, η) = F T (ζ)J1 1 F (η)
The following proposition relates Φ-dissipativity with solutions to SNIP:
Proposition 9.3.2 Consider the set LΦ and B0 ∈ LΦ . Then, B := F ( dtd )(B0 ) is a J1 1 dissipative behavior that interpolates D := {Vi eλi t , i = 1, . . . , N }. Moreover, for every J1 1 dissipative behavior B that interpolates D there exists a corresponding Φ-dissipative behavior
B0 .
Proof: We have seen in Chapters 3 and 4 of this thesis that F ( dtd ) can be thought of as a
differential operator that maps every Φ-dissipative behavior into a J1 1 -dissipative behavior.
Moreover, it is “invertible”, i.e. for every J1 1 -dissipative behavior, there exists a corresponding
Φ-dissipative behavior and vice versa. B interpolates D since ImF (λi ) = Vi . Further, such a
B is J1 1 -dissipative if and only if it is obtained as the image of a Φ-dissipative behavior under
the map F ( dtd ).
Thus in principle given any representation of a model for D (in a generative sense),
one may construct a QDF QΦ from this representation. If the set of Φ-dissipative
behaviors is “known”, the set of all J1 1 -dissipative behaviors that interpolate D can
be determined.
One sees immediately that QΦ is a fairly general supply function and no easy characterization
is available for the set of Φ-dissipative behaviors. Thus, determining the set of Φ-dissipative
behaviors is arguably a difficult task in general. We try to make QΦ “as simple as possible”
so that the set of Φ-dissipative behaviors is “known”. Dualizing the data does just this, as
we show below. We will try to make QΦ “like QJ1 1 ”. The following result is a consequence of
trying to make QΦ “simple”:
Theorem 9.3.3 The matrix F (ξ) satisfies
F T (−ξ)J1 1 F (ξ) = r(ξ)r(−ξ)U T (−ξ)J1 1 U (ξ)
9.3. System theoretic implications of dualizing the data
147
with U (ξ) unimodular if and only if columns of F (−λ̄i ) are J1 1 -orthogonal to columns of F (λi ),
i.e. F T (−λ̄i )J1 1 F (λi ) = 0 for all λi that satisfy det F (λi ) = 0.
Proof: Since F (ξ) ∈ R2×2 [ξ], it follows that if λi ∈ C is a root of det F (ξ) = 0, then λ̄i is also a
root of det F (ξ) = 0. Assume F T (−λ̄i )J1 1 F (λi ) = 0. Since F (ξ) is nonsingular, F T (−ξ)J1 1 F (ξ)
is a nonzero polynomial matrix. Hence, there exists a polynomial r(ξ) ∈ R[ξ] (having roots
at λi ) such that r(ξ)r(−ξ) divides F T (−ξ)J1 1 F (ξ). Define Z(ξ) = F T (−ξ)J1 1 F (ξ)/r(ξ)r(−ξ).
This is a unimodular matrix having the same inertia as J1 1 for almost all ξ ∈ iR. We compute
a J1 1 -spectral factorization of Z(ξ) and obtain a unimodular matrix U (ξ) such that Z(ξ) =
U T (−ξ)J1 1 U (ξ). Hence F T (−ξ)J1 1 F (ξ) = r(ξ)r(−ξ)U T (−ξ)J1 1 U (ξ). The converse of the
statement is trivial.
Theorem 9.3.3 shows that F (ξ) is a model for a dualized data set in the generative sense.
Remark 9.3.4 The interesting feature of the Nevanlinna-Pick algorithm is that there exists a
representation of a model for D ∪ D ⊥J1 1 with U (ξ) (see Theorem 9.3.3) equal to the identity
matrix. Indeed, if Im F ( dtd ) models D ∪ D ⊥J1 1 then ImF ( dtd )V ( dtd ) also models D ∪ D ⊥J1 1 if
and only if V (ξ) is unimodular. This follows from the fact that the columns of F (ξ) and that
of F (ξ)V (ξ) generate the same module. Thus, given any matrix F (ξ) such that columns of
F (λi ) are J1 1 -orthogonal to columns of F (−λ̄i ) there exists a unimodular matrix V (ξ) such
that F1 (ξ) = F (ξ)V (ξ) satisfies F1T (−ξ)J1 1 F1 (ξ) = r(ξ)r(−ξ)J1 1 . Therefore, while it is hardly
surprising that a representation with U (ξ) = I2 exists, it is interesting to note the simple way
to construct this representation (see Section 6.2.3).
Thus Theorem 9.3.3 and Remark 9.3.4 imply that if ImF (−λ̄i ) models D ⊥J1 1 then F (ξ) can be
suitably modified so that it enjoys the J1 1 -unitary property. Consider a behavior B associated
with the rational function p(ξ)/q(ξ) that is a solution to the SNIP. Therefore, it follows that
"
#
q(λi )
Im
= Vi
p(λi )
While characterizing the set of all solutions to the SNIP in Proposition 9.2.4, we have considered
the data set D ∪ D ⊥J1 1 . This is apparently more restrictive than the original problem statement
which required B to interpolate only D. However, the characterization obtained in Proposition
9.2.4 shows that a solution to the SNIP may be obtained with a common factor that is a
Hurwitz polynomial (say f (ξ)), a polynomial having roots at −λ̄i . Thus, the vector
"
#
q(−λ̄i )
f (−λ̄i )
p(−λ̄i )
is such that f (−λ̄i ) = 0. Therefore, q(−λ̄i ), p(−λ̄i ) are, in a sense, “free” and need not obey
constraints of interpolating D ⊥J1 1 .
Thus, in summary, dualizing the data has the following system theoretic implications:
• It is necessary for, and guarantees existence of a J1 1 -unitary model, ImF ( dtd ).
• J1 1 -unitariness of the model F ( dtd ) implies that the QDF QΦ defined by Φ(ζ, η) =
F T (ζ)J1 1 F (η) is “like QJ1 1 ”, i.e., QJ1 1 ∼ QΦ . Thus, the set of Φ-dissipative behaviors in this case is “known” which enables a easy characterization of the solutions of the
SNIP.
148
9 Nevanlinna-Pick interpolation
We now address a generalization of SNIP using QDFs. We obtain a characterization of
interpolants that satisfy a frequency-dependent norm, a result which is new and important.
9.4
Nevanlinna-Pick problem with frequency dependent
norms
In this section, we address the problem of constructing Φ-dissipative behaviors that interpolate
certain given subspaces. The matrix Φ that induces the QDF QΦ may not necessarily be a
constant matrix. Hence, interpolating behaviors that are Φ-dissipative are required to satisfy
a “frequency dependent norm” along with the given interpolation conditions. This leads us to
define a generalized SNIP along the lines stated below.
We assume that the QDF QΦ is such that Φ(ζ, η) admits the factorization
Φ(ζ, η) = K T (ζ)Jstrict K(η)
with K(ξ) square and nonsingular. Necessary and sufficient conditions for such a factorization
(and an algorithm to compute the factorization when it exists) have been already given in
Chapter 4, Theorem 4.4.3. We now state a “generalized subspace Nevanlinna interpolation
problem” (GSNIP) and address it using QDFs.
Problem (GSNIP): Given a QDF QΦ with Φ(ζ, η) = K T (ζ)Jstrict K(η) with K(ξ) ∈ R2×2 [ξ]
and nonsingular, together with N distinct subspaces Vi eλi t , i = 1, . . . , N , determine
necessary
#
"
d
q( dt )
such
and sufficient conditions for the existence of Φ-dissipative behaviors B := Im
p( dtd )
that
1. B has positive definite storage functions (with respect to QΦ ) on manifest variables, and
"
#
q(λi )
2. Im
= Vi .
p(λi )
Assumption: We assume that {λi }{i=1,...,N } ∩ spec K(ξ) = φ, i.e. λi is not a singularity of
K(ξ). We also assume that the spaces K(λi )Vi , i = 1, . . . , N , are contractive.
The following theorem gives necessary and sufficient conditions for the solvability of Problem
(GSNIP):
Theorem 9.4.1 Given
1. a QDF QΦ with Φ(ζ, η) = K T (ζ)Jstrict K(η).
2. a Φ-dissipative C ∞ -behavior B defined by an observable image representation:
" # "
#
u
q( dtd )
=
`
y
p( dtd )
3. subspaces Vi eλi t with Vi ∈ C2 , λi ∈ C+ , i = 1, . . . , N .
the Problem (GSNIP) is solvable if and only if
9.4. Nevanlinna-Pick problem with frequency dependent norms
1.
"
r(ξ)
s(ξ)
#
= K(ξ)
"
q(ξ)
p(ξ)
#
149
are such that r(ξ), s(ξ) coprime.
2. The modified Pick matrix T{Vi }i=1,...,N is positive definite where
T{Vi }i=1,...,N
and Vi is a basis for Vi .
V ∗ K T (λ̄i )J1 1 K(λj )Vj
= i
λj + λ̄i
N
i,j=1
"
#
d
)
r(
dt
Proof: Define a behavior B0 as Im
`. Since B is Φ-dissipative, it follows from
s( dtd )
Theorem 3.5.3 that B0 is Jstrict -dissipative. From Theorem 4.4.6, the behavior B has positive
definite storage functions if and only if r(ξ) and s(ξ) are coprime, and r(ξ) is Hurwitz. Define
Wi = K(λi )Vi . By assumption, spaces Im Wi are contractive. Thus, there exists a solution to
GSNIP if and only if there exists a solution to the SNIP.
Finally, there exists a solution to the SNIP if and only if the corresponding Pick matrix is
positive definite:
N
∗
W i J1 1 W j
>0
T{Vi }i=1,...,N =
λ̄i + λj i,j=1
This argument shows that there exists a Φ-dissipative behavior that interpolates Vi eλi t if
and only if the co-primeness conditions hold, and if in addition the modified Pick matrix is
positive definite.
Conversely, suppose that there exists a Φ-dissipative behavior B with positive definite
storage functions that interpolates given subspaces Vi eλi t for which the modified Pick matrix
T{Vi }i=1,...,N is not positive definite. This implies that there exists a Jstrict -dissipative behavior
B0 := K( dtd )(B) with positive definite storage functions that interpolates the modified data
(λi , Wi ) := (λi , K(λi )Vi ) (where Vi is a basis for Vi ), and for which the corresponding Pick
matrix T{Vi }i=1,...,N is not positive definite. This is a contradiction (Proposition 9.2.4).
The essential idea in the above proof is that the matrix K(ξ) can be used to convert the problem
into a problem of SNIP with QJstrict . Thus solution to the Problem (GSNIP) can be obtained
as follows:
1. Given subspaces Vi eλi t , i = 1, . . . , N , choose a basis Vi for Vi . Define
Wi = K(λi )Vi
2. Compute the Pick matrix [Wi∗ J1 1 Wj /(λ̄i + λj )]N
i,j=1 . If this matrix is positive definite then
proceed, else stop, there is no solution.
3. Compute all Jstrict -dissipative behaviors that interpolate the subspaces Wi eλi t , and which
have positive definite storage functions. Let B0 be such a (controllable) behavior.
4. Every behavior B that satisfies the condition that there exists a B0 (with an observable
image representation) satisfying B = K( dtd )(B0 ) is a solution to Problem (GSNIP)
150
9.5
9 Nevanlinna-Pick interpolation
Conclusion
In this chapter we have provided a behavior-theoretic characterization of all solutions to the
Nevanlinna Pick interpolation problem. The characterization presented here guarantees controllability. We have provided an explanation as to the need of the so called “mirror images” in
interpolation problems. In all classical formulations of the Nevanlinna-Pick problems, the interpolant is required to satisfy a “frequency independent” norm condition. We have generalized
the Nevanlinna-Pick problem to cases where a interpolant is required to satisfy a “frequency
dependent” norm condition. This is shown to be intimately related to dissipativity with respect
to a supply function defined using a Quadratic Differential Form. We have obtained necessary
and sufficient conditions for the solvability of a class of Nevanlinna-Pick interpolation problems
with “frequency dependent” norm conditions.
Chapter 10
Conclusion and future work
We now take a bird’s-eye view of the problems addressed in this thesis. We list contributions
from this study. We also emphasize the connections between different aspects of systems and
control theory that have been developed in this thesis. Finally, we address possible directions
for future work.
10.1
Summary of results
Chapter 3 dealt with a parametrization for dissipative behaviors. While it is not difficult to
check if a given linear, time-invariant dynamical system is dissipative with respect to a supply function defined by a QDF, the question of constructing all dynamical systems that are
dissipative with respect to a given supply function is not straightforward. We first considered
single-input-single-output (SISO) dissipative systems and obtained an explicit parametrization
for the set of all dissipative systems under some assumptions on the supply function (the assumption of constant signature on the imaginary axis). We generalized the results for the SISO
case to the multi-input-multi-output (MIMO) case, again under the assumption of constant
signature on the imaginary axis. We then considered the case when the assumption of constant signature on the imaginary axis does not hold. Given a general supply function defined
by a QDF, we showed that one can always parametrize a subset of the set of all dissipative
behaviors if the assumption of constant signature on the imaginary axis does not hold. The
parametrization in this case was obtained using what we defined as “split sums”.
The problem addressed in Chapter 3 can be examined with some additional conditions, for
example: when do there exist positive definite storage functions for dissipative systems. We
have examined precisely this question in Chapter 4, “KYP lemma and its extensions”. We
first noted the well known result that the classical Kalman-Yakubovich-Popov (KYP) lemma
gives an equivalence between passivity and the existence of positive definite storage functions.
We examined the KYP lemma in a representation free manner and derived it using behaviortheoretic arguments. We then generalized the lemma for systems that are dissipative with
respect to certain “special” supply functions. We showed that a given dissipative dynamical
system has positive definite storage functions on manifest variables if and only if an associated
dynamical system is “passive” and has certain properties (the properties of observability of an
image representation, and the property of there being no non-trivial memoryless part).
152
10 Conclusion and future work
Chapters 3 and 4 provide the basic tools to investigate problems in different areas in systems
and control theory. A majority of problems addressed in the later chapters can be thought of
as applications of results obtained in Chapters 3 and 4. The results in Chapters 3 and 4 thus
provide a unifying thread that runs across most chapters in this thesis.
One of the most interesting applications of storage functions is their use as Lyapunov functions. We investigated this aspect in Chapter 5 which dealt with the “absolute stability”
problem in nonlinear systems. We considered autonomous systems obtained by interconnection
of LTI systems with nonlinearities, through given interconnection constraints. The treatment
unifies stability analysis for a large class of systems. We obtained MIMO versions of classical
results as special cases.
In Chapters 3 and 4, a certain factorization of polynomial matrices called “polynomial
J-spectral factorization” was invoked. This factorization is an important and well-studied
problem in computations with polynomial matrices. We addressed the polynomial J-spectral
factorization problem in Chapter 6. Using behavior-theoretic ideas and QDFs, we obtained
a new algorithm for polynomial J-spectral factorization. This algorithm builds on results in
interpolation theory, specifically the Nevanlinna-Pick interpolation problem. We implemented
the algorithm and reported the numerical aspects. The algorithm reported in Chapter 6 is one
of the simplest among those available in literature. It also has good numerical properties for
problems of reasonable size.
In Chapter 7 we returned to the theme of dissipative systems which we considered in Chapters 3,4 and 5. An interesting application of results obtained in Chapters 3 and 4 is in the
well known “H∞ ” problem. The H∞ problem is one of the most important areas in control
theory and finds applications in, for instance, disturbance attenuation, passivation, etc. In
Chapter 7, we obtained a novel characterization of all solutions to the H∞ problem using the
parametrization results obtained in Chapters 3 and 4.
Many real-life systems cannot be adequately modeled by linear laws. A model defined by
a quadratic form is better suited in such cases. This motivated Chapter 8, which represents
a new and interesting line of research: modeling of data with bilinear differential forms. We
first translated the modeling problem to an interpolation problem with two variable polynomial
matrices. We gave an iterative algorithm to solve the interpolation problem. The algorithm
relies on standard polynomial matrix computations and is suitable for implementation in a
symbolic computational package. As a special case, we considered the problem of interpolating
with scalar bivariate polynomials. We also addressed the problem of constructing storage functions for autonomous dissipative systems. The approach presented in Chapter 8 for computing
storage functions presents an interesting alternative to methods available in literature.
Continuing with the interpolation ideas in Chapter 8, we considered the problem of interpolation with rational functions in Chapter 9. We addressed a generalization of the well
known Nevanlinna-Pick interpolation problem. This interpolation problem is about finding a
rational function that satisfies some norm constraints on the imaginary axis, and interpolates
given data. The norm constraint is “frequency independent”. In Chapter 9, we generalized the
Nevanlinna-Pick interpolation problem to rational functions that satisfy a “frequency dependent” norm condition. We used results obtained in Chapter 3, 4, and 6 in order to address this
generalization.
10.2. Directions for further work
153
With this brief overview of the research presented in this thesis, we will now consider possible
directions for future work.
10.2
Directions for further work
We view results presented in this thesis as only the starting point of a systematic study of
quadratic differential forms and their applications vis-a-vis dynamical systems. In our opinion,
the ideas presented in this thesis can be generalized in the many directions. We present a
summary of some interesting problems for research.
Parametrization of dissipative systems: In Chapter 3 we obtained a complete parametrization of dissipative systems under some assumptions on the supply function (the assumption of constant inertia on the imaginary axis). When this assumption does not hold,
our results give only sufficient conditions. Investigation is required into how to obtain
necessary and sufficient conditions for a general supply function.
Kalman-Yakubovich-Popov lemma: The KYP lemma relies on certain assumptions on
the supply functions. Relaxing this assumption is an immediate problem. Research in
state space formulations of our results for the sake of efficient computations with Linear
Matrix Inequalities (LMIs) should yield interesting results.
Nonlinear systems: The approach presented in Chapter 5 is promising. Investigations
can be taken up into examining nonlinearities with memory, like relays with hysteresis. Research into state space based formulations, integration with other theories like
“integral-quadratic-constraints” will also be interesting and insightful.
Interpolation problems: Applications to concrete real-life systems (for example econometric
systems, multidimensional signal processing, etc) can be addressed using the recursive
algorithm for interpolation with bilinear and quadratic differential forms. A discrete-time
version of the algorithm can be used for quadratic filter design in digital signal processing.
Generalization of Nevanlinna-Pick interpolation can be used in order to address a “data
driven” theory for control which does not involve models.
Synthesis of dissipative systems: We have only considered systems dissipative with respect
to a metric defined by a “constant matrix”. Investigation along the lines of Chapter 7
can be used to generalize the synthesis problem to what is usually called the “frequency
weighted H∞ control”. Also, the treatment in Chapter 7 was based on certain assumptions
about the hidden behavior– relaxing these is an immediate problem. Investigation is
required to determine the feasibility of the approach taken in Chapter 7 for obtaining
efficient computational algorithms.
154
10 Conclusion and future work
References
[1] F.A. Aliev and V.B. Larin, “Algorithm of J-spectral factorization of polynomial matrices”,
Automatica, 33 (1997), pp 2179-2182.
[2] B.D.O. Anderson, “A system theory criterion for positive real matrices” SIAM Journal of
Control, 5 (1967) , pp 171-182.
[3] B.D.O. Anderson and S. Vongpanitlerd, “Network Analysis and Synthesis: a Modern Systems Theory Approach” , Prentice Hall, 1973.
[4] B.D.O. Anderson, K.L. Hitz, and N.D. Diem, “Recursive algorithm for spectral factorization”, IEEE Transactions on Circuits and Systems, 21 (1976), pp. 453-464.
[5] B.D.O. Anderson, P. Kokotovic, I.D. Landau and J.C Willems, “Dissipativity of dynamical systems: applications in control– dedicated to Vasile Mihai Popov”, Special issue:
European Journal of Control, 8(2002).
[6] A.C. Antoulas “A new result on passivity preserving model reduction”, Systems and Control Letters, 54(2005), pp361-374.
[7] A.C. Antoulas, and J.C. Willems, “A behavioral approach to linear exact modeling”, IEEE
Transactions on Automatic Control, 38 (1993), pp. 1776-1802.
[8] D.P. Atherton, “Nonlinear Control Engineering” , Van Nostrand, 1975.
[9] J.A. Ball, I. Gohberg, L. Rodman,
Birkhauser-Verlag, 1990.
“Interpolation of Rational Matrix Functions”,
[10] B. Bojanov, Y.Xu, “On a hermite interpolation by polynomials of two variables”, SIAM
Journal of Numerical Analysis, 39(2002), pp 1780-1793.
[11] M.N. Belur,“Control in a behavioral context”, Doctoral dissertation, University of Groningen, The Netherlands, 2003.
[12] M.K. Camlibel, J.C. Willems, M.N. Belur, “On the dissipativity of uncontrollable systems”, Proceedings of 42nd IEEE Conference on Decision and Control, 2003, pp 1645-1650.
[13] D.S. Bernstein and S.P. Bhat “Energy equipartition and the emergence of damping in
lossless systems”, Proc 41st IEEE Conf Decision and Control (2002), pp 2913-2918.
[14] Bai, Z, Demmel, J et al “Templates for the Solution of Algebraic Eigenvalue Problems: a
Practical Guide”, SIAM publication, 2000.
155
156
References
[15] R.W. Brockett, J.L. Willems, “Frequency domain stability criteria–Part I” IEEE Transactions on Automatic Control, 10 (1965), pp 255-261.
[16] R.W. Brockett, J.L. Willems, “Frequency domain stability criteria–Part II” IEEE Transactions on Automatic Control, 10 (1965), pp 407-413.
[17] F.M. Callier, “Spectral factorization by symmetric factor extraction”, IEEE Transactions
on Automatic Control, 30(1985), pp 453-465.
[18] H.J. Carlin, “The scattering matrix in network theory”, IRE Transactions on Circuit
Theory, 3 (1956), pp 88- 97.
[19] J.C. Doyle, K. Glover, P. Khargonekar and B.A. Francis “State space solutions to standard
H2 and H∞ problems”, IEEE Transactions on Automatic Control, 34 (1989), pp 831-847.
[20] M. Fu, S. Dasgupta, and Soh, Y.C. “Integral quadratic constraint approach vs. multiplier
approach”, Proc. IEEE conference on Control, Automation, Robotics and Vision, ICARCV
vol. 1, pp. 144 - 149, 2002.
[21] R. Fitts, “Two counter-examples to Aizerman’s conjecture”, IEEE Transactions on Automatic Control, 11(1966), pp 553-556.
[22] F.R. Gantmacher “The Theory of Matrices, Vol. 1” Chelsea Publishing Company, New
York, 1960.
[23] M. Gasca and T. Sauer “On Bivariate Hermite Interpolation with Minimal Degree Polynomials”, SIAM Journal of Numerical Analysis, 37(2000), pp 772-798.
[24] T.T. Georgiou and P.P. Khargonekar, “Spectral factorization and Nevanlinna-Pick interpolation”, SIAM Journal on Control and Optimization, 25(1987), pp 754-766.
[25] T.T. Georgiou, “On a Schur-algorithm based approach to spectral factorization: Statespace formulae”, Systems and Control Letters, 10(1998), pp 123-129.
[26] T.T. Georgiou, “Computational aspects of spectral factorization and the tangential Schur
algorithm”, IEEE Transactions on Circuits and Systems, 36 (1989), pp 103-108.
[27] T.T. Georgiou and P.P. Khargonekar, “Spectral factorization of matrix valued functions
using interpolation theory”, IEEE Transactions on Circuits and Systems, 36(1989), pp
568-574.
[28] I. Gohberg, P. Lancaster, L. Rodman “Factorization of Selfadjoint Matrix Polynomials
with Constant Signature” Linear and Multilinear Algebra, 11 (1982), pp 209-224.
[29] I. Gohberg, P. Lancaster, L. Rodman “Matrix Polynomials”, Academic Press, 1982.
[30] J.M. Goncalves, A. Megretski, M.A. Dahleh, “Global stability of relay feedback systems”,
IEEE Transactions on Automatic Control, 46(2001), pp 550–562.
REFERENCES
157
[31] W.M. Haddad, V. Kapila, “Absolute stability criteria for multiple slope restricted nonlinearities”, IEEE Transactions on Automatic Control, 40 (1995),pp 361-365.
[32] W.M. Haddad, V. Chellaboina, and S.G. Nersesov, “A system-theoretic foundation for
thermodynamics: energy flow, energy balance, energy equipartition, entropy and ectropy”,
Proceedings of American Control Conference, 2004, pp 396-417.
[33] E.V. Haynsworth, “Determination of inertia of a partitioned hermitian matrix”, Linear
algebra and its applications, 1 (1968), pp 73-81.
[34] D.J. Hill and P.J. Moylan “Dissipative dynamical systems: basic input, output and state
properties”, Journal of the Franklin Institute, 309(1980), pp 327-357.
[35] D.J. Hill and P.J. Moylan “The stability of nonlinear dissipative systems”, IEEE Transactions on Automatic Control, 21(1976), pp 708-711.
[36] J.C. Hsu and A.U. Meyer, “Modern control principles and applications”, McGraw Hill,
1968.
[37] Hu, T. Huang, B. Lin. Z, “Absolute Stability With a Generalized Sector Condition” IEEE
Transactions on Automatic Control, 49(2004), pp 535–548.
[38] J. Ježek and V. Kučera, “Efficient algorithm for matrix spectral factorization”, Automatica,
21 (1985), pp 663-669.
[39] K.H. Johansson, A.E. Barabanov, and K.J. Astrom, “Limit cycles with chattering in relay
feedback systems”, IEEE Transactions on Automatic Control, 47(2002) pp 1414–1423.
[40] R.E. Kalman, 1963, “Lyapunov functions for the problem of Lur’e in automatic control”
Proc. Mat. Acad. Sci. USA, 49 (1963), pp 201-205.
[41] O. Kaneko and T. Fujii, “Discrete-time average positivity and spectral factorization in a
behavioral framework”, Systems and Control Letters, 39(2000), pp 31-44.
[42] C.-Y. Kao, A. Megretski and U. Jönsson, “Specialized fast algorithms for IQC feasibility
and optimization problems”, Automatica, 40(2004), pp 239-252.
[43] J.S. Karmarkar, “On Siljak’s absolute stability test” Proceedings of the IEEE, 58 (1970),
pp 817-819.
[44] H. Kimura, “Directional interpolation approach to H∞ -Optimization and robust stabilization”, IEEE Transactions on Automatic Control, 32 (1987), pp 1085-1093.
[45] H. Kimura, “Directional interpolation in the state space”, Systems and Control Letters, 10
(1988), pp 317-324.
[46] H. Kimura, “Conjugation, interpolation, and model-matching in H∞ ”, International Journal of Control, 49 (1989), pp 269-307.
158
References
[47] M. Kuijper, J.W. Polderman, “Behavioral models for list decoding”, Mathematical and
computer modelling of dynamical systems, 8 (2002), pp 429-443.
[48] Kulkarni V.V. Sofanov M.G. “All multipliers for repeated monotone nonlinearities”, IEEE
Transactions on Automatic Control, 47(2004), pp 1209–1212.
[49] H. Kwakernaak and M. Šebek, “Polynomial J-spectral factorization”, IEEE Transactions
on Automatic Control, 39 (1994), pp 315-328.
[50] D.P.
O’Leary,
“Symbiosis
between
linear
algebra
http://citeseer.ist.psu.edu/223627.html, July 2005.
and
optimization”,
[51] Madhusudan, “Efficient checking of polynomials and proofs and the hardness of approximation problems”, Lecture notes in computer science, Springer-Verlag, 1995.
[52] A. Megretski, A. Rantzer, “System analysis via integral quadratic constraints”, IEEE
Transactions on Automatic Control, 42 (1997), pp 819-830.
[53] G. Meinsma, “Frequency-domain methods in H∞ -control”, Doctoral Dissertation, University of Twente, 1993.
[54] J. Meixner, “On the theory of linear passive systems”, Archives for Rational Mechanics
and Analysis, 17 (1964), pp 278-296.
[55] P.J. Moylan, “Implications of passivity in a class of nonlinear systems”, IEEE Transactions
on Automatic Control, 19(1974), pp 373-381.
[56] R.W. Newcomb, “Linear Multiport Synthesis” McGraw Hill, New York, 1966.
[57] G. Park, D. Banjerdpongchai, T. Kailath, “The asymptotic stability of nonlinear lure systems with multiple slope restrictions”, IEEE Transactions on Automatic Control, 43(1998),
pp 979-982.
[58] R. Peeters and P. Rapisarda, “A two-variable approach to solve the polynomial Lyapunov
equation”, System and Control Letters, 42(2001), pp. 117-126.
[59] I. Pendharkar, “Model reduction and associated problems in linear systems theory”, Masters dissertation, Department of Electrical Engineering, Indian Institute of Technology
Bombay, 2001.
[60] I. Pendharkar and H.K. Pillai, “ A parametrization for dissipative behaviors”, Systems
and control letters, 51(2004), pp 123-132.
[61] I. Pendharkar and H.K. Pillai “On dissipative SISO systems: a behavioral approach”,
Proceedings of 42nd IEEE Conference on Decision and Control, 2002, Hawai, USA, pp1616 - 1620.
[62] I. Pendharkar, H.K. Pillai, “The Kalman-Yakubovich lemma in the behavioral setting”,
to appear in International Journal on Control.
REFERENCES
159
[63] I. Pendharkar and H.K. Pillai, “A parametrization for behaviors with a non-negative storage function”, Proceedings of twenty seventh National Systems Conference, 2003, Kharagpur, India, pp-59-63.
[64] I. Pendharkar and H.K. Pillai, “Kalman-Yakubovich lemma in the behavioral setting”, in
Proc IFAC symposium on large scale systems, 2004, Osaka, Japan.
[65] I. Pendharkar, H.K. Pillai, “On a theory for nonlinear behaviors”, Proc 16th International
Symposium on Mathematical Theory of Networks and Systems (MTNS 2004).
[66] I. Pendharkar and H.K. Pillai, “A behavioral approach to Popov-like stability criteria”
Proceedings of National Systems Conference, Vellore , 2004.
[67] I. Pendharkar and H.K. Pillai, “A rootlocus based approach for stabilization of nonlinear
systems” Proceedings of National Conference on Control and Dynamical systems, 2005.
[68] I. Pendharkar and H.K. Pillai, “On stability of systems with monotone nonlinearities”
Proceedings of National Conference on Control and Dynamical systems, 2005.
[69] I. Pendharkar and H.K. Pillai, “Guaranteed closed loop stability with sensor uncertainties”
Proceedings of International Conference on Instrumentation , December, 2004.
[70] I. Pendharkar, H.K. Pillai and P. Rapisarda, “A behavioral view of Nevanlinna-Pick interpolation”, 44th IEEE conference on decision and control (CDC), 2005, accepted.
[71] P. Penfield, “ Passivity Conditions”, IEEE Transactions on Circuit Theory, 12(1965), pp
446-448.
[72] H.K. Pillai and E. Rogers, “On quadratic differential forms for n-D systems”, Proceedings
of 39th IEEE Conference on Decision and Control (CDC), Sydney, 2000, pp 5010-5013.
[73] V.A. Pliss, “Necessary and sufficient conditions for the global stability of a certain system
of three differential equations”, Dokl. Akad. Nauk SSR 120(1958) pp 4.
[74] J.W. Polderman, “Proper elimination of latent variables”, Systems and control letters,
32(1997) 261-269.
[75] J.W. Polderman, J.C. Willems, “Introduction to mathematical systems theory: A behavioral approach” Springer-Verlag, 1997.
[76] V.M. Popov, “Absolute stability of nonlinear systems of automatic control”, Avtomatika
i Telemekhanika, 22 (1961), pp 961-979. For an English translation, see A.G.J. McFarlane
Ed “Frequency response methods in control systems”, IEEE press, 1979.
[77] V.M. Popov, A. Halanay, “On the stability of nonlinear automatic control systems with
lagging argument”, Avtomatika i Telemekhanika, 23 (1962), pp 849-851. For an English
translation, see A.G.J. McFarlane Ed “Frequency response methods in control systems”,
IEEE press, 1979.
[78] V.M. Popov, “Hyperstability of Control Systems”, Springer-Verlag, 1973.
160
References
[79] S. Purkayastha and A. Mahalanabis, “An extended MKY lemma and its application”
IEEE Transactions on Automatic Control, 16(1971) 366-367.
[80] A. C. M. Ran, L. Rodman, “Factorization of Matrix Polynomials with Symmetries”
SIAM Journal on Matrix Analysis and Applications, 15 (1994), pp 845864. Also Preprint 993, Institute for Mathematics and its Applications, July 1992.
http://ima.umn.edu/preprints/July92/0993.ps (as in August 2005).
[81] A. Rantzer, “On the Kalman-Yakubovich-Popov lemma” Systems and Control Letters, 28
(1996), pp 7-10.
[82] P. Rapisarda, Linear differential systems, Ph.D thesis, University of Groningen, The
Netherlands, 1998.
[83] P. Rapisarda and J.C. Willems, “State maps for linear systems”, SIAM Journal of Control
and Optimization, 35 (1997), pp 1053-1091.
[84] P. Rapisarda and J.C. Willems, “The subspace Nevanlinna interpolation problem and the
most powerful unfalsified model”, Systems and Control Letters, 32 (1997), pp. 291-300.
[85] G.L. Sicuranza, “Quadratic filters for signal processing”, Proceedings of IEEE, 80(1992),
pp 1263-1285.
[86] D.D. Siljak, “New Algebraic Criteria for Positive Realness” Journal of The Franklin
Institute, 291 (1971), pp 109-120.
[87] V. Singh, “A note on the extended MKY lemma”, IEEE Transactions on Automatic
Control, 27 (1982) pp 1264.
[88] V. Singh, “Absolute stability criterion for a class of nonlinear systems with slope restricted
nonlinearity”, Proceedings of the IEEE, 70 (1982), pp 1232-1233.
[89] V. Singh, “An extended MKY lemma– what shall it be?”, IEEE Transactions on Automatic Control , 28 (1983), pp 627-628.
[90] J.E. Slotine and W. Li, “Applied Nonlinear Control” Prentice Hall, 1991.
[91] P. R. Smith, “Bilinear interpolation of digital images”, Ultramicroscopy, 6(1981), pp 201204.
[92] R.L. Smith, “Some interlacing properties of the schur complement of a hermitian matrix”,
Linear algebra and its applications, 177 (1992) pp 137-144.
[93] D.C. Sorensen, “Passivity preserving model reduction via interpolation of spectral zeros”,
Systems and Control Letters, 54(2005), pp 347-360.
[94] V.R. Sule, “State space approach to behavioral systems theory: the Dirac-Bergmann
algorithm” Systems and Control letters, 50 (2003), pp 149-162.
REFERENCES
161
[95] H.L. Trentelman and J.C. Willems, “Every storage function is a state function” Systems
and Control letters, 32 (1997), pp 249-259.
[96] H.L. Trentelman and J.C. Willems, “Synthesis of Dissipative Systems Using Quadratic
Differential Forms, Parts I and II”, IEEE Transactions on Automatic Control, 47 (2002),
pp 53–69 and 70–86.
[97] H.L. Trentelman and P. Rapisarda, “New algorithms for polynomial J-spectral factorization”, Mathematics of Control, Signals and Systems, 12 (1999), pp 24-61.
[98] J.C. Willems, “Dissipative dynamical systems, Part 1: General theory” Archives for
Rational Mechanics and Analysis, 45 (1972), pp 321-351.
[99] J.C. Willems, “Dissipative dynamical systems, Part 2: Linear systems with quadratic
supply rates” Archives for Rational Mechanics and Analysis, 45 (1972), pp 352-393.
[100] J.C. Willems, “From time series to linear system, part II: Exact modeling”, Automatica,
22 (1986), pp 675-694.
[101] J.C. Willems, “Paradigms and puzzles in the theory of dynamical systems”, IEEE Transactions on Automatic Control, 36 (1991), pp 259-294.
[102] J.C. Willems, “On Interconnections, Control and Feedback”, IEEE Transactions on
Automatic Control , 42 (1997), pp 326-339.
[103] J.C. Willems and H.L. Trentelman, 1998, “On Quadratic Differential Forms”, SIAM
Journal of Control and Optimization, 36 (1998), pp 1703-1749.
[104] J.C. Willems and H.L. Trentelman, “H∞ -control in a behavioral context: The full information case”, IEEE Transactions on Automatic Control, 44 (1999), pp 521-536.
[105] J.L. Willems, “Stability theory of dynamical systems”, Nelson, 1970.
[106] R. Van Der Geest and H.L. Trentelman, “The KYP lemma in a behavioral framework”
Systems and Control letters, 32 (1997) , pp 283-290.
[107] M. Vidyasagar, “Nonlinear Systems Analysis, 2nd ed.” Prentice Hall, 1993.
[108] A.A. Voronov, “Basic principles of automatic control theory– special linear and nonlinear
systems”, Mir Publishers, Moscow, 1985.
[109] V.A. Yakubovich 1962, “Solutions of some matrix inequalities occurring in the theory of
automatic control” Dokl. Akad. Nauk SSSR, 143 (1962), pp 1304-1367 (in Russian).
[110] D.C. Youla and M. Saito, “Interpolation with positive real functions”, Journal of the
Franklin Institute, 284 (1967) pp 77-108.
[111] G. Zames, P.L. Falb, “Stability conditions for systems with monotone and slope restricted
nonlinearities”, SIAM Journal on Control, 6 (1968), pp 89-108.
162
References
Appendix A
Notation
R
C
WT
teletype fonts
w, q etc
C ∞ (R, Rw )
D(R, Rw )
w
Lloc
1 (R, R )
det
deg
Im
Ker
Re (λ)
λ̄
Lw
LΦ
QΦ
Lw con
LΦ
B
m(B)
p(B)
n(B)
R[ξ]
R[ζ, η]
Rn1 ×n2 [ξ]
R•×n [ξ]
The set of real numbers.
The set of complex numbers.
The set of maps from T → W.
The number of components of vectors w, q respectively.
The space of infinitely many times differentiable functions from R to Rw .
The space of compactly supported C ∞ functions from R to Rw .
The space of locally integrable functions from R to Rw .
determinant of a matrix.
degree of a one variable polynomial matrix.
Image of a linear map.
Kernel of a linear map.
Real part of λ.
Complex conjugate of λ.
The set of linear differential systems with “w” variables.
Bilinear differential form defined by Φ(ζ, η).
Quadratic differential form defined by Φ(ζ, η).
The set of controllable linear differentiable systems with “w” variables.
The set of all controllable Φ-dissipative linear differentiable systems.
A behavior.
Input cardinality of B.
Output cardinality of B.
McMillan degree of B.
The ring of polynomials in ξ over R.
The ring of polynomials in ζ, η over R.
The set of all n1 × n2 polynomial matrices in ξ.
The set of polynomial matrices in ξ with n columns.
164
A Notation
Rq1 ×q2 [ζ, η]
Rw×w
s [ζ, η]
AT
A∗
R∼
A≥0
Π(iω) ≥ 0
A>0
Π(iω) > 0
σ(A)
σ+ (A)
σ− (A)
σ0 (A)
σworst (Z)
Iw
Jmn
Jmn
J
Jworst
X( dtd )
X ( dtd )
N
NΘ
FN
F0K
Fmon
BN
P full
P
K
C
N
The set of q1 × q2 matrices over R[ζ, η].
The set of w × w symmetric matrices over R[ζ, η].
The transpose of a constant matrix A.
The hermitian transpose of a constant matrix A.
The transpose of a polynomial matrix R(ξ)∼ := RT (−ξ).
The quadratic form induced by A is positive semidefinite.
The hermitian form induced by Π(iω) is positive semidefinite at almost all ω ∈ R.
The quadratic form induced by A is positive definite.
The hermitian form induced by Π(iω) is positive definite at almost all ω ∈ R.
Inertia of the hermitian matrix A.
Number of positive eigenvalues of A.
Number of negative eigenvalues of A.
Number of zero eigenvalues of A.
The “worst inertia” of Z(ξ).
The w × w Identity matrix. !
The inertia matrix
Jmn − I m+n×m+n .
Im 0
.
0 −In
!
0 Im
The 2m × 2m matrix (1/2)
.
Im 0
The “worst inertia matrix”.
The state map associated with a behavior.
The matrix defining interconnection constraints.
A nonlinearity.
The positive cone associated with QΘ .
The family of all nonlinearities that are positive along QΘ .
The family of sector bound nonlinearities.
The family of monotone nonlinearities.
The nonlinear behavior.
The full plant behavior.
The manifest plant behavior.
The controlled behavior.
The controller behavior.
The hidden behavior.
Index
absolute stability, 75
Aizerman’s conjecture, 75
available storage, 57
Axiom of state, 24
controlled behavior, 119
describing function, 92
dissipation equality, 57
dissipation function, 56
dissipative systems, 32
and analyticity, 37
characterization using Nyquist plot, 37
MIMO, 43
SISO, 36
storage functions for, 56
synthesis of, 117
uncontrollable, 33
dualization, 96, 144
system theoretic implications of, 145
dynamical systems, 11
autonomous, 23
stable, 24
dissipative, 31
linear, 12
linear differential, 12
models for, 94
time-invariant, 12
BDF, 5
modeling with, 129
zero along a behavior, 133
behavior, 11
autonomous, 23
controllable, 20
controlled, 119
implementable, 119
controller, 118
dissipative, 32
full, 15
full plant, 118
hidden, 119
implementable, 119
manifest, 118
McMillan degree, 26
memoryless part of, 26
most powerful unfalsified, see MPUM
state space representation, 24
view of modeling, 94
bilinear difference form, 140
Bilinear differential form, see BDF
Bilinear form, 4
Bivariate polynomial matrix, 5
symmetric, 6
bounded real rational function, 62
effective size, 131
Elimination theorem, 16
equivalent supply functions, 35
characterization of, 45
exact identification, 129
Factorization ala Ran, Rodman, 47
free variables, 27
function spaces, 13
Future work, 153
Circle criterion, 85
control
as interconnection, 80, 118
controller, 118
control variables, 118
controllability, 20
GNU-GPL, 109
H∞ problem, 117
and dissipativity, 32
165
166
characterization of all solutions of, 126
formulation, 119, 120
hidden behavior, 119
hybrid representation, 15
image representation, 20
implementable behavior, 119
inertia, 5
worst, 47
input cardinality, 28, 119
inputs, 27
interconnection
view of control, 118
interpolation
for computing storage functions, 135
Lagrange, 132
Nevanlinna-Pick, 141
standard case, 142
with frequency dependent norm, 147
with BDFs
notion of complexity, 139
with bilinear difference form, 140
with bivariate polynomials, 135
with two variable polynomial matrices,
131
invariants of a behavior, 28
kernel representation, 12
KYP lemma
as an LMI, 55
for SISO systems, 69
generalization of, 61
storage functions and, 60
strict versions of, 67
L∞ norm, 44
latent variable, 15
representation using, 15
latent variable representation, 15
loop transformation, 86
lossless system, 46, 66, 125
Lyapunov function, 56, 76
computing for linear systems, 135
manifest plant behavior, 118
Index
manifest variable, 15
McMillan degree, 26, 28, 57, 79
memoryless nonlinearity, 81
memoryless part of a behavior, 26
Meyer-Kalman-Yakubovich lemma, 56
minimum required supply, 57
Minor, 19
model
in the generative sense, 142
Modeling with BDFs, 129
models for dynamical systems, 94
MPUM, 95, 142
computing a representation for, 95
for dualized data, 96
negative feedback, 81
nonlinearity, 80
memoryless, 81
sector bound, 85, 86
slope restricted, 89
stabilizing controllers for, 82
with memory, 90
observability, 21
of a QDF, 8
output cardinality, 28
outputs, 27
para-Hermitian matrix, 7, 93
J-spectral factorization of, 93
computing singularities of, 108
minimal factorizations of, 47
split sums for, 50
worst inertia of, 47
passive system, 31, 61
Pick matrix, 96, 144
polynomial J-spectral factorization, 45, 93
algorithm for, 105
computer implementation, 109
numerical aspects of the algorithm, 106
Popov, 75
stability criterion, 86
positive real rational function, 37, 71
strictly, 71
preview of the thesis, 2
INDEX
principal block minor, 97
proper rational function, 27
QDF, 6
acting on Lloc
1 functions, 79
observable, 8
sign definite, 8
symmetric canonical factorization, 8, 106
Quadratic differential form, see QDF
quadratic form, 4
diagonalization, 4
Rank, 19
Rational J-spectral factorization, 41
regular polynomial matrix, 107
Relay with hysteresis, 90
Representation, 11
Σ-unitary, 94
equivalent, 18
image, 20
kernel, 12
latent variable, 15
observable image, 21
state, 24
right coprime matrices, 65
Schur complement, 35, 122
Scilab, 109
semi-simple, 96
signature, 5
slope restricted nonlinearities, 89
spectral factorization, 58
split sum, 50
parametrization using, 50
stability
global asymptotic, 80
of equilibria, 80
of linear systems, 24
state, 24
axiom of, 25
McMillan degree, 26
storage functions on, 57, 76
state map, 26, 58, 62, 77
storage functions, 56
a procedure to compute, 58
167
and KYP lemma, 60
maximum and minimum, 59
positive definite, 66, 68, 72
state functions, 57, 76
strictly positive real rational function, 71
strictly proper rational function, 27
strong solution, 14
Subspace Nevanlinna Interpolation Problem
(SNIP), 143
summary of results, 151
supply function
constant inertia, 44
constant matrices, 43
equivalent, 35
variable inertia, 46
Symmetric canonical factorization, 8
strictly proper, 99
Synthesis of dissipative systems, 117
Thermodynamics and dissipativity, 32
to-be-controlled variables, 118
unitary matrix, 48, 96
weak solution, 14
weighted degree, 131
worst inertia, 47
how to compute, 47
Download