EOLSS Contribution 6.43.13.4 Frequency domain representation and singular value decomposition A.C. Antoulas Department of Electrical and Computer Engineering Rice University Houston, Texas 77251-1892, USA e-mail: aca@rice.edu - fax: +1-713-348-5686 URL: http://www.ece.rice.edu/˜aca June 12, 2001 Abstract This contribution reviews the external and the internal representations of linear time-invariant systems. This is done both in the time and the frequency domains. The realization problem is then discussed. Given the importance of norms in robust control and model reduction, the final part of this contribution is dedicated to the definition and computation of various norms. Again, the interplay between time and frequency norms is emphasized. Key words: linear systems, internal representation, external representation, Laplace transform, Z -transform, vector norms, matrix norms, Singular Value Decomposition, convolution operator, Hankel operator, reachability and observability gramians. This work was supported in part by the NSF through Grants DMS-9972591 and CCR-9988393. 1 Introduction EOLSS 6.43.13.4 Contents 1 Introduction 2 Preliminaries 2.1 Norms of vectors, matrices and the SVD . . . . . . . . . 2.1.1 Norms of finite-dimensional vectors and matrices 2.1.2 The singular value decomposition . . . . . . . . 2.1.3 The Lebesgue spaces `p and Lp . . . . . . . . . 2.1.4 The Hardy spaces h p and Hp . . . . . . . . . . . 2.1.5 The Hilbert spaces `2 and L2 . . . . . . . . . . . 2.2 The Laplace transform and the Z -transform . . . . . . . 2.2.1 Some properties of the Laplace transform . . . . 2.2.2 Some properties of the Z -transform . . . . . . . 3 4 5 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 3 5 7 8 9 10 10 11 The external and the internal representation of linear systems 3.1 External representation . . . . . . . . . . . . . . . . . . . . 3.2 Internal representation . . . . . . . . . . . . . . . . . . . . 3.2.1 Solution in the time domain . . . . . . . . . . . . . 3.2.2 Solution in the frequency domain . . . . . . . . . . 3.2.3 The concepts of reachability and observability . . . . 3.2.4 The infinite gramians . . . . . . . . . . . . . . . . . 3.3 The realization problem . . . . . . . . . . . . . . . . . . . . 3.3.1 The solution of the realization problem . . . . . . . 3.3.2 Realization of proper rational matrix functions . . . 3.3.3 The partial realization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 13 15 17 18 18 21 24 27 29 31 Time and frequency domain interpretation of various norms 4.1 The convolution operator and the Hankel operator . . . . . 4.2 Computation of the singular values of S . . . . . . . . . . 4.3 Computation of the singular values of H . . . . . . . . . . 4.4 Computation of various norms . . . . . . . . . . . . . . . 4.4.1 The H2 norm . . . . . . . . . . . . . . . . . . . . 4.4.2 The H1 norm . . . . . . . . . . . . . . . . . . . 4.4.3 The Hilbert-Schmidt norm . . . . . . . . . . . . . 4.4.4 Summary of norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 32 33 35 36 36 37 37 39 . . . . . . . . . . . . . . . . . Appendix: Glossary 42 List of Tables 1 2 3 4 5 Basic Laplace transform properties . . . . . . . . . . . . . . . Basic Z -transform properties . . . . . . . . . . . . . . . . . . I/O and I/S/O representation of continuous-time linear systems I/O and I/S/O representation of discrete-time linear systems . . Norms of linear systems and their relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 12 19 20 40 1 Introduction One of the most powerful tools in the analysis and synthesis of linear time-invariant systems is the equivalence between the time domain and the frequency domain. Thus additional insight into problems in this area is obtained by viewing them both in time and in frequency. This dual nature accounts for the presence and great success of linear systems both in engineering theory and applications. 2 Preliminaries EOLSS 6.43.13.4 In this contribution we will provide an overview of certain results concerning the analysis of linear dynamical systems. Time and frequency domain frameworks are inextricably connected. Therefore together with frequency domain considerations in the sequel, unavoidably, a good deal of time domain considerations are included as well. Our goals are as follows. First, basic system representations will be introduced, both in time and in frequency. Then the ensuing realization problem is formulated and solved. Roughly speaking the realization problem entails the construction of a state space model from frequency response data. The second goal is to introduce various norms for linear systems. This is of great importance both in robust control and in system approximation/model reduction. For details see e.g. [14, 31, 7, 24, 6, 4]. First it is shown that besides the convolution operator we need to attach a second operator to every linear system, namely the Hankel operator. The main attribute of this operator is that it has a discrete set of singular values, known as the Hankel singular values. These singular values are main ingredients of numerous computations involving robust control and model reduction of linear systems. Besides the Hankel norm, we discuss various p-norms, where p = 1; 2; 1. It turns out that norms which are obtained for p = 2 have both a time domain and a frequency domain interpretation. The rest have an interpretation in the time domain only. The contribution is organized as follows. The next section is dedicated to a collection of useful results on two topics: Norms and the SVD on the one hand and the Laplace and discrete-Laplace transforms on the other. Two tables 1, 2, summarize the salient properties of these two transforms. Section 3 develops the external and internal representations of linear systems. This is done both in the time and frequency domains, with the results summarized in two further tables 3, 4. This discussion is followed by the formulation and solution of the realization problem. The final section 4 is dedicated to the introduction of various norms for linear systems. The basic features of these norms are summarized in the fifth and last table 5. 2 Preliminaries 2.1 Norms of vectors, matrices and the SVD In this section we will first review some material from linear algebra which pertains to norms of vectors, norms of operators (matrices), both in finite and infinite dimensions. The latter are of importance because a linear system can be viewed as a map between infinite dimansional spaces. The Singular Value Decomposition (SVD) will also be introduced and its properties briefly discussed. Textbooks pertaining to the material discussed in this section are [16, 18, 19, 21, 27]. 2.1.1 Norms of finite-dimensional vectors and matrices Let X be a linear space over the field K which is either the field of reals R or that of complex numbers C . A norm on X is a function : X ! R , such that the following three properties are satisfied. Strict positiveness: (x) 0; 8 x 2 X , with equality if x = 0; triangle inequality: (x + y) (x) + (y), 8 x; y 2 X ; positive homogeneity: (x) = jj (x), 8 2 K , 8 x 2 X . For vectors x 2 R n or x 2 C n the Hölder or p-norms are defined as follows: k x kp := 8 P > > < > > : i2n 1 j xi jp ; 1 p < 1 p maxi2n j xi j; p = 1 0 ; x=B x1 1 .. C . A xn where n := f1; 2; ; ng, n 2 N . The 2-norm satisfies the Cauchy-Schwartz inequality: jx yj k x k2 k y k2 3 (2.1) Preliminaries EOLSS 6.43.13.4 with equality holding iff y = x, 2 K . An important property of the 2-norm is that it is invariant under unitary (orthogonal) transformations. Let U be n n and UU = In . It follows that k Ux k22 = x U Ux = x x =k x k22 . The following relationship between the Hölder norms for p = 1; 2; 1 holds: k x k1 k x k 2 k x k 1 One type of matrix norms are those which are induced by the vector p-norms defined above. More precisely A Figure 1: ellipsoid. A maps the unit sphere into an ellipsoid. The singular values are the lengths of the semi-axes of the for A 2 C mn kq k A kp;q := sup kkAx x6=0 x kp is the induced p; q -norm of A. In particular, for p = q = 1; 2; 1 the following expressions hold X X 1 k A k1 = max j Aij j; k A k1 = max jAij j; k A k2 = [max (AA )℄ 2 i2n j 2m j 2m (2.2) i2n Besides the induced matrix norms, there exist other norms. One such class is the Schatten p-norms of matrices. These non-induced norms are unitarily invariant. Let i (A), 1 i min(m; n), be the singular values of A, i.e. the square roots of the eigenvalues of AA . Then 0 11 p X p A k A kp := i (A) ; 1 p < 1 (2.3) i2m It follows that the Schatten norm for p = 1 is k A k1= max(A) which is the same as the 2-induced norm of A. For p = 1 we obtain the trace norm k A k1 = X i2m 4 i (A) Preliminaries EOLSS 6.43.13.4 For p = 2 the resulting norm is also known as the Frobenius norm, the Schatten 2-norm, or the HilbertSchmidt norm of A: 0 11 2 X 1 1 2 A k A kF = i (A) = (trae (A A)) 2 = (trae (AA )) 2 (2.4) i2m where trae () denotes the trace of a matrix. 2.1.2 The singular value decomposition Given a matrix A 2 K nm , n m, let the nonnegative numbers 1 2 n 0 be the positive square roots of the eigenvalues of AA . There exist unitary matrices U 2 K nn , UU = In , and V 2 K mm , V V = Im , such that 0 B B A = U V where = ( 0) 2 R nm and := B B 1 1 2 .. . n C C C C A 2 R nn (2.5) The decomposition (2.5) is called the singular value decomposition (SVD) of the matrix A; the i ’s are called the singular values of A while the columns of U , V un); V = (v1 v2 vm ) U = (u1 u2 are called the left, right singular vectors of A, respectively. These singular vectors are the eigenvectors of AA , A A respectively. Thus Avi = i ui ; i = 1; ; n Example 2.1 Consider the matrix p2 2 p 2 2 ! A= ! 1 1 1 3 and A A = 1 0 p1 2 ! . The eigenvalue decomposition of the matrices are: AA = U 2 U ; A A = V 2 V where U= p1 2 q AA = 1 1 p 1 1 ! 1 0 0 2 ; = p q 1 = 2 + 2; 2 = 2 0 ! B ; V =B p1 2 1 p 1+ p2 2 1 p1 1 22 C 1p p2 C A 22 2 Notice that A maps the unit disc in the plane to the ellipse with half-axes 1 and 2 ; more precisely v1 7! 1 u1 and v2 7! 2 u2 (see figure 1). It follows that X = 2 u2 v2 is a perturbation of smallest 2-norm (equal to 2 ) such that A X is singular: 1 X= 2 p 1 1p 2 1 2 1 ! ) 5 1 A X= 2 p 1 1 + p2 1 1+ 2 ! Preliminaries EOLSS 6.43.13.4 The singular values of A are unique. The left-, right-singular vectors corresponding to singular values of multiplicity one are also uniquely determined (up to a sign). Thus the SVD is unique in case the matrix A is square and the singular values have multiplicity one. Lemma 2.1 The 2-induced norm of A is equal to its largest singular value 1 Proof. By definition =k A k2 ind . k2 = sup x A Ax k A k22 = sup kkAx x x x k2 2 x= 6 0 x= 6 0 2 Let y be defined as y := V x where V is the matrix containing the eigenvectors of A A, i.e. A A = V V . Substituting in the above expression we obtain n2 2 2 2 2 x xAxAx = 1 yy12 ++ ++ y2nyn 12 n 1 This expression is maximized and equals 12 , for y = e1 , i.e. x = v1 , where v1 is the first column of V . Theorem 2.1 Every matrix A with entries in K has a singular value decomposition. Proof. We will give two proofs of this result. (a) The first is based on the lemma above. Let 1 be the 2norm of A; there exist unity length vectors x1 2 K m , x1 x1 = 1, and y1 2 K n , y1 y1 = 1, such that Ax1 = 1 y1 . Define the unitary matrices V1 , U1 so that their first column is x1 , y1 respectively: V1 := [x1 V1 ℄; U1 := [y1 U1 ℄ It follows that 0 U1 AV1 = B 1 w B 0 and consequently 1 C A =: A1 where w 2 K m 0 U1 AA U1 = A1 A1 = B 12 + w w w B BB Bw 1 1 C A Since the 2-norm of every matrix is bigger than or equal to the norm of any of its submatrices, we conclude that 12 + w w k AA k = 12 The implication is that w must be the zero vector w = 0. Thus 0 1 0 U1 AV1 = B 0 B 1 C A 1) (m 1). = 0; the matrices U , , V are partitioned in two blocks the first The procedure is now repeated for B which has size (n Assume that in (2.5) r having r columns: > 0 while r+1 U = [U1 U2 ℄; = 0 1 = B 1 1 0 0 2 ! and V = [V1 V2 ℄ 1 .. . r C A > 0; 2 = 0 2 R (n r)(n 6 r) (2.6) Preliminaries EOLSS 6.43.13.4 Corollary 2.1 Given (2.5) and (2.6) the following statements hold. rank A = r span ol A = span ol U1 ker A = span ol V2 Dyadic decomposition. A has a decomposition as a sum of r matrices of rank one: A = 1 u1 v1 + 2 u2 v2 + r ur vr (2.7) The orthogonal projection onto the span of the columns of A is U1U1 The orthogonal projection onto the kernel of A is V2V2 The orthogonal projection onto the orthogonal complement of the span of the columns of A is U2 U2 The orthogonal projection onto the orthogonal complement of the kernel of A is V1V1 q The Frobenius norm of A is k A kF = 12 + + n2 . For symmetric matrices the SVD can be readily obtained from the EVD (Eigenvalue Decomposition). Let the latter be: A = V V . Define by S := diag (sgn1 ; ; sgnn ), where sgn is the signum function; it equals +1 if > 0, 1 if < 0 and 0 if = 0. Then A = U V where U := V S and := diag (j1 j; ; jn j). 2.1.3 The Lebesgue spaces `p and Lp In this section we will define the p-norms of infinite sequences and functions. These are functions of one real variable, which in the context of system theory is taken to be time. Consequently, these are time-domain spaces and norms. Let `n (I ) := ff : I ! K n ; I Zg denote the set of sequences of vectors in K n which is either R or C . Frequent choices of I : I = Z . The p-norms of the elements of this space are defined as: 8 1 P > p p; 1p<1 > < k f ( t ) k p t2I ; f 2 `n (I ) k f kp:= > > : The corresponding `p spaces are: `np (I ) supt2I k f (t) kp ; p = 1 := ff = R , I = R + or I = R 8 R > > < k f kp:= > > : The corresponding t2I (2.8) 2 `n(I ); k f kp< 1g; 1 p 1 For functions of a continuous variable, let Ln(I ) := ff Frequent choices of I : I I = Z, I = Z+ or : I ! K n; I Rg , and the p-norms are: 1 k f (t) kpp dt p ; 1 p < 1 supt2I k f (t) kp ; p = 1 ; f 2 Ln (I ) Lp spaces are: Lnp(I ) := ff 2 Ln(I ); k f kp< 1g; 1 p 1 7 (2.9) Preliminaries 2.1.4 EOLSS 6.43.13.4 The Hardy spaces hp and Hp In this section we consider norms of functions of one complex variable. Thus in the system theoretic context, this variable is taken to be complex frequency and the resulting spaces and norms are frequency-domain ones. Let D C denote the (open) unit disc, and let F : C ! C qr be a matrix-valued function, analytic in D. Its p-norm is defined as follows: k F kh p Z 2 1 := k F (rej ) kpp d sup 2 jrj<1 0 !1 p ; 1p<1 k F kh1 := sup k F (z) kp; p = 1 z 2D We will choose k F (z0 ) kp to be the Schatten p-norm of F evaluated at z choices. The resulting hp spaces are defined as follows: hpqr := hpqr (D) := fF as above with : = z0 ; however, there are other possible k F kh < 1g p The following special cases are worth noting: !1 Z 2 2 h i k F kh2 = 21 sup trae F (re j )F (rej ) d jrj<1 0 where trae () denotes the trace, and () denotes complex conjugation and transposition; furthermore k F kh1 = sup max (F (z)) (2.10) (2.11) z 2D Let C C denote the (open) left half of the complex plane: s = x + jy 2 C , complex-valued functions F as defined above, which are analytic in C . Then x < 0. Consider the qr 1 1 p k F (x + jy) kp dy ; 1 p < 1 k F kH := sup x<0 1 Z p p k F kH1 := sup k F (z) kp; p = 1 z 2C Again k F (s0 ) kp is chosen to be the Schatten p-norm of defined analogously to the hp spaces: F evaluated at s = s0 . The resulting Hp spaces are Hpqr := Hpqr (C ) := fF as above with : k F kH < 1g p As before, the following special cases are worth noting: 1 1 k F kH2 = sup trae [F (x jy)F (x + jy)℄ dy 2 (2.12) x<0 1 where trae () denotes the trace, and () denotes complex conjugation and transposition; furthermore Z k F kH1 = sup max (F (s)) s2C (2.13) The suprema in the formulae above can be computed by means of the maximum modulus theorem, which states that a function f continuous inside a domain D C as well as on its boundary D and analytic inside D, attains its maximum on the boundary D of D. Thus (2.10), (2.11), (2.12), (2.13) become: 8 Preliminaries EOLSS 6.43.13.4 Z 2 h k F kh2 := 21 trae F (e 0 j )F (ej ) i d 1 2 k F kh1 := sup max F (ej ) 2[0;2℄ k F kH2 := Z (2.14) 1 1 2 trae [F ( jy)F (jy)℄ dy 1 k F kH1 := sup max (F (jy)) y2R (2.15) (2.16) (2.17) If F has no poles on the unit circle or the j! -axis, but is not necessarily analytic in the corresponding domains, the h1 , H1 norms are not defined. Instead the `1 , L1 norm of F is defined respectively as follows: k F k`1 := sup max (F (ej )); k F kL1 := sup max (F (jy)) y where in the first expression the supremum is taken over 2 [0; 2 ℄, while in the second the supremum is taken over y 2 ( 1; 1). 2.1.5 The Hilbert spaces `2 and L2 The spaces `2 (I ) and L2 (I ) are Hilbert spaces, that is linear spaces where not only a norm but an inner product is defined as well.1 For I = Z and I = R respectively, the inner product is defined as follows: hx; yi`2 := X t2I x (t)y(t) (2.18) 1 Z hx; yiL2 := 2 x (t)y(t)dt I (2.19) where as before () denotes complex conjugation and transposition. For I = Z and I = R respectively, elements (vectors or matrices) with entries in `2 (Z) and L2 (R ) have a transform defined as follows: f 8 P 1 f (t) t > < 1 7 ! F () := > R 1 : 1 f (t)e t dt It follows that if the domain of f is discrete, F (ej ) =: F (f )( ) is the Fourier transform of f and belongs to L2[0; 2℄; analogously, if the domain of f is continuous, F (j!) =: F (f )(!) is the Fourier transform of f and belongs to the space denoted by L2 (j R ) and defined as follows: L2(j R ) := fF : C ! C pm; suh that (2:16) < 1g Furthermore the following bijective correspondences hold: `2 (Z) = `2 (Z ) `2 (Z+ ) Z! L2 [0; 2℄ = h2 (D) h2 (D ) 1 The spaces ` (I ) and L (I ), p 6= 2, do not share this property; they are Banach spaces. For details see [12, 18]. p p 9 (2.20) Preliminaries and EOLSS 6.43.13.4 L2 (R ) = L2 (R ) L2 (R +) L! L2(j R ) = H2 (C ) H2 (C +) For simplicity the above diagram is shown for spaces containing scalars. It is however equally valid for the corresponding spaces containing matrices of arbitrary dimension. There are two results connecting the spaces introduced above. We will only state the continuous-time versions. The first has the names of Parseval, Plancherel and Paley-Wiener attached to it. Proposition 2.1 The Fourier transform F is a Hilbert space isometric isomorphism between L2 (R ) and L2 (j R ). It maps L2 (R + ), L2 (R ) onto H2 (C + ), H2 (C ) respectively. The second one shows that the L1 and H1 norms can be viewed as induced norms. Recall that if (X; ) and (Y; ) are two normed spaces with norms , , respectively, just as in the finite-dimansional case, the ; -induced norm of an operator T with domain X and range Y is: k T k; := sup kkTxxkk x6=0 Proposition 2.2 Let F 2 L1 ; then F L2 (j R ) in the frequency domain space L2 (j R ): (2.21) L2(j R ) and the L1 norm can be viewd as an induced norm k F X k L2 k X k L2 In this last expression, X can be restricted to lie in H2 . Let F 2 H1 ; then F H2 (C + ) H2 (C + ) and the H1 norm can be viewd as an induced norm both in the frequency domain space H2 as well as in the time domain space L2 : k F kH1 =k F kH2 ind= sup kkFXXkkH2 = sup kkFxxkkL2 =k F kL2 ind H2 L2 X 6=0 x6=0 k F kL1 =k F kL2 ind = sup X= 6 0 2.2 The Laplace transform and the Z -transform The logarithm can be considered as an elementary transform. It assigns a real number to any positive real number. It was invented in the middle ages and its purpose was to convert the multiplication of multi-digit numbers to addition. In the case of linear, time-invariant systems the operation which one wishes to simplify is the derivative with respect to time in the continuous-time case or the shift in the discrete-time case. As a consequence, one also wishes to simplify the operation of convolution, both in discrete- and continuous-time. Thus an operation is sought which will transform derivation into simple multiplication in the transform domain. In order to achieve this however, the transform needs to operate on functions of time. The resulting function will be one of complex frequency. This establishes two equivalent ways of dealing with linear, time-invariant systems, namely in the time domain and in the frequency domain. In the next two section we will briefly review some basic properties of this transform, which is called Laplace transform in continuous-time and discreteLaplace or Z -transform in discrete-time. For further details we refer to any introductory book in signals and systems, e.g. [9]. 2.2.1 Some properties of the Laplace transform Consider a function of time f (t). The unilateral Laplace transform of f is a function denoted by F (s) of the complex variable s = + j! . The definition of F is as follows: Z 1 f (t) L! F (s) := f (t)e 0 10 st dt (2.22) Preliminaries EOLSS 6.43.13.4 Therefore the values of f for negative time are ignored by this transform. Instead, in order to capture the influence of the past, initial conditions at time zero are required (see Differentiation in time below). Basic Laplace transform properties Property Time signal L-transform Linearity af1 (t) + bf2(t) aF1 (s) + bF2 (s) Shifting in the s-domain es0 t f (t) F (s s0 ) Time scaling f (at); a > 0 s 1 aF a Convolution f1 (t) f2 (t) F1 (s)F2 (s) f1 (t) = f2 (t) = 0; t < 0 Differentiation in time d dt f (t) Differentiation in freq. tf (t) sF (s) f (0 ) d ds F (s) Integration in time Rt Impulse Æ(t) 1 Exponential eat 1(t) s a 0 f ( )d Initial value theorem: Final value theorem: s F (s) 1 1 f (0+ ) = lims!1 sF (s) limt!1 f (t) = lims!0 sF (s) Table 1: Basic Laplace transform properties The last 2 properties hold provided that 2.2.2 f (t) contains no impulses or higher-order singularities at t = 0. Some properties of the Z -transform Consider a function of time f (t), where time is discrete t 2 Z. The unilateral Z -transform of f is a function denoted by F (z ) of the complex variable z = rej . The definition of F is as follows: 1 X f (t) Z! F (z ) := z t f (t) t=0 11 (2.23) The external and the internal representation of linear systems EOLSS 6.43.13.4 Basic Z -transform properties Property Time signal Z -transform Linearity af1 (t) + bf2 (t) aF1 (z ) + bF2 (z ) Forward shift f (t 1) z 1 F (z ) + f ( 1) Backward shift f (t + 1) zF (z ) zf (0) Scaling in freq. at f (t) F ( az ) Conjugation f (t) F (z ) Convolution f1 (t) f2 (t) F1 (z )F2 (z ) f1 (t) = f2 (t) = 0; n < 0 Differentiation in freq. tf (t) Impulse Æ(t) 1 Exponential an I(t) z z a First difference f (t) f (t 1) (1 z 1 )F (z ) f ( 1) Accumulation Pn z dFdz(z) k=0 f (t) 1 Initial value theorem: z 1 F (z ) 1 f [0℄ = limz!1 F (z ) Table 2: Basic Z -transform properties 3 The external and the internal representation of linear systems In this section we will review some basic results concerning linear dynamical systems. General references for the material in this chapter are [31], [28], [29], [9], [7], [15]. For an introduction to linear systems from basic principles the reader may consult the book by Willems and Polderman [26]. Here we will assume that the external variables have been partitioned into input variables u and into output variables y , and will be concerned with convolution systems, i.e. systems where the relation between u and y is given by a convolution 12 The external and the internal representation of linear systems sum or integral EOLSS 6.43.13.4 y =hu (3.1) where h is an appropriate weighting pattern. This will be called the external representation. We will also be concerned with systems where besides the input and output variables, the state x has been declared as well. Furthermore, the relationship between x and u is given by means of a set of first order difference or differential equations with constant coefficients, while that of y with x and u is given by a set of linear algebraic equations. It will also be assumed that x lives in a finite-dimensional space: x = Ax + Bu; y = Cx + Du (3.2) where is the derivative or shift operator and A, B , C , D are linear constant maps. This will be called internal representation. We will also consider an alternative external representation, in terms of two polynomial matrices Q 2 p R p [ ℄, P 2 R pm [ ℄: Q()y = P ()u (3.3) where as above, is the derivative or the backwards shift operator. It is usually assumed that det Q 6= 0. This representation is given in terms of differential or difference equations linking the input and the output. The first subsection is devoted to the discussion of systems governed by (3.1), (3.3) while the following subsection investigates some structural properties of systems represented by (3.2). These equations are solved both in the time and the frequency domains. The third subsection discusses the equivalence of the external and the internal representation, As it turns out going from the latter to the former involves the elimination of x and is thus straightforward. The converse however is far from trivial as it involves the construction of state. It is called the realization problem. This problem can be interpreted as deriving a time domain representation from frequency domain data. 3.1 External representation A discrete-time linear system , with m input and p output channels can be viewed as an operator S : `m (Z) ! `p(Z), which is linear. There exists a sequence of matrices S (i; j ) 2 K pm (recall that K is either R or C ) such that : u 7 ! y := S (u); y(i) = X j 2Z S (i; j )u(j ); i 2 Z (3.4) This relationship can be written in matrix form as follows 0 B B B B B B B B B .. . y( 2) y( 1) y(0) y(1) .. . 0 1 C C C C C C C C C A B B B B B B B B B .. . = .. . S ( 2; S ( 1; S (0; S (1; .. . .. . .. . 2) S ( 2; 1) S ( 2; 0) S ( 2; 1) 2) S ( 1; 1) S ( 1; 0) S ( 1; 1) 2) S (0; 1) S (0; 0) S (0; 1) 2) S (1; 1) S (1; 0) S (1; 1) .. . .. . The system described by S is called causal iff S (i; j ) = 0; i j and time invariant iff .. . S (i; j ) =: Si 13 j 2 K pm .. . 10 .. . CB CB CB CB CB CB CB CB CB A .. . u( 2) u( 1) u(0) u(1) .. . 1 C C C C C C C C C A (3.5) The external and the internal representation of linear systems For a time invariant system EOLSS 6.43.13.4 , we can define the sequence of p m constant matrices h = ( It will be called the impulse response of ; S 2 ; S 1 ; S0 ; S1 ; S2 ; ) (3.6) because it is the output obtained in response to a unit pulse ( u(t) = Æ(t) = 1; t = 0 0; t 6= 0 Operation (3.4) can now be represented as a convolution sum: S : u 7! y = S (u) = h u where (h u)(t) = Moreover, the matrix representation of 0 B B B B B B B B B .. . 1 y( 2) y( 1) y(0) y(1) .. . C C C C C C C C C A 1 X 1 k= S in this case is a Toeplitz matrix 0 B B B B B B B B B .. . = .. . S0 S1 S2 S3 .. . .. . .. . S 1 S0 S1 S2 S 2 S 1 S0 S1 .. . .. . .. . S 3 S 2 S 1 S0 .. . .. St k u(k); t 2 Z 10 CB CB CB CB CB CB CB CB CB A .. . u( 2) u( 1) u(0) u(1) . .. . (3.7) 1 C C C C C C C C C A (3.8) In the sequel we will restrict our attention to both causal and time-invariant linear systems. The matrix representation of S in this case is lower triangular and Toeplitz ( Sk = 0, k < 0). In analogy to the discrete-time case, a continuous-time linear system , with m input and p output channels can be viewed as an operator S mapping Lm (R ) onto Lp (R ), which is linear. In particular we will be concerned with systems which can be expressed by means of an integral S : Lm (R ) ! Lp (R ): S : u 7 ! y; y(t) := Z 1 h(t; )u( )d; t 2 R 1 (3.9) where h(t; ), is a matrix-valued function called the kernel or weighting pattern of S . The system just defined is causal iff h(t; ) = 0; t and time invariant iff h depends on the difference of the two arguments: h(t; ) = h(t ) In this case S is a convolution operator S : u 7! y = S (u) = h u where (h u)(t) = Z 1 h(t 1 )u( )d; t 2 R (3.10) In the sequel we will assume that S is both causal and time-invariant which means that the upper limit of integration can be replaced by t. In addition, we will assume that h can be expressed as h(t) = S0 Æ(t) + ha (t); S0 2 K pm ; t 0 (3.11) where Æ denotes the Æ -distribution and ha is analytic. Hence ha is uniquely determined by means of the coefficients of its Taylor series expansion at t = 0+ : t2 t ha (t) = S1 + S2 + S3 + 1! 2! k+1 + Sk (kt + 1)! + ; Sk 2 K pm 14 The external and the internal representation of linear systems EOLSS 6.43.13.4 It follows that if (3.11) is satisfied the output y is at least as smooth as the input u and is consequently called a smooth system. Hence just like in the case of discrete-time systems, smooth continuous-time linear system can be described by means of the infinite sequence of p m matrices Si , i 0. We formalize this conclusion next. Definition 3.1 The external representation of a time-invariant, causal and smooth continous-time system and that of a time-invariant, causal discrete-time linear system with m inputs and p outputs is given by an infinite sequence of p m matrices ; Sk ; ); Sk 2 R pm The matrices Sk are often referred to as the Markov parameters of the system S . h := (S0 ; S1 ; S2 ; (3.12) The (continuous- or discrete-time) Laplace transform of the impulse response yields the transfer function of the system H ( ) := (Lh)( ) (3.13) The Laplace transform is denoted for simplicity by L for both discrete- and continuous-time, and the Laplace variable is denoted by for both cases. It readily follows that H can be expanded in a formal power series in : H ( ) = S0 + S1 1 + S2 This can also be regarded as a Laurent expansion of written as 2 + + Sk k + H around infinity. (3.14) Consequently (3.7) and (3.10) can be Y ( ) = H ( )U ( ) An alternative way for describing linear systems externally is by specifying a differential or difference equation which relates one of the input and one of the output channels. Given that the input has m and the output p channels. This representation assumes the existence of polynomials qi;j ( ), i; j = 1; ; p and pi;j ( ), i = 1; ; p, j = 1; ; m, such that ) Q () y(t) = P () u(t) (3.15) where P; Q are polynomal matrices Q 2 R pp [ ℄, P 2 R pm [ ℄. If we make the assumption that Q is nonsingular, that is, its determinant is not identically zero: det Q = 6 0, the transfer function of this system is the qi;j () yj (t) = pi;j () ui (t) rational matrix H = Q 1 P . If in addition this is proper rational, that is, the degree of the numerator of each entry is less that the degree of the corresponding denominator, we can expand this as follows: H ( ) = Q 1 ( )P ( ) = S0 + S1 1 + + Sk k + (3.16) Recall that the variable is used to denote the transform variable s or z , depending on whether we are dealing with continuous- or discrete-time systems. We will not further dwell on this polynomial representation of linear systems since it is the subject of the following contribution in this volume, namely EOLSS Contribution 6.43.13.5. 3.2 Internal representation An alternative description for linear systems is the internal representation which uses in addition to the input u and the output y, the state x. For a first-principles treatment of the concept of state we refer to the book by Willems and Poldeman [26]. For our purposes, given are three linear finite-dimensional spaces: the state 15 The external and the internal representation of linear systems EOLSS 6.43.13.4 space X = K n 2 , the input space U = K m , and the output space Y = K p (recall that K denotes the field of real numbers R or that of complex numbers C ). The state equations describing a linear system are a set of first order linear differential or difference equations, according to whether we are dealing with a continuous- or a discrete-time system: dx(t) = Ax(t) + Bu(t); t 2 R or dt x(t + 1) = Ax(t) + Bu(t); t 2 Z (3.17) (3.18) In both cases x(t) 2 X is the state of the system at time t, while u(t) 2 U; is the value of the input function at time t. Moreover, B: U ! X; A : X ! X; are linear maps; the first one is called the input map, while the second one describes the dynamics or internal evolution of the system. Equations (3.17) and (3.18) can be written in a unified way as follows: x = Ax + Bu (3.19) where denotes the derivative operator for continuous-time systems, and the (backwards) shift operator for discrete-time systems. The output equations, for both discrete- and continuous-time linear systems, are composed of a set of linear algebraic equations y = Cx + Du (3.20) where y is the output function (response), and C: X ! Y; D : U ! Y are linear maps; C is called the output map. It describes how the system interacts with the outside world. In the sequel the term linear system in internal representation will be used to denote a linear, timeinvariant, continuous- or discrete-time system which is finite-dimensional. Linear means: U , X , Y are linear spaces, and A, B , C , D are linear maps; finite-dimensional means: U , X , Y are all finite dimensional; timeinvariant means: A, B , C , D do not depend on time; their matrix representations are constant n n, n m, p n, p m matrices. In the sequel (by slight abuse of notation) we will denote the linear maps A, B , C , D as well as their matrix representations (in some appropriate basis) with the same symbols. We are now ready to give the Definition 3.2 (a) A linear system in internal or state space representation is a quadruple of linear maps (matrices) ! B A ; A 2 K nn ; B 2 K nm ; C 2 K pn; D 2 K pm := (3.21) C D The dimension of the system is defined as the dimension of the associated state space: dim = n (3.22) (b) is called stable if the eigenvalues of A have negative real parts or lie inside the unit disc, depending on whether is a continuous-time or a discrete-time system. X X n The notation = K n means that is a linear space which is isomorphic to the -dimensional space K n ; as an example the space of all polynomials of degree less than is isomorphic to R n , since there is a one-to-one correspondence between each polynomial and an -vector consisting of its coefficients. X 2 n n 16 The external and the internal representation of linear systems 3.2.1 EOLSS 6.43.13.4 Solution in the time domain Let (u; x0 ; t) denote the solution of the state equations (3.19), i.e., the state of the system at time t attained from the initial state x0 at time t0 ; under the influence of the input u. In particular, for the continuous-time state equations (3.17) Z t A (t t0 ) (u; x0 ; t) = e x0 + (3.23) eA(t ) Bu( )d; t t0 ; t0 while for the discrete-time state equations (3.18) (u; x0 ; t) = At t0 x0 + t 1 X j =t0 At 1 j Bu(j ); t t0 : (3.24) In the above formulae we may assume without loss of generality, that t0 = 0, since the systems we are dealing with are time-invariant. The first summand in the above expressions is called zero input and the second zero state part of the solution. The nomenclature comes from the fact that the zero input part is obtained when the system is excited exclusively by means of initial conditions and the zero state part is the result of excitation by some input u and zero initial conditions. In the tables that follow these parts are denoted with the subscripts ”zi” and ”zs”. For both discrete- and continuous-time systems it follows that the output is given by: y(t) = C(u; x(0); t) + Du(t) = C(0; x(0); t) + C(u; 0; t) + Du(t) (3.25) Again the same remark concerning the zero-input and the zero state parts of the output holds. If we compare the above expressions for t0 = 1 and x0 = 0, with (3.7) and (3.10) it follows that the impulse response h has the form below. For continuous-time systems: ( h(t) := CeAt B + Æ(t)D; t 0 0; t < 0 (3.26) where Æ denotes the Æ -distribution. For discrete-time systems 8 > < h(t) := > : CAt 1 B; t > 0 D; t = 0 0; t < 0 (3.27) The corresponding external representation given by means of the Markov parameters (3.12), is: h = (D; CB; CAB; CA2 B; ; CAk 1B; ) (3.28) By transforming the state the matrices which describe the system will change. Thus, if the new state is x~ := T x, det T = 6 0, (3.19) and (3.20) in the new state x~, will be become x~ = T| AT {z A~ ~ + |{z} T B u; }x 1 B~ 1 y = CT ~ + Du | {z } x C~ where D remains unchanged. The corresponding triples are called equivalent. Put differently, equivalent if there exists T such that: T ! Ip A B C D ! = A~ C~ 17 B~ D~ ! T ! Im ; det T = 6 0 and ~ are (3.29) The external and the internal representation of linear systems Let EOLSS 6.43.13.4 and ~ be equivalent with equivalence transformation T . It readily follows that H ( ) = D + C (I A) 1 B = D + CT 1 T (I A) 1 T 1 T B = D + CT 1(I T AT 1 ) 1 T B = D~ + C~ (I A~) 1 B~ = H~ ( ) This immediately implies that Sk = S~k , k 2 N . We have thus proved Proposition 3.1 Equivalent triples have the same transfer function and therefore the same Markov parameters. 3.2.2 Solution in the frequency domain = 0. Let ( ) = L()( ), where is defined by (3.23), In this section we will assume that the initial time is t0 (3.24); there holds ( ) = (I A) 1 x0 + (I A) 1 BU ( ) Thus, by (3.13), (3.26), (3.27), the transfer function of is: H ( ) = D + C (I ) Y () = C () + DU () (3.30) A) 1 B (3.31) A summary of these relationships are provided in the table that follows. 3.2.3 The concepts of reachability and observability The concept of reachability provides the tool for answering questions related to the extend to which the state of the system x can be manipulated through the input u. The related concept of controllability will be discussed subsequently. Both concepts involve only the state equations. For additional information on these issues we refer to [5]. = ; A 2 K nxn ; B 2 K nxm . A state x 2 X is reachable from the zero (t) and a time T < 1, such that state iff there exist an input function u x = (u; 0; T) Definition 3.3 Given is A B The reachable subspace X reah X of , is the set which contains all reachable states of the system (completely) reachable iff X reah = X . Furthermore Rn(A; B ) := [B AB A2 B An 1B ℄ will be called the reachability matrix of follows. For continuous-time systems: . We will call (3.32) . The finite reachability gramians at time t < 1 are defined as P (t) := Z t 0 eA BB eA d; t 2 R + (3.33) while for discrete-time systems P (t) := Rt (A; B )Rt (A; B ) = t 1 X k=0 Ak BB (A )k ; t 2 Z+ (3.34) Theorem 3.1 Consider the pair (A; B ) as defined above. (a) X reah = span ol Rn = span ol P (t), where t > 0, t n 1, for continuous-, discrete-time systems, respectively. (b) Reachability conditions. The following are equivalent: 18 The external and the internal representation of linear systems EOLSS 6.43.13.4 I/O and I/S/O representation of continuous-time linear systems I/O I/S/O variables: (u; y) variables: (u; x; y ) d d d Q dt y(t) = P dt u(t), dt x(t) = Ax(t) + Bu(t); y (t) = Cx(t) + Du(t) u(t); y(t) 2 R d dt x(t) 2 R n , d dt A B C D ! 2 R (n+p)(n+m) Impulse response Q h(t) = P Æ(t) H (s) = L(h(t)) = Q 1 (s)P (s) h(t) = DÆ(t) + CeAt B; t 0 H (s) = D + C (sI A) 1 B Poles - characteristic roots det(i I A) = 0 Zeros zi 2 C : 9 vi 2 C m satisfying zi 2 C : 9 w i 2 C n ! , vi zi I A B H (zi )vi = 0 C D i ; detQ(i ) = 0; i = 1; ; n 2 C m!, such that wi vi Matrix exponential P tk k d At eAt = 1 i=0 k! A ) dt e At 1 L(e ) = (sI A) y(t) = h(t) u(t) ) y(t) = yzi(t) + yPzs(t) where yzi (t) = ni=1 i e t i and yzs(t) = Rt 0 =0 = AeAt Solution in the time domain x(t) = xzi (t) + xzs(Rt) x(t) = eAt x(0 ) + 0tR eA(t ) Bu( )d y(t) = CeAt x(0 ) + t (DÆ(t ) + CeA(t ) B ) u( )d 0 h(t )u( )d | {z h() } ) y(t) = CeAtx(0 ) + R0t h(t )u( )dt Solution in the frequency domain Y (s) = Q 1 (s)R(s) + H (s)U (s) X (s) = (sI A) 1 x(0 ) + (sI A) 1 BU (s) Y (s) = C (sI A) 1 x(0 ) + (|D + C (sI{z A) 1 B}) U (s) H (s) ) Y (s) = C (sI A) 1 x(0 ) + H (s)U (s) Table 3: I/O and I/S/O representation of continuous-time linear systems 1. The pair (A; B ), A 2 K nxn , B 2. 3. 4. 5. 2 K nxm, is completely reacable. The rank of the reachability matrix is full: rank R(A; B ) = n. The reachability gramian is positive definite P (t) > 0, for some t > 0. No left eigenvector v of A is in the left kernel of B : v A = v ) v B 6= 0. rank (In A B ) = n, for all 2 C 6. The polynomial matrices I A and B are left coprime. The fourth and fifth conditions in the theorem above are known as the PHB or Popov-Hautus-Belevich tests for reachability. 19 The external and the internal representation of linear systems EOLSS 6.43.13.4 I/O and I/S/O representation of discrete-time linear systems I/O I/S/O variables: (u; y ) variables: (u; x; y ) Q () y(t) = P () u(t), x(t) = Ax(t) + Bu(!t); y(t) = Cx(t) + Du(t) u(t); y(t) 2 R x(t) 2 R n , A B C D 2 R (n+p)(n+m) Impulse response h(0) = D; h(t) = CAt 1 B; t > 0 H (z ) = D + C (zI A) 1 B Poles - zeros: same as for t 2 R Exponents of a martix: Z (At ) = (zI A) 1 Q () h(t) = P () Æ(t) H (z ) = Z (h(t)) = Q 1 (z )P (z ) Solution in the time domain y(t) = h(t) u(t) ) x(t) = xzi (t) + xP zs (t) t 1 At 1 Bu( ) y(t) = yzi(t) + yPzs(t) x(t) = At x(0) + P =0 where yzi (t) = ni=1 i ti y(t) = CAt x(0) + t =01 (|DÆ(t ) +{zCAt h() P 1 P and yzs (t) = t =0 h(t )u( ) y(t) = CAt x(0) + t =01 h(t )u( ) Solution in the frequency domain 1 B}) u( ) Y (z ) = Q 1 (z )R(z ) + H (z )U (z ) X (z ) = (zI A) 1 x(0) + (zI A) 1 BU (z ) Y (z ) = C (zI A) 1 x(0) + (|D + C (zI{z A) 1 B}) U (z ) H (z ) ) Y (z) = C (zI A) 1 x(0) + H (z)U (z) Table 4: I/O and I/S/O representation of discrete-time linear systems We now turn our attention to the concept of observability. In order to be able to modify the dynamical behavior of a system, very often the state x needs to be available. Typically however the state variables are inaccessible and only certain linear combinations y thereof, given by the output equations (3.20) are known. Thus we need to discuss the problem of reconstructing the state x(T ) from observations y ( ) where is in some appropriate interval. If 2 [T; T + t℄ we have the state observation problem, while if 2 [T t; T ℄ we have the state reconstruction problem. We will first discuss the observation problem. Without loss of generality we will assume that T = 0. Recall (3.23), (3.24) and (3.25). Since the input u is known, the latter two terms in (3.25) are also known for t 0. Therefore, in determining x(0) we may assume without loss of generality that u() = 0. Thus, the observation problem reduces to the following: given C(0; x(0); t) for t 0, find x(0). Since B and D are irrelevant, for this subsection ! A = ; A 2 K nxn ; C 2 K pxn C Definition 3.4 A state x 2 X is unobservable iff y(t) = C(0; x; t) = 0, for all t 0, i.e. iff x is indistinguishable from the zero state for all t 0. The unobservable subspace X unobs of X is the set of all unobservable states of . is (completely) observable iff X unobs = 0. The observability matrix of is On(C; A) = (C A C (A )n 1C ) 20 (3.35) The external and the internal representation of linear systems EOLSS 6.43.13.4 The finite observability gramians at time t < 1 are: Q(t) := Theorem 3.2 Given = Z t 0 (t ) A(t ) C Ce d; eA t 2 R+ Q(t) := Ot (C; A)Ot (C; A); t 2 Z+ A C , for both t 2 Z and t 2 R , X unobs is a linear subspace of X given by where t > 0, t n 1, depending on whether the system is continuous-, or discrete-time. Thus, observable if, and only if, rank O (C; A) = n. Y0 := (y (0) y (1) where D := (3.37) X unobs = ker On (C; A) = ker Q(t) = fx 2 X : CAi 1 x = 0; i > 0g Remark 3.1 (a) Given y (t); (3.36) (3.38) is completely t 0, let Y0 denote the following np 1 vector: y(n 1)) t 2 Z; Y0 := (y (0) Dy(0) Dn 1y(0)) t 2 R ; d dt . The observation problem reduces to the solution of the linear set of equations On(C; A)x(0) = Y0 This set of equations is solvable for all initial conditions x(0), i.e. it has a unique solution if and only if is observable. Otherwise x(0) can only be determined modulo X unobs , i.e. up to an arbitrary linear combination of unobservable states. (b) If x1 ; x2 , are not reachable, there is a trajectory passing through the two points if, and only if, x2 f (A; T )x1 2 X reah , for some T , where f (A; T ) = eAT for continuous-time systems and f (A; T ) = AT for discrete-time systems. This shows that if we start from a reachable state x1 6= 0 the states that can be attained are also within the reachable subspace. A concept which is closely related to reachability is that of controllability. Here, instead of driving the 2 X is zero state to a desired state, a given non-zero state is steered to the zero state. Furthermore, a state x unreconstructible iff y (t) = C(0; x; t) = 0, for all t 0, i.e. iff x is indistinguishable from the zero state for all t 0. The next result shows that for continuous-time systems the concepts of reachability and controllability are equivalent while for discrete-time systems the latter is weaker. Similarly, while for continuous-time systems the concepts of observability and reconstructibility are equivalent, for discrete-time systems the latter is weaker. For this reason, only the concepts of reachability and observability are used in the sequel. Proposition 3.2 Given is the triple (C; A; B ). (a) For continuous-time systems X ontr = X reah and X unre = X unobs . (b) For discrete-time systems X reah X ontr and X unre X unobs ; in particular X ontr = X reah + ker An and X unre = X unobs \ im An . 3.2.4 The infinite gramians Consider a continuous-time linear system = A B C D which is stable, i.e. all eigenvalues of A have negative real parts. In this case both (3.33) as well as (3.36) are defined for t = 1. In addition because of Plancherel’s formula, the gramians can be expressed also in the frequency domain (expressions on the righthand side): Z 1 Z 1 (j! A) 1 BB ( j! A ) 1 d! P := eA BB eA d = 1 (3.39) 0 2 1 21 The external and the internal representation of linear systems Q EOLSS 6.43.13.4 1 A A 1 Z1 ( j! := e C Ce d = 2 1 0 Z A ) 1 C C (j! A) 1 d! (3.40) P , Q are the infinite reachability and infinite observability gramians associated with . These gramians satisfy the following linear matrix equations, called Lyapunov equations; see also [21, 8]. Proposition 3.3 Given the stable, continuous-time system gramian P satisfies the continuous-time Lyapunov equation as above, the associated infinite reachability AP + P A + BB = 0 (3.41) while the associated infinite observability gramian satisfies A Q + QA + C C = 0 (3.42) Proof. Due to stability Z 1 Z 1 A A d(eA BB eA ) = BB Ae BB e A d = AP + P A = 0 0 This proves (3.41); (3.42) is proved similarly. If the discrete-time system d = F G H J is stable, i.e. all eigenvalues of F are inside the unit disc, the gramians (3.34) as well as (3.37) are defined for t = 1 P := R(F; G)R(F; G) = X Q := O(H; F ) O(H; F ) = X t>0 t>0 F t 1 GG(F )t (F )t 1 H HF t 1 1 = 1 Z 2 j (e I 2 0 = 1 Z 2 (e 2 0 Notice that P can be written as P = GG + F P F ; moreover Q discrete Lyapunov or Stein equations: j I A) 1 BB (e j I A ) 1 C C (ej I = H H + F QF . A ) 1 d (3.43) A) 1 d (3.44) These are the so-called Proposition 3.4 Given the stable, discrete-time system d as above, the associated infinite reachability gramian P satisfies the while the associated infinite observability gramian Q satisfies discrete-time Lyapunov equation AP A + BB = P ; A QA + C C = Q (3.45) We conclude this section by summarizing some properties of the system gramians. For details see, e.g. [23, 14, 7]. Lemma 3.1 Let P and Q denote the infinite gramians of a linear stable system. (a) The minimal energy required to steer the state of the system from 0 to xr is xr P 1 xr . (b) The maximal energy produced by observing the output of the system whose initial state is xo is xo Qxo . (c) The states which are difficult, i.e. require large amounts of energy, to reach are in the span of those eigenvectors of P which correspond to small eigenvalues. Furthermore, the states which are difficult to observe, i.e. produce small observation energy, are in the span of those eigenvectors of Q which correspond to small eigenvalues. 22 The external and the internal representation of linear systems EOLSS 6.43.13.4 Remark 3.2 Computation of the reachability gramian. Given the pair A 2 R nn , B 2 R nm , the reachability gramian is defined by (3.33). We will assume that the eigenvalues of A are distinct. Then A is diagonalizable; let the EVD (Eigenvalue Decomposition) be A = V V 1 vn℄; = diag (1 ; ; n ) where V = [v1 v2 vi denotes the eigenvector corresponding to the eigenvalue i . Notice that if the ith eigenvalue is complex, the corresponding eigenvector will also be complex. Let W = V 1 B 2 C nm and denote by Wi 2 C 1m the ith row of W . With the notation introduced above the following formula holds: P (T ) = V R(T )V where [R(T )℄ij = W+iWj 1 exp [(i + j )T ℄ 2 C (3.46) i j Furthermore, if i + j = 0, [R(T )℄ij = (Wi Wj ) T . If in addition A is stable, the infinite gramian (3.41) W W is given by P = V RV , where Rij = + . This formula accomplishes both the computation of the i i j j exponential and the integration explicitely, in terms of the EVD of A. Example 3.1 Consider the example of the parallel connection of two branches, the first consisting of the series connection of an inductor L with a resistor RL , and the other consisting of the series connection of a capacitor C with a resistor RC . Assume that the values of these elements are L = 1, RL = 1, C = 1, RC = 12 ; then " A= 1 0 # 0 ;B= 2 " 1 2 # ) " e eAt B = 2e t # 2t The gramian P (T ) and the infinite gramian P are: P (T ) = " 1=2 e 2=3 e 2T 3T + 1=2 + 2=3 # 2=3 e 3T + 2=3 ; e 4T + 1 P = Tlim P (T ) = !1 " 1=2 2=3 2=3 1 # If the system is asymptotically stable, i.e. Re(i (A)) < 0, the reachability gramian is defined for T = 1, and it satisfies (3.41). Hence, the infinite gramian can be computed as the solution to the above linear matrix equation; no explicit calculation of the matrix exponentials, mutliplication and subsequent intergration is required. In matlab, if in addition the pair (A; B ) is controllable, we have: P = lyap (A; B B 0) For the matrices defined earlier, using the ’lyap’ command in the format ’long e’, we get: P= " 5:000000000000000e 6:666666666666666e 01 6:666666666666666e 01 01 1:000000000000000e + 00 # Example 3.2 A second simple example is the following: A= 0 2 1 3 ! ;B= 0 1 ! ) eAt 23 = e 2t + 2e 2e t + 2e t 2t e t e 2t 2e 2t e t ! The external and the internal representation of linear systems EOLSS 6.43.13.4 This implies 6e 2T + 8e 3T 3e 4T + 1 12e 3T + 6e 2T + 6e 4T P (T ) = 121 12e 3T + 6e 2T + 6e 4T 12e 4T + 16e 3T 6e 2T + 2 And finally P = lyap (A; B B 0) = 1 12 0 0 ! ! 1 6 A transformation between continuous and discrete time systems One transformation between continuous- and discrete-time systems is given by the bilinear transformation of the complex plane onto itself given by z = 11+ss . In particular, the transfer function H (s) of a continuous-time system is obtained from that of a discrete-time one Hd (z ) as follows: H(s) = Hd 1+s 1 s This transformation maps the left-half of the complex plane onto the unit disc and vice-versa. The matrices := A B C D ! F G C J d := ; ! of these two systems are related as given in the following table. Continuous-time A; B; C; D A = (pF + I ) 1 (F I ) B = p2(F + I ) 1 G C = 2H (F + I ) 1 D = J H (F + I ) 1 G 9 > > > = > > > ; z= 1+s 1 s s= z 1 z +1 Proposition 3.5 Given the stable continuous-time system let d := F G H J ! , with infinite gramians 8 > > > < > > > : := Discrete-time F = (pI + A)(I A) 1 G = p2(I A) 1 B H = 2C (I A) 1 J = D + C (I A) 1 B F; G; H; J A B C D ! with infinite gramians P , Q , Pd , Qd , be the discrete-time system obtained by means of the transformation given above. It follows that the bilinear transformation introduced above preserves the gramians: P = Pd and Q = Qd . Furthermore, this transformation preserves the infinity norms (see section 4.3). 3.3 The realization problem In the preceding sections we have presented two ways of representing linear systems: the internal and the external. The former makes use of the inputs u, states x, and outputs y . The latter makes use only of the inputs u and the outputs y. The question thus arises as to the relationship between these two representations. 24 The external and the internal representation of linear systems In one direction this problem is trivial. Given the internal representation EOLSS 6.43.13.4 = A B C D ! of a system, the external representation is readily derived. As shown earlier, the transfer function of the system is given by (3.31) H ( ) = D + C (I A) 1 B , while from (3.28), the Markov parameters are given by S0 = D; Sk := CAk 1B 2 R pm ; k 2 N (3.47) The converse problem, i.e. given the external representation, derive the internal one, is far from trivial. This is the realization problem: given the external representation of a linear system construct an internal or h, or equivalently the transfer function state variable representation. In other words, given the impulse response ! A B H , or the Markov parameters Sk of a system, construct , such that (3.47) hold. C D without computation that D = S0 . Hence the following problem results: It readily follows Definition 3.5 Given the sequence of p m matrices Sk , k 2 N , the realization problem consists in finding a positive integer n and constant matrices (C; A; B ) such that Sk = CAk 1B; C; A; B 2 R pn R nn R nm ; k 2 N The triple sequence. A B C (3.48) ! is then called a realization of the sequence Sk , and the latter is called a realizable The realization problem is sometimes referred to as the problem of construction of state for linear systems described by convolution relationships. Remark 3.3 Realization can also be considered as the problem of converting frequency domain data into time domain data. The reason is that measurement of the Markov parameters is closely related to measurement of the frequency response. Example 3.3 Consider the following (scalar) sequences: (Sk )1 k=0 (Sk )1 k=0 (Sk )1 k=0 (Sk )1 k=0 (Sk )1 k=0 f1; 1; 1; 1; 1; 1; 1; 1; 1; g f1; 2; 3; 4; 5; 6; 7; 8; 9; g natural numbers f1; 2; 3; 5; 8; 13; 21; 34; 55; g Fibonai numbers f1; 2; 3; 5; 7; 11; 13; 17; 19; g primes 1 1 1 1 1 1 = f1; ; ; ; ; ; ; g inverse fatorials 1! 2! 3! 4! 5! 6! = = = = Which sequences are realizable? Problem 3.1 The following problems arise: (a) Existence: given a sequence Sk , k > 0, determine whether there exist a positive integer n and a triple of matrices A; B; C such that (3.48) holds. (b) Uniqueness: in case such an integer and triple exist, are they unique in some sense? (c) Construction: in case of existence, find n and give an algorithm to construct such a triple. 25 The external and the internal representation of linear systems EOLSS 6.43.13.4 The main tool for answering the above questions is the matrix H of Markov parameters: 0 H := B B B B B B B B B B B B B B B B B B S1 S2 S2 S3 .. . .. . Sk Sk+1 Sk+1 Sk+2 .. . Sk Sk+1 Sk+1 Sk+2 .. . .. . S2k 1 S2k S2k S2k+1 .. . .. . .. . 1 C C C C C C C C C C C C C C C C C C A (3.49) This is the Hankel matrix; it has infinitely many rows, infinitely many columns, and block Hankel structure, i.e. (H)i;j = Si+j 1 , for i; j > 0. We start by listing conditions related to the realization problem. Lemma 3.2 Each statement below implies the one which follows: (a) The sequence Sk , k 2 N , is realizable. P (b) The formal power series k>0 Sk k is rational. (c) The sequence Sk , k 2 N , satisfies a recursion with constant coefficients, i.e. there exist a positive integer r and constants i , 0 i < r , such that 0 Sk + 1 Sk+1 + 2 Sk+2 + + r 2Sr+k 2 + r 1 Sr+k 1 + Sr+k = 0; k > 0 (3.50) (d) The rank of H is finite. Proof. (a) ) (b) Realizability implies (3.48). Hence X k>0 Sk This proves (b). (b) ) (c) Let det(I implies k = X k>0 0 CAk 1B k =C A) =: 0 + 1 + 0 A ( ) X k>0 X k>0 1 Ak 1 kA B = C (I A) 1 B + r 1r 1 + r =: A(). The previous relationship 1 Sk k A = C [adj (I A)℄ B where adj (M ) denotes the adjoint of the matrix M . On the left-hand side there are terms having both positive and negative powers of , while on the right-hand side there are only terms having positive powers of . Hence the coefficients of the negative powers of on the left-hand side must be identically zero; this implies precisely (3.50). (c) ) (d) Relationships (3.50) imply that the (r + 1)-st block column of H is a linear combination of the previous r block columns. Furthermore, because of the block Hankel structure, every block column of H is a sub-column of the previous one; this implies that all block columns after the r -th are linearly dependent on the first r , which in turn implies the finiteness of the rank of H. The following lemma describes a fundamental property of H; it also provides a direct proof of the implication (a) ) (d). 26 The external and the internal representation of linear systems EOLSS 6.43.13.4 Lemma 3.3 Factorization of H If the sequence of Markov parameters is realizable by means of the triple follows: (C; A; B ), H can be factored as H = O(C; A)R(A; B ) (3.51) Consequently, if the sequence of Markov parameters is realizable the rank of H is finite. Proof. If Sn , n 2 N , is realizable the relationships Sn 0 H= B B B B CB CAB CAB CA2 B .. . .. . = CAn 1B , n 2 N , hold true. Hence: 1 C C C C A = O(C; A)R(A; B ) It follows that: rank H maxfrank O ; rank Rg size (A). In order to discuss the uniqueness issue of realizations, we need to recall the concept of equivalent systems defined by (3.29). In particular, proposition 3.1 asserts that equivalent triples have the same Markov parameters. Hence the best one can hope for in connection with the uniqueness question is that realizations be equivalent. Indeed as shown in the next section this holds for realizations with the smallest possible dimension. 3.3.1 The solution of the realization problem We are now ready to answer the three questions posed at the beginning of the previous sub-subsection. This also proves the implication (d) ) (a), and hence the equivalence of the statemenrs of lemma 3.2. Theorem 3.3 Main Result. (1) The sequence Sk , k 2 N , is realizable if, and only if, rank H =: n < 1. (2) The state space dimension of any solution is at least n. All realizations which are minimal are both reachable and observable. Conversely, every realization which is reachable and observable is minimal. (3) All minimal realizations are equivalent. Lemma 3.3 proves part (1) of the main theorem in one direction. To prove (1) in the other direction we will actually construct a realization assuming that the rank of H is finite. Lemma 3.4 Silverman Realization Algorithm Let rank H = n. Find an n n submatrix of H which has full rank. Construct the following matrices: (i) 2 R nn ; it is composed of the same rows as ; its columns are obtained by shifting those of by one block column (i.e. m columns). (ii) 2 R nm is composed of the same rows as ; its columns are the first m columns of H. (iii) 2 R pn is composed of the same columns as ; its rows are the first p rows of H. The triple (C; A; B ), where C := , A := 1 , and B := 1 ), is a realization of dimension n of the given sequence of Markov parameters. Proof. By assumption there exist n := rankH, columns of H which span its column space. Denote these columns by 1 ; note that the columns making up 1 need not be consecutive columns of H. Let 1 denote the n columns of H obtained by shifting those of 1 by one block column, i.e. by m individual columns; let 1 denote the first m columns of H. There exist unique matrices A 2 R nn, B 2 R nm , such that: 1 = 1 A 1 = 1 B 27 (3.52) (3.53) The external and the internal representation of linear systems EOLSS 6.43.13.4 Finally, define as C the first block row, i.e. the first p individual rows, of 1 : C := (1)1 (3.54) For this proof (M )k , k 2 N , will denote the k th block row of the matrix M . Recall that the first block element of 1 is S1 , i.e. using our notation ( 1 )1 = S1 . Thus from (3.53), together with (3.54) follows S1 = ( 1 )1 = (1 B )1 = (1 )1 B = CB For the next Markov parameter notice that S2 = ( 1 )1 = ( 1 )2 . Thus making use of (3.52) S2 = ( 1 )1 = (1 B )1 = (1 AB )1 = (1)1 AB = CAB For the k th Markov parameter, combining (3.53), (3.52) and (3.54), we obtain Sk = (k 1 1)1 = (k 1 1B )1 = (1Ak 1 B )1 = (1 )1 Ak 1 B = CAk 1 B Thus (C; A; B ) is indeed a realization of dimension n. The state dimension of a realization cannot be less than n; indeed, if such a realization existed, the rank of H would be less than n, which is a contradiction to the assumption that the rank of H is equal to n. Thus a realization of the sequence (Sk ) of dimension equal to rank H is called minimal realization; notice that the Silverman algorithm constructs minimal realizations. In this context the following holds true. Lemma 3.5 A realization of the sequence (Sk ) is minimal if, and only if, it is reachable and observable. Proof. Let (C; A; B ) be some realization of Sn , n 2 N . Since H = OR: rankH minfrankO; rankRg size(A) ^ A; ^ B^ ) be a reachable and observable realization. Since H = O^ R^ , and each of the matrices O^ , R^ Let (C; contain a nonsingular matrix of size equal to the size of A^, we conclude that size(A^) rankH. This concludes the proof. We are now left with the proof of part (3) of the main theorem, namely that minimal realizations are equivalent. We will only provide the proof for a special case; the proof of the general case follows along similar lines. Outline of Proof. Single-input, single-output case (i.e. p = m = 1). Let (Ci ; Ai ; Bi ), i = 1; 2; be minimal realizations of S . We will show the existence of a transformation T , det T 6= 0, such that (3.29) holds. From lemma 3.3 we conclude that Hn;n = On1 R1n = On2 R2n (3.55) where the superscript is used to distinguish between the two different realizations. Furthermore, the same lemma also implies Hn;n+1 = On1 [B1 A1 R1n℄ = On2 [B2 A2R2n℄ which in turn yields On1 A1R1n = On2 A2R2n (3.56) Because of minimality, the following determinants are non-zero: det Oni 6= 0; det Rin 6= 0; i = 1; 2 We now define T := (On1 ) 1 On2 = R1n (R2n ) 1 Equation (3.55) implies C1 = C2 T 1 and B1 = T B2 , while (3.56) implies A1 = T A2 T 28 1. The external and the internal representation of linear systems 3.3.2 EOLSS 6.43.13.4 Realization of proper rational matrix functions Given is a p m matrix H ( ) with proper rational entries, i.e. entries whose numerator degree is no larger than the denominator degree. Consider first the scalar case, i.e. p = m = 1. We can write p( ) q( ) H ( ) = D + where D is a constant in K and p, q are polynomials in p( ) = p0 + p1 + + p 1 1 ; pi 2 K q( ) = q0 + q1 + + q 1 1 + ; qi 2 K In terms of these coefficients pi and qi we can write down a realization of H ( ) as follows 0 H := It can be shown that A B C D ! := B B B B B B B B 0 0 .. . 0 q0 p0 1 0 0 1 .. . 0 q1 p1 .. . .. 0 q2 p2 0 0 .. . . H is indeed a realization of H , i.e H ( ) = D + C (I 0 0 1 C C .. C . C C 2 K ( +1)( +1) C 1 0C C q 1 1 A p 1 D (3.57) A) 1 B This realization is reachable but not necessarily observable; this means that the rank of the associated Hankel matrix is at most . The realization is in addition, observable if the polynomials p and q are coprime. Thus (3.57) is minimal iff p, q are coprime. In this case the rank of the associated Hankel matrix H is precisely . In the general case, we can write H ( ) = D + 1 P ( ) q( ) where q is a scalar polynomial which is the least common multiple of the denominators of the entries of H , and P is a polynomial matrix of size p m: P ( ) = P0 + P1 + + P 1 1 ; Pi 2 K pm q( ) = q0 + q1 + + q 1 1 + ; qi 2 K The construction given above provides a realization: 0 H := A B C D ! := 0m 0m B B B B B B B B .. . 0m q0Im P0 Im 0m 0m Im .. . 0m .. . 0m q 1 Im P1 .. q2 Im P2 0m 0m . .. . Im q 1Im P 1 1 0m 0m C C .. . 0m Im D C C C C C C A 2 K (m+p)(m+m) (3.58) where 0m is a square zero matrix of size m, and Im is the identity matrix of the same size. Unlike the scalar case however, the realization H need not be minimal. One way to obtain a minimal realization is by applying the Silverman algorithm; in this case H has to be expanded into a formal power series H ( ) = S0 + S1 1 + S2 29 2 + + St t + The external and the internal representation of linear systems EOLSS 6.43.13.4 The Markov parameters are computed using the following relationship. Given the polynomial q as above, let q(k) ( ) := k+q n 1 k 1 + +qk+1 + qk ; k = 1; ; + denote its pseudo-derivative polynomials. It follows that the numerator polynomial Markov parameters Sk and the denominator polynomial q , as follows: + S 1q( P ( ) = S1 q(1) ( ) + S2 q(2) + This can be verified by direct calculation. Alternatively, assume that denote the characteristic polynomial of A. Then adj (I A) = q( ) ( )A 1 + q( 1) ( )A The result (3.60) follows by noting that P ( ) = C adj (I 2 + P ( ) is related with the + S q( ) 1) H ( ) = C (I (3.59) (3.60) A) 1 B , and let q( ) q(2) ()A1 + q(1) ()I (3.61) A)B . Since H is rational, the rank of the ensuing Hankel matrix associated with the sequence of Markov parameters Sn , n 2 N , is guaranteed to have finite rank. In particular the following upper bound holds rankH minfm; pg A concept often used is that of the McMillan degree of a rational matrix function. For proper rational matrix functions H the McMillan degree turns out to be equal to the rank of the associated Hankel matrix H; in other words the McMillan degree in this case is equal to the dimension of any minimal realization of H . Example 3.4 We will now investigate the realization problem for the Fibonacci sequence (Sk )1 k=0 = ( 1; 2; 3; 5; 8; 13; 21; 34; 55; which is constructed according to the rule S1 ) = 1, S2 = 2, and Sk+2 = Sk+1 + Sk ; k > 0 The Hankel matrix (3.49) becomes 0 H := B B B B B B B B B B B B B 1 2 3 2 3 5 3 5 8 5 8 13 8 13 21 13 21 34 .. . .. . .. . 5 8 13 21 34 55 8 13 13 21 21 34 34 55 55 89 89 144 .. . .. . .. . .. . 1 C C C C C C C C C C C C C A It readily follows from the law of construction of the sequence, that the rank of the Hankel matrix is two. will be chosen so that it contains rows 2, 4 and columns 2, 5 of H: = 3 13 8 34 = 5 21 13 55 ! ) = 1 17 4 13 2 3 2 ! The remaining matrices are ! ; = 30 2 5 ! and = (2 8) The external and the internal representation of linear systems It follows that A= ! 1=2 1=2 1=2 3=2 Furthermore X k>0 3.3.3 EOLSS 6.43.13.4 ! ; B= 3=2 1=2 k +1 2 1 Sk = and C = (2 8) The partial realization problem Recall section 3.3 and in particular definition 3.5. The realization problem with partial data is defined as follows. Definition 3.6 Given the finite sequence of p m matrices Sk , k = 1; ; r , the partial realization problem consists in finding a positive integer n and constant matrices (C; A; B ) such that Sk = CAk 1B; C; A; B 2 R pn R nn R nm ; k = 1; 2; The triple A B C ;N (3.62) ! is then called a partial realization of the sequence Sk . Because of lemma 3.2, a finite sequence of matrices is always realizable. As a consequence the set of problems arising are: (a) minimality: given the sequence Sk , k = 1; ; r , find the smallest positive integer n for which the partial realization problem is solvable. (b) parametrization of solutions: parametrize all minimal and other solutions. (c) recursive construction: recursive construction of solutions. Similarly to the realization problem, the partial realization problem can be studied by means of the partially defined Hankel matrix: 0 S1 S2 Sr 1 B S2 Sr ? C B C Hr := B B B B B B B B .. . .. . .. . Sr Sr ? ? ? ? ? .. . C C C C C C C C A 2 R rprm (3.63) where ? denote unknown matrices defining the continuation of the sequence Sk . The rank of the partially defined Hankel matrix Hk is defined as the size of the largest submatrix of Hk which is non-singular, independently of the unknown parameters ?. It then follows that the dimension of any partial realization satisfies dim rank Hk =: n Furthermore, there always exists a partial realization of dimension n, which is a minimal partial realization. Once the rank of Hk is determined, Silverman’s algorithm (see lemma 3.4) can be used to construct such a realization. For details on this problem see [20, 1, 2, 3]. We illustrate this procedure by means of a simple example. 31 Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 Example 3.5 Consider the scalar (i.e. m = p = 1) sequence (1; 1; 1; 2); the corresponding Hankel matrix together with its first 3 submatrices are: 0 H4 = B B B 1 1 1 2 1 1 2 a 1 2 a b 2 a b 1 C C C; A 0 H3 = B 1 1 1 1 1 2 1 2 a 1 C A; 1 1 1 1 H2 = ! ; H4 H1 = (1) The determinants of these matrices are: det H4 = a3 + 4a2 8a + 8 + 2ab 3b ; det H3 = 1; det H2 = 0; det H1 = 1 It follows from the above definition of rank, that = H3 , Following lemma 3.4, we choose: 0 A=B rank H4 = 3 = = (1 1 1), which implies a2 4a + 8 b a2 + 3a 4 + b a 2 0 0 1 0 0 1 0 1 C A; B=B 1 0 0 1 C A; C = (1 1 1) Hence, there are more than one minimal partial realizations of S ; as a matter of fact the above expressions provide a parametrization of all minimal solutions; the parameters are a; b 2 R . Finally, we note that the value of is uniquely determined by a; b in this case: = a3 + 5a2 12a + 2ab 4b + 16. 4 Time and frequency domain interpretation of various norms Recall section 3.1 and in particular the definition of the convolution operator S (3.4), (3.9); first a new operator will be defined, which is obtained by restricting the domain and the range of S . This is the Hankel operator H. The significance of this operator lies in the fact that in contrast to S , it has a discrete set of singular values. Next we will compute various norms as well as the spectra of S and H. The calculations will be performed for the continuous-time case, the results for the discrete-time case following similarly. For additional information on the material presented in the sections below, we refer to [12], [10], [31], and in addition to [13], [18], [22], [27], [30], [25]. 4.1 The convolution operator and the Hankel operator Given a linear, time invariant, not necessarily causal system , its convolution operator S induces an operator which is of interest in the theory of linear systems: the Hankel operator H which is defined as follows: H : `m(Z ) ! `p(Z+); u 7 ! y+ := H(u ) where y+(t) := 1 X k= 1 St k u (k); t 0 (4.1) Thus the Hankel operator H is obtained from the convolution operator S by restricting its domain and range. The matrix representations of H, is given by the lower-left block of the matrix representation of S ; rearranging the entries of u we obtain 0 1 0 S1 S2 S3 y(0) B B C S y (1) B 2 S3 S4 B C B C=B B S3 S4 S5 B y (2) C A | .. . {z y+ } | .. . .. . {z H 32 .. . 10 1 B u( 1) C C C B u( 2) C B C C B u( 3) C C A A .. . }| .. . {z u } Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 The `2 -induced norm of H is by definition its largest singular value k H k `2 The quantity k ind = max (H) =k kH kH is called the Hankel norm of the system described by the convolution operator S . If in addition the system is stable, by combining the discrete-time versions of theorems 2.1 and 2.2, it follows that the `2 -induced norm of h is equal to the h1 -Schatten norm of its transform H : k h k`2 ind=k H kh1 For short, this relationship is written k h k2 =k H k1 ; we often refer to this quantity as the h1 -norm of . Given a linear, time invariant, continuous-time not necessarily causal system, similarly to the discrete-time case, the convolution operator S induces a Hankel operator H, which is defined as follows: H: Lm(R ) ! Lp(R + ); u The L2 -induced norm of H is The quantity k 7 ! y+ := H(u ) where y+(t) := k H kL2 ind = max (H) =: Z 0 1 h(t )u ( )d; t 0 k kH kH is called the Hankel norm of the system described by the convolution operator S . (4.2) (4.3) As in the discrete-time case, if the system is stable, by combining theorems 2.1 and 2.2 follows that the L2-induced norm of the system defined by the kernel h is equal to the H1-Schatten norm of its transform H : k h kL2 ind=k H kH1 For short, this relationship is written k h k2 =k H k1 , and we refer to this quantity as the H1 -norm of . Remark 4.1 Significance of H. As we will see in the two sections that follow, the Hankel operator has a discrete set of singular values. In contrast, this is not the case with the convolution operator. Therefore the singular values of H play a fundamental role in robust control and system approximation. 4.2 Computation of the singular values of S We will now compute the various ! norms assuming that the linear system is continuous-time and is given in state space form: = A B C D ; it follows from (3.26) that its impulse response is t 0. Analogous results hold for discrete-time systems. h(t) = CeAt B + Æ(t)Du, First, we will compute the adjoint S of the operator S . By definition, given the inner product h ; i, S is the unique operator satisfying hy; S ui = hS y; ui (4.4) for all y , u in the appropriate spaces. By (3.10), S is an integral operator with kernel h(); a relatively straightforward calculation using (4.4) shows that S is also an integral operator with kernel h and with time running backwards: S : Lp(R ) ! Lm (R )y 7 ! u := S (y) where u(t) = Consider the case of square systems readily follows that m = p. Let Z 1 h ( t + )y( )d; t 2 R 1 u0 (t) = v0 ej!0 t , v0 S (u0 )(t) = H (j!0 )v0 ej!0t 33 (4.5) 2 R m , t 2 R , be a periodic input. It (4.6) Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 and S (S (u0 ))(t) = H ( j!0 )H (j!0 )v0 ej!0t (4.7) (4.6) shows that u0 is an eigenfunction of S provided that v0 is an eigenvector of H (j!0 ). Furthermore (4.7) shows that u0 is a right singular function of S provided that v0 is a right singular vector of the hermitian matrix H ( j!0 )H (j!0 ). We conclude that the spectrum of S is composed of the eigenvalues of H (j!) for all ! 2 R , while the spectrum of S S is every point in the interval (; ) : := inf (H ( j!)H (j!)) ; := sup max (H ( j!)H (j!)) ! min ! Notice that the eigen-functions and singular-functions of ergy) but not finite energy. S are signals of finite power (root-mean-square en- Example 4.1 We will illustrate the preceeding discussion by computing the singular values of the convolution operator S associated with the discrete-time system: y(k + 1) ay(k) = bu(k) The impulse response of this system is: S0 = 0; Sk = bak 1 ; k > 0 Hence from (3.8) follows: 0 S= B B B B bB B B B B .. .. . . 0 1 a a2 .. . .. . .. . 0 0 1 a 0 0 0 1 .. . .. . .. . 0 0 0 0 .. . 1 .. C C C C C C C C C A . To compute the singular values of S , we will first consider the finite Toeplitz submatrices Sn of S : 0 Sn = B B B B bB B B B B 1 1 a 1 a .. . an an 2 1 .. . an .. . 2 1 a 1 C C C C C C C C C A 2 R nn It follows that the singular values of Sn lie between jbj (S S ) jbj ; i = 1; 2; ; n i n n 1 + jaj 1 jaj This relationship is valid for all n. Hence in the limit n ! 1, the singular values of S are composed of the i h j b j j b j interval 1+jaj ; 1 jaj . Let us now compare the singular values of S with the singular values of S100 , for a = 21 and b = 3. The largest, smallest singular values of this matrix are: 1 (Sn ) = 5:9944, 100 (Sn ) = 2:002, respectively. The corresponding singular values of S are 6, 2 respectively. 34 Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 4.3 Computation of the singular values of H We begin with a definition. Definition 4.1 The Hankel singular values of the stable system 1 () > are the singular values of value: > q () with multipliity ri; i = 1; ; q; H defined by (4.1), (4.2). The Hankel norm of In order to compute the singular values of systems the adjoint is defined as follows: ! Lm(R q X i=1 ri = n (4.8) is the largest Hankel singular kkH := 1 () The Hankel operator of a not necessarily stable system causal part + : H := H+ . H : Lp(R +) , denoted by: ); y+ 7 , is defined as the Hankel operator of its stable and H we need its adjoint H. ! u := H(y+) where u (t) := Recall (4.2). For continuous-time Z 0 1 h ( t+ )y( )d; t 0 (4.9) In the sequel we will assume that the underlying system is finite dimensional, i.e. the rank of the Hankel matrix derived from the corresponding Markov parameters St of h is finite. Consequently, by section 3.3, there exist a triple (C; A; B ) such that (3.26) holds: h(t) = CeAt B; t > 0 It follows that for a given u (Hu)(t) = Moreover Z 0 1 h(t )u( )d = CeAt Z 0 | (H Hu)(t) = B eA t Z 0 1 e A Bu( )d {z =: xi } = CeAt xi ; xi 2 R n 1 A A e C Ce d xi = B eA t Qxi where the expression in parenthesis is Q, the infinite observability gramian defined by (3.40). The requirement for u to be an eigenfunction of H H, is that this last expression be equal to i2 u(t). This implies: u(t) = 1 A t B e Qxi ; t 0 i2 Substituting u in the expression for xi we obtain 1 i2 Z 0 1 e A BB e A d Qxi = xi Recall the definition (3.39) of the infinite reachability gramian; this equation becomes PQxi = i2xi (4.10) We conclude that the (non-zero) singular values of the Hankel operator H are the eigenvalues of the product of the infinite gramians P and Q of the system. Therefore H, in contrast to S , has a discrete set of singular values. It can be shown that (4.10) holds for discrete-time systems where P , Q are the infinite gramians obtained by solving the discrete Lyapunov or Stein equations (3.45) respectively. In summary we have: 35 Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 Lemma 4.1 Given the reachable, observable and stable discrete- or continuous-time system n, the positive square roots of the eigenvalues of PQ are the Hankel singular values of : q k () = k (PQ) ; k = 1; Furthermore 1 is the Hankel norm of of dimension ; n (4.11) . Recall the definition of equivalent systems (3.29). Under equivalence, the gramians are transformed as follows P~ = T P T ; Q~ = T QT 1 ) P~ Q~ = T (PQ) T 1 Therefore, the product of the two gramians of equivalent systems is related by similarity transformation, and hence has the same eigenvalues. Corollary 4.1 The Hankel singular values are input-output invariants of . Remark 4.2 (a) For discrete-time systems: k ( ) = k (H), i.e. the singular values of the system are the singular values of the (block) Hankel matrix defined by (3.49). For continuous-time systems however k ( ) are the singular values of a continuous-time Hankel operator. They are not equal to the singular values of the associated matrix of Markov parameters. (b) It should be noticed that following proposition 3.5 the Hankel singular values of a continuous-time stable system and those of a discrete-time stable system related by means of the bilinear transformation s = zz +11 are the same. 4.4 Computation of various norms 4.4.1 The H2 norm This norm is defined as the L2 norm of the impulse response (in the time domain): k kH2 :=k h(t) kL2 (R +)=k H (s) kH2 (C +) Therefore it exists only if D = 0 and plane; in this case there holds (2.16): k k2H2 = 1 Z 0 (4.12) is stable, i.e. the eigenvalues of A are in the left-half of the complex trae [h (t)h(t)℄dt = 1 Z1 trae [H ( j!)H (j!)℄d! 2 1 where the second equality is a consequence of Parseval’s theorem. Thus using (3.40) we obtain k k H2 = 2 Furthermore since C P C ; therefore 1 Z 0 trae [B eA t C CeAt B ℄dt = trae [B QB ℄ trae [h (t)h(t)℄ = trae [h(t)h (t)℄, using (3.39), this last expression is also equal to p p k kH2 = B QB = C P C (4.13) An interesting question is whether the H2 norm is induced. In [11] it is shown that the 2-1 induced norm of the convolution operator is: q kSk2;1 = sup kkyukk1 = max(C P C ) 2 u6=0 Consequently, in the single-input single-output (m = p = 1) case, the H2 norm is an induced norm. This norm can be interpreted as the maximum amplitude of the output which results from finite energy input signals. 36 Time and frequency domain interpretation of various norms 4.4.2 The H1 norm According to (2.17), if EOLSS 6.43.13.4 is stable, i.e. the eigenvalues of A have negative real parts, kkH1 = sup max (H (j!)) ! Consider the rational function K (j!) = 2 Im H ( j!)H (j!) If is bigger than the H1 norm od , there is no real ! , such that K (j! ) is zero. Thus if we define K (s) = 2 H ( s)H (s), where H ( s) = B ( sI A ) 1 C + D is the adjoint system, the H1 norm of is less than if, and only if, K 1 (s) has no pure imaginary poles. As shown in [10], the A matrix of K 1 is " A 1 BB 1 C C A AK ( ) = # Hence we have the result Proposition 4.1 kkH1 < if, and only if, the matrix AK ( ) has no complex eigenvalues. For a sufficiently large , AK ( ) has no pure imaginary eigenvalues, while for a sufficiently small , it does. The algorithm used to find an approximation of the H1 norm consists in dissecting the interval [; ℄: let ~ = + ) has imaginary eigenvalues then the interval above is substituted by the interval where = ~ , 2 . If F (~ otherwise by = ~ ; both of these intervals have now half the lenght of the previous interval. The procedure continues until the difference is sufficiently small. The condition that AK ( ) have no pure imaginary eigenvalues is equivalent to the condition that the Riccati equation A X + XA + 1 XBB X + 1 C C = 0; have a positive definite solution X . 4.4.3 The Hilbert-Schmidt norm An operator I : X ! Y , where X , Y are Hilbert spaces, is Hilbert-Schmidt, if there exists a complete orthonormal sequence fxn g 2 X , such that X n>0 kI (xn)k < 1 This property can be readily checked for integral operators I . The integral operator Z b I (w)(x) = is a Hilbert-Schmidt operator, if its kernel k(x; y)w(y)dy; x 2 [; d℄; a k is square intergrable in both variables: "Z 2 = trae bZ d a k (x; y)k(x; y)dxdy # is finite. In this case is the Hilbert-Schmidt norm of I . According to [30], such operators are compact and hence have a discrete spectrum, where each eigenvalue has finite multiplicity, and the only accumulation point is zero. It readily follows that the convolution operator Hankel operator H is. In particular: 2 = trae Z 0 1Z 0 1 S associated with is not Hilbert-Schmidt, while the h (t )h(t ) d dt 37 Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 Assuming that the system is stable, this expression is equal to 2 = trae = trae = trae = trae 1Z Z 0 Z 0 0 B eA 1 1 A t Z Be (t ) A(t ) C Ce B d 0 1 e A C Ce At dt 1 A t At B e Q e B dt Z0 1 At A t Z e BB e 0 = trae [PQ℄ = 12 + where i are the Hankel singular values of the system Example 4.2 For the discrete-time system is: dt eAt B dt Q dt + n2 . y(k + 1) ay(k) = bu(k), discussed earlier, the Hankel operator 0 H= B B bB B 1 a a2 a a2 a3 a2 a3 a4 .. . .. . .. . .. 1 C C C C A . This operator has a single non-zero singular value, which turns out to be 1 (H) = In this case, since H is symmetric 1 (H) to the trace of H, up to a sign) is 1 ba2 . jbj 1 a2 = 1 (H). Furthermore, the Hilbert-Schmidt norm (which is equal Hankel singular values and the Nyquist diagram Consider the singular values i (H). In the SISO case the following result due to [17] holds. It provides a frequency domain interpretation of the Frobenius or Hilbert-Schmidt norm of the Hankel operator: Area of Nyquist diagram inluding multipliities ) = (12 + | {z + n2 }) (Hilbert Shmidt Norm of H)2 = kHk2HS Example 4.3 We consider a 16th order continuous-time Butterworth filter. In the left-hand side plot of figure 2, its Nyquist diagram is shown; in the middle are the 16 Hankel singular values; on the right-hand side is the impulse response of the system. The Nyquist diagram winds 3 times around the origin in almost perfect circles of radius 1; then it winds once more. The area of this diagram, multiplicities included is exactly 4 ; it can be verified that the sum of the 16 Hankel singular values is exactly 4. Furthermore we compute the following norms: kk2;1 = :5464; kkH = :9996; kkH1 = 1; kk1;1 = 2:1314; X i i2 = 9:5197 Following the relationships in the table below we have :9996 < 1, which means that the largest Hankel singular value is very close to 1, but still, smaller. Also :5646 < 1 < 2:1314 < 9:5197, which means that the 2; 1 induced norm which is equal to the H2 norm, is less that the 2; 2-induced norm which is also the same as the H1 norm; in turn this is smaller than the 1; infty-induced norm, and all these numbers are upper-bounded by twice the sum of the Hankel singular values. 38 Time and frequency domain interpretation of various norms 2 Nyquist diagram: 16th order Butterworth filter 1 0.8 0.6 Imaginary Axis 0.4 0.2 0 2 = 2 (HS norm) = 4 −0.2 −0.4 −0.6 −0.8 −1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 EOLSS 6.43.13.4 9:9963e 9:9163e 9:2471e 6:8230e 3:1336e 7:7116e 1:0359e 8:4789e 4:5242e 1:5903e 3:6103e 5:0738e 4:1111e 1:6892e 7:5197e 3:2466e 001 001 001 001 001 002 002 004 005 006 008 010 012 014 017 017 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Impulse response of 16th order Butterworth Filter 0.3 0.25 0.2 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 0 10 20 30 40 50 seconds 60 70 80 90 100 Real Axis Figure 2: 16th order Butterworth filter: Nyquist diagram, Hankel singular values, impulse response. 4.4.4 Summary of norms There are 3 quantities associated with a linear system which can be used to define different norms. These are: the impulse response h , the convolution operator S and the Hankel operator H . One can now define (i) the L1 norm of h and (ii) the L2 norm of the same quantity h . By means of the Hardy space H2 , the latter is equal to the H2 norm of its transform H (which is the transfer function of ). One can also define induced norms of S and H . Recall the definition (2.21). First we can define the 2; 2 norms, which according to (2.21) are obtained for = = 2. According to lemma 2.1 this is equal to the largest singular value of S , H , respectively. Again because of the equivalence with the frequency domain the former is the H1 norm of the transfer function H ; the latter is the Hankel norm of . Other induced norms of S are the 2; 1 ( = 2, = 1) and the 1; 1 or peak-to-peak norm ( = = 1). It turns out that the former is closely related to the L2 norm of h, while the latter is the L1 norm of h . The interpretation of the induced norms of the convolution operator are as follows. The 2; 1 norm gives the largest magnitude of the output y (i.e. ky k1 ) which is achievable with unit energy inputs u (i.e. kuk2 = 1). The 2; 2 induced norm is the largest energy of the output y (i.e. ky k2 ) given inputs u of unit energy (i.e. kuk2 = 1). Finally, the 1; 1 norm, also known as the peak-to-peak norm, is the largest magnitude of y (i.e. kyk1) achieved with inputs u of unit largest magnitude (i.e. kuk1 =1). As for the Hankel norm, that is, the 2; 2 induced norm of the Hankel operator H, it is the largest energy of future outputs y+ , caused by unit energy past inputs u . An interpretation of the singular values of this operator in the frequency domain is given by the fact that the Frobenius or Hilbert-Schmidt norm of H is equal to the area of the Nyquist diagram. Table 5 summarizes these facts and provides some comparisons between these norms. References [1] Antoulas A.C. (1986), On recursiveness and related topics in linear systems, IEEE Transactions on Automatic Control, AC-31: 1121-1135. [2] Antoulas A.C. (1993), Recursive modeling of discrete-time series, in P. Van Dooren and B.F. Wyman editors, IMA volume on Linear Algebra for Control, pp. 1–22. [3] Antoulas A.C. and Willems J.C. (1993), A behavioral approach to linear exact modeling, IEEE Trans. Automatic Control, AC-38: 1776-1802. 39 Time and frequency domain interpretation of various norms T IME -D OMAIN L1 norm of h: LR21norm of h: L 0 2;2 R1 0 kh(t)k1 dt trae [h (t)h(t)℄dt EOLSS 6.43.13.4 Norms of linear systems F REQUENCY-D OMAIN induced norm of S : supu6=0 kSkuukk L2;2 induced norm of H (Hankel norm): u k supu 6=0 kH ku k Hankel sing. values: i (H) P 2 Hilbert-Schmidt norm of H: i i (H) HR21norm of H : trae [H ( j!)H (j!)℄d! H1 norm of H : sup! max H (j!) 0 1 (area of Nyquist diagram) E XPRESSION kk2H2 = C P C = B QB kkH1 kk2H = max(PQ) i2 () = i (PQ) Relationships among norms p kSk2;1 = max(C P C ) kSk2;2 = sup! max(H (j!)) kSk1;1 2 kHk1 = 2 Pi i ^ _ k khkL2 kHk2;2 = 1 = max(PQ) khkL1 k p kH kH2 = trae (C P C ) Table 5: Norms of linear systems and their relationships [4] Antoulas A.C. (1998), Approximation of linear operators in the 2-norm,Linear Algebra and its Applications, Special Issue on Challenges in Matrix Theory. [5] Antoulas A.C., Sontag E.D. and Yamamoto Y. (1999), Controllability and Observability, in the Wiley Encyclopedia of Electrical and Electronics Engineering, edited by J.G. Webster, volume 4: 264-281. [6] Antoulas A.C. (1999), Approximation of linear dynamical systems, in the Wiley Encyclopedia of Electrical and Electronics Engineering, edited by J.G. Webster, volume 11: 403-422. [7] Antoulas A.C. (2001), Lectures on the approximation of large-scale dynamical systems, to appear, SIAM Press. [8] Antoulas A.C. and Sorensen D.C. (2001), Lyapunov, Lanczos and Inertia, Linear Algebra and Its Applications, LAA. [9] Brogan W.L. (1991), Modern control theory, Prentice Hall. [10] Boyd S.P. and Barratt C.H. (1991), Linear controller design: Limits of performance, Prentice Hall. [11] Chellaboina V.S., Haddad W.M., Bernstein D.S., and Wilson D.A. (2000), Induced Convolution Norms of Linear Dynamical Systems, Math. of Control Signals and Systems MCSS 13: 216-239. [12] Francis B.A. (1987), A course in H1 control theory, Springer Lec. Notes in Control and Information Sciences, 88. [13] Fuhrmann P.A. (1981), Linear spaces and operators in Hilbert space, McGraw Hill. 40 Time and frequency domain interpretation of various norms EOLSS 6.43.13.4 [14] Glover K. (1984), All optimal Hankel-norm approximations of linear multivariable systems and their bounds, International Journal of Control, 39: 1115-1193. L 1 -error [15] Green M. and Limebeer D.J.N. (1995), Linear robust control, Prentice Hall. [16] Golub G.H. and Van Loan C.F. (1989), Matrix computations, The Johns Hopkins University Press. [17] Hanzon B. (1992) The area enclosed by the oriented Nyquist diagram and the Hilbert-Schmidt-Hankel norm of a linear system, IEEE Transactions on Automatic Control, AC-37: 835-839. [18] Hoffman K. (1962), Banach spaces of analytic functions, Prentice Hall. [19] Horn R.A. and Johnson C.R. (1985), Matrix analysis, Cambridge University Press. [20] Kalman R.E. (1979), On partial realizations, transfer functions, and canonical forms, Acta Polyt. Scand. Ma., 31: 9-32. [21] Lancaster P. and Tismenetsky M. (1985), The theory of matrices, Academic Press. [22] Luenberger D.G. (1969), Optimization by vector space methods, John Wiley and Sons. [23] Moore B.C. (1981), Principal component analysis in linear systems: Controllability, observability and model reduction, IEEE Trans. Automatic Control, AC-26: 17-32. [24] Obinata G. and Anderson B.D.O. (2000), Model reduction for control system design, Springer Verlag. [25] Partington J.R. (1988), An introduction to Hankel operators, London Mathematical Society, Student Texts, 13. [26] Polderman J.W. and Willems J.C. (1998), Introduction to mathematical systems and control: A behavioral approach, Text in Applied Mathematics, 26, Springer Verlag. [27] Rosenblum M. and Rovnyak J. (1985), Hardy classes and operator theory, Oxford University Press. [28] Rugh W.J. (1996), Linear system theory, Second Edition, Prentice Hall. [29] Sontag E.D. (1990), Mathematical control theory, Springer Verlag. [30] Young N. (1988), An introduction to Hilbert space, Cambridge University Press. [31] Zhou K., Doyle J.C., and Glover K. (1996), Robust and optimal control, Prentice Hall. 41 Appendix: Glossary EOLSS 6.43.13.4 5 Appendix: Glossary Notation Meaning First appearing Z, R , C K Integers, real or complex numbers either R or C ( ) Transposition and complex conjugation of a matrix k kp Hölder p-norms k kp;q matrix or operator induced norm k kF Frobenius norm of matrix or operator i Singular values `np infinite sequences of vectors in K n , finite p-norm Lnp functions with values in K n, finite p-norm D (open) unit disc hp vectors/matrices, analytic in D, finite p-norm Hp vectors/matrices, analytic in C , finite p-norm `1 vectors/matrices, no poles on the unit circle, finite 1-norm L1 vectors/matrices, no poles on imaginary axis, finite 1-norm L Laplace transform Z discrete-Laplace or Z -transform I unit step function discrete or continuous u; x; y Input, State, Output h Impulse response H Transfer function Linear, time-invariant system h u convolution discrete or continuous R [ ℄ polynomials in with real coefficients R pq [ ℄ p q matrix with entries in R [ ℄ Sk Markov parameters Æ(t) Kronecker delta if t 2 Z Æ(t) delta distribution if t 2 R (u; x0 ; t) state at time t, initial condition x0 , with input u Rn, On Reachability, Observability matrices P , Q Reachability, Observability gramians S Convolution operator H Hankel operator S Adjoint of convolution operator H Adjoint of Hankel operator k k2;1 = kSk2;1 2; 1 induced norm of k k2;2 = kSk2;2 2; 2 induced norm of k k1;1 = kSk1;1 1; 1 induced norm of k kH Hankel norm of k kH2 = kH kH2 H2 -norm of k kL2 = khkL2 L2-norm of 42 section 2.1.1 section 2.1.1 section 2.1.1 section 2.1.1 section 2.1.1 section 2.1.1 section 2.1.2 section 2.1.3 section 2.1.3 section 2.1.4 section 2.1.4 section 2.1.4 section 2.1.4 section 2.1.4 section 2.2.1 section 2.2.2 section 2.2.2 section 3 section 3 section 3 section 3 section 3 section 3 section 3 section 3.1 section 3.2.1 section 3.2.1 section 3.2.1 section 3.2.3 section 3.2.3 section 4.1 section 4.1 section 4.2 section 4.3 section 4.4 section 4.4 section 4.4 section 4.4 section 4.4 section 4.4