Bibliographical Citation: IEEE Proc

Bibliographical Citation: IEEE Proc.-Control Theory Appl., Vol. 141, No. 4, July 1994, pp. 249-254 Online System Identification Using Laguerre Series P. D. Olivier Indexing Terms Abstract: An online system identification scheme is proposed based on a FourierLaguerre series representation of the unknown impulse response. The unknown parameters are determined using a gradient estimator. Noise effects are considered. The proposed identification scheme is applied to a system with time delay. 1. Introduction The problem addressed is the online identification of parameters in Laguerre models of high-order, perhaps infinite, systems. Several recent papers have used Laguerre series to approximate complex systems (see Olivier [6 and 7]; Zervox et al. [13 and 14]; Gu et al. [2]; Wahlbert [12]; Makila [4 and 5]; Glover et al. [1]; Partington [9]). With the exception of Reference 14, these contributions have been offline in nature and primarily in the frequency domain. Series of Laguerre functions have been found useful in the model order reduction setting in References 1, 6, and 9, with special attention to convergence rates to be found in Reference 1. These studies determined the expansion coefficients by calculating the inner product of the target function with the Laguerre functions or by applying the residue theorem to the frequency domain inner product. The algebraic expressions that result are useful for theoretical investigation, but do not lend themselves to numerical implementations necessary in adaptive schemes. References 6, 13 and 4 suggest orthonormal series identification ‘as a general framework for identification of infinite dimensional systems.’ Dumont and Zervos in Reference 14 present a least-squares approach to adaptive modeling and the results due to Wahlberg [12] could also lead to a least-squares adaptive identification scheme. This paper describes a time-domain online identification procedure based on the gradient estimation technique that can be used as part of a self-tuning controller. Motivating the further study of Laguerre models are observations made during the course of this prior research. For example, Makila [4] observes that approximations based on Laguerre models ‘typically provide near best solutions to related L approximation problems’; Partington [9] points out that approximation techniques based on Fourier-Laguerre series ‘…have advantages through being easier to calculate than more rapidly converging approximations…’; and Zervos et al. [13] suggest that the closed-loop plant should be ‘modeled by a Laguerre series expansion, rather than by a 1 fixed-structure transfer function, because of the requirements of robustness with minimal prior information.’ Central to the parameter identification process is the adoption of a general model that is capable of describing the dynamical behaviour of a wide class of systems and which linearly relates model parameters, measured inputs to the system, and measured outputs from the system. Following Zervos et al. In Reference 13, a model is chosen based on the Laguerre series expansion of the impulse response (or of the transfer function). Since the Laplace transform of the Laguerre functions are rational functions of the Laplace variable s, it is possible to use a truncated version of the convergent expansion to construct a finite dimensional approximation to the potentially infinite dimensional unknown system. Once a general model is chosen, the next step is to choose parameter identification scheme. In this paper, a gradient estimator is utilized. Even though much of the background material and terminology is highly mathematical, relying on functional analytic terminology and concepts, the less mathematically inclined reader should not fear. The basic techniques presented herein are developed using only an elementary knowledge of differential equations and Laplace transform theory. The mathematical background is presented for completeness and for the convenience of the more mathematically oriented reader. 2. Laguerre models The Laguerre functions are a set of orthonormal functions that span the functions space L2 (0, ) , i.e. the space of square (Lebesgue) integrable functions on the time interval (0, ) . The classical Laguerre polynomials are  k (t )  e1 d k k  t (t e ) k! dt k (1) and these can be used to represent concisely the orthonormal Laguerre functions Lk (t )  2 pe  pt k (2 pt ) (2) where p must be positive. The inner product with respect to which these functions are orthonormal is the standard time domain L2 inner product, i.e.  ( f , g ) t   f (t ) g (t )dt (3) 0 The Laplace transform of the Lk (t ) are rational functions of the Laplace variable s for k 0 2 2p s  ( s  p) k Lk ( s)  2 p   ( s  p) k 1 ( s  p)  s  p p  k (4) Parseval’s theorem relates the time domain inner product to the standard frequency domain inner product. ( f , g )1  ( F , G ) S  1 2    F ( j )G ( j )d (5) where F and G are the Laplace transforms of f and g respectively, with the subscript t and s indicating a time domain or a frequency domain inner product; hence, the Laguerre functions are orthonormal in both the time and frequency domains. There is a rich body of knowledge concerning Laguerre functions in the classical literature. The reader is referred to Lebedev [3] and Szego [1] for treatments. In particular, observe the following classical result that shows that the Laguerre functions are an orthonormal basis of L2 (0, ) . Property 1: (Szego [11]. The Laguerre functions form an orthonormal basis of L2 (0, ) . Furthermore, the Laguerre functions are dense in L1 (0, ) . The densesness of the Laguerre functions in L2 (0, ) (i.e. the set of absolutely Lebesgue integrable functions) is particularly important because stable impulse responses must belong to L1 (0, ) . Makila [4] proves the following result that provides conditions that guarantee the uniform converegence of linear combinations of the (frequency-domain) Laguerre functions to transfer functions, which is suggested by the well-known isomorphism between L2 (0, ) and the Hardy space of functions analytic in the right halfplane H 2 (Re( s)  0) . Property 2: (Makila [4]). Let G(s)  H 2 be uniformly continuous on the imaginary axis j . Then, for any   0 , there exists an integer N ( )  0 and a linear combination of Laguerre functions L N ( ) such that G ( s )  L N ( ) ( s )   for all s in the closed right-half complex plane. Stable linear time-invariant systems are described by impulse responses that are in the functions space L1 (0, ) , i.e. functions that are absolutely (Lebesgue) integrable on the time interval (0, ) . The Laguerre functions are complete in the set L2 (0, ) ; therefore, the procedure that is described herein is applicable to systems whose impulse response is in the intersection of these two function spaces, i.e. in L1 (0, )  L2 (0, ) . 3 Approximating a given impulse response (or its associated transfer function) can follow standard procedures from Fourier analysis, i.e.   h(t )   ai Li (t ) H ( s )   ai Li ( s ) i 0 i 0 with ai  (h, Li )  ( H , Li ) s (6) These expressions for a i are very useful for online identification because they require integrations over all time or all frequencies; therefore, a more useful form must be found. Consider a system with impulse response h(t ) , transfer function H (s ) , and input u (t ) (or U (s ) ). The output is    i 0 i 0 i 0 y (t )  h(t )  u (t )   Li (t )  u (t )ai   Wi (t )ai   y i (t ) (7) where  indicates the time-domain convolution and Wi (t ) is implicitly defined. In the frequency domain, this becomes    i 0 i 0 i 0 Y ( s )  H ( s )U ( s )   Li ( s )U ( s )a1   Wi ( s )ai   Yi ( s ) (8) The output associated with the ith Laguerre function is Yi ( s)  Li ( s)U ( s)ai  Wi ( s)ai (9) and the quantities Li ( s)U ( s)  Wi ( s) depend only on the input function U(s), not on the parameter values. Using the fact that for i  1, Li ( s)U ( s)  Wi ( s) is calculated in the cascaded manner, which allows for the efficient cascade construction of the truncated series Li ( s)  Li 1 ( s) N s p s p (10) N N  Y   L (s)U (s)a  W (s)a i 0 i i 0 i i i 0 i i  W ( s)a In the time domain, W(s) becomes W (t )  W0 (t ),,WN (t ) , with 4 (11) Wi (t )  Li (t )  u (t ) (12) As this shows, the model parameters, which are contained in the column vector a, are linearly related to the measurable output y via the row vector W(t). The 0th stage of this system can be realized by the differential equation d z 0 (t )   pz 0  2 pu (t ) dt W0 (t )  z 0 (t ) (13) whereas for i  1, the ith stage can be realized by the differential equation d z i (t )   pz i (t )  2 pWi 1 (t ) dt Wi (t )  z i (t )  Wi 1 (t ) (14) In state-equation form, z  Az  Bu , with the A and B matrices defined in equation 15. This differential equation realization is similar to the one in Makila [4]. 0 0 p  2 p  p 0  A   2 p  2 p  p       2 p  2 p  2 p   0     0    B  0           p  2p  0  0     0  (15) Discrete-time versions of these differential equations are required because virtually all parameter estimation procedures are implemented numerically via digital computer. For piecewise constant inputs (i.e. for t k  t  t k 1 , u (t )  u (t k ) . The solution to this equation is z (t k   )  e A z (t k )   t k  tk  e A( tk  t ) Bu ( )d  e A z (t k )   e A Bd u k 0 (16) or z k 1  e A z k  u k (17) These equations need computationally efficient techniques to calculate e A and  . The following theorem provides the required matrices. 5 Theorem 1: e At  m0 m  1   m2    mn 1 0 m0 m1  mn  2 0 0 m0  m n 3  0  0      0  1   0     2           n 1   m0  (18) Li (t )  Li 1 (t ) 2p (19) with m0 (t )  L0 (t ) mi (t )  2p Proof: To verify the form of e At , consider N  ( sI  A) , which is lower triangular with  0  2 p  m0 ( )d  0 1  e  p 2 p  i  2 p  mi ( )d  0.5 2 pmi ( ) 0 (20) elements ni ,i  ( s  p) for j  i , nij  2 p for j  i , and ni , j  0 for j  i . Consequently, N 1 is lower triangular with elements N   N   M i  j ( s)  1 i, j L0 ( s ) 2p  M 0 ( s ) for i=j (21) and 1 i, j Li  j  Li  j 1 ( s) 2p for i  j (22) The inverse Laplace transform of N 1 ( s) is e At . The specified forms of mi (t ) are the inverse Laplace transforms of the M i (s) . The exact expression for the components of  follow immediately; whereas the approximate expression for i , i  0 comes from trapezoidal integration combined with the fact that mi (0)  0 for i  0 , which is consistent with the fact that these are off-diagonal elements of a state transition matrix. Now approximate a system with impulse response h(t )  L1 (0, )  L2 (0, ) by a linear model of the form y (t )  W(t )a . This is precisely the form needed to apply 6 prediction-error parameter-estimation techniques. The specific version used is the gradient estimator. 3. Gradient estimiation Consider the problem of estimating the parameters associated with a Laguerre model (with a real pole) of a stable system. This limitation results in minimal loss of generality. Hansen, Franklin, and Kosut [15] discuss experimental determination of the stable coprime factor representation of an unstable system. The Laguerre functions can be generalized in a variety of ways to allow for complex poles; one such generalization has come to be known as the Kautz functions [16]. The standard gradient estimator for a single-input single-output system can be formulated as follows (see Slotline and Li [10]): let the output of a system y be related to the unknown parameter (column) vector a via the equation y (t )  W (t )a (23) where W(t) is a known, measurable (or calculable from measurements) row vector. The following notation will be used: let aˆ (t ) represent the best estimate of the parameter vector at time t; let yc (t ) be the output calculated based on the best estimate of the parameters, i.e. yc (t )  W (t )aˆ (24) let y m (t ) be the measured output at time t, it is related to the parameter vector by y m (t )  W (t )a (25) further, let e1 (t ) be the so-called prediction error, i.e. ei (t )  yc (t )  y m (t )  W (t )aˆ (t )  W (t )a (26) and finally, let a~ (t ) be the parameter estimation column vector error, i.e. a~ (t )  aˆ (t )  a (27) Gradient estimation is heuristically based on updating the parameter estimate â to reduce the prediction error, i.e. the parameters should be updated according to   d  aˆ (t )   p 0 e1T e1   p 0W T (t )e1 dt aˆT 7 (28) where p 0  0 is the estimator gain that can be chosen to control the convergence rate and the superscript T denotes matrix transposition. Gradient estimators are easity shown to converge provided the input is persistently exciting (see Slotine and Li [10]). Online estimation schemes must be tolerant of noise. Example 3 presents some simulation results with and without noise. The effect of noise on the estimation scheme is now briefly (and heuristically) discussed. The input to the plant and the output from the plant will both be corrupted by noise. Let the input noise signal be denoted by n(t ) and the output noise be denoted by v (t ) . Eqn. 25 (for y m (t ) ) must be modified by replacing W(t) with W(t )  N(t ) , where the N(t) is a row vector with the ith component equal to the convolution of Li (t ) with n(t), and adding v(t), i.e. y m (t )  W (t )  N (t )a  v(t ) (29) Eqn. 26 (for e1 ) becomes e1 (t )  yc (t )  y m (t )  W (t )aˆ (t )  W (t )  N (t )a  v(t ) (30) and the differential equation for â becomes  d  aˆ (t )   p 0 T   p 0 W T (t )W (t )e1 (t )  W T (t ) N (t )( aˆ (t )  a~ (t ))  W T (t )v(t ) dt aˆ  (31) where a~ is the difference between â and a. Eqn. 31 clearly demonstrates that the estimate is governed by a nonlinear differential equation which means that the analysis is heuristic at best (i.e. the theory of nonlinear stochastic differential equations must be used in any theoretical investigation of the effect of noise on this scheme). Such a theoretical investigation is beyond the scope of this paper. 4. Examples The parameters of a Laguerre model can now be identified for a well-studied stable time-delay system (see Reference 1 and for a very similar system Reference 6). Before beginning with the examples, a comment on the choice of p in the Laguerre function is in order. The Laguerre functions are complete in L2 (0, ) so long as p is positive; therefore, the choice of p is not critical. However, a good choice of p can potentially reduce the number of terms needed for accurate approximation. 4.1 Example Choose the plant to be a time-delay system with transfer function 8 G (s)  1 s  1  e 2 s (32) The input is chosen to be u (t )  sin t  sin 2t , and a final time of 50s is used. A 4th order model was sought, and aˆ  0.7215,0.0494,0.0322,0.0126 was produced with a prediction error of –0.0016. To see if the approximation is truly useful, the author investigated the frequency response errors for the approximations in Table 1. As the plots of the error functions demonstrate (Fig. 1) the maximum error decreases except for the G  G3 case. Table 1: 2-norm and  -norm errors for four approximations in example 1. Approximation e2 e G0 ( s)  a0 Lo ( s) 0.004 0.136 G1 ( s)  a0 L0 ( s)  a1 L1 ( s) 0.012 0.066 G2 ( s)  a0 L0 ( s)  a1 L1 ( s)  a2 L2 (2) 0.011 0.021 G3 ( s)  a0 L0 ( s)  a1 L1 ( s)  a2 L2 (2)  a3 L3 (2) 0.01 0.039 In this case, the maximum error is larger but the squared error is smaller. 4.2 Example 2 This example investigates the dependence of the approximation of the pole p. Approximations using p=0.722 (the pole of the actual plant) and p=4 are compared to Fig. 1 Approximation errors for p=1 the approximation based on p=1. Fig. 2 shows the error functions for the p=0.722 case, with corresponding parameter vector aˆ  0.7248,0.0413,0.0275,0.0044 . Fig. 3 shows the error functions for the p=4 case, with corresponding parameter vector aˆ  0.5796,0.3307,0.0515,0.2964. Comparing the error functions, one sees clearly that the choice of p has a significant impact on the accuracy of the approximation. Choosing the pole location to coincide with the dominant pole of the given transfer gives the most accurate approximation. The case when p=4= order of approximation was investigated because this pole location was suggested by Glover et al. [1] for a unit impulse response. Of course, knowledge of the dominant pole (or existence of a dominant pole) was not assumed in Reference 1. Further, this example is consistent with the result of Reference 1 in that it demonstrates that the convergence rate for the p=4 case is the fastest even though the initial error is larger, whereas the convergence rate for the p=0.722 case is slower even though its initial error is smaller. 9 4.3 Example 3 This example investigates the effect of input and output noise on the identification scheme. Five values of the parameter vectors are compared for p=0.722. (i) optimal parameter vector obtained using the inner product in eqn. 5 (ii) ideal parameter vector obtained using the identification scheme proposed without noise (from the previous example) (iii) parameter vector obtained when the input is corrupted by noise uniformly distributed on the interval [-0.1, 0.1] (iv) parameter vector obtained when the output is corrupted by noise uniformly distributed on the interval [-0.1, 0.1] (v) parameter vector obtained when both the input and the output are corrupted by noise uniformly distributed on the interval [-0.05, 0.05]. The results are summarized in Table 2. This table demonstrates that the no-noise estimation is very close to optimal; noise at the input has less effect than noise at the output owing to the ‘momentum’ internal to the system. It is interesting to note that the  -norm errors correlate with the two-norm errors. Fig. 2 Fig. 3 Approximation errors for p=0.722 Approximation errors for p=4 Table 2: Two-norm,  -norm, and prediction errors with and without noise e2 Optimal No noise Input and output noise Input noise Output noise -4 4.597 x 10 5.519 x 10-4 5.434 x 10-4 6.271 x 10-4 7.327 x 10-4 10 e ep 0.019 0.02 0.021 0.027 0.039 NA 0.002 0.028 0.006 0.032 5. Conclusions This paper has demonstrated that Laguerre models can be used to identify parameters in an online setting; therefore, parameter identification based on the Laguerre models appears suitable for use in self-tuning adaptive controllers. Laguerre models are useful because the Laguerre models are generic in that knowledge of the pole p is not necessary; however, it is helpful. In addition, the Laguerre models have a ‘state’ that does not depend on the parameters to be identified; this is a property shared by other parameter identification schemes. 6. References 1. GLOVER, K., LAM, J., and PARTINGTON, J. R.: ‘Rational approximation of a class of infinitedimensional systems: the L2 case’, in NEVAI, P., and PINKUS, A. (Ed.): ‘Progress in approximation theory’ (Academic Press, 1991), pp. 405-440 GU, G., KHARGONEKAR, R. P., and LEE, E. B.: ‘Approximation of infinite dimensional systems,’ IEEE Trans., 1989, AC-34, pp. 610-618 LEBEDEV, N. N.: ‘Special functions and their applications’ (Dover, New York, 1972) MAKILA, P. M.: ‘Approximation of stable systems by Laguerre filters,’ Automatica, 1990, 26, pp. 333-345 MAKILA, P. M.: ‘Laguerre series approximation of infinite dimensional systems,’ Automatica, 1990, 26, pp. 985-995 OLIVIER, P. D.: ‘Reduced-order models using optimal Laguerre approximations,'’Electron, Lett., 1987, 23, (6), pp. 257-259 OLIVIER, P. D., and SCHMALZEL, J. L.: ‘Empirical modeling of long catheters.’ Presented at IEEE international symposium on Biomedical Engineering OLIVIER, P. D.: ‘Approximating irrational functions using Lagrange interpolation formula,’ IEEE Proc. D, 1992, 139, (1), pp. 9-12 PARTINGTON, J. R.: ‘Approximation of delay systems by Fourier-Laguerre series,’ Automatica, 1991, 27, (3), pp. 569-572 SLOTINE, J.-J. E., and LI, W.: ‘Applied nonlinear control’ (Prentice-Hall, Englewood-Cliffs, 1991) SZEGO, G.: ‘Orthogonal polynomials’ (Amer. Math. Soc., 1939), Vol. 23 WAHLBERG, B.: ‘System identification using Laguerre models,’ IEEE Trans., 1991, AC-36, (5) ZERVOX, C., BELANGER, P. R., and DUMONT, G. A.: ‘On PID controller tuning using orthonormal series identification,’ Automatica, 1988, 24, (2), pp. 165-175 DUMONT, G. A., and ZERVOS, C. C.: ‘Adaptive control based on orthonormal series representation.’ Proceedings of second IFAC workshop on Adaptive systems in control and signal processing, Lund, Sweden, 1988, pp. 371-376 HANSEN, F. R., FRANKLIN, G. F., and KOSUT, R.: ‘Closed-loop identification via fractional representation: experiment design.’ Proceedings of American Control Conference, Pittsburgh, PA, 1989, pp. 1422-1427 KAUTZ, W. H.: ‘Transient synthesis in time domain,’ IRE Trans., 1954, CT-1, (3), pp. 29-39 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 11

Bibliographical Citation: IEEE Proc

Related documents

Products

Support

Bibliographical Citation: IEEE Proc

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib