Bibliographical Citation: IEEE Proc

advertisement
Bibliographical Citation: IEEE Proc.-Control Theory Appl., Vol. 141, No. 4, July 1994, pp. 249-254
Online System Identification Using Laguerre Series
P. D. Olivier
Indexing Terms
Abstract: An online system identification scheme is proposed based on a FourierLaguerre series representation of the unknown impulse response. The unknown
parameters are determined using a gradient estimator. Noise effects are considered.
The proposed identification scheme is applied to a system with time delay.
1.
Introduction
The problem addressed is the online identification of parameters in Laguerre
models of high-order, perhaps infinite, systems. Several recent papers have used
Laguerre series to approximate complex systems (see Olivier [6 and 7]; Zervox et al. [13
and 14]; Gu et al. [2]; Wahlbert [12]; Makila [4 and 5]; Glover et al. [1]; Partington [9]).
With the exception of Reference 14, these contributions have been offline in nature and
primarily in the frequency domain. Series of Laguerre functions have been found useful
in the model order reduction setting in References 1, 6, and 9, with special attention to
convergence rates to be found in Reference 1. These studies determined the expansion
coefficients by calculating the inner product of the target function with the Laguerre
functions or by applying the residue theorem to the frequency domain inner product. The
algebraic expressions that result are useful for theoretical investigation, but do not lend
themselves to numerical implementations necessary in adaptive schemes.
References 6, 13 and 4 suggest orthonormal series identification ‘as a general
framework for identification of infinite dimensional systems.’ Dumont and Zervos in
Reference 14 present a least-squares approach to adaptive modeling and the results due to
Wahlberg [12] could also lead to a least-squares adaptive identification scheme. This
paper describes a time-domain online identification procedure based on the gradient
estimation technique that can be used as part of a self-tuning controller.
Motivating the further study of Laguerre models are observations made during the
course of this prior research. For example, Makila [4] observes that approximations
based on Laguerre models ‘typically provide near best solutions to related L
approximation problems’; Partington [9] points out that approximation techniques based
on Fourier-Laguerre series ‘…have advantages through being easier to calculate than
more rapidly converging approximations…’; and Zervos et al. [13] suggest that the
closed-loop plant should be ‘modeled by a Laguerre series expansion, rather than by a
1
fixed-structure transfer function, because of the requirements of robustness with minimal
prior information.’
Central to the parameter identification process is the adoption of a general model
that is capable of describing the dynamical behaviour of a wide class of systems and
which linearly relates model parameters, measured inputs to the system, and measured
outputs from the system. Following Zervos et al. In Reference 13, a model is chosen
based on the Laguerre series expansion of the impulse response (or of the transfer
function). Since the Laplace transform of the Laguerre functions are rational functions of
the Laplace variable s, it is possible to use a truncated version of the convergent
expansion to construct a finite dimensional approximation to the potentially infinite
dimensional unknown system.
Once a general model is chosen, the next step is to choose parameter
identification scheme. In this paper, a gradient estimator is utilized.
Even though much of the background material and terminology is highly
mathematical, relying on functional analytic terminology and concepts, the less
mathematically inclined reader should not fear. The basic techniques presented herein
are developed using only an elementary knowledge of differential equations and Laplace
transform theory. The mathematical background is presented for completeness and for
the convenience of the more mathematically oriented reader.
2.
Laguerre models
The Laguerre functions are a set of orthonormal functions that span the functions
space L2 (0, ) , i.e. the space of square (Lebesgue) integrable functions on the time
interval (0, ) . The classical Laguerre polynomials are
 k (t ) 
e1 d k k  t
(t e )
k! dt k
(1)
and these can be used to represent concisely the orthonormal Laguerre functions
Lk (t )  2 pe  pt k (2 pt )
(2)
where p must be positive. The inner product with respect to which these functions are
orthonormal is the standard time domain L2 inner product, i.e.

( f , g ) t   f (t ) g (t )dt
(3)
0
The Laplace transform of the Lk (t ) are rational functions of the Laplace variable s for
k 0
2
2p s 
( s  p) k
Lk ( s)  2 p


( s  p) k 1 ( s  p)  s 
p
p 
k
(4)
Parseval’s theorem relates the time domain inner product to the standard frequency
domain inner product.
( f , g )1  ( F , G ) S 
1
2



F ( j )G ( j )d
(5)
where F and G are the Laplace transforms of f and g respectively, with the subscript t and
s indicating a time domain or a frequency domain inner product; hence, the Laguerre
functions are orthonormal in both the time and frequency domains.
There is a rich body of knowledge concerning Laguerre functions in the classical
literature. The reader is referred to Lebedev [3] and Szego [1] for treatments. In
particular, observe the following classical result that shows that the Laguerre functions
are an orthonormal basis of L2 (0, ) .
Property 1: (Szego [11]. The Laguerre functions form an orthonormal basis of L2 (0, ) .
Furthermore, the Laguerre functions are dense in L1 (0, ) .
The densesness of the Laguerre functions in L2 (0, ) (i.e. the set of absolutely
Lebesgue integrable functions) is particularly important because stable impulse responses
must belong to L1 (0, ) .
Makila [4] proves the following result that provides conditions that guarantee the
uniform converegence of linear combinations of the (frequency-domain) Laguerre
functions to transfer functions, which is suggested by the well-known isomorphism
between L2 (0, ) and the Hardy space of functions analytic in the right halfplane
H 2 (Re( s)  0) .
Property 2: (Makila [4]). Let G(s)  H 2 be uniformly continuous on the imaginary axis
j . Then, for any   0 , there exists an integer N ( )  0 and a linear combination of
Laguerre functions L N ( ) such that G ( s )  L N ( ) ( s )   for all s in the closed right-half
complex plane.
Stable linear time-invariant systems are described by impulse responses that are in
the functions space L1 (0, ) , i.e. functions that are absolutely (Lebesgue) integrable on
the time interval (0, ) . The Laguerre functions are complete in the set L2 (0, ) ;
therefore, the procedure that is described herein is applicable to systems whose impulse
response is in the intersection of these two function spaces, i.e. in L1 (0, )  L2 (0, ) .
3
Approximating a given impulse response (or its associated transfer function) can follow
standard procedures from Fourier analysis, i.e.


h(t )   ai Li (t )
H ( s )   ai Li ( s )
i 0
i 0
with
ai  (h, Li )  ( H , Li ) s
(6)
These expressions for a i are very useful for online identification because they require
integrations over all time or all frequencies; therefore, a more useful form must be found.
Consider a system with impulse response h(t ) , transfer function H (s ) , and input u (t )
(or U (s ) ). The output is



i 0
i 0
i 0
y (t )  h(t )  u (t )   Li (t )  u (t )ai   Wi (t )ai   y i (t )
(7)
where  indicates the time-domain convolution and Wi (t ) is implicitly defined. In the
frequency domain, this becomes



i 0
i 0
i 0
Y ( s )  H ( s )U ( s )   Li ( s )U ( s )a1   Wi ( s )ai   Yi ( s )
(8)
The output associated with the ith Laguerre function is
Yi ( s)  Li ( s)U ( s)ai  Wi ( s)ai
(9)
and the quantities Li ( s)U ( s)  Wi ( s) depend only on the input function U(s), not on the
parameter values. Using the fact that for i  1, Li ( s)U ( s)  Wi ( s) is calculated in the
cascaded manner, which allows for the efficient cascade construction of the truncated
series
Li ( s)  Li 1 ( s)
N
s p
s p
(10)
N
N
 Y   L (s)U (s)a  W (s)a
i 0
i
i 0
i
i
i 0
i
i
 W ( s)a
In the time domain, W(s) becomes W (t )  W0 (t ),,WN (t ) , with
4
(11)
Wi (t )  Li (t )  u (t )
(12)
As this shows, the model parameters, which are contained in the column vector a, are
linearly related to the measurable output y via the row vector W(t). The 0th stage of this
system can be realized by the differential equation
d
z 0 (t )   pz 0  2 pu (t )
dt
W0 (t )  z 0 (t )
(13)
whereas for i  1, the ith stage can be realized by the differential equation
d
z i (t )   pz i (t )  2 pWi 1 (t )
dt
Wi (t )  z i (t )  Wi 1 (t )
(14)
In state-equation form, z  Az  Bu , with the A and B matrices defined in equation 15.
This differential equation realization is similar to the one in Makila [4].
0
0
p
 2 p  p
0

A   2 p  2 p  p



 
 2 p  2 p  2 p

 0 


 0 


B
 0


  



  p

2p

0 
0 

 
0 
(15)
Discrete-time versions of these differential equations are required because virtually all
parameter estimation procedures are implemented numerically via digital computer. For
piecewise constant inputs (i.e. for t k  t  t k 1 , u (t )  u (t k ) . The solution to this equation
is
z (t k   )  e A z (t k )  
t k 
tk

e A( tk  t ) Bu ( )d  e A z (t k )   e A Bd u k
0
(16)
or
z k 1  e A z k  u k
(17)
These equations need computationally efficient techniques to calculate e A and  . The
following theorem provides the required matrices.
5
Theorem 1:
e At
 m0
m
 1
  m2

 
mn 1
0
m0
m1

mn  2
0
0
m0

m n 3
 0
 0 

 
 0
 1 
 0     2 




 
  
n 1 
 m0 
(18)
Li (t )  Li 1 (t )
2p
(19)
with
m0 (t ) 
L0 (t )
mi (t ) 
2p
Proof: To verify the form of e At , consider N  ( sI  A) , which is lower triangular with

0  2 p  m0 ( )d 
0
1  e  p
2 p

i  2 p  mi ( )d  0.5 2 pmi ( )
0
(20)
elements ni ,i  ( s  p) for j  i , nij  2 p for j  i , and ni , j  0 for j  i .
Consequently, N 1 is lower triangular with elements
N 

N 
 M i  j ( s) 
1
i, j
L0 ( s )
2p
 M 0 ( s ) for i=j
(21)
and
1
i, j
Li  j  Li  j 1 ( s)
2p
for i  j
(22)
The inverse Laplace transform of N 1 ( s) is e At . The specified forms of mi (t ) are the
inverse Laplace transforms of the M i (s) . The exact expression for the components of 
follow immediately; whereas the approximate expression for i , i  0 comes from
trapezoidal integration combined with the fact that mi (0)  0 for i  0 , which is
consistent with the fact that these are off-diagonal elements of a state transition matrix.
Now approximate a system with impulse response h(t )  L1 (0, )  L2 (0, ) by a
linear model of the form y (t )  W(t )a . This is precisely the form needed to apply
6
prediction-error parameter-estimation techniques. The specific version used is the
gradient estimator.
3.
Gradient estimiation
Consider the problem of estimating the parameters associated with a Laguerre
model (with a real pole) of a stable system. This limitation results in minimal loss of
generality. Hansen, Franklin, and Kosut [15] discuss experimental determination of the
stable coprime factor representation of an unstable system. The Laguerre functions can
be generalized in a variety of ways to allow for complex poles; one such generalization
has come to be known as the Kautz functions [16].
The standard gradient estimator for a single-input single-output system can be
formulated as follows (see Slotline and Li [10]): let the output of a system y be related to
the unknown parameter (column) vector a via the equation
y (t )  W (t )a
(23)
where W(t) is a known, measurable (or calculable from measurements) row vector. The
following notation will be used: let aˆ (t ) represent the best estimate of the parameter
vector at time t; let yc (t ) be the output calculated based on the best estimate of the
parameters, i.e.
yc (t )  W (t )aˆ
(24)
let y m (t ) be the measured output at time t, it is related to the parameter vector by
y m (t )  W (t )a
(25)
further, let e1 (t ) be the so-called prediction error, i.e.
ei (t )  yc (t )  y m (t )  W (t )aˆ (t )  W (t )a
(26)
and finally, let a~ (t ) be the parameter estimation column vector error, i.e.
a~ (t )  aˆ (t )  a
(27)
Gradient estimation is heuristically based on updating the parameter estimate â to reduce
the prediction error, i.e. the parameters should be updated according to


d

aˆ (t )   p 0
e1T e1   p 0W T (t )e1
dt
aˆT
7
(28)
where p 0  0 is the estimator gain that can be chosen to control the convergence rate and
the superscript T denotes matrix transposition.
Gradient estimators are easity shown to converge provided the input is
persistently exciting (see Slotine and Li [10]).
Online estimation schemes must be tolerant of noise. Example 3 presents some
simulation results with and without noise. The effect of noise on the estimation scheme
is now briefly (and heuristically) discussed. The input to the plant and the output from
the plant will both be corrupted by noise. Let the input noise signal be denoted by n(t )
and the output noise be denoted by v (t ) . Eqn. 25 (for y m (t ) ) must be modified by
replacing W(t) with W(t )  N(t ) , where the N(t) is a row vector with the ith component
equal to the convolution of Li (t ) with n(t), and adding v(t), i.e.
y m (t )  W (t )  N (t )a  v(t )
(29)
Eqn. 26 (for e1 ) becomes
e1 (t )  yc (t )  y m (t )  W (t )aˆ (t )  W (t )  N (t )a  v(t )
(30)
and the differential equation for â becomes

d

aˆ (t )   p 0 T   p 0 W T (t )W (t )e1 (t )  W T (t ) N (t )( aˆ (t )  a~ (t ))  W T (t )v(t )
dt
aˆ

(31)
where a~ is the difference between â and a. Eqn. 31 clearly demonstrates that the
estimate is governed by a nonlinear differential equation which means that the analysis is
heuristic at best (i.e. the theory of nonlinear stochastic differential equations must be used
in any theoretical investigation of the effect of noise on this scheme). Such a theoretical
investigation is beyond the scope of this paper.
4.
Examples
The parameters of a Laguerre model can now be identified for a well-studied
stable time-delay system (see Reference 1 and for a very similar system Reference 6).
Before beginning with the examples, a comment on the choice of p in the Laguerre
function is in order. The Laguerre functions are complete in L2 (0, ) so long as p is
positive; therefore, the choice of p is not critical. However, a good choice of p can
potentially reduce the number of terms needed for accurate approximation.
4.1
Example
Choose the plant to be a time-delay system with transfer function
8
G (s) 
1
s  1  e 2 s
(32)
The input is chosen to be u (t )  sin t  sin 2t , and a final time of 50s is used. A 4th order
model was sought, and aˆ  0.7215,0.0494,0.0322,0.0126 was produced with a
prediction error of –0.0016. To see if the approximation is truly useful, the author
investigated the frequency response errors for the approximations in Table 1. As the
plots of the error functions demonstrate (Fig. 1) the maximum error decreases except for
the G  G3 case.
Table 1: 2-norm and  -norm errors for four approximations in example 1.
Approximation
e2
e
G0 ( s)  a0 Lo ( s)
0.004
0.136
G1 ( s)  a0 L0 ( s)  a1 L1 ( s)
0.012
0.066
G2 ( s)  a0 L0 ( s)  a1 L1 ( s)  a2 L2 (2)
0.011
0.021
G3 ( s)  a0 L0 ( s)  a1 L1 ( s)  a2 L2 (2)  a3 L3 (2)
0.01
0.039
In this case, the maximum error is larger but the squared error is smaller.
4.2
Example 2
This example investigates the dependence of the approximation of the pole p.
Approximations using p=0.722 (the pole of the actual plant) and p=4 are compared to
Fig. 1
Approximation errors for p=1
the approximation based on p=1. Fig. 2 shows the error functions for the p=0.722 case,
with corresponding parameter vector aˆ  0.7248,0.0413,0.0275,0.0044 . Fig. 3 shows
the error functions for the p=4 case, with corresponding parameter vector
aˆ  0.5796,0.3307,0.0515,0.2964. Comparing the error functions, one sees clearly
that the choice of p has a significant impact on the accuracy of the approximation.
Choosing the pole location to coincide with the dominant pole of the given transfer gives
the most accurate approximation. The case when p=4= order of approximation was
investigated because this pole location was suggested by Glover et al. [1] for a unit
impulse response. Of course, knowledge of the dominant pole (or existence of a
dominant pole) was not assumed in Reference 1. Further, this example is consistent with
the result of Reference 1 in that it demonstrates that the convergence rate for the p=4 case
is the fastest even though the initial error is larger, whereas the convergence rate for the
p=0.722 case is slower even though its initial error is smaller.
9
4.3
Example 3
This example investigates the effect of input and output noise on the identification
scheme. Five values of the parameter vectors are compared for p=0.722.
(i)
optimal parameter vector obtained using the inner product in eqn. 5
(ii)
ideal parameter vector obtained using the identification scheme proposed
without noise (from the previous example)
(iii) parameter vector obtained when the input is corrupted by noise uniformly
distributed on the interval [-0.1, 0.1]
(iv)
parameter vector obtained when the output is corrupted by noise uniformly
distributed on the interval [-0.1, 0.1]
(v)
parameter vector obtained when both the input and the output are
corrupted by noise uniformly distributed on the interval [-0.05, 0.05].
The results are summarized in Table 2. This table demonstrates that the no-noise
estimation is very close to optimal; noise at the input has less effect than noise at the
output owing to the ‘momentum’ internal to the system. It is interesting to note that the
 -norm errors correlate with the two-norm errors.
Fig. 2
Fig. 3
Approximation errors for p=0.722
Approximation errors for p=4
Table 2: Two-norm,  -norm, and prediction errors with and without noise
e2
Optimal
No noise
Input and output noise
Input noise
Output noise
-4
4.597 x 10
5.519 x 10-4
5.434 x 10-4
6.271 x 10-4
7.327 x 10-4
10
e
ep
0.019
0.02
0.021
0.027
0.039
NA
0.002
0.028
0.006
0.032
5.
Conclusions
This paper has demonstrated that Laguerre models can be used to identify
parameters in an online setting; therefore, parameter identification based on the Laguerre
models appears suitable for use in self-tuning adaptive controllers. Laguerre models are
useful because the Laguerre models are generic in that knowledge of the pole p is not
necessary; however, it is helpful. In addition, the Laguerre models have a ‘state’ that
does not depend on the parameters to be identified; this is a property shared by other
parameter identification schemes.
6.
References
1.
GLOVER, K., LAM, J., and PARTINGTON, J. R.: ‘Rational approximation of a class of infinitedimensional systems: the L2 case’, in NEVAI, P., and PINKUS, A. (Ed.): ‘Progress in
approximation theory’ (Academic Press, 1991), pp. 405-440
GU, G., KHARGONEKAR, R. P., and LEE, E. B.: ‘Approximation of infinite dimensional
systems,’ IEEE Trans., 1989, AC-34, pp. 610-618
LEBEDEV, N. N.: ‘Special functions and their applications’ (Dover, New York, 1972)
MAKILA, P. M.: ‘Approximation of stable systems by Laguerre filters,’ Automatica, 1990, 26,
pp. 333-345
MAKILA, P. M.: ‘Laguerre series approximation of infinite dimensional systems,’ Automatica,
1990, 26, pp. 985-995
OLIVIER, P. D.: ‘Reduced-order models using optimal Laguerre approximations,'’Electron, Lett.,
1987, 23, (6), pp. 257-259
OLIVIER, P. D., and SCHMALZEL, J. L.: ‘Empirical modeling of long catheters.’ Presented at
IEEE international symposium on Biomedical Engineering
OLIVIER, P. D.: ‘Approximating irrational functions using Lagrange interpolation formula,’
IEEE Proc. D, 1992, 139, (1), pp. 9-12
PARTINGTON, J. R.: ‘Approximation of delay systems by Fourier-Laguerre series,’ Automatica,
1991, 27, (3), pp. 569-572
SLOTINE, J.-J. E., and LI, W.: ‘Applied nonlinear control’ (Prentice-Hall, Englewood-Cliffs,
1991)
SZEGO, G.: ‘Orthogonal polynomials’ (Amer. Math. Soc., 1939), Vol. 23
WAHLBERG, B.: ‘System identification using Laguerre models,’ IEEE Trans., 1991, AC-36, (5)
ZERVOX, C., BELANGER, P. R., and DUMONT, G. A.: ‘On PID controller tuning using
orthonormal series identification,’ Automatica, 1988, 24, (2), pp. 165-175
DUMONT, G. A., and ZERVOS, C. C.: ‘Adaptive control based on orthonormal series
representation.’ Proceedings of second IFAC workshop on Adaptive systems in control and signal
processing, Lund, Sweden, 1988, pp. 371-376
HANSEN, F. R., FRANKLIN, G. F., and KOSUT, R.: ‘Closed-loop identification via fractional
representation: experiment design.’ Proceedings of American Control Conference, Pittsburgh,
PA, 1989, pp. 1422-1427
KAUTZ, W. H.: ‘Transient synthesis in time domain,’ IRE Trans., 1954, CT-1, (3), pp. 29-39
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
11
Download