( ) 19 Lecture Notes No.

advertisement
2.160 System Identification, Estimation, and Learning
Lecture Notes No. 19
May 1, 2006
14 Experiment Design
14.1 Review of System ID Theories for Experiment Design
Key Requirements for System ID
• Consistent (unbiased) estimate* θˆN → θ 0
• Small covariance cov θˆ
( )
N
*) Unbiased estimate and consistent estimate are equivalent if the system is ergodic.
[ ]
(1) E θˆN = θ 0
The ensemble mean of θˆN is equal to the true value θ 0 …..unbiased.
(2) lim θˆN → θ 0 The limit of θˆN as N tends to infinity is the true value θ 0 ..consistent
N →∞
Major Theoretical Results
i) Informative data sets
Is data set Z ∞ informative enough to distinguish any two different models W1 ( q)
and W2 ( q) in the same model set M (θ ) ?
(W (e
1
iω
) − W2 ( e iω ) )Z (t ) ≡ 0 → W1 ( e iω ) ≡ W2 ( e iω )
(3)
The data set is Informative (for any linear time-invariant model), if the Spectrum Matrix:
⎡ Φ u (ω ) Φ uy (ω )⎤
Φ z (ω ) = ⎢
⎥ : Positive definite for almost all ω . (4)
⎣Φ yu ( −ω ) Φ y (ω ) ⎦
u (t )
M (θ )
y (t )
ii) Persistence of excitation
Any (n-1)-st order MA filter cannot filter the input to zero.
u (t )
M n (q)
v(t )
M n ( q ) = m1q −1 + " + mn q − n
1
Ru (1) " Ru ( n − 1) ⎤
⎡ Ru (0)
⎢ R (1)
Ru (0) " Ru ( n − 2)⎥
u
⎥
⎢
Persistently Exciting of order n ↔ Rn =
#
⎥
⎢
⎥
⎢
"
Ru (0) ⎦
⎣ Ru ( n − 1)
is Non-singular
In particular for
G ( q, θ ) =
(
q − nk b1 + b2 q −1 + " + bnb q nb +1
1 + f1q −1 + " + f n f q
)
(5)
(6)
−n f
An open-loop experiment with an input sequence u(t) that is persistently exciting of order
nb + n f is informative enough to distinguish any two transfer functions of the above form.
1
0
-1
0
1
2
3
4
5
6
0
1
2
3
4
5
6
0
1
2
3
4
5
6
1
0
-1
n sinusoids
w/different
frequencies
persistently
exciting of
order 2n
Informative
enough w.r.t.
G ( q,θ ) having
at most n poles
and n zeros
1
0
-1
iii) Signal to Noise Ratio
The Prediction-Error Method (PEM) determines parameter vector θ in such a way that a
frequency-weighted squared error norm be minimized. For a fixed noise model:
H (q, θ ) = H fixed ( q )
the following signal-to-noise ratio is the weight of the norm:
2
π
Φ u (ω )
(7)
θˆ = arg min ∫ G0 ( e iω ) − G ( e iω , θ ) ⋅
dω
2
−π
θ
H fixed ( e iω )
Error of transfer
function
Signal to Noise
Ratio
2
iv) Asymptotic Variance
1 N 1 2
∑ ε (t,θ ) . If the true model is involved in the
N t =1 2
model set θ 0 ∈ DM , the variance of parameter estimation error approaches the
following Asymptotic Variance, as the number of data tends to infinity,
Consider PEM with VN (θ , Z N ) =
CovθˆN ~
λ
1
Pθ = 0 E ψ (t , θ 0 )ψ T (t , θ 0 )
N
N
[ (
)]
−1
(8)
which depends on:
• Number of data; N
• Noise level: λ0
•
Sensitivity: ψ (t , θ 0 ) =
d
yˆ (t θ ) θ0
dθ
(9)
The Punch Line
Using many parameters θ ∈ R d (large d) may provide an accurate model with a
small value of bias ( θ 0 ∈ DM ), but a large d tends to worsen the variance and
convergence speed, if some parameters in θ have small sensitivity values,
d
yˆ (t θ ) .
dθ
The above asymptotic variance can be expressed in the frequency-domain as
⎤
1 ⎡ 1 π
1
CovθˆN ~ ⎢ ∫
T ' (eiω , θ 0 )Φ x0 (ω )T 'T (eiω , θ 0 )dω ⎥
−
π
Φ v (ω )
N ⎣ 2π
⎦
d
where T ' (q, θ ) =
T (q, θ ) = [G ' (q, θ ), H ' (q, θ )]
dθ
Φ v (ω ) = λ0 H (e iω , θ 0 )
−1
(10)
2
λ0 = Φ e (ω )
0
14.2 Design Space of System ID Experiments
1) Data Acquisition
• Sensors: What to measure and how to measure
Output variables
Disturbances: If feasible, it is better to measure disturbances
• Actuators: What to manipulate
Inputs physical limit and constraints must be taken into account
• Sampling rate… 2.161 Signal Processing
3
High sampling rate means
ƒ More data, but
ƒ Capturing noise rather than signals
Anti-aliasing
Pre-filter design
2) Model Structure Selection
•
Linear/Non-linear
a) Linear parameterization of nonlinear systems
b) Hammerstein and Wiener Models
u(t)
u(t)
•
•
Nonlinear
Map f (u)
Linear
Dynamic
Model
f [u(t)]
Linear
Dynamic
Model
Nonlinear
Map f (z)
y(t)
y(t)
Hammerstein Model
Wiener Model
Linear Time-Invariant Models
FIR, ARX, ARMA, OE, J-B, State Space
Trade-off between bias (accuracy) and variance (reliability)
System order
3) Input Signal Design
How do we design an input sequence (waveform) to be given to the system so that
data are informative and that the covariance of estimate may be small?
The Punch Line
Both bias and variance conditions are provided in terms of the second-order
properties of input signals. See eqs. (4), (5),(7), (8) and (10). They hinge on signal
power spectra, but signal waveforms are free to choose. This gives a great degree of
freedom in designing an input sequence.
14.3 Input Design for Open-Loop Experiments
In an open-loop system, u(t) and eo(t) are independent. Therefore, Φ ue0 = 0 ,
Φ e0u = 0 in eq. (10).
1 ⎡ 1
CovθˆN ~ ⎢
N ⎣ 2π
π
∫π
−
⎤
1
T ' ( e iω , θ 0 )Φ x0 (ω )T 'T ( e iω , θ 0 )dω ⎥
Φ v (ω )
⎦
−1
=
1
Pθ
N
4
This asymptotic variance Pθ can be broken down to the term comprising the input
spectrum Φ u (ω ) and the one having no input spectrum. Taking inverse of Pθ ,
Pθ
−1
π
= ∫ A(ω )Φ u (ω )dω + B
(12)
−π
where
1
G ' ( e iω , θ 0 )G 'T ( e iω , θ 0 )
2πΦ v (ω )
λ π
1
B= 0 ∫
H ' ( e iω , θ 0 ) H 'T ( e iω , θ 0 )dω
2π −π Φ v (ω )
A(ω ) =
(13)
(14)
The asymptotic variance Pθ must be small for fast convergence. This can be
achieved by allocating the input spectrum such that the first term in (12) becomes
large. That is, the input spectrum should be large at frequencies where the squared
parameter sensitivity of the transfer function, G '( eiω ,θ0 ) 2 , is large. See the figure
below.
G ( e iω , θ + ∆θ )
True Transfer Function G0 ( e iω )
G ( e iω )
Bode plots of G ( e iω ) for
perturbed parameter vector θ + ∆θ
High sensitivity
Large G ' ( e iω )
Low sensitivity
Small
dG ( e iω , θ )
dθ
To achieve fast convergence, choose the input spectrum Φ u (ω ) such that Φ u (ω )
has a large magnitude at frequencies where A(ω ) in (13) has a large magnitude. Note
A(ω ) is proportional to G’G’T; the sensitivity of the transfer function G ( e iω ) to
parameter changes θ . Plotting G ( e iω , θ + ∆θ ) for perturbed parameter vector θ
depicts where to allocate large Φ u (ω ) .
This guide line for input design is only for the spectrum; the actual waveform can
be determined freely by considering other requirements.
5
14.4 Practical Requirements for Input Design
Actuator
Physical
Limit
Plant
Sensor
Physical
Constraints
Dynamic
Range
When conducting experiments, various constraints due to physical limits of the process
must be satisfied. Among others, the input amplitude limit is a common constraint
u ≤ u (t ) ≤ u
(15)
where u and u are lower and upper limits.
In general, the larger the input magnitude u (t ) becomes, the smaller the
asymptotic variance PC becomes. Therefore the input sequence should have large
amplitude most of the time.
Max u (t )
u (t )
Max u (t )
u (t )
Undesirable
time
Desirable
time
To evaluate this aspect of input signal, the following crest factor is often used for zeromean signals:
max u 2 (t )
2
1≤ t ≤ N
,
(16)
Cr ≡
1 N 2
∑ u (t )
N t =1
Note Cr ≥ 1 ; Smaller is better.
14.5 System ID Using Random Signals
The best crest factor (1, the lowest value) is achieved with zero mean binary signals:
6
u ( t ) = ± u . The signal shown below is an example of random binary sequence with an
auto-correlation:
Ru (τ ) = E[u(t )u(t − τ )] = 1 ⋅ δ (τ )
(17)
time
In system identification, random signals are often used for input sequences because of
their superb noise reduction properties. The following is to show this property.
Consider the model description based on impulse response given by
∞
y (t ) = ∑ g ( k )u(t − k ) + v(t )
(18)
k =1
where v(t ) is noise, which is uncorrelated with a random input u(t ) ,
E[ u(t )v(t − τ )] ≡ 0 , for all τ .
(19)
Evaluating the correlation between the input and the output yields
∞
R yu (τ ) = E[ y (t )u(t − τ )] = E[{∑ g ( k )u(t − k ) + v(t )}u(t − τ )]
k =1
∞
= ∑ g ( k ) E[u(t − k )u(t − τ )] + E[ v(t )u(t − τ )]
k =1
0
(20)
∞
= ∑ g ( k ) Ru ( k − τ ) = g (τ )
k =1
Note that the noise term is averaged out when the input-output correlation is
computed. This is particularly useful for noisy systems. If the input is a random sequence
with unit variance, the input-output correlation {R yu ( k ), k = 1,2,"} gives the impulse
response {g ( k ), k = 1,2,"} .
The advantage of the random input signal is clear when it is compared with a
standard impulse response test using the following input:
⎧α , t = 0
u(t ) = ⎨
⎩ 0, t ≠ 0
(21)
The observed output is given by
7
y (t ) = α ⋅ g (t ) + v (t )
If one determines the impulse response from this output:
gˆ (t ) =
y (t )
α
= g (t ) +
v(t )
(22)
α
Note that noise v(t ) shows up in the output y (t ) . To eliminate the effect of v(t ) , the
magnitude of the input α must be very large, which is prohibited in most applications.
Thus, the input-output correlation method using a random input sequence is a powerful
system identification method particularly for noisy systems.
14.6 Pseudo-Random Binary Signal (PRBS)
Now that random binary signals are useful for system identification, how can we
generate them? A Pseudo-Random Binary Signal (PRBS) is a periodic, deterministic
signal with white noise like properties. It has been widely used for system identification
as well as for spread spectrum wireless communication and GPS.
PRBS can be generated easily with a shift register that circulates its output to the
input gate and thereby generates a periodic, long-sequence binary signal. Mathematically,
the signal is written as the following difference equation:
u (t ) = rem ( A( q )u(t ),2)
(23)
= rem ( a1u (t − 1) + " + a n u (t − n ),2)
where rem(x,2) is the remainder as x is divided by 2, i.e. module 2 of x. The output is
therefore binary.
Example: A 5 bit shift register
+
1
1
1
Initial state
+
1
1
0
0
Output u(t )
0
1
u(t + 1)
1
Output u(t )
Binary shift register with feedback
Initial
state
1 1 1 1 1 00011011101010000100101100 1111100011011101010000100101100
31 bits = 25 −1
Repeat the same 31 bits
8
•
•
•
The sequence repeats itself after every sequence of 31 bits.
This sequence is uniquely determined by the feedback law A(q)
M = 2n-1 is the maximum-length sequence.
All the maximum-length Pseudo-Random Binary Signals (PRBS) have the following
mean, covariance, and power spectrum.
Mean:
1
M
M
u
∑ u(t ) = M
(24)
t =1
⎧u 2
k = 0,± M ,±2 M , "
⎪
2
+
=
u
t
u
t
k
(
)
(
)
⎨ u
∑
elsewhere
t =1
⎪⎩ − M
2πu 2 M −1 ⎛
2πk ⎞
Power Spectrum: Φ u (ω ) =
δ ⎜ω −
0 ≤ ω ≤ 2π
⎟
∑
M k =1 ⎝
M ⎠
1
Covariance:
M
M
(25)
(26)
Image removed for copyright reasons.
Features of PRBS
1) The power spectrum of n-th order PRBS with maximum length M = 2n-1 has M-1
frequency peaks between − π ≤ ω ≤ π , and is persistently exciting of order M-1;
high-density, white noise.
2) Crest factor of 1; Maximum strength input sequence
3) Periodic
• Quasi-stationary
• Noise reduction by taking average of samples y(t), y(t+T), y(t+2T),…,
which are outputs to the same input, u(t), u(t+T),…
9
An alternative to PRBS is to generate white, zero-mean Gaussian noise, and take the sign
of the noise for generating a binary sequence. Compared with this regular random binary
signal, PRBS has the following features
4) The regular random signal needs a long-sequence (n>>1) to secure the second
order properties, while PRBS needs a much shorter bit length.
5) PRBS’s spectrum has a special pattern, which facilitates analytic calculations.
PRBS has been used for
™ Global Positioning System
™ Spread-Spectrum, Code-Division Multiple Access (CDMA)
…Cellular Phones, Bluetooth Wireless Networking etc.
14.7 Sinusoidal Inputs.
Considering the requirements for input sequence design, a natural choice of input
is to form it as a sum of sinusoids with different frequencies:
p
u (t ) = ∑ ak cos(ωk t + φk )
(27)
k =1
At steady state ( t : large) this multi-sine function has the following power spectrum:
p
ak2
Φ u (ω ) = 2π ∑ [δ (ω − ωk ) + δ (ω + ωk )]
k =1 4
(28)
Φ u (ω )
π
2
− ωk
− ω1
These should be placed
at frequencies
where A(ω ) in (13) is
large
ak2
ω1 ω2 " ωk
ωp
ω
For each frequency, a pair of non-zero components is generated at ± ωk .
If G (q,θ ) has at most n poles and n zeros, p ≥ n frequencies satisfy the persistently
exciting condition. The problem with the multi-sin input is that, depending on phase
angle φk , the crest factor often becomes vary large. If all the sinusoids are in phase,
φk = 0 ,
10
p
2
Cr2 =
max u (t )
Amplitude of u (t )
=
=
1
signal power
2
(
)
u
t
∑
N t =1
1≤t ≤ N
N
∑a
k =1
k
ak2
∑
k =1 2
p
⇒ 2p
for
ak = a ,
k = 1," p
(29)
If all the sinusoids have the same amplitude, the crest factor is 2 p , which becomes
large as p increases.
Figure 13-5 from Ljung, Lennart. System Identification: Theory for the User. 2nd ed.
Upper Saddle River, NJ: Prentice-Hall, 1999, chapter 13. ISBN: 0136566952.
Image removed for copyright purposes.��
Figure 1 Multi-Sine inputs
As shown in Figure 1 (Figure 13.5 in Ljung’s book), the magnitude becomes very large
9a) when 10 sinusoids with zero phase angles φk = 0 are superimposed; Figure(a). To
11
overcome this problem, the phase angles φk ( k = 1" p ) should be randomized (See Figure
(c) ), or placed at particular angles that are as much out of phase as possible:
k (k − 1)
φk = φ1 −
π
2≤k ≤ p
(30)
p
φ1 = arbitrary
This is called the Schroeder phase choice.
Swept Sinusoids (Chirp Signals)
Continuously varying the frequency between ω1 and ω2 over a time period of
0 ≤ t ≤ T creates a Chirp Signal:
⎛
t2 ⎞
u (t ) = A cos⎜⎜ ω1t + (ω21 − ω1 ) ⎟⎟
2T ⎠
⎝
The instantaneous frequency, ωi , is obtained by differentiating u (t )
t
ωi = ω1 + (ω2 − ω1 )
T
This Chirp signal has the same crest factor as that of a single sinusoid, that is
Figure 2 below.
(31)
(32)
2 . See
Figure 13-6 from Ljung, Lennart. System Identification: Theory for the User.
2nd ed. Upper Saddle River, NJ: Prentice-Hall, 1999, chapter 13. ISBN: 0136566952.
Image removed for copyright purposes.
Figure 2 Chirp Signal and its spectrum
12
Download