Learning Wiener models – starting from the model Division of Automatic Control

advertisement
Learning Wiener models
– starting from the model
Thomas B. Schön
Division of Automatic Control
Linköping University
Sweden
Joint work with (alphabetical order): Michael I. Jordan (UC Berkeley), Fredrik Lindsten
(Linköping University), Lennart Ljung (Linköping University), Brett Ninness (University of
Newcastle, Australia) and Adrian Wills (University of Newcastle, Australia).
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Block-oriented nonlinear models
2(23)
Block-oriented nonlinear models are dynamic models consisting
of interactions of linear dynamic models and static nonlinear models.
v1
ut
Σ
e1
L1
h1 (·)
Σ
e2
v3
Σ
L3
y1
v2
L2
h2 (·)
Learning Wiener Models
Thomas B. Schön
y2
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
The Wiener model
3(23)
We will study one notable member of the class of block-oriented
nonlinear models, the Wiener model.
vt
ut
L
et
zt
h(·)
Σ
yt
A Wiener model is a linear dynamical model (L) followed by a static
nonlinearity (h(·)).
Learning problem: Find L and h(·) based on {u1:T , y1:T }.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Wiener model – explicit state space equations
Linear Gaussian state space (LGSS) model:
xt + 1
xt
= A B
+ vt ,
ut
vt ∼ N (0, Q),
zt = Cxt .
Static nonlinearity:
• Parametric: yt = h(zt , β) + et , et ∼ N (0, R).
• Non-parametric: yt = h(zt ) + et , et ∼ N (0, R).
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
4(23)
Example of simplifying assumptions
5(23)
Most of the existing work deals with special cases of the general
problem. Typical restrictions imposed are:
•
•
•
•
The nonlinearity h(·) is assumed to be invertible.
The measurement noise et is absent.
The LGSS model is deterministic (vt is absent).
The LGSS model is stochastic, but vt is assumed white.
In the models and solutions provided here we do not have to make
any of these assumptions.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Outline
6(23)
1. Parametric model (yt = h(zt , β) + et )
2. Semiparametric models (yt = h(zt ) + et )
• The LGSS model is parametric, but the nonlinearity is modelled
using a Bayesian nonparametric model (Gaussian process).
• Data driven model: Same as above, but a sparseness prior
(ARD) is placed on the dynamics.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Parametric model
7(23)
Linear Gaussian state space (LGSS) model:
xt
xt + 1 = A B
+ vt ,
ut
vt ∼ N (0, Q),
zt = Cxt ,
yt = h(zt , β) + et ,
et ∼ N (0, R).
Maximum Likelihood (ML) amounts to solving,
ML
b
θ = arg max log pθ (y1:T )
θ
where the log-likelihood function is given by
T
log pθ (y1:T ) =
∑ log pθ (yt | y1:t−1 )
t=1
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Where is the challenge?
8(23)
Two challenges:
1. The one-step prediction pdf pθ (yt | y1:t−1 ) has to be computed,
2. In solving the optimisation problem
θbML = arg max log pθ (y1:T )
θ
the derivatives
∂
∂θ pθ (yt
| y1:t−1 ) are useful.
The Expectation Maximisation (EM) algorithm together with a
Particle Smoother (PS) provides a systematic way of dealing with
both of hese challenges.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
EM-PS (I/II)
9(23)
Rather than studying
arg max log pθ (y1:T ),
θ
the EM algorithm is concerned with
arg max log pθ (y1:T , x1:T ).
θ
The key idea underlying EM is to consider the joint likelihood
function of the observed variables y1:T and the latent variables x1:T .
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
EM-PS (II/II)
10(23)
EM splits the original problem
arg max pθ (y1:N )
θ
into two manageable (and closely linked) subproblems:
1. Compute a conditional expectation
Q(θ, θk ) = Eθk {log pθ (x1:T , y1:T ) | y1:T }
=
Z
log pθ (x1:T , y1:T )pθk (x1:T | y1:T )dx1:T
R. Douc, A. Garivier, E. Moulines, and J. Olsson. Sequential Monte Carlo smoothing for general state space
hidden Markov models. Annals of Applied Probability, 21(6):2109 2145, 2011.
2. Solve a maximisation problem
θk+1 = arg max Q(θ, θk )
θ
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Parametric model – blind Wiener learning (I/IV)
e1,t
h1 (zt , β)
ut
L
zt
Σ
y1,t
e2,t
h2 (zt , β)
Σ
y2,t
Learning problem: Find L and β, r1 , r2 based on {y1,1:T , y2,1:T }.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
11(23)
Parametric model – blind Wiener learning (III/IV)
• Second order LGSS model with
30
complex poles.
• EM-PS was terminated after just
100 iterations.
25
20
15
Gain (dB)
• Employ the EM-PS with
N = 100 particles.
12(23)
10
5
0
−5
• Results obtained using
T = 1000 samples.
−10
−15
0
0.5
1
1.5
2
Normalised frequency
2.5
3
• The plots are based on 100
realisations of data.
• Nonlinearities (dead zone and
saturation) shown on next slide.
Learning Wiener Models
Thomas B. Schön
Bode plot of estimated mean (black),
true system (red) and the result for all
100 realisations (gray).
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Parametric model – blind Wiener learning (IV/IV)
13(23)
1
0.5
0.8
0.6
Output of nonlinearity 2
Output of nonlinearity 1
0
−0.5
−1
0.4
0.2
0
−0.2
−0.4
−0.6
−1.5
−2
−1.5
−1
−0.5
0
Input to nonlinearity 1
0.5
1
−0.8
−1
−0.5
0
0.5
Input to nonlinearity 2
1
1.5
Estimated mean (black), true static nonlinearity (red) and the result
for all 100 realisations (gray).
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Semiparametric model 1 – known model order
14(23)
First step towards a fully data driven model, the order of the LGSS
model is assumed known.
Parameters: θ = {Γ, Q, r, h(·)}.
Bayesian model specified by priors
• Conjugate priors for Γ = [A B], Q and r,
• p(Γ, Q) = Matrix-normal inverse-Wishart
• p(r) = inverse-Wishart
u1:T
Γ
Q
x1:T
• Gaussian process prior on h(·),
y1:T
h(·) ∼ GP (z, k(z, z0 )).
h(·)
Learning Wiener Models
Thomas B. Schön
r
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Semiparametric model 2 – fully data driven
15(23)
Everything is learned from the data, by introducing the possibility to
switch specific model components on and off.
Parameters: θ = {Γ, Q, δ̄, r, h(·)}.
δ̄
Bayesian model specified by priors
• Sparseness prior (ARD) on Γ = [A B],
nu
• p(Γ | δ̄) = ∏nj=x +
N (γj | 0, δj−1 Inx +nu )
1
nu
• δ̄ = {δj }jn=x +
1 ,
u1:T
Γ
Q
p(δj ) = Gam(δj ; a, b)
x1:T
• Inverse-Wishart prior on Q and r
• Gaussian process prior on h(·),
y1:T
0
h(·) ∼ GP (z, k(z, z )).
h(·)
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
r
Learning algorithm
16(23)
Aim: Devise a Gibbs sampler targeting p(θ, x1:T | y1:T ).
PG-BS for Wiener system identification:
• PG-BS
• Run a conditional PF, targeting p(x1:T | θ, y1:T );
? ;
• Run a backward simulator to sample x1:T
• Draw
? , y ), if MNIW prior, or;
• {Γ? , Q? , r? } ∼ p(Γ, Q, r | h, x1:T
1:T
? , y ), if ARD prior;
• {Γ? , δ̄? , Q? , r? } ∼ p(Γ, δ̄, Q, r | h, x1:T
1:T
? , y ).
• Draw h? ∼ p(h | r? , x1:T
1:T
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Snapshot – where are we?
17(23)
We have introduced two Bayesian semiparametric models:
• Assume the model order of the LGSS model to be known and
employ conjugate priors (MNIW).
• Fully data driven model where everything is learned from data,
made possible via the ARD prior.
We have a learning algorithm (PG-BS) for each model.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Example – known model order (I/II)
18(23)
with conjugate prior (MNIW).
• 6th order LGSS model and a
saturation.
• Using T = 1000 measurements.
• Employ the PG-BS sampler with
N = 15 particles.
• Run 15000 MCMC iterations,
discard 5000 as burn-in.
Learning Wiener Models
Thomas B. Schön
True
Posterior mean
99 % credibility
0
−5
−10
−15
200
Phase (deg)
• Bayesian semiparametric model
Magnitude (dB)
5
150
100
50
0
0
0.5
1
1.5
2
2.5
3
Frequency (rad/s)
True Bode diagram of the linear
system (black), estimated Bode
diagram (dashed black) and 99%
credibility interval (blue).
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Example – known model order (II/II)
1
19(23)
True
Posterior mean
99 % credibility
0.8
0.6
h(z)
0.4
0.2
0
−0.2
−0.4
−0.6
−0.8
−1
−1.5
−1
−0.5
0
z
0.5
1
1.5
True static nonlinearity (black), estimated posterior mean (dashed
black) and 99% credibility interval (blue).
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Data driven learning (I/II)
20(23)
3000
2500
• Bayesian semiparametric model
2000
with ARD prior.
1500
• 4th order LGSS model.
• Using T = 1000 measurements.
• Employ the PG-BS sampler with
N = 15 particles.
• Run 15000 MCMC iterations,
discard 5000 as burn-in.
1000
500
0
1
2
3
4
5
6
7
8
9
10
11
ARD precision parameters
δj = 1, . . . , 11. The rightmost box
plot corresponds to the input signal.
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
1
True
Posterior mean
99 % credibility
20
True
Posterior mean
99 % credibility
0.8
0.6
10
0.4
0
−10
100
Phase (deg)
21(23)
h(z)
Magnitude (dB)
Data driven learning (I/II)
0.2
0
−0.2
−0.4
0
−0.6
−100
−0.8
−1
−200
−1.5
0
0.5
1
1.5
2
2.5
3
−1
−0.5
0
z
0.5
1
1.5
Frequency (rad/s)
Bode diagram of the 4th-order linear
system. Estimated mean, true system
and 99% credibility intervals (blue).
Learning Wiener Models
Thomas B. Schön
Estimated mean (thick black), 99%
credibility intervals (blue) and the true
static nonlinearity (non-monotonic)
(black).
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Conclusions
22(23)
vt
ut
L
et
zt
h(·)
Σ
yt
• The Wiener model offeres a “controlled” venture into the realm
of nonlinear dynamical models.
• Note the “characterization of the uncertainty”.
• Three cases studied
• Fully parametric (ML – EM-PS)
• Semiparametric (Bayes – PG-BS)
• Data driven semiparametric (Bayes – PG-BS)
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Learning Wiener models – references
23(23)
• Maximum Likelihood approach (EM-PS)
Adrian Wills, Thomas B. Schön, Lennart Ljung and Brett Ninness. Identification of Hammerstein-Wiener
Models. Automatica, 2012. (Accepted for publication)
Thomas B. Schön, Adrian Wills and Brett Ninness. System Identification of Nonlinear State-Space
Models. Automatica, 47(1):39-49, January 2011.
• Bayesian approach (PG-BS)
Fredrik Lindsten, Thomas B. Schön and Michael I. Jordan. Data driven Wiener system identification.
Automatica, 2012 (submitted).
Christophe Andrieu, Arnaud Doucet and Roman Holenstein, Particle Markov chain Monte Carlo methods,
Journal of the Royal Statistical Society: Series B, 72:269-342, 2010.
M ATLAB code is available from www.control.isy.liu.se/˜lindsten/code.html
Learning Wiener Models
Thomas B. Schön
AUTOMATIC CONTROL
REGLERTEKNIK
LINKÖPINGS UNIVERSITET
Download