Gaussian process inference in differential equations Magnus Rattray June 15th, 2010

advertisement
Gaussian process inference in differential equations
Magnus Rattray
Machine Learning and Oprimization Group, University of Manchester
June 15th, 2010
Joint work with Neil Lawrence
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
1 / 25
Differential equation models
Differential equations are a very popular way to model transcriptional
regulation and other cellular processes
x(t = 0) = x0
dx
= F (x, θ)
dt
Model-based inference and learning are useful in a number of
contexts, e.g.
(1) Inference: can we infer the action of unobserved chemical species?
(2) Learning: can we learn the model parameters θ from data?
(3) Model selection: can we identify the best hypothesis F (x, θ)?
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
2 / 25
Differential equation models
Differential equations are a very popular way to model transcriptional
regulation and other cellular processes
x(t = 0) = x0
dx
= F (x, θ)
dt
Model-based inference and learning are useful in a number of
contexts, e.g.
(1) Inference: can we infer the action of unobserved chemical species?
(2) Learning: can we learn the model parameters θ from data?
(3) Model selection: can we identify the best hypothesis F (x, θ)?
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
2 / 25
Differential equation models
Differential equations are a very popular way to model transcriptional
regulation and other cellular processes
x(t = 0) = x0
dx
= F (x, θ)
dt
Model-based inference and learning are useful in a number of
contexts, e.g.
(1) Inference: can we infer the action of unobserved chemical species?
(2) Learning: can we learn the model parameters θ from data?
(3) Model selection: can we identify the best hypothesis F (x, θ)?
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
2 / 25
Differential equation models
Differential equations are a very popular way to model transcriptional
regulation and other cellular processes
x(t = 0) = x0
dx
= F (x, θ)
dt
Model-based inference and learning are useful in a number of
contexts, e.g.
(1) Inference: can we infer the action of unobserved chemical species?
(2) Learning: can we learn the model parameters θ from data?
(3) Model selection: can we identify the best hypothesis F (x, θ)?
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
2 / 25
Differential equation models
Differential equations are a very popular way to model transcriptional
regulation and other cellular processes
x(t = 0) = x0
dx
= F (x, θ)
dt
Model-based inference and learning are useful in a number of
contexts, e.g.
(1) Inference: can we infer the action of unobserved chemical species?
(2) Learning: can we learn the model parameters θ from data?
(3) Model selection: can we identify the best hypothesis F (x, θ)?
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
2 / 25
Differential equation model of activation
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
xj (t) – concentration of gene j’s mRNA
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
xj (t) – concentration of gene j’s mRNA
f (t) – concentration of active transcription factor (TF)
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
xj (t) – concentration of gene j’s mRNA
f (t) – concentration of active transcription factor (TF)
Model parameters: baseline Bj , sensitivity Sj and decay Dj
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
xj (t) – concentration of gene j’s mRNA
f (t) – concentration of active transcription factor (TF)
Model parameters: baseline Bj , sensitivity Sj and decay Dj
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
xj (t) – concentration of gene j’s mRNA
f (t) – concentration of active transcription factor (TF)
Model parameters: baseline Bj , sensitivity Sj and decay Dj
Problem 1: how do we fit the model when f (t) is not observed?
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Differential equation model of activation
Linear Activation Model (Barenco et al. Genome Biology 2006)
dxj (t)
= Bj + Sj f (t) − Dj xj (t)
dt
xj (t) – concentration of gene j’s mRNA
f (t) – concentration of active transcription factor (TF)
Model parameters: baseline Bj , sensitivity Sj and decay Dj
Problem 1: how do we fit the model when f (t) is not observed?
Problem 2: how do we deal with the fact f (t) does not appear on the
left? (the system is “open”)
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
3 / 25
Why use a model-based approach?
Co-regulated genes can differ greatly in their expression profiles
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
4 / 25
Why use a model-based approach?
Co-regulated genes can differ greatly in their expression profiles
210764sat − CYR61
5
x 10
2
1.5
1
0.5
0
0
2
4
Magnus Rattray (University of Manchester)
6
8
10
Gaussian process inference
15/06/10
4 / 25
Why use a model-based approach?
Co-regulated genes can differ greatly in their expression profiles
210764sat − CYR61
5
x 10
204748at − PTGS2
4
x 10
2
2
1.5
1.5
1
1
0.5
0.5
0
0
2
4
Magnus Rattray (University of Manchester)
6
8
10
Gaussian process inference
0
0
2
4
6
8
15/06/10
10
4 / 25
Why use a model-based approach?
Co-regulated genes can differ greatly in their expression profiles
210764sat − CYR61
5
x 10
204748at − PTGS2
4
x 10
2
2
1.5
1.5
1
1
0.5
0.5
0
0
2
4
6
8
10
0
0
2
4
6
8
10
Simple clustering cannot be relied on to identify co-regulated genes
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
4 / 25
Why use a model-based approach?
Co-regulated genes can differ greatly in their expression profiles
210764sat − CYR61
5
x 10
204748at − PTGS2
4
x 10
2
2
1.5
1.5
1
1
0.5
0.5
0
0
2
4
6
8
10
0
0
2
4
6
8
10
Simple clustering cannot be relied on to identify co-regulated genes
Concentration of phosphorylated TF in nucleus difficult to measure
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
4 / 25
Why use a model-based approach?
Co-regulated genes can differ greatly in their expression profiles
210764sat − CYR61
5
x 10
204748at − PTGS2
4
x 10
2
2
1.5
1.5
1
1
0.5
0.5
0
0
2
4
6
8
10
0
0
2
4
6
8
10
Simple clustering cannot be relied on to identify co-regulated genes
Concentration of phosphorylated TF in nucleus difficult to measure
A model-based inference approach is useful
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
4 / 25
Representing f (t) as a Gaussian Process
We need a way to represent the TF concentration f (t)
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
5 / 25
Representing f (t) as a Gaussian Process
We need a way to represent the TF concentration f (t)
A Gaussian Process (GP) is a distribution over functions
f (t) v GP m (t) , k t, t 0
where
m (t) = E [f (t)]
k t, t 0 = E (f (t) − m (t)) f t 0 − m t 0
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
5 / 25
Representing f (t) as a Gaussian Process
We need a way to represent the TF concentration f (t)
A Gaussian Process (GP) is a distribution over functions
f (t) v GP m (t) , k t, t 0
where
m (t) = E [f (t)]
k t, t 0 = E (f (t) − m (t)) f t 0 − m t 0
Functional analogue of the Gaussian distribution
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
5 / 25
Representing f (t) as a Gaussian Process
We need a way to represent the TF concentration f (t)
A Gaussian Process (GP) is a distribution over functions
f (t) v GP m (t) , k t, t 0
where
m (t) = E [f (t)]
k t, t 0 = E (f (t) − m (t)) f t 0 − m t 0
Functional analogue of the Gaussian distribution
Like the Gaussian it has some useful properties for inference
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
5 / 25
From a Gaussian distribution to a Gaussian Process
Samples from a 25-dimensional Gaussian distribution
2
1.5
0.8
10
0.6
15
0.4
20
0.2
m
1
5
fn
0.5
5
10
−0.5
15
20
25
n
25
5
−1
Magnus Rattray (University of Manchester)
Gaussian process inference
10
n
15
20
25
15/06/10
6 / 25
From a Gaussian distribution to a Gaussian Process
Samples from a 25-dimensional Gaussian distribution
2
1.5
0.8
10
0.6
15
0.4
20
0.2
m
1
5
fn
0.5
5
10
−0.5
15
20
25
n
25
5
−1
10
n
15
20
25
By making nearby points correlated the samples seem to follow a line
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
6 / 25
Samples from a Gaussian Process
2.5
3
2
2
1.5
1
1
0.5
−3
−2
−1
−0.5
1
2
3
−3
−2
−1
1
2
3
−1
−1
−1.5
−2
−2
−2.5
−3
kf ,f t, t
0
(t − t 0 )2
= exp −
l2
!
The parameter l determines the time-scale for changes in f (t)
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
7 / 25
Linear activation model
Recall the linear model
dxj (t)
= Bj + Sj f (t) − Dj xj (t) .
dt
This differential equation can be solved for xj (t) as
Bj
xj (t) =
+ Sj
Dj
Magnus Rattray (University of Manchester)
Z
t
e −Dj (t−u) f (u) du .
0
Gaussian process inference
15/06/10
8 / 25
Linear activation model
Recall the linear model
dxj (t)
= Bj + Sj f (t) − Dj xj (t) .
dt
This differential equation can be solved for xj (t) as
Bj
xj (t) =
+ Sj
Dj
Note:
Z
t
e −Dj (t−u) f (u) du .
0
This is a linear operation on f (t)
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
8 / 25
Covariance function
Any linear operation on a GP =⇒ Related GP
f (t) v GP 0, kf ,f t, t
0
=⇒ xj (t) v GP
Bj
, kxj ,xj t, t 0
Dj
with covariance (i = j) and cross-covariances (i 6= j) between genes:
kxi ,xj t, t
0
t0
Z tZ
0
0
e −Di (t−u)−Dj (t −u ) kf ,f u, u 0 dudu 0
= Si Sj
0
0
and cross-covariances between xj (t) and f (t):
kxj ,f t, t
0
Z
=
t
e −Di (t−u) kf ,f u, t 0 du .
0
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
9 / 25
Inferring the transcription factor concentration f (t)
Under the linear model, we have
0
f
Kff
∼ GP
,
B
Kxf
x
D
Magnus Rattray (University of Manchester)
Gaussian process inference
Kf x
Kxx
15/06/10
10 / 25
Inferring the transcription factor concentration f (t)
Under the linear model, we have
0
f
Kff
∼ GP
,
B
Kxf
x
D
Kf x
Kxx
Bayesian GP regression gives the predicted process as
p(f | x) ∼ GP(hf ipost , Kffpost ) where
hf ipost
Kffpost
Magnus Rattray (University of Manchester)
=
−1
Kf x Kxx
B
x−
D
−1
= Kff − Kf x Kxx
Kxf
Gaussian process inference
15/06/10
10 / 25
Inferring the transcription factor concentration f (t)
Under the linear model, we have
0
f
Kff
∼ GP
,
B
Kxf
x
D
Kf x
Kxx
Bayesian GP regression gives the predicted process as
p(f | x) ∼ GP(hf ipost , Kffpost ) where
hf ipost
=
Kffpost
−1
Kf x Kxx
B
x−
D
−1
= Kff − Kf x Kxx
Kxf
Note: in practice x only observed, with noise, at small number of times
data
Magnus Rattray (University of Manchester)
yjt = xj (t) + ηjt
Gaussian process inference
15/06/10
10 / 25
Artificial Example – inferring f (t) from noiseless data
6 data points from 3 genes
Zero noise observations
Known kinetic parameters:
j =1
j =2
j =3
Magnus Rattray (University of Manchester)
Bj
0.0
7.5 × 10−2
2.5 × 10−3
Sj
1.0
0.4
0.4
Gaussian process inference
Dj
1.0
5 × 10−2
1 × 10−3
15/06/10
11 / 25
Artificial Example – inferring f (t) from noiseless data
7
7
6
6
5
5
4
4
3
3
2
2
1
1
0
0
−1
0
−1
0
5
10
j =1
j =2
j =3
Magnus Rattray (University of Manchester)
15
20
Bj
0.0
7.5 × 10−2
2.5 × 10−3
Sj
1.0
0.4
0.4
Gaussian process inference
5
10
15
20
Dj
1.0
5 × 10−2
1 × 10−3
15/06/10
12 / 25
Artificial Example – inferring f (t) from noiseless data
7
7
6
6
5
5
4
4
3
3
2
2
1
1
0
−1
0
0
5
10
15
20
−1
0
Magnus Rattray (University of Manchester)
Gaussian process inference
5
10
15
15/06/10
20
13 / 25
Parameter learning
A likelihood function for model parameters θ = {Bj , Sj , Dj }dj=1 and
time-scale l obtained by integrating out the latent function f
!
Z Y
T
L(θ, l) =
p(yt |f (t), θ) p(f |l) df
t=1
Parameters are obtained by maximum likelihood or Bayesian MCMC
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
14 / 25
Artificial Example – inferring f (t) and learning parameters
18
7
16
6
14
5
12
4
10
8
3
6
2
4
1
2
−2
5
10
15
4
5
−1
10
15
3
1
3
S
D
B
2
2
0.5
1
1
0
0
gene 1
gene 2
gene 3
gene 4
gene 5
Magnus Rattray (University of Manchester)
gene 1
gene 2
gene 3
gene 4
gene 5
Gaussian process inference
0
gene 1
gene 2
gene 3
gene 4
15/06/10
gene 5
15 / 25
p53, data from Barenco et al. Genome Biology 2006
12
15
10
10
x(t)
f(t)
8
6
5
4
2
0
0
2
4
6
t
8
10
12
0
0
2
4
6
t
8
10
12
Learning parameters and inferring f (t) from training genes
Good correspondance with protein data from westerns
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
16 / 25
Elk-1, data from Amit et al. Nature Genetics 2007
Transcription factor concentration over time
Training Gene 2
Training Gene 4
3
8
3.5
2.5
6
3
TF concentration
2
2.5
4
1.5
2
1
2
1.5
0.5
1
0
0.5
0
0
−2
0
2
4
6
0
8
1
2
3
4
5
6
7
8
0
1
2
3
4
5
time (h)
time(h)
time (h)
Training Gene 1
Training Gene 3
Training Gene 5
6
7
8
6
7
8
3.5
3
2.5
3
2.5
2.5
2
2
2
1.5
1.5
1.5
1
1
0.5
0.5
0
0
1
0.5
0
−0.5
0
1
2
3
4
5
6
7
8
time (h)
0
1
2
3
4
5
6
7
8
0
1
2
time (h)
3
4
5
time (h)
Learning parameters and inferring f (t) from training genes
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
17 / 25
Elk-1 target ranking
Predicted target gene
Predicted non−target gene
3
1.8
1.6
2.5
1.4
2
1.2
1
1.5
0.8
1
0.6
0.4
0.5
0.2
0
0
0
1
2
3
4
5
6
7
8
−0.2
0
1
2
time (h)
3
4
5
6
7
8
time (h)
Likelihood can be used for ranking putative targets
Example of good and bad fit shown above
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
18 / 25
Nonlinear response models
Consider the following modification to the model,
dxj (t)
= Bj + Sj gj (f (t)) − Dj xj (t) ,
dt
where gj (·) is a non-linear function.
The differential equation can still be solved,
Z t
Bj
xj (t) =
+ Sj
e −Dj (t−u) gj (f (u)) du
Dj
0
but is no longer linear in f (t).
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
19 / 25
Approximate inference
1
Laplace approximation to the required integral:
T 1
−1
ˆ
ˆ
ˆ
p(f | x) ' GP(f , A ) ∝ exp − f − f
A f −f
2
where fˆ = argmaxp(f | x) and A = −∇∇ log p(f | y ) |f =fˆ .
f
2
MCMC: Sampling f (t) on dense grid requires well-designed
Metropolis-Hastings moves (Titsias et al. NIPS 2009)
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
20 / 25
Michaelis-Menten kinetics
Michaelis-Menten activation model uses following non-linearity
gj (f (t)) =
e f (t)
,
γj + e f (t)
where GP f (t) now models the log of the TF activity.
4
4
3
3
f(t)
f(t)
2
2
1
1
0
−1
0
2
4
Magnus Rattray (University of Manchester)
6
t
8
10
12
0
0
Gaussian process inference
2
4
6
t
8
10
12
15/06/10
21 / 25
Repression Model
We can use an analogous model of repression,
gj (f (t)) =
1
γj + e f (t)
Recall the solution to the ODE (with additional transient term)
Z t
Bj
−Dj t
xj (t) = αi e
+
+ Sj
e −Dj (t−u) gj (f (u)) du
Dj
0
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
22 / 25
Results for the repressor LexA
gene id
umuC
dinI
recN
Bj
3 × 106
0.12
5.94
Dj
0.01
0.16
1.00
Sj
0.78
0.39
3.68
αj
1.72
0.05
−5.74
12
γj
1.06
0.81
1.10
4
10
3.5
6
f(t)
x(t)
8
3
4
2.5
2
0
0
20
t
40
Magnus Rattray (University of Manchester)
60
Gaussian process inference
2
0
20
t
40
60
15/06/10
23 / 25
Summary
We can use a model to infer the concentration of quantities that are
difficult to measure: hidden variables
Application to ranking putative targets of a phosphorylated TF
General principle can be applied to any “open” biochemical system
Learning requires integrating out the hidden variable to derive the
likelihood function
Model-selection requires further integrating out of model parameters
to derive the marginal likelihood
• Honkela et al. Proc. Natl. Acad. Sci. USA 107(17), 7793-7798 (2010).
• Gao et al. Bioinformatics 24(16), i70-i75 (2008).
• Lawrence, Rattray, Gao, Titsias. (2010) “Gaussian processes for missing species
in biochemical systems” in Learning and Inference in Computational Systems
Biology (MIT Press, Cambridge, MA. 2010).
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
24 / 25
Advertisements
Two post-doc positions available to work on gene regulatory network
inference – see my homepage http://www.cs.man.ac.uk/∼magnus
Magnus Rattray (University of Manchester)
Gaussian process inference
15/06/10
25 / 25
Download