Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Switching Regulatory Models of Cellular Stress Reaction Guido Sanguinetti Joint work with M. Opper, A. Ruttor and C. Archambeau Computer Science/ ChELSI - The University of Sheffield StoMP, Jul 2009 Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Outline of the talk 1 Stochastic models and inference 2 Basic problem 3 Single Input Motif Model Results 4 Conclusions and future work Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Where does stochasticity come from? My personal view: three types of stochasticity Intrinsic stochasticity: a deterministic description of the system is not appropriate regardless of the amount of information available (e.g. quantum mechanics, perhaps single cell protein production) Induced stochasticity: the system is truly deterministic but we have no information about some parts of the system, hence for all practical purposes it should be modelled as a stochastic system Noise: the system is deterministic but observations are noisy, hence stochasticity Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Bayesian inference A stochastic model defines a probability distribution over its variables x We can therefore use observations x̂ to update the model using Bayes’ theorem p (x|x̂) = p (x̂|x) p (x) p(x̂) p(x) is our prior model p (x̂|x) is the likelihood, connecting observations and model The updated belief or posterior is the prior re-weighted by the likelihood The bottleneck is computing the marginal P p (x̂) = x p (x̂|x) p (x) Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Basic problem We wish to predict dynamics of transcription factors (TFs) during adaptation to stress, based on time-course mRNA profiles Example: E. coli transition between aerobic and anaerobic states (see previous talk) Not just interested in steady states Due to difficulties in measuring TF activity, treat it as an inference problem with TFs as latent variables (functions) Not the first one to think of it: see Liao et al, Sabatti and James, Khanin et al, Barenco et al, Rogers et al, Lawrence et al... Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results Single Input Motif In general, most genes have complex promoter structures with several TFs interacting The single input motif (SIM) is a specific network motif where several genes are controlled by a single TF The TF input to the SIM are generally Master regulators, TFs who control hundreds of genes and generally are associated with large shifts in cellular behaviour Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results Basic problem Consider an ODE model of SIM dynamics dxi (t) = g (f (t) , θi ) + bi − λi xi (t) dt Given time course observations of the expression levels of the target genes xi , infer the profile of the transcription factor f and the model parameters θi , bi and λi Problem originally considered by Barenco et al. (linear dependence on the TF), and then Lawrence et al., Khanin et al., Rogers et al.,... We wish to hardwire into the model the fast dynamics of stress response Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results Swicthing latent process We assume TF activity jumps quickly from zero to saturation level dxi (t) = Ai µ (t) + bi − λi xi (t) (1) dt where µ(t) ∈ {0, 1} The driving process µ(t) is modelled as a two-states Markov jump process, also known as a telegraph process. NB: This is not a logical approximation to Michaelis-Menten. Given transition rates f0,1 (t) for the process, the probability p1 (t) of µ(t) = 1 at a given time is given by the following Master equation dp1 (t) = −(f1 + f0 )p1 (t) + f1 (t) . dt Guido Sanguinetti (2) Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results Why? 0.9 1 0.8 0.9 0.8 0.7 0.7 Activation Activation 0.6 0.5 0.4 0.6 0.5 0.4 0.3 0.3 0.2 0.2 0.1 0 0.1 0 100 200 300 400 500 600 700 800 900 1000 f 0 0 100 200 300 400 500 600 700 800 900 1000 t Left: Michaelis-Menten activation as function of f . Right: Michaelis-Menten activation as function of t, with f starting to change exponentially at t = 200. Biologically meaningful? More identifiable? Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results Inference Exact inference is theoretically possible for this model It relies on a forward-backward procedure, involving solving iteratively PDEs numerically An alternative is to use a variational formulation and find an optimal approximate solution We compute the Kullback-Leibler (KL) divergence between the posterior process and an approximating telegraph (Markov) process q (µ|g± ) KL [qkppost ] = ln Z + KL [qkpprior ] − N X Eq [ln p (x̂j |x (tj ))] . j=1 Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results Toy example 1 1 0.9 0,9 0.8 0,8 0.7 0,7 0.6 0,6 0.5 0,5 0.4 0,4 0.3 0,3 0.2 0,2 0,1 0.1 0 0 100 200 300 400 500 600 700 800 900 1000 0 0 200 400 600 800 1000 Figure: Left: variational inference A = 2.3 ± 0.5 × 10−3 , b = 1.0 ± 0.2 × 10−3 , λ = 4 ± 0.3 × 10−3 . Right: exact inference A = 3.2 ± 1.1 × 10−3 , b = 0.08 ± 0.6 × 10−3 , λ = 3.1 ± 1.3 × 10−3 . True values A = 3.7 × 10−3 , b = 0.8 × 10−3 , λ = 5 × 10−3 . Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results FNR regulation As a real example on which to test our approach, we considered transcriptomic measurements of the reaction of E.coli to sudden oxygen starvation When oxygen is removed, Fe-S clusters are generated which dimerise and activate the master regulator FNR FNR activation is thought to be the main channel used by the bacterium to switch between aerobic and nitrate metabolism Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Model Results FNR regulation 60 0.85 0.8 50 0.75 0.7 40 0.65 30 0.6 0.55 20 0.5 0.45 10 0.4 0.35 0 0 10 20 30 40 50 60 ompW yjiD hypB moaA aspA Figure: Results on E.coli data: (a) posterior mean FNR profile; (b) half lives of targets (in minutes) with uncertainty, inferred (triangles on the right) versus experimentally measured. No measurement of the half life of yjiD or moaA is available. Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Conclusions We have proposed a novel TF inference framework which arguably could describe better some biological conditions It is of interest in its own right as an example of hybrid discrete-continuous (and stochastic/deterministic) model It can identify both the time profile of TF activity and the model parameters, including non-trivial interaction terms Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction Stochastic models and inference Basic problem Single Input Motif Conclusions and future work Future work What type of data do we need for a large-scale application? Hierarchical models of transcriptional regulation (e.g. FFL) Try to model dynamics of the signal too (e.g., oxygen metabolism) Consider SDEs driven by telegraph noise, look at single cell data Guido Sanguinetti Switching Regulatory Models of Cellular Stress Reaction