Integrating my interests in stochastic chemical kinetics and metabolic flux balance analysis.

advertisement
Integrating my interests in stochastic
chemical kinetics and metabolic flux
balance analysis.
Andrzej M. Kierzek
Reader in Systems Biology,
Faculty of Health and Medical Sciences,
University of Surrey,
Guildford,
GU2 7XH, UK
Thursday, 7 April 2011
Stochastic kinetic models and FBA - extremes
of biochemical network modelling spectrum.
• Stochastic chemical kinetics: Most detailed
dynamic models of best studied systems
where kinetic parameters are known. The time
evolution of the distribution of the number of
molecules in individual cells is studied. (Mol.
BioSyst 2010, Biophys J 2009, Biophys J
2004, Bioinformatics 2002, J. Biol. Chem
2001)
• Flux Balance Analysis: Least detailed steady
state models are studied. The reaction fluxes
are calculated and no information is obtained
about molecular amounts/activities. Genome
scale networks are reconstructed and
analysed (PLoS Comp. Biol. 2011,
Bioinformatics 2011, Metabolic Engineering
2008, Genome Biology 2007).
Thursday, 7 April 2011
I. Stochastic computer simulations of the
biochemical reaction network dynamics.
Thursday, 7 April 2011
Acknowledgements
The work on stochastic dynamics of two component
system signalling has been done in collaboration with the
group of Barry Wanner, Purdue University, USA ….
… which was established during informal
“Warsaw Pact” consortium meeting in
Warsaw and Zelazowa Wola 2004
Thursday, 7 April 2011
Flow cytometry of a two component system
signalling system.
Signal
P
P
Regulator
GFP
Reporter
Thursday, 7 April 2011
Kinase
Phosphotransfer
reactions involving
formation of regulatorkinase complexes
Zhou et al (Barry L. Wanner’s group, Purdue) Yearbook Bioinformatics 2004, R. Hofestädt (ed.), pp.
25-38. Magdeburg: IMBio, Informationsmanagement in der Biotechnologie e.V., 2005.
Thursday, 7 April 2011
Stochastic kinetic model of TCS signalling.
Two component system signalling in Escherichia coli.
Kierzek, Zhou, Wanner, Molecular Biosystems 2010.
Thursday, 7 April 2011
Stochastic kinetic model of TCS signalling.
Propensity:
c9 #mRNAKin
c9 = 10-4 1/s
Two component system signalling in Escherichia coli.
Kierzek, Zhou, Wanner, Molecular Biosystems 2010.
Thursday, 7 April 2011
Stochastic kinetic model of TCS signalling.
Thursday, 7 April 2011
Exact stochastic simulation with
Gillespie algorithm.
1. Set initial state of the system
x0=(X1(t0),...,XN(t0))
2. For every reaction Rµ compute aµ(x)
3. Compute the sum of propensity func.
a0 = ∑j=1,M aj(x)
4. Randomly select reaction Rµ and waiting
time τ such that:
P(τ,µ)dτ = aµ exp(-a0τ) dτ
P(µ) = (aµ / a0)
P(τ) = a0 exp(-a0τ) dτ
5. Update the system: x(t+τ) = x(t) + νµ
6. Set simulation time to t + τ
7. Go to 2.
Thursday, 7 April 2011
S+E->ES c = 1 1/s
ES->E+S c = 10 1/s
ES->P+E c = 1 1/s
Initial conditions: #S = 100; #E = 20
Exact stochastic simulation with
Gillespie algorithm.
1. Set initial state of the system
x0=(X1(t0),...,XN(t0))
2. For every reaction Rµ compute aµ(x)
3. Compute the sum of propensity func.
a0 = ∑j=1,M aj(x)
4. Randomly select reaction Rµ and waiting
time τ such that:
P(τ,µ)dτ = aµ exp(-a0τ) dτ
P(µ) = (aµ / a0)
P(τ) = a0 exp(-a0τ) dτ
5. Update the system: x(t+τ) = x(t) + νµ
6. Set simulation time to t + τ
7. Go to 2.
Thursday, 7 April 2011
S+E->ES c = 1 1/s
ES->E+S c = 10 1/s
ES->P+E c = 1 1/s
Initial conditions: #S = 100; #E = 20
Stochastic kinetic model of TCS signalling.
Thursday, 7 April 2011
Validation of the model.
Thursday, 7 April 2011
The effect of transcription and translation initiation
frequencies on the stochastic fluctuations in
prokaryotic gene expression.
If two genes produce the same mean number of protein molecules the one with
higher transcription and lower translation exhibits smaller variance of the number
of protein molecules.
Rigney, D. R. & Schieve, W. C. J. Theor. Biol. 69, 761–766 (1977).
Berg,O. G. J. Theor. Biol. 173, 307–320 (1978).
Kierzek A.M, et al. J. Biol. Chem. 276, 8165-8172 (2001)
M. Thattai and A. van Oudenaarden, Proc. Natl. Acad. Sci. U. S. A., 2001, 98, 8614–8619.
E. M. Ozbudak, et al., Nat. Genet., 2002, 31, 69–73..
Thursday, 7 April 2011
Stochastic fluctuations in expression of TCS
genes.
Slow transcription, fast translation
Thursday, 7 April 2011
Fast transcription, slow translation
Dependence of stochastic switching behaviour of
TCS on the HK gene expression noise and RR
autoregulation.
Thursday, 7 April 2011
Autoregulated RR
Low HK noise
High HK noise
Thursday, 7 April 2011
Constitutive RR
Mixed mode response.
Thursday, 7 April 2011
II. Flux Balance Analysis of genome scale
metabolic networks.
Thursday, 7 April 2011
Should I wait until there is enough quantitative enough
parameters to run dynamic simulations and resist temptation of
working in projects like TB-HOST-NET, or should I use
approximate methods to study gene function now ?
TB-HOST-NET: Integration of computational modelling with transcription and gene essentiality profiling
of both MTB bacillus and infected human dendritic cells and macrophages to understand molecular
interaction networks involved in host-pathogen cross-talk.
Stewart GR, Robertson BD, Young Nat Rev Microbiol. 2003
EraSysBio+ TB-HOST-NET Partners
Andrzej M. Kierzek (ccordinator, Surrey)
Graham Stewart (University of Surrey)
Johnjoe McFadded (University of Surrey)
Steffen Klamt (Max Planck Institute)
Olivier Neyrolles (CNRS)
Ludovic Tailleux (Institute Pasteur)
Maria Foti (University of Milan-Bicocca)
Thursday, 7 April 2011
Flux Balance Analysis – a constraint based
approach
Bxt
Cxt
Find
maximal
dX/dt
if
the
following
constraints
are
saRsfied:
Value
to
be
maximised
(objec5ve
func5on)
Axt
growth
Transport
of
extracellular
(external,
unbalanced)
metabolites.
Dxt
Adapted
from
FluxAnalyzer
so:ware
(Steffen
Klamt,MPI
Magdeburg)
The
linear
programming
algorithm
finds
the
largest
possible
value
of
dX/dt.
However,
there
are
many
possible
values
of
fluxes
(F1,..,F8)
that
result
in
the
same
maximal
value
of
objecRve
funcRon.
Thursday, 7 April 2011
Minimal
and
maximal
reacRon
capaciRes
(bounds).
R4
is
the
only
reversible
reacRon
in
the
system.
Steady
state
(flux
balance)
assumpRon
for
intracellular
(internal)
metabolites.
GSMN‐TB:
a
web‐based
genome‐scale
network
model
of
Mycobacterium
tuberculosis
metabolism.
Dany
JV
Beste*,
Tracy
Hooper*,
Graham
Stewart,
Bhushan
Bonde,
Claudio
Avignone‐
Rossa,
Michael
E
Bushell,
Paul
Wheeler,
Steffen
Klamt,
Andrzej
M
Kierzek#,
Johnjoe
McFadden#
*
Joint
first
authors
#
Joint
senior
authors
Genome
Biology
2007,
8(5):R89
Web
so:ware
available
at:
hdp://sysbio3.fms.surrey.ac.uk/
Thursday, 7 April 2011
GSMN-TB model and SurreyFBA
Reaction Class
Thursday, 7 April 2011
Number
Enzymatic conversions
723
Transport reactions
126
Total number of reactions
849
Orphan reactions
210
Genes
726
Internal metabolites
638
External metabolites
101
Total number of
metabolites
739
Screening for essential genes by Transposon Site
Hybridisation (TraSH)
Thursday, 7 April 2011
Comparison of gene essentiality prediction with
TraSH data.
For each gene in the model {
Constrain all fluxes that require this gene to 0.
Run FBA.
If the flux towards BIOMASS is less than cut-off {
the gene is essential.
} else {
the gene is non-essential
}
}
Compare list of predicted essential genes with the list of genes with TraSH
microarray signal ratio (insertion probe/genomic probe) lower than cut-off.
Classify genes to the following categories.
TP
FP
TN
FN
essential in the
essential in the
non-essential in
non-essential in
model and essential in experiment.
model, non-essential in experiment.
the model, non essential in experiment
the model, essential in experiment
Predictions of GSMN-TB model were compared with data set of Sassetti et al,
Molecular Microbiology, 2003
Thursday, 7 April 2011
Receiver Operating Characteristics (ROC) of
gene essentiality prediction.
Each ROC curve shows 100 points corresponding to
sensitivity and specificity of the model predictions
obtained for growth rate thresholds varying in the
range from 0.0 to 0.1 (increment 0.001). The growth
rate threshold has no effect on prediction accuracy.
The LP optimisation is effectively used as a
qualitative test of BIOMASS producibility and it is
irrelevant whether TB bacillus grows with maximal
rate or not.
Different curves correspond to TraSH ratio
thresholds of 0.05, 0.1, 0.2, 0.6, 1. The TraSH ratio
cutoff has considerable influence on prediction
accurracy.
Sensitivity = TP/(TP + FN)
Specificity = TN/(TN+FP)
Thursday, 7 April 2011
The best ROC curve corresponds to the following
prediction scores: Sensitivity 71%, Specificity
80%, Correct predictions 78%.
Microarray signal distributions for essential and
non-essential genes.
Distributions of the raw TraSH signal
ratios for genes present in the model.
Blue line shows distribution for genes that
were predicted by the model to be
essential for growth. Red line shows
distribution of TraSH ratio among genes
predicted to be nonessential for
BIOMASS production.
Medians of two distributions significantly
different by means of non-parametric U-test
(p-value < 2e-16)
Thursday, 7 April 2011
Extension to expression data: Perform
this analysis for every metabolite in
the network, not only for BIOMASS.
For each metabolite in the network
identify genes that influence and do not
influence its producibility. Compare
microarray signal for both groups of
genes and decide whether metabolite is
differentially affected by gene
expression program.
Changes in metabolite producibility in Streptomyces
coelicolor during the onset of antibiotic production.
All three antibiotics
included into the GSMN
model are on the list of
differentially affected
metabolites !!!!!
Collaboration with the group of Colin Smith; BBSRC grant BBD0115821
Thursday, 7 April 2011
Differential Producibility Analysis (DPA)
,…,
Producibility plot: For each
metabolite find all genes which
inactivation affects maximal flux
towards metabolite.
Use Rank Product
Analysis to calculate
metabolite p-values.
Metabolites that rank
consistently high have
“significant” p-values and
are declared differentially
producible.
For each metabolite calculate
metabolite signal as a median
log microarray signal ratio of
genes influencing producibility of
the given metabolite.
For each metabolite compile profile:
pj = (r1,..,rn)
where ri is the rank of metabolite signal
j in the ith dataset.
Bhushan Bonde, Dany Beste, Emma Laing, Andrzej M. Kierzek ,
Johnjoe McFadden, PLoS Computational Biology, in press
Thursday, 7 April 2011
Maximisation of congruence between transcriptome
data and FBA model of global MTB metabolism.
Modified approach of Shlomi et al, Nature Biotechnology, 26, 1003, 2008
Thursday, 7 April 2011
Prediction and community annotation of gene
function.
http://sysbio3.fhms.surrey.ac.uk
Thursday, 7 April 2011
II. Integration: Using stochastic simulation
to sample sequences of feasible events
linking inputs and outcomes.
Thursday, 7 April 2011
What about more general than metabolic models
where analysis of quasi steady-state flux distribution
is not useful?
Raza et al. BMC Systems Biology 2010, 4:63
http://www.biomedcentral.com/1752-0509/4/63
Thursday, 7 April 2011
Requirements of the approach
1. Makes useful qualitative predictions
about effects of gene inactivation even
if nothing but network connectivity is
known.
2. Allows accommodation of rates and
rate equations, if these data are
available for some of the reactions.
3. Is equivalent to stochastic dynamics, if
all rates are known.
4. Analyses genome scale networks
involving all classes of interactions
Thursday, 7 April 2011
Continuously
changing
approximation
level
Formal Methods in Molecular Biology
Dagstuhl Seminar 09091
Breitling, Gilbert, Heiner, Priami, February 2009
Thursday, 7 April 2011
Petri net token game based approach
1.Represent the system as Petri net.
2.Create two copies of the model: a WT model
with all genes active and mutant model with
inactive gene of interest.
3.Run large number of token games for both
WT and mutant model and calculate the
number of times these trajectories exhibit
property of interest.
4.The mutant/WT ratio of the numbers of
behaviour occurrences measures involvement
of the gene in particular behaviour (gene
function).
This approach is similar to Signalling Petri Net
method of Ruths et al. (PLoS Computational
Biology, 2008). I used different stochastic
dynamics to sample the network behaviours and
I used Continuous Stochastic Logic to express
properties I was testing for.
Thursday, 7 April 2011
Benchmark System
Yeast cell cycle model of Chen et al. Molecular Biology of the Cell 2004 in
original and SBGN notations.
Thursday, 7 April 2011
Stochastic Petri Net has been implemented in
Prism.
Thursday, 7 April 2011
Stochastic Petri Net has been implemented in
Prism.
Thursday, 7 April 2011
Properties of Interest
// G2 arrest
P=? [ G[200,300]
nSPN<4 & nORI>4 &
nBUD>4 & ndiv=0 ]
// G1 arrest
P=? [ G[200,300] nORI<4 &
nBUD<4 & ndiv=0 ]
// M-phase arrest
P=? [ G[200,300]
nSPN>4 & nORI>4 &
nBUD>4 & ndiv=0 ]
// Division happened
// twice
P=? [ F<=200 ndiv=2 ]
Statistical model checking with CTMC dynamics has been used to calculate
probabilities in WT and mutant models. In the initial state 10 tokens has
been placed at the nodes. Reaction rates were set to 1.
Thursday, 7 April 2011
Correct results for three mutants
Gene
inactivation
Divided twice
G1 arrest
G2 arrest
M-phase
arrest
Wild type
0.5685
0.0
0.006
0.0181
YGR108W
0.0
and YPR119W
(Clb2)
0.0
1.0
0.0
YAL040c
(Cln3)
0.0
1.0
0.0
0.0
YGL116W
(Cdc20)
0.026
0.0
0.0
0.7762
Thursday, 7 April 2011
Acknowledgements
Computational systems biology group @ Surrey.
Andrea Rocco
Albert Gevorgyan
Ahmad Mannan
Huihai Wu
Collaborating academics @ Surrey.
Claudio Avignone-Rossa
Michael Bushell
Rebecca Hoyle
Andrzej M. Kierzek
Emma Laing
Roberto La Ragione
Johnjoe McFadden
Colin P. Smith
Graham Stewart
International Collaborations.
Barry Wanner (Purdue)
Hirotada Mori (Nara Institute of Science and Technology)
Kazuyuki Shimizu (Kyushu Institute of Technology)
EraSysBio+ TB-HOST-NET Partners
Steffen Klamt (Max Planck Institute)
Olivier Neyrolles (CNRS)
Ludovic Tailleux (Institute Pasteur)
Maria Foti (University of Milan-Bicocca)
Thursday, 7 April 2011
The work presented here and its
continuation is generously
funded by three BBSRC grants,
BBSRC/EraSysbioPlus grant,
Wellcome trust grant and MRC
PhD fellowship.
Download