PROM - National e

advertisement
Nathan Price
Department of Chemical & Biomolecular Engineering
Center for Biophysics & Computational Biology
Institute for Genomic Biology
University of Illinois, Urbana-Champaign
Metabolic Pathways Workshop
Edinburgh, Scotland
April 7, 2011
INSTITUTE for
GENOMIC
BIOLOGY
Interactions between metabolic and regulatory networks
Milne, Eddy, Kim, Price, Biotechnology Journal, 2009
INSTITUTE for
GENOMIC
BIOLOGY
Biochemical Reaction Networks
Statistical Inference Networks
Data Sources
Phylogenetic
Data
Physiological
Data
Genome
Annotation
Literature
More detail (biochemistry, etc.)
Interactomics
Less detail
Reaction Stoichiometry
Interaction Networks
•Transcriptomics 
Rxn. 1
Rxn. 2
Rxn. 3
Enz
A
Enz-A
B
Enz-A-B
C
D
-1
0
+1
-1
0
0
+1
-1
0
0
-1
0
0
+1
-1
0
0
+1
0
0
+1
Integrated
Network
Data
•Proteomics 
•Metabolomics 
Protein-Metabolite
Protein-Protein
DNA-Protein
DNA-DNA
Application of Constraints
Network Inference
Constraint-Based Model
S·v=0
v ≤ vmax
Statistical Inference Network
v2
Mathematical
Model
C = f(A,B,D)
v1
v3
Eddy and Price, Encyclopedia of complexity and systems science (2009)
Activation
Inhibition
Indirect
INSTITUTE for
GENOMIC
BIOLOGY
Need for automated reconstruction methods
1000
# Completed
Genomes
GEMs
100
10
1
1995 1997 1999 2001 2003 2005 2007 2009
Year
C Milne, JA Eddy, PJ Kim, ND Price, Biotechnology Journal, 2009
INSTITUTE for
GENOMIC
BIOLOGY
Automated reconstruction of metabolic networks





Automated reconstruction of
computable metabolic
network models
Demonstrated on 130
genomes
Provide advanced starting
point for virtually any
organism
Accuracy from genomics:
65%
With biolog and
optimization: 87%
Henry, C. DeJongh, M, Best, AA, Frybarger, PM, and Stevens, RL, Nature Biotechnology, 2010
INSTITUTE for
GENOMIC
BIOLOGY
Integrated automated reconstructions
INSTITUTE for
GENOMIC
BIOLOGY
Integration of automatically learned statistics-based regulatory
networks and biochemistry-based metabolic networks
Sriram
Chandrasekaran
Bozena
Sawicka
Amit
Ghosh
Example of Current State-of-the-Art: rFBA

Motivated by data limitations

Regulatory network
represented by Boolean
rules

Rules taken from literature
curation

Only subset of network
available under different
environmental conditions

Metabolic flux analysis
performed with available
reactions
Covert, MW et al., Nature, 2004
INSTITUTE for
GENOMIC
BIOLOGY
PROM models integrating TRN and metabolic network

Automated

Comprehensive

Probabilistic Boolean vs Boolean

Higher accuracy
Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for
GENOMIC
BIOLOGY
PROM MODEL - PROBABILITIES
PROM's novelty lies in the introduction of probabilities to represent gene
states and gene - transcription factor (TF) interactions.
IF (B) THEN A
P(A|B) = 0.95
P(A = 1|B = 0) - The probability of gene A being ON when its
transcription factor B is OFF
P(A = 1|B = 1)
- probability of A being ON when B is ON.
Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for
GENOMIC
BIOLOGY
CONSTRAINING FLUXES USING PROBABILITIES
TF
p(mRNA|TF)
Flux Bound
p*Vmax
Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010
Optimal Flux
State
INSTITUTE for
GENOMIC
BIOLOGY
PROM: Basis is a constraint-based metabolic model
Constraint-based analysis involves solving the linear
optimization problem:
max wTv
subject to constraints
S.v = 0
lb ≤ v ≤ ub
where S is the stoichiometric matrix, v is a flux vector
representing a particular flux configuration, wTv is the
linear objective function, and lb,ub are vectors containing
the minimum and maximum fluxes through each reaction.
INSTITUTE
for
GENOMIC
PROM Approach
PROM finds a flux distribution that satisfies the same
constraints as FBA plus additional constraints due to
the transcriptional regulation min (κ.α + κ.β)
β
subject to constraints
lb’ – α ≤ v ≤ ub’ + β
α
α, β ≥ 0
Where lb’, ub’ are constraints based on transcriptional
regulation ( the flux bound cues), α,β are positive
constants which represent deviation from those
constraints and κ represents the penalty for such
deviations.
INSTITUTE
for
GENOMIC
Data used for the E. coli PROM model
E. coli
Metabolic Model
IAF1260
Metabolic
Reactions
2382
Regulatory data
RegulonDB
Regulatory Interactions
1773
Microarrays
907
Total Genes in the model
1400
Validation Data set
1875 growth
phenotypes
Feist A et al, Molecular Systems Biology, 2007
Chandrasekaran, S., and Price, N.D., PNAS, 2010
INSTITUTE for
GENOMIC
BIOLOGY
tdc
R
crp
ma
lT
glp
R
gnt
R
xyl
R
as
nC
rbs
R
ilv
Y
gln
G
rha
S
cp
xR
cyt
R
sox
R
me
lR
Automated PROM model has similar accuracy to RFBA
1,2Propanediol
2Deoxy Adenosine
aDGlucose
aDLactose
aKetoGlutaric Acid
Acetic Acid
Acetoacetic Acid
Adenosine
Citric Acid
D,LMalic Acid
DAlanine
DFructose
DGalactose
DGalacturonic Acid
DGluconic Acid
DGlucose6Phosphate
DGlucuronic Acid
DMannitol
DMannose
DMelibiose
DRibose
DSerine
DSorbitol
DTrehalose
DXylose
Formic Acid
Fumaric Acid
Glycerol
Glycolic Acid
Inosine
LAlanine
LArabinose
LAsparagine
LAspartic Acid
LFucose
LGlutamic Acid
LGlutamine
LLactic Acid
LMalic Acid
LProline
LRhamnose
LSerine
LThreonine
Maltose
Maltotriose
NAcetylbDMannosamine
NAcetylDGlucosamine
Pyruvic Acid
Succinic Acid
Sucrose
Thymidine
Uridine
Butyric Acid
D,LCarnitine
Dihydroxy Acetone
gAmino Butyric Acid
Glycine
LArginine
LHistidine
LIsoleucine
LLeucine
LLysine
LMethionine
LOrnithine
LPhenylalanine
LTartaric Acid
LValine
NAcetylNeuraminic Acid
Putrescine
Adenine
Adenosine
AlaAsp
AlaGln
AlaGlu
AlaGly
AlaHis
AlaLeu
AlaThr
Allantoin
Ammonia
Cytidine
Cytosine
DAlanine
DGlucosamine
DSerine
GlyAsn
GlyGln
GlyGlu
GlyMet
Glycine
Guanine
Guanosine
Inosine
LAlanine
LArginine
LAsparagine
LAspartic Acid
LCysteine
LGlutamic Acid
LGlutamine
LHistidine
LIsoleucine
LLeucine
LLysine
LMethionine
LOrnithine
LPhenylalanine
LProline
LSerine
LThreonine
LTryptophan
LTyrosine
LValine
MetAla
NAcetylDGlucosamine
NAcetylDMannosamine
Nitrate
Nitrite
Putrescine
Thymidine
Uracil
Urea
Uridine
Xanthine
Xanthosine
Covert MW et al, Nature, 2004
Chandrasekaran S, and Price ND, PNAS, 2010
COMPARISON WITH RFBA
Non Lethal, both PROM,RFBA are
right
Lethal, both PROM,RFBA are right
PROM wrong ,RFBA right
PROM right, RFBA wrong
Lethal, both wrong
Non lethal, both wrong
PROM – 85% , RFBA – 81%
AUTOMATED (PROM) Vs MANUAL (RFBA)
INSTITUTE for
GENOMIC
BIOLOGY
Increased comprehensiveness to previous RFBA model
Interactions
PROM E. coli model
Regulated metabolic genes
E. coli iMC1010
Transcription Factors
0
200
400
600
800
1000 1200 1400
1600 1800 2000
Automated learning from high-throughput data improves comprehensiveness
Covert MW, Nature, 2004
Chandrasekaran, S, and Price, ND, In review, 2010
INSTITUTE for
GENOMIC
BIOLOGY
Growth rate prediction by PROM
Culture
Actual
PROM
WT + O2
0.71
0.7382
WT - O2
0.49
0.385
ΔarcA + O2
0.69
0.7651
ΔarcA - O2
0.38
0.3224
Δfnr + O2
0.63
0.5635
Δfnr - O2
0.41
0.2181
Δfnr/ΔarcA + O2
0.65
0.6596
Δfnr/ΔarcA - O2
0.3
0.204
ΔappY + O2
0.64
0.7152
ΔappY - O2
0.48
0.3287
ΔoxyR + O2
0.64
0.7876
ΔoxyR - O2
0.48
0.3287
ΔsoxS + O2
0.72
0.7687
ΔsoxS - O2
0.46
0.379
Experimental data taken from MW Covert et al, Nature, 2004
Chandrasekaran, S., and Price, N.D., PNAS, 2010
Predicted growth rate
Results: Quantitative Growth Prediction
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Experimental growth rate
Overall correlation with
experimental data: R = 0.95
Function of both oxygen switch
(dominant) and regulation
INSTITUTE for
GENOMIC
BIOLOGY
PROM Model Inputs for M. tuberculosis
M. tuberculosis
Metabolic Model
iNJ661
Metabolic
Reactions
1028
Regulatory data
Balazsi et al
Regulatory Interactions
218
Microarrays
437
Total Genes in the model
691
Validation Data set
30 TF knockout
Jamshidi NJ, and Palsson, BO, BMC Systems Biology, 2007
Balazsi G et al, Molecular Systems Biology, 2008; Boshoff HI et al, JBC, 2004
INSTITUTE for
GENOMIC
BIOLOGY
Accuracy in predicting essentiality of TF for optimal growth
dnaA
Rv0485
crp
sigD
kdpE
ideR
Rv1395
argR
sigC
sigH
lrpA
Rv3575c
oxyS
nadR
hspR
regX3
Rv0586
narL
sigE
Predicted Growth
rate
0.03
0.042
0.03
0.05
0.052
0.038
0.028
0.047
0.024
0.05
0.032
0.026
0.052
0.052
0.052
0.052
0.052
0.052
0.052
furA
0.052
Rv1931c
furB
lexA
pknK
dosR
birA
sigF
kstR
cyp143
embR
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
TF
Accuracy
95%
Sensitivity %
83
Specificity %
100
Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for
GENOMIC
BIOLOGY
Legend
Correct Prediction
Essential gene
Non essential gene
Candidate essential
Incorrect Prediction
Non essential gene
Essential gene
PROM Model Inputs for S. cerevisiae
S. cerevisiae
Metabolic Model
Metabolic
Reactions
iMM904
Regulatory data
YEASTRACT
1577
Regulatory
Interactions
Microarrays
Total Genes in the
model
904, M3D
Validation Data set
136 TF knockout
4200
904
Duarte NC et al BMC Genomics 2004
Steinmetz LM et al. Nature Genetics 2002
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation)
INSTITUTE for
GENOMIC
BIOLOGY
Increased comprehensiveness to previous RFBA model
RFBA model iMH805/775
PROM model
Transcription Factors
55
136
Regulated Metabolic Genes
348
904
Interactions
775
4200
Interactions
Regulated
Metabolic Genes
PROM model
RFBA model iMH805/775
Transcription
Factors
0
1000
2000
3000
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation)
Herrgard et al., Genome Res, 2006
4000
5000
INSTITUTE for
GENOMIC
BIOLOGY
Accuracy in predicting essentiality of TF for optimal growth



Predicts correctly 135/136 of lethal/nonlethal calls
Identifies 8 lethal TF KOs, with only 1 false
positive
Lone miss (Gcn4) is a very slow grower
(multiple days)
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation)
INSTITUTE for
GENOMIC
BIOLOGY
Validation: Quantitative Growth Prediction
Growth rate prediction by PROM
Galactose
Fructose
SUR = 6.3; OUR = 2.5
SUR = 2.1, OUR = 3.9
SUR = 2.6, OUR =
6.2
Actual
PROM
Actual
PROM
Actual
PROM
WT
0.21
0.22
0.13
0.15
0.2
0.23
adr1
0.21
0.22
0.13
0.15
0.2
0.23
cat8
0.21
0.22
0.13
0.15
0.2
0.23
mig2
0.21
0.22
0.13
0.15
0.2
0.23
sip4
0.21
0.22
0.13
0.12
0.2
0.19
gal4
0.21
0.22
0.03
0.01
0.2
0.23
rtg1
0.21
0.22
0.08
0.07
0.2
0.23
mth1
0.21
0.21
0.11
0.15
0.2
0.23
nrg1
0.21
0.20
0.13
0.14
0.2
0.22
mig1
0.21
0.21
0.13
0.15
0.2
0.21
gcr2
0.17
0.15
0.13
0.15
0.16
0.17
Predicted growth rate
Culture
0.25
Glucose
Experimental data taken from MJ Herrgard et al, Genome Res 2006
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation)
0.2
0.15
0.1
0.05
0
0
0.05
0.1
0.15
0.2
Experimental growth rate
Overall correlation with
experimental data: R = 0.96
Driven by both substrate
(dominant) and regulation
INSTITUTE for
GENOMIC
BIOLOGY
0.25
Quantitative Growth Prediction for 77 TF
knockout Phenotypes with Galactose
Predicted Growth Rate
0.25
0.2
0.15
0.1
0.05
0
0
0.05
0.1
0.15
0.2
0.25
Experimental Growth Rate
Overall correlation with experimental data: R = 0.90
(based only on regulation – metabolic model alone would be flat line)
Experimental data taken from SM Fendt et al, Molecular Systems Biology 2010
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation)
INSTITUTE for
GENOMIC
BIOLOGY
Prediction of Metabolic flux for ∆Gcn4 mutant strain
WT
(expt)
GCN4
(expt)
WT
(model)
GCN4
(model)
flux
(expt)
G6P <=> F6P (net)
492.89
362.99
500
500

PEP ->
1280.46
1290.54
17.72
18.94


P5P <=> EC2 + G3P (net)
127.82
213.63
0.263
0.217


F6P <=> EC2 + E4P (net)
-60.99
-103.85
-0.182
-0.19


S7P <=> EC3 + E4P (net)
66.82
109.79
8.661
9.19


PYR -> ACA + CO2
1162.09
1035.51
17.78
15.94


ETH -> ETHOUT
662.32
666.92
15.82
17.72


ACE -> ACCOA
504.22
371.79
0.147
0.067


OAAMIT+ACCOAMIT-> CITMIT
515.08
496.43
0.529
0.811


OAAMIT <=> OAA (net)
-243.97
-156.85
-498.08
-464.36


CITMIT <=> CIT (net)
51.48
72.45
0.035
0.076


SER -> CYS
2.84
2.07
0.230
0.0019


SER <=> GLY + METTHF (net)
8.92
14.02
0.09709
0.0448


OAA -> ASP
27.41
21.02
500
474.91


PYR ->
4.32
3.32
0.67622
0.1471


AKG -> GLU
27.48
25.89
0.0285
0.0131


GLU -> ORN
5.81
3.72
0.04626
0.0258


CHOR -> PPHN
5.83
5.94
0.03111
0.0679


Reaction
Experimental data taken from SM Fendt et al, Moxley et al, PNAS 2009
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation)
INSTITUTE for
GENOMIC
BIOLOGY
flux
(model)
PROM Highlights

PROM is a new approach for integrating the transcriptional
network with metabolism
 Automated and comprehensive

We compared it with state-of-the art metabolic-regulatory
models of E. coli
 Comparable accuracy
 More comprehensive (automated from HT data)


We constructed the first genome-scale integrated
regulatory-metabolic model for M. tuberculosis
We compared it with state-of-the art metabolic-regulatory
models of S. cerevisiae
 Much more accurate
 Much more comprehensive (automated from HT data)

PROM can accurately predict the effect of perturbations to
transcriptional regulators and subsequently be used to
predict microbial growth phenotypes quantitatively
Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for
GENOMIC
BIOLOGY
Constraint-based Reconstruction and Analysis Conference
Confirmed Speakers
Eivind Almaas
Ronan Fleming
Vassily Hatzimanikatis
Christopher Henry
Hermann-Georg Holzhütter
Costas Maranas
Jens Nielsen
Bernhard Palsson
Key Dates
Jason Papin
Balázs Papp
Nathan Price
Eytan Ruppin
Uwe Sauer
Stefan Schuster
Daniel Segre
Ines Thiele
April 7, 2011 - Abstract Deadline for
oral & poster presentations (WILL
EXTEND)
June 24-26, 2011 - COBRA
conference
INSTITUTE for
GENOMIC
BIOLOGY
Acknowledgments
Nathan D. Price Lab @ the University of Illinois, Urbana-Champaign
Postdocs
Nick Chia
Cory Funk
Amit Ghosh
Pan-Jun Kim
Charu Gupta Kumar
Younhee Ko
Vineet Sangar
Graduate Students
Daniel Baker
Matthew Benedict
Sriram Chandrasekaran John Earls
Piyush Labhsetwar
Bozena Sawicka
Shuyi Ma
Jaeyun Sung
James Eddy
Andrew Magis
Chunjing Wang
Funding Sources
NIH / National Cancer Institute
Howard Temin Pathway to Independence Award
NSF CAREER
Department of Defense – TATRC
Department of Energy
Energy Biosciences Institute (BP)
Luxembourg-ISB Systems Medicine Program
Roy J. Carver Charitable Trust Young Investigator Award
Matthew Gonnerman
Seyfullah Kotil
Caroline Milne
Matthew Richards
Yuliang Wang
Download