Nathan Price Department of Chemical & Biomolecular Engineering Center for Biophysics & Computational Biology Institute for Genomic Biology University of Illinois, Urbana-Champaign Metabolic Pathways Workshop Edinburgh, Scotland April 7, 2011 INSTITUTE for GENOMIC BIOLOGY Interactions between metabolic and regulatory networks Milne, Eddy, Kim, Price, Biotechnology Journal, 2009 INSTITUTE for GENOMIC BIOLOGY Biochemical Reaction Networks Statistical Inference Networks Data Sources Phylogenetic Data Physiological Data Genome Annotation Literature More detail (biochemistry, etc.) Interactomics Less detail Reaction Stoichiometry Interaction Networks •Transcriptomics Rxn. 1 Rxn. 2 Rxn. 3 Enz A Enz-A B Enz-A-B C D -1 0 +1 -1 0 0 +1 -1 0 0 -1 0 0 +1 -1 0 0 +1 0 0 +1 Integrated Network Data •Proteomics •Metabolomics Protein-Metabolite Protein-Protein DNA-Protein DNA-DNA Application of Constraints Network Inference Constraint-Based Model S·v=0 v ≤ vmax Statistical Inference Network v2 Mathematical Model C = f(A,B,D) v1 v3 Eddy and Price, Encyclopedia of complexity and systems science (2009) Activation Inhibition Indirect INSTITUTE for GENOMIC BIOLOGY Need for automated reconstruction methods 1000 # Completed Genomes GEMs 100 10 1 1995 1997 1999 2001 2003 2005 2007 2009 Year C Milne, JA Eddy, PJ Kim, ND Price, Biotechnology Journal, 2009 INSTITUTE for GENOMIC BIOLOGY Automated reconstruction of metabolic networks Automated reconstruction of computable metabolic network models Demonstrated on 130 genomes Provide advanced starting point for virtually any organism Accuracy from genomics: 65% With biolog and optimization: 87% Henry, C. DeJongh, M, Best, AA, Frybarger, PM, and Stevens, RL, Nature Biotechnology, 2010 INSTITUTE for GENOMIC BIOLOGY Integrated automated reconstructions INSTITUTE for GENOMIC BIOLOGY Integration of automatically learned statistics-based regulatory networks and biochemistry-based metabolic networks Sriram Chandrasekaran Bozena Sawicka Amit Ghosh Example of Current State-of-the-Art: rFBA Motivated by data limitations Regulatory network represented by Boolean rules Rules taken from literature curation Only subset of network available under different environmental conditions Metabolic flux analysis performed with available reactions Covert, MW et al., Nature, 2004 INSTITUTE for GENOMIC BIOLOGY PROM models integrating TRN and metabolic network Automated Comprehensive Probabilistic Boolean vs Boolean Higher accuracy Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010 INSTITUTE for GENOMIC BIOLOGY PROM MODEL - PROBABILITIES PROM's novelty lies in the introduction of probabilities to represent gene states and gene - transcription factor (TF) interactions. IF (B) THEN A P(A|B) = 0.95 P(A = 1|B = 0) - The probability of gene A being ON when its transcription factor B is OFF P(A = 1|B = 1) - probability of A being ON when B is ON. Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010 INSTITUTE for GENOMIC BIOLOGY CONSTRAINING FLUXES USING PROBABILITIES TF p(mRNA|TF) Flux Bound p*Vmax Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010 Optimal Flux State INSTITUTE for GENOMIC BIOLOGY PROM: Basis is a constraint-based metabolic model Constraint-based analysis involves solving the linear optimization problem: max wTv subject to constraints S.v = 0 lb ≤ v ≤ ub where S is the stoichiometric matrix, v is a flux vector representing a particular flux configuration, wTv is the linear objective function, and lb,ub are vectors containing the minimum and maximum fluxes through each reaction. INSTITUTE for GENOMIC PROM Approach PROM finds a flux distribution that satisfies the same constraints as FBA plus additional constraints due to the transcriptional regulation min (κ.α + κ.β) β subject to constraints lb’ – α ≤ v ≤ ub’ + β α α, β ≥ 0 Where lb’, ub’ are constraints based on transcriptional regulation ( the flux bound cues), α,β are positive constants which represent deviation from those constraints and κ represents the penalty for such deviations. INSTITUTE for GENOMIC Data used for the E. coli PROM model E. coli Metabolic Model IAF1260 Metabolic Reactions 2382 Regulatory data RegulonDB Regulatory Interactions 1773 Microarrays 907 Total Genes in the model 1400 Validation Data set 1875 growth phenotypes Feist A et al, Molecular Systems Biology, 2007 Chandrasekaran, S., and Price, N.D., PNAS, 2010 INSTITUTE for GENOMIC BIOLOGY tdc R crp ma lT glp R gnt R xyl R as nC rbs R ilv Y gln G rha S cp xR cyt R sox R me lR Automated PROM model has similar accuracy to RFBA 1,2Propanediol 2Deoxy Adenosine aDGlucose aDLactose aKetoGlutaric Acid Acetic Acid Acetoacetic Acid Adenosine Citric Acid D,LMalic Acid DAlanine DFructose DGalactose DGalacturonic Acid DGluconic Acid DGlucose6Phosphate DGlucuronic Acid DMannitol DMannose DMelibiose DRibose DSerine DSorbitol DTrehalose DXylose Formic Acid Fumaric Acid Glycerol Glycolic Acid Inosine LAlanine LArabinose LAsparagine LAspartic Acid LFucose LGlutamic Acid LGlutamine LLactic Acid LMalic Acid LProline LRhamnose LSerine LThreonine Maltose Maltotriose NAcetylbDMannosamine NAcetylDGlucosamine Pyruvic Acid Succinic Acid Sucrose Thymidine Uridine Butyric Acid D,LCarnitine Dihydroxy Acetone gAmino Butyric Acid Glycine LArginine LHistidine LIsoleucine LLeucine LLysine LMethionine LOrnithine LPhenylalanine LTartaric Acid LValine NAcetylNeuraminic Acid Putrescine Adenine Adenosine AlaAsp AlaGln AlaGlu AlaGly AlaHis AlaLeu AlaThr Allantoin Ammonia Cytidine Cytosine DAlanine DGlucosamine DSerine GlyAsn GlyGln GlyGlu GlyMet Glycine Guanine Guanosine Inosine LAlanine LArginine LAsparagine LAspartic Acid LCysteine LGlutamic Acid LGlutamine LHistidine LIsoleucine LLeucine LLysine LMethionine LOrnithine LPhenylalanine LProline LSerine LThreonine LTryptophan LTyrosine LValine MetAla NAcetylDGlucosamine NAcetylDMannosamine Nitrate Nitrite Putrescine Thymidine Uracil Urea Uridine Xanthine Xanthosine Covert MW et al, Nature, 2004 Chandrasekaran S, and Price ND, PNAS, 2010 COMPARISON WITH RFBA Non Lethal, both PROM,RFBA are right Lethal, both PROM,RFBA are right PROM wrong ,RFBA right PROM right, RFBA wrong Lethal, both wrong Non lethal, both wrong PROM – 85% , RFBA – 81% AUTOMATED (PROM) Vs MANUAL (RFBA) INSTITUTE for GENOMIC BIOLOGY Increased comprehensiveness to previous RFBA model Interactions PROM E. coli model Regulated metabolic genes E. coli iMC1010 Transcription Factors 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Automated learning from high-throughput data improves comprehensiveness Covert MW, Nature, 2004 Chandrasekaran, S, and Price, ND, In review, 2010 INSTITUTE for GENOMIC BIOLOGY Growth rate prediction by PROM Culture Actual PROM WT + O2 0.71 0.7382 WT - O2 0.49 0.385 ΔarcA + O2 0.69 0.7651 ΔarcA - O2 0.38 0.3224 Δfnr + O2 0.63 0.5635 Δfnr - O2 0.41 0.2181 Δfnr/ΔarcA + O2 0.65 0.6596 Δfnr/ΔarcA - O2 0.3 0.204 ΔappY + O2 0.64 0.7152 ΔappY - O2 0.48 0.3287 ΔoxyR + O2 0.64 0.7876 ΔoxyR - O2 0.48 0.3287 ΔsoxS + O2 0.72 0.7687 ΔsoxS - O2 0.46 0.379 Experimental data taken from MW Covert et al, Nature, 2004 Chandrasekaran, S., and Price, N.D., PNAS, 2010 Predicted growth rate Results: Quantitative Growth Prediction 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Experimental growth rate Overall correlation with experimental data: R = 0.95 Function of both oxygen switch (dominant) and regulation INSTITUTE for GENOMIC BIOLOGY PROM Model Inputs for M. tuberculosis M. tuberculosis Metabolic Model iNJ661 Metabolic Reactions 1028 Regulatory data Balazsi et al Regulatory Interactions 218 Microarrays 437 Total Genes in the model 691 Validation Data set 30 TF knockout Jamshidi NJ, and Palsson, BO, BMC Systems Biology, 2007 Balazsi G et al, Molecular Systems Biology, 2008; Boshoff HI et al, JBC, 2004 INSTITUTE for GENOMIC BIOLOGY Accuracy in predicting essentiality of TF for optimal growth dnaA Rv0485 crp sigD kdpE ideR Rv1395 argR sigC sigH lrpA Rv3575c oxyS nadR hspR regX3 Rv0586 narL sigE Predicted Growth rate 0.03 0.042 0.03 0.05 0.052 0.038 0.028 0.047 0.024 0.05 0.032 0.026 0.052 0.052 0.052 0.052 0.052 0.052 0.052 furA 0.052 Rv1931c furB lexA pknK dosR birA sigF kstR cyp143 embR 0.052 0.052 0.052 0.052 0.052 0.052 0.052 0.052 0.052 0.052 TF Accuracy 95% Sensitivity % 83 Specificity % 100 Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010 INSTITUTE for GENOMIC BIOLOGY Legend Correct Prediction Essential gene Non essential gene Candidate essential Incorrect Prediction Non essential gene Essential gene PROM Model Inputs for S. cerevisiae S. cerevisiae Metabolic Model Metabolic Reactions iMM904 Regulatory data YEASTRACT 1577 Regulatory Interactions Microarrays Total Genes in the model 904, M3D Validation Data set 136 TF knockout 4200 904 Duarte NC et al BMC Genomics 2004 Steinmetz LM et al. Nature Genetics 2002 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY Increased comprehensiveness to previous RFBA model RFBA model iMH805/775 PROM model Transcription Factors 55 136 Regulated Metabolic Genes 348 904 Interactions 775 4200 Interactions Regulated Metabolic Genes PROM model RFBA model iMH805/775 Transcription Factors 0 1000 2000 3000 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) Herrgard et al., Genome Res, 2006 4000 5000 INSTITUTE for GENOMIC BIOLOGY Accuracy in predicting essentiality of TF for optimal growth Predicts correctly 135/136 of lethal/nonlethal calls Identifies 8 lethal TF KOs, with only 1 false positive Lone miss (Gcn4) is a very slow grower (multiple days) Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY Validation: Quantitative Growth Prediction Growth rate prediction by PROM Galactose Fructose SUR = 6.3; OUR = 2.5 SUR = 2.1, OUR = 3.9 SUR = 2.6, OUR = 6.2 Actual PROM Actual PROM Actual PROM WT 0.21 0.22 0.13 0.15 0.2 0.23 adr1 0.21 0.22 0.13 0.15 0.2 0.23 cat8 0.21 0.22 0.13 0.15 0.2 0.23 mig2 0.21 0.22 0.13 0.15 0.2 0.23 sip4 0.21 0.22 0.13 0.12 0.2 0.19 gal4 0.21 0.22 0.03 0.01 0.2 0.23 rtg1 0.21 0.22 0.08 0.07 0.2 0.23 mth1 0.21 0.21 0.11 0.15 0.2 0.23 nrg1 0.21 0.20 0.13 0.14 0.2 0.22 mig1 0.21 0.21 0.13 0.15 0.2 0.21 gcr2 0.17 0.15 0.13 0.15 0.16 0.17 Predicted growth rate Culture 0.25 Glucose Experimental data taken from MJ Herrgard et al, Genome Res 2006 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) 0.2 0.15 0.1 0.05 0 0 0.05 0.1 0.15 0.2 Experimental growth rate Overall correlation with experimental data: R = 0.96 Driven by both substrate (dominant) and regulation INSTITUTE for GENOMIC BIOLOGY 0.25 Quantitative Growth Prediction for 77 TF knockout Phenotypes with Galactose Predicted Growth Rate 0.25 0.2 0.15 0.1 0.05 0 0 0.05 0.1 0.15 0.2 0.25 Experimental Growth Rate Overall correlation with experimental data: R = 0.90 (based only on regulation – metabolic model alone would be flat line) Experimental data taken from SM Fendt et al, Molecular Systems Biology 2010 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY Prediction of Metabolic flux for ∆Gcn4 mutant strain WT (expt) GCN4 (expt) WT (model) GCN4 (model) flux (expt) G6P <=> F6P (net) 492.89 362.99 500 500 PEP -> 1280.46 1290.54 17.72 18.94 P5P <=> EC2 + G3P (net) 127.82 213.63 0.263 0.217 F6P <=> EC2 + E4P (net) -60.99 -103.85 -0.182 -0.19 S7P <=> EC3 + E4P (net) 66.82 109.79 8.661 9.19 PYR -> ACA + CO2 1162.09 1035.51 17.78 15.94 ETH -> ETHOUT 662.32 666.92 15.82 17.72 ACE -> ACCOA 504.22 371.79 0.147 0.067 OAAMIT+ACCOAMIT-> CITMIT 515.08 496.43 0.529 0.811 OAAMIT <=> OAA (net) -243.97 -156.85 -498.08 -464.36 CITMIT <=> CIT (net) 51.48 72.45 0.035 0.076 SER -> CYS 2.84 2.07 0.230 0.0019 SER <=> GLY + METTHF (net) 8.92 14.02 0.09709 0.0448 OAA -> ASP 27.41 21.02 500 474.91 PYR -> 4.32 3.32 0.67622 0.1471 AKG -> GLU 27.48 25.89 0.0285 0.0131 GLU -> ORN 5.81 3.72 0.04626 0.0258 CHOR -> PPHN 5.83 5.94 0.03111 0.0679 Reaction Experimental data taken from SM Fendt et al, Moxley et al, PNAS 2009 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY flux (model) PROM Highlights PROM is a new approach for integrating the transcriptional network with metabolism Automated and comprehensive We compared it with state-of-the art metabolic-regulatory models of E. coli Comparable accuracy More comprehensive (automated from HT data) We constructed the first genome-scale integrated regulatory-metabolic model for M. tuberculosis We compared it with state-of-the art metabolic-regulatory models of S. cerevisiae Much more accurate Much more comprehensive (automated from HT data) PROM can accurately predict the effect of perturbations to transcriptional regulators and subsequently be used to predict microbial growth phenotypes quantitatively Chandrasekaran and Price, Proc. Natil. Acad. Sci. USA, 2010 INSTITUTE for GENOMIC BIOLOGY Constraint-based Reconstruction and Analysis Conference Confirmed Speakers Eivind Almaas Ronan Fleming Vassily Hatzimanikatis Christopher Henry Hermann-Georg Holzhütter Costas Maranas Jens Nielsen Bernhard Palsson Key Dates Jason Papin Balázs Papp Nathan Price Eytan Ruppin Uwe Sauer Stefan Schuster Daniel Segre Ines Thiele April 7, 2011 - Abstract Deadline for oral & poster presentations (WILL EXTEND) June 24-26, 2011 - COBRA conference INSTITUTE for GENOMIC BIOLOGY Acknowledgments Nathan D. Price Lab @ the University of Illinois, Urbana-Champaign Postdocs Nick Chia Cory Funk Amit Ghosh Pan-Jun Kim Charu Gupta Kumar Younhee Ko Vineet Sangar Graduate Students Daniel Baker Matthew Benedict Sriram Chandrasekaran John Earls Piyush Labhsetwar Bozena Sawicka Shuyi Ma Jaeyun Sung James Eddy Andrew Magis Chunjing Wang Funding Sources NIH / National Cancer Institute Howard Temin Pathway to Independence Award NSF CAREER Department of Defense – TATRC Department of Energy Energy Biosciences Institute (BP) Luxembourg-ISB Systems Medicine Program Roy J. Carver Charitable Trust Young Investigator Award Matthew Gonnerman Seyfullah Kotil Caroline Milne Matthew Richards Yuliang Wang