Metabolic/Subsystem Reconstruction And Modeling Given a “complete” set of genes… • Assemble a “complete” picture of the biology of an organism? • Gene products don’t generally function in isolation • The whole is greater than the sum of the parts? Or can it also be less? A few examples of higher order entities (multiple gene products and even some additional components) • Protein complexes (ribosomes, enyzmes, secretion systems, etc.) • Pathways • Metabolism (linked pathways) • Processes (chemotaxis, splicing, etc.) • Cellular structures (membrane, cell wall, etc.) Metabolic Reconstruction • Determination of which metabolic pathways are present in an organism based on the genome content • Can provide insight into organisms as well as environments • But, we can only reconstruct what we recognize KEGG (Kyoto Encyclopedia of Genes and Genomes) KEGG-Reference Pathway Overview KEGG - Escherichia coli MG1655 overview KEGG- Citrate Cycle KEGG – Citrate Cycle (E. coli MG1655) Mouse over EC number 1.3.99.1 (succinate dehydrogenase) KEGG • Other functionality • Growing • Automated annotation server for assigning genes from a new genome to pathways • Map subsets of genes to pathways (enrichment analyses) Bacterial chemotaxis – Pectobacterium atrosepticum… but what about the other 33 receptors? One size doesn’t fit all • Specialized pathways for individual organisms in specialized database resources • Allow for variations on a theme The SEED - variants Pathway holes can lead to discovery Metabolic Model • Computable metabolic reconstruction Five uses: 1. Contextualization of high-throughput data 2. Guiding metabolic engineering 3. Directing hypothesis-driven discovery 4. Interrogation of multi-species relationships 5. Network discovery Contraint-based modeling -A stoichiometric matrix, S (M x N) is constructed for an organism, where M=metabolites (rows) & N=reactions (columns) r1 r2 ……..rk Ex. r1 m1+m2=> m3 r2 m3 <=> m1 + m4 m1 -1 1 m2 -1 0 m3 1 -1 … 0 1 mi The dynamic mass balance equation dmi/dt = Σ sik vk k -sik represent entries in S - vk represents a reaction flux that produce and/or degrade metabolite mi -Concentration of a given metabolite: mi dm/dt =Sv m=a vector that represents a set of metabolites v = flux vector at steady-state there is no accumulation or depletion of metabolites in the network, so the rate of production= rate of consumption, hence this balance of fluxes is represented mathematically as Sv = 0 -bounds that further constrain individual variables can be identified, such as fluxes, concentrations, and kinetic constants. (vmin < v < vmax) Irreversible reactions vmin=0, some metabolites such as O2 or CO2 have vmax=infinity, other metabolites are constrained based on experimental measurements as determined for the biomass reaction for E. coli 1 gm dry cell weight There are normally more columns (reactions ~2,300) than rows (metabolites ~1,100) there does not exist a single solution but rather a steady-state solution space containing all possible solutions. (Thiele I. et al. 2009 PLOS Comp. Biol.) Flux Balance Analysis (FBA): FBA calculates the flow of metabolites through this metabolic network, thereby making it possible to predict the growth rate of an organism or the rate of production of a biotechnologically important metabolite. -With no constraints, the flux distribution of a biological network may lie at any point in a solution space. -When mass balance constraints imposed by the stoichiometric matrix S and capacity constraints imposed by the lower and upper bounds (ai and bi) are applied to a network, it defines an allowable solution space. -Through optimization of an objective function, FBA can identify a single optimal flux distribution that lies on the edge of the allowable solution space. (Orth, Thiele, and Palsson Nat. Biotech 2010) The Iterative reconstruction and history of the E. coli metabolic network (Feist A.F. and B.O Palsson (2008) Nature Biotechnology) Applications of the RMN of E. coli Feist A.F. and B.O. Palsson (2008) Nature Biotechnology Validation of metabolic models through comparison of in silico vs. experimental data with or without oxygen Comparison of carbon source utilization Flux Balance Analysis (FBA) Given an uptake rate for key nutrients (such as glucose and oxygen), the maximum possible growth rate of the cells can be predicted in silico. 0.3 10 0.25 8 0.2 6 0.15 4 0.1 2 0.05 0 0 0 1.5 3 4.5 6 7.5 9 10.5 12 12 (Becker SA, et al. (2007) Nature Protocols) Time (h) gDW/L mmol/L Comparison of batch growth Glucose (mmol/L) Acetate (mmol/L) Formate (mmol/L) Lactate (mmol/L) Ethanol (mmol/L) Succinate (mmol/L) Biomass(gDW/L) Carbon source utilization results E. coli K-12 Strain MG1655 E. coli O157:H7 (EHEC) W3110 EDL933 E. coli (UPEC) Sakai CFTO73 Salmonella UTI89 LT2 O2 No O2 O2 No O2 O2 No O2 O2 No O2 O2 No O2 O2 No O2 O2 No O2 Tested compounds included in the model In silico and experimental Agreement 76 76 76 76 76 76 76 76 76 76 76 76 55 55 70 66 71 64 69 63 68 64 67 63 71 65 52 48 Experimental = N In silico = Y False positives 4 1 1 0 2 2 2 1 3 0 1 0 2 0 Experimental = Y In silico = N False negatives 2 9 4 12 5 11 6 11 6 13 4 11 1 7 In general good agreement of in silico vs experimental carbon source utilization for both aerobic (>88% accurate) and anaerobic conditions (>83 % accurate). Batch growth results in MOPS minimal media + 0.2 % glucose anaerobic 0.4 0.35 Biomass (g/L) 0.3 0.25 0.2 0.15 0.1 0.05 0 E. coli K12 E. coli K12 MG1655 W3110 E. coli E. coli Sakai E. coli CFT EDL933 O157:H7 (UPEC) O157:H7 E. coli UTI89 (UPEC) S. typhi LT2 M. tuberculosis • Built a genome-scale model • Predicted essential genes using FBA and compared to saturated transposon-based characterization of essentiality (78% accuracy/agreement) • Compared flux through all pathways under slow and fast growth by changing nutrient uptake flux constraints Major difference in isocitrate lyase and glyoxylate shunt Yeast deletion mutants • Used quantitative image analysis to measure growth of replica pinned cells on agar under 16 conditions (no growth, slow growth, wt growth) • FBA to predict growth from yeast model • 94% agreement • Refined experiments based on model (checked mutations, secondary mutations, unlinked phenotypes) • Gained insight into glycerol and raffinose catabolism