Linear Algebraic Approaches to Metabolic Systems Analysis: Adventures for Undergrads… and Up NIMBIOS Workshop June 20, 2014 Terrell L. Hodge NOTE: The ideas discussed in these slides provide but one in-road into the use of algebraic (particularly linear algebraic, here) methods for the representation and analysis of metabolic and other biochemical reaction networks. This is a fast-moving field and there is far more research that has been done on this topic that is much newer. These slides complement undergraduate curriculum materials in Chapter 8 of Modern Concepts and Methods in Modern Biology, but we will also soon start with and use other materials for today’s presentation and exercises (posted in the NIMBIOS blog). Outline of Workshop Topics Metabolic Pathways Stoichiometric Matrices Null Spaces: Extreme Pathways Left Null Spaces: Extreme Pool Maps Going Further: SVDs and More Metabolic Pathways Context Examples One Mathematical Approach [diagram from Wikipedia] The Meaning of “Life”? Living organisms/systems are, thermodynamically, open systems that tend to maintain a steady state. Steady state: “All rates of flows in the system are constant, so the system does not change with time.” [A steady state is a stable state!] Eventual state of closed thermodynamic system is equilibrium. Equilibrium: Death for a living system. (However, individual reactions in a system may be close to equilibrium.) [Death is also a very stable state.] Metabolic pathways [Voet & Voet] “Metabolism is the overall process through which living systems acquire and utilize free energy to carry out their various functions.” Metabolism is enacted through metabolic pathways: chains of “consecutive enzymatic reactions that produce specific products for use by an organism”. The metabolites in a metabolic pathway are usually taken to be the substrates, intermediates, and reactants in this chain of reactions. Example: Glycolysis [diagram from Voet & Voet] ‘Glykos’ = ‘sweet’, ‘lysis’ = ‘loosening’ (Greek) Widely shared mechanism of life forms for energy extraction Overall reaction: GLU + 2NAD+ + 2ADP + 2Pi 2NADH + 2PYR + 2ATP + 2H2O + 4H+ Stoichiometric Matrix S Metabolic system with m metabolites, n reactions Dynamic mass balance equation dC/dt = Sv S = (s ) is integer-valued m by n ij matrix metabolites s11 s21 – sij = 0 if metabolite i not involved in reaction j : – sij < 0 if metabolite i is a substrate in reaction j (|sij| moles (units) consumed in reaction j) : – sij > 0 if metabolite i is a product of reaction j (|sij| moles formed in reaction j) sm1 r… e a… c t … i o… n s… s1n s2n : : smn Example: Glycolysis [diagram from Voet & Voet] Example: Glycolysis [diagram from Voet & Voet] Dynamic Mass-Balance Equation(s) dC/dt = Sv Metabolic system with m metabolites, n reactions Ci = [Xi] := concentration in the metabolic system of Xi := metabolite i, for i= 1,..,m dCi/dt = si1v1 + si2v2 + ... + sinvn, where vj= rate of reaction j := flux of reaction j Example: Glycolysis Dynamic MassBalance [diagram from Voet & Voet] First two reactions, again. Exercise: Repeat for extended two-reaction system, and/or first three reactions. A Linear Algebraic Perspective Sv = 0 Null Space of S N(S) xTS = 0 Left Null Space of S N(ST) Picture modified from [Famili and Palsson] A Linear Algebraic Perspective Focus here. Sv = 0 Null Space of S N(S) xTS = 0 Left Null Space of S N(ST) Picture modified from [Famili and Palsson] Null Space of S Sv = 0: steady-state solutions to dynamic mass balance equation dC/dt = Sv N(S):={y in Rn | Sy = 0}; a vector v in N(S) is a flux vector for the metabolic system (steady state) Vectors in N(S) give dependencies among columns of S (i.e., reactions) Example: Glycolysis [diagram from Voet & Voet] Example: Glycolysis [diagram from Voet & Voet] Metabolic Pathways Exercise, Step 1: Build Your Own Stoichiometric Matrix A ‘toy’ example from [Schilling and Palsson]. Metabolic Pathways Exercise, Step 2: Read off the “Pathways” Figures from [Schilling & Palsson] Metabolic Pathways Exercise: Drawing “Paths” Figure from [Schilling & Palsson] Metabolic Pathways Exercise, Step 3: Change Basis, Read Pathways Figures from [Schilling & Palsson] Metabolic Pathways Exercise: Drawing “Paths” Again Figure from [Schilling & Palsson] From the Null Space of S to Extreme Pathways Null space of S: Standard methods yield mathematically valid basis of N(S), but resulting vectors may not be biologically valid total flux vectors. Base-changing: Aim for “biologically valid” basis of N(S); does such necessarily exist? Even if so, what about uniqueness? Next: Convex hulls, extreme pathways, and examples. Biologically “Good” Flux Vectors In the convex hull flux cone(S), there is an analogue of a basis for N(S), only better: a generating set of ‘independent’ flux vectors P = {p1,…, pt}, unique up to taking scalar multiples, and for which every w in flux cone(S) is a unique non-negative linear combination of vectors in P. Image from [Schilling, Letscher, Palsson] Example: Extreme Pathways (Expas) [Schilling, Schuster, Palsson & Heinrich] Look here! Basis (transposed) for N(S): b1 b2 b3 p1 = f1 := b1 – b2,, p2 = f2 := b1, p3 = f3 := b3 – b2 , p4 = f4 := b3 Expas: systematically independent basis P (transposed) for convex flux cone: f1 f2 Figures and tables from [Schilling, Schuster, Palsson & Heinrich] f3 f4 Expas Example: Human Red Blood Cell (HRBC) [Wiback & Palsson] Model accounts for 39 metabolites and 32 internal metabolic reactions, as well as 19 external ones (12 primary exchange and 7 currency exchange fluxes). Resulting flux cone(S) has |P| = 54; further partitioning into ‘types’ yields 39 expas of interest (36 Type I, 39 Type II). HRBC: Some of the Type I Expas [Wiback & Palsson] HRBC: More Type I and II Expas [Wiback & Palsson] Outcomes include: Unique and mathematically precise description of pathways, including key ‘historical pathways’, but extending to many ‘less intuitive’ paths that reflect network properties Opportunity to predict system effects of enzyomapathy and other ‘load’ capacities on individual reactions Another not-so-subtle point of the last few slides: consider advantages of good mathematical framework, such as linear algebra. Enzyomapathies in HRBC [Çakir, Tacer & Ülgen] Human red blood cell model with 44 metabolites and 39 reactions Investigates 5 (of about 20 known) enzyomapathies (in this case, enzyme deficiencies) using metabolic pathway analysis Follows work including [Wiback & Palsson], but using EFMs (elementary flux modes) One aim: identify targets for drug intervention for diseases caused by enzyme alterations/dysfunction Recall: A Linear Algebraic Perspective Sv = 0 Null Space of S N(S) xTS = 0 Left Null Space of S N(ST) Focus here. Picture modified from [Famili and Palsson] Left Null Space of S N(ST) = {x in Rm | STx = 0} = N(-ST) = {x in Rm | xTS = 0} * v in N(ST) is a potential “pool map”, defining a conservation relationship** Vectors in N(ST) give dependencies among rows of S (metabolites) * Rm consists of column vectors. **Details in the Appendix. Conservation and Pool Maps Exercise, Step 1: Build Your Own Dynamic MassBalance Equations A ‘toy’ example from [Nikolaev, Burgard, & Maranas] Conservation and Pool Maps Exercise, Step 1: Build Your Own Dynamic MassBalance Equations (Solution) Note difference from previous sign conventions; can use –S in place of prior S. Example from [Nikolaev, Burgard, & Maranas] Conservation and Pool Maps Exercise, Step 2: Find Conserved Cycles Note difference from previous sign conventions; can use –S in place of prior S. Diagram from [Nikolaev, Burgard, & Maranas] Conservation and Pool Maps Exercise, Step 2: Find Conserved Cycles Note difference from previous sign conventions; can use –S in place of prior S. Diagram from [Nikolaev, Burgard, & Maranas] Alternate Perspective: Reaction Maps to Compound Maps e.g., as in [Famili and Palsson] Metabolites Nodes Reactions Arrows S, N(S) Substrates Tails of Edges Products Heads of Edges Reactions Nodes Metabolites Arrows -ST, N(-ST) Substrates Edges Entering Nodes Products Edges Exiting Nodes Left Null Space of S: Compound Maps and Extreme Pool Maps As before, basis of N(-ST) lacks uniqueness and may not be biologically interesting As before, compute convex basis, call resulting (unique) vectors extreme pool maps (extreme pools) Example: Extreme Pool Maps in Glycolysis [Nikolaev, Burgard & Maranas] Glycolysis represented with 11 metabolites (16 if include ATP, ADP, NAD+, NADH, H20), and 13 reactions. Flux cone(-ST) has a systematically independent basis with |P| = 8 vectors, so there are 8 extreme pool maps. Diagram from [Nikolaev, Burgard & Maranas] Example: Extreme Pool Maps in Glycolysis [Nikolaev, Burgard & Maranas] Glycolysis offered as both a rich and sufficiently small real-life system for direct computation and analysis of extreme pools. However, paper considers alternative methods to elucidate and analyze extreme pools for larger systems, paralleling alternate ‘flux coupling’ methods for extreme paths. Metabolic Pathway Analysis, Extreme Pathways, and Extreme Pools: Some Consequences Yields mathematically precise definition of metabolic pools and pathways that take a systems/network approach Yields ‘unique’ generating set, with properties similar to vector space bases (‘minimality’ and ‘spanning’) Gives geometrically and graphically appealing interpretations Algorithms and programs exist for computing extreme paths and extreme pools Linear algebra framework provides accessible mathematical framework that is rich in computational power and is a base for many other mathematical structures Extends current biological ‘intuition’, suggests mechanisms for understanding how living systems maintain steady states and fight or fall to disease, as well as proper design of medical interventions. Linear Algebra Again: The Four Fundamental Subspaces and S Sv = 0 Null Space of S N(S) xTS = 0 Left Null Space of S N(ST) Focus here. Picture modified from [Famili and Palsson] Singular Value Decomposition (SVD): Learning More From S [Price, et. al.] Diagram from [Price, et. al.] ‘p’ above = ‘t’ below. Recall convex basis of expas P = {p1,.., pt} for flux cone(S). Set P to be the matrix with columns p1,..pt. Find SVD(P) = USVT. Analysis allows for comparison of extreme pathways for different metabolic systems, and may assist in identifying key branch points (targets for regulation). Singular Value Decomposition (SVD): Learning More From S [Price, et. al.] The column vectors of U (‘modes’) give information re: flux variability within the cone. The singular values measure variance in directions given by the corresponding U vectors. Diagrams from [Price, et. al.] Some Issues in Use of Extreme Pathways/Extreme Pool Maps In small to medium systems, expas/expools can be calculated, but giving their biological interpretations is not automated! Scaling to genome-level an issue: Implementation of original algorithms for computing flux cones (like expa) problematic for large systems: computational round-off error for large S, combinatorial explosion and NP completeness issues arise… Perspective may be enhanced by comparison with other linear algebra/convex analysis/linear programming methods*, e.g., EFMs, FCA, MCCA and MCPI, FluxAnalyzer… Dynamic and regulatory information are not, in general, treated in MPA (metabolic pathway analysis) approach. Biologists are not out of jobs: good biological data and physical approaches to flux determination (isotope labeling, etc.) still important. *Still other approaches/enhancements exist, making use of aspects of probability and statistics, etc… References Schilling and Palsson, The underlying pathway of biochemical reaction networks. Famili and Palsson, The convex basis of the left null space of the stochiometric matrix leads to the definition of metabolically meaningful pools. Schilling, Schuster, Palsson and Heinrich, Metabolic pathway analysis: Basic concepts and scientific applications in the postgenomic era. Çakir, Tacer and Ülgen, Metabolic pathway analysis of enzymedeficient human red blood cells. Suthers, Burgard, Dasika, Nowroozi, Van Dien, Keasling, Maranas, Metabolic flux elucidation for large-scale models using 13C labeled isotopes. Price, Reed, Papin, Famili and Palsson, Analysis of metabolic capabilities using singular value decomposition of extreme pathway matrices. Schilling, Letscher, and Palsson, Theory for systematic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. Wiback and Palsson, Extreme pathway analysis of human red blood cell metabolism. Nikolaev, Burgard, and Maranas, Elucidation and Structural Analysis of Conserved Pools for GenomeScale Metabolic Reconstructions. References Voet and Voet, Biochemistry (3rd Ed.) Bell and Palsson, expa, a program for calculating extreme pathways in biochemical reaction networks. Becker, Feist, Mo, Hannum, Palsson, and Herrgard, Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Schuster, Dandekar, and Fell, Detection of elementary flux modes in biochemical networks: a promising tool for pathway analysis and metabolic engineering. Klamt, Stelling, Ginkel and Giles, FluxAnalyzer: exploring structure, pathways, and flux distributions in metabolic networks on interactive flux maps. Palsson, Representing Reconstructed Networks Mathematically: The Stochiometric Matrix (lecture series). Systems Biology Research Group, http://gcrg.ucsd.edu/ Schuster, Fell, and Dandekar, A general definition of metabolic pathways useful for systematic organization and analyis of complex metabolic networks. Schuster and Hilgetag, On elementary flux modes in biochemical reaction systems at steady state. Schuster, Hilgetag, Woods and Fell, Reaction routes in biochemical reaction systems: algebraic properties, validated calculation procedure and example from nucleotide metabolism.