Complex Models in Systems Biology Program on Development, Assessment, and Utilization of Complex Computer Models SAMSI September 11, 2006 Reinhard Laubenbacher Virginia Bioinformatics Institute and Mathematics Department Virginia Tech The Hallmarks of Cancer Hanahan & Weinberg (2000) Biochemical Networks Metabolic space Metabolite 1 Protein 2 Metabolite 2 Protein space Complex 3:4 Protein 4 Protein 3 Protein 1 Gene 2 Gene 3 Gene 1 Gene space Gene 4 Brazhnik, P., de la Fuente, A. and Mendes, P. Trends in Biotechnology 20, 2002 System-Level Experimental Data Systems biology • New technology allows system level measurements in molecular biology. • Can build system-level models. • Need to scale up modeling technology. The dynamic GNS simulation of interconnected signal transduction pathways and gene expression networks controlling human cell growth contains over 2,000 variables. The model describes the processes of endocytosis, receptor signaling, signal transduction, transcriptional control of gene expression networks, and protein translation and degradation mechanisms. It predicts various physiological outcomes such as cell cycle progression and arrest through G1-S and G2-M starting from mitogenic signaling, cell cycle arrest and apoptosis induction via p53, and the interplay between survival signals and apoptosis. Gene Network Sciences http://www.gnsbiotech.com/news-press020603.html • fruitfly image www.ars-grin.gov/mia/images/News/ Wildtype Gene Expression Nature 406 2000 A Boolean network is a time-discrete dynamical system f=(f1, … ,fn): {0, 1}n → {0, 1}n. Each fi is a Boolean function. Dynamics is generated by iteration. For a binary vector x we have f(x) = (f1(x), …, fn(x)), that is, the variables are updated synchronously. A Mathematics Program • Study stochastic sequential dynamical systems of the form f=(f1, … ,fn): kn → kn where k is a finite field, and fi: kn → kn, which only changes the ith coordinate. An update of the system is computed by choosing an update order of the variables based on a probability distribution on update orders. That is, f(x) = fi ◦ fj ◦ … Fact: Each fi can be represented uniquely as a polynomial function. “Bottom-up modeling:” Model individual pathways and aggregate to system-level models “Top-down modeling:” Develop network inference methods for system-level phenomenological models Model Types Ideker, Lauffenburger, Trends in Biotech 21, 2003 Challenges Cellular biochemical networks are • Nonlinear • High-dimensional • Poorly understood • Underdetermined by available data, which are typically noisy • Difficult to perturb Network inference Problem: Given D={(si, ti) ∈ kn×kn }, find the “most likely” model f: kn → kn such that f(si) = ti Using methods from computational algebra, one can describe the entire space of possible models in a compact way and choose a most parsimonious model by optimizing model structure. R. Laubenbacher and B. Stigler, A computational algebra approach to the reverseengineering of gene regulatory networks, J. Theor. Biol. 229 (2004) A. Jarrah, R. Laubenbacher, B. Stigler, and M. Stillman, Reverse-engineering of polynomial dynamical systems, Adv. in Appl. Math. (2006) in press Application Use the Albert-Othmer Boolean model to generate time courses (wild-type and knockout mutant) totaling 24 time courses of 7 data points each. (Note that the system has 221 possible states.) The reverse-engineering algorithm recovers the “wiring diagram” of the network correctly, as well as 19 of the 21 Boolean functions. Summary • Understanding cellular networks is very important (e.g., personalized medicine). • System-level data are increasingly available and increasingly quantitative. • Many mathematical and computational problems are waiting to be solved. • There is a large community of life scientists eager to collaborate.