An Information Integration System for Automated Reconstruction and Dynamical Modeling of Gene Regulatory Networks Michael Baitaluk1, Amarnath Gupta1, Shubhada Godbole2, Xufei Qian1, Vijay Chickarmane2, and Animesh Ray2 1San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92039, 2Keck Graduate Institute, Claremont, CA 91711 I. AIMS This poster describes methods for data integration, automated retrieval and dynamic simulation of gene/protein interaction networks as a tool to understand cellular regulation from a systems biology perspective. Three questions are addressed: 1. How to develop a truly integrated computational platform that allows intelligent retrieval of any genomic-scale or single-gene information. 2. How to automate the retrieval of cellular interaction networks by data integration approaches. 3. How to derive dynamic simulations from the static networks? II. RATIONALE Large Interaction datasets Internal Data Model Basics: 1. 2. 3. Curated data from literature Gene expression datasets III. SIMULATION OF EARLY MEIOSIS NETWORK Primary Nodes (States): All molecules (e.g. DNAs, RNAs, proteins), small molecules (e.g. ions, ATP, lipids), physical events (heat, radiation, stress). Connector Nodes (Transitions): All types of interactions (binding, chemical reaction, expression, etc.). Graph Nodes (hypernodes): Complex objects (protein complexes, pathways, cell processes) that might contain graphs Vegetative Cell (2N) Meiosis S. cerevisiae Tetrad 4 x (1N) Cellular compartment: part of the model. Warehouse of data Network query Annotation, Ontology Static interaction network model Ime1p NDT80 Ndt80p Transcriptional activation of IME1 gene is the trigger of sporulation/meiosis. Ime1p is a transcription factor that activates Ime2p, a kinase, which allows autoactivation of Ime1p, and together activates a second transcription factor gene, NDT80 Data Model: PathSys’s data model is relevant to Hybrid Functional Petri Net (HFPN) data model in the way that HFPN modeling has places (states) and transitions as nodes in the interaction graph and certain concepts such as node attributes and abstractions may be defined. One could treat our primary and connector nodes together with their attributes as a “state” and transition specifiers respectively. PathSys query: Retrieved a network of early meiosis genes: Dynamical model Petri Net (A state-transition model), Ordinary Differential Equations BiologicalNetworks architecture Developed Software: GUI Level Querying Querying Templates PathSys: PathSys is a data integration platform that provides dynamic integration over diverse databases containing genetic, protein-protein and DNA-protein interactions, protein localization, and microarray data available through published literature. Querying Wizard DAG Querying Window Ontologies Annotation Server Gene Ontology Kegg Ontology … E External DB Sources (SOAP) X T BiologicalNetworks: provides access to PathSys through a novel query engine that stores and queries directed acyclic graphs such as ontologies and taxonomies in addition to molecular network interactions, allowing easy retrieval and visualization of complex biological networks. http://brak.sdsc.edu/pub/BiologicalNetworks E KEGG Ternary Relations Schema … Generic Schema Graph Representation … Graph Editing and Selection R Graph Editing and Selection N A Layout Engine Layout Engine L Visual Mapper Visual Mapper D DB Interfaces B Petri Net simulation results Petri Net-equivalent of Ime1 network DB Interfaces PetriNet GUI Project Interfaces S O Graph Editing and Selection Layout Engine U R PathSys system architecture Project Interfaces Simulation Engine C DB Interfaces User Privileges E Update/Editing Graph, DB S Visual Mapper … Graph Engine Middle Tier DataBase Tier DAG Engine Cytoscape GRN Schema (Ternary) Gene Ontology Schema Biological Networks GRN+GO+… Generic Schema PetriNet Engine 3500 BIRN Ontologies From static networks to dynamic simulation: the method In the above diagrams, we show how the query for a particular set of genes, allows us to acquire a network, for which a Petri Net simulation can be done. 3000 2500 IME1 2000 IME2 UME6 1500 RIM15 NDT80_at 1000 a) GO function libraries (for biological/molecular processes) SUM1_at YHP1 500 Microarray expression data (fitted here, by polynomials) suggest that IME1 mRNA may indeed oscillate over time. [Data from:Primig et al., Nat Genet. 26: 415-23 (2000)] 0 t=0 1h 2h 3h 4h 6h 8h 10h -500 b) Parameters and initial conditions PN-software Static network Petri Net Model Query PathSys: The system is equipped with two novel query engines with built in SQLlike querying language, allowing paths, trees, graphs operations. The server provides querying services and an information management framework over PathSys. The system integrates over 20 curated and publicly contributed data sources for the budding yeast (S. cerevisiae) and fly (D.melanogaster). PathSys is capable of generating and simulating dynamical gene regulatory models from molecular interaction graphs based on Hybrid Functional Petri Nets and XML technology, allowing the user to simulate and predict gene expression dynamics. SAN DIEGO SUPERCOMPUTER CENTER PathSys Experiment Pure Dynamical Model Petri Nets: Petri Nets nets are a promising tool for modeling systems with concurrency and resource sharing. In addition they can easily represent hybrid systems comprising of both continuous and discrete dynamics. For data poor biological systems petri nets are a useful descriptive intermediate and as additional data becomes available they can be converted to full continuous models which are then amenable to a wide range and analytical and numerical techniques. Summary We describe PathSys – database system which integrates over 20 curated and publicly contributed data surces for the budding yeast (S. cerevisiae) and fly (D.melanogaster) and BiologicalNetworks- a bioinformatics software platform for visualizing molecular interaction networks, integrating these interactions with other graph-structured data such as ontologies (e.g. gene ontology) and taxonomies (like the enzyme classification system and functional classification of yeast proteins), integrating interactions with gene expression data and other state data, querying all types of data to extract biologically meaningful relations, pathway modeling and simulation. Model describes automated reconstruction and dynamic simulation of gene regulatory networks. This is achieved by first querying a database to obtain a network, using a Petri Net to create a logical statetransition flow chart and then using this to construct a reaction scheme, which is described by a set of coupled ODE’s. Acknowledgements: We thank Herbert Sauro, KiriLynn Svay, Aditya Bagchi. This project is supported by National Scientific Foundation grants EIA -0205061 and EIA-0130059.