An Information Integration System for Automated Reconstruction and

advertisement
An Information Integration System for Automated Reconstruction and Dynamical Modeling of Gene
Regulatory Networks
Michael Baitaluk1, Amarnath Gupta1, Shubhada Godbole2, Xufei Qian1, Vijay Chickarmane2, and Animesh Ray2
1San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92039, 2Keck Graduate Institute, Claremont, CA 91711
I. AIMS
This poster describes methods for data integration, automated retrieval and
dynamic simulation of gene/protein interaction networks as a tool to
understand cellular regulation from a systems biology perspective. Three
questions are addressed:
1. How to develop a truly integrated computational platform that allows
intelligent retrieval of any genomic-scale or single-gene information.
2. How to automate the retrieval of cellular interaction networks by data
integration approaches.
3. How to derive dynamic simulations from the static networks?
II. RATIONALE
Large Interaction
datasets
Internal Data Model
Basics:
1.
2.
3.
Curated data from
literature
Gene expression
datasets
III. SIMULATION OF EARLY MEIOSIS NETWORK
Primary Nodes (States): All
molecules (e.g. DNAs, RNAs,
proteins), small molecules (e.g. ions,
ATP, lipids), physical events (heat,
radiation, stress).
Connector Nodes (Transitions): All
types of interactions (binding,
chemical reaction, expression, etc.).
Graph Nodes (hypernodes):
Complex objects (protein complexes,
pathways, cell processes) that might
contain graphs
Vegetative Cell (2N)
Meiosis
S. cerevisiae
Tetrad 4 x (1N)
Cellular compartment: part of the model.
Warehouse
of data
Network query
Annotation,
Ontology
Static interaction network model
Ime1p
NDT80
Ndt80p
Transcriptional activation of IME1 gene is the trigger of sporulation/meiosis.
Ime1p is a transcription factor that activates Ime2p, a kinase, which allows
autoactivation of Ime1p, and together activates a second transcription
factor gene, NDT80
Data Model: PathSys’s data model is relevant to Hybrid
Functional Petri Net (HFPN) data model in the way that
HFPN modeling has places (states) and transitions as
nodes in the interaction graph and certain concepts such
as node attributes and abstractions may be defined. One
could treat our primary and connector nodes together
with their attributes as a “state” and transition specifiers
respectively.
PathSys query: Retrieved a network of early meiosis genes:
Dynamical model
Petri Net (A state-transition model), Ordinary Differential Equations
BiologicalNetworks architecture
Developed Software:
GUI Level
Querying
Querying
Templates
PathSys: PathSys is a data integration platform that provides dynamic integration
over diverse databases containing genetic, protein-protein and DNA-protein
interactions, protein localization, and microarray data available through published
literature.
Querying
Wizard
DAG
Querying
Window
Ontologies
Annotation
Server
Gene Ontology
Kegg Ontology
…
E
External DB Sources
(SOAP)
X
T
BiologicalNetworks: provides access to PathSys through a novel query engine that
stores and queries directed acyclic graphs such as ontologies and taxonomies in
addition to molecular network interactions, allowing easy retrieval and visualization
of complex biological networks. http://brak.sdsc.edu/pub/BiologicalNetworks
E
KEGG
Ternary Relations
Schema
…
Generic Schema
Graph Representation
…
Graph Editing
and Selection
R
Graph Editing
and Selection
N
A
Layout Engine
Layout Engine
L
Visual Mapper
Visual Mapper
D
DB
Interfaces
B
Petri Net simulation results
Petri Net-equivalent of Ime1 network
DB
Interfaces
PetriNet GUI
Project
Interfaces
S
O
Graph Editing
and Selection
Layout Engine
U
R
PathSys system architecture
Project Interfaces
Simulation
Engine
C
DB
Interfaces
User Privileges
E
Update/Editing
Graph, DB
S
Visual Mapper
…
Graph Engine
Middle Tier
DataBase Tier
DAG
Engine
Cytoscape
GRN Schema
(Ternary)
Gene
Ontology
Schema
Biological
Networks
GRN+GO+…
Generic
Schema
PetriNet
Engine
3500
BIRN
Ontologies
From static networks to dynamic simulation: the method
In the above diagrams, we show how the
query for a particular set of genes, allows
us to acquire a network, for which a Petri
Net simulation can be done.
3000
2500
IME1
2000
IME2
UME6
1500
RIM15
NDT80_at
1000
a) GO function libraries (for
biological/molecular
processes)
SUM1_at
YHP1
500
Microarray expression data (fitted here, by
polynomials) suggest that IME1 mRNA
may indeed oscillate over time.
[Data from:Primig et al., Nat Genet. 26: 415-23 (2000)]
0
t=0
1h
2h
3h
4h
6h
8h
10h
-500
b) Parameters and initial
conditions
PN-software
Static network
Petri Net Model
Query
PathSys: The system is equipped with two novel query engines with built in SQLlike querying language, allowing paths, trees, graphs operations. The server provides
querying services and an information management framework over PathSys. The
system integrates over 20 curated and publicly contributed data sources for the
budding yeast (S. cerevisiae) and fly (D.melanogaster). PathSys is capable of
generating and simulating dynamical gene regulatory models from molecular
interaction graphs based on Hybrid Functional Petri Nets and XML technology,
allowing the user to simulate and predict gene expression dynamics.
SAN DIEGO SUPERCOMPUTER CENTER
PathSys
Experiment
Pure Dynamical Model
Petri Nets: Petri Nets nets are a promising tool for modeling systems with concurrency and resource sharing. In addition they
can easily represent hybrid systems comprising of both continuous and discrete dynamics. For data poor biological systems petri
nets are a useful descriptive intermediate and as additional data becomes available they can be converted to full continuous
models which are then amenable to a wide range and analytical and numerical techniques.
Summary
We describe PathSys – database system which integrates over 20 curated and publicly contributed data
surces for the budding yeast (S. cerevisiae) and fly (D.melanogaster) and BiologicalNetworks- a
bioinformatics software platform for visualizing molecular interaction networks, integrating these
interactions with other graph-structured data such as ontologies (e.g. gene ontology) and taxonomies
(like the enzyme classification system and functional classification of yeast proteins), integrating
interactions with gene expression data and other state data, querying all types of data to extract
biologically meaningful relations, pathway modeling and simulation.
Model describes automated reconstruction and dynamic simulation of gene regulatory networks. This is
achieved by first querying a database to obtain a network, using a Petri Net to create a logical statetransition flow chart and then using this to construct a reaction scheme, which is described by a set of
coupled ODE’s.
Acknowledgements: We thank Herbert Sauro, KiriLynn Svay, Aditya Bagchi. This project is supported
by National Scientific Foundation grants EIA -0205061 and EIA-0130059.
Download