050505.RMBS.Synopsis..

advertisement
Realistic modeling of biological systems
Synopsis
Realistically modeling of biological systems is becoming feasible thanks to recent
progress in experimental and computational methods. On May 1st-May 5th 2005, a
workshop was held in Mezpe Hayamim, Israel, to explore the motivation, meaning and
methodologies for realistic modeling of biological systems. In addition, the goals and
measures of success have been discussed and debated.
An ad hoc definition for realistic models of biological systems (RMBSs) is “a
comprehensive model of a complex biological system that can be interrogated to
reproduce and predict the behavior of the system under realistic conditions”. Most
definitions of scientific models require them to be simpler than the system under study,
offering abstractions that come from understanding.
A debated requirement was that models, just like Popperian hypothesis, should produce
falsifiable predictions that stand up to tests. Some of the biologists that participated in the
workshop proposed a special type of models, communicative models. Such models will
capture the knowledge of different researchers in different fields to a concise
representation, which may have little predictive power if that knowledge is still too
partial. The goals of communicative realistic models is to describe existing knowledge in
a unified structure, and to create a frameworks which allows, through interacting with the
model, to develop new insights, to highlight contradictions, or to point to missing
information. It is important to note that this type of realistic model was considered
acceptable or useful only to some participants.
The motivations for developing RMBMs that were discussed included:
- Integrating knowledge and data about a system, bridging disciplines and turfs
- Testing the existing understanding of the underlying mechanisms by the ability of
the model to reproduce the emerging properties of the system.
- Predicting the behaviors of the system to suggest experiments which will resolve
between competing hypothesis.
- Explaining what features of the underlying system are most important in
generating the emerging properties of the system.
Some of the parameters of examining the performance of a realistic model are:
- Completeness: the fraction of known behaviors and structures of the system that
are included by the model.
- Usefulness: the ability to utilize the model to draw conclusions or make decisions
(such as experimental design) based on the model.
- Predictiveness: the ability of the model to assist in formulating predictions that
are successfully validated.
- Depth: the ability of the model to describe emerging macroscopic properties of
the system across biological organization levels (e.g. molecular, cellular, tissue)
based on microscopic elements and rules.
A topic of major discussion was the testing of the realism of a system. It was largely
fueled by David Harel’s proposal for a “grand challenge”, in the form of “The Extended
Turing Test”. In this test, model is realistic if “one versed in the field” cannot tell the
model from the real system. The discussion on this test pointed to the need to define the
legalistic aspects of such as a test, as well as the need to establish mechanisms which will
prevent failure simply on the basis of the difference in the way in which the answers are
communicated (e.g. probing the state of the model being faster than running a test on the
real system). With a more immediately achievable goals in mind, Ronan Sleep presented
a roadmap challenge based on the study of specific systems involving gastrulation and
other developmental processes. Sleep’s roadmap starts with…
Several models were discussed, demonstrated the place of realistic modeling along the
traditional levels of organization in biology. These included modeling metabolism with
realistic modeling of membranes and the cytosol (Gordon Brodick), the lytic/lysonegic
switch in lambda phage (Anastasia Yartseva), C. elegans Vulval development (Jane
Hubbard and Michael Stern), Different aspects of the immune cell generation (Howard
Petrie, Sol Efroni and Naaman Kam) …
[I included here some of the summaries I made during the meeting that ended up more
mature. I will appreciate everyones sending me one or two paragraphs that capture the
essence of the message they had in terms of the summary – e.r.]
Sorin Solomon presented a model which much like the famous simulation Life, allows
simple automata to reproduce or die on a grid. In Sorin’s system, two types of creatures
coexist on the grid. Breeders can reproduce if they meet Catalyzers, or die with a given
probability at any cycle, while catalyzers do not die or reproduce. Both catalyzers and
breeders diffuse randomly on the grid. This very simple system produced two very
interesting observations about the nature of emerging properties in complex systems.
First, modeling the fate of the system with differential equations gave false predictions.
According to such solutions, which average out the behavior of all the elements in the
system, if the average “death” rate is bigger than the average “birth” rate, reproducers
will gradually go extinct. However, if you run the simulation you see that random nuclei
of high density around a reproducer can breed faster than they die; as a result, for most
starting conditions the breeders took over the entire grid. The second emerging property
of this system is that a cloud of breeders seemed to follow the catalyzers. While none of
the individual breeders had any directional interaction with the catalyzers, their density
followed it.
Jane Hubberd and XXX presented models of vulval development in the warm C. elegans.
Prof. Hubberd explained the special challenge and opportunity this warm poses for
modeling. Being transparent, fast to breed, and with a relatively simple anatomy, this
warm is one of the best understood models of development. Decades of research provided
us with a clear documentation of cell fates1 from the zygote and all the way to the mature
An interesting discussion surrounded the use of the term “fate”. Prof. Irun Cohen argued that it conveys a
sense of predestined processed, while in reality the process is driven by a series of “here and now”
decisions. Irun’s expressed concern that the use term may be mislead for its English meaning. The
1
adult with its 1000 or so cells. Furthermore, it is rather simple to silence specific genes in
the warm2, providing an easy way to test the effects of perturbation. In addition, laser
ablation can be used to explore the effect of destroying specific cells or structures during
development. Michael Stern presented a model for the processes that determine vulval
positioning. In this processes, a cascade of intercellular signals lead to the differentiation
of a set of cells into two cell types, one which will later die to form the vulval cavity, and
one which will form its walls. A simulation of this process was presented, based on
Logical State Charts (LSCs), which captures the formation of the early vulva from simple
rules that each of the cells follow.
Ohad Parnes discussed models. He argues that in biomedical research, there is no real
difference between experimental systems from model systems. All experimental systems
are models: we are not studying nature directly, but try to describe “agents”, a concept
first introduced by Muller, that can explain the behavior of the system in a cause-andeffect way. For example, Schwan showed that if you analyze the process of stomach
digestion required an agent later found to be pepsin. Similarly, fermentation was shown
to result from a specific agent. Through this model, Schwan identified cells as the agent.
In the 1970s and 80s, the agent started weakening. Examples: The clonal selection theory
cannot explain the behavior of the system from the individual “agents”. In epidemiology,
given the same bacteria, different societies will develop different disease patterns.
Patterns from the substrates – people – and the agent – the bacteria. In the 1940s and
1950s, system theoretical approaches tried to identify high level rules about the behaviors
of all biological systems. The idea was to define rules that all systems follow, and than
drill down and understand how the systems are realization this rule.
Agent based programming / agent based programming: the agents entering biological
models from the field of computer modeling are different from the “old” agent model.
Each agent is a much more complex model. It has goals, it has bounded rationality, etc.
This was explicitly not allowed in physiology in the 18 century science. In modern
modeling, goals behavior and rationality are introduced into physiological systems.
The discussion in Q&A was around where are we heading. There was a discussion of
replacing the agent with new agents. The computational agents were mentioned as a
possible emerging new type of agent. Yoram Luzon mentioned the difficulty of living
with more 4-5 elements in a system.
Carl Schaefer presented the Pathway Interaction Database (PID)
(http://cmap.nci.nih.gov/PW) as a prototype database of metabolic and signaling
interactions extracted from the representations of pathways available from KEGG
(http://www.genome.jp/kegg/) and BioCarta (http://www.biocarta.com). The database
currently contains 4207 interactions from 85 human metabolic pathways and 3064
interactions from 259 human signaling pathways. In the PID data model, there are four
developmental biologists acknowledged the concern, but explained that it is too central to embryogenesis to
be replaced.
2
Gene silencing in C. elegans can be achieved by creating transgenic E. coli bacteria that express a specific
siRNA designed to target specific genes or groups of genes, and feeding the warms with this bacteria. C.
elegans warms naturally feed on bacteria.
interaction types (reaction, modification, transcription, and translocation), four molecule
types (protein, complex, RNA, and compound), and four role types (input, output, agent,
and inhibitor). Post-translational modifications and cellular locations are specified by
labels on uses of molecules in interactions. This scheme models a few simple relations -cause/effect, producer/consumer -- in a way that supports computation across the entire
set of interactions.
One interesting exploratory use of the PID data is the construction of interaction profiles
of phenotypes. An interaction profile is analogous to a gene expression profile. A gene
expression profile is a set of pairs, each pair consisting of a gene id and a value of “up” or
“down”. Similarly, an interaction profile is a set of pairs, each pair containing an
interaction id and a value of “on” or “off”. In most cases, the proteomics data needed to
construct a profile of signaling interactions is simply not available. However, one can use
gene expression data to specify an initial state for the set of signaling interactions in the
PID, and then apply a set of simple rules which interpret the cause/effect relation to infer
which interactions are active for a gene expression dataset. Using this approach, one can
infer that a given posttranslationally-modified form of a gene product is present in one
sample but absent in another, even though the gene is equally expressed in both samples
We have applied this method to data from 18 brain tumor (glioblastoma multiforme)
samples and 7 normal brain samples from NCI’s REMBRANDT project
(http://rembrandt.nci.nih.gov), computing, for each tumor sample, the set of interactions
that are active in cancer but not in normal. The interactions unique to a given tumor
sample typically aggregate into sets of connected graphs. The size of these graphs varies
from 3 interactions (the minimum size for this analysis) to 35 interactions in the largest
graph. While some of theses graphs are unique to a single tumor sample (39), other
graphs are shared by several tumor samples; one graph is present in 12 of the 18 samples.
Furthermore, in some cases one graph may include all the interactions in another graph.
Using this relation of inclusion, we create a partial ordering of graphs. This ordering, in
turn, implies an ordering of the tumor samples containing these graphs, which might
reflect a progression in the activation of connected interactions that are not found to be
active in normal samples
Download