reactions - People at VT Computer Science

advertisement
Multistate
Modeling and Simulation
for Regulatory Networks
Zhen Liu, Clifford A. Shaffer, Umme Juka
Mobassera, Layne T. Watson, and Yang Cao
Department of Computer Science
Program in Genetics, Bioinformatics, and
Computational Biology
Virginia Tech
Goal: Modeling the Cell Cycle
(John Tyson)
G1
S
DNA
replication
M
(mitosis)
G2
Regulatory Network Modeling

Model using a series of chemical reactions.




The actors are proteins (“chemical species”)
whose interaction rates are modeled by rate
laws
Species are created, consumed, combined
Populations can rise and fall, under the
control of other species
Loops and cycles
Decomposition of Models

Modelers find it natural to divide into
“bundles” of reactions.
Multistate Phosphorylation Motif


Blocks relate to naturally occurring motifs
Example: antagonistic interaction between
Clb2 and Cdh1, with Cdc14 as the control
variable driving phosphorylation of Cdh1

Forms a bi-stable switch
Multistate Version

The reality is more complex, as a protein
can undergo multiple levels of
phosphorylation, which can affect the
behavior of the larger system
Multistate Modeling


Equations on chemical species with multiple
states, related in some meaningful way
Expressing as single-state equations would
require dozens of reactions.
JigCell Model Builder Support
Problems



Complications arise from the potential
combinatorial explosion of states in
complexes
Example: Two multistate species each
with 10 states could form complexes with
potentially 100 states.
A{i} + B{j} -> AB{i,j}
This presents challenges to simulation.
Stochastic Simulation

Reaction models have often been modeled
using ODEs


Track concentrations of chemical species
ODE models cannot account for stochastic
effects



Small numbers for some species (RNA)
Variations in inputs => Differing outputs
Simulation ensemble => Distribution
Gillespie’s SSA (1)



N molecular species {S1, …, SN}.
M reaction channels {R1, … RM}.
For reaction channel Rj:



Propensity function aj
State change vector vj = (v1,j, …, vN,j)
aj(x)dt gives probability that one Rj reaction
will occur in next infinitesimal time interval
given state vector x.
Gillespie’s SSA (2)



Select two random numbers r1 and r2
Let a0(x) be the sum for all the reaction
propensities on state vector x.
Time for next reaction to occur is t + t
t = 1/a0(x) log (1/r1).
Gillespie’s SSA (3)



Index j for next reaction is given by
smallest integer satisfying
S al(x) > r2a0(x).
System state updated after each reaction,
including populations and propensities
Observations:


A population-based simulation
SSA calculates propensities for reactions
Rule-Based Modeling





A rule defines how a molecular particle
reacts with other particles
k
Aopen,?,? + B ---> AB,?,?
Subscripts describe the matching
configurations for binding sites
Convenient for representation
Updating propensities of rules faster(?)
than updating propensities of reactions
rule
Network-Free Algorithm (1)



(Sneddon et al. 2008)
Alternative to turning rules into collections
of reactions and performing SSA.
Conceptually similar to SSA, but



Calculate propensities for rules.
Particle based (not population based)
Keep list of particles associated with each rule
Network-Free Algorithm (2)

Simulation loop:




Calculate propensity for each rule (cheaper
than SSA)
Calculate rule and time of next event
Select particles from associated list
Update the particle lists as necessary (major
expense)
Population-Based NFA (PNFA)


(Our first contribution)
Modification to NFA: (go back to) using
populations for single-state species



Hybrid particle/population approach
Attempts to cut down on the size of the
lists associated with the rules
Can be viewed as an optimization to NFA

at worst degrades to NFA
Full-Scale SSA (FSSSA) (1)




(Our second contribution)
Use populations even for multi-state
species
Should work well unless there is a small
population spread across many states
Can view as more direct conversion of SSA
to rules (pure population-based approach)
Full-Scale SSA (FSSSA) (2)



For each species, store an array of
populations (one for each state)
Might be a sparse array
Store with each rule the population count
for all associated reactants
Full-Scale SSA (FSSSA) (3)

Simulation loop:




Calculate propensity for each rule (cheaper
than SSA)
Calculate rule and time of next event
Select a state for each reactant from the
population array
Update populations of affected species
(states) and population counts for associated
rules (might require modifying arrays)
Comparisons: Selection



SSA does linear search through reactions
NFA, PNFA do linear search through rules,
then select qualifying objects from
associated reactant lists
FSSSA does linear search through rules,
only needs to search state lists
(populations)
Comparisons: Update




SSA updates populations of some
reaction’s reactants and products
NFA must create/destroy molecule
objects, and update associate rule lists
PNFA same, but does little work on singlestate species populations
FSSSA updates sparse matrix info.
Bi-stable Switch Model

Reaction-based form:



12 species
44 reactions
Rule-based form:



1 single-state species, 1 multi-state
7 rules
Non-zero populations in each state
Simulation Times: Switch
Total CPU Propensity Reactant
Time
Update
Selection
System
Update
Other
SSA
115
72.0
30.6
5.3
7.1
NFA
341
11.1
34.0
286.0
9.9
PNFA
246
9.9
26.2
200.8
9.1
FSSSA
117
9.2
32.4
66.2
9.2
Cell Cycle Model

Reaction-based form:


58 species, 185 reactions
Rule-based form:




17 single-state species, 6 multi-state
64 rules
Half the states have zero population
Observation: Affecting one multi-state species
affects only a smaller fraction of all the rules
Simulation Times: Cell Cycle
Total CPU Propensity Reactant
Time
Update
Selection
System
Update
Other
SSA
171
143.3
23.5
1.4
2.8
NFA
133
36.4
20.4
72.5
3.7
PNFA
113
34.0
17.6
58.6
2.8
FSSSA
64
32.8
18.2
10.5
2.5
Simulation Quality (1)
Simulation Quality (2)


This graph shows distribution of
population for Clb2, one of the species in
the cell cycle model.
The significance is that it indicates that
each simulation algorithm gives
approximately the same ensemble of
outputs.
Complexity Analysis
Download