A Formal and Integrated Framework to Simulate Evolution of Biological Pathways Lorenzo Dematte’, Corrado Priami, Alessandro Romanel and Orkun Soyer CMSB 07 Edinburgh, 20/09/2007 Introduction • Interest in using evolutionary approaches to study pathways • Current approaches to evolution use ad-hoc tools and representations of pathway dynamics • Current available tools to model and simulate pathway dynamics do not allow for evolutionary simulations. • Outline: – BetaWB language – Evolutionary framework – A running example BetaWB language Bio-process P Interface Internal behaviour •Stochastic: each action enabled in the system has a stochastic rate. • Three types of rules: • Monomolecular: describe the evolution of single entities; • Bimolecular: describe actions that involve two or more entities; • Events: global rules of the environment. The BetaWB language Operational semantics: set of syntax-driven rules that automatically infer the possible future of the system. A A P A intra P’ B P Q inter Monomolecular A B P’ Q’ Bimolecular B C R Q B Q Complexes C D R C B join Events C P D BetaWB extensions: deterministic events let Kinase : bproc = #(x:1,Alpha) [ @(2).nil ]; let A : bproc = #(y:1,Gamma) [ @(2).nil ]; when (A: step=1500) delete(500); when (A: step=2500) new(500); when (Kinase: time=3.5) new(2000); when (|A|<10) new(2000); when (|A|=1000) delete(200); Fixed step Cardinality Time “Injection” or “wash out” of substances Deterministic events Evolution on Computers • How biological systems function, and why they function the way they do? What happened? • Understand how pathways emerged during evolution can help us to understand their basic properties • Role of complexity, • Importance of topology • Importance of feedback loops. Evolutionary Framework • In silico evolution – A population of individuals – The behaviour of each individual – A measure of “success” – Reproduction based on success Replicate & Mutate 7 A signalling or metabolic pathway Fitness function Compositional Model For Signalling Pathways A+ AA O O + A Bio-process Inact Act - Internal pi-process O Evolutionary algorithm 1) Simulation Each individual in the population is simulated separately using the BetaWB stochastic simulator Mapk1 Mapk1 Mapk1 Mapk1 Mapk 123 BetaWB Simulator • Stochastic simulator – variant of next reaction method • Species based on structural congruence 2) Fitness • Fitness measures how good an individual was • According to some criteria, it determines if an individual was successful in its life – If its fitness value is higher, it has a higher probability to live and reproduce • Measures can be: – Directly on pathway output (quantity of an entity) – Indirect, as a result of the pathway activity (food eaten, ability to move..) 3) Selection and replication Based on fitness values (normalized sum) Each individual take a slice in a 0-1 bar 0.0 1.0 0.345 Generate Random Number Take individual, replicate it Repeat until we have a full population Possibly, Mutate it 4) Mutations a) Initial configuration b) Duplication of DNA strand c) DNA point mutation (domain structure changes) d) Domain duplication (changes internal behaviour) Changing affinity file Duplication of Bio-process Addition of binders, changing internal pi-process Mutations (2) Modification of internal process: manipulation of the Pi-process AST Only fixed transformations (have to match know structure) Must have sense biologically (further constraints on what is “technically” possible) An example: MAPK The mitogen-activated protein kinase cascade Basic structure well conserved • Three kinases in the cascade • Phosphorylation at two sites • Relay signals from membrane • Stimulus / response curve very steep MAPK known facts • Other signalling pathways use just a kinase – Why three? – Cascade arrangement and pathway dynamics – Ultrasensitivity • Ultrasensitivity important for biological function – Noise filtering switch-like circuit – Depends upon dual-collision double phosphorylation How MPAK evolved? • How we reached the three kinases configuration? – Are other configuration possible? • Which intermediate steps lead to the final configuration? • It is known that the high degree of ultrasensitivity depends also upon dual-collision double phosphorylation – How have this structure arisen? – Through which steps was it combined with the cascade configuration? (future work) MAPK experiment setup Signals, 2-level phosphatases, kinases Area below the curve: how “quick” is response Area above curve: “switch off” response Switch off signal Introduction of signal Fitness: ratio(area1) – ratio(area2) Our results Fitness First phospatase added to pathway First kinase activation (signal turn on) Fitness Generations Generations Have we obtained MAPK? Not really, but we had interesting “variations” Possible explanations • Case C): • We allowed self-phosphorilation • Response is quick (as quick as having 2 kinases) • But phosphatases can target only one protein (signal switch off is slower) • Case B): • Only one phosphatase was introduced • Signal switch off slower also in this case MAPK cascade model Typical curves for c) and b) Cluster computation Conclusions and future work • Designed a framework for studying evolution, both formal and practical – Mutations for bio-processes – Tools to automate the whole process • Applied the framework to a biological example – A good test for our approach, we got interesting results • Plans to extend it to add more mutations, constraints, control of the process – Also easier ways to write fitness functions – Use the extended framework to answer more questions on our MAPK example Thank you! Any question?