1 An Automated Translation from a Narrative Language for Biological Modelling into Process Algebra Maria Luisa Guerrriero1 , John K. Heath2 , Corrado Priami1,3 1 ICT Dept., University of Trento, Italy 2 School of Biosciences, University of Birmingham, UK 3 The Microsoft Research-University of Trento Centre for Computational and Systems Biology, Italy 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 2 1 – Overview of the talk 2 – Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 – The proposed framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4 – The narrative language 5 – Beta-binders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 6 – Translation from the narrative language into Beta-binders . . . . . . . . . . . . . . . . . . . 12 7 – An example: enzymatic catalysed reaction . . . . . . . . . . . . . . . . . . . . . . . . . . 15 8 – Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 9 – Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 3 2 – Introduction ☞ Tools for modelling, analysis and simulation (ODE, Petri nets, statecharts, process algebra, etc.) ☞ Standard exchange languages (SBML, CellML, etc.) ☞ Biological databases (Panther, KEGG, etc.) ☞ Desiderata of modelling languages: ➳ Intuitive for non computer scientists (i.e. graphical or similar to natural language) ➳ Unambiguous (i.e. computable) ➳ Translatable into standard languages (to exchange models) ➳ Compositional (to update models and build them incrementally) ➳ Insensitive to incomplete knowledge 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 4 3 – The proposed framework ☞ High level textual modelling language ⇒ Intuitive description of biological systems ➳ Narrative-style language: sequence of basic events ➳ Basic events are biochemical reactions ☞ Automatic translation into formal language ⇒ Simulation and analysis of models ➳ Formal language hidden from the modeler ➳ Process algebra (Beta-binders) 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 5 4 – The narrative language ☞ Models are composed of four sections: ➳ Compartments involved in the evolution of the system ➳ Components involved in the system ➳ Reactions occurring in the system ➳ Events sequence of the evolution of the system 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 6 ☞ A compartment is described by: ➳ identifier ➳ name ➳ size ➳ number of spatial dimensions id name size (unit of measure) 1 exosol 9.95 · 10−12 (l) 3 2 cellMembrane 12.57 · 10−8 (dm2 ) 2 3 cytosol 2.10 · 10−12 (l) 3 4 nucleus 0.25 · 10−12 (l) 3 21/09/2007 – CMSB 2007, Edinburgh, UK dimensions M.L. Guerriero 7 ☞ A component is described by: ➳ name ➳ list of interaction sites (each site has a name and a state) ➳ list of states ➳ list of possible locations ➳ initial quantity/concentration (value and reliability) name site site state site active LIF gp130 STAT3 LIF bound false Y767 phospho false Y814 phospho false Y705 phospho false gp130 bound false 21/09/2007 – CMSB 2007, Edinburgh, UK state state active compart compart active initial quant reliability bound false 1 true 3000 100% dimer false 2 true 1000 50% dimer false 3 true 5000 0% 4 false M.L. Guerriero 8 ☞ A reaction is described by: ➳ identifier ➳ type ➳ rate (value and reliability) id type rate (unit of measure) 1 binding 8 · 105 (M−1 s−1 ) 100% 2 unbinding 6 · 10−4 (s−1 ) 100% 3 dimerization inf (M−1 s−1 ) 4 phosphorylation 0.2 (s−1 ) 100% 5 homodimerization inf (s−1 ) 20% 6 relocation 10 (min) 50% 21/09/2007 – CMSB 2007, Edinburgh, UK reliability 20% M.L. Guerriero 9 ☞ The evolution of the system is described by a narrative of events, possibly grouped into processes ☞ An event is a textual description of a reaction, described by: ➳ identifier ➳ textual semiformal description ➳ identifier of the reaction ☞ The textual semiformal description can be prefixed by an optional condition if condition then event description ➳ The condition can be on the state of a component (or site) or on its position (component.site is state, component is in compartment, etc); negative and multiple conditions can be specified ➳ The event description points out the occurring reaction and the involved component(s) (component reaction or component1 reaction component2, where reaction can be phosphorylates, binds, activates, etc); interaction/modification site can be specified 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 10 ☞ Events can be concurrent (e.g. involving different proteins), sequential (e.g. phosphorylation allowed after protein is bound), or alternative (e.g. competitive ligand/receptor binding); identifiers of alternative events can be specified, and conditions can be used to enforce ordering of sequential events id description react alternative LIF-gp130 binding 1 if gp130.LIF is not bound and LIF is not bound then LIF binds gp130 on LIF 1 2 if gp130.LIF is bound and LIF is bound then LIF unbinds gp130 on LIF 2 LIF pathway 3 if LIFR.LIF is bound then LIFR dimerises with gp130 3 4 if gp130 is dimer then gp130 phosphorylates on Y767;Y814 4 2 STAT3 pathway 5 if gp130.Y767 is phospho and STAT3 is in 3 then gp130 binds STAT3 on gp130 1 6 if STAT3.gp130 is bound then STAT3 phosphorylates on Y705 4 7 if STAT3.gp130 is bound and STAT3.Y705 is phospho then gp130 unbinds STAT3 on gp130 2 8 if STAT3.Y705 is phospho and STAT3.gp130 is not bound then STAT3 homodimerises 5 9 if STAT3 is in 3 and STAT3 is dimer then STAT3 relocates to 4 6 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 11 5 – Beta-binders ☞ Bio-inspired language of concurrent interacting processes with typed interaction sites Bio-process Body: internal behaviour P 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 11 5 – Beta-binders ☞ Bio-inspired language of concurrent interacting processes with typed interaction sites Bio-process Body: internal behaviour P Interface: interactions 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 11 5 – Beta-binders ☞ Bio-inspired language of concurrent interacting processes with typed interaction sites Bio-process Body: internal behaviour P x1 : ∆1 xh2 : ∆2 x3 : ∆3 P Interface: interactions 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 11 5 – Beta-binders ☞ Bio-inspired language of concurrent interacting processes with typed interaction sites Bio-process Body: internal behaviour x1 : ∆1 P xh2 : ∆2 x3 : ∆3 P Interface: interactions ☞ System: parallel composition of interacting bio-processes ☞ Operations: ➳ Input/output communications (intra / inter) ➳ Operations on interfaces (expose / hide / unhide / chtype) ➳ Merging/splitting of processes (bind / unbind) ➳ Sequential/parallel/choice composition of processes ➳ Conditional expressions (on binders type/visibility) 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 12 6 – Translation from the narrative language into Beta-binders ☞ Components ⇒ bio-processes ➳ Component interaction site ⇒ beta binder (active/inactive state ⇒ type) ➳ Component state ⇒ beta binder (active/inactive state ⇒ type) ➳ Component location ⇒ beta binder (current location ⇒ type) ☞ Events ⇒ pi-processes placed in parallel composition ➳ Monomolecular event ⇒ one pi-process ➳ Bimolecular event ⇒ pair of interacting pi-processes ➳ Site/state/location modification ⇒ chtype action ➳ Reaction rate ⇒ type affinity or action rate ➳ Condition/alternative event ⇒ condition on binder type 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 13 ☞ Empty bio-processes are added ☞ Beta binders for compartments are added (subject = “loc”, type = initial compartment name concatenated with its identifier) STAT3 located in compartment 3 (cytosol) ⇒ β(loc : cytosol 3) added to ST AT 3 bio-process ☞ Beta binders for states are added (subject = state name, type = state name concatenated with component name) Gp130 can be a dimer ⇒ β(dimer : monomer gp130) added to gp130 bio-process. ☞ Beta binders for sites are added (subject = site name concatenated with state name, type = state name concatenated with component name and site name) Site Y705 of receptor STAT3 can be phosphorylated ⇒ β(Y 705 pho : depho ST AT 3 Y 705) added to ST AT 3 bio-process 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 14 ☞ Pi-process for monomolecular event = sequence of chtype actions, possibly prefixed by conditions Event 4 is phosphorylation of sites Y767 and Y814 on gp130 (phosphorylations can occur if gp130 is dimer). Pi-process added to gp130: P ho 4 = if (dimer, dimer gp130) then chtype(0.2, Y 767 pho, pho gp130 Y 767) . chtype(inf, Y 814 pho, pho gp130 Y 814) . P ho 4 ☞ Pi-processes for bimolecular event = communication action, followed by a sequence of chtype actions, possibly prefixed by conditions Event 6 is binding of gp130 to STAT3 (binding can occur if site Y767 on gp130 is phosphorylated and STAT3 is in cytosol). Pi-processes added to gp130 and ST AT 3: Bind 6A = if (Y 767 pho, pho gp130 Y 767) then gp130 binding 6hi . Bind 6A Bind 6B = if (loc, cytosol 3) then gp130 bou() . chtype(inf, gp130 bou, bou ST AT 3 gp130) . Bind 6B ☞ Concurrent reactions = pi-processes placed in parallel composition ☞ Sequential reactions = the second one is prefixed by a condition on binder modified by the first one ☞ Mutually exclusive reactions = both prefixed by condition on binder modified at the end of the other one 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 15 7 – An example: enzymatic catalysed reaction E + S <--> ES --> EP --> E + P id 1 name E S descr enzyme substrate/product site site state site act product active false id id 1 2 3 4 name cytosol size 1.0 dimensions 3 state bound bound state act false false type rate (unit of measure) 1 binding 2 unbinding 3 activation 4 unbinding (M s ) −1 1 (s ) −1 −1 1 (M s ) −1 1 (s ) 1 −1 −1 compart 1 1 initial quant 100 1000 reliab 100% 100% reliability 50% 50% 50% 50% description E-S binding/unbinding if E is not bound and S is not bound and S.product is not active then E binds S if E is bound and S is bound then E unbinds S Product formation if E is bound and S is bound and S.product is not active then E activates S on product if E is bound and S is bound and S.product is active then E unbinds S 21/09/2007 – CMSB 2007, Edinburgh, UK compart act true true reaction alternative 1 2 3 4 2 M.L. Guerriero 16 [steps = 100000] << BASERATE:inf, s0:inf, s1:inf, s2:inf, s3:inf, s4:inf >> let let let let let Bind_1A : pproc = if (not(bou_E,Bound_E)) then bou_S_1<binding>.ch(inf,bou_E,Bound_E)... Unbind_2A : pproc = if (bou_E,Bound_E) then bou_S_2<unbinding>.ch(inf,bou_E,Unbound_E)... Act_3A : pproc = if (bou_E,Bound_E) then act_S_3<activation>.s3<s0>.nil endif; Unbind_4A : pproc = if (bou_E,Bound_E) then bou_S_4<unbinding>.ch(inf,bou_E,Unbound_E)... P_E : pproc = !s1{}.Bind_1A | !s2{}.Unbind_2A | !s3{}.Act_3A | !s4{}.Unbind_4A; let E : bproc = #(bou_S_4:0,Unbinding_S_4), #(act_S_3:0,Activation_S_3), #(bou_S_2:0,Unbinding_S_2), #(bou_S_1:0,Binding_S_1), #(bou_E:0,Unbound_E), #(loc_E:0,Cyto_1) [ Bind_1A | Unbind_2A | Act_3A | Unbind_4A | P_E ]; let let let let let Bind_1B : pproc = if ((not(bou_S,Bound_S)) and (not(product_act_S,Active_S_product)))... Unbind_2B : pproc = if (not(product_act_S,Active_S_product)) then if (bou_S,Bound_S)... Act_3B : pproc = if (not(bou_S,Unbound_S)) then if ((bou_S,Bound_S) and (not... Unbind_4B : pproc = if ((bou_S,Bound_S) and (product_act_S,Active_S_product)) then. P_S : pproc = !s1{}.Bind_1B | !s2{}.Unbind_2B | !s3{}.Act_3B | !s4{}.Unbind_4B; let S : bproc = #(product_act_S:0,Inactive_S_product), #(bou_S:0,Unbound_S), #(loc_S:0,Cyto_1) [ Bind_1B | Unbind_2B | Act_3B | Unbind_4B | P_S ]; run 100 E || 1000 S 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 17 ☞ Beta-binders model was simulated with BetaWBa a See http://www.cosbi.eu/Rpty Soft BetaWB.php 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 18 8 – Conclusions ☞ Positive feedback from biologists! ☞ The proposed language is simple, and nevertheless it is unambiguous and automatically translatable into formal languages ☞ Modelers can describe biological systems in simple words, simulate models, and obtain results, without the need to know the target formal language ☞ Main differences with SBML: ➳ System evolution is given in terms of a narrative of events ➳ It is easy to represent details about components and reactions ➳ Components can have multiple states ➳ Reliability values to represent uncertainty of biological knowledge 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 19 9 – Further work ☞ Additional primitives and features can be added to the language, and new features of the target language can be exploited ⇒ compactness, efficiency, representation of other types of events ☞ The translator can be extended to other languages ⇒ SBML for exchange of models, other formal languages to use and compare different tools for simulation and analysis ☞ Integration with biological databases ⇒ automatic extraction of data and parameters 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 20 Thank you! 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 21 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 22 hmodeli hcomparts decli hcompons decli hreacts decli hprocs decli ::= ::= ::= ::= ::= hcomparts declihcompons declihreacts declihprocs decli Compartments hcomparts listi Components hcompons listi Reactions hreacts listi Narrative hprocs listi hcomparts listi ::= | ::= | ::= | ::= | hcompartmenti hcompartmentihcomparts listi hcomponenti hcomponentihcompons listi hreactioni hreactionihreacts listi hproci hprocihprocs listi hcompartmenti hcomponenti ::= ::= hreactioni ::= (hidi, hcompart namei, hopt sizei, hopt uniti, hopt dimi) (hnamei, hopt inf orm descri, hopt sites def i, hopt states def i, hopt comparts def i, hinitial quantityi) (hidi, hreact typei, hrate consti) hcompons listi hreacts listi hprocs listi 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 23 hproci hevents listi heventi hopt sites def i hsites def i hsite def i hopt states def i hstates def i hstate def i hopt comparts def i hcomparts def i hcompart def i ::= ::= | ::= ::= | ::= | ::= Process hopt inf orm descrihevents listi heventi heventihevents listi (hidi, hf orm descri, hreact idi, hopt altern eventi) hsites def i hsite def i hsite def i; hsites def i hnamei : hstate namei : his activei ::= | ::= | ::= hstates def i hstate def i hstate def i; hstates def i hstate namei : his activei ::= | ::= | ::= hcomparts def i hcompart def i hcompart def i; hcomparts def i hidi : his activei 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 24 hinitial quantityi hrate consti ::= ::= (hquantityi, hopt reliabilityi) (hratei, hopt uniti, hopt reliabilityi) hf orm descri ::= | hevent descri if hcondsi then hevent descri hcondsi ::= | ::= | | | hcondi hcondi and hcondsi hnamesi is hstate namei hnamesi is not hstate namei hnamesi is in hidi hnamesi is not in hidi ::= | | | ::= | hnamei hnamei.hnamei hnamei; hnamesi hnamei.hnamei; hnamesi hnamei hnamei; hsitesi hcondi hnamesi hsitesi 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 25 hevent descri hcomplex namei hidi ::= | | | | | | | | | | | ::= | hcomplex nameihbimol reactihcomplex namei on hsitesi hcomplex nameihbimol reactihcomplex namei hcomplex nameihmonomol reacti on hsitesi hcomplex nameihmonomol reacti hcomplex namei relocates to hidi hcomplex namei degrades hcomplex namei degrades hcomplex namei hcomplex namei synthesises hcomplex namei hcomplex namei homodimerises hcomplex namei dehomodimerises hcomplex namei dimerises with hcomplex namei hcomplex namei dedimerises from hcomplex namei hnamei hnamei : hcomplex namei ::= Int 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 26 hopt sizei hopt uniti hopt dimi hnamei hopt inf orm descri hquantityi hopt reliabilityi hratei hreact idi hopt altern eventi his activei 21/09/2007 – CMSB 2007, Edinburgh, UK ::= | ::= | ::= | ::= ::= | ::= ::= | ::= ::= ::= | ::= Int Str Int Ide Str Int | Real Int Int | Real | inf Int alternative to hidi Bool M.L. Guerriero 27 hcompart namei hreact typei ::= | nucleus | cytosol | exosol cellMembrane | nucleusMembrane | Ide ::= | | | | | | phosphorylation | dephosphorylation hstate namei ::= phosphorylated | bound | active | hydrolysed | dimer hbimol reacti ::= | phosphorylates | dephosphorylates | binds | unbinds activates | deactivates | hydrolyses | dehydrolyses ::= phosphorylates | dephosphorylates | hydrolyses | dehydrolyses hmonomol reacti 21/09/2007 – CMSB 2007, Edinburgh, UK binding | unbinding homodimerization | dehomodimerization dimerization | dedimerization activation | deactivation hydrolysis | dehydrolysis degradation | synthesis | relocation M.L. Guerriero 28 ☞ Bio-processes are initially empty, and then beta binders are added 1: for all component ∈ Components do 2: component.bioprocess ← new empty bio-process 3: CompartsToBetaBinders (component) 4: StatesToBetaBinders (component) 5: SitesToBetaBinders (component) 6: end for ☞ Type of a beta binder representing a compartment = compartment name concatenated with compartment identifier STAT3 located in compartment 3 (cytosol) ⇒ β(loc : cytosol 3) added to ST AT 3 bio-process 1: binder ← new beta binder in component.bioprocess 2: binder.name ←“loc” 3: if component is in compartment at initial state then 4: binder.type 5: end if ← compartment.name∧ compartment.id 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 29 ☞ Type of a binder representing a state = state name concatenated with component name Gp130 can be a dimer ⇒ β(dimer : monomer gp130) added to gp130 bio-process. 1: for all state ∈ component.states do 2: binder ← new beta binder in component.bioprocess 3: binder.name ← state.name 4: if component is in state at initial state then 5: binder.type ← state.name∧ component.name 6: else 7: binder.type ← state.opposite name∧ component.name 8: end if 9: end for ☞ Type of a binder representing a site = state name concatenated with component name and site name Site Y705 of receptor STAT3 can be phosphorylated ⇒ β(Y 705 pho : depho ST AT 3 Y 705) added to ST AT 3 bio-process 1: for all site ∈ component.sites do 2: binder ← new beta binder in component.bioprocess 3: binder.name ← site.name∧ site.state.name 4: if site is in state at initial state then 5: binder.type ← site.state.name∧ component.name∧ site.name 6: else 7: binder.type ← site.state.opposite name∧ component.name∧ site.name 8: end if 9: end for 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 30 ☞ Pi-process representing a monomolecular event = sequence of chtype actions, possibly prefixed by conditions; reaction rates are based on input definitions Event 4 is phosphorylation of sites Y767 and Y814 on gp130 (phosphorylations can occur if gp130 is dimer). The pi-process added to gp130 is a sequence of two chtype actions, prefixed by a condition: P ho 4 = if (dimer, dimer gp130) then chtype(0.2, Y 767 pho, pho gp130 Y 767) . chtype(inf, Y 814 pho, pho gp130 Y 814) . P ho 4 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: if event.reaction is relocation then piproc ← chtype(“loc”, new location type) else if event.reaction is phosphorylation then piproc ← chtype(binder site pho, site phosphorylated type) else if event.reaction is dephosphorylation then piproc ← chtype(binder site pho, site dephosphorylated type) else if ... then end if if event.condition is specified then piproc ← if (binder cond, cond state type) then piproc end if 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 31 ☞ Pi-processes representing a bimolecular event = communication action, followed by a sequence of chtype actions, possibly prefixed by conditions; types affinities are based on input definitions Event 6 is binding of gp130 to STAT3 (binding can occur if site Y767 on gp130 is phosphorylated and STAT3 is in cytosol). The pi-process added to gp130 is an output action, prefixed by a condition: Bind 6A = if (Y 767 pho, pho gp130 Y 767) then gp130 binding 6hi . Bind 6A The pi-process added to ST AT 3 is a chtype action, prefixed by an input action, and by a condition: Bind 6B = if (loc, cytosol 3) then gp130 bou() . chtype(inf, gp130 bou, bou ST AT 3 gp130) . Bind 6B event.reaction is phosphorylation then piproc1 ← binder event idhi piproc2 ← binder site pho().chtype(binder site pho, site phosphorylated type) AddAffinity (binder event id type, site dephosphorylated type) 1: if 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: else if ... then end if if event.condition is specified on event.component1 then piproc1 ← if (binder cond, cond state type) then piproc1 end if if event.condition is specified on event.component2 then piproc2 ← if (binder cond, cond state type) then piproc2 end if 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero 32 ☞ Concurrent reactions = pi-processes placed in parallel composition. Sequential reactions = the second one is prefixed by a condition on binder modified by the first on. Mutually exclusive reactions = both are prefixed by a condition on binder modified at the end of the other one 1: for all ev ∈ Events do 2: if ev is monomolecular then 3: pproc ← EventToPiProcess (ev ) 4: if ev is alternative to prev ev then 5: pproc ← if not(prev ev.mod binder, prev ev.mod type) then pproc 6: prev ev.pproc ← if not(ev.mod binder, ev.mod type) then prev ev.pproc 7: end if 8: AddPiprocessInParallel (ev.component.bioproc, pproc) 9: else if ev is bimolecular then 10: h pproc1, pproc2 i ← EventToPiProcesses (ev ) 11: if ev is alternative to prev ev then 12: if both ev and prev ev involve ev.component1 then 13: pproc1 ← if not(prev ev.mod binder, prev ev.mod type) then pproc1 14: prev ev.pproc ← if not(ev.mod binder, ev.mod type) then prev ev.pproc 15: end if 16: if both ev and prev ev involve ev.component2 then ... 17: end if 18: end if 19: AddPiprocessInParallel (ev.component1.bioproc, pproc1) 20: AddPiprocessInParallel (ev.component2.bioproc, pproc2) 21: end if 22: end for 21/09/2007 – CMSB 2007, Edinburgh, UK M.L. Guerriero