Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work On the Analysis of Numerical Data Time Series in Temporal Logic François Fages, Aurélien Rizk Equipe Projet Contraintes, INRIA Paris-Rocquencourt CMSB 2007 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Introduction Logical paradigm for systems biology [Fages, SAS/LOPSTR 2005] : Biological model = Transition system Biological property = Temporal logic formula Biological validation = Model-checking Implementation in the Biochemical Abstract Machine BIOCHAM modeling environment available at http ://contraintes.inria.fr/BIOCHAM Our goal is to extract from experimental data the relevant biological properties formalized in LTL. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Introduction Biocham contains : one rule-based language to write models of biochemical networks ; a temporal logic language to formalize experimental knowledge ; simulation, model-checking and inference tools. Experimental Data Analysis Temporal Properties Inference Simulated Data Simulation Validation Biological Model → the objective is to design an analyzer of time series computing from experimental data a set of biological properties in temporal logic ; On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Plan 1 Temporal logic with constraints 2 LTL constraint solving 3 LTL patterns 4 Inference of biological properties 5 Future work On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Kripke structure [A] 3 2 1 t 0 t0 t1 t2 t3 Linear temporal logic formulas interpreted over a set of concentration traces. X φ (next) : φ is true at the next time point ; F φ (finally ) : φ is true at some time point in the future ; G φ (globally ) : φ is true at all time points in the future ; φ1 Uφ2 (until) : φ1 is true until φ2 becomes true. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Kripke structure Definition (Kripke structure) A Kripke structure K is a triplet K = (S, R, L) where S is a set of states, R ⊆ S × S is the transition relation and L a labeling function. state : a vector of molecules concentrations, state transition between two consecutive time points, atomic propositions : constraints on molecules concentrations and their derivatives. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL formulas with numerical constraints F([A]>10) : the concentration of A eventually gets above the threshold value 10. G([A]+[B]<[C]) : the concentration of C is always greater than the sum of the concentrations of A and B. F((d[M]/dt > 0) and F((d[M]/dt < 0) and F((d[M]/dt > 0)))) denotes a change of sign of the derivative of [M]. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work From model-checking to constraint solving [A] 10 t F([A]≥8) Model-checking the formula is true F([A]≥v) constraint solving the formula is true for any v ≤ 10 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Constraint LTL syntax C − ltl = Atom | F (C − ltl) | G (C − ltl) | X (C − ltl) | (C − ltl)U(C − ltl) | (C − ltl) and (C − ltl) | (C − ltl) or (C − ltl) | (C − ltl) ⇒ (C − ltl | not (C − ltl) Atom = Value Op Variable | Value Op Value Op = < | > | ≤ | ≥ Value = float | [molecule] | d[molecule]/dt | d 2 [molecule]/dt 2 | Time | Value + Value | Value − Value | − Value | Value × Value | Value/Value | Value Value On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL Constraint solving problem LTL constraint solving problem Given a trace T and a LTL formula φ with n variables, the constraint solving problem ∃v ∈ Rn such that (φ(v )), is the problem of determining the valuation v of the variables for which the formula φ is true in T . In other words, we look for the domain of validity Dφ ⊂ Rn such that T |=LTL ∀v ∈ Dφ (φ(v )). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] 3 2 1 t 0 t0 t1 t2 t3 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] [A] ≥ p 3 p≤2 2 1 t 0 t0 t1 t2 t3 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] [A] ≥ p 3 p≤2 p≤2 t2 t3 2 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] [A] ≥ p p≤3 3 p≤2 p≤2 t2 t3 2 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] [A] ≥ p p≤3 3 p≤2 p≤2 t2 t3 2 p≤1 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] [A] ≥ p p≤3 3 p≤2 p≤2 t2 t3 2 p≤1 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least one time point after t. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] F([A] ≥ p) [A] ≥ p p≤3 p≤2 3 p≤2 p≤2 t2 t3 2 p≤1 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least one time point after t. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] F([A] ≥ p) p≤3 3 [A] ≥ p p≤2 p≤2 p≤2 p≤2 t2 t3 2 p≤1 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least one time point after t. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] p≤3 p≤3 3 F([A] ≥ p) [A] ≥ p p≤2 p≤2 p≤2 p≤2 t2 t3 2 p≤1 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least one time point after t. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm [A] p≤3 p≤3 3 2 p≤3 F([A] ≥ p) [A] ≥ p p≤2 p≤2 p≤2 p≤2 t2 t3 p≤1 1 t 0 t0 t1 Atomic sub-formula : [A] >= p at time t is true for any p less or equal to [A](t). F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least one time point after t. F ([A] >= p)(t) ≡ ([A] >= p)(t0 )∨([A] >= p)(t1 )∨([A] >= p)(t2 )∨([A] >= p)(t3 ) On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Simplification rules F φ → φ ∨ XF φ, that is DF φ (ti ) = DF φ (ti+1 ) ∪ Dφ (ti ) ; G φ → φ ∧ XG φ ; (φUψ) → ψ ∨ (φ ∧ X (φUψ)). Formulas are normalized in a negation-free form using the following equivalences : ¬X φ → X ¬φ, ¬F φ → G ¬φ, ¬G φ → F ¬φ, ¬(φUψ) → G ¬φ ∨ (¬φU(¬φ ∧ ¬ψ)). Domain simplification rules : On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work LTL constraint solving algorithm starting from the end of the trace, label each time point ti by the sub-formula F ψ (resp. G ψ and ψ1 Uψ2 ) and its domain of validity DF ψ (ti ) = DF ψ (ti+1 ) ∪ Dψ (ti ) resp. DG ψ (ti ) = DG ψ (ti+1 ) ∩ Dψ (ti ) Dψ1 Uψ2 (ti ) = Dψ2 (ti ) ∪ (Dψ1 Uψ2 (ti+1 ) ∩ Dψ1 (ti )) ; label each time point ti by the sub-formula X ψ (resp. ψ1 and ψ2 and ψ1 or ψ2 ) and its domain of validity DX ψ (ti ) = Dψ (ti+1 ) resp. Dψ1 and ψ2 (ti ) = Dψ1 (ti ) ∩ Dψ2 (ti ) Dψ1 or ψ2 (ti ) = Dψ1 (ti ) ∪ Dψ2 (ti ) On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Properties of the constraint solving algorithm Strong completeness The instantiation algorithm is correct and complete : a valuation v makes φ true at time ti , T , ti |=LTL (φ(v )), if and only if v is in the computed domain of φ at ti , v ∈ Dφ (ti ). Complexity The instantiation algorithm has a time complexity O(kndv +1 ) where k, d, v are respectively the size, the depth and the number of variables of the LTL formula and where n is the size of the trace, Efficient in practice, implemented in gnu-prolog (1200 lines). On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Biologically relevant LTL constraints Reachability : F([A]>=p), what threshold p does species A attain in the trace ? Stability : G([A]=<p1 & [A]>=p2), what is the range of values taken by [A] ? Oscillation : F((d([A])/dt>0 & [A]>v1) & (F((d([A])/dt<0 & [A]<v2)))), what amplitude (v 1 − v 2) is attained in at least one oscillation ? Influence : G(d[A]/dt>p1 -> d2[B]/dt2>=0), above which threshold does the derivative of A have an influence on B ? On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Evaluation on simulated data Evaluation made on simulated data from a cell cycle model [Tyson et al. Kinetic analysis of a molecular model of the budding yeast cell cycle. Molecular Biology of the Cell (2000) ] : 0.015 : _=>Cyclin. MA(200) : Cyclin+Cdc2~{p1} => Cdc2-Cyclin~{p1,p2} MA(0.018): Cdc2-Cyclin~{p1,p2} => Cdc2-Cyclin~{p1} 180*([Cdc2-Cyclin~{p1}])^2*[Cdc2-Cyclin~{p1,p2}] : Cdc2-Cyclin~{p1,p2} =[Cdc2-Cyclin~{p1}]=> Cdc2-Cyclin~{p1} MA(1) : Cdc2-Cyclin~{p1} => Cyclin~{p1}+Cdc2 MA(0.6) : Cyclin~{p1} =>_ MA(100) : Cdc2 => Cdc2~{p1} MA(100) : Cdc2~{p1} => Cdc2 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Reachability analysis 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(F([Cdc2-Cyclin∼{p1}]≥ v )) [[v ≤ 0.194]] On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Oscillation amplitude analysis 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(oscil(Cdc2,1)) [[v 2 ≥ 0.338, v 1 ≤ 0.479], [v 2 ≥ 0.341, v 1 ≤ 0.480]] Maximum amplitude of 1 oscillation : 0.479 − 0.338 = 0.141. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Oscillation amplitude analysis 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(oscil(Cdc2,2)) Maximum amplitude of 2 oscillations : 0.138 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Influence analysis 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(G(d[Cyclin∼{p1}]/dt > p1 → d2[Cdc2]/dt2 ≥ 0)) Influence scores are computed by normalizing the value obtained for p1 with the highest value of d[Cyclin∼{p1}]/dt On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Positive influence scores Molecules Cdc2 Cdc2∼{p1} Cyclin Cdc2-Cyclin∼{p1,p2} Cdc2-Cyclin∼{p1} Cyclin∼{p1} (1) (2) (3) (4) (5) (6) (7) (8) Cdc2 0.00 0.01 0.00 0.00 0.90 0.50 Cdc2-Cyclin∼{p1,p2} 0.11 0.12 0.34 0.02 0.00 0.09 ⇒ Cyclin. Cyclin+Cdc2∼{p1} ⇒ Cdc2-Cyclin∼{p1,p2} Cdc2-Cyclin∼{p1,p2} ⇒ Cdc2-Cyclin∼{p1} Cdc2-Cyclin∼{p1,p2} =[Cdc2-Cyclin∼{p1}]⇒ Cdc2-Cyclin∼{p1} Cdc2-Cyclin∼{p1} ⇒ Cyclin∼{p1}+Cdc2 Cyclin∼{p1} ⇒ Cdc2 ⇒ Cdc2∼{p1} Cdc2∼{p1} ⇒ Cdc2 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Experimental like data 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(F([Cdc2-Cyclin∼{p1}]≥ v )) [[v ≤ 0.194]] On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Experimental like data 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(oscil(Cdc2-Cyclin∼{p1},1)) Maximum amplitude of 1 oscillation : 0.192. On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Experimental like data 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(oscil(Cdc2-Cyclin∼{p1},2)) Maximum amplitude of 2 oscillations : 0.012 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Experimental like data 0.5 0.4 0.3 0.2 0.1 0 0 20 Cdc2-Cyclin~{p1} 40 Cdc2-Cyclin~{p1,p2} 60 Cdc2 80 100 Cyclin~{p1} biocham : trace analyze(G(d[Cyclin∼{p1}]/dt > p1 → d2[Cdc2]/dt2 ≥ 0)) On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Positive influence scores Molecules Cdc2 Cdc2∼{p1} Cyclin Cdc2-Cyclin∼{p1,p2} Cdc2-Cyclin∼{p1} Cyclin∼{p1} (1) (2) (3) (4) (5) (6) (7) (8) Cdc2 0.59 0.59 0.00 0.00 0.49 0.48 Cdc2-Cyclin∼{p1,p2} 0.00 0.00 0.73 0.59 0.00 0.00 ⇒ Cyclin. Cyclin+Cdc2∼{p1} ⇒ Cdc2-Cyclin∼{p1,p2} Cdc2-Cyclin∼{p1,p2} ⇒ Cdc2-Cyclin∼{p1} Cdc2-Cyclin∼{p1,p2} =[Cdc2-Cyclin∼{p1}]⇒ Cdc2-Cyclin∼{p1} Cdc2-Cyclin∼{p1} ⇒ Cyclin∼{p1}+Cdc2 Cyclin∼{p1} ⇒ Cdc2 ⇒ Cdc2∼{p1} Cdc2∼{p1} ⇒ Cdc2 On the Analysis of Numerical Data Time Series in Temporal Logic Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties Future work Conclusion we have generalized LTL model-checking by providing a LTL constraint solving algorithm ; we gave examples of LTL formulas for computing relevant specifications from experimental data ; Ongoing work : relax condition on the occurrence of variables in constraint LTL formulas ; define other LTL patterns formalizing biological properties ; compare scores of influence obtained with existing techniques ; evaluate this method on experimental data (European project TEMPO). Future work : use this algorithm in a parameter search method. On the Analysis of Numerical Data Time Series in Temporal Logic