Temporal logic with constraints LTL constraint solving LTL patterns Inference of biological properties

advertisement
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
On the Analysis of Numerical Data Time Series in
Temporal Logic
François Fages, Aurélien Rizk
Equipe Projet Contraintes, INRIA Paris-Rocquencourt
CMSB 2007
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Introduction
Logical paradigm for systems biology [Fages, SAS/LOPSTR 2005] :
Biological model = Transition system
Biological property = Temporal logic formula
Biological validation = Model-checking
Implementation in the Biochemical Abstract Machine BIOCHAM
modeling environment available at
http ://contraintes.inria.fr/BIOCHAM
Our goal is to extract from experimental data the relevant biological
properties formalized in LTL.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Introduction
Biocham contains :
one rule-based language to write models of biochemical networks ;
a temporal logic language to formalize experimental knowledge ;
simulation, model-checking and inference tools.
Experimental Data
Analysis
Temporal Properties
Inference
Simulated Data
Simulation
Validation
Biological Model
→ the objective is to design an analyzer of time series computing from
experimental data a set of biological properties in temporal logic ;
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Plan
1
Temporal logic with constraints
2
LTL constraint solving
3
LTL patterns
4
Inference of biological properties
5
Future work
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Kripke structure
[A]
3
2
1
t
0
t0
t1
t2
t3
Linear temporal logic formulas interpreted over a set of concentration
traces.
X φ (next) : φ is true at the next time point ;
F φ (finally ) : φ is true at some time point in the future ;
G φ (globally ) : φ is true at all time points in the future ;
φ1 Uφ2 (until) : φ1 is true until φ2 becomes true.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Kripke structure
Definition (Kripke structure)
A Kripke structure K is a triplet K = (S, R, L) where S is a set of states,
R ⊆ S × S is the transition relation and L a labeling function.
state : a vector of molecules concentrations,
state transition between two consecutive time points,
atomic propositions : constraints on molecules concentrations and
their derivatives.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL formulas with numerical constraints
F([A]>10) : the concentration of A eventually gets above the
threshold value 10.
G([A]+[B]<[C]) : the concentration of C is always greater than the
sum of the concentrations of A and B.
F((d[M]/dt > 0) and F((d[M]/dt < 0) and F((d[M]/dt > 0))))
denotes a change of sign of the derivative of [M].
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
From model-checking to constraint solving
[A]
10
t
F([A]≥8)
Model-checking
the formula is true
F([A]≥v)
constraint solving
the formula is true for any v ≤ 10
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Constraint LTL syntax
C − ltl =
Atom | F (C − ltl)
| G (C − ltl) | X (C − ltl) | (C − ltl)U(C − ltl)
| (C − ltl) and (C − ltl) | (C − ltl) or (C − ltl)
| (C − ltl) ⇒ (C − ltl | not (C − ltl)
Atom =
Value Op Variable | Value Op Value
Op =
< | > | ≤ | ≥
Value =
float | [molecule] | d[molecule]/dt | d 2 [molecule]/dt 2 | Time
| Value + Value | Value − Value | − Value | Value × Value
| Value/Value | Value Value
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL Constraint solving problem
LTL constraint solving problem
Given a trace T and a LTL formula φ with n variables,
the constraint solving problem ∃v ∈ Rn such that (φ(v )),
is the problem of determining the valuation v of the variables for which
the formula φ is true in T .
In other words, we look for the domain of validity Dφ ⊂ Rn such that
T |=LTL ∀v ∈ Dφ (φ(v )).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
3
2
1
t
0
t0
t1
t2
t3
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
[A] ≥ p
3
p≤2
2
1
t
0
t0
t1
t2
t3
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
[A] ≥ p
3
p≤2
p≤2
t2
t3
2
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
[A] ≥ p
p≤3
3
p≤2
p≤2
t2
t3
2
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
[A] ≥ p
p≤3
3
p≤2
p≤2
t2
t3
2
p≤1
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
[A] ≥ p
p≤3
3
p≤2
p≤2
t2
t3
2
p≤1
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least
one time point after t.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
F([A] ≥ p)
[A] ≥ p
p≤3
p≤2
3
p≤2
p≤2
t2
t3
2
p≤1
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least
one time point after t.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
F([A] ≥ p)
p≤3
3
[A] ≥ p
p≤2
p≤2
p≤2
p≤2
t2
t3
2
p≤1
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least
one time point after t.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
p≤3
p≤3
3
F([A] ≥ p)
[A] ≥ p
p≤2
p≤2
p≤2
p≤2
t2
t3
2
p≤1
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least
one time point after t.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
[A]
p≤3
p≤3
3
2
p≤3
F([A] ≥ p)
[A] ≥ p
p≤2
p≤2
p≤2
p≤2
t2
t3
p≤1
1
t
0
t0
t1
Atomic sub-formula : [A] >= p at time t is true for any p less or
equal to [A](t).
F operator : F ([A] >= p)(t) is true iff [A] >= p is true in at least
one time point after t.
F ([A] >= p)(t) ≡ ([A] >= p)(t0 )∨([A] >= p)(t1 )∨([A] >= p)(t2 )∨([A] >= p)(t3 )
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Simplification rules
F φ → φ ∨ XF φ, that is DF φ (ti ) = DF φ (ti+1 ) ∪ Dφ (ti ) ;
G φ → φ ∧ XG φ ;
(φUψ) → ψ ∨ (φ ∧ X (φUψ)).
Formulas are normalized in a negation-free form using the following
equivalences :
¬X φ → X ¬φ,
¬F φ → G ¬φ,
¬G φ → F ¬φ,
¬(φUψ) → G ¬φ ∨ (¬φU(¬φ ∧ ¬ψ)).
Domain simplification rules :
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
LTL constraint solving algorithm
starting from the end of the trace, label each time point ti by the
sub-formula F ψ (resp. G ψ and ψ1 Uψ2 ) and its domain of validity
DF ψ (ti ) = DF ψ (ti+1 ) ∪ Dψ (ti )
resp.
DG ψ (ti ) = DG ψ (ti+1 ) ∩ Dψ (ti )
Dψ1 Uψ2 (ti ) = Dψ2 (ti ) ∪ (Dψ1 Uψ2 (ti+1 ) ∩ Dψ1 (ti ))
;
label each time point ti by the sub-formula X ψ (resp. ψ1 and ψ2
and ψ1 or ψ2 ) and its domain of validity
DX ψ (ti ) = Dψ (ti+1 )
resp.
Dψ1 and ψ2 (ti ) = Dψ1 (ti ) ∩ Dψ2 (ti )
Dψ1 or ψ2 (ti ) = Dψ1 (ti ) ∪ Dψ2 (ti )
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Properties of the constraint solving algorithm
Strong completeness
The instantiation algorithm is correct and complete :
a valuation v makes φ true at time ti , T , ti |=LTL (φ(v )),
if and only if v is in the computed domain of φ at ti , v ∈ Dφ (ti ).
Complexity
The instantiation algorithm has a time complexity O(kndv +1 ) where k, d,
v are respectively the size, the depth and the number of variables of the
LTL formula and where n is the size of the trace,
Efficient in practice, implemented in gnu-prolog (1200 lines).
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Biologically relevant LTL constraints
Reachability : F([A]>=p), what threshold p does species A attain in
the trace ?
Stability : G([A]=<p1 & [A]>=p2), what is the range of values
taken by [A] ?
Oscillation :
F((d([A])/dt>0 & [A]>v1) & (F((d([A])/dt<0 & [A]<v2)))),
what amplitude (v 1 − v 2) is attained in at least one oscillation ?
Influence : G(d[A]/dt>p1 -> d2[B]/dt2>=0), above which
threshold does the derivative of A have an influence on B ?
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Evaluation on simulated data
Evaluation made on simulated data from a cell cycle model [Tyson et al.
Kinetic analysis of a molecular model of the budding yeast cell cycle.
Molecular Biology of the Cell (2000) ] :
0.015
: _=>Cyclin.
MA(200) : Cyclin+Cdc2~{p1}
=> Cdc2-Cyclin~{p1,p2}
MA(0.018): Cdc2-Cyclin~{p1,p2} => Cdc2-Cyclin~{p1}
180*([Cdc2-Cyclin~{p1}])^2*[Cdc2-Cyclin~{p1,p2}]
: Cdc2-Cyclin~{p1,p2} =[Cdc2-Cyclin~{p1}]=> Cdc2-Cyclin~{p1}
MA(1)
: Cdc2-Cyclin~{p1}
=> Cyclin~{p1}+Cdc2
MA(0.6) : Cyclin~{p1} =>_
MA(100) : Cdc2
=> Cdc2~{p1}
MA(100) : Cdc2~{p1}
=> Cdc2
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Reachability analysis
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham : trace analyze(F([Cdc2-Cyclin∼{p1}]≥ v ))
[[v ≤ 0.194]]
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Oscillation amplitude analysis
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham : trace analyze(oscil(Cdc2,1))
[[v 2 ≥ 0.338, v 1 ≤ 0.479],
[v 2 ≥ 0.341, v 1 ≤ 0.480]]
Maximum amplitude of 1 oscillation : 0.479 − 0.338 = 0.141.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Oscillation amplitude analysis
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham : trace analyze(oscil(Cdc2,2))
Maximum amplitude of 2 oscillations : 0.138
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Influence analysis
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham :
trace analyze(G(d[Cyclin∼{p1}]/dt > p1 → d2[Cdc2]/dt2 ≥ 0))
Influence scores are computed by normalizing the value obtained for p1
with the highest value of d[Cyclin∼{p1}]/dt
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Positive influence scores
Molecules
Cdc2
Cdc2∼{p1}
Cyclin
Cdc2-Cyclin∼{p1,p2}
Cdc2-Cyclin∼{p1}
Cyclin∼{p1}
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Cdc2
0.00
0.01
0.00
0.00
0.90
0.50
Cdc2-Cyclin∼{p1,p2}
0.11
0.12
0.34
0.02
0.00
0.09
⇒ Cyclin.
Cyclin+Cdc2∼{p1} ⇒ Cdc2-Cyclin∼{p1,p2}
Cdc2-Cyclin∼{p1,p2} ⇒ Cdc2-Cyclin∼{p1}
Cdc2-Cyclin∼{p1,p2} =[Cdc2-Cyclin∼{p1}]⇒ Cdc2-Cyclin∼{p1}
Cdc2-Cyclin∼{p1} ⇒ Cyclin∼{p1}+Cdc2
Cyclin∼{p1} ⇒
Cdc2 ⇒ Cdc2∼{p1}
Cdc2∼{p1} ⇒ Cdc2
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Experimental like data
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham : trace analyze(F([Cdc2-Cyclin∼{p1}]≥ v ))
[[v ≤ 0.194]]
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Experimental like data
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham : trace analyze(oscil(Cdc2-Cyclin∼{p1},1))
Maximum amplitude of 1 oscillation : 0.192.
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Experimental like data
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham : trace analyze(oscil(Cdc2-Cyclin∼{p1},2))
Maximum amplitude of 2 oscillations : 0.012
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Experimental like data
0.5
0.4
0.3
0.2
0.1
0
0
20
Cdc2-Cyclin~{p1}
40
Cdc2-Cyclin~{p1,p2}
60
Cdc2
80
100
Cyclin~{p1}
biocham :
trace analyze(G(d[Cyclin∼{p1}]/dt > p1 → d2[Cdc2]/dt2 ≥ 0))
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Positive influence scores
Molecules
Cdc2
Cdc2∼{p1}
Cyclin
Cdc2-Cyclin∼{p1,p2}
Cdc2-Cyclin∼{p1}
Cyclin∼{p1}
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Cdc2
0.59
0.59
0.00
0.00
0.49
0.48
Cdc2-Cyclin∼{p1,p2}
0.00
0.00
0.73
0.59
0.00
0.00
⇒ Cyclin.
Cyclin+Cdc2∼{p1} ⇒ Cdc2-Cyclin∼{p1,p2}
Cdc2-Cyclin∼{p1,p2} ⇒ Cdc2-Cyclin∼{p1}
Cdc2-Cyclin∼{p1,p2} =[Cdc2-Cyclin∼{p1}]⇒ Cdc2-Cyclin∼{p1}
Cdc2-Cyclin∼{p1} ⇒ Cyclin∼{p1}+Cdc2
Cyclin∼{p1} ⇒
Cdc2 ⇒ Cdc2∼{p1}
Cdc2∼{p1} ⇒ Cdc2
On the Analysis of Numerical Data Time Series in Temporal Logic
Temporal logic with constraints
LTL constraint solving
LTL patterns
Inference of biological properties
Future work
Conclusion
we have generalized LTL model-checking by providing a LTL
constraint solving algorithm ;
we gave examples of LTL formulas for computing relevant
specifications from experimental data ;
Ongoing work :
relax condition on the occurrence of variables in constraint LTL
formulas ;
define other LTL patterns formalizing biological properties ;
compare scores of influence obtained with existing techniques ;
evaluate this method on experimental data (European project
TEMPO).
Future work :
use this algorithm in a parameter search method.
On the Analysis of Numerical Data Time Series in Temporal Logic
Download