Vérification de systèmes et réécriture de plus en plus efficace Yohan Boichut LIFO - Université d’Orléans JIRC 08, Orléans 9 octobre 2008 1/16 Context Java Bytecode analysis I I Rewriting semantics for the Java Bytecode Static analysis from reachability analysis in rewriting I I Tree automata technique [RTA98] Timbuk tool (http://www.irisa.fr/lande/genet/timbuk) 2/16 Context Java Bytecode analysis I I Rewriting semantics for the Java Bytecode Static analysis from reachability analysis in rewriting I I Tree automata technique [RTA98] Timbuk tool (http://www.irisa.fr/lande/genet/timbuk) 2/16 Rewriting Semantics for the Java Bytecode JVM Execution Trace inst 1 s1 inst 2 s2 inst 3 s3 inst 4 s4 inst 5 s5 s6 s i : JVM state : JVM transition inst i : Bytecode instruction For a given program P I I JVM states as Terms Rewrite rules for I I the Bytecode instructions interpretation (generic rules) the program 3/16 Reachability Analysis in Rewriting R* ? Bad E R*(E) Over−approximation of R*(E) E : initial terms JVM initial state R : term rewriting system JVM Transition relation for the given program Bad : forbidden terms Forbidden JVM states 4/16 Tree automata completion L(A 0 ) R*(L(A 0 )) L(A 1 ) L(A 2 ) ... L(A k ) I A set of terms is represented by a tree automaton language I Ai+1 = Ai + new transitions and states I Completion stops when a fix point automaton is found 5/16 Computing substitutions for a completion step g g a f b c 6/16 Computing substitutions for a completion step g g a f b c a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → qa qb qc qg 1 qf qg 2 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g g g R f → x a x y y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g g a f x y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg1 ] g[qa ] a f [qb ] x y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg1 ] g[qa ] a f [qb ] x y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg1 ] g a f x y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g g a f x y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg1 ] a x f [qf ] y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg1 ] f [qf ] a[qa ] x[qb ] y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg1 ] f [qf ] a[qa ] x[qb ] y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg1 ] f [qf ] a[qa ] x[qb ] y 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg1 ] f [qf ] a[qa ] x[qb ] y[qc ] 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg1 ] f [qf ] a[qa ] x[qb ] y[qc ] 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g a f b c qa qb qc qg 1 qf qg 2 g[qg2 ] g a f x[qb ] y[qc ] 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g f a b c qa qb qc qg 1 qf qg 2 g[qg2 ] g g f → x a y x[qb ] y[qc ] 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g g f a b c qa qb qc qg 1 qf qg 2 g[qg2 ] g[qg2 ] g f → x[qb ] y[qc ] a x[qb ] y[qc ] 6/16 Computing substitutions for a completion step a → b → c → → A g (qa , qb ) f (qc ) → g (qg 1 , qf ) → g (qb , qc ) → g g f a b c qa qb qc qg 1 qf qg 2 qg 2 g[qg2 ] g[qg2 ] g f → x[qb ] y[qc ] a x[qb ] y[qc ] 6/16 Don’t forget that it is used for Verification! I Security protocols: [CADE00,WITS03,CAV05,TFIT06,ICTAC06] I Java program verification: [RTA 07] 7/16 Don’t forget that it is used for Verification! I Security protocols: [CADE00,WITS03,CAV05,TFIT06,ICTAC06] I Java program verification: [RTA 07] 7/16 Don’t forget that it is used for Verification! I Security protocols: [CADE00,WITS03,CAV05,TFIT06,ICTAC06] I Java program verification: [RTA 07] 7/16 Don’t forget that it is used for Verification! I Security protocols: [CADE00,WITS03,CAV05,TFIT06,ICTAC06] I Java program verification: [RTA 07] But for the verification of Java programs. . . I I TRS are huge (more than 600 rules for a bubble sort program) Computation times with Timbuk may exceed 4 days! 7/16 Don’t forget that it is used for Verification! I Security protocols: [CADE00,WITS03,CAV05,TFIT06,ICTAC06] I Java program verification: [RTA 07] But for the verification of Java programs. . . I I TRS are huge (more than 600 rules for a bubble sort program) Computation times with Timbuk may exceed 4 days! Goal: Propose practical techniques to solve scalability issues 7/16 Reachability Preserving TRS Transformation Fact: Collecting all possible ground instances of a deep pattern may be expensive Idea: Transform TRS into simpler TRS I A simple form for the left hand-side of rules (depth max=2) I I I Flat: f (x1 , . . . , xn ) or c f (t1 , . . . , tn ) where each ti is flat Reachability preserving (Terms computed with the original TRS must be also computed by the resulting TRS) 8/16 Reachability Preserving TRS Transformation Fact: Collecting all possible ground instances of a deep pattern may be expensive Idea: Transform TRS into simpler TRS I A simple form for the left hand-side of rules (depth max=2) I I I Flat: f (x1 , . . . , xn ) or c f (t1 , . . . , tn ) where each ti is flat Reachability preserving (Terms computed with the original TRS must be also computed by the resulting TRS) 8/16 Reachability Preserving TRS Transformation Fact: Collecting all possible ground instances of a deep pattern may be expensive Idea: Transform TRS into simpler TRS I A simple form for the left hand-side of rules (depth max=2) I I I Flat: f (x1 , . . . , xn ) or c f (t1 , . . . , tn ) where each ti is flat Reachability preserving (Terms computed with the original TRS must be also computed by the resulting TRS) 8/16 Reachability Preserving TRS Transformation Fact: Collecting all possible ground instances of a deep pattern may be expensive Idea: Transform TRS into simpler TRS I A simple form for the left hand-side of rules (depth max=2) I I I Flat: f (x1 , . . . , xn ) or c f (t1 , . . . , tn ) where each ti is flat Reachability preserving (Terms computed with the original TRS must be also computed by the resulting TRS) 8/16 Reachability Preserving TRS Transformation Fact: Collecting all possible ground instances of a deep pattern may be expensive Idea: Transform TRS into simpler TRS I A simple form for the left hand-side of rules (depth max=2) I I I Flat: f (x1 , . . . , xn ) or c f (t1 , . . . , tn ) where each ti is flat Reachability preserving (Terms computed with the original TRS must be also computed by the resulting TRS) 8/16 Transformation of the example rule g g R g a f x → x y y 9/16 Transformation of the example rule g g a a f x y → C1 9/16 Transformation of the example rule g g C1 a f x y → C1 9/16 Transformation of the example rule g g C1 a g (C1 , x) f x y → → C1 C2 (x) 9/16 Transformation of the example rule g C2 f x y a g (C1 , x) → → C1 C2 (x) 9/16 Transformation of the example rule g C2 f x y a g (C1 , x) f (y ) → → → C1 C2 (x) C3 (y ) 9/16 Transformation of the example rule g C2 C3 x y a g (C1 , x) f (y ) → → → C1 C2 (x) C3 (y ) 9/16 Transformation of the example rule g C2 C3 x y a g (C1 , x) f (y ) g (C2 (x), C3 (y )) → → → → C1 C2 (x) C3 (y ) C4 (x, y ) 9/16 Transformation of the example rule C4 x y a g (C1 , x) f (y ) g (C2 (x), C3 (y )) → → → → C1 C2 (x) C3 (y ) C4 (x, y ) 9/16 Transformation of the example rule g C4 x → y a g (C1 , x) f (y ) g (C2 (x), C3 (y )) C4 (x, y ) → → → → → x y C1 C2 (x) C3 (y ) C4 (x, y ) g (x, y ) 9/16 Transformation of the example rule g g R g a → f x x y y a g (C1 , x) φ(R) f (y ) g (C2 (x), C3 (y )) C4 (x, y ) → → → → → C1 C2 (x) C3 (y ) C4 (x, y ) g (x, y ) 9/16 Transformation of the example rule g g R g a → f x x y y a g (C1 , x) φ(R) f (y ) g (C2 (x), C3 (y )) C4 (x, y ) → → → → → C1 C2 (x) C3 (y ) C4 (x, y ) g (x, y ) ∀t, t 0 ∈ T (F).t→R t 0 =⇒ t→∗φ(R) t 0 9/16 Main Result φ (R)*(L(A0)) R*(L(A0)) L(A0 ) 10/16 Main Result An over-approximation computed for φ(R) is also an over-approximation for R φ (R)*(L(A0)) R*(L(A0)) L(A0 ) L(Ak ) 10/16 Dedicated completion algorithm (1) Facts I For each l→r ∈ φ(R), l does not exceed a depth of 2 I Very close to a direct pattern-matching on transitions g (C2 (x), C3 (y )) → C4 (x, y ) g (qg 1 , qf ) → qg 2 I For this transition, the current matching algorithm computes all possible instances from g (qg 1 , qf ) Can we reduce the substitution computation to a simple pattern-matching problem? 11/16 Dedicated completion algorithm (1) Facts I For each l→r ∈ φ(R), l does not exceed a depth of 2 I Very close to a direct pattern-matching on transitions g (C2 (x), C3 (y )) → C4 (x, y ) g (qg 1 , qf ) → qg 2 I For this transition, the current matching algorithm computes all possible instances from g (qg 1 , qf ) Can we reduce the substitution computation to a simple pattern-matching problem? 11/16 Dedicated completion algorithm (1) Facts I For each l→r ∈ φ(R), l does not exceed a depth of 2 I Very close to a direct pattern-matching on transitions g (C2 (x), C3 (y )) → C4 (x, y ) g (qg 1 , qf ) → qg 2 I For this transition, the current matching algorithm computes all possible instances from g (qg 1 , qf ) Can we reduce the substitution computation to a simple pattern-matching problem? 11/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 a g (C1 , x) f (y ) g (C2 (x), C3 (y )) C4 (x, y ) → → → → → C1 C2 (x) C3 (y ) C4 (x, y ) g (x, y ) 12/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 a g (C1 , x) f (y ) g (C2 (x), C3 (y )) C4 (x, y ) → → → → → C1 C2 (x) C3 (y ) C4 (x, y ) g (x, y ) 12/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 g qg1 qf 12/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 g qg1 qf 12/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 g qf list g qa C2 qb qb 12/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 g qf list g qa C2 qb qb 12/16 Dedicated completion algorithm (2) a b c g (qa , qb ) f (qc ) g (qg 1 , qf ) g (qb , qc ) C1 C3 (qc ) C2 (qb ) → → → → → → → → → → qa qb qc qg 1 qf qg 2 qg 2 qa qf qg 1 g list g qa qb list C2 f C3 qb qc qc 12/16 Dedicated completion algorithm (3) We want to do a completion step with the rule g (C2 (x), C3 (y )) → C4 (x, y ). g C2 C4 C3 → x y x y The particular form of the rules allows us to replace the substitution calculus by associative pattern-matching 13/16 Dedicated completion algorithm (3) We want to do a completion step with the rule g (C2 (x), C3 (y )) → C4 (x, y ). g list C4 list → C2 C3 x y x y The particular form of the rules allows us to replace the substitution calculus by associative pattern-matching 13/16 Dedicated completion algorithm (3) We want to do a completion step with the rule g (C2 (x), C3 (y )) → C4 (x, y ). g list g list list list ≺ ≺ C2 C3 x y g qa qb C2 f C3 qb qc qc The particular form of the rules allows us to replace the substitution calculus by associative pattern-matching 13/16 Dedicated completion algorithm (3) We want to do a completion step with the rule g (C2 (x), C3 (y )) → C4 (x, y ). g list g list list list ≺ ≺ C2 C3 x y g qa qb C2 f C3 qb qc qc The particular form of the rules allows us to replace the substitution calculus by associative pattern-matching 13/16 Dedicated completion algorithm (3) We want to do a completion step with the rule g (C2 (x), C3 (y )) → C4 (x, y ). g list C4 list → g qa qb C2 f C3 qb qc qc qb qc The particular form of the rules allows us to replace the substitution calculus by associative pattern-matching 13/16 General schema of the implementation The Tom language [RTA’07]: Piggybacking Rewriting on top of Java I Efficient support for algebraic terms (hash-consing), I Pattern-matching (AU theory, variadic operators), I Expressive strategy language (à la ELAN, Stratego). 14/16 General schema of the implementation The Tom language [RTA’07]: Piggybacking Rewriting on top of Java I Efficient support for algebraic terms (hash-consing), I Pattern-matching (AU theory, variadic operators), I Expressive strategy language (à la ELAN, Stratego). Generator of dedicated completion programs written in Tom CCG Specification input file Completion Code Generator Tom completion program dedicated to the input Tom compiler Java completion program 14/16 Experimental results TRS size (nb of rules) Timbuk: Time (secs) Tom: Time (secs) Timbuk/Tom NSPK protocol 13 View-Only protocol 15 Java program (chained lists) 303 19.7 6420 37387 5.9 3 150 40 303 120 In practice, conclusive analyses with Timbuk are also conclusive with Tom 15/16 Experimental results TRS size (nb of rules) Timbuk: Time (secs) Tom: Time (secs) Timbuk/Tom NSPK protocol 13 View-Only protocol 15 Java program (chained lists) 303 19.7 6420 37387 5.9 3 150 40 303 120 In practice, conclusive analyses with Timbuk are also conclusive with Tom 15/16 Conclusion Main results: I Definition of a reachability preserving transformation on TRS I Computations of over-approximations using associative pattern-matching I Implementation in Tom/Java I A factor 10 in general, and up to 100 on Java examples Future work: I Verification of MIDlets I A better control of approximations I Using threads to parallelize the completion procedure 16/16 Conclusion Main results: I Definition of a reachability preserving transformation on TRS I Computations of over-approximations using associative pattern-matching I Implementation in Tom/Java I A factor 10 in general, and up to 100 on Java examples Future work: I Verification of MIDlets A better control of approximations I Using threads to parallelize the completion procedure I 16/16