Algorithmic verification Ahmed Rezine IDA, Linköpings Universitet Vårtermin 2015 Outline Overview Model checking Symbolic execution Outline Overview Model checking Symbolic execution Program verification and Approximations We often want to answer whether the program is safe or not (i.e., has some erroneous reachable configurations or not): Safe Program Unsafe Program Static Program Analysis and Approximations I Finding all configurations or behaviours (and hence errors) of arbitrary computer programs can be easily reduced to the halting problem of a Turing machine. I This problem is proven to be undecidable, i.e., there is no algorithm that is guaranteed to terminate and to give an exact answer to the problem. I An algorithm is sound in the case where each time it reports the program is safe wrt. some errors, then the original program is indeed safe wrt. those errors I An algorithm is complete in the case where each time it is given a program that is safe wrt. some errors, then it does report it to be safe wrt. those errors Static Program Analysis and Approximations I The idea is then to come up with efficient approximations and algorithms to give correct answers in as many cases as possible. Over-approximation Under-approximation Static Program Analysis and Approximations I A sound analysis cannot give false negatives I A complete analysis cannot give false positives False Positive False Negative In this lecture We will briefly introduce different types of verification approaches: I Model checking: exhaustive, aims for soundness I Symbolic execution: partial, aims for completeness Administrative Aspects: I There will be two lab sessions I These might not be enough and you might have to work more I You will need to write down your answers to each question on a draft. I You will need to demonstrate (individually) your answers in one of the lab sessions on a computer for me or Ola I Once you get the green light, you can write your report in a pdf form and send it (in pairs) to me or Ola. I You will get questions in the final exam about this lecture and the labs. Outline Overview Model checking Correctness properties Symbolic execution Model checking I Model checking is a push button verification approach I Given: I a model M of the system to be verified, and I a correctness property Φ to be checked: absence of deadlocks, livelocks, starvation, under/over specification, violations of constraints/assertions, etc I The model checking tool returns: I a counter example in case M does not model Φ, or I a mathematical guaranty that the M does model Φ Model Checking: Verification vs debugging I Model checking tools are used both: I To establish correctness of a model M with respect to a correctness property Φ I More importantly, to find bugs and errors in M early during the design M as a Kripke structure Assume a set of atomic propositions AP. A Kripke structure M is a tuple (S ; S0 ; R ; L) where: 1. S is a finite set of states S is the set of initial states R S S is the transition relation s.t. for any s 2 S, R(s s 0 ) holds for some s 0 2 S L : S ! 2AP labels each state with the atomic propositions 2. S0 3. 4. ; that hold on it. Programs as Kripke structures 1 int x = 0; 2 3 4 5 6 void thread (){ int v = x ; x = v + 1; } 7 8 9 10 11 12 13 14 void main (){ fork ( thread ); int u = x ; x = u + 1; join ( thread ); assert ( x == 2); } Synchronous circuits as Kripke structures v00 = v10 = v20 = :v0 v0 v1 (v0 ^ v1 ) v2 (1) (2) (3) Synchronous circuits as Kripke structures v00 = v10 = v20 = :v0 v0 v1 (v0 ^ v1 ) v2 (1) (2) (3) Asynchronous circuits handled using a disjunctive R instead of a conjunctive one like for synchronous circuits. Temporal Logics I Temporal logics are formalisms to describe sequences of transitions I Time is not mentioned explicitly (in today’s lecture) I Instead, temporal operators are used to express that certain states are: I never reached I eventually reached I more complex combinations of those Computation Tree Logic (CTL) Computation trees are obtained by unwinding the Kripke structure Computation Tree Logic (CTL) M ; s0 j= EF g M ; s0 j= AF g M ; s0 j= EG g M ; s0 j= AG g Listeners in JPF1 1 From Mehlitz’s HCSS 2010 presentation Outline Overview Model checking Symbolic execution Testing I I I I Most common form of software validation Explores only one possible execution at a time For each new value, run a new test. On a 32 bit machine, if(i==2014) bug() would require 232 different values to make sure there is no bug. I The idea in symbolic testing is to associate symbolic values to the variables Symbolic Testing I Main idea by JC. King in “Symbolic Execution and Program Testing” in the 70s I Use symbolic values instead of concrete ones I Along the path, maintain a Path Constraint (PC ) and a symbolic state (Σ) I I I I I PC collects constraints on variables’ values along a path, Σ associates variables to symbolic expressions, We get concrete values if PC is satisfiable The program can be run on these values Negate a condition in the path constraint to get another path Symbolic Execution: a simple example I Can we get to the ERROR? explore using SSA forms. I Useful to check array out of bounds, assertion violations, etc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 foo ( int x ,y , z ){ PC1 = true x = y - z; PC2 = PC1 if ( x == z ){ PC3 = PC2 z = z - 3; PC4 = PC3 if (4* z < x + y ){ PC5 = PC4 if (25 > x + y ) { PC6 = PC5 ... } else { ERROR ; PC10 = PC6 } } } ... ^ x1 = y0 ^ x1 = z0 ^ z1 = z0 ^ 4 z1 PC = (x1 = y0 z0 ^ x1 = z0 ^ z1 = z0 3 < z0 3 x1 + y0 ^ 25 x1 + y0 ^ 4 z1 < x1 + y0 Check satisfiability with an SMT solver (e.g., http://rise4fun.com/Z3) x x x x x 7! x0 7 x1 ! 7 x1 ! 7! x1 7! x1 x 7! x1 y ! 7 y0 z ! 7 z1 y y ;y ;y ;y ; ; 7! y0 7 y0 ! 7 y0 ! 7! y0 7! y0 ; ^ 25 x1 + y0 ) z z ;z ;z ;z ; ; ; 7! z0 7 z0 ! 7 z0 ! 7! z1 7! z1 Symbolic execution today I Leverages on the impressive advancements for SMT solvers I Modern symbolic execution frameworks are not purely symbolic, and not necessarily static: I They can follow a concrete execution while collecting constraints along the way, or I They can treat some of the variables concretely, and some other symbolically I This allows them to scale, to handle closed code or complex queries