A Tutorial on Automated Verification Tevfik Bultan Who are these people and what do they have in common? 2007 Clarke, Edmund M 2007 Emerson, E Allen 2007 Sifakis, Joseph 1996 Pnueli, Amir 1991 Milner, Robin 1980 Hoare, C. Antony R. 1978 Floyd, Robert W 1972 Dijkstra, E. W. Outline • • • • • Software’s Chronic Crisis Temporal Logics and Model Checking Problem Symbolic Model Checking Automata Theoretic Model Checking Software Verification Using Explicit State Model Checking with Java Path Finder • Bounded Model Checking • Symbolic Software Model Checking with Predicate Abstraction and Counter-Example Guided Abstraction Refinement Software’s Chronic Crisis Large software systems often: • Do not provide the desired functionality • Take too long to build • Cost too much to build • Require too much resources (time, space) to run • Cannot evolve to meet changing needs – For every 6 large software projects that become operational, 2 of them are canceled – On the average software development projects overshoot their schedule by half – 3 quarters of the large systems do not provide required functionality Software Failures • There is a long list of failed software projects and software failures • You can find a list of famous software bugs at: http://www5.in.tum.de/~huckle/bugse.html • I will talk about two famous and interesting software bugs Ariane 5 Failure • A software bug caused European Space Agency’s Ariane 5 rocket to crash 40 seconds into its first flight in 1996 (cost: half billion dollars) • The bug was caused because of a software component that was being reused from Ariane 4 • A software exception occurred during execution of a data conversion from 64-bit floating point to 16-bit signed integer value – The value was larger than 32,767, the largest integer storable in a 16 bit signed integer, and thus the conversion failed and an exception was raised by the program • When the primary computer system failed due to this problem, the secondary system started running. – The secondary system was running the same software, so it failed too! Ariane 5 Failure • The programmers for Ariane 4 had decided that this particular velocity figure would never be large enough to raise this exception. – Ariane 5 was a faster rocket than Ariane 4! • The calculation containing the bug actually served no purpose once the rocket was in the air. – Engineers chose long ago, in an earlier version of the Ariane rocket, to leave this function running for the first 40 seconds of flight to make it easy to restart the system in the event of a brief hold in the countdown. • You can read the report of Ariane 5 failure at: http://www.ima.umn.edu/~arnold/disasters/ariane5rep.html Mars Pathfinder • A few days into its mission, NASA’s Mars Pathfinder computer system started rebooting itself – Cause: Priority inversion during preemptive priority scheduling of threads • Priority inversion occurs when – a thread that has higher priority is waiting for a resource held by thread with a lower priority • Pathfinder contained a data bus shared among multiple threads and protected by a mutex lock • Two threads that accessed the data bus were: a high-priority bus management thread and a low-priority meteorological data gathering thread • Yet another thread with medium-priority was a long running communications thread (which did not access the data bus) Mars Pathfinder • The scenario that caused the reboot was: – The meteorological data gathering thread accesses the bus and obtains the mutex lock – While the meteorological data gathering thread is accessing the bus, an interrupt causes the high-priority bus management thread to be scheduled – Bus management thread tries to access the bus and blocks on the mutex lock – Scheduler starts running the meteorological thread again – Before the meteorological thread finishes its task yet another interrupt occurs and the medium-priority (and long running) communications thread gets scheduled – At this point high-priority bus management thread is waiting for the lowpriority meteorological data gathering thread, and the low-priority meteorological data gathering thread is waiting for the medium-priority communications thread – Since communications thread had long-running tasks, after a while a watchdog timer would go off and notice that the high-priority bus management thread has not been executed for some time and conclude that something was wrong and reboot the system Software’s Chronic Crisis • Software product size is increasing exponentially – faster, smaller, cheaper hardware • Software is everywhere: from TV sets to cell-phones • Software is in safety-critical systems – cars, airplanes, nuclear-power plants • We are seeing more of – distributed systems – embedded systems – real-time systems • These kinds of systems are harder to build • Software requirements change – software evolves rather than being built Summary • Software’s chronic crisis: Development of large software systems is a challenging task – Large software systems often: Do not provide the desired functionality; Take too long to build; Cost too much to build Require too much resources (time, space) to run; Cannot evolve to meet changing needs What is this? First Computer Bug • In 1947, Grace Murray Hopper was working on the Harvard University Mark II Aiken Relay Calculator (a primitive computer). • On the 9th of September, 1947, when the machine was experiencing problems, an investigation showed that there was a moth trapped between the points of Relay #70, in Panel F. • The operators removed the moth and affixed it to the log. The entry reads: "First actual case of bug being found." • The word went out that they had "debugged" the machine and the term "debugging a computer program" was born. Can Model Checking Help • The question is: Can the automated verification techniques we have been discussing be used in finding bugs in software systems? • Today I will discuss some automated verification techniques that have been successful in identifying bugs State of the art in automated verification: Model Checking • What is model checking? – Automated verification technique – Focuses on bug finding rather than proving correctness – The basic idea is to exhaustively search for bugs in software – Has many flavors • Explicit-state model checking • Symbolic model checking • Bounded model checking Model Checking Evolution • Earlier model checkers had their own input specification languages – For example Spin, SMV • This requires translation of the system to be verified to the input langauge of the model checker – Most of the time these translations are not automated and use ad-hoc simplifications and abstractions • More recently several researchers developed tools for model checking programs – These model checkers work directly on programs, i.e., their input language is a programming language – These model checkers use well-defined techniques for restricting the state space or use automated abstraction techniques Explicit-State Model Checking Programs • Verisoft from Bell Labs – C programs, handles concurrency, bounded search, bounded recursion. – Uses stateless search and partial order reduction. • Java Path Finder (JPF) at NASA Ames – Explicit state model checking for Java programs, bounded search, bounded recursion, handles concurrency. – Uses techniques similar to the techniques used in Spin. • CMC from Stanford for checking systems code written in C Symbolic Model Checking of Programs • CBMC – This is the bounded model checker we discussed earlier, bounds the loop iterations and recursion depth. – Uses a SAT solver. • SLAM project at Microsoft Research – Symbolic model checking for C programs. Can handle unbounded recursion but does not handle concurrency. – Uses predicate abstraction and BDDs. Beyond Model Checking • Promising results obtained in the model checking area created a new interest in automated verification • Nowadays, there is a wide spectrum of verification/analysis/testing techniques with varying levels of power and scalability – Bounded verification using SAT solvers – Symbolic execution using combinations of decision procedures – Dynamic symbolic execution (aka concolic execution) – Various types of symbolic analysis: shape analysis, string analysis, size analysis, etc. What to Verify? • Before we start talking about automated verification techniques, we need to identify what we want to verify • It turns out that this is not a very simple question • First we will discuss issues related to this question Temporal Logics and Model Checking Problem A Mutual Exclusion Protocol Two concurrently executing processes are trying to enter a critical section without violating mutual exclusion Process 1: while (true) { out: a := true; turn := true; wait: await (!b or !turn); cs: a := false; } || Process 2: while (true) { out: b := true; turn := false; wait: await (!a or turn); cs: b := false; } Reactive Systems: A Very Simple Model • We will use a very simple model for reactive systems • A reactive system generates a set of execution paths • An execution path is a concatenation of the states (configurations) of the system, starting from some initial state • There is a transition relation which specifies the next-state relation, i.e., given a state what are the states that can follow that state State Space • The state space of a program can be captured by the valuations of the variables and the program counters • For our example, we have – two program counters: pc1, pc2 domains of the program counters: {out, wait, cs} – three boolean variables: turn, a, b boolean domain: {True, False} • Each state of the program is a valuation of all the variables State Space • Each state can be written as a tuple (pc1,pc2,turn,a,b) • Initial states: {(o,o,F,F,F), (o,o,F,F,T), (o,o,F,T,F), (o,o,F,T,T), (o,o,T,F,F), (o,o,T,F,T), (o,o,T,T,F), (o,o,T,T,T)} – initially: pc1=o and pc2=o • How many states total? 3 * 3 * 2 * 2 * 2 = 72 exponential in the number of variables and the number of concurrent components Transition Relation • Transition Relation specifies the next-state relation, i.e., given a state what are the states that can come immediately after that state • For example, given the initial state (o,o,F,F,F) Process 1 can execute: out: a := true; turn := true; or Process 2 can execute: out: b := true; turn := false; • If process 1 executes, the next state is (w,o,T,T,F) • If process 2 executes, the next state is (o,w,F,F,T) • So the state pairs ((o,o,F,F,F),(w,o,T,T,F)) and ((o,o,F,F,F),(o,w,F,F,T)) are included in the transition relation Transition Relation The transition relation is like a graph, edges represent the next-state relation (o,o,F,F,F) (o,w,F,F,T) (o,c,F,F,T) (w,o,T,T,F) (w,w,T,T,T) Transition System • A transition system T = (S, I, R) consists of – a set of states S – a set of initial states IS – and a transition relation RSS • A common assumption in model checking – R is total, i.e., for all s S, there exists s’ such that (s,s’) R Execution Paths • An execution path is an infinite sequence of states x = s0, s1, s2, ... such that s0 I and for all i 0, (si,si+1) R Notation: For any path x xi denotes the i’th state on the path (i.e., si) xi denotes the i’th suffix of the path (i.e., si, si+1, si+2, ... ) Execution Paths A possible execution path: ((o,o,F,F,F), (o,w,F,F,T), (o,c,F,F,T)) ( means repeat the above three states infinitely many times) (o,o,F,F,F) (o,w,F,F,T) (o,c,F,F,T) (w,o,T,T,F) (w,w,T,T,T) Temporal Logics • Pnueli proposed using temporal logics for reasoning about the properties of reactive systems • Temporal logics are a type of modal logics – Modal logics were developed to express modalities such as “necessity” or “possibility” – Temporal logics focus on the modality of temporal progression • Temporal logics can be used to express, for example, that: – an assertion is an invariant (i.e., it is true all the time) – an assertion eventually becomes true (i.e., it will become true sometime in the future) Temporal Logics • We will assume that there is a set of basic (atomic) properties called AP – These are used to write the basic (non-temporal) assertions about the program – Examples: a=true, pc0=c, x=y+1 • We will use the usual boolean connectives: • We will also use four temporal operators: Invariant p : Gp (aka p) Eventually p : Fp (aka p) Next p : Xp (aka p) p Until q : pUq ,, (Globally) (Future) (neXt) LTL Properties ... Xp p ... Gp p p p p p ... Fp pUq p p ... p p p p q Example Properties mutual exclusion: G ( (pc1=c pc2=c)) starvation freedom: G(pc1=w F(pc1=c)) G(pc2=w F(pc2=c)) Given the execution path: x =((o,o,F,F,F), (o,w,F,F,T), (o,c,F,F,T)) x |= pc1=o x |= X (pc2=w) x |= F (pc2=c) x |= (turn) U (pc2=c b) x |= G ( (pc1=c pc2=c)) x |= G(pc1=w F(pc1=c)) G(pc2=w F(pc2=c)) LTL Model Checking • Given a transition system T and an LTL property p T |= p iff for all execution paths x in T, x |= p For example: T |=? G ( (pc1=c pc2=c)) T |=? G(pc1=w F(pc1=c)) G(pc2=w F(pc2=c)) Model checking problem: Given a transition system T and an LTL property p, determine if T is a model for p (i.e., if T |=p) Complexity: (|S|+|R|) 2O(|f|) Linear Time vs. Branching Time • In linear time logics we look at the execution paths individually • In branching time logics we view the computation as a tree – computation tree: unroll the transition relation Transition System Execution Paths Computation Tree s3 s3 s1 s2 s3 s3 s4 s4 s3 . . . s1 s2 s3 . . . s4 .. .. .. s4 s1 s3 s2 s3 s1 . . . s4 .. .. .. s1 . . . Computation Tree Logic (CTL) • In CTL we quantify over the paths in the computation tree • We use the same four temporal operators: X, G, F, U • However we attach path quantifiers to these temporal operators: – A : for all paths – E : there exists a path • We end up with eight temporal operators: – AX, EX, AG, EG, AF, EF, AU, EU CTL Properties Transition System p s1 s2 s3 |= p s4 |= p s1 |= p s2 |= p s3 Computation Tree s3 p p s4 s3 |= EX p s3 |= EX p s3 |= AX p s3 |= AX p s3 |= EG p s3 |= EG p s3 |= AF p s3 |= EF p s3 |= AF p p s4 .. .. .. s4 p s1 s3 p s2 s3 p s1 . . . p s4 . . . . . . s1 . . . CTL Model Checking • Given a transition system T= (S, I, R) and a CTL property p T |= p iff for all initial state s I, s |= p Model checking problem: Given a transition system T and a CTL property p, determine if T is a model for p (i.e., if T |=p) Complexity: O(|f| (|S|+|R|)) For example: T |=? AG ( (pc1=c pc2=c)) T |=? AG(pc1=w AF(pc1=c)) AG(pc2=w AF(pc2=c)) • Question: Are CTL and LTL equivalent? CTL vs. LTL • CTL and LTL are not equivalent – There are properties that can be expressed in LTL but cannot be expressed in CTL • For example: FG p – There are properties that can be expressed in CTL but cannot be expressed in LTL • For example: AG(EF p) • Hence, expressive power of CTL and LTL are not comparable Symbolic Model Checking Temporal Properties Fixpoints [Emerson and Clarke 80] Here are some interesting CTL equivalences: AG p = p AX AG p EG p = p EX EG p AF p = p AX AF p EF p = p EX EF p p AU q = q (p AX (p AU q)) p EU q = q (p EX (p EU q)) Note that we wrote the CTL temporal operators in terms of themselves and EX and AX operators Fixpoint Characterizations Fixpoint Characterization Equivalences AG p = y . p AX y EG p = y . p EX y AG p = p AX AG p EG p = p EX EG p AF p = y . p AX y EF p = y . p EX y AF p = p AX AF p EF p = p EX EF p p AU q = y . q (p AX (y)) p EU q = y . q (p EX (y)) p AU q=q (p AX (p AU q)) p EU q = q (p EX (p EU q)) Least Fixpoint Given a monotonic function F, the least fixpoint y . F y is the limit of the following sequence (assuming F is continuous): , F , F2 , F3 , ... If S is finite, then we can compute the least fixpoint using the above sequence EF Fixpoint Computation EF p = y . p EX y is the limit of the sequence: , pEX , pEX(pEX ) , pEX(pEX(p EX )) , ... which is equivalent to , p, p EX p , p EX (p EX (p) ) , ... EF Fixpoint Computation p p s1 s2 s3 s4 Start 1st iteration pEX = {s1,s4} EX()= {s1,s4} ={s1,s4} 2nd iteration pEX(pEX ) = {s1,s4} EX({s1,s4})= {s1,s4} {s3}={s1,s3,s4} 3rd iteration pEX(pEX(p EX )) = {s1,s4} EX({s1,s3,s4})= {s1,s4} {s2,s3,s4}={s1,s2,s3,s4} 4th iteration pEX(pEX(pEX(p EX ))) = {s1,s4} EX({s1,s2,s3,s4})= {s1,s4} {s1,s2,s3,s4} = {s1,s2,s3,s4} EF Fixpoint Computation EF(p) states that can reach p p p EX(p) EX(EX(p)) ... • • • EF(p) Greatest Fixpoint Given a monotonic function F, the greatest fixpoint y . F y is the limit of the following sequence (assuming F is continuous): S, F S, F2 S, F3 S, ... If S is finite, then we can compute the greatest fixpoint using the above sequence EG Fixpoint Computation Similarly, EG p = y . p EX y is the limit of the sequence: S, pEX S, pEX(p EX S) , pEX(p EX (p EX S)) , ... which is equivalent to S, p, p EX p , p EX (p EX (p) ) , ... EG Fixpoint Computation p s1 s2 s3 p s4 p Start S = {s1,s2,s3,s4} 1st iteration pEX S = {s1,s3,s4}EX({s1,s2,s3,s4})= {s1,s3,s4}{s1,s2,s3,s4}={s1,s3,s4} 2nd iteration pEX(pEX S) = {s1,s3,s4}EX({s1,s3,s4})= {s1,s3,s4}{s2,s3,s4}={s3,s4} 3rd iteration pEX(pEX(pEX S)) = {s1,s3,s4}EX({s3,s4})= {s1,s3,s4}{s2,s3,s4}={s3,s4} EG Fixpoint Computation EG(p) states that can avoid reaching p p EX(p) EX(EX(p)) ... • • • EG(p) Symbolic Model Checking [McMillan et al. LICS 90] • Basic idea: Represent sets of states and the transition relation as Boolean logic formulas • Fixpoint computation becomes formula manipulation – pre-condition (EX) computation: Existential variable elimination – conjunction (intersection), disjunction (union) and negation (set difference), and equivalence check • Use an efficient data structure for boolean logic formulas – Binary Decision Diagrams (BDDs) Symbolic Pre-condition Computation • Remember the function EX : 2S 2S which is defined as: EX(p) = { s | (s,s’) R and s’ p } • We can symbolically compute pre as follows EX(p) V’ R p[V’ / V] – V : current-state boolean variables – V’ : next-state boolean variables – p[V’ / V] : rename variables in p by replacing currentstate variables with the corresponding next-state variables – V’ f : existentially quantify out all the variables in V’ from f An Extremely Simple Example Variables: x, y: boolean Set of states: S = {(F,F), (F,T), (T,F), (T,T)} S True F,F T,F F,T T,T Initial condition: Ixy Transition relation (negates one variable at a time): R x’=x y’=y x’=x y’=y (= means ) An Extremely Simple Example Given p x y, compute EX(p) F,F T,F F,T T,T EX(p) V’ R p[V’ / V] V’ R x’ y’ V’ (x’=x y’=y x’=x y’=y ) x’ y’ V’ (x’=x y’=y) x’ y’ (x’=x y’=y) x’ y’ V’ x y x’ y’ x y x’ y’ x y x y EX(x y) x y x y In other words EX({(T,T)}) {(F,T), (T,F)} An Extremely Simple Example 3 F,F T,F Let’s compute compute EF(x y) 2 1 F,T T,T The fixpoint sequence is False, xy , xy EX(xy) , xy EX (xy EX(xy)) , ... If we do the EX computations, we get: False, x y , x y x y x y, True 0 1 2 3 EF(x y) True In other words EF({(T,T)}) {(F,F),(F,T), (T,F),(T,T)} An Extremely Simple Example • Based on our results, for our extremely simple transition system T=(S,I,R) we have I EF(x y) hence: T |= EF(x y) (i.e., there exists a path from each initial state where eventually x and y both become true at the same time) I EX(x y) hence: T |= EX(x y) (i.e., there does not exist a path from each initial state where in the next state x and y both become true) An Extremely Simple Example • Let’s try one more property AF(x y) • To check this property we first convert it to a formula which uses only the temporal operators in our basis: AF(x y) EG((x y)) If we can find an initial state which satisfies EG((x y)), then we know that the transition system T, does not satisfy the property AF(x y) An Extremely Simple Example Let’s compute compute EG((x y)) F,F T,F 1 F,T T,T The fixpoint sequence is 0 True, x y, (x y) EX(x y) , … If we do the EX computations, we get: True, x y, x y, 0 1 2 EG((x y)) x y Since I EG((x y)) we conclude that T |= AF(x y) Symbolic CTL Model Checking Algorithm • Translate the formula to a formula which uses the basis – EX p, EG p, p EU q • Atomic formulas can be interpreted directly on the state representation • For EX p compute the precondition using existential variable elimination as we discussed • For EG and EU compute the fixpoints iteratively SMV [McMillan 93] • • • • BDD-based symbolic model checker Finite state Temporal logic: CTL Focus: hardware verification – Later applied to software specifications, protocols, etc. • SMV has its own input specification language – concurrency: synchronous, asynchronous – shared variables – boolean and enumerated variables – bounded integer variables (binary encoding) • SMV is not efficient for integers, but that can be fixed – fixed size arrays SMV Language • An SMV specification consists of a set of modules (one of them must be called main) • Modules can have access to shared variables • Modules can be composed asynchronously using the process keyword • Module behaviors can be specified using the ASSIGN statement which assigns values to next values of variables in parallel • Module behaviors can also be specified using the TRANS statements which allow specification of the transition relation as a logic formula where next state values are identified using the next keyword Example Mutual Exclusion Protocol Two concurrently executing processes are trying to enter a critical section without violating mutual exclusion Process 1: while (true) { out: a := true; turn := true; wait: await (b = false or turn = false); cs: a := false; } || Process 2: while (true) { out: b := true; turn := false; wait: await (a = false or turn); cs: b := false; } Example Mutual Exclusion Protocol in SMV MODULE process1(a,b,turn) VAR pc: {out, wait, cs}; ASSIGN init(pc) := out; next(pc) := case pc=out : wait; pc=wait & (!b | !turn) : cs; pc=cs : out; 1 : pc; esac; next(turn) := case pc=out : 1; 1 : turn; esac; next(a) := case pc=out : 1; pc=cs : 0; 1 : a; esac; next(b) := b; FAIRNESS running MODULE process2(a,b,turn) VAR pc: {out, wait, cs}; ASSIGN init(pc) := out; next(pc) := case pc=out : wait; pc=wait & (!a | turn) : cs; pc=cs : out; 1 : pc; esac; next(turn) := case pc=out : 0; 1 : turn; esac; next(b) := case pc=out : 1; pc=cs : 0; 1 : b; esac; next(a) := a; FAIRNESS running Example Mutual Exclusion Protocol in SMV MODULE main VAR a : boolean; b : boolean; turn : boolean; p1 : process process1(a,b,turn); p2 : process process2(a,b,turn); SPEC AG(!(p1.pc=cs & p2.pc=cs)) -- AG(p1.pc=wait -> AF(p1.pc=cs)) & AG(p2.pc=wait -> AF(p2.pc=cs)) Here is the output when I run SMV on this example to check the mutual exclusion property % smv mutex.smv -- specification AG (!(p1.pc = cs & p2.pc = cs)) is true resources used: user time: 0.01 s, system time: 0 s BDD nodes allocated: 692 Bytes allocated: 1245184 BDD nodes representing transition relation: 143 + 6 Example Mutual Exclusion Protocol in SMV The output for the starvation freedom property: % smv mutex.smv -- specification AG (p1.pc = wait -> AF p1.pc = cs) & AG ... is true resources used: user time: 0 s, system time: 0 s BDD nodes allocated: 1251 Bytes allocated: 1245184 BDD nodes representing transition relation: 143 + 6 Example Mutual Exclusion Protocol in SMV Let’s insert an error change pc=wait & (!b | !turn) : cs; to pc=wait & (!b | turn) : cs; % smv mutex.smv -- specification AG (!(p1.pc = cs & p2.pc = cs)) is false -- as demonstrated by the following execution sequence state 1.1: a = 0 b = 0 turn = 0 p1.pc = out p2.pc = out [stuttering] state 1.2: [executing process p2] state 1.3: b = 1 p2.pc = wait [executing process p2] state 1.4: p2.pc = cs [executing process p1] state 1.5: a = 1 turn = 1 p1.pc = wait [executing process p1] state 1.6: p1.pc = cs [stuttering] resources used: user time: 0.01 s, system time: 0 s BDD nodes allocated: 1878 Bytes allocated: 1245184 BDD nodes representing transition relation: 143 + 6 Symbolic Model Checking with BDDs • BDDs are used as a data structure for encoding trust sets of Boolean logic formulas in symbolic model checking • One can use BDD-based symbolic model checking for any finite state system using a Boolean encoding of the state space and the transition relation • Why are we using symbolic model checking? – We hope that the symbolic representations will be more compact than the explicit state representation on the average – In the worst case we may not gain anything Problems with BDDs • The BDD for the transition relation could be huge – Remember that the BDD could be exponential in the number of disjuncts and conjuncts – Since we are using a Boolean encoding there could be a large number of conjuncts and disjuncts • The EX computation could result in exponential blow-up – Exponential in the number of existentially quantified variables Heuristics • Instead of computing a monolithic BDD for the whole transition system partition the transition relation in order to keep the BDD size small • Use good variable ordering in order to keep the BDD sizes small – Use heuristics to find good variable orderings, – Use dynamic variable ordering heuristics that change the variable ordering dynamically if the BDD size grows too much • Use other data structures (such as multi-terminal decision diagrams) Counter-Example Generation • Remember: Given a transition system T= (S, I, R) and a CTL property p T |= p iff for all initial state s I, s |= p • Verification vs. Falsification – Verification: • Show: initial states truth set of p – Falsification: • Find: a state initial states truth set of p • Generate a counter-example starting from that state • The ability to find counter-examples is one of the biggest strengths of the model checkers An Example • We want to check the property AG(p) • We compute the fixpoint for EF(p) • We check if the intersection of the set of initial states I and the truth set of EF(p) is empty – If it is not empty we generate a counter-example path starting from the intersection EF(p) states that can reach p p • In order to generate the p counter-example path, save the fixpoint iterations. • After the fixpoint computation converges, do a second pass to generate the counter-example path. EX(p) EX(EX(p)) ... • • • I • • • EF(p) Generate a counter-example path starting from a state here Automata Theoretic Model Checking LTL Properties Büchi automata [Vardi and Wolper LICS 86] • Büchi automata: Finite state automata that accept infinite strings – The better known variant of finite state automata accept finite strings (used in lexical analysis for example) • A Büchi automaton accepts a string when the corresponding run visits an accepting state infinitely often – Note that an infinite run never ends, so we cannot say that an accepting run ends at an accepting state • LTL properties can be translated to Büchi automata – The automaton accepts a path if and only if the path satisfies the corresponding LTL property LTL Properties Büchi automata true Gp p p true Fp G (F p) p p p p p p The size of the property automaton can be exponential in the size of the LTL formula (recall the complexity of LTL model checking) Büchi Automata: Language Emptiness Check • Given a Buchi automaton, one interesting question is: – Is the language accepted by the automaton empty? • i.e., does it accept any string? • A Büchi automaton accepts a string when the corresponding run visits an accepting state infinitely often • To check emptiness: – Look for a cycle which contains an accepting state and is reachable from the initial state • Find a strongly connected component that contains an accepting state, and is reachable from the initial state – If no such cycle can be found the language accepted by the automaton is empty LTL Model Checking • Generate the property automaton from the negated LTL property • Generate the product of the property automaton and the transition system • Show that there is no accepting cycle in the product automaton (check language emptiness) – i.e., show that the intersection of the paths generated by the transition system and the paths accepted by the (negated) property automaton is empty • If there is a cycle, it corresponds to a counterexample behavior that demonstrates the bug LTL Model Checking Example Example transition system Property to be verified Gq Negation of the property p,q q 1 2 G q F q 3 p Property automaton for the negated property true Each state is labeled with the propositions that hold in that state q q Equivalently {q},{p,q} ,{p},{q}, {p,q} , {p} 1 2 Transition System to Buchi Automaton Translation Example transition system p,q Corresponding Buchi automaton i 1 {p,q} 1 q 2 3 p {p,q} {q} {q} 2 Each state is labeled with the propositions that hold in that state 3 {p} Buchi automaton for the transition system (every state is accepting) Product automaton 1,1 {p,q} 1 {p,q} 2,1 {p,q} 2 {q} {q} 3,1 {q} 3 4 {p} {p} {q} 3,2 4,2 Property Automaton {q},{p,q} ,{p},{q}, {p,q} , {p} 1 {p,q} 2 {p} Accepting cycle: (1,1), (2,1), (3,1), ((4,2), (3,2)) Corresponds to a counter-example path for the property G q SPIN [Holzmann • • • • 91, TSE 97] Explicit state model checker Finite state Temporal logic: LTL Input language: PROMELA – Asynchronous processes – Shared variables – Message passing through (bounded) communication channels – Variables: boolean, char, integer (bounded), arrays (fixed size) – Structured data types SPIN Verification in SPIN • Uses the LTL model checking approach • Constructs the product automaton on-the-fly – It is possible to find an accepting cycle (i.e. a counterexample) without constructing the whole state space • Uses a nested depth-first search algorithm to look for an accepting cycle • Uses various heuristics to improve the efficiency of the nested depth first search: – partial order reduction – state compression Example Mutual Exclusion Protocol Two concurrently executing processes are trying to enter a critical section without violating mutual exclusion Process 1: while (true) { out: a := true; turn := true; wait: await (b = false or turn = false); cs: a := false; } || Process 2: while (true) { out: b := true; turn := false; wait: await (a = false or turn); cs: b := false; } Example Mutual Exclusion Protocol in Promela #define cs1 process1@cs #define cs2 process2@cs #define wait1 process1@wait #define wait2 process2@wait #define true 1 #define false 0 bool a; bool b; bool turn; proctype process1() { out: a = true; turn = true; wait: (b == false || turn == false); cs: a = false; goto out; } proctype process2() { out: b = true; turn = false; wait: (a == false || turn == true); cs: b = false; goto out; } init { run process1(); run process2() } Property automaton generation % spin -f "! [] (! (cs1 && cs2))“ never { /* ! [] (! (cs1 && cs2)) */ T0_init: if :: ((cs1) && (cs2)) -> goto accept_all :: (1) -> goto T0_init fi; accept_all: skip } % spin -f "!([](wait1 -> <>(cs1)))“ • Input formula “[]” means G “<>” means F • “spin –f” option generates a Buchi automaton for the input LTL formula never { /* !([](wait1 -> <>(cs1))) */ T0_init: if :: ( !((cs1)) && (wait1) ) -> goto accept_S4 :: (1) -> goto T0_init fi; accept_S4: if :: (! ((cs1))) -> goto accept_S4 fi; } Concatanate the generated never claims to the end of the specification file SPIN • “spin –a mutex.spin” generates a C program “pan.c” from the specification file – This C program implements the on-the-fly nested-depth first search algorithm – You compile “pan.c” and run it to the model checking • Spin generates a counter-example trace if it finds out that a property is violated %mutex -a warning: for p.o. reduction to be valid the never claim must be stutter-invariant (never claims generated from LTL formulae are stutter-invariant) (Spin Version 4.2.6 -- 27 October 2005) + Partial Order Reduction Full statespace search for: never claim assertion violations acceptance cycles invalid end states + + (if within scope of claim) + (fairness disabled) - (disabled by never claim) State-vector 28 byte, depth reached 33, errors: 0 22 states, stored 15 states, matched 37 transitions (= stored+matched) 0 atomic steps hash conflicts: 0 (resolved) 2.622 memory usage (Mbyte) unreached in proctype process1 line 18, state 6, "-end-" (1 of 6 states) unreached in proctype process2 line 27, state 6, "-end-" (1 of 6 states) unreached in proctype :init: (0 of 3 states) Problems/Heuristics for explicit state model checking • State space explosion: Number of states can be exponential in the number of variables and concurrent components • Heuristics used by Spin: – On the fly checking: use a depth first search that computes the product of the property automaton and the transition relation while looking for a violation – Bit-state hashing • do not store the full state information • might skip some unvisited states so it is not sound – Partial order reduction • only explore a representative subset of interleavings among the concurrent processes • this can be done in a sound manner Software Verification Using Explicit State Model Checking with Java Path Finder Java Path Finder • Program checker for Java • Properties to be verified – Properties can be specified as assertions • static checking of assertions – It can also verify LTL properties • Implements both depth-first and breadth-first search and looks for assertion violations statically • Uses static analysis techniques to improve the efficiency of the search • Requires a complete Java program • It can only handle pure Java, it cannot handle native code Java Path Finder, First Version • First version – A translator from Java to PROMELA – Use SPIN for model checking • Since SPIN cannot handle unbounded data – Restrict the program to finite domains • A fixed number of objects from each class • Fixed bounds for array sizes • Does not scale well if these fixed bounds are increased • Java source code is required for translation Java Path Finder, Current Version • Current version of the JPF has its own virtual machine: JPF-JVM – Executes Java bytecode • can handle pure Java but can not handle native code – Has its own garbage collection – Stores the visited states and stores current path – Offers some methods to the user to optimize verification • Traversal algorithm – Traverses the state-graph of the program – Tells JPF-JVM to move forward, backward in the state space, and evaluate the assertion • The rest of the slides are on the current version of JPF: W. Visser, K. Havelund, G. Brat, S. Park and F. Lerda. "Model Checking Programs." Automated Software Engineering Journal Volume 10, Number 2, April 2003. Storing the States • JPF implements a depth-first search on the state space of the given Java program – To do depth first search we need to store the visited states • There are also verification tools which use stateless search such as Verisoft • The state of the program consists of – information for each thread in the Java program • a stack of frames, one for each method called – the static variables in classes • locks and fields for the classes – the dynamic variables (fields) in objects • locks and fields for the objects Storing States Efficiently • Since different states can have common parts each state is divided to a set of components which are stored separately – locks, frames, fields • Keep a pool for each component – A table of field values, lock values, frame values • Instead of storing the value of a component in a state store an index at which the component is stored in the table in the state – The whole state becomes an integer vector • JPF collapses states to integer vectors using this idea State Space Explosion • State space explosion if one of the major challenges in model checking • The idea is to reduce the number of states that have to be visited during state space exploration • Here are some approaches used to attack state space explosion – Symmetry reduction • search equivalent states only once – Partial order reduction • do not search thread interleavings that generate equivalent behavior – Abstraction • Abstract parts of the state to reduce the size of the state space Symmetry Reduction • Some states of the program may be equivalent – Equivalent states should be searched only once • Some states may differ only in their memory layout, the order objects are created, etc. – these may not have any effect on the behavior of the program • JPF makes sure that the order which the classes are loaded does not effect the state – There is a canonical ordering of the classes in the memory Symmetry Reduction • A similar problem occurs for location of dynamically allocated objects in the heap – If we store the memory location as the state, then we can miss equivalent states which have different memory layouts – JPF tries to remedy this problem by storing some information about the new statements that create an object and the number of times they are executed Partial Order Reduction • Statements of concurrently executing threads can generate many different interleavings – all these different interleavings are allowable behavior of the program • A model checker has to check all possible interleavings that the behavior of the program is correct in all cases – However different interleavings may generate equivalent behaviors • In such cases it is sufficient to check just one interleaving without exhausting all the possibilities – This is called partial order reduction state space search generates 258 states with symmetry reduction: 105 states with partial order reduction: 68 states with symmetry reduction + partial order reduction : 38 states class S1 { int x;} class FirstTask extends Thread { public void run() { S1 s1; int x = 1; s1 = new S!(); x = 3; }} class S2 { int y;} class SecondTask extends Thread { public void run() { S2 s2; int x = 1; s2 = new S2(); x = 3; }} class Main { public static void main(String[] args) { FirstTask task1 = new FirstTask(); SecondTask task2 = new SecondTask(); task1.statr(); task2.start(); }} Static Analysis • JPF uses following static analysis techniques for reducing the state space: – slicing – partial evaluation • Given a slicing criterion slicing reduces the size of a program by removing the parts of the program that have no effect on the slicing criterion – A slicing criterion could be a program point – Program slices are computed using dependency analysis • Partial evaluation propagates constant values and simplifies expressions Abstraction vs. Restriction • JPF also uses abstraction techniques such as predicate abstraction to reduce the state space • Still, in order to check a program with JPF, typically, you need to restrict the domains of the variables, the sizes of the arrays, etc. • Abstraction over approximates the program behavior – causes spurious counter-examples • Restriction under approximates the program behavior – may result in missed errors • If both under and over approximation techniques are used then the resulting verification technique is neither sound nor complete – However, it is still useful as a debugging tool and it is helpful in finding bugs JPF Java Modeling Primitives • Atomicity (used to reduce the state space) – beginAtomic(), endAtomic() • Nondeterminism (used to model non-determinism caused by abstraction) – int random(int); boolean randomBool(); Object randomObject(String cname); • Properties (used to specify properties to be verified) – AssertTrue(boolean cond) Annotated Java Code for a Reader-Writer Lock import gov.nasa.arc.ase.jpf.jvm.Verify; class ReaderWriter { private int nr; private boolean busy; private Object Condr_enter; private Object Condw_enter; public ReaderWriter() { Verify.beginAtomic(); nr = 0; busy=false ; Condr_enter =new Object(); Condw_enter =new Object(); Verify.endAtomic(); } public boolean read_exit(){ boolean result=false; synchronized(this){ nr = (nr - 1); result=true; } Verify.assertTrue(!busy || nr==0 ); return result; } private boolean Guarded_r_enter(){ boolean result=false; synchronized(this){ if(!busy){nr = (nr + 1);result=true;}} return result; } public void read_enter(){ synchronized(Condr_enter){ while (! Guarded_r_enter()){ try{Condr_enter.wait();} catch(InterruptedException e){} }} Verify.assertTrue(!busy || nr==0 ); } private boolean Guarded_w_enter(){…} public void write_enter(){…} public boolean write_exit(){…} }; JPF Output >java gov.nasa.arc.ase.jpf.jvm.Main rwmain JPF 2.1 - (C) 1999,2002 RIACS/NASA Ames Research Center JVM 2.1 - (C) 1999,2002 RIACS/NASA Ames Research Center Loading class gov.nasa.arc.ase.jpf.jvm.reflection.JavaLangObjectReflection Loading class gov.nasa.arc.ase.jpf.jvm.reflection.JavaLangThreadReflection ============================== No Errors Found ============================== ----------------------------------States visited : 36,999 Transitions executed : 68,759 Instructions executed: 213,462 Maximum stack depth : 9,010 Intermediate steps : 2,774 Memory used : 22.1MB Memory used after gc : 14.49MB Storage memory : 7.33MB Collected objects : 51 Mark and sweep runs : 55,302 Execution time : 20.401s Speed : 3,370tr/s ----------------------------------- Example Error Trace 1 error found: Deadlock ======================== *** Path to error: *** ======================== Steps to error: 2521 Step #0 Thread #0 Step #1 Thread #0 rwmain.java:4 ReaderWriter monitor=new ReaderWriter(); Step #2 Thread #0 ReaderWriter.java:10 public ReaderWriter( ) { … Step #2519 Thread #2 ReaderWriter.java:71 while (! Guarded_w_enter()){ Step #2520 Thread #2 ReaderWriter.java:73 Condw_enter.wait(); Bounded Model Checking Bounded Model Checking • Represent sets of states and the transition relation as Boolean logic formulas • Instead of computing the fixpoints, unroll the transition relation up to certain fixed bound and search for violations of the property within that bound • Transform this search to a Boolean satisfiability problem and solve it using a SAT solver What Can We Guarantee? • Note that in bounded model checking we are checking only for bounded paths (paths which have at most k+1 distinct states) – So if the property is violated by only paths with more than k+1 distinct states, we would not find a counterexample using bounded model checking – Hence if we do not find a counter-example using bounded model checking we are not sure that the property holds • However, if we find a counter-example, then we are sure that the property is violated since the generated counterexample is never spurious (i.e., it is always a concrete counter-example) Bounded Model Checking: Proving Correctness • One can also show that given an LTL property f, if E f holds for a finite state transition system, then E f also holds for that transition system using bounded semantics for some bound k • So if we keep increasing the bound, then we are guaranteed to find a path that satisfies the formula – And, if we do not find a path that satisfies the formula, then we decide that the formula is not satisfied by the transition system – Is there a problem here? Proving Correctness • We can modify the bounded model checking algorithm as follows: – Start from an initial bound. – If no counter-examples are found using the current bound, increment the bound and try again. • The problem is: We do not know when to stop Proving Correctness • If we can find a way to figure out when we should stop then we would be able to provide guarantee of correctness. • There is a way to define a diameter of a transition system so that a property holds for the transition system if and only if it is not violated on a path bounded by the diameter. • So if we do bounded model checking using the diameter of the system as our bound, then we can guarantee correctness if no counter-example is found. Bounded Model Checking • What are the differences between bounded model checking and BDD-based symbolic model checking? – In bounded model checking we are using a SAT solver instead of a BDD library – In symbolic model checking we do not unroll the transition relation as in bounded model checking – In bounded model checking we do not execute the iterative fixpoint computations as in symbolic model checking – In symbolic model checking for finite state systems both verification and falsification results are guaranteed • In bounded model checking we can only guarantee the falsification results, in order to guarantee the verification results we need to know the diameter of the system Bounded Model Checking • A bounded model checker needs an efficient SAT solver – zChaff SAT solver is one of the most commonly used ones • Most SAT solvers require their input to be in Conjunctive Normal Form (CNF) – So the final formula has to be converted to CNF • Similar to BDD-based symbolic model checking, bounded model checking was also first used for hardware verification • More recently, it has been applied to verification of software Bounded Model Checking for Software CBMC is a bounded model checker for ANSI-C programs • Handles function calls using inlining • Unwinds the loops a fixed number of times • Allows user input to be modeled using non-determinism – So that a program can be checked for a set of inputs rather than a single input • Allows specification of assertions which are checked using the bounded model checking Loops • Unwind the loop n times by duplicating the loop body n times – Each copy is guarded using an if statement that checks the loop condition • At the end of the n repetitions an unwinding assertion is added which is the negation of the loop condition – Hence if the loop iterates more than n times in some execution, the unwinding assertion will be violated and we know that we need to increase the bound in order to guarantee correctness • A similar strategy is used for recursive function calls – The recursion is unwound up to a certain bound and then an assertion is generated stating that the recursion does not go any deeper A Simple Loop Example Original code Unwinding the loop 3 times x=0; while (x < 2) { y=y+x; x++; } x=0; if (x < 2) { y=y+x; x++; } if (x < 2) { y=y+x; x++; } if (x < 2) { y=y+x; x++; } Unwinding assertion: assert (! (x < 2)) From Code to SAT • After eliminating loops and recursion, CBMC converts the input program to the static single assignment (SSA) form – In SSA each variable appears at the left hand side of an assignment only once – This is a standard program transformation that is performed by creating new variables • In the resulting program each variable is assigned a value only once and all the branches are forward branches (there is no backward edge in the control flow graph) • CBMC generates a Boolean logic formula from the program using bit vectors to represent variables Another Simple Example Original code x=x+y; if (x!=1) x=2; else x++; assert(x<=3); Convert to static single assignment x1=x0+y0; if (x1!=1) x2=2; else x3=x1+1; x4=(x1!=1)?x2:x3; assert(x4<=3); Generate constraints C x1=x0+y0 x2=2 x3=x1+1 (x1!=1 x4=x2 x1=1 x4=x3) P x4 <= 3 Check if C P is satisfiable, if it is then the assertion is violated C P is converted to boolean logic using a bit vector representation for the integer variables y0,x0,x1,x2,x3,x4 Bounded Verification Approaches • What we have discussed above is bounded verification by bounding the number of steps of the execution. • For this approach to work the variable domains also need to be bounded, otherwise we cannot convert the problems to boolean SAT • Bounding the execution steps and bounding the data domain are two orthogonal approaches. – When people say bounded verification it may refer to either of these – When people say bounded model checking it typically refers to bounding the execution steps Symbolic Software Model Checking with Predicate Abstraction and Counter-Example Guided Abstraction Refinement Model Checking Programs Using Abstraction • Program model checking tools generally rely on automated abstraction techniques to reduce the state space of the system such as: – Abstract interpretation – Predicate abstraction • If the abstraction is conservative then, if there is no error in the abstracted program we can conclude that there is no error in the original program • In general the problem is to construct a finite state model from the program such that the errors or absence of errors can be demonstrated on the finite state model – Model extraction problem Model Checking Programs via Abstraction • Bandera – A tool for extracting finite state models from programs – Uses various abstract domains to map the state space of the program to a finite set of states via abstraction • SLAM project at Microsoft Research – Symbolic model checking for C programs – Can handle unbounded recursion but does not handle concurrency – Uses predicate abstraction, counter-example guided abstraction refinement and BDDs Abstraction (A simplified view) • Abstraction is an effective tool in verification • Given a transition system, we want to generate an abstract transition system which is easier to analyze • However, we want to make sure that – If a property holds in the abstract transition system, it also holds in the original (concrete) transition system Abstraction (A simplified view) • How do we generate an abstract transition system? • Merge states in the concrete transition system (based on some criteria) – This reduces the number of states, so it should be easier to do verification • Do not eliminate transitions – This will make sure that the paths in the abstract transition system subsume the paths in the concrete transition system Abstraction (A simplified view) • For every path in the concrete transition system, there is an equivalent path in the abstract transition system – If no path in the abstract transition system violate a property, then no path in the concrete system can violate the property • Using this reasoning we can verify ACTL, LTL and ACTL* properties in the abstract transition system – If the property holds on the abstract transition system, we are sure that the property holds in the concrete transition system – If the property does not hold in the abstract transition system, then we are not sure if the property holds or not in the concrete transition system Abstraction (A simplified view) • If the property does not hold in the abstract transition system, what can we do? • We can refine the abstract transition system (split some states that we merged) • We have to make sure that the refined transition system is still an abstraction of the concrete transition system • Then, we can recheck the property again on the refined transition system – If the property does not hold again, we can refine again Predicate Abstraction • An automated abstraction technique which can be used to reduce the state space of a program • The basic idea in predicate abstraction is to remove some variables from the program by just keeping information about a set of predicates about them • For example a predicate such as x = y maybe the only information necessary about variables x and y to determine the behavior of the program – In that case we can just store a boolean variable which corresponds to the predicate x = y and remove variables x and y from the program – Predicate abstraction is a technique for doing such abstractions automatically Predicate Abstraction • Given a program and a set of predicates, predicate abstraction abstracts the program so that only the information about the given predicates are preserved • The abstracted program adds nondeterminism since in some cases it may not be possible to figure out what the next value of a predicate will be based on the predicates in the given set • One needs an automated theorem prover to compute the abstraction Predicate Abstraction, A Very Simple Example • Assume that we have two integer variables x,y • We want to abstract the program using a single predicate “x=y” • We will divide the states of the program to two: 1. The states where “x=y” is true 2. The states where “x=y” is false, i.e., “xy” • We will then merge all the states in the same set – This is an abstraction – Basically, we forget everything except the value of the predicate “x=y” Predicate Abstraction, A Very Simple Example • We will represent the predicate “x=y” as the boolean variable B in the abstract program – “B=true” will mean “x=y” and – “B=false” will mean “xy” • Assume that we want to abstract the following program which contains only one statement: y := y+1 Predicate Abstraction, Step 1 • Calculate preconditions based on the predicate {x = y + 1} y := y + 1 {x = y} precondition for B being true after executing the statement y:=y+1 {x y + 1} y := y + 1 {x y} precondition for B being false after executing the statement y:=y+1 Using our temporal logic notation we can say something like: {x=y+1} AX{x=y} Again, using our temporal logic notation: {x≠y+1} AX{x≠y} Predicate Abstraction, Step 2 • Use decision procedures to determine if the predicates used for abstraction imply any of the preconditions x = y x = y + 1 ? No x y x = y + 1 ? No x = y x y + 1 ? Yes x y x y + 1 ? No Predicate Abstraction, Step 3 • Generate abstract code Predicate abstraction wrt the predicate “x=y” IF B THEN B := false ELSE B := true | false y := y + 1 1) Compute preconditions 3) Generate abstract code {x = y + 1} y := y + 1 {x = y} {x y + 1} y := y + 1 {x y} 2) Check implications x = y x = y + 1 ? No x y x = y + 1 ? No x = y x y + 1 ? Yes x y x y + 1 ? No Model Checking Push-down Automata A class of infinite state systems for which model checking is decidable • Push-down automata: Finite state control + one stack • LTL model checking for push-down automata is decidable • This may sound like a theoretical result but it is the basis of the approach used in SLAM toolkit for model checking C programs – A program with finite data domains which uses recursion can be modeled as a pushdown automaton – A Boolean program generated by predicate abstraction can be represented as a pushdown automaton Predicate Abstraction + Model Checking Push Down Automata • Predicate abstraction combined with results on model checking pushdown automata led to some promising tools – SLAM project at Microsoft Research for verification of C programs – This tool is being used to verify device drivers at Microsoft • The main idea: – Use predicate abstraction to obtain finite state abstractions of a program – A program with finite data domains and recursion can be modeled as a pushdown automaton – Use results on model checking push-down automata to verify the abstracted (recursive) program SLAM Toolkit • SLAM toolkit was developed to find errors in windows device drivers – Examples in my slides are from the following paper: • “The SLAM Toolkit”, Thomas Ball and Sriram K. Rajamani, CAV 2001 • Windows device drivers are required to interact with the windows kernel according to certain interface rules • SLAM toolkit has an interface specification language called SLIC (Specification Language for Interface Checking) which is used for writing these interface rules • The SLAM toolkit instruments the driver code with assertions based on these interface rules A SLIC Specification for a Lock SLIC specification: state { enum { Unlocked=0, Locked=1 } state = Unlocked; } KeAcquireSpinLock.return { if (state == Locked) abort; else state = Locked; } KeReleaseSpinLock.return { if (state == Unlocked) abort; else state = Unlocked; } • This specification states that KeAcquireSpinLock has to be called before KeReleaseSpinLock is called, • and KeAcquireSpinLock cannot be called back to back before a KeReleaseSpinLock is called, and vice versa A SLIC Specification for a Lock SLIC specification: state { enum { Unlocked=0, Locked=1 } state = Unlocked; } KeAcquireSpinLock.return { if (state == Locked) abort; else state = Locked; } KeReleaseSpinLock.return { if (state == Unlocked) abort; else state = Unlocked; } Generated C Code: enum { Unlocked=0, Locked=1 } state = Unlocked; void slic_abort() { SLIC_ERROR: ; } void KeAcquireSpinLock_return() { if (state == Locked) slic_abort(); else state = Locked; } void KeReleaseSpinLock_return { if (state == Unlocked) slic_abort(); else state = Unlocked; } void example() { Instrumented code do { KeAcquireSpinLock(); A: KeAcquireSpinLock_return(); void example() { nPacketsOld = nPackets; do { req = devExt->WLHV; KeAcquireSpinLock(); if(req && req->status){ nPacketsOld = nPackets; devExt->WLHV = req->Next; req = devExt->WLHV; KeReleaseSpinLock(); if(req && req->status){ B: KeReleaseSpinLock_return(); devExt->WLHV = req->Next; irp = req->irp; KeReleaseSpinLock(); if(req->status > 0){ irp = req->irp; irp->IoS.Status = SUCCESS; if(req->status > 0){ irp->IoS.Info = req->Status; irp->IoS.Status = SUCCESS; } else { irp->IoS.Info = req->Status; irp->IoS.Status = FAIL; } else { irp->IoS.Info = req->Status; irp->IoS.Status = FAIL; } irp->IoS.Info = req->Status; SmartDevFreeBlock(req); } IoCompleteRequest(irp); SmartDevFreeBlock(req); nPackets++; IoCompleteRequest(irp); } nPackets++; } while(nPackets!=nPacketsOld); } KeReleaseSpinLock(); } while(nPackets!=nPacketsOld); C: KeReleaseSpinLock_return(); KeReleaseSpinLock(); } } An Example Boolean Programs • After instrumenting the code, the SLAM toolkit converts the instrumented C program to a Boolean program using predicate abstraction • The Boolean program consists of only Boolean variables – The Boolean variables in the Boolean program are the predicates that are used during predicate abstraction • The Boolean program can have unbounded recursion Boolean Programs C Code: enum { Unlocked=0, Locked=1 } state = Unlocked; void slic_abort() { SLIC_ERROR: ; } void KeAcquireSpinLock_return() { if (state == Locked) slic_abort(); else state = Locked; } void KeReleaseSpinLock_return { if (state == Unlocked) slic_abort(); else state = Unlocked; } Boolean Program: decl {state==Locked}, {state==Unlocked} := F,T; void slic_abort() begin SLIC_ERROR: skip; end void KeAcquireSpinLock_return() begin if ({state==Locked}) slic_abort(); else {state==Locked}, {state==Unlocked} := T,F; end void KeReleaseSpinLock_return() begin if ({state == Unlocked}) slic_abort(); else {state==Locked}, {state==Unlocked} := F,T; end C Code: void example() { do { KeAcquireSpinLock(); A: KeAcquireSpinLock_return(); nPacketsOld = nPackets; req = devExt->WLHV; if(req && req->status){ devExt->WLHV = req->Next; KeReleaseSpinLock(); B: KeReleaseSpinLock_return(); irp = req->irp; if(req->status > 0){ irp->IoS.Status = SUCCESS; irp->IoS.Info = req->Status; } else { irp->IoS.Status = FAIL; irp->IoS.Info = req->Status; } SmartDevFreeBlock(req); IoCompleteRequest(irp); nPackets++; } } while(nPackets!=nPacketsOld); KeReleaseSpinLock(); C: KeReleaseSpinLock_return(); } Boolean Program: void example() begin do skip; A: KeAcquireSpinLock_return(); skip; if (*) then skip; B: KeReleaseSpinLock_return(); skip; if (*) then skip; else skip; fi skip; fi while (*); skip; C: KeReleaseSpinLock_return(); end Abstraction Preserves Correctness • The Boolean program that is generated with predicate abstraction is non-deterministic. – Non-determinism is used to handle the cases where the predicates used during predicate abstraction are not sufficient enough to determine which branch will be taken • If we find no error in the generated abstract Boolean program then we are sure that there are no errors in the original program – The abstract Boolean program allows more behaviors than the original program due to non-determinism. – Hence, if the abstract Boolean program is correct then the original program is also correct. Counter-Example Guided Abstraction Refinement • However, if we find an error in the abstract Boolean program this does not mean that the original program is incorrect. – The erroneous behavior in the abstract Boolean program could be an infeasible execution path that is caused by the non-determinism introduced during abstraction. • Counter-example guided abstraction refinement is a technique used to iteratively refine the abstract program in order to remove the spurious counter-example traces Counter-Example Guided Abstraction Refinement The basic idea in counter-example guided abstraction refinement is the following: • First look for an error in the abstract program (if there are no errors, we can terminate since we know that the original program is correct) • If there is an error in the abstract program, generate a counter-example path on the abstract program • Check if the generated counter-example path is feasible using a theorem prover. • If the generated path is infeasible add the predicate from the branch condition where an infeasible choice is made to the predicate set and generate a new abstract program. Counter-Example Guided Abstraction Refinement Refined Boolean Program: Boolean Program: (using the predicate (nPackets = npacketsOld)) void example() the boolean variable b void example() begin represents the predicate begin do (nPackets = npacketsOld) do skip; skip; A: KeAcquireSpinLock_return(); A: KeAcquireSpinLock_return(); b := T; skip; if (*) then if (*) then skip; skip; B: KeReleaseSpinLock_return(); B: KeReleaseSpinLock_return(); skip; skip; if (*) then if (*) then skip; skip; else else skip; skip; fi fi b := b ? F : *; skip; fi fi while (!b); while (*); skip; skip; C: KeReleaseSpinLock_return(); C: KeReleaseSpinLock_return(); end end Counter-Example Guided Abstraction Refinement • Using counter-example guided abstraction refinement we are iteratively creating more an more refined abstractions • This iterative abstraction refinement loop is not guaranteed to converge for infinite domains – This is not surprising since automated verification for infinite domains is undecidable in general • The challenge in this approach is automatically choosing the right set of predicates for abstraction refinement – This is similar to finding a loop invariant that is strong enough to prove the property of interest