272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 2: Software Verification with JPF, ALV and Design for Verification Model Checking Evolution • Earlier model checkers had their own input specification languages – For example Spin, SMV • This requires translation of the system to be verified to the input langauge of the model checker – Most of the time these translations are not automated and use adhoc simplifications and abstractions • More recently several researchers developed tools for model checking programs – These model checkers work directly on programs, i.e., their input language is a programming language – These model checkers use well-defined techniques for restricting the state space or use automated abstraction techniques Explicit-State Model Checking Programs • Verisoft from Bell Labs – C programs, handles concurrency, bounded search, bounded recursion. – Uses stateless search and partial order reduction. • Java Path Finder (JPF) at NASA Ames – Explicit state model checking for Java programs, bounded search, bounded recursion, handles concurrency. – Uses techniques similar to the techniques used in Spin. • CMC from Stanford for checking systems code written in C Symbolic Model Checking of Programs • CBMC – This is the bounded model checker we discussed earlier, bounds the loop iterations and recursion depth. – Uses a SAT solver. • SLAM project at Microsoft Research – Symbolic model checking for C programs. Can handle unbounded recursion but does not handle concurrency. – Uses predicate abstraction and BDDs. Java Path Finder • Program checker for Java • Properties to be verified – Properties can be specified as assertions • static checking of assertions – It can also verify LTL properties • Implements both depth-first and breadth-first search and looks for assertion violations statically • Uses static analysis techniques to improve the efficiency of the search • Requires a complete Java program • It can only handle pure Java, it cannot handle native code Java Path Finder, First Version • First version – a translator from Java to PROMELA – Use SPIN for model checking • Since SPIN cannot handle unbounded data – Restrict the program to finite domains • A fixed number of objects from each class • Fixed bounds for array sizes • Does not scale well if these fixed bounds are increased • Java source code is required for translation Java Path Finder, Current Version • Current version of the JPF has its own virtual machine: JVM-JPF – Executes Java bytecode • can handle pure Java but can not handle native code – Has its own garbage collection – Stores the visited states and stores current path – Offers some methods to the user to optimize verification • Traversal algorithm – Traverses the state-graph of the program – Tells MC-JVM to move forward, backward in the state space, and evaluate the assertion • The rest of the slides on the current version of JPF Storing the States • JPF implements a depth-first search on the state space of the given Java program – To do depth first search we need to store the visited states • There are also verification tools which use stateless search such as Verisoft • The state of the program consists of – information for each thread in the Java program • a stack of frames, one for each method called – the static variables in classes • locks and fields for the classes – the dynamic variables (fields) in objects • locks and fields for the objects Storing States Efficiently • Since different states can have common parts each state is divided to a set of components which are stored separately – locks, frames, fields • Keep a pool for each component – A table of field values, lock values, frame values • Instead of storing the value of a component in a state store an index at which the component is stored in the table in the state – The whole state becomes an integer vector • JPF collapses states to integer vectors using this idea • This strategy enables JPF to collapse and uncollapse parts of the states during the state space exploration State Space Explosion • State space explosion if one of the major challenges in model checking • The idea is to reduce the number of states that have to be visited during state space exploration • Here are some approaches used to attack state space explosion – Symmetry reduction • search equivalent states only once – Partial order reduction • do not search thread interleavings that generate equivalent behavior – Abstraction • Abstract parts of the state to reduce the size of the state space Symmetry Reduction • Some states of the program may be equivalent – Equivalent states should be searched only once • Some states may differ only in their memory layout, the order objects are created, etc. – these may not have any effect on the behavior of the program • JPF makes sure that the order which the classes are loaded does not effect the state – There is a canonical ordering of the classes in the memory • A similar problem occurs for location of dynamically allocated objects in the heap – If we store the memory location as the state, then we can miss equivalent states which have different memory layouts – JPF tries to remedy this problem by storing some information about the new statements that create an object and the number of times they are executed Partial Order Reduction • Statements of concurrently executing threads can generate many different interleavings – all these different interleavings are allowable behavior of the program • A model checker has to check all possible interleavings that the behavior of the program is correct in all cases – However different interleavings may generate equivalent behaviors • In such cases it is sufficient to check just one interleaving without exhausting all the possibilities – This is called partial order reduction state space search generates 258 states with symmetry reduction: 105 states with partial order reduction: 68 states with symmetry reduction + partial order reduction : 38 states class S1 { int x;} class FirstTask extends Thread { public void run() { S1 s1; int x = 1; s1 = new S1(); x = 3; }} class S2 { int y;} class SecondTask extends Thread { public void run() { S2 s2; int x = 1; s2 = new S2(); x = 3; }} class Main { public static void main(String[] args) { FirstTask task1 = new FirstTask(); SecondTask task2 = new SecondTask(); task1.start(); task2.start(); }} Action Language and Action Language Verifier • Now, I want to talk about an infinite state model checker called Action Language Verifier (ALV) • ALV is like Spin and SMV, it has its own input language called Action Language • So, to verify something using ALV you first have to specify it in Action Language Action Language • Actions specify state changes (transitions) • States correspond to valuations of variables – boolean – enumerated – integer (possibly unbounded) – heap variables (i.e., pointers) • Parameterized constants – specifications are verified for every possible value of the constant • Parameterized specifications – Enable verification of a protocols for arbitrary number of processes Action Language • Transition relation is defined using actions – Atomic actions: Predicates on current and next state variables – Action composition: • asynchronous (|) or synchronous (&) • Modular – Modules can have submodules – A module is defined as asynchronous and/or synchronous compositions of its actions and submodules Actions in Action Language • Atomic actions: Predicates on current and next state variables – Current state variables: reading, nr, busy – Next state variables: reading’, nr’, busy’ – Logical operators: not (!) and (&&) or (||) – Equality: = (for all variable types) – Linear arithmetic: <, >, >=, <=, +, * (by a constant) • An atomic action: !reading and !busy and nr’=nr+1 and reading’ What can we specify in Action Language? • For example, we can specify a Read-Write lock implementation integer nr; boolean busy; initial: !busy and nr=0; r_enter: r_exit: w_enter: w_exit: [!busy] nr := nr+1; nr := nr-1; [!busy && nr=0] busy := true; busy := false; • Then using Action Language Verifier, we can check if this read write lock satisfies a CTL property (like mutual-exclusion) Read-Write Lock in Action Language module main() integer nr; boolean busy; restrict: nr>=0; initial: nr=0 and !busy; S : Cartesian product of variable domains defines the set of states module ReaderWriter() enumerated state {idle, reading, writing}; initial: state=idle; I : Predicates defining the initial states R : Atomic actions of a single process r_enter: state=idle and !busy and nr’=nr+1 and state’=reading; r_exit: state=reading and nr’=nr-1 and state’=idle; w_enter: state=idle and !busy and nr=0 busy’ and state’=writing; w_exit: state=writing and !busy’ and state’=idle; ReaderWriter: r_enter | r_exit | w_enter | w_exit; endmodule main: ReaderWriter*(); R : Transition relation of a process, defined as asynchronous composition of its atomic actions spec: invariant(busy => nr=0) spec: invariant(busy => eventually(!busy)) R : Transition relation of main, defined as asynchronous endmodule composition of finite but arbitrary number of reader-writer modules Arbitrary Number of Threads? • How do we check arbitrary number of threads? • Counting abstraction – Create an integer variable for each thread state – Each variable counts the number of threads in a particular state – Generate updates and guards for these variables based on the specification – Local states of the threads have to be finite • Shared variables can be unbounded • Counting abstraction is automated Parameterized Read-Write Lock module main() integer nr; boolean busy; parameterized integer numReaderWriter; restrict: nr>=0 and numReaderWriter>=1; initial: nr=0 and !busy; module ReaderWriter() integer idle, reading, writing; initial: idle=numReaderWriter; r_enter: idle>0 and !busy and nr’=nr+1 and idle’=idle-1 and reading’=reading+1; r_exit: reading>0 and nr’=nr-1 and reading’=reading-1 and idle’=idle+1; w_enter: idle>0 and !busy and nr=0 and busy’ and idle’=idle-1 and writing’=writing+1; w_exit: writing>0 and !busy’ and writing’=writing-1 and idle’=idle+1 ReaderWriter: r_enter | r_exit | w_enter | w_exit; endmodule main: ReaderWriter(); spec: invariant(busy => nr=0) spec: invariant(busy => eventually(!busy)) endmodule Action Language Verifier • Action Language Verifier is an infinite state model checker that can verify properties of systems specified in Action Language • The input Action Language specification is represented as a transition system: – S : The set of states – I S : The set of initial states – R S S : The transition relation • Properties of the input specification are expressed in temporal logics • Invariant(p) : is true in a state if property p is true in every state reachable from that state – Also known as AG • Eventually(p) : is true in a state if property p is true at some state on every execution path from that state – Also known as AF Symbolic Model Checking Given a program and a temporal property p: • Either show that all the initial states satisfy the temporal property p – set of initial states truth set of p • Or find an initial state which does not satisfy the property p – a state set of initial states truth set of p • We can check these in two ways: – Starting from the initial states and iteratively add states that are reachable from the current set of reachable states. We stop when there is nothing new to add. (This is called forward fixpoint computation). This computes all the reachable states. OR – Start from the bad states (p) and iteratively add states that can reach the current set of states. We stop when there is nothing new to add. (This is called backward fixpoint computation). This computes all the states that can reach a bad state. Symbolic Model Checking Computes Fixpoints Pre-condition (backwardImage) of p Backward fixpoint Initial states initial states that violate Invariant(p) Invariant(p) Forward fixpoint • • • p states that can reach p i.e., states that violate Invariant(p) • • • Initial states p reachable states that violate p Post-condition (forward image) of initial states reachable states of the system Symbolic Model Checking • Represent sets of states and the transition relation as logic formulas • Forward and backward fixpoints can be computed by iteratively manipulating these formulas – Pre- and Post-condition computation (aka Forward, backward image): Existential variable elimination – Conjunction (intersection), disjunction (union) and negation (set difference), and equivalence check • Requires use an efficient data structures for manipulation of logic formulas Fixpoints May Not Converge • For infinite state systems fixpoint computations may not converge (there is always something new to add, so we never stop). • In fact many verification problems are undecidable for infinite state systems • So we use conservative approximations Conservative Approximations • Compute a lower ( p ) or an upper ( p+ ) approximation to the truth set of the property ( p ) • Action Language Verifier can give three answers: I p I p 1) “The property is satisfied” sates which violate the property p p 3) “I don’t know” I 2) “The property is false and here is a counter-example” p p+ p Action Language Tool Set Action Language Specification Action Language Parser Composite Symbolic Library Action Language Verifier Verified Don’t know Counter example Omega Library Presburger Arithmetic Manipulator CUDD Package BDD Manipulator MONA Automata Manipulator Read-Write Lock Verification with ALV Integers Booleans Cons. Time (secs.) Ver. Time (secs.) Memory (Mbytes) RW-4 1 5 0.04 0.01 6.6 RW-8 1 9 0.08 0.01 7 RW-16 1 17 0.19 0.02 8 RW-32 1 33 0.53 0.03 10.8 RW-64 1 65 1.71 0.06 20.6 RW-P 7 1 0.05 0.01 9.1 Read-Write Lock in Java class ReadWriteLock { private Object lockObj; How do we translate private int totalReadLocksGiven; private boolean writeLockIssued; this to Action Language? private int threadsWaitingForWriteLock; public ReadWriteLock() { lockObj = new Object(); writeLockIssued = false; } public void getReadLock() { synchronized (lockObj) { while ((writeLockIssued) || (threadsWaitingForWriteLock != 0)) { try { lockObj.wait(); } catch (InterruptedException e) { Action } } Language totalReadLocksGiven++; } Verifier } public void getWriteLock() { synchronized (lockObj) { threadsWaitingForWriteLock++; } while ((totalReadLocksGiven != 0) || (writeLockIssued)) { try { Verification of lockObj.wait(); } catch (InterruptedException e) { Synchronization // } Java } Programs threadsWaitingForWriteLock--; writeLockIssued = true; in Two Challenges in Software Model Checking • State space explosion – Exponential increase in the state space with increasing number of variables and threads • State space includes everything: threads, variables, control stack, heap • Environment generation – Finding models for parts of software that are • either not available for analysis, or • are outside the scope of the model checker Modular Verification • Modularity is key to scalability of any verification technique – Moreover, it can help in isolating the behavior you wish to focus on, removing the parts that are beyond the scope of your verification technique • Modularity is also a key concept for successful software design – The question is finding effective ways of exploiting the modularity in software during verification Interfaces for Modularity • How do we do modular verification? – Divide the software to a set of modules – Check each module in isolation • How do we isolate a module during verification/testing? – Provide stubs representing other modules (environment) • How do we get the stubs representing other modules? – Write interfaces • Interfaces specify the behavior of a module from the viewpoint of other modules • Generate stubs from the interfaces Interfaces and Modularity: Basic Idea 1. Write interface specifications for the modules 2. Automatically generate stubs from the interface specifications 3. Automatically generated stubs provide the environment during modular verification A Design for Verification Approach Our design for verification approach is based on the following principles: 1. Use of design patterns that facilitate automated verification 2. Use of stateful, behavioral interfaces which isolate the behavior and enable modular verification 3. An assume-guarantee style modular verification strategy that separates verification of the behavior from the verification of the conformance to the interface specifications 4. A general model checking technique for interface verification 5. Domain specific and specialized verification techniques for behavior verification Concurrency Controller Pattern ThreadA Shared Controller ThreadB Shared SharedStub +a() +b() +a() +b() Controller -var1 -var2 +action1() +action2() ControllerStateMachine +action1() +action2() Helper classes Action +blocking() +nonblocking() -GuardedExecute used at runtime used during interface verification used both times int GuardedCommand StateMachine GuardedCommand +guard() +update() Concurrency Controller Pattern • Avoids usage of error-prone Java synchronization primitives: synchronize, wait, notify • Separates controller behavior from the threads that use the controller – Supports a modular verification approach that exploits this modularity for scalable verification Reader-Writer Controller This helper class is provided. No need to rewrite it! class Action{ class RWController implements protected final Object owner; … RWInterface{ private boolean GuardedExecute(){ int nR; boolean busy; boolean result=false; final Action act_r_enter, act_r_exit; for(int i=0; i<gcV.size(); i++) try{ final Action act_w_enter, act_w_exit; if(((GuardedCommand)gcV.get(i)).guard()){ RWController() { ((GuardedCommand)gcV.get(i)).update(); ... result=true; break; } gcs = new Vector(); }catch(Exception e){} gcs.add(new GuardedCommand() { return result; public boolean guard(){ } return (nR == 0 && !busy);} public void blocking(){ public void update(){busy = true;}} synchronized(owner) { ); while(!GuardedExecute()) { act_w_enter = new Action(this,gcs); try{owner.wait();} } catch (Exception e){} } public void w_enter(){ owner.notifyAll(); } act_w_enter.blocking();} } public boolean w_exit(){ public boolean nonblocking(){ return act_w_exit.nonblocking();} synchronized(owner) { public void r_enter(){ boolean result=GuardedExecute(); act_r_enter.blocking();} if (result) owner.notifyAll(); public boolean r_exit(){ return result; } return act_r_exit.nonblocking();} } } } Controller Interfaces • A controller interface defines the acceptable call sequences for the threads that use the controller • Interfaces are specified using finite state machines r_enter r_exit idle w_exit w_enter public class RWStateMachine implements RWInterface{ reading StateTable stateTable; final static int idle=0,reading=1,writing=2; public RWStateMachine(){ ... stateTable.insert("w_enter",idle,writing); } public void w_enter(){ stateTable.transition("w_enter"); writing } ... } Modular Design / Modular Verification Thread Modular Interface Verification Thread 1 Thread 2 Thread n Thread n Thread 2 Thread 1 Concurrent Program Interface Machine Interface Machine Interface Machine Interface Controller Shared Data Controller Behavior Modular Behavior Verification Behavior Verification • Analyzing properties (specified in CTL) of the synchronization policy encapsulated with a concurrency controller and its interface – Verify the controller properties assuming that the user threads adhere to the controller interface • Behavior verification with Action Language Verifier – We wrote a translator which translates controller classes to Action Language – Using counting abstraction we can check concurrency controller classes for arbitrary number of threads Interface Verification • A thread is correct with respect to an interface if all the call sequences generated by the thread can also be generated by the interface machine – Checks if all the threads invoke controller methods in the order specified in the interfaces – Checks if the threads access shared data only at the correct interface states Interface Verification • Interface verification with Java PathFinder – Verify Java implementations of threads – Correctness criteria are specified as assertions • Look for assertion violations • Assertions are in the StateMachine and SharedStub – Performance improvement with thread Isolation • thread modular verification Thread Isolation: Part 1 • Interaction among threads • Threads can interact with each other in only two ways: – invoking controller actions – Invoking shared data methods • To isolate the threads – Replace concurrency controllers with controller interface state machines – Replace shared data with shared stubs Thread Isolation: Part 2 • Interaction among a thread and its environment • Modeling thread’s calls to its environment with stubs – File I/O, updating GUI components, socket operations, RMI call to another program • Replace with pre-written or generated stubs • Modeling the environment’s influence on threads with drivers – Thread initialization, RMI events, GUI events • Enclose with drivers that generate all possible events that influence controller access Verification Framework Behavior Verification Controller Behavior Machine Controller Classes Counting Abstraction Concurrent Program Controller Interface Machine Classes Interface Verification Java Path Finder Thread Isolation Thread Thread Thread Action Language Verifier Thread Class A Case Study: TSAFE Tactical Separation Assisted Flight Environment (TSAFE) functionality: 1. Display aircraft position 2. Display aircraft planned route 3. Display aircraft future projected route trajectory 4. Show conformance problems between planned and projected route TSAFE Architecture <<TCP/IP>> User Server Radar feed Feed Parser Client Flight Database EventThread <<RMI>> Graphical Client 21,057 lines of code with 87 classes Computation Timer Reengineering TSAFE • Found all the synchronization statements in the code (synchronize, wait, notify, notifyAll) • Identified 6 shared objects protected by these synchronization statements • Used 2 instances of a reader-writer controller and 3 instances of a mutex controller for synchronization • In the reengineered TSAFE code the synchronization statements appear only in the Action helper class provided by the concurrency controller pattern Behavior Verification Performance Controller Time(sec) Memory (MB) P-Time (sec) P-Memory (MB) RW 0.17 1.03 8.10 12.05 Mutex 0.01 0.23 0.98 0.03 Barrier 0.01 0.64 0.01 0.50 BB-RW 0.13 6.76 0.63 10.80 BB-Mutex 0.63 1.99 2.05 6.47 P denotes parameterized verification for arbitrary number of threads Interface Verification Performance Thread Time (sec) Memory (MB) TServer-Main 67.72 17.08 TServer-RMI 91.79 20.31 TServer-Event 6.57 10.95 TServer-Feed 123.12 83.49 TClient-Main 2.00 2.32 TClient-RMI 17.06 40.96 TClient-Event 663.21 33.09 Effectiveness in Finding Faults • Created 40 faulty versions of TSAFE by fault seeding • Each version had at most one interface fault and at most one behavior fault – 14 behavior and 26 interface faults • Among 14 behavior faults ALV identified 12 of them – 2 uncaught faults were spurious • Among 26 interface faults JPF identified 21 of them – 2 of the uncaught faults were spurious – 3 of the uncaught faults were real faults that were not caught by JPF Falsification Performance Thread Time (sec) Memory (MB) TServer-RMI 29.43 24.74 TServer-Event 6.88 9.56 TServer-Feed 18.51 94.72 TClient-RMI 10.12 42.64 TClient-Event 15.63 12.20 Concurrency Controller RW-8 Time (sec) 0.34 Memory (MB) 3.26 RW-16 1.61 10.04 RW-P 1.51 5.03 Mutex-8 0.02 0.19 Mutex-16 0.04 0.54 Mutex-p 0.12 0.70 Conclusions • ALV performance – Cost of parameterized verification was somewhere between concrete instances with 8 and 16 threads – Falsification performance was better than verification • Completeness of the controller properties – Effectiveness of behavior verification by ALV critically depends on the completeness of the specified properties • Concrete vs. parameterized behavior verification – When no faults are found, the result obtained with parameterized verification is stronger – However for falsification we observed that concrete instances were as effective as parameterized instances Conclusions • JPF performance – Typically falsification performance is better than verification performance – In some cases faults caused execution of new code causing the falsification performance to be worse than verification performance • Thread isolation – Automatic environment generation for threads result in too much non-determinism and JPF runs out of memory – Dependency analysis was crucial for mitigating this • Deep faults were difficult to catch using JPF – Three uncaught faults were created to test this Conclusions • Unknown shared objects – The presented approach does not handle this problem – Using escape analysis may help • We could not find a scalable and precise escape analysis tool • Environment generation – This is the crucial problem in scalability of the interface verification – Using a design for verification approach for environment generation may help