Modularity, Interfaces, and Verification Tevfik Bultan Department of Computer Science University of California, Santa Barbara Joint Work with My Students • Action Language Verifier (ALV) – Tuba Yavuz-Kahveci, University of Florida, Gainesville • Web Service Analysis Tool (WSAT) – Xiang Fu, Hofstra University • (co-advised with Jianwen Su) • Design for verification – Aysu Betin-Can, Middle East Technical University • Interface Grammars – Graham Hughes, PhD candidate http://www.cs.ucsb.edu/~bultan/vlab/publications.html Outline • Motivation • PART 1: Verification of Synchronization Policies in Concurrent Programs via Interfaces • PART 2: Verification of Conversations among Web Services via Interfaces • PART 3: Modular Verification with Interface Grammars • Conclusions Model Checking Software • Model checking – An automated software verification technique – Exhaustive exploration of the state space of a program to find bugs • Systematically explore all possible behaviors of a program – look for violations of the properties of interest • assertion violations, deadlock • Software model checkers: Verisoft, Java PathFinder (JPF), SLAM, BLAST, CBMC Two Challenges in Model Checking • State space explosion – Exponential increase in the state space with increasing number of variables and threads • State space includes everything: threads, variables, control stack, heap • Environment generation – Finding models for parts of software that are • either not available for analysis, or • are outside the scope of the model checker Modular Verification • Modularity is key to scalability of any verification technique – Moreover, it can help in isolating the behavior you wish to focus on, removing the parts that are beyond the scope of your verification technique • Modularity is also a key concept for successful software design – The question is finding effective ways of exploiting the modularity in software during verification Interfaces for Modularity • How do we do modular verification? – Divide the software to a set of modules – Check each module in isolation • How do we isolate a module during verification/testing? – Provide stubs representing other modules (environment) • How do we get the stubs representing other modules? – Write interfaces • Interfaces specify the behavior of a module from the viewpoint of other modules • Generate stubs from the interfaces Interfaces and Modularity: Basic Idea 1. Write interface specifications for the modules 2. Automatically generate stubs from the interface specifications 3. Automatically generated stubs provide the environment during modular verification Three Applications I will talk about three different instantiations of this basic idea: 1. Verification of synchronization policies in concurrent programs using finite state interfaces 2. Verification of conversations among web services using finite state interfaces 3. Verification of sequential interactions using interface grammars PART 1 Concurrency Controller Pattern for Synchronization An Infinite State Model Checker Action Language Specification + CTL property Action Language Parser Action Language Verifier (ALV) Composite Symbolic Library Model Checker Counter-example Verified Not sure Omega Library Presburger Arithmetic Manipulator CUDD Package BDD Manipulator MONA Automata Manipulator Read-Write Lock in Action Language module main() integer nr; boolean busy; restrict: nr>=0; initial: nr=0 and !busy; S : Cartesian product of variable domains defines the set of states module ReaderWriter() enumerated state {idle, reading, writing}; initial: state=idle; I : Predicates defining the initial states R : Atomic actions of a single process r_enter: state=idle and !busy and nr’=nr+1 and state’=reading; r_exit: state=reading and nr’=nr-1 and state’=idle; w_enter: state=idle and !busy and nr=0 busy’ and state’=writing; w_exit: state=writing and !busy’ and state’=idle; ReaderWriter: r_enter | r_exit | w_enter | w_exit; endmodule main: ReaderWriter*(); spec: invariant(busy => nr=0) spec: invariant(busy => eventually(!busy)) endmodule R : Transition relation of a process, defined as asynchronous composition of its atomic actions R : Transition relation of main, defined as asynchronous composition of finite but arbitrary number of reader-writer modules Read-Write Lock in Java class ReadWriteLock { private Object lockObj; private int totalReadLocksGiven; private boolean writeLockIssued; private int threadsWaitingForWriteLock; public ReadWriteLock() { lockObj = new Object(); writeLockIssued = false; } public void getReadLock() { synchronized (lockObj) { while ((writeLockIssued) || (threadsWaitingForWriteLock != 0)) { try { lockObj.wait(); } catch (InterruptedException e) { } Action } totalReadLocksGiven++; Language } } Verifier public void getWriteLock() { synchronized (lockObj) { threadsWaitingForWriteLock++; How do we translate this to Action Language? while ((totalReadLocksGiven != 0) || (writeLockIssued)) { try { Verification of lockObj.wait(); } catch (InterruptedException e) { Synchronization // } Java } threadsWaitingForWriteLock--; Programs writeLockIssued = true; in A Design for Verification Approach Our design for verification approach is based on the following principles: 1. Use of design patterns that facilitate automated verification 2. Use of stateful, behavioral interfaces which isolate the behavior and enable modular verification 3. An assume-guarantee style modular verification strategy that separates verification of the behavior from the verification of the conformance to the interface specifications 4. A general model checking technique for interface verification 5. Domain specific and specialized verification techniques for behavior verification Concurrency Controller Pattern ThreadA Shared Controller ThreadB Shared SharedStub +a() +b() +a() +b() Helper classes Action +blocking() +nonblocking() -GuardedExecute used at runtime used during interface verification used both times Controller -var1 -var2 +action1() +action2() int ControllerStateMachine +action1() +action2() GuardedCommand StateMachine GuardedCommand +guard() +update() Concurrency Controller Pattern • Avoids usage of error-prone Java synchronization primitives: synchronize, wait, notify • Separates controller behavior from the threads that use the controller – Supports a modular verification approach that exploits this modularity for scalable verification Reader-Writer Controller This helper class is provided. No need to rewrite it! class Action{ class RWController implements protected final Object owner; … RWInterface{ private boolean GuardedExecute(){ int nR; boolean busy; boolean result=false; final Action act_r_enter, act_r_exit; for(int i=0; i<gcV.size(); i++) try{ final Action act_w_enter, act_w_exit; if(((GuardedCommand)gcV.get(i)).guard()){ RWController() { ((GuardedCommand)gcV.get(i)).update(); ... result=true; break; } gcs = new Vector(); }catch(Exception e){} gcs.add(new GuardedCommand() { return result; public boolean guard(){ } return (nR == 0 && !busy);} public void blocking(){ public void update(){busy = true;}} synchronized(owner) { ); while(!GuardedExecute()) { act_w_enter = new Action(this,gcs); try{owner.wait();} } catch (Exception e){} } public void w_enter(){ owner.notifyAll(); } act_w_enter.blocking();} } public boolean w_exit(){ public boolean nonblocking(){ return act_w_exit.nonblocking();} synchronized(owner) { public void r_enter(){ boolean result=GuardedExecute(); act_r_enter.blocking();} if (result) owner.notifyAll(); public boolean r_exit(){ return result; } return act_r_exit.nonblocking();} } } } Controller Interfaces • A controller interface defines the acceptable call sequences for the threads that use the controller • Interfaces are specified using finite state machines public class RWStateMachine implements RWInterface{ r_enter reading StateTable stateTable; final static int idle=0,reading=1,writing=2; public RWStateMachine(){ ... r_exit stateTable.insert("w_enter",idle,writing); idle } w_exit public void w_enter(){ stateTable.transition("w_enter"); writing } w_enter ... } Modular Design / Modular Verification Thread Modular Interface Verification Thread 1 Thread 2 Thread n Thread n Thread 2 Thread 1 Concurrent Program Interface Machine Interface Machine Interface Machine Interface Controller Shared Data Controller Behavior Modular Behavior Verification Verification Framework Controller Behavior Machine Controller Classes Behavior Verification Action Language Verifier Counting Abstraction Concurrent Program Controller Interface Machine Thread Thread Thread Classes Interface Verification Java Path Finder Thread Isolation Thread Class A Case Study • Tactical Separation Assisted Flight Environment (TSAFE) – 21,057 lines of code with 87 classes • Reengineered TSAFE using 2 instances of a reader-writer controller and 3 instances of a mutex controller – In the reengineered version the synchronization statements appear only in the Action helper class • Created 40 faulty versions of TSAFE by fault seeding – Caught 33 of them using interface and behavior verification – 4 of the uncaught faults were spurious – 3 uncaught faults were real faults and they were interface violations Behavior Verification Performance Controller Time(sec) Memory (MB) P-Time (sec) P-Memory (MB) RW 0.17 1.03 8.10 12.05 Mutex 0.01 0.23 0.98 0.03 Barrier 0.01 0.64 0.01 0.50 BB-RW 0.13 6.76 0.63 10.80 BB-Mutex 0.63 1.99 2.05 6.47 P denotes parameterized verification for arbitrary number of threads Interface Verification Performance Thread Time (sec) Memory (MB) TServer-Main 67.72 17.08 TServer-RMI 91.79 20.31 TServer-Event 6.57 10.95 TServer-Feed 123.12 83.49 TClient-Main 2.00 2.32 TClient-RMI 17.06 40.96 TClient-Event 663.21 33.09 Falsification Performance Thread Time (sec) Memory (MB) TServer-RMI 29.43 24.74 TServer-Event 6.88 9.56 TServer-Feed 18.51 94.72 TClient-RMI 10.12 42.64 TClient-Event 15.63 12.20 Concurrency Controller RW-8 Time (sec) 0.34 Memory (MB) 3.26 RW-16 1.61 10.04 RW-P 1.51 5.03 Mutex-8 0.02 0.19 Mutex-16 0.04 0.54 Mutex-p 0.12 0.70 Observations • Falsification performance is better than verification performance • Completeness of the controller properties is an issue • For falsification, concrete instances were as effective as parameterized instances • Unknown shared objects: need escape analysis • Thread isolation is challenging • Environment generation is the crucial problem in scalability of the interface verification PART 2 Peer Controller Pattern for Web Services Web Services WSCDL Composition WSBPEL Service WSDL Message SOAP Type XML Schema Data XML Web Service Standards Implementation Platforms Interaction Microsoft .Net, Sun J2EE • Loosely coupled, interaction through standardized interfaces • Standardized data transmission via XML • Asynchronous messaging • Platform independent (.NET, J2EE) Web Service Conversations • A composite web service consists of – a finite set of peers – and a finite set of message classes • The messages among the peers are exchanged using reliable and asynchronous messaging – FIFO and unbounded message queues • A conversation is a sequence of messages generated by the peers during an execution Checking Conversations confirm query Agent Customer suggest Agent reserve Customer Hotel !query ?query ?query Hotel !query ?suggest ?reserve !reserve !confirm !suggest ?suggest Input Queue ?confirm !suggest Conversation ... ? G(query F(confirm)) This is an undecidable verification problem due to unbounded queues Synchronizability Analysis • A composite web service is synchronizable, if its conversation set does not change – when asynchronous communication is replaced with synchronous communication • If a composite web service is synchronizable we can check its properties about its conversations using synchronous communication semantics – For finite state peers this is a finite state model checking problem Web Service Analysis Tool (WSAT) Web Services BPEL (bottom-up) Front End BPEL to GFSA Analysis Back End Intermediate Representation Guarded automata GFSA to Promela Synchronizability Analysis GFSA parser Guarded automaton (synchronous communication) GFSA to Promela skip Conversation Protocol (top-down) Verification Languages (bounded queue) Realizability Analysis fail success GFSA to Promela (single process, no communication) Promela Checking Service Implementations There are some problems: • People write web service implementations using programming languages such as Java, C#, etc. • Synchronizability analysis works on state machine models • How do we generate the state machines from the Java code? Synchronizability Analysis Checking Service Implementations Design for Verification Approach Use the same principles: 1. Use of design patterns that facilitate automated verification 2. Use of stateful, behavioral interfaces which isolate the behavior and enable modular verification 3. An assume-guarantee style modular verification strategy that separates verification of the behavior from the verification of the conformance to the interface specifications 4. A general model checking technique for interface verification 5. Domain specific and specialized verification techniques for behavior verification Peer Controller Pattern ApplicationThread Communicator PeerServlet CommunicationInterface CommunicationController StateMachine sessionId ThreadContainer used at runtime used during interface verification used both times Red Bordered classes are the ones the user has to implement Modular Design / Modular Verification Peer Modular Interface Verification Peer 1 Peer 2 Peer n Peer n interface Peer 2 interface interface Peer 1 Composite Service Interface Machine Interface Machine Interface Machine Conversation Behavior Modular Conversation Verification Verification Framework Thread Peer Thread WSAT State Machines Promela Translation Synchronizability Analysis Composite Service Peer State Machine Conversation Verification Interface Verification Spin Java Path Finder Peer Code Promela Examples • We used this approach to implement several simple web services – Travel agency – Loan approver – Product ordering • Performance of both interface and behavior verification were reasonable Interface Verification Interface Verification with JPF for Loan Approver Threads T (sec) M (MB) Customer 8.86 3.84 Loan Approver 9.65 4.7 Risk Assesor 8.15 3.64 Behavior Verification • Sample Property: Whenever a request with a small amount is sent, eventually an approval message accepting the loan request will be sent. • Loan Approval system has 154 reachable states – because queue lengths never exceeds 1 • Behavior verification used <1 sec and 1.49 MB • SPIN requires restricted domains – Have to bound the channel sizes bounded message queues • In general there is no guarantee these results will hold for other queue sizes – Using synchronizability analysis we use queues of size 0 and still guarantee that the verification results hold for unbounded queues! Observations • Once the behavior is isolated (using concurrency controller or peer controller patterns) behavior verification is quite efficient • Interface verification is challenging • It is necessary to find effective behavioral interface specification and verification techniques PART 3 Interface Grammars Interface Grammar Interface Grammar Interface Grammars Interface Compiler Component B Component B Stub Component A Model Checker Component A An Example • An interface grammar for transactions – Specifies the appropriate ordering for method calls to a transaction manager Start → Base → | Tail → | Base begin Tail Base ε commit rollback – Method calls are the terminal symbols of the interface grammar An Example • Consider the call sequence begin rollback begin commit Start → Base → | Tail → | • Here is a derivation: Start Base begin Tail Base begin rollback Base begin rollback begin Tail Base begin rollback begin commit Base begin rollback begin commit Base begin Tail Base ε commit rollback Another Example • This interface can also be specified as a Finite State Machine (FSM) begin commit rollback • However, the following grammar, which specifies nested transactions, cannot be specified as a FSM Start → Base → | Tail → | Base begin Base Tail Base ε commit rollback Yet Another Example • Let’s add another method called setrollbackonly which forces all the pending transactions to finish with rollback instead of commit • We achieve this by extending the interface grammars with semantic predicates and semantic actions Start → Base → Tail | | → | «r:=false; l:=0» Base begin «l:=l+1» Base Tail «l:=l-1; if l=0 then r:=false» Base setrollbackonly «r:=true» Base ε «r=false» commit rollback Interface Grammar Translation • Our interface compiler translates interface grammars to executable code: – the generated code is the stub for the component • The generated code is a parser that – parses the incoming method calls – while making sure that the incoming method calls conform to the interface grammar specification Interface Grammar Verification with Interface Grammars parser stack Component Stub Interface Compiler parse table Top-down parser semantic predicates and semantic actions Program method invocation (lookahead) Model Checker A Case Study • We wrote an interface grammar for the EJB 3.0 Persistence API – This is an API specification for mapping Java object graphs to a relational database – Hibernate is an implementation of this API • Hibernate distribution contains several example test clients that are designed to fail and test exceptional behavior by violating the interface specification A Case Study, Continued • We used these simple clients to check the fidelity of the stub generated from our interface specification – We used the JPF software model checker • None of these examples can run under JPF directly • Time taken to develop the interface was dominated by the need to understand EJB Persistence first – about a couple of hours Experiments: Falsification Client 1 Client 2 Client 3 sec. sec. sec. sec. # obj. # iter. 1.9 1.8 1.8 2.1 1 1 1.9 1.8 1.8 2.1 1 5 1.9 1.8 1.8 2.1 5 1 1.9 1.8 1.8 2.1 5 5 Client 4 Experiments: Verification Client 1 Client 2 Client 3 sec. sec. sec. sec. # obj. # iter. 3.1 1.9 1.8 2.5 1 1 26.9 11.1 9.9 17.1 1 5 10.1 5.9 4.2 11.8 5 1 126.3 63.9 43.5 138.5 5 5 Client 4 A Case Study, Continued • For these simple clients, interface violations can be detected by JPF in a couple of seconds using the EJB stub generated by our interface compiler – Falsification time does not increase with the number of operations executed or the number of objects created by the clients • When we fix the errors, JPF is able to verify the absence of interface violations – Verification time increases with the number of operations executed or the number of objects created by the clients Interface Grammars: Uni/Bidirectional Interface • Interface grammars can be – Unidirectional: No callbacks Callee Caller Interface – Bidirectional: Need to handle Callbacks Comp B Comp A Interface Grammars: Client/Server Interface • Interface grammars can be used for – Client verification: Generate a stub for the server – Server verification: Generate a driver for the server Stub Client Driver Server Interface Compiler Semantic Elements in JML • To handle both client and server side verification – We need to generate stubs and drivers from the same specification • Semantic predicate in one direction becomes a semantic action in the other direction and visa versa • We focused on a subset of Java Modeling Language – A restricted subset that is reversable – Semantic predicates and actions are written in this subset – Interface compiler automatically generates code from them both for client and server side verification Interface Grammars and Data • A crucial part of the interface specification is specifying the allowable values for the method arguments and generating allowable return values • Approach 1: These can be specified in the semantic actions and semantic predicates of the interface grammars • Approach 2: Can we specify the constraints about the arguments and return values using the grammar rules? – Yes, grammar productions can be used to specify the structure of most recursive data structures. Checking Arguments • A crucial part of the interface specification is specifying the allowable values for the method arguments and generating allowable return values • In what I discussed so far all these are done in the semantic actions and semantic predicates • The question is, can we specify the constraints about the arguments and return values using the grammar rules – Recursive data structures are especially good candidates for this! Shape Types • Shape types [Fradet, Metayer, POPL 97] provide a formalism for specifying recursive data structures • It is a specification formalism based on graph grammars • Shape types can be used to specify the connections among the heap allocated objects • Objects become the parameters of the nonterminals and the constraints on the connections among the objects are specified on the right-hand-sides of the grammar rules (similar to semantic predicates) Shape Type for Doubly Linked List → → → Doubly Lx Lx p x, prev x null, L x next x y, prev y x, L y next x null p prev next next 2 1 prev next 3 prev 4 prev Doubly p 1, prev 1 null, L 1 next 1 2, prev 2 1, L 2 next 2 3, prev 3 2, L 3 next 3 4, prev 4 3, L 4 next 4 null next Shape Type for Binary Tree Bintree Bx Bx → → → p x, B x left x y, right x z, B y, B z left x null, right x null p left left 2 right 1 right left left 4 right left 5 right 3 right Extension to Interface Grammars • In order to support shape types we extend the interface grammars as follows: – We allow nonterminals with parameters • This extension is sufficient since the constraints about the connections among the objects can be stated using semantics predicates and semantic actions Interface Grammars + Shape Types → → → Doubly Lx Lx p x, prev x null, L x next x y, prev y x, L y next x null Doubly[x] → ensure x == \new(Node) && x.getPrev() == null; L[x] L[x] → Node y; ensure y == \new(Node) && x.getNext() == y && y.getPrev() == x; L[y] | ensure x.getNext() == null; Interface Grammars + Shape Types → → → Bintree Bx Bx p x, B x left x y, right x z, B y, B z left x null, right x null Bintree[x] → ensure x == \new(Node); B[x] B[x] → Node y, z; ensure y == \new(Node) && z == \new(Node) && x.getLeft() == y && x.getRight() == z ; B[y] B[z] | ensure x.getLeft() == null && x.getRight() == null; Objection Generation vs. Validation • The use of shape types in interface grammars has two purposes – For the objects that are passed as method arguments we need to check that their shape is allowed by the shape type • We call this object validation – For the objects that are returned by the component we need to generate an object that is allowed by the shape type • We call this object generation Object Generation vs. Validation • Object generation and validation tasks are broadly symmetric – The set of nonterminals and productions used for object generation and validation are the same and are dictated by the shape type specification – In object generation semantic actions are used to set the fields of objects to appropriate values dictated by the shape type specification – In object validation these are constraints that are checked using semantic predicates specified as guards • Given the semantic elements specified in JML, our interface compile generates code both for object generation and validation Object Generation vs. Validation • There is a minor problem with object validation • In shape type specifications, the assumption is that there is no aliasing among the objects unless it is explicitly specified • This assumption is easy to enforce during object generation since every new statement creates a new object that has nothing else pointing to it • In order to enforce the same constraint during object validation we need to make sure that there is no unspecified aliasing – This can be enforced by using a hash-set for storing and propagating all the observed objects Modular Verification of Web Services • We applied our modular verification approach based on interface grammars to both client and server side verification of Web services Interface Grammars and Web Services Our approach: 1. A WSDL-to-interface grammar translator automatically generates grammar productions that generate and/or validate XML arguments and return values 2. User adds control flow constraints by modifying the grammar 3. Interface compiler automatically generates a stub for client side verification and a driver for server-side verification Interface Grammars for Web Services Another Case Study: AWS-ECS • We tested the Amazon E-Commerce Web Service (AWSECS) using our approach • AWS-ECS WSDL specification lists 40 operations – that provide differing ways of searching Amazon’s product database • We focus on the core operations: – ItemSearch, CartCreate, CartAdd, CartModify, CartGet, CartClear Client-side Verification • For client verification we used a demonstration client provided by Amazon • This client does not check any constraints such as – You should not try to insert an item to a shopping cart before creating a shopping cart • When such requests are sent to AWS-ECS they would return an error message • Using our approach we can easily check if the client allows such erroneous requests • Falsification time changes with the type of faults we are looking for (data or control errors), changes from 10 to 60 seconds AWS-ECS: Server Verification • Our interface compiler automatically generates a driver that sends sequences of requests to AWS-ECS server and checks that the return values conform to the interface specification • The driver is a sentence generator – It generates sequences of SOAP requests based on the interface specification • We used two algorithms for sentence generation: – A random sentence generation algorithm – Purdom’s algorithm: A directed sentence generation algorithm that guarantees production coverage Directed Sentence Generation • Number of sentences generated: 5 • Average derivation length: 24 • Average number of SOAP requests/responses: 3.8 • Verification time: 20 seconds Random Sentence Algorithm • Number of sentences generated: 100 • Average derivation length: 17.5 • Average number of SOAP requests/responses: 3.2 350 100% 300 80% 250 200 60% time in seconds 150 Percentage of covered productions Percentage of covered nonterminals 40% 100 20% 50 0% 0 0 20 40 60 80 100 120 Number of executions 0 5 10 15 20 25 30 35 40 45 50 Number of exceutions Server-side verification • We found two errors during server side verification – Errors were discovered within 10 seconds • These errors indicate a mismatch between the interface specification and the server implementation • It may mean that we misunderstood the description of the Web service interface • It may also mean that there is an error in the service implementation Conclusions • Modular verification is a necessity • Interfaces are crucial for modular verification • Interface grammars provide a new specification mechanism for interfaces • Interface grammars can be used for automated stub and driver generation leading to modular verification Conclusions • Behavioral interfaces can be useful both – From software design perspective by enabling better modular designs – From verification perspective by enabling more efficient modular verification • The challenge is to find behavioral interface specification mechanisms that serve both of these goals Related Work: Modular Verification • Clarke, Long, McMillan, Compositional Model Checking • Henzinger, Qadeer, Rajamani, Assume Guarantee Reasoning in Model Checking • Flanagan, Qadeer, Thread-Modular Model Checking • Krishnamurthi, Fisler, Modular Verification of Feature Oriented Programs Related Work: Design for Verification • Meyer, Design by Contract • Flanagan, Leino, et al. ESC Java • Mehlitz, Penix, Design for Verification Using Design Patterns • Sarna-Starosta, Stirewalt, Dillon, Design for Verification for Synchronization Related Work: Interfaces • L. de Alfaro and T. A. Henzinger. Interface automata. • O. Tkachuk, M. B. Dwyer, and C. Pasareanu. Automated environment generation for software model checking. • T. Ball and S. K. Rajamani. SLAM interface specification language. • G. T. Leavens et al.: JML Related: Grammar-based Testing • A. G. Duncan, J. S. Hurchinson: Using attributed grammars to test designs and implementations • P. M. Maurer: Generating test data with enhanced context free grammars • P. M. Maurer: The design and implementation of a grammar-based data generator • E. G. Sirer and B. N. Bershad: Using production grammars in software testing