Tools for Specification, Verification, and Synthesis of Concurrency Control Components Tevfik Bultan Department of Computer Science University of California, Santa Barbara bultan@cs.ucsb.edu http://www.cs.ucsb.edu/~bultan/ http://www.cs.ucsb.edu/~bultan/composite/ Students Tuba Yavuz-Kahveci Xiang Fu Constantinos Bartzis Murat Tuncer Aysu Betin Problem Concurrent programming is difficult and error prone – In sequential programming you only worry about the “states” of the variables, in concurrent programming you also have to worry about the “states” of the processes When there is concurrency, testing is not enough – State space increases exponentially with the number of processes We would like to guarantee certain properties of a concurrent system Airport Ground Traffic Control A simplified model of Seattle Tacoma International Airport from [Zhong 97] Airport Ground Traffic Control Simulator Simulate behavior of each airplane with a thread Use a monitor which keeps track of number of airplanes on each runway and each taxiway Use guarded commands (which will become the procedures of the monitor) to enforce the control logic Tools for Specification, Verification, and Synthesis of Reactive Systems Action Language specification Action Language Parser Action Language Verifier Code Generator Verified code Composite Symbolic Library Omega Library CUDD Package Applications Safety-critical system specifications – SCR (tabular), Statecharts (hierarchical state machines) specifications [Bultan, Gerber, League ISSTA98, TOSEM00] Concurrent programs – Synthesizing verified monitor classes from specifications [Yavuz-Kahveci, Bultan, 02] Protocol verification – Verification of parameterized cache coherence protocols using counting abstraction [Delzanno, Bultan CP01] Verification of workflow specifications – Verification of acyclic decision flows [Fu, Bultan, Hull, Su TACAS01] Outline Specification Language: Action Language Verification Engine Synthesizing Verified Monitors Conclusions Some Terminology Model Checker: A program that checks if a (reactive) system satisfies a (temporal) property Reactive System: Systems which continuously interact with their environment without terminating – Protocols – Requirements specifications for safety critical systems – Concurrent programs Temporal Property: A property expressed using temporal operators such as “invariant” or eventually” Expressing Properties Properties of reactive systems are expressed in temporal logics using temporal operators Invariant(p) : is true in a state if property p is true in every state on all execution paths starting at that state Eventually(p) : is true in a state if property p is true at some state on every execution path starting from that state Action Language A state based language – Actions correspond to state changes States correspond to valuations of variables – Integer (possibly unbounded), boolean and enumerated variables • Recently, we added heap variables (i.e., pointers) – Parameterized constants (verified for every possible value of the constant) Transition relation is defined using actions – Atomic actions: Predicates on current and next state variables – Action composition: • synchronous (&) or asynchronous (|) Modular – Modules can have submodules – Modules are defined as synchronous and asynchronous compositions of its actions and submodules Simple Example module main() integer a,b,c,r; restrict a>=0 and b>=0 and c>=0; initial r=0; module max(x,y,result) integer x,y,result; boolean pc; initial pc = true; a1: pc and (x >= y) and result’ = x and !pc’; a2: pc and (y >= x) and result’ = y and !pc’; max: a1 | a2; spec: invariant(!pc => (result>=x and result>=y)) endmodule main: max(a,r,r) | max(b,r,r) | max(c,r,r) spec: eventually(r>=a and r>=b and r>=c) endmodule Simple Example Action Language Verifier automatically verifies given temporal properties If there is an error: a1: pc and (x > y) and result’ = x and !pc’; a2: pc and (y > x) and result’ = y and !pc’; Action Language Verifier automatically generates a counter-example: An execution sequence where where x is equal to y Model Checking View Every reactive system – safety-critical software specification, – cache coherence protocol, – mutual exclusion algorithm, etc. is represented as a transition system: – S : The set of states – I S : The set of initial states – R S S : The transition relation Readers Writers Solution in Action Language module main() integer nr; boolean busy; restrict: nr>=0; initial: nr=0 and !busy; module Reader() boolean reading; initial: !reading; rEnter: !reading and !busy and nr’=nr+1 and reading’; rExit: reading and !reading’ and nr’=nr-1; Reader: rEnter | rExit; endmodule module Writer() boolean writing; initial: !writing; wEnter: !writing and nr=0 and !busy and busy’ and writing’; wExit: writing and !writing’ and !busy’; Writer: wEnter | wExit; endmodule main: Reader() | Reader() | Writer() | Writer(); spec: invariant([busy => nr=0]) endmodule A Closer Look module main() integer nr; boolean busy; restrict: nr>=0; initial: nr=0 and !busy; S : Cartesian product of variable domains defines the set of states I : Predicates defining the initial states module Reader() R : Atomic actions of the boolean reading; Reader initial: !reading; rEnter: !reading and !busy and nr’=nr+1 and reading’; rExit: reading and !reading’ and nr’=nr-1; Reader: rEnter | rExit; endmodule R : Transition relation of Reader defined as module Writer() asynchronous composition ... of its atomic actions endmodule main: Reader() | Reader() | Writer() | Writer(); spec: invariant([busy => nr=0]) endmodule R : Transition relation of main defined as asynchronous composition of two Reader and two Writer processes Actions in Action Language Atomic actions: Predicates on current and next state variables – – – – – Current state variables: reading, nr, Next state variables: reading’, nr’, Logical operators: not (!) and (&&) Equality: = (for all variable types) Linear arithmetic: <, >, >=, <=, +, busy busy’ or (||) * (by a constant) An atomic action: !reading and !busy and nr’=nr+1 and reading’ Asynchronous Composition Asynchronous composition is equivalent to disjunction if composed actions have the same next state variables a1: i > 0 and i’ = i + 1; a2: i <= 0 and i’ = i – 1; a3: a1 | a2 is equivalent to a3: (i > 0 and i’ = i + 1) or (i <= 0 and i’ = i – 1); Asynchronous Composition Asynchronous composition preserves values of variables which are not explicitly updated a1 : i > j and i’ = j; a2 : i <= j and j’ = i; a3 : a1 | a2; is equivalent to a3 : (i > j and i’ = j) and j’ = j or (i <= j and j’ = i) and i’ = i Synchronous Composition Synchronous composition is equivalent to conjunction if two actions do not disable each other a1: i’ = i + 1; a2: j’ = j + 1; a3: a1 & a2; is equivalent to a3: i’ = i + 1 and j’ = j + 1; Synchronous Composition A disabled action does not block synchronous composition a1: i < max and i’ = i + 1; a2: j < max and j’ = j + 1; a3: a1 & a2; is equivalent to a3: (i < max and i’ = i + 1 or i >= max & i’ = i) and (j < max & j’ = j + 1 or j >= max & j’ = j); Model Checking Given a program and a temporal property p: Either show that all the initial states satisfy the temporal property p – set of initial states truth set of p Or find an initial state which does not satisfy the property p – a state set of initial states truth set of p – and generate a counter-example starting from that state Temporal Properties Fixpoints States that satisfy Invariant(p) are all the states which are not in Reach(p): The states that can reach p Reach(p) can be computed as the fixpoint of the following functional: We call this backward image operation F(states) = p reach-in-one-step(states) Actually, Reach(p) is the least-fixpoint of F Temporal Properties Fixpoints backwardImage of p Backward fixpoint Invariant(p) Initial states initial states that violate Invariant(p) Forward fixpoint forward image of initial states Initial states p • • • states that can reach p i.e., states that violate Invariant(p) • • • reachable states of the system p reachable states that violate p Symbolic Model Checking Represent sets of states and the transition relation as Boolean logic formulas Forward and backward fixpoints can be computed by iteratively manipulating these formulas – Forward, backward image: Existential variable elimination – Conjunction (intersection), disjunction (union) and negation (set difference), and equivalence check Use an efficient data structure for manipulation of Boolean logic formulas – BDDs BDDs Efficient representation for boolean functions Disjunction, conjunction complexity: at most quadratic Negation complexity: constant Equivalence checking complexity: constant or linear Image computation complexity: can be exponential Constraint-Based Verification Can we use linear arithmetic constraints as a symbolic representation? – Required functionality • Disjunction, conjunction, negation, equivalence checking, existential variable elimination Advantages: – Arithmetic constraints can represent infinite sets – Heuristics based on arithmetic constraints can be used to accelerate fixpoint computations • Widening, loop-closures Linear Arithmetic Constraints Disjunction complexity: linear Conjunction complexity: quadratic Negation complexity: can be exponential – Because of the disjunctive representation Equivalence checking complexity: can be exponential – Uses existential variable elimination Image computation complexity: can be exponential – Uses existential variable elimination A Linear Arithmetic Constraint Manipulator Omega Library [Pugh et al.] – Manipulates Presburger arithmetic formulas: First order theory of integers without multiplication – Equality and inequality constraints are not enough: Divisibility constraints are also needed Existential variable elimination in Omega Library: Extension of Fourier-Motzkin variable elimination to integers Eliminating one variable from a conjunction of constraints may double the number of constraints Integer variables complicate the problem even further – Can be handled using divisibility constraints Arithmetic Constraints vs. BDDs Constraint based verification can be more efficient than BDDs for integers with large domains BDD-based verification is more robust Constraint based approach does not scale well when there are boolean or enumerated variables in the specification Constraint based verification can be used to automatically verify infinite state systems – cannot be done using BDDs Price of infinity – CTL model checking becomes undecidable Conservative Approximations Compute a lower ( p ) or an upper ( p+ ) approximation to the truth set of the property ( p ) Model checker can give three answers: I p I p “The property is satisfied” sates which violate the property p p “I don’t know” I p p+ “The property is false and here is a counter-example” p Computing Upper and Lower Bounds Approximate fixpoint computations – Widening: To compute upper bound for least-fixpoints • We use a generalization of the polyhedra widening operator by Cousot and Halbwachs – Collapsing (dual of widening): To compute lower bound for greatest-fixpoints – Truncated fixpoints: To compute lower bounds for leastfixpoints and upper bounds for greatest fixpoints Loop-closures – Compute transitive closure of self-loops – Can easily handle simple loops which increment or decrement a counter Composite Model Checking Each variable type is mapped to a symbolic representation type – Map boolean and enumerated types to BDD representation – Map integer type to arithmetic constraint representation Use a disjunctive representation to combine symbolic representations Each disjunct is a conjunction of formulas represented by different symbolic representations Composite Formulas Composite formula (CF): CF ::=CF CF | CF CF | CF | BF | IF Boolean Formula (BF) BF ::=BF BF | BF BF | BF | Termbool Termbool ::= idbool | true | false Integer Formula (IF) IF ::= IF IF | IF IF | IF | Termint Rop Termint Termint ::= Termint Aop Termint | Termint | idint | constantint where Rop denotes relational operators (=, , > , <, , ), Aop denotes arithmetic operators (+,-, and * with a constant) Composite Representation We represent composite formulas as disjunctions Each disjunct represents a conjunction of formulas in basic symbolic types Conjunctive Decomposition Each composite atom is a conjunction Each conjunct corresponds to a different symbolic representation – x: integer; y: boolean; – x>0 and x’=x+1 and y´y • Conjunct x>0 and x´x+1 will be represented by arithmetic constraints • Conjunct y´y will be represented by a BDD – Advantage: Image computations can be distributed over the conjunction (i.e., over different symbolic representations). Composite Symbolic Library Our library implements this approach using an objectoriented design – A common interface is used for each symbolic representation – Easy to extend with new symbolic representations – Enables polymorphic verification – As a BDD library we use Colorado University Decision Diagram Package (CUDD) [Somenzi et al] – As an integer constraint manipulator we use Omega Library [Pugh et al] Composite Symbolic Library: Class Diagram Symbolic +intersect() +union() +complement() +isSatisfiable() +isSubset() +bacwardImage() +forwardImage() BoolSym –representation: BDD +intersect() +union() • • • CUDD Library CompSym IntSym –representation: list of comAtom –representation: Polyhedra +intersect() + union() • • • compAtom –atom: *Symbolic +intersect() +union() • • • OMEGA Library Composite Symbolic Representation x: integer, y:boolean (x>0 and x´x+1 and y´=true) or (x<=0 and x´x and y´y) : CompSym representation : List<compAtom> : ListNode<compAtom> data : compAtom 0 b’ y´=true 1 x>0 and x´=x+1 next :*ListNode<compAtom> : ListNode<compAtom> data : compAtom 0 1 y’=y x<=0 and x’=x next: *ListNode<compAtom> Satisfiability Checking boolean isSatisfiable(CompSym A) for each compAtom b in A do if b is satisfiable then return true return false is Satisfiable? true isSatisfiable? isSatisfiable? isSatisfiable? boolean isSatisfiable(compAtom a) for each symbolic representation t do if at is not satisfiable then return false return true false false is true is is Satisfiable? and Satisfiable? Satisfiable? Backward Image: Composite Representation A: B: CompSym backwardImage(Compsym A, CompSym B) CompSym C; for each compAtom d in A do for each compAtom e in B do insert backwardImage(d,e) into C return C C: ••• Backward Image: Composite Atom compAtom backwardImage(compAtom a, compAtom b) for each symbolic representation type t do replace at by backwardImage(at , bt ) return a b: a: Heuristics for Efficient Manipulation of Composite Representation Masking – Mask operations on integer arithmetic constraints with operations on BDDs Incremental subset check – Exploit the disjunctive structure by computing subset checks incrementally Merge image computation with the subset check in least-fixpoint computations Simplification – Reduce the number of disjuncts in the composite representation by iteratively merging matching disjuncts Cache expensive operations on arithmetic constraints Polymorphic Verifier Symbolic TranSys::check(Node *f) { • • • Symbolic s = check(f.left) case EX: s.backwardImage(transRelation) case EF: do snew = s sold = s snew.backwardImage(transRelation) s.union(snew) while not sold.isEqual(s) • • • } Action Language Verifier is polymorphic When there are no integer variable in the specification it becomes a BDD based model checker Synthesizing Verified Monitors [Yavuz-Kahveci, Bultan 02] Concurrent programming is difficult – Exponential increase in the number of states by the number of concurrent components Monitors provide scoping rules for concurrency – Variables of a monitor can only be accessed by monitor’s procedures – No two processes can be active in a monitor at the same time Java made programming using monitors a common problem Monitor Basics A monitor has – A set of shared variables – A set of procedures • which provide access to the shared variables – A lock • To execute a monitor procedure a process has to grab the monitor lock • Only one process can be active (i.e. executing a procedure) in the monitor at any given time Monitor Basics What happens if a process needs to wait until a condition becomes true? – Create a condition variable that corresponds to that condition Each condition variable has a wait queue – A process waits for a condition in the wait queue of the corresponding condition variable – When a process updates the shared variables that may cause a condition to become true: it signals the processes in the wait queue of the corresponding condition variable Monitors Challenges in monitor programming – Condition variables – Wait and signal operations Why not use a single wait queue? – Inefficient, every waiting process has to wake up when any of the shared variables are updated Even with a few condition variables coordinating wait and signal operations can be difficult – Avoid deadlock – Avoid inefficiency due to unnecessary signaling Monitor Specifications in Action Language Monitors with boolean, enumerated and integer variables Condition variables are not necessary in Action Language – Semantics of Action Language ensures that an action is executed when it is enabled We can automatically verify Action Language specifications We can automatically synthesize efficient monitor implementations from Action Language specifications Readers-Writers Monitor Specification module main() integer nr; boolean busy; restrict: nr>=0; initial: nr=0 and !busy; module Reader() boolean reading; initial: !reading; rEnter: !reading and !busy and nr’=nr+1 and reading’; rExit: reading and !reading’ and nr’=nr-1; Reader: rEnter | rExit; endmodule module Writer() boolean writing; initial: !writing; wEnter: !writing and nr=0 and !busy and busy’ and writing’; wExit: writing and !writing’ and !busy’; Writer: wEnter | wExit; endmodule main: Reader() | Reader() | Writer() | Writer(); spec: invariant([busy => nr=0]) endmodule What About Arbitrary Number of Processes? Use counting abstraction – Create an integer variable for each local state of a process type – Each variable will count the number of processes in a particular state Local states of the process types have to be finite – Specify only the process behavior that relates to the correctness of the monitor – Shared variables of the monitor can be unbounded Counting abstraction can be automated Readers-Writers Monitor Specification After Counting Abstraction module main() integer nr; boolean busy; parameterized integer numReader, numWriter; restrict: nr>=0 and numReader>=0 and numWriter>=0; initial: nr=0 and !busy; module Reader() integer readingF, readingT; initial: readingF=numReader and readingT=0; rEnter: readingF>0 and !busy and nr’=nr+1 and readingF’=readingF-1 and readingT’=readingT+1; rExit: readingT>0 and nr’=nr-1 readingT’=readingT-1 and readingF’=readingF+1; Reader: rEnter | rExit; endmodule module Writer() ... endmodule main: Reader() | Writer(); spec: invariant([busy => nr=0]) endmodule Verification of Readers-Writers Monitor Specification Integers Booleans Cons. Time (secs.) Ver. Time (secs.) Memory (Mbytes) RW-4 1 5 0.04 0.01 6.6 RW-8 1 9 0.08 0.01 7 RW-16 1 17 0.19 0.02 8 RW-32 1 33 0.53 0.03 10.8 RW-64 1 65 1.71 0.06 20.6 RW-P 7 1 0.05 0.01 9.1 SUN ULTRA 10 (768 Mbyte main memory) What about the Implementation of the Monitor? We can automatically generate code from the monitor specification – Generate a Java class – Make shared variables private variables – Use synchronization to restrict access Is the generated code efficient – Yes! – We can synthesize the condition variables automatically – There is no unnecessary thread notification Synthesized Monitor Class: Uses Specific Notification Pattern public class ReadersWriters{ private int nr; private boolean busy; private Object rEnterCond, wEnterCond; private synchronized boolean Guard_rEnter() { if (!busy) { nr++; return true; } All condition variables and else return false; wait and signal operations are } generated automatically public void rEnter() { synchronized(rEnterCond) { while(!Guard_rEnter()) rEnterCond.wait(); } public void rExit() { synchronized(this) { nr--; } synchronized(wEnterCond) { wEnterCond.notify(); } } ... } Airport Ground Traffic Control [Zhong 97] Modeling of airport operations using an object oriented approach A concurrent program simulating the airport ground traffic control – multiple planes – multiple runways and taxiways Can be used by controllers as advisory input A simplified model of Seattle Tacoma International Airport from [Zhong 97] Control Logic An airplane can land using 16R only if no airplane is using 16R at the moment An airplane can takeoff using 16L only if no airplane is using 16L at the moment An airplane taxiing on one of the exits C3-C8 can cross runway 16L only if no airplane is taking off at the moment An airplane can start using 16L for taking off only if none of the crossing exits C3-C8 is occupied at the moment (arriving airplanes have higher priority) Only one airplane can use a taxiway at a time Airport Ground Traffic Control Simulator Simulate behavior of each airplane with a thread Use a monitor which keeps track of number of airplanes on each runway and each taxiway Use guarded commands (which will become the procedures of the monitor) to enforce the control logic Airport Ground Traffic Control Monitor Action Language specification – Has 13 integer variables – Has 4 Boolean variables per arriving airplane process to keep the local state of each airplane – Has 2 Boolean variables per departing airplane process to keep the local state of each airplane Automatically generated monitor class – Has 13 integer variables – Has 20 condition variables – Has 34 procedures Experiments Processes Construction Verify-P1 Verify-P2 Verify-P3 2 0.81 0.42 0.28 0.69 4 1.50 0.78 0.50 1.13 8 3.03 1.53 0.99 2.22 16 6.86 3.02 2.03 5.07 2A,PD 1.02 0.64 0.43 0.83 4A,PD 1.94 1.19 0.81 1.39 8A,PD 3.95 2.28 1.54 2.59 16A,PD 8.74 4.6 3.15 5.35 PA,2D 1.67 1.31 0.88 3.94 PA,4D 3.15 2.42 1.71 5.09 PA,8D 6.40 4.64 3.32 7.35 PA,16D 13.66 9.21 7.02 12.01 PA,PD 2.65 0.99 0.57 0.43 Conclusions and Future Work We can automatically verify and synthesize nontrivial monitors in Java Our tools can deal with boolean, enumerated and (unbounded) integer variables What about recursive data types? – shape analysis What about arrays? – uninterpreted functions