Games for Formal Design and Verification of Reactive Systems Rajeev Alur

advertisement
Games for Formal Design and
Verification of Reactive Systems
Rajeev Alur
University of Pennsylvania
http://www.cis.upenn.edu/~alur/
ATVA, Taipei, November 2004
System Reliability
 Software bugs are pervasive
Bugs can be expensive
Bugs can cost lives
Bulk of development cost is in validation, testing, bug fixes
 Old problem that just won’t go away
 Many approaches and decades of research
Systematic testing
Programming languages technology (e.g. types)
Formal methods (specification and verification)
Grand challenge for computer science:
Tools for designing “correct” software
model
temporal
property
Model Checker
yes
Error trace
Advantages
Automated formal verification, Effective debugging tool
Growing industrial success
In-house groups: Intel, Microsoft, Lucent, Motorola…
Commercial model checkers
Opportunities for research
Scalability is still a problem
Effective use requires great expertise
Models for Formal Analysis
 Model is usually a composition of models for components
 Component models are primarily of two types
1. Manual or automatic abstraction of code that implements the
components (e.g. tcp client)
2. Capturing the environment that the components are reacting to
(e.g. network connecting the clients)
 Nondeterminism / choice is essential in modeling
1. Abstraction loses information (e.g. control-flow graph keeps both
branches of a conditional test)
2. Environment supplies inputs and/or has unpredictable events (e.g.
a network may or may not lose a message)
 Verifying a model typically amounts to checking all
possible executions of the model
From Code to Models via Abstraction
int x, y;
if x>0 {
…………
y:=x+1
……….}
else {
…………
y:=x+1
……….}
Predicate Abstraction
bx: x>0; by : y>0
bool bx, by;
if bx {
…………
by:=true
……….}
else {
…………
by:={true,false}
……….}
Contemporary tools for software verification
(SLAM, BLAST…) use automated predicate abstraction
and symbolic model checking
Classical Model Checking
Processor1
Cache
Controller1
Cache
Controller2
Processor2
Bus
 Model is viewed as a state transition graph (no distinction
among choices of various components)
 Requirements
No reachable state has two caches in write-exclusive states
(Safety/Reachability)
Every read/write request is eventually completed
(Liveness/Linear Temporal Logic)
From every reachable state, there exists a path leading to a
quiescent state (Branching-time/CTL)
Game-based Analysis
Processor1
Cache
Controller1
Cache
Controller2
Processor2
Bus
 Model is viewed as a game graph
Different components viewed as separate players
Each move belongs to one of the players
Strategy for a player is to choose the next move based on the
execution so far
 Requirements: Processor1 and Controller1 have a strategy
to successfully write no matter how other components
behave (adversarial/collaborative groups)
 Beyond model checking: Compute the most general model
for Processor that satisfies requirements (Synthesis)
Talk Outline
 Motivation
 Introduction to Theory of Games
 Games in Requirements (MOCHA)
 Interface Synthesis using Games (JIST)
 Conclusions
Formal Definition of Games
 A game graph G consists of
A set V of vertices and a set E of edges
A labeling of edges with moves in a set M
 When game is at a vertex v, player0 chooses a move m, player1
chooses an edge (v, u) labeled m, and game proceeds to u
 A strategy f for player0 is a function from V+ to M
 For a vertex v, strategy f, Plays(v,f) contains paths v0v1v2… in
graph G s.t. v0=v and for each i, there is an edge (vi,vi+1) labeled
with f(v0v1… vi)
 A winning condition W is a set of infinite paths over V
Reachability: a path is in W if it contains a vertex in target set F
Safety: a path is in W if all its vertices are in safe set S
 A strategy f is winning for player0 in initial state v if every path
in Plays(v,f) is in W
 Game problem: Given G, v, and W, decide if player0 has a winning
strategy starting in v (and compute the winning strategy)
Reachability Game
b
b
1
b
a
a
a
a
2
4
5
4
a
b
3
a
b
1
a
a
2
3
a
b
5
Can win from every vertex except 4
Sufices to consider memoryless strategies
Can be computed in linear time (PTIME complete)
Partial Information Games
Player0 does not know the game position precisely
Every vertex has an observation
A strategy maps a sequence of observations to a move
b
b
1
b
a
a
a
a
2
4
b
Color shows observation
Cannot win from any non-target vertex
3
a
b
5
Solving partial information game requires subset construction
Problem is Exponential-time complete!
Safety Games for Assumption Generation
Env
Generating inputs
System model M
Nondeterministic
 Goal: System should stay within a given safe set S
 Let E be the environment that generates all possible inputs
 All paths in E || M may not stay within S
 Game: In E||M, check Env has a winning strategy to keep
the system within S
 Winning strategies in safety games are closed under union
 Most general winning strategy A for safety game: From a
vertex v allow move m if some winning strategy picks m
 Strategy A is the most permissive assumption on the
environment so that A||M is safe
Results on Games
 Many variations of game graphs studied
Synchronous multi-player games: Each player chooses a move
independently, and next vertex is determined by the moves selected
by all the players
Asynchronous interleaving games: A (fair) scheduler picks which
player gets to move, and the selected player chooses the next state
 Complexity of infinitary winning conditions
W specified by LTL: Double-exp-time
W specified by parity condition: NP & CoNP (Open whether in P)
 Games with time, probabilities, costs, rewards, pushdown models…
 Connections to tree automata, and mu-calculus model checking
Our Focus:
Can games be useful in practice?
How to solve games efficiently (state explosion problem)?
Talk Outline
 Motivation
 Introduction to Theory of Games
 Games in Requirements (MOCHA)
Alternating Temporal Logic (ATL)
Symbolic model checking for ATL
Non-repudiation for security protocols
 Interface Synthesis using Games (JIST)
 Conclusions
Overview of MOCHA
Key features
Compositional modeling language: Reactive Modules
Game-based requirements of open systems: ATL
Refinement checking by assume-guarantee rules
Joint project with UC Berkeley
See http://www.cis.upenn.edu/~mocha/
Alternating-time temporal logic
[Alur,Henzinger,Kupferman, JACM 2002]
Game-based verification of non-repudiation protocols
[Kremer and Raskin, Journal of Computer Security, 2003]
Alternating Temporal Logic
Suitable for requirements of multiagent systems
Interpreted over game graphs
Suppose Sys chooses move (dashed/solid), and Env chooses next state
EF p
AG p
<<sys>>F p
Alternating Temporal Logic
Interpreted over game graphs where set of players is P
Syntax:
phi := p | ~ phi | phi & phi | <<Q>> Next phi | <<Q>> phi Until phi
where p is a proposition, Q is a subset of P
<<Q>> phi holds at a state v iff players in Q have a winning strategy
in the game starting at v where phi gives winning condition
Sample property <<A,B>> G p
can agents A and B collaborate to maintain invariant p?
existential over choices of A & B, universal over others
Can specify games and controllability
More expressive than CTL
Symbolic Representation of Games
Typically, model/game specified implicitly
Model variables X
Each var is of finite type, say, boolean
Move variables M (e.g. inputs)
Update: T(X,M,X’)
How new vars X’ are related to old vars X as a result of
executing one step, when Player0 chooses values of M
Reachability Game: Target specified by predicate p(X)
Computational problem:
Compute the states from which Player0 can reach p ?
Model checking the ATL formula <<Player0>>F p
Building the game graph explicitly not feasible!
Symbolic Solution
R:=p(X)
repeat
APre(R(X)) := Exists M. Forall X’. T(X,M,X’) -> R(X’)
if R contains APre(R) return R
else R := R union APre(R)
APre(R): Set of states from which Player0 can force the
game to reach R in one step
Similar to standard CTL model checking, except that preimage computation involves quantifier alternation
Mocha implements this symbolic solution using OBDDs as a
symbolic representation (CUDD package)
Performance: Models with 50-60 variables analyzed easily
Analysis of Security Protocols
 Authentication Protocols
Goal: Establish secure communication between Alice and Bob so that
a malicious third party cannot talk to Alice pretending to be Bob
Many formal methods and model checkers used to analyze and find
bugs in authentication protocols (e.g. Lowe used FDR model checker
to find a bug in Needham-Shroeder public key authentication)
Analysis involves modeling of adversary, and checking all executions
satisfy correctness requirements (only nondeterminism is
communication medium and adversary)
 Non-repudiation Protocols
Repudiation means that Bob can pretend not to have participated in
the protocol (after receiving what Bob really wanted from Alice)
Non-repudiation protocols allow Alice/Bob to have evidence of
messages sent/received, typically using Trusted Third Party (TPP),
so that other person cannot cheat
Game-based modeling and ATL model checking using Mocha has been
shown to be the most effective technique for analysis (KR01,KR03..)
Modeling Non-repudiation Protocols
Alice
Honest
Cheat
Honest
Cheat
Communication Channels
Nondeterministic model
Trusted Third Party
Deterministic
Bob
Analysis using Mocha
 Model described in guarded command language
Players: Alice (A), Bob (B), Communication channels (Com), TPP
NRR: Alice gets non-repudiation of receipt evidence
NRO: Bob gets non-repudiation of origin evidence
 Requirements in ATL (and not expressible in CTL)
Viability: Alice and Bob can cooperate to be fair to each other
<<A,B>> F (NRR & NRO)
Fairness to Alice: Bob and Com cannot cooperate to reach a state
where Bob has his evidence, but Alice can no longer get hers
~ <<B,Com>> F (NRO & ~ <<A>> F NRR)
 Many published protocols formally analyzed by Mocha
Asokan-Shoup-Weidner certified mail protocol (previously known
violations of fairness found)
Zhou-Gollman non-repudiation protocol (way to cheat Alice found for
certain types of channels)
Talk Outline
 Motivation
 Introduction to Theory of Games
 Games in Requirements (MOCHA)
 Interface Synthesis using Games (JIST)
Behavioral interfaces for Java classes
Learning automata representing strategies
Implementation and results
 Conclusions
Static Interfaces for Java Classes
package java.security;
…
public abstract class Signature extends java.security.SignatureSpi {
<<variable declarations>>
protected int state = UNINITIALIZED;
public final void initVerify (PublicKey publicKey) {…}
public final byte[] sign () throws SignatureException { ….}
public final boolean verify (byte[] signature) throws SignatureException
{ ….}
public final void update (byte b) throws SignatureException {…}
..
}
Behavioral Interface
 Methods: initVerify (IV), verify (V), initSign (IS),
sign(S), update (U)
 Constraints on invocation of methods so that the
exception signatureException is not thrown
initVerify (initSign) must be called just before verify
(sign), but update can be called in between
update cannot be called at the beginning
IS IV
IV
S, U, IS
IS
V, U, IV
AbstractList.ListItr
public Object next() {
…
lastRet = cursor++;
…}
public Object prev() {
…
lastRet = cursor;
…}
public void remove() {
if (lastRet==-1)
throw new IllegalExc();
…
lastRet = -1;
…}
public void add(Object o) {
…
lastRet = -1;
…}
Behavioral Interface
Start
next
add
next,prev
Safe
Unsafe
remove,add
add
next,prev
Interfaces for Java classes
 Given a Java class C with methods M and return values R, an
interface I is a function from (MxR)* to 2M
Interface specifies which methods can be called after a given history
 Given a safety requirement S over class variables, interface I is
safe for S if calling methods according to I keeps C within S
 Given C and S, there exists a most permissive interface that is
safe wrt S
 Interfaces can be useful for many purposes
Documentation
Modular software verification (check client conforms to interface)
Version consistency checks
 JIST: Automatic extraction of finite-state interfaces
Phase 1: Abstract Java class into a Boolean class using predicate abstraction
Phase 2: Generate interface as a solution to game in abstract class
Game in Abstracted Class
next
prev
From black states,
Player0 gets to choose
the input method call
From purple states,
Player1 gets to choose
a path in the abstract
class till call returns
Objective for Player0: Ensure error states (from which
exception can be raised) are avoided
Winning strategy: Correct method sequence calls
Most General winning strategy: Most permissive safe interface
Game is partial information!
Interface Synthesis
 Most permissive safe interface can be captured by a
finite automaton (as a regular language over MxR)
For partial information games, the standard way
(subset construction) to generate the interface is
exponential in the number of states of abstract class
Number of states of abstract class is exponential in
the number of predicates used for abstraction
Use of symbolic methods (e.g. OBDDs) desired
 Novel approach: Use algorithms for learning a regular
language to learn interface
Angluin’s L* algorithm
Works well if we expect the final interface to have a
small representation as a minimized DFA
L* Algorithm for Learning DFAs
Infers the structure of an
S := {ε}; // states of DFA
unknown DFA by
E := {ε}; // distinguishing expts
– membership queries
repeat:
– equivalence queries
Update T;
// member tests for (S U S•Σ)•E
Observation table (S,E,T)
MakeTClosed(S,E,T);
T: (S U S•Σ)•E {0, 1}
C := MakeConjecture(S,E,T);
Constructs a minimal DFA using
if !(c=IsEquiv(C)) then return C;
a polynomial number of
else{
queries
e = FindSuffix(c);
O(|Σ|n2 + n log m) member
Add e to E;
}
at most n-1 equivalence
Implementing L*
 Transform abstract class into a model M in NuSMV (a
state-of-the-art BDD-based model checker)
 Membership Query: Is a string s in the desired language?
 Are all runs of M on s safe?
 Construct an environment Es that invokes methods
according to s, and check M||Es safe using NuSMV
 Equivalence Query: Is current conjecture interface C
equivalent to the final answer I? If not, return a string in
the difference
Subset check: Is C contained in I ? Are all strings
allowed by C safe? Check if C||M is safe using NuSMV
Superset check: Does C contain I ? Is C most
permissive?
Superset Query
 Is C maximal, that is, contains all safe method sequences?
 Problem is NP-hard, and does not directly lend to a model
checking question
 Approximate it using two tests
A sequence s is weakly safe if some run of M on s stays
safe. We can check if C includes all weakly safe runs
using a CTL model checking query over C||M.
We can locally check if allowing one more method in a
state of C keeps it safe
 Summary: Our implementation of L* computes interface I
as a minimal DFA
Guaranteed to be safe
Algorithm either says I is most permissive, or do not
know (in that case, most permissive will have more
states than I as a minimal DFA)
JIST: Java Interface Synthesis Tool
Java Byte
Code
Java
Soot
Jimple
Predicate
Abstarctor
Interface
Interface
Automaton
Synthesis
NuSMV
Language
BJP2SMV
Boolean
Jimple
Signature Class
3 global variable predicates used for abstraction
24 boolean variables in abstract model
83 membership, 3 subset, 3 superset queries
time: 10 seconds
JIST synthesized the most permissive interface
package java.security;
…
public abstract class Signature extends
java.security.SignatureSpi {
<<variable declarations>>
protected int state = UNINITIALIZED;
IV
public final void initVerify (PublicKey publicKey) {…}
public final byte[] sign () throws SignatureException { ….}
public final boolean verify (byte[] signature) throws
SignatureException
{ ….}
public final void update (byte b) throws SignatureException
{…}
…}
IV
IS
S, U, IS
IS
V, U, IV
JIST Project
 Tool is able to construct useful interfaces for sample Java
classes in Java2SDK accurately and efficiently
 Work in progress, many challenges remain
How to choose predicates for abstraction? How to refine
abstractions?
Features of Java (e.g. class hierarchy)
Robustness of the tool
 Reference: Synthesis of Interface specifications for Java
classes, ACMN, POPL 2005
Joint work with Pavol Cerny, P. Madhusudan, Wonhong Nam
See http://www.cis.upenn.edu/jist/
Conclusions
 Games provide a modeling paradigm for multi-agent
systems to highlight the distinction among
choices/nondeterminism of different components
 Alternating temporal logic (ATL) as a specification
language for game-based requirements
Main application: Security protocols
 Synthesis of most general winning strategies
Automatic extraction of assumptions
Interfaces for software components
 Coping with state-space explosion raises new challenges
Learning-based strategy extraction seems promising
Not much research on solving games efficiently
Download