Structural Heuristics for Directed Model Checking of Java Programs

advertisement
Model Checking Java Programs
using
Structural Heuristics
Alex Groce
Carnegie Mellon University
Willem Visser
NASA Ames Research Center
Model Checking
• Explores graph of reachable system states
– Checking for local assertions, invariants and
general temporal (logic) properties
• Symbolic model checking
• Explicit-state model checking
Java PathFinder
Java Code
Bytecode
void add(Object o) {
buffer[head] = o;
head = (head+1)%size;
}
Object take() {
…
tail=(tail+1)%size;
return buffer[tail];
}
JAVAC
0:
1:
2:
5:
8:
9:
10:
iconst_0
istore_2
goto
#39
getstatic
aload_0
iload_2
aaload
Model
Checker
Special
JVM
JVM
Depth-first Search
push initial state on Stack
while (Stack not empty)
s = top(Stack)
if s has no more successors
pop the Stack
else
s’ = next successor of s
if s’ not already visited
mark s’ visited
if s’ is a goal state
then terminate
push s’ on Stack
Problems with DFS
• Produces lengthy counterexamples
• If state-space is too large to fully explore
– May expend all resources on a single path when
shallow counterexamples exist
– Failed runs give little information because
states explored may be very “similar”
Directed Model Checking
• Model checking as a search in a state space
• Why not use heuristics to guide the search?
– Need to know what we’re looking for
• Can we find good heuristics for model checking?
• Bug-finding rather than verification
Best-first Search
priority queue Q = {initial state}
while (Q not empty)
s = state in Q with lowest f
remove s from Q
for each successor state s’ of s
if s’ not already visited
mark s’ visited
if s’ is a goal state
then terminate
f = h(s’)
store (s’, f) in Q
Two Kinds of Heuristics
• Property-specific heuristics
– Directed at a specific error
• Number of unblocked threads as a measure of
distance to deadlock
• Static analysis for distance to an assertion check
– Focus of most previous work in field
Two Kinds of Heuristics
• Structural heuristics
– Designed to explore the structure of a program
in a systematic fashion
– But what do we mean by structure?
Structural Heuristics
• One obvious kind of structure in a program:
– Control flow
• Reachable control flow rather than just CFG
• Motivation for branch coverage metrics used in
software testing
Branch Coverage
• Instrument model checker to calculate branch
coverage
• Using a simple coverage measure as a heuristic
doesn’t work well
– Easily falls into local minima (once any branches are
taken, every state on that path has “better” coverage)
– Doesn’t distinguish between branches explored once
and branches explored many times
The Branch Counting Heuristic
• Count the number of times each branch has
been taken
• Heuristic value is then:
– Branches never before taken get lowest value
– Non-branching transitions are next lowest
– Otherwise, score is equal to the count
(lower values are explored first)
Three Searches
DFS
Branch Counting
CFG
Each CFG state is a basic block that increments
some variable x.
ERROR
BFS
Three Searches
Branch Counting
CFG
BFS
DFS
Three Searches
Branch Counting
CFG
BFS
DFS
Three Searches
Branch Counting
CFG
BFS
DFS
Three Searches
Branch Counting
CFG
Heuristic
avoids taking
BFS
DFS
Three Searches
Branch Counting
CFG
BFS
DFS
Three Searches
DFS
Branch Counting
CFG
Expands 15 states
BFS
Terminates
only with
depth limit
Expands 25 states
Experimental Results
• DEOS real-time operating system example
• This version uses an integer valued counter,
without abstraction
Results for DEOS
Search Strategy
Branch-count
%-coverage
Random heuristic
BFS
DFS
DFS depth 500
DFS depth 1000
DFS depth 4000
States
2,701
20,215
8,057
18,054
14,678
392,470
146,949
8,481
Time
60
FAIL
162
FAIL
FAIL
6,782
2,222
171
Memory
91MB
FAIL
240MB
FAIL
FAIL
383MB
196MB
270MB
Length
136
FAIL
334
FAIL
FAIL
455
987
3,997
Max Depth
139
334
360
135
14,678
500
1,000
4,000
All experiments performed on a 1.4GHz Athlon,
limiting Java heap size to 512MB, all times are in
seconds
The Interleaving Heuristic
• An important (and very hard to find) class
of errors in Java is concurrency errors
• What kind of structure could we explore to
catch these?
– Thread-interdependency
The Interleaving Heuristic
• Not clear how to heuristically define actual
thread-interdependence
• So we use an approximation:
– Executions in which context is switched more
often are given better heuristic values
– Explores executions unlikely to appear in
testing (JVM/JITs schedule quite differently)
The Interleaving Heuristic
• Keep track on each path of which threads
are executed at each transition
• Give lower (better) heuristic score to paths
in which the most recently executed thread
has been run less frequently
• Slightly more complicated in practice,
counting live threads
Limiting the Queue
• With heuristics we are more interested in
finding bugs than in verification
• So, we apply a technique from heuristic
search literature:
– Limit the size of the priority queue!
• When queue has more than k states in it, remove all
but k states with best heuristic values
Experimental Results
• Dining Philosophers
– Comparison to other results:
• Godefroid and Khurshid in TACAS ’02 paper apply
genetic algorithms to dining philosophers
– Best result reported is 17 philosophers, 177 seconds, 50%
success rate (on a slower machine)
• HSF-SPIN
– Not clear how to compare (times not given)
– Best result they show is 16 philosophers, and SPIN (using
partial order reduction) itself fails with 14 philosophers
Experimental Results
Search Strategy
Random heuristic
BFS
DFS
DFS depth 500
DFS depth 1000
DFS depth 4000
Interleaving
Most-blocked
Interleaving (k = 1000)
Most-blocked (k = 5)
Most-blocked (k = 160)
Most-blocked (k = 1000)
Interleaving (k = 40)
Interleaving (k = 160)
Most-blocked (k = 40)
Interleaving (k = 5)
Threads
8
8
8
8
8
8
8
8
8
8
8
8
16
16
16
64
States
218,500
436,068
398,906
1,354,747
1,345,289
1,348,398
487,942
310,317
354,552
891,177
25,023
123,640
69,987
290,637
101,576
101,196
Time
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
60
17,259
10
46
16
60
38
59
Memory
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
137MB
378MB
12MB
59MB
45MB
207MB
69MB
206MB
Length
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
FAIL
67
78,353
172
254
131
131
1,008
514
Max Depth
86
13
384,286
500
1,000
4,000
16
285
67
78,353
172
278
131
132
1,008
514
One Last Heuristic
• The choose-free heuristic:
– Works only for abstracted Java programs
– Rewards transitions that do not involve
nondeterminism introduced by the abstraction
– Prefers counterexamples that do not result from
loss of precision introduced by the abstraction
• Structure of abstraction, not program
Previous Work
• Edelkamp, Lafuente, and Leue
– HSF-SPIN: SPIN + heuristic search framework
• Bloem, Ravi, and Somenzi
– Symbolic Guided Search: BDDs + heuristics
– With BDDs heuristics can aid verification
• Cobleigh, Clarke, and Osterweil
– FLAVERS verification work
Conclusions
• Structural heuristics: a useful class of heuristics
– When model checking is used for debugging, we may
not know what kinds of bugs we are hunting
• Property-specific heuristics are also useful;
approach is complementary, not replacement
– Most-blocked can perform as well or better than
interleaving in the Remote Agent example, depending
on the k limit and search method
Future Work
•
•
•
•
Experiment with other, larger examples
Static analysis for property-specific heuristics
Language for properties/search/heuristics
Discover how heuristics work when symbolic
execution is introduced into JPF
• Counterexample analysis for “bug causality”
• What other kinds of structure can be exploited
with heuristics?
– Counting occurrences of data values, perhaps
Download