CAV

advertisement
Effective Program Verification
for Relaxed Memory Models
Sebastian Burckhardt
Madanlal Musuvathi
Microsoft Research
CAV, July 10, 2008
Motivation: Memory Model Vulnerabilities

Programmers do not always follow strict locking
discipline in performance-critical code
◦ Ad-hoc synchronization with normal loads and stores or
interlocked operations is faster
◦ Result: “benign” or “intentional” data races

Such code can break on relaxed memory models
◦ Most multicore machines are not sequentially consistent
◦ Both compilers and actual hardware can contribute to effect

Vulnerabilities are hard to find, reproduce, and analyze
◦ May require specific hardware configuration and schedule
2
C# Example
volatile bool isIdling;
volatile bool hasWork;
//Consumer thread
void BlockOnIdle(){
lock (condVariable){
isIdling = true;
if (!hasWork)
Monitor.Wait(condVariable);
isIdling = false;
}
}
//Producer thread
void NotifyPotentialWork(){
hasWork = true;
if (isIdling)
lock (condVariable) {
Monitor.Pulse(condVariable);
}
}
3
Example: Store Buffer Vulnerability

Key pieces of code on previous slide:
volatile int ii = 0;
volatile int hw = 0;
Consumer
Producer
Store ii, 1
Load hw, 0
Store hw, 1
Load ii, 1 0


4
On x86, hardware may perform store late
Bug: Producer thread does not notice waiting
Consumer, does not send signal
Abstract View of Memory Models
Given a program P, a memory model Y defines the
subset TP,Y  T of traces corresponding to some
(partial or complete) execution of P on Y.
TP, SC
SC (sequential consistency)
Is strongest memory model
5
TP, Y
T
More executions may be possible
on a relaxed memory model Y
5
Example: TSO
Under TSO, processors can buffer stores in FIFO queue.
TP, SC
6
TP, TSO
1.1 Store ii, 1
2.1 Store hw, 1
1.2 Load hw, 0
2.2 Load ii, 0
T
Trace corresponding to
code on slide 4
6
Why TSO?
RMO
PSO
TSO
z6
SC
Alpha
IA-32
IA-64
7
Memory models
are platform
dependent &
ridden with
details
 We focus on TSO
because it
models store
buffers, the most
common
relaxation
 In practice, TSO is
almost the same
as the x86
hardware model

Model Checking Programs on Relaxed
Memory Models

Covering all relaxed executions is challenging
◦ Highly nondeterministic
(exposed to low-level hardware concurrency)
◦ Memory models are usually not finite-state
◦ Memory models are often a matter of negotiation
(formal descriptions are the exception)

State of the art has limited scalability
◦ Model checking using simplified operational models
◦ Bounded model checking using axiomatic models
(CheckFence)
8
Memory Model Safety
Observation: Programmer writes code for SC
◦ Resorts to {locks, fences, volatiles, interlocked
operations} to maintain SC behavior where needed
◦ If program P exhibits non-SC behavior,
it is most likely a bug
Definition:
A program P is Y-safe if TP,SC = TP,Y
9
Decomposed Program Verification on
Relaxed Memory Models
TP, SC
TP, Y
T
1. Verify sequentially consistent executions
(show that all executions in TP,SC are correct)
2. Verify memory model safety
(show that TP,SC = TP,Y )
10
Can we do 1 and 2 at the same time? Yes.
Borderline Executions

Def.: A borderline execution for P is an execution
with a successor in TP,TSO - TP,SC
TP,SC
TP,TSO

11
Thm.: A program P is TSO-safe if and only if it has
no borderline executions.
Borderline Executions

Def.: A borderline execution for P is an execution
with a successor in TP,TSO - TP,SC
We can verify /
falsify this as a
safety property
of sequentially
consistent
executions!

12
TP,SC
TP,TSO
Thm.: A program P is TSO-safe if and only if it has
no borderline executions.
Example: TSO Borderline Execution
1.1 Store ii, 1
2.1 Store hw, 1
TP, TSO
1.2 Load hw, 0
1.1 Store ii, 1
2.1 Store hw, 1
1.2 Load hw, 0
2.2 Load ii, 1
TP, SC
1.1 Store ii, 1
2.1 Store hw, 1
1.2 Load hw, 0
2.2 Load ii, 0
Successor traces are traces with one more instruction.
13
Sober Tool
Structure
Event Stream
(shared memory
accesses, sync ops)
Instrumented
Program
Scheduler
Enumerates
Traces
Outputs:
14
(1) P correct
Borderline
Monitor
Stateless Model
Checker (CHESS)
(2) P not TSO-safe (+cex)
(3) P has SC-bug (+cex)
Program output is always sound.
Tool may not terminate exploration if # of executions is too large.
Define SC using hb relation

Trace = Set of Instructions (Vertices) with attributes
◦ [processor]. [issue index] [operation] [address], [coherence index]
coh.index is the position of the value within the sequence of values written to the
same location (i.e., “we replace each value with its sequence number”)



Add edges: program order p / conflict order c
Define happens-before order hb = (p  c)
Trace is sequentially consistent if and only if hb is acyclic.
This trace is SC:
This trace is not SC:
1.1 Store ii, 1
1.2 Load hw, 0
1.1 Store ii, 1
2.1 Store hw, 1
2.2 Load ii, 1
15
1.2 Load hw, 0
2.1 Store hw, 1
2.2 Load ii, 0
Define TSO by Relaxing hb


Define relaxed happens-before order
rhb = (p  c) \ { (s,l) | s is store, l is load, and s p l }
Trace is possible on TSO if and only if
(1) rhb is acyclic
(2) there do not exist s, l such that s p l and l c s
This trace is TSO, but not SC:
1.1 Store ii, 1
1.2 Load hw, 0
2.1 Store hw, 1
Thm.:
Def. Is equivalent to
operational TSO model
(see Tech Report)
2.2 Load ii, 0
1.1 Store ii, 1
1.2 Load hw, 0
16
hb
1.1 Store ii, 1
2.1 Store hw, 1
2.2 Load ii, 0
1.2 Load hw, 0
rhb
2.1 Store hw, 1
2.2 Load ii, 0
Borderline Monitor Implementation
Receiving a stream of memory accesses:
 Record all stores to all locations.
 For each load L, check if there exists a reordering of L
with prior stores to the same location such that
(1) hb has a cycle
(2) rhb is acyclic
(3) there do not exist s, l such that s p l and l c s
 Implementation: use standard vector clock to compute
hb , and custom vector clock (twice the width) to
compute rhb
17
Equivalent Interleavings

Typically, many different interleavings map to the same
(Mazurkiewic) trace.

By construction, our monitor is insensitive to the choice
of interleaving
◦ Checks all hb -equivalent ones simultaneously
◦ Makes it compatible with partial order reduction
◦ Improves probability of finding bugs
18
Results

Good at finding bugs even if only a small number of
schedules is explored
◦ Monitor checks all hb-equivalent interleavings
◦ Chess heuristic (iterative context bounding) seems to mix
well


Found expected store buffer vulnerabilities in
standard examples (Dekker, Bakery)
Detected 2 store buffer vulnerabilities in a
production-level concurrency library.
◦ Overall code size ~ 33 kloc
◦ Used existing test harness written by product team (slightly
adapted for use with CHESS)
◦ Bugs not previously known
19
Some Numbers
program
name
Fig. 1(b)
dekker
(2 threads,
2 crit-sec)
(loc 82)
bakery
(2 threads,
3 crit-sec)
(loc 122)
takequeue
(2 threads,
6 ops)
(loc 374)
20
context
bound
∞
1
2
3
4
5
0
1
2
3
0
1
2
3
4
5
# interleavings
time
ver. time [s]
total borderline
[s]
SoBeR CHESS
10
4 < 0.1
< 0.2
< 0.2
5
4 < 0.1
< 0.2
< 0.2
36
23 < 0.1
0.39
0.37
183
50 < 0.1
1.9
1.8
1,219
124 < 0.1
13.2
13.0
8,472
349 < 0.1
106.0
100.6
1
1 < 0.1
< 0.2
< 0.2
25
20 < 0.1
0.47
0.43
742
533 < 0.1
10.3
9.8
12,436
8,599 < 0.1
189.0
181.0
3
47
402
2,318
9,147
29,821
0
14
189
1,197
5,321
17,922
n.a.
0.34
0.43
0.74
0.84
0.86
< 0.3
0.72
5.2
28.9
125.5
481.5
< 0.3
0.69
4.9
27.8
118.9
461.6
Conclusion



21
With increasing use of multicores, more and more
programs are likely to exhibit failures caused by the
memory model.
Such failures are hard to find by conventional
means (code inspection, testing).
Our combination of borderline monitor & stateless
model checking makes it practical to detect
memory model safety violations in a unit test
environment.
Future Work


Run on larger programs (runtime verification)
Handle more memory models
◦ Which memory models guarantee borderline executions?


Prove memory model safety of concurrent data
type implementations
Develop borderline monitors for other relaxed
concurrent APIs
◦ Transactional memory
◦ Concurrency Libraries
22
Download