Public Deployment of Cooperative Bug Isolation Ben Liblit Alex Aiken, and Michael Jordan

advertisement
Public Deployment of
Cooperative Bug Isolation
Ben Liblit, Mayur Naik, Alice Zheng,
Alex Aiken, and Michael Jordan
UC Berkeley, Stanford University and
University of Wisconsin (pending)
Our Goal: Measure Reality
• We measure bridges, airplanes, cars…
– Where is
ight data recorder for software?
• Users are a vast, untapped resource
– 96,000 new Kazaa users during this workshop
– Users know what matters most
• Opportunity for reality-directed debugging
– Implicit bug triage for an imperfect world
Bug Isolation Architecture
Guesses
Program
Source
Sampler
Shipping
Application
Compiler
Top bugs with
likely causes
Statistical
Debugging
Pro le
& J/L





Predicates on Program Behavior
• Guess what might be interesting
–
–
–
–
Branches: Left? Right?
Function returns: Negative? Zero? Positive?
Pairs of variables: Less? Equal? Greater?
Reference counts: Alive? Dead?
• Count how often guesses are true
• Feedback: vector of counts + outcome label
Sampling the Bernoulli Way
• “Next sample” countdown
– Geometric distribution
• Split into acyclic regions
– Finite threshold weight
Sampling the Bernoulli Way
• “Next sample” countdown
4
– Geometric distribution
• Split into acyclic regions
– Finite threshold weight
3
1
2
1
2
1
1
Sampling the Bernoulli Way
• “Next sample” countdown
– Geometric distribution
• Split into acyclic regions
– Finite threshold weight
• Clone acyclic regions
– “Fast” & “slow” variants
– Choose at run time
• Result:
– Subset of dynamic behavior
– Statistically fair sample
>4?
Multithreaded Programs
• Global next-sample countdown
– High contention, small footprint
– Want to use registers for performance
Thread-local: one countdown per thread
• Global predicate counters
– Low contention, large footprint
Optimistic atomic increment
Multi-Module Programs
• Forget about global static analysis
– Plug-ins, shared libraries
– Instrumented & uninstrumented code
• Self-management at compile time
– Locally derive identifying object signature
– Embed static site information within object
le
• Self-management at run time
– Report feedback state on normal object unload
– Signal handlers walk global object registry
Native Compiler Integration
• Instrumentor must mimic native compiler
– You don’t have time to port & annotate by hand
• Our approach: source-to-source, then native
• Hooks for GCC: Guesses
– Program
Stage wrapping Compiler
via scripts
Shipping
Sampler
Toolvia
Chain
– Flag
management
spec les Application
Source
Compiler
Keeping the User In Control
1400
1200
1000
800
600
400
Good
Error
Crash
bo
x
R
hy
th
m
us
til
N
au
G
nu
m
er
ic
P
IM
G
ai
m
G
n
200
0
Ev
ol
ut
io
Reports Received
Public Deployment, To Date
Public Deployment, To Date
100%
80%
Good
Error
Crash
60%
40%
20%
bo
x
R
hy
th
m
us
til
N
au
G
nu
m
er
ic
P
IM
G
ai
m
G
Ev
ol
ut
io
n
0%
Sneak Peak: Data Exploration
Conclusions
• Public deployment is challenging
– Real world code pushes tools to their limits
– Large user communities take time to build
• But the results are worth it:
“Thanks to Ben Liblit and the Cooperative
Bug Isolation Project, this version of
Rhythmbox should be the most stable yet.”
Join the Cause!
The Cooperative Bug Isolation Project
http://www.cs.berkeley.edu/~liblit/sampler/
Download