Public Deployment of Cooperative Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan UC Berkeley, Stanford University and University of Wisconsin (pending) Our Goal: Measure Reality • We measure bridges, airplanes, cars… – Where is ight data recorder for software? • Users are a vast, untapped resource – 96,000 new Kazaa users during this workshop – Users know what matters most • Opportunity for reality-directed debugging – Implicit bug triage for an imperfect world Bug Isolation Architecture Guesses Program Source Sampler Shipping Application Compiler Top bugs with likely causes Statistical Debugging Pro le & J/L Predicates on Program Behavior • Guess what might be interesting – – – – Branches: Left? Right? Function returns: Negative? Zero? Positive? Pairs of variables: Less? Equal? Greater? Reference counts: Alive? Dead? • Count how often guesses are true • Feedback: vector of counts + outcome label Sampling the Bernoulli Way • “Next sample” countdown – Geometric distribution • Split into acyclic regions – Finite threshold weight Sampling the Bernoulli Way • “Next sample” countdown 4 – Geometric distribution • Split into acyclic regions – Finite threshold weight 3 1 2 1 2 1 1 Sampling the Bernoulli Way • “Next sample” countdown – Geometric distribution • Split into acyclic regions – Finite threshold weight • Clone acyclic regions – “Fast” & “slow” variants – Choose at run time • Result: – Subset of dynamic behavior – Statistically fair sample >4? Multithreaded Programs • Global next-sample countdown – High contention, small footprint – Want to use registers for performance Thread-local: one countdown per thread • Global predicate counters – Low contention, large footprint Optimistic atomic increment Multi-Module Programs • Forget about global static analysis – Plug-ins, shared libraries – Instrumented & uninstrumented code • Self-management at compile time – Locally derive identifying object signature – Embed static site information within object le • Self-management at run time – Report feedback state on normal object unload – Signal handlers walk global object registry Native Compiler Integration • Instrumentor must mimic native compiler – You don’t have time to port & annotate by hand • Our approach: source-to-source, then native • Hooks for GCC: Guesses – Program Stage wrapping Compiler via scripts Shipping Sampler Toolvia Chain – Flag management spec les Application Source Compiler Keeping the User In Control 1400 1200 1000 800 600 400 Good Error Crash bo x R hy th m us til N au G nu m er ic P IM G ai m G n 200 0 Ev ol ut io Reports Received Public Deployment, To Date Public Deployment, To Date 100% 80% Good Error Crash 60% 40% 20% bo x R hy th m us til N au G nu m er ic P IM G ai m G Ev ol ut io n 0% Sneak Peak: Data Exploration Conclusions • Public deployment is challenging – Real world code pushes tools to their limits – Large user communities take time to build • But the results are worth it: “Thanks to Ben Liblit and the Cooperative Bug Isolation Project, this version of Rhythmbox should be the most stable yet.” Join the Cause! The Cooperative Bug Isolation Project http://www.cs.berkeley.edu/~liblit/sampler/