Empirical Software Engineering at Microsoft: Transitioning Research into Practice Christian Bird Empirical Software Engineering Group Microsoft Research, Redmond 1 A little about me... complex code has more severe bugs! Computer Science UndergradCorrelation => Causation Just add more people Worked for large tech companyXML writing will solve all problems software No bugs if 100% statement coverage Went back to school because “There has to Decisions & Policies be a more principled way” based on intuition and anecdotal evidence 2 Many projects run over budget Developing Most ship late Software is Expensive and Time Consuming And yet more money is spent on maintenance "The F-35 mission systems software after release before (80%) development and test than is tending towards familiar historical patterns of extended development … and I don’t think it’s because deferrals to later increments." developers are inept 3 Software Engineering Goals Increase Productivity Improve Quality 4 Cholera Outbreak of 1854 Outbreak began in Soho, London on August 31, 1854 Many false ideas Miasma Divine Intervention “It just happens” Government concluded they could do nothing 5 Dr. Snow’s Cholera Investigation Had hypotheses about spread of Cholera Interviewed families Collected geographic data Observed the community 6 The beginning of a scientific field Broad Street Pump Considered the beginning of epidemiology Cholera outbreak stopped 7 How do we make projects work better more often? Empirical Method Gather Data Examine Relationships Make Changes & Build Tools 8 Social Dynamics in Programming “Design and programming are human activities; forget that and all is lost.” - Bjarne Stroustrop. The C++ Programming Language 10 Results in this talk 1. How does ownership and expertise affect software quality? 2. How do we determine who should be coordinating work? 11 Dealing with Large Systems in commercial contexts • Divide system into modules and interfaces • Assign modules to teams/developers • This leads to strong ownership practices 14 Ownership & Expertise Can we quantify the effect of component ownership on defects? What is the effect of many contributions from people with low expertise? 16 Ownership Terms On a per component basis Major Contributor – a developer that has made at least 5% of the total commits. Minor Contributor – a developer that has made less than 5% of the total commits. Ownership – the proportion of commits made by the highest contributing developer. 17 Failure Correlation Analysis Windows Vista Category Metric Total Ownership Minor Metrics Major Ownership Size "Classical" Churn Metrics Complexity Pre-release 0.84 0.86 0.26 -0.49 0.75 0.72 0.70 Post-release 0.70 0.70 0.29 -0.49 0.69 0.69 0.53 Windows 7 Pre-release 0.92 0.93 -0.40 -0.29 0.70 0.71 0.56 Post-release 0.24 0.25 -0.14 -0.02 0.26 0.26 0.37 Minor Contributors has higher correlation than any other measure 18 Correlations can be deceiving Guess which has more failures Now guess which is larger, more complex, and had more changes 19 Regression Analysis Allows us to control for component characteristics such as size, complexity, and churn Skewed distribution Use variance explained (R2) to evaluate model improvement 20 Regression Analysis Results Model Windows Vista Pre-release Post-release Windows 7 Pre-release Post-release Base (code metrics) 26% 29% 24% 18% Base + Total 40% (+14%) 35% (+6%) 68% (+35%) 21% (+3%) Base + Minor 46% (+20%) 41% (+12%) 70% (+46%) 21% (+3%) Base + Minor + Major 48% (+2%) 43% (+2%) 71% (+1%) 22% (+1%) Base + Minor + Major + Ownership 50% (+2%) 44% (+1%) 72% (+1%) 22% (+0%) Addition of all measures were statistically significant. Total had less of an affect and improved model less than Minor. 21 Relationship to Failures Metric Size, Complexity, Churn Total Contributors Minor Contributors Major Contributors Ownership Only statistically significant in 3 of 4 cases Effects of Failures Medium Positive Large Positive Largest Positive Small Positive Small Negative 22 But WHY do some components have so many minor contributors? 23 The Major-Minor-Dependency relationship I need to Ichange need toBar fix whichFoo is used by Foo Foo.exe Monte Carlo simulation showed that MMD happens twice as often Dependency as would be expected by chance Bar.dll 24 Replicating Defect Prediction Pinzger, Nagappan, Murphy [FSE 08] Whole Network Precision Recall 75% 82% Without Minors Precision Recall 44% 58% Without Majors Precision Recall 84% 88% Baz.sys ie.exe Minor Contributors are vital to predictive power sol.exe Bar.dll Foo.exe kernel.dll 27 Recommendations 1. Changes made by minor contributors should be reviewed with more scrutiny. 2. Potential minor contributors should communicate desired changes rather than making them. 3. Components with low ownership should be given priority by QA resources. 28 The Problem of Large Software More people More code More coordination Coordination overhead can dominate the project and breakdowns, leading to: –Decreased Productivity –Lower Quality 30 Branches to the rescue • Create a separate workspace for development of a feature, fix, or maintenance task. Development work Completion Branch Trunk Initial stable state Deliver changes 32 The Cost of Isolation • • Branches allow a temporary reprieve from requirements of awareness. Conflicting changes to the system will eventually manifest. 33 SocioTechnical Congruence Coordination requirement change Foo() related change Bar() 34 Does STC apply to branching? Coordination Requirement work Branch 1 related work Branch 2 35 How do we identify similar branches? A Branch is characterized in two ways: The changes required to accomplish the goal of the branch The contributors making those changes 37 Operationalizing Branch Profiles Let 𝐷 be the set of developers in a project 𝐷 = {𝑑1 , 𝑑2 , … } Let 𝐹 be the set of files in a project 𝐹 = {𝑓1 , 𝑓2 , … } Let 𝐵 be the set of branches in a project 38 Branch Profile Vectors For some branch 𝑏 ∈ 𝐵: The goal profile, 𝑏𝐹 , is a vector with dimension 𝐹 such the ith element in 𝑏𝐹 is the number of changes to the ith file. 𝑏𝐹 = changes to 𝑓1 , 𝑡𝑜 𝑓2 , … The virtual team profile, 𝑏𝐷 , is a vector with dimension 𝐷 such the jth element in 𝑏𝐷 is the number of changes made by the jth developer. 𝑏𝐷 = changes by 𝑑1 , 𝑏𝑦 𝑑2 , … cosine similarity 𝐴, 𝐵 = 𝑛 𝑖=1 2 𝑛 𝑖=1 𝐴𝑖 𝐴𝑖 ∙ 𝐵𝑖 ∙ 2 𝑛 𝑖=1 𝐵𝑖 39 Branch Similarity Example Video Decoder Branch: 𝑣 Media UI Branch: 𝑢 𝑣𝐹 = 0, 34, 16, 0 𝑢𝐹 = 5, 12, 0, 3 𝑣𝐷 = 42, 8, 0 𝑢𝐷 = 0, 3, 17 Goal Similarity: 0.81 Team Similarity: 0.03 41 Which Branches Need Coordination? Compare all pairs of branches by file similarity and developer similarity. Dark areas mean many branch pairs in that area. Same Teams Different Teams Same files, but different team means potential problems Different Files Same Files 43 Empirical Software Engineering Low ownership leads to poor quality. We can identify coordination needs. 56