Software Defect Modeling at JPL John N. Spagnuolo Jr. and John D. Powell 19th International Forum on COCOMO and Software Cost Modeling 10/27/2004 The JPL SQI Project Process & Product Definition Capture, define, and refine repeatable processes and a set of engineering practices for project use Measurement & Benchmarking Provide measurement infrastructure for projects, conduct empirical analysis, and package experiences for future use SQI Project Engineering Provide overall technical infrastructure and work element integration Software Engineering Technology Infusion Identify, evaluate, and support software tools and techniques to facilitate process and product improvement 10/27/2004 Deployment Infuse practices into project use; provide training, products, mentoring and consulting for projects John N. Spagnuolo Jr. & John D. Powell 2 SQI Measurement & Benchmarking • The objective of the SQI Measurement Program is to provide the basis for a quantitatively based software management approach – – – – Define models and measures Create an infrastructure Provide consulting and support Produce Handbooks & Training • Cost estimation and planning – Help develop total cost, schedule and plan for project activities and phases • Quality planning and assessment – Help predict and assess the quality of products • Management tracking – Help managers plan / monitor activities and assess risks during project execution • Guiding improvement – Help JPL assess the overall effectiveness of software processes 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 3 Quality Purpose and Goals • The purpose of this work is to predict software quality via the development of Defect Prediction Models – Predict defects and their effects on software projects early in project’s life cycle • The current goal of this work is to make use of analysis of JPL defect data to support decision making. – Determine and make use of existing software defect trends – Determine the driving forces behind exceptions to trends • Critical Discriminators (CDs) are the distinguishing characteristics of a project(s) that capture/quantify the driving force behind a deviation from over all JPL defect trends • Combine Critical discriminators along with trends to assess threats to software project goals – Provide a means for managers to make use of Trends and CDs for • • • • 10/27/2004 Planning Prediction Corrective Actions Process Improvement John N. Spagnuolo Jr. & John D. Powell 4 Project-Defect-SLOC-Chart 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 JPL PROJECT Projec t F1 Projec t F2 Projec t F3 Projec t F4 Projec t F5 Projec t F6 Projec t F7 Projec t F8 Projec t F9 Projec t F10 Projec t F11 Projec t F12 Projec t F13 Projec t F14 Projec t F15 Projec t F16 Projec t F17 Projec t F18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 Projec t G1 Projec t G2 Projec t G3 Projec t G4 Projec t G5 Projec t G6 Projec t G7 Projec t G8 Projec t G9 Projec t G10 Projec t G11 Projec t G12 Projec t G13 Projec t G14 Projec t G15 Projec t G16 Projec t G17 Projec t G18 Projec t G19 Projec t G20 Projec t G21 Projec t G22 Projec t G23 Projec t G24 Projec t G25 Projec t G26 Part ial Defec t s Flight Soft ware All Defec t s Analyzed Defec t s Code Dat a Correlat ed Influenc e Ground Soft ware 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 5 Early Defect Measurement Basic Defect Prediction Approach 2500 FSW Number of defects 2000 1500 1000 FSW & GSW 500 GSW 0 0 100 200 300 400 500 600 700 800 LKSLOC • Average Defect Density for – Flight Software (FSW) – Ground Software (GSW) – FSW & GSW • Plot Lines with Slopes = Densities 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 6 One Perspective of Data: Average Work Hours per SDS vs SLOC • Formed Two Linear Trends • Philosophical Connection with Subsequent Charts A ver age WH ' s/PFR vs LK SLO C 16 1 4 . 4 0 5 5 14 12 Aver age WH' s per PF R 1 1 . 4 10 8 . 8 5 8 6 . 3 6 5 . 5 3 9 4 . 9 4 4 . 6 0 4 2 4 . 2 9 3 4 2 2 0 0 0 20 40 60 80 100 120 140 160 180 200 LK SL O C 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 7 Preview of Major Findings • New Trends: Given the basic Defect prediction Approach (Previous slide), the actual JPL defect data shows defects trends that are: – Are Counter intuitive on first inspection – More complex on further inspection • New CDs: Newly developed software size CDs simplify the complex and counterintuitive Defect versus Size trend. – Managers may quickly make use of simple valuable sub-trends that apply to their projects: • Without dealing with overall trend complexities that may be based on factors that do not apply to their project • Based on characteristics such as size that may be estimated early in the software lifecycle and easily revised for accuracy as the project progresses. – The new CD provides a means for managers to make use of the trends for planning, prediction, corrective actions and process improvement. 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 8 The High Level Story • Difference between Fault and Failure – Fault : error which may or may not have been discovered – Failure : Outward sign of an error (Discovered Fault) – In This presentation Failure = Defect • Related Literature suggests that faults vs. code size may fit to an exponential curve • Relation of exponential curve to Project G4 Data Plots – Linear fit is reasonable for Project G4 defects vs. size – Not necessarily a contradiction • All faults may not have to be fixed • More testing resources and/or time • Project G4 defect curve ~ exponential fault curve • Expand and correlate Project G4 analysis to all FSW and GSW Data collected thus far 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 9 Project G4 and JPL Projects • Project G4 DATA : Number of Defects vs. SDS SLOC – – – – Project G4 TEST (SDS’s) - Quadratic Fit the best Project G4 OPS (SDS’s) - Quadratic Fit is the best Project G4 TEST & OPS (SDS’s) - Quadratic Fit is the best Project G4 TEST + OPS (SDS’s) - Quadratic Fit is the best • JPL Projects: Number of Defects vs. Project Size – All FSW - Quadratic fit is the best – All GSW - Quadratic Fit is the best – All FSW & GSW - Quadratic Fit is the best 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 10 Interesting Fit for ALL JPL FSW Number of Defects All Collec ted FSW- TEST -June 30- 2004 2600 2500 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 y = - 0.0749x2 + 28.781x - 363.52 R2 = 0.9981 0 50 100 150 200 FSW Poly. (FSW) 250 300 350 LKSLOC 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 11 Curve Extends to GSW, GSW&FSW Project G4 TEST, OPS, TEST & OPS,TEST+OPS JPL GSW - T EST June 30 2004 All JPL FSW & GSW:TEST June 30 2004 1800 1700 1800 1600 1700 1500 1600 1400 1500 1400 1300 1300 Number of Defects Number of Defects 1200 1100 1000 y = - 0.0113x2 + 8.7463x - 8.1403 900 R2 = 0.8374 800 700 600 y = -0.0104x 2 + 7.8956x + 135.87 1200 2 R = 0.689 1100 1000 900 800 700 600 500 500 400 400 300 300 200 200 100 100 0 0 0 100 200 300 400 500 600 700 0 800 100 200 300 400 500 600 700 800 LKSLOC LKSLOC JASON1 GSW : OPS JASON1 GSW TEST 40 80 38 75 36 70 34 65 32 30 60 28 26 50 Number of Defects Number of Defects 55 45 y = -0.0023x2 + 0.7257x + 4.3511 40 R2 = 0.5132 35 30 24 22 20 18 y = -0.0029x 2 + 0.5949x + 2.3269 16 R2 = 0.5848 14 25 12 20 10 8 15 6 10 4 5 2 0 0 0 50 100 150 200 250 300 0 20 10/27/2004 40 60 80 100 120 140 160 180 LKSLOC LKSLOC John N. Spagnuolo Jr. & John D. Powell 12 200 Trends and Potential Explanation • Upper bound on amount of testing resources & time – Testing resources / KSLOC is higher for smaller modules than for larger modules. – One possible explanation of defects / size ratios – Smaller Modules • As code gets bigger more people tend to test more • Defect curve increases to a certain point – Medium Modules • With further size increases, testing resource approach their upper limit • Defect curve begins to “level off” – Larger Modules • Upper bound on testing resources for Project G4 • Testing resources/KSLOC not big enough to maintain defect / size ratio observed in smaller modules • Eventually defects / KSLOC trend curve begins to decrease 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 13 Where Does the Exponential come in? All FSW and GSW: Exponential Fit for LKSLOC < 247 Al FSW & GSW < 248 LKSLOC 3500 3400 3300 3200 3100 3000 2900 2800 2700 2600 2500 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 y = 65.224e 0.0317x R2 = 0.7548 FSW&GSW< 248 LKSLOCK Expon. (FSW&GSW< 248 LKSLOCK) 0 20 40 60 80 100 120 140 All FSW & GSW 1800 1750 1700 1650 1600 1550 1500 1450 1400 1350 1300 1250 1200 1150 1100 1050 1000 950 900 850 800 750 700 650 600 550 500 450 400 350 300 250 200 150 100 50 0 y = -0.0107x 2 + 8.0852x + 156.95 All FSW & GSW Vertical1 Vertical2 Expon. (All FSW & GSW) Poly. (All FSW & GSW) Linear (All FSW & GSW) Linear (Vertical1) Linear (Vertical2) R2 = 0.7076 y = 0.0004x + 497.46 R2 = 6E-08 y = 220.32e 0 .0 0 0 5 x R2 = 0.0138 0 10/27/2004 100 200 300 400 500 600 John N. Spagnuolo Jr. & John D. Powell 700 800 14 Exponential Fits for JPL FSW, GSW Project G4 GSW TEST and Project G4 GSW OPS FSW < 160 LKSLOC GSW < 247 LKSLOC 2000 850 1900 y = 19.295e0.0661x 1800 800 2 R = 0.9207 750 1700 700 1600 1500 650 1400 600 1300 550 1200 500 1100 450 1000 400 900 800 350 700 300 600 250 500 200 400 y = 125.53e 0 .0 1 8 9 x 2 R = 0.9157 150 300 100 200 50 100 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 JASON1 GSW: Exponential Fit for LKSLOC < 103 50 60 70 80 90 100 JASON1 OPS # of pfr's vs size <57K 80 42 75 39 70 36 65 33 60 30 55 27 # of pfr's 50 # of Defects 40 45 40 y = 1.6196e 35 0 .0 9 9 2 x 2 R = 0.874 24 21 30 15 25 12 20 9 15 6 10 y = 2.1255e0.0777x 18 R2 = 0.8899 3 5 0 0 0 5 10 15 20 25 30 35 40 0 LKSLOC 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 5 10 15 20 25 30 35 LKSLOC 15 40 Relevance to Exponential Curve • Examination of Lower SLOC range: Project G4 – – – – Computed (max LKSLOC - min LKSLOC)/3 Project G4 TEST- Excellent fit to Exponential Curve < 103 LKSLOC Project G4 OPS - Excellent fit to Exponential Curve < 57 LKSLOC Project G4 TEST & OPS - Excellent fit to Exponential Curve < 103 LKSLOC – Project G4 TEST + OPS - Excellent fit to Exponential Curve < 103 LKSLOC • Examination of Lower SLOC range: FSW, GSW, FSW & GSW – – – – Computed (max LKSLOC - min LKSLOC)/3 All FSW - Excellent fit to Exponential Curve < 118 LKSLOC All GSW - Excellent fit to Exponential Curve < 247 LKSLOC All FSW and GSW - Excellent fit to Exponential Curve < 247 LKSLOC 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 16 Observations at this Point • The Empirical Evidence thus far seems to suggest: – Number of defects seems to increase as code size increases, peaks out, then begins a decreasing trend. – The exponential relationship between number of faults vs. code size, as indicated by the “Exponential curve”, appears to be most closely approximated by the relationship between the number of defects vs. code size for smaller sized Projects/SDS’s here at JPL – Code does not require exponential discovery of faults to work appropriately for JPL purposes 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 17 Summary • Defect Differences from Test to Operations by SDS • The Size Critical Discriminator (CD) for Defect Rates – Size CD is a code size threshold • Represents successful development of a new and valuable CD • Distinguishes between very different expected defects rate behaviors • Allows defect prediction superior to simple “defects per lines of code” models – Significance of the Size CD • Intuitive results below Size CD threshold • Counterintuitive results above the Size CD threshold – Plausible and Reasonable explanations – Valuable rules for projects – Calculation of the Size CD and Differing Trends • Analysis of Size CD and Defect Trends gives reasonable confidence level • Very High Statistical Confidence Levels (R2, etc..) but more data is need to make claims of high confidence overall • Actual Data driven results • Trends persist – Across multiple analyses of the data – Across various subsets of the data 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 18 Future Work • Relations between criticality 1 and criticality 2 defects • For each “Number of PFR’s vs.. size” chart, construct a chart corresponding to defect density vs.. size – determine corresponding hump in trend curves • Identify and Classify More Critical Discriminators – – – – New individual CDs Interaction of multiple CDs For each CD associate modules, characteristics and data Establish Trends for all modules corresponding to each CD • Investigate interaction of multiple CDs at work in a given project / SDS 10/27/2004 John N. Spagnuolo Jr. & John D. Powell 19