Software Defect Modeling at JPL

advertisement
Software Defect Modeling at JPL
John N. Spagnuolo Jr. and John D. Powell
19th International Forum on
COCOMO and Software Cost Modeling
10/27/2004
The JPL SQI Project
Process & Product Definition
Capture, define, and refine
repeatable processes
and a set of engineering practices
for project use
Measurement & Benchmarking
Provide measurement infrastructure for
projects,
conduct empirical analysis, and package
experiences for future use
SQI Project Engineering
Provide overall technical infrastructure
and work element integration
Software Engineering Technology
Infusion
Identify, evaluate,
and support software tools and techniques
to facilitate process and product
improvement
10/27/2004
Deployment
Infuse practices into project use;
provide training, products, mentoring
and consulting for projects
John N. Spagnuolo Jr. &
John D. Powell
2
SQI Measurement & Benchmarking
• The objective of the SQI Measurement Program is to provide the
basis for a quantitatively based software management approach
–
–
–
–
Define models and measures
Create an infrastructure
Provide consulting and support
Produce Handbooks & Training
• Cost estimation and planning
– Help develop total cost, schedule and plan for project activities and phases
• Quality planning and assessment
– Help predict and assess the quality of products
• Management tracking
– Help managers plan / monitor activities and assess risks during project execution
• Guiding improvement
– Help JPL assess the overall effectiveness of software processes
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
3
Quality Purpose and Goals
• The purpose of this work is to predict software quality via
the development of Defect Prediction Models
– Predict defects and their effects on software projects early in project’s life cycle
• The current goal of this work is to make use of analysis of
JPL defect data to support decision making.
– Determine and make use of existing software defect trends
– Determine the driving forces behind exceptions to trends
• Critical Discriminators (CDs) are the distinguishing characteristics of a project(s) that
capture/quantify the driving force behind a deviation from over all JPL defect trends
• Combine Critical discriminators along with trends to assess threats to software project
goals
– Provide a means for managers to make use of Trends and CDs for
•
•
•
•
10/27/2004
Planning
Prediction
Corrective Actions
Process Improvement
John N. Spagnuolo Jr. &
John D. Powell
4
Project-Defect-SLOC-Chart
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
JPL PROJECT
Projec t F1
Projec t F2
Projec t F3
Projec t F4
Projec t F5
Projec t F6
Projec t F7
Projec t F8
Projec t F9
Projec t F10
Projec t F11
Projec t F12
Projec t F13
Projec t F14
Projec t F15
Projec t F16
Projec t F17
Projec t F18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Projec t G1
Projec t G2
Projec t G3
Projec t G4
Projec t G5
Projec t G6
Projec t G7
Projec t G8
Projec t G9
Projec t G10
Projec t G11
Projec t G12
Projec t G13
Projec t G14
Projec t G15
Projec t G16
Projec t G17
Projec t G18
Projec t G19
Projec t G20
Projec t G21
Projec t G22
Projec t G23
Projec t G24
Projec t G25
Projec t G26
Part ial Defec t s
Flight Soft ware
All Defec t s Analyzed Defec t s
Code
Dat a Correlat ed
Influenc e
Ground Soft ware
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
5
Early Defect Measurement
Basic Defect Prediction Approach
2500
FSW
Number of defects
2000
1500
1000
FSW &
GSW
500
GSW
0
0
100
200
300
400
500
600
700
800
LKSLOC
• Average
Defect Density for
– Flight Software (FSW)
– Ground Software (GSW)
– FSW & GSW
• Plot Lines with Slopes = Densities
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
6
One Perspective of Data:
Average Work Hours per SDS vs SLOC
• Formed Two Linear Trends
• Philosophical Connection with Subsequent Charts
A ver age WH ' s/PFR
vs LK SLO C
16
1 4 . 4 0 5 5
14
12
Aver age WH' s per PF R
1 1 . 4
10
8 . 8 5
8
6 . 3
6
5 . 5 3 9
4 . 9 4
4 . 6 0 4 2
4 . 2 9 3
4
2
2
0
0
0
20
40
60
80
100
120
140
160
180
200
LK SL O C
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
7
Preview of Major Findings
• New Trends: Given the basic Defect prediction Approach (Previous
slide), the actual JPL defect data shows defects trends that are:
– Are Counter intuitive on first inspection
– More complex on further inspection
• New CDs: Newly developed software size CDs simplify the complex
and counterintuitive Defect versus Size trend.
– Managers may quickly make use of simple valuable sub-trends that apply
to their projects:
• Without dealing with overall trend complexities that may be based on factors
that do not apply to their project
• Based on characteristics such as size that may be estimated early in the
software lifecycle and easily revised for accuracy as the project progresses.
– The new CD provides a means for managers to make use of the trends for
planning, prediction, corrective actions and process improvement.
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
8
The High Level Story
• Difference between Fault and Failure
– Fault : error which may or may not have been discovered
– Failure : Outward sign of an error (Discovered Fault)
– In This presentation Failure = Defect
• Related Literature suggests that faults vs. code size may fit
to an exponential curve
• Relation of exponential curve to Project G4 Data Plots
– Linear fit is reasonable for Project G4 defects vs. size
– Not necessarily a contradiction
• All faults may not have to be fixed
• More testing resources and/or time
• Project G4 defect curve ~ exponential fault curve
• Expand and correlate Project G4 analysis to all FSW and
GSW Data collected thus far
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
9
Project G4 and JPL Projects
• Project G4 DATA : Number of Defects vs. SDS SLOC
–
–
–
–
Project G4 TEST (SDS’s) - Quadratic Fit the best
Project G4 OPS (SDS’s) - Quadratic Fit is the best
Project G4 TEST & OPS (SDS’s) - Quadratic Fit is the best
Project G4 TEST + OPS (SDS’s) - Quadratic Fit is the best
• JPL Projects: Number of Defects vs. Project Size
– All FSW - Quadratic fit is the best
– All GSW - Quadratic Fit is the best
– All FSW & GSW - Quadratic Fit is the best
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
10
Interesting Fit for ALL JPL FSW
Number of Defects
All Collec ted FSW- TEST -June 30- 2004
2600
2500
2400
2300
2200
2100
2000
1900
1800
1700
1600
1500
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
y = - 0.0749x2 + 28.781x - 363.52
R2 = 0.9981
0
50
100
150
200
FSW
Poly. (FSW)
250
300
350
LKSLOC
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
11
Curve Extends to GSW, GSW&FSW
Project G4 TEST, OPS, TEST & OPS,TEST+OPS
JPL GSW - T EST June 30 2004
All JPL FSW & GSW:TEST June 30 2004
1800
1700
1800
1600
1700
1500
1600
1400
1500
1400
1300
1300
Number of Defects
Number of Defects
1200
1100
1000
y = - 0.0113x2 + 8.7463x - 8.1403
900
R2 = 0.8374
800
700
600
y = -0.0104x 2 + 7.8956x + 135.87
1200
2
R = 0.689
1100
1000
900
800
700
600
500
500
400
400
300
300
200
200
100
100
0
0
0
100
200
300
400
500
600
700
0
800
100
200
300
400
500
600
700
800
LKSLOC
LKSLOC
JASON1 GSW : OPS
JASON1 GSW TEST
40
80
38
75
36
70
34
65
32
30
60
28
26
50
Number of Defects
Number of Defects
55
45
y = -0.0023x2 + 0.7257x + 4.3511
40
R2 = 0.5132
35
30
24
22
20
18
y = -0.0029x 2 + 0.5949x + 2.3269
16
R2 = 0.5848
14
25
12
20
10
8
15
6
10
4
5
2
0
0
0
50
100
150
200
250
300
0
20
10/27/2004
40
60
80
100
120
140
160
180
LKSLOC
LKSLOC
John N. Spagnuolo Jr. &
John D. Powell
12
200
Trends and Potential Explanation
• Upper bound on amount of testing resources & time
– Testing resources / KSLOC is higher for smaller modules than for larger modules.
– One possible explanation of defects / size ratios
– Smaller Modules
• As code gets bigger more people tend to test more
• Defect curve increases to a certain point
– Medium Modules
• With further size increases, testing resource approach their upper limit
• Defect curve begins to “level off”
– Larger Modules
• Upper bound on testing resources for Project G4
• Testing resources/KSLOC not big enough to maintain defect / size ratio
observed in smaller modules
• Eventually defects / KSLOC trend curve begins to decrease
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
13
Where Does the Exponential come in?
All FSW and GSW: Exponential Fit for LKSLOC < 247
Al FSW & GSW < 248 LKSLOC
3500
3400
3300
3200
3100
3000
2900
2800
2700
2600
2500
2400
2300
2200
2100
2000
1900
1800
1700
1600
1500
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
y = 65.224e 0.0317x
R2 = 0.7548
FSW&GSW< 248 LKSLOCK
Expon. (FSW&GSW< 248 LKSLOCK)
0
20
40
60
80
100
120
140
All FSW & GSW
1800
1750
1700
1650
1600
1550
1500
1450
1400
1350
1300
1250
1200
1150
1100
1050
1000
950
900
850
800
750
700
650
600
550
500
450
400
350
300
250
200
150
100
50
0
y = -0.0107x 2 + 8.0852x + 156.95
All FSW & GSW
Vertical1
Vertical2
Expon. (All FSW & GSW)
Poly. (All FSW & GSW)
Linear (All FSW & GSW)
Linear (Vertical1)
Linear (Vertical2)
R2 = 0.7076
y = 0.0004x + 497.46
R2 = 6E-08
y = 220.32e
0 .0 0 0 5 x
R2 = 0.0138
0
10/27/2004
100
200
300
400
500
600
John N. Spagnuolo Jr. &
John D. Powell
700
800
14
Exponential Fits for JPL FSW, GSW
Project G4 GSW TEST and Project G4
GSW OPS
FSW < 160
LKSLOC
GSW < 247
LKSLOC
2000
850
1900
y = 19.295e0.0661x
1800
800
2
R = 0.9207
750
1700
700
1600
1500
650
1400
600
1300
550
1200
500
1100
450
1000
400
900
800
350
700
300
600
250
500
200
400
y = 125.53e 0 .0 1 8 9 x
2
R = 0.9157
150
300
100
200
50
100
0
0
0
10
20
30
40
50
60
70
80
0
10
20
30
JASON1 GSW: Exponential Fit for LKSLOC < 103
50
60
70
80
90
100
JASON1 OPS # of pfr's vs size <57K
80
42
75
39
70
36
65
33
60
30
55
27
# of pfr's
50
# of Defects
40
45
40
y = 1.6196e
35
0 .0 9 9 2 x
2
R = 0.874
24
21
30
15
25
12
20
9
15
6
10
y = 2.1255e0.0777x
18
R2 = 0.8899
3
5
0
0
0
5
10
15
20
25
30
35
40
0
LKSLOC
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
5
10
15
20
25
30
35
LKSLOC
15
40
Relevance to Exponential Curve
• Examination of Lower SLOC range: Project G4
–
–
–
–
Computed (max LKSLOC - min LKSLOC)/3
Project G4 TEST- Excellent fit to Exponential Curve < 103 LKSLOC
Project G4 OPS - Excellent fit to Exponential Curve < 57 LKSLOC
Project G4 TEST & OPS - Excellent fit to Exponential Curve < 103
LKSLOC
– Project G4 TEST + OPS - Excellent fit to Exponential Curve < 103
LKSLOC
• Examination of Lower SLOC range: FSW, GSW, FSW &
GSW
–
–
–
–
Computed (max LKSLOC - min LKSLOC)/3
All FSW - Excellent fit to Exponential Curve < 118 LKSLOC
All GSW - Excellent fit to Exponential Curve < 247 LKSLOC
All FSW and GSW - Excellent fit to Exponential Curve < 247 LKSLOC
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
16
Observations at this Point
• The Empirical Evidence thus far seems to suggest:
– Number of defects seems to increase as code size increases, peaks out,
then begins a decreasing trend.
– The exponential relationship between number of faults vs. code size, as
indicated by the “Exponential curve”, appears to be most closely
approximated by the relationship between the number of defects vs. code
size for smaller sized Projects/SDS’s here at JPL
– Code does not require exponential discovery of faults to work
appropriately for JPL purposes
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
17
Summary
• Defect Differences from Test to Operations by SDS
• The Size Critical Discriminator (CD) for Defect Rates
– Size CD is a code size threshold
• Represents successful development of a new and valuable CD
• Distinguishes between very different expected defects rate behaviors
• Allows defect prediction superior to simple “defects per lines of code” models
– Significance of the Size CD
• Intuitive results below Size CD threshold
• Counterintuitive results above the Size CD threshold
– Plausible and Reasonable explanations
– Valuable rules for projects
– Calculation of the Size CD and Differing Trends
• Analysis of Size CD and Defect Trends gives reasonable confidence level
• Very High Statistical Confidence Levels (R2, etc..) but more data is need to
make claims of high confidence overall
• Actual Data driven results
• Trends persist
– Across multiple analyses of the data
– Across various subsets of the data
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
18
Future Work
• Relations between criticality 1 and criticality 2 defects
• For each “Number of PFR’s vs.. size” chart, construct a chart
corresponding to defect density vs.. size – determine
corresponding hump in trend curves
• Identify and Classify More Critical Discriminators
–
–
–
–
New individual CDs
Interaction of multiple CDs
For each CD associate modules, characteristics and data
Establish Trends for all modules corresponding to each CD
• Investigate interaction of multiple CDs at work in a given
project / SDS
10/27/2004
John N. Spagnuolo Jr. &
John D. Powell
19
Download