Early Phase Software Effort and Schedule Estimation Models

advertisement
Early Phase Software
Cost and Schedule Estimation Models
Presenter:
Wilson Rosa
Co-authors:
Barry Boehm, Ray Madachy, Brad Clark
, Cheryl Jones, John McGarry
November 16, 2015
Outline
•
•
•
•
•
•
•
Introduction
Experimental Design
Data Analysis
Descriptive Statistics
Effort Models
Schedule Models
Conclusion
2
Introduction
Problem Statement
• Software cost estimates are more useful at
early elaboration phase, when source lines of
code (SLOC) and Function Points Analysis
(FPA) are not yet available.
• Mainstream software cost models do not
provide alternate functional size
measurements for early phase estimation
4
Significance of Proposed Study
• This study will remedy these limitations in 3 ways:
1. Introduce effort and schedule estimating models for
software development projects at early elaboration phase
2. Perform statistical analysis on parameters that are made
available to analysts at early elaboration phase such as
Estimated functional requirements
Estimated peak staff
Estimated Effort
3. Measure the direct effect of functional requirements
on software development effort
5
Research Questions
Question 1:
Does estimated requirement relate to actual effort?
Question 2:
Do estimated requirements along with estimated peak staff
relate to actual effort?
Question 3:
Does estimated effort relate to actual development duration?
Question 4:
Are estimating models based on Estimated Size more
accurate than those based on Final Size?
6
Experimental Design
Quantitative Method
• A non-random sample was used since NCCA had
access to names in the population and the selection
process for participants was based on their
convenience and availability (see next slide)
• This study focused on projects reported at the total
level rather than by CSCIs, as requirements count at
elaboration phase are provided at the aggregate level
• To minimize threats to validity the analysis framework
focused on estimated inputs rather than final inputs
8
Instrumentation
• Questionnaire:
– Software Resource Data Report” (SRDR) (DD Form 2630)
• Source:
–
Cost Assessment Data Enterprise (CADE) website:
http://cade.osd.mil/Files/Policy/Initial_Developer_Report.xlsx
http://cade.osd.mil/Files/Policy/Final_Developer_Report.xlsx
•
Content:
– Allows for the collection of project context, company information,
requirements, product size, effort, schedule, and quality
9
Sample and Population
• Empirical data from 40 very recent US DoD programs
extracted from the Cost Assessment Data Enterprise:
http://dcarc.cape.osd.mil/Default.aspx
Each program submitted:
SRDR Initial Developer
Report (Estimates)
&
SRDR Final Developer
Report (Actuals)
10
Operating Environments
C4I
Aircraft
Missile
Automated Information System
Enterprise Resource Planning
Unmanned Aircraft
0
2
4
6
8
10
12
Number of Projects
Major Automated Information Systems (AIS)
Major Defense Systems
11
Delivery Year
Project Delivery Year
2014
2013
2012
2011
2010
2009
2008
2007
2006
0
2
4
6
Number of Projects
8
10
Projects were completed during the time period from 2006 to 2014
12
Model Reliability and Validity
 Accuracy of the Models verified using five different measures:
Measure
Symbol
Description
Coefficient of
Variation
CV
P-value
α
Variance
Inflation Factor
VIF
Indicates whether multicollinearity (correlation among
predictors) is present in a multi-regression analysis.
Coefficient of
Determination
R2
The Coefficient of Determination shows how much variation in
dependent variable is explained by the regression equation.
F-test
The value of the F test is the square of the equivalent t test;
the bigger it is, the smaller the probability that the difference
could occur by chance.
F-test
Percentage expression of the standard error compared to the
mean of dependent variable. A relative measure allowing
direct comparison among models.
Level of statistical significance established through the
coefficient alpha (p ≤ α).
13
Data Analysis
Pairwise Correlation Analysis
• Variable selection based on Pairwise Correlation
– Pairwise Correlation chosen over structural equation modeling
as the number of observations (40) was far below the
minimum observations (200) needed
– Variables examined:
Actual Effort
Estimated Total Requirements
Actual Duration
Actual Total Requirements
Estimated New Requirements
Actual New Requirements
Estimated Peak Staff
Actual Peak Staff
Scope
Volatility
Estimated Effort
15
Pairwise Correlation Analysis
Actual
Effort
Actual
Estimated
Actual
Estimated
Actual Estimated
Actual
Estimated
Duration Total REQ Total REQ New REQ New REQ Effort
Peak Staff Peak Staff
Actual Effort
Actual Duration
1.0
0.6
0.6
1.0
0.7
0.4
0.7
0.4
0.7
0.5
0.5
0.3
0.6
0.2
0.4
-0.2
0.4
-0.2
Estimated Total Requirement
0.7
0.4
1.0
0.9
0.9
0.7
0.6
0.2
0.2
Actual Total Requirement
Estimated New Requirement
Actual New Requirement
Estimated Effort
Actual Peak Staff
Estimated Peak Staff
RVOL
Scope
0.7
0.7
0.5
0.6
0.4
0.4
0.1
0.2
0.4
0.5
0.3
0.2
-0.2
-0.2
0.1
-0.1
0.9
0.9
0.7
0.6
0.2
0.2
0.0
0.1
1.0
0.8
0.8
0.6
0.3
0.3
0.0
0.1
0.8
1.0
0.9
0.7
0.2
0.2
0.5
0.4
0.8
0.9
1.0
0.5
0.5
0.4
0.2
0.3
0.6
0.7
0.5
1.0
0.6
0.6
0.1
0.1
0.3
0.2
0.5
0.6
1.0
1.0
0.1
0.4
0.3
0.2
0.4
0.6
1.0
1.0
0.1
0.4
Strong Correlation
Moderate Correlation
Weak Correlation
 Estimated Requirements should be considered in the effort model, as it is strongly correlated to Actual Effort
 Estimated Peak Staff should also be considered in the effort model, as it is correlated to Actual Effort
 Although estimated effort is weakly correlated to actual duration, it was still chosen based past literature
16
Descriptive Statistics
Project Size Boxplot
Product Size by Mission Area
Functional Requirements
10000
8000
6000
4000
2001
2000
906
0
IT
Defense
Observation: higher requirements count for defense projects
18
Project Duration Boxplot
Development Duration by Mission Area
Duration (in months)
90
80
70
60
50
39
40
30
21
20
10
0
IT
Defense
Observation: longer duration for defense systems due to interdependencies with
hardware design and platform integration schedules.
19
Productivity Boxplot
New vs Enhancement
Actual Hours per Estimated Requirement
Hours per Requirement
800
700
600
500
400
300
200
100
164
183
173
89
0
Enhancement
New
Scope
Observation: No significant difference between new and enhancement projects
20
Productivity Boxplot
IT vs Defense Projects
Actual Hours per Estimated Requirement
Hours per Requirement
800
700
600
IT Projects
500
400
300
192
200
100
141
154
0
ERP
AIS
Project Type
Defense
Observation:
• ERP shows higher “hours per requirement” due to challenges with customizing SAP/Oracle
21
Effort Growth Boxplot
Contract Award to End
Effort Growth (Contract Award to End)
Percent Growth (%)
300
250
IT Projects
200
150
100
50
22%
0
15%
37%
-50
ERP
AIS
Project Type
Defense
Observation: No significant difference between Defense and IT projects
22
Effort Models
Effort Model Variables
Variable
Type
Definition
Actual Effort
Dependent
Actual software engineering effort (in PersonMonths)
Actual Total
Requirements
Independent
Total Requirements captured in the Software
Requirements Specification (SRS). These are the
final total requirements at end of contract.
Estimated Total
Requirements
Independent
Total Requirements captured in the Software
Requirements Specification (SRS). These are the
estimated total requirements at contract award.
Actual Peak Staff
Independent
Actual peak team size, measured in full-time
equivalent staff. Only include direct labor.
Estimated Peak Staff
Independent
Estimated peak team size at contract award,
measured in full-time equivalent staff. Only include
direct labor.
24
Effort Model 1: using Estimated REQ
Equation:
PM
Model Form
N
R2
CV
Mean
MAD
REQ
Min
40
76
64
1739
58
25
= REQ0.6539 x RVOL0.9058x 2.368Scope
PM = 22.37 x eREQ0.5862
16000
=
=
Coeff
22.37
0.5862
Actual effort (in Person Months)
Estimated total requirements
T stat
1.8262
7.3870
14000
Predicted (PM_Final)
Variable
Intercept
eREQ
13900
Actual vs. Predicted (Unit Space)
Where:
PM
eREQ
REQ
Max
12000
10000
8000
6000
4000
2000
0
0
2000
4000
6000
8000
10000 12000 14000 16000
Actual
25
Effort Model 2: using Actual REQ
Equation:
PM
Model Form
N
R2
CV
Mean
MAD
REQ
Min
40
74
54
1739
55
35
= REQ0.6539 x RVOL0.9058x 2.368Scope
PM = 29.08 x aREQ0.5456
16000
=
=
Coeff
29.08
0.5456
Actual effort (in Person Months)
Actual total requirements
T stat
1.7464
6.600
14000
Predicted (PM_Final)
Variable
Intercept
aREQ
12716
Actual vs. Predicted (Unit Space)
Where:
PM
aREQ
REQ
Max
12000
10000
8000
6000
4000
2000
0
0
2000
4000
6000
8000
10000 12000 14000 16000
Actual
26
Effort Model 3: using Estimated REQ and Staff
Equation:
PM
Model Form
N
R2
CV
Mean
MAD
REQ
Min
40
78
54
1739
47
25
= REQ0.6539 x RVOL0.9058x 2.368Scope
PM = 11.82 x eREQ0.4347 x eStaff0.4269
16000
=
=
=
Coeff
11.82
0,4347
0.4269
Actual effort (in Person Months)
Estimated total requirements
Estimated Peak Staff
T stat
1.8790
4.7140
3.5372
14000
Predicted (PM_Final)
Variable
Intercept
eREQ
eStaff
13900
Actual vs. Predicted (Unit Space)
Where:
PM
eREQ
eStaff
REQ
Max
12000
10000
8000
6000
4000
2000
0
0
2000
4000
6000
8000
10000 12000 14000 16000
Actual
27
Effort Model 4: using Actual REQ and Staff
Equation:
PM
Model Form
N
R2
CV
Mean
MAD
REQ
Min
40
66
50
1739
57
35
= REQ0.6539 x RVOL0.9058x 2.368Scope
PM = 17.01 x aREQ0.3006 x aStaff0.5124
16000
=
=
=
Coeff
17.01
0.3006
0.5124
Actual effort (in Person Months)
Actual total requirements
Actual Peak Staff
T stat
5.8891
3.3815
4.2866
14000
Predicted (PM_Final)
Variable
Intercept
aREQ
aStaff
12716
Actual vs. Predicted (Unit Space)
Where:
PM
aREQ
aStaff
REQ
Max
12000
10000
8000
6000
4000
2000
0
0
2000
4000
6000
8000
10000 12000 14000 16000
Actual
28
Schedule Models
Schedule Model Variables
Variable
Type
Definition
Actual Duration
Dependent
Actual software engineering duration (in
Months) from software requirements analysis
through final qualification test
Actual Effort
Independent
Actual software engineering effort at the end of
the contract
Estimated Effort
Independent
Estimated software engineering effort at
contract award.
30
Schedule Model 1: using Estimated Effort
Equation:
PM
Model Form
= REQ0.6539 x RVOL0.9058x2 2.368Scope
TDEV = ePM0.5290
N
R
CV
Mean
F-stat
PM Min
PM Max
40
94
60
38
683
17
7132
Actual vs. Predicted (Unit Space)
Where:
Variable
ePM
=
=
Coeff
0.529
Actual Duration in Months
Estimated Effort (in Person Months)
T stat
26.14
P value
0.0000
120
Predicted (TDEV_Final)
TDEV
ePM
140
100
80
60
40
20
0
0
20
40
60
80
100
120
140
Actual
31
Schedule Model 2: using Actual Effort
Equation:
PM
Model Form
= REQ0.6539 x RVOL0.9058x2 2.368Scope
TDEV = aPM0.5051
N
R
CV
Mean
F-stat
PM Min
PM Max
40
95
48
38
887
27
14819
Actual vs. Predicted (Unit Space)
Where:
Variable
aPM
=
=
Coeff
0.529
Actual Duration in Months
Actual Effort (in Person Months)
T stat
26.14
P value
0.0000
120
Predicted (TDEV_Final)
TDEV
aPM
140
100
80
60
40
20
0
0
20
40
60
80
100
120
140
Actual
32
Conclusion
Primary Findings
• Estimated functional requirements is a significant
contributor to software development effort.
• Variation in effort becomes more significant when
estimated peak staff is added to the effort model.
– Thus, the effect of estimated functional requirements on
effort shall be interpreted along with estimated peak staff.
• Estimated effort is a significant contributor to
development duration.
34
Model Usefulness
 Effort models based on estimated requirements and estimated
peak more appropriate at early elaboration phase.
 Effort Models based on final requirements and final peak staff
more appropriate after Critical Design Review, once
requirements have been stabilized
 Productivity Boxplots (effort per requirement) are useful for
crosschecking estimates at Preliminary Design Review
 Appropriate for both, Defense and IT projects
35
Study Limitations
• Since data was collected at the aggregate level, the
estimation models are not appropriate for projects
reported at the CSCI level.
• Do not use Effort Models 1 through 4 if your input
parameter is outside of the effort model range.
• Do not use Schedule Models 5 & 6 if your input
parameter is outside of the schedule model range.
36
Future Work
Develop similar effort and schedule estimation
models using data reported at the CSCI level.
Build effort models using functional requirements
along with other cost drivers such as
Complexity/Application Domain
Percent reuse
 Requirements Volatility
Process Maturity
37
Acknowledgement
• Dr. Shu-Ping Hu, Tecolote Research
• Dr. Brian Flynn, Technomics
38
Download