University of Southern California Center for Systems and Software Engineering Building Cost Estimating Relationships for Acquisition Decision Support Brad Clark, Ray Madachy, Thomas Tan, & Barry Boehm Wilson Rosa, Sponsor University of Southern California Center for Systems and Software Engineering Topics • • • • Research problem and objectives Data challenges and resolution Results Future work Project led by the Air Force Cost Analysis Agency (AFCAA) working with service cost agencies, and assisted by University of Southern California and Naval Postgraduate School November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 2 University of Southern California Center for Systems and Software Engineering Problem • For many years, there have been efforts to collect data from multiple projects and organizations – – – – Data Analysis Center for Software (DACS) Software Engineering Information Repository (SEIR) International Software Benchmarking Standards Group (ISBSG) Large Aerospace Mergers (Attempts to create company-wide databases) – USAF Mosemann Initiative (Lloyd Mosemann Asst. Sec. USAF) – USC CSSE COCOMO II repository – DoD Software Resources Data Report (SRDR) • Purpose: to derive estimating relationships and benchmarks for size, cost, productivity and quality • All have faced common challenges such as data definitions, completeness and integrity November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 3 University of Southern California Center for Systems and Software Engineering Research Objectives • Using SRDR data, improve the quality and consistency of estimating methods across cost agencies and program offices through guidance, standardization, and knowledge sharing. – Characterize different Application Domains and Operating Environments within DoD – Analyze collected data for simple Cost Estimating Relationships (CER) within each domain – Develop rules-of-thumb for missing data • Make collected data useful to oversight and management entities Data Analysis Cost = a * Xb Data Records November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling CERs 4 University of Southern California Center for Systems and Software Engineering SRDR Raw Data (520 observations) PM = 1.67 * KSLOC0.66 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 5 University of Southern California Center for Systems and Software Engineering Data Conditioning • Segregate data • Normalize sizing data (predictor) • Normalize effort data (response) • Address multi-build data November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 6 University of Southern California Center for Systems and Software Engineering SRDR Data Segmentation November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 7 University of Southern California Center for Systems and Software Engineering Communication Domain Analysis-1 Domain Communications Environment Fixed Ground November 3, 2010 Examples • Radios • Microwave controller • Large telephone switching systems • Network management Examples • Computing facilities • Command and Control centers • Tactical Information centers • Communication centers Brief Definition Software that controls the transmission and receipt of voice, data, digital and video information. The software operates in real-time or in pseudo real-time. Environment: Fixed ground, mobile ground, manned and unmanned airborne, or unmanned space. Brief Definition Manned and unmanned fixed, stationary land sites (buildings) with access to external power sources, backup power sources, physical access to systems, regular upgrades and maintenance to hardware and software, support for multiple users. Possible noisy environment. 25th International Forum on COCOMO and Systems/Software Cost Modeling 8 University of Southern California Center for Systems and Software Engineering Mission Management Analysis-1 Domain Mission Management Environment Examples • Operational Flight Program • Mission Computer • Flight Control Software Examples • Fixed-wing aircraft • Helicopters Avionics November 3, 2010 Brief Definition Software that enables and assists the operator in performing mission management activities including scheduling activities based on vehicle, operational and environmental priorities. Environment: Mobile ground, avionics or manned space. Brief Definition Manned airborne platforms. Software that is complex and runs in real-time in embedded computer systems. It must often operates under interrupt control to process timelines in the nanoseconds. 25th International Forum on COCOMO and Systems/Software Cost Modeling 9 University of Southern California Center for Systems and Software Engineering Normalizing Size • Normalize the SLOC counting method to Logical SLOC – Physical SLOC count converted to Logical SLOC count by programming language – Non-comment SLOC count converted to Logical SLOC count by programming language • Convert Auto-Generated SLOC convert to Equivalent SLOC (ESLOC) – Use AAF formula: (DM% * 0.4) + (CM% * 0.3) + (IM% * 0.3) – DM = CM = 0; IM = 100 • Convert Reused SLOC to ESLOC with AAF formula – DM = CM = 0; IM = 100 • Convert Modified SLOC to ESLOC – Use AAF formula: (DM% * 0.4) + (CM% * 0.3) + (IM% * 0.3 – Default values: Low – Mean – High based on 90% confidence interval • • • Create Equivalent SLOC count and scale to thousands (K) to derive EKSLOC (New + Auto-Gen+ Reused+ Modifed) / 1000 = EKSLOC Remove all records with an EKSLOC below 1.0 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 10 University of Southern California Center for Systems and Software Engineering SLOC Count Conversion Experiment Logical SLOC = 0.611 * NCSS Count R2 = 0.9974 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 11 University of Southern California Center for Systems and Software Engineering SLOC Count Conversion Factors Data Count Total Line to Logical NCSS to Logical Ada 4 0.25 0.52 C/C++ 12 0.32 0.61 C# 8 0.35 0.68 Java 6 0.35 0.72 Perl 4 0.53 0.70 PHP 4 0.44 0.66 Overall 38 0.33 0.64 For example, (C++ NCSS SLOC Count) * 0.61 = (C++ Logical SLOC Count) November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 12 University of Southern California Center for Systems and Software Engineering Convert Modified Size to ESLOC • Use AAF formula: (DM% * 0.4) + (CM% * 0.3) + (IM% * 0.3) • Problems with missing DM, CM & IM in SRDR data • Program interviews provided parameters for some records • For missing data, use records that have data in all fields to derive recommended values for missing data November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 13 University of Southern California Center for Systems and Software Engineering Convert Modified Size to ESLOC • Communication Domain (18 observations) Median Low 90% CL Mean High 90% CL DM% CM% IM% 15 28 64 14 20 49 25 31 62 36 42 75 • Mission Management (19 observations) Median Low 90% CL Mean High 90% CL November 3, 2010 DM% CM% IM% 100 100 100 58 69 76 75 83 88 92 97 100 25th International Forum on COCOMO and Systems/Software Cost Modeling 14 University of Southern California Center for Systems and Software Engineering Normalizing Effort • Labor hours are reported for 7 categories: – – – – – – – Software Requirements Software Architecture (including Detailed Design) Software Code (including Unit Testing) Software Integration and Test Software Qualification Test Software Developmental Test & Evaluation Other (Mgt, QA, CM, PI, etc.) • Create effort distribution percentages for records that have hours in requirements, architecture, code, integration and qualification test phases (developmental test evaluation and other phases may or may not be blank) • Fill in missing hours using effort distribution table • Currently don’t use Developmental Test and Other hours November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 15 University of Southern California Center for Systems and Software Engineering Distribution Percentages • Communication (27 observations) Median Low 90% CL Mean High 90% CL Req’t% Arch% Code% I&T% QT% 16 27 32 21 4 14 23 29 17 4 17 27 32 20 7 20 30 35 23 10 • Mission Management (16 observations) Median Low 90% CL Mean High 90% CL November 3, 2010 Req’t% Arch% Code% I&T% QT% 24 13 34 17 11 18 11 27 12 6 24 14 32 17 13 30 19 37 22 20 25th International Forum on COCOMO and Systems/Software Cost Modeling 16 University of Southern California Center for Systems and Software Engineering Multi-Build Data November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 17 University of Southern California Center for Systems and Software Engineering One More: Team Experience • SRDR Data Definition – Report the percentage of project personnel in each category – Highly Experienced in the domain (three or more years of experience) – Nominally Experienced in the project domain (one to three years of experience) – Entry-level Experienced (zero to one year of experience) • Need to include Team Experience (TXP) in CERs to estimate cost • After analyzing the data, the following quantitative values are assigned: – Highly experienced: 0.60 – Nominally experienced: 1.00 – Entry-level experienced: 1.30 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 18 University of Southern California Center for Systems and Software Engineering Data Conditioning Results Just Kidding! November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 19 University of Southern California Center for Systems and Software Engineering Five Phases 0.939 PM = 6.35 * EKSLOC R2 = 0.86 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 20 University of Southern California Center for Systems and Software Engineering Three Phases 0.953 PM = 3.8 * EKSLOC R2 = 0.88 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 21 University of Southern California Center for Systems and Software Engineering Five Phases PM = 5.06 * EKSLOC1.22 R2 = 0.890 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 22 University of Southern California Center for Systems and Software Engineering Three Phases PM = 3.47 * EKSLOC1.19 R2 = 0.9094 November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 23 University of Southern California Center for Systems and Software Engineering Simple Cost Estimating Relationships CER # Data Pts EKSLOC Range R2 Communications Mission Management PM = 3.8 * EKSLOC0.95 * TXP PM = 3.47 * EKSLOC1.19 * TXP 26 36 4.8 to 200 1.2 to 201 0.88 0.91 Notes: CER: Cost Estimating Relationship PM: Person Months (152 labor hours / month) EKSLOC: Equivalent Thousands of Source Lines of Code R2: Correlation Coefficient that ranges for 0 to 1 Bias: Average percentage error that estimate is above/below actual value November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 24 University of Southern California Center for Systems and Software Engineering Conclusion • Workshop: Thursday from 8:00 AM to 2:50 PM • Come find out – How to use the information to construct an estimate – Next steps and schedule – Conclusions about this approach • Discussion on how the SRDR may change – We made recommendations for improvements • We will also discuss ranking Application Domains by order of productivity November 3, 2010 25th International Forum on COCOMO and Systems/Software Cost Modeling 25 University of Southern California Center for Systems and Software Engineering Questions? November 3, 2010 For more information, contact: Wilson Rosa Wilson.Rosa@pentagon.af.mil 703-604-0395 Or Brad Clark bkclark@csse.usc.edu 703-754-0115 Or Ray Madachy rjmadach@nps.edu 25th International Forum on COCOMO and Systems/Software Cost Modeling 26