COCOMO II Calibration Brad Clark Software Metrics Inc. Don Reifer Reifer Consultants Inc. 22nd International Forum on COCOMO and Systems / Software Cost Modeling USC Campus, Los Angeles, CA, 31 Oct to 2 Nov 2007 smi Software Metrics, Inc. Topics I. Globbing study results – Glob: (informal noun) a lump of a semi-liquid substance – COCOMO Globbing: lumping projects together based on a common attributes II. COCOMO Model calibration status smi Software Metrics, Inc. 2 Globbing Motivation Create pre-set cost driver ratings for different application domains – Select an application domain and then create a set of driver ratings for it based on common characteristics – Allows estimators to generate estimates quickly – Permits estimators to create a knowledge base on the basis of application domain characteristics Create calibration groups within the data – Hypothesis: projects in the database are so diverse that “local” calibration within a group should improve model accuracy and precision smi Software Metrics, Inc. 3 Approach -1 Globbing data into application domains is based on productivities and size What productivity do we use: raw or adjusted? Raw Productivity SLOC Actual Effort Adjusted Productivity SLOC Actual Effort adjusted for project characteristics Why would we consider adjusting actual effort? – Because there are business decisions that impact the project and are independent of an application domain Which COCOMO II drivers are business-driven? How do we create size ranges? smi Software Metrics, Inc. 4 Approach -2 Business-driven drivers (these are independent of the application) – Analyst capability – Programmer capability – Personnel continuity – Application experience – Platform experience – Language and tool experience – Use of software tools – Multi-site development – Required development schedule – All Scale Drivers smi Software Metrics, Inc. Application-driven drivers – Required software reliability – Database size – Product complexity – Developed for Reuse (maybe not) – Documentation match to life-cycle needs – Execution time constraint – Main storage constraint – Platform volatility 5 Size Buckets Size versus Productivity When Size and Productivity are compared, there are three areas where productivity changes at different rates: – Small: 2 to 25 KSLOC – Medium: 25 to 100 KSLOC – Large:100 to 1,000 KSLOC COCOMO II Productivities (Nominal Projects) 350 325 317 300 275 250 247 230 221 215 225 196 200 183 176 175 171 150 125 100 0 100 200 300 400 500 600 700 800 900 1000 KSLOC smi Software Metrics, Inc. 6 Analysis Approach -1 Data from COCOMO II 2000 calibration (161projects) were used to conduct the study The effort was adjusted to remove the business-driven drivers: PM’ Data points were divided into four different groups based on productivities observed in the data – – – – Glob-1: Defense-like applications (real-time; complex) Glob-2: Telecom-like applications (high reliability) Glob-3: Scientific-like applications (compute-intensive) Glob-4: Business-like applications (data-intensive) Each Glob was segregated into three size buckets based on COCOMO II model productivity rates – Small: 2 to 25 KSLOC – Medium: 25 to 100 KSLOC – Large: 100+ KSLOC smi Software Metrics, Inc. 7 Analysis Approach -2 Globbing around application-type and size has been successfully demonstrated in other cost models. We defined application-types as follows: Glob-1 Glob-2 Glob-3 Glob-4 REL Y VH H H N DATA N H L N CPLX H N H L RUSE N N N N DOCU N N N N TIME VH H N N STOR VH H N N PVOL H N N N We use such definitions because we can compare results against the norms reported in the Crosstalk Mar 2002 article “Let the Numbers Do The Talking” smi Software Metrics, Inc. 8 COCOMO II 2000 Effort Prediction Accuracy As a comparison to the Globbing results, the COCOMO II model accuracy reported in 2000 was: Prediction Accuracy1 Before Stratification by Organization After Stratification by Organization PRED(20) 63% 70% PRED(25) 68% 76% PRED(30) 75% 80% Note: 1. PRED(X) = Y% means that Y% of the predicted values fall within X% of the actual values smi Software Metrics, Inc. 9 Results -1 Projects in each Glob and Size Bucket were used to create a new COCOMO II constant, A, for each group (B set to 0.91). Size Bucket 2-25 KSLOC 25-100 KSLOC 100+ KSLOC smi Software Metrics, Inc. Glob-1 Glob-2 Glob-3 Glob-4 n 6 10 6 10 A 3.13 2.82 2.99 2.82 PRED(25) 83% 80% 100% 70% PRED(20) 67% 70% 100% 50% n 13 15 15 14 A 3.21 3.12 3.09 2.67 PRED(25) 69% 87% 73% 93% PRED(20) 69% 73% 73% 86% n 18 8 6 9 A 3.61 3.18 2.40 2.75 PRED(25) 72% 75% 50% 89% PRED(20) 56% 25% 33% 89% 10 Results -2 COCOMO II constants, A and B, were created for each group. Size Bucket 2-25 KSLOC 25-100 KSLOC 100+ KSLOC smi Software Metrics, Inc. Glob-1 Glob-2 Glob-3 Glob-4 n 6 10 6 10 A 3.53 4.27 2.88 1.90 B 0.87 0.75 0.92 1.07 PRED(25) 83% 80% 100% 60% PRED(20) 67% 70% 100% 50% n 13 15 15 14 A 7.14 5.42 6.44 2.05 B 0.69 0.76 0.72 0.98 PRED(25) 69% 80% 73% 93% PRED(20) 69% 80% 73% 86% n 18 8 6 9 A 4.58 9.52 1.20 8.76 B 0.87 0.73 1.04 0.68 PRED(25) 72% 75% 50% 100% PRED(20) 56% 75% 50% 100% 11 Usage Example Example Project: – Estimated 56,000 SLOC – Application-type: Telecom-like (Glob-2, constant A=3.12) Results: – Preset drivers: • • • • RELY: High DATA: High TIME: High STOR: High – All other drivers: Nominal – 382 PM, 24 Months, 16 Average Staffing Level As more is known (but not until then), COCOMO II drivers can be adjusted to reflect that information smi Software Metrics, Inc. 12 Globbing Conclusions Purpose was to create pre-set cost driver ratings for different application domains and a local calibration based on the domain An application domain is not characterized by all COCOMO II drivers, i.e. only a subset of the drivers describe an application domain (the question is which subset?) Results for a new COCOMO II calibrated constant, A, for each group makes the most sense and are reasonably accurate. – Need more data to calibrate A and B (10-12 projects for each group) – Size buckets account for possible changes in B smi Software Metrics, Inc. 13 Globbing Next Steps Get feedback on this idea from you, the conference attendees – Please share your thoughts with Don or Brad The key to this approach is identifying the correct Globs and the cost-driver setting for each Glob. – Use a consensus approach to setting the cost drivers for the different Globs – Name and describe the Globs – Re-run analysis Re-apply this technique to new project data in the repository smi Software Metrics, Inc. 14 Topics I. Globbing study results – Glob: (informal noun) a lump of a semi-liquid substance – COCOMO Globbing: lumping projects together based on a common attribute II. COCOMO Model calibration status smi Software Metrics, Inc. 15 COCOMO II Calibration Status Please check the CSSE website under “Past Events” for the results of this work. http://csse.usc.edu/csse/event/past.html smi Software Metrics, Inc. 16 For More Information Brad Clark Software Metrics Inc. (703) 754-0115 Brad@software-metrics.com Don Reifer Reifer Consultants Inc. (310) 530-4493 dreifer@earthlink.net smi Software Metrics, Inc. 17