1 APPENDIX C COCOMO SUITE: DATA COLLECTION FORMS AND GUIDELINES C.1 Introduction Appendix C provides a set of forms and procedures for collecting effort and schedule data for a given software project throughout its life cycle, in a form compatible with the following COCOMO Suite models: COCOMO II, and its emerging extensions COCOTS, COPSEMO, COQUALMO and CORADMO. These data collection or Software Project Data (SPD) forms have been kept brief, with minimal definitions, explanations, etc. Please refer to the index and glossary for the definitions of any terms that are unfamiliar. The procedures are oriented around the collection of information and the updating of estimates at the project's life cycle anchor points (see Appendix A). Revising project estimates as each anchor point is achieved provides immediate benefits by furnishing 1) estimates O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 2 that are more accurate, and 2) current cost-to-complete and schedule-to-complete information. Revised estimates also provide the data needed to perform up-todate sensitivity, risk and parametric analyses. Such data collection activities are as an integral part of an effective project management process. Information gathered is used in determining whether or not a project is on track relative to original plans built upon initial estimates. When actual cost and schedule performance deviates from plans, new estimates may be in order. Data collection should not be an additional burden for management. Thus, we have organized COCOMO II data collection to be management-relevant and easy to implement via the electronic forms found on the accompanying CD. For on-going projects, data collection allows you to determine whether or not your performance is on track relative to plans. For completed projects, data collection allows you to develop a database that you can use to more precisely calibrate COCOMO II and the other Suite models to your actual experience. For both types of projects, data collection permits you to use existing knowledge to improve the accuracy of your estimating capabilities. The data collection forms and procedures provided here enable an organization to develop the core capabilities needed to satisfy the new Level 2 O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 3 Measurement and Analysis process area called Activities Performed, found in the Integrated Capability Maturity Model (CMMI) recently issued at: http://www.sei.cmu.edu/cmm/cmmi/ The activities include: establish measurement objectives; define measures; define data collection and storage procedures; define analysis procedures; collect measurement data; analyze measurement data; store data and results; and communicate results. C.2 Procedure for Projects The Software Project Data forms (Figures) and corresponding instructions (Tables) described below are provided herein as well as on the accompanying CD-ROM. The first form applies to all the Suite models, while numbers two through five apply to COCOMO II; these are also needed, however, as a base for all of the emerging extensions. The remaining forms, six through nine, are specific to each of the extensions of COCOMO II. All the forms can be used either for on-going or completed projects: Form SPD-1: General Information (All Models) (Figure C-1/Table C-1). Originated at the start of the project, updated at intermediate milestones and completed at the end of the project. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 4 Form SPD-2a: Phase Summaries (Waterfall-based process) (Figure C-2a/Table C-2a). Estimated or actual phase information entered at the end of each major phase of the project following a Waterfall-based process. Finalized at the end of the development. Form SPD-2b: Phase Summaries (MBASE/RUP-based process) (Figure C-2b/ Table C-2b). Estimated or actual phase information entered at the end of each major phase of the project following a MBASE/RUP-based process. Finalized at the end of the development. Form SPD-3: Component Summaries (Figure C-3/Table C-3). Component data entered during the start of the project. Completed at the end of the development. Form SPD-4: COCOMO II Progress Runs (Figure C-4/Table C-4). Estimated project cost and schedule data, and ratings for estimating parameters, entered at the end of each major phase of the project. Form SPD-5: COCOMO II Project Actuals (Figure C-5/Table C-5). Actual project cost and schedule data, and final ratings for estimating parameters, collected at the end of the project. Form SPD-5a: COCOMO II Project Actuals: Simple Completed Project (Figure C-5a/Table C-5a). Actual project cost and schedule data, and final O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 5 ratings for estimating parameters, for simple completed projects; collected at the end of the project. Form SPD-6a: COCOTS Project Level Data (Figure C-6a/Table C-6a). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development. Form SPD-6b: COCOTS Assessment Data (Figure C-6b/Table C-6b). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development. Form SPD-6c: COCOTS Tailoring Data (Figure C-6c/Table C-6c). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development. Form SPD-6d: COCOTS Glue Code Data (Figure C-6d/Table C-6d). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 6 Form SPD-6e: COCOTS Volatility Data (Figure C-6e/Table C-6e). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development. Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries (Figure C-7/Table C-7). Phase cycles and activity breakdowns. Form SPD-8: COQUALMO Defect Summaries (Figure C-8/Table C-8). Defect introduction and removal data collected by artifact and life cycle phase. Form SPD-9: CORADMO RAD Details Summaries (Figure C-9/Table C-9). Rapid Application Development parameters (CoRADMO Driver Ratings). Project ratings entered during the start of the project. Final ratings re-assessed at the end of the development, relying on COPSEMO detailed effort and schedule actuals' data for calibration. C.3 Guidelines for Data Collection C.3.1 New Projects Projects starting out should consider collecting cost, schedule and error data at the following times during the project's life: O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 7 Project Start - develop your initial estimates using the following set of forms: Form SPD-1: General Information Form SPD-3: Component Summaries Form SPD-6a: COCOTS Project Level Data At the end of Major Project Phases - update your estimates using the following set of forms: Form SPD-2a: Phase Summaries (Waterfall-based process) Form SPD-2b: Phase Summaries (MBASE/RUP-based process) Form SPD-3: Component Summaries Form SPD-4: COCOMO II Progress Runs Form SPD-6b: COCOTS Assessment Data Form SPD-6c: COCOTS Tailoring Data Form SPD-6d: COCOTS Glue Code Data Form SPD-6e: COCOTS Volatility Data Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries Form SPD-8: COQUALMO Detailed Summaries Form SPD-9: CORADMO RAD Project Summaries. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 8 At the end of the development - capture your project actuals using the following forms: Form SPD-2a: Phase Summaries (Waterfall-based process) Form SPD-2b: Phase Summaries (MBASE/RUP-based process) Form SPD-5: COCOMO II Project Actuals Form SPD-6b: COCOTS Assessment Data Form SPD-6c: COCOTS Tailoring Data Form SPD-6d: COCOTS Glue Code Data Form SPD-6e: COCOTS Volatility Data Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries Form SPD-8: COQUALMO Detailed Summaries Form SPD-9: CORADMO RAD Details New projects should view data collection as an opportunity. They can use the data to benchmark their progress, develop business cases and calibrate their cost models. C.3.2 Completed Projects In general, it is not possible to reconstruct COCOMO II and other COCOMO Suite milestone runs and detailed phase/activity data from completed O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 9 projects. If the project was estimated using earlier versions of model, we suggest that you use our Rosetta stone [Boehm, Reifer, Chulani, ????] to convert the data. If they weren't, we suggest that you try to capture as much cost related data as possible using the following forms: Form SPD-1: General Information - Fill out this form as best you can. Form SPD-2a: Phase Summaries (Waterfall-based process) Complete this form for each major delivery of a Waterfallbased process. Form SPD-2b: Phase Summaries (MBASE/RUP-based process) - Complete this form for each major delivery of a MBASE/RUP-based process. Form SPD-3: Component Summaries - Do the best you can with whatever data you can gather. Use a code counter to collect actuals whenever possible. Form SPD-4: COCOMO II Progress Runs - Fill out this form using any cost- and schedule-to-complete information at your disposal. If no such information exists or is readily available, don't waste your time. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 10 Form SPD-5: COCOMO II Project Actuals - Complete this form by sifting through your accounting reports and by inspecting the final product. Form SPD-5a: COCOMO II Project Actuals: Simple Completed Projects – Preferably, this form should be accompanied by Form SPD1, but it can be used as a one-page total-completed-project data collection form compatible with the data provided for a COCOMO II estimation run. Form SPD-6a: COCOTS Project Level Data - Do the best you can with whatever data you can gather. Form SPD-6b: COCOTS Assessment Data - Do the best you can with whatever data you can gather. Form SPD-6c: COCOTS Tailoring Data - Do the best you can with whatever data you can gather. Form SPD-6d: COCOTS Glue Code Data - Do the best you can with whatever data you can gather. Use a code counter to collect actuals whenever possible. Form SPD-6e: COCOTS Volatility Data - Do the best you can with whatever data you can gather. Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries - Complete this form by sifting through your accounting O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 11 reports and applying engineering judgement based on personnel and their tasks or roles. Form SPD-8: COQUALMO Detailed Summaries - Fill out this form as completely as you can using inspection reports, technical review reports, testing results and reports, and software trouble report records as your source. Form SPD-9: CORADMO RAD Details Summaries (Figure C-9/Table C-9). Fill out this form with Rapid Application Development parameters (CoRADMO Driver Ratings). C.3.3 Maintenance Projects COCOMO II also provides you with the capability to develop annual or other periodic maintenance cost estimates based upon the modification of the original COCOMO 81 maintenance model, described in Chapter 2, Section 2.5. We suggest that you use the forms provided when using this model. However, you will want to use actuals and re-rate project attributes collected on Form SPD5 when computing the numbers. C.4 Data Conditioning Data conditioning is an essential activity in the software data collection and analysis process. Even when people try to provide the best data they can, there are a number of known problems and subtle sources of misunderstanding O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 12 that can inject bias into their data. Use of such data to calibrate cost models can lead to erroneous results should these and other sources of data contamination not be removed. C.4.1 Sources of Data Contamination Besides the problems of missing data and clerical errors, some of the most common and frequent sources of software data collection problems include: 1. Inconsistent definitions - The COCOMO II model defines terms differently than previous models. For example, it uses SLOC (Source Lines of Code) instead of DSI (Delivered Source Instructions) which were used in the original COCOMO model (see Section 2.2.1). An "IF-THEN-ELSE, ELSE IF" pair will now count as a single SLOC instead of two DSI when a terminal semicolon is used for the counting conventions. As another example, COCOMO uses 152 person hours per person month and assumes casual overtime is not included as part of the burden. If you used something different, the model would generate erroneous answers. 2. Improper scope - The COCOMO II model assumes that the project's scope includes certain activities and excludes others. For example, software testing is included while software support to system integration and test is not. As O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 13 another example, software documentation that is normally generated during the software development life cycle is included while customer unique documentation is not. Again, you would generate erroneous answers if you used the model outside of its proper scope. Appendix A, Section 6, records the major COCOMO II scoping assumptions. 3. Double Counting - Sometimes items are double counted or taken into account twice using several factors within the model. For example, REVL is used to take into account volatile requirements. However, some people double dip by improperly rating the Precedentedness or Architecture/Risk Resolution scale factors lower than they should be to take volatility into account. You should understand what the factor ratings involve prior to rating them to avoid making this mistake. 4. Averaging - Often, people use average ratings for groupings that extend across subsystems and the project. Because they haven't taken the time to get into the details, they consolidate their estimate and lose fidelity because little differentiation is made between different types of software. You can avoid this problem and greatly improve the accuracy of your estimates by breaking down the project into finer grained components. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 14 5. Garbage In, Garbage Out - Another common problem is the use of erroneous assumptions. People often use models to generate quick-and-dirty estimates. They make all sorts of simplifying assumptions in their quest for numbers. One way to avoid problems of this sort is to take a little more time to develop realistic, but simplifying assumptions. This often takes some interaction with both the developer and customer communities. 6. Observational Bias - Finally, many people tend to be overly optimistic/pessimistic when they estimate. Biases either way should be avoided especially when they can become a self-fulfilling prophecy. Use of wide band Delphi in which groups of experts reach consensus on their estimates reduces such biases. However, such group estimates take more time to achieve and may not be practical under some circumstances. C.4.2 Data Conditioning Guidelines The best defense against these problems is to provide those involved in the data collection with a clear set of definitions, automated procedures and examples. Build self-checks whenever possible into your data collection system. For example, you can ask a question in two different ways on two related forms to test the consistency of the answer (e.g., effort, schedule, average staff size). Be O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 15 careful not to overdo this, however. In addition, such problems can be further avoided by collecting the data close to its source. For example, try to collect labor hours using your time card system. Finally, make data collection a natural part of the way you implement your processes. For example, collect error data as part of your software trouble reporting process. This eliminates the need to use multiple forms and makes it easier to collect the data. The following additional data conditioning guidelines are recommended for inclusion in your process: Data screening - Each form used should be screened when completed to identify missing, unreasonable and inconsistent entries. For example, a large PM estimate for a small sized application needs to be checked. As another example, lots of applications experience for a highly unprecedented application seems inconsistent. If the data is collected on-line, such checks can and should be automated. Wide band Delphi - Whenever possible, use more than a single person and more than one cost model to base your estimates on. By polling the experts, this approach limits the observational bias natural to estimating to a minimum. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 16 On-Line Forms and Counting Conventions - Put your forms and counting conventions on-line whenever possible. Make sure that you include plenty of examples to illustrate how to fill out the forms properly and how to count correctly. If possible, construct a web site on your server and make the forms and conventions accessible to all. Guard Competitively Sensitive Data with Your Life - Protect your cost data carefully. Don't allow unauthorized access to the database and guard it against pilfering and pirating. Limit the number of people who have access to the data to those who are responsible for its use and analysis. Compare to Industry Benchmarks - If you can, see how you compare to any published benchmarks. This provides you with yet another check on the reasonableness of your data. The COCOMO II data base size, effort, and cost driver rating distributions in Chapter 4 provide one such source. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 17 Instructions to Prentice Hall editors for physical presentation of the data collection forms described in this appendix: The forms and accompanying instructions should appear at the end of this appendix, after all the expository text. The forms themselves (identified as figures) should appear only on leftside pages. The associated instructions (identified as tables) should appear only on right-side pages, OPPOSITE the figure a given table is explaining. O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 18 O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 19 [Insert Figure C-1] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 20 [Insert Table C-1] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 21 [Insert Figure C-1 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 22 [Insert Table C-1 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 23 [Insert Figure C-2a] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 24 [Insert Table C-2a] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 25 [Insert Figure C-2b] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 26 [Insert Table C-2b] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 27 [Insert Figure C-3] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 28 [Insert Table C-3] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 29 [Insert Figure C-3 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 30 [Insert Table C-3 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 31 [Insert Figure C-3 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 32 [Insert Table C-3 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 33 [Insert Figure C-4] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 34 [Insert Table C-4] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 35 [Insert Figure C-5] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 36 [Insert Table C-5] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 37 [Insert Figure C-5 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 38 [Insert Table C-5 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 39 [Insert Figure C-5 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 40 [Insert Table C-5 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 41 [Insert Figure C-5a] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 42 [Insert Table C-5a] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 43 [Insert Figure C-6a] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 44 [Insert Table C-6a] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 45 [Insert Figure C-6a (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 46 [Insert Table C-6a (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 47 [Insert Figure C-6b] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 48 [Insert Table C-6b] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 49 [Insert Figure C-6c] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 50 [Insert Table C-6c] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 51 [Insert Figure C-6d] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 52 [Insert Table C-6d] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 53 [Insert Figure C-6d (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 54 [Insert Table C-6d (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 55 [Insert Figure C-6e] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 56 [Insert Table C-6e] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 57 [Insert Figure C-7] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 58 [Insert Table C-7] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 59 [Insert Figure C-8] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 60 [Insert Table C-8] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 61 [Insert Figure C-8 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 62 [Insert Table C-8 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 63 [Insert Figure C-8 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 64 [Insert Table C-8 (cont'd)] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 65 [Insert Figure C-9] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178 66 [Insert Table C-9] O© 1999-2000 USC Center for Software Engineering. All Rights Reserved 612931178