University of Southern California Center for Systems and Software Engineering Cost Estimation with COCOMO II Barry Boehm CS 510, Fall 2015 v3: Slide 10 edited by Jim Alstad University of Southern California Center for Systems and Software Engineering Outline • Model Overview • Model Reinterpretation for CS 577 ©USC-CSSE 2 University of Southern California Center for Systems and Software Engineering Software Cost Estimation Methods • Cost estimation: prediction of both the person-effort and elapsed time of a project • Methods: – – – – Algorithmic Expert judgement Estimation by analogy Parkinsonian – Price-to-win – Top-down – Bottom-up • Best approach is a combination of methods – compare and iterate estimates, reconcile differences • COCOMO - the “COnstructive COst MOdel” – COCOMO II is the update to Dr. Barry Boehm’s COCOMO 1981 • COCOMO is the most widely used, thoroughly documented and calibrated cost model ©USC-CSSE 3 University of Southern California Center for Systems and Software Engineering Software Estimation Accuracy 4x • Effect of uncertainties over time 2x Relative x Size Range 0.5x Operational Concept 0.25x Feasibility Life Cycle Objectives Plans/Rqts. Life Cycle Architecture Design Initial Operating Capability Develop and Test Phases and Milestones ©USC-CSSE 4 University of Southern California Center for Systems and Software Engineering COCOMO Black Box Model product size estimate development, maintenance cost and schedule estimates product, process, platform, and personnel attributes reuse, maintenance, and increment parameters COCOMO II cost, schedule distribution by phase, activity, increment organizational project data recalibration to organizational data ©USC-CSSE 5 University of Southern California Center for Systems and Software Engineering Major COCOMO II Features • Multi-model coverage of different development sectors • Variable-granularity cost model inputs • Flexibility in size inputs – – – – SLOCS function points application points other (use cases ...?) • Range vs. point estimates per funnel chart ©USC-CSSE 6 University of Southern California Center for Systems and Software Engineering Relations to ICSM-Sw/MBASE*/RUP Anchor Point Milestones Application Compos. Inception Elaboration, Construction Transition IOC COCOMO II estimates SRR System Devel. Waterfall Rqts. Inception Phase PDR Prod. Des. Development Elaboration Construction Transition LCA LCO *MBASE: Model-Based (System) Architecting and Software Engineering ©USC-CSSE 7 7 University of Southern California Center for Systems and Software Engineering COCOMO Effort Formulation # of cost drivers Effort (person-months) = A (Size)E P EMi i=1 • Where: – A is a constant derived from historical project data (currently A = 2.94 in COCOMOII.2000) – Size is in KSLOC (thousand source lines of code), or converted from function points or object points – E is an exponent for the diseconomy of scale dependent on five additive scale drivers according to b = .91 + .01*SSFi, where SFi is a weighting factor for ith scale driver – EMi is the effort multiplier for the ith cost driver. The geometric product results in an overall effort adjustment factor to the nominal effort. • Automated translation effects are not included ©USC-CSSE 8 University of Southern California Center for Systems and Software Engineering Diseconomy of Scale • Nonlinear relationship when exponent > 1 P e rs o n M o n th s 16000 B = 1 .2 2 6 14000 12000 10000 8000 6000 B = 1 .0 0 4000 2000 B = 0 .9 1 0 0 500 1000 KSLO C ©USC-CSSE 9 University of Southern California Center for Systems and Software Engineering COCOMO Schedule Formulation Schedule (months) = C (Effort)(.28+0.2(E-0.91)) x SCED%/100 • Where: – Schedule is the calendar time in months from the requirements baseline to acceptance – C is a constant derived from historical project data (currently C = 3.67 in COCOMOII.2000) – Effort is the estimated person-months excluding the SCED effort multiplier – E is the exponent in the effort equation – SCED% is the compression / expansion percentage in the SCED cost driver • This is the COCOMOII.2000 calibration • Formula can vary to reflect process models for reusable and COTS software, and the effects of application composition capabilities. ©USC-CSSE 10 University of Southern California Center for Systems and Software Engineering MBASE Phase Distributions Phase Effort % Inception Schedule % 6 12.5 Elaboration 24 37.5 Construction 76 62.5 Transition 12 COCOMO Total 100 Project Total 12.5 100 118 125 • see COCOMO II book for complete phase/activity distributions ©USC-CSSE 11 University of Southern California Center for Systems and Software Engineering COCOMO II Output Ranges • COCOMO II provides one standard deviation optimistic and pessimistic estimates. • Reflect sources of input uncertainties per funnel chart. • Apply to effort or schedule for all of the stage models. • Represent 80% confidence limits: below optimistic or pessimistic estimates 10% of the time. Stage Optimistic Pessimistic Estimate Estimate 1 0.50 E 2.0 E 2 0.67 E 1.5 E 3 0.80 E 1.25 E ©USC-CSSE 12 University of Southern California Center for Systems and Software Engineering Reused and Modified Software • Effort for adapted software (reused or modified) is not the same as for new software. • Approach: convert adapted software into equivalent size of new software. ©USC-CSSE 13 University of Southern California Center for Systems and Software Engineering Nonlinear Reuse Effects • • The reuse cost function does not go through the origin due to a cost of about 5% for assessing, selecting, and assimilating the reusable component. Small modifications generate disproportionately large costs primarily due the cost of understanding the software to be modified, and the relative cost of interface checking. Data on 2954 NASA modules [Selby,1988] 1.0 1.0 0.70 0.75 0.55 Relative cost 0.5 Usual Linear Assumption 0.25 0.046 0.25 0.5 0.75 1.0 Amount Modified ©USC-CSSE 14 University of Southern California Center for Systems and Software Engineering COCOMO Reuse Model • A nonlinear estimation model to convert adapted (reused or modified) software into equivalent size of new software: AAF 0.4( DM ) 0.3( CM ) 0.3( IM ) ESLOC ASLOC[ AA AAF (1 0.02( SU )(UNFM ))] , AAF 0.5 100 ASLOC[ AA AAF ( SU )(UNFM )] ESLOC , AAF 0.5 100 ©USC-CSSE 15 University of Southern California Center for Systems and Software Engineering COCOMO Reuse Model cont’d • • • • • • • • • ASLOC - Adapted Source Lines of Code ESLOC - Equivalent Source Lines of Code AAF - Adaptation Adjustment Factor DM - Percent Design Modified. The percentage of the adapted software's design which is modified in order to adapt it to the new objectives and environment. CM - Percent Code Modified. The percentage of the adapted software's code which is modified in order to adapt it to the new objectives and environment. IM - Percent of Integration Required for Modified Software. The percentage of effort required to integrate the adapted software into an overall product and to test the resulting product as compared to the normal amount of integration and test effort for software of comparable size. AA - Assessment and Assimilation effort needed to determine whether a fullyreused software module is appropriate to the application, and to integrate its description into the overall product description. See table. SU - Software Understanding. Effort increment as a percentage. Only used when code is modified (zero when DM=0 and CM=0). See table. UNFM - Unfamiliarity. The programmer's relative unfamiliarity with the software which is applied multiplicatively to the software understanding effort increment (0-1). ©USC-CSSE 16 University of Southern California Center for Systems and Software Engineering Assessment and Assimilation Increment (AA) AA Increment Level of AA Effort 0 None 2 Basic module search and documentation 4 Some module Test and Evaluation (T&E), documentation 6 Considerable module T&E, documentation 8 Extensive module T&E, documentation ©USC-CSSE 17 University of Southern California Center for Systems and Software Engineering Software Understanding Increment (SU) • Take the subjective average of the three categories. • Do not use SU if the component is being used unmodified (DM=0 and CM =0). Very High High Nominal Low Very Low High cohesion, low Strong modularity, information hiding in coupling. data / control structures. Structure Very low cohesion, high coupling, spaghetti code. Moderately low cohesion, high coupling. Reasonably wellstructured; some weak areas. Application Clarity No match between program and application world views. Some correlation between program and application. Good correlation Moderate between program correlation between program and application. and application. Some code commentary and headers; some useful documentation. Moderate level of code commentary, headers, documentations. Good code commentary and headers; useful documentation; some weak areas. Self-descriptive code; documentation up-todate, well-organized, with design rationale. 40 30 20 10 Obscure code; SelfDescriptivenes documentation missing, obscure s or obsolete SU Increment to ESLOC 50 ©USC-CSSE Clear match between program and application worldviews. 18 University of Southern California Center for Systems and Software Engineering Programmer Unfamiliarity (UNFM) • Only applies to modified software UNFM Increment Level of Unfamiliarity 0.0 Completely familiar 0.2 Mostly familiar 0.4 Somewhat familiar 0.6 Considerably familiar 0.8 Mostly unfamiliar 1.0 Completely unfamiliar ©USC-CSSE 19 University of Southern California Center for Systems and Software Engineering Cost Factors • Significant factors of development cost: – scale drivers are sources of exponential effort variation – cost drivers are sources of linear effort variation • product, platform, personnel and project attributes • effort multipliers associated with cost driver ratings – Defined to be as objective as possible • Each factor is rated between very low and very high per rating guidelines – relevant effort multipliers adjust the cost up or down ©USC-CSSE 20 University of Southern California Center for Systems and Software Engineering Scale Factors • Precedentedness (PREC) – Degree to which system is new and past experience applies • Development Flexibility (FLEX) – Need to conform with specified requirements • Architecture/Risk Resolution (RESL) – Degree of design thoroughness and risk elimination • Team Cohesion (TEAM) – Need to synchronize stakeholders and minimize conflict • Process Maturity (PMAT) – SEI CMM process maturity rating ©USC-CSSE 21 University of Southern California Center for Systems and Software Engineering Scale Factor Rating Scale Factors (Wi) Very Low Low Nominal Precedentedness (PREC) thoroughly unprecedented largely unprecedented somewhat unprecedented Development Flexibility (FLEX) rigorous occasional relaxation Architecture/Risk little (20%) Resolution (RESL)* Team Cohesion (TEAM) Process Maturity (PMAT) very difficult interactions High Very High Extra High generally familiar largely familiar throughly familiar some relaxation general conformity some conformity general goals some (40%) often (60%) generally (75%) mostly (90%) full (100%) some difficult interactions basically cooperative interactions largely highly seamless cooperative cooperative interactions Weighted average of “Yes” answers to CMM Maturity Questionnaire * % significant module interfaces specified, % significant risks eliminated ©USC-CSSE 22 University of Southern California Center for Systems and Software Engineering Cost Drivers • Product Factors – – – – – Reliability (RELY) Data (DATA) Complexity (CPLX) Reusability (RUSE) Documentation (DOCU) • Platform Factors – Time constraint (TIME) – Storage constraint (STOR) – Platform volatility (PVOL) ©USC-CSSE • Personnel factors – Analyst capability (ACAP) – Program capability (PCAP) – Applications experience (APEX) – Platform experience (PLEX) – Language and tool experience (LTEX) – Personnel continuity (PCON) • Project Factors – Software tools (TOOL) – Multisite development (SITE) – Required schedule (SCED) 23 University of Southern California Center for Systems and Software Engineering Product Factors • Required Software Reliability (RELY) – Measures the extent to which the software must perform its intended function over a period of time. – Ask: what is the effect of a software failure RELY Very Low slight inconvenience Low low, easily recoverable losses Nominal moderate, easily recoverable losses ©USC-CSSE High high financial loss Very High risk to human life Extra High 24 University of Southern California Center for Systems and Software Engineering Example Effort Multiplier Values for RELY 1.26 1.10 Very Low Slight Inconvenience Nominal Low High Very High 1.0 Low, Easily Moderate, Easily High Financial Recoverable Loss Recoverable Losses Losses Risk to Human Life 0.92 0.82 E.g. a highly reliable system costs 26% more than a nominally reliable system 1.26/1.0=1.26) or a highly reliable system costs 54% more than a very low reliability system (1.26/.82=1.54) ©USC-CSSE 25 University of Southern California Center for Systems and Software Engineering Product Factors cont’d • Data Base Size (DATA) – Captures the effect large data requirements have on development to generate test data that will be used to exercise the program – Calculate the data/program size ratio (D/P): D DataBaseSize( Bytes ) P ProgramSize( SLOC ) Very Low DATA Low DB bytes/ Pgm SLOC < 10 ©USC-CSSE Nominal 10 D/P < 100 High 100 D/P < 1000 Very High D/P > 1000 Extra High 26 University of Southern California Center for Systems and Software Engineering Product Factors cont’d • Product Complexity (CPLX) – Complexity is divided into five areas: • control operations, • computational operations, • device-dependent operations, • data management operations, and • user interface management operations. – Select the area or combination of areas that characterize the product or a sub-system of the product. – See the module complexity table, next several slides ©USC-CSSE 27 University of Southern California Center for Systems and Software Engineering Product Factors cont’d • Module Complexity Ratings vs. Type of Module – Use a subjective weighted average of the attributes, weighted by their relative product importance. Very Low Low Nominal High Very High Extra High Control Operations Straightline code with a few nonnested structured programming operators: DOs, CASEs, IFTHENELSEs. Simple module composition via procedure calls or simple scripts. Straightforward nesting of structured programming operators. Mostly simple predicates. Mostly simple nesting. Some intermodule control. Decision tables. Simple callbacks or message passing, including middlewaresupported distributed processing. Highly nested structured programming operators with many compound predicates. Queue and stack control. Homogeneous, dist. processing. Single processor soft realtime ctl. Reentrant and recursive coding. Fixed-priority interrupt handling. Task synchronization, complex callbacks, heterogeneous dist. processing. Singleprocessor hard realtime ctl. Multiple resource scheduling with dynamically changing priorities. Microcodelevel control. Distributed hard realtime control. Computational Operations Evaluation of simple expressions: e.g., A=B+C*(D-E) Evaluation of moderate-level expressions: e.g., D=SQRT(B**24.*A*C) Use of standard math and statistical routines. Basic matrix/vector operations. Basic numerical analysis: multivariate interpolation, ordinary differential eqns. Basic truncation, roundoff concerns. Difficult but structured numerical analysis: near-singular matrix equations, partial differential eqns. Simple parallelization. Difficult and unstructured numerical analysis: highly accurate analysis of noisy, stochastic data. Complex parallelization. ©USC-CSSE 28 University of Southern California Center for Systems and Software Engineering Product Factors cont’d • Module Complexity Ratings vs. Type of Module – Use a subjective weighted average of the attributes, weighted by their relative product importance. Very Low Low Nominal High Very High Extra High Devicedependent Operations Simple read, write statements with simple formats. No cognizance needed of particular processor or I/O device characteristics. I/O done at GET/PUT level. I/O processing includes device selection, status checking and error processing. Operations at physical I/O level (physical storage address translations; seeks, reads, etc.). Optimized I/O overlap. Routines for interrupt diagnosis, servicing, masking. Communication line handling. Performance-intensive embedded systems. Device timingdependent coding, micro-programmed operations. Performancecritical embedded systems. Data Management Operations Simple arrays in main memory. Simple COTSDB queries, updates. Single file subsetting with no data structure changes, no edits, no intermediate files. Moderately complex COTS-DB queries, updates. Multi-file input and single file output. Simple structural changes, simple edits. Complex COTS-DB queries, updates. Simple triggers activated by data stream contents. Complex data restructuring. Distributed database coordination. Complex triggers. Search optimization. Highly coupled, dynamic relational and object structures. Natural language data management. User Interface Management Simple input forms, report generators. Use of simple graphic user interface (GUI) builders. Simple use of widget set. Widget set development and extension. Simple voice I/O, multimedia. Moderately complex 2D/3D, dynamic graphics, multimedia. Complex multimedia, virtual reality. ©USC-CSSE 29 University of Southern California Center for Systems and Software Engineering Product Factors cont’d • Required Reusability (RUSE) – Accounts for the additional effort needed to construct components intended for reuse. Very Low RUSE Low Nominal High Very High none across project across program across product line Extra High across multiple product lines • Documentation match to life-cycle needs (DOCU) – What is the suitability of the project's documentation to its life-cycle needs. Very Low DOCU Many life-cycle needs uncovered Low Nominal Some life-cycle needs uncovered Right-sized to lifecycle needs ©USC-CSSE High Excessive for lifecycle needs Very High Extra High Very excessive for life-cycle needs 30 TIME University of Southern California Center for Systems and Software Engineering Platform Factors • Platform – Refers to the target-machine complex of hardware and infrastructure software (previously called the virtual machine). • Execution Time Constraint (TIME) – Measures the constraint imposed upon a system in terms of the percentage of available execution time expected to be used by the system. Very Low Low Nominal High Very High Extra High 50% use of available execution time 70% 85% 95% ©USC-CSSE 31 University of Southern California Center for Systems and Software Engineering Platform Factors cont’d • Main Storage Constraint (STOR) – Measures the degree of main storage constraint imposed on a software system or subsystem. Very Low STOR Low Nominal High Very High Extra High 50% use of available storage 70% 85% 95% • Platform Volatility (PVOL) – Assesses the volatility of the platform (the complex of hardware and software the software product calls on to perform its tasks). Very Low PVOL Low Nominal High Very High major change every 12 mo.; minor change every 1 mo. major: 6 mo.; minor: 2 wk. major: 2 mo.; minor: 1 wk. major: 2 wk.; minor: 2 days ©USC-CSSE Extra High 32 University of Southern California Center for Systems and Software Engineering Personnel Factors • Analyst Capability (ACAP) – Analysts work on requirements, high level design and detailed design. Consider analysis and design ability, efficiency and thoroughness, and the ability to communicate and cooperate. ACAP Very Low Low Nominal High Very High 15th percentile 35th percentile 55th percentile 75th percentile 90th percentile Extra High • Programmer Capability (PCAP) – Evaluate the capability of the programmers as a team rather than as individuals. Consider ability, efficiency and thoroughness, and the ability to communicate and cooperate. PCAP Very Low Low Nominal High Very High 15th percentile 35th percentile 55th percentile 75th percentile 90th percentile ©USC-CSSE Extra High 33 University of Southern California Center for Systems and Software Engineering Personnel Factors cont’d • Applications Experience (AEXP) – Assess the project team's equivalent level of experience with this type of application. AEXP Very Low Low Nominal High Very High 2 months 6 months 1 year 3 years 6 years Extra High • Platform Experience (PEXP) – Assess the project team's equivalent level of experience with this platform including the OS, graphical user interface, database, networking, and distributed middleware. PEXP Very Low Low Nominal High Very High 2 months 6 months 1 year 3 years 6 year ©USC-CSSE Extra High 34 University of Southern California Center for Systems and Software Engineering Personnel Factors cont’d • Language and Tool Experience (LTEX) – Measures the level of programming language and software tool experience of the project team. LTEX Very Low Low Nominal High Very High 2 months 6 months 1 year 3 years 6 years Extra High • Personnel Continuity (PCON) – The scale for PCON is in terms of the project's annual personnel turnover. PCON Very Low Low Nominal High Very High 48% / year 24% / year 12% / year 6% / year 3% / year ©USC-CSSE Extra High 35 University of Southern California Center for Systems and Software Engineering Project Factors • Use of Software Tools (TOOL) – Assess the usage of software tools used to develop the product in terms of their capabilities and maturity. Very Low edit, code, debug Low simple, frontend, backend CASE, little integration Nominal High basic lifecycle strong, mature tools, moderately lifecycle tools, integrated moderately integrated ©USC-CSSE Very High Extra High strong, mature, proactive lifecycle tools, well integrated with processes, methods, reuse 36 University of Southern California Center for Systems and Software Engineering Project Factors cont’d • Multisite Development (SITE) – Assess and average two factors: site collocation and communication support. SITE: Collocation Very Low Low Nominal High Very High Extra High International Multi-city and Multi-company Multi-city or Multi-company Same city or metro. area Same building or complex Fully collocated Individual phone, FAX Narrowband email Wideband electronic communication Wideband elect. comm, occasional video conf. Interactive multimedia SITE: Some phone, Communications mail • Required Development Schedule (SCED) – Measure the imposed schedule constraint in terms of the percentage of schedule stretch-out or acceleration with respect to a nominal schedule for the project. SCED Very Low Low Nominal High Very High 75% of nominal 85% 100% 130% 160% ©USC-CSSE Extra High 37 University of Southern California Center for Systems and Software Engineering Demos • COCOMO II 2000.3 (standalone, in C) See http://greenbay.usc.edu/csci577/fall2010/site/tools/index.html http://greenbay.usc.edu/csci577/tools/cocomo/COCOMOII_2000_3.exe • COCOMO II 2000.3 (web-Based) See http://csse.usc.edu/csse/research/COCOMOII/cocomo_main.html for http://csse.usc.edu/tools/COCOMOII.php NOTE: no separate modules ©USC-CSSE 38 University of Southern California Center for Systems and Software Engineering Fast (dangerous?) Function Point Sizing • Count number of files of different types – – – – – File: grouping of data elements handled similarly by software External Input EI: files entering software system External Output EO: files exiting software system Internal Logical IL: internal files used by software system External Interface EIF: files passed/shared between software systems – External Query EQ: input and immediate output response • Use Average complexity weights for all files – FP = 4 * EI + 5 * EO + 10 * IL + 7 * EIF + 4 * EQ • USC COCOMO II FP = 4(12) + 5(7) + 10(7) + 0 + 0 = 153 – Java, C++ SLOC = 153(50) = 7650 SLOC – HTML, Power Builder = 153(20) = 3060 SLOC – Can use averages for mixes of languages • See COCOMO II book for detailed function point procedure ©USC-CSSE 39 University of Southern California Center for Systems and Software Engineering Using COCOMO II in CS 577 • Begin with COCOMO II bottom-up team estimate – Source lines of code (SLOC) – Using adjustments to CS 577 below – Focus on 577b Construction phase • Cross-check with estimate – Using Fast Function Point sizing – Effort by activity, rough 577b milestone plan • Adjust, try to reconcile both estimates ©USC-CSSE 40 University of Southern California Center for Systems and Software Engineering COCOMO II Estimates for 577b • Disregard COCOMO II (CII) schedule estimates • Use COCOMO II effort estimates to determine how large a team needed for 12-week fixed schedule – Assuming 12 hours/week of dedicated effort per person – Assuming 10 of the 12 weeks fill COCOMO II Construction phase (72% of total effort estimate) – Assuming 100 hours/person-month for COCOMO estimates • For 577b Construction phase, these are equivalent: – 1 577b team member effort = (10 weeks)(12 hours/week) = 120 hrs – 1.67*[est'd COCOMO II person month] = (1.67)(100 hours)(0.72) = 120 hrs • So, one 577b team member effort = 1.67 CII person mon's • And 6 577b team members’ effort = 6*1.67 = 10 CII person mon's – 5 on-campus students + 1 off-campus student • Or, N/1.67 577b team members’ effort = N CII person ©USC-CSSE 41