University of Southern California Center for Systems and Software Engineering Software Sizing Ali Afzal Malik & Vu Nguyen USC-CSSE CSCI 510 10/16/2009 ©USC-CSSE 1 University of Southern California Center for Systems and Software Engineering Outline • • • • • Nature, value, and challenges of sizing Sizing methods and tools Sizing metrics Maintenance sizing Conclusions and references 10/16/2009 ©USC-CSSE 2 University of Southern California Center for Systems and Software Engineering Nature and Value of Sizing • Definitions and general characteristics • Value provided by sizing • When does sizing add less value? 10/16/2009 ©USC-CSSE 3 University of Southern California Center for Systems and Software Engineering Size: Bigness/Bulk/Magnitude Measured in gallons Measured in usable square feet Measured in ??? 10/16/2009 ©USC-CSSE 4 University of Southern California Center for Systems and Software Engineering Sizing Characteristics • Generally considered additive – Size (A U B) = Size (A) + Size (B) • Often weighted by complexity – Some artifacts “heavier” to put in place – Requirements, function points 10/16/2009 ©USC-CSSE 5 University of Southern California Center for Systems and Software Engineering Software Size is an Important Metric • A key metric used to determine – software project effort and cost: (effort, cost) = f(size, factors) – – – – time to develop staff quality productivity 10/16/2009 ©USC-CSSE 6 University of Southern California Center for Systems and Software Engineering When Does Sizing Add Less Value? • Often easier to go directly to estimating effort – Imprecise size parameters • GUI builders; COTS integration – Familiar, similar-size applications • Analogy to previous effort: “yesterday’s weather” • When size is a dependent variable – Time-boxing with prioritized features 10/16/2009 ©USC-CSSE 7 University of Southern California Center for Systems and Software Engineering Sizing Challenges • Brooks’ factors: software system, product • Cone of uncertainty 10/16/2009 ©USC-CSSE 8 University of Southern California Center for Systems and Software Engineering Brooks’ Factor of 9 for Programming System Product x3 A A Programming Program System x3 A A Programming Programming Systems Product Adapted from Brooks, 1995 Product Product: Handles off-nominal cases, testing; well-documented System: Handles interoperability, data management, business management Telecom: 1/6 basic call processing; 1/3 off-nominals; 1/2 billing 10/16/2009 ©USC-CSSE 9 University of Southern California Center for Systems and Software Engineering The Cost and Size Cone of Uncertainty • If you don’t know what you’re building, it’s hard to estimate its size or cost Boehm et al. 2000 10/16/2009 ©USC-CSSE 10 University of Southern California Center for Systems and Software Engineering Outline • • • • • Nature, value, and challenges of sizing Sizing methods and tools Sizing metrics Maintenance sizing Conclusions and references 10/16/2009 ©USC-CSSE 11 University of Southern California Center for Systems and Software Engineering Basic Methods, Strengths, and Weaknesses (Adapted from Boehm, 1981) Method Strengths Weaknesses Pair-wise comparison •Accurate assessment of relative size •Absolute size of benchmark must be known Expert judgment •Assessment of representativeness, interactions, exceptional circumstances •No better than participants •Biases, incomplete recall Analogy •Based on representative experience •Representativeness of experience Parkinson •Correlates with some experience •Reinforces poor practice Price to win •Often gets the contract •Generally produces large overruns Top-down •System level focus •Efficient •Less detailed basis •Less stable Bottom-up •More detailed basis •More stable •Fosters individual commitment •May overlook system level costs •Requires more effort 10/16/2009 ©USC-CSSE 12 University of Southern California Center for Systems and Software Engineering Comparison with Previous Projects • Comparable metadata: domain, user base, platform, etc. • Pair-wise comparisons • Differential functionality – analogy; yesterday’s weather 10/16/2009 ©USC-CSSE 13 University of Southern California Center for Systems and Software Engineering Group Consensus • Wideband Delphi – – – – – – Anonymous estimates Facilitator provides summary Estimators discuss results and rationale Iterative process Estimates converge after revision in next rounds Improves understanding of the product’s nature and scope – Works when estimators are collocated • Planning Poker (Cohn, 2005) – – – – – – – 10/16/2009 Game: deck of cards Moderator provides the estimation-item Participants privately choose appropriate card from deck Divergence is discussed Iterative – convergence in subsequent rounds Ensures everyone participates Useful for estimation in agile projects ©USC-CSSE 14 University of Southern California Center for Systems and Software Engineering Probabilistic Methods • PERT (Putnam and Fitzsimmons, 1979) – – – – – – 3 estimates: optimistic, most likely, pessimistic Expected Size = optimistic + 4 * (most likely) + pessimistic 6 Std. Dev. = (pessimistic – optimistic) / 6 Easy to use Ratio reveals uncertainty Example Component ai 10/16/2009 mi bi Ei δi SALES 6K 10K 20K 11K 2.33K DISPLAY 4 7 13 7.5 1.5 INEDIT 8 12 19 12.5 1.83 TABLES 4 8 12 8 1.33 TOTALS 22 37 64 39 δ E = 3.6 ©USC-CSSE 15 University of Southern California Center for Systems and Software Engineering Why Do People Underestimate Size? (Boehm, 1981) • Basically optimistic and desire to please • Have incomplete recall of previous experience • Generally not familiar with the entire software job 10/16/2009 ©USC-CSSE 16 University of Southern California Center for Systems and Software Engineering Outline • • • • • Nature, value, and challenges of sizing Sizing methods and tools Sizing metrics Maintenance sizing Conclusions and references 10/16/2009 ©USC-CSSE 17 University of Southern California Center for Systems and Software Engineering Sizing Metrics vs. Time and Degree of Detail (Stutzke, 2005) Process Phase Possible Measures Primary Aids Concept Subsystems Key Features Product Vision, Analogies Elaboration User Roles, Use Cases Construction Screens, Reports, Function Components Objects Files, Points Application Points Operational Concept, Specification, Context Feature List Diagram Architecture, Detailed Top Level Design Design Source Lines of Code, Logical Statements Code Increasing Time and Detail 10/16/2009 ©USC-CSSE 18 University of Southern California Center for Systems and Software Engineering Sizing Metrics and Methodologies Function Points COSMIC Function Points Use Case Points Mark II Function Points Requirements CK Object Oriented Metrics Class Points Object Points Architecture and Design Source Lines of Code (SLOC) Module (file) Count Code 10/16/2009 ©USC-CSSE 19 University of Southern California Center for Systems and Software Engineering Counting Artifacts • Artifacts: requirements, inputs, outputs, classes, use cases, modules, scenarios – Often weighted by relative difficulty – Easy to count at initial stage – Estimates may differ based on level of detail 10/16/2009 ©USC-CSSE 20 University of Southern California Center for Systems and Software Engineering Cockburn Use Case Level of Detail Scale (Cockburn, 2001) 10/16/2009 ©USC-CSSE 21 University of Southern California Center for Systems and Software Engineering Function Point Analysis - 1 • FPA measures the software size by quantifying functionality of the software provided to users • Five function types – Data • Internal Logical File (ILF), External Interface File (EIF) – Transactions • External Input (EI), External Output (EO), External Query (EQ) • Each function is weighted by its complexity • Total FP is then adjusted using 14 general systems characteristics (GSC) EO/EO/EQ Huma n User EI/EO/EQ MS Word MS Excel EIF (system being counted) EO/EO/EQ ILF MS Outlook EIF 10/16/2009 ©USC-CSSE 22 University of Southern California Center for Systems and Software Engineering Function Point Analysis - 2 • Advantages – Independent of technology and implementation – Can be measured early in project lifecycle – Can be applied consistently across projects and organizations • Disadvantages – Difficult to automate – Require technical experience and estimation skills – Difficult to count size of maintenance projects 10/16/2009 ©USC-CSSE 23 University of Southern California Center for Systems and Software Engineering COSMIC FP • COSMIC measures functionality of the software thru the movement of data (exit, entry, read, write) cross logical layers • Advantages – Same as FPA – Applicable to real-time systems • Disadvantages – – – – Difficult to automate Require experience and estimation skills Require detailed architecture Difficult to understand for customers 10/16/2009 ©USC-CSSE 24 University of Southern California Center for Systems and Software Engineering Use Case Point • UCP counts functional size of the software by measuring the complexity of use cases UCP = (#Actors + #Use cases) * technical and environmental factors • Advantages – Independent of technology and implementation – Can be measured early in project lifecycle – Easier to count than FPA • Disadvantages – Lack of standardization – Difficult to automate – High variance (low accuracy) 10/16/2009 ©USC-CSSE 25 University of Southern California Center for Systems and Software Engineering CK Object-Oriented Metrics • Six OO metrics (Chidamber and Kemerer, 1994) – – – – – – Weighted Methods Per Class (WMC) Depth of Inheritance Tree (DIT) Number of Children (NOC) Coupling Between Object Classes (CBO) Response for a Class (RFC) Lack of Cohesion in Methods (LCOM) • Advantages – Correlated to effort – Can be automated using tools • Disadvantages – Difficult to estimate early – Useful only in object-oriented code 10/16/2009 ©USC-CSSE 26 University of Southern California Center for Systems and Software Engineering Source Lines of Code (SLOC) • SLOC measures software size or length by counting the logical or physical number of source code lines – Physical SLOC = number of physical lines in code – Logical SLOC = number of source statements • Advantages – – – – Can be automated easily using tools (e.g., USC’s CodeCount) Easy to understand for customers and developers Reflect the developer’s view of software Supported by most of cost estimation models • Disadvantages – Dependent on programming languages, programming styles – Difficult to estimate early – Inconsistent definitions across languages and organizations 10/16/2009 ©USC-CSSE 27 University of Southern California Center for Systems and Software Engineering SLOC Counting Rules •Standard definition for counting lines - Based on SEI definition checklist from CMU/SEI-92-TR-20 - Modified for COCOMO II Statement type Includes 1. Executable Excludes 2. Non-executable: 10/16/2009 3. Declarations 4. Compiler directives 5. Comments: 6. On their own lines 7. On lines with source code 8. Banners and non-blank spacers 9. Blank (empty) comments 10. Blank lines ©USC-CSSE Adapted from Madachy, 2005 28 University of Southern California Center for Systems and Software Engineering Relationship Among Sizing Metrics • Two broad categories of sizing metrics – Implementation-specific e.g. Source Lines of Code (SLOC) – Implementation-independent e.g. Function Points (FP) • Need to relate the two categories – e.g. SLOC/FP backfire ratios 10/16/2009 ©USC-CSSE 29 University of Southern California Center for Systems and Software Engineering SLOC/FP Backfiring Table (Jones, 1996): other backfire ratios up to 60% higher 10/16/2009 ©USC-CSSE 30 University of Southern California Center for Systems and Software Engineering Multisource Estimation Implementation-independent/dependent • Implementation-independent estimators – Use cases, function points, requirements • Advantage: implementation-independent – Good for productivity comparisons when using VHLLs, COTS, reusable components • Weakness: implementation-independent – Gives same estimate when using VHLLs, COTS, reusable components, 3GL development • Multisource estimation reduces risk 10/16/2009 ©USC-CSSE 31 University of Southern California Center for Systems and Software Engineering Outline • • • • • Nature, value, and challenges of sizing Sizing methods and tools Sizing metrics Maintenance sizing Conclusions and references 10/16/2009 ©USC-CSSE 32 University of Southern California Center for Systems and Software Engineering Sizing in Reuse and Maintenance Release 1 Release 2 Reused Code Acquired Code Project Execution Adapted Code Exiting System’s Code New Code 10/16/2009 ©USC-CSSE 33 University of Southern California Center for Systems and Software Engineering Calculating Total Equivalent SLOC - 1 • Formulas: EKSLOCreused KSLOCreused * AAM reused AT EKSLOCadapted KSLOC adapted * (1 ) * AAM adapted 100 AAM reused 0.3IM reused AAF 2 AA AAF [ 1 ( 1 ) ] * SU *UNFM 100 if AAF 100 AAM adapted 100 AA AAF SU *UNFM if AAF 100 100 AAF 0.4 * DM 0.3 * CM 0.3 * IM DM = % design modified, CM = % code modified, IM = % integration and test needed 10/16/2009 ©USC-CSSE 34 University of Southern California Center for Systems and Software Engineering Calculating Total Equivalent SLOC - 2 EKSLOC KSLOC added EKSLOCadapted EKSLOCreused 10/16/2009 ©USC-CSSE 35 University of Southern California Center for Systems and Software Engineering Adaptation Adjustment Modifier (AAM) 1.60 AAM w orse case: AA = 8 SU = 50 UNFM = 1 1.40 Relative Cost 1.20 Old AAM f ormula New AAM f ormula 1.00 0.80 0.60 Relative # of interf aces checked 0.40 0.20 0.00 0 20 40 60 80 100 120 Relative Amount of Modification (AAF) 10/16/2009 ©USC-CSSE 36 University of Southern California Center for Systems and Software Engineering Conclusions • Size plays an important role in estimation and project management activities – Software estimation is a “garbage in garbage out” process • Bad size in; bad cost out • A number of challenges make estimation of size difficult • Different sizing metrics – FP, COSMIC FP (cfu), Use Case Points, SLOC, CK OO metrics • A plethora of sizing methods exists each method has strengths and weaknesses 10/16/2009 ©USC-CSSE 37 University of Southern California Center for Systems and Software Engineering References • Books – – – – – – – • Journals – – • Chidamber, S. and C. Kemerer, 1994, “A Metrics Suite for Object Oriented Design,” IEEE Transactions on Software Engineering. Putnam, L.H., and Fitzsimmons, A., 1979. Estimating software costs. Datamation. Tutorials/Lectures/Presentations – – – – • Boehm, Barry W., Software Engineering Economics, Prentice Hall, 1981. Boehm, Barry W., et al., Software Cost Estimation With COCOMO II, Prentice Hall, 2000. Brooks, Jr., Frederick P., The Mythical Man-Month, Addison-Wesley, 1995. Cockburn, A., Writing Effective Use Cases, Addison-Wesley, 2001. Cohn, M., Agile Estimating and Planning, Prentice Hall, 2005. Jones, C., Applied Software Measurement: Assuring Productivity and Quality, McGraw-Hill, 1996. Stutzke, Richard D., Estimating Software-Intensive Systems, Addison-Wesley, 2005. Boehm, Barry W., and Malik, Ali A., COCOMO Forum Tutorial, “System & Software Sizing”, 2007. Boehm, Barry W., CSCI 577a Lecture, “Cost Estimation with COCOMO II”, 2004. Madachy, Ray, CSCI 510 Lecture, “COCOMO II Overview”, 2005. Valerdi, Ricardo, INCOSE Presentation, “The Constructive Systems Engineering Cost Model”, 2005. Websites – – – 10/16/2009 http://www.crisp.se/planningpoker/ http://www.ifpug.org/ http://www.spr.com/ ©USC-CSSE 38