Software project management (intro) Introduction to estimating development effort What makes a successful project? Delivering: agreed functionality on time at the agreed cost with the required quality Stages: 1. set targets 2. Attempt to achieve targets BUT what if the targets are not achievable? Over and under-estimating Parkinson’s Law: ‘Work expands to fill the time available’ An over-estimate is likely to cause project to take longer than it would otherwise Weinberg’s Zeroth Law of reliability: ‘a software project that does not have to meet a reliability requirement can meet any other requirement’ A taxonomy of estimating methods Expert opinion - just guessing? Bottom-up - activity based Parametric e.g. function points Analogy artificial neural networks - a view of the future? Parkinson and ‘price to win’ Heemstra and Kusters survey Expert judgement Analogy ‘Capacity problem’ Price-to-win Parametric models 25.5% 60.8% 20.8% 8.9% 13.7% Heemstra and Kusters contd. Only 50% kept project data on past projects - but 60.8% used analogy! 35% did not produce estimates 62% used methods based on intuition only 16% used formalized methods Function point users produced worse estimates! Top-down versus Bottom-up Top-down produce overall estimate based on project cost drivers based on past project data Bottom-up use when no past project data Top-down estimates Produce overall estimate using effort Estimate overall 100 days driver(s) project distribute proportions of overall estimate to components test design code 30% i.e. 30 days 30% i.e. 30 days 40% i.e. 40 days Bottom-up estimating 1. Break project into smaller and smaller components [2. Stop when you get to what one person can do in one/two weeks] 3. Estimate costs for the lowest level activities 4. At each higher level calculate estimate by adding estimates for lower levels Parametric models COCOMO (lines of code) and function points examples of these Problem with COCOMO etc: guess algorithm estimate but what is desired is system characteristic algorithm estimate Parametric models - continued Examples of system characteristics no of screens x 4 hours no of reports x 2 days no of entity types x 2 days the quantitative relationship between the input and output products of a process can be used as the basis of a parametric model Parametric models - the need for historical data simplistic model for an estimate estimated effort = (system size) / productivity e.g. system size = lines of code productivity = lines of code per day productivity = (system size) / effort based on past projects Parametric models Some models focus on task or system size e.g. Function Points FPs originally used to estimate Lines of Code, rather than effort Number of file types model Numbers of input and output transaction types ‘system size’ Parametric models Other models focus on productivity: e.g. COCOMO Lines of code (or FPs etc) an input System size Estimated effort Productivity factors COCOMO Based on industry productivity standards database is constantly updated Allows an organization to benchmark its software development productivity COCOMO – Examples Boehm simple model b E = a * (KLOC) d D = 2.5 (E) Coefficient table S/W Project ab bb db Organic 2.4 1.05 0.38 Semi detached 3.0 1.12 0.35 Embedded 3.6 1.20 0.32 Estimating by analogy source cases attribute values effort attribute values effort attribute values effort attribute values effort attribute values effort attribute values effort Use effort from source as estimate target case attribute values ????? Select case with closet attribute values Anchor + adjustment N FOREST go to tall building by line of sight FOREST You are here: how do you get to red cross? pace distance on a bearing Estimating by analogy Identify significant attributes (‘drivers’) locate closest match amongst source cases for target adjust for differences between source and target Machine assistance for source selection (ANGEL) Source A Source B It-Is Ot-Os target Number of outputs Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 ) Stages: identify Significant features of the current project previous project(s) with similar features differences between the current and previous projects possible reasons for error (risk) measures to reduce uncertainty System size: function points Based on work at IBM 1979 onwards Albrecht and Gaffney wanted to measure the productivity independently of lines of code has now been developed by the International FP User Group (which is US based) Mark II FPs developed by Simons mainly used in UK Albrecht function points external interface files external inputs internal logical files external inquiries external outputs Function points are based on 2 ‘data function’ types internal logical files (ILF) external interface files (EIF) 3 ‘transactional function’ types external inputs (EI) external outputs (EO) external inquiries (EQ) Each occurrence is judged simple, average or complex Albrecht FP weightings TYPE ILF (Internal Logical File) EIF (External Interface File) EI (External Input) EO (External Output) EQ (External Inquiry) SIMPLE AVERAGE COMPLEX 7 10 15 5 7 10 3 4 6 4 5 7 3 4 6 FP = count total * (0.65 + 0.01 * (Fi)); i = 1 to 14 Taking Complexity into Account Factors are rated on a scale Questions for Complexity Adjustment Values 0 (not important) to 5 (very important) 1. Data communications 2. Backup and recovery 3. Distributed functions 4. Heavily used configurations 5. Transaction rate 6. On-line data entry 7. On-line update 8. End user efficiency 9. Complex processing 10. Installation ease 11. Operational ease 12. Multiple sites 13. Facilitate change 14. Reusable Example: FP Approach measurement parameter weight count number of user inputs 40 x 4 = 160 number of user outputs 25 x 5 = 125 number of user inquiries 12 x 4 = 48 number of files 4 x 7 = 28 number of ext.interfaces 4 x 7 = 28 algorithms 60 x 3 = 180 count-total 569 complexity multiplier .84 feature points 478 0.25 p-m / FP = 120 p-m Some conclusions: how to review estimates Ask the following questions about an estimate What are the task size drivers? What productivity rates have been used? Is there an example of a previous project of about the same size? Are there examples of where the productivity rates used have actually been found?