In the name of God Computer Engineering Division, Urmia Branch, Islamic Azad University Advanced Software Engineering Fall 2015 Course Information Instructor: Dr. Farhad Soleimanian Gharehchopogh Email: farhad.soleimanian@gmail.com Course Web Page: http://www.soleimanian.net/ase.pdf Class Hours : 8:30-11 Wednesdays Office Hours : Unfortunately, We don’t have any office hours…. 2 Software Cost Estimation Section 1: SLOC-based Models , Function Points Model And COCOMO Models Introduction Ad-hoc models initially used Need for formal estimation model Lines of code easily understood metric 1970 – SLIM (Putnam) 1979 – Function Points (Albrecht) 1981 – COCOMO (Boehm) Wagerline.com Function Hours Web site design 10 Database model and creation 10 External data feed integration and creation of individual sports pages 10 Install, setup, customize phpBB forums 4 Home, About Us, Contact Us pages 4 Leader board for each sport 6 Display user’s pending picks 4 Modify user profile 4 Display user profile 4 User registration and login 4 User-defined Pools Create a new pool (4 hrs) Display pool leaders (4 hrs) Make picks for a pool (4 hrs) Display all public pools (4 hrs) 16 Wagerline.com Total Estimated Hours = 76 76 x $40 per hour = $3040 How do you estimate SLOC? Experience Previous system size Existing system size Breaking system into pieces From “Schaum's Outline of Software Engineering” by David Gustafson How do you estimate SLOC? For each piece estimate Smallest possible SLOC - a Most likely SLOC - m Largest possible SLOC - b From “Example of an Early Sizing, Cost and Schedule Estimate for an Application Software System” by L. H. Putnam How do you estimate SLOC? Expected SLOC for each piece Total Expected SLOC a 4m b Ei 6 E Ei From “Example of an Early Sizing, Cost and Schedule Estimate for an Application Software System” by L. H. Putnam SLOC Estimate Example Smallest Most Likely Largest Display user’s pending picks 200 300 500 Modify user profile 100 150 250 Display user profile 250 300 450 User registration and login 200 220 250 What are function points? Functions of a software system 5 Categories External Input External Output Internal File External Interface External Inquiry What are function points? External System Boundary Inquiries Internal Files External Outputs External Interfaces External Inputs Unadjusted Function Points (UFP) External Input External Output Internal File External Interface External Inquiry Low __ x 3 __ x 4 __ x 7 __ x 5 __ x 3 3 Avg. __ x 4 __ x 5 __ x 10 __ x 7 __ x 4 High __ x 6 __ x 7 __ x 15 __ x 10 __ x 6 5 UFP wij xij i 1 j 1 From “Reliability of Function Points Measurement. A Field Experiment,” by Chris F. Kemerer Adjusting for Other Factors 1. 2. 3. 4. 5. 6. 7. Data communications Distributed functions Performance Heavily used configuration Transaction rate Online data entry End user efficiency 0 – No Influence 5 – Very Influential Adjusting for Other Factors 8. 9. 10. 11. 12. 13. 14. Online update Complex processing Reusability Installation ease Operational ease Multiple sites Facilitates change 0 – No Influence 5 – Very Influential Value Adjustment Factor (VAF) 14 VAF 0.65 0.01 ri i 1 where ri is the rating of factor i From “Reliability of Function Points Measurement. A Field Experiment,” by Chris F. Kemerer Adjusted Function Points (AFP) AFP UFP VAF From “Reliability of Function Points Measurement. A Field Experiment,” by Chris F. Kemerer Function Points Model Advantages Disadvantages Estimation data available early Difficult to automate data collection Language and implementation independent Possible subjective counting of function points Non-technical estimation SLOC-based Models Advantages Disadvantages Easy to automate data collection Highly subjective estimate of SLOC Easy to understand SLOC concept Highly dependent on experience Difficult calibration for a nonnative environment Algorithmic (Parametric) Model Use of mathematical equations to perform software estimation Equations are based on theory or historical data Use input such as SLOC, number of functions to perform and other cost drivers Accuracy of model can be improved by calibrating the model to the specific environment 20 Algorithmic (Parametric) Model (Cont.) Examples: 21 COCOMO (COnstructive COst MOdel) Developed by Boehm in 1981 Became one of the most popular and most transparent cost model Mathematical model based on the data from 63 historical software project COCOMO II Published in 1995 To address issue on non-sequential and rapid development process models, reengineering, reuse driven approaches, object oriented approach etc Has three sub-models – application composition, early design and postarchitecture Algorithmic (Parametric) Model (Cont.) Putnam’s software life-cycle model (SLIM) Developed in the late 1970s Based on the Putnam’s analysis of the life-cycle in terms of a so-called Rayleigh distribution of project personnel level versus time. Quantitative software management developed three tools : SLIM-Estimate, SLIM-Control and SLIM-Metrics. 22 Algorithmic (Parametric) Model (Cont.) Advantages Generate repeatable estimations Easy to modify input data Easy to refine and customize formulas Objectively calibrated to experience Disadvantages Unable to deal with exceptional conditions Some experience and factors can not be quantified Sometimes algorithms may be proprietary 23 Expert Judgment Capture the knowledge and experience of the practitioners and providing estimates based upon all the projects to which the expert participated. Examples Delphi Developed by Rand Corporation in 1940 where participants are involved in two assessment rounds. Work A Breakdown Structure (WBS) way of organizing project element into a hierarchy that simplifies the task of budget estimation and control 24 Expert Judgment (Cont.) Advantages Useful in the absence of quantified, empirical data. Can factor in differences between past project experiences and requirements of the proposed project Can factor in impacts caused by new technologies, applications and languages. Disadvantages Estimate is only as good expert’s opinion Hard to document the factors used by the experts 25 Top - Down Also called Macro Model Derived from the global properties of the product and then partitioned into various low level components Example – Putnam model 26 Top – Down (Cont.) Advantages Requires Usually Focus minimal project detail faster and easier to implement on system level activities Disadvantages Tend No 27 to overlook low level components detailed basis Bottom - Up Cost of each software components is estimated and then combine the results to arrive the total cost for the project The goal is to construct the estimate of the system from the knowledge accumulated about the small software components and their interactions An example – COCOMO’s detailed model 28 Bottom – Up (Cont.) Advantages More stable More detailed Allow each software group to hand an estimate Disadvantages May overlook system level costs More 29 time consuming Estimation by Analogy Comparing the proposed project to previously completed similar project in the same application domain Actual data from the completed projects are extrapolated Can 30 be used either at system or component level Estimation by Analogy (Cont.) Advantages Based on actual project data Disadvantages Impossible if no comparable project had been tackled in the past. How 31 well does the previous project represent this one Price to Win Estimation Price believed necessary to win the contract Advantages Often rewarded with the contract Disadvantages Time 32 and money run out before the job is done COCOMO 81 COCOMO stands for COnstructive COst Model It is an open system First published by Dr Barry Bohem in 1981 Worked quite well for projects in the 80’s and early 90’s Could estimate results within ~20% of the actual values 68% of the time 33 COCOMO 81 COCOMO has three different models (each one increasing with detail and accuracy): Basic, applied early in a project Intermediate, applied after requirements are specified. Advanced, applied after design is complete COCOMO has three different modes: Organic – “relatively small software teams develop software in a highly familiar, in-house environment” [Bohem] Embedded – operate within tight constraints, product is strongly tied to “complex of hardware, software, regulations, and operational procedures” [Bohem] Semi-detached – intermediate stage some where between organic and embedded. Usually up to 300 KDSI 34 COCOMO 81 COCOMO uses two equations to calculate effort in Man Months (MM) and the number on months estimated for project (TDEV) MM is based on the number of thousand lines of delivered instructions/source (KDSI) MM = a(KDSI)b * EAF TDEV = c(MM)d EAF is the Effort Adjustment Factor derived from the Cost Drivers, EAF for the basic model is 1 The values for a, b, c, and d differ depending on which mode you are using 35 Mode a b c d Organic 2.4 1.05 2.5 0.38 Semi-detached 3.0 1.12 2.5 0.35 Embedded 3.6 1.20 2.5 0.32 COCOMO 81 A simple example: Project is a flight control system (mission critical) with 310,000 DSI (319 KDSI) in embedded mode Reliability must be very high (RELY=1.40). So we can calculate: Effort = 1.40*3.6*(319)1.20 = 5093 MM Schedule = 2.5*(5093)0.32 = 38.4 months Average Staffing = 5093 MM/38.4 months = 133 FSP 36 فاکتورهای تالش EMs Type Description very low low Rating nominal high Very high Extra high RELY DATA CPLX TIME Product Product Product Computer Required system reliability Database size Complexity of system modules Execution time constraint 0.75 0.70 - 0.88 0.94 0.85 - 1.00 1.00 1.00 1.00 1.15 1.08 1.15 1.11 1.40 1.16 1.30 1.30 1.65 1.66 STOR Computer Memory constraints - - 1.00 1.06 1.21 1.56 VIRT Computer Machine volatility - 0.87 1.00 1.15 1.30 - TURN Computer Turnaround time - 0.87 1.00 1.07 1.15 - ACAP AEXP Personnel Personnel Analysts capability Analyst experience in project domain 1.46 1.29 1.19 1.13 1.00 1.00 0.86 0.91 0.71 0.82 - PCAP VEXP LEXP MODP TOOL SCED Personnel Personnel Personnel project Project Project Programmer capability Virtual machine experience Language experience Modern programing practices Use of Software tools Development schedule compression 1.42 1.21 1.14 1.24 1.24 1.23 1.17 1.10 1.07 1.10 1.10 1.08 1.00 1.00 1.00 1.00 1.00 1.00 0.86 0.90 0.95 0.91 0.91 1.04 0.70 0.82 0.83 1.10 - مقادیر فاکتورهای تالش 37 توصیف ضرایب تالش توانایی تحلیل گران Acap توانایی برنامه نویسان Pcap تجربه برنامه های کاربردی Aexp برنامه ریزی شیوه های مدرن Modp استفاده از ابزارهای نرم افزاری Tool تجربه ماشین مجازی Vexp تجربه زبان Lexp محدودیت برنامه Sced محدودیت حافظه اصلی Stor اندازه پایگاه داده Data محدودیت زمانی برای cpu Time زمان چرخش Turn نوسانات ماشین Virt پیچیدگی فرآیند Cplx قابلیت اطمینان نرم افزار مورد نیاز Rely نتیجه افزایش این ضرایب باعث کاهش تالش می شود. - کاهش این ضرایب باعث کاهش تالش می شود. COCOMO II Main objectives of COCOMO II: To develop a software cost and schedule estimation model tuned to the life cycle practices of the 1990’s and 2000’s To develop software cost database and tool support capabilities for continuous model improvement From “Cost Models for Future Software Life Cycle Processes: COCOMO 2.0," Annals of Software Engineering, (1995). 39 COCOMO II Has three different models: The Application Composition Model The Early Design Model Good for projects built using rapid application development tools (GUI-builders etc) This model can get rough estimates before the entire architecture has been decided The Post-Architecture Model Most detailed model, used after overall architecture has been decided on 40 COCOMO II Differences The exponent value b in the effort equation is replaced with a variable value based on five scale factors rather then constants Size of project can be listed as object points, function points or source lines of code (SLOC). EAF is calculated from 17 cost drivers better suited for today's methods, COCOMO81 has 15. A breakage rating has been added to address volatility of system 41 COCOMO II Calibration For COCOMO II results to be accurate the model must be calibrated Calibration requires that all cost driver parameters be adjusted Requires lots of data, usually more then one company has The plan was to release calibrations each year but so far only two calibrations have been done (II.1997, II.1998) Users can submit data from their own projects to be used in future calibrations 42 Importance of Calibration Proper calibration is very important The original COCOMO II.1997 could estimate within 20% of the actual values 46% of the time. This was based on 83 data points. The recalibration for COCOMO II.1998 could estimate within 30% of the actual values 75% of the time. This was based on 161 data points. 43 Is COCOMO the Best? COCOMO is the most popular method however for any software cost estimation you should really use more then one method Best to use another method that differs significantly from COCOMO so your project is examined from more then one angle Even companies that sell COCOMO based products recommend using more then one method. Soft star (creators of Costar) will even provide you with contact information for their competitor’s products 44 COCOMO Conclusions COCOMO is the most popular software cost estimation method Easy to do, small estimates can be done by hand USC has a free graphical version available for download Many different commercial version based on COCOMO – they supply support and more data, but at a price 45 Conclusions Project costs are being poorly estimated The accuracy of cost estimation has to be improved Data Use collection of tools Use several methods of estimation 46