Experiences/Observations in Developing COCOTS and Early COCOTS Betsy Clark Brad (baggage handler/coffee-getter) Clark Chris Abts 20th International Forum on COCOMO and Software Cost Modeling October 2005 smi Outline Data Collection Challenges Early COCOTS Model – An Update smi 2 Data Collection Challenges Why is data so hard to get? – No immediate payback for data providers (takes time and effort) – Fear of airing “dirty laundry” Data collector is about as popular as a telemarketer Projects don’t track effort by Assessment – Tailoring – Glue Code – Effort data must be reconstructed – Quality of data is highly dependent on knowledge of person being interviewed smi 3 Experiences in Obtaining Data Started by sending COCOTS data-collection survey and asking people to fill out – Result: nothing! – Length of the survey may have discouraged people Changed our approach to meet face to face – Four-hour interviews – This approach has worked reasonably well BUT… smi 4 Difficulty Obtaining Complete Data …BUT critical data was occasionally omitted • Effort (for assessment, tailoring, or glue code) • Glue code size “Fred will get back to you with that number” but Fred never does – No leverage after the fact to make Fred do that Lesson learned for future data gathering – Send out one-page sheet in advance containing critical data items – Person can prepare without being overwhelmed by a lengthy form smi 5 Data Collection Challenges Almost every project had a new war story to tell that impacted the amount of effort spent One consistent thread is the lack of control resulting from the use of COTS components “There may be such a thing as too much COTS – the more COTS you have, the less control you have” smi 6 Lack of Control Impacts Effort Just a few of the areas that are out of a project’s control - Vendors going out of business - Vendors not performing as promised - Products not working as advertised “The vendor folded half-way through development and we had to start over.” “What leverage do I have with a vendor to ensure they deliver on time?” “Very few components worked as advertised.” “We spent lots of effort on workarounds because of deficiencies [in product X].” smi 7 Lack of Control Impacts Effort A few more areas that are out of a project’s control are - Dependence on vendor to fix errors - Evolution of a component (the what and when of new releases) “If you find a bug in your custom code, you can fix it yourself. The vendor may not be able to duplicate a problem we find here. If not, they have to come to our site.” “Even when we don’t change a version, there is a lot of analysis required. It can be difficult to verify implications with a black box.” “Vendors are constantly driven to add more functionality which puts more demand on hardware.” smi 8 Lack of Control Impacts Effort Yet another area - Lack of user acceptance because vendor’s interface isn’t what they’re used to “Each site is its own fiefdom. It like a 100-headed hydra trying to make same decision. No one can say yes – they all can say no.” smi 9 Lack of Control - Summary In going the COTS route, a project is faced with a lot of potential “gotcha’s” that can have a major impact on effort Ye Yang is identifying risks in developing COTS-based systems smi 10 Important Questions Are we modeling the right activities? – Assessment, tailoring, glue code are important – Initial data collection efforts did not include several major activities • Business Process Reengineering (BPR) • Impact analysis and verification testing for new releases or products • Developer training (especially for tailoring) • User training • Data conversion Do we have the right cost drivers? – Yes, but some are difficult for model users to estimate early on – Major impetus leading to Early COCOTS smi 11 Outline Data Collection Challenges Early COCOTS Model smi 12 What is Early COCOTS? 35,000 foot view to be used early in the lifecycle for: – Rough Order-of-Magnitude (ROM) estimates – Range of estimates Simplified model – Information known early in the lifecycle is limited smi 13 Early COCOTS Effort Estimation PMCOTS PMAssessment PMTailoring PMGlueCode Table lookup by COTS product smi Linear model or distribution by COTS class Productivity by COTS product 14 What Model Users Know Early On System Sizing Parameters – Number of users – Number of sites – Amount of data (legacy and new) • Number and age of databases to be converted – Amount of legacy code to be reused • Totally new systems are rare – Number of interfacing systems Requirements – Functional (high level) – Performance – Security Architecture (Solution alternatives) Implementation Strategy smi 15 Classes of COTS Products Application – – – – – – graphic information systems back office retail telemetry analysis telemetry processing financial packages network managers Infrastructure – databases – disk arrays – communication protocols/packages – middleware – operating systems – Network monitors – device drivers smi Tools – – – – – – – – – – configuration mgmt/build tools data conversion packages compilers emulators engineering tools (req’ts mgmt, design) software process tools GUIs/GUI builders problem management report generators word processors 16 Assessment Effort vs. Number of COTS Products COTS Product Assessment There does not appear to be a correlation between number of products and total time spent in assessing them. 160 140 Actual Effort 120 100 80 60 40 20 0 0 5 10 15 20 25 30 35 40 45 Number of COTS Products smi 17 Assessment Effort Estimation Need to explain the variation in the amount of effort spent assessing a COTS product between different COTS-intensive application developments – Must be known early in the life cycle “Uncertainty” driver – Created from the COCOTS data – The Uncertainty driver rates the number of unknowns that must be resolved to ascertain the fitness of a COTS product for use in the overall system. – Applies to all classes of COTS products smi 18 Degree of Uncertainty -1 Low – Select from a list of pre-certified products – Choice is dictated (by hardware, by other software or by organizational policy) – Already using a product which will be used in this project Medium – There are multiple products but a detailed assessment is not required. Assessment will be a simple exercising of the product and/or a paper and pencil evaluation Large – One or two products get very detailed assessment and the other products choices were certain, e.g. once the operating system was chosen the other products were selected as well smi 19 Degree of Uncertainty -2 Very Large – There are a fair number of COTS products with very high level of service requirements combined with large amounts of custom code. There is a lot of uncertainty and risk around those products. A lot of effort is spent on making sure those products work – Verify service level agreements such as performance, reliability, availability, fault tolerance, security, interoperability, etc. – Quadruple redundancy – Seven 9’s of reliability (99.99999) – Through prototyping to assess key criteria Extra Large – Many different groups of users: end-to-end detailed scenarios (entire work flows and data flows) required to assess suitability – Example: Government Financial package suite used for multiple government agencies smi 20 Assessment Effort vs Degree of Product Uncertainty Assessment S M L VL XL 180 160 Actual Effort 140 120 100 80 60 40 20 0 0 5 10 15 20 25 Uncertainity smi 21 Assessment Input The effort required for Assessment is by the Degree of Uncertainty. Select the Degree of Uncertainty Small Lower 80% CL 0.31 Mean 0.37 Upper 80% CL 0.44 Medium Large Very Large 0.79 1.00 1.27 2.27 2.73 3.28 5.16 7.44 10.73 Extra Large 11.65 20.14 34.83 Estimated Assessment Effort = 8.27 PM * Uncertainty Rating Range Example: The Degree of Uncertainty was judged as Large (using rating descriptions) Low Estimate = 8.27 PM * 2.27 = 19 PM Mean Estimate = 8.27 PM * 2.73 = 23 PM High Estimate = 8.27 PM * 3.28 = 27 PM smi 22 Tailoring Effort Estimation Need to explain the variation in the amount of effort spent tailoring a COTS product between different COTS-intensive application developments Must be known early in the life cycle Application-type COTS products – Number of User Profiles: roles and permissions – Different user profiles create the need for different scripts, screens, and reports. Infrastructure-type COTS products – Tailoring effort appears relatively constant – This type of tailoring usually consists of installation and setting parameters Tool-type COTS products – These don’t appear to require tailoring, training is more of a cost driver for these products smi 23 Tailoring Effort Applications Tailoring Effort for Applications 300 PM = 49.967x R2 = 0.8928 Effort (PM) 250 200 150 Lower 80% CL = 43.6 PM Mean = 57.1 PM Upper 80% CL = 70.6 PM 100 50 0 0 smi 1 2 3 4 Number of User Profiles 5 6 24 Tailoring Effort for Infrastructure Tailoring Effort Distribution for Infrastructure 7 Lower 80% CL = 2.69 PM Mean = 3.75 PM Upper 80% CL = 4.81 PM 6 Frequency 5 4 3 2 1 0 1 2 3 4 5 6 7 8 Tailoring Effort smi 25 Glue Code Effort -1 Need to explain the variation in the amount of effort spent constructing glue code for COTS products between different COTS-intensive application developments – Must be known early in the life cycle Glue Code effort is based on observed productivities in the data Quote – “It is very difficult to write glue code because there are so many constraints” Effect: productivities are lower than for coding custom software COCOTS data also shows that the Required System/ Subsystem Reliability is inversely correlated to glue code productivities smi 26 Glue Code Effort -2 Constraints on System/Subsystem Reliability (ACREL) – How severe are the overall reliability constraints on the system or subsystem into which the COTS components was/is being integrated? What are the potential consequences if the components fail to perform as required in any given time frame? Low Threat is low; if a failure occurs losses are easily recoverable (e.g., document publishing). smi Nominal Threat is moderate; if a failure occurs losses are fairly easily recoverable (e.g., support systems). High Threat is high; if a failure occurs the risk is to mission critical requirements. Very High Threat is very high; if a failure occurs the risk is to safety critical requirements. 27 Glue Code Productivities Reliability = VH 8 7 7 Frequency 6 Reliability = N 5 4 4 3 2 2 Reliability = L 1 1 1 1 0 0 0 0 0 100 200 300 400 500 600 800 900 1000 More SLOC/PM smi 28 Glue Code smi Cots Products Operating System Communication Product Database Network Manager Operating System Graphical User Interface Network Manager Telemetry Product Network Manager Report Generator Graphical User Interface Communication Product Graphical User Interface Graphical User Interface Database Graphical User Interface GC Effort (staff months) 6 1 85 2 36 12 24 12 6 12 60 6 100 84 12 12 SLOC 100 30 5000 150 3,000 1,000 2,000 2800 1400 5,000 30,000 3,000 50,000 50,000 10,000 20,000 SLOC/PM 17 30 59 75 83 83 83 233 233 417 500 500 500 595 833 1667 Required Reliability H L H VH VH VH VH VH VH H H H H L H N 29 Future Work Cost Model Requests “You need to have one integrated model – not COCOMO + COCOTS” “We need one model – not two” “Need to know how to cost the entire life cycle” “COTS/GOTS is high risk because we are dependent on someone else... There needs to be a process to help people evaluate risk.” “Our biggest driver is testing – not being captured now by COCOTS” “Does model account for effort required to evaluate vendor patches/releases?” smi 30 Contact Information Betsy Clark betsy@software-metrics.com Brad Clark (703) 754-0115 brad@software-metrics.com smi 31