University of Southern California Center for Systems and Software Engineering SLOC and Size Reporting Pongtip Aroonvatanaporn [Material by Ray Madachy] CSCI577b Spring 2011 February 4, 2011 2/4/11 (C) USC-CSSE 1 University of Southern California Center for Systems and Software Engineering Outline • Size Reporting Process • SLOC Counting Rules • Reused and Modified Software – COCOMO Model • The Unified Code Count tool • Conclusion 2/4/11 (C) USC-CSSE 2 University of Southern California Center for Systems and Software Engineering Goal of Presentation • Understanding the size data required at IOC • Why? – Important historical data on 577 process • Process performance – Can be used for COCOMO calibration – Specially calibrated COCOMO for 577 • Current COCOMO utilize 200+ projects to calibrate – Help identify COCOMO deficiencies and additional needs 2/4/11 (C) USC-CSSE 3 University of Southern California Center for Systems and Software Engineering Size Reporting Process • Determine what you produced and quantify it – Code developed new, reused, and modified – Apply code counter to system modules – Apply reuse parameters to all reused and modified code to get equivalent size 2/4/11 (C) USC-CSSE 4 University of Southern California Center for Systems and Software Engineering Size Reporting Process • Identify any software not counted • Provide as much background as possible – Someone can follow up and fill in the gaps – Acceptable to use function point counts as a last resort • Problem with counting COTS code – Provide COCOTS inputs if doing COTSintensive development – COTS-development contributes the majority of effort, but size cannot be counted 2/4/11 (C) USC-CSSE 5 University of Southern California Center for Systems and Software Engineering Size Reporting Process • Finalizing the report – Add up all equivalent lines of code – The same top-level size measure that COCOMO uses • Count by modules – The modules should be consistent with your COCOMO estimation – Otherwise, nearly impossible to compare with estimates 2/4/11 (C) USC-CSSE 6 University of Southern California Center for Systems and Software Engineering Lines of Code • Source lines of code (SLOCs) – Logical source statements – NOT physical statements • Logical source statements – Data declarations • Non-executable statements that affect an assembler’s or compiler’s interpretation – Executable statements • Cause runtime actions 2/4/11 (C) USC-CSSE 7 University of Southern California Center for Systems and Software Engineering Lines of Code Example 1 1 2 3 4 5 6 7 8 String[] command = { “cmd.exe”, “/C”, “-arg1”, “-arg2”, “-arg3” }; Example 2 1 2 3 4 5 6 7 8 int arg1=0; int arg2=4; String ans; ans = “Answer is”; System.out.println(ans +arg1+arg2); LOC = 1 2/4/11 LOC = 5 (C) USC-CSSE 8 University of Southern California Center for Systems and Software Engineering SLOC Counting Rules • Standard definition for counting lines – Based on SEI definition – Modified for COCOMO • When line or statement contains more than one type – Classify it as type with highest precedence 2/4/11 (C) USC-CSSE 9 University of Southern California Center for Systems and Software Engineering SLOC Counting Rules Statement type Includes 1. Executable Excludes Declarations 4. Compiler directives 5. Comments: 1. Programmed 3. Converted with automated translators 4. Copied or reused without change 5. Modified 6. Removed 6. On their own lines 7. On lines with source code 8. Banners and non-blank spacers 9. Blank (empty) comments 10. Blank lines 2/4/11 Includes Excludes 2. Generated with source code generators 2. Non-executable: 3. How produced (C) USC-CSSE 10 University of Southern California Center for Systems and Software Engineering SLOC Counting Rules Origin Includes 1. New work: no prior existence Excludes 2. Prior work: taken or adapted from: 3. A previous version, build, or release 4. Commercial, off-the-shelf software (COTS), other than libraries 5. Government furnished software (GFS), other than reuse libraries 6. Another product 7. A vendor-supplied language support library (unmodified) 8. A vendor-supplied operating system or utility (unmodified) 9. A local or modified language support library or operating system 2/4/11 (C) USC-CSSE 11 University of Southern California Center for Systems and Software Engineering Reused and Modified Software • Also categorized as “adapted software” • Problem: – Effort for adapted software is not the same as for new software – How to compare effort for reused and modified software with new software? • Counting approach: – Convert adapted software into equivalent size of new software 2/4/11 (C) USC-CSSE 12 University of Southern California Center for Systems and Software Engineering Reuse Size-Cost Model Data on 2954 NASA modules [Selby,1988] 1.0 1.0 0.70 0.75 0.55 Does not cross origin due to cost for assessing, selecting, and assimilating reusable components ~ 5% Relative cost 0.5 Usual Linear Assumption 0.25 0.046 0.25 0.5 0.75 1.0 Amount Modified • Non-linear because small modifications generation disproportionately large costs • Cost of understanding software • Relative cost of interface checking 2/4/11 (C) USC-CSSE 13 University of Southern California Center for Systems and Software Engineering COCOMO Reuse Model • Non-linear estimation model – Convert adapted software into equivalent size of new software Adaptation Adjustment Factor Percent Design Modified Percent Code Modified Percent of effort for integration and test of modified AAF = 0.4(DM)+ 0.3(CM)+ 0.3(IM) Equivalent SLOC ASLOC[AA + AAF(1+ 0.02(SU)(UNFM )] ESLOC = , AAF £ 0.5 100 ASLOC[AA + AAF + (SU)(UNFM )] ESLOC = , AAF > 0.5 Assessment and 100 Software Unfamiliarity Assimilation Effort 2/4/11 Adaptation Adjustment Multipliers (AAM) Understanding (C) USC-CSSE 14 University of Southern California Center for Systems and Software Engineering Reuse Model Parameters • DM – Percent Design Modified – Percentage of adapted software’s design modified to fit it to new objectives • CM – Percent Code Modified – Percentage of “reused” code modified to fit it to new objectives • IM – Percentage of effort for integration and test of modified software – Relative to new software of comparable size IM = 100 * I&T Effort (modified software) / I&T Effort (new software) 2/4/11 (C) USC-CSSE 15 University of Southern California Center for Systems and Software Engineering Reuse Model Parameters (AA) • Assessment & Assimilation Effort – Effort needed to: • Determine whether fully-reused software is appropriate • Integrate its description into overall product description AA Increment 2/4/11 Level of AA Effort 0 None 2 Basic module search and documentation 4 Some module Test and Evaluation (T&E), documentation 6 Considerable module T&E, documentation 8 Extensive module T&E, documentation (C) USC-CSSE 16 University of Southern California Center for Systems and Software Engineering Reuse Model Parameters (SU) • Software Understanding Effort – When code isn’t modified (DM=0, CM=0), SU=0 – Take subjective average of 3 categories Very Low Structure Very low cohesion, high coupling, spaghetti code. Application Clarity Low Nominal Very High Reasonably wellstructured; some weak areas. High cohesion, low coupling. Strong modularity, information hiding in data / control structures. No match Some correlation between between program and program and application. application world views. Moderate correlation between program and application. Good correlation between program and application. Clear match between program and application world-views. SelfDescriptiveness Obscure code; documentation missing, obscure or obsolete Some code commentary and headers; some useful documentation. Moderate level of code commentary, headers, documentations. Good code commentary and headers; useful documentation; some weak areas. Self-descriptive code; documentation up-todate, well-organized, with design rationale. SU Increment to ESLOC 50 40 30 20 10 2/4/11 Moderately low cohesion, high coupling. High (C) USC-CSSE 17 University of Southern California Center for Systems and Software Engineering Reuse Model Parameters (UNFM) • Unfamiliarity – Effect of programmer’s unfamiliarity with software UNFM Increment 2/4/11 Level of Unfamiliarity 0.0 Completely familiar 0.2 Mostly familiar 0.4 Somewhat familiar 0.6 Considerably familiar 0.8 Mostly unfamiliar 1.0 Completely unfamiliar (C) USC-CSSE 18 University of Southern California Center for Systems and Software Engineering Improved Reuse Model • Unified model for both reuse and maintenance – New calibration performed by Dr. Vu Nguyen • SLOC modified and deleted are considered to be equivalent to SLOC added 0.3 1 AAF = 0.4(DM)+CM + 0.3(IM) é æ AAF ö2 ù ASLOC[AA + AAF ê1- ç1÷ ú(SU)(UNFM ) êë è 100 ø úû ESLOC = , AAF £1 100 ASLOC[AA + AAF + (SU)(UNFM )] ESLOC = , AAF >1 100 2/4/11 (C) USC-CSSE 19 University of Southern California Center for Systems and Software Engineering Reuse Parameter Guidelines Code Category New - all original software Adapted - changes to pre-existing software Reused - unchanged existing software COTS - off-the-shelf software (often requires new glue code as a wrapper around the COTS) 2/4/11 Reuse Parameters IM not applicable DM CM 0% - 100% normally > 0% 0+% - 100% usually > DM and must be > 0% 0% 0% 0% 0% (C) USC-CSSE 0% - 100+% IM usually moderate; but can be > 100% 0% - 100% rarely 0%, but could be very small 0% - 100% AA SU UNFM 0% – 8% 0% - 50% 0-1 0% – 8% not applicable 0% – 8% not applicable 20 University of Southern California Center for Systems and Software Engineering Data Collection 2/4/11 (C) USC-CSSE 21 University of Southern California Center for Systems and Software Engineering Data Collection • Refer to COCOMO model definition for details on various parameters – DM, CM, IM, etc • Indicate the counting method you used – Manual approach? – Automated? • Available Code Counters – CSSE Code Counter: UCC – Code Counter developed as part of CSC 665 Advanced Software Engineering project – Third party. But make sure that the counting rules are consistent. 2/4/11 (C) USC-CSSE 22 University of Southern California Center for Systems and Software Engineering The Unified Code Count Tool • Developed at USC-CSSE • Based on the counting rule standards established by SEI • Evolved to count all major languages including web platforms • Can be used to determine modified code (changed and deleted) – Use this data to find equivalent “new” code 2/4/11 (C) USC-CSSE 23 University of Southern California Center for Systems and Software Engineering Conclusion • Software sizing and reporting is more than just simple line counting – Finding actual effort based on equivalent sizing – Only logical source code contributes to effort • Accurate reporting is essential – For research purposes – Process performance evaluation and calibration – Future planning and productivity predictions • Give background on software pieces not counted 2/4/11 (C) USC-CSSE 24