Analysis and construction of software measures Jean-Marc Desharnais Analysis of software measures Scale and units McCabe Albrecht Function Point Halstead Measure The following presentation is inspire from Dr. Alain Abran documents that I am using with his permission. A book should be published in a near future. Scale and units Scale numbering Nominal Ordinal Interval Ratio Absolute Exercise Scale Function Point Analysis (FPA) is using Adjusted and Unadjusted point; Unadjusted FPA is the total number of point based on total components based on their type, their complexity and the points giving by the 5 tables; Adjusted FPA is: Unadjusted FPA * by an adjustment factor between 0.65 and 1.35 This adjustment factor is coming from 14 questions quote between 0 and 5 (14 * (0) 5 = 70)/100 = (0) 0.7; 1.35 - 0.65 = 0.7 What is inappropriate in this approach? Analysis of software measures Scale and units McCabe Albrecht Function Point Halstead Measure McCabe Cyclomatic Complexity Number This short presentation will show that McCabe: measure software design has some ambiguity as much the measure interpretation Measurement concepts and measurement instruments are based on imprecise empirical representations Mc Cabe and complexity Prior to investigating the explicit and implicit definitions behind the McCabe ‘cyclomatic complexity number’ , it is interesting to survey how complexity has been defined (characteristics of the measure); Complexity definition: IEEE Degree to which a system or component has a design or implementation that is difficult to understand and verify; complexity is a property of the design implementation; also a relationship to on the effort needed to understand and verify the design implementation. Two different entities are involved in the definition, process (effort) and product (design or source code) Complexity and McCabe By adding the label ‘complexity’ to the expression ‘Cyclomatic Number’, McCabe leads the reader to believe that the attribute he considered is the complexity of a source code program, but does not explicitly document this claim by association. What some authors say about complexity? Complexity: Evans et al. Degree of complication of a system or system component; Determined by such factors as the number and intricacy of interfaces, the number and intricacy of conditional branches, the degree of nesting, and the types of data structures. Note: many concepts to be measured. Complexity: Whitmire Also include computational, psychological and representational complexity; Computational: hardware resources to execute the software Psychological: complexity of the problem solved by the software, characteristics of the software (size, cohesion, coupling, etc.) Representational: complexity that includes knowledge and experience of the problem. Cyclomatic complexity number McCabe analysis is based on measurement concepts in the graph theory and transpose on software measurement; Cyclomatic number is: v(G) = e − n + p (1) e = edges; n = nodes (vertices); p = separates components Simple example E (edge or arrow) = 4 N (nodes or square) = 5 P = 1 (connected component) + 1 (virtual edge) v(G) = E – N + P v(G)= 4 – 5 +2 =1 Exercise What is the complexity ? Edge = Node = V(G) = E – N + P Answer Edge = 16 Node = 13 V(G) = 16 - 13 + 2 =5 Definition of units Left side of the equation: Units of v(G) derived from the definition: maximum number of linearly independent cycles in a strongly connected graph; Strongly connected graph explain indirectly the metamodel. Right side of the equation: Three distinct types of units: edges, nodes and 'separate components'; Numerical assignment rule: add and subtract. Definition of unit not clear Different types of units are involved; At the level of each unit different types of units cannot be added (edge and node); Needed to be transposed into a higher level of abstraction (ex: apple and orange have a higher abstraction of fruit); Graph theory has not documented clearly how the units are manipulated in their measurement procedure; There is a lack of clarity. In Zuse, a proof of the scale type associated with the cyclomatic number is suggested Analysis: Definition of the measured Entity The entity measured by the Cyclomatic Complexity Number is a control flow graph. According to McCabe, the measured entity is the source code of a given module, which corresponds to a function or a subroutine. Analysis: Definition of the measured Entity But, do graphs correctly represent the source code entity in order to measure its Cyclomatic Number? In other words, is the assumption concerning the oneto-one relation of a given module source code and its corresponding graph verified? Analysis: Definition of the measured Entity One source code of one module is related to one and only one graph. But the contrary is not necessarily true; that is, one graph can be related to one or many source codes. So, it is not obvious that the final source code corresponds to the measured graph. Moreover, McCabe suggests to use this Cyclomatic Number in order to plan the testing effort (ex: test the more complex modules). Source code and graph Source code 1 Graph 1 Source code 2 Graph 2 Source code 3 Interpretation in measurement We should explore: whether or not there is a relationship across multiple variables? what are the strengths of such relationships? how much these relationships vary (or not) under various conditions? McCabe complexity McCabe by adding the term ‘complexity’ to the ‘cyclomatic number’ has explicitly made an association between this number and a concept of complexity: but has neither described explicitly such association; nor associated quantitative numbers to this association; nor described under which conditions it would hold for particular quantities. Interpretation in industry Ratio scale or exponential scale ? Program A Cyclomatic number = 10 Program B Cyclomatic number = 5 Testing program A versus program B: is A more than twice the testing time than B? Practitioners build their own interpretation scale without clear rules and non measurement-related convention. Conclusion McCabe Artificially labeling the Cyclomatic Number as a ‘complexity’ concept has led to considerable ambiguity on the use of this Number as a measurement number rather than as a qualitative empirical model which varies according to the empirical contexts. The key problem of measurement units: Related concepts have not been either adequately explored or adequately explained. Without such knowledge and insights, it is difficult to improve such a design. Analysis of software measures Scale and units McCabe Albrecht Function Point Halstead Measure The following presentation is inspire from Dr. Alain Abran documents that I am using with his permission. A book should be published in a near future. Albrecht Function Point Alan Albrecht was the first to propose in 1979 a new way of quantifying software size Size was based on the users view of the software functional size of software was proposed as a generic measure of the “output” of the development process It allows a comparison of projects in which software was developed using different programming languages Albrecht Function Point (continued) Approach proposed by Albrecht is characterized by its empirical nature It was a quite useful method to solve 60’s and 70’s problems, but more difficult to apply in modern software development (Alain Abran) Note: Dr. Abran article will publish in December 2009 Albrecht (global vision) Unadjusted count There are two main parts: The measurement process for the logical data files ; The measurement process for the transactions. Ret = Record Element Type Validity of the model It must be noted that, neither in the original Albrecth’s paper nor in the IFPUG’s documentation: • none of the relations of these models is based on any experimentally justified theories in any precise framework • they are only described in a set of rules established by a consensus within the normative committee of IFPUG, the Counting Practices Committee. Unadjusted FPA The final results, (unadjusted) function points, are most difficult to interpret: Many implicit dimensions; Use of different types of measurement; Mix of measurement scale types; No mathematical significance unless the validity of each transformation can be demonstrated theoretically or empirically. Value Adjustment factors 14 general questions with 6 characteristics for each (counting 0 as one); Classify as 0 to 5 (ordinal); Each 14 general questions result (characteristic number) is add (interval); Classification in the previous step of a system characteristic is next multiplied by an ‘impact’ of each answer. Ex: question 1 answer is 2 then 0.02 (ratio). Analysis of software measures Scale and units McCabe Albrecht Function Point Halstead Measure The following presentation is inspire from Dr. Alain Abran documents that I am using with his permission. A book should be published in a near future. Halstead Measures According to Halstead, a computer program is: an implementation of an algorithm considered to be a collection of tokens that can be classified as either operators or operands. Halstead operators and operands By counting the tokens and determining which are operators and which are operands The following base measures can be collected: n1: Number of distinct operators. n2: Number of distinct operands. N1: Total number of occurrences of operators. N2: Total number of occurrences of operands. Number of potential operators Number of potential operands Operator or operand Unfortunately, there is a problem in distinguishing between operators and operands Halstead has not explicitly described the generic measurable concepts of operators and operands He assume that the example he gave was explicit enough Example of algorithms use 10 equations are based on the results of 6 based measures: n1, n2, N1, N2, n1* and n2* Example: length (N) of a program P is: N = N1 + N2. Example: vocabulary (n) of a program P is: n = n1 + n2 There are also volume, level, difficulty, estimator, intelligent content, programming effort, programming “time”. Halstead measures In summary, the Halstead measures, as designed almost thirty years ago, do not meet a key design criterion of measures in engineering and the physical sciences. The implementation of the measurement functions of Halstead’s measures has been interpreted in different ways than the goals specified by Halstead in their designs. For example, the program length (N) has been interpreted as a measure of program complexity, which is a different characteristic of a program.