On Inter-Method and Intra-Method Object-Oriented

advertisement
ON INTER-METHOD 1
ON INTER-METHOD AND INTRA-METHOD OBJECT-ORIENTED CLASS
COHESION
Frank Tsui, Orlando Karam, Sheryl Duggins, Challa Bonja
School of Computing and Software Engineering
Southern Polytechnic State University
Marietta, Georgia, USA 30060
KEYWORDS: Object-Oriented Design, Software Metrics, Software Quality, Systems
Evaluation
ON INTER-METHOD 2
Abstract
Cohesion has been a topic of interest since structured design in the 1970’s. Cohesion may
also be viewed as a characterization of a system attribute. Today, there are numerous
researchers continuing this work into object-oriented designs. Most of the current
research has focused on the interaction of methods within a class, the inter-method
cohesion. In this paper, we consider both the inter-method cohesion and the intra-method
cohesion of a class. We have utilized the concept of program slice (Weiser, 1981) and
have extended Functional Cohesion (Bieman & Ott, 1994) to devise a new intra-method
cohesion metric, ITRA-C, for measuring cohesion of each method within the class. This
intra-method cohesion is based on the notion of effects and chaining in an effect-slice.
We further combine the (inter-method, intra-method)-tuple into one combined Class
Cohesion, which provides a quick view of bands of cohesion for categorizing classes.
Introduction
Developing high quality software continues to be a difficult task. Many attributes may be
studied to understand software. Since software engineering is still in a relatively young
stage, applying the “systems approach” as defined by R. L. Ackoff (Ackoff, 1971) where
the complete software system is studied in a holistic manner is still a challenge. In this
paper, we will focus on a specific software attribute, cohesion, and study it further
through measuring this attribute from an object oriented class perspective. Cohesion has
ON INTER-METHOD 3
been shown to be an important attribute for good quality software (Bansiya & Davis,
2002; Bieman & Ott,1994; Briand ,Morasca ,Basili, 1995). In this paper, instead of the
complete software, the object oriented class itself is viewed as the system. Cohesion is an
attribute that characterizes connectedness and thus allows us to view a system as a set of
connected elements (Checkland, 1981), rather than in separate parts. We pursue an indepth analysis of this single attribute of the system through the various views of inter and
intra method cohesion metrics. We will also show how the cohesion metrics may be used
to help us design better object oriented classes. Thus, the value of the paper is not only in
extending the concepts of cohesion and the various associated metrics, but also the
application of theses metrics in guiding us in improving our class, or system, design. This
emphasis on engineering software has lead to research into measurements for evaluating
the quality of software. Low coupling and high cohesion have been identified as
attributes of good software design (Bansiya & Davis, 2002; Briand et al, 1994) and a
wide number of metrics have been developed to measure these quality attributes. The
notion of cohesion has been in existence for several decades (Stevens, et al 1974;
Yourdon & Constantine, 1979). These early papers introduced the concept of “functional
relatedness” of modules. The relatedness among modules was called coupling and the
relatedness within a module was called cohesion. Relatedness itself is an abstract concept
which asks if items belonged together. Intuitively, those that “belonged” together ought
to be designed into one entity. This made sense, especially, for the follow-on
maintenance people who had to understand and make modifications to the design and the
code. That is, if the “related” entities are spread across the system, then it is more
difficult to find them. As Checkland (1981) advocates, a system should be thought of as
ON INTER-METHOD 4
a connected set of elements rather than separate parts. Other than the now well know
seven levels of cohesion (coincidental, logical, temporal, procedural, communicational,
sequential, and functional), which defined ordered categories of cohesion, there was not a
numeric metric for modular cohesion in those early days. Bieman and Ott (1994) and
Bieman and Kang (1995; 1998) introduced numeric metrics based on program slices to
gauge “relatedness,” or cohesion.
Following the same concept of relatedness, there are several metrics designed to measure
cohesion of an object-oriented class. Briand et al. (1994;1998), Hitz and Montazeri
(1995), Chidamber and Kemerer (1994), Bansiya and Davis (2002), Counsel et al.
(2006), Henderson-Sellers (1996), Bonja and Kidanmariam (2006), Chae et al. (2004),
and Zhou et al. (2002;2004) have proposed different approaches to measuring cohesion in
an objected oriented class. For the most part, these metrics all revolve around the notion
of relatedness of the methods in a class. The relatedness of the methods is primarily
gauged by the amount of and the type of sharing of the attributes, or data. The methods in
a class are considered more cohesive if the amount of or type of (or both) sharing of
attributes is higher. Also, the amount of interaction among methods in the form of
method evocation of other methods in the class is considered an important factor for
cohesion among methods in a class. That is, the connectedness of the methods is
considered important. But, still, whether each individual method itself is cohesive or not
is not clearly accounted for. In this paper, we will consider class cohesiveness to be
composed of two attributes:
-
Relatedness, and
-
Singularity in function or purpose.
ON INTER-METHOD 5
If one views a class as a system, then the relatedness concept of methods in that system is
similar to the concept of coupling of the methods in the class. The more “coupled” the
methods within a class are, the higher the cohesion of that class is. Thus, inter-method
cohesion is captured by the notion of coupling of methods in the class. In such a context,
one is lead to ask what an individual method cohesion is. That is, the singularity of
function for each method in a class is important. Thus, intra-method cohesion must also
be considered. The intra-method cohesion should answer how singular, or the degree of
singularity in purpose, is the method. Intuitively, the more singular the method’s
functional purpose is, the more cohesive its intra-method cohesion is. The ideal situation
for a class is to maximize single purpose methods (intra-method cohesion) and also have
these cohesive methods be strongly related in a class (inter-method cohesion). Both intermethod cohesion and intra-method cohesion need to be included when discussing the
cohesion of a class. Furthermore, one may want to consider which one of the subattributes, inter-method or intra-method, is more important.
Design quality metrics for object oriented systems can be categorized as either static or
dynamic. Dynamic metrics measure object level coupling and dynamic complexity
(Yacoub et al, 1999). This paper will address static metrics which measure the static
cohesion of a class. The static structure of a class is considered to have only three main
parts, the class name, the instance variables, and the methods. Class cohesion is analyzed
by utilizing the instance variables and the methods of the class and their interplays within
the class. We will first discuss the traditional relatedness of methods in a class, the intermethod cohesion metric. The notion of inter-method coupling will be studied through a
set of evolving scenarios with adding instance variable and adding method to an “ideally”
ON INTER-METHOD 6
inter-method wise cohesive class. We will then explore the concept of intra-method
cohesion and introduce an intra-method cohesion metric. In the process of extending the
metric definition, we also expand the notion of intra-method cohesion. Finally, the
combination of inter-method and intra-method cohesion will be considered. Here, the
difficulties involving multi-attribute metric as pointed out by Fenton and Pfleeger (1997)
is explored. A potential combination metric will be proposed, and its characteristics will
be discussed.
Inter-Method Cohesion
Cohesion of an entity is based on several basic and similar concepts (Stevens et al, 1974;
Yourdon & Constantine 1979). These range from how much the entity serves a common
goal to how related the parts of the entity are. These are intuitively similar in that if an
entity had many unrelated parts, then chances are they may be serving more than a
singular purpose. Here, we will use a very simple and contrived example for illustration
purpose. Consider, as an example, where a Class Math is designed to perform a single
service of providing the sum of a set of integer numbers. This Class Math may be
expanded to include more services in the form of methods to provide the maximum of the
set of integer numbers, the minimum of the set of integer numbers, and the average of the
set of integers. As Class Math matures and enters into maintenance mode, it is further
expanded to also accept floating point numbers. Further enhancement of Class Math may
include a method that performs the input check and restricts the input to be only integers
and floating point. In a way, these enhancements are not atypical of a Class that evolves
ON INTER-METHOD 7
through its post-release enhancements. We can easily see how a very limited single
purpose Class Math can be expanded to a broader multi-purpose Class Math.
Using this Class Math example, let us examine how the various inter-method metrics
would treat the change in single purposefulness and relatedness of a Class. The intermethod metrics which define cohesion based on the interaction of methods with the
instance variables will all treat the above Class Math in a similar way. That is, the
methods are all interacting with the same set of instance variables, the input integers and
the input floating numbers. In Table 1, we have summarized these different cohesion
metrics of interest which we will use to trace their respective changes as Class Math
evolves.
Table 1: Some Major Inter-Method Cohesion Metrics
Metric
Metric Explanation
Briand, et al. (1998):
RCI = |CI (C)| / |Max (C)|, where CI( C) is the set of all data
RCI
declaration, or DD, interactions and data-method, or DM,
interactions in Class. Max (C) is the set of all possible DD and
DM interactions.
Bieman and Kang
TCC = NDC/ NP, where NDC = # of pairs of methods that
(1995;1998):
directly or indirectly use common attributes, or directly
TCC and LCC
connected methods. NP = all possible # of pairs of methods that
directly or indirectly use common attributes, or all possible
directly connected pairs.
ON INTER-METHOD 8
LCC = (NDC + NIC)/NP, where NIC are the pairs of methods
that are indirectly connected.
Bonja and
CC = ( ∑(|IVC|/|IVT|) )/ |Max Pairs|, where IVC is the set of
Kidanmariam (2006):
common instance variables used by a pair of methods. IVT is
CC
the set of instance variables used by a pair of methods. The
numerator is the sum of these ratios summed over all the pairs
of methods, or n!/(2*(n-2)!) pairs, in the Class. Max pairs is the
maximum possible pairs of methods, which is n!/(2*(n-2)!)
pairs for a class with n methods.
Chidamber and
LCOM= |P| - |Q| if |P| > |Q|; otherwise 0. If there are n methods,
Kemerer (1994):
then {Ii} is the set of instance variables used by method i, Mi.
LCOM
Then P = { (Ii, Ij) where Ii ∩ Ij = Ø}, and Q = {(Ii, Ij) where Ii
∩ Ij ≠ Ø. If for all i, {Ii} = Ø, then P = Ø .
Hitz and Montazeri
LCOM4 = # of connected components in a class, where method
(1995):
a and method b is connected if 1) they share an instance
LCOM4
variable or 2) either method a invokes method b or vise versa.
Henderson-Sellers
LCOM5 = [((1/a) ( ∑ u(Aj) )) - m ] / ( 1- m) where a = # of
(1996):
attributes or instance variables, u(Aj) = number of methods
LCOM5
accessing attribute Aj, m = number of methods, and ∑u(Aj) is
summed over all the attributes j=1, ---, a.
Bansiya et al. (2002):
CACM
CACM = (∑ ∑ Oij )/ ( KL), where Oij is the (i,j )th entry in
the parameter occurrence matrix. Oij = 1 if the jth data type
occurs as a parameter in the ith method, and Oij = 0 otherwise.
ON INTER-METHOD 9
K is the number of columns or number of data types in the
parameter occurrence matrix, and L is the number of rows or
the number of methods in the parameter occurrence matrix. ∑∑
Oij is summed over all the parameter data types, K, and over all
the methods, L.
Counsel et al. (2006):
NHD
NHD = (∑∑ Aij ) / [L * (K(K-1)/2) ] where Aij is the entry of
parameter agreement matrix. Aij = number of parameter
agreements between method mi and mj. K = number of methods
and L = number of attribute types. NHD is the ratio of methods
agreeing on parameter types to the maximum potential of every
method agreeing with every other method in parameter types.
The denominator is L attributes times the number of pairs of
methods out of K methods.
Consider the values that each of the metrics in Table 1 will evolve from the most ideal
cohesive situation to a less ideal case as described through the following five scenarios.
a) All the methods within the Class use/share the single instance variable
(e.g. integers)
b) Add another instance variable (e.g. floating type) that is shared by all the
methods.
c) Add one more method that also uses/shares the same instance variables
ON INTER-METHOD 10
d) Add an instance variable that disturbs the “uniformity” of all the methods
sharing all the instance variables. That is, in the software maintenance or
evolution mode, we often will introduce an additional instance variable
into a Class without fully considering the erosion to cohesiveness of that
Class.
e) Add a method that similarly disturbs the “uniformity” of all the methods
sharing all the instance variables. Again, during software evolution we
often will introduce an additional method into a Class without
considering how it might erode the cohesiveness of that Class.
In Table 2, we have summarized the evolution of the metric values as the Class Math
evolves from the above condition (a) through condition (e). Class Math starts with
inputting a set of integers as an instance variable. There are four methods in Class Math
that compute the sum, min, max, and average of the integers respectively. Then Class
Math is expanded to input floating point numbers and the same four methods are
enhanced to compute the sum, min, max, and average of the floating point numbers. Then
an additional fifth method is included to perform a check to ensure that both integers and
floating point input numbers are between -10,000 and +10,000. Then Class Math may
evolve to either include an instance variable that only some of the methods use or include
a method that uses some of the instance variables. Let us pick a sample metric in Table 2,
the Henderson-Sellars’ (1996) LCOM5, as we consider these scenarios. As we go
through the scenarios, it will become evident that even this simple evolution is not as
clear cut as it looks.
ON INTER-METHOD 11
LCOM5 Computations
LCOM5 was defined by Henderson-Sellers (1996). It predominantly looks at the number
of methods that access each of the set of attributes or data, specifically only the instance
variables. Thus, LCOM5 does not deal with data to data interactions and the non-instance
variables. It focuses on instance variables to method interactions. For LCOM5 having a
value of 0 is considered perfect cohesion.
For scenario (a), there are 5 data elements. One is an instance variable, I1, which is
accessed by all 4 methods. Let I1 be represented as A1, and the other 4 data elements be
A2 through A5. Recall that these 4 data elements are variables: sum, min, max, and
average. They are defined within each of the methods and accessed only by their
respective methods, m1 through m4. We realize that one may argue to have these
declared as instance variables. But for this example, we will purposely chose not to do so.
Since these 4 data elements are not instance variables, they will not enter into LCOM5
computation. We only have u(A1) = 4. LCOM5 = [(1/1) *(4) – (4)] / (1-4) = 0. For
scenario (a), LCOM5 =0 is considered perfect cohesion.
For scenario (b), we introduce another instance variable, a floating type I2, to be accessed
by all 4 methods again. In this case, u(A1) = 4 and u(A2) = 4. LCOM5 = [ (1/2)* (4+4) –
4 ] / ( 1- 4) = 0. Thus for scenario (b), LCOM5 = 0 indicates that the Class cohesion
remains perfect.
ON INTER-METHOD 12
Consider scenario (c) where a fifth method, input check method, is introduced to check
both of the instance variables. Therefore, u(A1) = 5 and u(A2) = 5. LCOM5 = [(1/2) * (5
+ 5) – 5]/ (1-5) = 0. For scenario (c), LCOM5 continues to indicate that the Class
cohesion remains perfect.
For scenario (d), we introduce a third instance variable, I3 that is only accessed by 1 of
the 5 methods. In this case, u(A1) and u(A2) both remain the same as before, and u(A3) =
1. LCOM5 = [(1/3)* (5 +5 +1) – 5] / ( 1- 5) = (- 4/3) / (-4) = 1/3. This indicates that the
Class cohesion has deteriorated as it is moving from the perfect 0 case towards 1, the
worst case.
The final scenario (e) is to introduce a sixth method that accesses only 1 of the 3 existing
instance variables. We will arbitrarily pick that 1 instance variable to be I1. Now, u(A1) =
6, u(A2) = 5, and u(A3) = 1. LCOM5 = [(1/3)* (6+5+1) – 6] / (1-6) = (-2)/(-5) = 2/5. This
time LCOM5 has slightly increased in value, indicating further deterioration of Class
cohesion.
LCOM5 metric showed perfect cohesion for scenarios (a) through (c). Intuitively, this
made sense when one is only considering the instance variable. As the two cases in
scenarios (d) and (e) show, the class cohesion eroded and increased in value as we
introduced an instance variable that is only utilized by one method followed by the
introduction of a method that only uses I1. The only inconvenient part is LCOM5 starts
with a perfect 0 and increases in value to the worst case, 1 as cohesion deteriorates.
Summarizing Scenario (a) through (e):
ON INTER-METHOD 13
In Table 2, we have summarized the evolution of the metric values as the Class Math
evolves from condition (a) through condition (e) for all the metrics listed in Table 1.
Table 2: Summarizing Scenarios (a) through (e)
(a)
(b)
(c)
Briand et al
.6
(1998):
.6
(d)
(e)
add an instance variable
add a method
- adding an instance
- adding a method that
.65 variable that only
does not access all
interacts with some
instance variables also
methods further decreases
further decreases the
RCI
value of RCI
- adding an instance
- adding a method that
variable that only
does not access all the
(1995;1998):
interacts with some
instance variable,
TCC
methods creates no
decreases TCC
RCI
Bieman and
1
Kang
1
1
change to TCC
Bonja and
Kidanmariam
(2006):
CC (X)
1
1
1
-adding an instance
- adding a method that
variable that only
does not interact with all
interacts with some
the instance variables,
methods decreases CC
decreases CC (X)
(X)
Chidamber
- adding an instance
- adding a method that
ON INTER-METHOD 14
and Kemerer
0
0
0
(1994):
LCOM
variable that creates a
creates a “non-uniform’
“non-uniform’ situation
situation that affects |P|
that affects |P| and |Q|; if
and |Q|; if |P| < |Q| then
|P| < |Q| then LCOM = 0
LCOM = 0 and if |P| >|Q|,
and if |P| >|Q|, then
then LCOM increases in
LCOM increases in value
value from 0.
from 0
- adding an instance
- adding a method that
variable that is not
does not access all
accessed by all methods
instance variables creates
creates a non-uniformity
a non-uniformity and
LCOM4
and increases LCOM 4.
increases LCOM 4.
Henderson-
- adding instance variable
- adding a method that
that is accessed by some
accesses some instance
methods increases
variable also increases
LCOM5 above 0 towards
LCOM5 towards 1 and
1, deteriorating cohesion
further deteriorates
Hitz and
Montazeri
1
1
1
(1995):
Sellers (1996):
0
0
0
LCOM5
cohesion
Bansiya, et al.
1
(2002):
CACM
Counsel, et al.
1
1
-adding an instance
- adding a method that
variable that is not used
does not use all instance
by all the methods
variables decreases
decreases CACM from1
CACM from 1
- adding an instance
- adding a method that
ON INTER-METHOD 15
1
(2006):
NHD
1
1
variable that is not used
does not access all the
by all methods decreases
instance variables
NHD from 1
decreases NHD from 1
This plethora of cohesion metrics of a Class shows that the ideal value for cohesion
varies. Some start at 0 and increase in value as cohesion is compromised, and others start
at 1 and decrease in value as cohesion erodes. While each has its own strength as a
metric, it is nevertheless difficult to keep track of all the details behind these metrics.
From these Class cohesion metrics, it is clear that inter-method cohesion in a Class
resembles the concept of coupling of methods within a Class. However, in this case, we
want tight coupling among methods, not loose coupling. As such, inter-method cohesion
should be concerned with the following two characteristics:
1. Methods coupled due to sharing of control through method invocations,
and
2. Methods coupled due to sharing of data among methods.
From these two characteristics one can also see that the sharing of control may be multileveled. That is, a method may invoke another method which may further invoke a third
method. This chain of invocations creates different degrees of inter-method cohesion.
Similarly, there may be a chain of sharing of data or data dependencies (Chae et al, 2004;
Zhou et al, 2002). That is, consider an instance variable, x, that is assigned a value by a
method A. In a different method, B, x is used to define another variable, y. A third
method, C, may utilize the variable y to compute something. In this scenario, methods A
ON INTER-METHOD 16
and C do not directly share any data. However, there exists a data dependency
relationship that should be accounted for when considering inter-method cohesion.
From Table 1, one would like to pick an inter-method cohesion metric that comes closest
to covering as much of the two attributes as possible. Then, perhaps modify or enhance
the metric, if necessary. For the rest of the paper, especially in the section on Combining
the Inter-method and Intra-method Cohesion, we will choose the popular LCOM5 intermethod cohesion metrics, but with one modification. We will be reversing it to start with
lowest cohesion of 0 and move towards a perfect cohesion of 1.
Intra-Method Cohesion
In this section, we will discuss the notion of cohesion of each individual method. The
cohesion of each method may be viewed as a micro-level of the inter-method cohesion in
that we can analyze the structural relationships of the data to the operations and
relationships among the operations. Thus, for intra-method cohesion, we believe that each
method should be viewed from the perspective of relatedness of the operations and of
data to achieve a single functionality. The key is the phrase “single functionality.” For
this we may consider reverting to the earlier definition of cohesion in terms of levels of
cohesion, from coincidental to functional.
The problem is that there is no clear and simple way to numerically measure intra-method
cohesion when it is defined through a metric of levels that is only ordered. In that
definition, only the best situation, functional cohesion level, has one function. All other
levels have multiple functions, and the manner in which the multiple functions operate
ON INTER-METHOD 17
determines the level of cohesion. One way is to assign functional level to be 1/1,
sequential level to be1/2, communicational level to be 1/3, and so on up to the worst case
of coincidental level, which will take on the value of 1/7. This primitive, numerical
metric assumes that each level is different from the next level in the exactly same
amount. Furthermore, there is no differentiation of number of functions performed at
different levels. Consider the situation where one method may perform 5 functions at the
procedural level and another method performs 2 functions at the logical level. According
to the cohesion metric by level, the one with 5 functions at the procedural level would be
1/4, and the one with 2 functions at the logical level would be 1/6. Thus, this metric only
serves as a guideline, but is quite limited in its utility.
Bieman and Ott (1994) have suggested three metrics, based on data slicing, to measure
cohesion: Strong Functional Cohesion, SFC, Weak Functional Cohesion, WFC, and
adhesiveness, A. Perhaps, a better alternative to the levels of cohesion is to consider these
metrics based on data slices for intra-method cohesion. SFC is defined as the ratio of
super glue-tokens to total number of data tokens, and WFC is defined as the ratio of glue
tokens to the total number of data tokens. The adhesiveness of a data token, t, in a
procedure is defined as the ratio of slices that contain t and the total number of slices in
the procedure. If the method contains only one function, then every data token will reside
in only one function, and the adhesiveness of each of the token is defined to be one. The
average adhesiveness of all the data tokens in the method is defined as A(m) = (Σ A(ti) ) /
|t|, where A(ti) is the adhesiveness of data token ti and |t| is the cardinality of the set of
data tokens in the method. It provides a metric that would address cohesion of the method
in terms of the adhesiveness of the data tokens or the connectedness of these functions
ON INTER-METHOD 18
through the data tokens. The adhesiveness metric does not differentiate the intra-method
cohesion by pre-defined levels. So it is possible to have a numerical adhesiveness metric
for intra-method cohesion that is the same for two different levels of cohesion, such as
the sequential and communicational levels. The nature of the functionality which
differentiated the cohesion level in the previous, ordered, cohesion levels is not part of
metric of cohesion when measured through adhesiveness of data tokens or the other two
(SFC and WFC) metrics.
We propose a variation to the Bieman and Ott’s (1994) metrics based on data slices as a
metric for intra-method cohesion metric. We will expand the notion of “output” in
Bieman and Ott (1994) to a broader set of situations. The intra-method cohesion metric
should take into account of two characteristics in a method:
-
The effect of the functionality in the method, and
-
The chaining within the functionality.
The “effect” of the functionality is a defined set of observables. Once these are defined,
then it is a much easier characteristic to observe than general functionality. We define the
set of effects as characterized by the following specific activities over a variable:
1. Printing, displaying, or writing of a variable,
2. Returning a value of a variable, and
3. Storing of a variable.
ON INTER-METHOD 19
We will discuss these three types of effects. 1) Printing, displaying, or writing a variable
is often the culmination of some specific set of activities and indicates a function is
completed. Thus, tracing the slice of code that resulted in the printing of that variable
would provide us a hint of the cohesiveness. 2) Similarly, returning a value of a variable
implies the completion of some functionality. However, this is a more difficult effect in
that the return variable may not allow us to perform a trace of the functionality. It may be
the situation where the particular method performs a synchronization activity or a sorting
activity on an instance variable array. The return value is just a success indicator. Thus,
tracing the slice of code from the return value, in this situation, will not provide us with a
view of the functionality. In the more traditional case where the return value is usually
the variable that contains the result of some functionality, tracing the slice of code from
the return value would give us an idea of the cohesiveness. 3) The final storing of a
variable may be accompanied with retrieving of the variable. This pair of activities often
represents the updating function. The slice of code between the retrieving and the storing
would represent the functionality, such as sorting an array variable, performed for
updating the variable. A simple, perhaps trivial, example is the constructor method with
input parameters. The storing of a variable without the retrieve part would imply storing
the variable after completion of some functionality. The final storing of variable is similar
to the effect of printing and writing.
The more effects in a method should represent more functionality and potential diversity
in functionalities. Also, the number of variables involved in the slice of code that
produced the effect provides an indication of the size and diversity of functionalities
ON INTER-METHOD 20
involved in the resulting effect. The notion of code slice is the same as that provided by
Weiser [21]. The Effect Indicator, EI, is represented as follows:
EI =

V (i, j)
where V(i,j) is the jth variable in the ith effect code slice.
i slices j  var iables
The conjecture here is that the larger the Effect Indicator is the less cohesive is the
method. We will use the reciprocal of the EI and define:
Effect, E = 1/EI.
The Effect metric is equal to 1 when there is only one effect in the method and also when
that one effect involves only one variable. As more effects and variables are involved in
each effect, then EI increases but E decreases. Thus E varies from 1, the best case, to
potentially 0, representing the large number of effects over variables.
The second characteristic for intra-method cohesion is the notion of the chain of the
effect. The chaining characteristic is also based on the slicing concepts from Weiser
(1981). For each effect, the slice of code for that effect is identified first. Then the
variable or variables that participate in the slice of code for that effect are traced in a
chain fashion much like the define-usage (or d-u ) path used in program testing
(Jorgensen, 2002).The length of the chain for each variable is a count of the number of
steps involved in the completion of an effect. Thus the chain length provides an
indication of the size of the function. In the event that the same variable appears several
times in the chain, we only trace the longest chain for that variable. Let the Chain Length
ON INTER-METHOD 21
of the slice of code traced from the variable in the effect all the way back to those that
affect the first definition of that variable be CL. Let span of the chain or Chain Span, CS,
be all the steps of the code, including those not in the slice, between the variable in the
effect to the first definition or assignment of that variable. The ratio, CL/ CS, would
represent the proximity attribute of the variable in the effect slice. For each variable in
each effect slice there is a Proximity Indicator PI = CL/CS. This proximity of effect
shows how physically spread out the variable in each effect in the method is. Thus, it is
an indication of the physical cohesion of the method. A method may contain more than
one effect; thus we need to compute the PI for each variable in the effect slice for all the
effects in a method. The Average PI, or API for a method is:
API = Σ PI / | PI|
For a method that has only one effect, one variable in that effect slice and the effect slice
associated with the method is the complete method, then CL = CS. Then PI = 1, and API
will also be 1. As API moves towards 0, it indicates that the slice of an effect is more
physically spread out in the method. This physical cohesion also matches well with our
intuition, especially from a maintenance perspective.
The Intra-method Cohesion for a method, m, in an object is defined to be the combination
of Effect and Average Proximity Indicator of that method or
Intra-method Cohesion of method or ITRA-C (m) = ( E + API ) / 2
ON INTER-METHOD 22
For an object, O, which contains multiple methods, the intra-method cohesion for the
object is:
Intra-method Cohesion of the object or ITRA-C(O) = Σ IC (m j) / | mj |
This Intra-method Cohesion (O), or ITRA-C(O), of an object will vary from the ideal
value of 1 to the worst case of 0. The best case is that each Intra-method Cohesion (m), or
ITRA-C(m), is equal to 1. As each of the ITRA-C (m) decreases from 1, so will the
ITRA-C (O).
The combining of two sub-attributes related to cohesion at the method level was achieved
by just averaging the metrics for these sub-attributes. In this case, the intuitive notion of
cohesion is still preserved with the averaging. The ordering of the intra-method cohesion
matches that of the ordering of cohesion.
Note that while we can say Intra-method Cohesion (m1) > Intra-method Cohesion (m2),
we can not pinpoint which of the sub-attributes or both, E or API, contributed to this
relationship between m1 and m2 without specifically looking at E of m1, E of m2, API of
m1 and API of m2. In the next section, we will investigate the problem of combining
metrics of different sub-attributes. Specifically, we are interested in combining the two
sub-attributes, inter-method cohesion and intra-method cohesion, and coming up with one
unified Class Cohesion metric.
Combining Inter-Method And Intra-Method Cohesion
ON INTER-METHOD 23
The unification of the inter-method and intra-method cohesion metrics will be explored in
this section. In software metric theory (Fenton and Pfleeger, 1997), we are reminded of
the Representation Condition. That is, the mapping from the empirical world to the
numerical world should maintain the relations such that the relations in the empirical
world are preserved by the relations in the numerical world.
Consider the case where an inter-method metric, such as LCOM5, is picked to represent
“relatedness.” Recall that LCOM5 varies from 0 to 1 where 0 is the most cohesive and 1
is the least cohesive. For intra-method cohesion, or singularity in purpose, consider the
Intra-method Cohesion(O), ITRA-C(O), defined in the previous section. ITRA-C(O) also
varies from 0 to 1 where 1 is the most cohesive and 0 is the least cohesive. It would seem
desirable to somehow combine these two metrics and come up with a single Class (or
Object) Cohesion metric that takes into account relatedness and singularity in purpose.
An inter-method and intra-method combined Class Cohesion metric would provide us
with a quick view or comparison of cohesion among different classes.
One simple way is to just take the average of the LCOM5 and ITRA-C(O). Immediately,
we would run into trouble because LCOM5 works in the reverse direction than ITRAC(O). To adjust and make the two metrics move in the same direction may sometimes be
a non-trivial task that will require the redefinition of the attribute. In this case, we can
define an adjusted LCOM5’= (1- LCOM5) to reverse the direction of LCOM5 from 1 to
0, to 0 to 1. We will utilize LCOM5’ as the adjusted inter-method cohesion in this section.
Assuming that the two metrics are adjusted and are on the same 0 to 1 interval,
combining them by taking a simple average would mean that the two sub-attributes, intermethod and intra-method, are viewed to be of the same contributive value, which may not
ON INTER-METHOD 24
always be true. For the time being, let’s assume that the two metrics, LCOM5’ and
ITRA-C(O), are of equal value in their contribution to Class Cohesion, where LCOM5’ is
the adjusted inter-method metric. Even then, the Class Cohesion derived from the average
of LCOM5’ and ITRA-C(O) cannot satisfy the Representation Condition. This is
illustrated in Figure 1 below.
Inter-method cohesion
( LCOM5’ )
0
Intra-method cohesion
( ITRA-C (O) )
1
O1=.3
0
O2=.6
O2=.4 O1=.5
1
Class Cohesion
0
O1 =.4
O2=.5
Class Cohesion(O1) = (.3+.5)/2
Class Cohesion(O2) = (.6+.4)/2
Figure 1: mapping from inter and intra cohesion to a
unified Class Cohesion
As Figure 1 illustrates, Class Cohesion (O2) > Class Cohesion (O1). However, this
relationship may not be preserved or be able to preserve the same relationship for both
LCOM5’ and ITRA-C(O), at the inter-method and intra-method level. At the intermethod level, LCOM5’(O2) > LCOM5’(O2), but at the intra-method level ITRA-C(O1)
> ITRA-C(O2). These relations are not preserved through the mapping from inter-method
and intra-method to the Class Cohesion, if we use the “averaging” function as our
mapping function.
ON INTER-METHOD 25
One alternative is to keep the sub-attributes, and represent Class Cohesion (O) as a 2tuple.
Class Cohesion (O) = ( LCOM5’, ITRA-C(O) )
Further define the “>” relation such that Class Cohesion (O1) > Class Cohesion(O2) if
and only if
i)
LCOM5’(O1) > LCOM5’(O2) AND
ii)
ITRA-C(O1) > ITRA-C(O2)
Class Cohesion (O) defined in this fashion will at least preserve the relationships in the
empirical worlds of both inter-method and intra-method cohesion when the logical AND
condition is met. The question still remains when it is not the case where both of the
AND conditions are met. The following are the major, but not all, possible situations.
a) LCOM5’ (O1) > LCOM5’ (O2) AND ITRA-C(O1) > ITRA-C(O2), then
Class Cohesion (O1) > Class Cohesion (O2)
b) LCOM5’ (O1) < LCOM5’ (O2) AND ITRA-C( O1) < ITRA-C (O2), then
Class Cohesion (O1) < Class Cohesion (O2)
c) LCOM5’ (O1) = LCOM5’ (O2) AND ITRA-C(O1) = ITRA-C(O2), then
ON INTER-METHOD 26
Class Cohesion (O1) = Class Cohesion (O2)
d) LCOM5’ (O1) < LCOM 5’ (O2) AND ITRA-C(O1) > ITRA-C(O2), then
Class Cohesion (O1) ≠ Class Cohesion (O2)
e)
LCOM5’ (O1) > LCOM5’ (O2) AND ITRA-C(O1) < ITRA-C(O2), then
Class Cohesion (O1) ≠ Class Cohesion (O2)
For conditions d) and e) above, we can say that Class Cohesion (O1) ≠ Class Cohesion
(O2). This mapping of Class Cohesion into the 2-tuple, with the very strict logical AND
operator, creates a metric that is even more difficult to use under certain circumstances.
Next we consider the situation where the logical AND for inter-method and intra-method
conditions is relaxed to a logical OR situation. That is, define Class Cohesion (O1) >=
Class Cohesion (O2) if (the reverse is not necessarily true):
i)
LCOM5’(O1) >= LCOM5’(O2) OR
ii)
ITRA-C(O1) >= ITRA-C(O2).
The logical OR relaxes the constraints we placed on the mapping. This may allow us to
devise a metric that will replace the Class Cohesion (O1) “not-equals” Class Cohesion
(O2) situations. Using the 2-tuple of (inter-method, intra-method), or more specifically
(LCOM5’, INTR-C(O)), the Class Cohesion (O) metric will run from (0, 0) to (1,1). The
ON INTER-METHOD 27
question is what happens when we are dealing with situations where Class Cohesion (O)
is not on the diagonal or on the same “radial line” of Figure 2.
(1,1)
(1,0)
Class Cohesion where
Intra-method < inter-method
Inter-method
Class Cohesion where
Intra-method > inter-method
(0,1)
(0,0)
Intra-method
Figure 2: The Class Cohesion Metric on and off the diagonal
With the logical AND condition, the “>” relationship of Class Cohesion (O) may be
unclear if the two Class Cohesion (O) reside on different radial lines. This is a problem
that makes the situation very constrained. In Figure 3, we show that besides using the
radial lines, we can also look at the location where the Class Cohesions fall and roughly
determine the “>” relationship. As Figure 3 shows, given a Class Cohesion (O) at some
point (a,b), there is an area that is clearly less cohesive and an area that is clearly more
cohesive. Both of these areas are represented with hashed lines in Figure 3. But there are
two areas that are questionable. We are also interested in what happens in the unclear
areas of Figure 3.
ON INTER-METHOD 28
(1,0)
Inter-method
more
unclear
(a,b)
less
unclear
(0,1)
(0,0)
Intra-method
Figure 3: The Class Cohesion (O) in terms of Distance Function
With the relaxed condition of OR, we have more latitude in devising a metric. We will
explore the definition of Class Cohesion as a distance function.
Class Cohesion (O) = distance of (LCOM5’, ITRA-C(O)) from (0,0) or
=
( LCOM5'- 0) 2  ( ITRA - C(O) - 0) 2
The maximum distance is SQRT (2), or 1.41, in this case. Class Cohesion (O) defined as
a distance function and based on the logical OR condition does handle the cases where
Class Cohesion is either on or off the diagonal or the same radial line. Class Cohesion
(O1) > Class Cohesion (O2) if:
ON INTER-METHOD 29
(LCOM5' (O1) 2  (ITRA - C(O1) 2
> { (LCOM5' (O2) 2  (ITRA - C(O2) 2
This numerical distance function obviously satisfies the OR condition because it took that
into consideration. Figure 4 shows the combined metric of Class Cohesion (O) moving
from band y to band z.
(1,1)
(1,0)
Inter-method
Clas
s Co
Clas
s
Coh
esio
n
hes
io
n =
b an
d z
= b
and
y
(0,1)
(0,0)
Intra-method
Figure 4: The Class Cohesion (O) in terms of Distance Function
On the band of Class Cohesion = band y, for example, there are classes of different
combinations of inter-method and intra-method cohesions. Thus the relaxed OR
condition gives a greater amount of flexibility in representing the Class Cohesion (O)
than the AND condition. According to Figure 4, all classes on Class Cohesion = band z
have a higher cohesion than those on the y-band.
This combined metric is a good initial step in combining the inter-method and intramethod cohesion of a class. It eliminates the earlier “unclear” situations for Class
Cohesion (O) of the AND condition where [ LCOM5’ (O1) < LCOM 5’ (O2) AND
ON INTER-METHOD 30
ITRA-C(O1) > ITRA-C(O2) ] or [ LCOM5’ (O1) > LCOM5’ (O2) AND ITRA-C(O1) <
ITRA-C(O2) ]. With the OR condition, the Class Cohesion (O) metric seems to better
preserve the relationship in the numerical world with the empirical world. This unified
Class Cohesion provides us with bands of cohesion and gives us a quick global view of
classes of cohesion. The grouping of classes may be viewed as shorthand to cataloging
classes, or systems, by cohesion bands.
Concluding Remarks
In this paper, we have taken an object oriented class as a system. The connectedness or
relatedness of the system, known as cohesion, has been shown to be associated to quality
(Bansiya & Davis, 2002; Bieman & Ott, 1994; Briand et al, 1994). Thus cohesion is an
important characteristic to investigate, and we have viewed it as such a characterization
of an object oriented class in software. More specifically, we have limited our view to
just a static analysis perspective by analyzing the interaction between the instance
variables and the methods within a class and by analyzing all the individual methods
within the class. These were classified as inter-method cohesion and intra-method
cohesion, respectively. The literature is rich with inter-method cohesion metrics; any one
of them may be picked for inter-method cohesion.
Our main contributions in this paper are twofold. First, we utilized the concept from
program slices and modified the Functional Cohesion concepts from Bieman and Ott to
formulate a new intra-method cohesion metric, ITRA-C. Second, while it may be
necessary to keep the (inter-method, intra-method)-tuple as the class metric, we also used
ON INTER-METHOD 31
the distance function and unified these two metrics into one Class Cohesion metric to
provide “bands” of class cohesion as a high level categorization of classes or systems.
In the future, we plan to extend this work into several areas. One area is to further study
the bands of Class Cohesion, utilizing samples of classes and observing them through
multiple maintenance cycles. Another would utilize the (inter-cohesion, intra-cohesion)tuple as a guide to further analyze and re-factor class design. A third area would be to
extend the study to include the interactions among classes and the dynamic analysis of
cohesion such as applying Class Cohesion to run-time analysis of classes (Mitchell &
Power, 2004; Yacoub et al, 1999). And lastly, we would explore the potential of applying
the cohesion concepts and metrics to other, more general software systems beyond object
oriented software classes.
References
ON INTER-METHOD 32
Ackoff, R. L. (1971). Towards A System of Systems Concepts. Management
Science, 17(11), 661- 671.
Bansiya, J. & Davis, C. (2002). A Hierarchical Model for Object-oriented Design
Quality Assessment. IEEE Transaction on Software Engineering, 28(1), 4-17.
Bieman, J. & Ott, L.M. (1994). Measuring Functional Cohesion. IEEE
Transactions on Software Engineering, 20(8), 644-657.
Bieman, J. & Kang, B. K. (1995). Cohesion and Reuse in an Object-Oriented
System. In Proceedings of Symposium on Software Reusability, Seattle,
Washington, USA.
Bieman, J. & Kang, B.K. (1998). Measuring Design Level Cohesion. IEEE
Transactions on Software Engineering, 24(2), 111-124.
Bonja, C. & Kidanmariam, E. (2006). Metrics for Class Cohesion and Similarity
Between Methods. In Proceedings of the 44th ACM Southeast Conference,
Melbourne, Florida, USA.
Briand, L., Morasca, S., & Basili, V.C. (1994). Defining and Validating HighLevel Design Metrics. University of Maryland CS-TR3301-1, Maryland, USA.
Briand, L.C., Daly, J.W., & Wust, J. (1998). A Unified Framework for Cohesion
Measurement in Object-Oriented Systems. Empirical Software Engineering,
3(1), 65-117.
Chae, H.S., Kwon, Y.R., & Bae, D.H. (2004). Improving Cohesion Metrics for
Classes by Considering Dependent Instance Variables. IEEE Transactions on
Software Engineering, 30(11), 826-832.
ON INTER-METHOD 33
Checkland, P. (1981), Systems thinking, systems practice. West Sussex, England:
JohnWiley & Sons.
Chidamber, S. R. & Kemerer, C. F. (1994). A Metric Suite for Object-oriented
Design. IEEE Transactions on Software Engineering, 20(6), 476-493.
Counsel, S., Swift, S., & Crampton, J. (2006). The Interpretation and Utility of
Three Cohesion Metrics for Object-Oriented Design. ACM Transactions on
Software Engineering and Methodology, 15(2), 123-149.
Fenton, N.E. & Pfleeger, S. L. (1997). Software metric a rigorous and practical
approach, 2nd edition. PWS Publishing Company.
Henderon-Sellers, B. (1996). Object-Oriented metrics: measures of complexity.
Upper Saddle River, New Jersey: Prentice Hall.
Hitz, M. & Montazeri, B. (1995). Measuring Coupling and Cohesion in ObjectOriented Systems. In Proceedings of International Symposium on Applied
Corporate Computing, (25-27), Monterey, Mexico.
Jorgensen, P.C. (2002). Software testing a craftsman’s approach, 3nd edition.
Boca Raton, Florida: Auerbach Publications.
Kitchenham, B., Pfleeger, S., & Fenton, N. (1995). Towards a Framework for
Software Measurement Validation. IEEE Transactions on Software
Engineering, 21(12), 929-944.
Kramer, S. & Kaindl, H. (2004). Coupling and Cohesion Metrics for KnowledgeBased Systems Using Frames and Rules. ACM Transactions on Software
Engineering and Methodology, 13(3), 332-358.
ON INTER-METHOD 34
Lewis, J. & Lofton, W., (2001). Java software solutions: foundations of program
design. Reading, Massachusetts: Addison Wesley Longman, Inc.
Mitchell, A. & Power, J. F. (2004). Run-Time Cohesion Metrics: An Empirical
Investigation. In Proceedings of International Conference on Software
Engineering Research and Practice, Las Vegas, Nevada, USA.
Sarkar, S., Rama, G. M., & Kak, A.C., (2007). API-Based and InformationTheoretic Metric for Measuring Quality of Software Modularization. IEEE
Transactions on Software Engineering, 33(1), 14-32.
Stevens, W.P., Myers, G.J., & Constantine, L. (1974). Structured Design. IBM
Systems Journal, 13(2), 200-224.
Weiser, M., (1981). Program Slicing. In Proceedings of the 5th International
Conference on Software Engineering, (439 – 449), San Diego, California,
USA.
Yacoub, S., Ammar, H., & Robinson, T., (1999). Dynamic Metrics for Objectoriented Design. In Proceedings of Software Metrics Symposium, (50 – 61),
Boca Raton, Florida, USA.
Yourdon, E. & Constantine, L., (1979). Structured Design. Upper Saddle River,
New Jersey: Prentice Hall.
Zhou, Y., Xu, B., Zhao, J. & Yang, H. (2002). ICBMC: An Improved Cohesion
Measure for Classes. In Proceedings of International Conference on Software
Maintenance, (44-53), Montreal, Canada.
ON INTER-METHOD 35
Zhou, Y, Lu, J., Lu, H. & Xu, B. (2004). A Comparative Study of Graph Theorybased Class Cohesion Measures. ACM SIGSOFT, Software Engineering
Notes, 29(2), 13 - 18.
Download