Code smells, slice-based metrics and plenty of deodorant Steve Counsell Brunel University Alessandro Murgia University of Cagliari Introduction Code smells are areas of code which “scream out” to be refactored In theory, a code smell would be indicative of a decaying class Methods becoming longer Classes becoming bigger Coupling becoming greater Cohesion deteriorating Commonly-studied smells (cont.) Feature Envy “A classic [code] smell is a method that seems more interested in a class other than the one it is in. The most common focus of the envy is the data” (Fowler) God Class (aka Key Class) A Long Method A class that is deemed to have grown too large method that is deemed to have grown too large Long Parameter List Method is deemed to have too many parameters Commonly-studied smells (cont.) Feature Envy Remedied (deodorised) by ‘moving’ the method to the class where it is most needed God Class No obvious remedy Extract class? Long Method Remedied by splitting the method into at least two Long Parameter List What we did Premise Smell-based classes will have low cohesion We used Eclipse Jdeodorant Allows smell extraction from Java systems Slice-based metrics plug-in (Tsantalis et al.): Overlap Tightness Coverage honestly!) (omitted from this analysis for clarity – Overlap and Tightness (function F) Overlap(F) = 1 | Vo | | Vo | i 1 Tightness(F) = | SL int | length( F ) | SL int | | SLi | Program function Main() { int i; int smallest; int largest; int A[10]; SLsmallest SLlargest SLint | | | | | | | | | for (i=0; i <10; i++) { int num; scanf(“%d”, &num); A[i] = num; } | | | | | | | | | | | | smallest = A[0]; largest=smallest; | | | | i=1; while (i <10) { if (smallest > A[i]) smallest = A[i]; if (largest < A[i]) largest = A[i]; | | | | | | i = i +1; } | printf(“%d \n”, smallest); printf(”%d \n”, largest); } length =19 | | | | | | | | | 14 16 11 Overlap = 1 2 Tightness = 11 ( 14 + 11 = 0.58 19 11 16 ) = 0.74 Study 1 – Evolizer Evolizer tool A tool for studying the evolution of OO systems Developed at the University of Zurich 300 classes/interfaces We looked at the following smells: God class Long method Feature envy God class The JDeodorant tool found 18 occurrences of the God class For each of the God classes We extracted the two slice-based metrics for all methods in those classes: Overlap Tightness Abstract classes/interfaces an issue God class - problems Constructors were often the largest methods in a God class (metrics n/a) Single line methods with no local variables were a frequent occurrence (metrics n/a) Get and set methods Single variable methods Use of ‘super’ to access a superclass 1.2 1 0.8 Overlap 0.6 Tightness 0.4 0.2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 Low cohesion Method values are not apparent Tightness and Overlap (Counsell et al 2010) N Mean Max SD Median Tightness (fault-prone) 372 0.32 0.99 0.32 0.21 Tightness (fault-free) 150 0.38 1.00 0.37 0.28 Overlap (fault-prone) 372 0.59 1.00 0.33 0.63 Overlap (fault-free) 150 0.63 1.00 0.38 0.72 Long method The JDeodorant tool found 9 occurrences of Long Method For each of these methods: We extracted the same two slice-based metrics for all methods in those classes Overlap Tightness Long method - problems Constructors were often the largest methods (metrics n/a) Single line methods with no local variables were a frequent occurrence (metrics n/a) Get and set methods Single variable methods Use of ‘super’ 1.2 1 0.8 Overlap 0.6 Tightness 0.4 0.2 0 1 2 3 4 5 Method 6 7 8 9 Low cohesion Values not apparent Feature envy The JDeodorant tool found 11 occurrences of Feature envy For each of these methods: We extracted the two slice-based metrics for all methods in those classes Feature envy - problems ………same problems as with the other two smells 1.2 1 0.8 Overlap 0.6 Tightness 0.4 0.2 0 1 2 3 4 5 6 Method 7 8 9 10 11 A hypothesis In terms of cohesion, we would expect: Long Method to contain the most un-cohesive methods God class to contain the next most uncohesive methods Feature envy to be the most cohesive A hypothesis – result mean values For Overlap: God class most cohesive Feature envy Long method least cohesive For Tightness: Long method most cohesive Feature envy God class least cohesive Most values of Overlap and Tightness > 0.5 Study 2 – Proprietary System Background to Study 2 C# sub-system for a web-based, loans system providing quotes and financial information for on-line customers We examined two versions of one of its subsystems: an early version, comprised 401 classes later version (version n) had been the subject of a significant refactoring effort to amalgamate, minimize as well as optimize classes Comprised 101 classes only Smell analysis We focused on three smells which, arguably, should be easily identifiable from the source code: God Class Long Method Lazy Class. A class is not doing enough to justify its existence, identified by a small number of methods and/or executable statements; it should be merged with the nearest, related class God Class We found many god classes to be architectural pattern-based class (Page Controller and Data Transfer Objects) Should be left alone, irrespective of the cohesion value They also had relatively large amounts of coupling So eradicating this smell would not only be unwarranted, but difficult (because of the coupling) Long Method Class ComparisonEngine.cs contained the method with the highest number of statements. Often ‘long’ methods are a necessary part of the implementation of an architectural pattern Inspection of the code revealed this method to contain one large switch statement comprising 340 statements Deodorising this method would be a major undertaking Leave them alone Found evidence of these features in both start and end versions Conclusions Many problems with extracting the Weiser set of metrics and interpretation in OO Use of parameters might be a better bet Use of variables in any cohesion metric is subject to various problems Decision on eradication of smells (deodorant) is a problem Might explain the difficulty of capturing cohesion Thanks for listening!