2009 35th Euromicro Conference on Software Engineering and Advanced Applications Measure, diagnose, refactor: A formal quality cycle for software models Thomas Ruhroth, Hendrik Voigt, Heike Wehrheim Department of Computer Science University of Paderborn, 33098 Paderborn, Germany {thomas.ruhroth, hvoigt, wehrheim}@uni-paderborn.de Abstract—Software metrics measure the quality of code according to criteria like reusability, understandability and well-structuredness. Refactorings change code as to improve the quality while preserving overall behaviour. Measuring and refactoring go hand in hand as particular results of measurements indicate specific design problems which suggest certain refactorings. As a consequence, we get a quality cycle of repeated steps of measuring, diagnosing and refactoring. In this paper, we transfer such a quality cycle to the area of software models. Contrary to code, models - especially UML models - usually do not consist of a single entity but out of several diagrams. Our measures and refactorings jointly treat these diagrams. Contrary to code refactorings, we furthermore base the refactorings on a formal semantics for models and only employ refactorings which provably preserve the behaviour. The approach has been implemented in the tool RMC, supporting the definition as well as execution of quality measurements and refactorings. Keywords-quality cycle; model quality; metric; indicator; refactoring I. I NTRODUCTION With the increasing complexity of and the ever growing need for more software, the quality of software comes to the fore. While the term “software quality” may also refer to aspects like safety or correctness, we will understand it here as the adherance to fundamental software engineering principles like well-structuredness, reusability or readability (which - of course - have an impact on other properties). Software quality is usually quantified using software metrics [1], measuring for instance the size or coupledness of classes or the depth of inheritance hierarchies in objectoriented code. Refactorings [2], [3] on the other hand are used to improve bad software quality. According to Fowler [2], “refactoring is the process of changing a software system in such a way that it does not alter the external behaviour of the code yet improves its internal structure”. As these two techniques are tightly connected, they are often combined into a software quality cycle: code is measured according to certain measurements, the result of this is analysed and a diagnosis given, and specific diagnoses hint to refactorings. While this is a well-established method for code, there is less work for software models, both in the area of measurements and refactorings. Methods for model refactorings have been developed for UML models [4], [5], [6] as well as for formal specifications (e.g. Object-Z [7], CSP-OZ [8], [9], UML-RT with a Circus semantics [10]). Models, in particular UML models, however most often consist of 978-0-7695-3784-9/09 $26.00 $25.00 © 2009 IEEE DOI 10.1109/SEAA.2009.39 several separate descriptions (viz. diagrams) which only in their combination give a suitable model of the system. Out of the above cited works, only [8], [9] jointly treat more than one such view. In the area of design measures for UML diagrams, there are a number of approaches treating class diagrams or statecharts in isolation (for a survey see [11]) but - to the best of our knowledge - none defining design measures for their combined usage. Furthermore, no existing approach combines measures and refactorings into a quality cycle for software models. In this paper, we propose a quality cycle for UML models written in a profile for reactive systems introduced in [12]. This profile includes class diagrams with additional Z annotations for methods and attributes, protocol statecharts and component diagrams, thus incorporating four views. The reason for choosing this particular profile is its formal semantics. Since we are ultimately interested in the correctness of models, we aim at a formal quality cycle: the refactoring step should indeed be behaviour preserving. A formal semantics for the modelling formalism allows us to prove this. A way of how to derive class skeletons consisting of variables, signatures, pre- and postconditions of methods for a Java implementation from such formal models is also described in [12]. For models written in this profile we first of all define several measures. The measures refer to design aspects like structuredness and reusability. Aspects like performance and reliability are not intended to be measured on such models (and also cannot be measured since the models do not contain sufficient information about timing aspects, ressource requirements etc.). Technically, measures are given as OCL (Object Constraint Language) expressions over the profile’s metamodel. The measures do not only capture the quality of model diagrams separately but also interrelate them. The results of measuring are next analysed and a diagnosis is given. If quality is low, the diagnosis proposes one or more refactorings to be executed. The refactorings are chosen such that the particular quality problem indicated by the measurement is actually tackled. All refactorings can furthermore be proven to be behaviour preserving using the underlying semantics of the UML profile and the appendix gives an example of one such proof. The semantics is defined in terms of the formal language CSP-OZ [13], a combination of the process algebra CSP [14] and the state-based specification language Object-Z [15]. Behaviour preservation can then be defined using the notion of refinement for CSP-OZ. 360 <<component>> CDPlayer <<component>> CDHandling <<interface>> CDService −volume −CDin −tracks −current −CDin −tracks −current low() high() insert() eject() choose() play() stop() <<interface>> HandlingService insert() eject() choose() play() stop() Figure 3. Figure 1. <<component>> VolumeAdjustment <<interface>> VolumeService low() high() −volume Refactored class diagram Class diagram of CD player model D stop insert A B low eject low high stop high low insert E insert play choose H C eject choose play low high choose high high low Figure 4. justment. play F stop Refactored statecharts, left: CDHandling, right: VolumeAd- G eject Figure 2. Statechart of CD player model All steps of the quality cycle are tool-supported. The tool RMC [16] allows to edit models as well as carry out measurements and refactorings. It has several predefined measures and refactorings, but also offers the user facilities for defining new ones. In summary, this gives us a formal and tool supported quality cycle for software models. A further part of the UML profile are specific tags which can be attached to attributes of classes and methods in interfaces. As the profile aims at providing models with a precise formal semantics, we also need a description of types of attributes and signatures as well as pre- and postconditions of methods. These are given as Object-Z expressions. For instance, the following Object-Z expression describes the method insert: insert = [Δ(CDin, tracks, current); tr? : N | CDin ∧ tracks = tr? ∧ current = 0] II. M OTIVATION AND BACKGROUND We start with explaining the UML profile along an example of a CD player model. For software measures usually the following rule applies: The larger the case study, the more significant the results of measurement and diagnosis. We have still refrained from choosing a large case study here as to keep it understandable. The approach has also been evaluated on a number of larger case studies. The following example model is used to illustrate the quality cycle throughout the paper. It is the simple model of a CD player, consisting of a class diagram and a statechart. The profile proposed in [12] is used for modelling reactive systems consisting of processes working concurrently and communicating with each other via message exchanges. Class diagrams are used to describe the components in the system together with their interfaces. Figure 1 shows a single component (class) CDPlayer (marked by the stereotype <<component>>) and its provided interface CDService offering volume adjustment, CD insertion and ejection as well as track choice, play and stop facilities. Interfaces are used to connect classes with each other. This can be modelled by component diagrams (not part of our example). Besides components and interfaces we may have pure data classes in the class diagram (as well as associations, inheritance etc.). Every active component can furthermore have an attached simple protocol statechart (Fig. 2). Transitions in this statechart describe the allowed ordering of methods of the provided and required interfaces of a component. For our CD player we see that we can switch the volume between high and low at any state. Furthermore, after having inserted a CD the component offers us to choose a track, start playing and stop the CD. The method has an input parameter tr? and sets attribute CDin to true, tracks to the value of the parameter, current to 0 and keeps volume (primed variables refer to the afterstate of a method, the Δ-list contains all variables which are modifiable by the method). In a similar fashion all methods are described. Besides allowing for the definition of a formal semantics, we can furthermore precisely define what the referenced as well as modified variables of a method are. Here, we for instance have ref (insert) = ∅ and mod(insert) = {CDin, tracks, current}. This will prove useful when defining some of our refactorings and measurements. The semantics of such models is given in terms of the integrated formal method CSP-OZ [13] (see appendix). Hence behaviour preservation of refactorings means that the semantics of a model before and after refactoring is the same. Looking at our model there is in particular one part which seems to be of bad quality: the statechart. Due to the high number of paths through the diagram, it is difficult to actually understand which orderings are allowed and which not. In fact, there is yet another design problem present in the model: we have made just one class for completely different and independent aspects of the system. It turns out that the model can be refactored into two separate components with separate, much simpler statecharts (shown in Figures 3 and 4). The first component is responsible for handling all CD related features, the second one only handles volume adjustment. Accordingly, the statecharts of the components only contain the parts responsible for the ordering of their methods. With respect to the semantics, the model before and after refactoring is still equivalent. In the following we will explain a) how to 361 Table I E XCERPT OF MEASURES Measure Informal Definition Literature Scope Value for CDPlayer Number of connected parts in CD [17] 2 a class Z Number of top level parallelizaNTLP new SC 1 tions Number of transitions divided by CD 19 NT2NOps new number of operations SC 7 Maximum nesting level of comMaxNLCS [18] SC 0 posite states NBC Number of border crossings new SC 0 Number of direct via variables [19] CD connected methods divided by TCC 9 the maximum number of possible Z 21 connections Number of classes that a class is DCC [20] CD 0 related to Maximum length of all CD MaxUsedPL new 0 parameter lists used by a class Z Number of inherited operations divided by the total number of OIF [21] CD 0 available operations DIT Depth of inheritance of a class [22] CD 0 Legend: Class Diagram (CD), Statechart (SC), Z description (Z) LCOM4 systematically detect such bad designs by measurements, and b) how to suggest and carry out behaviour-preserving improvements, i.e. refactorings. III. I NDICATING D ESIGN P ROBLEMS Our objective here is to indicate and at the best resolve design problems. Dealing exclusively with design problems means that we restrict our considerations here to the quality characteristic design quality. We detect bad designs in a software model in three steps. Given a concrete model, we first of all carry out various measurements, mainly referring to the structure of the model. Next, we analyse the results of these measurements as to derive indicators for specific problems. In the last step we will then connect these indicators with actual design problems and propose refactorings in order to solve detected design problems. Basically, the first two steps of our approach can be applied to other quality characteristics like consistency as well. However, some restrictions apply then. The measurements presented here are solely based on the model’s syntax and do not involve its semantics (only the refactorings do so). Thus, the integration of additional measurements that involve human judgement or that first need to transform a model into a semantic domain in order to make some analysis on it would require some adjustments to our measurement step. Other quality characteristics like for instance timing behaviour or performance can furthermore not be detected at all since the models just do not contain sufficient information about such issues. There are many important requirements you can claim for measures. Measures should be precise, economical collectable, traceable to information needs, evaluated, understandable, and analysable. To be precise and economical collectable, all measures are formulated as OCL queries. We achieve traceability by linking these measures to quality problems. These quality problems represent our information needs and justify the need for our proposed measures. We reused some old measures taken from existing publications (cp. column Literature in Tab. I). These measures have been evaluated before. However, this set of measures is incomplete with respect to our quality problems, in particular because of our aim of measuring multi-view models. Hence, we have also introduced further new measures. These have so far been evaluated on some case studies, and have shown to be able to detect design problems of multi-view models. Basically, our set of measures is influenced by the modeling language and the according views of our model example, and by our quality problems that we introduce in subsection III-C. There we will explain the causal relationship between some measures and the investigated design problems in detail. An overview of the measures and their concrete values for the running example is given in Table I. The table just gives an excerpt of the available measures. The Measures are named by short acronyms. The Informal Definition just gives an idea about their computation and does not replace a formal definition (which is given in OCL) that enables measurement automation. The column Scope lists the views which are considered during measurement. Most of the measures are basically based on counting model elements and thus should be understandable (e.g. NTLP or NT2OPS). LCOM4 is an exception to that and we have to clarify what is meant by connected parts in a class. A connected part is a set of related operations and attributes. Two operations a and b are related, if sp1 calls op2 or op2 calls op1 or they both access the same attribute. We can draw a graph linking related operations and attributes to each other (see below). LCOM4 equals the number of connected graphs. For calculating LCOM4 we depend on information about operations and their referenced as well as modified attributes. Z descriptions include such information. For example, the operation insert modifies the attributes CDin, tracks, and current (cf. example in section II) that yields three corresponding edges in the graph below. The LCOM4 value for our class CDPlayer is 2. choose() tracks A. Measures In general, measures are used for the quantification of quality attributes of an entity. A measure itself is just a variable to which a value is assigned as the result of a measurement. In our case, the software model presented in Section II represents the entity. stop() play() current insert() eject() low() CDin high() volume LCOM4 = 2 Thus, the information needed for calculating LCOM4 are gathered from two different views: Class diagram and Z description. Note that a number of further measures in Table I refer to more than one view/diagram in the model. 362 Table II E XCERPT OF INDICATORS Problem Indicator (Symptom) LCOM4 > NTLP Measure LCOM4 NTLP NT2NOps NT2NOps Indicator (threshold example) LCOM4 > NTLP Design Problem Indicated Hidden Concurrency NT2NOps > UPNT2NOps (= 2) NT2NOps > UPNT2NOps (= 2) Hidden Concurrency Unnecessary Behavioral Complexity Unnecessary Behavioral Complexity Unnecessary Behavioral Complexity Too Low Cohesion Too Low Cohesion Lazy Class Too Strong Coupling Too Strong Coupling Refused Bequest Values Player true MaxNLCS > UPMaxNLCS (= 5) NBC NBC > UPNBC (= 5) LCOM4 LCOM4 > 1 TCC TCC < LOWTCC (= 0.5) LCOM4 LCOM4 < 1 DCC DCC > UPDCC (= 9) MaxUsedPL MaxUsedPL > UPMaxUsedPL (= 5) OIF OIF < LOWOIF (= 0.2) DIT ∧DIT >= 1 Legend: Thresholds for a measure M is described as follows: Upper limit := UPM and Lower Limit := LOWM true true NBC > UPNBC false LCOM4 > 1 false TCC < LOWTCC true true false false false false LCOM4 < 1 Our measures enable a quantification of quality attributes. However, it so far remains unclear how to interpret the measured values with respect to the design problems. To this end, measured values are next analysed by comparing them with thresholds. The combination of a measure and a related threshold is called an indicator. An indicator provides a qualification of an attribute and enables us to indicate bad designs. Table II gives an overview of some indicators. Again this table is not complete, but only shows an excerpt. Each Indicator is based on at least one Measure. These measures can be compared with thresholds (cp. NT2NOps > UPNT2NOps (= 2)) or other measures (LCOM4 > NTLP) or can be logical combined (OIF < LOWOIF (= 0.2) ∧ DIT > 1). Design Problems are then indicated by the indicators. The rightmost column documents which design problems are detected for our running example. We explain some design problems in the next section. We have filled in specific values for the thresholds. Some of these values are taken from existing evaluations [19], [23]. Other thresholds are based on our own experience. In fact, they have been quite effective for detection of our design problems. Now, we come back to our LCOM4 value of 2 and provide an interpretation for it. An LCOM4 value of 0 only happens when there are no operations in a class. This value would indicate the design problem Lazy Class. An LCOM4 value of 1 indicates a cohesive class, which is a good class. If the value is higher than 1, then a design problem is indicated because there are parts in the class that do not work together. This means that the class is doing more than one job: It is not cohesive. Thus, the LCOM value of 2 indicates Too Low Cohesion. C. Design Problems The indicators defined in Table II help us to diagnose certain design problems. Our diagnosis is quite simple. The indicators defined for one design problem are logically Hidden Concurrency solved by Refactoring (Therapy) State Parallelization Split Method Unnecessary Behavioral Complexity Join Methods Split Class Too Low Cohesion Lazy Class Move Feature Join Classes Push Down Feature DCC > UPDCC Too Strong Coupling Pull Up Feature MaxUsedPL > UPMaxUsedPL OIF < LOWOIF DIT > 1 Figure 5. B. Indicators Design Problem (Diagnosis) NT2NOps > UPNT2NOps MaxNLCS > UPMaxNLCS MaxNLCS indicates Refused Bequest Introduce Common Superclass Mapping symptoms to therapies via diagnosis combined by disjunction. Thus, each indicator on its own can indicate a design problem. We concentrate on six different design problems: Hidden Concurrency, Unnecessary Behavioral Complexity, Too Low Cohesion, Lazy Class, Too Strong Coupling, and Refused Bequest. We will explain for two of them, why they are real design problems and which indicators help us in the context of our running example. Diagnosing Unnecessary Behavioral Complexity: Complexity is a typical attribute of a software model (and software systems in general). Hence, complexity is not a design problem that can be solved under all circumstances. A software model is often complex due to the problem domain and due to the system requirements. In such cases, there is no potential left to improve the design quality significantly. Here, we instead want to find out whether a software model is complex without any need. Thus, we require measurements indicating preventable complexity. We indicate Unnecessary Behavioral Complexity by comparing the complexity of the statechart with the complexity of a class given by its operations. On the one hand, the complexity of the statechart is given by the number of transitions (NT). On the other hand, the complexity of a class is given by the number of operations (NOps). If the statechart is very complex in relation to the class, then this indicates Unnecessary Behavioral Complexity. This is the case for our example. Diagnosing Hidden Concurrency: If concurrency is hidden in a statechart, then it is hard to understand. You have to take the whole statechart into account at once, instead of first understanding smaller units and then interconnecting them. Another problem lies in the risk during editing the statechart. You always have to consider all operations, even for small changes or extensions. Thus, a software model should make concurrency explicit. We can indicate Hidden Concurrency by comparing the number of connected parts in a class (LCOM4) with the number of top level parallelizations (NTLP). If LCOM4 is higher than NTLP, then some parts are independent of each 363 3 stop insert A B low high stop high low insert E stop insert play 1 2 play choose 4 eject eject low D F choose H high C high 5 high low 6 low play choose Figure 7. G Statechart after refactoring ”State Parallelization” eject Figure 6. Statechart: The ovals highlight states, which will be combined to new states other, although this fact is not explicitly modeled in the statechart. For our example this is the case. Figure 5 does not only show the relation between indicators and design problems, but also proposes possible refactorings to solve the design problems. For our example, we can diagnose the design problems Hidden Concurrency, Unnecessary Behavioral Complexity and Too Low Cohesion. Next, we see how to get rid of these problems using refactorings. IV. R EFACTORING THE M ODEL The purpose of refactorings is to improve the structure (viz. quality) of code or models while preserving its behaviour. A refactoring thus acts as a therapy for design problems. Due to the requirement of behaviour preservation, refactorings cannot be used as a therapy when quality characteristics like correctness are considered (which often require a change of the models functionality when problems have been indicated). Usually code refactorings are specified by ordinary text plus some examples. Since we are aiming at a formal quality cycle, we employ a formal description of refactorings (see [8]). Every refactoring is described by two templates, giving the before and after state (or model structure) of a refactoring. When executing a refactoring we furthermore need to know where to apply it in the model. This is captured by parameters which are filled before execution. Our tool will prompt the user for these parameters. Finally, similar to Roberts [24], every refactoring has a precondition stating its applicability. This will automatically be checked by the tool. In summary, a refactoring thus has four components: (Par, BeforeTemplate, Pre, AfterTemplate). Since the templates and preconditions easily get quite complex, we just explain them textually for our example. The diagnosis of our example has stated three design problems: Hidden Concurrency, Unnecessary Behavioral Complexity and Too Low Cohesion. As therapies, five refactorings are suggested (see Fig. 5): “State Parallelization”, “Split Method”, “Join Methods”, “Split Class” and “Move Feature”. From this, we can directly see that neither the diagnosis nor the therapy proposes a unique solution to the quality problem. Instead, users have to decide on their own which refactoring to apply. The tool will propose all five refactorings and assist the user by checking their applicability. Here, we have two refactorings which are proposed for more than one design problem, namely “State Parallelization” and “Split Class”. Since the latter is not applicable (as we will see below), the best option is to take the former. The purpose of the refactoring “State Parallelization” is to make inherent concurrency visible by introducing parallel regions into the statechart. To this end, we need to identify states which are “equal” with respect to the methods of one such potential subregion. For instance, as far as methods low and high are concerned, the states {A, B, C, D} as well as {E, F, G, H} in the statechart in Fig. 6 (dark gray ovals) cannot be distinguished. Similarly, with respect to the methods insert, eject, choose, play and stop the states A and E, B and F, D and H, C and G are equal (light gray ovals in Fig. 6). As a parameter to the refactoring, we thus need to give these two method sets {high, low} and {insert, choose, eject, play, stop} as well as the respective sets of “equivalent” states. The precondition check then amounts to checking equivalence (a kind of bisimulation, see appendix). The After Template finally specifies that the result is a statechart with as many parallel regions as we have method sets, with states in these regions being the above state sets and with transitions representing the old transitions between states in the sets. The resulting statechart for our example is shown in Fig. 7. Here, we for instance have states 1 and 2 representing state sets {A, E} and {B, F} plus e.g. a transition from 1 to 2 corresponding to the old transition from A to B (E to F, respectively). The proof of behaviour preservation can be found in the appendix. This refactoring gives us a new, second model. Next, the quality cycle is started again: we measure the model’s quality and derive a diagnosis. In Tab. III we give the reTable III M EASURES AND INDICATORS AFTER REFACTORING ”PARALLELIZE S TATECHART ” Measure LCOM4 NTLP NT2NOps MaxNLCS NBC TCC Value 2 2 10 7 1 0 9 21 Indicator LCOM4 > NTLP NT2NOps > 2 MaxNLCS > 5 NBC > 5 LCOM4 > 1 TCC < 0.5 Value false false false false true true sults of the next measurement. The values for LCOM4 and NTLP are equal now and the value for NT2NOps is below the limit of UPNT2NOps . Therefore we have solved the 364 design problems Hidden Concurrency and Unnecessary Behavioural Complexity. The last open design problem is Too Low Cohesion: the LCOM4 value of 2 tells us that there are 2 completely independent parts of the class, the suggested refactoring is thus “Split Class”. As the name says this refactoring splits the class into two (or more), thereby also splitting the statechart. Parameters to “Split Class” are the sets of methods which should be put together in a new class. Here, these are the sets M1 = {low, high} and M2 = {insert, eject, choose, start, play}. The precondition is twofold: on the class diagram and its Z annotation we need to check (ref (M1 ) ∪ mod(M1 )) ∩ (ref (M2 ) ∪ mod(M2 )) = ∅, i.e. the method sets operate on different variables of the class. On the statechart we need to check whether there are parallel regions with each only refering to the methods in one such set. Again, given the values for the parameters, this can be checked automatically. As a remark: The precondition for this refactoring is for instance not fulfilled in the first model as we do not have parallel regions there. The refactoring then splits the class as well as the statechart in two. The resulting model has already been discussed in the example section (Figs. 3 and 4). Note that this refactoring both in its precondition check and the transformation itself refers to class diagram, Z annotations and statechart together. Alike the previous refactoring, this refactoring can be shown to preserve the overall semantics of the model. Having obtained a third model, we once more start our quality cycle. Table IV shows the results of measuring. Table IV M EASURES AND I NDICATORS AFTER REFACTORING ”S PLIT C LASS ”. VS: VOLUME S ERVICE , HS: H ANDLING S ERVICE Measure LCOM4 NTLP NT2NOps MaxNLCS NBC TCC Value HS 1 1 Value VS 1 1 6 5 3 2 0 0 0 0 1 8 10 Indicator LCOM4 > NTLP NT2NOps > 2 MaxNLCS > 5 NBC > 5 LCOM4 > 1 TCC < 0.5 Value HS false false false false false false Value VS false false false false false false After this final refactoring we have a model which has no measured design problems anymore and which has the same behaviour as the original model. Thus, we have indeed improved the quality of the model according to our indicators. V. RMC T OOL The RMC Tool [16] features both the development of formal software models and the application of the whole quality cycle to them. It has been implemented as an Eclipse plugin. It enables the creation and editing of UML models containing class diagrams with Z annotations and statecharts, the measurement of the model, a diagnosis and finally the refactoring. To support this, the tool has a large number of predefined measurements, indicators and refactorings. In addition, the tool allows users to define their own measures, indicators and refactorings. Measures are defined by Object Constraint Language (OCL) queries. OCL queries are based on numerical rules that are understood by our tool. The automatic calculation saves time and can be performed in an ad hoc fashion in contrast to measurements that involve human judgement. For the indicators, we use a range based approach. This permits the documentation of objectives that are based on stepwise thresholds or ranges. The design problems are described by diagnoses that are derived from the results of the indicators and a knowledge base containing information about connections between design problems and indicators. The knowledge base of the diagnosis gives us also hints which refactorings can be applied to solve the problem. It is given in a configuration file and can be simply adapted to fit newly found indicators. A similar approach is used for defining refactorings. Available refactorings are stored in a configuration file. Users can extend it to fit their particular needs. The configuration file includes three main parts: refactoring interface definitions, simple and complex refactorings. The refactoring interface definitions specify which information needs to be gathered from the user via the GUI (e.g. the new class name for renaming a class). While simple refactorings define a simple, small and atomic transformation step, a complex refactoring combines simple refactorings and describes more complicated ones in a modular way. VI. C ONCLUSION AND R ELATED W ORK In this paper, we have presented our quality cycle approach for software models by applying it to a small example. Our approach is based on software models whose semantic definition is given by the formal language CSPOZ [13] and that are consisting of four interrelated views (class diagrams, statecharts, component diagrams and Z descriptions). Our measures take several views at once into account. Due to these multi-view measures the according indicators help us to detect bad design in the underlying model. Indicated design faults are repaired by behaviour preserving, multi-view refactorings. In order to restructure the underlying model in a consistent way, refactorings must necessarily work on the whole set of available views as well. The quality cycle we propose here can be applied to ordinary UML models as well. But, for software models that lack in formal semantics it remains unclear whether our refactorings will preserve their overall behaviour. Research on measures and refactorings each on its own is a very broad field. The combination of these two techniques is however rare, even for source code. We briefly survey some existing work. Indicators for Software Models: There are a lot of measures and indicators suggested for source code. Since models are however abstractions of code, many measures 365 cannot be directly applied to models and indicator thresholds have to be adjusted. This is still an ongoing research field [25], [18], only large experiments can yield practically useful values here. We intend to directly incorporate new results in this area into our tool. Refactorings: While we refactor UML models here, we still are able to show behaviour preservation based on the formal semantics of our profile. So we combine the advantages of refactoring UML models [4], [5], [6] with those of formal models (e.g. [7], [8], [9], [10]). Quality Cycle for Source Code: Our quality cycle has some overlaps with existing quality cycles for source code [26], [27] which also have the trichotomy of measuring, diagnosing and refactoring. However, the object of study is quite different (software model vs. source code). Quality cycles for code need not consider multi-view measures and multi-view refactorings, which we do here. Diagnosis: Instead of using measures and indicators, [28] suggests logic rules defined in a Prolog variant for detecting design problems. [29] analyses the change history of versioning systems for source code. They detect critical pieces of code and can predict future refactorings. Both works present interesting orthogonal approaches to our work and could be used to complement our quality cycle. Acknowledgement.: This work was partially funded by the German Research Council DFG under grant WE 2290/6-1. R EFERENCES [1] D. N. Card and R. L. Glass, Measuring Software Design Quality. Upper Saddle River, NJ, USA: Prentice-Hall, 1990. [2] M. Fowler, Refactoring: Improving the Design of Existing Code. Addison Wesley, 2004. [3] T. Mens and T. Tourwé, “A Survey of Software Refactoring,” IEEE Trans. Software Eng., vol. 30, no. 2, pp. 126– 139, 2004. [4] J. Küster, R. Heckel, and G. Engels, “Defining and validating transformations of UML models.” in HCC. IEEE Computer Society, 2003, pp. 145–152. [5] T. Mens, N. V. Eetvelde, S. Demeyer, and D. Janssens, “Formalizing refactorings with graph transformations,” Journal of Software Maintenance, vol. 17, no. 4, pp. 247– 276, 2005. [6] R. V. D. Straeten and M. D’Hondt, “Model Refactorings through Rule-Based Inconsistency Resolution,” in Proceedings of the 2006 ACM Symposium on Applied Computing, J. Bézivin, Ed., 2006, pp. 71 210–1217. [7] T. McComb, “ Refactoring Object-Z Specifications,” in FASE’04, ser. LNCS. Springer, 2004, pp. 69 – 83. [8] T. Ruhroth and H. Wehrheim, “Refactoring Object-Oriented Specifications with Data and Processes,” in FMOODS, ser. LNCS, M. M. Bonsangue and E. B. Johnsen, Eds., vol. 4468. Springer, 2007, pp. 236–251. [9] J. Derrick and H. Wehrheim, “Model Transformations Incorporating Multiple Views,” in AMAST, ser. LNCS, M. Johnson and V. Vene, Eds., vol. 4019. Springer, 2006. [10] R. Ramos, A. Sampaio, and A. Mota, “Transformation Laws for UML-RT,” in FMOODS, ser. LNCS, R. Gorrieri and H. Wehrheim, Eds., vol. 4037. Springer, 2006. [11] M. Genero, M. Piattini-Velthuis, J.-A. Cruz-Lemus, and L. Reynoso, “Metrics for UML Models,” UPGRADE - The European Journal for the Informatics Professional, vol. 5, pp. 43–48, 2004. [12] M. Möller, E.-R. Olderog, H. Rasch, and H. Wehrheim, “Integrating a formal method into a software engineering process with UML and Java,” Formal Asp. Comput., vol. 20, no. 2, pp. 161–204, 2008. [13] C. Fischer, “Combination and Implementation of Processes and Data: from CSP-OZ to Java,” Ph.D. dissertation, University of Oldenburg, 2000. [14] W. Roscoe, The Theory and Practice of Concurrency. Prentice-Hall, 1997. [15] G. Smith, The Object-Z Specification Language. 2000. KAP, [16] H. Voigt and T. Ruhroth, “A Quality Circle Tool for Software Models,” in ER, ser. LNCS, Q. L. et al., Ed., vol. 5231. Springer, 2008, pp. 526–527. [17] M. Hitz and B. Montazeri, “Measuring Product Attributes of Object-Oriented Systems,” in Proc. ESEC ‘95. Springer, 1995. [18] J. A. Cruz-Lemus, M. Genero, Álvarez, “An Empirical Study Composite States Within UML ER (Workshops), ser. LNCS, J. Springer, 2005, pp. 12–22. M. Piattini, and J. A. T. of the Nesting Level of Statechart Diagrams,” in A. et al, Ed., vol. 3770. [19] J. Bieman and B.-K. Kang, “Cohesion and reuse in an object-oriented system,” SIGSOFT Softw. Eng. Notes, vol. 20, no. SI, pp. 259–262, 1995. [20] J. Bansiya and C. G. Davis, “A Hierarchical Model for Object-Oriented Design Quality Assessment,” Software Engineering, IEEE Transactions on, vol. 28, no. 1, pp. 4–17, 2002. [21] F. B. e. Abreu, “Using OCL to formalize object oriented metrics definitions,” Portugal, Tech. Rep., 2001. [22] S. R. Chidamber and C. F. Kemerer, “A Metrics Suite for Object Oriented Design,” IEEE Trans. Software Eng., vol. 20, no. 6, pp. 476–493, 1994. [23] F. B. e. Abreu and W. Melo, “Evaluating the Impact of Object-Oriented Design on Software Quality,” in In Proceedings of the 3 rd International Software Metrics Symposium, 1996. [24] D. B. Roberts, “Practical Analysis for Refactoring,” Ph.D. dissertation, 1999. [25] C. Lange, “Assessing and Improving the Quality of Modeling - A Series of Empirical Studies about the UML,” Ph.D. dissertation, TU of Eindhoven, 2007. 366 [26] A. Trifu and R. Marinescu, “Diagnosing Design Problems in Object Oriented Systems,” in WCRE. IEEE Computer Society, 2005, pp. 155–164. [27] L. Tahvildari and K. Kontogiannis, “Improving design quality using meta-pattern transformations: a metric-based approach,” Journal of Software Maintenance, vol. 16, no. 4-5, 2004. [28] T. Tourwé and T. Mens, “Identifying Refactoring Opportunities Using Logic Meta Programming,” in CSMR ’03. Washington, DC, USA: IEEE Computer Society, 2003, p. 91. transitions in the statechart actually describe a parallel composition of two separate orderings. To this end, we first define two equivalence relations on the set of states, one for each dimension k = 1, 2: s1 ≡k s2 ⇔ ∃ Slk : {s1 , s2 } ⊆ Slk . The precondition for the refactoring consists of two conditions: • Orthogonal distribution of states: ∀ s1 , s2 ∈ S : s1 ≡1 s2 iff s1 ≡2 s2 (states are either summarised in the first or the second dimension), • Parallel execution: a s2 ∧ s1 ≡1 s2 ⇒ ∀ s1 ∈ S, s1 ≡2 s1 : ∀ s1 , s2 ∈ S : s1 − → a s2 ∧ s1 ≡1 s2 ∃ s2 ∈ S, s2 ≡2 s2 ∧ s1 − → a s1 − → s2 ∧ s1 ≡2 s2 ⇒ ∀ s1 ∈ S, s1 ≡1 s1 : a s2 ∧ s1 ≡2 s2 . ∃ s2 ∈ S, s2 ≡1 s2 ∧ s1 − → [29] J. Ratzinger, T. Sigmund, P. Vorburger, and H. Gall, “Mining Software Evolution to Predict Refactoring,” in ESEM ’07. Washington, DC, USA: IEEE Computer Society, 2007. [30] H. Rasch and H. Wehrheim, “Checking Consistency in UML Diagrams: Classes and State Machines,” in FMOODS, ser. LNCS, E. Najm, U. Nestmann, and P. Stevens, Eds., vol. 2884. Springer, 2003, pp. 229–243. [31] R. Milner, Communication and Concurrency. Hall, 1989. Prentice A PPENDIX : B EHAVIOUR P RESERVATION The semantics of models written in our UML profile is defined by first giving an Object-Z semantics to class diagrams with Z annotations and a CSP semantics to statecharts. Object-Z is next given a CSP semantics as well which is then combined with the statechart semantics via parallel composition. Parallel composition acts as logical conjunction here, combining the behaviour of class diagram and statechart. Thus in short, the semantics of a model (cd, sc) is defined by (equivalent states have equivalent behaviour). The precondition check is carried out by the tool. The refactoring then builds a new statechart with two parallel regions. The states of the first one are {S01 , . . . , Sn1 } and a 1 1 1 we have transitions Si1 − → 1 Sj , i = j, iff ∃ s1 ∈ Si , s2 ∈ Sj a such that s1 − → s2 , the second region is similar. Translating these to CSP again (where parallel regions are translated into parallel compositions) we thus get a process P1main || P2main . For showing behaviour preservation we need to prove the equivalence of Pmain and P1main || P2main . This can be done by proving these two processes to be bisimilar [31] which - in the absense of internal operations, which is the case here - then proves equivalence with respect to CSPs failures-divergences model. The relation showing bisimilarity for the above refactoring is R = {(Pi , P1j || P2k ) | Pi represents state si , P1j state Sj1 , P2k state Sk2 ∧ si ∈ Sj1 ∧ si ∈ Sk2 } . [ (cd, sc)]] = [ (cd)]] || [ (sc)]] Of interest for showing behaviour preserving of the refactoring ”State Parallelization” is the semantics of statecharts. Since we only allow for simple protocol statecharts, the translation to CSP is straightforward (see [30]): a state X is translated into a CSP process PX which is defined as executing one of the methods on the outgoing transitions thereby evolving into the process for the next state. As an example, we get (among others) the following processes for our initial statechart: Pmain PA PB In a similar fashion all statechart refactorings can be proven to be behaviour-preserving. In some cases, the proof of equivalence can simply be reduced to known equivalences among CSP terms, as can for instance be found in [14]. = PA = insert → PB 2 high → PE = eject → PA 2 choose → PC 2 high → PF We only roughly sketch the proof of correctness for refactoring ”State Parallelization” here. As parameters to this refactoring we have sets of states which we want to consider equal. Assume we aim at two regions. Then we need a distribution of the set of all states S for two dimensions, 1 and 2: S=∪ n i=0 Si1 S=∪ m j=0 Sj2 These disjoint sets will form the states of the regions. Given such a distribution we have to check whether the 367