A DOCUMENT-BASED SOFTWARE TRACEABILITY TO SUPPORT CHANGE IMPACT ANALYSIS OF OBJECT-ORIENTED SOFTWARE SUHAIMI BIN IBRAHIM UNIVERSITI TEKNOLOGI MALAYSIA BAHAGIAN A – Pengesahan Kerjasama* Adalah disahkan bahawa projek penyelidikan tesis ini telah dilaksanakan melalui kerjasama antara ______________________ dengan ___________________ Disahkan oleh: Tandangan : ………………………………………… Nama : ………………………………………… Jawatan : ………………………………………… Tarikh: ………….. (Cop rasmi) * Jika penyediaan tesis/projek melibatkan kerjasama. ============================================================ Bahagian B – Untuk kegunaan Pejabat Sekolah Pengajian Siswazah Tesis ini telah diperiksa dan diakui oleh: Nama dan Alamat Pemeriksa Luar : Prof. Dr. Jubair Jwamear Al-Jaafar Alhadithi : Department of Computer Science, : Kulliyyah of Infor. and Comp. Technology, : International Islamic University Malaysia, : Gombak, Kuala Lumpur Nama dan Alamat Pemeriksa Dalam I : Prof. Dr. Safaai bin Deris : Jabatan Kejuruteraan Perisian, : Fakulti Sains Komp. dan Sistem Maklumat, : UTM, Skudai Nama dan Alamat Pemeriksa Dalam II: Dr. Wan Mohd Nasir bin Wan Kadir : Jabatan Kejuruteraan Perisian, : Fakulti Sains Komp. dan Sistem Maklumat, : UTM, Skudai Disahkan oleh Penolong Pendaftar di Sekolah Pengajian Siswazah: Tandatangan : …………………………………….. Tarikh: …………… Nama : …………………………………….. A DOCUMENT-BASED SOFTWARE TRACEABILITY TO SUPPORT CHANGE IMPACT ANALYSIS OF OBJECT-ORIENTED SOFTWARE SUHAIMI BIN IBRAHIM A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy Faculty of Computer Science and Information System Universiti Teknologi Malaysia MAY 2006 ii iii ALHAMDULILLAH …… For my beloved parents, my wife, Hjh. Aslamiah bt. Md Nor and my children, Muhammad Nazri and Noor Aini who have given me the strength and courage. iv ACKNOWLEDGEMENT I would like to take this opportunity to thank my main supervisor, Dato’ Prof. Dr. Norbik Bashah Idris for his encouragement, advice and inspiration throughout this research. Special thanks go to my co-supervisor, Prof. Dr. Aziz Deraman, the faculty Dean of Technology and Information Science, Universiti Kebangsaan Malaysia for his constant support, technical guidance and constructive review of this research work. I am especially indebted to Prof. Malcolm Munro of Durham University, United Kingdom, a renown expert in software impact analysis for his early support through insightful ideas and constructive comments on the draft model and approach to this research. At the beginning of this journey he guided and encouraged me throughout my stay at Durham, giving me inspiring and fruitful experience in the research area. A great gratitude also goes to the Universiti Teknologi Malaysia for sponsoring my three year PhD study and to MOSTI for funding the IRPA research project. My thanks also go to all CASE (Centre For Advanced Software Engineering) staff and individuals who have been involved directly or indirectly in the project. Lastly, my appreciation also goes to post-graduate students of CASE, Universiti Teknologi Malaysia, Kuala Lumpur for their participation in the controlled experiment. v ABSTRACT The need for software modifications especially during the maintenance phase, is inevitable and remains the most costly. A major problem to software maintainers is that seemingly small changes can ripple through the entire system to cause major unintended impacts. As a result, prior to performing the actual change, maintainers need mechanisms in order to understand and estimate how a change will affect the rest of the system. Current approaches to software evolution focus primarily on the limited scope of change impact analysis e.g. code. This research is based on the premise that a more effective solution to manage system evolution can be achieved by considering a traceability approach to pre-determine the potential effects of change. The aim of this research is to establish a software traceability model that can support change impact analysis. It identifies the potential effect to software components in the system that does not lie solely on code but extends to other high level components such as design and requirements. As such, in this research, modification to software is therefore considered as being driven by both high level and low level software components. This research applies a comprehensive static and dynamic analysis to provide better impact infrastructures. The main research contribution in this thesis can be seen in the ability to provide a new software traceability approach that supports both top-down and bottom-up tracing. In further proving the concept, some software prototype tools were developed to automate and support the potential effects. The significant achievement of the model was then demonstrated using a case study on a non-trivial industrial application software, and evaluated via a controlled experiment. The results when compared against existing benchmark proved to be significant and revealed some remarkable achievements in its objective to determine change impacts. vi ABSTRAK Keperluan terhadap pengubahsuaian perisian terutama pada fasa penyenggaraan adalah suatu yang tidak dapat dielakkan dan masih melibatkan kos yang tinggi. Permasalahan utama kepada penyenggara perisian ialah pindaan yang nampaknya agak kecil boleh merebak ke seluruh sistem lalu menyebabkan impak luar jangka yang besar. Lantaran itu, sebelum kerja penyenggaraan dilakukan, penyenggara perlukan beberapa mekanisma untuk memahami dan menganggar bagaimana pindaan akan menjejaskan bahagian lain dalam sistem. Pendekatan semasa terhadap evolusi perisian lebih memfokuskan kepada skop analisa impak pindaan yang terhad, contohnya kod sumber. Kajian ini berasaskan kepada landasan iaitu penyelesaian efektif untuk menangani evolusi sistem boleh dicapai dengan mengambilkira pendekatan jejakan bagi mengenalpasti terlebih dahulu impak pindaan berpotensi. Tujuan kajian ini ialah untuk menghasilkan satu model jejakan perisian yang boleh membantu menganalisa impak pindaan. Ia mengenalpasti kesan berpotensi terhadap komponen perisian dalam sistem yang tidak terletak semata-mata kepada kod tetapi meluas kepada komponen aras tinggi seperti rekabentuk dan keperluan. Oleh itu dalam kajian ini, pengubahsuaian terhadap perisian boleh diambilkira dengan melibatkan kedua-dua komponen perisian aras tinggi dan aras rendah. Penyelidikan ini menggunakan analisa statik dan dinamik yang komprehensif untuk menyediakan infrastruktur impak yang lebih baik. Sumbangan utama kajian dalam tesis ini boleh dilihat dari segi keupayaan menyediakan satu pendekatan perisian jejakan baru yang membantu kedua-dua jejakan atas-bawah dan bawah-atas. Bagi membuktikan lagi pengesahan konsep, beberapa alatan prototaip perisian dibina untuk mengautomasi dan membantu kesan berpotensi. Pencapaiannya yang signifikan kemudian dipersembahkan dengan menggunakan satu kajian kes yang merupakan satu aplikasi perisian berasaskan industri dan dinilai melalui ekperimen terkawal. Keputusan yang diperolehi apabila dibandingkan dengan penanda aras terbukti ianya signifikan dan memperlihatkan beberapa pencapaian yang memberangsangkan dalam matlamatnya untuk mengenalpasti impak pindaan. vii TABLE OF CONTENTS CHAPTER 1 TITLE PAGE DECLARATION ii DEDICATION iii ACKNOWLEDGEMENT iv ABSTRACT v TABLE OF CONTENTS vii INTRODUCTION 1.1 Introduction 1 1.2 Introduction to Software Evolution 1 1.3 Background of the Problem 4 1.3.1 Broader Maintenance Perspective 4 1.3.2 Effective Change Communication 5 1.3.3 Importance of System Documentation 6 1.3.4 Incomplete Maintenance Supported CASE tools 6 1.4 Statement of the Problem 7 1.5 Objectives of the Study 8 1.6 Importance of the Study 9 1.7 Scope of Work 9 1.8 Thesis Outline 11 2 LITERATURE REVIEW ON SOFTWARE EVOLUTION AND TRACEABILITY 2.1 Introduction 13 viii 2.2 Software Evolution Models 13 2.3 Software Traceability in Software Engineering 16 2.3.1 Software Configuration Management 16 2.3.2 System Documentation 18 2.3.3 Requirements Traceability 21 2.3.4 Change Management Process 23 2.4 2.5 Review and Further Remarks on Software Evolution and Traceability 27 Impact Analysis Versus Traceability 29 2.5.1 Definitions of Impact Analysis and Traceability 29 2.5.2 Vertical Traceability 32 2.5.2.1 Data Dependencies 32 2.5.2.2 Control Dependencies 33 2.5.2.3 Component Dependencies 33 Horizontal Traceability 34 2.5.3.1 Explicit Links 34 2.5.3.2 Implicit Links 35 2.5.3.3 Name Tracing 36 2.5.3.4 Concept Location 37 2.5.3 2.5.4 Review on Impact Analysis and Traceability Techniques 2.6 Critiques on State-of-the-art Software Traceability Approaches 41 2.6.1 The Comparison Framework 41 2.6.2 Existing Software Traceability Models and Approaches 2.6.3 44 The Comparative Evaluation of Software Traceability Modeling 2.7 38 Summary of the Proposed Solution 48 49 3 RESEARCH METHODOLOGY 3.1 Introduction 51 3.2 Research Design 51 ix 3.3 Operational Framework 52 3.4 Formulation of Research Problems 54 3.4.1 Need of Change Process Support 54 3.4.2 Establish Communications Within a Project 55 3.4.3 Understanding Requirements to Support Software Evolution 3.5 55 Some Considerations for Validation Process 55 3.5.1 Supporting tools 56 3.5.2 Data Gathering and Analysis 57 3.5.3 Benchmarking 58 3.6 Some Research Assumptions 62 3.7 Summary 62 4 REQUIREMENTS TRACEABILITY MODELING 4.1 Introduction 64 4.2 Overview of Requirements Traceability 64 4.3 A Proposed Requirements Traceability Model 65 4.3.1 A Conceptual Model of Software Traceability 66 4.3.2 Dynamic and Static Analyses 67 4.3.3 Horizontal Links 68 4.3.4 Vertical Links 71 4.3.5 Program Dependencies 73 Defining Object-Oriented Impact Analysis 76 4.4.1 Ripple Effects of Program Dependencies 76 4.4.2 Defined Dependencies 79 4.4 4.5 Total Traceability of Artifact Links 80 4.6 Summary 81 5 DESIGN AND IMPLEMENTATION OF SOFTWARE TRACEABILITY 5.1 Introduction 82 5.2 System Traceability Design 82 5.2.1 82 Software Traceability Architecture x 5.2.2 CATIA Use Case 86 5.2.3 CATIA Class Interactions 87 5.3 CATIA Implementation and User Interfaces 102 5.4 Other Supporting Tools 105 5.4.1 Code Parser 105 5.4.2 Instrumentation Tool 106 5.4.3 Test Scenario tool 108 Summary 109 5.5 6 EVALUATION 6.1 Introduction 110 6.2 Evaluating Model 110 6.2.1 Hypothesizing Traces 111 6.2.2 Software Traceability and Artifact Links 114 6.3 6.4 6.5 6.6 6.7 Case Study 116 6.3.1 Outlines of Case Study 116 6.3.2 About the OBA Project 117 6.3.3 OBA Functionality 118 Controlled Experiment 122 6.4.1 Subjects and Environment 123 6.4.2 Questionnaires 123 6.4.3 Experimental Procedures 124 6.4.4 Possible Threats and Validity 125 Experimental Results 127 6.5.1 User Evaluation 127 6.5.2 Scored Results 134 Overall Findings of the Analysis 137 6.6.1 Quantitative Evaluation 138 6.6.2 Qualitative Evaluation 140 6.6.3 Overall Findings and Discussion 144 Summary 145 7 CONCLUSION xi 7.1 Research Summary and Achievements 146 7.2 Summary of the Main Contributions 149 7.3 Research Limitation and Future Work 150 REFERENCES 152-166 APPENDICES 167 - 236 Appendix A – E xii LIST OF TABLES TABLE NO. TITLE PAGE 2.1 An overview of the general SRS, SDD and STD documents 20 2.2 Sample form of PCR (MIL-STD-498, 2005) 23 2.3 A comparative study of traceability approaches 45 2.4 Existing features of software traceability approaches 49 3.1 Benchmark of AIS# and EIS# possibilities(Arnold and Bohner, 93) 59 3.2 Evaluation for impact effectiveness (Arnold and Bohner, 1993) 60 3.3 Effectiveness metrics 60 4.1 Software documents and corresponding artifacts 65 4.2 Program relations in C++ 77 4.3 Classifications of artifact and type relationships 79 6.1 Summary of model specifications and used techniques 115 6.2 Cross tabulation of experience versus frequencies 127 6.3 Cross tabulation of previous jobs versus frequencies 128 6.4 Cross tabulation of previous jobs versus group distribution 128 6.5 Cross tabulation of experience versus group distribution 129 6.6 Cross tabulation of program familiarity versus group classification 129 6.7 Mean of previous jobs for groups 129 6.8 Mean of experience for groups 130 6.9 Mean of program familiarity for groups 130 6.10 Point average of group distribution 130 6.11 Mean of scores for impact analysis features 131 6.12 Results of bottom-up impact analysis 135 6.13 Results of top-down impact analysis 137 6.14 Results of artifact S-Ratio 138 6.15 Existing features of software traceability systems 141 xiii C1 Summary of object oriented relationships 206 D1 A set of package-class references 231 D2 A set of method-class-package references 232 xiv LIST OF FIGURES FIGURE NO. TITLE PAGE 2.1 A Stage Model (Bennett and Rajlich, 2000) 15 2.2 MIL-STD-498 Data Item Descriptions 19 2.3 SADT diagram of software maintenance activities (Bohner, 1991) 25 2.4 SADT diagram of change process (Small and Downey, 2001) 26 2.5 The comparison framework 44 3.1 Flowchart of the operational framework 53 4.1 Conceptual model of traceability system 66 4.2 Hypothesizing traces 67 4.3 Traceability from the requirement perspective 69 4.4 System artifacts and their links 80 5.1 A view of software traceability architecture 83 5.2 Use Case diagram of CATIA system 86 5.3 CATIA class diagrams 88 5.4 CATIA sequence diagrams 89 5.5 First User Interface of CATIA 102 5.6 Primary artifacts at method level 103 5.7 Summary of the impacted artifacts 104 5.8 The Instrumented Code 107 5.9 Some ‘markers’ extracted from a class CruiseManager 108 5.10 The impacted methods of a class CcruiseManager 108 6.1 Hypothesized and observed traces 113 6.2 CSC relationships within a CSCI OBA 118 6.3 Easiness to Use in Overall 131 6.4 Easiness to Find SIS 132 6.5 AIS Support 132 xv 6.6 Cost Estimation Support 133 6.7 Traceability Support 133 6.8 Documentation Support 133 B1 primary option of requirement artifacts 179 B2 Potential effects and impact summary 180 B3 First CATIA interactive window 181 B4 Primary artifacts at method level 182 B5 Detailed results of secondary artifacts 183 B6 Summary of impacted artifacts by methods 184 B7 Impacted artifacts in a message window 185 B8 Primary artifacts at requirement level 186 B9 Summary of impacted artifacts by requirements 186 B10 A hyperlink between CATIA and documentation 187 C1 Table structures of recognizable tokens 189 D1 Start vehicle class diagram 208 D2 Set calibration class diagram 209 D3 Control cruising speed class diagram 210 D4 Request trip assistance class diagram 211 D5 Fill fuel class diagram 212 D6 Service vehicle class diagram 213 xvi LIST OF ACRONYMS AND SYMBOLS AIS - Actual Impact Set ANOVA - Analysis of Variance AST - Analysis Syntax Tree CASE - Computer Aided Software Engineering CBO - Coupling Between Object Classes CCB - Change Control Board CI - Component Identification CMM - Capability Maturity Model CSC - Computer Software Components CSCI - Computer Software Configuration Item CSU - Computer Software units DIF - Documentation Integration Facility DIT - Depth of Inheritance tree FSM - Functional Specification Manual GEOS - Global Entity Operations System HTML - Hypertext Markup Language IA - Impact Analysis LCOM - Lack of Cohesion Metric LOC - Lines of code MDD - Model Dependency Descriptor NOC - Number of Children OBA - Automobile Board Auto Cruise OMT - Object Modeling Technique OOP - Object Oriented Programming PCR - Problem Change Report PIS - Primary Impact Set xvii RFC - Response for a Class RUP - Rational Unified Process SADT - Structured Analysis Design Technique SCM - Software Configuration Management SDLC - Software Development Lifecycle SDP - Software Development Plan SIS - Secondary Impact Set SPEM - Software Process Engineering Meta Model SPS - Specification Product System SRS - Software Requirement Specification STD - Software Test Description SUM - Software User Manual TRAM - Tool for Requirement and Architectural Management UML - Unified Modeling Language VG - Values of complexity WMC - Weighted Methods per Class XML - Extensible Markup Language xviii LIST OF APPENDICES APPENDIX TITLE APPENDIX A Procedures and Guidelines of the PAGE Controlled Experiment 167-176 APPENDIX B CATIA Manual and Descriptions 177-188 APPENDIX C Code Parser 189-206 APPENDIX D OBA System – The Case Study 207-234 APPENDIX E Published papers 235-236 CHAPTER 1 INTRODUCTION 1.1 Introduction This chapter provides an introduction to the research work presented in this thesis. It describes the research overview that motivates the introduction of a document-based software traceability to support change impact analysis of objectoriented software. This is followed by a discussion on the research background, problem statements, objectives and importance of the study. Finally, it briefly explains the scope of work and the structure of the thesis. 1.2 Introduction to Software Evolution It is unanimously accepted that software must be continuously changing in order for the software to remain relevant in use. The need for changing a software system to keep it aligned with the users’ need and expectations has long been recognized within the software engineering community. Due to inherent dynamic nature of business application, software evolution is seen as the long term result of software maintenance. In the current decade, the term software evolution is often used as a synonym for software maintenance, broadly defined as modification of a software product after delivery (IEEE, 1998a), and both software maintenance and evolution assume changing the code as their basic operation. 2 The term software evolution lacks a standard definition, but some researchers and practitioners use it as a preferable substitute for maintenance (Bennett and Rajlich, 2000). In short, one could say that the evolution of a software system is the results of observing the changes made to the system components over a long time span. Consequently, the study of software evolution has aimed at analyzing the process of continuous change to discover trends and patterns, such as for example, the hypothesis that software evolution behaves as feedback processes (Lehman and Ramil, 2000). The maintenance process describes how to organize maintenance activities. Kitchenham et al. (1999) identify a number of domain factors believed to influence the maintenance process, namely i) maintenance activity type, ii) product, iii) peopleware and iv) process organization. Maintenance activity type covers the corrections, requirements changes and implementation changes. Product deals with the size and product composition. Peopleware describes the skills and user requests. Process organization manages the group and engineering resources that include methods, tools and technology. Software maintenance is basically triggered by a change request. Change requests are typically raised by the clients or internal development staff for the demand to do software modification. Traditionally, software modification can be classified into maintenance types that include corrective, adaptive, perfective and preventive. Chapin et al. (2001) view software evolution in slightly different perspective. They use the classification based on maintainers’ activity of mutually exclusive software evolution clusters ranging from the support interface, documentation, software properties and business rules. Each cluster is characterized by some maintenance types. They conclude that different types of maintenance or evolution may have different impact on software and business processes. For whatever reason of modification it may be, the real world software systems require continuous changes and enhancements to satisfy new and changed user requirements and expectations, to adapt to new and emerging business models and organizations, to adhere to changing legislation, to cope with technology 3 innovation and to preserve the system structures from deterioration (Canfora, 2004). Webster et al. (2005) in their study on risk management for software maintenance projects propose some taxonomy of risk factors that cover requirements, design, code, engineering specialities and legacy. Some common risk factors identified are incomplete specifications and limited understanding, high level complexity of the required change, direct or indirect impacts on current system’s functionalities, and inadequate test planning and preparation. The fact about software change is that the changes made by user requirements and operations are propagated onto software system (Bohner, 2002). These changes will require a substantial extension to both the database and code. As change is the basic building block of software evolution, software change is considered a key research challenge in software engineering (Bennett and Rajlich, 2000). A large portion of total lifecycle cost is devoted to introduce new requirements and remove or change the existing software components (Ramesh and Jarke, 2001). This intrinsically requires appropriate software traceability to manage it. However, there are at least three problems observed by the investigation of the current software traceability approaches. First, most of the Computer Aided Software Engineering (CASE) tools and applications more focus on the high level software and yet are directly applicable to software development rather than maintenance. While, the low level software e.g. code, is given less priority and very often left to users to decide. This makes the software change impact analysis extremely difficult to manage at both levels. Secondly, there exists some research works (Jang et al., 2001; Lee et al, 2000; Tonella, 2003) on change impact analysis but the majority confine their solution at the limited space i.e. code, although more evolvable software can be achieved at the meta model level. Finally, no proper visibility being made by the ripple effects of a proposed change across different levels of workproduct. If this can be achieved, a more concrete estimation can be predicted that can support change decision, cost estimation and schedule plan. The key point to the above solutions is the software traceability. Software traceability provides a platform as to how the relationships within software can be established and how the change impact can be implemented. All these issues require 4 an extensive study on the existing traceability and impact analysis in the existing software system and research works before a new model and approach can be decided. 1.3 Background of the Problem Software traceability is fundamental to both software development and maintenance of large systems. It shows the ability to trace information from various resources that requires special skill and mechanism to manage it. In this research problem, it focuses on software traceability to support change impact analysis. Following are some major issues to the research problems. 1.3.1 Broader Maintenance Perspective Many researchers have been working on the code-based maintenance for their software evolution as mentioned earlier. This type of maintenance is more focused but limited as it deals with a single problem, i.e. source code. However, managing software change at a restricted level is not enough to appreciate the actual impacts in the software system. Other levels of software lifecycle such as requirements, testing and design should also be considered as they are parts of the software system. To observe the impact at the broader perspective is considerably hard as it involves software traceability within and across different workproducts. Software workproducts refer to explicit products of software as appeared in software documention. For instance, test plans, test cases, test logs and test results are the workproducts of testing document. Architectural design model and detailed design model are the workproducts of design document. Code is by itself a workproduct of source code. Requirements and specifications are the workproducts of software requirements specification document. Thus, at the broader perspective of change impact, it observes the impact within and across different workproducts such 5 within code and from code to design, specification, etc. Within a workproduct, the software components may be decomposed further into smaller elements, called artifacts. For instance, code can be decomposed into classes, methods, attributes and variables of software artifacts. In this thesis the term components and artifacts are sometimes used interchangeably which refer to the same thing. The fact about this traceability approach is that if the component relationships are too coarse, they must be decomposed to understand complex relationships. On the other hand, if they are too granular, it is difficult to reconstruct them into more recognized, easily understood software components. However, it is perceived that there should be some tradeoffs between these two granularities in order to observe the ripple effects of change. 1.3.2 Effective Change Communication There is a need to communicate and share information within software development environment e.g. between software developers and management (Lu and Yali, 2003). In Software Configuration Management (SCM) for example, the Change Control Board (CCB) needs to evaluate change requests before the actual change is implemented. This requires CCB to consult other staff such as maintainers, software engineers and project manager. A maintainer himself needs to examine among other tasks how much and which part of the program modules will be affected for change and regression testing, its complexity and what types of test cases and requirements will be involved. The software engineers need to examine the types of software components to use and more critically to identify or locate the right affected software components (Bohner, 2002). The software manager is more concerned about the cost, duration and staffing before a change request is accepted. This certainly requires a special repository to handle software components with appropriate links to various levels in software lifecycle. Some of this information is not readily available in any CASE 6 tools that require the software development staff to manually explore the major analyses in some existing software models and documentations. 1.3.3 Importance of System Documentation To established organizations with software engineering practices in place, software engineers always refer to system documentation as an important instrumentation for communication (IEEE, 1998). The high level management finds documentation very useful when communicating with the low level developers or vice-versa. Despite this, many software engineers are still reluctant to make full use of the system documentation particularly to deal with software evolution (Nam et al., 2004). To them, documentation is abstract, seldom up-to-date and time consuming. However, many still believe that documentation is more significant if the software it documents is more visible and traceable to other parts of software components. For example, within documentation the user can visualize the impacted requirements and design classes in the database repository. 1.3.4 Incomplete Maintenance Supported CASE Tools Many CASE (Computer Aided Software Engineering) tools that exist today are aimed at addressing the issues of software development rather than software maintenance (Pressman, 2004). Some claim that their tools can support both software development and maintenance, however their applications are mainly centered around managing, organizing and controlling the overall system components, very few focus on the impact analysis of change requests. Deraman (1998) relates the current CASE tools as applications that do provide special upfront consistency checking of software components during development but tend to ignore its equally important relationships with code as it proceeds towards development and maintenance. The traceability relationships between code and its upper 7 functionalities are not explicitly defined and very often left to software engineers to decide. From the above scenarios, there is a need to integrate both the high level and low level software abstracts such that the effects of component traceability can be applied in the system thoroughly. Two main issues need to be addressed here firstly, the software traceability within a software workproduct and secondly, the traceability across many workproducts. Software traceability in this context reflects the underlying infrastructures of ripple effects of change that attempts to incorporate both the techniques and models in implementing a change impact. 1.4 Statement of the Problem This research is intended to deal with the problems related to requirements traceability for change impact analysis as discussed in Section 1.2. The main question is “How to produce an effective software traceability model and approach that can integrate the software components at different component levels to support change impact analysis of software maintenance?” The sub questions of the main research question are as follows: i. Why the current maintenance models, approaches and tools are still not able to support potential effects of change impact in the software system? ii. What is the best way to capture the potential effects of software components in the system? iii. How to measure the potential effects of a proposed change? iv. How to validate the usefulness of software traceability for software maintenance? Sub question (i) will be answered via literature reviews in Chapter 2. This chapter will provide a special attention to explore the software evolution, its models and traceability issues. From the impact analysis perspective, this chapter will 8 present a study on the detailed traceability process, the techniques and existing tools used. The strengths and drawbacks are drawn based on a comparison framework in order to propose a new model and approach to support change impact analysis. The above study provides some leverage to answer the sub question (ii). Chapter 3 describes a design methodology and evaluation plan before the research is carried out. Sub question (iii) will be counter balanced by a solution to measure the potential effects. The sub questions (ii) and (iii) will be further explained in the traceability modeling and implementation as described in Chapter 4 and 5. Lastly, sub question (iv) leads to the evaluation of the model and approach quantitatively and qualitatively as described in Chapter 6. 1.5 Objectives of the Study The above problem statement serves as a premise to establish a set of specific objectives that will constitute major milestones of this research. To this end, the objectives of this research are listed as follows 1) To build a new software traceability model to support change impact analysis that includes requirements, test cases, design and code. 2) To establish a software traceability approach and mechanism that cover features including artifact dependencies, ripple effects, granularity and metrics. 3) To develop software supporting tools to support the proposed model and approach. 4) To demonstrate and evaluate the practicability of the software traceability model and approach to support change impact analysis. 9 1.6 Importance of the Study Software maintenance is recognized as the most expensive phase of the software lifecycle, with typical estimate of more than sixty percent of all effort expended by a development organization, and the percentage continues to rise as more software is produced (Han, 2001). As software changes are introduced, avoiding defects becomes increasingly labor intensive. Due to lack of advanced technologies, methods and tools, doing software modification has been difficult, tedious, time consuming and error prone. Software maintainers need mechanisms to understand and solve maintenance tasks e.g. how a change impact analysis can be made for a software system. Bohner and Arnold (1996) describes a benefit of change impact analysis as …by identifying potential impacts before making a change, we can greatly reduce the risks of embarking on a costly change because the cost of unexpected problems generally increases with the lateness of their discovery. Clearly, the software change impact is an important activity that needs to be explored in order to improve software evolution and traceability is seen as a core infrastructure to support impact analysis. 1.7 Scope of Work Software traceability can be applied to some applications such as consistency-checking (Lucca et al., 2002) defect tracking (McConnel, 1997), cross referencing (Teng et al., 2004) and reuse (Ramesh and Jarke, 2001). The techniques and approaches used may differ from one another due to different objectives and feature requirements. Some of these approaches are geared toward system development while others are designed for system evolution. 10 In this scope of research, it needs to explore a software traceability specifically to support a change impact analysis within which it should be able to capture the impacts of change requests. The models and techniques used should allow the implementation of impacts across different workproducts. It needs to capture the software knowledge from the latest correct version of a complete system prior to implementation. It is assumed that a new traceability approach needs to develop some reverse engineering tools if ones are not available in the research community to support and simplify the capturing process. The term a complete system here may refer to a very large scope as it may govern all the software models including the business model, specification, high level design, documentation, code, etc. These models are comprised within the software lifecycle of software specification, design, coding and testing. However, for the sake of research and implementation, this work will focus on some relevant information that includes a set of functional requirements, test cases, design and code as follows. These software components or artifacts are seen to be the software development baseline (MIL-STD-498, 2005). Software development baseline reflects the software products that are used and confined to the internal development staff rather than the external users or clients to support software evolution. Thus, this baseline model is chosen to represent a smaller scope of a large complete system. The new work should be derived from a set of system documentation adhering to a certain software engineering standard e.g. MIL-STD-498 (MIL-STD498, 2005). Nevertheless, it should not be tied up with the documentation environment as the main focus is not on the traceability and impact in system documentation but rather to its information contents. With simple interface (not within this scope) it should allow system documentation to view the traceable software components as a result of implementing software traceability system. In this research approach, the scope is decided on object-oriented system to address the change impact analysis. It should be noted that this research is not concerned with correcting the software components but only reporting them. The 11 software engineers or maintainers should consider these results as an assistance to facilitate their initial prediction of change impact. 1.8 Thesis Outline This thesis covers some discussions on the specific issues associated to software traceability for impact analysis and understanding how this new research is carried out. The thesis is organized in the following outline. Chapter 2: Discusses the literature review of the software evolution and traceability. Few areas of interest are identified from which all the related issues, works and approaches are highlighted. This chapter also discusses some techniques of impact analysis and software traceability. Next, is a discussion on some existing models and approaches by making a comparative study based on a defined comparison framework. This leads to improvement opportunities that form a basis to develop a new proposed software traceability model. Chapter 3: Provides a research methodology that describes the research design and formulation of research problems and validation considerations. This chapter leads to an overview of data gathering and analysis including benchmarking. It is followed by some research assumptions. Chapter 4: Discusses the detailed model of the proposed software traceability for impact analysis. A set of formal notations are used to represent the conceptual model of the software traceability. It is followed by some approaches and mechanisms to achieve the model specifications. Chapter 5: Presents the design and functionality of some developed tools to support the software traceability model. This includes the implementation of the design and component tools. 12 Chapter 6: The software traceability model is evaluated for its effectiveness, usability and accuracy. The evaluation criteria and methods are described and implemented on the model that includes modeling validation, a case study and experiment. This research performs evaluation based on quantitative and qualitative results. Quantitative results are checked against a benchmark set forth and qualitative results are collected based on user perception and comparative study made on the existing models and approaches. Chapter 7: The statements on the research achievements, contributions and conclusion of the thesis are presented in this chapter. This is followed by the research limitations and suggestions for future work. CHAPTER 2 LITERATURE REVIEW ON SOFTWARE EVOLUTION AND TRACEABILITY 2.1 Introduction This chapter presents the state-of-the-art approaches in software evolution and software traceability. It begins with an overview of the prominent models and approaches in software evolution followed by the recent uses of software traceability. It presents the concept of change process that includes the change requests, ripple effects and definition of both impact analysis and traceability. It also presents some techniques used in software traceability and impact analysis, and discusses how both are related to one another. Next, a summary of comparative study on the existing software traceability models based on some predefined criteria and comparison framework. Finally, it presents a proposed model and approach of software traceability to support change impact analysis. 2.2 Software Evolution Models Currently, there is no standard view and generally accepted definition on the term software evolution and this has led both software evolution and maintenance often used interchangeably (Bennett and Rajlich, 2000; Chapin et al., 2001). For that reason, instead of reviewing the terminology definitions, the author will summarize 14 views and comments of other researchers and elaborate their own view on software evolution. Perry (1994) views software evolution as a well-engineered software system that consists of three inter-related ingredients: the domains, experiences and process. The domains are the real world environments made of objects and processes. It is this model that is the abstraction basis for system specification which can be turned into an operational system. The business policies, government regulations and technical standards are some examples of the real world environment within the context of software system. Experience is the skill gained from the implemented software system such as feedback, experimentation and accumulation of knowledge about various aspects relevant to the system. Feedback provides corrective action, while experimentation is to create information for the sake of understanding, insight and judgement. The process is a means of interaction in building and evolving the software system using methods, technologies and organizations. While, a method may include the techniques and tools, technology supports the automation and organization has to make sure that the quality, standards and practices are followed. These three dimensions are considered the fundamental sources of system evolution as they are intrinsically evolving themselves. Bennett (1996) views the software evolution in slightly different way as a model consisting of three main levels; the organization, process and technology. The first level focuses on how the software evolution meets the organizational goals and needs within a defined time scale and budget constraint. The second level relates to a set of activities for software evolution including initial requests, change assessment, cost-estimation and change impact analysis. The third level refers to some technologies used to support the process that include automated tools, methods and languages. Bennett and Rajlich (2000) classify the software revolution into a staged model that ranges from the initial development, iterative evolution, iterative servicing through phaseout and closedown (Figure 2.1). During the initial development, the system is built as a first complete version by the software team 15 expertise that meets the system architectures and specifications. As the initial development is successful, the software enters the iterative evolution stage. Initial Development evolution changes first running version Evolution loss of evolvability servicing patches Servicing servicing discontinued Phase-Out Switch-off Close-Down Figure 2.1: A staged model (Bennett and Rajlich, 2000) At this stage, evolution may respond to demand for additional functionalities and competitive pressures either from the clients or internal development staff for the need to do modification. As the software evolves over time, servicing stage may expect no major changes with minimum involvement of domain or software engineering experts. This situation occurs when the major changes become too burden and expensive, thus the software evolution is said to reach the saturation stage or is considered aging. Finally, the software system that becomes less useful in its environment and costly to maintain will enter the phaseout and closedown stage. The need and demand for continued change based on traditional maintenance classification such as corrective, adaptive, perfective and preventive cannot be avoided if such software is to remain operationally satisfactory. The effort to classify software evolution and maintenance by Lehman and Ramil (2002) brings 16 another useful view to software evolution model. They classify a model of evolution as consisting of five incremental steps. i) Initial development with complete operation ii) Improvement on versions, release or upgrades iii) Application that supports a wide range of co-operative system services iv) Process that refers to the aggregate of all activities useful to the above evolution. v) Mastery of software evolution process According to Lehman and Ramil (2002), to truly master software evolution one must understand and master the evolutionary properties of all those entities individually and collectively. Based on their findings, software traceability brings the highest value on both software and business processes. 2.3 Software Traceability in Software Engineering The following sections discuss thoroughly the areas related to software traceability and change environment as coined in the first chapter. These areas include the software configuration management, system documentation, requirements traceability and software change management. 2.3.1 Software Configuration Management As software system becomes larger and more complex, numerous corrections, extensions and adaptations tend to be more chaotic and unmanageable. The traditional way of addressing the maintenance task individually is no longer practical. It needs a special management system, called the Software Configuration Management (SCM) that covers the procedures, rules, policies and methods to handle the software evolution (IEEE, 1998b). SCM has been identified as a major part of a well defined software development and maintenance task (Pressman, 2004). 17 SCM deals with controlling the evolution of complex software systems that supports version controls and administrative aspects such as to handle change requests, and to perform changes in a controlled manner by introducing well-defined processes (Estublier et al, 2004). The IEEE Standard 1219-1998 for software maintenance (IEEE, 1998a) highlights the major functions of SCM (or CM for short) as • Identification - CM needs to identify the Configuration Items (CIs) that represent the collection of components and artifacts related to a particular project. • Change Management - Change management is the process of managing the change request, its evaluation, change approval or disapproval and implementation of changes. • Status Accounting - This task is required to record statistics, to examine the status of a product, and to easily generate reports about all aspects of the product and process. • Auditing and Review – To ensure that the change was properly implemented one can rely on some information gathered from the technical review and configuration audit. All these tasks are not taken to be mutually exclusive. It requires SCM to provide traceability and mechanisms to support procedures and controls over the shared CIs. The CIs may include all those components composing a software system such as the requirements, specifications, source codes, test cases and design models. The change management and identification functions play an important role in SCM as not only to respond to the need for change but also to serve a more general purpose in the effective management of the system evolution. Large number of SCM researchers have been working on version controls (Walrad and Strom, 2002; Nguyen et al., 2004; Cui and Wang, 2004) while others on system hierarchies (Wei et al., 2003; Hagen et al, 2004; Janakiraman et al., 2004) and change management (Small and Downey, 2001; Brown et al., 2005; Keller et al., 2004). Wei et al. (2003) takes a step further emphasizing on meta-model to express configuration items, their relationships and evolvement for complex software product 18 structures. Dorrow (2003) designs and implements a configurable middleware framework for distributed systems. It can address several configuration facets such as fault tolerance, security, power usage and system performance. Desai (2003) takes an initiative to support extensibility, reusability and verifiability by designing a configuration description language to model CM on heterogeneous cluster environment. This interface incorporates the rapidly changing configurations at the client sites into server configurations for internal working administration. The Eclipse (Brand et al., 2003) and ClearCase (White, 2000) are tools of database management system for program modules, supporting multi version software relationships among software components, configuration management, and access control protection. The study on the current SCM approaches and tools reveals that change process is part of SCM tasks. However, the existing change management function emphasizes more on the check-in, check-out procedures on CIs, authentication and version controls. Least, or almost no effort is taken to determine the traceability and ripple effects of change within CIs. 2.3.2 System Documentation System documentation is a software product and a part of software engineering knowledge (Sommerville, 2004). System documentation is vital to software maintainers who need to understand software thoroughly before making any changes. A maintainer needs to refer to documentation and its standard in order to understand the cause and effect of changes. Regardless of the software process being adopted, developers working on a large-scale software project produce a large variety of software documents typically include requirements, business plan, design specifications, source code, test plans, bug reports, etc. 19 There are few documentation standards available in software development such as the IEEE Standard 830-1998 and MIL-STD-498. Different standard may differ from the other in terms of format templates and structures, although its content wise could be relatively the same. The IEEE Recommended Practices have established standards for three main software documents, namely the Software Test Document - IEEE Std 829-1998 (IEEE, 1998c), Software Requirements Specification - IEEE Std 830-1998 (IEEE, 1998d) and Software Design Description IEEE Std 1016-1998 (IEEE, 1998f). MIL-Std-498 is made available for twenty two types of documentation (i.e. DIDs) at all phases of software life cycle. The general structure of the software process with DIDs is shown in Figure 2.2. The figure shows that each phase of software development lifecycle is supported by some documents e.g. the Plan phase is supported by the SDP, SIP, STRP and the Concept or Requirement phase is supported by four other documents and so forth. MIL-STD-498 Data Item Descriptions (DIDs) Plans SDP SIP STRP Concept/ Requirements OCD SSS SRS IRS Design SSDD SDD DBDD IDD Qualification Test Products STP STD STR User/ Operator Manual Support manual SUM SCOM SIOM COM CPM FSM Software SPS SVD Figure 2.2 : MIL-STD-498 Data Item Descriptions From investigation, IEEE Standard and MIL-STD-498 Standard (including IEEE/EIA 12207 compliance) possess three documentation standards in common i.e. Software Requirements Specification (SRS), Software Design Description (SDD) and Software Test Document (STD). The descriptions of these documents can be illustrated as in Table 2.1. In general, the software development environments are supposed to provide tools to support software documents. Several works have applied hypertext technology to improve software documents and their relationships. SLEUTH (Software Literacy Enhancing Usefulness to Humans) supports an on-line 20 documentation environment (French et al., 1997). It provides an environment for software documentation management consisting both an authoring and a viewing environment. The environment applies FrameMaker that integrates hypertext to cross-referenced filter configuration files. In SLEUTH, system documentation is maintained as a hypertext with typed links to other associated documents and the links are always updated. Table 2.1: An overview of SRS, SDD and STD documents Document Types Overview SRS document describes recommended approaches for the Software Requirements software requirements. The purpose is to explain the scope Specification of the project, references made to other standards, (SRS) definitions of specific terms used, background information and essential parts of the system. SDD document describes the necessary design information including high level and low level designs. The practice is Software Design Description not limited to specific methodologies for design, (SDD) configuration management or quality assurance. It can be applied to paper documents, automated database, design description languages or other means of description. STD document describes a set of basic software test information that covers test plan, test specification and test Software Test Document report. Test plan prescribes the scope, approach, resource (STD) and schedule of the testing activities. Test specification covers the test cases, test procedures and input/output assumptions. Test report identifies the test items, test log and summary report. CHIME (Customizable Hyperlink Insertion and Maintenance) extends the hypertext capability by integrating relationships between documents and code (Devandu et al., 1999). It inserts HTML tags and links in the source code by querying a database produced by the existing static analysis tools. SC (Software Concordance) is similar to CHIME but is designed to manage versioning system i.e. it can keep and update the links as evolution is made to the code and documents 21 (Nguyen et al., 2004). Hartmann et al. (2001) employs an approach to document software system based on XML and View II tool in reverse engineering environment. It focuses on the use of Specific Document Type Definitions (DTD) as a standard for software documentation. Sulaiman (Sulaiman, 2004) employs a DocView to visualize and re-document some structural information from code. In the above discussion, some attempt to associate documentation to other software entities e.g. code, while others attempt to support better visualization within documentation environment. Each work may use different approach of software traceability to support its own application. However, it is observed that no proper attempt has been made to associate traceability between high level and low level components such from requirements to low level design and code. It is anticipated that the existing information in the SRS, SDD and STD needs to be recaptured and reorganized into an appropriate repository in order to provide a new need for traceability mechanism. Due to component dependencies that would complicate the matter, a special attention is required to retranslate and transform all the related sources into a special database repository. This repository then needs an appropriate mechanism to implement change impact analysis. 2.3.3 Requirements Traceability There is a need to relate one model to another that constitutes traceability e.g. from design to code. Ramesh (2001) relates traceability as the ability to trace the dependent items within a model and the ability to trace the corresponding items in other models. Research on software traceability has been widely explored since the last two decades that supports many applications such as regression testing (Jang et al., 2001), change impact (Lee et al., 2001) and visualization (Bohner and Gracanin, 2003). There are efforts to simplify software evolution at the high level software system that involves business requirements and specifications. For example, Wan Kadir (2005) develops a meta model, process and tool called BROOD (Business Rule-Driven Object-Oriented Design) that link business rules to software design. Traceability is fundamental to software development and maintenance of large 22 system. It shows the ability to trace from high level abstracts to low level abstracts e.g. from a requirement to its implementation code or vice versa. Tracing such kind of traceability is called requirements traceability (Ramesh, 2001). Pursuant to this, Burd and Munro (1999) assume that requirements traceability implies that all models of the system are consistently updated, otherwise it is impossible to implement a software traceability in the system. Many software engineering standards governing the development of systems (e.g. IEEE/EIA 12207) emphasize on the need of requirements traceability to be included into software development process and treat it as a quality factor towards achieving process improvement. Process improvement via CMM (Capability Maturity Model) is a community-developed guide for evolving towards a culture of software engineering excellence for organizational improvement (Saemu and Prompoon, 2004). To better manage a change impact analysis at a broader perspective, it needs to associate with traceability approach that requires a rich set of coarse and fine grained granularity relationships within and across different level of software models. Sneed’s work (Sneed, 2001) relates traceability approach to a repository by constructing maintenance tasks that link the code to testing and concept models. His concept model seems to be too generalized that includes the requirements, business rules, reports, use cases, service functions and data objects but no formalism was described. A tool was used to select the impacted entities and pick up their sizes and complexities for effort estimation. Bianchi et al. (2000) introduce a traceability model to support impact analysis in object-oriented environment. However, their work is more involved with manual process which is time consuming and error prone. Lindvall and Sandahl (1998) present a traceability approach based on domain knowledge to collect and analyze software change metrics related to impact analysis for resource estimates. They relate some change requests to the impacted code in terms of classes but no requirements and test cases are involved. Investigation on requirements traceability reveals that although the above works apply requirements traceability for impact analysis, their emphasis and approaches do not allow a direct reciprocal link between artifacts. A new requirements traceability approach can be proposed to support direct links between 23 requirements at the high level and other artifacts at the low level or vice versa in order to observe the potential effects of change. 2.3.4 Change Management Process A maintenance task is typically initiated by some change requests when there is a need to do software change. Change requests are generally expressed in natural language by users or clients. The lexicon used to describe a concept itself may vary as users, designer, programmers and maintainers use different words to describe essentially the same or similar concept. Due to this reason, software developers may help clients prepare a change request so that the written problem description can be easily understood and analyzed. Table 2.2 represents a sample form of change request, also known as PCR (Problem Change Report). Table 2.2: Sample Form of PCR (MIL-STD-498, 2005) As far as the change process is concerned, it should include task such as how a change request or proposed change is evaluated before the actual changes take place. To put change impact analysis in perspective, it first needs to understand the process of change. Madvhaji (1992) defines the process of change as consisting of the following stages. i) Identify the need to make a change to a software component in the environment. 24 ii) Acquire adequate change related knowledge about the item. iii) Assess the impact of a change on other items in the environment. iv) Select or construct a method for the process of change. v) Make changes to all the items and make their inter-dependencies resolved satisfactorily. vi) Record the details of the changes for future reference, and release the changed item back to the environment. For large software systems, change requests are handled by the SCM via the Change Control Board (CCB). CCB is a special user or a unit who is responsible for approving and authorizing changes made to the software system. In general, CCB would require software engineers or maintainers to provide some input to change impact analysis. The result should describe many issues such as the strategy or plan for change process, organization impact, resources used and one of the most challenging tasks is to identify what other software components would be affected, their sizes and complexities. Bohner (1991) provides a view of change activities i.e. how a change process is managed and controlled as shown in SADT (Structured Analysis Design Technique) in Figure 2.3. The sub process “Manage Software Maintenance” sets the objective of the proposed change request. The sub process “Understand Software Change and Determine Impact” determines all possible effects of the change. This activity can be automated or semi-automated over parts of the software involved to produce the current impacts. The current impacts should provide valuable information to maintenance tasks in terms of the number of affected modules, sizes, complexity, etc. With the current scope, management can determine whether to proceed or not with the subsequent tasks. The sub process “Design Software Change” takes a continuing process, by generating requirements and design specifications for the change. Here, the 25 maintainer takes the initiative to model the system to meet the requirements of change. The sub process “Implement Software Change” is a process of doing the actual change. This activity also accounts for ripple effect that is to provide the propagation of changes to other code modules or other workproducts. This serves to check the effectiveness of the original impact analysis made earlier. The sub process “(Re) Test Affected Software” is a regression testing made over the changed software. This is to make sure that the new changes meet the requirements of the proposed changed. Objectives (Corrective Adaptive Perfective) Schedule, Constraints, Resources Manage Software Maintenance Change Request Current System Understand Software Reqts & Change & Change Design Determine Request impacts Input Reqts & Design Software Design Spec Change Current Impacts Program Level impacts Re-test Impacts Implement Updated (Re) Test Software Software Affected Change Software New System Ripple-Effects Impact/Scope Traceability Stability Complexity Modularity Doc & Descr Adaptability Testability & Completeness Figure 2.3: SADT diagram of software maintenance activities (Bohner, 1991) Small and Downey (2001) describes change activities in slightly different way. They categorize the change process into six subsequent phases (Figure 2.4) i.e. management, analysis and prediction, design, implementation planning, resource preparation and change implementation. A0 represents the scope to conduct changes of which all the phases are included. All the phases can be described as follow. 26 Figure 2.4: SADT diagram of change process (Small and Downey, 2001) Phase A1, Manage Changes, takes responsibility to plan and direct all the change activities. It monitors the environment, change situation and products. It also provides change information and recommendations internally and externally, to the enterprise and its environment. Phase A2, Analyze Situation and Predict, analyzes the possible effects and scope as required for managing change, and in support of further re-designing the enterprise. The change predictions and evaluations consider the emerging new vision, so iteration with A3 is expected. Phase A3, Re-Design Enterprise, re-designs model and specification for change process. This strategy requires inspiration, involvement of experts and managers from the management and supporting resources; e.g., information technology and human resources. Phase A4, Plan to Implement Changes, plans both the preparation of schedules, identification of resources for change, the changeimplementation sequencing and roles. 27 Phase A5, Prepare Resources to Implement, sets up in advance all resources needed for implementing the changes, including personnel hiring and training. Phase A6, Implement Changes, performs the changes as planned. All the above scenarios focus on the management aspect of change process within the software maintenance environment. Madvhaji emphasizes on some brief tasks to assess impact and its change implementation. Bohner elaborates the steps on change management right from the initial objectives of change knowledge through its implementation and testing. Small and Downey (2001) stage a careful consideration on the change procedures by adding in more plans to direct changes including the preparation of schedules and resources. It seems that the above scenarios complement each other in the change management, and change prediction is seen as a prerequisite effort to change implementation. Change prediction requires a special knowledge of impact analysis and is regarded as one of the research challenges in software engineering (Finkelstein and Kramer, 2000). The main issue now is how to perform change impact analysis and its subsequent traceability. These issues are discussed extensively in the next section. 2.4 Review and Further Remarks on Software Evolution and Traceability In summary, the views on software evolution as discussed in the beginning of this section emphasize two important issues: the environment that generates changes and the process that describes the way to perform changes. The environment represents the infrastructure that governs the change management such as the SCM, documentation and requirements traceability. Based on the objective evidence in software maintenance, software traceability is considered the most important sources to software evolution as it governs the software change process that may later be linked to support other parts of software lifecycle. 28 The process refers to the issues related to software change process such as software technology, methods, tools and techniques. The solution to software evolution problems is likely to succeed by addressing both the software traceability and change process. We can conclude SCM refers to the management of project coordination and control changes including the version control for software productivity. Documentation is concerned with all the software documents of software development activities as specified in the software lifecycle. Software traceability is the key element to all these dimensions as it can support software development and evolution. Software change process is seen as a pivotal point to software evolution that its success is largely dependent on the software traceability surrounding it. A more careful consideration of the reviewed issues and problems in software evolution leads to some remarks below. • The establishment of SCM as a discipline of software engineering demands for better management and control over CIs (Estublier et al., 2004). However, the majority have over emphasized on versions and release controls of source files, and have neglected other equally important requirements, e.g. analysis and evaluation for change requests. These requirements demand for a rich set of fine-grained granularity relationships among the CIs of which many commercial SCM tools tend to ignore. • The current CASE tools do support traceability relationships among the software components that enable them to provide functionalities such as work on visualization, navigation, cross-references, browsers and consistencychecking between components in the system. However, their objectives are geared towards managing, organizing and controlling the overall system components rather than establishing the links between coarse-grained and fine-grained granularities, e.g. code to requirements. • Managing change impact analysis at the broader perspective is more crucial and challenging as it involves software dependencies from within and across 29 different software lifecycle. Lack of software techniques, methods and tools makes software evolution significantly hard to manage. • The importance of software documentation is well understood by software developers. However, its use in software evolution is still skeptical as many perceive that documentation is less updated, obsolete and time consuming. However, many agree that its use is more significant if more information can be made traceable within software components of system documentation (Nam et al., 2004) e.g. software traceability for change impact analysis. The above discussion reveals that software traceability is central to software evolution. Software change process seems to be coherent to all other software entities that would require a special study on its own software traceability. We believe that the results of our proposed software traceability for change impact analysis will benefit other software entities in their effort to reduce the maintenance tasks. 2.5 Impact Analysis Versus Traceability Impact analysis and software traceability are two interrelated issues. The following section discusses their definition, techniques and uses to support change process. 2.5.1 Definition of Impact Analysis and Traceability Turver and Munro (1994) define impact analysis as “the assessment of a change, to the source code of a module on the other modules of the system. It determines the scope of a change and provides a measure of its complexity”. Bohner and Arnold (1996) define impact analysis as “identifying potential consequences of a change, or estimating what needs to be modified to accomplish a change”. Both 30 definitions emphasize the estimation of the potential impacts which is crucial in maintenance tasks because what is actually changed will not be fully known until after the software change is complete. Impact analysis is used in some applications such as cost estimation, schedule plan, risk management, visualization and change impact. It can also be applied to a system or part of a system for a specific purpose e.g. analysis on code for regression testing. Software impact analysis, or impact analysis for short, offers considerable leverage in understanding and implementing change in the system because it provides a detailed examination of the consequences of changes in software. Impact analysis provides visibility into the potential effects of the proposed changes before the actual changes are implemented. The ability to identify the change impact or potential effect will greatly help a maintainer or management to determine appropriate actions to take with respect to maintenance tasks. The related works on impact analysis are mainly focused on code. For example, Object-Oriented Test Model Environment (OOTME), a testing prototype tool provides a graphical representation of OO system that supports program relationships such as inheritance, control structures, uses, aggregation and object state behavior (Kung et al., 2000). It constructs a firewall method to detect change impact to data, class, method and class library levels. However, the current main use of the OOTME is to derive test cases for regression testing across functions and objects. Lee et al. (2000) claims that a total automation is possible to measure the effect of proposed changes based on source code. She develops an algorithmic approach called the Object-Oriented Data Dependence Graph (OODDG) that integrates the DDGs of intra-methods and inter-methods. An automated impact analysis tool, called chAT was then developed that applies an OODDG graph technique to compute some ripple-effects. Hoffman (2000) extends the features of ripple-effect capability by using dynamic analysis. He developed an integrated methodology, Comparative Software Maintenance (CSM) to evaluate and measure 31 testing coverage. With the dynamic analysis capability, it helps narrow down the testing tracking by implementing an automatic intrumentation of the modified source code. Hichins and Gallagher (1998) developed a tool for visualizing impact analysis, graphically depicting the dependencies between C functions and global variables. The interesting aspect of their research was the use of a debugger to dynamically trace dependencies. There is a need to extend the concept of impact analysis at the broader perspective as suggested in Chapter 1. Implementing impact analysis at the broader perspective implies extending impact analysis from one model to another e.g. from code model to design model. This is called traceability. Gotel classifies software traceability into two categories; vertical and horizontal (Gotel, 1994). Vertical traceability represents traceability within an individual model and horizontal traceability represents traceability across different models of workproducts. Notice that this thesis sometimes use the term software models or software levels interchangeably in this thesis, both are meant to be same. Antoniol et al. (2000) implement traceability to check for inconsistencies between code and design models. The process focuses on design artifacts expressed in OMT (Object Modeling Technique) notations with the code written in C++. It recovers an ‘as is’ design from the code and compares its recovered design structures with the actual design structures. French et al. (1997) implement software traceability within documentation environment to improve its readability and cross referencing. Gupta et al. (2003) apply traceability between documentation and code to map the occurrence of essential components in both entities. Huang et al. (2003) use traceability in distributed development environment to maintain links and update impacted artifacts. Lindvall and Sandahl (1998) developed requirements traceability that link requirements to code classes. It is important to note that both impact analysis and software traceability may overlap in concept and its use but differ in their scope. Impact analysis can be viewed as a general term of software traceability although it can be applied to a more 32 specific workproduct e.g. in code, while traceability relates to the extended impact that links artifacts across different levels of workproducts. 2.5.2 Vertical Traceability Vertical traceability refers to the association of dependent items within a model. For source code analysis, automated tools detect and capture dependency information, evaluating data, control, and component dependencies. A supporting method, called program slicing is a basic technique firstly introduced by Weiser (1984) that takes advantage of data-flow and control-flow dependencies to capture "slices" of programs. Techniques for dependency analysis are used extensively in compilers and source code analysis tools. Modeling data-flow, control-flow, and component-dependency relationships are fruitful ways to determine software change impacts within a set of source code artifacts. Although dependency analysis focuses narrowly on impact information captured from source code, it is the most matured impact analysis technique available. This section describes the techniques used in program dependency such as data dependency, control dependency and component dependency. 2.5.2.1 Data Dependencies Data dependencies are relationships amongst program statements that define or use data. A data dependence exists when a statement provides a value used directly or indirectly by another statement in a program (Horwitz et al., 1990). Data definition and use graphs are typical representations of these dependencies that assist software engineers in understanding key data flows. Key data flows need to be established among data objects when the values held by an object are used to calculate or set other values. Thus, data dependency of data-flow analysis produces dependency information on what data and where it goes in the software system. 33 2.5.2.2 Control Dependencies Control dependencies are relationships within program statements that control program execution (Horwitz et al., 1990). Control-flow analysis provides information on the logical decision points in software system and the complexity of their structure. Control-flow technology identifies procedure-calling dependencies, logical decisions, and under what conditions the decision is held true. Each node corresponds to a statement or a set of program statements and each arc (directed edge) indicates the flow of control from a statement(s) to another. These paths are typically represented as graphs or tables to aid in understanding the program's control flow. Examples of such tools that employ control-flow analysis techniques are McCabe Battlemap Analysis Tool (McCabe, 2005) and CodeSurfer (GrammaTech, 2005; Anderson and Teitelbaum, 2003). These tools decompose source code into its control elements to create a view of the program that specifies the control flows for analysis. 2.5.2.3 Component Dependencies Component dependencies relate to general relationships among source-code components such as modules, files, and test runs. Many techniques to detect component dependency are supported by software engineering tools such as crossreferencers (Teng et al., 2004), test-coverage analyzers (Briand and Pfahl, 2000), and source-code comparators (Cooper et al., 2004). Cross-referencers help find the impacts by indexing software information captured in one part to another part. As they bring together all object references (e.g. to a data field, disk file, flag, module, and so on), they help software engineers understand relationships among software artifacts. Test-coverage analyzers identify parts of the system that were executed during testing and indicate how processing time was spent. Test-coverage analyzers 34 can instrument programs being tested and collect runtime statistics on coverage during execution. Thus, they are highly effective in assisting testers to determine the areas covered and uncovered by the test cases. Source-code comparators help in determining source of changes in the code after the changes have been made. For instance, one may have changed the program at different time for different reasons. To determine the delta of change at a current stage, he or she can use a source-code comparator to identify the changed and unchanged statements. 2.5.3 Horizontal Traceability Horizontal analysis refers to the association of corresponding items between workproducts. Software engineers very often relate the horizontal traceability as traceability for short. In general, it can be classified into four major techniques of traceability, namely explicit links, implicit links, name tracing and concept location. 2.4.3.1 Explicit Links Explicit links can be derived from software knowledge that is already structurally constructed. Structural knowledge is a syntactic information derivable from source code or other models that includes control flows, data flows and components dependencies. Structural links are called a traceability associated to syntactic dependencies between software models. In control flows, a function may need to transfer control to other parts of program of different classes. Data flows may extend its uses to other parts of program relations via inheritance, association, aggregation, etc. These program relations will be discussed in Section 4.4.1. Software engineers can apply these explicit links to reuse, retrieve or classify information between high level and low level software e.g. between code and design. Antoniol et al. (2000) present a method to trace C++ classes to an OO design. Both the code classes and OO design are translated into Abstract Object Language 35 (AOL) intermediate representations. The extracted AOLs from the design and code are then compared using a special algorithm that computes the best mapping between code classes and entities of the OO design. Buchsbaum et al. (2001) have developed a hybrid approach that integrates logic-based static and dynamic visualization that help to determine the design-implementation congruence at various levels of abstraction. Gutpa et al. (2003) exploit a software concordance model to support documentation using hypermedia Analysis Syntax Tree (AST) technologies in software development and maintenance. It unifies the XML documentation design with source code using AST data structures. 2.4.3.2 Implicit Links At the high level software functionalities where the component links are poorly recognized in programming, many software components are associated to each other by implicit links (Antoniol et al., 2000). Implicit links are not easily implemented unless some semantics were made available. For example, when a linked list is chosen to implement a stack, it requires a table of data structures rather than encapsulation of the data-abstraction table in a class (interleaving decision). If the implementation of the table is changed i.e. performance requirement in the Software Requirements Specification (SRS) document is changed, its ripple effect affecting all these classes that should be changed accordingly. Such a ripple effect could not be detected by analyzing dependencies and coupling among classes, but a cognitive link between the performance requirement expressed in the SRS document and the design classes implementing it, should reveal this ripple effect. In another example, sharing input or output format generates links that cannot be traced by analyzing traditional dependencies and coupling among them. A change in input-output file format will impact other classes. These examples of design decisions produce both vertical and horizontal cognitive links. Unfortunately, design decisions are not usually recorded and updated in the documentation, so maintainers are often obliged to perform expensive design tracing processes before changing a 36 software system. However, high level design decisions may be difficult to trace because they are not always code-related. This explains why maintainers usually execute impact analysis by analyzing syntactic dependencies among code components, or at least employing name tracing techniques, but disregard cognitive links unless they can interview system experts to get some semantic information. 2.4.3.3 Name Tracing Name tracing can be used to trace homonymous software entities among different models, and requires a consistent use of naming convention throughout the software (Marcus and Meletic, 2003). In systematic understanding of large program, a programmer tends to know some useful components in a system. When intuition and experience fail to provide an immediate answer, programmers become more systematic to locate the needed concepts. In searching for the location of a concept e.g. “outstanding orders”, the programmer may search for the words “outstanding”, “outstanding_orders”, “outstandingOrders” and etc. Antoniol et al. (2002) use this technique to identify traceability links between design and code in object oriented software. It is performed by searching items with names similar to the ones in the starting model. Name tracing is well-known in most Unix systems via string pattern matching utility known as "grep" or sometimes called the "grep technique" (Maarek et al., 1992). Depending on the application, this technique is very useful when the application needs to search for all the possible occurrences of an item without considering the semantics of the object or program syntax. Marcus et al. (2003) extend the naming tracing approach by applying Latent Semantic Indexing (SLI) to automatically identify traceability links from system documentation to program source code. SLI captures significant portion of the meaning of words, sentences or paragraphs by checking on some particular word occurrences. The most important element of this method is to produce a profile of 37 the abbreviated description of the original documents so that their relationships can be established. With the help of natural language texts, clustering and name tracing, some predefined identifiers are retrieved from both documentation and code to support easy tracing. 2.4.3.4 Concept Location Concept location is a process of locating a feature or concept in software system (Rajlich and Wilde, 2002). A concept is relatively easy to handle in small systems in which the programmer fully understands. For large and complex system, it can be significantly difficult. Concept location assumes that a maintainer understands the concepts of the program domain, but does not know where in the code the concepts are located. It is directly applicable to program comprehension as the programmers or maintainers always have some pre-existing knowledge; otherwise the process of comprehension would not be possible (Rajlich and Wilde, 2002). Concept location, sometimes known as domain knowledge is normally used by the experienced software developers when tracing concepts using their knowledge about how different items are interrelated. Please note that several simplified definitions of what a concept is as appeared in the literature. It can be classified into four types of concepts as follows. 1) A concept that is equivalent to objects in an object-oriented program e.g. each class represents a concept. 2) A single value when the program domain is too trivial to have a class of its own. For example, the concept "sales discount" in the program may be implemented as a single integer rather than having its own class. 3) A concept that is spread across several classes. For example the “a purchase order package” of an inventory system is implemented in several classes. 38 4) Another simplified notion of concept is called lattice concept (Tonella, 2003). According to this definition, there is a fixed set of attributes and a concept is a specific subset of these attributes. The subsets constitute a lattice and therefore concepts also constitute a lattice. However, this notion does not cover a full range of concepts encountered by the programmer, although in certain cases it can be useful. So, the issue here is to know how to locate the implementation of a concept e.g. to locate a requirement in the source code where requirement is a concept. In a small and well designed program, each functionality may be located in a single identifiable method or function. However, in a large program, it is more likely that the functionality is coded as a delocalized plan (Snelting, 2000). In delocalized plans, pieces of code that are conceptually related but are physically located in noncontiguous parts of a program. Concept location can be implemented using one or more of the previous mentioned techniques, namely • Explicit links • Implicit links • Name tracing Lindvall and Sandahl (1998) implement domain knowledge in their case study of a maintenance project by interviewing the software experts over the change relationships of software artifacts. All the useful information and cognitive links are captured to support requirements traceability. Egyed (2003) applies explicit links of program slicing via dynamic analysis. A dynamic analysis is conducted by implementing test scenarios over some test cases and the impacted components are collected for further analysis. Sneed (2001) uses some semantics at the high level management to implement concept location. 2.5.4 Review on Impact Analysis and Traceability Techniques Impact analysis is widely used in many applications such as in regression testing, visualization, consistency checking and change impact. Meanwhile, 39 traceability has the ability to widen the scope of impact analysis to different levels in software system. Program slicing, data flow and control flow form the basis to support change impact analysis. These techniques provide some considerable promising results when applied to small code or program. As a project grows up, the code becomes larger and more complex. The need to manage impact analysis at the individual statements is no longer practical. Software engineers or maintainers may prefer to manage the software artifacts at larger impact. For example in large software systems, identifying the impacted methods, classes or packages is more efficient than that on individual statements (Zhoa, 2002). When those artifacts are explicitly recognized then a user can intuitively spot on the individual statements having such problems that need to be modified. This thesis is interested in change impact analysis that includes not only the code but also the design, specifications and requirements. The first impression is how to get the code linked to the higher level components e.g. requirements. This is where a concept location of software knowledge comes into play. Some strategies were carefully studied as this would affect the accuracy and completeness of the impacted artifacts. Performance criteria is not the main focus of this research here. A typical scenario of program structures would better be explained in topdown and bottom-up strategies (Mayrhauser and Lang, 1999). Being functional knowledge established by top-down strategy supports the need for well structured hierarchy of concepts. These structures supported by data flows and control flows dependencies provide a good search impact for software knowledge. In the implementation of change impact analysis, one would like to identify the ripple effects of a software artifact. If a software artifact (e.g. variable x) is change, it will affect other callers and parts of the program which are dependent on it via program relations. Calling invocations are the control dependencies, while other program relations such as inheritance, friendships, aggregation, etc. are the data dependencies that need to be captured from the code. If one extends the concept of a variable x to a class to represent a global meaning of software artifact (to be changed), then it 40 applies to component dependencies as discussed earlier. Thus, software traceability needs to consider these three types of explicit links when dealing with change impact analysis i.e. data dependencies, control dependencies and component dependencies. Naming tracing or string pattern is another traceability means widely used in programming environment. However, naming tracing has serious deficiencies. It often fails, particularly when the concepts are implicitly available in the code but are not traceable or comprehensible by any program identifiers. Naming tracing is not appropriate in the current research context because it ignores the syntactic program dependencies of which this information is very useful to account for change impact. The concern now is how to establish links between requirements and implementation code? In the new proposed model, the static analysis itself is not sufficient enough to implement horizontal traceability. Static analysis may be acceptable within code but to link it up with high level e.g. requirements, it needs a more concrete solution of concept analysis. One acceptable approach is using a dynamic analysis, known as reconnaissance technique (Wilde and Casey, 1996). This technique is implemented based on the execution of test cases or also known as test scenarios. As a requirement is characterized by one or more test cases, the implementation of each requirement can be materialized by performing these test cases and capturing the impacted software components in terms of the methods, classes or packages. The author had experienced testing this approach in support of concept location using RECON2 (Recon2, 2005), a reconnaissance tool developed by the university of West Florida. It applied a dynamic technique to capture some boundaries or impacts of a feature and the result was promising. Although this tool was developed to help C programmers, its idea can be extended and applied to support OO programs. The benefit of concept location via dynamic analysis is that the result is more accurate compared to the result produced by the static analysis. The reason is simply because it involves the program execution that can traverse all the possible paths that the static analysis may miss. The implicit links are not considered in this research as it is more applicable at the high level abstracts of which require a great deal of 41 research and effort. The objective here is to observe the requirements traceability that involves requirements, design, code and test cases, while other high level components are left for future work. 2.6 Critiques on State-of-the-art Software Traceability Approaches It needs to identify the strengths and weaknesses of the existing software traceability approaches before a new model can be proposed. For a systematic comparative review on software traceability approaches, an appropriate comparison framework has to be defined. 2.6.1 The Comparison Framework The comparison framework needs to be defined to set the goals of the evaluation framework. Some criteria of the comparison framework and techniques are taken and adopted from a set of well defined features of various domains such as software engineering standard (IEEE, 1998), software lifecycle coverage (Pressman, 2004), evolvability (IEEE, 1998a) and configuration management (IEEE, 1998). The framework is divided into four related categories i.e. coverage, coherence, measurability and pragmatic. Software coverage refers to software development lifecycle that includes software activities such as requirements specification, design, coding, testing and operation. Software lifecycle is governed by a software development model that defines its own methodology and process workflow. For example, RUP (Rational Unified Process) is one of the most recent software lifecycle models is a software process based on the Unified Modeling Language (UML) that is iterative, architecture-centric, use-case driven and risk-driven. RUP is described in terms of business model, which is structured into four main phases; inception, elaboration, construction and transaction. 42 The IEEE Standard for software maintenance (IEEE, 1998a) on the other hand, points out that the software products should conform to some standard in order to facilitate change request that covers problem identification, analysis, design, implementation, regression testing, acceptance testing and delivery. The minimum standard of software products consists of i) Specification– determines what requirements, its activities, risks and testing plan. ii) Design – determines how a system works, its specific functions and structures. iii) Implementation – produces source codes, documentation and tests including validating and verification of the software system. Software coherence refers to element interactions in software system particularly design and programs. The idea of coherence is to observe how the program dependencies known as cohesion and coupling can be recovered from the source (Sommerville, 2004). Program cohesion refers to the degree to which module components stick together in a module, while program coupling is the degree of interactions between two or more modules. In object oriented software, a module can be replaced by a class with methods and attributes are considered as their basic elements. With respect to change impact analysis, a rich set of coherence is expected at both vertical and horizontal levels of software interaction so that more potential impacts can be derived from various perspectives. Measurability in impact analysis is to indicate the ability to measure change impacts. Darcy and Kemerer (2005) propose some basic software metrics for cohesion and coupling in software system as follows: i) Weighted Methods per Class (WMC), ii) Depth of Inheritance Tree (DIT), iii) Number of Children (NOC), iv) Coupling Between Object Classes (CBO), v) Response For a Class (RFC) and vi) Lack of Cohesion Metric (LCOM). 43 It is observed that software as an entity is composed of many components such as lines of code, methods, classes, packages, etc. There are many potential components that can be measured and combined to form a useful information known as a useful metric set. The above metrics are proposed to define all the possible ways of measuring OO software. However, based on this scope of impact analysis only the WMC, CBO and RFC are directly applicable to account for ripple effects. This research is concerned about the sizes and complexity of the impacted methods and classes (WMC). The impacted methods and classes are resulting from the call invocation (RFC) and data uses (CBO). The only critical part here is to define the detailed scope of CBO. Wilkie and Kitchenham (1998) focus on one coupling measure, a CBO but with limited scope to measure ripple effects of change. They investigate that the classes with high CBO are more likely to be affected by the ripple changes. From their investigation, CBO is found to be an indicator of change-proneness in general but insufficient to account for all changes. Other dependencies such as inheritance, association and other indirect coupling need to be considered to explain the remaining ripple effects. Below are some corresponding metric set deemed to be useful to characterize the impact of changes. • Number of impacted artifacts • LOC of impacted artifacts • Value of complexity of the impacted artifacts Pragmatic is concerned with the practical aspects of deploying and using an approach. Users of various groups such as manager, developers and maintainers would like to use the approach in their own perspectives. However, some common desired features that will contribute to their satisfaction are high clarity, readability and understandability. The usability and usefulness of software traceability model can promote good communication and understanding within teamwork. Usability can be observed at the easiness of applying the process, while usefulness is mainly influenced by the significant use of the results of the process. In impact analysis, a 44 software traceability process should support and guide users toward better estimation of ripple effects with least cost and effort. support coherence determine coverage represent measurability use use Pragmatic influence acceptance Real World / Application Domain Figure 2.5: The Comparison Framework The components of the comparison criteria are related to each other as illustrated in Figure 2.5. Coverage reflects the real world entity in term of the product phases of the software lifecycle to be maintained. Software coherence is then defined with respect to the coverage to support measurement. During change impact analysis, component interactions within software are defined to demonstrate the ripple effects. Measurement is computed to determine the software metrics of the impacted components. Both coherence and measurement need to be demonstrated in pragmatic manner which then influences the user acceptance. 2.6.2 Existing Software Traceability Models and Approaches Based on the above discussion on comparison framework, five software traceability approaches with some related issues were selected for comparison. In each approach, the study focuses on the techniques and mechanisms, and how they 45 provide environment to support impact analysis (or IA for short). Table 2.3 depicts a summary on the used techniques, supported levels, strengths and drawbacks of each approach. 1) GEOS (Sneed, 2001) GEOS is an approach to improve the maintenance process emphasizing on concept, code and test. It attempts to perform impact analysis that can support effort and cost estimation of change requests prior to their actual change implementation. GEOS applies metrics, hypertext techniques and structural impacts to identify all interdependencies. GEOS is constructed on the basis of a relational database that covers three component levels, namely Table 2.3: A comparative study of traceability approaches Approaches Techniques used slicing, hypertext, metrics Levels Supported Concepts, test plans, classes 2. Requirements Management in TRAM Hypertext, cross reference, use cases, EBNF Management, documentation and architectural system 3. ScenarioDriven in Dynamic Analyser slicing Requirements, code 4. Class Dependency in Model Dependency Descriptor (MDD) Slicing, Pattern matching, cognitive Requirement, code -supports V&H -supports metrics -requirements traceability 5. Domain Knowledge Dependency in Domain Tracer Domain knowledge Requirements, classes -Supports semantics -requirements traceability 1. Global Trace in GEOS Strength Drawback -supports cost estimation -wide integration of impact analysis -requirements traceability -too generalized -formalism of artifact dependencies for IA is not clear -classes are the smallest artifacts -manual work at large -provide traceability at management levels only. -support impact analysis but not for change impact -based on dynamic analysis only -incompleteness of impact analysis -supports requirements engineering -supports visualization -supports semantics -supports requirements traceability -classes are the smallest artifacts -manual search for IA. Implementation on a case study exposes to thread for validity. -no automation for V&H -limited to requirements-classes only 46 • Concept level - requirements, business rules, dialog/batch processes, use cases, service functions, data objects, DB views, parameters, panels, reports, conditions, test cases. • Code level – components, modules, headers, classes, methods, interfaces, attributes, DB tables, resource tables. • Test level – test scenarios, test objects, test scripts, test cases, test sessions, test logs. In a database repository, a user is presented with a large historical data of previous performance to support cost estimation. This requires a massive task as it needs to associate the affected software components with efforts, task scheduling and productivity tables in addition to complexity determination and cost formulas. GEOS provides some insights of impact analysis to support cost estimation. However, the model seems to be too generalized as it does not describe the detailed formalism of relationships between artifacts in the context of impact analysis. The main parts of impact process still rely on human discretions e.g. requirement-class traceability. 2) TRAM (Han, 2001) The Tool for Requirement and Architectural Management (TRAM) was developed at the Monash University, Australia to support requirements management by providing requirements traceability between requirements and system architectural management (Han, 2001). The architectural management is focused to include the stakeholders, goals, software architectural components and documentation templates. TRAM manages requirements traceability using use cases of UML notation, EBNF and HTML hypertext. HTML document templates provide information at system requirement level that includes the stakeholders, goals, values, acceptance criteria, authorities, risks and relationships. At the system architectural level, HTML document provides information about the system interfaces, services, system components and use cases. The information model is codified into the 47 software document management using DOORS’ scripting language called XDL (Natarajan et al., 2000). This language has a range of features for managing, presenting, subsetting and querying information contained in its project documents. Despite providing good support for requirements traceability in SDLC, the capabilities of TRAM are dedicated to management for project tracking and cross references. 3) Scenario-Driven Approach (Egyed, 2003) This approach presents a scenario-driven approach to trace dependency analysis that is based on test scenarios. Test scenarios result in dynamic analysis of program execution. Program traces are then transformed into footprints of program slicing. This approach provides dependencies between requirements and code with no involvement of static analysis. As this approach is only based on dynamic analysis, the ripple effects of software components are not rich enough to support change impact analysis. 4) MDD (Bianchi and Antoniol, 2000) Two model dependency descriptors are defined; MDD1, MDD2. MDD1 is a coarse-grained level of class analysis, while MDD2 is a fine-grained level of method and attribute analysis. The impact analysis model seems to support system coverage at the code analysis. Both MDD1 and MDD2 are independent of one another, only the management has to decide whether to choose for coarse-grained or fine-grained analysis. This approach was applied to a case study of uncontrolled experiment which exposes to lack of validity analysis as no proper control was taken into account. For example, each subject group of MDD1 and MDD2 were allowed to manually explore a search impact on the target system within thirty days of student project assignment. The approach was implemented on the basis of manual search to explore the impact 48 of each requirement on code and no automated tools were used. Such kind of experiment is expensive, time consuming and leads to error prone. 5) Domain Tracer (Lindvall and Sandahl, 1998) Traceability links are established between each new requirement and the expected objects of C++ classes in a design object model. A long term case study, called PMR (Performance Management Traffic Recording) was used as an experiment. The impact analysis performed in this study is associated to domain knowledge as the tracing components were captured by interviewing knowledgeable software engineers instead of analyzing the software knowledge from the system. The traceability links are considered successful as it can provide semantics and cognitive links at the high level components. However, the cost is expensive as it involves a large number of human interactions and communications. The traceability limits the evaluation to class level, while deeper analysis is left for future work. 2.6.3 The Comparative Evaluation of Software Traceability Modeling Table 2.4 shows a summary of software traceability approaches with some defined criteria. In terms of the software coherence and coverage, the compared approaches focus on program dependencies of cohesion and coupling. GEOS applies coherence at the high level software structures that include the links between software specifications and classes. Slicing is used between classes to observe traceability at the code level. TRAM provides semantic knowledge to manage the links at the high level software functionality and no code elements involved. Dynamic analyzer creates program structures and slicing based on dynamic analysis of test scenarios between requirements, classes ad methods. MDD applies the same principle the only difference is that it uses static analysis and some cognitive elements of human intervention at some points. Domain Tracer adopts domain knowledge by interview expert to create traceability between requirements and classes. 49 With respect to measurability, GEOS provides more leverage to measure the impacted artifacts which lead to effort and cost estimation. TRAM supports measurement of the high level impacted artifacts for management purposes. Both Dynamic Analyzer and Domain Tracer measure the impacted artifacts in terms of classes. MDD applies LOC and complexity to measure the impacted artifacts between requirements, classes and methods. Table 2.4: Existing features of software traceability approaches Models/ Approaches Coherence Coverage Measurability Pragmatic GEOS Slicing, program concepts, test impacted classes, Hypertext, text, structures plans, classes effort estimation TD/BU Domain mgm, doc, arch. impacted Hypertext, text, knowledge design components TD/BU Dynamic Slicing, program reqts, classes, impacted classes TD, text Analyzer structures methods Slicing, program reqts, classes, structures, methods LOC, VG TD, text reqts, classes impacted classes TD, text TRAM MDD cognitive Domain Domain Tracer knowledge In terms of pragmatic, GEOS and TRAM support visualization as well as top down and bottom up tracing. Both apply hypertext for visualization with additional text support. Dynamic Analyzer, MDD and Domain Tracer limit their usability to top down tracing with more emphasis on text exploration. 2.7 Summary of the Proposed Solution From the above discussion, it draws some conclusions in terms of new traceability potential to support change impacts at broader perspective. First, the 50 requirement traceability that provides links between requirements and code is crucial but essential to observe the implementation of requirements. It needs some metrics in terms of sizes and complexities of the impacted artifacts to support some measurements. Second, it is a great help to maintenance task if one could automate a direct support for change impact at the vertical and horizontal analysis simultaneously. Some texts or visualization of the impacted artifacts are useful to provide better software traceability approach. Third, top down and bottom up tracing for software knowledge may present a good search impact. Fourth, it is an added advantage to provide a mechanism that allows both documentation environment and change impact process to work independently. This will improve the flexibility and efficiency of impact analysis as the research focus. Fifth, an integration of static and dynamic analysis may present a rich set of ripple effects to support efficiency and flexibility of impact dependencies. CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction This chapter discusses the research design methodology that includes the operational framework, formulation of research problem and evaluation plans. The evaluation plans cover a brief description on the need of supporting tools and environment, data gathering and analysis, and benchmark. Benchmark is specially selected to support assessment for change impact analysis. This chapter ends with some assumptions of the research and its summary. 3.2 Research Design Zelkowitz and Wallace (1998) classify a methodology of conducting research into four categories, namely scientific, engineering, empirical and analytical methods. Scientific method focuses more on formal method and theory to prove the hypothesis. Engineering centers its research on the know-how to develop new approach and test a solution to verify the hypothesis. Empirical method tends to propose statistical method as a means to validate a given hypothesis. Analytical method develops a formal theory to derive a hypothesis and compares the results with empirical observations e.g. using inductive principles. In the above categories, engineering method is more appropriate to software engineering research of which the current proposed model is developed. Lazaro and Marcos (2005) emphasize that as engineering is fundamentally concerned with the design and construction of 52 products, software engineering research should investigate how to produce more effective products, and how to do it in cost effective way. Software engineering research must therefore be influenced by the nature of the real world problems and development processes. In response to the above considerations in software engineering, the research methodology is carried out in three stages. The first stage involves the development of the model for software traceability analysis. The second stage is to develop one or more prototype tools to support the model, and the third is to evaluate the model via quantitative and qualitative assessment. 3.3 Operational Framework Figure 3.1 illustrates the operational steps of the research work. This research started by first indulging into the literature review. This was a long standing process with intention to understand the development and the state-of-the-art software traceability and impact analysis. During the literature review, the research gathered the related information to software traceability and its use in software system in order to obtain some meaningful research problems. Next, was the study on the current techniques and approaches of software traceability, followed by a preparation of comparison framework. The comparison framework provided the scope and criteria for evaluating existing models and approaches. A comparative study was carried out to reveal some strengths and weaknesses of the current models and approaches that led to establish a software traceability model for research proposal. A new software traceability model was then developed. To gain a better opportunity strengthening the proposed model, the author had undergone a special research attachment at the Centre For Software Maintenance, Durham University, United Kingdom for two months. During the research attachment the author gained some useful experience that allowed exchange of ideas and feedbacks with other experts and researchers in the same area. This experience with some other additional literature reviews helped further refine the proposed model. 53 START Literature review Identify research problems Study techniques and approaches of soft. traceability No Satisfy research problems Do a comparative study Yes Research proposal Develop a soft. traceability model Develop supporting tools Choose & analyze a case study Conduct an experiment Results evaluation Research report END Figure 3.1: Flowchart of the operational framework 54 The model itself required some other supporting tools to prove its concept. The supporting tools were developed and the model was then implemented on a case study of software project with complete documentation and source code. The output of the case study was captured to obtain some useful information such as requirements, design, test cases and code components including their relationships. With this information in hand, an experiment was then conducted that employed a group of students with some working experience to represent software practitioners. This evaluation was conducted in a controlled experiment in order to reduce the external threats of validity. Some useful data was collected from the experiment and analyzed with the help of statistical methods. The findings were then concluded and the research report or thesis was produced. 3.4 Formulation of Research Problems This section discusses how the research problem is formulated. Basically, the research background as described in Chapter one, and traceability approaches and mechanisms as discussed in Chapter two form a basis to formulate the research justifications as follow. 3.4.1 Need of Change Process Support Traditionally, determining the effects of change request has been something that software maintainers do in their heads after some painstaking examination of the code and documentation. This may be sufficient for small programs, but for large programs or software this task is very tedious and error prone. Large software often requires repositories to organize their software components, which are becoming increasingly interdependent in subtle ways. Software maintainers often conduct the change process in an informal manner that is largely manual. It seems that performing a change request requires substantial computing resources. If attempted manually, it can be time consuming, expensive and exhausting. 55 3.4.2 Establish Communications Within a Project In software lifecycle, there is a great gap between requirements at the high level software abstraction and code at the low level software abstraction. The current efforts to bring requirements and implementation code closer are hampered by the scarcity of appropriate approaches, techniques and tools. These technologies of software evolution are still far behind those in the software development (Boehm and Sullivan, 2000). More research work are required to promote communication and bridge the gap within software project e.g. between management and maintainers. If the appropriate information can be brought forward and made traceable without much effort e.g. a requirements-code traceability, this will greatly reduce the maintenance task. The unit of change control board in SCM will directly benefit from this problem solving as it needs to communicate with other internal development staff of various levels in software lifecycle. 3.4.3 Understanding Requirements Support System evolution requires a better understanding of each requirement which can only be achieved by tracing back to its sources (Ramesh, 2002). Traceability provides an ability to cross-reference items from the requirements specification to other items in the design, code and other documents. With complete traceability opportunities, more evolvable components can be achieved. This can help software engineers determine all the possible affected components caused by a requirement. If this can be achieved, some benefits associated to requirements can be acquired e.g. to support effort estimation of a changed requirement. 3.5 Some considerations for Validation Process In the proposed traceability model and approach, the proof of concept needs to be planned and designed so that appropriate actions can be considered to validate and 56 verify its significant results. One of the needs in change impact analysis is to determine or measure its accuracy i.e. how close the potential impacts from the actual impacts of change. The potential impacts will be produced by the model and approach via CATIA (Configuration Artifact Traceability tool for Impact Analysis) tool as one of the supporting tools, while actual impacts will be determined by the expert i.e. software maintainer of a particular knowledge domain. Knowledge domain here refers to a specific domain of software project. A software project needs to be chosen from the existing software systems together with its software developers or software engineers who had involved in the project to act as software maintainers in a new case study. As change impact analysis in this research context needs a change request and human intervention to work on it, an experiment method of empirical study is needed. The idea of conducting an experiment is to let software maintainers apply the model and approach via a prototype tool to generate the potential impacts and compare these impacts with the actual impacts they need to explore manually. The basic needs of the evaluation are described bellow that includes supporting tools, data gathering and analysis, and benchmarking. 3.5.1 Supporting Tools It needs to consider the right environment with some available supporting tools in order to support the implementation of the proposed software traceability model and approach. Supporting tools are classified into two categories; development wise and experimentation wise. With respect to development wise, the proposed model is applicable to any type of system environment, however for the sake of research it is decided to develop an environment that can support C++ programming and running on Windows XP. The system development is applied to Microsoft Access database system and HTML for hypertext linked to system documentation. It creates a system environment that allows a user to toggle the application between system documentation and tool. The environment supports this capability to ease references between these two applications. 57 With regard to experimentation, it requires few supporting tools (i.e. code parser, instrumentation tool, test scenario tool and traceability tool) to produce software artifacts and their relationships. It is assumed that these supporting tools were available on open source, however due to some capability limitations the existing tools offered, the current research ended up with developing its own supporting tools. Code parser is required to reverse engineer the target code of C++ into some design abstracts. Instrumentation tool is used to instrument code with some markers prior to test scenario process. Test scenario tool is required to perform test cases and to capture the impacted components. Traceability tool is used to implement change impacts based on the input supplied by the other supporting tools. These supporting tools will be discussed in Section 4.5. 3.5.2 Data Gathering and Analysis For experimentation, the research will employ a group of software engineers as the subjects with proper software maintenance knowledge and a complete software project with documentation as a case study. Detailed experimentation will be discussed in Section 6.4. The experiment needs to be conducted in a controlled laboratory setting in order to secure the results of validity. The research will capture from the experiment some dependent variables based on time duration and scores the subjects produced. The duration is captured to measure the time spent to obtain the starting impact and the estimated impact. The scores are collected in terms of the number of components, total size and complexity of the estimated change impact. Meanwhile, the independent variables are the subjects, prototype and system documentation. Some statistical analysis methods will be used such as Frequencies, Descriptive, Compare Means and Crosstabs via a statistical SPSS package to evaluate the quantitative results. 58 The procedure of one-way ANOVA test is used to generate a one-way variance analysis for one dependent variable based on one independent variable, also known as factor. Hence, the time and scores are the dependent variables while the subjects and a project case study are the independent variables. In addition to evaluation, the research uses a comparative framework (Arnold and Bohner, 1993) as a benchmark that provides some guidelines and approaches to compare and measure the effectiveness and accuracy of ripple effects. 3.5.3 Benchmarking One has to admit that there is a lack of dimension for comparing one impact analysis (IA) approach with others in the software maintenance research community. It is hard to discern what work contributes to IA and what does not, according to a basic framework for assessing the technology. Fortunately, Arnold and Bohner (1993) provide a framework and attributes as an aid to compare IA approaches, assessing the strengths and weaknesses of individual IA approaches. They define some metrics to characterize the impact of the changes on software system as depicted in Table 3.1. They define IA as an activity of identifying what to modify to accomplish a change, or of identifying the potential consequences of a change. The following sets have been chosen to characterize the impact analysis products: 1. The set of components that are thought to be primarily affected by a change, PIS (Primary Impact Set); its size is PIS#. 2. The set of components secondarily estimated as being affected by a ripple change, SIS (Secondary Impact Set); its size is SIS#. 3. The set of components actually modified when changes have been performed; AIS (Actual Impact Set), its size is AIS#. 4. All possible impacts of components in the system that are estimated as being affected by a change, including both primary and secondary impact sets; SS (System set); it size is SS#. 59 Table 3.1: Benchmark of AIS# and EIS# possibilities (Arnold and Bohner, 93) Priority Defining Conditions System# AIS# SIS# System# AIS# SIS# System# AIS# SIS# System# SIS# AIS# System# SIS# AIS# System# SIS# AIS# System# SIS# AIS# Measurement When Present Point Interpretation SIS# = AIS#; AIS# / SIS# = 1 SIS# ⊆ System# Best. Estimated impact set matches AIS#. If this happens regularly, usefulness of IA is substantially increased. "Safe". The SIS# SIS# > AIS#; H < AIS# / contains the AIS#, AIS# C SIS#; SIS# < 1, for some userand the SIS# is not SIS# ⊆ selected H such much bigger than the System# that AIS#. 0<H<1 Desired Trend Desired: AIS# = SIS# always. Expected: AIS#1 = SIS# in 10% of a random sample of impact analyses Goal: AIS# = SIS# in 70% of a random sample of impact analyses. Desired: AIS# < SIS# never. Expected: AIS# < SIS# in 50% of a random sample of impact analyses Goal: AIS# < SIS# in 20% of a random sample of impact analyses. SIS >> AIS#; AIS# C SIS#; SIS# ⊆ System# AIS# / SIS# < H, where H is as in the preceding row Safe, but not so good. Big jump from AIS# to SIS# means a lot of things to check in SIS# before arriving at the AIS#. Desired: AIS# << SIS# never. Expected: AIS# << SIS# in 40% of a random sample of impact analyses. Goal: AIS# << SIS# in 10% of a random sample of impact analyses. AIS# > SIS#; SIS# C AIS#; AIS# ⊆ System# M < SIS# / A1S# < 1, for some userselected M such that 0 < M < 1 Expected. IA approximates, and falls short of, what needs to be changed. Desired: SIS# < AIS# never. Expected: SIS# < AIS# in 60% of a random sample of impact analyses. Goal: SIS# < AIS#, in 20% of a random sample of impact analyses. A1S# >> SIS#; SIS# C AIS#; SIS#, AIS# ⊆ System# SIS# / AIS# < M, where M is as in the preceding row Not so good. Big jump from SIS# to AIS# means extra work to discover AIS#. Desired: SIS# << A1S# never. Expected: SIS << A1S# in 30% of a random sample of impact analyses. Goal: SIS# << AIS# in 10% of a random sample of impact analyses. AIS# ∩ SIS# SIS# ∩ AIS# > 0 > 0, AIS# ≠ SIS#; SIS# ⊆ System# Not so good. Extra work to check SIS# objects that aren't in AIS#. Extra work to discover objects in AIS# not in SIS# Desired: SIS# ∩ AIS# near 1 most of a random sample of impact analyses. Expected: 0.7 < SIS# ∩ AIS# < 1, in 60% of a random sample of impact analyses. Goal: .9 < SIS# ∩ AIS# < 1, in 80% of a random sample of impact analyses. AIS# ∩ SIS# SIS# ∩ AIS# = 0 =0; SIS# ⊆ System# Not so good. A worse version of case 6. Desired: SIS# ∩ AIS# = 0 never. Expected: SIS ∩ AIS# = 0 in 20% of a random sample of impact analyses. Goal: SIS ∩ AIS# = 0 in 5% of a random sample of impact analyses. We would like to evaluate IA as being laid out by Arnold and Bohner in Table 3.2. In general, we can conclude that 1. We would like the SIS to be as close as possible to the PIS. 2. We would like the SIS to be much smaller than the system (SS#). 60 3. We would like the SIS to contain the AIS#, and the AIS# to equal to or smaller than the SIS. 4. We would like the granularities to match, if possible. Table 3.2: Evaluation for impact effectiveness (Arnold and Bohner, 1993) Element Explanation Desired IA Effectiveness Trends PIS and SIS What trend is observed in the relative size of PIS / SIS = 1, the PIS and SIS when the approach is applied (i.e. PIS = SIS) or nearly 1. to typical problems? We would like the SIS to be as close as possible to the PIS. SIS and System# What trend is observed in the relative size of SIS / System# ≤ N, where N the SIS and System# when the approach is is some small tolerance level. applied to typical problems? We would like the SIS to be much smaller than the System#. SIS and AIS# What trend is observed in the relative size of SIS / AIS# = 1, the SIS and AIS# when the approach is (i.e. SIS = AIS#), or nearly 1. applied to typical problems? We would like the SIS to contain the AIS#, and the AIS# to equal to or smaller than the SIS. Granularity What is the relative granularity of the artifact G1, granularities match. object model vs. the interface object model? (It is even better if We would like the granularities to match, if granularities are fine enough possible. too). Bianchi et al. (2000) defined and applied some metrics based on the granularity and traceability models. They determined some metrics of effectiveness based on the Inclusiveness, S-Ratio, Amplification and Change Rate as below. Table 3.3: Effectiveness metrics Metric Range Desired Trend Inclusiveness 0 or 1 1 S-Ratio=AIS#/SIS# 0: 1 1 Amplification=SIS#/PIS# >=1 1 ChangeRate=SIS#/System# >0 << 1 61 Inclusiveness Inclusiveness is used for assessing the adequacy of an impact analysis approach. It is defined as Inclusiveness = 1 if AIS ⊆ SIS or Inclusiveness = 0 otherwise. This metric indicates that if inclusiveness = 1, then it is worthwhile considering further metrics S-Ratio, Amplification and Change Rate. S-Ratio S-Ratio is defined as an indicator of the adherence between AIS and SIS. This metric indicates how far the SIS corresponds to the AIS that can be expressed as S-Ratio = AIS#/SIS# The smaller the ratio, the fewer will be the number of impacted components that actually need to be changed, and less modification effort is required. The desired trend is to see that the S-Ratio = 1 i.e. both AIS# and SIS# coincide. Amplification Amplification metric is to observe the ripple-sensitivity i.e. how much the SIS extends the PIS. Amplification = SIS#/PIS# The desired trend is not to have the SIS with a bigger different than PIS, so that the amplification tends toward 1. Change Rate Change Rate is to assess the sharpness of impact analysis approach. Change Rate is defined as 62 Change Rate = SIS#/System The smaller the estimated impact in a system, the better will be the result of impact analysis approach. The desired trend is to have the SIS# relatively small subset of the system such that the Change Rate is << 1. The table 3.3 describes the summary of the effectiveness metrics. 3.6 Some Research Assumptions 1) The study assumes that the system documentation is consistently updated with the current software system otherwise the result would be less effective. To ensure this to occur, a case study should be selected from a very recent and complete project development with documentation. 2) The system assumes that the users are familiar with the documentation and the domain knowledge of a selected project. The maintainers should gain the existing knowledge of its application area and the overall structures of the system. 3) It is assumed that a change request has been decomposed into some understandable software artifacts such as methods and classes prior to implementation of change impact analysis. 4) The system focuses on the object oriented software of UML class diagrams. 3.7 Summary This chapter explained a research methodology and procedures to carry out the research. The research work is carried out based on software disciplines and practices that relate software technology including know-how and tools to support real world environment. Some justifications of planning a research against its initial objectives and issues lead to plan for some validation approach. In the proposed software traceability modeling, a validation process is justified to include a case 63 study, experiment, data gathering and analysis methods, and benchmark. This should provide a means to assess and validate the significant contribution of the proposed model. Some assumptions of this research were laid out to determine proper actions to take prior to its implementation. CHAPTER 4 REQUIREMENTS TRACEABILITY MODELING 4.1 Introduction This chapter describes a new proposed software traceability model and approach to support change impact analysis. A model is established around the functional requirements, design, test cases and code of software workproducts. The chapter begins with discussion on an overview of requirements traceability, followed by a proposed model and approach. Some formal notations are presented to cover the vertical and horizontal traceability followed by the construction of ripple effects of object oriented dependencies. 4.2 Overview of Requirements Traceability It is not easy to establish a standard requirements traceability mechanism that can support all applications associated to software maintenance. This is due to the fact that different application may require different software knowledge and different ways of tackling artifact relationships. Bohner (1996) produces requirements traceability model ranging from requirement phase to implementation phase. This model illustrates the software traceability in the lifecycle that covers the requirements, design, coding and testing phases. The model seems ideal to observe software traceability in the system but it is too generalized to appreciate its usability for change impact analysis. In general, requirements traceability can be established 65 between software development phases, however the know-how traceability is crucial depending on the type of application. Cross referencers may prefer to observe the component traceability of the same reference items. Visualization and program understanding require the ability to view artifact dependencies in both forward and backward directions of artifact dependencies. Call invocations would prefer to trace the software relationships based on forward execution of calling scenarios. In change impact analysis, the software traceability is much more complex as it is derived from different sources of program dependencies such as call invocation, inheritance, aggregation, etc. that will be discussed in the forthcoming sections of this chapter. 4.3 A Proposed Requirements Traceability Model It is quite challenging to implement a total traceability in the whole software lifecycle as the system is composed of different management, policies and development levels. In this research application, requirements traceability is focused to support change impact. The product of software development levels is emphasized as described in the system documentation and code as follows. Table 4.1: Software documents and corresponding artifacts Software Documents Artifacts Software Requirement Specification (SRS) Requirements Software Design Description (SDD) Design Software Test Description (STD) Test cases Source Code Code Table 4.1 represents the software documents and their corresponding software artifacts needed to support the proposed traceability model and approach. SRS supports the essential parts of the system as it defines a set of user requirements. These requirements are normally described in text descriptions that the users can 66 understand. Each requirement needs to be translated into some architectural design and detailed design as described in SDD. Based on the SDD, the requirements then need to be transformed into code and with the availability of test cases the software engineers should be able to implement the requirements. These scenarios give an indication that there could be some traceability relationships to establish. Prior to change impact analysis, let us clarify first how the current traceability relationships exist between the requirements, design, test cases and code. The conceptual model can be explained in the following section. 4.3.1 A Conceptual Model of Software Traceability Figure 4.1 reflects the notion of the proposed traceability system to establish the relationships between software components that include the requirements, design, test cases and code. These components represent the workproducts of the software development phases which can be extracted from the requirements and specification, design, testing and code documents respectively. The arrows represent links derived from a static and dynamic analysis of component relationships. REQUIREMENT DESIGN TEST CASES CODE Figure 4.1: Conceptual model of software traceability Static analysis is a software dependency between components resulting from a study of static analysis on code and other related models. Dynamic analysis on the other hand, is a software dependency results from execution of software to find traces such as executing test cases to find its impacted codes. This model is classified into two categories; vertical and horizontal traceability. Vertical traceability refers to the association of dependent items within a model e.g. code, 67 while horizontal traceability refers to the association of corresponding items between models e.g. between code and design. The above model can be further explained in the following section. 4.3.2 Dynamic and Static Analyses We believe that there exists some relationships amongst the software artifacts in a software system. These relationships need to be captured somehow not only within the same level but also across different levels of workproducts. We need this information prior to the implementation of change impact analysis. The process of tracing and capturing these artifacts is called hypothesizing traces. Hypothesizing traces need to cover both dynamic and static analyses. Requirements x x x x Test Cases Design x x x x x x x x x F Code DYNAMIC ANALYSIS x x x x x x x x x x x x x x x x x STATIC ANALYSIS Figure 4.2: Hypothesizing traces Figure 4.2 illustrates how the links between artifacts are established. The hypothesizing process is divided into two parts; dynamic analysis and static analysis. Dynamic analysis takes responsibility to establish the links between requirements, test cases and code via test scenarios as described in Section 5.4. Test scenarios are a process of executing test cases to implement requirements where each requirement is characterized by one or more test cases. The impacted code in terms of the executed methods is then produced and collected for further use. 68 For static analysis, it is used to capture the structural relationships between artifacts in code. Here, artifacts can be interpreted as variables, data, methods and classes. However, for the sake of this research scope, it needs to capture the methods, classes and their relationships based on program cohesion and coupling of UML class diagrams. These structures can be recovered with the help of some supporting tools as described in Section 5.4. It represents the program dependencies within code that can later be upgraded to design level based on the UML class diagram specifications. This upgrading process is explained in Section 6.2.1 and Section 6.2.2. Now, the hypothesizing process provides the ability to establish links between requirements, test cases and code on one side and between code and design on the other side. The issue now is how to create other links i.e. requirements to design, and test cases to design? To manage this issue, it needs to consider some links between requirements and code, in which code can be upgraded to design as mentioned earlier. This means it can work out to determine the links between requirements and design as well. The same principle applies to establish links between test cases and code. This information needs to be stored into a database for further implementation. 4.3.3 Horizontal Links Horizontal links are regarded as a part of traceability model that involves inter-artifacts such that each artifact in one level provides links to other artifacts of different levels. Figure 4.3 shows a conceptual traceability from the point of view of requirements. For example, R1 is a requirement that has direct impacts on test cases T1 and T2. R1 also has direct impacts on the design D1, D2, D3 and on the code component C1, C3, C4. The same principle also applies to R2. R1 and R2 might have an impact on the same artifacts e.g. on T2, D3, and C4. Thus, in general the system impact can be interpreted as follows. 69 R1 R2 T1 C2 T2 D1 T3 D4 D1 D3 D2 C4 D3 C3 C4 C4 C6 C1 C5 Figure 4.3: Traceability from the requirement perspective S = (G, E) …………………………………………………………………… (H-i) G = GR ∪ GD ∪ GC ∪ GT ……………………………………………….. (H-ii) E = ER ∪ ED ∪ EC ∪ ET ………………………………………………… (H-iii) Where, S - a total impact in the system G - artifacts GR – requirement artifacts GD – design artifacts GC – code artifacts GT – test case artifacts E - relationships ER – relationships from a requirement artifact point of view ED –relationships from a design artifact point of view ET –relationships from a test case artifact point of view EC –relationships from a code artifact point of view ∪ – represents OR symbol. The above specifications can be interpreted as a total impact (S) consists of software artifacts and their relationships. For an artifact of interest of type requirements (GR), design (GD), code (GC) or test cases (GT), there is possibility to observe its traceability relationships with other artifacts in the system. 70 Horizontal traceability can be derived in the following perspectives. 1) Requirement Traceability ER ⊆ GR x SGR ………………………………………………………... (H-iv) SGR = GD ∪ GC ∪ GT ………………………………………………… (H-v) GR = {R1, R2, ……,Rn} …………………………………………………. (H-vi) Where, ⊆ - a subset of x - has a link SGR – artifacts of different levels {R1, R2, ……,Rn} – set of requirement instances can be interpreted as ER is a subset of links between requirements (GR) and other artifacts of different levels (SGR) that can be a design, code or test cases. GR is a set of requirements. 2) Design Traceability ED ⊆ GD x SGD ………………………………………….......…...……. (H-vii) SGD = GR ∪ GD ∪ GT ∪ GC ……………….………………………… (H-viii) GD = {D1, D2, ……,Dn} ………………………………………………….. (H-ix) ED is defined as a traceability relationship from the point of view of a design artifact. ED is a subset of links between design (GD) and other artifacts of different levels (SGD) that can be requirements, code or test cases. GD is a set of design instances. 3) Test case Traceability ET ⊆ GT x SGT …………………………………………………………. (H-x) SGT = GR ∪ GD ∪ GC …………………………………………………. (H-xi) GT = {T1, T2, ……,Tn} ……………………………………………………. (H-xii) 71 ET is defined as a traceability relationship from the point of view of a test case. ET is a subset of links between test cases (GT) and other artifacts of different levels (SGT) that can be requirements, design or code. GT is a set of test case instances. 4) Code Traceability EC ⊆ GC x SGC …………………………………………………………. (H-xiii) SGC = GR ∪ GD ∪ GT ∪ GC …………….………………………...….. (H-xiv) GC = {C1, C2,…….,Ct} ……………………………………………..……. (H-xv) EC is defined as a traceability relationship from the point of view of a code artifact. EC is a subset of links between code artifact (GC) and other artifacts of different levels (SGC) that can be requirements, design or test cases. GC is a set of code instances. Code can be further decomposed into more detailed artifacts structures that will be discussed in Section 4.3.5. 4.3.4 Vertical Links Vertical links are regarded as a traceability model for intra-artifacts of which an artifact provides links to other components within the same level of artifacts. In principle, the vertical platforms of this research scope can be considered as follows. a) Requirement level b) Test case level c) Design level d) Code level Requirement level here refers to the functional requirements. While the test case level refers to the test descriptions and input parameters that describe all possible situations that need to be tested to fulfill a requirement. In some systems, there might exist some requirements and test cases being further decomposed into their sub components. However, to comply with this model, each is uniquely identified. To illustrate this phenomenon, let us consider the following example. 72 Req#: 5 Code : SRS_REQ-02-05 Description: The driver presses an “Activation” button to activate the AutoCruise function. The test involves two major groups of test cases, namely test case#:1 and test case#:2. 1) Test case #: 1 Code: TCASE-12-01 Description : Launch the Auto Cruise when speed > 80 km/hr. i) Test case#: 1.1 Code : TCASE-12-01-01 Description: Launch the Auto Cruise while not on fifth gear. ii) Test case#: 1.2 Code : TCASE-12-01-02 Description: Launch the Auto Cruise while on fifth gear. 2) Test case#: 2 Code : TCASE-12-02 Description: Display the LED with a warning message “In Danger” while on auto cruise if the speed is >= 150 km/h. Now, Req#5 requires three sets of test cases instead of two as it needs to split the group of test case#1 into its individual test case#1.1 and test case#1.2. In the design level, it can be classified into high level design abstracts (e.g. collaboration design models) and low level design abstracts (e.g. class diagrams) or a combination of both. For the sake of this research, less attention is paid on high level design abstracts to derive traceability as this needs more research that would complicate works. Low level design abstracts are applied to represent a design that contains the class diagrams, while the code is to include all the program statements in general. 73 4.3.5 Program Dependencies As all the classes, methods and data (i.e. variables) are related to one another in a program, it is better to understand first the behavior of program dependencies between them in object oriented programming (OOP). The above section has discussed the general traceability between the requirements, test cases, design and code was discussed. As this traceability moves further down, it needs to refine the relationships in more detail and try to resolve some ambiguities if any. For example, in the detailed design and code, there might exist some ambiguities of what artifacts should be represented as both may represent some overlapping components e.g. should the classes be classified as part of design or code? To some software developers, this is just a matter of development choice. As classes may appear in both design and code, the proposed model decided that the design needs to contain all the classes and their interactions as appeared in the class diagrams, while code is to contain all the methods, their contents and interactions. Class diagrams can be derived from the design document (i.e. SDD) from which the classes and their interactions can be captured. Fortunately, this information can be recovered from the source code with the help of a code parser. Code parser can be developed to capture the methods, its contents and interactions. In the above discussion model, GC (in Section 4.3.3, part 4) can be further refined to include the detailed components as below. GC = (V, E) ………………………………………………………………… (V-i) V = {V1,V2,…….,Vt} ………………………………………………………. (V-ii) E = {E1,E2,……..,Es} ………………………………………………………. (V-iii) Er = Vi x Vj ………………………..………………………………………… (V-iv) Where, GC - global program dependency in code V - a set of nodes (i.e. program artifacts) V1,V2,…….,Vt 74 E - a set of links E1,E2,……..,Es. Er - program dependency created at one instance between two nodes, Vi and Vj. x – a link between two nodes can be interpreted as GC consists of V as a set of nodes V1,V2,…….,Vt and E as a set of links E1,E2,……..,Es. Er is a result of program dependency between the two nodes, Vi and Vj at a time. The artifact dependencies in the code can be explained in the following categories. 1. Class linked to a class Vi , Vj ∈ VC ………………………………….............………………….. (V-v) E = VC x VC ………………………………………...…………………… (V-vi) Given Vi, Vj as instances of two different classes, there will be a class-class dependency. This can be achieved via inheritance or class friendship. 2. Method defined in a class Vi ∈VC …………………………………………................…………….. (V-vii) Vj ∈VM …………………………………...................………………….. (V-viii) E = VM x VC ……………………………………….…………………… (V-ix) Given Vi as a class instance and Vj as a method instance, there will be a methodclass dependency that can occur when i) a method is defined in a class, or ii) a method is associated to another class via method friendship. 3. Data defined in a class Vi ∈VC …………………………………….……………….…………… (V-x) Vj ∈VD …………………………………….…………….……………… (V-xi) E = VD x VC ……………………………….…………….……………… (V-xii) Given Vi as a class instance and Vj as a data instance, there will be a data-class dependency that can occur via i) data declaration defined in a class, or ii) data declaration defined (in a class) but is associated to another class via composition, 75 aggregation and association. More related information is found in the following section. 4. Data defined in a method Vi ∈VM ..................................................................................................... (V-xiii) Vj ∈VD ....................................................................................................... (V-xiv) E = VD x VM .............................................................................................. (V-xv) Given Vi as a method instance and Vj as a data instance, there will be a data-method dependency that can occur via i) data declaration defined in a method, or ii) data declaration defined outside a method as global but within the same class, linked via attribute creation. 5. Method calls method Vi , Vj ∈VM ………………………………………….................………. (V-xvi) E = VM x VM ………………………..............…………………………. (V-xvii) Given Vi, Vj as instances of two different methods, there will be a method-method dependency that occurs when i) a method calls another method within a class, or ii) a method calls another method in two different classes. 6. Data uses data Vi , Vj ∈VD | (∀ Ck, ∃ Vi : Ck x Vi ∈ VC x VD) x (∀ Ck, ∃ Vj : Ck x Vj ∈ VC x VD) .......................................... (V-xviii) OR Vi , Vj ∈VD | (∀ Mk, ∃ Vi : Mk x Vi ∈ VM x VD) x ((∀Mi ∃ Vj : Mi x Vj ∈ VM x VD ∩ Mk = Mi ) ∪ (∀Ck, ∃ Vj : Ck x Vj ∈ VC x VD)) .................................. (V-xix) E = VD x VD …………………………..……………………….………… (V-xx) Given Vi, Vj as instances of two different data, there will be a data-data dependency that can occur when the first data, Vi uses (assigned to, assigned by or applies to) the second data, Vj. It is immaterial whether the two data reside within the same class as global data (part V-xviii) or within the same method as local data or one from the 76 other (part V-xix). No data is allowed to access the other data of different methods of different classes due to object-oriented protection for locally defined data. 4.4 Defining Object Oriented Impact Analysis The object oriented interactions need to be defined because this will determine the nature and behavior of impact analysis. These interactions then need to be summarized to conclude the nature of information flow and results of change impact. 4.4.1 Ripple Effects of Program Dependencies The software artifacts in this research context of OO programming can be defined as consisting of variables (i.e. data), methods and classes. The ripple effects of change need to determine if an artifact is changed, how other artifacts would be affected. For example, in a method-to-method relationship (MxM) of call invocations, such as M1 Æ M2 Æ M4 M1 calls two other methods; M2 and M4. If any change made to M2 or M4, it would have a potential effect on M1. So, in this context of change impact, it needs to work on the other way around by picking up a callee and finding its corresponding callers for its potential effects. In other word, a change made to a callee may have a potential effect or ripple effect on its callers. Table 4.2 presents the program relations with descriptions and examples of all possible dependencies in C++. It needs to established the method-method (MxM), method-class (MxC), class-method (CxM) relationships as far as this work is concerned. The types of relationship are depicted in the relationship column. Each row of relationship is presented with a set of brief example of C++ programs as depicted in the example column. While in the definition column, it describes the meaning of the relationships. An arrow ‘Å’ 77 represents the cause and effect i.e. cause is on the right side, and effect is on the left side. Table 4.2: Program relations in C++ Relationships Call (MxM) Definitions An operation (method) of the first class calls an operation of the second class. -----Impact ----a1() Å b1() A class contains a data member of some other class type. -----Impact ----AÅB Examples class B {void b1();} class A { void a1() { B test; test.b1();} } class B {}; class A { B test; } Some operation of the first class creates an object of the second class (instantiation) . -----Impact ----a1() Å B Dependency from two classes. -----Impact ----i) A Å B ii) A Å c1() class B {}; class A { void a1() { B t; } } class B{}; class A { friend class B; friend int C:: c1(); } Uses (MxC) Data uses another data of different classes. -----Impact ----a1() Å B a1() Å C Inheritance (CxC) Inheritance relation among classes. -----Impact ----AÅB A class contains an operation with formal parameters that have some class type. -----Impact ----a1() Å B A class contains data members of pointer (reference) to some other class. -----Impact ----a1() Å B class B {int k;…} class C {int m;…} class A { B b; C c; void a1() {int m=b.k + c.m; } } class B {}; class A : public B {} Composition (CxC) Create (MxC) Friendship i) (CxC) ii) (CxM) Association (MxC) Aggregation (CxM) (MxC) Define i) (MxC) ii) (CxM) A class contains data members and member functions. -----Impact ----i) A Å a1() ii) a1() Å A class B{}; class A { void a1 (B* par =0); } class B {}; class A { void a1() { B* test; }} Class A { Void a1(); } 78 In inheritance relationship, if class A is inherited from class B, then any change made in class B may affect class A and all its lower subclasses, but not to its upper classes. In call relationship, there exists a method-method relationship as the called method b1() may affect the calling method a1(). In composition relationship, test is a data member of class B, would imply a change in B may affect class A, in a class-class relationship. In create relationship, a change made in class B would affect the creation of objects in method a1(). Thus, it can be said that class B may affect method a1() in a method-class relationship. In friend relationship, two types of friendship can occur, namely class friendship and method friendship. Class friendship results in a class-class relationship as it allows other class to access its class private attributes. While, method friendship results in a method-class relationship as it allows a method of other class to access its class private attributes. In use relationship, it considers the use of data (i.e. variables or data members in C++) in the data assignment. In this objective, it needs to apply the use relationships that provide links between components of different methods and classes. In the Table 4.2, the use relationship involves data from other classes to implement a data assignment e.g. k and m from class B and C respectively. So, the class B and C would give impact to method a1() in a class-method relationship. In Association relationship, a class contains an operation with formal parameters that have some class type. A change of class type in B would affect method a1() in method-class relationship. In aggregation relationship, a class contains data members of pointer (reference) to some other class. So, a change in class B may affect class A in method-class relationship. Lastly, in define relationship observes i) a method-class relationship when one or more methods are defined in a class, so any change in a method simply affects its class ii) a class-method relationship when a change in a class implies an impact to its methods. 79 4.4.2 Defined Dependencies Based on the above discussion on program dependencies, it can be summarized into artifact and type relationships as shown in Table 4.3. The types of relationship can be represented as either direct (with no brackets) or indirect (with brackets) interactions. Direct interactions are resulting from the explicit links and indirect interactions are due to implicit links between the two artifacts. Table 4.3: Classifications of artifact and type relationships Artifact Relationships CxC CxM MxC MxM Types of Relationship composition, class friendship, inheritance [create, association, aggregation, friendship, uses, call, define] Define, friendship [call] Define, uses, associative, create, aggregation [call] Call The reason behind these direct and indirect interactions is that if there is an impact to a data, it would imply an impact to its method, and an impact to a method would imply an impact to its class it belongs to. Thus, there is a need to make them explicitly defined by upgrading the lower level matrices into the higher level matrices e.g. to upgrade the MxM into MxC, CxM, CxC. With these new formations, it needs to update the existing tables captured earlier. This gives a broader potential effect as it moves on to the higher levels. The point here is to allow users to identify the impact at any level of relationships. Please take note that in this context of analysis, MxC is not the same as CxM. The reason is that the MxC is to observe the potential effect of a method over other parts of the code in terms of classes, whereas the CxM is to observe the potential effect of a class over other parts of the code in terms of methods. 80 4.5 Total Traceability of Artifact Links Figure 4.4 represents the total traceability of the proposed model and approach. The horizontal relationships can occur at the cross boundaries as shown by the thin solid arrows. The crossed boundary relationship for the requirements-test cases is shown by RxT, test cases-methods by TxM, and so forth. The vertical relationships can occur at the method level (MxM- method interactions), class level (CxC- class interactions) and package level (PxP- package interactions). The basic MxM interactions can simply be upgraded into class interactions and package interactions by the use of mapping tables based on the fact that a package is made up of one or more classes and a class is made up of one or more methods. The thick doted lines represent the top down and bottom up traceability. By top-down tracing, it means the model can identify the traceability from the higher level artifacts down to its lower levels e.g. from a test case we can identify its associated implementation code. For bottom-up tracing, it allows us to identify the impacted artifacts from a lower to a higher level of artifacts e.g. from a method we can identify its impacted test cases and requirements. Requirements (RxT) Test cases Top Down (RxC) (R x M) (TxC),(TxP) (TxM) Bottom Up Design (CxC),(PxP) (MxC) Code (MxM) Figure 4.4: System artifacts and their links 81 4.6 Summary The software traceability model highlights the ability to establish links between requirements, test cases, design and code. The software traceability of the proposed model is dependent on the infrastructure of artifacts and their links produced by the hypothesizing process. This process incorporates both static and dynamic analyses in order to capture a rich set of artifacts and relationships. A more careful consideration is made to analyze the program dependencies in more detailed manner that includes method, classes and packages. A set of well defined dependencies are then defined to determine the behavior of object oriented impact analysis. With the help of some upgrading mechanism between artifacts, the model leads to the opportunity to trace artifacts in both top down and bottom up tracing when performing impact analysis. CHAPTER 5 DESIGN AND IMPLEMENTATION OF SOFTWARE TRACEABILITY 5.1 Introduction The objective of the current chapter is to design and implement the proposed software traceability model and approach. The system traceability design encompasses the traceability architectures, system Use Case, module interactions and operation. It is followed by a brief explanation of its implementation. Some parts of the early process may require support from some supporting tools which are also covered in this chapter. 5.2 System Traceability Design The proposed system is viewed as a complete system within its defined scope that covers several subsystems and interfaces. This system is classified into software traceability architecture, Use Case diagram, modules and operation to facilitate understanding of the design process. 5.2.1 Software Traceability Architecture A new traceability system is designed to fulfill a critical need of change impact process as was described in SCM. This application attempts to automate the 83 change impact analysis which has long been treated as one of the core issues in software maintenance. However, the emergence of this application is not meant to provide the total solution to software maintenance, but rather to support maintenance tasks in addition to the existing software maintenance packages. Before it proceeds to system implementation, let us understand first the system architecture and its environment. CATIA SYTEM Measure Impact Impact Viewer Impact Analysis Change Request Artifact Repository Hypothesizing Process Docs Code Support Tools Figure 5.1: A view of Software Traceability architecture The CATIA was developed in Java, and running on Microsoft Windows. It was about 4k LOC, specially built to analyze an impact of change request on the target system written in C++. CATIA (Figure 6.2) was designed to consist of four main functionalities; artifact repository, impact analysis, impact measurement and impact viewer. It is assumed that all the software knowledge to be implemented by CATIA has been extracted from the system documentation and code using some supporting tools. This information needs to be analyzed and transformed into the artifact repository for further impact analysis. Figure 5.1 represents an overview of 84 software traceability architecture that consists of CATIA as a main system followed by other related interfaces as follows. a) Change Request Change request represents a need to do software modification. In this research context, the change request has been translated into some identified artifacts of the initial target of change, called the primary artifacts. With these primary artifacts in hand, the software maintainer will interact with CATIA, a new software traceability prototype tool specially developed to support change impact. b) Artifact Repository Artifact repository represents a collection of artifacts and their links captured from the available resources including documentation, code and other related information. This information needs to be stored into a database of artifact relationship tables to make them readily available for CATIA implementation. c) Hypothesizing Process Hypothesizing process is a process of capturing the artifacts and their links from the available resources. This process involves three main components, namely documentation, code and supporting tools. Supporting tools are required to hypothesize traces of the existing artifacts using dynamic and static analyses as was described in the previous chapter. d) CATIA System CATIA represents the main part of the proposed software traceability system. CATIA is developed as a software prototype tool to implement the model in order to observe its effectiveness and usability. CATIA system can be divided into three main functionalities; impact analysis, impact measurement and impact viewer. 85 i) Impact Analysis The objective of impact analysis from this research perspective is to establish the potential impacts of a proposed change at both vertical and horizontal traceability. In this perspective, the vertical traceability can occur between a) methods (MxM), b) classes (CxC) and c) packages (PxP). It considers packages as the design components of higher level abstract as it consists of one or more classes. Meanwhile, horizontal traceability is the relationships that occur across different levels of artifacts such as requirements-test cases (RxT), test cases-methods (TxM), requirements-methods (RxM) and requirements-classes (RxC). Given a proposed change as a primary artifact, our tool should be able to identify the ripple effects of the impacted artifacts in either top down or bottom up tracing. In top down tracing, it means it can identify the traceability from the higher level artifacts down to its lower levels e.g. from a test case its associated implementation methods can be identified. For bottom up tracing, it allows to identify the impacted artifacts from lower to higher level artifacts e.g. from a method, its affected requirements can be identified. ii) Impact Measurement Impact measurement is essential when a user would like to know some metrics related to the impacted components in terms of LOC (lines of code) and VG (values of complexity). The measurement is computed as CATIA keeps track and discovers more impacted components during search impact. As this event occurs, it collects and sums up the LOC and VG of the impacted artifacts. iii) Impact Viewer Impact viewer provides an interface to enable a user to interact with CATIA system. The users can identify the impacted components and other associated information via this impact viewer. 86 5.2.2 CATIA Use Case Figure 5.2 shows the Use Case diagram of CATIA system that involves a user and a main process, called Do Impact. A user represents a software maintainer who is responsible for doing changes in software system. Software maintainer needs to interact with the Do Impact process by selecting the initial target of artifacts for change. In response to this request, Do Impact process will perform the potential impacts, controls and classify the impacts into types of artifact for view purposes and summary. Do Impact Software Maintainer Figure 5.2: Use Case diagram of CATIA system 1) Do Impact Process a) Aim This process is used to provide a service to software maintainer in an effort to identify potential impacts of change. b) Characteristic of Activation It is activated upon the software maintainer’s request. c) Pre Condition CATIA is initialized. d) Basic Flows i) The process starts receiving and verifying the user request ii) The process interprets user input into control commands. 87 iii) The process activates and controls appropriate actions in response to control commands. iv) The process searches and captures artifacts and its related information through some artifact relationships. v) The process acknowledges and updates the appropriate metric and impact tables. vi) The Use Case ends 5.2.3 CATIA Class Interactions CATIA system is made up of ten main classes specifically designed to automate software traceability for impact analysis. Figure 5.3 shows the CATIA class diagrams in which the arrows represent the activations between classes. Upon receiving an activation, each class via its main operation will manage and activate other related operations to implement some tasks. Each class contains a set of operations and attributes, however the attributes in the figure are purposely left out only to show the class functionalities. Figure 5.4 complements the class diagrams in which it shows the sequence operations of CATIA implementation. CATIA begins its operation by launching the StartUp class. StartUP sets the system environment to establish a database connection, initialize some variables and pointers. Initialization process is handed over to another class, Initializer. StartUp then triggers the Manager_Primary class to manage and control change impact. This class accepts and validates the user input for the primary and secondary categories of artifacts to cause impacts. Upon first category with appropriate setting, Manage_Primary will activate the Manage_Secondary class to manage the secondary or potential search impacts. The search impact needs some initialization on appropriate metric and impact tables and updates them accordingly during the impact task. The update process of these tables is managed by triggering the Mtd_Impact, Cls_Impact, Tcs_Impact, Pkg_Impact or Req_impact when necessary once at a time. The total summary is 88 then computed and updated for view purposes. Manage_Secondary expects another input option from the user to view the impact summary and calls Display_Impact class to fulfill its user view. To facilitate understanding on these operations and interaction process between classes, a set of class algorithms are attached as below. Initializer Manage_Primary Init_variables() create_Mtd_rships() create_Cls_rships() create_Pkg_rships() create Tcs rships() manage_primary() accept_UsrInput() validate_input() request_view() Display_Impact display_impact() StartUp main () Manage_Secondary Mtd_Impact sumMtd_impact() Perform_secIMP() Initial_IMP() primMtd_Do_impact() primCls_Do_impact() primPkg_Do_impact() primTcs_Do_impact() primReq_Do_impact() Req_Impact sumReq_impact() Cls_Impact Tcs_Impact Pkg_Impact sumCls_impact() sumTcs_impact() sumPkg_impact() Figure 5.3: CATIA Class Diagrams 89 Manage Manage Display Mtd Cls Tcs Pkg Req Sec. Impact Imp Imp Imp Imp Imp StartUp Initializer Prim. Init variables() manage_primary() accept_UsrInput() validate_input() perform_secIMP() sumMtd _impact() sumCls_impact() sumTcs _impact() sumPkg _impact() sumReq _impact() request_view() display_view() Figure 5.4: CATIA Sequence diagrams 1. StartUp Class The main() begins the launch of CATIA application. i) Pre condition The database connection is physically and logically configured for the Microsoft Access database. ii) Post condition CATIA is operational. 90 iii) Algorithm Below describes some brief algorithms of the operation Check the database connection Set and initialize the system status to workable Activate Init_variables() to initialize values and variables Activate manage_primary() to start primary impact Close the system. 2. Initializer Class The Init_variables() sets the class pointers by defining all global variables to use at the launch of CATIA. This includes the initialization of class variables and database connection. Init_variables() is also responsible on the initialization of artifact relationships. i) Pre condition CATIA is running. ii) Post condition All impact tables, metric tables and summary tables are initialized. iii) Algorithm Below describes some brief algorithms of the operation Initialize all global variables, pointers. Set the primary and secondary impacts to null. Initialize the metric, impact and summary tables. Create the artifact table pointers 91 3. Manage_Primary Class The manage_primary() manages the primary artifacts to cause the impact. It checks the primary and secondary input to activate the impacts. i) Pre condition All impact tables, metric tables and summary tables are initialized. ii) Post condition The potential artifacts are defined. All impact tables, metric tables and summary tables are updated.. iii) Algorithm Below describes some brief algorithms of the operation Activate accept_UsrInput() to receive input from user for both primary and secondary selections. Activate validate_input()to verify input. Set a primary artifact once at a time. For each primary artifact, choose the following steps. Case primary artifact = ‘method’: Set the secondary artifact to capture the impact. Set the relationship tables of the said primary vs. secondary. Activate the perform_secIMP(). Case primary artifact = ‘class’: Set the secondary artifact to capture the impact. Set the relationship tables of the said primary vs. secondary. Activate the perform_secIMP(). Case primary artifact = ‘package’: Set the secondary artifact to capture the impact. Set the relationship tables of the said primary vs. secondary. Activate the perform_secIMP(). Case primary artifact = ‘test case’: Set the secondary artifact to capture the impact. Set the relationship tables of the said primary vs. secondary. 92 Activate the perform_secIMP(). Case primary artifact = ‘requirement’: Set the secondary artifact to capture the impact. Set the relationship tables of the said primary vs. secondary. Activate perform_secIMP(). Repeat the process until end of the list of primary artifacts. 4. Manage_Secondary Class The perform_secIMP() manages the potential impacts for each primary artifact. i) Pre condition The corresponding metric tables and impact tables are initialized. ii) Post condition The corresponding metric tables and impact tables are updated. iii) Algorithm Below describes some brief algorithms of the operation Activate Initial_IMP() If the current primary type=’method’ Initialize metric and impact tables associated to primary ‘method’ Activate primMtd_Do_Impact() to manage secondary impacts Activate sumMtd_Do_impact() to summarize impacts associated to primary ‘method’. If the current primary type=’class’ Initialize metric and impact tables of CxM Activate primCls_Do_Impact() to manage secondary impacts Activate sumCls_Do_impact() to summarize impacts associated to primary ‘class’. If the current primary type=’package’ Initialize metric and impact tables of PxM Activate primPkg_Do_Impact()() to manage secondary impacts Activate sumPkg_Do_impact() to summarize impacts associated to primary 93 ‘package’. If the current primary type=’test case’ Initialize metric and impact tables of TxM Activate primTcs_Do_Impact()() to manage secondary impacts Activate sumTcs_Do_impact() to summarize impacts associated to primary ‘test case’. If the current primary type=’requirement’ Initialize metric and impact tables of RxM Activate primReq_Do_Impact()() to manage secondary impacts Activate sumReq_Do_impact() to summarize impacts associated to primary ‘requirement’. 4-a. Algorithm for primMtd_Do_impact() This operation manages the primary artifacts of type ‘method’. i) Pre condition The metric and impact tables of the primary type ‘method’ are initialized. ii) Post condition The metric and impact tables are updated. iii) Algorithm Below describes some brief algorithms of the operation If the current secondary artifact=’method’ Search for the occurrence of ‘1’ in the MxM table. If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of MxM. If the current secondary type=’class’ Search for the occurrence of ‘1’ in the MxC table If found, then count the number of impacted artifacts, identify its size and 94 complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of MxC. If the current secondary type=’package’ Search for the occurrence of ‘1’ in the MxP table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of MxP. If the current secondary type=’test case’ Search for the occurrence of ‘1’ in the MxT table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of MxT. If the current secondary type=’requirement’ Search for the occurrence of ‘1’ in the MxR table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of MxR. 4-b. Algorithm for primCls_Do_impact() This operation manages the primary artifacts of type ‘class’. i) Pre condition The metric and impact tables of the primary type ‘class’ are initialized. ii) Post condition The metric and impact tables are updated. iii) Algorithm Below describes some brief algorithms of the operation 95 If the current secondary artifact=’method’ Search for the occurrence of ‘1’ in the CxM table. If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of CxM.. If the current secondary type=’class’ Search for the occurrence of ‘1’ in the CxC table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of CxC. If the current secondary type=’package’ Search for the occurrence of ‘1’ in the CxP table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of CxP. If the current secondary type=’test case’ Search for the occurrence of ‘1’ in the CxT table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of CxT. If the current secondary type=’requirement’ Search for the occurrence of ‘1’ in the CxR table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of CxR. 4-c. Algorithm for primPkg_Do_impact() This operation manages the primary artifacts of type ‘package’. 96 i) Pre condition The metric and impact tables of the primary type ‘package’ are initialized. ii) Post condition The metric and impact tables are updated. iii) Algorithm Below describes some brief algorithms of the operation If the current secondary artifact=’method’ Search for the occurrence of ‘1’ in the PxM table. If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of PxM.. If the current secondary type=’class’ Search for the occurrence of ‘1’ in the PxC table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of PxC. If the current secondary type=’package’ Search for the occurrence of ‘1’ in the PxP table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of PxP. If the current secondary type=’test case’ Search for the occurrence of ‘1’ in the PxT table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of PxT. If the current secondary type=’requirement’ Search for the occurrence of ‘1’ in the PxR table 97 If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of PxR. 4-d. Algorithm for primTcs_Do_impact() This operation manages the primary artifacts of type ‘test case’. i) Pre condition The metric and impact tables of the primary type ‘test case’ are initialized. ii) Post condition The metric and impact tables are updated. iii) Algorithm Below describes some brief algorithms of the operation If the current secondary artifact=’method’ Search for the occurrence of ‘1’ in the TxM table. If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of TxM. If the current secondary type=’class’ Search for the occurrence of ‘1’ in the TxC table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of TxC. If the current secondary type=’package’ Search for the occurrence of ‘1’ in the TxP table If found, then count the number of impacted artifacts, identify its size and complexity. 98 Increment the total impacted sizes and complexities. Update the metric and impact tables of TxP. If the current secondary type=’requirement’ Search for the occurrence of ‘1’ in the TxR table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of TxR. 4-e. Algorithm for primReq_Do_impact() This operation manages the primary artifacts of type ‘requirement’. i) Pre condition The metric and impact tables of the primary type ‘requirement ‘ are initialized. ii) Post condition The metric and impact tables are updated. iii) Algorithm Below describes some brief algorithms of the operation If the current secondary artifact=’method’ Search for the occurrence of ‘1’ in the RxM table. If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of RxM. If the current secondary type=’class’ Search for the occurrence of ‘1’ in the RxC table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of RxC. If the current secondary type=’package’ 99 Search for the occurrence of ‘1’ in the RxP table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of RxP. If the current secondary type=’test case’ Search for the occurrence of ‘1’ in the RxT table If found, then count the number of impacted artifacts, identify its size and complexity. Increment the total impacted sizes and complexities. Update the metric and impact tables of RxT. 5. Mtd_Impact Class The secMtd_Do_impact() manages the summary of potential impacts of primary type ‘method’. i) Pre condition The summary table of primary type ‘method‘ is initialized. ii) Post condition The summary table is updated. iii) Algorithm Below describes some brief algorithms of the operation Initialize values of the method impact summary Create summary from the metric and impact tables for the primary ‘method’ 6. Cls_Impact Class The secCls_Do_impact() manages the summary of potential impacts of primary type ‘class’. 100 i) Pre condition The summary table of primary type ‘class‘ is initialized. ii) Post condition The summary table is updated. iii) Algorithm Below describes some brief algorithms of the operation Initialize values of the class impact summary Create summary from the metric and impact tables for the primary ‘class’ 7. Pkg_Impact Class The secPkg_Do_impact() manages the summary of potential impacts of primary type ‘package’. i) Pre condition The summary table of primary type ‘package‘ is initialized. ii) Post condition The summary table is updated. iii) Algorithm Below describes some brief algorithms of the operation Initialize values of the package impact summary Create summary from the metric and impact tables for the primary ‘method’ 8. Tcs _Impact Class The secTcs_Do_impact() manages the summary of potential impacts of primary type ‘test case’. 101 i) Pre condition The summary table of primary type ‘test case‘ is initialized. ii) Post condition The summary table is updated. iii) Algorithm Initialize values of the test case impact summary Create summary from the metric and impact tables for the primary ‘test case’ 9. Req_Impact Class The secReq_Do_impact() manages the summary of potential impacts of primary type ‘requirement’. i) Pre condition The summary table of primary type ‘requirement‘ is initialized. ii) Post condition The summary table is updated. iii) Algorithm Below describes some brief algorithms of the operation Initialize values of the requirement impact summary Create summary from the metric and impact tables for the primary ‘requirement’ 10. Display_Impact Class The display_impact() manages the display of the impacts as requested by the user. i) Pre condition The summary tables are initialized. 102 ii) Post condition The summary tables are updated. iii) Algorithm Below describes some brief algorithms of the operation Initialize all the metric and impact summary tables. Sum up the summary of all the metric and impact tables. 5.3 CATIA Implementation and User Interfaces The CATIA assumes that a change request has already been translated and expressed in terms of some acceptable artifacts i.e. requirements, packages, classes, methods or test cases. CATIA was designed to manage the potential effect of one type of artifacts at a time. The system works such that given an artifact as a primary impact, CATIA will determine its effects on other artifacts known as the secondary artifacts. Figure 5.5: First user interface of CATIA 103 Figure 5.5 shows an initial user interface, where the user is required to choose a type of primary artifact. Upon this selection, CATIA will display a list of all the related artifacts of that type for the user to view. From the list, the user needs to specify the names of the artifacts of interest to deal with as the primary artifacts. The user may choose from the list or type in the right codes to represent the actual artifacts. Figure 5.6 shows that the user had selected the mtd2, mtd12 and mtd15 to represent the initial primary artifacts. Upon receiving the primary artifacts, CATIA prompts another window that requires the user to decide the types of secondary artifacts to view the impact. Figure 5.6 : Primary artifacts at method level Figure 5.7 shows that the primary artifacts of mtd2, mtd12 and mtd15 are displayed with their respective summary impacts. By considering mtd2 as an example, this method has caused potential effect to four other methods in the system which brings to the total metrics of LOC (117) and VG (47). In terms of the classes, mtd2 has caused three classes with their LOC (190) and VG (71). In packages, mtd2 104 has caused two packages of size LOC (279) and VG (100). Mtd2 was associated to four test cases that took up the total metrics of LOC (373) and VG (123) of impacted artifacts to implement it. This detailed information is produced in addition to the existing summary via a message window. Figure 5.7: Summary of impacted artifacts In Figure 5.7, it shows the detailed impacts e.g. mtd2 has caused an impact on three classes; cls1, cls2 and cls8. Cls1 is a CDrivingStation with the total size and complexity of 39 and 14 respectively, however the total impacted size and complexity were 37 and 12 respectively. Cls2 is a CPedal with the total size and complexity of 18 and 6 respectively, but the total impacted size and complexity were 16 and 4. Cls8 is a CCruiseManager with a total size and complexity of 133 and 51 respectively, and the total impacted size and complexity were 64 and 31. 105 5.4 Other Supporting Tools Three types of supporting tools were developed to facilitate software traceability model and approach, namely code parser, instrumentation tool and test scenario tool. The first was specifically dealing with the static analysis on code, while the rest were designed and implemented based on dynamic analysis of test case execution. The output produced by these tools was then collected and analyzed to later be used as input to implement software traceability for change impacts. 5.4.1 Code Parser It is obliged to extract some useful information from the source code before the subsequent traceability analysis takes place. This information represents the program structures that need to be captured by identifying the following artifacts. 1. Identify the classes with their associated methods. 2. Identify the inheritances and friendships. 3. Identify the call invocations. 4. Identify the aggregation, association and composition. 5. Component relationships that represent how the external data are being used in the current method or class via data uses. The detailed description on code parser is available in Appendix C. The author experienced using some existing code parsers such as Columbus (Columbus, 2005), McCabe (McCabe, 2005) and CodeSurfer (GrammaTech, 2005) to help capture the above information. However, due to some limited capabilities of the existing tools, a new code parser called TokenAnalyzer was developed. TokenAnalyzer was written in Java to collect software dependencies of C++ programs. All these information are then summarized and transformed into a database of transformation matrices or tables for short. The tables are created to contain the binary relationships of paired components in the form of MxM, MxC, 106 CxM, CxC of program dependencies. These dependencies should be made available later during the implementation of software traceability process. 5.4.2 Instrumentation Tool Instrumentation process complements the static analysis of the above code parser. A new instrumentation tool, called CodeMentor was developed to support the instrumentation process prior to test scenarios. CodeMentor was written in C language to make an auto insert into the original code of C++ with some ‘markers’. These ‘markers’ are inserted at the following statements without affecting the original functionality of the program. • Start of method/class - e switch will insert instrumentation at the entry point of any function or class, including main() function. • Return - r switch will insert instrumentation whenever it encounters an explicit or an implicit RETURN. • Exit - x switch will insert instrumentation whenever it encounters an EXIT statement. • If - d switch will insert instrumentation whenever it encounters an IF decision statement. ‘Markers’ represent the dummy statements to indicate the presence of ‘#TRACE#’, line numbers and file names at the above points in the programs. These dummy statements are called the instrumented statements. Figure 5.8 highlights some instrumented statements as shown by arrows. When the test scenarios are performed, these ‘markers’ of impacted components will be produced to represent the impacted code. This instrumentation process produces a new code version, called an instrumented code. The instrumented code is then compiled to generate an 107 instrumented target program. With the help of some test cases, it performs the test scenarios of each requirement over the instrumented target program. The execution process produces some trace records that represent parts of the program being affected by the test cases. The test scenario process can better be explained by the following section. /********************************************************************/ /* */ /* FUNCTION NAME : CCruiseInfo */ /* FUNCTION PURPOSE : CCruiseInfo constructor */ /* */ /* PARAMETER : None */ /* */ /* RETURN VALUE : None */ /*******************************************************/ Entry Intrumentation CCruiseInfo::CCruiseInfo() { fputs("\n#TRACE# I am Inside Function *CCruiseInfo.CCruiseInfo()* Line=55",dst_file1); fprintf(dst_file1,"^ %d ^",++count_exe); bActive = false; bAcceleration = false; Decision (IF) bAccelerationPedal = false; Intrumentation if(countIFTrue1 > 0) { fputs("\n#TRACE# If statement Line : 59",dst_file1); ++countIFTrue1; } bResume = false; bSuspend = false; dPrevCruiseSpeed = 0; throttle->controlThrottle(throttleCommand, throttlePosition, 0); displayMsg->displayCommonMsg("--------------------"); //display led->lightLed(ledShm,1,false); Return led->lightLed(ledShm,2,false); Intrumentation led->lightLed(ledShm,3,false); fputs("\n#TRACE# END OF Function CCruiseInfo.CCruiseInfo() Line=67 ",dst_file1); fprintf(dst_file1,"^ %d ^",++count_exe); } /******************* END OF FUNCTION *************************/ Figure 5.8: The Instrumented Code 108 5.4.3 Test Scenario Tool Test scenarios are a process of implementing test cases in order to determine the impacted code of program execution. Figure 5.9: Some ‘markers’ extracted from a class CruiseManager Figure 5.10: The impacted methods of a class CcruiseManager 109 A new test scenario tool, called TestAnalyzer was developed to capture this impacted code in terms of non duplicate executed methods. During test scenarios, some ‘markers’ are dumped into some .txt files as shown in Figure 5.9. The TestAnalyzer is run to read and analyze the results of test scenarios in .txt files. TestAnalyzer has the ability to capture the non duplicate impacted methods and transform them into appropriate classes as shown in Figure 5.10. represents a class CcruiseManager with its four impacted The figure methods; ccruiseManager(), checkCruiseMode(), getOperationSignal (int intKeycode) and regulateSpeed (int intoperationCode). Test scenarios are used as a means to establish some basic relationships between test cases and methods which can further link to requirements. 5.5 Summary A software traceability approach was specifically developed in this research to support the crucial need of change impact analysis. This chapter has described system architecture of the proposed model and its design including the system Use Case and class diagrams. The implementation process was designed to include class interactions, traceability algorithms and user interfaces. The main contribution of proposed system can be observed at its ability to support top down and bottom up tracing in an effort to identify the potential impacts of change. Other supporting tools were used to capture and link artifacts of various levels for the model implementation. CHAPTER 6 EVALUATION 6.1 Introduction This chapter discusses the evaluation methods for software traceability model and approach that involve evaluating model, case study and experiment. The objective of this evaluation is to conform and prove that the proposed software traceability model and approach support change impact analysis. To achieve this objective, firstly this research elaborate on the modeling itself how each model specification item is satisfied. Secondly, the output of the analysis is then tested on a case study and controlled experiment. Thirdly, the quantitative and qualitative evaluations are then measured based on the scored results. Some other qualitative values are also considered by a comparative study made on existing traceability models and approaches in order to strengthen the overall findings. This chapter is concluded with a summary of the proposed model. 6.2 Evaluating Model The validation process can be implemented at several software workproducts that include requirement, design, testing and code. The objective of this modeling evaluation is to prove and verify how the underlying construction of the model can be achieved. 111 6.2.1 Hypothesizing Traces To understand the validation process better, let us recall the initial construction of software traceability model. S = (G, E) …………………………………………………………………… (H-i) G = GR ∪ GD ∪ GC ∪ GT ……………………………………………….. (H-ii) E = ER ∪ ED ∪ EC ∪ ET ………………………………………………… (H-iii) 1) Requirement Traceability ER ⊆ GR x SGR ………………………………………………………... (H-iv) SGR = GD ∪ GC ∪ GT ………………………………………………… (H-v) GR = {R1, R2, ……,Rn} …………………………………………………. (H-vi) 2) Design Traceability ED ⊆ GD x SGD ………………………………………….......…...……. (H-vii) SGD = GR ∪ GT ∪ GD ∪ GC ……………………….………………… (H-viii) GD = {D1, D2, ……,Dn} ………………………………………………….. (H-ix) 3) Test case Traceability ET ⊆ GT x SGT …………………………………………………………. (H-x) SGT = GR ∪ GD ∪ GC …………………………………………………. (H-xi) GT = {T1, T2, ……,Tn} ……………………………………………………. (H-xii) 4) Code Traceability EC ⊆ GC x SGC …………………………………………………………. (H-xiii) SGC = GR ∪ GD ∪ GT ∪ GC ………………………….…………...….. (H-xiv) GC = {C1, C2,…….,Ct} ……………………………………………..……. (H-xv) GC = (V, E) ………………………………………………………………… (V-i) V = {V1,V2,…….,Vt} ………………………………………………………. (V-ii) E = {E1,E2,……..,Es} ………………………………………………………. (V-iii) Er = Vi x Vj ………………………..………………………………………… (V-iv) 4-a) Class linked to a class Vi , Vj ∈ VC ………………………………….............………………….. (V-v) E = VC x VC ………………………………………...…………………… (V-vi) 4-b) Method defined in a class Vi ∈VC …………………………………………................…………….. (V-vii) 112 Vj ∈VM …………………………………...................………………….. (V-viii) E = VM x VC ……………………………………….…………………… (V-ix) 4-c) Data defined in a class Vi ∈VC …………………………………….……………….…………… (V-x) Vj ∈VD …………………………………….…………….……………… (V-xi) E = VD x VC ……………………………….…………….……………… (V-xii) 4-d) Data defined in a method Vi ∈VM ..................................................................................................... (V-xiii) Vj ∈VD ....................................................................................................... (V-xiv) E = VD x VM .............................................................................................. (V-xv) 4-e) Method calls method Vi , Vj ∈VM ………………………………………….................………. (V-xvi) E = VM x VM ………………………..............…………………………. (V-xvii) 4-f) Data uses data Vi , Vj ∈VD | (∀ Ck, ∃ Vi : Ck x Vi ∈ VC x VD) x (∀ Ck, ∃ Vj : Ck x Vj ∈ VC x VD) .......................................... (V-xviii) OR Vi , Vj ∈VD | (∀ Mk, ∃ Vi : Mk x Vi ∈ VM x VD) x ((∀Mi ∃ Vj : Mi x Vj ∈ VM x VD ∩ Mk = Mi ) ∪ (∀Ck, ∃ Vj : Ck x Vj ∈ VC x VD)) .................................. (V-xix) E = VD x VD …………………………..……………………….………… (V-xx) The above model can be generalized as a system impact that consists of four main categories i.e. requirements, design, test cases and code. The main idea now is how to capture and establish the artifact relationships within and across each level of workproducts. This is basically illustrated by the model specification H-i, H-ii and H-iii. The process involves the task of capturing and tracing artifacts in an environment called hypothesizing process (Figure 6.1). 113 Requirements 1. Select test cases 4. Satisfy goal Documentations 1. Clarify knowledge Test Cases 2. Observe traces Design 3. Generate traces 3. Generate traces Code Figure 6.1: Hypothesized and observed traces Hypothesized traces can often be elicited from system documentation or corresponding software models. It is not important in this approach whether the hypotheses should be performed by manual exploration through the available documentations and software models or by automation with the help of some supporting tools. However, in this context of analysis, some supporting tools were developed to facilitate the process. Figure 6.1 presents the steps to fulfill the hypothesized trace process, as follows. 1. For each requirement, identify some selected test cases (RxT). Clarify this knowledge in the available documentation. This process is intended to satisfy validation associated to requirements (H-iv, H-v and H-vi). 2. Run test scenarios (i.e. dynamic analysis) using TestAnalyzer, to observe traces between the test cases and implementation methods (TxM). Another supporting tool, called CodeMentor was developed to instrument the source code prior to test scenarios. This will satisfy validation associated to test cases (H-x, H-xi and Hxii). 3. Perform a static analysis on code to capture the program relation structures of class diagrams using a third supporting tool, called TokenAnalyzer. This 114 process creates program dependencies that can be upgraded to higher level artifacts based on the upgrading scheme as described in Section 4.5. This process will produce a class-class (CxC), method-class (MxC), class-method (CxM) dependency deriving from the basic method-method (MxM). This will partly satisfy traceability associated to code (H-xiii to H-xv, V-i to V-xxi) and design (H-xiii, H-xiv, H-xv). 6.2.2 Software Traceability and Artifact Links High level traceability is referred to as a tracing involving the requirements and test cases. This tracing involves dynamic analysis of program execution via test scenarios. In this approach, each requirement is characterized by some test cases. As it had the RxT and TxM from the hypothesized traces earlier, a new RxM can be produced using a transitive closure. (RxT) and (TxM) Æ (RxM) OBA developers had been employed to refine further the RxM with the help of the supporting tools (i.e. CodeMentor and TestAnalyzer) to give the true impacts of RxM. The refined RxM was then used to compute a RxC and RxP by upgrading the methods into classes and classes into packages. For the bottom up traceability, the underlying MxR (i.e. the inverted RxM table) was used and transformed into CxR and PxR. The same principle applied to the test cases with their low level artifacts. The low level traceability was classified as a tracing that can be established around the code and design levels. This traceability is to confirm with the validation associated to code (H-xiii to H-xv, V-i to V-xx). The impacts can be captured between components based on static analysis on program relation structures as was discussed in Section 4.4.1. The impacts are then upgraded such that an impact to a method implies an impact to its class and an impact to a class implies an impact to its 115 package it belongs to. These dependencies make it possible to upgrade the underlying MxM into MxC, MxP, PxM, CxM, CxC, CxP, PxC and PxP with the help of mapping tables. Table 6.1: Summary of model specification and used techniques Model description Traceability Used Techniques H-i to H-iii System traceability Based on the artifact perspective H-iv to H-vi Requirement traceability Test scenarios, transitive closure, upgrading scheme Design traceability Static analysis on program relation structures, upgrading scheme H-x to H-xii Test case traceability Test scenarios, upgrading scheme H-xiii to H-xv Code traceability Test scenarios, transitive closure, Static analysis on program relation structures, upgrading scheme H-vii to H-ix V-i to V-xx At the low level traceability, the vertical links can occur between a) methodmethod, b) class-class, and c) package-package. While, the horizontal links can occur across boundaries between a) method-class, b) method-package, c) classpackage. Table 6.1 represents the summary of the validation process of the system traceability. System traceability is composed of requirement, design, test case and code traceability. Requirement traceability can be achieved via test scenarios, transitive closure and upgrading scheme. Test case traceability is verified by the same test scenarios and upgrading scheme but no transitive closure required. Design traceability can be satisfied using static analysis on program relation structures and upgrading. While, code traceability can be achieved via test scenarios, transitive closure, static analysis on program relation structures and upgrading scheme. 116 6.3 Case Study The main criteria that apply to our empirical study are i) a software project as a case study with a complete system documentation, ii) the software components of system development are always updated and consistent, iii) adopts the object- oriented methodology, and iv) subjects are familiar with the software project and documentation standard. To ensure that the matter in ii) really exists, it is suggested that the best system to consider is to adopt and use the most recent software development system. In this research, the traceability method was applied to a case study of software project, called the Automobile Board Auto Cruise (OBA). 6.3.1 Outlines of Case Study OBA was a teamwork development project assigned to final year postgraduate students of software engineering at the Centre For Advanced Software Engineering, Universiti Teknologi Malaysia. The OBA project was assigned to groups of students in an effort to realize and practice a software development methodology based on MIL-STD-498 (MIL-STd-498, 2005). Our targeted OBA system for a case study was specially selected from the best group of the year. The software system was about 4K LOC with a complete 480 pages written documentation adhering to MIL-STD-498 standard. The software design was built based the UML specification and design standards (Booch et al., 1999) with the code written in C++. The detailed analysis of OBA case study is described in Appendix D. The remarkable point of selecting this case study was that the subjects (students) were already familiar with the case study and its domain knowledge in which this is one of the crucial criteria to establish an experiment. The students have just completed their three month period of OBA development project. Prior to the experimentation, the code was analyzed to obtain the program dependencies of 117 methods, classes and packages. A substantial time and effort was spent to translate each requirement into its implementation code via test scenarios as was mentioned earlier. The objective here is to obtain the traceability infrastructures for vertical and horizontal levels before a controlled experiment gets started. 6.3.2 About the OBA Project The OBA project itself is an interface system built to allow a driver to interact with his car while on auto cruise or non-auto cruise mode. On auto cruise mode, some interactions like accelerating speed, suspending speed, resuming speed and braking a car should be technically balanced with the current speed of the car. On non-auto cruise, the manual interactions such as pressing or depressing pedals to change speed, braking a car, mileage maintenance and filling fuel need to be taken into consideration. The interface should also allow changing modes between the auto cruise and non-auto cruise. The OBA developers should take a point to balance between the fuel consumption and the current speed in order to regulate the cruising speed at certain level. Exceeding speed limit will trigger a warning message. Warning message also applies to mileage maintenance when the total trip reaches the maintenance kilometer limit. In addition, the OBA system should be able to interact with an engine simulator, a readily supported subsystem to simulate the running of the mechanical parts of the engine. The OBA was initially designed by the THALES Corporation (formally known as THOMSON), a France based company specializing in communications, telecommunications and military weapons including software development. THALES’ initial target of OBA was to provide a comprehensive training to its clients in software development methodology. With the help of the prescribed design and guidelines, the clients or students need to proceed with the OBA development to its completion including software programs and documentation. As an industrially 118 based project, OBA was designed by taking into consideration some software engineering principles and standards. The existing OBA was designed and prepared based on the MIL-STD-498 standard. 6.3.3 OBA Functionality The OBA hierarchical structures can better be explained in MIL-STD-498 concept. As a whole system, OBA can be classified as a Computer Software Configuration Item (CSCI). The CSCI can be split into several subsystems, called Computer Software Components (CSC). Each CSC represents a package in object oriented concept. Each CSC may consist of one or more classes, called Computer Software units (CSU). Figure 6.2 shows the relationships amongst the CSCs in a CSCI OBA. AP I Disp lay Mgm t Control Panel Driving Stat ion Car Uti li ty global Trip M gmt M il eage Mgm t Cruis e M gm M aintenanc e M gm t Speed M gm t Figure 6.2: CSC relationships within a CSCI OBA Init M gm t Figure 6.2: CSC relationships within a CSCI OBA Tim er M gm t 119 The OBA system can be classified into the following processes. • Driving Station Driving Station implements the Start Vehicle and Control Cruising Speed. This package manages the activities of interaction between the CSCI and the shared memory. It provides the capability for the CSCI to access and read or write the gear data in the shared memory. The CSC consists of: - CDrivingStation Class - CPedal Class - CIgnition Class - CGear Class • Control Panel This package implements the Control Cruising Speed that performs calculation for kilometer reference, acceleration, average fuel consumption on trip, average fuel consumption between two fillings and odd parity. The CSC consists of: - CKeypad Class • Init Mgmt This package provides the capability to start the vehicle and initialize the fuel tank, kilometer reference and reset the initial distance. The CSC consists of: - CFcdInitMgmt Class - CInitManager Class 120 • Cruise Mgmt This package manages the auto-cruise control. It provides the capability to maintain a constant speed during auto-cruise and notifies the driver of over speeding. The CSC consists of: - CFcdCruise Class - CCruiseManager Class - CCruiseInfo Class • Car Utility This package manages the activities of interaction between the car and shared memory. It provides the capability for the CSCI to access and read or write the throttle (voltage) and shaft (pulse) data in the shared memory. The CSC consists of: - CThrottle Class • Mileage Mgmt This package holds the classes for calibration and mileage calculation. This CSC consists of: - CFcdMileage - CCalibrationManager Class - CMileageCtrl Class - CCalibration Class • Timer Mgmt This package holds the classes responsible for time calculation. This CSC consists of: - CTimer Class 121 • Speed Mgmt This package manages the pulses and speed-related operation. The speed process provides the capability of the CSCI to regulate the speed and acceleration. This CSC consists of: - CSpeedCtrl Class • Trip Mgmt This package manages the new trip status, distance, time and fuel consumption. This CSC consists of: - CFcdTrip Class - CTripManager Class - CFuelManager Class - CTrip Class - CFuel Class • Maintenance Mgmt This package implements the capabilities of Start Vehicle, Supervise Average Fuel Consumption, and Assist Maintenance Schedule. This package also provides the capability for the CSCI to monitor the vehicle distance and notifies the driver of the required maintenance controls based on the vehicle distance. This CSC consists of: - CFcdMaintenance Class - CMaintenanceManager Class - CMaintenance Class • Display Mgmt This package provides the capability for the CSCI to access and read the LED and Display Panel data to and from the shared memory. 122 This CSC consists of: - CDisplay Class - CLED Class • API This package provides the interface between the CSCI and other components with the shared memory. This CSC consists of: - OBATargetMachine Class - IntteruptMaskRegister Class - InterruptVectorSlot Class 6.4 Controlled Experiment The aim of the experiment is to see how the prototype can support maintenance task in terms of its efficiency and accuracy to determine the change impact. The subjects were provided with a complete OBA system that included a set of requirements, test cases, design model, source code and documentations. With some change requests, the subjects were firstly asked to obtain the potential effects of the change requests. Secondly, the subjects had to draw from these potential effects, the actual change impacts for comparison. The potential effects can simply be obtained using the CATIA prototype. However, for the actual change impacts, the subjects had to manually explore through the code and other components. Some preliminary guidelines and briefing were conducted prior to the controlled experiment to make sure that the subjects could really understand the objective of the evaluation. 123 6.4.1 Subjects and Environment The subjects of the study were the post-graduate students of the final semester of software engineering course at the Centre For Advanced Software Engineering, Universiti Teknologi Malaysia. They knew the basic software maintenance as this subject was taught in the post-graduate programme. The students were familiar with C++ program as this language was taught and used in some course work projects besides C and Java. In the coursework programme, the students were taught a software project management and how to read and write the documentation standard based on DOD standard, MIL-STD-498. All the subjects were familiar with the OBA project as this case study was developed as a teamwork project along with its complete documentation. The subjects were motivated to get involved in the experiment by making clear that they would gain a valuable experience in the know-how of software maintenance. From investigation, all the students had some working experience of at least one year in the software industries except one with less than one year. Before the experiment started, the student particulars were collected including their C++ grades so that the students could be divided into groups of equally distributed knowledge and working experience. 6.4.2 Questionnaires The questionnaires were specially formulated to support the user evaluation of the controlled experiment. The main objective of our assessment was to measure the efficiency and accuracy of the change impact. The questionnaires consisted of two sections (see Appendix A); Section A and Section B. Section A was designed to relate to the previous professional background and Section B was on the use of CATIA and maintenance method. Section A was targeted to collect some information with regard to subjects’ working experience just before they enrolled into a post-graduate study. 124 Section B was targeted to record the scores of the dependent variables and user assessment over the use of the CATIA prototype with respect to change impact analysis. The independent variables used were the CATIA prototype, the subjects and the OBA system, while the dependent variables were the scores of change impact. The scheme of score evaluation was formulated based on (Kendall and Kendall, 1998). The details are discussed in the following Section. The ripple effects are classified into two traceability modes; vertical and horizontal. Vertical traceability observes the ripple effects within a component level e.g. at the method level. Meanwhile, the horizontal traceability observes the ripple effects across different component levels e.g. from the methods to classes, test cases and requirements or vice versa. A few questions were placed for the subjects to answer how the steps they had taken and which source of information consumed time most to them between the documentation and source code while identifying the change impact. It provided a free requirement selection that required the subjects to select one from a set of the available requirements and captured the associated ripple effects in terms of the test cases, design and code. There were six questions related to the features of CATIA in that the users were asked to provide some degree of its usefulness with respect to impact analysis. The users were also requested to give some comments on the overall performance of CATIA as an open question. 6.4.3 Experimental Procedures Twenty participants were involved in a one day experiment over the above OBA case study of maintenance project. They were divided into five groups of fairly distributed knowledge and working experience. Subjects were then given a briefing and training session before the actual experiment took place. They were provided with a set of system documentations, CATIA tool (Appendix B) and source code of OBA project. Subjects then had to perform some understanding tasks and one impact 125 analysis task prior to experimentation. The idea behind this training session was to avoid the confounding factors such that subjects were not aware of the experimental setting and procedures, and how to perform impact analysis in a required manner. The experiment proper was then performed in which each group was given two sets of change request. The first set was meant to implement a bottom up traceability, while the second set was meant to implement a top down traceability. In the first set, they had to transform a change request into some understandable software components (PIS-primary impact set) and use the CATIA tool to generate the potential effects (SIS-secondary impact set) in terms of the number of components impacted, its sizes and complexities. Based on the SIS, the groups had to draw the actual impacts (AIS-actual impact set). AIS needed to be manually verified by exploring and searching through the right paths in the code and other components including design, test cases and requirements. They had to consult documentation, if necessary. Lastly, they had to modify, compile and submit the compiled code together with the results they had collected. In the second set of change request, the subjects had to choose any one of the existing requirements i.e. its corresponding requirement number and used CATIA to search for the impacted artifacts including methods, classes, packages and test cases. 6.4.4 Possible Threats and Validity Few factors may contribute to some possible threats in this study. First, the subjects who participated in the study were post-graduate students of software engineering. Although the experiment was conducted in an academic setting, the subjects were representative of software professionals as they had some working experience in the industries. Thus, external validity to which the result of the research could not be generalized can be eliminated. Laboratory setting such as this allows the investigation of a larger number of hypotheses at a lower cost than field studies. 126 Second, it could be an unequal distribution of working experience in each group. To counter balance this issue, the individuals were evaluated based on their academic grades of C++ module. During the analysis of the experiment, a study test of the correlation on subject expertise with the number of correct answers was conducted and found that no significant correlation between the expertise factors and the dependent variables. This indirectly eliminated the confounding factor that the result might depend on the experience of the users. Third, the experiment tried to reduce the internal threats to minimum possible. For example, the issue of plagiarism when each group is given the same type of change requests. Although the result could be more reliable, an act of plagiarism is very difficult to overcome, thus the reliability of the result will also be questionable. To avoid this phenomenon, each group was assigned to a different change request. Five set of change requests were developed, each with different problem solving. Fourth, it could be the issue of unfamiliarity with the tool for evaluation. To overcome this issue of learning curve, a short training was performed on the use of the CATIA tool prior to its experimentation. The controlled experiment was conducted on all groups under proper supervision. A one day experiment on the participants was allocated in order to secure a good result and to keep their enthusiasm in the study. It was foreseen that the subjects might very frequently involve in flipping over the documentation to search for some common information. To reduce this problem, some useful information had been made available that included documentation, code and tables in both hard and soft copies for easy reference. Lastly, there could be some careless mistakes the subjects had drawn from their results as this may be caused by human errors of manually examining the software components. This factor was eliminated by double checking the work flows of each group after their experimentation. 127 6.5 Experimental Results Below is a discussion of the research results based on the controlled experiment. The research results are divided into two parts; the user evaluation and scored results. 6.5.1 User Evaluation This research applied the SPSS statistical package to evaluate the results of our experiment. A few statistical formulas were used in the evaluation process such as cross tabulation, frequencies and One-Way Anova test. Table 6.2: Cross tabulation of experience versus frequencies Working Frequency Percent Experience Valid Cumulative Percent Percent Less 1 year 1 5.0 5.0 5.0 1 - 3 years 14 70.0 70.0 75.0 4 - 6 years 4 20.0 20.0 95.0 more than 6 years 1 5.0 5.0 100.0 TOTAL 20 100.0 100.0 Twenty subjects participated in a controlled experiment. Majority of them had more than one year experience in the industries before they enrolled in the postgraduate study (Table 6.2). Only one student with minimum less than one year was recorded while, fourteen students were between 1 to 3 years experience, four with 4 to 6 years and one with more than 6 years working experience. In terms of job position (Table 6.3), 35.0 percent worked as programmers, 40.0 percent as software engineers/system analysts, while 25.0 percent as project leader/project managers. 128 Table 6.3: Cross tabulation of previous jobs versus frequencies Previous Frequency Percent Job Programmer Valid Cumulative Percent Percent 7 35.0 35.0 35.0 8 40.0 40.0 75.0 Project Manager 5 25.0 25.0 100.0 TOTAL 20 100.0 100.0 Software Engineer/ System Analyst Project Leader/ The division of the subjects into five groups was made in such that each group had a fair distribution of expertise and working experience as close as possible. This can be illustrated by Table 6.4 through Table 6.6. To verify this distribution, a cross-tabulation analysis and mean test had been performed. The data in Table 6.7 through Table 6.10 show that the combination of their previous job, years of working experience and C++ familiarity provides a fairly point average of group distribution. From our analysis of using a One-Way Anova test between the scores and group distribution, there is no significant judgement to say that the correct scores are affected by the groups. Table 6.4: Cross tabulation of previous jobs versus group distribution Previous Group Group Group Group Group Total Job A B C D E Programmer 1 1 2 2 1 7 Software Engineer/ 2 2 1 1 2 8 1 1 1 1 1 5 4 4 4 4 4 20 System Analyst Project Leader/ Project Manager TOTAL 129 Table 6.5: Cross tabulation of experience versus group distribution Years in software Group Group Group Group Group TOTAL development A B Less 1 year C D E 1 1 - 3 years 3 4 - 6 years 1 1 2 more than 6 years 3 3 3 14 1 1 1 4 1 TOTAL 4 1 4 4 4 4 20 Table 6.6: Cross tabulation of program familiarity versus groups Program familiarity Group Group Group Group Group A B B- 1 1 B 1 1 C D TOTAL E 2 B+ 1 1 1 5 2 2 2 6 A 2 2 1 1 1 7 TOTAL 4 4 4 4 4 20 Table 6.7: Mean of previous jobs for groups Groups Mean Frequency Standard Deviation Group A 2.00 4 .82 Group B 2.00 4 .82 Group C 1.75 4 .96 Group D 1.75 4 .96 Group E 2.00 4 .82 TOTAL 1.90 20 .79 130 Table 6.8: Mean of experience for groups Groups Mean Frequency Standard Deviation Group A 2.25 4 .50 Group B 2.25 4 1.26 Group C 2.25 4 .50 Group D 2.25 4 .50 Group E 2.25 4 .50 TOTAL 2.25 20 .64 Table 6.9: Mean of program familiarity for groups Groups Mean Frequency Standard Deviation Group A 2.75 4 1.50 Group B 2.75 4 1.50 Group C 3.00 4 .82 Group D 3.00 4 .82 Group E 3.00 4 .82 Total 2.90 20 1.02 Table 6.10: Point average of group distribution Groups Point Average Group A 2.33 Group B 2.33 Group C 2.33 Group D 2.33 Group E 2.41 131 Table 6.11: Mean of scores for impact analysis features Impact Analysis Frequency Mean Standard Features Deviation Easy to use in overall 20 4.80 .70 Easy to find SIS 20 4.40 .82 Supports AIS 20 4.40 .60 Supports cost 20 3.80 1.51 Supports traceability 20 4.75 .72 Speeds up AIS 20 4.65 .88 Helps traceability in doc. 20 4.15 1.31 Usefulness of GUI 20 3.90 1.07 The subjects were asked to evaluate the CATIA tool (Table 7.10) with respect to its usefulness to support change impact analysis based on 6 basic scales (1-Not At All, 2-Low, 3-Moderate, 4-Useful, 5-Very Useful, 6-Extremely Useful). From the perspective of easiness to use in overall (Figure 6.3), 10% of the subjects chose 6 as extremely useful, 10% chose 1, 65% chose 5, 20% chose 4, while 5% chose 3. The majority of the sample agreed that the CATIA tool provides easy use in overall. On the questions of easy way to find the SIS (Figure 6.4), 5% chose 6, 45% chose 5, 35% chose 4, 15% chose 3. The result in Table 6.11 shows that the SIS was accepted to be very much satisfactory. Easiness To Use In Overall 10% Not At All 0% 5% 20% Low Moderate Useful 65% Very Useful Extremely Useful Figure 6.3: Easiness to Use in Overall 132 Easy To Find SIS Not At All 5% 0% 15% Low Moderate Useful 45% 35% Very Useful Extremely Useful Figure 6.4: Easiness to Find SIS AIS Support Not At All 5% 0% Low 30% Moderate Useful 65% Very Useful Extremely Useful Figure 6.5: AIS Support On the support of AIS (Figure 6.5), 5% chose 6, 30% chose 5, 65% chose 4. The result in the Table 6.11 also shows that the AIS was accepted to be satisfactory. With regard to support cost estimation (Figure 6.6), 10% chose 6, 30% chose 5, 20% chose 4, 20% chose 3, 10% chose 2, 10% chose 1. In terms of traceability support (Figure 6.7) 10% chose 6, 60% chose 5, 25% chose 4, 5% chose 3. The result shows that the SIS was accepted to be very much satisfactory. Majority of the subjects agreed that the tool could speed up the AIS. On the question of helping traceability in documentation (Figure 6.8) 10% chose 6, 45% chose 5, 10% chose 4, 20% chose 3, 15% chose 2. Majority also agreed that the tool provided sufficient need for GUI. 133 Cost Estimation Support Not At All 0% 15% 10% Low 20% Moderate Useful Very Useful 30% Extremely Useful 25% Figure 6.6: Cost Estimation Support Traceability Support 10% Not At All 0% 5% 25% Low Moderate Useful Very Useful 60% Extremely Useful Figure 6.7: Traceability Support Docum entation Support 10% 0% Not At All 15% Low Moderate 20% 45% 10% Useful Very Useful Extremely Useful Figure 6.8: Documentation Support 134 6.5.2 Scored Results It is identified from the OBA project, 46 requirements, 34 test cases, 12 packages, 23 classes and 80 methods. Table 6.12 reveals the results of bottom-up impact analysis collected from the experiment. The results show how PIS produced at the lowest level was characterized to affect SIS and AIS of higher levels. For example, PCR#1 produced two PIS, five SIS and three AIS at the method level. At the class level, PCR#1 produced two PIS, two SIS and one AIS. At the package level, PCR#1 produced one PIS, one SIS and one AIS. At the test case level, PCR#1 produced 16 AIS and no SIS was reported. At the requirement level, PCR#1 produced nineteen AIS. Please note that there were no PIS and SIS at the test case and requirement levels because it applied the dynamic analysis of concept location as produced by the user. The scored results of the experiment in Table 6.12 were then compared to previously defined benchmark as described in Chapter 3. The evaluation can be explained as follows. In each software level, the Inclusiveness was 1, meaning that the AIS was always part of the SIS. This indicates that the actual impacts is within the estimated impacts, thus further metrics such as S-Ratio, Amplification and Change Rate were meaningful. The S-Ratio relates to how far the AIS is adherent to the SIS. The data show that adherence gets worse (i.e. the S-Ratio decreases) as the degree of granularity gets finer. This means that the maintainer will have to devote a substantial maintenance effort to search for the components that will actually be modified. The result shows that the amplification increases as the granularity gets finer, but the Change Rate gets increases as the granularity gets coarser. The difference is obvious as the amplification is linked to PIS, but the Change Rate is linked to SIS. The PIS get increased as the granularity gets finer and the SIS gets closer to the overall system as the granularity gets coarser. 135 Table 6.12: The results of bottom-up change impacts Grp A B C D E Levels PIS# SIS# AIS# Incl Ampl 1 SRatio 0.60 2.5 Change Rate 0.06 Method 2 5 3 Class 2 2 1 1 0.50 1.0 0.09 Package 1 1 1 1 1.0 1.0 0.08 T. Case - 30 16 1 0.53 - 0.88 Reqt. - 44 19 1 0.43 - 0.96 Method 2 13 8 1 0.62 6.5 0.16 Class 2 4 3 1 0.75 2.0 0.17 Package 2 3 2 1 0.67 1.5 0.25 T. Case - 32 13 1 0.41 - 0.94 Reqt. - 44 17 1 0.38 - 0.96 Method 3 10 6 1 0.60 3.33 0.13 Class 3 4 3 1 0.75 1.33 0.17 Package 2 3 2 1 0.67 1.5 0.25 T. Case - 34 16 1 0.47 - 1.0 Reqt. - 46 20 1 0.44 - 1.0 Method 3 11 6 1 0.55 3.67 0.14 Class 3 4 3 1 0.75 1.33 0.17 Package 2 2 2 1 1.0 1.0 0.17 T. Case - 34 10 1 0.29 - 1.0 Reqt. - 46 15 1 0.33 - 1.0 Method 2 5 3 1 0.60 2.5 0.06 Class 2 2 2 1 1.00 1.0 0.09 Package 2 2 2 1 1.0 1.0 0.17 T. Case - 31 8 1 0.26 - 0.91 Reqt. - 45 22 1 0.49 - 0.98 Analy. Effort (PIS) Spec. Effort (AIS) 22 20 27 35 30 27 35 25 25 22 The S-Ratio average of method, class and package levels were calculated to be 0.59, 0.75 and 0.87 respectively. These results show that as the degree of granularity gets finer (e.g. methods) the SIS# tends to be less accurate i.e. 0.59. But as the degree of granularity gets coarser, the AIS# and SIS# become more generalized and get closer to 1.0. The Amplification is not applicable to test cases and requirements as no PIS are reported for these artifacts. At the test cases and requirement levels, the SIS produced were too generalized as almost all the test cases and requirements were affected by the PIS. This is due to the fact that some common 136 program modules such as the start/end module and variable initialization module were always called to execute any test cases or requirements no matter whether we like it or not. Two efficiency metrics of the maintenance process were also assessed: Analysis Effort, that the subjects spent time to search for the PIS, and the specification Effort expressed the time taken to search for the AIS. These efforts were expressed in minutes. The data shows that the Analysis Effort required is greater in group C and D. The increased effort could be explained by an increase in PIS# and SIS#. While the group B recorded the highest Specification Effort due to an increase in AIS#. In conclusion, the greater effort devoted to the analysis of a finer grained model is counterbalanced by the greater accuracy of the modification accomplished. Table 6.13 reveals the results of top-down impact analysis collected from the experiment. In the above bottom up traceability, this research experienced unnecessarily large SIS were affected by a small PIS. To reduce this issue, the research had allocated a substantial time prior to actual experiment capturing and selecting the true impacts by employing developers to implement test scenarios on each test case and requirement. They captured all the potential impacts and derived from it some useful or true impacts with the help of the supporting tool, TestAnalyzer. Thus, in the top down traceability we treated the user decisions as AIS and no SIS applied. Table 6.13 shows that the req9 (group A) had caused impacts on 22 methods, 13 classes, 9 packages and 3 test cases. The impacts took up LOC (348) and VG (116) on methods, LOC (473) and VG (154) on classes, LOC (491) and VG (160) on packages, LOC (348) and VG (116) on test cases. Req46 (group B) had caused impacts on 26 methods, 11 classes, 8 packages and 3 test cases. The impacts took up LOC (389) and VG (106) on methods, LOC (477) and VG (123) on classes, LOC (473) and VG (133) on packages, LOC (389) and VG (106) on test cases, and so forth. Please note that no requirements are allowed to cause impact on other requirements as mentioned earlier. 137 Table 6.13: Results of top-down impact analysis Grp A B C D E PCR# 1 2 3 4 5 Primary Artifacts Req9 Req46 Req22 Req36 Req32 Secondary Artifacts Methods Count 22 LOC 348 VG 116 Classes 13 473 154 Packages 9 491 160 T. Cases 3 348 116 Reqts. - - - Methods 26 389 106 Classes 11 477 123 Packages 8 473 133 T. Cases 3 389 106 Reqts. - - - Methods 23 289 99 Classes 14 473 154 Packages 10 491 160 T. Cases 2 289 99 Reqts. - - - Methods 23 229 64 Classes 12 352 109 Packages 8 485 138 T. Cases 2 229 64 Reqts. - - - Methods 19 244 72 Classes 10 399 114 Packages 7 485 138 T. Cases 1 244 72 Reqts. - - - 6.6 Findings of the Analysis This section discusses the findings of the research based on two perspectives. First, the findings drawn from the quantitative perspective based on the scored results gathered during the controlled experiment. Second, the findings made from the 138 qualitative perspective based on the study on user evaluation in the experiment, as well as the existing models and approaches of similar types of application. 6.6.1 Quantitative Evaluation The Inclusiveness metric was 1 in each case analyzed, meaning that the AIS was always contained within the SIS. This findings showed that the automated impact analysis was not affected by errors i.e. the further metrics were meaningful. The S-Ratio metric expresses how far the estimated impact (SIS) corresponds to the actual impact (AIS). Table 6.14 shows that the S-Ratio average package = 0.87, classes = 0.75 and method = 0.59, which means Adherence gets worse (i.e. the SRatio decreases) as the degree of granularity gets finer. Based on the benchmark as was discussed in previous Section 3.5.3, our results fall under the priority 1 i.e. the desired trend is to get closer to S-Ratio 1.0. Table 6.14: Results of artifact S-Ratio Artifacts S-Ratio Package 0.87 1 Class 0.75 1 Method 0.59 2 Priority From the above result, it is obvious that as packages are the target artifacts of the ripple effects, CATIA can almost certain to locate for the actual affected ones. If classes are the next finer target artifacts of interest, then the result would be less certain. If methods, the finest target artifacts to be looked into then the results would be less accurate, however all are within the expected impacts. In this case, it is presumed that the maintainer will have to devote a large part of the maintenance effort to discarding the estimated components that will not actually be modified. Therefore, the system administrator should decide whether to establish the efficiency or accuracy of the maintenance task as the main objective to be satisfied and the granularity of the software level to be chosen accordingly. 139 Amplification expresses how much the SIS extends the PIS. The result shows that the amplification increases as the granularity gets finer. This is due to the fact that more component dependencies exist at the lower level, thus the SIS tends to increase considerably. The Change Rate expresses how large is the estimated impacts can be detected out of a system. The data shows that this metric gets worse (i.e. increases) as the granularity gets coarser. More maintenance effort is required as the degree of granularity get finer to search for the actual impacted components. At the coarse granularity, the proposed traceability approach tends to provide efficiency but with less accuracy as the impact is too generalized. It was revealed by the analysis when the average S-Ratio was calculated in the overall system. At the requirement and test case levels, the results seem unpredictable. This is due to the fact that a requirement itself is very subjective. This means, the estimated impacts on code can be captured via test case execution but the actual impacted components (AIS) are arbitrary that can only be decided by the user. In comparison to other works, it can be concurred that although there exist some works in the same research area but no such detailed results were made available to compare with. To date, a work done by Bianchi et al. (Bianchi et al., 2000) with their approach called MDD may be used for slight comparison. MDD was split into two parts; MDD1 and MDD2. MDD1 worked on a coarse grained granularity which focused on classes, while MDD2 worked on a fine grained granularity that focused on the methods, attributes and variables. The result showed that MDD1 produced the average results of S-Ratio = 0.40, Amplification = 9.52 and Change-Rate = 0.74. MDD2 produced the average results of S-Ratio = 0.33, Amplification = 36.58 and Change-Rate = 0.48. At the class level, our results showed that the averages of S-Ratio = 0.75, Amplification = 1.332 and Change-Rate = 0.138. It seems that at the class level, our result is better (i.e. more accurate) than MDD1 as the S-Ratio and Amplification are closer to 1.0 and Change-Rate << 1.0. MDD2 is not comparable to 140 our result (at the method level) as it involves more detailed artifacts including the attributes and variables. However, this result supports the research findings as it explores further to finer grained granularity, the analysis will become less accurate. 6.6.2 Qualitative Evaluation The research application was exposed to users by letting them use and evaluate its effectiveness under a controlled experiment. User perception took into account the feedback and comments from users on the usefulness of the prototype tool to support change impact analysis. Some questionnaires were designed to establish whether the prototype was useful and effective to support change impact analysis. Majority agreed that the CATIA provides an easy use in application to support change impact. They found it easy to identify the potential effects at various levels. As the potential effects are possible based on change requests, the users were guided to work along the way searching for the actual impacts in a short time. On question of time taken to capture for the AIS, majority responded that CATIA can speed up the search. From the analysis, it was noticed that they spent more time flipping over the documentation and code to search for the PIS. However to search for AIS, they managed to speed up the time with the help of CATIA. The figures showed that CATIA is a useful mean to support better prediction on maintenance decision involving cost, effort and resources. One commented that more information was required to make the cost estimation more practical e.g. use of function points and expertise values which are out of our scope. He cited that CATIA anyway provided a very useful support on the impacted components of code based. With regard to traceability within software components, large number of participants agreed that the tool can help them search for the right components in the system in either top down or bottom up tracing. They suggested that the tool capabilities should be extended to other high level abstracts such as the high level 141 design and specifications. With respect to GUI, a number agreed that our tool supported a user interface to some extent. With respect to other models and approaches, five selected applications were chosen for comparison and evaluation as illustrated in the Table 6.15. The evaluation is made based on some traceability features as defined in previous Section 2.6. Table 6.15: Existing features of software traceability Systems Systems/ Ripple Change Fine Metrics Top Visuali- Graph Projects Effects Impact Grained Support Down/ zation Formalism Bottom Up Tracing GEOS Y Y classes Y 1 way Y N TRAM Y N System N 2 ways Y N compone nts Software N N variables N 2 ways Y N Y Y methods N 1 way N N MDD Y Y methods Y 1 way N Y Domain Y Y Classes Y 1 way N N Y Y methods Y 2 ways Y Y Concorda nce Dynamic Analyzer Tracer CATIA (limited) 1. All the tools provide ripple effects of a requirement except for the Software Concordance. Some of these ripple effects are determined earlier by human intervention e.g. Domain Tracer, GEOS and TRAM. Dynamic Analyzer creates the ripple effects by implementing test scenarios. CATIA attempts to 142 identifying ripple effects by test scenarios as well as static analysis. This allows more rich set of artifact dependencies for automation. 2. Some of the tools are not directly developed to support change impact but rather to provide traceability among the components. GEOS was developed to support effort and cost estimation. TRAM was developed to provide traceability at the high level design and specification. Software Concordance provides traceability between documentation and code. Dynamic Analyzer, MDD support change impact via explicit links, while Domain Tracer achieves this using domain knowledge by interviewing software engineers. CATIA attempts to support change impact using explicit links as well as concept location. This will provide more accurate result of change impact. 3. GEOS and Domain Tracer support file grained traceability at the class level. Dynamic Analyzer and MDD support traceability at the method level. While TRAM is built to provide traceability at the high level architectural components. Only Software Concordance manages to support traceability at the statement level which includes the variables. However, not all the variables are catered for in the traceability, only those as defined in the documentation are taken into consideration. CATIA supports traceability at the method level for change impact analysis. It is argued that the change impact is not practical to occur at each variable or statement level as this will involve a lot of resources and memory space. 4. GEOS provides good support for metrics as it was developed to support effort and cost estimation. Measurement for effort and cost estimation requires a large historical data of the past projects in order to provide good prediction of estimation. MDD and Domain Tracer do provide some metric measurements. However, it is not very clear to what level of artifacts the metrics they provide for. CATIA supports metrics for measuring the impacted artifacts in terms of the size and complexity. 143 5. Most of the tools do not provide sufficient need for top down and bottom up tracing for impact analysis. GOES, MDD, Domain Tracer and Dynamic Analyzer do support one way tracing, which means it can be either top down or bottom up traceability but not both at the same time. TRAM and Software Concordance support two ways traceability. CATIA was specially designed to provide two ways tracing i.e. top down and bottom up traceability. The user can examine the potential effects either from the high level artifacts or low level artifacts of a proposed change. 6. Visualization understanding. provides a useful mechanism to support program GEO, TRAM and Software Concordance offer a useful visualization of relationships between components. While Dynamic Analyzer, MDD and Domain Tracer are more focused on text exploration. CATIA provides limited visualization as this facility requires a lot of effort and memory space that may affect the system performance. CATIA emphasizes more on text exploration as this provides sufficient information to users in order to take the necessary actions. 7. Many researchers do not describe in detail how the graph formalism of impact analysis they adopt. Only MDD reveals its formalism as follows. System=Cr ∪ Ca ∪ Cd ∪ Ci and Rel=Ev ∪ Eh where Cr , Ca , Cd and Ci are the sets of components from the requirements, analysis, design and implementation models, respectively; Ev ⊆ Cr x Cr ∪ Ca x Ca ∪ Cd x Cd ∪ Ci x Ci is the vertical traceability relationship; Eh ⊆ Cr x Ca ∪ Ca x Cd ∪ Cd x Ci is the horizontal traceability relationship. The above traceability tends to be too generalized because it does not apply to an artifact of interest. This research is interested to know how the impact from the point of view of an artifact. Our graph formalism was explained in detail in the Section 5.3.3 and 5.3.5. 144 6.6.3 Overall Findings and Discussion From the model perspective, the proposed software traceability manages to establish links between requirements, test cases, code and design. This traceability represents some parts of the whole software system. In the real world software maintenance, other software components also need to be considered. For example, the test results and test logs in software testing, high level design in the design document, and the high level requirements in the requirements document. As far as this proposed model is concerned, it only deals with the functional requirements that need to be translated into some identified artifacts earlier to be changed. The mechanism to transform the requirements into these initial target components is still researchable. Thus, in this model, the experts i.e. software maintainers are assumed to be able to translate it into the target artifacts. The Proposed model captures the design from the source code rather than from the design data base or design document. The design structures are based on a standard detailed design document as described in the DOD documentation standard, MIL 498. The high level design remains as a limitation to this proposed model. In terms of the approach to establish traceability within a system, the proposed model attempts to integrate between the static and dynamic analysis in order to establish a rich set of artifact dependencies. However, the results obtained from the dynamic analysis via test scenarios may exposed to inaccuracy as it depends on the completeness of the test cases. From the analysis, the results show that the potential effects covered by the test cases are much larger, almost double the size of the actual impacts. To mitigate this issue, the proposed approach applied cognitive method i.e. it requires human decision to decide on the true impacts. These true impacts can be drawn from the test scenarios with the help of supporting tools. 145 6.7 Summary This chapter has described the analysis and findings of the controlled experiment that evaluate how the proposed traceability model can achieve its effectiveness and accuracy in dealing with change impact analysis. The scored results produced by the subjects and the time taken to complete the experiment were considered as the useful variables. Based on the result of the experiment, it revealed that the proposed model provides some significant achievements to handle the change impact. The subjects indicated that the tool provides some useful interfaces and improves productivity of software maintenance. This approach was compared qualitatively with other similar approaches. The significant difference can be observed at its ability to directly link a component at one level to other components of different levels in top down and bottom up traceability. In general, this model is able to improve some aspects of change impact features as were discussed in the early literatures. CHAPTER 7 CONCLUSION The software traceability approach, as was proposed earlier in this research, is summarized in this chapter. The current chapter starts with the summary and explanation on how the research achieves each objective set earlier. It is followed by the main contributions of this research in response to change process in software evolution. Finally, it describes some limitations in the current scope and areas of future research. 7.1 Research Summary and Achievements A set of research objectives as defined in the early stage have set the direction of this research. Hence, these objectives are highlighted to observe and summarize how each has been implemented and achieved. Objective 1: To build a new software traceability model to support change impact analysis that covers requirements, test cases, design and code. As described in Chapter 4, the software traceability model was designed to establish links and dependencies within software system that involved four main artifacts; requirements, design, test cases and code. The aim of this model is to establish software traceability from an artifact point of view e.g. from a requirement to identify its potential impacts to other parts of the system. This model managed to 147 establish links between artifacts from within and across different software workproducts. In further investigation and with some tradeoffs between design and code, this model was refined and evaluated to the detailed artifacts that included the requirements, test cases, packages, classes and methods. The proposed model managed to tackle the software traceability issues in two perspectives; vertical and horizontal. Vertical traceability was handled using the combined techniques of data flows, control flows and composite dependencies. While, horizontal traceability was managed using the explicit links, implicit links and concept location. Objective 2: To establish a software traceability approach and mechanism that cover features including artifact dependencies, ripple effects, granularity and metrics. One of the crucial factors in software traceability approach lies in its capability to capture and reconstruct artifacts from the available resources. The proposed approach applied dynamic analysis to capture the links between the requirements, test cases and implementation. It applied static analysis to capture program dependencies within code. As traceability approach was designed to utilize the benefits of OO paradigm of widely accepted UML class diagrams, the program dependencies were successfully recovered from the code based on program relations of association, aggregation, call invocation, data uses and inheritance. The results of static and dynamic analyses were then integrated to provide a rich set of artifacts and their relationships as was discussed in Chapter 4. A comparative study was performed on the existing software traceability approaches in terms of the techniques used, capabilities, features, strengths and drawbacks. The findings revealed that some improvement opportunities could be made to the existing traceability approaches particularly on the aspects of impact measurement and total traceability. 148 Objective 3: To develop software supporting tools to support the proposed model and approach. The manual activities of change impact process involve searching for the right components in the design model and code, identifying their propagation to other models including user requirements, and determining the sizes and complexities of the affected components. These activities are very tedious, time consuming and error prone. Therefore, there is a need to develop some supportive tools and mechanism to facilitate all these tasks. Chapter 5 described some related tools developed to support change impact analysis. For example, a CodeMentor was developed to capture the program dependencies, TestAnalyser was developed to perform test scenarios and CATIA to implement the change impact and traceability. These prototype tools automate some useful information for change impact analysis that includes the artifact structures, metrics and potential impacts. Objective 4: To demonstrate and evaluate the practicability of the software traceability model and approach to support change impact analysis. Chapter 6 presented an evaluation on the traceability application by five groups of students specially selected to represent software practitioners (with some working experience) in a controlled experiment. In the experiment, the model was applied to a maintenance project of software embedded system as a case study. Based on the evaluation, it revealed that the proposed model meets its expectation and satisfies the users. Some metrics were applied based on a benchmark as was introduced by Arnold (1993). The results proved that the S-Ratio average of method, class and package levels met its desired trend. The results expressed in analysis effort and specification effort also showed some significant achievements with respect to change rate and duration. 149 7.2 Summary of the Main Contributions The main contributions of the proposed software traceability model and approach can be summarized as follow. 1. The new software traceability model and approach provide traceability mechanism to support change impact analysis. This work differs from others in that it attempts to integrate the software components that include the requirements, test cases, design and code with methods considered as the smallest artifacts. 2. The new software traceability model and approach allow traceability from an artifact point of view to search for the impacted components in either top down or bottom up strategy. This helps the maintainers or management identify all the potential effects before taking appropriate actions. 3. The new approach provides an integrated approach between dynamic and static analysis with the help of some supporting tools. This integration provides a rich set of artifacts and relationships for change impact analysis. Investigation revealed in the object oriented impact analysis, the existing parsers and tools do not capture sufficient information to cause the ripple effects of change of an artifact. 4. The new approach provides the ability to produce the sizes and complexities of the affected artifacts at each software level. The ability to measure these useful metrics supports other critical needs such as design decision, effort and cost estimation. 150 7.3 Research Limitation and Future Work Despite the above contributions, the findings are also exposed to some limitations as follow: 1. As large systems are more complex than one could imagine, the usefulness of this approach is currently tested and applicable to a small size of software system. Large systems may involve some integrated applications of different platforms and environments. 2. The new approach deals with the functional requirements not the constraint requirements. This is because the constraint requirements are not testable and measurable. 3. The requirement-code traceability is dependent on the completeness of the test cases that characterize a requirement. 4. The new software traceability model only focuses on the direct links not the indirect links between artifacts. By direct links it means the first order relationships that can be established directly from an artifact point of view not via the second order, third order or the forth. It is foreseen, based on the scope established and the limitations discovered that the following areas may constitute possible future works: 1. The approach focuses on the low level design, while the high level design and specification are reserved for future work. The reason is that the high level design and specification require a special attention as the component dependencies are too abstract, subjective and need proper semantics. 151 2. The approach provides a traceability mechanism for the software objects derived from the documentation and code. The auto traceability and cross referencing that links the change ripple effects to documentation and code are seen as another research opportunity to explore in future. 3. Visualization seems to be helpful to support program understanding. The user may find it useful if the tool can provide a view of the program or system structures like trees or control flows of the affected software segments. REFERENCES Anderson, P., Reps T., Teitelbaum, T. and Zarins M. (2003). Tool Support for FineGrained Software Inspection. IEEE Software. 1-9. Antoniol, G., Canfora, G.A., Gasazza, G., and Lucia, A. (2002). Recovering Traceability Links Between Code and Documentation. IEEE Transactions on Software Engineering. 28(10): 970-983. Antoniol, G., Caprile, B., Potrich, A. and Tonella, P. (2000). Design-Code Traceability for Object Oriented Systems. The Annals of Software Engineering. 9: 35-58. Antoniol, G., Penta, M.D. and Merlo, E. (2004). An automatic Approach to Identify Class Evolution Discontinuities. Proceedings of the 7th International Workshop on Principles of Software Evolution. IEEE Computer Society. 3140. Arnold, R.S. and Bohner, S.A. (1993). Impact Analysis – Towards A Framework for Comparison. Proceedings of the Software Maintenance Conference. September 27-30. USA: IEEE Computer Society. 292-301. Bennett, K. (1996). Software Evolution: Past, Present and Future. Information and Software Technology. 38(11): 673-680. Bennett, K., Rajlich, V. (2000). Software maintenance and Evolution: A Roadmap. Proceedings of the Conference on the Future of Software Engineering. May. USA: ACM Press. 73-87. 153 Bianchi, A. , Fasolino, A.R., Visaggio, G. (2000). An Exploratory Case Study of the Maintenance Effectiveness of Traceability Models. International Workshop on Program Comprehension. June 10-11. Ireland: IEEE Computer Society. 149-158. Bieber M., Kukkonen H. and Balasubramanian V. (2000). Hypertext Functionality. ACM Computing Surveys. 1-6. Bieman, J.M.and Kang, B.K. (1998). Measuring Design-Level Cohesion. IEEE Transactions on Software Engineering. 24(2): 111-124. Boehm, B. and Sullivan, K. (2000). Software Economics: A Roadmap. Proceedings of the Conference on the Future of Software Engineering. ACM. 319-344. Bohner, S.A. (1991). Software Change Impact Analysis for Design Evolution, Eighth International Conference on Software Maintenance and Reengineering. IEEE Computer Society Press. 292-301. Bohner, S.A. (2002). Software Change Impacts: An Evolving Perspective. Proceedings of the International Conference on Software Maintenance. October 4-6. Canada: IEEE Computer Society. 263-271. Bohner, S.A. and Arnold, R.S. (1996). Software Change Impact Analysis. California: IEEE Computer Society press. Bohner, S.A. and Gracanin, D. (2003). Software Impact Analysis in a Virtual Environment, Proceedings of the 28th Annual NASA Goddard Software Engineering Workshop. December 3-4. USA: IEEE Computer Society. 143151. Booch, G., Rumbaugh, J. and Jacobson, I. (1999). The Unified Modeling Language User Guide. USA: Addison-Wesley. 154 Brand, V.D, Jong, H.A., Klint P. and Kooiker A.T. (2003). A Language Development Environment for Eclipse. Proceedings of the 2003 Object Oriented Programming, Systems, Languages and Applications. October. USA: ACM Press. 55-59. Briand, L.C. and Pfahl, D. (2000). Using Simulation for Assessing the Real Impact of Test Coverage on Defect-Coverage. IEEE Transactions on Reliability. 49(1): 60-70. Brown, A.B., Keller, A. and Hellerstein, J.L. (2005). A model of configuration complexity and its application to a change management system Integrated Network Management. IFIP/IEEE International Symposium. 631–644. Buchsbaum, A., Chen, Y.F., Huang, H., Koutsofios, E., Mocenigo, J., Rogers, A. (2001). Visualizing and Analyzing Software Infrastructures. IEEE Software. 62-70. Burd, E. and Munro, M. (1999). An initial approach towards measuring and characterizing software evolution. Proceedings of Sixth Working Conference on Reverse Engineering.168–174. Canfora, G. (2004). Software Evolution in the era of Software Services. Proceedings of the seventh International Workshop on Software Evolution (IWPSE’04). September 6-7. Japan: IEEE Computer Society. 9-18. Chapin, N., Hale, J.E. and Khan, K.M. (2001). Types of software evolution and software maintenance. Journal of Software Maintenance and Evolution: Research and Practice 13(1). 3-30. Chaumun, M.A., Kabaili, H., Keller, R.K., Lustman, F. and Saint-Denis, G. (2000). Design Properties and Object-Oriented Software Changeability. Proceedings of the Fourth European Conference on Software Maintenance. February 29March 3. Switzerland: IEEE Computer Society. 45-54. 155 Chen, K., Rajlich, V. (2000). Case study of feature location using dependence graph. Eigth International Workshop on Program Comprehension. June 10-11. Ireland: IEEE Computer Society Press. 241-247. Chen, X., Tsai, W.T., Huang, H. (1996). Omega – An integrated environment for C++ program maintenance. IEEE Computer Society Press. 114-123. Columbus. (2005). Analyzer Front End Company for Reverse Engineering Community. Last accessed on October 24, 2005. http://www.frontendart.com Cooper, D., Khoo, B., Konsky, B.R., Robey, M. (2004). Java Implementation Verification Reverse Engineering. Australian Computer Science Conference. Australian Computer Society. 26: 203-211. Cui, L. and Wang, H. (2004). Version management of EAI based on the semantics of workflow. Proceedings of the Eight International Conference on Computer Supported Cooperative Work in Design. November 6-10. USA: ACM. 341 – 344. Darcy, D.P and Kemerer, C.F. (2005). OO Metrics in Practice. IEEE Software. 22(6): 17 – 19. Deraman, A. (1998). A framework for software maintenance model development. Malaysian Journal of Computer Science . 11(2): 23-31. Desai, N., Lusk, A., Bradshaw, R. and Evard, R. (2003). BCFG: A Configuration Management Tool for Heterogeneous Environments. Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’03). December 1-4. Hong Kong: IEEE Computer Society. 500-503. Devandu, P.,Chen, Y.F., Muller, H. and Martin J. (1999). Chime: Customizable Hyperlink Insertion and Maintenance Engine for Software Engineering Environments. Proceedings of the 21st International Conference on Software Engineering. May 16-22. USA: IEEE Computer Society. 473-482. 156 Dorrow, K. (2003). Flexible Fault Tolerance in Configurable Middleware for Embedded Systems. Proceedings of the 27th Annual International Computer Software and Applications Conference. November 3-6. USA: IEEE Computer Society. 563-569. Egyed, A. (2003). A Scenario-Driven Approach to Trace Dependency Analysis. IEEE Transactions on Software Engineering. 29(2): 324-332. Engles, G. and Groenewegen, L. (2000). Object-Oriented Modeling: A Roadmap. Proceedings of the International Conference on the Future of Software Engineering. June 4-11. Ireland: ACM Press. 103-117. Estublier, J., Leblang, D. and Clemm, G. (2004). Impact of the Research Community for the Field of Software Configuration Management. International Conference of Software Engineering. May 23-28. Scotland: ACM. 643-644. Finkelstein, A. and Kramer, J. (2000). Software Engineering: A Roadmap. Proceedings of the Conference on the Future of Software Engineering. June 4-11. Ireland: ACM Press. 3-23. Fiutem, R. and Antoniol, G. (1998). Identifying Design-Code Inconsistencies in Object-Oriented Software: A Case Study. Proceedings of International Conference on Software Maintenance. USA: IEEE Computer Society Press. 94-102. French, J.C., Knight, J.C. and Powell, A.L. (1997). Applying Hypertext Structures to Software Documentation. Information Processing and Management. 33(2): 219-231. Garlan, D. (2000). Software Architecture: A Roadmap. Proceedings of the Conference on the Future of Software Engineering. June 4-11. Ireland: ACM Press. 91-101. 157 Gotel, O. and Finkelstein, A. (1994). An Analysis of the Requirements Traceability Problem, Proceedings of the First International Conference on Requirements Engineering. USA: IEEE Computer Society. 94-101. GrammaTech. (2005). CodeSurfer Application. Last accessed on October 24, 2005. http://www.grammatech.com/products/codesurfer/ Gupta, S.C., Nguyen, T.N. and Munson, E.V. (2003). The Software Concordance: Using a Uniform Document Model to Integrate Program Analysis and Hypermedia. Proceedings of the Tenth Asia-Pacific Software Engineering. December 10-12. Thailand: IEEE Computer Society. 164-173. Hagen, V., McDonald, A., Atchison, B., Hanlon, A. and Lindsay, P. (2004). SubCM: A Tool for Improved Visibility of Software Change in an Industrial Setting. IEEE Transactions on Software Engineering. 30(10): 675-693. Han, J. (2001). TRAM: A Tool for Requirements and Architecture Management. Proceedings of the 24th Australian Conference on Computer Science. Australia: IEEE Computer Society. 60-68. Hartmann, J. Huang, S. and Tilley, S. (2001). Documenting Software Systems with View II: An Integrated Approach Based on XML. Proceedings of Annual ACM Conference on Systems Documentation (SIGDOC 2001). October 2124. USA: ACM. 237-246. Hichins, M. and Gallagher, K. (1998). Improving visual impact analysis. Proceedings of the International Conference on Software Maintenance. November 16-20. USA: IEEE Computer Society. 294-301. Hoffman, M.A. (2000). A methodology to support the maintenance of object-oriented systems using impact analysis. Louisiana State University: Ph.D. Thesis. 158 Horwitz, S., Reps, T. and Binkley, D. (1990). Interprocedural slicing using dependence graphs. ACM Transactions on Progrramming Languages and Systems. 12(1): 26-60. Huang, J.C., Chang, C.K. and Christensen, M. (2003). Event-Based Traceability for Managing Evolutionary Change. IEEE Transactions of Software Engineering. 29(9): 796-810. IEEE. (1998a). IEEE Standard For Software Maintenance. New York, IEEE Std. 1219-1998. IEEE. (1998b). IEEE Standard for Software Configuration Management Plans. New York, IEEE-Std 828-1998. IEEE. (1998c). IEEE Standard for Software Test Documentation. New York, IEEEStd 829-1998. IEEE. (1998d). IEEE Recommended Practice For Software Requirements Specifications. New York, IEEE-Std 830-1998. IEEE. (1998e). Guides- A Joint Guide Developed by IEEE and EIA- Industry Implementation of International Standard ISO/IEC 12207. New York, IEEE/EIA 12207. IEEE. (1998f). IEEE Recommended Practice For Software Design Descriptions. New York, IEEE-Std 1016-1998. ISO/IEC 12207. (2005). ISO/IEC 12207 Software Lifecycle Processes. Last accessed on October 24, 2005. www.abelia.com/docs/12207cpt.pdf Jacobson, I., Booch, G., and Rumbaugh, J. (1999). Unified Software Development Process. USA: Addison-Wesley. 159 Janakiraman, G., Santos, J.R., Turner, Y. (2004). Automated System Design for Availability. Proceedings of the 2004 International Conference on Dependable Systems and Networks. June 28-July 1. Italy: IBM. 411-420. Jang, K.Y., Munro, M., Kwon,Y.R. (2001). An Improved Method of Selecting regression Tests for C++ Programs. Journal of Software Maintenance And Evolution: Research And Practice. 13: 331-350. J-STD-016-1995. (2005). A Comparison of IEEE/EIA 12207, J-STD-016, and MILSTD-498. Last accessed on October 24, 2005. http://www.abelia.com/pubsmain.htm Kaibaili, H., Keller, R.K. and Lustman, F. (2001). Cohesion as Changeability Indicator in Object-Oriented Systems. IEEE Computer Society. 39-46. Keller, A. Hellerstein, J.L., Wolf, J.L. and Wu, K.L. (2004). The CHAMPS System: Change Management with Planning and Scheduling. Network Operations and Management Symposium. April 19-23. Korea: IEEE/IFIP. 395 – 408. Kendall, K.E. and Kendall, J.E. (1998). System Analysis and Design. Fourth Edition. USA: Prentice Hall. Kichenham, B.A., Travassos, G.H., Mayrhauser, A.V. and Schneidewind, N. (1999). Towards an Ontology of Software Maintenance. Journal of Software Maintenance: Research and Practice. 11: 365-389. Kullor, C. and Eberlain, A. (2003). Aspect-Oriented Requirements Engineering for Software Product Lines. Proceedings of the Tenth International Conference and Workshop on the Engineering of Computer-Based System. USA: IEEE Computer Society. 98-107. 160 Kung, D.C., Liu, C.H. and Hsia, P. (2000). An object-oriented Web test model for testing Web applications. The 24th Annual International Conference on Computer Software and Applications. 537-542. Lakhotia, A., Deprez, J.C. (1999). Restructuring Functions with Low Cohesion. IEEE Computer Society Press. 381-390. Lazaro M. and Marcos E. (2005). Research in Software Engineering: Paragidms and Methods. PHISE’05. Lee, J.K. and Jang, W.H. (2001). Component Identification Method with Coupling and Cohesion. Proceedings of the Eight Asia-Pacific Software Engineering Conference. December 4-7. China: IEEE Computer Society. 79-86. Lee, M., Offutt, A.J. and Alexander, R. (2000). Algorithmic analysis of the impacts of changes to object-oriented software. Proceedings of the 34th International Conference on Technology of Object-Oriented Languages and Systems. July 30 - August 4. USA: IEEE Computer Society.. 61-70. Lehman M.M. and Ramil J.F. (2000). Towards a Theory of Software Evolution – and Its Practical Impact. International Symposium on Principles of Software Evolution. November 1-2. Japan: IEEE Computer Society. 2-11. Lehman, M.M. and Ramil, J.F. (2002). Software Evolution and Software Evolution Processes. Annals of Software Engineering. 14: 275-309. Lindvall, M. (1997). Evaluating Impact Analysis - A case Study. Journal of Empirical Software Engineering. 2(2): 152-158. Lindvall, M. and Sandahl, K. (1998). Traceability Aspects of Impacts Analysis in Object-Oriented System. Journal of Software Maintenance Research and Practice. 10: 37-57. 161 Lu X. and Yali G. (2003). Risk Analysis in Project of Software Development. Proceedings of the Engineering Management Conference on Management Technologically Driven Organizations. November 2040. New York: IEEE Computer Society. 72-75. Lucca, G.A., Di Penta, M. and Gradara, S. (2002). An Approach to Classify Software Maintenance Requests. Proceedings of the International Conference of Software Maintenance. October 3-6. Canada: IEEE Computer Society. 93102. Maarek, Y., Berry, D. and Kaiser, G. (1991). An Information Retrieval Approach for Automatically Constructing Software Libraries. IEEE Transactions of Software Engineering. 17(8). 800-813. Madhavji, N.H. (1992). Environment Evolution: The Prism Model of Changes. IEEE Transaction on Software Engineering. 8(5). 380-392. Marcus A. and Meletic J.I. (2003). Recovering ocumentation-to-Sourcce-Code Traceability Links Using Latent Semantic Indexing. Proceedings of the Twenty Fifth International Conference on Software Engineering. May 3-10. USA: IEEE Computer Society. 125-135. Mayrhauser, A.V. and Lang, S. (1999). A Coding Scheme to Support Systematic Analysis of Software Comprehension. IEEE Transaction of Software Engineering. 25(4). 526-540. McCabe. (2005). McCabe Software- Assuring Quality Throughout the Application Lifecycle. Last accessed on October 24, 2005. http://www.mccabe.com McConnel, S. (1997). Gauging Software Readiness with Defect Tracking. IEEE Software. 14(3): 136-135. MIL-STD-498. (2005). Roadmap for MIL-STD-498. Last accessed on October 24, 2005. http://www2.umassd.edu/SWPI/DOD/MIL-STD-498/ROADMAP.PDF 162 Murphy, G.C., Notkin, D. and Sullivan, K. (1995). Software Reflexion Models: Bringing the Gap Between Source and High-Level Models. Proceedings of Third Symposium Foundations of Software Engineering. October 12-15. USA: ACM. 18-28. Nam, C., Lim, J.L., Kang, S. Bae, Lee J.H. (2004). Declarative development of Web applications with active documents. Proceedings of the Russian-Korean International Symposium on Science and Technology. IEEE Computer Society. 68-72. Natarajan, B., Gokhale, A., Yajnik, S. and Schmidt, D.C. (2000). DOORS: Towards High-Performance Fault Tolerant. Proceedings of the Second International Symposium on Distributed Objects and Applications. September 21-23. Belgium: IEEE Computer Society. 39 – 48. Nguyen, T.N., Munson, E.V., Boyland, J.T. and Thao, C. (2004). Flexible Finegrained Version Control for Software Documents. Proceedings of the 11th Asia-Pacific Software Engineering. November 30 - December 3. Korea: IEEE Computer Society. 212-219. Nuseibeh, B. and Easterbrook S. (2000). Requirements Engineering: A Roadmap. Proceedings of the International Conference on the Future of Software Engineering. June 4-11. Ireland: ACM Press. 35-46. Perry, D.E. (1994). Dimensions of Software Evolution. Proceedings of the International Conference on Software Maintenance. USA: IEEE Computer Society. 296-303. Pressman. R.S. (2004). Software engineering, a practitioner’s approach. Sixth ed. New York: Mc Graw Hill. 163 Rajlich, V. and Wilde, N. (2002). The Role of Concepts in Program Comprehension, Proceedings of 10th International Workshop on Program Comprehension. June 27-29. France: IEEE Computer Society. 271-278. Ramesh, B. and Jarke, M. (2001). Toward Reference Models for Requirements Traceability, IEEE Transactions on Software Engineering. 27(1): 58-93. Ramesh, B. (2002). Process Knowledge Management With Traceability, IEEE Software. 19(3): 50-52. Rational (2005). Rational Software Corporation. Last accessed on October 24, 2005. www.rational.com. Recon2. (2005). Recon2 Tool for C Programmers. Last accessed on October 24, 2005. http://www.cs.uwf.edu/~recon/recon2/ Rilling, J., Seffah, A., Bouthlier, C. (2002). The CONCEPT Project – Applying Source Code Analysis to Reduce Information Complexity of Static and Dynamic Visualization Techniques. Proceedings of the First International Workshop on Visualization Software for Understanding and Analysis. UK: IEEE Computer Society. 99-99. Saemu, J., Prompoon, N. (2004). Tool and guidelines support for Capability Maturity Model's software subcontract management. Proceedings of the 11th AsiaPacific Software Engineering Conference. 158–165. Sefika, M., Sane A. and Campbellk, R.H. (1996). Monitoring Compliance of a Software System with its High-Level Design Models. Proceedings of Eighteen International Conference of Software Engineering. March 25-29. Germany: IEEE Computer Society. 387-396. Singer, J. and Vinson, N.G. (2002). Ethical Issues in Emperical Studies of Software Engineering. IEEE Transactions on Software Engineering. 28(12): 11711180). 164 Small, A.W.; Downey, E.A. (2001). Managing Change: Some Important Aspects. Proceedings of the Change Management and the New Industrial Revolution (IEMC’01). October 7-9. USA: IEEE Computer Society. 50-57. Sneed, H.M. (2001). Impact Analysis of maintenance tasks for a distributed objectoriented system, Proceedings of the International Conference on Software Maintenance. November 6-10. Italy: IEEE Computer Society. 180-189. Sommerville, I. (2004). Software Engineering. Seventh Ed. USA: Addison-Wesley Publishing Company. Sulaiman, S. (2004). A Document-Like Software Visualization Method for Effective Cognition of C-Based Software Systems, Universiti Teknologi Malaysia: Ph.D. Thesis. Teng, Q., Chen, X., Zhao, X., Zhu, W. and Zhang L. (2004). Extraction and visualization of architectural structure based on cross references among object files. Proceedings of the 28th Annual International Conference on Computer Software and Applications. September 27-30. Hong Kong: IEEE Computer Society. 508-513. Tonella, P. (2003). Using a Concept Lattice of Decomposition Slices for Program Understanding and Impact Analysis, IEEE Trans. Software Engineering. 2(6): 495-509. Turver, R.J. and Munro, M. (1994). An Early impact analysis technique for software maintenance, Journal of Software Maintenance: Research and Practice. 6 (1): 35-52. Tvedt, R.T., Costa, P., Lindvall, M. (2002). Does the Code Match the Design? A Process for Architecture Evaluation. Proceedings of the Eighteen International Conference on Software Maintenance. October 4-6. Canada: IEEE Computer Society. 393-401. 165 Tvedt, R.T., Lindvall, M. and Costa, P. (2003). A Process for Software Architecture Evaluation Using Metrics. Proceedings of the 27th Annual NASA Goddard/IEEE Software Engineering Workshop. December 5-6. Maryland: NASA Goddard/IEEE Software. 175-182. Wan Kadir, W.N. (2005). Business Rule-Driven Object-Oriented Design. University of Manchester: Ph.D. Thesis. Webster, P.B., Oliveira, K.M. and Anquetil, N. (2005). A Risk Taxonomy Proposal for Software Maintenance. Proceedings of the 21st IEEE International Conference on Software Maintenance. 453-461. Wei, Y., Zhang, S., Zhong, F. (2003). A Meta-Model for Large-Scale Software System. International Conference on Systems, Man and Cybernatics. October 5-8. USA: IEEE Computer Society. 501-505. Weiser, M. (1984). Program slicing. IEEE Transactions on Software Engineering. 10(4): 352-357. White, B.A. (2000). Software Configuration Management Strategies and Rational ClearCase. USA: Addison-Wesley. Wilde, N., Buckellew, M., Page, H., Rajlich, V. (2001). A case study of feature location in unstructured legacy Fortran code. Proceedings of the Fifth European Conference on Software Maintenance and Reengineering. March 14-16. Portugal: IEEE Computer Society. 68-76. Wilde, N., Casey, C. (1996). Early Field Experience with the Software Reconnaissance Technique for Program Comprehension. Proceedings of the Third Workshop Conference of Reverse Engineering, November 8-10. USA: IEEE Computer Society. 270-276. 166 Wilde, N., Gomez, J.A., Gust, T. and Strasburg D. (1992). Locating User Functionality in Old Code. Proceedings of International Conference on Software Maintenance. November 9-12. Sweden: IEEE Computer Society. 200-205. Snelting G. (2000). Understanding Class Hierarchies Using Concept Analysis. ACM Transactions on Programming Languages and Systems. 22(3):123-135. Walrad, C. and Strom, D. (2002). The Importance of Branching Models in SCM. Journal of Computer. 35(9): 31-38. Wilkie, F.G. and Kitchenham, B.A. (1998). Coupling Measures and Change Ripples in C++ Application Software. Proceedings of the Third International Conference on Empirical Assessment and Evaluation in Software Engineering. April 12-14. UK: IEEE Computer Society. 112-130. Zelkowitz, M.V. and Wallace. D.R. (1998). Experimental Models for Validating Technology. IEEE Computer. 31(2): 23-31. Zhoa, J. (2002). Change impact analysis to support architectural evolution. Software Maintenance and Evolution: Research and Practice. 14: 317-333. APPENDIX A Procedures and Guidelines of the Controlled Experiment Objectives of the study: 1. 2. 3. 4. 5. To study why a maintenance task is so complex. To investigate the effectiveness of the process and technique used to manage PCRs. To identify the effect of change at the broader perspective that includes the code, design, test cases and requirements. To investigate the relationships between software artifacts within and across different software workproducts (e.g. code). To realize the significant use of software traceability in change impact analysis. The questionnaires consist of two sections: 1. Section A: Previous working background 2. Section B: Use of CATIA and tool evaluation Please answer all the questions Definitions: • • • • • PCR: acronym for Problem Change Request that specifies the problems and reasons for change to be made. Software workproducts: A set of software products such as code, design, specifications, testing, requirements and documentation. In our case study, we use the code, design, test cases and requirements as our workproducts. Software artifacts: software components derived from the individual workproducts e.g. variables, methods derived from the code workproduct, packages and classes derived from the design workproduct, test cases from the test specification workproduct and requirements from the requirements specification workproduct. Software traceability: the ability to trace the dependent artifacts within a workproduct and across different workproducts e.g. within code and cross over to design, test cases and requirements. Change impact analysis: detailed examination into the potential effects of a proposed change of software artifacts before the actual change is implemented. 168 Some descriptions on maintenance method and the use of CATIA: Each group is required to analyze each PCR. In each PCR, please identify the appropriate primary artifacts (PIS). PIS is an initial software target (e.g. which method, class, test case or requirement) that is firstly identified as a candidate for change. For each PIS, please identify the potential effects (i.e. change impact) on other artifacts (i.e. other methods, classes, test cases or requirements) with the help of our maintenance tool, called CATIA. You are supplied with CATIA and a set of documentation (IRS, SRS, SDD, STD, code) to support your change impact analysis. We consider these potential effects as our Secondary artifacts (SIS). SIS may be unnecessarily large. Based on this SIS, please draw the actual artifacts (AIS). AIS is a set of components actually modified as a result of changing the PIS. The steps to manage change impact analysis are as follow: 1) Please explore manually the right artifact(s) in the OBA system. Use the existing code and documents (SRS, IRS, SDD, STD) if necessary to help you get into the right artifacts. Mark for its IDs (i.e. method no. or class no.) as your PIS candidates. 2) Use CATIA to obtain the SIS in terms of which methods, classes, packages, test cases and requirements affected by the PIS. 3) Obtain the AIS from the SIS. 4) Change the code accordingly based on the PCR. 169 Sample Results: PCR#: 5 Select one type of primary artifacts (PIS) where appropriate and fill in the details. Time starts 2.35 Tick Types of Artifacts PIS details Methods Mtd7, Mtd11 √ Classes Packages Test cases Requirements Time ends 2.50 For each PIS, select all types of secondary artifacts (SIS) and fill in the details. Time starts 2.55 PIS Candidate Mtd7 Types of Artifacts Count SIS details Methods 4 Mtd7 Mtd8 Classes 3 CSU1 CSU6 CSU7 Packages 2 CSC1 CSC4 Test Cases 5 Tcs1 Tcs4 Tcs5 Requirements 4 Req2 Req4 Req7 Req10 Time ends 3.05 For each PIS, select all types of actual artifacts (AIS) and fill in the details. Time starts 3.00 PIS Candidate Mtd7 Types of Artifacts Count SIS details Methods 4 Mtd7 Classes 3 CSU1 CSU6 Packages 2 CSC1 CSC4 Test Cases 5 Tcs1 Tcs4 Tcs5 Requirements 4 Req2 Req4 Req7 Req10 Time ends 3.20 170 CONTROLLED EXPERIMENT You are supplied with a CATIA tool, documentations (SRS, IRS, SDD, STD) and code to manage the change impact analysis. Each group is required to work on a set of the designated PCRs. Please proceed your work on the change impact based on the procedures and guidelines given. Program Change Reports (PCRs):` PCR#1 The system should highlight a warning message when the speed is less or greater than 15 km/h from the cruising speed and highlight a danger message when the speed exceeds 155 km/h instead of its default 150 km/h. Choose any one of the existing requirements in the OBA system. Record for its potential effects on other components in terms of methods, classes, test cases and requirements. PCR#2 The system should maintain the continuous signal of the maintenance limit of “Air Filter Change”, but to maintain its intermittent signal at 100 km lower than the original value before reaching the limit. Choose any one of the existing requirements in the OBA system. Record for its potential effects on other components in terms of methods, classes, test cases and requirements. PCR#3 The cruise controller of the vehicle needs to be adjusted to regulate speed at +-2 km/hr from the current speed and with the speed limit of no more than 145.5 km/h for the LED danger message to be lighted on. Choose any one of the existing requirements in the OBA system. Record for its potential effects on other components in terms of methods, classes, test cases and requirements. PCR#4 Due to new development features, the keypad handler should supply input to auto cruise at gear four and above. The default fuel fill code to be adjusted to 24 instead of 25. Choose any one of the existing requirements in the OBA system. Record for its potential effects on other components in terms of methods, classes, test cases and requirements. 171 PCR#5 The Pedal functions need to respond to acceleration, brake and clutch during auto cruise and non auto cruise. The default brake code should be identified as 250 instead of 200 to nullify the auto cruise. Choose any one of the existing requirements in the OBA system. Record for its potential effects on other components in terms of methods, classes, test cases and requirements. 172 Please get additional copies of the following tables to write down your analysis results. GROUP: _________ PCR#: ___________ Just select one type of primary artifacts (PIS) Time starts Tick Types of Artifacts PIS details Methods Classes Packages Test cases Requirements Time ends Select all types of secondary artifacts (SIS) Time starts PIS Candidate Type of Artifacts Count SIS details Methods Classes Packages Test Cases Requirements Time ends Select all types of actual artifacts (AIS) Time starts PIS Candidate Type of Artifacts Count AIS details Methods Classes Packages Test Cases Requirements Time ends 173 Section A: Previous Working Background (Tick √ the answers that apply) A1: What was your previous job? (Tick only one) Programmer Software Engineer /System Analyst Project Leader / Project manager Configuration engineer Quality engineer Other, please specify: ___________________________ A2: What are your reasons of studying Software Engineering? (Tick one or more) Company expects me to help implement SE practices after I finish study Just for fun To get a Master’s degree To become a software expert To improve my knowledge in software engineering standards and practices Other, please specify: ___________________________ A3: What have you learnt in Software Engineering? (Tick one or more) Software development practices Project management Software maintenance Software contracts Object-oriented design and programming Documentation standard, please specify: ______________________ A4: Number of years you have been involved in software development projects before: (Tick only one) Less 1 year 1 – 3 years 4 – 6 years More than 6 years No experience at all Other, please specify: ___________________________ A5: Number of years you have been involved in software maintenance projects before: (Tick only one) Less 1 year 1 – 3 years 4 – 6 years More than 6 years No experience at all Other, please specify: ___________________________ 174 A6: If yes, what types of workproducts have you been involved for maintenance projects: (Tick one or more) Code Design Specifications Testing Requirements and project management Documentation Other, please specify: ___________________________ A7: List three programming languages (or software packages) and its platforms that you are familiar most: (e.g. C and Unix platform) Programming languages 1. 2. 3. Platform 175 Section B: Use of CATIA and Maintenance Method B3: Specify the steps you have taken to determine the PIS from the PCRs. 1. 2. 3. 4. 5. 6. 7. B4: Specify the steps you have taken to determine the AIS from the OBA system. 1. 2. 3. 4. 5. 6. 7. B5: Which of the followings apply to you most while doing OBA maintenance: (tick one only) We spend more time on source codes than other documents to understand the PIS We spend more time on documents than source codes to understand the PIS We equally spend time on both source codes and documents to understand the PIS B6: Which of the followings apply to you most while doing OBA maintenance: (tick one only) We spend more time on source codes than other documents to understand the AIS We spend more time on documents than source codes to understand the AIS We equally spend time on both source codes and documents to understand the AIS 176 B7: Rank (on scale of 1 to 7) the usefulness of the features in CATIA to support OBA maintenance: Not at all 1 2 4 4 5 6 Easy to use in overall Easy to identify the SIS of an artifact within and across different level of workproducts. Supports change impact analysis Provides basis to cost estimation and plan schedules Provides top down and bottom up traceability Do you think that CATIA can speed up your task in searching for the AIS? Do you think that CATIA can support a traceability process in documentation standards? Do you find the user interface supplied by CATIA is useful to your impact analysis? Any comments on CATIA or on other related issues in general? End of Questionnaires Thank You Extremely 7 No Opinion 177 APPENDIX B CATIA MANUAL AND DESCRIPTIONS CATIA: A Configuration Artifact Traceability Tool for Change Impact Analysis 178 B1. Introduction This manual provides information on how to use the CATIA prototype. It describes the traceability process as being implemented by CATIA. It also provides detailed information on each component of the prototype. This manual is intended for the software maintainers or software engineers in their effort to manage maintenance tasks. B2. Overview of CATIA CATIA (Configuration Artifact Traceability Tool For Impact Analysis) is a tool developed in Java to implement software traceability between software components to support change impact analysis. CATIA provides visibility of potential effects within and across different levels of software workproducts. Five different workproducts handled by CATIA, namely requirements, test cases, packages, classes and methods. CATIA supports two main components; primary components and secondary components. Primary components are the initial target of software components to begin change impact. Secondary components are the potential effects of a proposed change. B3. Accessing the CATIA This section describes how to access the primary components and secondary components in the system. B3.1 Starting CATIA To access the system, follow the following instructions. 179 Step 1 Run the CATIA tool by clicking the run project icon in the tool environment. Step 2 Select the primary option from the Artifact’s pull down menu. Figure B1: primary option of requirement artifacts Step 3 Select a type of primary artifacts from the radio buttons. A selection is followed by a list of the detailed artifacts. Select one or more artifacts (figure B1) of no more than 10 from the list, e.g. req3, req8. Then press the ENTER. Step 4 Choose one or more types of secondary artifacts you would like to view the potential effects e.g. Methods, Classes and test cases. Then click the generate button. 180 Figure B2: Potential effects and impact summary Step 5 Wait for a while, CATIA is generating a list of the impacted artifacts for each primary artifact chosen earlier. User can scroll up and down the Window Pane to visualize the detailed impacts of each component (i.e. artifact) with its corresponding metrics in LOC and VG. B3.2 Viewing Summary Press the view summary button to visualize the impact summary. B3.3 Repeat the Process User can repeat a new process for another impact analysis by clicking the back option. B3.4 Exiting the System To exit from the system, select the exit option from the file pull down menu. 181 B4.0 Detailed Implementation of CATIA The aim of this section is to describe the detailed implementation of CATIA. The implementation can be divided three categories; bottom up traceability, top down traceability and interface with documentation. A user should find this section as additional information to the existing manual we described earlier. B4.1 Bottom-Up Traceability In bottom-up traceability, a user aims to implement impacts by poising some primary artifacts of the lower level (e.g. methods) and expecting some potential affects at the high levels. Figure B3 shows that the user had selected the mtd2, mtd12 and mtd15 to represent the initial primary artifacts. Figure B3: First CATIA Interactive window Upon receiving the primary artifacts, CATIA prompts another window that requires the user to decide the types of secondary artifacts (i.e. potential effects) to 182 view the impact (Figure B4). The user can select one or more types or levels of secondary artifacts to view the impact. In Figure B5, the user selects all the artifact levels and after the ‘generate button’, CATIA produces a list of the impacted methods, classes, packages, test cases and requirements for each of the primary artifact chosen earlier. Figure B4: Primary artifacts at method level The user can scroll the window to see the detailed impacted artifacts. In Figure B6, the window shows that the primary artifact mtd2 has caused four other artifacts i.e. mtd1, mtd14, mtd16 including mtd2 itself. The information is given in terms of its method name, size (LOC) and complexity (VG). Mtd1 is a drivingStationHandler() with its LOC 37 and VG 12, mtd2 is getState(), with its LOC 16 and VG 4, and so forth. The user can view the summary of the selected primary artifacts with their impacted artifacts or secondary artifacts when he selects a ‘view summary’ button 183 (Figure B6). The figure shows that the primary artifacts of mtd2, mtd12 and mtd15 are displayed with their respective summary impacts. By considering mtd2 as an example, this method has caused potential effect to four other methods in the system which brings to the total metrics of LOC (117) and VG (47). Figure B5: Detailed results of secondary artifacts In terms of the classes, mtd2 has caused three classes with their LOC (190) and VG (71). In packages, mtd2 has caused two packages of size LOC (279) and VG (100). Mtd2 was associated to four test cases that took up the total metrics of LOC (373) and VG (123) of impacted artifacts to implement it. CATIA also provides a list of the detailed artifacts with their metric values to allow users to identify the impacted components. This detailed information is produced in addition to the existing ones via specific windows. Figure B6 depicts all the activities of the running system including the output of the detailed impacted artifacts in a message window. The user may find the current message window as another alternative space to view the impacts. 184 Figure B6: Summary of impacted artifacts by methods In Figure B7, it shows the detailed impacts e.g. mtd2 has caused an impact on three classes; cls1, cls2 and cls8. Cls1 is a CDrivingStation with the total size and complexity of 39 and 14 respectively, however the total impacted size and complexity were 37 and 12 respectively. Cls2 is a CPedal with the total size and complexity of 18 and 6 respectively, but the total impacted size and complexity were 16 and 4. Cls8 is a CCruiseManager with a total size and complexity of 133 and 51 respectively, and the total impacted size and complexity were 64 and 31. The affected methods can be recognized by looking at the method list as appeared earlier and find their corresponding classes in the reference tables. Please note that the total size and complexity of a method is the same as its size and complexity. The reason is due to the fact that it only considers methods as the smallest artifacts of impact analysis. 185 Figure B7: Impacted artifacts in a message window B4.2 Top-Down Traceability Figure B8 shows another way of implementing CATIA. In this example, a user performed a top down traceability by choosing some requirements (i.e. req5, req9, req26) at the highest level to view the impacted artifacts at the lower levels. Figure B9 summarized the output that the req5 has caused impacts on 24 methods, 14 classes, 10 packages and 3 test cases. In terms of sizes and complexities, the impacts has taken up LOC (351) and VG (117) on methods, LOC (473) and VG (154) on classes, LOC (491) and VG (160) on packages and LOC (351) and VG (117) on test cases. 186 Figure B8: Primary artifacts at requirement level Figure B9: Summary of impacted artifacts by requirements 187 Req9 has caused impacts on 22 methods, 13 classes, 9 packages and 3 test cases. In terms of sizes and complexities, the impacts has taken up LOC (348) and VG (116) on methods, LOC (473) and VG (154) on classes, LOC (491) and VG (160) on packages and LOC (348) and VG (116) on test cases, and so forth. The detailed metric descriptions of individual artifacts in top-down traceability are also made available in the box window and message window. Please note that no requirements are allowed to cause impact on other requirements as described earlier in our proposed model. B4.3 Interface with Documentation The above CATIA environment was designed for a single task application of change impact analysis in which some useful information from the system documentation has been extracted, analyzed and kept in its own database repository. Another version of CATIA was developed to support documentation environment by establishing a simple interface (Figure B10) using an HTML webpage. Figure B10: A hyperlink between CATIA and documentation 188 The HTML allows the user to switch between documentation and CATIA application. In this new version, CATIA shares the same resources of its first version except that the user interfaces are different. The significant advantage of this version is that the current environment provides dual applications at the same time i.e. documentation and CATIA. It provides some flexibility to a user to cross over between the two applications for some references without having to terminate the other session. 189 APPENDIX C CODE PARSER Our code parser, TokenAnalyzer was developed in Java to parse the C++ code into some recognizable tokens. The recognizable tokens are classified into the table structures as shown in the Figure C1. Figure C1: Table structures of recognizable tokens The table structures are composed of 1. Clstable – to keep classes. 2. Mtdtable – to keep methods with classid. 3. Parametertable – to keep parameters. 4. Attable – to keep attributes. 5. Datatypetable – to keep all the data types. 190 What is the .XMI file? The .xmi file is a token file consisting of classes, methods, parameters, attributes and relationships extracted from the code based on the XML/UML format. Defining a class A class is defined by using the keyword “class” followed by a programmerspecified name and the class definition in brackets. The class definition contains the class members and member functions. In the .xmi file, a class is defined as below: <!-- ================ ClassName [class] ================= --> We developed a special program to read the .xmi file and extracted from it a set of classes. The underlying .xmi format required us to apply a regular expression to help parse the code (Listing C1). Based on the structure of the UML token file, there is a possibility to read from the specific location of the token file repeatedly and dynamically as some condition changes, if i) the specific token matched by the keyword, and ii) more than one members of artifacts detected. Regex package provides algorithms to read the token file and capture the data correctly. The techniques used include: • capturing the data into double arrays upon reading the token file and pass the related double arrays into a database. • using the normal variables and Boolean-type variables to control the behavior and functionality of the methods and to handle the token file based on the required conditions. 191 Listing C1: A Regex Program to Implement Regular Expression import java.util.regex.Matcher; … //From ProcessParse.java file public static CharSequence FileInfo(String filename) throws IOException { CharBuffer cbuffer = null; try { FileInputStream finput = new FileInputStream(filename); FileChannel fchanel = finput.getChannel(); ByteBuffer buffer = fchanel.map(FileChannel.MapMode.READ_ONLY, 0, (int) fchanel.size()); cbuffer = Charset.forName("8859_1").newDecoder().decode(buffer); } catch (Exception e) { System.out.println(" FROM FindClass Method : " + e.getMessage()); } return cbuffer; } public static Matcher xmi_Matcher(String compile, String filename) throws Exception { Pattern pattern = Pattern.compile(compile); Matcher matcher = pattern.matcher(FileInfo(filename)); return matcher; } Engine that can read the Regular expression regular expression //From RunParser.java file Matcher match_Class = ProcessParse.xmi_Matcher("\\w.+\\[class]", ProcessParse.filename); Classes.xmi_findClass(match_Class); //From Classes.java file public static void xmi_findClass(Matcher match) throws SQLException { //Local Variable declaration int rs; Parse p = null; try { … int count = 1; while (match.find()) { //the find method return true if regular expression definition is true p = new Parse(); String[] result = match.group().trim().split("\\s"); find = result[0]; //this is the class name that we get in the xmi file //some code } catch (Exception e) { } } 192 Defining a method A method is a member function declared in a class. In the .xmi file, a method is known as an ‘operation’. The method is defined in .xmi file as follows: <!-- ============== ClassName [class] MethodName [Operation] =============== --> Here, is the snippet of the code to get the methods: //From RunParser.java file Matcher match_Mtd = ProcessParse.xmi_Matcher("\\w.+\\[Operation]", ProcessParse.filename); Method.xmi_findMtd(match_Mtd); //From Methods.java file public static void xmi_findMtd(Matcher match) throws IOException { try { //....... //More Regular expression Matcher match_3 = ProcessParse.xmi_Matcher( "<Foundation.Core.Classifier xmi.idref = \\'[^ ][(\\w),(\\w)]*\\'", ProcessParse.filename); //data type Matcher match_2 = ProcessParse.xmi_Matcher( "<Foundation.Core.Classifier xmi.idref = \\'[^ ][(\\w),(\\w)]*\\'", ProcessParse.filename); //data type Matcher match_1 = ProcessParse.xmi_Matcher( "visibility = \\'[^ ][(\\w),(\\w)]*\\'", ProcessParse.filename); //Accessibility while (match.find()) { p = new Parse(); String[] result = match.group().trim().split("\\s"); p.setClsname(result[0]); //class p.setMtdname(result[4]); //method int go = match.end(); if (match_1.find(go)) {//to find next line after find the method String[] result_1 = match_1.group().trim().split("\\s"); p.setVisibility(result_1[2].replace('\'', ' ').trim()); //Access type-public, private, protected } int end = match_1.end(); if (match_2.find(end)) { String[] result_2 = match_2.group().trim().split("\\s"); p.setDtyid(result_2[3].replace('\'', ' ').trim()); //to find data type } ....//some others code } } } catch (Exception e) { } } 193 Defining an attribute Attribute is a class member declared as data with some variation of data types. In the .xmi, the attributes can be presented as below: <!-- =============== ClassName [class] attributeName [Attribute] =============== --> Here, is the snippet of the code to get the attributes: //From RunParser.java file Matcher match_Attribute= ProcessParse.xmi_Matcher("\\w.+\\[Attribute]", ProcessParse.filename); Attribute.xmi_findAttribute (match_Attribute); //From Attribute.java file public static void xmi_findAttribute(Matcher match) throws IOException { try { //....... //More Regular expression Matcher match_2 = ProcessParse.xmi_Matcher( "<Foundation.Core.Classifier xmi.idref = \\'[^ ][(\\w),(\\w)]*\\'", ProcessParse.filename); Matcher match_1 = ProcessParse.xmi_Matcher( "visibility = \\'[^ ][(\\w),(\\w)]*\\'", ProcessParse.filename); while(match.find()){ p = new Parse(); String[] result = match.group().trim().split("\\s"); p.setClsname(result[0]); //Class p.setAttname(result[4]); //Attribute int go = match.end(); if(match_1.find(go)){ String[] result_1 = match_1.group().trim().split("\\s"); p.setVisibility(result_1[2].replace('\'',' ').trim());//Access type-public, private, protected } int end = match_1.end(); if(match_2.find(end)){ String[] result_2 = match_2.group().trim().split("\\s"); p.setDtyid(result_2[3].replace('\'', ' ').trim()); //Data type } ....//some others code } } } catch (Exception e) { } } 194 Defining a parameter Parameter is a piece of information in a call invocation that provides communication between parts of the program. Below represents some tags of parameters: <UML:BehavioralFeature.parameter> …. </UML:BehavioralFeature.parameter> Here, is the snippet of the code to get the parameters: //From RunParser.java file Parameter.xmi_findParameter(ProcessParse.TokenString(ProcessParse.filename)); //From Parameter.java file public static void xmi_findParameter(String[] result) throws SQLException { try { //....... for (int i = 0; i < result.length; i++) { p = new Parse(); //System.out.println("Test Mode " + result[i]); if (result[i].trim().equals("<UML:BehavioralFeature.parameter>")) { buf = result[i - 4].trim().split(" "); start = i; found = true; } if (result[i].trim().equals("</UML:BehavioralFeature.parameter>")) { end = i; if (found) { for (int k = start; k < end; k++) { if (result[k].trim().substring(0, 16).matches("<UML:Parameter x")) { count = count + 1; String[] temp = result[k + 1].trim().split(" "); p.setMtdname(buf[10]); //Method p.setClsname(buf[6]); //Class p.setParamname(temp[2].replace('\'', ' ').trim()); //Parameter p.setVisibility(temp[5].replace('\'', ' ').trim()); //Access type-public, private, protected p.setKind(temp[11].replace('\'', ' ').trim());//Return or inout } if (result[k].trim().substring(0, 16).matches("<Foundation.Core")) { String[] temp = result[k].trim().split(" "); p.setDtyid(temp[3].replaceAll("/>", " ").replace('\'', ' '). trim());//Data type xmi_insertParam(p, count); } ....//some others code } } ....//some others code } } } catch (Exception e) {} To} define the Data Type 195 Defining a data type A data type in a programming language is a data with its predefined type. Examples of data types are integer, float, char, string, and pointer. Below is the tag for data type: <!-- ================= [DataType] ================== --> Here, is the snippet of the code to get the data types: //From RunParser.java file DataType.xmi_findDataType(ProcessParse.TokenFile(ProcessParse.filename)); //From DataType.java file public static void xmi_findDataType(LineNumberReader result) throws SQLException { try { //....... Vector v = new Vector(); while ( (str = result.readLine()) != null) { StringTokenizer st = new StringTokenizer(str); while (st.hasMoreTokens()) { words = st.nextToken(); v.addElement(words); } } int count =1; for (int i = 0; i < v.size(); i++) { p = new Parse(); if("[DataType]".equals(v.elementAt(i))){ p.setClsname(((String)v.elementAt(i-1)).replace('=',' ').replace('*',' ').trim()); //Class p.setDatatypename(((String)v.elementAt(i-1)).replace('=',' ').trim()); //Data type name p.setDtyid(((String)v.elementAt(i+6)).replace('\'',' ').trim()); // Data type id p.setVisibility(((String)v.elementAt(i+12)).replace('\'',' ').trim()); // Access type-public, private, protected xmi_insertDataType(p,count); count++; } } } ....//some others code } } } catch (Exception e) {} } 196 RELATIONSHIP There are three main types of relationship between classes: generalization (inheritance), Aggregation/composition and association. • Generalization - This implies an "is a" relationship. One class is derived from the base class. Generalization is implemented as inheritance in C++. The derived class has more specialization. It can either override the methods of the base, or add new methods. Examples are a poodle class derived from a dog class, or a paperback class derived from a book class. • Aggregation/Composition - This implies a "has a" relationship. One class can be defined from other classes, that is, it contains objects of any component classes. For example, a car class contains the objects such as tires, doors, engine, and seats. • Association - Two or more classes interact with one another. We need to extract information from each other in some way e.g. a car class may need to interact with a road class, or if you live near any metropolitan area, the car class needs to pay tolls to a collector class. Other relationships are calls, creation and friendship. • Call - This relationship is an invocation that establishes communication within methods and classes. • Creation – A new instance object created in a method of different classes. • Friendship – A class gives a grant access to other classes. In the .xmi file, the declarations or definitions of relationship are defined as below: 197 1. Inheritance <UML:Generalization.parent> .. </UML:Generalization.parent> //derive class <UML:Generalization.child> .. </UML:Generalization. child > //base class 2. Aggregation <!-- ================= ClassName A-ClassName B [Aggregation] ================= --> 3. Composition <!-- ================= ClassName A-ClassName B [Composition] ================= --> 4. Association <!-- ================ ClassName A-ClassName B [Association] ================ --> 5. Call <!-- ================= ClassName A-ClassName B [Call] ==================== --> 6. Creation <!-- ================ ClassName A-ClassName B [Create] =================== --> 7. Friendship <UML: Dependency.client> .. </UML: Dependency.client> Dependency.supplier> 8.<UML: .. </UML: Dependency.supplier> //use the grant //give the grant 198 Below, are the snippets of the code to get the relationships: 1. Inheritance //RunParser.java Matcher match_Inheritance = ProcessParse.xmi_Matcher( "<UML:Generalization.child>", ProcessParse.filename); //to find base class int one = Relationship.xmi_findMain(match_Inheritance, "<UML:Generalization.parent>", "Inheritance", "Generalization", 1); // to find derive class 2. Aggregation //RunParser.java Matcher match_Aggregation = ProcessParse.xmi_Matcher("\\w.+\\[Aggregation]", ProcessParse.filename); //Aggregation definition – represent in regular expression int four = Relationship.xmi_findRelationship(match_Aggregation, "name = \\'[^ ][(\\w),(\\w)]*\\'", "Aggregation", three); //name of instance object that create in aggregation relationship using ‘arrow’ to call class member 3. Composition //RunParser.java Matcher match_Composition = ProcessParse.xmi_Matcher("\\w.+\\[Composition]", ProcessParse.filename); //Aggregation definition – represent in regular expression int three = Relationship.xmi_findRelationship(match_ Composition, "name = \\'[^ ][(\\w),(\\w)]*\\'", " Composition", two); //name of instance object that create in composition relationship using ‘dot’ to call class member 4. Association //RunParser.java Matcher match_Association = ProcessParse.xmi_Matcher("\\w.+\\[Association]", ProcessParse.filename); //Association definition – represent in regular expression int five = Relationship.xmi_findRelationship(match_ Association, "name = \\'[^ ][(\\w),(\\w)]*\\'", " Association ", four); //name of instance object that become a parameter to class member function 199 5. Call //RunParser.java Matcher match_Call= ProcessParse.xmi_Matcher("\\w.+\\[Call]", ProcessParse.filename); //Call definition – represent in regular expression int seven = Relationship.xmi_findRelationship(match_Call, "name = \\'[^ ][(\\w),(\\w)]*\\'", " Association ", six); //name of the call method 6. Creation //RunParser.java Matcher match_Create = ProcessParse.xmi_Matcher("\\w.+\\[Create]", ProcessParse.filename); //Creation definition – represent in regular expression int six = Relationship.xmi_findRelationship(match_Create, "targetScope = \\'[^ ][(\\w),(\\w)]*\\'", "Creation", five); 7. Friend //RunParser.java Matcher match_Friend = ProcessParse.xmi_Matcher("<UML:Dependency.client>", ProcessParse.filename); //subclass that use grant superclass int two = Relationship.xmi_findMain(match_Friend,"<UML:Dependency.supplier>", "Friend", "Friendship", one); // superclass that give a grant to subclass 200 A sample program of artifact relationships in code is described as below. //Relationship.java public static int xmi_findMain(Matcher match, String type, String art, String f, int count){ Matcher match_1 = ProcessParse.xmi_Matcher("\\<!-- [\\w]* -->", ProcessParse.filename); Matcher match_2 = ProcessParse.xmi_Matcher(type, ProcessParse.filename); Matcher match_3 = ProcessParse.xmi_Matcher("\\<!-- [\\w]* -->", ProcessParse.filename); while (match.find()) { p = new Parse(); String[] result = match.group().trim().split("\\s"); int go = match.end(); if (match_1.find(go)) { String[] result_1 = match_1.group().split(" "); p.setClassname(result_1[2]); //child – subclass } int end = match_1.end(); if (match_2.find(end)) { int end_1 = match_2.end(); String[] result_ = match_2.group().split(" "); if (match_3.find(end_1)) { String[] result_2 = match_3.group().split(" "); p.setClsname(result_2[2]); //parent – superclass } } p.setArtifacts(art); // type of artifacts p.setReltype(f); // type of relationships xmi_insertRelation(p, count); count++; } } catch (Exception e) { ConnectionDB.closeConnection(con); System.out.println(" FROM xmi_findMain Method : " + e.getMessage()); } 201 Relationship Between Classes And Methods A relationship involving classes and methods can be interpreted as Class x Class (CxC), Class x Method (CxM), Method x Class (MxC) and Method x Method (MxM) tables. These relationship tables contain artifact relationships captured from the code. The artifact relationships are captured based on some detected tokens such as via dot and arrow relationships. The relationship tables provide input to impact analysis under the following file names: 1. C X C a. AGGREGATIONTABLE b. ASSOCIATIONTABLE c. COMPOSITIONTABLE d. CREATIONTABLE e. FRIENDTABLE f. INHERITIMPTABLE 2. C X M a. FRIENDCXMTABLE 3. M X C a. USESMXCTABLE b. CREATIONMXCTABLE c. AGGREGATIONMXCTABLE d. ASSOCIATIONMXCTABLE e. COMPOSITIONMXCTABLE f. INHERITANCEMXCTABLE 4. M X M a. CALLTABLE b. USESMXMTABLE 202 To implement the relationships between classes and methods we need to do the following steps. 1. Keep track of all the headers and source files, and save them into a table, called SOURCEFILE. The table contains the following information. - ID – auto number - CLSID – fill the class id - CLSNAME – fill the class name - SOURCE – file name *.cpp - HEADER – file name *.h 2. Create an object table, OBJECTTABLE to save object instances that are derived from the classes. This information is available in the source files (i.e. SOURCEFILE table). The OBJECTTABLE contains the following fields: 3. From - ID – auto number - CLSNAME – Class name - TYPE – Pointer or Dot - OBJECT – Object name - FROMFILE - Either Source or Header the OBJECTTABLE, we need to create two other tables, OBJECTINTERACTION and CREATIONINTERACTION. a. OBJECTINTERACTION i. ID – auto number ii. FROMMETHOD – Methods that perform call invocations. iii. OBJECT – The object names iv. TYPE – can be either “->” or “.” v. TOMETHOD – Methods or attributes that receive the call invocations. b. CREATIONINTERACTION i. ID – auto number ii. OBJECT – The object names iii. FROMMETHOD – Methods that create the new objects. 203 iv. TYPE – can be either “= new ” or “ ” 4. System will generate the USESMXCTABLE and USESMXMTABLE tables for uses relationship with the help of OBJECTINTERACTION table. This process identifies the method-class and method–method relationships. Another table can be created based on method-class relationships (CREATIONMXCTABLE) via object creation with the help of CREATIONINTERACTION table. Below is a representation of method-class relationships as appeared in the USESMXCTABLE and CREATIONMXCTABLE tables. Class csu1-csun Method mtd1-mtdn The table below represents the method-method relationships as appeared in the USESMXMTABLE table. Method mtd1-mtdn Method mtd1-mtdn 5. If there exist a relationship between a method and class, system will put a ‘1’ in the MxC table, otherwise it is null. The sample code below shows that a method a() of Class A has an relationship with Class B when it is called in class B as an operation. Code example: Class A{ //csu1 void a(); // mtd1 }; 204 Class B{ //csu2 A a1; void b(); //mtd2 }; //operation void B::b(){ a1.a(); }; The above application can be illustrated in a sample table below. Mtd1 represents method a() and csu2 represents class B. There is a relationship between method a() with Class B as follows. a() Æ csu2 code csu1 csu2 6. A mtd1 0 1 mtd2 0 0 CXC table will create the class-class relationships using the RELATIONSHIPTABLE. Below is a table representation of class-class relationships. The class-class relationships can be derived from the inheritance, aggregation, association, composition and friendship. Class csu1-csun Class csu1-csun 205 7. For call relationship (MXM), we use different approach as below: a. Extract the call graph scheme (.rsf) to get the method-to-method relationships. b. Create a call table relationship, CALLTABLE (MXM) as below: Method mtd1-mtdn Method mtd1-mtdn c. Create a new table call TEMPMTDTABLE with fields as follows: i. UID – identification numbers ii. ID – the IDs extracted from .rsf iii. MTDNAME – the method names extracted from .rsf. For example, the Listing C2 represents a sample description of token file received from the .rsf. We read the .rsf file to find the word call as a tag to get the ID1. MTDNAME (doCruise) is a method name of class CFcdCruise. ID1 represents a unique value for a method doCruise. We keep the ID1 as a caller in the table TEMPMTDTABLE. ID2 is another unique value that we need to keep in the table TEMPMTDTABLE as a callee. Listing C2: A sample token file of .rst type id52386 Function name id52386 Declaration ID "CFcdCruise::doCruise" MTDNAME id52386 "void doCruise (int) (F:\1- RITA\MyWork\OBA\FcdCruise.cpp(86))" Call id52386 ID1 id58229 ID2 Based on the TEMPMTDTABLE, we create the method-method relationships in the table MTDTABLE. The table below represents the table MTDTABLE, where ID1 is represented by mtd2 and ID2 is represented by mtd1, such that ID1 calls ID2 206 code mtd1 mtd2 mtd1 0 0 mtd2 1 0 Table C1 represents a summary of object oriented artifacts and their relationships. We classify the relationship tables into forms of CxC, CxM, MxC and MxM. Please note that these tables are used as input to the change impact process. To make them useful for the change impact, we have to reverse the existing tables such that ID1 Æ ID2 meaning that if ID1 has a relation with ID2 (e.g. ID1 calls ID2), we have to work on the way around by considering ID2 as a cause and ID1 as an effect. The interpretation is that if ID2 is to be changed then ID1 has potential to be affected. The summary of the relationship tables can be illustrated as below. Table C2: Summary of object oriented relationships No Matrix Types of Symbols Tables Inheritance Class A Æ Class B INHERITIMPTABLE Association Class A Æ Class B ASSOCIATIONTABLE Aggregation Class A Æ Class B AGGREGATIONTABLE Composition Class A Æ Class B COMPOSITIONTABLE Friendship Class A Æ Class B FRIENDTABLE Creation Class A Æ Class B CREATIONTABLE Relationship 1 CXC 2 CXM Friendship Class A Æ Method B FRIENDCXMTABLE 3 MXC Uses Method A Æ Class A USESMXCTABLE Creation Method A Æ Class A CREATIONMXCTABLE Aggregation Method A Æ Class A AGGREGATIONMXCTABLE Composition Method A Æ Class A COMPOTIONMXCTABLE Association Method A Æ Class A ASSOCIATIONMXCTABLE Inheritance Method A Æ Class A INHERITANCEMXCTABLE Call Method A Æ Method B CALLTABLE Uses Method A Æ Method B USESMXMTABLE 4 MXM 207 APPENDIX D OBA SYSTEM - The Case Study D1. OBA Overview Automobile Board Auto Cruise (OBA) is an auto cruise system, acts as an interface between a car system and its driver. OBA allows a driver to interact with his car while on auto cruise or non-auto cruise mode such as accelerating speed, suspending speed, resuming speed, braking a car, mileage maintenance and changing modes between the auto cruise and non-auto cruise. While the mechanical part of the engine function is managed by a readily available subsystem, called OBATargetMachine, the OBA can communicate with this subsystem via a shared memory function with some designated capabilities. These capabilities include • Initialize calibration values • Message passing to display panel • Capture throttle values for acceleration • Initialize the maintenance limits • Obtain the number of pulses for distance and speed calculations • Initialize the auto speed limits • Initialize the auto control limits OBA has to compute and manage some values such as the speed, acceleration, mileage maintenance, warning and danger messages based on the state of the car. D2. Project of Industrial Standard The architectural design and specifications of OBA, equipped with a software simulator (OBATargetMachine) were prepared by the THALES company in Paris, France as a training recruitment programme for its clients in software development methodology. As an international and specialized company majoring in automobile, 208 spaceship, electronic and telecommunication, Thales company had designed the business architectures and specification of the OBA that meets the industrial standard based on the MIL-STD-498. D3. CSCI Internal Interfaces These interfaces provide the capability of interactions between classes. The interfaces are identified according to three types of classes i.e. the boundary classes, entity classes and control classes. The boundary classes deal with the user selection of the alternative operations. The entity classes manage and store the data and its operation on the database. The control classes provide the main program control over a specific process. It initiates and identifies other classes to be involved in for a particular process. The class interactions in each package are illustrated in the Figure D1 through D6. Start Vehicle Class Diagram <<entity>> Fuel <<entity>> Calibration (from Entity) (from Entity) init() getTankCapacity() setRecentFillAmount(amount : Integer) getRecentFuelAmount() setRecentFillDistance(distance : Integer) getLastFillDistance() : Integer init() updateCalibration(pulses : Integer) isStart() : Boolean getCalibration() : Integer 1 1 <<boundary>> Ignition <<control>> InitManager 1 (from Boundary) <<entity>> Mileage 1 (from Entity) (from Control) 1 clickBtn() isOn() : Boolean 1 1 init() calculateMileage() getMileage() 1 getSignal() 1 1 1 <<entity>> Timer (from Entity) 1 1 <<entity>> Speed init() 1 getTime() 1 (from Entity) 1 init() calculateSpeed() getSpeed() : Double isMoving() isValidRange() : Boolean <<entity>> PulseSensor (from Entity) getPulse() : Integer Figure D1: Start Vehicle Class diagram 209 Set Calibration Class Diagram <<entity>> CruiseInfo (f rom Entity ) setCruisingStatus(status : Boolean, operationType : String) setCruiseSpeed(speed : Double) getPrevSpeed() : Integer isActivate() : Boolean 1 1 <<entity>> Speed (f rom Entity ) init() calculateSpeed() 1 getSpeed() : Double isMoving() isValidRange() : Boolean 1 (f rom Entity ) 1 <<boundary>> Display (f rom Control) (f rom Boundary ) getCalibrationStatus(calibrationStatus : Boolean) 1 getAccelerationSignal() verifyPulses() calculateCalibration() 1 <<entity>> PulseSensor <<control>> CalibrationManager 1 displayCalibrationMsg() displayMaintenanceMsg(m sg : String) displaySpeedMsg(m sg : String) displayFuelMsg(m sg : String) 1 1 1 <<entity>> Calibration 1 <<boundary>> Keypad (f rom Boundary ) (f rom Entity ) clickBtn(btn : Integer) getPulse() : Integer init() updateCalibration(pulses : Integer) isStart() : Boolean getCalibration() : Integer Figure D2: Set Calibration Class Diagram 210 Control Cruising Speed Class Diagram <<boundary>> Gear <<entity>> CruiseInfo <<entity>> Speed (from Entity) (from Entity) (from Boundary) <<boundary>> Keypad setCruisingStatus(status : Boolean, operationType : String) setCruiseSpeed(speed : Double) getPrevSpeed() : Integer isActivate() : Boolean changeGear(gear : Integer) isHighest() : Boolean (from Boundary) * clickBtn(btn : Integer) 1 1 * init() calculateSpeed() getSpeed() : Double isMoving() isValidRange() : Boolean 1 1 1 1 1 <<control>> CruiseManager <<boundary>> LED <<entity>> Calibration (from Entity) (from Control) (from Boundary) lightLED(LED : Integer, status : Boolean) * 1 verifyCurrentSpeed() getOperationSignal(status : Boolean, operationType : String) getLEDSignal(danger : Integer) 1 1 1 init() updateCalibration(pulses : Integer) isStart() : Boolean getCalibration() : Integer 1 <<boundary>> BrakePedal (from Boundary) pressPedal() isReleased() releasePedal() 1 1 1 1 1 <<boundary>> AcceleratorPedal <<boundary>> Ignition (from Boundary) <<control>> Throttle (from Boundary) (from Control) pressPedal() releasePedal() clickBtn() isOn() : Boolean Figure D3: Control Cruising Speed Class Diagram regulateSpeed(cruiseSpeed : Integer) regulateAcceleration() resumeCruisingSpeed(cruiseSpeed : Integer) 211 Request Trip Assistance Class Diagram <<boundary>> BrakePedal <<entity>> Fuel <<entity>> Speed (from Boundary) (from Entity) (from Entity) pressPedal() isReleased() releasePedal() init() getTankCapacity() setRecentFillAmount(amount : Integer) getRecentFuelAmount() setRecentFillDistance(distance : Integer) getLastFillDistance() : Integer 1 1 init() calculateSpeed() getSpeed() : Double isMoving() isValidRange() : Boolean 1 1 1 1 <<control>> TripManager <<boundary>> Keypad <<entity>> Mileage (from Control) (from Boundary) 1 1 clickBtn(btn : Integer) (from Entity) getSignal(status : Boolean, operationType : Integer) 1 calculateTotalStopTime() calculateAverageSpeed() calculateFuelConsumption() 1 1 1 1 1 1 1 1 <<boundary>> Display (from Boundary) displayCalibrationMsg() displayMaintenanceMsg(msg : String) displaySpeedMsg(msg : String) displayFuelMsg(msg : String) 1 <<entity>> Timer (from Entity) init() getTime() 1 1 <<boundary>> Ignition <<boundary>> AcceleratorPedal (from Boundary) (from Boundary) clickBtn() isOn() : Boolean init() calculateMileage() getMileage() pressPedal() releasePedal() Figure D4: Request Trip Assistance Class Diagram <<entity>> Trip (from Entity) setInitialMileage(time : Integer) getTotalStopT ime(totalTime : Integer) getInitialMileage() getIInitialTime() getTotalStopT ime() 212 Fill Fuel Class Diagram <<entity>> Fuel <<boundary>> Display (from Entity) (from Boundary) init() getTankCapacity() setRecentFillAmount(amount : Integer) getRecentFuelAmount() setRecentFillDistance(distance : Integer) getLastFillDistance() : Integer displayCalibrationMsg() displayMaintenanceMsg(msg : String) displaySpeedMsg(msg : String) displayFuelMsg(msg : String) 1 1 1 1 <<control>> FuelManager (from Control ) 1 <<boundary>> Keypad getSignal(signal : Boolean) getFuelAmount(amount : Integer) 1 verifyEnteredAmount() calculateFuelConsumption() (from Boundary) clickBtn(btn : Integer) <<entity>> Mileage 1 * 1 1 <<boundary>> Ignition (from Boundary) clickBtn() isOn() : Boolean Figure D5: Fill Fuel Class Diagram (from Entity) init() calculateMileage() getMileage() 213 Service Vehicle Class Diagram <<boundary>> Display (from Boundary) displayCalibrationMsg() displayMaintenanceMsg(msg : String) displaySpeedMsg(msg : String) displayFuelMsg(msg : String) 1 1 <<boundary>> Keypad (from Boundary) 1 clickBtn(btn : Integer) 1 <<control>> MaintenanceManager <<entity>> Maintenance (from Control ) (from Enti ty) verifyMaintenace() getSignal() 1 1 1 <<entity>> Mileage (from Enti ty) init() calculateMileage() getMileage() Figure D6: Service Vehicle Class Diagram 1 setNextService() getNextService() 214 D4. CSCI Data Element Requirements This section identifies the interfaces between different functionalities as described above. We provide the main activities, list of the classes, received and sent messages of each class from the object-oriented point of view. D4.1 Boundary Classes AccelerationPedal Class Type : Boundary Class Responsibilities : Allow the driver to press or release the Acceleration pedal in order to increase or decrease the speed Received Event : pressPedal() releasePedal() Sent Event : getOperationSignal(Boolean, String) Attributes : Not applicable Class Type : Boundary Class Responsibilities : Allow the driver to press the Brake pedal in BrakePedal order to reduce the speed Received Event : pressPedal() isRelease() releasePedal() Sent Event : getOperationSignal(Boolean, String) Attributes : Not applicable Class Type : Boundary Class Responsibilities : Allow the driver to change the gear Received Event : changeGear() Gear isHighest() 215 Sent Event : getOperationSignal(Boolean, String) Attributes : Not applicable Class Type : Boundary Class Responsibilities : Allow the driver to select operations Keypad of the system Received Event : Not applicable Sent Event : getOperationSignal(Boolean, String) Attributes : Not applicable Display Class Type : Boundary Class Responsibilities : Display appropriate messages to the driver Received Event : displayCalibrationMsg(String) displayMaintenanceMsg(String) displaySpeedMsg(String) displayFuelMsg(String) Sent Event : Not applicable Attributes : Not applicable Class Type : Boundary Class Responsibilities : Enable the LED to be lit under a specific LED condition Received Event : lightLED(Integer, Boolean) Sent Event : Not applicable Attributes : Not applicable Class Type : Boundary Class Responsibilities : Allow the driver to on/off the engine Received Event : clickBtn(Integer) Ignition isOn() 216 Sent Event : getSignal() Attributes : Not applicable D4.2 Entity Classes Calibration Class Type : Entity Class Responsibilities : Store the calibration reference Received Event : init() isStart() setCalibration(Integer) getCalibration() Sent Event : Not applicable Attributes : intCalibration Class Type : Entity Class Responsibilities : Store information of cruising Received Event : setCruiseSpeed(Double) CruiseInfo setCruiseStatus(Boolean, String)) getPrevSpeed() isActive() Sent Event : Not applicable Attributes : boolAactivate boolAccelerate boolResume boolSuspend Trip Class Type : Entity Class Responsibilities : Store the information of trip Received Event : getInitialMileage() 217 getInitialTime() setInitialMileage(Integer) setInitialTime(Integer) Sent Event : Not applicable Attributes : intInitTime intInitMileage Fuel Class Type : Entity Class Responsibilities : Store the information of Fuel Received Event : init() getTankCapacity() getRecentFilllAmount() setRecentFillAmount(Integer) getRecentFillMileage() setRecentFillMileage(Integer) Sent Event : Not applicable Attributes : intFillMileage intAmountFuel Maintenance Class Type : Entity Class Responsibilities : Store the information of Maintenance Received Event :getNextService() setNextService(Integer) setServiceNum() getServiceNum() Sent Event : Not applicable() Attributes : intSchedule intServiceMileage intNextServiceMileage intServiceNum 218 PulseSensor Class Type : Entity Class Responsibilities : Store the number of pulses at a specified time Received Event : getPulse() Sent Event : Not applicable Attributes : intNumOfPulse Class Type : Entity Class Responsibilities : Store the current time Received Event : init() Timer getTime() Sent Event : Not applicable Attributes : intInitTime D4.3 Control Classes Mileage Class Type : Entity Class Responsibilities : Store the information of current mileage Received Event : init() getMileage() calculateMileage(Integer) Sent Event : Not applicable Attributes : intMileage Class Type : Entity Class Responsibilities : Store the information of current speed Received Event : init() Speed getSpeed() 219 calculateSpeed(Integer) isMoving() isValidRange() Sent Event : Not applicable Attributes : intSpeed InitManager Class Type : Control Class Responsibilities : Initialize all the information of the vehicle upon the activation of the system Received Event : getSignal() Sent Event : Not applicable Attributes : Not applicable CalibrationManager Class Type : Control Class Responsibilities : Set the calibration reference Received Event : getCalibrationStatus(Boolean) getAccelerationSignal(Boolean) verifyPulses() calculateCalibration(Integer) Sent Event : isActivate() isMoving() getPulse() updateCalibration(Integer) Attributes : Not applicable CruiseManager Class Type : Control Class Responsibilities : Manage the cruising function Received Event : getOperationSignal(Boolean, String) getLEDSignal() verifyCurrentSpeed(Integer) 220 Sent Event : lightLED(Integer, Boolean) isOn() isStart() isHighest() isValidRange() setCruisingStatus(Boolean, String) getSpeed() setCruiseSpeed(Double) regulateSpeed(Integer) regulateAcceleration() setCruiseSpeed(Double) Attributes : Not applicable TripManager Class Type : Control Class Responsibilities : Manage the trip information Received Event : getSignal(Boolean, Integer) calculateAverageSpeed() calculateFuelConsumption() Sent Event : isOn() getMileage() setInitialMileage(Integer) isMoving() getTime() getInitialMileage() getInitialTime() getMileage() getRecentFuelAmount() displaySpeedMsg(String) displayFuelMsg(String) Attributes : intStopTime intNextStartTime 221 FuelManager Class Type : Control Class Responsibilities : Manage the Fuel information Received Event : getSignal(Boolean) getFuelAmount(Integer) verifyEnteredAmount() calculateFuelConsumption() Sent Event : isOn() getMileage() getTankCapacity() setRecentFillAmount(Integer) getLastFillDistance() setRecentFillDistance() displayFuelMsg(String) Attributes : intMileage MaintenanceManager Class Type : Control Class Responsibilities : Manage the maintenance information Received Event : getSignal(Boolean) verifyMaintenance() Sent Event : getMileage() getNextService() displayMaintenaceMsg(String) setNextService(Integer) setServiceNum() Attributes : Not applicable Class Type : Control Class Responsibilities : Manage the throttle status Received Event : regulateSpeed(Integer) Throttle regulateAcceleration() resumeCruiseSpeed(Integer) 222 Sent Event : getSpeed() getLEDSignal(Integer) Attributes : Not applicable D5. Data Specifications and Analysis Below are some data specifications we used during our data gathering and impact analysis. These data specifications include a set of requirements, test cases, packages, classes and methods. D5.1 List of requirements Below represents a list of requirements we captured from the SRS document. These requirements are used in the test scenarios and traceability process. Each requirement has its own unique code and description. The first two digits in the code represent the requirement group and the last two digits represent the requirement unit. We purposely organize these requirements into unique identifiers to make them individually recognizable and unambiguous. Req 1 : SRS_REQ-01-01 desc: The driver clicks ignition button on the driving station to put the engine of the car on. Req 2 : SRS_REQ-01-02 desc: The CSCI resets the default value for calibration (5291 pulses/km). Req 3 : SRS_REQ-01-03 desc: The CSCI resets the default value for the fuel tank (35 litres). Req 4 : SRS_REQ-02-01 desc: No calibration is performed during Autocruise. 223 Req 5 : SRS_REQ-02-02 desc: To Activate the AutoCruise function the transmission is on the gear 5. Req 6 : SRS_REQ-02-03 desc: The speed is>= 80km/h <= 175km/h for AutoCruise. Req 7 : SRS_REQ-02-04 desc: The driver presses on the “Activation” button to activate the AutoCruise function. Req 8 : SRS_REQ-02-05 desc: The system informs the driver by light the on/off LED on the instrument panel. Req 9 : SRS_REQ-02-06 desc: The control cruise capacity is to maintain constant speed. Req 10 : SRS_REQ-02-07 desc: The driver presses on the “Deactivation” button and the system shall return the control to the driver. Req 11 : SRS_REQ-02-08 desc: The system informs the driver the autocruise function is deactivated by lighting off the LED on the instrument panel. Req 12 : SRS_REQ-02-09 desc:The driver presses on the “Acceleration” button to increase the cruising speed. Req 13 : SRS_REQ-02-10 desc: The system measures the accelerations and keep it at 1km/h/s. Req 14 : SRS_REQ-02-11 desc: The driver presses the “Stop” Acceleration and the system keeps the vehicle at new speed. 224 Req 15 : SRS_REQ-02-12 desc: The driver presses the acceleration pedal to increase the vehicle speed. Req 16 : SRS_REQ-02-13 desc: The driver presses the brake to suspend the cruise control function. Req 17 : SRS_REQ-02-14 desc: The cruise control capacity is suspended instantaneously. Req 18 : SRS_REQ-02-15 desc: The driver presses clutch pedal to suspend the cruise control function. Req 19 : SRS_REQ-02-16 desc: The brake and clutch pedal are released and the transmission is on the highest gear. Req 20 : SRS_REQ-02-17 desc: The driver presses “Resume” button. Req 21 : SRS_REQ-02-18 desc:The system returns the vehicle to the speed that was selected before braking or changing speed. Req 22 : SRS_REQ-02-19 desc: The system lights the LED of danger message when the speed exceeds 150 km/h. Req 23 : SRS_REQ-02-20 desc: The system lights the LED of warning message when the speed is less or greater than 10 km/h from the cruising speed. Req 24 : SRS_REQ-02-21 desc: The auto cruise is deactivated when the current speed is < 80 km/h. 225 Req 25 : SRS_TREQ-02-01 desc: The reaction of the software towards the fuel inlet mechanism must take place in a period no longer than 0.5 seconds. Req 26 : SRS_REQ-03-01 desc: Calibration is possible only if the cruising speed control is inactive. Req 27 : SRS_REQ-03-02 desc: The driver presses on “Start” calibration at the start of measured km. Req 28 : SRS_REQ-03-03 desc: The driver presses on “Stop” calibration after 1 km drive. Req 29 : SRS_REQ-03-04 desc: The system stores the number of pulses of the shaft during this interval. Req 30 : SRS_REQ-03-05 desc: When the current count of the pulses exceeds the calibration error limit, an error message is displayed on the control panel. Req 31 : SRS_REQ-04-01 desc: The driver presses “Fill Fuel” button and enter the amount of fuel. Req 32 : SRS_REQ-04-02 desc: The driver shall click on the “Enter” button. Req 33 : SRS_REQ-04-03 desc: The system verifies the amount of fuel is within the reasonable range. Req 34 : SRS_REQ-04-04 desc: Invalid fuel capacity amount. 226 Req 35 : SRS_TREQ-04-01 desc: The system displays the average fuel consumption to the driver by less than 1 second. Req 36 : SRS_REQ-05-01 desc: The driver presses on the “Start trip” button. Req 37 : SRS_REQ-05-02 desc: The driver presses on the “Average Speed” button. Req 38 : SRS_REQ-05-03 desc: The system displays the average speed of the vehicle since the most recent start of trip request, including the stop periods of the vehicle. Req 39 : SRS_REQ-05-04 desc: The driver presses on the “Fuel Consum” button. Req 40 : SRS_REQ-05-05 desc: The system calculates and displays the average fuel consumption of the vehicle since the most recent start of trip request up to the most recent filling input completion. Req 41 : SRS_TREQ-05-01 desc: Responses to the driver by displaying the average speed of the trip less than 1 second. Req 42 : SRS_REQ-06-01 desc: The system must watch the mileage of the vehicle with the corresponding maintenance level. Req 43 : SRS_REQ-06-02 desc: The system displays appropriate message intermittently within 400km before reaching a mileage corresponding to a maintenance level. 227 Req 44 : SRS_REQ-06-03 desc: The system displays appropriate message continuously within 80km before reaching a mileage corresponding to a maintenance level. Req 45 : SRS_REQ-06-04 desc: The driver clicks the “Done” Service button and the system suppresses the existing message. Req 46 : SRS_TREQ-06-01 desc: The display of vehicle maintenance messages must take place less than 10 seconds after the milestone for the maintenance has been reached. D5.2 List of test cases Below represents a list of test cases involved in the OBA case study. Each test case is represented by a unique code and followed by its description. The first two digits in the code represent the test case group and the last two digits represent the test case unit. We purposely organize these test cases into unique identifiers to make them individually recognizable and unambiguous. tcs 1 : STD_REQ-01-01 desc: Initialize OBA tcs 2 : STD_REQ-02-01 desc: Launch Auto Cruise While Not in Gear 5 tcs 3 : STD_REQ-02-02 desc: Launch Auto Cruise While In Gear 5 tcs 4 : STD_REQ-02-03 desc: Start Calibration During Auto Cruise 228 tcs 5 : STD_REQ-02-04 desc: Light “Hazard Bolting” LED When Launching Auto Cruise While Speed is >= 150 km/h tcs 6 : STD_REQ-02-05 desc:Regulate Speed During Auto Cruise tcs 7 : STD_REQ-02-06 desc: Light “Care to speed” LED During Auto Cruise tcs 8 : STD_REQ-02-07 desc: Regulate Speed During Auto Cruise on Uphill Road tcs 9 : STD_REQ-02-08 desc: Regulate Speed During Auto Cruise on Downhill Road tcs 10 : STD_REQ-02-09 desc:Accelerate at 1 km/h/s When the “Start” Acceleration Button is Clicked tcs 11 : STD_REQ-02-10 desc:Increase Speed by Pressing the Acceleration Pedal tcs 12 : STD_REQ-02-11 desc: Select the Current Speed As New Cruising Speed by Clicking the “Stop” Acceleration Button tcs 13 : STD_REQ-02-12 desc:Light “Hazard bolting” LED During Auto Cruise tcs 14 : STD_REQ-02-13 desc: Suspend Cruise Control by Pressing Brake Pedal or Clutch Pedal tcs 15 : STD_REQ-02-14 desc:Resume the Cruising Speed While Not Releasing the Brake Pedal 229 tcs 16 : STD_REQ-02-15 desc: Resume Cruising Speed While In Suspend Mode and Speed is < 80 km/h tcs 17 : STD_REQ-02-16 desc: Resume Cruising Speed While In Suspend Mode and Speed is >= 80 km/h tcs 18 : STD_REQ-02-17 desc: Deactivate Auto Cruise by Clicking on the “Deactivation” Button tcs 19 : STD_REQ-02-18 desc: Changing Gear During Auto Cruise tcs 20 : STD_REQ-02-19 desc: Fuel Inlet Mechanism Responses In Less Than 0.5 Seconds on Actions of the Driver tcs 21 : STD_REQ-03-01 desc: Start Calibration While Auto Cruise is Not Activated tcs 22 : STD_REQ-03-02 desc: Stop Calibration When Reach 1 km Distance tcs 23 : STD_REQ-03-03 desc: Stop Calibration When Exceed Further Than 1 km Distance tcs 24 : STD_REQ-03-04 desc:Activate Auto Cruise While in Calibration Mode tcs 25 : STD_REQ-04-01 desc:Fill Fuel With Fuel Amount More Than 1 Decimal tcs 26 : STD_REQ-04-02 desc: Fill Fuel With Invalid Fuel Amount 230 tcs 27 : STD_REQ-04-03 desc: Fill fuel with cancel transaction tcs 28 : STD_REQ-04-04 desc: Display Average Fuel Consumption Between 2 Fillings Less Than 1 Second tcs 29 : STD_REQ-05-01 desc: Get Average Speed or Average Fuel Consumption Without Clicking “New” Trip button tcs 30 : STD_REQ-05-02 desc: Get Average Speed During a Trip tcs 31 : STD_REQ-05-03 desc: Get Average Fuel Consumption During a Trip tcs 32 : STD_REQ-06-01 desc: Display “Oil & Filter Change” Message Intermittently and Continuously, 400 km and 80 km Respectively Before Reaching Maintenance Milestones tcs 33 : STD_REQ-06-02 desc: Display “Air Filter Change” Message Intermittently and Continuously, 400 km and 80 km Respectively Before Reaching Maintenance Milestones tcs 34 : STD_REQ-06-03 desc: Display “General-Check” Message Intermittently and Continuously, 400 km and 80 km Respectively Before Reaching Maintenance Milestones 231 D5.3 Package and class Relationships The OBA project consists of 46 requirements, 34 test cases, 12 packages, 23 classes and 80 methods. The CSU Codes represent the individual classes and the CSC Numbers represent the package numbers of the package names. The package classification is available in the SRS document that needs to be associated to classes of OBA programs. These artifacts and their relationships are stored in a table for further references (Table D1). This table is used most of the times during the traceability and change impact process when there is a need to relate classes to packages or vice versa. Table D1: A set of package-class references Ref_CSU_CSC ID Class Name 1 CDrivingStation 2 Cpedal 3 Cignition 4 Cgear 5 Ckeypad 6 CInitManager 7 CFcdCruise 8 CCruiseManager 9 Ccruiseinfo 10 DLL_Utility 11 DLL_Mileage 12 Ctimer 13 CSpeedCtrl 14 CFcdTrip 15 CTripManager 16 CFuelManager 17 Ctrip 18 Cfuel 19 CFcdMaintenance 20 CMaintenanceManager 21 Cmaintenance 22 DLL_Display 23 DLL_OTM CSU Code CSC No CSU-1 1 CSU-2 1 CSU-3 1 CSU-4 1 CSU-5 2 CSU-6 3 CSU-7 4 CSU-8 4 CSU-9 4 CSU-10 5 CSU-11 6 CSU-12 7 CSU-13 8 CSU-14 9 CSU-15 9 CSU-16 9 CSU-17 9 CSU-18 9 CSU-19 10 CSU-20 10 CSU-21 10 CSU-22 11 CSU-23 12 CSC Name DrivingStation DrivingStation DrivingStation DrivingStation ControlPanel InitMgmt CruiseMgmt CruiseMgmt CruiseMgmt DLL_CarUtility DLL_MileageMgmt TimerMgmt SpeedMgmt TripMgmt TripMgmt TripMgmt TripMgmt TripMgmt MaintenanceMgmt MaintenanceMgmt MaintenanceMgmt DLL_DisplayMgmt DLL_OTMMgmt 232 D5.4 Package, Class and Method Relationships The Table D2 below represents the relationships between methods, classes and packages. There are 80 methods, 23 classes and 12 packages captured in the OBA project. The Method Codes represent the individual methods, the CSU IDs represent the class IDs and the CSC IDs represent the package IDs. This table is used most of the times when there is a need to relate between artifacts. Table D2: A set of method-class-package references Ref_Mtd_CSU_CSC ID Method Name 1 drivingStationHandler() 2 getState() 3 isOn() 4 isHighest() 5 keypadHandler() 6 CInitManager::CInitManager() 7 CInitManager::init() 8 CInitManager::getSignal() 9 main() 10 CFcdCruise::CFcdCruise() 11 doCruise() 12 doCruiseStatus() 13 CCruiseManager::CCruiseManager() 14 getOperationSignal() 15 regulateSpeed() 16 checkCruiseModel() 17 CCruiseInfo::CCruiseInfo() 18 isActive() 19 setCruiseSpeed() 20 setCruiseStatus() 21 getPrevSpeed() 22 DLL_Utility::DLL_Utility() 23 DLL_Mileage::DLL_Mileage() 24 CTimer::CTimer() 25 CTimer::init() 26 timeHandler() 27 getTime() 28 calculateTime() 29 controlBlinkMsg() 30 CSpeedctrl::CSpeedctrl() 31 isValidRange() Method Code CSU ID CSC ID Mtd1 1 1 Mtd2 2 1 Mtd3 3 1 Mtd4 4 1 Mtd5 5 2 Mtd6 6 3 Mtd7 6 3 Mtd8 6 3 Mtd9 6 3 Mtd10 7 4 Mtd11 7 4 Mtd12 7 4 Mtd13 8 4 Mtd14 8 4 Mtd15 8 4 Mtd16 8 4 Mtd17 9 4 Mtd18 9 4 Mtd19 9 4 Mtd20 9 4 Mtd21 9 4 Mtd22 10 5 Mtd23 11 6 Mtd24 12 7 Mtd25 12 7 Mtd26 12 7 Mtd27 12 7 Mtd28 12 7 Mtd29 12 7 Mtd30 13 8 Mtd31 13 8 233 Ref_Mtd_CSU_CSC ID Method Name 32 getSpeed() 33 calculateSpeed() 34 CFcdTrip::CFcdTrip() 35 CFcdTrip::init() 36 doTrip() 37 doStatus() 38 doFillFuel() 39 CTripManager::CTripManager() 40 getTripStatus() 41 calculateAverageSpeed() 42 CTripManager::calculateFuelConsumption() 43 CTripManager::getSignal() 44 CFuelManager:CFuelManager() 45 CFuelManager::getSignal() 46 getFuelAmount() 47 concateNum() 48 verifyEnteredAmount() 49 CFuelManager:calculateFuelConsumption() 50 getFuelFillStatus() 51 CTrip::CTrip() 52 setInitialMileage() 53 setInitialTime() 54 getInitialTime() 55 getInitial Mileage() 56 CFuel::CFuel() 57 CFuel::init() 58 getRecentFillAmount() 59 getTankCapacity() 60 setRecentFillAmount() 61 setRecentFillMileage() 62 getRecentFillMileage() 63 CFcdMaintenance::CFMaintenance() 64 CFcdMaintenance::doInit() 65 doMaintenance() 66 CMaintenanceManager::CMaintenanceManager() 67 CMaintenanceManager::init() 68 CMaintenanceManager::getSignal() 69 VerifyMaintenance() 70 ServiceDone() 71 reachServiceMileage() 72 CMaintenance::CMaintenance() 73 CMaintenance::init() 74 setServiceMileage() 75 setNextService() 76 getServiceNum() 77 getNextService() Method Code CSU ID CSC ID Mtd32 13 8 Mtd33 13 8 Mtd34 14 9 Mtd35 14 9 Mtd36 14 9 Mtd37 14 9 Mtd38 14 9 Mtd39 15 9 Mtd40 15 9 Mtd41 15 9 Mtd42 15 9 Mtd43 15 9 Mtd44 16 9 Mtd45 16 9 Mtd46 16 9 Mtd47 16 9 Mtd48 16 9 Mtd49 16 9 Mtd50 16 9 Mtd51 17 9 Mtd52 17 9 Mtd53 17 9 Mtd54 17 9 Mtd55 17 9 Mtd56 18 9 Mtd57 18 9 Mtd58 18 9 Mtd59 18 9 Mtd60 18 9 Mtd61 18 9 Mtd62 18 9 Mtd63 19 10 Mtd64 19 10 Mtd65 19 10 Mtd66 20 10 Mtd67 20 10 Mtd68 20 10 Mtd69 20 10 Mtd70 20 10 Mtd71 20 10 Mtd72 21 10 Mtd73 21 10 Mtd74 21 10 Mtd75 21 10 Mtd76 21 10 Mtd77 21 10 234 Ref_Mtd_CSU_CSC ID Method Name 78 getServiceMileage2() 79 DLL_Display::DLL_Display() 80 DLL_OTM::DLL_OTM() Method Code CSU ID CSC ID Mtd78 21 10 Mtd79 22 11 Mtd80 23 12 235 APPENDIX E Published Papers Ibrahim, S., Idris, N.B. and Deraman, A. (2003). Change Impact Analysis for Maintenance Support. Proceedings of International Information Technology Symposium (ITSIM’03). September 30-October 2. Malaysia: UKM Press. 270-278. Ibrahim, S., Idris, N.B. and Deraman, A. (2003). Case study: Reconnaissance techniques to support feature location using RECON2. Asia-Pacific Software Engineering Conference. December 10-12. Thailand: IEEE Computer Society. 371-378. Ibrahim, S., Al-Busaidi, M.A. and Sarkan, H. (2004). A Dynamic Analysis Approach for Concept Location. Technical report of Software Engineering. CASE/Mei 2004/LT2. Ibrahim, S. and Mohamad, R.N. (2004). Code Parser for C++. Technical report of Software Engineering. CASE/Sept 2004/LT3. Ibrahim, S., Idris, N.B. Munro, M. and Deraman, A. (2004). A Software Traceability Mechanism to Support Change Impact. Proceedings of ICCTA on Theory and Applications. September 28-30. Egypt: Computer Scientific Society. 117-124. Ibrahim, S., Idris, N.B. and Deraman, A. (2004). A Software Traceability Tool to Support Impact Analysis. Proceedings of the Confederation of Scientific and Technological Associations in Malaysia. October 5-7. Kuala Lumpur: COSTAM. 460-468. Ibrahim, S., Idris, N.B. Munro, M. and Deraman, A. (2005). Implementing A Document-Based Requirements Traceability: A Case Study. Proceedings of the International Conference on Software Engineering. 15-17 February. 236 Austria: The International Associations of Science and Technology for Development (IASTED). 124-131. Ibrahim, S., Idris, N.B. Munro, M. and Deraman, A. (2005). A Requirements Traceability to Support Change Impact Analysis. Asean Journal of Information Technology. Pakistan: GracePublication. 4(4): 345-355. Ibrahim, S., Idris, N.B. Munro, M. and Deraman, A. (2005). Integrating Software Traceability For Change Impact Analysis. Arab International Journal of International Technology. Jourdan: AIJIT Press. 2(4): 301-308. Ibrahim, S., Idris, N.B. Munro, M. and Deraman, A. (2005). A Software Traceability Model to Support Change Impact Analysis. Post-Graduate Annual Research Seminar (PARS’2005). May 17-18. UTM: Faculty of Computer Science and Information System. 284-289. Sarkan, H., Ibrahim, S. and Al Busaidi, M. (2005). A Dynamic Analysis Approach to Support Regression Testing. The First Malaysian Software Engineering Conference. December 12-13. Penang: USM Press. 241-245. Ibrahim, S., Idris, N.B., Munro, M. and Deraman, A. (2006). A Requirements Traceability Validation for Change Impact Analysis, Proceedings of International conference on Software Engineering Research Practice (SERP). June 26-29. Las Vegas: IEEE Computer Society.