vii TABLE OF CONTENTS CHAPTER 1 TITLE PAGE DECLARATION ii DEDICATION iii ACKNOWLEDGEMENT iv ABSTRACT v TABLE OF CONTENTS vii LIST OF TABLES xiv LIST OF FIGURES xv LIST OF ACRONYMS AND SYMBOLS xvii LIST OF APPENDICES xviii INTRODUCTION 1.1 Introduction 1 1.2 Background of the Research Problem 2 1.3 Statement of the Problem 3 1.4 Objective of the Study 4 1.5 Scope of Work 5 1.6 Importance of the Study 5 1.7 Thesis Outline 6 1.8 Summary 7 viii 2 LITERATURE REVIEW 2.1 Introduction 8 2.2 Introduction of Software Maintenance 9 2.2.1 Software Maintenance Categories 11 2.2.2 Problem in Software Maintenance 12 2.3 Program Understanding 13 2.3.1 Program Understanding Support Mechanism 15 2.3.1.1 Unaided Browsing 15 2.3.1.2 Leveraging Corporate Knowledge and 16 Experience 2.3.1.3 2.3.2 Computer Aided Technique Program Understanding via Reverse 16 17 Engineering 2.4 Reverse Engineering 18 2.4.1 Reverse Engineering Concept and Definition 19 2.4.2 Challenges in Reverse Engineering 21 2.4.3 Program Understanding in Reverse Engineering 23 Automating Approaches 2.5 Parsing Technique 24 2.5.1 Two Way of Parsing 25 2.5.2 Parsing Methods 26 2.5.2.1 Directionality 26 2.5.2.2 Search Techniques 27 2.5.2.3 Left Corner Parsing 28 2.5.3 2.6 Time Requirement 28 Extraction Process 29 2.6.1 Pattern Matching 29 2.6.2 Regular Expression 30 2.6.2.1 32 Basic Concepts ix 2.6.2.2 Portable Operating System Interface 34 (POSIX) Syntax 2.6.3 Pattern Matching and Regular Expression in 36 Artifact Extraction 2.7 2.8 2.9 Abstraction Process 36 2.7.1 Graphical Representation 38 2.7.2 Textual Representation 38 Concept Location 40 2.8.1 Concept Location in Source Code 41 2.8.2 Static Concept Location Techniques 43 2.8.2.1 String Pattern Matching Technique 44 2.8.2.2 Dependency Search Technique 44 2.8.2.3 IR-based Technique 45 Code Query in Reverse Engineering Tools 45 2.9.1 Windows Grep 46 2.9.2 Rigi 47 2.9.3 2.9.4 3 2.9.2.1 Rigi Features 48 2.9.2.2 Rigi Query Technique 49 CodeSurfer 49 2.9.3.1 CodeSurfer Features 51 2.9.3.2 CodeSurfer Query Technique 52 The Comparative Evaluation of Existing Tools 53 2.10 Proposed Solution 54 2.11 Summary 55 RESEARCH METHODOLOGY 3.1 Introduction 56 3.2 Operational Framework 57 3.2.1 58 Phase 1: Formulation of Research Problem x 3.2.1.1 Literature Reviews 58 3.2.1.1.1 59 Understanding the Need of Change Request Process 3.2.1.1.2 Understanding Structured 59 Programming Concept 3.2.1.1.3 Understanding the 60 Extraction Process 3.2.1.1.4 Understanding the 60 Abstraction Technique 3.2.1.2 Analysis Current Approach and 60 Existing Tools 3.2.1.3 3.2.2 3.2.3 3.2.4 4 Research Proposal 61 Phase 2: Prototype Development 61 3.2.2.1 Code Query Model Design 62 3.2.2.2 Code Query Prototype Development 62 Phase 3: Implementation and Evaluation 63 3.2.3.1 Supporting Tools 64 3.2.3.2 Choose Case Study 64 3.2.3.3 Experimental 65 3.2.3.4 Evaluation 65 Phase 4: Research Report 66 3.3 Research Assumption 66 3.4 Summary 66 CODE QUERY MODEL 4.1 Introduction 68 4.2 Overview of Code Query 68 4.3 Code Query in Structured Programming 70 xi 4.3.1 Structured Programming Concept 70 4.3.1.1 72 Relationship in Structured Programming 4.3.1.2 Dependencies in Structured 73 Programming 4.3.1.2 Observations about Structured 73 Programming 4.4 A Proposed Code Query 74 4.4.1 Keyword 75 4.4.2 Extraction of Artifacts 75 4.4.2.1 Parser 76 4.4.2.2 Pattern Matching 77 4.4.2.3 Regular Expression 77 4.4.3 Abstraction of Artifacts 78 4.4.3.1 Code Query in Textual Representation 79 4.4.3.1 Code Query in Graphical 80 Representation 4.5 5 Summary 81 DESIGN AND IMPLEMENTATION OF CODE QUERY 5.1 Introduction 82 5.2 Code Query Design 82 5.2.1 Code Query Architecture 83 5.2.1.1 Problem Change Request 84 5.2.1.2 Artifacts Repository 84 5.2.1.3 Extraction Process 85 5.2.1.4 Abstraction Process 87 5.2.2 Code Query Use Case 87 5.2.3 Code Query Class Interactions 91 xii 6 5.3 Code Query Implementation and User Interfaces 96 5.4 Other Supporting Tools 101 5.5 Summary 101 EVALUATION 6.1 Introduction 102 6.2 Case Study 103 6.2.1 Outlines of Case Study 103 6.2.2 GI Project Briefing 104 6.3 6.4 6.5 6.6 7 Controlled Experimental 104 6.3.1 Subject and Environment 105 6.3.2 Questionnaires 105 6.3.3 Experimental Procedures 106 6.3.4 Possible Threats and Validity 106 The Analysis 107 6.4.1 Analysis of the Controlled Experiment 107 6.4.2 Analysis of the Usability Study 110 Finding Analysis 113 6.5.1 Acceptance Tool 113 6.5.2 Qualitative Evaluation 114 Summary 116 CONCLUSION AND FUTURE WORK 7.1 Introduction 117 7.2 Contribution 118 7.3 Research Limitation and Future Works 118 7.4 Summary 119 xiii REFERENCES 120 Appendices A-B 124-134 xiv LIST OF TABLES TABLE NO. TITLE PAGE 2.1 Quantifiers of Regular Expression 33 2.2 Metacharacters for BRE Standard 35 2.3 Features of The Static Concept Location Techniques 43 2.4 Existing Features of current tools 54 4.1 Relationship Types 72 4.2 Regular Expression for Match Common Programming Language 78 6.1 Job versus frequencies 108 6.2 Year of Experience in Software Development 109 6.3 Year of Experience in Software Maintenance 109 6.4 Mean of scores for Code Query 111 6.5 Mean of Usefulness of Tools 112 6.6 Mean Comparison between Tools 114 6.7 Existing Features of Code Query Systems 115 xv LIST OF FIGURES FIGURE NO. TITLE PAGE 2.1 Software Maintenance Process 9 2.2 Reverse Engineering 19 2.3 Forward Engineering 20 2.4 Overview of Parser Process 25 2.5 Most concept location techniques rely on an intermediate representation of the source code 41 2.6 Window Grep Search Results 47 2.7 View produced by Rigi via RigiEdit 49 2.8 CodeSurfer Project Viewer 50 2.9 Finder Viewer 52 3.1 Operational Framework 57 4.1 Overview of Code Query Model 69 4.2 Structured Programming – Tax Calculation 71 4.3 Function Relationship 72 4.4 Code Query Approach 74 4.5 Metrics on the important files 79 xvi 4.6 Graphical presentation for function and variables of vehicle simulation 81 5.1 Code Query Architecture 83 5.2 Use Case Diagram of Code Query System 88 5.3 Code Query Class Diagram 91 5.4 Code Query Sequence Diagrams 92 5.5 Code Query Process Algorithm Flowchart 94 5.6 Code Query Introduction Screen 96 5.7 First user interface of Code Query 97 5.8 File Path and the Keyword Field 97 5.9 Textual Representation 98 5.10 Graphical Representation 99 5.11 Low Level of Abstraction - Source Code Viewer 99 5.12 High Level of Abstraction - Detail Relationship of Artifacts 100 6.1 Usefulness and Usability of Tools 110 6.2 Usefulness of Tool 112 xvii LIST OF ACRONYMS AND SYMBOLS BRE - Basic Regular Expressions GUI - Graphical User Interface IEEE - The Institute of Electrical and Electronics Engineers, Inc LSI - Latent Semantic Indexing PBS - Portable Bookshelf PCR - Program Change Request RUP - Rational Unified Process SDLC - Systems Development Life Cycle SHriMP - Simple Hierarchical Multi-Perspective SLC - Software Life Cycle SLCM - Software Life Cycle Model SLCP - Software Life Cycle Process UML - Unified Modeling Language WWW - World Wide Web xviii LIST OF APPENDICES APPENDIX TITLE PAGE A Questionnaire On Usability Of Software Understanding Tool 124 B User Manual 130