REVERSE ENGINEERING AS A METHOD OF SYSTEM REDOCUMENTATION BUHARI, B.A; Department of Mathematics, Usmanu Danfodiyo University, Sokoto, nIGERIA. buhari.bello@udusok.edu.ng, belloabuhari2@gmail.com ABSTRACT Reverse engineering for software is the process of analyzing a program in an effort to create a representation of the program at a higher level of abstraction than source code. Reverse engineering is a process of design recovery. Reverse engineering tools extract data, architectural, and procedural design information from an existing program. This paper explores the application of reverse engineering in understanding the processing of a system. Data flow diagram model, refined in some level of detail were used in describing the flow of information within a system. A faculty seminar presented in the faculty of Science, April 2010 1 INTRODUCTION The term reverse engineering has its origins in the hardware world. A company disassembles a competitive hardware product in an effort to understand its competitor's design and manufacturing "secrets." These secrets could be easily understood if the competitor's design and manufacturing specifications were obtained. But these documents are proprietary and unavailable to the company doing the reverse engineering. In essence, successful reverse engineering derives one or more design and manufacturing specifications for a product by examining actual specimens of the product. Reverse engineering for software is quite similar. In most cases, however, the program to be reverse engineered is not a competitor's. Rather, it is the company's own work (often done many years earlier). The "secrets" to be understood are obscure because no specification was ever developed. Therefore, reverse engineering for software is the process of analyzing a program in an effort to create a representation of the program at a higher level of abstraction than source code. Reverse engineering is a process of design recovery. Reverse engineering tools extract data, architectural, and procedural design information from an existing program. A faculty seminar presented in the faculty of Science, April 2010 2 Reverse engineering should produce, preferably in an automatic way, documents that help software engineers in understanding the system. Over the last ten years, reverse engineering research has produced a number of capabilities for analyzing code, including subsystem decomposition (Umar, 1997), concept synthesis (Biggerstaff, et al., 1994), design, program and change pattern matching (Gamma et al, 1995, Stevens and Poley, 1998), analysis of static and dynamic dependencies (Systa, 1999), object-oriented metrics (Chidamber and Kemerer, 1994), and others. In general, these approaches have been successful in treating the software at the syntactic level to address specific information needs and to span relatively narrow information gaps. REVERSE ENGINEERING IN DATA PROCESSING The first real reverse engineering activity begins with an attempt to understand and then extract procedural abstractions represented by the source code. To understand procedural abstractions, the code is analyzed at varying levels of abstraction: system, program, component, pattern, and statement The overall functionality of the entire application system must be understood before more detailed reverse engineering work occurs. This establishes a context for further analysis and provides insight into interoperability issues among applications within the system. Each of the programs that make up the application system represents a functional abstraction at a high level of detail. A faculty seminar presented in the faculty of Science, April 2010 3 Information is transformed as it flows through a computer-based system. The system accepts input in a variety of forms; applies hardware, software, and human elements to transform it; and produces output in a variety of forms. Input may be a control signal transmitted by a transducer, a series of numbers typed by a human operator, a packet of information transmitted on a network link, or a voluminous data file retrieved from secondary storage. The transform(s) may comprise a single logical comparison, a complex numerical algorithm, or a rule-inference approach of an expert system. Output may light a single LED or produce a 200-page report. A data flow diagram is a graphical representation that depicts information flow and the transforms that are applied as data move from input to output. A rectangle is used to represent an external entity; that is, a system element (e.g., hardware, a person, another program, etc) or another system that produces information for transformation by the software or receives information produced by the software. A circle (sometimes called a bubble) represents a process or transform that is applied to data (or control) and changes it in some way. An arrow represents one or more data items (data objects). All arrows on a data flow diagram should be labeled. The double line represents a data store—stored information that is used by the software. CREATING A DATA FLOW MODEL The data flow diagram enables the software engineer to develop models of the information domain and functional domain at the same time. As the DFD is refined into greater levels of detail, the analyst performs an implicit functional decomposition of the A faculty seminar presented in the faculty of Science, April 2010 4 system, thereby accomplishing the fourth operational analysis principle for function. At the same time, the DFD refinement results in a corresponding refinement of data as it moves through the processes that embody the application. Again considering the Student Information system product, a level 0 DFD for the system is shown in Fig. 1. The primary external entities (boxes) produce information for use by the system and consume information generated by the system. The inputs entities are Student, course and Exam. The labeled arrows represent data objects or data object type hierarchies. The input data objects include: student data, course data and exam data. Fig. 1 Level 0 DFD for Students Information Software The level 0 DFD is now expanded into a level 1 model as shown in Fig 2. Here to produce personal information there will be request for student information process, store student information process and display student information process. Also, to produce course registered report there will be request for course information process, store course information process and display course registered process. Further more, to produce both A faculty seminar presented in the faculty of Science, April 2010 5 senate format result and transcript report there will be request for exam result, store exam result, process results, and display senate format result and display transcript respectively. Fig 2 Level 1 DFD for Student Information Software The processes represented at DFD level 1 can be further refined into lower levels. For example, the process: process result can be refined into a level 2 DFD as shown in Fig 3. Further processes can be seen like access against student data, course data and exam result; read student information, course information and exam information; and inquire course and exam information for department and inquire courses and exam information for student. A faculty seminar presented in the faculty of Science, April 2010 6 Fig. 3 Level 2 DFD that refines process: process result CONCLUSION Reverse engineering for software is the process of analyzing a program in an effort to create a representation of the program at a higher level of abstraction than source code. Reverse engineering is a process of design recovery. The data flow diagram enables the software engineer to develop models of the information domain and functional domain at the same time. As the DFD is refined into greater levels of detail, the analyst performs an implicit functional decomposition of the system, thereby accomplishing the fourth operational analysis principle for function. At the same time, the DFD refinement results in a corresponding refinement of data as it moves through the processes that embody the application. This publication explore one of the reverse engineering step called extract abstraction, taking Student Information System as an example, in understanding the processing of a system A faculty seminar presented in the faculty of Science, April 2010 7 REFERENCES Biggerstaff, T. J. et al. (1994): Program understanding and the concept assignment problem. In: Proceedings of the 15nd International Conference on Software Engineering (ICSE), pp. 482-498. ACM Press, 1993. Chidamber, S. R. and Kemerer, C. F. (1994): A metrics suite for object Oriented design. In: IEEE Transaction on Software Engineering, Vol. 20, No. 06, June, 1994, pp. 476–493. Gamma, E. et al. (1995): Design Patterns - Elements of Reusable Object Oriented Software. Addison Wesley Professional Computing Series. Addison-Wesley, 1995. Roger S. Pressman (2001): Software Engineering: a practitioner’s approach. Mc Graw Hill Companies Inc, New York. fifth edition, 2001. Stevens, P. and Pooley, R. (1998): Systems reengineering patterns. In: Proceedings of the ACM SIGSOFT 6th International Symposium on the Foundations of Software Engineering (FSE), Vol. 23, No. 06, Software Engineering Notes, pp. 17–23, November, 1998. Systa, T. (1999): The relationships between static and dynamic models in reverse engineering java software. In: Proceedings of the 6th Working Conference on Reverse Engineering (WCRE). IEEE Computer Society Press, October 1999. Umar, A. (1997): Application (Re)Engineering: Building Web-Based Applications and Dealing with Legacies. Prentice Hall, Upper Saddle River, NJ, 1997 A faculty seminar presented in the faculty of Science, April 2010 8