Karolina Muszyńska Reverse engineering - looking at the solution to figure out how it works Reverse engineering - breaking something down in order to understand it, build a copy or improve it Reverse engineering is used to examine software or software components to figure out how they’re processing business rules, where they’re sourcing data, and how they make decisions. Basically, you want to understand how the software is supporting the business. The use of this elicitation technique is increasing across the field because of all the legacy systems (old computer systems) which need to be updated or replaced. Applications built 30 years ago have to be reverse engineered so people can figure out how they work. 2 When you are not sure what is happening within your code or need to understand how an old computer system calculates a certain field (business users may ask about how the system supports the business process) When the software documentation is out of date (sometimes there is no documentation at all) When business users are not aware of the business rules being enforced (the business may have changed in the years since the rules were hard-coded into the application) When you are interfacing systems and need to know the correctness of data in each system (this challenge is one you face when you create ongoing interfaces or onetime data migrations) 3 As a learning tool As a way to make new, compatible products that are cheaper than what is currently on the market For making software interoperate more effectively or to bridge data between different operating systems or databases To uncover the undocumented features of commercial products 4 Security flaws Questionable privacy practices Two main areas of threats to reverse-engineering ◦ shrink-wrap licenses that explicitly prohibit anyone who opens or uses the software from reverse-engineering it ◦ Digital Millennium Copyright Act (DMCA), which prohibits the creation or dissemination of tools or information that could be used to break technological safeguards that protect software from being copied 5 One important activity in reverse engineering is to recover the architecture of the system – there must be a process to rebuild the UML models that together provide the architectural view of the system In RUP, the construction of the use-case model is central to the reverse engineering process: ◦ use cases are used to recover the business process model the system supports ◦ use cases are analyzed to build the system analysis model that represents a hypothetical architecture for the software ◦ use cases are used as the source of scenarios to be run to find the software elements that are involved in the implementation of the business functions Based on: http://www.ibm.com/developerworks/rational/library/sep06/dugerdil/ 6 RUP reverse engineering process proceeds through the following three steps: 1. Assess the scope of the reengineering project 2. Build the abstract models 3. Recover the architecture of the software Based on: http://www.ibm.com/developerworks/rational/library/sep06/dugerdil/ 7 This step includes: ◦ ◦ ◦ ◦ Developing vision Developing business cases Finding actors and use cases Identifying and assessing risks In this initial step, the business scope and desired quality attributes of the reengineered system are set. Usually, these quality attributes are defined beforehand, and the system is restructured to fulfill them. Based on: http://www.ibm.com/developerworks/rational/library/sep06/dugerdil/ 8 If the quality of the actual code is too bad, it may not be worth the effort of complete reengineering: ◦ management may decide to extract and reengineer some critical component only ◦ when the system structure is so bad that no useful piece of code can be reused, the reengineering work could be limited to the extraction of the knowledge embedded in the old system to help specify a new one This step is iterative: the more we know about the actual structure of the system, the better we can assess the economic relevance of the restructuring of the system Based on: http://www.ibm.com/developerworks/rational/library/sep06/dugerdil/ 9 This step includes: ◦ ◦ ◦ ◦ Detailing a business use case Finding business workers and entities Detailing a business entity Database analysis Building the architectural model requires understanding the system itself, since its source code does not offer much help; as an aid in system understanding, we can create a hypothetical architecture to potentially be discovered in the code A representation of the business process the system is intended to support as well as a tentative domain model can be built, by gathering system usage information from all the people involved Based on: http://www.ibm.com/developerworks/rational/library/sep06/dugerdil/ 10 This step contains the following tasks: ◦ ◦ ◦ ◦ ◦ ◦ Analysis of the Implementation Model Running the use cases Analyzing the call graph Mapping the functions to the Implementation Model Validating the hypothetical architecture Rebuilding the high-level architecture The problem is to create the traceability link between the high-level analysis model elements and the low-level software components Based on: http://www.ibm.com/developerworks/rational/library/sep06/dugerdil/ 11 Summary One of the key problems in reverse engineering a legacy software system is to understand the code and build an architectural representation of it One possible solution is the reconstruction of the UML models through RUP Some application, like for example Microsoft Visual Studio .NET, can be used to reverse engineer applications into UML or other development diagrams 12