Software maintenance and
evolution
Software re-engineering and reverse
engineering
Prof. Robertas Damaševičius, [email protected]
Prof. Vytautas Štuikys
Kaunas University of Technology
Terminų žodynėlis
• Tiesioginė inžinerija – Forward Engineering
• Apgrąžos (atvirkštinė) inžinerija – Reverse
Engineering
• Dvipusė inžinerija (reinžinerija) – Reengineering
• Rekonstravimas – Reconstruction, re-factoring
• Restruktūrizavimas – Restructuring,
remodularization
Definitions
• Forward engineering - traditional software engineering
approach starting with requirements analysis and
progressing to implementation of a system
• Reverse engineering – system analysis process to:
– identify the system's components and their
interrelationships and
– create representations of the system in another form or at
higher levels of abstraction
• Reengineering - process of analysis and change
whereby a system is modified by first reverse
engineering and then forward engineering.
• Re-factoring (restructuring) - transformation of a system
from one representational form to another
Software re-engineering and
reverse engineering
3
Forward Engineering and
Reengineering*
System
specification
Design and
implementation
Ne w
system
Understanding and
transformation
Re-engineered
system
Forward engineering
Existing
software system
Software re-engineering
*According to Sommerville
Software re-engineering and
reverse engineering
4
Contents
• Re-engineering vs maintenance
• Reverse engineering
• Re-factoring (re-structuring)
Software re-engineering and
reverse engineering
5
System Reengineering
• Restructuring or rewriting part or all of a system
without changing its functionality
• Required when some (but not all) subsystems of
a larger system require frequent maintenance
• Reengineered system may also be restructured
and should be re-documented
Software re-engineering and
reverse engineering
6
Goals of Reengineering
• Port to other Platform
– when hardware or software support becomes obsolete
• Design extraction
– to improve maintainability, portability, etc.
• Exploitation of New Technology
– new language features, standards, libraries, etc.
– when tools to support restructuring are readily available
Software re-engineering and
reverse engineering
7
Reengineering Categories
•
•
•
•
Automatic restructuring
Automatic and semi-automatic transformation
Design recovery and reimplementation
Code reverse engineering and forward
engineering
• Data reverse engineering and schema migration
• Migration of legacy systems to modern platforms
Software re-engineering and
reverse engineering
8
Reengineering Techniques
• Restructuring
– automatic conversion from unstructured to structured code
– source code translation
— Chikofsky and Cross
• Data reengineering
– integrating and centralizing multiple databases
– unifying multiple, inconsistent representations
– upgrading data models
— Sommerville, ch 32
• Refactoring
– renaming/moving methods/classes etc.
Software re-engineering and
reverse engineering
9
Life-Cycle of Reengineering
(0) requirement
analysis
Requirements
(2) problem
detection
(3) problem
resolution
Designs
(1) model
capture
• people centric
• lightweight
Code
(4) program transformation
(Acc. To S. Ducasse et al.)
Software re-engineering and
reverse engineering
10
Generic Reengineering Process
• Requirement analysis: analyse on which parts of your
requirements have changed
• Model capture: reverse engineer from the source-code into a
more abstract form, typically some form of a design model
• Problem detection: identify design problems in that abstract
model
• Problem resolution: propose an alternative design that will
solve the identified problem
• Program transformations: make the necessary changes to
the code, so that it adheres to the new design yet preserves
all the required functionality
Software re-engineering and
reverse engineering
11
Reengineering Process*
Program
documentation
Original
program
Modularised
program
Original data
Reverse
engineering
Program
modularisation
Source code
translation
Data
reengineering
Program
structure
improvement
Structured
program
*According to Sommerville
Software re-engineering and
reverse engineering
Reengineered
data
12
Software Reengineering
Process Model (1)
• Software Inventory analysis
– sorting active software applications by business
criticality, longevity, current maintainability, and other
local criteria
– helps to identify reengineering candidates
• Document restructuring options
– update poor documents if they are used
– fully rewrite the documentation for critical systems
focusing on the "essential minimum"
Software re-engineering and
reverse engineering
13
Software Reengineering
Process Model (2)
• Reverse engineering
– process of design recovery
– analyzing a program in an effort to create a
representation of the program at some abstraction
level higher than source code
• Code restructuring
– source code is analyzed and violations of structured
programming practices are noted and repaired
– revised code needs to be reviewed and tested
Software re-engineering and
reverse engineering
14
Software Reengineering
Process Model (3)
• Data restructuring
– existing data structures are reviewed for quality
• Forward engineering
– sometimes called reclamation or renovation
– recovers design information from existing source
code
– uses this design information to reconstitute the
existing system to improve its overall quality or
performance
Software re-engineering and
reverse engineering
15
Re-engineering advantages
• Reduced risk
– There is a high risk in new software development:
development problems, staffing problems and
specification problems
• Reduced cost
– Cost of re-engineering is often less than costs of
developing new software
Software re-engineering and
reverse engineering
16
How?
Reengineering
Manual
Reading code
Automated
Extracting info
Presenting info
Static analysis
Querying
Program slicing
Browsing
Dependency graphs
Animation
Dynamic analysis
Profiling
Stepwise execution
Contents
• Re-engineering vs maintenance
• Reverse engineering
• Re-factoring (re-structuring)
Software re-engineering and
reverse engineering
18
Principles of reverse engineering
• Reverse engineering:
– Systematic process of acquiring important design
factors and information regarding engineering aspects
from an existing product
– A process which analyses a product/technology to
find out the design aspects and its functions
– A kind of analysis which engages an individual in a
process of constructive learning of design and its
functionality of systems and products
Reverse engineering
• Goal: to facilitate change by allowing a software system
to be understood in terms of what it does, how it works
and its architectural representation.
• Objectives:
–
–
–
–
–
–
–
–
–
to recover lost information,
to facilitate migration between platforms,
to improve and/or provide new documentation,
to extract reusable components,
to reduce maintenance effort,
to cope with complexity,
to detect side effects,
to assist migration to a CASE environment
to develop similar or competitive products.
Software re-engineering and
reverse engineering
20
Goals of Reverse Engineering
• Cope with complexity
– need techniqes to understand large, complex systems
• Recover lost information
– extract what changes have been made and why
• Detect side effects
– help understand ramifications of changes
• Synthesize higher abstractions
– identify latent abstractions in software
• Facilitate reuse
– detect candidate reusable artifacts and components
— Chikofsky and Cross
Software re-engineering and
reverse engineering
21
Uses of Reengineering
• Recreating the design, making the design decision, and
information which is developed by the original design team
• Learning the working principle of a system
• Justifications of design decisions of the original design
team
• Redesigning the product to improvise, modify to suit the
modern circumstances etc.
• Understanding the functionality of product in depth
• Evaluating a product to understand its limitations and
comparison with other products using simulation
technology
Role of Reverse Engineering
(Reengineering)
• Reengineering method plays a significant role in the
development of science and technology innovations
• Reengineering involves disassembling and disclosing
the method in which it works
• It often shows an effective way for learning to construct a
product/ technology or improvise it
Reverse Engineering Concepts
(1)
• Abstraction level
– ideally want to be able to derive design information at the highest
level possible
• Completeness
– level of detail provided at a given abstraction level
• Interactivity
– degree to which humans are integrated with automated reverse
engineering tools
Software re-engineering and
reverse engineering
24
Reverse Engineering Concepts
(2)
• Directionality
– one-way means the software engineer doing the
maintenance activity is given all information extracted
from source code
– two-way means the information is fed to a
reengineering tool that attempts to regenerate the old
program
• Extract abstractions
– meaningful specification of processing performed is
derived from old source code
Software re-engineering and
reverse engineering
25
Reverse Engineering Activities
• Understanding process
– source code is analyzed to at varying levels of detail
– to understand procedural abstractions and overall functionality
• Understanding data
– internal data structures
– database structure
• Understanding user interfaces
– what are basic actions processed by the interface?
– what is system's behavioral response to these actions?
Software re-engineering and
reverse engineering
26
Reverse Engineering Techniques
• Re-documentation
– pretty printers
– diagram generators
– cross-reference listing generators
• Design recovery
–
–
–
–
–
software metrics
browsers
visualization tools
static analyzers
dynamic (trace) analyzers
Software re-engineering and
reverse engineering
27
Factors that Motivate the Application
of Reverse Engineering
Software re-engineering and
reverse engineering
28
Summary of Objectives and Benefits
of Reverse Engineering
Software re-engineering and
reverse engineering
29
Benefits of Reverse Engineering for
Software Maintenance
• Corrective change:
– abstraction of unnecessary detail gives greater insight into
the parts of the program to be corrected
– easier to identify defective program components and the
source of residual errors
• Adaptive/perfective change:
– Eases understanding of system’s components and their
interrelationships, showing where new requirements fit and
how they relate to existing components
– Extracted information whean be used during enhancement
of the system or for the development of another product
• Preventive change:
– brings benefit to future maintenance of a system
Software re-engineering and
reverse engineering
30
Legality of software re-engineering
• Software reengineering is done
– to retrieve the source code of a program because the
source code was lost,
– to study how the program performs certain operations
– to improve the performance of a program
– to fix a bug (correct an error in the program when the
source code is not available)
– to identify malicious content in a program such as a virus
or to adapt a program written for use with one
microprocessor for use with another.
• Reengineering for the purpose of copying or duplicating
programs may constitute a copyright violation.
• In some cases, the licensed use of software specifically
prohibits reverse engineering
Reengineering for software
• Reengineering takes a program and constructs a high level
representation for documentation, maintenance or reuse
• To accomplish this, most current reengineering techniques
begin by analyzing a program’s structure
• The structure is determined by lexical, syntactic and
semantic rules for legal program constructs
• Because we know how to do these kinds of analyses quite
well, it is natural to try and apply them to understanding a
program
Reengineering and program
rephrasing
• Imagine trying to understand a program in which
all identifiers have been systematically replaced
by random names and in which all indentation
and comments have been removed
• Example of transformation by rephrasing is
obfuscation
• The task of understanding would be difficult if
not impossible
Obfuscation
• Source code obfuscation which is a useful anti-reversing
technique for source code
• There may exist a requirement to ship the source code
of an application so that the machine code can be
generated on the end user’s computer
• If the source code contains intellectual property that is
worth protecting, one can perform transformations to the
source code which make it difficult to read, but have no
impact on the machine code that would ultimately be
generated when the program is compiled.
34
Applications of Reengineering
• Reverse-engineering is used for many purposes:
– as a learning tool;
– as a way to make new, compatible products that are
cheaper than what's currently on the market;
– for making software interoperate more effectively; or
– to bridge data between different operating systems or
databases;
– and to uncover the undocumented features of
commercial products.
Reengineering and program
understanding
• Given that the source code by itself is not sufficient to
understand the program
• The problem is that programs have a purpose: their job
is to compute something
• For the computation to be of value, the program must
model or approximate some aspect of the real world
• To the extent that the model is accurate, the program will
succeed in accomplishing its purpose
• To the extent that the model is comprehended by the
reverse engineer, the process of understanding the
program will be eased
Contents
• Re-engineering vs maintenance
• Reverse engineering
• Re-factoring (re-structuring)
Software re-engineering and
reverse engineering
37
Basic Issues in refactoring
•
•
•
•
38
What is refactoring?
Why should you refactor?
When should you refactor?
Categories of refactorings
What is refactoring?
• A refactoring is a software transformation that
– preserves the external behaviour of the software
– improves the internal structure of the software
• Refactoring [Fowler 1999]
– [noun] a change (keitimas) made to the internal
structure of software to make it easier to understand
and cheaper to modify without changing its observable
behaviour
– [verb] to restructure software by applying a series of
refactorings without changing its (user) observable
behavior
39
Why should you refactor? (1)
• To improve the software design
• To counter code decay (software aging, decay)
and increase its life-time
• To increase software understandibility
(comprehensibility)
• To reduce costs of software maintenance
40
Why should you refactor? (2)
• …
• To find bugs and write more robust code
• To reduce testing
– automatic refactorings are guaranteed to be behaviourpreserving
• To prepare for / facilitate future customisations
• To turn an OO application into a framework
– introduce design patterns in a behaviourally
preserving way
41
When should you refactor?
• Only if you see the need for it
– Not on a preset periodical basis
• Apply the rule of three
– When 1st time: implement from scratch
– When 2nd time: implement something similar by code duplication
– When 3rd time: do not implement similar things again or
duplicate, but refactor
• Refactor when adding new features
– Especially if feature is difficult to integrate with the existing code
• Refactor during bug fixing
– If a bug is very hard to trace, refactor first to make the code
more understandable
• Refactor during code reviews
42
Categories of refactorings
• Based on granularity
– low-level (primitive) vs
– high-level (composite) refactorings
• Based on programming language
– language-specific (e.g. Java, Smalltalk, ...)
– language-independent (e.g. [Tichelaar&al 2000])
• Degree of formality
– formal (e.g. [Bergstein 1997])
– ad-hoc (nesisteminis, be metodikos) (e.g. [Fowler 1999])
– semi-formal
• Degree of automation
– fully automated (e.g. [Moore 1996])
– interactive (e.g. Refactoring Browser of [Roberts&al 1997])
– fully manual (e.g. [Fowler 1999])
43
Refactoring to Understand
Problem: How do you decipher cryptic code?
Solution: Re-factor it till it makes sense
•
•
•
•
Goal (for now) is to understand, not to reengineer
Work with a copy of the code
Refactoring requires an adequate test base
Restructuring is a transformation from one form of
presentation to another
• Refactoring is the object-oriented variant of restructuring
• The subjects external behavior is preserved
• Idea is to make existing code more extensible
Software re-engineering and
reverse engineering
44
Program structure improvement
• Maintenance tends to corrupt the structure of a
program. It becomes harder and harder to
understand
• The program may be simplified to make them
more readable
Software re-engineering and
reverse engineering
45
Types of Restructuring
• Code restructuring
– Program transformation
– Architecture transformations
• Data restructuring
–
–
–
–
–
analysis of source code
data redesign
data record standardization
data name rationalization
file or database translation
Software re-engineering and
reverse engineering
46
Program modularization
• Process of re-organising a program so that related
program parts are collected together in a single module
– Data abstractions: Abstract data types where data
structures and associated operations are grouped
– Hardware modules: All functions required to interface with
a hardware unit
– Functional modules: Modules containing functions that
carry out closely related tasks
– Process support modules: Modules where the functions
support a business process or process fragment
Software re-engineering and
reverse engineering
47
Restructuring Approaches*
Automated program
restructuring
Automated source
code conversion
Program and data
restructuring
Automated restructuring
with manual changes
Restructuring plus
architectural changes
Increased cost
*According to Sommerville
Software re-engineering and
reverse engineering
48
Refactoring activities
• Identify where the software should be re-factored
• Determine which re-factoring(s) should be applied to the
identified places
• Guarantee that the applied refactoring preserves
behaviour
• Apply the refactoring
• Assess the effect of the refactoring on quality
characteristics of the software
• Maintain the consistency between the re-factored
program code and other software artefacts
Software re-engineering and
reverse engineering
49
Types of software artifacts
• Refactoring program source code
• Refactoring of models, independent of the
underlying programming language
– Refactoring of UML diagrams
• Restructuring software architectures
• Restructuring of software requirements
Software re-engineering and
reverse engineering
50
Qualities of refactoring tools
• Reliability: check if the provided re-factorings
are truly behavior preserving
• Coverage: number of refactoring activities
supported by the tool
• Configurability: modifying refactoring pattern
specifications
• Scalability: combine frequently used primitive
re-factorings into composite re-factorings
• Language independence: applicability to
different languages
Software re-engineering and
reverse engineering
51
Program Translation Process*
System to be
re-engineered
Identify source
code differences
*According to Sommerville
Design translator
instructions
System to be
re-engineered
Re-engineered
system
Automatically
transla te code
Manually
transla te code
Software re-engineering and
reverse engineering
52
Source Code Translation
• Converting the code from one language (or language
version) to another
• May be necessary because of
– hardware platform updates
– staff skill shortages
– organizational policy changes
• Cost and time effective only if an automatic translator is
available
• Manual fine tuning of new code is always required
Software re-engineering and
reverse engineering
53
Restructuring problems
• Problems with re-structuring are:
– Loss of comments
– Loss of documentation
– Heavy computational demands
• Restructuring doesn’t help with poor modularisation where
related components are dispersed throughout the code.
• The understandability of data-driven programs may not be
improved by re-structuring.
Software re-engineering and
reverse engineering
54
Restructuring Benefits
• Improved program and documentation quality
• Makes programs easier to learn
– improves productivity
– reduces developer frustration
• Reduces effort required to maintain software
• Software is easier to test and debug
Software re-engineering and
reverse engineering
55
Summary: acquired knowledge
• Objective of re-engineering is to improve the system
structure to make it easier to understand and maintain
• Re-engineering involves source code translation, reverse
engineering, program structure improvement and data reengineering
• Reverse engineering is the process of deriving the
system design (documents, models, e.g., control graph
and specification) from its source code
• Program structuring involves reorganization to group
related items
• Data re-engineering may be necessary because of
inconsistent data management
Software re-engineering and
reverse engineering
56
Conclusions
• Reverse engineering is a general transformation approach
to transform various products and artefacts from lowerlevel representation to higher-level representation
• Directly related to learning and understanding of facts,
products, processes and artefacts
• Process of new knowledge discovery can be also
conceived of as a re-engineering process or extending
previous knowledge
• Reengineering is the way for obtaining know-how or
knowledge, which can be used for many purposes,
including better software maintenance
References
• I. Somerville, Software Engineering. New York, NY,
Addison-Wesley, 1995.
• R. Arnold, Software Reengineering. IEEE Computer
Society Press, 1993.
• J.H. Cross, E.J. Chikofsky, C.H. May, ”Reverse
engineering”. Advances in Computers, 35: 199-254,
1992.
Software re-engineering and
reverse engineering
58