OORPT Object-Oriented Reengineering Patterns and Techniques 1. Introduction Prof. O. Nierstrasz OORPT — Introduction OORPT Lecturer Assistants Prof. Oscar Nierstrasz Oscar.Nierstrasz@iam.unibe.ch Schützenmattstr. 14/103, Tel. 031 631.4618 Dr. Tudor Gîrba, Adrian Kuhn Lectures IWI 001, Wednesdays @ 10h15-12h00 Exercises IWI 001, Wednesdays @ 12h00-13h00 WWW Email list Text www.iam.unibe.ch/~scg/Teaching/OORPT/ www.iam.unibe.ch/mailman/listinfo/oorpt-vorlesung “Object-Oriented Reengineering Patterns,” Serge Demeyer, Stéphane Ducasse and Oscar Nierstrasz, Morgan Kaufmann and DPunkt, 2002, ISBN 1-55860-639-4. © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.2 OORPT — Introduction Roadmap Goals of this course > Why Reengineer? > — Lehman's Laws — Object-Oriented Legacy > Typical Problems — common symptoms — architectural problems & refactoring opportunities > Reverse and Reengineering — Definitions — Techniques — Patterns © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.3 OORPT — Introduction Roadmap Goals of this course > Why Reengineer? > — Lehman's Laws — Object-Oriented Legacy > Typical Problems — common symptoms — architectural problems & refactoring opportunities > Reverse and Reengineering — Definitions — Techniques — Patterns © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.4 OORPT — Introduction Goals of this course We will try to convince you: > Yes, Virginia, there are object-oriented legacy systems too! > Reverse engineering and reengineering are essential activities in the lifecycle of any successful software system. (And especially OO ones!) > There is a large set of lightweight tools and techniques to help you with reengineering. > Despite these tools and techniques, people must do job and they represent the most valuable resource. © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.5 OORPT — Introduction Course Schedule Week Date Lesson 1 25-Oct-06 Introduction 2 1-Nov-06 Lab — understanding legacy software 3 8-Nov-06 Presentation of results (all groups) 4 15-Nov-06 Reverse Engineering 5 22-Nov-06 Visualization for Program Understanding 6 29-Nov-06 Design Extraction and Visualization 7 6-Dec-06 Problem Detection and Duplicated Code 8 13-Dec-06 Restructuring 9 20-Dec-06 Lab — using reengineering tools 10 10-Jan-07 Meta-modeling 11 17-Jan-07 Architectural Extraction 12 24-Jan-07 Testing and Migration Strategies 13 31-Jan-07 TBA 14 7-Feb-07 Final Exam © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.6 OORPT — Introduction Email list Please register yourself in the course mailing list! https://www.iam.unibe.ch/mailman/listinfo/oorpt-vorlesung © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.7 OORPT — Introduction Roadmap Goals of this course > Why Reengineer? > — Lehman's Laws — Object-Oriented Legacy > Typical Problems — common symptoms — architectural problems & refactoring opportunities > Reverse and Reengineering — Definitions — Techniques — Patterns © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.8 OORPT — Introduction What is a Legacy System ? “legacy” A sum of money, or a specified article, given to another by will; anything handed down by an ancestor or predecessor. — Oxford English Dictionary A legacy system is a piece of software that: • you have inherited, and • is valuable to you. Typical problems with legacy systems: • original developers not available • outdated development methods used • extensive patches and modifications have been made • missing or outdated documentation so, further evolution and development may be prohibitively expensive © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.9 OORPT — Introduction Software Maintenance - Cost Relative Cost of Fixing Mistakes Relative Maintenance Effort Between 50% and 75% of global effort is spent on “maintenance” ! x 200 x 20 Solution ? • Better requirements engineering? • Better software methods & tools (database schemas, CASE-tools, objects, components, …)? © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz x 10 x5 x1 requirement coding delivery design testing 1.10 OORPT — Introduction Continuous Development 4.1% Other data from [Lien78a] 18.2% Adaptive (new platforms or OS) 17.4% Corrective (fixing reported errors) 60.3% Perfective (new functionality) The bulk of the maintenance cost is due to new functionality even with better requirements, it is hard to predict new functions © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.11 OORPT — Introduction Lehman's Laws A classic study by Lehman and Belady [Lehm85a] identified several “laws” of system change. Continuing change > A program that is used in a real-world environment must change, or become progressively less useful in that environment. Increasing complexity > As a program evolves, it becomes more complex, and extra resources are needed to preserve and simplify its structure. Those laws are still applicable… © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.12 OORPT — Introduction What about Objects ? Object-oriented legacy systems > = successful OO systems whose architecture and design no longer responds to changing requirements Compared to traditional legacy systems > The symptoms and the source of the problems are the same > The technical details and solutions may differ OO techniques promise better > flexibility, > reusability, they do not come for free > maintainability > … © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.13 OORPT — Introduction What about Components ? Components can be very brittle … After a while one inevitably resorts to glue :-) © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.14 OORPT — Introduction Modern Methods & Tools ? [Glas98a] quoting empirical study from Sasa Dekleva (1992) > Modern methods(*) lead to more reliable software > Modern methods lead to less frequent software repair > and ... > Modern methods lead to more total maintenance time Contradiction ? No! • modern methods make it easier to change ... this capacity is used to enhance functionality! (*) process-oriented structured methods, information engineering, data-oriented methods, prototyping, CASE-tools – not OO ! © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.15 OORPT — Introduction How to deal with Legacy ? New or changing requirements will gradually degrade original design … unless extra development effort is spent to adapt the structure New Functionality Hack it in ? • • • • duplicated code complex conditionals abusive inheritance large classes/methods Take a loan on your software pay back via reengineering © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz First … • refactor • restructure • reengineer Investment for the future paid back during maintenance 1.16 OORPT — Introduction Roadmap Goals of this course > Why Reengineer? > — Lehman's Laws — Object-Oriented Legacy > Typical Problems — common symptoms — architectural problems & refactoring opportunities > Reverse and Reengineering — Definitions — Techniques — Patterns © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.17 OORPT — Introduction Common Symptoms Lack of Knowledge > > > > obsolete or no documentation departure of the original developers or users disappearance of inside knowledge about the system limited understanding of entire system missing tests Process symptoms > > > > too long to turn things over to production need for constant bug fixes maintenance dependencies difficulties separating products simple changes take too long Code symptoms • • duplicated code code smells big build times © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.18 OORPT — Introduction Common Problems Architectural Problems > > > > > insufficient documentation = non-existent or out-of-date improper layering = too few or too many layers lack of modularity = strong coupling duplicated code = copy, paste & edit code duplicated functionality = similar functionality by separate teams © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz Refactoring opportunities misuse of inheritance = code reuse vs polymorphism > missing inheritance = duplication, case-statements > misplaced operations = operations outside classes > violation of encapsulation = type-casting; C++ "friends" > class abuse = classes as namespaces > 1.19 OORPT — Introduction FAMOOS Case Studies Domain LOC Reengineering Goal pipeline planning 55,000 extract design user interface 60,000 increase flexibility embedded switching 180,000 improve modularity mail sorting 350,000 portability & scalability network management 2,000,000 unbundle application space mission 2,500,000 identify components Different reengineering goals … but common themes and problems ! © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.20 OORPT — Introduction Roadmap Goals of this course > Why Reengineer? > — Lehman's Laws — Object-Oriented Legacy > Typical Problems — common symptoms — architectural problems & refactoring opportunities > Reverse and Reengineering — Definitions — Techniques — Patterns © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.21 OORPT — Introduction Some Terminology “Forward Engineering is the traditional process of moving from highlevel abstractions and logical, implementation-independent designs to the physical implementation of a system.” “Reverse Engineering is the process of analyzing a subject system to identify the system’s components and their interrelationships and create representations of the system in another form or at a higher level of abstraction.” “Reengineering ... is the examination and alteration of a subject system to reconstitute it in a new form and the subsequent implementation of the new form.” — Chikofsky and Cross [in Arnold, 1993] © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.22 OORPT — Introduction Goals of Reverse Engineering > Cope with complexity — need techniques to understand large, complex systems > Generate alternative views — automatically generate different ways to view systems > Recover lost information — extract what changes have been made and why > Detect side effects — help understand ramifications of changes > Synthesize higher abstractions — identify latent abstractions in software > Facilitate reuse — detect candidate reusable artifacts and components — Chikofsky and Cross [in Arnold, 1993] © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.23 OORPT — Introduction Reverse Engineering Techniques > Redocumentation — pretty printers — diagram generators — cross-reference listing generators > Design recovery — — — — software metrics browsers, visualization tools static analyzers dynamic (trace) analyzers © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.24 OORPT — Introduction Goals of Reengineering > Unbundling — split a monolithic system into parts that can be separately marketed > Performance — “first do it, then do it right, then do it fast” — experience shows this is the right sequence! > Port to other Platform — the architecture must distinguish the platform dependent modules > Design extraction — to improve maintainability, portability, etc. > Exploitation of New Technology — i.e., new language features, standards, libraries, etc. © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.25 OORPT — Introduction Reengineering Techniques > Restructuring — automatic conversion from unstructured to structured code — source code translation — Chikofsky and Cross > Data reengineering — integrating and centralizing multiple databases — unifying multiple, inconsistent representations — upgrading data models — Sommerville, ch 32 > Refactoring — renaming/moving methods/classes etc. © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.26 OORPT — Introduction The Reengineering Life-Cycle (0) requirement analysis Requirements (2) problem detection (3) problem resolution Designs (1) model capture • people centric • lightweight Code (4) program transformation © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.27 OORPT — Introduction Reverse engineering Patterns Reverse engineering patterns encode expertise and trade-offs in extracting design from source code, running systems and people. — Even if design documents exist, they are typically out of sync with reality. Example: Interview During Demo © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.28 OORPT — Introduction Reengineering Patterns Reengineering patterns encode expertise and trade-offs in transforming legacy code to resolve problems that have emerged. — These problems are typically not apparent in original design but are due to architectural drift as requirements evolve Example: Move Behaviour Close to Data © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.29 OORPT — Introduction A Map of Reengineering Patterns Tests: Your Life Insurance Detailed Model Capture Initial Understanding First Contact Setting Direction © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz Migration Strategies Detecting Duplicated Code Redistribute Responsibilities Transform Conditionals to Polymorphism 1.30 OORPT — Introduction Summary > Software “maintenance” is really continuous development > Object-oriented software also suffers from legacy symptoms > Reengineering goals differ; symptoms don’t > Common, lightweight techniques can be applied to keep software healthy © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.31 OORPT — Introduction License > http://creativecommons.org/licenses/by-sa/2.5/ Attribution-ShareAlike 2.5 You are free: • to copy, distribute, display, and perform the work • to make derivative works • to make commercial use of the work Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. • For any reuse or distribution, you must make clear to others the license terms of this work. • Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. © Stéphane Ducasse, Serge Demeyer, Oscar Nierstrasz 1.32