Software Analysis and Testing Paolo Tonella Alessandro Marchetto Cu D. Nguyen Mariano Ceccato Software Engineering (SE) unit, Fondazione Bruno Kessler (FBK) Team Paolo Tonella Cu D. Nguyen Mariano Ceccato Alessandro Marchetto Software Engineering research unit http://se.fbk.eu at Fondazione Bruno Kessler http://www.fbk.eu 2 FBK FBK (more than 350 researchers) is a research organization of the Autonomous Province of Trento that promotes research in the areas of science, technology and humanities. FBK objectives are to: (i) conduct research that obtains recognition at an international level; (ii) carry out applied research of strategic importance to the province; (iii) publicize scientific results and promote economic development; and (iv) encourage innovation throughout the province. 3 FBK – CIT Three main areas of Information Technology: - Engineering - Content - Interaction FBK is organized into Research Units. - Research units are research groups that are above critical mass but still manageable. - Research Units include senior and young researchers, postdocs, PhD students, project managers, system architects and programmers. - Research Units are strongly encouraged towards collaborations and projects that can exploit synergies among difference competences. 4 Web site Material for the course: - Course objectives and program - Course agenda - Notes and slides of the lectures - Project deliverables - Exam rules and dates http://selab.fbk.eu/swat 5 Lecture schedule Wed. Fri. 16.30 -18.30 14.30 -16.30 room A107 room A107 Please, come to the Lab lectures with your laptop or ask us for having a laptop. … Is any laptop needed? 6 Exam The exam consists of an oral discussion of the project carried out during the laboratory. During the course we will work on a project. The exam will consist of some questions on the project implementation plus one question on the theory behind the project. Project deliverables include both implementation artifacts (Eclipse project, test cases, etc.) and documentation artifacts (reports). Both are mandatory for admission to the exam. The next lecture will be devoted to the presentation of the course project. Project must be delivered one week before the exam date! 7 Objective This course aims at teaching students how to analyze and test a software system, when it is evolved to accommodate a set of change requirements (e.g., adding new functionalities, bug fixing, adaptation or restructuring, etc.), by executing a software project which involves: definition of acceptance tests for the change requirements; definition of tests for the user interface; definition of unit tests for the modules implemented to realize the change requirements; application of adequacy testing criteria to the implemented modules; regression testing w.r.t. the preserved functionalities of the evolved software 8 Program Testing: Background and Context: Software maintenance and evolution Code analysis Software testing Acceptance Test GUI testing Unit test Structural test (path and data flow) Mutation test Automated path generation and path testing Regression test Test case prioritization Laboratory: program analysis and understanding user interface test acceptance test creation and execution unit test creation and execution coverage and mutation test regression test Questions: - Which testing to apply? - How to define test cases? - How to automate test cases execution? - How to decide when to stop testing? Tools: - Fitnesse - Junit - MuJava - MuClipse - Clover - Jumble - Emma -… 9 Agenda Intro & course project: Sep 14, 16, 21, 23 Acceptance testing: Sep 28, 30; Oct 5 GUI testing: Oct 7, 12 Unit testing: Oct 14, 19 Analysis and testing theory: Oct 21, 26, 28; Nov 2, 30 Debugging: Nov 4, 9, 11, 16 Coverage and mutation testing: Nov 18, 23, 25 Regression testing: Dec 2, 7 Advanced topic: Dec 14, 16 10 Software Maintenance and Software Testing 11 Software maintenance Software maintenance has been defined as “the modification of a software product after delivery to correct faults, to improve performance or other attributes, or to adapt the product to a changed environment.” (ANSI/IEEE, 1983) Maintenance activities can be classified into four categories: Perfective maintenance Adaptive maintenance Corrective maintenance Preventive maintenance 12 Software maintenance Maintenance activities are difficult: Continuing change. Increasing complexity. Fundamental law of program evolution. Conservation of organizational stability (invariant work rate). Conservation of familiarity (perceived complexity). 13 Software maintenance steps Change Request Program Understanding Change location identification Change implementation Testing Ripple effects 14 Legacy system • They were implemented years ago ( 1970) • Their technology became obsolete (obsolete languages, language styles, hardware, …) • They have been maintained for a long time ( 30 years) • Their structure is deteriorated and does not facilitate understanding • Their documentation (if it exists) has become obsolete • Original authors are not available Attention! • They contain business rules not recorded elsewhere • They can not be easily replaced (important!) • They represent a large investment 15 Legacy dilemma What should we do with legacy code? • • to build the new system from scratch. trying to understand the legacy code and to reconstitute it in a new form. 16 Reverse Engineering Reverse engineering is the process of taking something (a device, an electrical component, a car, a software, …) apart and analyzing its working in details, usually with the intention to construct a new device or program that does the same thing. Forward engineering is the traditional process of moving from high-level abstractions to the physical implementation of a system. Requirements Design Implementation Reverse engineering is “the inverse” of Forward engineering Requirements Design “Abstract Code Representation” Implementation Code 17 Re-engineering Re-engineering is the examination (reverse engineering) of a system to reconstitute it (forward engineering) in a new form. This process may include modifications with respect to new requirements not met by the original system (Semantics cannot be preserved). The re-engineering process takes many forms, depending on its objectives. Sample objectives are: - code migration/porting (ex. C to C++) - reengineering code for reusing it … 18 Horse-shoe model of reengineering 19 Restructuring Restructuring is the transformation from one representation to another at the same relative abstraction level - while preserving the system external behavior (functionality and semantics). Examples: Code level: - from an unstructured (“spaghetti”) form to a structured form (“goto-less”) - conversion of set of “if-statements” into a “case structure”. Design level: to introduce design patterns (e.g., Model-View-Controller architecture). 20 Program analysis Program analysis is the (automated) inspection of a program to infer some properties. In some cases, properties can be inferred without running the program (static analysis). In other cases, properties can be inferred only running the program (dynamic analysis). Examples are: - Type analysis (type inference) - Dead code analysis - Clone analysis - Pointer Analysis … 21 Software Testing Software testing is a key activity, both in software development and in software maintenance. Its objectives are opposed to that of development: instead of making the system work, it aims at breaking it, thus revealing the presence of defects. This is the main reason why development and testing teams should be separate. Code analyses are extremely helpful to testing. Specifically, structural testing (aka white-box testing), as opposed to functional testing (aka black-box testing), assumes the possibility to access the internal structure of the program, analyze it and derive testing criteria from such an analysis. The outcome supports: • Test case production/automatic generation. • Definition of stopping criteria. 22 Testing • One of the practical methods commonly used to detect the presence of errors (failures) in a computer program is to test it for a set of inputs. Testing detects errors; only exhaustive testing, usually infeasible, can prove correctness (absence of errors). I1, I2, I3, …, In, … Our program The output is correct? Expected results =? Obtained results “Inputs” 23 Examples of test case Test Input Description: 1. Login to <Abc page> as administrator 2. Go to Reports page 3. Click on the ‘Schedule reports' button 4. Add reports 5. Update Expected Results: The report schedule should get added to the report schedule Table Test case for sort: • Test data: <12 -29 32 > • Expected output: -29 12 32 24 Terminology … Failure: it is an observable incorrect behavior or state of a given system. In this case, the system displays a behavior that is contrary to its specifications/requirements. Thus, a failure is tied (only) to system executions/behaviors and it occurs at runtime when some part of the system enters an unexpected state. Fault: (commonly named “bug/defect”) it is a defect in a system. A failure may be caused by the presence of one or more faults on a given system. However, the presence of a fault in a system may or may not lead to a failure, e.g., a system may contain a fault in its code but on a fragment of code that is never exercised so this kind of fault do not lead to a software failure. Error: it is the developer mistake that produce a fault. Often, it has been caused by human activities such as the typing errors. 25 LOC Code 1 program double (); 2 var x,y: integer; 3 begin 4 read(x); 5 y := x * x; 6 write(y) 7 end Example … Failure: x = 3 means y =9 Failure! • This is a failure of the system since the correct output would be 6 Fault: The fault that causes the failure is in line 5. The * operator is used instead of +. Error: The error that conduces to this fault may be: • a typing error (the developer has written * instead of +) • a conceptual error (e.g., the developer doesn't know how to double a number) 26 Terminology … Test Case: input sequence and associated expected output Testing: testing is the process of executing a program with the intent of finding errors Test Suite: a set of test cases for a system Testing cannot guarantee the absence of faults, Strategies for defining test suites, Formal methods (e.g., model checking) can be used to statically verify software properties, this is not testing. Debugging: finding and fixing faults in the code 27 Sources for test cases definition … • The requirements to the program (its specification) • An informal description • A set of scenarios (use cases) • A set of sequence diagrams • A state machine • The system itself (the code or the execution of the application) • A set of selection criteria • Heuristics (e.g., guidelines for testing) • Experience (of the tester) 28 Testing: three main questions … At which level conducting the testing? System Unit Integration How to choose inputs? Considering the program as a black box using the specifications/use cases/requirements Considering the program as a white box A randomly selected set of inputs is often not adequate … using the structure How to identify the expected output? Test oracles 29 Test phases Acceptance Testing – this checks if the end user functionalities are actually delivered. It is often a contractual prerequisite for the user to accept and pay for the software. Unit testing – this is testing of a single function, procedure, class. It is usually done by the developer, not by a separate testing team. Integration testing – this checks that units tested in isolation work properly when put together. It often requires drivers and stubs to simulate the missing components while integrating the system. System testing – here the goal is to ensure that the whole system works properly. Regression Testing – this checks that the system preserves its functionality after maintenance and/or evolution. 30 Acceptance vs. Unit Testing Acceptance Tests are specified by the customer and analyst to test that the overall system is functioning as required (Did developers build the right system?). Acceptance tests typically test the entire system, or some large chunk of it. Unit Tests are tests written by the developers to test a functionality as they write it. Unit tests typically test each unit of a system in isolation. 31 “At different points in the process” Iterative Software development Write acceptance tests Write and execute unit tests Execute acceptance tests increment + system “Written before” Prioritized functionalities “Executed after the development” 32 Jemmy/Abbot/JFCUnit/… Testing tools FIT/Fitnesse (High level) Cactus GUI Perfomance and Load Testing JMeter/JUnitPerf Business Logic HttpUnit/Canoo/Selenium Junit (Low level) Web UI Persistence Layer Junit/SQLUnit/XMLUnit 33 Badly designed systems makes testing difficult We have a thick GUI that has program logic. The interfaces between the modules are not clearly defined. Testing of specific functions cannot be isolated. Testing has to be done through the GUI Testing is difficult. “Badly designed system” GUI-test drivers 34 Well architected applications makes testing simple Design for testability The GUI does not contain any program logic other than dealing with presentation. The interfaces between the modules are well defined. This give us testing advantages. Unit and System acceptance testing are simpler and they can be automated “Well architected application” 35 How good these test cases are? Adequacy = level of confidence of a test suite applied to the system under test Several criteria: coverage of the requirements (what if 100% covered wrt the implemented features) coverage of the code (what if 100% covered wrt the code) fault detection (what if no fault found) 36 Coverage Testing Coverage measures describe the degree to which a program has been tested Many type of coverage measures statements branches paths methods, classes requirement specifications, etc. Example with two input data: (1) X=7 => a=8 and b=6 Statements [1, 2, 3, 4, 5, 6, 8] (2) X=22 => a=23 and b=21 Statements [1, 2, 3, 4, 6, 7, 8] 1 2 3 4 5 6 7 8 scanf("%d", &x); a = x + 1; b = x - 1; if (a < 10) x++; if (b > 20) x--; printf("%d\n", x); 37 Mutation Testing Mutant: a copy of the original program with a small change (seeded fault) Mutant killed: if its behaviors/outputs differ from those of the original program Mutant score: number of killed mutants 38 Regression testing Selective re-testing of a system or component to verify that modifications have not caused unintended effects - Ripple effects Can be conducted at each of the test levels: unit, integration, system Modifications Version X T1 T2 T3 Changed Version X+1 • find affected testcases (red) • change affected testcases (red) • execute them • define new test cases, if necessary Tn … T1 T2 T3 Tn … Testcases to modify 39 Test case prioritization The faults revealed by a test case are unknown until the test case is executed and its output is evaluated against the oracle. The order in which test cases are executed affects: The rate of fault detection: good orderings reveal faults earlier than bad ones; The rate of code coverage: good orderings meet the required coverage level earlier than bad ones. The information can be used in regression testing to prioritize the re-executions of the test cases, thus early finding faults. 40 Conclusions Software maintenance implies several activities (reverse engineering, analysis, etc.) One of the most relevant and hard activities is software testing Different types of testing exist: acceptance, unit, etc. The motivation for acceptance testing is demonstrating working functionalities from the end-user perspective. The motivation for unit testing is finding faults in the unit the developer is working on. Several issues impact testing: test case definition, test case execution, adequacy criteria, etc. Badly designed systems make testing difficult. 41 Additional references - V.R. Basili and R.W. Selby. Comparing the Effectiveness of Software Testing Strategies. IEEE Transactions on Software Engineering. 1987 - Jim Heumann. Generating Test Cases From Use Cases. Online IBM journal. 2001 - Peter Zielczynski. Traceability from Use Cases to Test Cases. online IBM journal 2006 - R.C.Martin and G.Melnik. Tests and Requirements, Requirements and Tests: A Möbius Strip. IEEE Software 2008 - Y.K.Malaiya, M.N.Li, J.M.Bieman, R.Karcich. Software reliability growth with test coverage. IEEE Transactions on Reliability - Q. Yang, J.Jenny Li and D.M. WEISS. A Survey of Coverage-Based Testing Tools. The Computer Journal 2007 - Ben H. Smith and Laurie Williams. Should Software Testers Use Mutation Analysis to Augment a Test Set. Journal of Software Systems 2009 -Florentin Ipate, Raluca Lefticaru. State-based Testing is Functional Testing! TAICPART 2007 -Jeff Offutt, Shaoying Liu, Aynur Abdurazik and Paul Ammann. Generating test data from state-based specifications. Jounal on Software Testing, Verification and Reliability. 2003 42