Outline of the Lecture Software Testing • Some Notations • Testing Level No issue is meaningful unless it can be put to the test of decisive verification. C.S. Lewis, 1934 – Integration Testing – Component/Unit/Module/Basic Testing – System Testing Steps • • • • Function testing / Thread testing Performance testing Acceptance testing Installation testing • Test Automation • Termination Problem PUM-2006, Mariam Kamkar, IDA, LiU 1 Testing a ballpoint pen Does the pen write in the right color, with the right line thickness? • Is the logo on the pen according to company standards? • Is it safe to chew on the pen? • Does the click-mechanism still work after 100 000 clicks? • Does it still write after a car has run over it? bridge automobile television word processor ⇒ Product of any engineering activity must be verified against its requirements throughout its development. What is expected from this pen? Intended use!! PUM-2006, Mariam Kamkar, IDA, LiU 2 Goal: develop software to meet its intended use! But: human beings make mistake! What is expected from this pen? • PUM-2006, Mariam Kamkar, IDA, LiU 3 PUM-2006, Mariam Kamkar, IDA, LiU Customer 4 Developer • Verifying bridge = verifying design, construction, process,… • Software must be verified in much the same spirit. In this lecture, however, we shall learn that verifying software is perhaps more difficult than verifying other engineering products. Requirements definition Requirements specification Functional requirements Nonfunctional requirements We shall try to clarify why this is so. Code = System PUM-2006, Mariam Kamkar, IDA, LiU 5 PUM-2006, Mariam Kamkar, IDA, LiU Design Specification 6 1 Basic Definitions Error, Fault, Failure Debugging vs. Testing • Debugging: to find the bug • Testing: to demonstrate the existence of a fault Can lead to Can lead to Human error (Mistake, Bug) – fault identification – fault correction / removal Fault (Defect, Bug) Failure PUM-2006, Mariam Kamkar, IDA, LiU 7 PUM-2006, Mariam Kamkar, IDA, LiU Types of Faults Types of Faults (cont.) (dep. on org. IBM, HP) • • • • • • • Algorithmic: division by zero Computation & Precision: order of op Documentation: doc - code Stress/Overload: data-str. size ( dimensions of tables, size of buffers) Capacity/Boundary: x devices, y parallel tasks, z interrupts Timing/Coordination: real-time systems Throughout/Performance : speed in req PUM-2006, Mariam Kamkar, IDA, LiU • • • 9 Recovery: power failure Hardware & System Software: modem Standards & Procedure: organizational standard; difficult for programmers to follow each other PUM-2006, Mariam Kamkar, IDA, LiU Objective: to ensure that code implemented the design properly. Top-down Bottom-up Big bang Sandwich Code = System PUM-2006, Mariam Kamkar, IDA, LiU 10 Unit & Integration Testing Integration Testing • • • • 8 11 PUM-2006, Mariam Kamkar, IDA, LiU Design Specification 12 2 Component code Components Design Specification Unit test driver Tested components Component code Integration test Unit test Boundary conditions independent paths interface ... Component to be tested Tested components stub stub Test cases Integrated modules Subs: pieces of throw-away code that emulate a called unit. Driver: pieces of throw-away code that emulate next level up in the hierarchy PUM-2006, Mariam Kamkar, IDA, LiU 13 PUM-2006, Mariam Kamkar, IDA, LiU 14 Top-down A B E PUM-2006, Mariam Kamkar, IDA, LiU 15 E C F A D Top-down B G PUM-2006, Mariam Kamkar, IDA, LiU E 17 F PUM-2006, Mariam Kamkar, IDA, LiU A B C C F D D G 16 Modified Top-down G PUM-2006, Mariam Kamkar, IDA, LiU 18 3 Bottom-up A B E PUM-2006, Mariam Kamkar, IDA, LiU 19 C F D G PUM-2006, Mariam Kamkar, IDA, LiU 20 A B E C F D Bottom-up G A B E PUM-2006, Mariam Kamkar, IDA, LiU 21 C F D G PUM-2006, Mariam Kamkar, IDA, LiU 22 A B E C F D Big-bang G A B E PUM-2006, Mariam Kamkar, IDA, LiU 23 PUM-2006, Mariam Kamkar, IDA, LiU C F Target level B, C, D D G 24 4 A B E C F A Sandwich D B G E C F Modified Sandwich D G Target level B, C, D PUM-2006, Mariam Kamkar, IDA, LiU 25 PUM-2006, Mariam Kamkar, IDA, LiU Integration Testing Comparison of Integration Strategies Top-down Modified Top-Down Bottom-up Big-bang Integration Early Early Early Time to basic working program Early Early Driver needed No Stubs needed Yes Sandwich Modified Sandwich Late Early Early Late Late Early Early Yes Yes Yes Yes Yes Yes No Yes Yes Yes PUM-2006, Mariam Kamkar, IDA, LiU 26 27 Unit Testing Course book: Pfleeger, S.L.: Software Engineering: Theory and Practice. Second edition. Chapter 8, section 8.4, page 356 PUM-2006, Mariam Kamkar, IDA, LiU 28 Input Failure? Test • Code Reviews: – Walkthroughs – Inspections Object • White/Open box testing • Black/Close box testing Output Oracle PUM-2006, Mariam Kamkar, IDA, LiU 29 PUM-2006, Mariam Kamkar, IDA, LiU 30 5 Two Types of Oracles Balls and Urn • Human: an expert that can examine an input and its associated output and determine whether the program delivered the correct output for this particular input. • Testing can be viewed as selecting different colored balls from an urn where: – Black ball = input on which program fails. – White ball = input on which program succeeds. • Only when testing is exhaustive is there an “empty” urn. • Automated: a system capable of performing the above task. Urn (program) Balls (inputs) PUM-2006, Mariam Kamkar, IDA, LiU 31 PUM-2006, Mariam Kamkar, IDA, LiU 32 Walkthroughs (you lead the discussion) Design, Code, Chapter of user’s guide,… • presenter • coordinator • secretary • maintenance oracle • standards bearer • user representative A correct program A program that always fails A typical program PUM-2006, Mariam Kamkar, IDA, LiU • • • • • 33 34 Inspections Inspections (cont.) (originally introduced by Fagan 1976) (the review team leads the discussion) some classical programming errors • • • • • • • • overview (code, inspection goal) preparation (individually) reporting rework follow-up PUM-2006, Mariam Kamkar, IDA, LiU PUM-2006, Mariam Kamkar, IDA, LiU 35 Use of un-initialized variables Jumps into loops Non-terminating loops Incompatible assignments Array indexes out of bounds Off-by-one errors Improper storage allocation or de-allocation Mismatches between actual and formal parameters in procedure calls PUM-2006, Mariam Kamkar, IDA, LiU 36 6 Experiments Discovery activity • 82% of faults discovered during design & code inspection (Fagan) Faults found per thousand lines of code Requirements review Design review Code inspection Integration test Acceptance test • 93% of all faults in a 6000-lines application were found by inspections (Ackerman, et al 1986) 2.5 5.0 10.0 3.0 2.0 • 85% of all faults removed by inspections from examining history of 10 million lines of code (Jones 1977) • Inspections : finding code faults Jons, S et al, Developing international user information. Bedford, MA: Digital Press, 1991. • Prototyping: requirements problem PUM-2006, Mariam Kamkar, IDA, LiU 37 PUM-2006, Mariam Kamkar, IDA, LiU 38 Black box / Closed box testing Proving code correct • Formal proof techniques • Symbolic execution • Automated theorem proving input • incorrect or missing functions • interface errors • performance error output PUM-2006, Mariam Kamkar, IDA, LiU 39 PUM-2006, Mariam Kamkar, IDA, LiU Block-Box Testing 40 Black-box Testing Techniques • Definition: a strategy in which testing is based on requirements and specifications. • • • Applicability: all levels of system development – Unit – Integration – System • Exhaustive testing Equivalence class testing (Equivalence Partitioning) Boundary value analysis • Disadvantages: never be sure of how much of the system under test has been tested. • Advantages: directs tester to choose subsets to tests that are both efficient and effective in finding defects. PUM-2006, Mariam Kamkar, IDA, LiU 41 PUM-2006, Mariam Kamkar, IDA, LiU 42 7 Equivalence Class Testing Exhaustive testing • Equivalence Class (EC) testing is a technique used to reduce the number of test cases to a manageable level while still maintaining reasonable test coverage. • Definition: testing with every member of the input value space. • Each EC consists of a set of data that is treated the same by the module or that should produce the same result. Any data value within a class is equivalent, in terms of testing, to any other value. • Input value space: the set of all possible input values to the program. PUM-2006, Mariam Kamkar, IDA, LiU 43 PUM-2006, Mariam Kamkar, IDA, LiU Identifying the Equivalence Classes Taking each input condition (usually a sentence or phrase in the specification) and partitioning it into two or more groups: – Input condition Guidelines 1. 2. 3. • range of values x: 1-50 – Valid equivalence class 4. • 1< x < 50 – Invalid equivalence classes 5. • x<1 x > 50 PUM-2006, Mariam Kamkar, IDA, LiU 45 Identifying the Test Cases 44 If an input condition specifies a range of values; identify one valid EC and two invalid EC. If an input condition specifies the number (e.g., one through 6 owners can be listed for the automobile); identify one valid EC and two invalid EC (- no owners; - more than 6 owners). If an input condition specifies a set of input values and there is reason to believe that each is handled differently by the program; identify a valid EC for each and one invalid EC. If an input condition specifies a “must be” situation (e.g., first character of the identifier must be a letter); identify one valid EC (it is a letter) and one invalid EC (it is not a letter) If there is any reason to believe that elements in an EC are not handled in an identical manner by the program, split the equivalence class into smaller equivalence classes. PUM-2006, Mariam Kamkar, IDA, LiU 46 Applicability and Limitations • Most suited to systems in which much of the input data takes on values within ranges or within sets. 1. Assign a unique number to each EC. 2. Until all valid ECs have been covered by test cases, write a new test case covering as many of the uncovered valid ECs as possible. 3. Until all invalid ECs have been covered by test cases, write a test case that cover one, and only one, of the uncovered invalid ECs. • It makes the assumption that data in the same EC is, in fact, processed in the same way by the system. The simplest way to validate this assumption is to ask the programmer about their implementation. • EC testing is equally applicable at the unit, integration, system, and acceptance test levels. All it requires are inputs or outputs that can be partitioned based on the system’s requirements. PUM-2006, Mariam Kamkar, IDA, LiU 47 PUM-2006, Mariam Kamkar, IDA, LiU 48 8 Equivalence partitioning Invalid inputs Specification: the program accepts four to eight inputs which are 5 digit integers greater than 10000. Valid inputs outputs PUM-2006, Mariam Kamkar, IDA, LiU 49 PUM-2006, Mariam Kamkar, IDA, LiU Technique Boundary Value Testing Boundary value testing focuses on the boundaries simply because that is where so many defects hide. The defects can be in the requirements or in the code. The most efficient way of finding such defects, either in the requirements or the code, is through inspection (Software Inspection, Gilb and Graham’s book). PUM-2006, Mariam Kamkar, IDA, LiU 51 PUM-2006, Mariam Kamkar, IDA, LiU Between 10000 and 99999 1. Identify the ECs. 2. Identify the boundaries of each EC. 3. Create test cases for each boundary value by choosing one point on the boundary, one point just below the boundary, and one point just above the boundary. PUM-2006, Mariam Kamkar, IDA, LiU 52 Applicability and Limitations Boundary value analysis Less than 10000 50 Boundary value testing is equally applicable at the unit, integration, system, and acceptance test levels. All it requires are inputs that can be partitioned and boundaries that can be identified based on the system’s requirements. More than 99999 53 PUM-2006, Mariam Kamkar, IDA, LiU 54 9 White box Techniques White box testing • • • • • • Control flow testing • Data flow testing logical decision loops internal data structure paths ... Coverage!! PUM-2006, Mariam Kamkar, IDA, LiU 55 PUM-2006, Mariam Kamkar, IDA, LiU White-box Testing 56 Control Flow Graphs • Definition: a strategy in which testing is based on the internal paths, structure, and implementation of the software under test (SUT) • Applicability: all levels of system development (path testing!) – – – – Process blocks Unit Integration System Acceptance Decision Point Junction Point Sequence • Disadvantages: 1) number of execution paths may be so large; 2) test cases may not detect data sensitivity; 3) assumes that control flow is correct (nonexistent paths!); 4) tester must have programming skills. If • Advantages: tester can be sure that every path have been identified and tested. While PUM-2006, Mariam Kamkar, IDA, LiU 57 Until PUM-2006, Mariam Kamkar, IDA, LiU Case 58 Levels of Coverage (test coverage metrics) Definition: Given a program written in an imperative programming language, its program graph is a directed graph in which nodes are statement fragments, and edges represent flow of control (a complete statement is a “default” statement fragment). PUM-2006, Mariam Kamkar, IDA, LiU 59 • • • • • • Statement (Line) coverage Decision (Branch) coverage Condition coverage Decision/Condition coverage Multiple Condition coverage Path coverage PUM-2006, Mariam Kamkar, IDA, LiU 60 10 Branch Coverage Statement Coverage begin Begin if ( y >= 0) then y = 0; abs = y; end; y >= 0 begin Begin if ( y >= 0) then y = 0; abs = y; end; yes y=0 y >= 0 test case-1(yes): input: y = 0 expected result: 0 actual result: 0 y=? ? ? PUM-2006, Mariam Kamkar, IDA, LiU 61 Condition Coverage Begin if ( x < 10 && y > 20) { z = foo (x,y); else z =fie (x,y); } end; test case-1: input: x = ?, y = ? expected result: ? actual result: ? yes z=fie (x,y) z=foo (x,y) test case-2 : input: x = ?, y = ? expected result: ? actual result: ? PUM-2006, Mariam Kamkar, IDA, LiU 63 Begin if ( x < 10 && y > 20) { z = foo (x,y); else z =fie (x,y); } end; no z=fie (x,y) PUM-2006, Mariam Kamkar, IDA, LiU z=fie (x,y) z=foo (x,y) test case-2: input: x = ?, y = ? expected result: ? actual result: ? Path with loops a d y > 20 b c yes x<? y>? ----------------------------------test-case-1: t t test-case:2 t f test-case-3: f t test-case-4 f f yes 64 yes x < 10 x<10 && y>20 no PUM-2006, Mariam Kamkar, IDA, LiU Multiple Condition Coverage no 62 test case-1: input: x = ?, y = ? expected result: ? actual result: ? PUM-2006, Mariam Kamkar, IDA, LiU Begin if ( x < 10 && y > 20) { z = foo (x,y); else z =fie (x,y); } end; test case-2 (no) : input: y = ? expected result: ? actual result: ? Decision/Condition Coverage x<10 && y>20 no y=0 abs = y abs = y test case-1: input: expected result: actual result: yes no a e z=foo (x,y) ? ? e 65 PUM-2006, Mariam Kamkar, IDA, LiU 66 11 Path with loops Path Coverage no x <> 0 a d b yes z = z-x (n, y) x = ?, z = ? (y, n) x = ?, z = ? z = sin(x) c yes a e c,b,d z > 10 z=0 d z=z/x e PUM-2006, Mariam Kamkar, IDA, LiU 67 no (n, n) x = ?, z = ? (n, y) x = ?, z = ? (y, n) x = ?, z = ? (y, y) x = ?, z = ? PUM-2006, Mariam Kamkar, IDA, LiU 68 Computation of cyclomatic complexity Path Coverage • All possible execution paths • Question: How do we know how many paths to look for? • Answer: The computation of cyclomatic complexity 1. 2. Cyclomatic complexity has a foundation in graph theory and is computed in the following ways: Cyclomatic complexity V(G), for a flow graph, G, is defined as: V(G) = E – N + 2 E: number of edges N: number of nodes Cyclomatic complexity V(G), for a flow graph, G, with only binary decisions, is defined as: V(G) = P + 1 P: number of binary decision PUM-2006, Mariam Kamkar, IDA, LiU 69 PUM-2006, Mariam Kamkar, IDA, LiU Data Flow Testing A 1. V(G) = E – N + 2 C B F E H 2. V(G) = P + 1 L K N M J I DEF(S) = {x | statement S contains a definition of variable x} USE(S) = {x | statement S contains a use of variable x} DEF-USE-Chain (du chain) = [x, S, S’] E=? N=? V(G) = ? D G 70 P O R Q S1: i = 1; S2: while (i <= n) P=? V(G) = ? S PUM-2006, Mariam Kamkar, IDA, LiU 71 PUM-2006, Mariam Kamkar, IDA, LiU 72 12 Data Flow testing s = 0; i = 1; s = 1; while (i <= n) { du: def-use dk: def-kill ? ... s + = i; i ++ } print (s); print (i); print (n); PUM-2006, Mariam Kamkar, IDA, LiU 73 Data Flow Graphs define x • dd: defined and defined again – not invalid but suspicious • du: defined and used – perfectly correct • dk: defined and then killed – not invalid but probably a programming error • ud: used and defined – acceptable • uu: used and used again – acceptable • uk: used and killed – acceptable • kd: killed and defined – acceptable • ku: killed and used – a serious defect • kk: killed and killed – probably a programming error. PUM-2006, Mariam Kamkar, IDA, LiU Variable y define x • ~d: the variable does not exist, then it is defined • ~u: the variable does not exist, then it is used • ~k: the variable does not exist, then it is killed use y kill z use y define x use x use x define y use z kill y define z kill z define z use z define y use z Variable z ~kill: programming error kill-use: major blunder use-use: correct use-define: acceptable kill-kill: probable programming error kill-define: acceptable define-use: correct Variable x: use y use z Variable z Variable y: ~use: major blunder use-define: acceptable define-use: correct use-kill: acceptable define-kill: probable programming error kill z kill z use x define z define x use x use z 74 use y ~define: correct define-define: suspicious, programming error define-use: correct kill y use z define z Control flow graph annotated with define-use-kill information for x, y, z Total: 6 problem! PUM-2006, Mariam Kamkar, IDA, LiU 75 PUM-2006, Mariam Kamkar, IDA, LiU 76 Relative strengths of test strategies (B. Beizer 1990) Program Slicing All paths s = 0; i = 1; while (i <= n) { s + = i; i ++ } print (s); print (i); print (n); All definition-use paths i = 1; while (i <= n) { All uses i ++ All predicate/ Some computational uses All computational/ Some predicate uses } All predicate uses print (i); All computational uses Branch All definition PUM-2006, Mariam Kamkar, IDA, LiU 77 PUM-2006, Mariam Kamkar, IDA, LiU Statement 78 13 Objective: to ensure that the system does what the customer wants it to do. System Testing Customer Function testing Performance testing Acceptance testing Installation testing Developer Functional requirements Nonfunctional requirements Requirements definition Requirements specification 79 PUM-2006, Mariam Kamkar, IDA, LiU 80 Tested components Integration test Unit test Function test Tested components Integrated modules Acceptance test Customer requirements spec. PUM-2006, Mariam Kamkar, IDA, LiU 81 Performance test Installation test System In Use! User environment PUM-2006, Mariam Kamkar, IDA, LiU Function testing 82 Cause-Effect (testing one function at a time) functional requirements (test case generation from req.) • have a high probability of detecting a fault • use a test team independent of the designers and programmers • know the expected actions and output • test both valid and invalid input • never modify the system just to make testing easier • have stopping criteria PUM-2006, Mariam Kamkar, IDA, LiU Other software requirements Functioning systems Unit test Integrated modules System functional requirements Design Specification Accepted system Component code Component code PUM-2006, Mariam Kamkar, IDA, LiU Verified validated software • • • • Causes C1: command is credit C2: command is debit C3: account number is valid C4: transaction amount is valid 83 PUM-2006, Mariam Kamkar, IDA, LiU Effects E1: print “invalid command” E2: print “invalid account number” E3: print “debit amount not valid ” E4: debit account print E5: credit account print 84 14 Performance Testing nonfunctional requirements C2 and C3 • • • • • • • E3 not C4 PUM-2006, Mariam Kamkar, IDA, LiU 85 Stress tests Volume tests Configuration tests Compatibility tests Regression tests Security tests Timing tests • • • • • • Environment tests Quality tests Recovery tests Maintenance tests Documentation tests Human factors tests / usability tests PUM-2006, Mariam Kamkar, IDA, LiU Acceptance testing 86 Installation testing customers, users need users site • Benchmark test: a set of special test cases • Pilot test: everyday working Acceptance test at developers site Æ installation test at users site, otherwise may not be needed!! – Alpha test: at the developer’s site, controlled environment – Beta test: at one or more customer site. • Parallel test: new system in parallel with previous one PUM-2006, Mariam Kamkar, IDA, LiU 87 Test Planing • • • • • • PUM-2006, Mariam Kamkar, IDA, LiU 88 Automated Testing Tools Establishing test objectives Designing test cases Writing test cases Testing test cases Executing tests Evaluating test results • Code Analysis tools – Static, Dynamic • Test execution tools – Capture-and-Replay – Stubs & Drivers – Comparators • Test case generator PUM-2006, Mariam Kamkar, IDA, LiU 89 PUM-2006, Mariam Kamkar, IDA, LiU 90 15 Termination Problem How decide when to stop testing Scaffolding Oracle • The main problem for managers! What can be automated? • Termination takes place when • resources (time & budget) are over • found the seeded faults • some coverage is reached PUM-2006, Mariam Kamkar, IDA, LiU Test case generation 91 PUM-2006, Mariam Kamkar, IDA, LiU Real life examples 92 Real life examples • First U.S. space mission to Venus failed. (reason: missing comma in a Fortran do loop) • December 1995: AA, Boeing 575, mountain crash in Colombia, 159 killed. Incorrect one-letter computer command (Cali, Bogota 132 miles in opposite direction, have same coordinate code) • June 1996: Ariane-5 space rocket, self-destruction, $500 million. (reason: reuse of software from Ariane-4 without recommended testing). PUM-2006, Mariam Kamkar, IDA, LiU Termination 93 • Australia: Man jailed because of computer glitch. He was jailed for traffic fine although he had actually paid it for 5 years ago. • Dallas Prisoner released due to program design flaw: He was temporary transferred from one prison to another (witness). Computer gave him “temporary assignment”. PUM-2006, Mariam Kamkar, IDA, LiU 94 Summary Goals of software testing: Historical Evolution Years UNIT Prevent software faults Measure SQA test objectives 1980 Find faults 1981: Deutsch. Software project V and V 1979: Myers, “Art of software testing” 1970 cost complexity # applic. UNIT Establish confidence • Control flow testing • Data flow testing 1972 June, First formal conference software testing, university of North Carolina, Bill Hetzel. 1960 1957, Charles Baker distinguished debugging from testing 1950 PUM-2006, Mariam Kamkar, IDA, LiU • Code inspections • Code walkthroughs • Black-box • White-box UNIT • Exhaustive testing • Equivalence class • Boundary value Not distinguished from debugging 95 PUM-2006, Mariam Kamkar, IDA, LiU 96 16 UNIT UNIT • Code inspections UNIT • Control flow testing • Data flow testing • Code walkthroughs • Black-box • White-box UNIT • Exhaustive testing • Equivalence class • Boundary value • Integration testing •Top-down • Bottom-up • Big-bang • Sandwich UNIT UNIT UNIT UNIT UNIT • Integration testing • Top-down • Bottom-up • Big-bang • Sandwich UNIT UNIT PUM-2006, Mariam Kamkar, IDA, LiU 97 • Integration testing • Top-down • Bottom-up • Big-bang • Sandwich PUM-2006, Mariam Kamkar, IDA, LiU UNIT • System testing • Function testing • Performance testing • Acceptance testing • Installation testing • Integration testing • Top-down • Bottom-up • Big-bang • Sandwich 98 And … Testing can show the presence, but never the absence of errors in software. E. Dijkstra, 1969 PUM-2006, Mariam Kamkar, IDA, LiU 99 17