Introduction to Software Testing Chapter 1 A Talk in 3 Parts Jeff Offutt Information & Software Engineering SWE 437 Software Testing www.ise.gmu.edu/~offutt/ 637 – Introduction (Ch 1) Cost of Testing You’re going to spend about half of your development budget on testing, whether you want to or not. In real-world usage, testing is the principle post-design activity Restricting early testing usually increases cost Extensive hardware-software integration requires more testing 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Part 1 : Why Test? If you don’t know why you’re conducting a test, it won’t be very helpful. Written test objectives and requirements are rare How much testing is enough? – Common objective – spend the budget … 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Cost of Not Testing Program Managers often say: “Testing is too expensive.” Not testing is even more expensive Planning for testing after development is prohibitively expensive What are some of the costs of NOT testing? 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Part 2 : What ? But … what should we do ? 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Important Terms Validation & Verification Verification: The process of determining whether the products of a given phase of the software development process fulfill the requirements established during the previous phase – (Are we building the product right) Validation : The process of evaluating software at the end of software development to ensure compliance with intended usage – (Are we building the right product) IV&V stands for “independent verification and validation” 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Static and Dynamic Testing Static Testing : Testing without executing the program. – This include software inspections and some forms of analyses. Dynamic Testing : Testing by executing the program with real inputs 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Testing Results Dynamic testing can only reveal the presence of failures, not their absence The only validation for non-functional requirements is the software must be executed to see how it behaves Both static and dynamic testing should be used to provide full V&V coverage 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Software Faults, Errors & Failures Software Fault : A static defect in the software Software Error : An incorrect internal state that is the manifestation of some fault Software Failure : External, incorrect behavior with respect to the requirements or other description of the expected behavior Faults in software are design mistakes and will always exist 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Software Faults, Errors & Failures public static int numZero (int[] x) { // Effects: if x == null throw NullPointerException // else return the number of occurrences of 0 in x int count = 0; for (int i = 1; i < x.length; i++) { if (x[i] == 0) { count++; } } return count; } Input [1, 2, 0] Output is 1 Input [0, 1, 2, 0] Output is 1 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Testing & Debugging Testing : Finding inputs that cause the software to fail Debugging : The process of finding a fault given a failure 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Fault & Failure Model Three conditions necessary for a failure to be observed 1. Reachability : The location or locations in the program that contain the fault must be reached 2. Infection : The state of the program must be incorrect 3. Propagation : The infected state must propagate to cause some output of the program to be incorrect 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Some more terminology used in testing… 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Observability and Controllability Software Observability : How easy it is to observe the behavior of a program in terms of its outputs, effects on the environment and other hardware and software components – Software that affects hardware devices, databases, or remote files have low observability Software Controllability : How easy it is to provide a program with the needed inputs, in terms of values, operations, and behaviors – Easy to control software with inputs from keyboards – Inputs from hardware sensors or distributed software is harder – Data abstraction reduces controllability and observability 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Top-Down and Bottom-Up Testing Top-Down Testing : Test the main procedure, then go down through procedures it calls, and so on Bottom-Up Testing : Test the leaves in the tree (procedures that make no calls), and move up to the root. – Each procedure is not tested until all of its children have been tested 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 White-box and Black-box Testing Black-box testing : Deriving tests from external descriptions of the software, including specifications, requirements, and design White-box testing : Deriving tests from the source code internals of the software, specifically including branches, individual conditions, and statements This view is really out of date. The more general question is: from what level of abstraction to we derive tests? 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Changing Notions of Testing Old view of testing is of testing at specific software development phases – Unit, module, integration, system … New view is in terms of structures and criteria – Graphs, logical expressions, syntax, input space 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Old : Testing at Different Levels Acceptance testing: Is the software acceptable to the user? System testing: Test the overall functionality of the system Integration testing: Test how modules interact with each other Module testing: Test each class, file, module or component Unit testing: Test each unit (method) individually main Class P Class A Class B method mA1() method mB1() method mA2() method mB2() 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 New : Test Coverage Criteria A tester’s job is simple : Define a model of the software, then find ways to cover it Test Requirements: Specific things that must be satisfied or covered during testing Test Criterion: A collection of rules and a process that define test requirements Testing researchers have defined dozens of criteria, but they are all really just a few criteria on four types of structures … 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Example Given a bag of jelly beans, test it. Test Coverage Criteria: Test every flavor Idea: the criteria if fully satisfied will fully test our system Using the criteria we develop Test Requirements ( – must test grape, orange, cherry…) Using those requirements we create test cases: – Select a jelly bean until you get a grape one. Eat it. – Select a jelly bean until you get a cherry one. Eat it. –… 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 New : Criteria Based on Structures Structures : Four ways to model software 1. Graphs 2. Logical Expressions 3. Input Domain Characterization 4. Syntactic Structures 637 – Introduction (Ch 1) (not X or not Y) and A and B A: {0, 1, >1} B: {600, 700, 800} C: {swe, cs, isa, infs} if (x > y) z = x - y; else z = 2 * x; © Jeff Offutt, 2005-2007 1. Graph Coverage – Structural 2 1 6 5 Node Edge (Branch) Path(Statement) Cover Cover every edge Coverevery everynode path 3 This graph may represent • statements & branches 7 4 • methods & calls •••12567 12567 12567 •••1343567 1343567 1257 •• 1357 13567 • components & signals • 1357 • states and transitions • 1343567 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 • 134357 … 1. Graph Coverage – Data Flow def = {m} def = {a , m} use = {y} 2 def = {x, y} 6 use = {x} use = {a} 1 5 use = {x} def = {a} use = {a} use = {m} 7 Defs & Uses Pairs All AllDefs Uses 3 •Every (x, 1, (1,2)), (x, 1, (1,3)) def used once Every def “reaches” every • use (y, 1, 4), (y, 1, 6) • 1, 2, 5, 6, 7 This graph contains: • defs: nodes & edges where variables get values 4 def = {m} use = {y} • uses: nodes & edges where values are accessed 637 – Introduction (Ch 1) • •(a, 1,2, 2,(5,6)), 5, 3, 6,5, 7(a,72, (5,7)), (a, • 1, 3, 4, 3, (5,6)), (a, 3, (5,7)), • 1, 2, 5, 7 • (m, 4, 7), (m, 6, 7) • 1, 3, 5, 6, 7 • 1, 3, 5, 7 © Jeff Offutt, 2005-2007 • 1, 3, 4, 3, 5,7 2. Logical Expressions ( (a > b) or G ) and (x < y) Transitions Logical Program Decision Statements Software Specifications 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Expressions 2. Logical Expressions ( (a > b) or G ) and (x < y) Predicate Coverage : Each predicate must be true and false – ( (a>b) or G ) and (x < y) = True, False Clause Coverage : Each clause must be true and false – (a > b) = True, False – G = True, False – (x < y) = True, False Combinatorial Coverage : Various combinations of clauses – Active Clause Coverage: Each clause must determine the predicate’s result 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 2. Logic – Active Clause Coverage ( (a > b) or G ) and (x < y) With these values for G and (x<y), (a>b) determines the value of the predicate 637 – Introduction (Ch 1) 1 T F T 2 F F T 3 F T T 4 F F T 5 T T T 6 T T F © Jeff Offutt, 2005-2007 duplicate 3. Input Domain Characterization Describe the input domain of the software – Identify inputs, parameters, or other categorization – Partition each input into finite sets of representative values – Choose combinations of values System level – Number of students – Level of course – Major { 0, 1, >1 } { 600, 700, 800 } { swe, cs, isa, infs } Unit level – Parameters F (int X, int Y) – Possible values X: { <0, 0, 1, 2, >2 }, Y : { 10, 20, 30 } – Tests • F (-5, 10), F (0, 20), F (1, 30), F (2, 10), F (5, 20) Equivalence Partition in the Pressman book 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Input Domain Characterization (Equivalence Partitioning) 3 <4 4 637 – Introduction (Ch 1) 10 Between 4 and 10 > 10 Number of input values 9999 10000 < 10000 7 11 50000 100000 99999 Between 10000 and 99999 © Jeff Offutt, 2005-2007 > 99999 4. Syntactic Structures Based on a grammar, or other syntactic definition Primary example is mutation testing 1. 2. 3. 4. Induce small changes to the program: mutants Find tests that cause the mutant programs to fail: killing mutants Failure is defined as different output from the original program Check the output of useful tests on the original program Example program and mutants if (x > y) if (x >= y) z = x - y; z = x + y; z = x – m; else z = 2 * x; if (x > y) z = x - y; else z = 2 * x; 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Coverage Given a set of test requirements TR for coverage criterion C, a test set T satisfies C coverage if and only if for every test requirement tr in TR, there is at least one test t in T such that t satisfies tr Infeasible test requirements : test requirements that cannot be satisfied – No test case values exist that meet the test requirements – Dead code – Detection of infeasible test requirements is formally undecidable for most test criteria 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Coverage Example Coverage Criteria: test every node Test Requirements: – Execute Node 1, – Execute Node 2, – Exceute Node … 2 1 6 5 3 4 Test Set: – Test Case 1: 1,2,5,6,7 – Test Case 2: 1,3,4,3,5,7 637 – Introduction (Ch 1) Criteria give you ©a Jeff recipe for test requirements Offutt, 2005-2007 7 Two Ways to Use Test Criteria 1. Directly generate test values to satisfy the criterion often assumed by the research community most obvious way to use criteria. Very hard without automated tools 2. Generate test values externally and measure against the criterion usually favored by industry – – sometimes misleading if tests do not reach 100% coverage, what does that mean? 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 How to use Coverage Criteria (Direct method) Define your representation of the system as one of the four models – – – – Graph Logical Expression Input Domain Syntactic Structure Determine your criteria – What rule will you use (some examples.. There are many more!) • We will cover every edge in the graph • We will verify each boolean clause as true and false • We will verify one value in each input partition (equivalence class) Using that criteria, determine the set of Test Requirements you need to satisfy your criteria – Must cover graph edge (2,5), (1,6), (4,1), … Create the test cases that satisfy your test criteria 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Generators and Recognizers Generator : A procedure that automatically generates values to satisfy a criterion Recognizer : A procedure that decides whether a given set of test values satisfies a criterion Both problems are provably undecidable for most criteria It is possible to recognize whether test cases satisfy a criterion far more often than it is possible to generate tests that satisfy the criterion Coverage analysis tools are quite plentiful 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Some Tools Static Analysis Tools – FindBugs - Finds MANY categories of bugs – Checkstyle - coding standard violations – PMD - Maybe a lot more, but seems to be mainly unused variables, also cut-n-paste code. – Jamit - Java Access Modifier Inference Tool - find tighter access modifiers – UPDATE Spring2010: SQE: A nice integration with Netbeans: • http://kenai.com/projects/sqe/pages/Home – http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis Unit Testing – Junit – see UnitTesting Slides Coverage Analysis Tool – Netbeans Plugins - Unit Tests Code Coverage Plugin Mutation Testers – http://www.mutationtest.net/twiki/bin/view/Resources/WebHome 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Unit Testing See unit testing slides 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007 Quiz review What is the difference between verification and validation? What is the difference between static and dynamic testing? Can testing prove the absence of errors? What is a software fault? Error? Failure? How do they relate? What are the three types of code coverage criteria? What is logical coverage? What is input domain characterization? What is mutation testing? 637 – Introduction (Ch 1) © Jeff Offutt, 2005-2007