Improving Logic-Based Testing Jeff Offutt

advertisement
Improving Logic-Based
Testing
Jeff Offutt
Professor, Software Engineering
George Mason University
Fairfax,VA USA
www.cs.gmu.edu/~offutt/
offutt@gmu.edu
Joint research with Gary Kaminski and Paul Ammann
Improving Logic-Based Testing, invited to Journal of Software Systems
of 33
Outline
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
Better Logic-Testing
© Kaminski, Ammann, Offutt
2 of 33
Software is a Skin that
Surrounds Our Civilization
Quote due to Dr. Mark Harman
Linköping, January 2011
© Jeff Offutt
3 of 33
Costly Software Failures
• “The Economic Impacts of Inadequate Infrastructure for
Software Testing”
– Inadequate software testing costs the US alone between $22 and
$59 billion USD annually
– Better testing could cut this amount in half
• 2006 : Amazon’s BOGO offer became a double discount
• 2007 : Symantec says that most security vulnerabilities are
now due to faulty software
– And more than half are in web applications
• Huge losses due to web application failures
– Financial services : $6.5 million per hour (just in USA!)
– Credit card sales applications : $2.4 million per hour (in USA)
World-wide monetary loss is staggering
Better Logic-Testing
© Kaminski, Ammann, Offutt
4 of 33
Cost Of Late Testing
60
50
40
30
20
10
0
Fault origin (%)
Fault detection (%)
Unit cost (X)
Software Engineering Institute; Carnegie Mellon University; Handbook CMU/SEI-96-HB-002
Linköping, January 2011
© Jeff Offutt
5 of 33
Outline
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
Better Logic-Testing
© Kaminski, Ammann, Offutt
6 of 33
Covering Logic Expressions
• Logic expressions show up in many situations
• They are essential to defining software behavior
• Covering logic expressions is required by the US Federal
Aviation Administration for safety critical software
• Logical expressions can come from many sources
–
–
–
–
Decisions in programs
UML : FSMs and statecharts, activity diagrams
Requirements
SQL queries
• Test designs are subsets of expressions’ truth assignments
Better Logic-Testing
© Kaminski, Ammann, Offutt
7 of 33
Logic Predicates and Clauses
• A predicate is an expression that evaluates to a boolean
value
• Predicates can contain
– boolean variables
– non-boolean variables that contain >, <, ==, >=, <=, !=
– boolean function calls
• Internal structure is created by logical operators
– ¬ – the negation operator
–  – the and operator
–  – the or operator
–  – the implication operator
–  – the exclusive or operator
–  – the equivalence operator
• A clause is a predicate with no logical operators
Better Logic-Testing
© Kaminski, Ammann, Offutt
8 of 33
Power of Logic Testing
• Logic expressions encode the behavior of software
• Logic expressions define the domain of values for which
the software behaves in a certain way
• Logic expressions are often
– Complicated
– Subtle
– Easy to get wrong, both in design and implementation
Testing logic predicates is a cost-effective way to
find many subtle software faults
Better Logic-Testing
© Kaminski, Ammann, Offutt
9 of 33
Problems Addressed
• This (mostly) theoretical talk presents results on three
problems with logic predicate testing :
1. Redundant mutation operators for predicate testing
2. Weakness of major logic testing criterion : MCDC
3. A stronger logic test criterion, minimal-MUMCUT
• Solutions based on theoretical analysis
• Solutions can be immediately used to create better tools
and stronger criteria, with very slight additional cost
Better Logic-Testing
© Kaminski, Ammann, Offutt
10 of 33
Outline
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
Better Logic-Testing
© Kaminski, Ammann, Offutt
11 of 33
Mutation Testing
• Mutation helps testers design tests directly to find common
mistakes
1. Modify the software in small, syntactic, ways (mutants)
•
Replace a variable, replace an operator, delete statements, …
2. Design or find a test to cause each mutant to result in
incorrect behavior (killing mutants)
• The resulting tests are very strong—will detect most
mistakes in the software
Fundamental Premise : If the software contains a
fault, there will usually be mutants that can only
be killed by a test that also detects that fault
Better Logic-Testing
© Kaminski, Ammann, Offutt
12 of 33
Redundancy in Mutation
• Mutation is widely considered to be “expensive”
• This expense is largely based on the high number of test
requirements—mutants
• But Li et al. found that mutation needed fewer tests !
Number of Test
Requirements
29 Java Classes
4000
3000
2000
1000
0
Number of Tests
29 Java Classes
500
400
300
200
100
0
Li, Praphamontripong, Offutt, An experimental comparison of four unit test criteria, Mutation 2009
Better Logic-Testing
© Kaminski, Ammann, Offutt
13 of 33
Eliminating Redundancy
• This is strong evidence that mutation tools use many
redundant operators
• A more clever mutation system should have less
redundancy
• Fewer mutants means less work for the tester … cheaper!
Better Logic-Testing
© Kaminski, Ammann, Offutt
14 of 33
Mutation Predicate Testing
• Traditional ROR operator :
Each occurrence of a relational operator (<, >, <=,
>=, =, !=) is replaced by each other operator, and
the expression is replaced by True and False.
• Example:
– Original predicate: a > b
–
–
–
–
–
–
–
Mutant 1 : a < b
Mutant 2 : a <= b
Mutant 3 : a >= b
Mutant 4 : a == b
Mutant 5 : a != b
Mutant 6 : true
Mutant 7 : false
Better Logic-Testing
© Kaminski, Ammann, Offutt
15 of 33
Mutation Predicate Testing
A fault hierarchy establishes theoretical dominance relations
among faults:
If fault A dominates fault B, then any test that
detects fault A will by definition detect fault B
detects
LIF
Lau and Yu’s logic fault hierarchy
LRF
TOF
LOF
LNF
ORF+
TNF
ORF.
ENF
Better Logic-Testing
© Kaminski, Ammann, Offutt
16 of 33
ROR Mutant Hierarchy
If mutant A dominates mutant B, then any test
that detects mutant A will by definition detect
mutant B
Mutants for a < b
false
a <= b
a != b
a == b
a>b
true
Mutants for a >=
b
true
a>b
a == b
a != b
a >= b
Better Logic-Testing
a <= b
false
a<b
© Kaminski, Ammann, Offutt
17 of 33
A Cheaper ROR Operator
Each occurrence of a relational operator (<, >,
<=, >=, =, !=) is replaced by operators as
follows:
• < : <=, !=, False
• > : >=, !=, False
• <= : <, ==, True
• >= : >, ==, True
• == : <=, >=, False
• != : <, >, True
Saves four mutants for each
relational operator
Better Logic-Testing
© Kaminski, Ammann, Offutt
18 of 33
Outline
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
Better Logic-Testing
© Kaminski, Ammann, Offutt
19 of 33
MCDC
• Multiple Condition - Decision Coverage
– Required by the USA Federal Aviation Administration to test
safety critical software
• Each clause (condition) in a predicate (decision) is required
to be tested with true and false when the clause “matters”
– Changing the value of the clause changes the predicate’s value
• Example :
p = a  (b  c)
Test 1 for a : a=true, (b  c)=false
Test 2 for a : a=false, (b  c)=false
• MCDC considered to thoroughly probe predicates
• Most useful when predicate has more than 4 clauses
– Otherwise, we can test all 2N truth assignments
Better Logic-Testing
© Kaminski, Ammann, Offutt
20 of 33
Weakness of MCDC
• MCDC was invented in the early 1990s
• Research community has invented additional logic criteria
since
– MCDC is weaker than ROR-mutation
• MCDC works at the clause level
• ROR works at the relational operator level
Solution : Extend MCDC to the
relational operator level
Better Logic-Testing
© Kaminski, Ammann, Offutt
21 of 33
Stronger MCDC
• MCDC can be extended to include requirements to kill
ROR mutants
• Method :
– MCDC requires clause c = x op y to have two values, True and
False
– Cheaper-ROR requires c to have three values :
x < y, x == y, x > y
– The two MCDC values will always satisfy at least two of the
cheaper-ROR requirements
– Add one additional test to cover the third
Better Logic-Testing
© Kaminski, Ammann, Offutt
22 of 33
Cost is Minor
• MCDC on a predicate with N clauses requires N+1 .. 2N
tests
• MCDC + ROR requires N more (2N+1 ... 3N tests)
• Algorithm and proof in paper
Better Logic-Testing
© Kaminski, Ammann, Offutt
23 of 33
Example
p=abc
a = (a1 < a2), b = (b1 <= b2), c = (c1 == c2)
The following test set satisfies MCDC :
T = { t1, t2, t3, t4} = { ttf, tft, tff, ftf }
Which can be refined with the following value assignments :
Test Value
R
O
R
t
e
s
t
s
Better Logic-Testing
a1
a2
b1
b2
c1
c2
a
b
<
<
t1
TTF
5
6
10
11
21
22
t2
TFT
5
6
11
10
21
21
t3
TFF
5
6
11
10
21
22
t4
FTF
6
5
10
11
21
22
>
New
(t1)
5
5
10
11
21
22
==
New
(t1)
5
6
10
10
21
22
New
(t2)
5
6
11
10
22
21
© Kaminski, Ammann, Offutt
c
==
>
<
==
>
24 of 33
Outline
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
Better Logic-Testing
© Kaminski, Ammann, Offutt
25 of 33
Faults in Logic Expressions
• Types of common and possible faults in logic expressions
have been categorized
– LIF : Literal Insertion Fault … An extraneous literal, or clause, is in
the predicate
– LRF : Literal Replacement Fault … The wrong literal, or clause, is
used in the predicate
– LOF : Literal Omission Fault … A literal, or clause, that should
have been in the predicate was omitted
– TNF : Term Negation Fault … A term, or collection of clauses,
that should have been negated was not (or vice versa)
• Lau and Yu defined nine types of logic faults and explored
their relationships
Better Logic-Testing
© Kaminski, Ammann, Offutt
26 of 33
Lau and Yu’s Fault Hierarchy
A fault hierarchy establishes theoretical dominance relations
among faults:
If fault A dominates fault B, then any test that
detects fault A will by definition detect fault B
minimal- detects
MUMCUT
LIF
LRF
TOF
LOF
LNF
ORF+
TNF
ORF.
MCDC
ENF
Better Logic-Testing
© Kaminski, Ammann, Offutt
27 of 33
Empirical Studies
• The fault hierarchy result is theoretical
– That is, MCDC is only guaranteed to detect TNF and ENF faults
– But it could detect others by serendipity
• Two studies on different sets of logic expressions
1. Software from 5 airplane “black boxes” (Line Replaceable Units)
2. Air traffic collision avoidance system (TCAS)
Better Logic-Testing
© Kaminski, Ammann, Offutt
28 of 33
Empirical Results
• Black boxes
–
–
–
–
–
20,256 logic expressions from 5 different “black boxes”
Expressions originally studied by Chilenski
132 expressions with at least 5 unique literals
125 were simple, so that MCDC can find all faults, leaving 7
MCDCfound
found81%
81%ofofthe
thepossible
possiblefaults
faults(140
(140ofof173)
MCDC
173)
• TCAS
–
–
–
19 predicates (originally studied by Chen, Lau and Yu)
Larger predicates, several with 25 clauses
MCDC
MCDC found
found only
only 35%
35% of
of the
the possible
possible faults
faults (205
(205 of
of 580)
580)
Better Logic-Testing
© Kaminski, Ammann, Offutt
29 of 33
Size Matters
Faults found grouped by number of unique literals
Unique
Literals
Better Logic-Testing
Total
Faults
Faults
Found
Percent
5
71
61
86%
7
523
328
63%
8
316
152
48%
9
1630
650
40%
10
1120
432
39%
11
956
312
33%
12
4870
1421
29%
13
1623
535
33%
© Kaminski, Ammann, Offutt
30 of 33
Minimal-MUMCUT vs. MCDC
• Minimal-MUMCUT finds significantly more faults than
MCDC
– Especially in large, complicated, logic expressions
– Which is precisely when engineers are most likely to make
mistakes!
– Also very hard to debug when the software fails
• Requires up to four times as many tests
• Suggested approach
1. Use all combinations on predicates with less than 5 clauses
2. Use MCDC with 5 to 10 clauses
3. Use minimal-MUMCUT with above 10 clauses
Better Logic-Testing
© Kaminski, Ammann, Offutt
31 of 33
Outline
1. Motivation
2. Logic-Based Testing
3. Making Mutation Cheaper
4. Strengthening MCDC
5. Making MCDC Obsolete
6. Summary
Better Logic-Testing
© Kaminski, Ammann, Offutt
32 of 33
Recommendations
1. Replace ROR with cheaper-ROR in mutation tools
– No loss in strength
– Saves 4 test requirements (mutants) for each relational operator
– We are currently implementing in the muJava mutation tool
2. Extend logic criteria such as MCDC with ROR
– Logic testing should apply to the relational operator level
– Small increase in the number of tests
– Large increase in the testing strength
3. Replace MCDC
–
–
–
–
Better: Replace MCDC with Minimal-MUMCUT + ROR
RTCA-DO-178B has been in effect for almost 20 years
MCDC was a brilliant idea … in 1992
MCDC was adopted and required without scientific validation
• Little theoretical analysis and no experimental studies
– We now know that MCDC is poor at discovering crucial faults
Better Logic-Testing
© Kaminski, Ammann, Offutt
33 of 33
Download