Mutant Subsumption Graphs Mutation 2014 March 31, 2014 Bob Kurtz, Paul Ammann, Marcio Delamaro, Jeff Offutt, Lin Deng Introduction • In this talk, we will – Define true subsumption, dynamic subsumption, and static subsumption to model the redundancy between mutants – Develop a graph model to display the subsumption relationship – Examine via an example how subsumption relationships behave and evolve 2 Motivation • What exactly is subsumption, anyway? – Lots of prior work – fault hierarchies, subsuming HOMs, etc. – Can we specify some rules and produce a useful representation? • What can we do with it once we have it? – Can we select a minimal set of mutants to reduce testing complexity? – Can we use subsumption as a fitness function for tasks like evaluating automated program repair candidates? 3 “True” Subsumption • Given a set of mutants M on artifact A, mutant mi subsumes mutant mj (mi → mj) iff: – Some test kills mi – All tests that kill mi also kill mj • “True” subsumption represents the actual relationship between mutants • We’d like to get this relationship, but in general it is undecidable, so we must approximate it 4 Dynamic Subsumption • Dynamic subsumption approximates true subsumption using a finite test set T • Given a set of mutants M on artifact A and a test set T, mutant mi dynamically subsumes mutant mj iff: – Some test in T kills mi – All tests in T that kill mi also kill mj • If T contains all possible tests – dynamic subsumption = true subsumption 5 Static Subsumption • Static subsumption approximates true subsumption using static analysis of mutants rather than test execution • Given a set of mutants M on artifact A, mutant mi statically subsumes mutant mj iff: – Analysis shows that some test could kill mi – Analysis shows that all tests that could kill mi also would kill mj • If we had omniscient analysis, then static subsumption = true subsumption 6 An Informal View All tests Tests that kill mj Tests that kill mi Tests that kill mk mi → mj 7 Graph Model • In the Mutant Subsumption Graph (MSG) graph model – Nodes represent a maximal set of indistinguished mutants – Edges represent the subsumption relationship – Thus, m1 → m2 → m3 is represented as: 8 8 Dynamic Subsumption Graph (DMSG) t1 t2 t3 t4 m1 m2 m 3 m4 m5 “Indistinguished” mutants T = { t1, t2, t3, t4 } 9 Minimal Mutants • Minimal mutants are not subsumed by any other mutants • If we execute a test set that kills all the minimal mutants, then we will kill all the mutants – All other mutants are redundant! 10 10 DMSG Growth • We can observe the growth of the DMSG as we add tests – Dashed nodes indicate live mutants m1 m2 m3 m4 t1 t2 T = { t1, t2 } T = { t1, t2, t3 } t3 t4 T = { t1 } T = { t1, t2, t3, t4 } 11 Subsumption State Model • Mutants change state (with respect to subsumption relationships) as tests are added. – Live or killed – Distinguished or indistinguished – Minimal or subsumed • Only if killed • These 3 attributes combine to create 8 possible states, but since subsumption is not defined for live mutants, we only care about 6 states 12 12 The cal() Example • To explore mutant subsumption graphs in more detail, we selected a small example program • cal() is a simple Java program of < 20 SLOC – cal() calculates the number of days between two dates in the same year • Chosen for its well-defined finite input space – See Ammann and Offutt, Introduction to Software Testing • We used muJava to generate 173 mutants 13 The cal() Example • Dynamic subsumption requires a test set • We used the Advanced Combinatorial Testing System (ACTS) to generate a test set – Pairwise combinations of months and year types (divisible-by-400, divisible-by-100, divisible-by-4, other) generated 90 test cases – Test set killed 145 mutants, and the remaining 28 were analyzed by hand and determined to be equivalent 14 The cal() Example • 31 nodes of indistinguished mutants • 7 nodes of minimal mutants – Even though muJava generated 145 nonequivalent mutants, we need to kill only 7 (one from each of these nodes) to ensure that we kill all 145 15 DMSG Growth for cal() • We can observe the growth of the DMSG as we individually add the 90 “pairwise” tests in random order – Graph shows the number of minimal mutant nodes (red) and the total number of graph nodes (red + blue) 16 DMSG Growth for cal() 17 cal() DMSG for Different Test Sets 6-test “minimal” test set 17 nodes 6 minimal nodes 90-test “pairwise” test set 31 nodes 7 minimal nodes 312-test “combinatorial” test set 33 nodes 9 minimal nodes 18 cal() in C • We implemented the cal() program in C, then used Proteum to generate mutants – Proteum’s mutation operators are not based on the selective set of operators, so it generated many more mutants – 891 – The same 90 tests killed all but 71 mutants, and those 71 were determined to be equivalent 128 nodes Only 18 minimal nodes 19 Dynamic Approximation • May group mutants together where a distinguishing test is missing • May add unsound edges where a contradicting test is missing TMSG DMSG 20 Static Approximation • May group mutants together where unable to solve constraints • If analysis is sound, should never add unsound edges TMSG SMSG 21 Static Refinement of the DMSG • Can the dynamic results be refined by static analysis? • We performed a manual analysis of a small portion of the graph 22 Static Refinement of the DMSG • COI_1 is killed by all tests • AORB_4 is killed whenever (month2=month1) • AOUI_7 is killed whenever (month2≠month1) or whenever ((month2=month1)^(day2≠day1)) Tests that kill COI_1 (all tests) Tests that kill AORB_4 Tests that kill AOUI_7 23 Static Refinement of the DMSG • COI_1 is killed by all tests • AORB_4 is killed whenever (month2=month1) • AORB_2 is killed whenever ((month2=month1)^ ((day2≠day1)≠(day2-day1))) Tests that kill COI_1 (all tests) Tests that Tests that kill kill AORB_2 AORB_4 24 Static Refinement of the DMSG • AORB_2 is killed whenever (month2=month1)^ ((day2-day1)≠(day2/day1)) • AORB_3 is killed whenever (month2=month1)^ ((day2-day1)≠(day2%day1)) • All tests / tests that kill COI_1 Tests that kill AORB_2 ? Tests that kill AORB_3 What is the relationship between these mutants? 25 Static Refinement of the DMSG • Combinations of day1 and day2 that kill: – – – – both AORB_2 and AORB_3 are GREEN neither are BLUE AORB_2 but not AORB_3 are RED AORB_3 but not AORB_2 are YELLOW • This one test case breaks AORB_3 → AORB_2 26 Static Refinement of the DMSG • Static analysis removes the unsound edge between AORB_3 and AORB_2 Refines to 27 “Stubborn” Mutants • Yao, Harman, and Jia define “stubborn” mutants as those nonequivalent mutants which are not killed by a branchadequate test set – “A Study of Equivalent and Stubborn Mutation Operators Using Human Analysis of Equivalence”, ICSE 2014 63% kill • What’s the relationship between “stubborn” mutants and minimal mutants? 82% kill 28 Summary • We have developed a succinct definition of mutant subsumption, as well as two practical approximations, dynamic and static subsumption • We have developed a graphical notation for subsumption • We have investigated some properties of subsumption, including growth patterns of the DMSG and a state machine 29 Open Questions • Why are the Java/muJava and C/Proteum subsumption graphs so different? • Can we analyze static subsumption using Java Pathfinder and differential symbolic execution (or some other tools/techniques)? • How do we merge dynamic and static MSGs to get closer to the “true” MSG? • What is the relationship between minimal and “stubborn” mutants? 30 Related Information • Establishing Theoretical Minimal Sets of Mutants – Paul Ammann, Marcio Delamaro, and Jeff Offutt – Tuesday, 11:30-1:00 in the Burlington Room 31 Questions? rkurtz2@gmu.edu Minimal Mutant Operators • AORB_13 • ROR_16, ROR_20 • ROR_17, AORB_12, AORB_11, AORB_10, AOIS_20, AOIS_22, AOIS_21, AORB_9, AOIS_33, AOIS_34, AOIS_19, LOI_6, LOI_9, ROR_21, ROR_24, ROR_28 • ROR_14, ROR_10 • AORB_19 • AORB_3 • AOIS_46, AOIS_8 33 Minimal Mutant Operators Operator #Minimal #Total %Mimimal AOIS 8 70 11.4% AOIU 0 7 0% AORB 7 32 21.9% COI 0 7 0% COR 0 4 0% LOI 2 19 10.5% ROR 8 34 25.5% 34