Program Debugging with Effective Software Fault Localization W. Eric Wong Department of Computer Science The University of Texas at Dallas ewong@utdallas.edu http://www.utdallas.edu/~ewong The work is done in conjunction with Vidroha Debroy (Microsoft) and Ruizhi Gao (UTD) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 1 Speaker Biographical Sketch Professor & Director of International Outreach Department of Computer Science University of Texas at Dallas Guest Researcher Computer Security Division National Institute of Standards and Technology (NIST) Vice President, IEEE Reliability Society Founder & Steering Committee co-Chair for the SERE conference (IEEE International Conference on Software Security and Reliability) (http://paris.utdallas.edu/sere13) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 22 A Simple View of Software Assurance Test generation & execution Before the observation of any failure – Analysis – Prediction Reliability Fault-prone modules Security/safety vulnerability etc. After a failure is observed – Bug fixing – Regression testing – etc. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 3 Motivation Testing and debugging activities constitute one of the most expensive aspects of software development – Often more than 50% of the cost [Hailpern & Santhanam, 2003] Manual debugging is… – Tedious – Time Consuming – Error prone – Prohibitively expensive Need ways to debug… automatically B. Hailpern and P. Santhanam, “Software Debugging, Testing, and Verification,” IBM Systems Journal, 41(1):4-12, 2002 Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 4 Debugging Today (1) Program debugging consists of three fundamental activities – Learning that the program has a fault – fault detection – Finding the location of the fault – fault localization – Actually removing the fault – fault fixing A lot of progress has been made in the area of test case generation and thus we can assume that we will have a collection of test cases (i.e., a test set) that can reveal that the program has faults. – So this talk will not cover the first task (fault detection) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 5 Debugging Today (2) Recently fault localization has received a lot of focus – It is one of the most expensive debugging activities [Vessey, 1985] Iris Vessy, “Expertise in Debugging Computer Programs: A Process Analysis,” International Journal of Man-Machine Studies, 23(5):459-494, March1985 Fault Fixing has also been an important research area – Have to be very careful not to introduce new faults in the process Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 6 Software Fault Localization : Our Publications 7 Our Publications (1) The DStar Method for Effective Software Fault Localization, W. Eric Wong, V. Debroy and R Gao, IEEE Transactions on Reliability (to appear in December 2013) Effective Software Fault Localization using an RBF Neural Network, W. E. Wong, V. Debroy, R. Golden, X. Xu and B. Thuraisingham, IEEE Transactions on Reliability, Volume 61, Number 1, Pages 149-169, March 2012 Towards Better Fault Localization: A Crosstab-based Statistical Approach, W. E. Wong, V. Debroy and D, Xu, IEEE Transactions on Systems, Man, and Cybernetics -- Part C: Applications & Reviews, Volume 42, Number 3, pp. 378-396, May 2012 A Family of Code Coverage-based Heuristics for Effective Fault Localization, W. E. Wong, V. Debroy and B. Choi, Journal of Systems and Software, Volume 83, Issue 2, pp. 188-208, February 2010 (Top cited among all the papers published by JSS in 2010) Metamorphic Slice: An application in Spectrum-Based Fault Localization, X. Xie, W. E. Wong, T. Y. Chen, and B. Xu Information and Software Technology, Volume 55, Issue 5, Pages 866-879, May 2013 A Consensus-based Strategy to Improve the Quality of Fault Localization, V. Debroy and W. E. Wong, Software: Practice and Experience (accepted for publication) (Article first published online: 28 November, 2011)(DOI: 10.1002/spe.1146) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 8 Our Publications (2) BP Neural Network-based Effective Fault Localization, W. E. Wong and Y. Qi, International Journal of Software Engineering and Knowledge Engineering, Volume 19, Issue 4, pp. 573-597, June 2009 Effective Program Debugging based on Execution Slices and Inter-Block Data Dependency, W. E. Wong and Y. Qi, Journal of Systems and Software , Volume 79, Number 7, pp. 891-903, July 2006 Ties Within Fault Localization Rankings: Exposing and Addressing the Problem, X. Xu, V. Debroy, W. E. Wong and D. Guo, International Journal of Software Engineering and Knowledge Engineering Volume 21, Issue 6, pp. 803-827, September 2011 On the Estimation of Adequate Test Set Size Using Fault Failure Rates Reference, V. Debroy and W. E. Wong, Journal of Systems and Software Volume 84, Issue 4, pp. 587-602, April 2011 Using Mutation to Automatically Suggest Fixes for Faulty Programs, V. Debroy and W. E. Wong, in Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST), Paris, France, April 2010 Insights on Fault Interference for Programs with Multiple Bugs, V. Debroy and W. E. Wong, in Proceedings of the 20th IEEE International Symposium on Software Reliability Engineering (ISSRE), pp. 165-174, Mysuru, Karnataka, India, November 2009 Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 9 Software Fault Localization : Traditional Approach 10 Commonly Used Techniques Insert print statements Add assertions or set breakpoints Examine core dump or stack trace Rely on programmers’ intuition and domain expert knowledge Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 11 Execution Dice-based Fault Localization 12 The cSuds Tool Suite Also known as IBM C and C++ Testing and Maintenance Toolsuite http://xsuds.argreenhouse.com Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 13 One Failed and One Successful Test Possible locations of faults Code in the execution dice (top priority) Code in the failed execution slice but not in the dice A bug is in the failed execution slice (the red A bug is in the failed execution slice (the red path) path) but not in the successful execution slice and in the successful execution slice (the blue path) (the blue path) The dicing-based technique can be effective in locating some program bugs – H. Agrawal, J. R. Horgan, S. London, and W. E. Wong, “Fault localization using execution slices and dataflow tests,” in Proceedings of the 6th IEEE International Symposium on Software Reliability Engineering, pp. 143-151, Toulouse, France, October 1995. †Authors ‡Number are listed in alphabetical order of citations: more than 200 (according to the Google Scholar) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 1414 Suspiciousness Ranking-based Fault Localization 15 Overview Compute the suspiciousness (likelihood of containing bug) of each statement Rank all the executable statements in descending order of their suspiciousness Examine the statements one-by-one from the top of the ranking until the first faulty statement is located Statements with higher suspiciousness should be examined before statements with lower suspiciousness as the former are more likely to contain bugs than the latter Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 16 Techniques for Computing Suspiciousness Take advantage of code coverage (namely, execution slice) and execution result of each test (success or failure) for debugging. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 17 Different Techniques Program Spectra-based Fault Localization Code Coverage-based Fault Localization Statistical Analysis-based Fault Localization Neural Network-based Fault Localization Similarity Coefficient-based Fault Localization Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 18 Program Spectra-based Fault Localization 19 Tarantula J. A. Jones and M. J. Harrold, “Empirical evaluation of the Tarantula automatic fault-localization technique,” in Proc. of the 20th IEEE/ACM Conference on Automated Software Engineering, pp. 273-282, Long Beach, California, USA, December, 2005 X/X+Y, X=(NCF/NF) & Y=(NCS/NS) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 20 Ochiai A. Ochiai, “Zoogeographic studies on the soleoid fishes found in Japan and its neighboring regions,” Bull. Japan Soc. Sci. Fish 22, 526–530, 1957 R. Abreu, P. Zoeteweij, R. Golsteijn, and A.J.C. van Gemund, “A Practical Evaluation of Spectrum-based Fault Localization,” Journal of Systems and Software, 82(11):1780 - 1792, 2009 NCF N F ( NCF NCS ) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 21 Code Coverage-based Fault Localization & Calibration W. E. Wong, V. Debroy and B. Choi, “A Family of Code Coverage-based Heuristics for Effective Fault Localization,” Journal of Systems and Software, Volume 83, Issue 2, pp. 188-208, February 2010 (Best Paper Award; COMPSAC 2007) (The most cited papers among those published by JSS in 2010) 22 Code Coverage-based & Calibration (1) Suppose for a large test suite, say 1000 test cases, a majority of them, say 995, are successful test cases and only a small number of failed test cases (five in this example) will cause an execution failure. The challenge is how to use these five failed tests and the 995 successful tests to conduct an effective debugging. How can each additional test case that executes the program successfully help locate program bugs? What about each additional test case that makes the program execution fail? Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 23 Code Coverage-based & Calibration (2) Should all the successful test executions provide the same contribution to locate software bugs? Intuitively, the answer should be “no” We propose that with respect to a piece of code, the contribution introduced by the first successful test that executes it in computing its likelihood of containing a bug is larger than or equal to that of the second successful test that executes it, which is larger than or equal to that of the third successful test that executes it, etc. The same also applies to the failed tests. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 24 Crosstab-based Fault Localization W. Eric Wong, Vidroha Debroy and Dianxiang Xu, “Towards Better Fault Localization: A Crosstab-based Statistical Approach,” IEEE Transactions on Systems, Man, and Cybernetics – Part C: Applications & Reviews, Volume 42, Number 3, pp. 378-396, May 2012 25 Neural Network-based Fault Localization • W. Eric Wong, Vidroha Debroy, Richard Golden, Xiaofeng Xu and Bhavani Thuraisingham, “Effective Software Fault Localization using an RBF Neural Network,” IEEE Transactions on Reliability, Volume 61, Number 1, Pages 149-169, March 2012 • W. Eric Wong and Yu Qi, “BP Neural Network-based Effective Fault Localization,” International Journal of Software Engineering and Knowledge Engineering, 19(4): 573-597, June 2009 26 Summary of RBF-based Fault Localization (1) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 27 Summary of RBF-based Fault Localization (2) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 28 Three Novel Aspects Introduce a method for representing test cases, statement coverage, execution results within a modified RBF neural network formalism – Training with example test cases and execution results – Testing with virtual test cases Develop a novel algorithm to simultaneously estimate the number of hidden neurons and their receptive field centers Instead of using the traditional Euclidean distance which has been proved to be inappropriate in the fault localization context, a weighted bit-comparison based distance is defined to measure the distance between the statement coverage vectors of two test cases. – Estimate the number of hidden neurons and their receptive field centers – Compute the output of each hidden neuron Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 29 DStar − A Similarity Coefficient-based Fault Localization (IEEE Transactions on Reliability, accepted for publication) (December 2013 Issue) 30 The Construction of D* (1) The suspiciousness assigned to a statement should be Intuition 1: directly proportional to the number of failed test cases that cover it suspiciousness(s) α NCF Intuition 2: inversely proportional to the number of successful test cases that cover it suspiciousness(s) α 1/NCS Intuition 3: inversely proportional to the number of failed test cases that do not cover it suspiciousness(s) α 1/NUF Conveniently enough such a coefficient already exists Kulczynski [Kulczynski, 1928]: NCF /(NCS+NUF) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 31 The Construction of D* (with * = 2) (2) However, we also have a fourth intuition … Intuition 4: Intuition 1 is the most sound of the other intuitions and should therefore carry a higher weight. Kulczynski does not lead to the realization of the fourth intuition. Under the circumstances we might try to do something like this: suspiciousness(s) 2 NCF NUF NCS or maybe even suspiciousness(s) 100 NCF NUF NCS But this is not going to help us (as we shall later see) So instead we make use of a different coefficient (D*) suspiciousness(s) NCF NCF NUF NCS Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 32 Empirical Evaluation 33 Our Observation D* is evaluated across 24 programs and is compared to 38 different fault localization techniques. Both single-fault and multi-fault programs are used. Results indicate that D* is more effective at locating faults than all the other techniques it is compared to. An empirical evaluation is also conducted to illustrate how the effectiveness of D* increases as * grows, and then levels off when * exceeds a critical value. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 34 24 Subject Programs Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 35 38 Techniques (1) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 36 38 Techniques (2) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 37 38 Techniques (3) Tarantula Ochiai Ochiai2 Crosstab RBF Heuristics H3b and H3c Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 38 Three Evaluation Metrics/Criteria Total number of statements examined – An absolute measure The EXAM score: the percentage of code examined – A relative (graphical) measure The Wilcoxon Signed-Rank Test – A statistical measure Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 39 Ties in the Ranking: Best/Worst The suspiciousness assigned to a statement by D* (and other techniques) may not be unique, i.e., two or more statements can be tied for the same position in the ranking. Assuming a faulty statement and some correct statements are tied – In the best case we examine the faulty statement first – In the worst case we examine it last For each of the previously discussed evaluation criteria, we will have the best case and the worst case effectiveness. – Presenting only the average would have resulted in a loss of information Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 40 Results – Total Number of Statements Examined (1) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 41 Results – Total Number of Statements Examined (2) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 42 Results – Total Number of Statements Examined (3) Regardless of whether the best or worst case is considered, D2 is always the most effective of the similarity coefficient-based fault localization techniques except for – the Siemens suite where D2 is the second & – the worst case of the make program where D2 is the second (however, D2 is the most effective for the best case) This consistent high performance with respect to D2 deserves attention as the other techniques are not just less effective than D2, but are also less consistent Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 43 Results – EXAM Score (best case on Unix) Percentage of Faulty Versions where Fault is Located 100.00% 95.63% 91.25% 86.88% 82.50% 78.13% 73.75% 69.38% 65.00% D2 Best Case Mountford Best Case Kulcynzki Best Case 0% 0% 10% 20% 30% 40% 50% 60% 70% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 44 Results – EXAM Score (worst case on Unix) Percentage of Faulty Versions where Fault is Located 100.00% 91.88% 83.75% 75.63% 67.50% 59.38% 51.25% 43.13% 35.00% D2 Worst Case Mountford Worst Case Kulcynzki Worst Case 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 45 Results – EXAM Score (best case on Space) Percentage of Faulty Versions where Fault is Located 100.00% 98.13% 96.25% 94.38% 92.50% 90.63% 88.75% 86.88% 85.00% D2 Best Case Harmonic Mean Best Case Phi Best Case 0% 0% 5% 10% 15% 20% 25% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 46 Results – EXAM Score (worst case on Space) Percentage of Faulty Versions where Fault is Located 100.00% 97.50% 95.00% 92.50% 90.00% 87.50% 85.00% 82.50% 80.00% D2 Worst Case Harmonic Mean Worst Case Phi Worst Case 0% 0% 5% 10% 15% 20% 25% 30% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 47 Results – EXAM Score (best case on Ant) Percentage of Faulty Versions where Fault is Located 100.00% 96.25% 92.50% 88.75% 85.00% 81.25% 77.50% 73.75% 70.00% D2 Best Case Pearson Best Case Phi Best Case 0% 0.00% 0.10% 0.20% 0.30% 0.40% 0.50% 0.60% 0.70% 0.80% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 48 Results – EXAM Score (worst case on Ant) Percentage of Faulty Versions where Fault is Located 100% 96.25% 92.50% 88.75% 85.00% 81.25% 77.50% 73.75% 70.00% D2 Worst Case Pearson Worst Case Phi Worst Case 0% 0.00% 0.10% 0.20% 0.30% 0.40% 0.50% 0.60% 0.70% 0.80% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 49 Results – Wilcoxon Signed-Rank Test Confidence with which it can be claimed that D2 is more effective than other techniques (best and worst cases): 99.99 % in most cases. If the alternative hypothesis is modified to include the equality, i.e., D2 is more effective than or at least as effective as the other techniques: 100% in almost every case. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 50 More Discussion on D* D* with a higher value for the * Compare D* with other fault localization techniques Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 51 Effectiveness of D* The effectiveness of D* for the make program increases until it levels off as the value of * increases. A similar observation also applies to other programs. Total number of statements examined for make 14000 D* Worst 12000 10000 D* Best 8000 6000 4000 2000 0 2 8 14 20 26 32 38 44 50 Star Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 52 Results – Total Number of Statements Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 53 Programs with Multiple Faults 54 Tow Approaches One bug at a time Using “fault-focused” clustering Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 55 One Bug at A Time Program (P) Test set (T) Examining code until the first bug is located and fixed Suspiciousness ranking Program execution & data collection At least one failure Computing suspiciousness of each statement & generating a ranking Execution traces & execution results No failure Process Termination Only one bug is fixed for each iteration. All the test cases have to be re-executed at the beginning of each iteration. The number of bugs is the same as the number of iterations. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 56 Fault-Focused Clustering-based Technique (1) Process Termination Program (P) Test set (T) failed test cases fi (i=1,2,3,…n) Program execution & data collection successful test cases (S) Suspiciousness rankings ri (i=1,2,3,…n) Estimation of the # of clusters & their centers Estimated clusters & their centers Fixing the first bug suggested by each ranking Clustering f1 ∪ S f2 ∪ S r1 r2 ... fn ∪ S rn Fault-focused clusters Ci (i=1,2,3,…k) Fault-focused suspiciousness rankings Ri (i=1,2,3,…k) Using failed test cases in each cluster along with all the successful test cases to generate “fault-focused” rankings Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 57 Fault-Focused Clustering-based Technique (2) More than one bug can be fixed during each iteration. The number of iterations is reduced. Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 58 Ongoing Case Study using Programs with Multiple Bugs 40 fault localization techniques – – – – – – – – D* Tarantula Ochiai RBF H3B and H3C O and Op Crosstab 31 similarity coefficient-based 23 Programs – – – – – – – – gzip grep make (more than 20K lines of code) (Not artificial bugs) Ant (more than 75K lines of code) (Not artificial bugs) space flex Siemens Unix Each program has multiple faults – 50 distinct versions of 3-bug, 4-bug, 5-bug, 6-bug and 7-bug Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 59 Some Results (1) 50 4-bug distinct versions of grep Crosstab for fault-focused clustering-based technique OBA: one bug at a time using Crosstab, Ochiai, and RBF, respectively Best Average Worst Average Number of Iterations Fault-focused 159.34 435.78 1.94 Crosstab-OBA 257.56 501.04 4 Ochiai-OBA 485.94 793.18 4 RBF-OBA 455.04 763.82 4 “Fault-focused” clustering-based technique is better not only in terms of effectiveness (smaller average number of statements need to be examined) but also in terms of efficiency (smaller number of iterations) Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 60 Some Results (2) Exam Score (best case) Percentage of Faulty Versions where Faults are Located 100% 90% 80% 70% 60% 50% 40% 30% Fault-focused Best Case 20% Crosstab-OBA Best Case Ochiai-OBA Best Case 10% RBF-OBA Best Case 0% 0% 5% 10% 15% 20% 25% 30% 35% 40% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 61 Some Results (3) Exam Score (worst case) Percentage of Faulty Versions where Faults are Located 100% 90% 80% 70% 60% 50% 40% 30% Fault-focused Worst Case 20% Crosstab-OBA Worst Case Ochiai-OBA Worst Case 10% RBF-OBA Worst Case 0% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Percentage of Code Examined Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 62 Conclusion 63 What We Have Discussed Existing and new fault localization techniques – Many of them use the same information (statement coverage and execution results) to identify suspicious code likely to contain program bug(s) D* is very efficient and can be automated – Via an evaluation across 24 different programs, D* is shown to be more effective than 38 other fault localization techniques (including 31 similarity coefficient-based techniques, and 7 other contemporaries (Crosstab, RBF, H3b and H3c, Tarantula, Ochiai, and Ochiai2). – Upper bound on effectiveness: The effectiveness of D* increases along with * before leveling off after * passes a critical point. Debugging programs with multiple bugs Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 64 Automatic Bug Fixing 65 Our Approach (JSS paper) In the context of program debugging we have discussed mutation and fault localization Fault Localization Mutation + The Good: Can result in potential fixes for faulty programs automatically. The Good: Can potentially identify the exact location of a fault in a program. The Bad: We have no idea as to where in a program a fault is, and so we do not know how to proceed. Randomly examining mutants can be prohibitively expensive. The Bad: Even if we locate the fault, we have no idea as to how to fix the fault. This is left solely as the responsibility of the programmers/debuggers. So…what if we combined these two? Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 66 Questions ? 67