W. Eric Wong

advertisement
Program Debugging with Effective
Software Fault Localization
W. Eric Wong
Department of Computer Science
The University of Texas at Dallas
ewong@utdallas.edu
http://www.utdallas.edu/~ewong
The work is done in conjunction with
Vidroha Debroy (Microsoft) and Ruizhi Gao (UTD)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
1
Speaker Biographical Sketch
 Professor & Director of International Outreach
Department of Computer Science
University of Texas at Dallas
 Guest Researcher
Computer Security Division
National Institute of Standards and Technology (NIST)
 Vice President, IEEE Reliability Society
 Founder & Steering Committee co-Chair for the SERE conference
(IEEE International Conference on Software Security and Reliability)
(http://paris.utdallas.edu/sere13)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
22
A Simple View of Software Assurance
 Test generation & execution
 Before the observation of any failure
– Analysis
– Prediction
 Reliability
 Fault-prone modules
 Security/safety vulnerability
 etc.
 After a failure is observed
– Bug fixing
– Regression testing
– etc.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
3
Motivation
 Testing and debugging activities constitute one of the most expensive
aspects of software development
– Often more than 50% of the cost [Hailpern & Santhanam, 2003]
 Manual debugging is…
– Tedious
– Time Consuming
– Error prone
– Prohibitively expensive
Need ways to debug…
automatically
B. Hailpern and P. Santhanam, “Software Debugging, Testing, and Verification,” IBM Systems Journal, 41(1):4-12, 2002
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
4
Debugging Today (1)
 Program debugging consists of three fundamental activities
– Learning that the program has a fault – fault detection
– Finding the location of the fault – fault localization
– Actually removing the fault – fault fixing
 A lot of progress has been made in the area of test case generation and
thus we can assume that we will have a collection of test cases
(i.e., a test set) that can reveal that the program has faults.
– So this talk will not cover the first task (fault detection)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
5
Debugging Today (2)
 Recently fault localization has received a lot of focus
– It is one of the most expensive debugging activities [Vessey, 1985]
Iris Vessy, “Expertise in Debugging Computer Programs: A Process Analysis,” International Journal of
Man-Machine Studies, 23(5):459-494, March1985
 Fault Fixing has also been an important research area
– Have to be very careful not to introduce new faults in the process
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
6
Software Fault Localization :
Our Publications
7
Our Publications (1)






The DStar Method for Effective Software Fault Localization, W. Eric Wong,
V. Debroy and R Gao, IEEE Transactions on Reliability (to appear in
December 2013)
Effective Software Fault Localization using an RBF Neural Network, W. E.
Wong, V. Debroy, R. Golden, X. Xu and B. Thuraisingham, IEEE
Transactions on Reliability, Volume 61, Number 1, Pages 149-169, March
2012
Towards Better Fault Localization: A Crosstab-based Statistical Approach, W.
E. Wong, V. Debroy and D, Xu, IEEE Transactions on Systems, Man, and
Cybernetics -- Part C: Applications & Reviews, Volume 42, Number 3, pp.
378-396, May 2012
A Family of Code Coverage-based Heuristics for Effective Fault Localization,
W. E. Wong, V. Debroy and B. Choi, Journal of Systems and Software,
Volume 83, Issue 2, pp. 188-208, February 2010
(Top cited among all the papers published by JSS in 2010)
Metamorphic Slice: An application in Spectrum-Based Fault Localization, X.
Xie, W. E. Wong, T. Y. Chen, and B. Xu Information and Software
Technology, Volume 55, Issue 5, Pages 866-879, May 2013
A Consensus-based Strategy to Improve the Quality of Fault Localization, V.
Debroy and W. E. Wong, Software: Practice and Experience (accepted for
publication) (Article first published online: 28 November, 2011)(DOI:
10.1002/spe.1146)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
8
Our Publications (2)






BP Neural Network-based Effective Fault Localization, W. E. Wong and Y.
Qi, International Journal of Software Engineering and Knowledge
Engineering, Volume 19, Issue 4, pp. 573-597, June 2009
Effective Program Debugging based on Execution Slices and Inter-Block Data
Dependency, W. E. Wong and Y. Qi, Journal of Systems and Software ,
Volume 79, Number 7, pp. 891-903, July 2006
Ties Within Fault Localization Rankings: Exposing and Addressing the
Problem, X. Xu, V. Debroy, W. E. Wong and D. Guo, International Journal of
Software Engineering and Knowledge Engineering Volume 21, Issue 6, pp.
803-827, September 2011
On the Estimation of Adequate Test Set Size Using Fault Failure Rates
Reference, V. Debroy and W. E. Wong, Journal of Systems and Software
Volume 84, Issue 4, pp. 587-602, April 2011
Using Mutation to Automatically Suggest Fixes for Faulty Programs, V.
Debroy and W. E. Wong, in Proceedings of the 3rd International Conference
on Software Testing, Verification and Validation (ICST), Paris, France, April
2010
Insights on Fault Interference for Programs with Multiple Bugs, V. Debroy and
W. E. Wong, in Proceedings of the 20th IEEE International Symposium on
Software Reliability Engineering (ISSRE), pp. 165-174, Mysuru, Karnataka,
India, November 2009
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
9
Software Fault Localization :
Traditional Approach
10
Commonly Used Techniques
 Insert print statements
 Add assertions or set breakpoints
 Examine core dump or stack trace
Rely on programmers’ intuition and domain expert knowledge
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
11
Execution Dice-based Fault Localization
12
The cSuds Tool Suite
 Also known as IBM C and C++ Testing and Maintenance Toolsuite
http://xsuds.argreenhouse.com
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
13
One Failed and One Successful Test
Possible locations of faults
 Code in the execution dice (top priority)
 Code in the failed execution slice but not in the dice
 A bug is in the failed execution slice (the red  A bug is in the failed execution slice (the red path)
path) but not in the successful execution slice
and in the successful execution slice (the blue path)
(the blue path)
 The dicing-based technique can be effective in locating some program bugs
– H. Agrawal, J. R. Horgan, S. London, and W. E. Wong, “Fault localization using execution slices and
dataflow tests,” in Proceedings of the 6th IEEE International Symposium on Software Reliability
Engineering, pp. 143-151, Toulouse, France, October 1995.
†Authors
‡Number
are listed in alphabetical order
of citations: more than 200 (according to the Google Scholar)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
1414
Suspiciousness Ranking-based
Fault Localization
15
Overview
 Compute the suspiciousness (likelihood of containing bug) of each
statement
 Rank all the executable statements in descending order of their
suspiciousness
 Examine the statements one-by-one from the top of the ranking until the
first faulty statement is located
 Statements with higher suspiciousness should be examined before
statements with lower suspiciousness as the former are more likely to
contain bugs than the latter
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
16
Techniques for Computing Suspiciousness
 Take advantage of code coverage (namely, execution slice) and execution
result of each test (success or failure) for debugging.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
17
Different Techniques
 Program Spectra-based Fault Localization
 Code Coverage-based Fault Localization
 Statistical Analysis-based Fault Localization
 Neural Network-based Fault Localization
 Similarity Coefficient-based Fault Localization
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
18
Program Spectra-based Fault Localization
19
Tarantula

J. A. Jones and M. J. Harrold, “Empirical evaluation of the Tarantula automatic
fault-localization technique,” in Proc. of the 20th IEEE/ACM Conference on
Automated Software Engineering, pp. 273-282, Long Beach, California, USA,
December, 2005
X/X+Y, X=(NCF/NF) & Y=(NCS/NS)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
20
Ochiai

A. Ochiai, “Zoogeographic studies on the soleoid fishes found in Japan and its
neighboring regions,” Bull. Japan Soc. Sci. Fish 22, 526–530, 1957

R. Abreu, P. Zoeteweij, R. Golsteijn, and A.J.C. van Gemund, “A Practical
Evaluation of Spectrum-based Fault Localization,” Journal of Systems and
Software, 82(11):1780 - 1792, 2009
NCF
N F  ( NCF  NCS )
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
21
Code Coverage-based Fault Localization
&
Calibration
W. E. Wong, V. Debroy and B. Choi, “A Family of Code Coverage-based
Heuristics for Effective Fault Localization,” Journal of Systems and Software,
Volume 83, Issue 2, pp. 188-208, February 2010
(Best Paper Award; COMPSAC 2007)
(The most cited papers among those published by JSS in 2010)
22
Code Coverage-based & Calibration (1)
 Suppose for a large test suite, say 1000 test cases, a majority of them, say
995, are successful test cases and only a small number of failed test cases
(five in this example) will cause an execution failure.
 The challenge is how to use these five failed tests and the 995 successful
tests to conduct an effective debugging.
 How can each additional test case that executes the program successfully
help locate program bugs?
 What about each additional test case that makes the program execution
fail?
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
23
Code Coverage-based & Calibration (2)
 Should all the successful test executions provide the same contribution to
locate software bugs?
 Intuitively, the answer should be “no”
 We propose that with respect to a piece of code, the contribution
introduced by the first successful test that executes it in computing its
likelihood of containing a bug is larger than or equal to that of the second
successful test that executes it, which is larger than or equal to that of the
third successful test that executes it, etc.
 The same also applies to the failed tests.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
24
Crosstab-based Fault Localization
W. Eric Wong, Vidroha Debroy and Dianxiang Xu, “Towards Better
Fault Localization: A Crosstab-based Statistical Approach,”
IEEE Transactions on Systems, Man, and Cybernetics – Part C:
Applications & Reviews, Volume 42, Number 3, pp. 378-396, May 2012
25
Neural Network-based
Fault Localization
• W. Eric Wong, Vidroha Debroy, Richard Golden, Xiaofeng Xu
and Bhavani Thuraisingham, “Effective Software Fault Localization using
an RBF Neural Network,” IEEE Transactions on Reliability,
Volume 61, Number 1, Pages 149-169, March 2012
• W. Eric Wong and Yu Qi, “BP Neural Network-based Effective
Fault Localization,” International Journal of Software Engineering
and Knowledge Engineering, 19(4): 573-597, June 2009
26
Summary of RBF-based Fault Localization (1)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
27
Summary of RBF-based Fault Localization (2)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
28
Three Novel Aspects
 Introduce a method for representing test cases, statement coverage,
execution results within a modified RBF neural network formalism
– Training with example test cases and execution results
– Testing with virtual test cases
 Develop a novel algorithm to simultaneously estimate the number of
hidden neurons and their receptive field centers
 Instead of using the traditional Euclidean distance which has been
proved to be inappropriate in the fault localization context,
a weighted bit-comparison based distance is defined to measure the
distance between the statement coverage vectors of two test cases.
– Estimate the number of hidden neurons and their receptive field centers
– Compute the output of each hidden neuron
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
29
DStar − A Similarity Coefficient-based
Fault Localization
(IEEE Transactions on Reliability, accepted for publication)
(December 2013 Issue)
30
The Construction of D* (1)
 The suspiciousness assigned to a statement should be
 Intuition 1: directly proportional to the number of failed test cases that
cover it
suspiciousness(s) α NCF
 Intuition 2: inversely proportional to the number of successful test cases
that cover it
suspiciousness(s) α 1/NCS
 Intuition 3: inversely proportional to the number of failed test cases that
do not cover it
suspiciousness(s) α 1/NUF
 Conveniently enough such a coefficient already exists
Kulczynski [Kulczynski, 1928]: NCF /(NCS+NUF)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
31
The Construction of D* (with * = 2) (2)
 However, we also have a fourth intuition …
 Intuition 4: Intuition 1 is the most sound of the other intuitions and should
therefore carry a higher weight.
 Kulczynski does not lead to the realization of the fourth intuition.
 Under the circumstances we might try to do something like this:
suspiciousness(s) 
2  NCF
NUF  NCS
or maybe even
suspiciousness(s) 
100  NCF
NUF  NCS
 But this is not going to help us (as we shall later see)
 So instead we make use of a different coefficient (D*)
suspiciousness(s) 
NCF  NCF
NUF  NCS
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
32
Empirical Evaluation
33
Our Observation
 D* is evaluated across 24 programs and is compared to 38 different fault
localization techniques.
 Both single-fault and multi-fault programs are used.
 Results indicate that D* is more effective at locating faults than all the
other techniques it is compared to.
 An empirical evaluation is also conducted to illustrate how
the effectiveness of D* increases as * grows, and then levels off
when * exceeds a critical value.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
34
24 Subject Programs
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
35
38 Techniques (1)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
36
38 Techniques (2)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
37
38 Techniques (3)
 Tarantula
 Ochiai
 Ochiai2
 Crosstab
 RBF
 Heuristics H3b and H3c
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
38
Three Evaluation Metrics/Criteria
 Total number of statements examined
– An absolute measure
 The EXAM score: the percentage of code examined
– A relative (graphical) measure
 The Wilcoxon Signed-Rank Test
– A statistical measure
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
39
Ties in the Ranking: Best/Worst
 The suspiciousness assigned to a statement by D* (and other techniques)
may not be unique, i.e., two or more statements can be tied for the same
position in the ranking.
 Assuming a faulty statement and some correct statements are tied
– In the best case we examine the faulty statement first
– In the worst case we examine it last
 For each of the previously discussed evaluation criteria, we will have the
best case and the worst case effectiveness.
– Presenting only the average would have resulted in a loss of information
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
40
Results – Total Number of Statements Examined (1)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
41
Results – Total Number of Statements Examined (2)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
42
Results – Total Number of Statements Examined (3)
 Regardless of whether the best or worst case is considered, D2 is always
the most effective of the similarity coefficient-based fault localization
techniques except for
– the Siemens suite where D2 is the second &
– the worst case of the make program where D2 is the second
(however, D2 is the most effective for the best case)
 This consistent high performance with respect to D2 deserves attention
as the other techniques are not just less effective than D2, but are also less
consistent
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
43
Results – EXAM Score (best case on Unix)
Percentage of Faulty Versions where Fault is Located
100.00%
95.63%
91.25%
86.88%
82.50%
78.13%
73.75%
69.38%
65.00%
D2 Best Case
Mountford Best Case
Kulcynzki Best Case
0%
0%
10%
20%
30%
40%
50%
60%
70%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
44
Results – EXAM Score (worst case on Unix)
Percentage of Faulty Versions where Fault is Located
100.00%
91.88%
83.75%
75.63%
67.50%
59.38%
51.25%
43.13%
35.00%
D2 Worst Case
Mountford Worst Case
Kulcynzki Worst Case
0%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
45
Results – EXAM Score (best case on Space)
Percentage of Faulty Versions where Fault is Located
100.00%
98.13%
96.25%
94.38%
92.50%
90.63%
88.75%
86.88%
85.00%
D2 Best Case
Harmonic Mean Best Case
Phi Best Case
0%
0%
5%
10%
15%
20%
25%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
46
Results – EXAM Score (worst case on Space)
Percentage of Faulty Versions where Fault is Located
100.00%
97.50%
95.00%
92.50%
90.00%
87.50%
85.00%
82.50%
80.00%
D2 Worst Case
Harmonic Mean Worst Case
Phi Worst Case
0%
0%
5%
10%
15%
20%
25%
30%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
47
Results – EXAM Score (best case on Ant)
Percentage of Faulty Versions where Fault is Located
100.00%
96.25%
92.50%
88.75%
85.00%
81.25%
77.50%
73.75%
70.00%
D2 Best Case
Pearson Best Case
Phi Best Case
0%
0.00%
0.10%
0.20%
0.30%
0.40%
0.50%
0.60%
0.70%
0.80%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
48
Results – EXAM Score (worst case on Ant)
Percentage of Faulty Versions where Fault is Located
100%
96.25%
92.50%
88.75%
85.00%
81.25%
77.50%
73.75%
70.00%
D2 Worst Case
Pearson Worst Case
Phi Worst Case
0%
0.00%
0.10%
0.20%
0.30%
0.40%
0.50%
0.60%
0.70%
0.80%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
49
Results – Wilcoxon Signed-Rank Test
 Confidence with which it can be claimed that D2 is more effective than
other techniques (best and worst cases): 99.99 % in most cases.
 If the alternative hypothesis is modified to include the equality, i.e., D2 is
more effective than or at least as effective as the other techniques:
100% in almost every case.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
50
More Discussion on D*
 D* with a higher value for the *
 Compare D* with other fault localization techniques
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
51
Effectiveness of D*
 The effectiveness of D* for the make program increases until it levels off
as the value of * increases.
 A similar observation also applies to other programs.
Total number of statements examined for make
14000
D* Worst
12000
10000
D* Best
8000
6000
4000
2000
0
2
8
14
20
26
32
38
44
50
Star
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
52
Results – Total Number of Statements Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
53
Programs with Multiple Faults
54
Tow Approaches
 One bug at a time
 Using “fault-focused” clustering
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
55
One Bug at A Time
Program (P)
Test set (T)
Examining code
until the first bug
is located and
fixed
Suspiciousness ranking
Program
execution & data
collection
At least one failure
Computing
suspiciousness of
each statement &
generating a
ranking
Execution traces
& execution results
No failure
Process
Termination
 Only one bug is fixed for each iteration.
 All the test cases have to be re-executed at the beginning of each iteration.
 The number of bugs is the same as the number of iterations.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
56
Fault-Focused Clustering-based Technique (1)
Process Termination
Program (P)
Test set (T)
failed test cases fi (i=1,2,3,…n)
Program execution
& data collection
successful test cases (S)
Suspiciousness rankings
ri (i=1,2,3,…n)
Estimation of the # of
clusters & their
centers
Estimated clusters &
their centers
Fixing the first
bug suggested by
each ranking
Clustering
f1 ∪ S
f2 ∪ S
r1
r2
...
fn ∪ S
rn
Fault-focused clusters
Ci (i=1,2,3,…k)
Fault-focused
suspiciousness rankings
Ri (i=1,2,3,…k)
Using failed test cases in
each cluster along with
all the successful test
cases to generate
“fault-focused”
rankings
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
57
Fault-Focused Clustering-based Technique (2)
 More than one bug can be fixed during each iteration.
 The number of iterations is reduced.
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
58
Ongoing Case Study using Programs with Multiple Bugs
 40 fault localization techniques
–
–
–
–
–
–
–
–
D*
Tarantula
Ochiai
RBF
H3B and H3C
O and Op
Crosstab
31 similarity coefficient-based
 23 Programs
–
–
–
–
–
–
–
–
gzip
grep
make (more than 20K lines of code) (Not artificial bugs)
Ant (more than 75K lines of code) (Not artificial bugs)
space
flex
Siemens
Unix
 Each program has multiple faults
– 50 distinct versions of 3-bug, 4-bug, 5-bug, 6-bug and 7-bug
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
59
Some Results (1)
 50 4-bug distinct versions of grep
 Crosstab for fault-focused clustering-based technique
 OBA: one bug at a time using Crosstab, Ochiai, and RBF, respectively
Best Average
Worst Average
Number of
Iterations
Fault-focused
159.34
435.78
1.94
Crosstab-OBA
257.56
501.04
4
Ochiai-OBA
485.94
793.18
4
RBF-OBA
455.04
763.82
4
“Fault-focused” clustering-based technique is better not only in terms of
effectiveness (smaller average number of statements need to be examined) but
also in terms of efficiency (smaller number of iterations)
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
60
Some Results (2)
 Exam Score (best case)
Percentage of Faulty Versions where Faults are Located
100%
90%
80%
70%
60%
50%
40%
30%
Fault-focused Best Case
20%
Crosstab-OBA Best Case
Ochiai-OBA Best Case
10%
RBF-OBA Best Case
0%
0%
5%
10%
15%
20%
25%
30%
35%
40%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
61
Some Results (3)
 Exam Score (worst case)
Percentage of Faulty Versions where Faults are Located
100%
90%
80%
70%
60%
50%
40%
30%
Fault-focused Worst Case
20%
Crosstab-OBA Worst Case
Ochiai-OBA Worst Case
10%
RBF-OBA Worst Case
0%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
Percentage of Code Examined
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
62
Conclusion
63
What We Have Discussed
 Existing and new fault localization techniques
– Many of them use the same information (statement coverage and execution
results) to identify suspicious code likely to contain program bug(s)
 D* is very efficient and can be automated
– Via an evaluation across 24 different programs, D* is shown to be more
effective than 38 other fault localization techniques (including 31 similarity
coefficient-based techniques, and 7 other contemporaries (Crosstab, RBF,
H3b and H3c, Tarantula, Ochiai, and Ochiai2).
– Upper bound on effectiveness:
 The
effectiveness of D* increases along with * before leveling off after * passes a
critical point.
 Debugging programs with multiple bugs
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
64
Automatic Bug Fixing
65
Our Approach (JSS paper)
 In the context of program debugging we have discussed mutation
and fault localization
Fault Localization
Mutation
+
The Good: Can result in potential
fixes for faulty programs
automatically.
The Good: Can potentially
identify the exact location of a
fault in a program.
The Bad: We have no idea as to
where in a program a fault is, and
so we do not know how to proceed.
Randomly examining mutants can
be prohibitively expensive.
The Bad: Even if we locate the
fault, we have no idea as to how
to fix the fault. This is left solely
as the responsibility of the
programmers/debuggers.
So…what if we combined these two?
Program Debugging with Effective Software Fault Localization (© 2012 Professor W. Eric Wong, The University of Texas at Dallas)
66
Questions ?
67
Download