PPTX

advertisement
Delta Debugging
CS 6340
How do we go from this ...
<td align=left valign=top>
<SELECT NAME="op sys" MULTIPLE SIZE=7>
<OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION
VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows
2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac
System 7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System
8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac
System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION
VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION
VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HPUX<OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION
VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION
VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION
VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="bug severity" MULTIPLE SIZE=7>
<OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION
VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION
VALUE="enhancement">enhancement</SELECT>
</tr>
</table>
File
Print
Segmentation Fault
2
... to this?
<SELECT>
File
Print
Segmentation Fault
3
Simplification
Once one has reproduced a problem, one must
find out what’s relevant
– Does failure really depend on 10,000 lines of input?
– Does failure really require this exact schedule?
– Does failure really need this sequence of calls?
4
Why Simplify?
• Ease of communication: a simplified test case
is easier to communicate
• Easier debugging: smaller test cases result in
smaller states and shorter executions
• Identify duplicates: simplified test cases
subsume several duplicates
5
Real-World Scenario
• The Mozilla open-source web browser project receives several
dozens bug reports a day
• Each bug report has to be simplified
– Eliminate all details irrelevant to producing the failure
• To facilitate debugging
• To make sure it does not replicate a similar bug report
• In July 1999, Bugzilla listed more than 370 open bug reports
for Mozilla
– These were not even simplified
– Mozilla engineers were overwhelmed with work
– They created the Mozilla BugAThon: a call for volunteers to process
bug reports
Real-World Scenario
• The Mozilla open-source web browser project receives several
dozens bug reports a day
Today: For 15 bug reports simplified, you
• Each bug report has to be simplified
get a Firefox plushie
– Eliminate all details irrelevant to producing the failure
https://developer.mozilla.org/en/Gecko_
• To facilitate debugging
• To make sure it does not replicate a similar bugBugAThon
report
• In July 1999, Bugzilla listed more than 370 open bug reports
for Mozilla
– These were not even simplified
For 5 bug reports simplified, the volunteer
– Mozilla
engineers
overwhelmed
with work
would
be invited
to thewere
launch
party; 20 bugs
– Theyearn
created
the Mozilla
BugAThon:
a call for volunteers to process
would
a T-shirt
signed by
the grateful
bug reports engineers.
7
Your Solution
• How do you solve these problems?
• Binary search
– Cut the test case in half
– Iterate
• Brilliant idea: Why not automate this?
8
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
9
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
10
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
11
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
12
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
13
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
14
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
15
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✔
16
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
17
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
18
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✔
19
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✔
20
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
Simplified input
21
Complex Input
<td align=left valign=top>
<SELECT NAME="op sys" MULTIPLE SIZE=7>
<OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION
VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows
2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac
System 7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System
8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac
System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION
VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION
VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HPUX<OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION
VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION
VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION
VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="bug severity" MULTIPLE SIZE=7>
<OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION
VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION
VALUE="enhancement">enhancement</SELECT>
</tr>
</table>
File
Print
Segmentation Fault
22
Simplified Input
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• Simplified from 896 lines to one single line
• Required only 12 tests
23
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
24
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
25
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
26
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
27
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
What do we do if both halves pass?
28
Change Granularity
Finer
Granularity of test
case
Coarser
Granularity of test
case
Failure of the test
case
Higher
Lower
Progress of the
search
Slower
Faster
29
Change Granularity
Finer
Granularity of test
case
Coarser
Granularity of test
case
Failure of the test
case
Higher
Lower
Progress of the
search
Slower
Faster
30
General Delta-Debugging Algorithm
Basic idea:
1. Start with few & large changes first
∆1
∆2
∆1
∆2
1. If all alternatives pass or are unresolved,
apply more & smaller changes.
∆1
∆2
∆3
∆4
∆5
∆6
∆7
∆8
∆1
∆2
∆3
∆4
∆5
∆6
∆7
∆8
31
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
32
Binary Search
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
33
Binary Search
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
34
Binary Search
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
35
Binary Search
•
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
✔
36
Inputs and Failures
• Let E be the set of possible inputs
• rP  E corresponds to an input that passes
• rF  E corresponds to an input that fails
37
Binary Search
•
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
✔
Example of rF and rP ?
38
Binary Search
•
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
✔
Example of rF and rP ?
rF = <SELECT NALE SIZE=7>
rP = <SELECT NAME="priori
39
Changes
• We can go from one input to another by changes
• A change  is a mapping  : E  E which takes one input and
changes it to another input
• Example: ’ = insert ME="priori at appropriate position
r1 = <SELECT NAty" MULTIPLE SIZE=7>
What is ’(r1)?
<SELECT NAME="priority" MULTIPLE SIZE=7>
40
Decomposing Changes
• A change  can be decomposed to a number of elementary
changes 1, 2, ..., n where
 = 1 o 2 o ... o n and (i o j)(r) = i(j(r))
• For example, deleting a part of the input file can be
decomposed to deleting characters one be one from the file
– in other words: by composing deleting of single characters,
we can get a change that deletes part of the input file
’ = insert ME="priori
’ = 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10
– 1 = insert M
– 2 = insert E
– …
41
Summary
• We have an input without failure: rP
• We have an input with failure: rF
• We have a set of changes cF = {1, 2, ..., n } such that:
rF = (1 o 2 o ... o n )(rP)
• Each subset c of cF is a test case
42
Testing Test Cases
• Given a test case c, we would like to know if input generated
by applying changes in c to rP causes a failure
• We define the following function:
test: Powerset(cF)  {P, F, ?}
such that, given c = {1, 2, ..., m}  cF
test(c) = F iff (1 o 2 o ... o m )(rP) is a failing input
43
Minimizing Test Cases
• Now the question is: Can we find the minimal test case c such
that test(c) = F?
• A test case c  cF is called the global minimum of cF if:
for all c’ cF , |c’| < |c|  test(c’)  F
• Global minimum is the smallest set of changes which will make
the program fail
• Finding the global minimum may require us to perform
exponential number of tests
44
Search for 1-minimal Input
• Different problem formulation:
Find a set of changes that cause the failure, but
removing any change causes the failure to go away
• This is 1-minimality
45
Minimizing Test Cases
• A test case c  cF is called a local minimum of cF if:
for all c’ c. test(c’)  F
• A test case c  cF is n-minimal if:
for all c’ c. |c|  |c’|  n  test(c’)  F
• A test case is 1-minimal if:
for all i  c. test(c – {i})  F
46
Naïve Algorithm
• To find a 1-minimal subset of c:
• If for all i  c. test(c – {i})  F, then c is 1-minimal
• Else recurse on c – {} for some   c. test(c – {}) = F
47
Analysis
• In the worst case,
– We remove one element from the set per iteration
– After trying every other element
• Work is potentially
N + (N-1) + (N-2) + …
• This is O(N2)
48
Work Smarter, Not Harder
• We can often do better
• Silly to start out removing 1 element at a time
– Try dividing change set in 2 initially
– Increase # of subsets if we can’t make progress
– If we get lucky, search will converge quickly
49
Minimization Algorithm
• The delta debugging algorithm finds a 1-minimal test case
• It partitions the set cF to 1, 2, ... n
– 1, 2, ... n are pairwise disjoint
– cF = 1  2  ...  n
• Define the complement of i as i = cF  i
• Start with n = 2
• Tests each test case defined by partition and their complements
• Reduce test case if a smaller failure inducing set is found
– otherwise refine the partition, i.e. n = n*2
50
Steps of the Minimization Algorithm
Let n = 2
(A) Start with  as test set
(B) Test each 1, 2, ... n and each 1, 2, ..., n
(C) There are four possible outcomes:
1. Some i causes failure
– Go to step (A) with  = i and n = 2
2. Some i causes failure
– Go to step (A) with  = i and n = n  1
3. No test causes failure
– Refine granularity: Go to (A) with  =  and n = 2n
4. The granularity can no longer be refined
– Done, found the 1-minimal subset
51
Delta Debugging
52
Asymptotic Analysis
• Worst case is still quadratic
• Subdivide until each set is of size 1
– Reduced to the naïve algorithm
• Good news
– For single failure, converges in log N
– Binary search again
53
GNU C Compiler
#define SIZE 20
double mult(double z[], int n) {
int i, j;
i = 0;
for (j = 0; j < n; j++) {
i = i + j + 1;
z[i] = z[i] * (z[0] + 1.0);
}
return z[n];
• This program (bug.c) crashes GCC
2.95.2 when optimization is enabled
• Goal: minimize this program to file a
bug report
• For GCC, a passing program run is the
empty input
• For sake of simplicity, model each
change as insertion of a single
character
– r is running GCC on empty input
– r means running GCC on bug.c
– each change i inserts ith character
of bug.c
}
void copy(double to[], double from[], int count) {
int n = (count + 7) / 8;
switch (count % 8) do {
case 0: *to++ = *from++;
case 7: *to++ = *from++;
case 6: *to++ = *from++;
case 5: *to++ = *from++;
case 4: *to++ = *from++;
case 3: *to++ = *from++;
case 2: *to++ = *from++;
case 1: *to++ = *from++;
} while (--n > 0);
return mult(to, 2);
}
int main(int argc, char *argv[]) {
double x[SIZE], y[SIZE];
double *px = x;
while (px < x + SIZE)
*px++ = (px – x) * (SIZE + 1.0);
54
return copy(y, x, SIZE)
}
GNU C Compiler
• The test procedure:
– creates the appropriate subset of bug.c
– feeds it to GCC
– returns  if GCC crashes,  otherwise
77
755
377
188
55
GNU C Compiler
• The minimized code is:
t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}
…
• The test case is 1-minimal
– No single character can be removed
without removing the failure
– Even every superfluous whitespace
has been removed
– The function name has shrunk from
mult to a single t
– Program has an infinite loop, but GCC
still isn’t supposed to crash
double mult(double z[], int n) {
int i, j;
i = 0;
for (j = 0; j < n; j++) {
i = i + j + 1;
z[i] = z[i] * (z[0] + 1.0);
}
return z[n];
}
• So where could the bug be?
– We already know it is related to optimization
– If we remove –O option to turn off optimization, the failure disappears
56
GNU C Compiler
• The GCC documentation lists 31 options to control
optimization on Linux:
–ffloat-store
–fforce-mem
–fno-inline
–fkeep-static-consts
–fstrength-reduce
–fcse-skip-blocks
–fgcse
–fschedule-insns2
–fcaller-saves
–fmove-all-movables
–fstrict-aliasing
–fno-default-inline
–fforce-addr
–finline-functions
–fno-function-cse
–fthread-jumps
–frerun-cse-after-loop
–fexpensive-optimizations
–ffunction-sections
–funroll-loops
–freduce-all-givs
–fno-defer-pop
–fomit-frame-pointer
–fkeep-inline-functions
–ffast-math
–fcse-follow-jumps
–frerun-loop-opt
–fschedule-insns
–fdata-sections
–funroll-all-loops
–fno-peephole
• It turns out that applying all of these options causes the failure
to disappear
– Some option(s) prevent the failure
57
GNU C Compiler
• Use test case minimization to find the preventing option(s)
– Each i stands for removing a GCC option
– Having all i applied means to run GCC with no option (failing)
– Having no i applied means to run GCC with all options (passing)
• After 7 tests, the single option -ffast-math is found which
prevents the failure
– Unfortunately, it is not a good candidate for a workaround
as it may alter program’s semantics
– Thus, we remove -ffast-math from list of options and repeat
– After 7 tests, we find -fforce-addr also prevents the failure
– Further tests shows none other option prevents the failure
58
GNU C Compiler
This is what we can send to the GCC maintainers:
– The minimal test case
– “The failure only occurs with optimization”
– “-ffast-math and -fforce-addr prevent the failure”
59
Case Studies
• Minimizing Mozilla input
– Print the html file containing <SELECT>
• Minimizing fuzz input
– Fuzzing: feed program with randomly generated
input and check if it crashes
– Used delta debugging to minimize fuzz input
60
Another Application
• Yesterday, my program worked. Today, it does not. Why?
– The new release 4.17 of GDB changed 178,000 lines
– No longer integrated properly with DDD (a graphical front-end)
– How to isolate the change that caused the failure?
61
Summary
• Delta Debugging is a technique, not a tool
• Bad News:
– Probably must be re-implemented for each
significant system
– To exploit knowledge of changes
• Good News:
– Relatively simple algorithm, big payoff
– It is worth re-implementing
62
Download