Slides - faculty.sutd.edu.sg

advertisement
50.530: Software Engineering
Sun Jun
SUTD
Course Outline
Date
Topic
Sep 15
Introduction
Sep 22
Automatic Testing
Sep 29
Delta Debugging
Oct 13
Bug Localization
Oct 20
Specification Mining
Nov 3
Race Detection
Nov 10
Hoare Logic and Proving
Nov 17
Symbolic Execution
Nov 24
Invariant Generation
Dec 1
Software Model Checking
Dec 12
Rely Guarantee Reasoning
Dec 19
Final Exam
Remarks
Debugging
Verification
Dec 15, 10 - 12
Week 5: Specification Mining
for Debugging
Where the bug is?
Where the bug is depends on what the programmer wants at
each step. How do we know what the programmer wants?
We “find out” what the programmer wants, borrowing
ideas and techniques from machine learning.
Sahoo et al. ASPLOS 2013
USING LIKELY INVARIANTS FOR
AUTOMATED SOFTWARE FAULT
LOCALIZATION
The Idea
Delta Debugging is perhaps inefficient and unscalable because it compares a pair of concrete
program states: too many differences and too
detailed.
Good
Bad
The Idea
In fact, the details don’t matter. The fact that the graph is cyclic matters.
The Idea
1. Generate more passed test cases
Good
Good
Good
The Idea
2. Generate likely invariants
At L, x = 1 and y = -2
At L, x = 2 and y = 0
1<=x<=3
and
-2<=y<=1
At L, x = 3 and y = 1
What forms of invariants do I use?
The Idea
3. Test the likely invariant with the failed test
Bad
1<=x<=3
and
-2<=y<=1
At L, x = 50 and y = 0
L is a candidate root cause of the bug!
The Idea
4. Reduce the candidate root causes
• Dynamic program slicing: finding out which
statements affect the candidate root cause
• Dynamic dependence filtering: given two root
causes A and B, if B is affected by A and A
comes earlier, A is more likely the real cause.
Overall Picture
Overall Picture
How to generate inputs?
What invariants to generate?
How to conclude one
candidate root cause is more
likely than the other?
Where is the bug?
It fails when the date is
0000-Jan-01.
From MySQL database server
1. Generate Inputs
• The inputs should be “close” to the failure
input, in the same spirit of “nearest neighbor”.
• Systematically generate inputs based on the
DD algorithm.
The initial good inputs
+ good inputs generated
from DD
A queue of good inputs to
generate more good inputs
from.
A list of good inputs
Algorithm 1
Algorithm 1
Consider the input is “SELECT DATE_FORMAT(“0000-01-01”,
‘%W %d %M %Y’) for the MySQL example, does it work?
If a specification of the input format is given, we can generate
better and meaningful inputs.
Generate new
inputs based on
type
Algorithm 2
Research Discussion
How do we guarantee to generate inputs
which are close to the failure input?
Can we generate inputs at a program
points closer to the failure?
2. Generate Invariants
• The invariant should rightly “guess” what the
programmer wants somewhere in the
program.
– Where do we generate invariants?
– What form of the invariants should take?
Invariant: The returned
value must be positive.
How should we know this?
2. Generate Invariants
• Where do we generate invariants?
– (in the paper) load, store and function return
instructions.
• Load: array[i] * 5 + 2
• Store: array[i] = array[k] + 100;
• Return: return x + y;
How would you justify this?
What is the consequence?
2. Generate Invariants
• What form of the invariants should take?
– (in the paper) a range invariant, e.g., x in [1..5]
How would you justify this?
Overall Picture
How to generate inputs?
What invariants to generate?
How to conclude one
candidate root cause is more
likely than the other?
4. Reduce Candidate Causes
• Using dynamic program slicing: given a
statement S, the backward slice of S contains
all statements which S depends on.
– A data dependency is a situation in which S refers
to the data of a preceding statement.
– S is control dependent on a preceding statement
if the outcome of latter determines whether S
should be executed or not.
Remove all those candidate causes which the initial failure
statement does not depend on.
Dynamic Program
Slicing
int[] previous = new int[5];
public int max (int[] list) {
int max = list[0];
for (int i = 1; i < list.length-1; i++) {
if (max < list[i]) {
max = list[i];
}
}
previous[0] = max;
return max;
}
So if the value of returned max caused a failure,
“previous[0] = max” should not be a candidate
cause.
public int max (int[] list)
int max = list[0];
int i = 0
i < list.length-1
if (max < list[i]) {
max = list[i]
i++
i < list.length-1
Previous[0] = max
return max
Exercise 1
int sum = 0;
int i = 0;
while (i < 1100) {
sum += i;
i++;
}
assert(sum >=0);
Use program slicing on the assertion.
4. Reduce Candidate Causes
• Using dependency filtering: if a faulty
statement that is the bug’s root cause triggers
an invariant failure, then any statement using
the faulty value computed by that statement
might also trigger an invariant failure.
• If statement T (control/data-)depends on S,
remove T.
Is this justified?
Invariant failure here
Invariant failure here
dependency
4. Reduce Candidate Causes
• If there are multiple failed test cases, with the
same cause of failure, intersect the candidate
cause set for each failed test case.
Is this justified?
Case Study
• Objects of analysis
– The Squid HTTP proxy server
– The MySQL database server
– The Apache HTTP web server
• Selected 8 real software bugs
– Have to be software versions which can be
supported by the tool developed by the authors
– No concurrency bugs. Why?
– No missing code bugs. Why?
Case Study
Case Study: Effectiveness
Q1: whether the approach can find the true root
causes of bugs?
• For each bug, the correction patch in the bug
reports is used to identify the minimal
statements which should be changed or
deleted to remove the failure symptom.
Q2: how many false positives it generates?
Is this justified?
Case Study: Effectiveness Results
Given a set of remaining causes, find out the statements the causes depend on.
Compared with Tarantula
The range of source codes that have to be checked.
What are the limitations?
LEARN TO DEBUG
Click HERE for Slides; Click HERE for the Paper
feature 2
Research Discussion
O
O
O
O
O
O
O
XX X
X XX
O
O
O
O
O O O
O O
O
feature 1
What if the vectors are located like above.
Research Discussion
int[] previous = new int[5];
public int max (int[] list) {
int max = list[0];
for (int i = 1; i < list.length-1; i++) {
if (max < list[i]) {
As an expert programmer, how do you
max = list[i];
learn what the programmer wants?
}
}
previous[0] = max;
return max;
}
What does this program do and
how do you know?
Research Discussion
if (card == null) {
printk (KERN_ERR, “capidrv-%d: … %d!\n”, card->contrnr, id);
}
How do you know there is a bug
in the program?
Research Discussion
int mxser_write (struct tty_struct *tty, …) {
struct mxser_struct *info = tty->driver_data;
unsigned long flags;
if (!tty || !info->xmit_buf) {
return (0);
}
}
There is a potential problem and
why?
Exercise 3
Take this program and this input as example.
Apply both methods and argue whether it works
to find the bug. If there is a challenge, how do
you overcome it or what assumptions you would
make to overcome it?
Research Discussion
What else can we learn what the programmer really want from?
The Overall View
the behaviors we wanted
the behaviors we have
Download