Tracing regression bugs

advertisement
Tracing regression bugs
Presented by Dor Nir
1
Outline






Problem introduction.
Regression bug – definition.
Industry tools.
Proposed solution.
Experimental results.
Future work.
2
Micronose corp.


A big company founded by Nataly
noseman
1998 – Version 1 of nosepad. (great
success)
3
Nosepad version 1
class Nosepad
void Save(){
...
bDirty = false;
}
void Exit(){
if(IsDirty()) {
if(MsgBox(“Save your
noses?"))
Save();
}
CloseWindow();
}
{
bool bDirty;
void AddNose(){
...
bDirty = true;
}
void DeleteNose(){
...
bDirty = true;
}
bool IsDirty(){
return bDirty;
}
}
4
Nosepad version 2

New features requires





…
…
Undo/Redo mechanism.
…
Micronose expanding


Promotions.
New recruitments.
5
Undo/Redo Design


Undo stack – Each operation will be added to the
undo stack.
Redo stack - When undo operation this operation
will move to the redo stack.
Key “b”
“a”
Undo
Redo
6
Nosepad version 2
class Nosepad
{ …
Stack undoStack;
Stack redoStack;
void AddNose(){ ...
void Undo(){
undoStack.Top().Operate(false);
redoStack.Push(undoStack.Pop());
}
void DeleteNose() {
...
undoStack.Add(DelNoseOp);
redoStack.Clear();
}
.
.
.
void Redo(){
redoStack.Top().Operate(true);
undoStack.Push(redoStack.Pop());
}
undoStack.Add(AddNoseOp);
redoStack.Clear();
}
}
7
Zelda from QA
8
Nosepad version 2 correction
class Nosepad
{…
Stack undoStack;
Stack redoStack;
void AddNose(){
...
undoStack.Add(AddNoseOp);
redoStack.Clear();
}
void Undo(){
undoStack.Top().Operate(false);
redoStack.Push(undoStack.Pop());
}
void Redo(){
redoStack.Top().Operate(true);
undoStack.Push(redoStack.Pop());
}
void DeleteNose() {
undoStack.Add(DelNoseOp);
redoStack.Clear();
}
bool IsDirty(){
return bDirty &
!undoStack.IsEmpty();
}
9
Zelda from QA
10
Regression bug observations



The second bug is a regression bug.
The same test will succeeded on version 1 and
fail on version 2.
The specifications for version 1 haven't
change for version 2 (only addition)
11
Regression
bug
Specifications
1. X
2. Y
3. Z
Version 1
Changes in code
Version 2
Specifications
1. X
2. Y
3. Z
4. A
5. B
Bug…
But no
regression
12
Regression bug definition

Regression bug – Changes in existing code
that change the behavior of the application so
it does not meet a specification that was
previously met.
13
How to avoid regression bugs?

Prevent inserting regression bug to the code:




Simple design.
Programming language.
Good programming.
Methodology.




Test driven development.
Code review.
Communication.
Find regression bugs before release of the product.


Extensive testing.
White box \ Black box testing.
14
Automatic tools


Find whether a regression bug exist.
Quick Test Professional.
15
Where is it?

What was the cause for the regression bug?

What was the change that caused the regression
bug?
16
What is a change?



Change existing code lines
Adding new code lines.
Delete code lines.
17
Problem definition

When getting a check point C that failed, and
source code S of the AUT. We want to find the
p1 , p2 in
... pthe
n
places (changes)
code S that
causes C to fail.


We want to do it independently of the source code
language or technology.
We know that at time T (previous to the failure)
the checkpoint passed.
18
Solution 1
QA
Programmer
Cooperation
Tests
Source code
19
Solution 1 - map
Tests
Check text in
message box
Source code
Windows.cpp
errorMessages.cpp
File t.xml was
Created successfully
File.cpp
IO.cs
“SELECT NAMES from
Table1” is not empty
C:\code\files
DB project
20
Solution 1




Much work has to be done for each new test.
Maintenance is hard.
We end up with a lot of code to be analyzed.
Could use automatic tools (profilers).
21
Solution 2
Check text in
message box
Windows.cpp
errorMessages.cpp
File t.xml was
Created successfully
“SELECT NAMES from
File.cpp
Only
IO.cs
changes
C:\code\files
Table1” is not empty
DB project
22
Source Control







Version control tool.
Data base of source code.
Check-in \ Check-out operation.
History of versions.
Differences between versions.
Very common in software development.
Currently in market:VSS, starTeam, clear
case, cvs and many more.
23
Finding regression bug
Check point to code tool
Check
point
Source
code
Source
Control
Tool
Change A
Change B
…
Input
Failed
check
point
First phase
Heuristic
s
Out put:
Relevant
changes:
1. Change X
2. Change Y
3. Change Z
…
Second phase
24
Heuristics (second phase)



Rank changes.
Each heuristic will get different weight.
Two kinds of heuristics:


Technology dependence.
Non technology dependence.
2
1
3
25
Non-technology heuristics



Do not depend on the technology of the code.
Textual driven.
No semantics.
26
Code lines affinity
Check point
Select "clerk 1" from the
clerk tree (clerk number 2).
Go to the next clerk.
The next clerk is "clerk 3"
27
Check in comment affinity
Check-in comment
Check point
Go to the next waiter when
next item event is raise
Select "clerk 1" from the
clerk tree (clerk number 2).
Go to the next clerk.
The next clerk is "clerk 3"
28
File affinity
Words histogram in file Clerk.cpp
Waiter
Waiters
Next
Number
…..
…..
186
15
26
174
Check point
Select "clerk 1" from the
clerk tree (clerk number 2).
Go to the next clerk.
The next clerk is "clerk 3"
29
File Name affinity
Check point
ClerkDlg.cpp
Select "clerk 1" from the
clerk tree (clerk number 2).
Go to the next clerk.
The next clerk is "clerk 3"
30
More possible non technology
heuristics

Programmers history.



Reliable vs. “Prone to error” programmers.
Experience in the specific module.
Time of change.


Late at night.
Close to release dead line.
31
Technology heuristics



Depend on the source code language.
Take advantage of known keywords.
Use the semantics.
32
Function\Class\Namespace affinity
Check point
Select "clerk 1" from the
clerk tree (clerk number
2). Go to the next clerk.
The next clerk is "clerk 3"
33
Code complexity
Deepness, number of branching.

if(b1 && b2)
if(b1 && b2 && c2 && d1)
{
c1 = true;
if(c2 && d1)
c1 = true;
else
{
if((c2 && d2) || e1)
c1 = false;
}
}
>
34
Words affinity problem
red, flower,
white, black,
cloud
rain, green, red,
coat
35
Words affinity problem (cont.)
red, flower,
white, black,
cloud
rain, green, red,
coat
>
red, flower,
white, black,
cloud
train, table, love
36
Word affinity
red
flower
red
<
blue
red
<
red
37
How can we measure affinity?

Vector space model of information retrieval.Wong S.K.M , raghavan


Similarity of documents.
Improving web search results using affinity
graph - benyu Zhang, Hau Li, Lei Ji, Wensi Xi, Weiguo Fan.


Similarity of documents.
Diversity vs. Information richness of documents.
38
Affinity definition

Synonym (a) - Group of words that are
synonyms of a or similar in the meaning to a.
Synonym (choose)
=
chosen, picked
out; choice,
superior, prime;
discriminating,
choosy, picky ,
select, selection
39
Words affinity definition (cont)

if a == b
0
else
ShallowAffinity (a,b) =
1

1
if a == b
Affinity (a,b) =
else
ShallowAffinity (synonym (a),
synonym (b))
40
Affinity of groups of words
A  {a1 , a 2 , a3 ...a n }
B  {b1 , b2 , b3 ...bm }
ShallowAffinity ( A, B ) 
AsymetricAffinity( A, B ) 
  ShallowAffinity (a , b )
i
i 1.. n j 1.. m
j
| A|| B |
 max{ Affinity(a , b ),..., Affinity(a , b
i 1.. n
i
1
i
m
| A|
AsymetricAffinity( A, B )  AsymetricAffinity( B, A)
Affinity( A, B ) 
2
)}
Using affinity in the tool


Words (C) = the group of words in the
description of the checkpoint C.
Words (P) = Group of words in the source
code/checkin/file etc…
Rank (C , P )  Affinity(Words (C ), Words ( P ))
42
Using affinity in heuristics

Code line affinity:

Words (P, L) = Group of words in the source code located L lines from the
change P.


β – coefficient that gives different weight for lines inside the change.
Check-in comment affinity:
Rank 2 (C, P)  Affinity(Words(C),Words(checkin( P))
43
Using affinity in heuristics (cont.)

File affinity: P is a change in file F with histogram
map.
HstgrmAffinity ( A, B, map) 
 max{ Affinity(a , b ),..., Affinity(a , b
 map[a ]
i 1.. n
i
1
i 1.. n
i
m
)}  map[ai ]
i
FileRank (C , F )  HstgrmAffinity (Words (C ), Words ( F ), Hstgrm ( F ))
Rank 3 (C, P)  FileRank (C, F )
44
Using affinity in heuristics (cont.)

File name affinity:
Rank 4 (C, P)  Affinity(Words(C),Words( FileName( P))

Code elements affinity:
Rank 5 (C , P ) 
1
 Affinity(Words (C ), Words ( FunctionName( P )) 
2
3
 Affinity(Words (C ), Words (ClassName( P )) 
8
1
 Affinity(Words (C ), Words ( Namespace( P ))
8
45
Algorithm
Input: C – Checkpoint.
T – The last time checkpoint C passed.
1. Get the latest version of the source code for C from
the source control tool.
2. Get files versions from the source control tool one
by one until the version check-in time is smaller
then T. For each file version:
1. Get the change between the two sequential versions.
2. Analyze and rank the change in respect to the
checkpoint C (Rank(C,P))
3. Add the rank to DB.
46
Observations
 Rank i (C , P1 )  Rank j (C , P2 )
and i ≠ j
P1  P2


Better affinity
Better results
The project is not always in a valid state.
47
Implementation

Visual source safe

Arexis merge – Diff tool.

MS Word

WordNet

MS Access – DB.
48
WordNet




Developed at the University of Princtoen.
Large lexical database of English.
English nouns, verbs, adjectives, and adverbs
are organized into synonym sets, each
representing one underlying lexicalized
concept.
Different relations link the synonym sets.
49
Additional views


Group by file.
Group by time of change.
50
The tool
51
The tool
52
Experimental Results

Source code:
C++
 MFC framework
 891 files in 29 folders
 3 millions lines of code
 3984 check-ins

53
Experimental results (cont.)
Checkpoint
No Grouping
Group by file
1
1
1
2
2
7
3
2
2
4
-
-
5
-
1
54
Challenges

Time.




Words equality.
Source code vocabulary.


Cache.
Filtering by one heuristic.
Example - m_CountItemInTable.
Additional synonyms.

Clerk ≈ Waiter.
55
Future work



Add more heuristics.
Learning mechanism – Automatic tuning of
heuristics.
Why? Finding more about source of regression bugs.




Bad Programmer.
Dead line.
Technology.
Design.
56
57
Download