Predicting bugs using anti patterns

advertisement

Ehsan Salamati Taba, Foutse Khomh,

Ying Zou, Meiyappan Nagappan,

Ahmed E. Hassan

1

2

Past Defects, History of Churn (Zimmermann, Hassan et al.)

Model

Predict Bugs

Topic Modeling (Chen et al.)

3

4

 weaknesses in design

 not technically incorrect and don't prevent a system from functioning

5

Indicate a deeper problem in the system

6

Antipatterns indicate weaknesses in the design that may increase the risk for bugs in the future.

(Fowler 1999)

7

CVS Repository

Bugzilla

Mining Source Code

Repositories

Detecting

Antipatterns

Mining Bug

Repositories

Calculating Metrics Analyzing

RQ1

RQ2

RQ3

9

Systems

Systems

Eclipse

Release(#)

2.0 - 3.3.1(12)

ArgoUML 0.12 - 0.26.2(9)

Churn LOCs

148,454 26,209,669

21,427 2,025,730

10

 DECOR (Moha et al.)

 13 different antipatterns

Systems #Antipatterns

Eclipse 273,766

ArgoUML 15,100

# of Antipatterns

# Files

12

RQ1: Do antipatterns affect the density of bugs in files?

RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?

RQ3: Can we improve traditional bug prediction models with antipatterns information?

14

 Density of bugs in the files with antipatterns and the other files without antipatterns is the same.

15

Systems Releases(#) D

A

– D

NA

> 0 p-value<0.05

Eclipse 12 8 8

ArgoUML 9 6 6

Files with

Antipatterns

Density of Bugs

Files without

Antipatterns

Density of Bugs

16

RQ1: Do antipatterns affect the density of bugs in files?

RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?

RQ3: Can we improve traditional bug prediction models with antipatterns information?

17

 Average Number of Antipatterns (ANA)

 Antipattern Recurrence Length(ARL)

 Antipattern Cumulative Pairwise Differences

(ACPD)

 Antipattern Complexity Metric (ACM)

18

a.java

b.java

1.0

2.0

3.0

4.0

5.0

6.0

3 4 0 2 1 3

4 5 1 0 0 3 c.java

0 6 5 4 5 4

ANA(a.java) =2.16, ARL(a.java) = 18.76,

ACPD(a.java) = 0

19

20

 Provide additional explanatory power over traditional metrics

 ARL shows the biggest improvement

21

RQ1: Do antipatterns affect the density of bugs in files?

RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?

RQ3: Can we improve traditional bug prediction models with antipatterns information?

22

Metric name

LOC

MLOC

PAR

NOF

NOM

NOC

VG

DIT

LCOM

NOT

WMC

PRE

Churn

Description

Source lines of codes

Executable lines of codes

Number of parameters

Number of attributes

Number of methods

Number of children

Cyclomatic complexity

Depth of inheritance tree

Lack of cohesion of methods

Number of classes

Number of weighted methods per class

Number of pre-released bugs

Number of lines of code added modified or deleted

Step-wise analysis

1) Removing Independent

Variables

2) Collinearity Analysis

23

Eclipse

12

10

8

6

4

2

0

 ARL remained statistically significant and

8

6

4

2

0

Churn PRE LOC MLOC NOT NOF NOM ACM ACPD ARL

24

 ARL can improve crosssystem bug prediction on the two studied systems

25

Backup Slides

27

a.java

b.java

1.0

2.0

3.0

4.0

5.0

6.0

3 4 0 2 1 3

4 5 1 0 0 3 c.java

0 6 5 4 5 4

ANA(a.java) =2.16, ARL(a.java) = 18.76,

ACPD(a.java) = 0

28

Anti Singleton Blob Class Data Should be Private

Lazy Class LPL Long Method

Spaghetti Code SG SwissArmyKnife

Complex Class Large Class

Message Chain RPB

-

29

30

There is no difference between the density of future bugs of the files with antipatterns and the other files without antipatterns.

Findings

In general, the density of bugs in a file with antipatterns is higher than the density of bugs in a file without antipatterns.

31

Download