Asking and Answering Questions during a Programming Change Task

advertisement

Cross-application Fan-in Analysis for Finding Application-specific

Concerns

Makoto Ichii

Takashi Ishio

Katsuro Inoue

Osaka University

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

1

Coding pattern detection

Automatic detection of crosscutting concerns helps

Finding refactoring opportunities

Understanding application-specific coding rules

Fung: Coding pattern detection tool

[Ishio, 2008][Miyake, 2007]

Detects coding patterns including crosscutting concerns from an application using a data mining technique

Basic idea: “a crosscutting concern code

frequently appears across an application

[Ishio, 2008] T. Ishio. H. Date, T. Miyake and K. Inoue,

"Mining Coding Pattern to Detect Crosscutting Concerns in Java Programs", Proc. WCRE2008, 2008

[Miyake, 2007] T. Miyake, T. Ishio, K. Taniguchi, K. Inoue,

"Towards Maintenance Support for Idiom-based Code Using Sequential Pattern Mining", Proc. AOASIA3, 2007

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

2

Example of coding pattern

Coding pattern

An ordered sequence of method calls and control statements that frequently appears in source code.

Process of coding pattern detection

Source code parse & normalize

Method call sequence

Sequential pattern mining

Coding pattern

… if (log.isDebugEnabled())

{ log.debug(getMessage());

}

… …

String status = getStatus(); if (log.isDebugEnabled())

{ log.debug(status);

}

… … if (log.isDebugEnabled())

{

}

… log.debug("QBK");

… isDebugEnabled()

IF getMessage() debug()

END_IF

… getStatus() isDebugEnabled()

IF debug()

END_IF

… isDebugEnabled()

IF debug()

END_IF

1: isDebugEnabled()

2: IF

3: debug()

4: END_IF

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

3

Needs for application-specific concerns

Detected coding patterns include generic idioms

Idioms also frequently appear across code base

Less interesting to developers who need applicationspecific knowledge

Target application Detected patterns

Logging

1: isDebugEnabled()

2: IF

3: debug()

4: END_IF

Iterator idiom

1: iterator()

2: hasNext()

3: LOOP

4: next()

5: hasNext()

6: END_LOOP

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

4

Filtering approach: cross-application fan-in analysis

Key Idea

Generic idioms appear in various applications

Application-specific patterns appear in a few applications

 Measure how widely a class/pattern is used across applications

– “Universality” metric

Logging

1: isDebugEnabled()

2: IF

3: debug()

4: END_IF

Iterator idiom

1: iterator()

2: hasNext()

3: LOOP

4: next()

5: hasNext()

6: END_LOOP

Appears in only two applications

Appears in almost all applications

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

5

Approach overview

Collect various applications

Including target application

Analyze the use-relation between the classes in the applications

Measure universality metric for each classes

Filter out the patterns comprising only universally-used classes.

Target application Detected patterns Filtered patterns

1: indexOf

2: lastIndexOf

3: substring

1: iterator()

2: hasNext()

3: LOOP

4: next()

5: hasNext()

6: END_LOOP

1: isDebugEnabled()

2: IF

3: debug()

4: END_IF

1: activate

2: IF

3: deactivate

4: END_IF

1: contains

2: IF

3: get

4: END_IF

1: indexOf

2: lastIndexOf

3: substring

1: iterator()

2: hasNext()

3: LOOP

4: next()

5: hasNext()

6: END_LOOP

1: activate

2: IF

3: deactivate

4: END_IF

1: isDebugEnabled()

2: IF

3: debug()

4: END_IF

1: contains

2: IF

3: get

4: END_IF

1. …………………………

2. …………………………

3. …………………………

4. …………………………

5. …………………………

Application collection Use-relation between classes List of universally-used classes

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

6

Cross-application use-relation

An extension of ordinal static use-relation analysis between classes in an application.

Build a use-relation graph

Node: class

Edge: static use-relation between classes

Kinds of use-relation

Inheritance, Method call, Field access, Instantiation and

Variable/Parameter declaration

Source code Use-relation graph

WarehouseApp

WarehouseApp class Liquor { long price;

String name;

} class Warehouse {

Liquor liq = new Liquor();

}

Warehouse

Liquor

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

7

Cross-application use-relation

Analyze use-relation between classes across application borders

Analyze intra-application use-relation

 in the same way with the case of single application

If there are several copies of “used class” in different applications, create edges to all of them

WarehouseApp

Warehouse

StoreApp

Store Shelf

Liquor Liquor Paper

A copy of

Liquor in

WarehouseApp

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

8

Class fan-in and application fan-in

Class fan-in of a class c

The number of classes using c

Application fan-in of a class c

The number of applications using c

App Class

WA

SA

Warehouse

Liquor

Store

Shelf

Liquor

Paper

CFI AFI

0 0

3 2

0

1

3

2 1

0

1

2

WarehouseApp

Warehouse

StoreApp

Store Shelf

Liquor Liquor Paper

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

9

The Class Universality Metric

Class universality of a class c

Represents how widely a class is used

 From many classes / applications

universali ty (

c

)

log( log(|

i c

C

1

|)

)

log( log(|

a c

A

1

|)

)

i c

: class fan-in of c

; a c

: application fan-in of c

;

|

C

| : total number of classes; |

A

| : total number of applications

Frequentlyused locally Frequently-used universally

App Class

WA

SA

Warehouse

Liquor

Store

Shelf

Liquor

(copy)

Paper

CFI AFI

0 0

3 2

3

2

0

1

2

1

0

1

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Univ.

0

1.2

0

0.39

1.2

0.61

10

The Pattern Universality Metric

Pattern universality of a pattern p

The minimum universality value of the classes whose methods are invoked in p

A universal pattern comprises only universal classes

Coding pattern

1: iterator()

2: hasNext()

3: LOOP

4: next()

5: hasNext()

6: END_LOOP

Involved classes

Class

Collection

Iterator

Univ.

0.72

0.77

Pattern universality = 0.72

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

11

Case studies

Case Study 1

Measure class universality value of actual classes

Case Study 2

Measure pattern universality value of coding patterns detected by Fung

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

12

Case Study 1

Overview

Questions

Q.1

What kind of classes have high universality?

Q.2

Can universality distinguish classes widely used and classes simply frequently used?

Q.3

What threshold value is good for filtering?

Process

Measure class universality of classes in application collection

Investigate the result to answer the questions

 The top-20 classes in the universality [Q.1]

 Difference between the universality and the fan-in [Q.2]

 Distribution of the universality [Q.3]

Target

39 application packages (131,328 classes)

 Java SE 1.5

 Various OSS packages covering a broad range of domains

– Eclipse (IDE), Azureus (Network client), Apache Tomcat (Network server),

Freemind (Drawing tool), …

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

13

Case Study 1

Top 20 classes in the class universality

Class name

1 java.lang.String

Univ.

CFI

0.933 69,324

Q.1

What kind of classes have high universality?

2 java.lang.Object

3 java.util.List

4 java.lang.System

5 java.lang.Class

6 java.lang.Throwable

7 java.util.Iterator

8 java.util.ArrayList

0.915 55,628

0.793 12,981

0.780 11,191

0.776 10,590

0.775 10,467

0.773 10,191

0.772 10,135

 Fundamental / Utility classes

9 java.lang.Exception

10 java.util.Map

0.761

8,840

0.757

8,476

11 java.lang.Integer

12

14

16

18 java.util.Set

13 java.io.File java.lang.StringBuffer

15 java.io.PrintStream java.util.HashMap

17 java.io.IOException java.util.Collection

19 java.lang.IllegalArgumentException

0.748

0.741

0.736

0.735

0.730

0.730

0.725

0.724

0.714

7,568

6,954

6,554

6,907

6,132

6,129

6,115

5,690

5,057

20 java.lang.Runnable

2008/12/2

0.699

6,790

AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

14

Case Study 1

Universality and fan-in

High universality / Low fan-in

Class Name java.lang.Character java.util.LinkedList java.io.FileOutputStream java.lang.Comparable java.util.Stack

Rank

[Univ.]

39

41

56

78

95

Rank

[CFI]

104

105

177

240

354

High universality / Low fan-in

Classes with fundamental / utility role

Low universality / High fan-in

Classes implementing crosscutting concerns in a large application

Low universality / High fan-in

Class Name org.eclipse.swt...Control org.eclipse.swt.SWT org.eclipse.core…IResource org.openide.util.NbBundle org.openide.ErrorManager

Rank

[Univ.]

213

221

564

1,398

1,496

Rank

[CFI]

25

34

69

24

54

Q.2

Can universality distinguish classes widely used and classes simply frequently used?

 Yes.

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

15

Case Study 1

Distribution

1.0-0.5: general-purpose classes

Primitive/fundamental classes, collection utilities, …

0.5-0.2: domain-specific classes

Logging utility, networking, GUI, …

0.2-0: application-local classes

Univ.

1.0 – 0.9

0.9 – 0.8

0.8

– 0.7

0.7

– 0.6

0.6 – 0.5

0.5 – 0.4

0.4 – 0.3

#of Classes

2 java.lang

Package

0 -

17 java.util, java.lang, java.io

18 java.lang, java.util, java.io, java.net, java.awt

49 java.util, java.lang, java.io, javax.swing, java.awt,...

80 java.io, java.lang, javax.swing, javax.swing, java.awt,...

196 org.eclipse.swt.widgets, javax.swing, java.util, java.awt.event, java.lang, ...

348 org.eclipse.swt.widgets, org.eclipse.swt.graphics, javax.swing,

0.3 – 0.2

0.2 – 0.1

0.1 – 0.0

2008/12/2

Q.3

What threshold value is good for filtering?

1,385

 0.2 for finding application-specific concerns

129,233 org.gudy.azureus2.core3.util, org.bouncycastle.asn1, ...

 0.5 for filtering out generic concerns soot.jimple.parser.node, org.apache.poi.....functions, test, soot.coffi, ...

AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

16

Case Study 2

Overview

Question

Can the pattern universality distinguish among generic, domain-specific and application-specific patterns?

Process

Categorize coding patterns according to pattern universality

 1.0 – 0.5: Generic pattern

 0.5 – 0.2: Domain-specific pattern

 0.2 – 0.0: Application-specific pattern

Target

Coding patterns

 Azureus (presented in [Ishio, 2008] )

Application collection

 Same as Case Study 1

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

17

Case Study 2

Result

Generic patterns (2290 patterns)

String manipulation

 String.lastIndexOf() / IF / String.substring() / END_IF

Collection manipulation

 List.get() / IF / List.remove() / END_IF

Domain-specific patterns (79 patterns)

Collection manipulation

 Map.size() / Iterator.remove() / LinkedHashMap.get() /

LinkedHashMap.remove()

 Domain-specific?

Application-specific patterns (2293 patterns)

Logging

 LOOP / Thread.sleep() / Debug.printStackTrace() / END_LOOP

Synchronization

 IF / AEMonitor.enter() / ArrayList.remove() / AEMonitor.exit()

/ END_IF

Q.

Can the pattern universality distinguish generic

/ domain-specific / application-specific patterns?

 Almost yes.

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

18

Discussion

Universality metric can distinguish universally-used classes

Resource management classes in Eclipse/NetBeans are distinguished as application-specific

 although they have large fan-in

Universality metric value may depend on a set of applications

Case studies in different target are needed

 E.g. industrial software systems.

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

19

Discussion

Some domain-specific classes have higher class universality than general-purpose classes

Ideas to improve the metric

Propagate fan-in through important use-relation

 E.g. inheritance

Combining other metric

Class name

33 java.awt.Component

Univ.

0.63

113 java.util.ListIterator

0.46

230 java.util.LinkedHashMap 0.36

+ Less popular generic concerns may be more interesting than famous domain-specific ones

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

20

Summary and future works

Cross-application fan-in analysis for filtering coding patterns

Measures universality, or a metric that represents how widely a class/pattern is used

Future work

Case studies with different applications

Refinement of the universality metric

2008/12/2 AOAsia 4

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

21

Download