Abstract - Software Technologies Applied Research (STAR) Group

advertisement
A Service-Oriented Componentization Framework
for Java Software Systems
MASc Seminar
Shimin Li
Software Technologies Applied Research Lab
Department of Electrical & Computer
Engineering
Outline



Motivation
Research Goals
Proposed Framework (SOC4J)





Architecture
Processes
Case Studies
Thesis Contributions
Future Works
August 29, 2006
Shimin Li, MASc Seminar
2
Motivation




Service-oriented computing has dramatically changed the way in
which we develop software systems.
Providing competitive services to the global market is critical to the
success of businesses and organizations.
Many competitive services have already been implemented in
existing systems.
To expose all or parts of an existing system as business services is
one of the most effective ways to leverage the value of the system.


A business service of a software system is an abstract resource that
represents a capability of performing tasks that represent a coherent
functionality.
To reuse business services, the service-oriented architecture
suggests realizing them into self-contained components.

A self-contained component is a component that contains all source
code which are necessary to implement its services.
August 29, 2006
Shimin Li, MASc Seminar
3
Research Questions

How to reuse an existing object-oriented software system?



Transforming the functionality of the software system into services
by identifying critical business services embedded in the system.
Realizing the identified services into self-contained components.
How to improve the maintainability of an existing objectoriented software system?


Transforming the monolithic architecture of the existing system into
a more flexible service-oriented architecture.
Reconstructing the existing system into a component-based system.
August 29, 2006
Shimin Li, MASc Seminar
4
Research Goals

To identify critical business services embedded in an existing
Java system.

To realize identified services as self-contained components.

To reconstruct the existing system into a component-based
system.

To build a comprehensive framework addressing the above
objectives, based on the following research areas:




Program Comprehension
Program Migration
Architecture Recovery
Software Reuse
August 29, 2006
Shimin Li, MASc Seminar
5
Service-Oriented Componentization Framework for
Java Software Systems (SOC4J)
Stage I: Architecture Recovery (AR)
Source Code Modeling
Java Source
Code
Component-Based
System
Source code models
Architecture Modeling
Stage IV: System Transformation (ST)
Architecture Reconstruction
Architectural models
Top-Level Service Identification
Top-level services
Low-Level Service Identification
Stage II: Service Identification (SI)
August 29, 2006
Self-contained components
Top-level services
and their low-level
services
Service Realization
Self-Contained
Component
Repository
Stage III: Component Generation (CG)
Shimin Li, MASc Seminar
6
Stage I (AR) : Source Code Modeling

Goal


To build a complete data model set for Java source code at different levels
of abstraction to support structural analysis and recovery.
Approach
Java Source Code
Interpreter
Raw Data
Model Generator
Source Code
Models
(XML Doc)
generates
JavaCC
Grammar

JavaCC
(Java Compiler Compiler)
Source Code Models





JPackage – To model Java packages
JFile – To model Java source files
JClass – To model Java classes and interfaces
JMethod – To model Java methods and constructors
To construct the Basic View (BView) of the system
August 29, 2006
Shimin Li, MASc Seminar
7
Stage I (AR) : Architecture Modeling

Goal


To establish a repository of relationships among classes and interfaces
which can easily be queried in the service identification stage.
Approach
Relationship Extractor
Source Code
Models
(XML Doc)
CIDG

XML Parser
Graph Transformer
Objects
CIRG
Metric Generator
Graph Generator
Architectural Models



Class/Interface Relationship Graph (CIRG)
Class/Interface Dependency Graph (CIDG)
To build the Structure View (SView) of the system
August 29, 2006
Shimin Li, MASc Seminar
8
Class/Interface Relationship Graph (CIRG)

Purpose



To capture different types of relationships among classes and interfaces
To describe relationships as graph representations
Definitions
A Labeled Directed Graph (LDG) is a tuple Γ(V, E, LV, LE, lV , lE), where V is a set of nodes (or
vertices), E is a set of edges (or arcs), LV is a set of node labels, LE is a set of edge labels, lV : V
→ LV is a label function that maps nodes to node labels, and lE : E → LE is a label function that
maps edges to edge labels.
The CIRG of an object-oriented system is an LDG, where V is the set of all classes/interfaces of
the system, lV (v) returns the full name (i.e. package name concatenates class or interface name)
of v for any v ∈ V , E = {(v, w) ∈ V × V | v references w}, and lE(e) returns the types of
relationships between the source node and target node of e for any e ∈ E. The type of a
relationship is one of IN, RE, AS, AG, CO, and US, which represents inheritance, realization,
association, aggregation, composition, and usage, respectively.
In SOC4J, the types of relationships are inheritance (IN), realization (RE),
association (AS), aggregation (AG), composition (CO), and usage (US).
August 29, 2006
Shimin Li, MASc Seminar
9
Class/Interface Dependency Graph (CIDG)

Purpose



To capture the dependency relationship among classes and interfaces
To represent the CIRG at different levels of abstraction
Definition
The CIDG of an object-oriented system is an LDG, where V is the set of
all classes/interfaces of the system, lV (v) returns the full name (i.e.
package name concatenates class or interface name) of v for any v ∈ V ,
E = { (v, w) ∈ V × V | v w }, LE = φ, and hence lE(e) returns an empty
label for any e ∈ E.
C1
<<IN>>
C2
C4
<<RE>>
<<RE>>
<<AG>>
C1
<<CO, AG>>
<<AS>>
C3
abstract
C5
C5
C3
<<US, AS>>
CIDG
CIRG
August 29, 2006
C4
C2
Shimin Li, MASc Seminar
10
The SOC4J Framework
Stage I: Architecture Recovery (AR)
Source Code Modeling
Java Source
Code
Component-Based
System
Source code models
Architecture Modeling
Stage IV: System Transformation (ST)
Architecture Reconstruction
Architectural models
Top-Level Service Identification
Top-level services
Low-Level Service Identification
Stage II: Service Identification (SI)
August 29, 2006
Self-contained components
Top-level services
and their low-level
services
Service Realization
Self-Contained
Component
Repository
Stage III: Component Generation (CG)
Shimin Li, MASc Seminar
11
Service Description in SOC4J

Classification



Top-Level Service (TLS): A top-level service is a service that is not used by
any other services of the system. It may contain a hierarchy of low-level
services that further describes/modularizes the service.
Low-Level Service (LLS): A low-level service is a service that is underneath
a top-level service and may be used by other low-level services.
Representation
top-level service
A service is represented as a tuple: (name, CF, SHG)
MASc Seminar
Arrangement
name – name of the service,
low-level service
CF – Façade Class Set of the service,
SHG – Service Hierarchy Graph that is associated
Program Status
Room
Checking
Booking
to a top-level service. The SHG describes structural
relationships between a top-level service and its lowAnnouncement
Database
Connection
level services.
An Example of SHG
All top-level services of a system and their SHGs build the Service View (ServView).
August 29, 2006
Shimin Li, MASc Seminar
12
Stage II (SI) : Top-Level Service Identification

Goals



To identify the top-level services embedded in an existing Java software
system.
To build an initial SHG for each identified top-level service. Low-level
services within the initial SHG are called atomic services. An atomic
service is a service provided by a single Java class or interface.
Rationale



By their definition, top-level services partition the system into independent
parts.
Each of these independent part contains an entry point of the system.
From the user's point of view, each of these independent part represents
a service (i.e., top-level service) to the outside world.
August 29, 2006
Shimin Li, MASc Seminar
13
Stage II (SI) : Top-Level Service Identification cont’d

Approach
CIRG, CIDG
CIDG Transformatiom
MCIDGs
Top-Level Service Candidate Generation
Top-level service candidates
Service Validation
Validated top-level services and
their atomic services (described in SHGs)
Top-Level Service Identification Process
August 29, 2006
CIDG Transformation – decomposing CIDG
into a set of rooted components. A rooted
component is named as a Modularized CIDG
(MCIDG). Each MCIDG is a subgraph of CIDG
and represents an independent part of the
system.
Top-Level Service Candidate Generation –
generating top-level service candidates from
MCIDGs and describing each candidate as a
tuple, (name, CF, SHG).
Service Validation – validating each candidate
by examining classes within its facade class
set (classes in the façade class set represent
the functionality of the service) and assigning a
meaningful name for each accepted service.
Shimin Li, MASc Seminar
14
Stage II (SI) : Low-Level Service Identification

Goal


To identify reusable services underneath each top-level service.
Rationale




The initial SHG built in the top-level service identification process is a
rooted directed graphs. It represent the structural dependency between
a top-level service and its low-level services (atomic services).
Atomic services (provided by a single Java class) are very fine-grained
and therefore have very limited reusability.
Highly related atomic services could be clustered together to represent a
new service. The newly identified service has higher level of granularity
and thus presents a higher potential of reuse.
After service clustering, a new SHG can be built by introducing the newly
identified services.
August 29, 2006
Shimin Li, MASc Seminar
15
Stage II (SI) : Low-Level Service Identification cont’d

Approach
SHG Transformation – preprocessing the SHG,
such as collapsing cycles, etc.
Dominance Tree Generation – generating the
dominance tree from SHG.
Dominance Tree Reduction – identifying highly
related services and clustering these services
into a new service. The newly identified service
has a higher level of granularity.
A top-level service
and its atomic services
(described in SHG)
SHG Transformation
Service Aggregation
SHG
Dominance Tree Generation
DTree of SHG
Dominance Tree Reduction
Reduced DTree
SHG Reconstruction – reconstructing the SHG
from a reduced dominance tree.
Termination Criteria:
SHG Reconstruction
SHG
No
Termination Criteria
Satisfied?
(1)
Yes
The input top-level service
and its low-level services
(described in newly built SHG)
(2)
The top-level service has been nicely
modularized by its low-level services.
Low-level services are presenting
appropriate level of granularity.
Low-Level Service Identification
August 29, 2006
Shimin Li, MASc Seminar
16
The SOC4J Framework
Stage I: Architecture Recovery (AR)
Source Code Modeling
Java Source
Code
Component-Based
System
Source code models
Architecture Modeling
Stage IV: System Transformation (ST)
Architecture Reconstruction
Architectural models
Top-Level Service Identification
Top-level services
Low-Level Service Identification
Stage II: Service Identification (SI)
August 29, 2006
Self-contained components
Top-level services
and their low-level
services
Service Realization
Self-Contained
Component
Repository
Stage III: Component Generation (CG)
Shimin Li, MASc Seminar
17
Component Description in SOC4J

Classification



Top-Level Component (TLC): A top-level component is a component that
realizes a top-level service.
Low-Level Component (LLC): A low-level component is a component
that realizes a low-level service.
Representation
top-level component
A component is described as a tuple: (name, if, CF, CC , CHG)
MASc Seminar
Arrangement
name – name of the component,
low-level component
if – interface of the component,
CF – Façade Class Set of the component,
Program Status
Room
Checking
Booking
CC – Constituent Class Set of the component,
CHG – Component Hierarchy Graph that is associated
Announcement
Database
to a top-level component. The CHG describes structural
Connection
relationships between a top-level component and its lowAn Example of CHG
level components.
All top-level components of a system and their CHGs build the Component View (CompView)
August 29, 2006
Shimin Li, MASc Seminar
18
Component Reusability Model in SOC4J
Characteristic
Quality Factor
Criteria
Metric
Complexity
RPD
Observability
RCO
Adptability
Customizability
RCC
Portability
External Dependency
Understandability
Reusability
SCCr
SCCp






RPD (Reference Parameter Density) measures the occurrence of reference parameters in the interface
of a component.
RCO (Rate of Component Observability) measures the percentage of readable properties in all fields
declared in the interface of a component.
RCC (Rate of Component Customizability) measures the percentage of writable properties in all fields
declared in the interface of a component.
SCCr (Self-Completeness of Component's Return Values) measures the percentage of business methods
without any return values in all business methods implemented in a component.
SCCp (Self-Completeness of Component's Parameters) measures the percentage of business methods
without any parameters in all business methods implemented in a component.
Based on the above metrics, the reusability has been formulized to a value in [-1, 1]. A higher value
represents a higher level of reusability.
August 29, 2006
Shimin Li, MASc Seminar
19
Stage III (CG) : Service Realization

Goal


To realize each identified service (both top-level service and low-level
service) into a self-contained component.
Approach





Name the component by copying its service’s name.
Compute the façade class set, CF, by copying its service’s façade class
set.
Extract the constituent class set, CC, from the CIDG in order to make the
component self-contained.
Create a new interface, if, and modify source code of classes/interfaces
in the façade class set so that the user can access all public methods
and class fields defined in classes in the façade class set through the
newly created interface.
Generate the CHG for a top-level component, based on the SHG of its
service.
The above code modification is a kind of refactoring because this modification
does not change the observable behavior of the original system.
August 29, 2006
Shimin Li, MASc Seminar
20
The SOC4J Framework
Stage I: Architecture Recovery (AR)
Source Code Modeling
Java Source
Code
Component-Based
System
Source code models
Architecture Modeling
Stage IV: System Transformation (ST)
Architecture Reconstruction
Architectural models
Top-Level Service Identification
Top-level services
Low-Level Service Identification
Stage II: Service Identification (SI)
August 29, 2006
Self-contained components
Top-level services
and their low-level
services
Service Realization
Self-Contained
Component
Repository
Stage III: Component Generation (CG)
Shimin Li, MASc Seminar
21
Stage IV (ST) : Architecture Reconstruction

Goal


To reconstruct an existing object-oriented system into a component-based
system, based on the components extracted from the system.
Reference Model for Component-Based Systems
Target System
(Component-Based System)
1
contains
1
contains
*
1.. *
Top-Level Component
(JAR file)
1
*
1
*
contains
contains
*
*
Low-Level Component
(JAR file)
August 29, 2006
Class/Interface
(Java file)
contains
1
1
contains
Shimin Li, MASc Seminar
22
Stage IV (ST) : Architecture Reconstruction

Approach

We adopt a bottom-up integration technique that collaborates with the
extracted components, by starting with the components in the lowest
position in the component hierarchy:
for each top-level component t do
while there exists a low-level component in t.CHG do
find the component c, which is in the lowest position in t.CHG;
retrieve the parents of c in t.CHG;
refactor the source code of each parent to access c through its interface;
remove component c from t.CHG;
end while
end for
The above reconstruction process does not change the observable behavior
of the original system.
August 29, 2006
Shimin Li, MASc Seminar
23
The SOC4J Framework
Stage I: Architecture Recovery (AR)
Source Code Modeling
Java Source
Code
Component-Based
System
Source code models
Architecture Modeling
Stage IV: System Transformation (ST)
Architecture Reconstruction
Architectural models
Top-Level Service Identification
Top-level services
Low-Level Service Identification
Stage II: Service Identification (SI)
August 29, 2006
Self-contained components
Top-level services
and their low-level
services
Service Realization
Self-Contained
Component
Repository
Stage III: Component Generation (CG)
Shimin Li, MASc Seminar
24
JComp – Java Componentization Kit


JComp is a toolkit to implement the proposed SOC4J framework and provides
an integrated workbench for componentizing Java software systems.
It is built on the top of the Eclipse Rich Client Platform (RCP) and is composed
of a set of plug-ins.
JComp RCP Application
Eclipse RCP Platform
UI (Generic Workbench)
Parser Plug-in
Modeler Plug-in
JFace
Parser Plug-in : Generating source code
models (JPackage, JFile, JClass, and
JMethod).
Modeler Plug-in : Building architectural
models (CIRG and CIDG).
Extractor Plug-in
SWT
Generator Plug-in
Resource Manager
Transformer Plug-in
Platform Runtime (OSGi)
JComp Architecture
August 29, 2006
Extractor Plug-in : Identifying business
services.
Generator Plug-in : Generating a selfcontained component for each service.
Transformer Plug-in : Reconstructing an
existing system into a component-based
system.
Shimin Li, MASc Seminar
25
Case Studies

Jetty: an open-source, standards-based, and full-featured web server
implemented entirely in Java.

Apache Ant: a software tool for automating software build processes.
It is similar to make, but it is written in Java and is primarily intended
for use with Java.
Project
Version
LOC
Java Source Files
Packages
Classes
Interfaces
Jetty
5.1.10
44125
318
25
273
47
Apache Ant
1.6.5
86468
690
70
640
60
August 29, 2006
Shimin Li, MASc Seminar
26
Obtained Results - Jetty

33 top-level service candidates were generated.

16 top-level services were accepted.

The unacceptable candidates are dead code, debugging modules,
or testing modules.


For example, we found 8 dead classes in org.mortbay.util package
and a debugging module whose entry point is the class
org.mortbay.servlet.ProxyServlet.
Low-level services underneath each of 16 top-level services were
identified.
August 29, 2006
Shimin Li, MASc Seminar
27
Business Services Identified from Jetty
ID
Top-Level Service
Classes and
Interfaces
Low-Level
Services
T1
Win32 Server
248
11
T2
Dynamic Servlet Invoker
207
12
T3
Jetty Server MBean
126
9
T4
Proxy Request Handler
113
7
T5
XML Configuration MBean
87
5
Low-Level Services within
Win32 Server
T6
Web Application MBean
86
6
Jetty Server
0.9
T7
Administration Servlet
56
5
Service Handlers
0.6
T8
CGI Servlet
49
5
Resource Handler
0.7
T9
Host Socket Listener
46
5
Security Handler
0.7
T10
Web Configuration
34
3
Socket Listener
0.8
T11
Authentication Access Handler
30
3
HTTP Connection
0.9
T12
Servlet Response Wrapper
27
2
HTTP Request
0.7
T13
IP Access Handler
18
0
HTTP Response
0.5
T14
Multipart Form Data Filter
16
2
Web Application Context
0.6
T15
HTML Script Block
12
1
Servlet
0.7
T16
Applet Block
9
1
Servlet Handler
0.8
August 29, 2006
Shimin Li, MASc Seminar
Component
Reusability
28
Reusability of Components Extracted from Jetty
Reusability of Top-Level Components
Average Reusability of Low-Level Components in a Top-Level Component
1
0.9
Reusability
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
Top-Level Components
August 29, 2006
Shimin Li, MASc Seminar
29
Time and Space Statistics of JComp
Measurement Item
Jetty
Apache Ant
Source Code Modeling Time (min : sec)
2:18
5:20
Architecture Modeling Time (min : sec)
4:19
9:15
Top-Level Service Candidate Identification Time (min : sec)
8:45
19:43
Average Low-Level Service Identification Time (min: sec)
1:06
0:54
Measurement Item
Jetty
Apache Ant
Source Code Space (MB)
2.95
5.69
Source Code Model Space (MB)
1.43
3.34
Architectural Model Space (MB)
1.57
3.92
JComp were running on a Windows desktop with Intel Pentium 4 CPU 3.4GHz, 2G memory.
August 29, 2006
Shimin Li, MASc Seminar
30
Thesis Contributions

The design and implementation of comprehensive graph
representations of an object-oriented system in different levels of
abstraction.

The design and implementation of an efficient and effective
methodology for identifying and realizing critical business services
embedded in an existing object-oriented system.

The exploration of an incremental program comprehension
approach. The BView, SView, ServView, and CompView built by
the proposed framework help users gain a program understanding.

The design and implementation of a toolkit that provides an
integrated workbench for componentizing Java software systems.
August 29, 2006
Shimin Li, MASc Seminar
31
Future Works

To apply dynamic analysis on system behavior within the first
stage of the SOC4J framework to improve the detection of class
relationships.

To investigate some algorithmic processes that can be used to
automatically categorize the identified services and components.

To improve the precision of the service identification by
considering design-patterns, alternate implementations of the
algorithms, and alternate definitions of the class relationships.

To extend the SOC4J framework on other programming
languages, for instance, C++, or even C and COBOL.
August 29, 2006
Shimin Li, MASc Seminar
32
A Service-Oriented Componentization Framework
for Java Software Systems
MASc Seminar
Shimin Li
Software Technologies Applied Research Lab
Department of Electrical & Computer
Engineering
Download