BhutaMattmann - Center for Software Engineering

advertisement
A Framework for the Assessment and Selection
of Software Components and Connectors in
COTS-based Architectures
Jesal Bhuta, Chris Mattmann
{jesal, mattmann}@usc.edu
USC Center for Systems & Software Engineering
http://csse.usc.edu
February 13, 2007
Outline
 Motivation and Context
 COTS Interoperability Evaluation Framework
 Demonstration
 Experimentation & Results
 Conclusion and Future work
2
COTS-Based Applications Growth Trend
 Number of systems using
OTS components steadily
increasing
80
70
– USC e-Services
projects show number
of CBA’s rise from 28%
in 1997 to 70% in 2002
Percentage
60
50
40
30
20
10
0
1997
1998
1999
2000
2001
2002
Year
CBA Growth Trend in USC e-Services Projects
Standish Group Results
– Standish group’s 2000
survey found similar
results (54%) in the
industry [Standish 2001
- Extreme Chaos]
3
COTS Integration: Issues
 COTS products are created
with their own set of
assumptions which are not
always compatible
– Example: Java-Based
Customer Relationship
Management (CRM) and
Microsoft Access integration
 CRM supports JDBC, MS
SQL supports ODBC
Java CRM
JDBC
ODBC
Microsoft SQL Server
4
Case Study [Garlan et al. 1995]
 Develop a software architecture toolkit
 COTS selected
–
–
–
–
OBST, public domain object oriented database
Inter-views, GUI toolkit
Softbench, event-based tool integration mechanism
Mach RPC interface generator, an RPC mechanism
 Estimated time to integrate: 6 months and 1 person-year
 Actual time to integrate: 2 years and 5 person-years
5
Problem: Reduced Trade-Off Space




Detailed interoperability
assessment is effort intensive
– Requires detailed analysis of
interfaces and COTS
characteristics, prototyping
Large number of COTS products
available in the market
– Over 100 CRM solutions, over
50 databases = 5000 possible
combinations
This results in interoperability
assessment being neglected until
late in development cycle
These reduce trade-off space
between
– medium and low priority
requirements chosen over cost
to integrate COTS
Large number
of COTS
choices
High Priority Functional
Criteria Filtering
Medium and Low Priority
Functional Criteria Filtering
COTS Product Type A
COTS Product Type B
COTS Product Type C
COTS Product Type D
6
Statement of Purpose
To develop an efficient and effective COTS interoperability
assessment framework by:
1. Utilizing existing research and observations to introduce
concepts for representing COTS products
2. Developing rules that define when specific interoperability
mismatches could occur
3. Synthesizing (1 and 2) to develop a comprehensive
framework for performing interoperability assessment
early (late inception) in the system development cycle
Efficient: Acting or producing effectively with a minimum of unnecessary effort
Effective: Producing the desired effect (effort reduction during COTS integration
)
7
Proposed Framework: Scope
 Specifically addresses the problem of technical
interoperability
 Does not address non-technical interoperability issues
– Human computer interaction incompatibilities
– Inter/intra organization incompatibilities
Conceptualize Architecture
Identify COTS Software Products
Inception
IRR
Detailed
Analysis &
Prototyping
Apply Proposed
Framework
Integration & Testing
Elaboration
LCO
IRR – Inception Readiness Review; LCO – Life Cycle Objective Review;
LCA – Life Cycle Architecture Review; IOC – Initial Operational Capability
[Boehm 2000]
Construction
LCA
IOC
High return on investment area
8
Motivating Example: Large Scale Distributed Scenario
 Data disseminated in multiple
intervals
NASA JPL (USA, Pasedena)
Digital content
& metadata
Additional
Planetary
Data
Digital Asset
Management
System
Data Retrieval
Component
High Voluminous Data Connector (C1 & C2)
Data Retrieval
Component
 Two user classes separated by
distributed geographic
networks (Internet)
– Scientists from European
Space Agency (ESA)
– External users
Query
Manager
Query
Manager
High Voluminous Data Connector (C3)
 Manage and disseminate
– Digital content (planetary
science data)
External User
Systems
Digital Asset
Management
System
Additional
Planetary
Data
Digital
Metadata
ESA (Spain, Madrid)
Data flow
Data Store
Custom/COTS components
Organization Intranet
9
Interoperability Evaluation Framework Interfaces
Interoperability Evaluation
Framework
COTS Representation
Attributes
COTS Components
& Proposed System
Architecture
COTS Interoperability
Evaluator
(StudioI)
Estimates Lines
of Glue-Code
COCOTS Glue-Code
Estimation model
[Chris Abts 2002]
Developer
COTS Interoperability
Analysis Report
Cost & Effort Estimate
to Integrate COTS
Products
Integration Rules and
Strategies
10
COTS Representation Attributes
COTS Representation Attributes
COTS General Attributes (4)
COTS Dependency Attributes* (6)
Name
Role*
Type
Version
Communication Dependency*
Communication Incompatibilities*
Deployment Language*
Execution Language Support*
Underlying Dependency*
Same Node Incompatibilities*
COTS Interface Attributes* (14)
Binding*
Communication Language
Support*
Control Inputs*
Control Output*
Control Protocols*
Error Handling Inputs*
Error Handling Outputs*
Extensions*
Data Inputs*
Data Outputs*
Data Protocols*
Data Format*
Data Representation*
Packaging*
COTS Internal Assumption Attributes (16)
Backtracking
Control Unit
Component Priorities
Concurrency
Distribution
Dynamism
Encapsulation
Error Handling
Mechanism
Implementation language*
Layering
Preemption
Reconfiguration
Reentrant
Response Time
Synchronization
Triggering capability
* indicates the attribute or attribute set can have multiple values
11
COTS Definition Example: Apache 2.0
COTS General Attributes (4)
COTS Dependency Attributes (4)
Name
Apache
Communication
Dependency
None
Role
Platform
Deployment
Language
Binary
Type
Third-party component
Execution Language
Support
CGI
Version
2.0
Underlying
Dependencies
Linux, Unix, Windows, Solaris (OR)
Interface
Attributes (14)
Backend Interface
Web Interface
COTS Internal Assumption Attributes (16)
Binding
Runtime Dynamic
Topologically Dynamic
Backtracking
No
Communication
Language Support
C, C++
Control Unit
Central
Control Inputs
Procedure call, Trigger
Component Priorities
No
Control Outputs
Procedure call, Trigger,
Spawn
Concurrency
Multi-threaded
Control Protocols
None
Distribution
Single-node
Dynamism
Dynamic
Encapsulation
Encapsulated
Error Inputs
Error Outputs
Logs
Data Inputs
Data access, Procedure
call, Trigger
Error
Handling
Mechanism
Notification
Data Outputs
Data access , Procedure
call, Trigger
Implementation Lang
C++
Data Protocols
HTTP Error Codes
HTTP
Layering
None
Data Format
N/A
N/A
Preemption
Yes
Data
Representation
Ascii, Unicode, Binary
Ascii,Unicode, Binary
Reconfiguration
Offline
Extensions
Supports Extensions
Reentrant
Yes
Packaging
Executable Program
Response Time
Bounded
Synchronization
Asynchronous
Triggering Capability
Yes
Web service
12
COTS Interoperability Evaluation Framework
COTS Definition
Generator
COTS Selection
Framework
COTS
Definitions
Interoperability Analysis
Framework
COTS
Definitions
Architecting
User Interface
Component
COTS Definition
Repository
Project Analyst
Connector
Query/Response
COTS Connector
Selector
Connector
Options
Level of Service
Connector Selection
Framework
Define
Architecture &
COTS
combinations
Deployment Architecture
Connector
Options
Integration Analysis
Component
COTS
Interoperability
Analysis
Report
Integration
Rules
Integration Rules
Repository
13
Integration Rules
 Interface analysis rules
– Example: ‘Failure due incompatible error communication’
 Internal assumption analysis rules
– Example: ‘Data connectors connecting components that
are not always active’
 Dependency analysis rules
– Example: ‘Parent node does not support dependencies
required by the child components’
 Each rule includes: pre-conditions, results
14
Integration Rules: Interface Analysis
 ‘Failure due incompatible error communication’
– Pre-conditions
 2 components (A and B) communicating via data &/or control
(bidirectional)
 One component’s (A) error handling mechanism is ‘notify’
 Two components have incompatible error output/error input
methods
– Result
 Failure in the component A will not be communicated in
component B causing a permanent block or failure in
component B
15
Integration Rules: Internal Assumption Analysis
 ‘Data connectors connecting components that are not
always active’
– Pre-conditions
 2 components connected via a data connector
 One of the component does not have a central control unit
– Result
 Potential data loss
Component A
Pipe
Component B
16
Integration Rules: Dependency Analysis
 ‘Parent node does not support dependencies required by
the child components’
– Pre-condition:
 Component in the system requires one or more software
components to function
– Result:
 The component will not function as expected
17
Voluminous Data Intensive Interaction Analysis
 An Extension Point implementation of the Level of Service
Connector Selector
 Distribution connector profiles (DCPs)
– Data access, distribution, streaming [Mehta et. al 2000] metadata
captured for each profiled connector
– Can be generated manually, or using an automatic process
 Distribution Scenarios
– Constraint queries phrased against the architectural vocabulary
of data distribution








Total Volume
Number of Users
Number of User Types
Delivery Intervals
Data Types
Geographic Distribution
Access Policies
Performance Requirements
18
Voluminous Data Intensive Interaction Analysis
 Need to understand the relationship between the
scenario dimensions and the connector metadata
– If we understood the relationship we would know which
connectors to select for a given scenario
 Current approach allows both Bayesian inference and
linear equations as a means of relating the connector
metadata to the scenario dimensions
 For our motivating example
– 3 Connectors, C1-C3
– Profiled 12 major OTS connector technologies
 Including bbFTP, gridFTP, UDP bursting technologies, FTP,
etc.
– Apply selection framework to “rank” most appropriate of 12
OTS connector solutions for given example scenarios
19
Voluminous Data Intensive Interaction Analysis
 Precision-Recall analysis
– Evaluated framework against 30 real-world data distribution
scenarios
– 10 high volume, 9 medium volume, and 11 low volume scenarios
– Used expert analysis to develop “answer key” for scenarios
 Set of “right” connectors
 Set of “wrong” connectors
 Applied Bayesian and linear programming connector selection
algorithm
– Clustered ranked connector lists using k-means clustering (k=2)
to develop similar answer key for each algorithm
 Bayesian selection algorithm: 80% precision, linear programming
48%
– Bayesian algorithm more “white box”
– Linear algorithm more “black box”
– White box is better
20
Demonstration
Experiment 1
 Conducted in graduate software engineering course on 8
projects
– 6 projects COTS-Based Applications
 2 web-based (3-tier) projects, 1 shared data project, 1 clientserver project, 1 web-service interaction project and 1 singleuser system
– Implemented this framework before RLCA* milestone on
their respective projects
– Data collected using surveys
 Immediately after interoperability assessment
 After the completion of the project
* Rebaselined Life Cycle Architecture
22
Experiment 1 Results
Data Set
Groups
Mean
Standard
Deviation
Dependency
Accuracy
Pre Framework Application
79.3%
17.9
0.017
Post Framework Application
100%
0
Pre Framework Application
76.9%
14.4
Interface Accuracy
Actual Assessment
Effort
Actual Integration
Effort
PValue
0.0029
Post Framework Application
100%
0
Projects using this framework
1.53
1.71
Equivalent projects that did not
use this framework
5 hrs
3.46
Projects using this framework
9.5 hrs
2.17
Equivalent projects that did not
use this framework
18.2 hrs
3.37
0.053
0.0003
* Accuracy of Dependency Assessment:
1 – (number of unidentified dependencies/total number of dependencies)
** Accuracy of Interface Assessment:
1 – (number of interface interaction mismatches identified/total number of interface interactions)
Accuracy: a quantitative measure of the magnitude of error [IEEE 1990]
23
Experiment 2 – Controlled Experiment
Treatment Group
Control Group
Number of Students
75
81
On campus Students
60
65
DEN Students
15
16
Average Experience
1.473 years
1.49 years
Average OnCampus
Experience
0.54 years
0.62 years
Average DEN
Experience
5.12 years
5 years
24
Experiment 2 - Cumulative Results
Data Set
Groups
Mean
Standard
Deviation
PValue
Treatment Group (75)
100%
0
Control Group (81)
72.5%
11.5
<0.0001
(t=20.7;
sdev=8.31;
DOF=154)
Treatment Group (75)
100%
0
Control Group (81)
80.5%
13.0
Treatment Group (75)
72.8 min
28.8
Control Group (81)
185 min
104
Hypothesis IH1:
Dependency
Accuracy
Hypothesis IH2:
Interface Accuracy
<0.0001
(t=13.0;
sdev=9.37;
DOF=154)
Hypothesis IH3:
Actual Assessment
Effort
<0.0001
(t=-9.04;
sdev=77.5;
DOF=154)
25
Experiment 2 – DEN Results
Data Set
Groups
Mean
Standard
Deviation
PValue
Treatment Group (60)
100%
0
Control Group (65)
72.6%
11.8
<0.0001
(t=17.9;
sdev=8.50;
DOF=123)
Treatment Group (60)
100%
0
Control Group (65)
80.4%
12.6
Treatment Group (60)
67.1 min
23.1
Control Group (65)
183 min
100
Hypothesis IH1:
Dependency
Accuracy
Hypothesis IH2:
Interface Accuracy
<0.0001
(t=12.0;
sdev=9.12;
DOF=123)
Hypothesis IH3:
Actual Assessment
Effort
<0.0001
(t=-8.75;
sdev=74.2;
DOF=123)
26
Conclusion and Future Work
 Results (so far) indicate a “sweet spot” in small eservices project
 Framework-based tool automates initial interoperability
analysis:
– Interface, internal assumption, dependency mismatches
 Further experimental analysis ongoing
– Different software development domains
– Projects with greater COTS complexity
 Additional quality of service extensions
27
Questions
28
Download