A Framework for Identification and Resolution of Interoperability Mismatches in COTS-Based Systems Jesal Bhuta, Barry Boehm Center for Systems and Software Engineering University of Southern California {jesal, boehm}@usc.edu Abstract Software systems today are frequently composed from prefabricated commercial components that provide complex functionality and engage in complex interactions. Such projects that utilize multiple commercial-off-the-shelf (COTS) products often confront interoperability conflicts resulting in budget and schedule overruns. These conflicts occur because of the incompatible assumptions made by developers of these products. Identification of such conflicts and planning strategies to resolve them is critical for developing such systems under budget and schedule constraints. In this paper we present an attribute-based framework that can be used to perform high-level and automated interoperability assessment to filter out COTS product combinations whose integration will not be feasible within the project constraints. Our framework is built upon standard definitions of both COTS components and connectors and is intended for use by architects and developers during the design phase of a software system. Our preliminary experience in using the framework indicates an increase in interoperability assessment productivity by 50% and accuracy by 20%. 1. Introduction Economic imperatives are changing the nature of software development processes to reflect both the opportunities and challenges of using commercial-offthe-shelf (COTS) products. Processes are increasingly moving away from the time-consuming development of custom software from lines of code towards assessment, tailoring, and integration of off-the-shelf (OTS) or other reusable components [6][16]. COTSbased systems provide several benefits such as reduced upfront development costs, rapid time to deploy, and reduced maintenance and evolution costs. These economic considerations often entice organizations to piece together their software systems with pre-built components. However these benefits are accompanied by several risk factors such as high maintenance costs, inaccessible source-code and no control over evolution of COTS products [4]. One such risk factor is that of interoperability amongst selected COTS products. The first example of such an interoperability issue was documented by Garlan et al. in [10] when attempting to construct a suite of software architectural modeling tools using a base set of 4 reusable components. Garlan et al. termed this problem architectural mismatch and found that it occurs due to (specific) assumptions that an COTS component makes about the structure of the application in which it is to appear that ultimately do not hold true. The best-known solution to identifying architectural mismatches is prototyping COTS interactions as they would occur in the conceived system. Such an approach is extremely time and effort intensive. Alternately development teams often times manually assess their COTS-based architectures to identify mismatches. Such assessments also take significantly long due to incoherent documentation provided by COTS vendors. This problem is further compounded by the present-day COTS market where there are a multitude of COTS product choices for any given functionality, increasing the number of COTS combinations that would need to be assessed for interoperability. At the University of Southern California (USC) we have developed an attribute-driven framework that addresses selection of COTS components and connectors to ensure that they can be integrated within project budget and schedule. Our framework identifies COTS component incompatibilities and recommends resolution strategies, partly by using specific connectors and glue-code to integrate these components. Such incompatibility information can be used to estimate the effort taken in COTS integration [2], which can be used as a criterion when selecting COTS products. The assessment conducted by the framework can be carried out as early as the inception phase, as soon as the development team has identified possible architectures and a set of COTS components and connectors. Further, we evaluate the utility of our framework through two experiments that we performed in a graduate-level software engineering course using the framework based tool – Integration Studio (iStudio). Rest of this paper is structured as follows. Section 2 presents background and related work. Section 3 provides a description of the COTS interoperability assessment framework required to understand our framework. Section 4 demonstrates application of our framework to a motivating example. Section 5 presents our experiments with the framework and corresponding results. Finally, section 6 presents our conclusions and future direction of our work. 2. Background and Related Work Several researchers have been working on component-based software architectures, component integration, OTS based system development and architecture mismatch analysis. This section describes results of these past efforts. Researchers have proposed several COTS component selection approaches [3][4][6][7]. Of these approaches, [3][7] are largely geared towards the selection and implementation of COTS based on business and functional criteria. The approach presented by Mancebo et al. in [12] focuses on a COTS-selection process based upon architectural constraints, and does not address the interoperability issue. Ballurio et al. [4] provide a detailed but time-intensive and manual method for assessment of COTS component interoperability, making it inappropriate for assessing large number of COTS combinations. Yakimovich et al. [15] have proposed an incompatibility model that provides a classification of COTS incompatibilities and strategies for their resolution across the system (hardware and software) and environment (development and target) related components. However identification and integration strategies recommended are extremely high-level and require manual analysis for incompatibility identification. Davis et al. [8] presented notations for representing architectural interactions, to perform multi-phase preintegration analysis for component-based systems. They define a set of 21 components characteristics for identifying problematic component interactions, and the interactions themselves. The authors further recommend the use of existing [11][14] architectural resolution strategies. Most characteristics however require access to and an understanding of the sourcecode, which makes this approach complicated to use for COTS systems. Moreover this approach does not identify interface level incompatibilities amongst system components, as our own approach does. While our approach is similar, it is applicable for components whose source code is either inaccessible or complicated to understand. Gacek [9] investigates the problem of architectural mismatch during system composition. Extending work done in [1] she presents 14 conceptual features, using which she defines 46 architectural mismatches across six connector types: call, spawn, data connector, shared data, trigger, and shared resource. Our work utilized and extends this research. Mehta et al. [14] propose a taxonomy of software connectors. In the taxonomy authors provide four major service categories addressed by connectors. These include: communication, conversion, coordination and facilitation. They further identify eight primitive types of connectors and classify them along a set of dimensions and sub-dimensions unique to each connector type. Our work utilizes these service categories as well as the connector classification for identification of COTS interfaces. 3. Interoperability Assessment Framework The framework is modeled using three key components: COTS representation attributes used to define interoperability characteristics for the specific COTS product; integration rules that define the preconditions for mismatches to occur, and COTS interoperability evaluation process that utilizes the attribute-based COTS definitions and rules to analyze the given architecture for mismatches. The framework outputs an assessment report which includes three major analyses: 1. Interface (or packaging) mismatches, which occur because of incompatible communication interfaces between two components. 2. Dependency analysis, which ensure that facilities required by COTS packages used in the system are being provisioned (e.g. Java-based CRM solution requires Java Runtime Engine). 3. Internal assumption mismatches, which are caused due to assumptions made by interacting COTS’ systems about each other’s internal structure [9]. In the remainder of this section we will describe each of the framework components in details. COTS Representation Attributes 3.1 COTS Representation Attributes The COTS Representation Attributes are a set of 40 attributes that define COTS product interoperability characteristics. COTS interoperability characteristics defined using these attributes are utilized by the integration analysis component along with integration assessment rules (described in the next section) to carry out interoperability analysis. These attributes have been derived from the literature, as well as our observations in various software integration projects. The two major criteria used for selecting these attributes were: 1. attributes should be able to capture enough details on the major sources of COTS product mismatches (interface, internal assumption and dependency mismatches) identified - internal assumption mismatches, 2. attributes should be defined at high-level so that COTS vendors are able to provide attribute definitions without revealing confidential product information To date, we have surveyed about 40 COTS products of which 30 were open source. For the non-open source COTS we could identify at least 34 of the 38 attributes from the publicly accessible information itself. We neglected to include many attributes such as data topology, control structure, and control flow because they were either : too detailed and required understanding of internal designs of COTS products for defining them, or could alternately be represented at a higher level by an already included attribute, or did not provide significant mismatches to warrant us including them. We have classified the attributes that we selected into four groups shown in . Attributes (or attribute sets) marked with an asterisk indicate that there may be multiple values for a given attribute (or set) for the given COTS product. The remainder of this section summarizes attribute classifications. The full descriptions of all the attributes can be accessed at [5]. COTS general attributes aid in the identification and querying of COTS products. The attributes include name, version, role and type. COTS interface attributes define the interactions supported by the COTS product. An interaction is defined by the exchange of data or control amongst components. COTS products may have multiple interfaces, in which case it will have multiple interface definitions. For example: the Apache Web Server will have one complete interface definition for the webinterface (interaction via HTTP), and another complete definition for server interface (interaction via procedure call). These attributes include packaging (source code modules, object modules, dynamic COTS General Attributes (4) Name Role* Type Version COTS Interface Attributes (14) Binding* Communication Language Support* Control Inputs* Control Output* Control Protocols* Data Inputs* Data Outputs* Data Protocols* Data Format* Data Representation* Error Handling Inputs* Error Handling Outputs* Extensions* Packaging* COTS Internal Assumption Attributes (16) Backtracking Control Unit Component Priorities Concurrency Distribution Dynamism Encapsulation Error Handling Mechanism Implementation Language* Layering Preemption Reconfiguration Reentrant Response Time Synchronization Triggering capability COTS Dependency Attributes (6) Communication Dependency* Communication Incompatibility* Deployment Language* Execution Language Support* Same-Node Incompatibility* Underlying Dependency* * indicates the attribute can have multiple values Figure 1.COTS Interoperability Attributes libraries, etc.), data and control inputs, outputs, protocols etc. When developing the COTS product the developer makes certain assumptions about the internal operations of the COTS products. The COTS internal assumption attributes capture such internal assumptions. For example developers of the Apache Web Server assume that the software will contain a central control unit which will regulate the behavior of the system. COTS internal assumption attributes include synchronization, concurrency, distribution and others. COTS dependency attributes define the facilities required by a COTS product i.e. software the COTS product requires for successful execution. For example any Java-based system requires the Java Runtime Environment (JRE) as a platform. COTS dependency attributes include underlying and communication dependencies, deployment language support, and execution language support. Full descriptions of all the attributes are out of the scope of this paper and can be accessed at [5]. 3.2 Integration Assessment Rules These are a set of rules used to perform the interoperability analysis. Every rule has a set of preconditions, which if true for the given architecture and components, identifies an architectural mismatch. For example consider one of the architectural mismatches found by Gacek in [9]: “Data connectors connecting components that are not always active”. For the given mismatch the pre-conditions are: 2 components connected via a data connector (only) and one of the components does not have a central control unit. There are similar rules for performing interface, dependency, and internal assumption analysis. Interface analysis discovers if there are commonly shared interfaces between two communicating COTS components, if not it includes recommendations on the type of “glueware” (or “glue code”) required to integrate the components. Dependency analysis rules verify if the architecture satisfies all the dependencies that a COTS product requires. Finally, for internal assumptions we leverage upon the mismatches identified in [9] and add new mismatches based on newly added attributes. 3.3 COTS interoperability evaluator To develop the COTS interoperability evaluator we needed to address two significant challenges: 1. Ensure that the effort spent in COTS interoperability assessment is much less than the effort spent performing the assessment manually. 2. Ensure that the framework is extensible, i.e. so that it can be updated based on prevailing COTS characteristics. We address these challenges by developing a framework that is modular, automated, and where COTS definitions and assessment criteria can be updated on-the-fly. Our framework allows for an organization to maintain a reusable and frequently updated portion (COTS selector) remotely, and a portion which is minimally updated (interoperability analyzer) at client-side. This allows for a dedicated team to maintain definitions for COTS being assessed by the organization. The internal architecture of the COTS interoperability evaluator is shown in Figure 2. The architecture consists of the following sub-components. COTS Definition Generator is a software utility that allows users as well as COTS vendors and other COTS experts to define the COTS components in a generally accepted standard format based on COTS representation attributes. For brevity, we omit its full description of our existing XML format and we point the reader to [5] for a complete description. COTS Definition Repository is an online storage of various COTS definitions indexed and categorized by their roles and functionality they provide (database systems, graphic toolkits etc.). The repository is queried by different sub-components of the interoperability evaluator. In practice, this component would be shared across the organization to enable COTS definitions reuse. Alternately, such a repository could be maintained and updated by a third-party vendor and its access can be licensed out to various organizations. Architecting User Interface Component provides a graphical user interface for the developers to create the system deployment diagram. The component queries the COTS definition repository to obtain the definitions of COTS products being used in the conceived system. Integration Rules Repository specifies various integration rules that will drive the analysis results and interoperability assessment. The rules repository can be maintained remotely; however it will be required to download the complete repository at the client-side (interoperability analyzer) before performing interoperability assessment. Integration Analysis Component contains the actual algorithm for analyzing the system. It uses the rules specified in the integration rules repository along with the architecture specification to identify internal assumption mismatches, interface (or packaging) mismatches and dependency analysis. When the integration analysis component encounters an interface mismatch the component queries the COTS connector selector Define architecture & COTS combinations COTS Definition Generator Evaluate Project Analyst Architecting User Interface Component COTS Definition Repository Integration Rules Repository Deployment architecture COTS Connector Selector COTS interoperability analysis report Integration Analysis Component Generates Connector options Figure 2. COTS Interoperability Evaluator component to identify if there is an existing bridge connector which could be used for integration of the components. It no bridge connector is available it will recommend that a wrapper of the appropriate type (either communication, or coordination or conversion) be utilized. The integration analysis component then provides some simple textual information (in human readable format) as to the functionality of the wrapper required to enable interaction between the two components. In addition, the integration analysis component identifies mismatches caused due to internal assumptions made by COTS components, and also identifies COTS component dependencies not satisfied by the architecture. For cases where the COTS component definition has missing information the integration analysis component will include both an optimistic, and a pessimistic outcome. These identifications are both included in the interoperability analysis report. COTS Connector Selector is a query interface used by integration analysis component to identify a bridging connector in the event of interface incompatibility, or a Quality of Service (QoS) specific connector. Quality of Service Connector Selection Framework is an extensible component built for identifying quality of service specific connectors. We are working on one such extension that aids in the selection of highly distributed and voluminous data connectors. Other quality of service extensions may include connectors for mobile-computing environments that require low memory footprint, or connectors for highly reliable, fault-tolerant systems. COTS Interoperability Analysis Report is output by the evaluator and contains the result of the analysis in three major sections: (1) internal assumptions mismatch analysis, (2) interface (packaging) mismatch analysis, and (3) dependency analysis. This is the ultimate product of the interoperability framework. 4. Empirical Study and Results To demonstrate the utility of our framework we conducted two experiments in a graduate software engineering course at USC. The course focuses on development of software system requested by a realworld client. Graduate students enrolled in the course form teams of about 5 to 6 members to design and implement a software system within a 24-week time period. During this period the project progresses through inception, elaboration, construction, and transition phases. Our first experiment was conducted close to the end of the elaboration phase. We asked 6 teams, whose architectures included at least 3 or more COTS to use our framework-based tool on their respective projects and measured results in four areas: 1. Accuracy of interface incompatibilities identified by the framework calculated as 1 – (number of interface incompatibilities missed by the team / total number of interface incompatibilities). 2. Accuracy of dependencies identified by the framework calculated as 1 – (number of dependencies missed by the team / total number of dependencies). Both interface and dependency assessment results produced by our framework were later verified through a survey conducted after the project was implemented. These results evaluate the completeness and correctness of our interface dependency rules. 3. Effort spent in assessing the architectures using the framework opposed to the effort spent in assessing the architectures manually by an equivalent team. These results demonstrate the efficiency of using our framework to perform interoperability assessment as opposed to performing a manual assessment. 4. Effort spent in performing the actual integration after using the framework as opposed to effort spent by an equivalent team. Results here validate the overall utility of our framework. Equivalent teams for comparing actual integration effort were chosen from past projects such that they had similar COTS products, similar architectures, and whose team-members had similar years of experience in project development. Upon performing independent T-test for four cases above we recorded the results shown in Table 2. Our results indicate that the framework increases dependency assessment accuracy and interface assessment accuracy by more than 20% and reduces both assessment effort and integration effort by approximately 50%. These results are significant at the alpha = 5% level. The tool’s perfect detection record in this experiment indicates that it has a strong “sweet spot” in the area of smaller e-services applications with relatively straightforward COTS components, but with enough complexity that less COTS-experienced software engineers are unlikely to succeed fully in interoperability assessment. We plan to conduct further tool evaluations on larger projects with more complex COTS products. Table 1. Empirical assessment of our framework Groups Mean Std. Dev. 76.9% 14.4 [2] P-Value Interface Assessment Accuracy Before using the framework [3] 0.0029 After using the framework 100% 0 [4] Dependency Assessment Accuracy Before using the framework 79.3% 17.9 After using the framework 100% 0 0.017 [5] Effort spent in performing architecture assessment Projects using the framework 1.53 hrs 1.71 Equivalent projects 5 hrs 3.46 0.053 [6] Effort spent when integrating the COTS products Projects using the framework 9.5 hrs 2.17 Equivalent projects 18.2 hrs 3.37 0.0003 [7] 5. Conclusions and Future Work In this paper we present an attribute-based framework that enables interoperability assessment of architectures using COTS product combinations early in the software development life-cycle. Using our framework does not eliminate detailed testing and prototyping for evaluating COTS interoperability, however it does provide an analysis of interface compatibilities, dependencies, recommends connectors to be used or glue-code required, all of which could be tested for during detailed prototyping. Moreover, since the framework-based tool is automated it enables evaluation of large number of architectures and COTS combinations, increasing the trade-off space for COTS component and connector selection. Our current experimental results in using this framework have shown a 20% increase in accuracy and 50% increase in productivity of COTS interoperability assessment. In the near future we are planning experiments and evaluations to gather empirical data to further test the utility of the attributes and tool. In addition, we are collaborating with researchers identifying similar attributes to assess architectures for quality of service (QoS) parameters. One such QoS extension that is being incorporated in our tool is that on voluminous data intensive interactions [13]. It is important to note that these attributes must be periodically updated based on prevailing COTS characteristics. 6. References [1] Abd-Allah A., "Composing Heterogeneous Software Architectures" PhD dissertation, University of Southern California, 1996. [8] [9] [10] [11] [12] [13] [14] [15] [16] C. Abts, "Extending the COCOMO II Software Cost Model to Estimate Effort and Schedule for Software Systems Using Commercial-Off-The-Shelf (COTS) Software Components: The COCOTS Model", Ph.D. Thesis, Univ. Southern California, 2004. Albert C., Brownsword L., “Evolutionary Process for Integrating COTS-Based Systems (EPIC),” SEI Technical Report CMU/SEI-2002-TR-005. Ballurio K., Scalzo B., Rose L., “Risk Reduction in COTS Software Selection with BASIS,” First International Conference, Orlando, Florida, Feb 2003. Bhuta J., "A Framework for Intelligent Assessment and Resolution of Commercial Off-The-Shelf (COTS) Product Incompatibilities," USC, Tech. Report USCCSE-2006-608, 2006. Boehm B., Port D., Yang Y., Bhuta J., Abts C., "Composable Process Elements for Developing COTSBased Applications", 2003 ACM-IEEE International Symposium on Empirical Software Engineering (ISESE 2003). Brownsword L., Obnerndorf P., Sledge C., “Developing new Processes for COTS-Based Systems,” IEEE Software Volume 17, Issue 4 July/August 2000. Davis L., Gamble R., Payton J., "The Impact of Component Architectures on Interoperability," Journal of Systems and Software (2002). Gacek C., "Detecting Architectural Mismatches During System Composition," PhD dissertation, University of Southern California, 1998. Garlan D., Allen R., Ockerbloom J., “Architectural Mismatch or Why it’s hard to build systems out of existing parts,” International Conference on Software Engineering 1995. Keshav R., Gamble R., "Towards a Taxonomy of Architecture Integration Strategies," 3rd International Software Architecture Workshop, 1-2, Nov 1998. Mancebo E., Andrew A., "A Strategy for Selecting Multiple Components," Proceedings of ACM Symposium on Applied Computing, 2005. Mattmann C., "Software Connectors for Highly Distributed and Voluminous Data-intensive Systems," In Proc. ASE, Tokyo, Japan, 2006. Mehta N., Medvidovic N., Phadke S., "Towards a Taxonomy of Software Connectors," Proceedings of 22nd International Conference on Software Engineering (ICSE '00) 2000. Yakimovich D., Bieman J., Basili V., "Software Architecture Classification for Estimating the Cost of COTS Integration," 21st International Conference on Software Engineering, 1999. Yang Y., Bhuta J., Boehm B., Port D., "Value-Based Processes for COTS-Based Applications," IEEE Software special issue on COTS Integration, July/August 2005.