Uploaded by Samuel Tesfaye Afework

978-3-642-02674-4 24

advertisement
Model-Based Interoperability of Heterogeneous
Information Systems: An Industrial Case Study
Nikola Milanovic1 , Mario Cartsburg1, Ralf Kutsche1 , Jürgen Widiker1 ,
and Frank Kschonsak2
1
TU Berlin
{nmilanov,mcartsbg,rkutsche,jwidiker}@cs.tu-berlin.de
2
Klopotek AG
f.kschonsak@klopotek.de
Abstract. Integration of heterogeneous and distributed IT-systems is
one of the major cost-driving factors in the software industry. We introduce a model-based approach for information system integration and
demonstrate it on the industrial case-study of data integration between
the Oracle database management system and the SAP R/3 enterprise
resource planning system. Particular focus is on multi-level modeling
abstractions, integration conflict analysis (automatic data model matching), semantic reasoning, code generation and tool support.
1
Introduction and Related Work
Integration of complex and heterogeneous IT-systems is one of the major costdriving factors in the software industry today. There is an increasing need to
systematically address integration in accidental architectures, that have grown
in an uncontrolled manner in heterogeneous enterprise environments.
Schema matching approaches [1] detect dependencies between data model elements at the model or instance levels. Extract-Transform-Load (ETL) tools,
such as CloverETL, use schema matching methodology to enable integration of
multiple data sources. Such tools are today primarily used for data warehousing. Furthermore, languages exist that enable specification of transformations
between data models, such as Ensemble [2]. However, the applicability of all
mentioned approaches diminishes with the increased heterogeneity of the underlying systems, for example, when they are not relational, or when data model is
not immediately accessible in form of an ER model or UML class diagram.
Another approach is the Service Oriented Architecture (SOA). In SOA, data
sources are wrapped as services and accessible to the Enterprise Service Bus
(ESB) engine, which orchestrates data and functional logic. The business process
orchestration standard is de facto BPEL, and there are numerous commercial
and open-source ESB engines. ESB however expects that all service endpoints
are compatible and no data or behavior conflicts will occur between them. Otherwise, either the endpoint itself has to be modified (frequently impossible), or
the transformation is performed at the message level (XSLT or BPELJ). From
R. Paige, A. Hartman, and A. Rensink (Eds.): ECMDA-FA 2009, LNCS 5562, pp. 325–336, 2009.
c Springer-Verlag Berlin Heidelberg 2009
326
N. Milanovic et al.
the architectural standpoint, this is bad as it introduces mixing of orchestration,
data model and implementation. Furthermore, all ESB implementations assume
mandatory SOAP/XML serialization, which restricts application domain and
hampers performance. Finally, approaches such as mapping editors (e.g., Altova
MapForce) and extended UML Editors (e.g., E2E Bridge) lack code generation,
and as such are more suitable for analysis than for development.
For these reasons, as a part of the R&D program of the German government,
project BIZYCLE (www.bizycle.de) was started in early 2007 in order to investigate in large-scale the potential of model-based software and data integration
methodologies, tool support and practical applicability for different industrial
domains. In this paper we present one BIZYCLE industrial case-study and try
to identify potential advantages of the proposed process, platform and tools.
2
Integration Scenario and the Current Solution
In the media and publishing industry, there are often several IT-systems which
have to synchronize their data. We describe a scenario where the Oracle database
server of an IT software provider has to send Customer Master Data to the SAP
ERP System, which is hosted on its customer side. The Oracle database stores
customer information and is therefore the basis of any financial transaction. The
SAP system manages financial accounting services.
In the current solution, Oracle database exports required data in a CTM file.
This is a proprietary data format created by the software provider. It is then
converted into plain-text ASCII file, stored on a file server (provider’s endpoint).
A dedicated ABAP program (SAP legacy language) on the customer side is
reading this file and converting it into a proprietary Batch Input structure which
is then imported. The scenario is point-to-point, asynchronous and with several
modal fragmentations. Many formats are used and responsibilities in case of
inconsistent data are not always clear to address. Furthermore, fragmentations
make the evolution very difficult. Additional 3rd party supplier has also to be
hired to develop the import ABAP routine.
Although outdated and inflexible, the solution represents the industrial practice in the field of the large-scale data integration. It is influenced by factors
such as presence of the legacy code (e.g., CTM export, ABAP import), timeto-market pressure, mixture of the business, presentation and data layers, and
high investments and unaffordable learning curve required for the refactoring.
Furthermore, sector-specific factors in the media and publishing sector (complex
and dynamic business processes, strong graphical and aesthetic requirements
and outdated technologies) contribute to the current unfavorable situation.
3
BIZYCLE Integration Process
Integration tasks are performed today by experienced software engineers and
application experts, manually programming the connectors i.e. the glue among
Download