This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 1 Integration of Distributed Enterprise Applications: A Survey Wu He, and Li Da Xu, Senior Member, IEEE Abstract— Many industrial enterprises acquire disparate systems and applications over the years. The need to integrate these different systems and applications is often prominent for satisfying business requirements and needs. In an effort to help researchers in industrial informatics understand the state-of-the-art of the enterprise application integration, this paper discusses the architectures and key technologies used for integrating distributed enterprise applications, illustrates their strengths and weaknesses, and identifies research trends and opportunities in this increasingly important area. Index Terms—Distributed Enterprise Applications, Middleware, Enterprise Application Integration, Web Services, Service-oriented Architecture(SOA), Enterprise Service Bus, Industrial Informatics, Enterprise Systems, Industrial Information Integration Engineering, Radio Frequency Identification (RFID), Internet of Things (IoT) I. INTRODUCTION A distributed enterprise application is defined as an application with software components residing on more than one computer in a network [1]. Oftentimes the network is heterogeneous and is composed of diverse computers, devices, and operating systems. In industrial enterprise environments, many industry systems typically consist of numerous technologies, protocols, applications, and devices which are distributed across a network [2], [3]. As the industry environments become increasingly distributed and heterogeneous across multiple organizational and geographical boundaries in recent years, there are strong demands to integrate distributed applications in order to increase enterprises' competitiveness. Particularly, the integration of distributed industrial applications has been of interest in the arena of industrial information systems. For example, according to IEC61499 (International Electrotechnical Commission’s Function Block specification) [4], a distributed control system Manuscript received December 24, 2011. Accepted for publication February 22, 2012. Copyright © 2009 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org Wu He is with Old Dominion University, Norfolk, VA 23529, USA (phone:757-683-5008; email:whe@odu.edu) Li Da Xu is with the Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; Old Dominion University, Norfolk, VA 23529, USA(email:lxu@odu.edu) consists of a number of applications that may be distributed among multiple devices. Oftentimes control processing application resides in a device and output conversion application resides in another device. Sometimes a function of an application may be distributed to several devices and requires the cooperation of different parts to work properly. Additionally, the applications and devices may be developed or provided by different vendors [3] with different programming languages, formats and protocols. Significant integration efforts are required to increase the interoperability and other collaborative features of these applications and devices. Over the past three decades, many enterprises have invested heavily to integrate distributed enterprise applications due to the continuous mergers and acquisitions, joint venture, outsourcing, corporate restructuring, infrastructure upgrades, adoption of mobile devices, smart embedded devices, and wireless sensors. Enterprises that are able to integrate their various enterprise applications have a distinct competitive advantage such as strategic utilization of company data and technology for greater efficiency and profit [5]. Distributed enterprise applications typically require their distributed components to interact with one another through certain remote communication mechanisms such as message-passing, and remote-invocations [6] in networking environments. However, distributed enterprise applications are often unable to communicate with each other due to reasons such as lack of interoperability, variable formats, different protocols, and dissimilar standards of operations. As distributed enterprise applications continue to grow in scale and complexity, integrating distributed enterprise applications has been a challenging task. For example, many industry enterprises have trouble in integrating industry applications rapidly, inexpensively, and seamlessly. Such applications include computer-aided design system, engineering document management system, manufacturing execution system, and product data management system running on different hosts. For many industry enterprises, it is imperative for these industry systems to cooperate for achieving business enterprise objectives. To address problems that concern integration, an IT solution named Enterprise Application Integration (EAI) has been developed to help achieve quality integration [5]. EAI encompasses technologies that enable distributed and heterogeneous applications to interact to one another across the network and help integrate many individual applications into a Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 2 seamless whole [5]. It consists of plans, methods and tools which aim to consolidate and coordinate computer applications. EAI facilitates the integration of both intra-organizational and inter-organizational systems. EAI solutions comprise the efficient integration of diverse business processes and data across the enterprises, interoperation and integration of intra-organizational and inter-organizational enterprise applications, conversion of varied data representations among involving systems, and the connection of proprietary/legacy data sources, ES, applications, processes, and workflows interorganizationally[7]. Through creating an integrative structure, EAI connects the heterogeneous data sources, systems, and applications intra- or inter- enterprise. With EAI, intra- or inter-enterprise application systems can be integrated seamlessly, and can ensure that different divisions or even enterprises can cooperate with each other [8]. EAI is highly relevant to industrial informatics as industrial informatics concern the information flow and information systems within the entire industrial organization [8]. EAI is a key research area in Industrial Information Integration Engineering (IIIE) which is a scientific sub-discipline in engineering. Broadly speaking, IIIE is a set of foundational concepts and techniques that facilitate the industrial information integration process; specifically speaking, IIIE comprises methods for solving complex problems when developing IT infrastructure for industrial sectors, especially in the aspect of information integration [8]. One of the main purposes for establishing IIIE is to respond to the challenging demand from industries. For instance, the International Electronics Manufacturing Initiative (iNEMI) revealed in 1996 that existing factory information and communication systems were hardly interoperable and required huge integration efforts [9]. Specifically, they pointed out a technology gap in the domain of middleware technologies required to integrate communicating machines and factory information systems. Afterwards, substantial efforts were spent to integrate disparate applications and systems in industry sectors. Nowadays many industries such as telecommunication, manufacturing, logistics, and electrical power have implemented and integrated distributed enterprise applications such as distributed control systems in factory automation, e-manufacturing systems, and distributed electronics production systems [3], [10-12]. For example, a factory management system may include an order processing application, a production scheduling application and a production control application which reside in three separate servers [13]. The three distributed applications then cooperate with each other to achieve collaborative order management function. The literature also shows that distributed enterprise applications sometimes perform better or required less resources than a large centralized software system for some industrial applications such as manufacturing control software and manufacturing execution systems due to the increased demand for agility, flexibility and scalability, dispersed users in various locations, and unanticipated system change [1], [11]. As the dialogue among researchers and practitioners in both the industrial informatics and enterprise systems areas are growing [8] [14], in an effort to provide IEEE Transactions on Industrial Informatics readers an avenue to understand the state of the art and future trends of enterprise application integration in industries, this paper aims to review the past, present and future development of enterprise application integration architectures and technologies. The topics of interest to industrial informatics readers include, but are not limited to, distributed enterprise architectures, middleware technologies for integrating distributed enterprise applications in industries, and research trends and challenges involved in the integration of distributed enterprise applications. However, the review is by no means meant to be exhaustive. We hope this paper can help readers become more aware of the challenges and opportunities that exist in this increasingly important area and bring their expertise to help address research challenges for integrating various enterprise applications in industries. The rest of the paper is organized as follows. Section 2 presents a brief overview on the historical development of distributed enterprise application architectures. Section 3 provides a brief overview on the past development on main distributed application integration technologies. Recent researches on enterprise application integration technologies are presented in Section 4. Based on the review, future trends and research challenges are discussed in Section 5. Conclusions are drawn in Section 6. II. HISTORICAL DEVELOPMENT: DISTRIBUTED ENTERPRISE APPLICATION ARCHITECTURES Distributed enterprise application architectures have undergone an extensive evolution. Early generation enterprise applications were built on centralized mainframes. As the capacity of personal computers increased, many applications and tasks were moved to the user’s computers to better satisfy the business or processing needs. As a result, first generation distributed enterprise applications were developed based on a two-tier client/server architecture in the 1980s [15]. In a two-tier client/server architecture, the client is responsible for presenting the application to the user while the server is in charge of data management and storage [16]. As the complexity of transactions and the amounts of data continue to increase, a 3-tier architecture became popular in enterprise application development in the mid-late 1990s. On a 3-tier architecture, software components are divided into three layers: a presentation layer, application layer, and database layer [14]. The client tier focuses on the user interface and interacts with the middle tier via protocols such as DLL, API, or RPC. The middle tier focuses on application logic and interacts with the database tier via standard database protocols such as ODBC. Middleware technologies such as CORBA are often deployed to the middle tier to integrate distributed enterprise applications including independently developed applications. In addition, TP (transaction processing) monitors often run on the middle tier for scalability, workload and resource balancing needs [14]. As Web applications become widespread, the 3-tier Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 3 architecture was extended to the web-centric architecture by adding web clients and web servers (e.g., Apache). In a web-centric architecture, the web client sends HTTP requests to the Web server for content. The Web server either returns the content directly or passes it on to a specific application server. The application server interacts with the back-end database and sends responses back to the client[14]. Furthermore, the 3-tier architecture can be extended to the multi-tier architecture. For example, additional tiers are often introduced between client and data layer for security, workload and resource balancing, and performance monitoring, etc. III. INTEGRATION OF DISTRIBUTED ENTERPRISE APPLICATIONS: THE PAST Enterprise integration includes physical system integration, application integration, and business integration [14]. This paper mainly concerns the application integration on heterogeneous platforms. The research community came up with EAI solution to help achieve quality application integration. Originally, EAI was only focused on integrating intra-organizational applications, but now it has been expanded to cover aspects of inter-organizational integration [7-8]. EAI provides ways to integrate heterogeneous applications on different platforms [8]. Integration can be studied through different dimensions including integration scope (intra-enterprise and inter-enterprise integration), integration point of view (user's view, designer's view and programmer's view), integration layer and integration level [14]. Furthermore, intra-enterprise integration can be divided into horizontal and vertical integration [13-14]. Inter-enterprise integration includes B2B (business to business) or the B2C (business to customer) integration [14]. Integration levels can be at hardware, platform, syntactical, and semantic levels. From the technology perspective, researchers found it useful to study the integration in terms of layers [17] including communication layer, data layer, business logic layer, and presentation layer. Below we give a brief overview of main integration technologies based on different layers. A. Communication Layer Integration Integrating distributed applications requires those separate applications to be able to communicate with one another and to exchange information. For example, an application may need to know the status and operations of a remote application in order to perform certain tasks such as scheduling. Typically a set of protocols are needed to transport information between two different applications. Examples of such protocols are HTTP protocol and IIOP protocol in CORBA [17]. B. Data Integration Research on data integration mainly deals with moving or federating data between heterogeneous data sources which can reside on different machines under different operating systems and database management systems [14],[18]. Data integration involves a lot of data conversion among elements including source schema, mediated (target) schema, and the mapping between them. Source schema refers to the data model of data sources to be integrated; mediated schema is the view of the integrated system from the existing data sources; the mapping provides mechanisms for transforming queries and data from the integrated systems to those of data sources [17],[19]. The drawback of data integration is that it requires a significant effort to understand the data models and to maintain the mediated schema if there is any change with the source schema. C. Business logic integration Integration at this layer can be further divided into integration in the sub-layers such as basic coordination, functional interfaces, business protocol and policies, and non-functional properties [17]. A traditional way for the application integration involves low-level network and operating system programming, making the resulting enterprise system difficult to maintain, configure and upgrade. To make the application integration at the business logic level easier, the research community mainly focuses on the development of middleware technologies. A number of middleware technologies have been developed to build and integrate distributed enterprise applications [20] in the past two decades. Middleware is designed to provide high-level primitives and hide the lower-level primitives of the computing machinery underneath it and thus make distributed systems design much easier and faster [20-21]. By adopting the abstractions that middleware provides, we can isolate applications from the variety of ever-changing hardware platforms, operating systems, networks, protocols, and transports that make up the enterprise computing systems [2]. Due to the advantages brought by middleware, the literature has witnessed an extensive use of middleware technologies in industrial environments [3], [6], [13]. As an important integration technology, middleware technologies are often used by industrial enterprises to integrate new applications and legacy applications[8]. Typically, a middleware for communication comprises two types of remote communications: message-passing, and/or remote-invocation [6]. More specifically, message-passing includes synchronous and asynchronous messaging [21]; Remote-invocation includes synchronous, client-side asynchronous, and server-side asynchronous remote invocations [6]. Additionally, middleware can provide functions to ensure reliability, scalability, and performance to enterprise systems. Figure 1 illustrates the use of middleware in distributed applications. Fig. 1. The use of middleware in distributed applications Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 4 There are many types of middleware such as RPC-based middleware, message-oriented middleware, event-based middleware, database middleware, transaction processing (TP) monitors, security middleware, agent-based middleware and service oriented middleware [14], [22-23]. In addition, some companies such as SAP implement their own custom middleware as part of their application solution. Each of these technologies and approaches has their own advantages and disadvantages. As custom middleware has limited system portability, interoperability, and configurability, and can be expensive in terms of development and maintenance [2], our review mainly focus on general purpose middleware technologies. Our review shows that the most commonly used general purpose middleware technologies at the business logic layer for integrating distributed enterprise applications include the RPC-based (remote invocation) middleware (e.g., DCE, DCOM, CORBA and Java RMI) and message-oriented middleware (MOM). Below is a brief introduction to some of the major middleware technologies. 1) Distributed Computing Environment (DCE) DCE was developed by the Open System Foundation in early 1990's and was designed to support distributed applications in heterogeneous hardware and software environments. DCE consists of multiple components such as the Remote Procedure Call (RPC), the Cell and Global Directory Services (CDS and GDS), the Security Service, DCE Threads, Distributed Time Service (DTS), and Distributed File Service (DFS). These components were integrated to work closely together [24-25]. DCE is supported by many different platforms and old legacy operating systems. A main advantage of DCE is that the DCE RPC facility provides a way of communicating between software modules running on different systems [14]. Compared to traditional networking programming methods such as using socket calls, RPC is relatively simpler to code. DCE supports both portability and interoperability by providing the developer with capabilities that hide differences among the various hardware, software and networking elements an application will deal with in a large network [24]. However, DCE does not have strong support for object-oriented languages because RPC is inherently procedural. As object-oriented languages such as Java are widely used in industries and business applications, DCE has lost its popularity in the marketplace [25]. 2) Distributed Component Object Model (DCOM) DCOM is a Microsoft technology for communication among software components distributed across networked computers [14]. DCOM is the distributed extension to COM, which provides a set of interfaces allowing clients and servers to communicate within the same computer [14]. Using DCOM, two objects on two separate computers are able to call each other's methods. DCOM actually builds an object remote procedure call (ORPC) layer on top of DCE RPC to support remote objects. DCOM supports object-oriented languages such as C++ and Java. DCOM is well supported by Windows system platforms. However, DCOM only supports a few non-Windows operating systems, which limits the use of DCOM for heterogeneous networks [25]. 3) Common Object Request Broker Architecture (CORBA) CORBA is an object-oriented middleware technology defined by the Object Management Group in1993 in the early 1990’s. Different from DCOM, CORBA is platform and language independent. CORBA provides support for a broad range of platforms and programming languages and has ability to integrate legacy software applications. The Object Request Broker (ORB) is the core of the system. The Internet InterORB Protocol (IIOP®) is used as the standard communication protocol between ORB. An ORB delivers requests from client applications to server applications. The CORBA specification provides a uniform framework across the entire distributed environment and makes applications built using an ORB very portable across diverse platforms [14]. CORBA had applications in many domains including telecommunications, finance, medicine, and manufacturing [27]. However, CORBA was complex and hard to use correctly, leading to long development times and high defect rates. As implementing CORBA-based distributed application is costly and technically complex, the interest in CORBA has declined sharply [25]. Currently, CORBA has pretty much lost its position in the marketplace. But some strengths of CORBA have been incorporated into technologies such as J2EE and Web services. 4) Java Remote Method Invocation (RMI) Java RMI was released by Sun around 1997 and java RMI provides a distributed computing platform specifically focused on Java-based clients and servers[28]. Java RMI enables the programmer to create distributed Java technology-based to Java technology-based applications, in which the methods of remote Java objects can be invoked from other Java virtual machines, possibly on different hosts [14], [28]. Due to Java’s inherent platform-independent capabilities, RMI-based applications are capable of running on a wide variety of computing platforms. However, RMI heavily relies on Java and does not have direct support for other common languages such as C or C++. 5) Message Oriented Middleware (MOM) MOM relies on messages to enable communication between separate systems. MOM uses the passing and queuing of messages, as well as multiprotocol support, to carry information and action requests between heterogeneous distributed applications or between distributed components within an application [6], [17]. MOM enhances flexibility by allowing applications to switch messages without the requirement of knowing on which platform or processor the other application located. MOM is inherently a loosely coupled, asynchronous technology [14]. MOM facilitates communications across a range of messaging systems, such as request-response, prolonged conversation, application queues, publishing and subscribing messaging, and broadcasting. MOM provides strong support for asynchronous communications[6]. Main disadvantages of MOM include limited scalability and heterogeneity support, lack of standards and poor portability [18]. MOM has been used successfully in some industrial systems such as integrated manufacturing systems [12], [17]. Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 5 D. Presentation Layer Integration The integration in the presentation layer focuses on the user interface(UI) integration. Presenting integrated and dynamic view for users is the main goal of UI integration [29]. The UI integration often builds applications by integrating components at the graphical user interface. Although there is a large body of research for integrations at the data layer or business logic layer, the research community has done little work at the presentation level [19]. A recent example of user interface integration is Web mashup such as integrating Google Map with other applications. Portlet is another technology in the UI integration and can help produce customizable portal applications [29]. Further work on effective standardization at the presentation level is needed for effective user interface integration to take off [19]. IV. RECENT RESEARCH ON THE TECHNOLOGY INTEGRATION OF DISTRIBUTED ENTERPRISE APPLICATIONS In this section, we give a brief overview of recent research on the integration of distributed enterprise applications technologies including J2EE, .Net and Web services, Service Oriented Architecture (SOA) and Enterprise Service Bus. A. J2EE (Java 2 Enterprise Edition) J2EE has emerged as a leading platform for developing enterprise applications [30]. J2EE contains technologies such as Java Database Connectivity, Enterprise Java Beans, Java Naming and Directory Interface, Remote Method Invocation, Java Server Pages (JSP), XML, etc. Particularly, the Enterprise JavaBeans (EJB) provides a simplified method to develop component-based distributed applications over heterogeneous environments [14],[17]. J2EE has been used extensively in industrial systems. For example, J2EE was used as a system framework to integrate supply chain alliance enterprises’ information systems [31]. B. Microsoft’s .Net Framework The .Net Framework standardizes how the languages refer to data and objects and allows objects from different languages to operate together. It allows the developers to develop the application in different languages and allows the execution on different types of runtime system and environment [26]. Basically, the .Net programming languages all compile to a common machine language, Microsoft Intermediate Language (MSIL), which runs on .NET Framework. The MSIL is then converted into machine code during the application execution. The .NET framework also provides remoting infrastructure which allows an object on one computer to call the methods of an object on another computer[32]. Using .NET remoting, objects can communicate with one another even though they reside on different computers. Many developers use remoting for implementing distributed applications within the intranet. As .Net runs only on Windows systems, .Net technology has limited reach in heterogeneous environments. C. Web Services, Service-oriented architecture and Enterprise Service Bus As traditional middleware such as CORBA and DCOM are typically used for intranet applications and in many cases they may not be able to cross firewall boundaries, web services have been developed to support the integration of internet applications in recent years. A Web service has a collection of functions that are packaged as a single entity and can be published to the network for use by other programs. Web services can be accessed by any language and can run on any operating system. They utilize the HTTP protocol as the underlying transport, which allows function requests to pass through firewalls. XML is used to format the input and output parameters of the request, so the request is not tied to any particular component technology or object calling convention. Main web services protocols includes SOAP (the protocol to interact with a Web Service), WSDL (the language for specifying the interface to a Web Service), and UDDI (the repository for storing references to Web Services so that clients can find them) [14], [33]. Web services are building blocks for constructing Web-based distributed applications and can be viewed as a middleware that is more suitable to be used across the Internet. In essence, Web services are built around the concept of messaging and frequently these messages take the form of request/response-type remote procedure calls on remote objects [34]. Both J2EE and .Net can be used to create web services. Web services consist of three components: a service broker that acts as a look up service between a service provider and a service requestor; a service provider that publishes its services to the service broker; a service requester that asks the service broker where to find a suitable service provider and that binds itself to the provider [33]. Web services play an important role for integrating different middleware systems. Web services can provide “middleware for middleware” abstraction layer for modern integration applications[2]. As different middleware has different advantages, many enterprises used various middleware for their application integration over the years. As a result, these enterprises face “Middleware Islands” issue caused by multiple middleware approaches because middleware technologies and products from different vendors don’t interoperate easily [35]. As a result, enterprises have to find ways to integrate these different middleware systems. In some cases, ad-hoc techniques such as adapters can be used to extend one of the enterprise middleware systems to wrap or integrate the others. But in many cases, this isn’t possible due to reasons such as lack of technical expertise and costs [2]. Web Services and its underlying principle named Service-oriented architecture are considered good solutions for such middleware-to-middleware interworking [2],[35]. SOA represents the latest trend in integrating heterogeneous systems and different middleware systems. SOA provides guidelines on how services are described, discovered and used [8], [14]. In SOA, software applications are packaged as services. Each service has a well-defined interface which lists the operations it provides and the set of messages it accepts and sends in response [35]. Services can be reconstructed and reused to create new applications[8]. In industrial systems, SOA Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 6 has been successfully applied to supply chain management systems, manufacturing execution system, train car management system, control system for the semiconductor processing equipment, healthcare medical system, electronic power application system, electronics production systems, etc. [8], [12], [36-37]. For example, a factory’s online business system can have a purchase-and-ordering application which could communicate to an inventory application on another web server that specifies the items that need to be reordered or a Web service from a credit bureau which requests the credit history from the loan services for prospective borrowers [38]. Service-oriented integration is an evolution of EAI in which proprietary connections are replaced with standards-based connections over an Enterprise Service Bus (ESB) notion that is location transparent and provides a flexible set of routing, mediation, monitoring, and transformation capabilities. ESB is able to work across different middleware products and standards to implement enterprise-wide SOA [2], [8]. ESB and open protocol SOAP can shield from different proprietary protocols (CORBA IIOP, J2EE RMI, etc.) of various types of heterogeneous system to realize the smooth flow of data between application systems and improve interoperability capacity between systems [39]. Thus, SOA-based ESB is often viewed as a new middleware technology. An example is that an electronic power application system was integrated by using ESB in Northern China [39]. Figure 2 illustrates an SOA-oriented integration environment using ESB. In summary, Web services, SOA and ESB provide a promising framework for inter-enterprise integration. Fig. 2. An SOA-oriented integration environment using ESB. V. FUTURE PERSPECTIVES A. Trends Integrating of various industry applications are an ongoing task for industry enterprises that are adopting new technologies and embedded devices. Some new trends in this area include: 1) As web services, SOA and ESA are being increasingly applied to complex integration tasks that involve both existing and legacy applications, there is an increased need to ensure Quality of Service (QoS) in the integration process [40]. As different web service applications often have different QoS requirements and may cause conflicts between each other for resources such as bandwidth and processing time, it is necessary to develop empirically tested QoS integration models to measure and monitor the QoS parameters, checks the agreed-upon service levels, reports violations to the authorized parties and implement dynamic selection mechanism for QoS-aware web services [41-42]. We expect to see models like Trusted and Autonomic Service Cooperation model to be deployed in industrial enterprises. In addition, services in service-oriented industry applications will be increasingly integrated using different multi-tenancy patterns [43]. 2) As the size and complexity of industrial applications continue to grow, the amount of data increase exponentially. There is an increasing need to integrate OLAP (Online Analytical Processing), knowledge discovery, data mining functions and data sources for decision support, information integration and other business needs [44-46]. Application-specific middleware such as data mining middleware will be increasingly developed and deployed for industrial information system integration. 3) The semantic web and social networking technologies are still in their infancy regarding industrial applications[47-48]. The integration of semantic web and social networking technologies with sensor data is expected to grow in industrial applications [49] and add more values to customers and partners [47] in industrial settings. Various ontology approaches for semantic integration and interoperability [14] will become mature and more applicable in industrial environments. 4) Mobile applications, embedded systems and smart embedded devices have been increasingly deployed in industry enterprises. Advances in semiconductor and transmission technology have added many advanced functionalities into interconnected and self-reliant smart embedded devices such as sensors and actuators. This creates new opportunities and challenges to build communication and interoperability between industrial systems and embedded devices. Due to its strong support for both autonomy and interoperability, SOA approaches have the capabilities to implement communication and data exchange between embedded devices and applications [50]. We expect to see that industry applications are increasingly integrated with services running on large numbers of networked, resource-limited mobile and smart embedded devices using SOA approaches. On the other hand, techniques such as embedded Web servers[51], gateway implementation and XML[52-53] and TCP/IP –based protocol application programming[51-52] will continue to be used to exchange data with large number of heterogeneous Networked Control Systems (NCS) and commercial Programmable Logic Controllers (PLC) deployed in the industry over the years. A main reason is that SOA has not been adopted yet in these NCS and PLCs and thus cannot be considered as a feasible solution for integration with already deployed control systems [51]. B. Some Research Challenges 1) As more and more industrial enterprise are adopting Web services to integrate various applications and devices, security Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 7 challenges become more prominent. For example, security risk may arise in the processes of command transferring, remote diagnosis, and maintenance [3] because information exchange and device access have to cross multiple corporate networks and the Internet now. 2) Many industrial enterprises have distributed real-time control systems (DRCS) which communicate with sensors and actuators over a communication network [53]. Generally, DRCS needs more reliability, robustness and efficiency than data-centric applications. Particularly, industrial enterprises such as power plants have strict performance constraints and response time requirements on their industrial systems and applications. For example, many industrial automation and control systems collect data from heterogeneous sensors and require real-time analyses for the data collected from a large variety of heterogeneous sensors [54-57]. As DRCS is playing an increasingly crucial role in critical industrial operations and transactions, new integration techniques such as real-time distribution middleware, distributed real-time Java [6] and real-time SOA solutions [54-55] are needed to address performance and time constraints concerns, predict performance, and mitigate integration risks such as lost or missent data, incomplete, unreadable, or invalid data, accountability and security [58-59]. 3) In practice, the integration of enterprise applications and devices may collapse due to various unfavorable factors and unanticipated changes. Thus, research on the integration reliability including data and information reliability is highly valued by all industrial enterprises [54]. New guidelines and methods such as task load balance, fault-tolerant and message scheduling and transaction mechanisms needed to be further developed to ensure the reliability, maintainability, fast diagnosing and robustness of integrated industrial applications and devices in various environments. 4) As new technologies and devices are constantly introduced into industrial systems, user interface integration for industrial systems poses many new problems[19] like various interface types, definitions, and service interfaces(functional, discovery, binding, etc.) [60]. Interface integration requires a good understanding of various applications, devices and enterprise-wide integration requirements. Currently, there is a lack of conceptual modeling techniques that can effectively elicit, represent, and analyze enterprise-wide integration requirements [61]. 5) In recent years, new technologies such as wireless sensor networks(WSN), Radio Frequency Identification (RFID) and Internet of Things (IoT) have been deployed to industrial systems such as logistics systems, material flow systems, and supply chain management systems [62-63]. These new technologies made the integration of industrial applications, devices and various interfaces more difficult. There is a lack of services, guidelines and standard architecture for allowing interactions of heterogeneous devices, sensors, aggregators, actuators and diverse domain of context aware applications while preserving reusability, security and privacy [64]. For example, the lack of services for connecting users to the appropriate sensor networks becomes very apparent as the amount of sensor network data sources increases [65]. Deployment of built-in and dynamically deployed user services in a physical device still need further research work to ensure the expected functionality and performance [66]. Various middleware solutions and architectures (e.g., publisher/subscriber architecture, real-time message bus architecture, ontology architecture) for integrating industrial applications, RFID, WSN and IoT have been proposed [14], [50, 54], [65-73]. However, these solutions and architectures are typically designed for respective domains. The development of an ontology precisely defining concepts and properties of an enterprise architecture domain integrating these new technologies is considered challenging [74]. The scalability and customization of these middleware solutions and architectures are still open issues and need further research. VI. CONCLUSION Enterprise application integration concerns the interoperability of applications on heterogeneous platforms as well as access to shared data and services by various distributed applications [14], [75-76]. Over the past three decades, a number of technologies and techniques have been developed to address the integration issue of distributed enterprise applications in networked environments. In this paper, we have surveyed the state-of-the-art of the integration of distributed enterprise applications, covering its essential concepts, architectures, key technologies for application integration as well as research directions. As enterprises spread their boundaries across different business areas and acquire disparate technology solutions and applications, integrating distributed enterprise applications becomes inevitable for enterprises that need to achieve business competitiveness. It is noted that there are strong demands in industry to add interoperability and other collaborative features to existing industrial information systems [3], [14], [74]. As more and more industry enterprises are adopting multi-tier client/server, Internet and service-oriented architectures for their enterprise applications and industrial devices [12], [27], [77-81], the need for interoperability is prominent in the industrial enterprise environment [13]. Currently, there are still many research challenges such as user interface integration, reliability, performance management and security risk management [3], data mining for distributed applications[82], cross infrastructures services access protection and relative services orchestration [83], integration framework for supply chain applications[84-85], integration of hybrid wireless networks in enterprise systems[86], architecture design for real-time sensor network [65], control architecture for the sensor, actuator and control service implementation [60] that need to be resolved in order for industrial systems to become more applicable [8]. Additionally, we believe that the dialogue in both the industrial informatics and enterprise systems areas needs to be enhanced in order to result in new development of integrated industry enterprise systems. Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 8 REFERENCES [1] v.S. Tanenbaum. Distributed Systems, Principles and Paradigms. Prentice Hall, 2002. [2] S. Vinoski. Integration with Web Services. IEEE Internet Computing, 7(6): 75-77, 2003. [3] Y. Xu, R. Song, L. Korba, L. Wang, W. Shen and S. Lang, “Distributed device networks with security constraints,” IEEE Transactions on industrial Informatics,1(4), 217 – 225, 2005. [4] Function Blocks for Industrial-Process Measurement and Control Systems, 2000. IEC TC65/WG6, IEC-TC65/WG6 Committee. [5] D.S. Linthicum. Enterprise application integration. Addison-Wesley Longman Ltd. Essex, UK, UK, 2000. [6] P.B.Val, M. Garcia-Valls, I. Estevez-Ayres, “Simple Asynchronous Remote Invocations for Distributed Real-Time Java,” IEEE Transactions on Industrial Informatics, 5(3), 289 – 298, 2009. [7] K. Qureshi, “Enterprise application integration,” in Proc. IEEE 2005 Int. Conf. Emerging Technologies, Islamabad, Pakistan, Sep. 17–18, pp. 340–345, 2005. [8] L. Xu, “Enterprise Systems: State-of-the-Art and Future Trends,” IEEE Transactions on Industrial Informatics, VOL. 7, NO. 4,pp. 630-640, 2011. [9] “Roadmap for board assembly,” in NEMI Technology Roadmaps 1996 Edition. Herndon: INEMI, Dec. 1996. [10] S. Chen, “Open design of networked power quality monitoring systems,” IEEE Transactions on Instrumentation and Measurement, 53(2), 597 – 601, 2004. [11] A. Lüder, A. Klostermeyer, J. Peschke, A. Bratoukhine, and T. Sauter, “Distributed Automation: PABADIS versus HMS,” IEEE Transactions on Industrial Informatics, VOL. 1, NO. 1, pp. 31-38, 2005. [12] I.M. Delamer, and J.L.M. Lastra, “Service-Oriented Architecture for Distributed Publish/Subscribe Middleware in Electronics Production,” IEEE Transactions on industrial Informatics, 2(4), pp. 281-294, 2006. [13]A. Kalogeras, J.Gialelis, C. Alexakos, M. Georgoudakis,and S.Koubias, “Vertical integration of enterprise industrial systems utilizing web services,” IEEE Trans. Industrial Informatics, 2(2): 120-128, 2006. [14] S. Izza, “Integration of industrial information systems: from syntactic to semantic integration approaches,” Enterprise Information Systems, 3(1), 1-57, 2009. [15] V. Matena, S. Krishnan, L. DeMichiel, and B. Stearns. Applying Enterprise JavaBeans™:Component-Based Development for the J2EE™ Platform, Second Edition. Addison Wesley, 2003. [16] A. Sinha, “Client-server computing,” Communications of the ACM, 35(7), July 1992, pp. 77–98. [17] B. Benatallah, and H. R. Motahari-Nezhad, “Service Oriented Architecture: Overview and Directions,” In Advances in Software Engineering, by Alfredo Ferro and Egon Boerger (Editors), Lecture Notes in Computer Science, Volume 5316/2008, 116-130, 2008. [18]D. Chen, G. Doumeingtsb, F. Vernadatc, “Architectures for enterprise integration and interoperability: Past, present and future,” Computers in Industry, 59(7), pp. 647-659, 2008. [19]F. Daniel, J. Yu, B. Benatallah, F. Casati, M. Matera, and R. Saint-Paul, “Understanding ui integration: A survey of problems, technologies, and opportunities,” IEEE Internet Computing, 11(3):59-66, 2007. [20] P. Bernstein, “Middleware: A model for distributed systems services,” Communications of the ACM, pp. 86–98,1996. [21] S.L. Ooi, and M.T. Su, “Integrating enterprise application using message-oriented middleware and J2EE technologies,” International Conference on Computing & Informatics, 1-5, June 2006. [22] W. Emerich, “Software Engineering and Middleware: A Roadmap,” Proceedings of the conference on The future of Software engineering, Limerick, Ireland, pp. 117-129, 2000. [23] Q.Chen, J. Yao, and R. Xing, “Middleware components for e-commerce infrastructure: An analytical review,” Journal of Issues in Informing Science and Information Technology, 3,137-146, 2006. [24] Open Software Foundation. OSF DCE Application Development Guide. Open Software Foundation, 11 Cambridge Center, Cambridge, MA., revision 1.0, update 1.0.2 edition, 1994. [25] Recommendations for Using DCE, DCOM, and CORBA Middleware, 1998, Sponsored by: DISA/JIEO Center for Computer Systems Engineering (JEXF). [26] Microsoft Corporation. DCOM Technical Overview, 2011. Available: http://technet.microsoft.com/en-us/library/cc722925.aspx [27] S. Vinoski, “CORBA: integrating diverse applications within distributed heterogeneous environments,” IEEE Communications Magazine, 35(2), p. 46-55, 1997. [28] J. Maassen, R. van Nieuwpoort, R. Veldema, H. E. Bal, T. Kielmann, C. Jacobs, and R. Hofman, “Efficient Java RMI for Parallel Programming,” ACM Trans. Prog. Lang. Syst., 23(6), 2001. [29] Bellas, F., “Standards for Second-Generation Portals,” IEEE Internet Computing., 8(2), pp. 54-60, 2004. [30] R. Johnson, "J2EE Development Frameworks," Computer, v.38 n.1, p.107-110, January 2005. [31] H. Zhang, B. Zhang, and B. Liu, “Information Integration Solutions Based on J2EE for Supply Chain Enterprises,” 2011 International Conference on Management and Service Science (MASS), pp. 1-4, Wuhan, China, 2011. [32] S. Khan, K. Qureshi, and H. Rashid, “Performance comparison of ICE, HORB, CORBA and dot NET remoting middleware technologies,” International Journal of Computer Applications, vol. 3, no. 11, pp. 15–18, 2010. [33] J. Roy and A. Ramanujan, “Understanding Web services,” IT Profess., vol. 3, no. 6, pp. 69–73, 2001. [34] M.D. Hanes, S.C. Ahalt, and A. K. Krishnamurthy, “A Comparison of Java RMI, CORBA, and Web Services Technologies for Distributed SIP Applications,” High Performance Embedded Computing Annual Workshp (HPEC), 2002. [35] S. Baker, and S. Dobson, “Comparing service-oriented and distributed object architectures,” Proceedings of the International Symposium on Distributed Objects and Applications, 2005. LNCS 3760: p. p631-645. [36] Y. H. Yin and J. Y. Xie, "Reconfigurable manufacturing execution system for pipe cutting," Enterprise Information Systems, 5(3), 287-299, 2011. [37] L. Duan, W. N. Street and E. Xu, "Healthcare information systems: data mining methods in the creation of a clinical recommender system," Enterprise Information Systems, 5(2), 169-181, 2011. [38] M. Dumitrache, S. Dumitra, and M. Baciu, “Web services integration with distributed applications,” Journal of Applied Quantitative Methods, 5(2), 223-233, 2010. [39] R. Xu, J. Bai, and Y. Wang, “The research and implementation of power application system integration based on enterprise service bus,” 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS), Beijing, China, 2010. [40] B. S. Farroha, and D. L. Farroha,"Policy-based QoS requirements in a SOA enterprise framework-an investigative analysis," IEEE Military Communications Conference,pp. 1-7, 2007. [41] J.Ai, J., J. Gao, J. Yu, and Z. Zhao, “A Concept for QoS Integration in Trusted and Autonomic Service Cooperation,” The 1st International Conference on Information Science and Engineering (ICISE2009), 3922-3925, 2009. [42]D.A. D'Mello and V. S. Ananthanarayana, "Dynamic selection mechanism for quality of service aware web services," Enterprise Information Systems, 4(1), 23-60, 2010. [43] R. Mietzner, F. Leymann and T. Unger, "Horizontal and vertical combination of multi-tenancy patterns in service-oriented applications," Enterprise Information Systems, 5(1), 59-77, 2011. [44] V.T. Ravi, and G. Agrawal, “Integrating and optimizing transactional memory in a data mining middleware,” 2009 International Conference on High Performance Computing (HiPC), 215 – 224, 2009. [45]B. Liu, S.G. Cao, and W. He, “Distributed Data Mining for E-Business,” Information Technology and Management, 12(1), pp. 1-13, 2011. [46]E. Xu, M. Wermus and D. Blythe Bauman, "Development of an integrated medical supply information system," Enterprise Information Systems, 5(3), 385-399, 2011. [47] J.G. Breslin, D. O'Sullivan, A. Passant, and L. Vasiliu, "Semantic Web computing in industry," Computers in Industry, 61(8), pp. 729-741, 2010. [48] G. Governatori and R. Iannella, "A modelling and reasoning framework for social networks policies," Enterprise Information Systems, 5(1), 145-167, 2011. [49] M. Kaplan, and M. Haenlein, “Users of the world, unite! The challenges and opportunities of Social Media,” in: Business Horizons, 53, S. 59-68, 2010. [50] D. Guinard, V. Trifa, S. Karnouskos, P. Spiess, and D. Savio, “Interacting with the SOA-Based Internet of Things: Discovery, Query, Selection, and On-Demand Provisioning of Web Services”, IEEE Transactions on Services Computing, 3(3), 223-235, 2010. Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 9 [51] A. Jestratjew, and A. Kwiecien, "Performance of HTTP Protocol in Networked Control Systems," IEEE Transactions on Industrial Informatics, volume 99, 2012. DOI: 10.1109/TII.2012.2183138 [52] T. Sauter, and M. Lobashov, "How to Access Factory Floor Information Using Internet Technologies and Gateways," IEEE Transactions on Industrial Informatics, 7(4), pp. 699-712,2011. [53] S. Eberle, “Adaptive Internet Integration of Field Bus Systems,” IEEE Trans. on Industrial Informatics, vol. 3, no. 1, pp. 12–20, Feb. 2007. [54] L. Du, C. Duan, S. Liu, and W. He, “Research on Service Bus for Distributed Real-time Control Systems,” The 6th IEEE Joint International Information Technology and Artificial Intelligence Conference, 401-405, 2011. [55] T. Cucinotta, A. Mancina, G. F. Anastasi, G. Lipari, L. Mangeruca, R. Checcozzo, and F. Rusinà, “A Real-Time Service-Oriented Architecture for Industrial Automation,” IEEE Trans. on Industrial Informatics, vol. 5, no. 3, pp. 267–277, Aug. 2009. [56]T. Cucinotta, L. Palopoli, L. Abeni, D. Faggioli, and G. Lipari, “On the Integration of Application Level and Resource Level QoS Control for Real-time Applications,” IEEE Transactions on Industrial Informatics, 6(4), pp.479-491, 2010. [57] X. Liu, Q. Wang, S. Gopalakrishnan, W. He, L. Sha, H. Ding, and K. Lee, “ORTEGA:An Efficient and Flexible Online Fault Tolerance Architecture for Real-Time Control Systems,” IEEE Transactions on Industrial Informatics, Vol 4, No. 4, pp 213-224, November 2008. [58] A. Capozucca and N. Guelfi, "Modelling dependable collaborative time-constrained business processes," Enterprise Information Systems, 4(2), 153-214, 2010. [59] R. Gleghorn, “Enterprise Application Integration: A Manager’s Perspective,” IT Professional, 7(6), 17-23, 2005. [60] Pohl, A. Krumm, H. Holland, F. Luck, I. Stewing, F.-J,“ServiceOrientation and Flexible Service Binding in Distributed Automation and Control Systems”, Proc. 22nd Advanced Information Networking and Applications-Workshops, 2008, p1393-1398. [61] N. Bolloju, “Conceptual Modeling of Systems Integration Requirements,” IEEE Software, 26(5), pp. 66-74, 2009. [62] C. Hsu and W. Wallace, “An industrial network flow information integration model for supply chain management and intelligent transportation,” Enterprise Information Systems, 1(3), 327-351, 2007. [63] S. Kumar, B. Kadow and M. Lamkin, "Challenges with the introduction of radio-frequency identification systems into a manufacturer's supply chain – a pilot study," Enterprise Information Systems,5(2), 235-253, 2011. [64] S. Bandyopadhyay, M. Sengupta, S. Maiti and S. Dutta, “Role of middleware for Internet of Things: a study,” International Journal of Computer Science & Engineering Survey (IJCSES), Vol.2, No.3, pp. 94-105, 2011. [65] Cristian Gadea, Bogdan Ionescu, Dan Ionescu,” Real-time collaborative intelligent services for sensor networks”, Proc. ICCC-CONTI 2010, May 2010, p511-516. [66] G. Candido, A.W. Colombo, J. Barata, and F. Jammes, "Service-Oriented Infrastructure to Support the Deployment of Evolvable Production Systems," IEEE Transactions on Industrial Informatics, 7(4),759 - 767, 2011. [67] F. Gandino, B. Montrucchio, M. Rebaudengo, and E.R. Sanchez, “On Improving Automation by Integrating RFID in the Traceability Management of the Agri-Food Sector,” IEEE Transactions on Industrial Electronics, 56(7), pp.2357-2365, 2009. [68] Y. Tian, J.V. Geiger, H.B. Su, S.V. Kumar, and Houser, P.R., “Middleware-Based Sensor Web Integration,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 3(4), 467-472, 2010. [69] S. Haller, S. Kanouskos and C. Schroth, “The Internet of Things in an enterprise context,” In Future Internet Systems (FIS), LCNS, Vol. 5468, Springer, pp.14-28, 2009. [70] C.Lee, and C. Chung, “RFID Data Processing in Supply Chain Management Using a Path Encoding Scheme”, IEEE transactions on knowledge and data engineering, vol. 23, NO. 5, MAY 2011. [71] M. Kranz, P. Holleis and A. Schmidt, “Embedded interaction interacting with the Internet of Things,” IEEE Internet Computing, March/April, 46-53, 2010. [72] T. Perumal, A.R. Ramli, C.Y. Leong, K. Samsudin, and S. Mansor, “Middleware for heterogeneous subsystems interoperability in intelligent buildings,” Automation in Construction, 19(2):160 – 168, 2010. [73] L. Xu, "Information Architecture for Supply Chain Quality Management", International Journal of Production Research, vol.49, no.1, pp.183-198, January 2011. [74] H. Panetto and A. Molina, "Enterprise integration and interoperability in manufacturing systems: Trends and issues," Computers in Industry, 59(7), pp.641–646, 2008. [75] K. Wang, X. Bai, J. Li and C. Ding, "A service-based framework for pharmacogenomics data integration," Enterprise information systems, 4(3), 225-245, 2010. [76] O. Erol, B.J. Sauser and M. Mansouri, "A framework for investigation into extended enterprise resilience," Enterprise Information Systems, 4(2), 111-136, 2010. [77] G. Cândido, A. W. Colombo, J. Barata, and F. Jammes, “Service-Oriented Infrastructure to Support the Deployment of Evolvable Production Systems,” IEEE Transactions on Industrial Informatics, 7(4), pp. 759-767, 2011. [78] C. Fu, G. Zhang, J. Yang and X. Liu, "Study on the contract characteristics of Internet architecture," Enterprise Information Systems, 5(4), 495-513, 2011. [79] F. Jammes and H. Smit, “Service-Oriented Paradigms in Industrial Automation,” IEEE transactions on industrial informatics, VOL. 1, NO. 1, pp. 62-70, 2005. [80] D. Liu, R. Deters and W.J. Zhang, "Architectural design for resilience," Enterprise Information Systems, 4(2), 137-152, 2010. [81] T. Zhang, S. Ying, S. Cao and J. Zhang, “A modeling approach to service-oriented architecture,” Enterprise Information Systems, 2(3), 239-257, 2008. [82] D.M. Chiang, C. Lin, and M. Chen, "The adaptive approach for storage assignment by mining data of warehouse management system for distribution centres," Enterprise Information Systems, 5:2, 219-234, 2011. [83] Q. Li, J. Zhou, Q. Peng, C. Li, C. Wang, J. Wu, and B. Shao, “Business processes oriented heterogeneous systems integration platform for networked enterprises,” Computers in Industry, 61, 127-144, 2010. [84] H.Hvolby, and J. Trienekens, "Challenges in business systems integration," Computers in Industry, 61(9), pp.808-812, 2010. [85] M. Zdravković, H. Panetto, M. Trajanović and A. Aubry, "An approach for formalising the supply chain operations," Enterprise Information Systems, 5(4), 401-421, 2011. [86] S. Li, L. Xu, X. Wang and J. Wang, “Integration of hybrid wireless networks in cloud services oriented enterprise information systems,” Enterprise Information Systems, 6(2), 165-187, 2012. Wu He received the B.S. degree in computer science from DongHua University, China, in 1998, and the Ph.D. degree in information science from the University of Missouri, USA, in 2006. His research interests include Enterprise Applications, and Knowledge Management. Li Da Xu (M’86-SM’11) received the M.S. degree in information science and engineering from the University of Science and Technology of China, in 1981, and the Ph.D. degree in systems science and engineering from Portland State University, Portland, USA, in 1986. He serves as the Founding Chair of IFIP TC8 WG8.9 and the Founding Chair of the IEEE SMC Society Technical Committee on Enterprise Information Systems. ACKNOWLEDGMENT The authors appreciate the valuable comments from the Editor and three anonymous reviewers. This project was partially supported by the NSFC (National Natural Science Foundation of China) Grant 71132008, Changjiang Scholar Program of the Ministry of Education of China, and the US National Science Foundation Grant 1044845. Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.