Enterprise Integration Modeling Strategies: Dynamic, Semantic, Canonical Models Author: Dave Hollander Publication Date: March 8, 2016 © Copyright 2003 Contivo, Inc. All rights reserved. The information contained herein has been obtained from sources believed to be reliable. Contivo disclaims all warranties as to the accuracy, completeness or adequacy of such information. Contivo has no liability for errors, omissions, or inadequacies in this document. Dynamic Semantic Canonical Models i Contents Introduction – Integration Strategies ............................................................................ 1 Dynamics – Staying Ahead of the Curve ....................................................................... 2 Semantics – Understanding Intent ................................................................................ 3 Canonicals– Managing Business Standards ................................................................. 5 Models – Tying it all Together ...................................................................................... 6 Dynamic Semantic Canonical Models 1 Introduction – Integration Strategies Integration is now a competitive necessity for enterprises. Integration projects that span internal applications, link critical demand- and supply-chain partners or connect customers all face the problem of making disparate systems interoperate seamlessly by exchanging business information. The integration imperative requires strategic solutions to accommodate the pervasive change and heterogeneity common in our IT environments. While many companies strive to reduce the number of applications, the effect on integration is minimal when compared to number of sources of change and variety. Sources include legacy systems, standards, applications and middleware that are introduced or revised, as well as organizations that restructure, merge, and move through the ongoing cycle of centralization and decentralization. Adding to this are new initiatives to gain competitive advantage often by deploying new applications and creating new business functions. A successful integration strategy must provide an IT manager the ability to: Accelerate development of global, networked business processes and visibility into distributed operations. Increase flexibility by minimizing the risks associated with inevitable organizational and technical change and introducing reusability, accountability and collaboration to leverage today’s effort in future projects. Reduce total costs when integrating legacy applications and web services even when scaled to enterprise boundaries and beyond. To address the imperative of effective integration, companies explore a variety of integration strategies including deploying middleware, managing meta-data, and adopting business messaging standards. Unfortunately all too often, when faced with staffing changes and the need to run parallel, distributed, multi-department and inter-company integration projects, these strategies do not successfully scale and achieve the desired business results. Integration using EAI middleware, Application Servers, B2B hubs or Service-Oriented-Architectures such as Web services deliver rapid connectivity but in doing so actually enlarge the problem. Metadata management can improve the efficiencies of managing an application but seldom encompass the full breath of integration. And business standards implemented as messaging formats or “canonicals” are difficult to implement and keep current in the dynamic environment our businesses operate in. While middleware, meta-data management, and messaging standards all have crucial roles in a successful enterprise integration strategy, deploying in a traditional approach they are insufficient to overcome the challenges presented by manual integration design. Traditional integration implementation methods are manual, labor intensive, and time-consuming processes. Traditionally integrated systems are expensive to maintain because a significant portion of the labor involved is dedicated to managing and mapping detailed relationships between the interchanged data. As the number of interconnected systems increases, manual integration methods result in implementation costs that can soar beyond five times the original investment in integration software. “Manual integration is among the top problems facing major enterprises; there is simply no practical, cost-effective way to solve it manually.” – Jon Derome, Yankee Group Dynamic Semantic Canonical Models 2 The goal of a successful integration strategy must be to reduce the incremental costs associated with every system that is integrated. As the number of systems interconnected goes up not only must costs and time-to-deploy go down, but it must result in increased flexibility to address heterogeneity and pervasive change. Traditional integration fails to meet these goals. To meet and conquer the challenges of enterprisescale integration, integration strategies must accommodate a variety of middleware, enable meta-data management capabilities that span multiple technologies and provide flexibility in applying business standards. Four key design factors in such a strategy are: Dynamic –responsive to message iteration, application versioning, changes in your information chains, and your partners’ and your own organizational changes. Semantic – focused on the meanings associated with data in documents, not representation. Canonical – expresses business judgment of which differences in business information are important and which are not. Model – an abstraction representation of an object, process, system, or information used to improve communications between all stakeholders, improve design quality, and reduce development and maintenance costs. Strategies that successfully manage these four aspects of integration design will provide the IT manager with the ability to accelerate development, increase flexibility, and reduce total costs. In short, these are the keys to designing a strategy that encourages and engages business process owners in the quick and affordable development of networked processes. Dynamics – Staying Ahead of the Curve Traditional integration design relies on creating static specifications of systems and interchanged information. Integration strategies that use messaging such as middleware, brokers, hubs or newer Service Oriented Architectures (SOA) such as Web Services rely on precise, rigid specifications of the interchanged messages. Interoperability is achieved by writing code based on these static specifications. The challenge to message-based approaches is keeping up with the change that is ever-present in the business environment. To accommodate change, messaging architectures have historically introduced optional elements into their message formats. While “optionality” can provide some flexibility and improve acceptance, it raises implementation costs and reduces interoperability. Many messaging veterans, such as Health Level Seven’s (HL7) Modeling and Methodology Committee, find that optionality creates more problems than it solves. “However, the wide uses of optionality, while helpful in gaining consensus and in writing a somewhat more compact specification, imposes significant costs on interface implementers. In fact, optionality, as a “feature” of HL7 Version 2.n, is associated with most of the issues that Version 3 is trying to solve. Optionality makes it harder to precisely define the semantics of a specific message and makes it virtually impossible Dynamic Semantic Canonical Models 3 to generate conformance claims for messaging.” – Message Development Framework, Version 3.3, HL7, p15 To accommodate the dual needs of keeping up with the dynamics while providing stable, static specifications, an integration solution should have the ability to separate message specifications into levels. Abstract: Dynamically managed models Concrete: Static interchange formats Physical: Static code that executes Abstract models describe the terms and concepts that must be exchanged to achieve a business result without the extra detail about how the data is stored in a message. Because they are less precise than the other two levels, abstract models are easier to change and can be kept up to date. Change should first be reflected in the abstract modeling layer to take advantage of quicker, lower cost processes that are focused on describing the impact of the change and assuring stakeholders consensus. Concrete formats represent the full detail of the messages and system interfaces being integrated. Concrete formats describe contents and storage requirements all of the data to be exchanged. These formats are often expressed as COBOL Copybooks, database schemas, XML Schemas or Document Type Definitions (DTDs) or UML. These formats must be rigid and precise since they represent the specifications that programmers use to code behavior into their systems. The physical layer in an integration environment is the actual runtime infrastructure. Runtime systems locate data in messages and initiate system behaviors based on the data. Traditionally, software programs must be written to locate the information as well as to initiate, perform, and monitor the appropriate behaviors. The trend toward meta-data driven systems, such as those implemented using XML Schema, provide some flexibility to this layer by enabling limited variations in the actual runtime messages and interfaces. However, the flexibility is limited to the ability of the meta-data standards to describe where data is located and the intended changes in system behavior. Finally, to make the three-layer strategy work there must be a rigorous and repeatable refining and distilling process for moving a project through the three layers. Once abstract models are agreed upon, they must be able to be refined to add addition specification data about how the data is recorded in the message. Completed concrete formats then need to be mapped to identify how the data is to be transformed as it is exchanged and code that performs that transformation needs to be generated. Augmenting traditional message design architectures with a third tier for abstract models can make the design environment dynamic enough to meet the strategic integration goals. To achieve this goal, it is not enough to just introduce a third tier—it must be designed to responsive to the intent of the exchanged information and it must enable meaningful and manageable business standards. Semantics – Understanding Intent Meta-data management often focuses on the concrete formats for the data to be exchanged, stored, or processed. However, the meta-data (the data that describes the actual data) is seldom semantic— that is, it is seldom abstract enough to provide the scalability and adaptability needed to meet integration goals. Dynamic Semantic Canonical Models 4 Traditional integration design seldom deals directly with semantics: the meaning of information. Semantics are the domain of human programmers and mappers who use their experience and specifications to infer the semantics of messages and interfaces. Unfortunately, while this traditional treatment leverages the strengths of humans to understand the nuances of semantics; it can not scale when faced with staffing changes nor does it meet the demands integration projects that span department or company boundaries. The conventional attempt to overcome the semantic scaling problem is to initiate an effort to design or adopt a single, uniform dictionary of terms and messages. These projects become overwhelmed when they try to enforce these messages on all systems and partners involved or even try to promote the dictionary and messages as an industry level standard. Abstract models based on semantics can avoid this potentially devastating distraction. A closer look at semantics will help illustrate how semantics benefit integration strategies. For example, consider a large multinational company’s policy that describes issuing, reclaiming and destroying keys. The policy may read something like: “Keys, notched and grooved, metal implements that are turned to open or close a lock, shall be…”. The physical, not semantic, definition of “key” works fine until new technology is introduced. With the introduction of electronic keys, the policies need to be amended. “Keys, notched and grooved, metal implements that are turned or small devices that use a transmitter to open or close a lock.” Policies generalized to use a semantic definition, “Devices used to open or close a lock”, do not have to be revised with every technology change. Concentration on intent enables innovation and insulates agreements about information from the influence of organizational change, technology and even cultural differences such as the differences between American and European conventions for representing number (1,000,000.00 vs. 1.000.000,00). In message-based systems, semantics describe what information is interchanged, not how. The semantic of “ship-date” remains the same regardless if it is represented as 15-07-02 or Jul 15, 02. But determining whether “ship-date” means the date a shipment is 1) on the dock; 2) available to ship; or 3) loaded on the truck; is best addressed at the abstract model level. The model level can used to resolve what information is logically the same from a business perspective. To manage semantics in the abstract model layer, an integration modeling solution would allow integration architects to identify semantic concepts like “ship-date” and group them into categories. Interfaces representing data messages are modeled to associated concepts to specific data fields. For example the data fields: Street, ADDRLINE, STRAS from xCBL, OAGIS and SAP IDocs respectively are all semantically equivalent and associated with the address1 concept in the address category and are grouped together. To fully deliver on this abstract modeling process, the models must be able to be refined to add detail about how the data is represented. These details, used in the lower tiers, includes rules for transforming items that are logically the same but physically represented differently, and rules for locating the information in the message. To enable semantic decisions to automatically guide the deployment of physical runtime processes, rules must be defined such that they can be deployed on variety of platforms and systems as they are in the Contivo Enterprise Integration Modeling solution. Dynamic Semantic Canonical Models 5 Semantic level agreements are essential for an integration strategy to successfully span the entire organization. If semantics are implemented differently at different points in a networked business process, then additional processes must be implemented to resolve differences. The development of a shared semantic model enables deeper understanding of the data within transactions and reduces the likelihood of not recognizing that two or more data items are related and represent the same business value. The goal of achieving complex, global distributed business operations and visibility requires an effective strategy for managing semantics. Clear understanding of the intent of data within messages allows enterprises to develop networked business processes that have more direct impact on operations and with reduced duplicated data. Semantically enriched integration strategies assure faster implementation, less redundant data, and make it easier to keep current with changing business requirements. Canonicals– Managing Business Standards In tradition integration strategies, business standards are sometimes implemented as static messaging formats commonly referred to as “canonicals”. Canonicals establish a shared, common view of business information by eliminating differences the business deems insignificant in enterprise data resources. Differences exist because each disparate system and various standards have unique perspectives regarding what information is important and how to represent it. Dates, for example, not only vary in how they are represented (July 27, 2001 or 07 Jul 01) but also what they mean (ship-date or ready-toship-date). Canonicals represent both the preferred representation and the business judgment as to which data elements are unique and those that are equivalent. Eliminating differences can ease integrating systems and enable limited message and interface reusability. Typically, canonicals are defined in the concrete tier where they describe all of the details about what information is to be included and how the information is to be structured. XML DTDs or Schemas are increasing being used to define canonical formats. Deployed successfully, canonical formats enable: Reusable business events. Universally reusable messages exchanged between any set of applications. Independence of action. Canonicals establish the technical requirements for achieving enterprise goals that project teams can use for planning and deploying integrations with a minimum of inter-team coordination. While canonical formats are beneficial, for an integration strategy to be scalable it needs to be enhanced to overcome their drawbacks, which include: Lots of models and formats. Because canonical formats are static, they must be designed prior to integrating. However, once deployed, canonicals are difficult to change because each and every system using the canonical event will need to change. In practice, large-scale integration environments use multiple versions and variations of canonicals at any given time. Additionally, canonicals are just a few of the many formats used to describe and design Dynamic Semantic Canonical Models 6 interoperability between business systems. Successful canonical integration strategies must relate all of the models and formats and provide facilities to coordinate their development. Performance. Few systems will use the canonical format as its native data format. To exchange data using canonicals, the source system will have to transform its data into the canonical and the receiving system will also have to transform the data. The performance implications of this additional transformation requirement can be too expensive. Business processes that must sustain very high throughput cannot afford the transformation overhead. Traditionally canonicals do not use abstract semantic definitions enabling them to keep pace with our dynamic businesses environment. Canonicals, as commonly used today, are important but insufficient to assure that an integration strategy is scalable. “Like other IT reuse strategies, interface and message reuse is hard to achieve” – R. Schulte; Tactical Guidelines, TG-15-1933; Gartner, Inc. The solution lies in applying canonicals to the abstract model tier. Abstract, semantically defined canonical models establish a shared view of what information is necessary to be exchanged in a business process. Because abstract canonical models only deal with semantics and not formatting details they are faster and easier to develop and maintain than static, standard dictionaries and messages which must be implemented at both the concrete and physical level to achieve the desired results, Abstract canonical model acts as an enterprise vocabulary from which multiple concrete formats can be created and maintained. When deployed in a system that manages the relationships between the abstract, concrete and physical layers, a single abstract semantic model can deliver reuse even with differing canonical formats and a variety of physical systems. Models – Tying it all Together Modeling has been successfully applied to engineering design activities such as architecture, semiconductors, electronic circuits, and software. Yet modeling has not been broadly and successfully applied to integration. Fundamentally, the success of modeling is due to its ability to identify significant issues and constraints and present these in multiple representations and at various levels of detail. This enables all stakeholders to collaborate in the design process and to reach consensus. It also makes it easier for designers to divide a problem in to smaller more manageable pieces. For integration modeling to deliver the same benefits, it must be adapted to identify the issues and constraints that are important to integration. The most fundamental issue is semantic identity and equivalence and the representational details associated with the semantics. To be scaleable across an enterprise and beyond, a successful integration design strategy must be able to use abstract semantic models as the organization responds to the forces of change around it. The modeling technique must also be able to refine the semantic model into static concrete formats that are used by system developers to implement system behavior and to further distill the formats into runtime code compatible with the physical systems it will execute on. Without processes for refining and distilling the models will go out of date and gather dust. Dynamic Semantic Canonical Models Integration design strategies that can leverage the benefits of modeling to semantically managing the integration data in multiple formats across multiple platforms will: Encourage the development of networked processes and visibility with the flexibility to respond to change. Involve all the stakeholders and encourages meaningful, informed communication and collaboration Lead to integration projects that are implemented quickly and affordably. In summary, dynamic, semantic, canonical models are keys to making your integration strategy deliver a competitive advantage. 7