The Current and Future Role of Data Warehousing in Corporate Application Architecture Χρήστος Μπακάλης. Γιώργος Χαραλαμπόπουλος. Part 1 Introduction Data Warehouse: A repository of historical data, subject-oriented and organized, summarized and integrated from various sources so as to be easily accessed and manipulated for decision support. Data Warehouse as a middleware layer Decision support applications Data Warehouse Operational applications The goal: Application Integration Examine the future role of Data Warehousing in Corporate Application Architectures. Analyze the potentials of reusing data warehousing methodology and management concepts for decoupling traditional transactional applications and channel-oriented applications. Foundation An application architecture model. Three dimensions. 1. Business process. 2. Business unit. 3. Business function. 3-D Model Integration concepts can be visualized in this 3-D model. Business function Business unit Business process Business Process Comprises of all the processes that are supported by applications. Examples: CRM,order processing, product development,risk management, corporate planning. Business unit Comprises of all the organizational units that result from customer segmentation, product grouping, or a combination of both. Example scales: Retail banking units, fixed line telephony units, life insurance units. Business function Comprises of all the functions that are supported by applications. Examples: Create file orders, calculate prices,create contracts, billing or plan resource utilization. Locating applications in the 3-D model Business unit Business function Decision support applications Data Warehouse Application cluster B Cross-product Application cluster A applications Vertical applications Transactional applications Business process Product-oriented integration (1) It is a cross-functional integration strategy. From these ‘vertical’ applications companies transfer certain business functions into dedicated cross-product applications. Example:Customer data management to be transferred from various product-specific applications into a single cross-product ‘partner management’ application to avoid problems of redundant customer data management and create opportunities for cross selling programs. Product oriented integration (2) Although all the data managed by crossproduct applications are processed by all other applications and thereby become ‘core’ data, they should be treated as operational data. As a result cross-product applications can be treated as transactional applications. Thus, product oriented integration is complemented by core data integration. The role of Data Warehouse It is the intermediate layer by which subjectoriented information for decision support applications is derived from transaction data. This database is used by all decision support applications as a single source of consistent data. It has its own architecture components for data extraction, data staging, data transformation, data integration, data correction, etc. A data warehouse can be implemented as a centralized system but can also be implemented in a decentralized way. Characteristics of Data warehousing (1) Organization: Data are organized by detailed subject containing only information relevant for decision support. Consistency: Data in different operational databases may be encoded differently. In the Data warehouse they will be coded in a consistent manner. Characteristics of Data warehousing (2) Time variant: The data are kept for several years so they can be used for trends, forecasting, and comparisons over time. Nonvolatile: Data in the warehouse are not updated. Relational: Relational structure is used. Client/Server: Provide easy access to data. Channel management and integration (1) Customers demand multiple access channels to products/services. Management has to decide which channels to use for which products/services without being restricted by IS/IT restrictions. Access media: Cellular phone and WAP, Internet, phone, etc. Channel management and integration (2) As a consequence, vertical applications and cross-product applications have to be complemented by channel- specific applications. E.g.:WWW portal, WAP portal,etc. Hence, product-oriented integration (along with core data integration) should be complemented by channel-oriented integration. Representation of channel-specific applications Channel-specific applications can be represented as ‘horizontal’ applications in the 3-D model. Channel-specific applications are created by transferring and integrating selected business functions from vertical applications. How to decouple horizontal and vertical applications? Operational data stores (1) Business function Business unit Application cluster B Application cluster A Crossproduct applica tions Vertical applications. Data Oper stagi ation ng al data store WAP portal WWW portal ……. ……. ……. Business process Operational data stores (2) The concept of operational data stores is introduced when real time access is required. It is used for short term decisions involving mission critical applications rather than for the medium and long term decisions associated with the regular data warehouse. It can also be thought of as a source system for the data warehouse to avoid duplication of integration functionality. Operational data stores VS Data warehouse (1) Focus on providing actual data for reporting: Data warehouse is sufficient. Focus on applications that have to exchange subject-oriented data in real time: Operational data store should be introduced. Operational data stores VS Data warehouse (2) Operational data stores: A ‘local’ closed loop approach can be supported between vertical and horizontal applications. Data warehousing: Efficient information supply between transactional applications and decision support applications can be achieved. Reusing Data warehousing concepts for application integration based on Operational data stores 1) Project justification. 2) Permanent organization. 3) Development methodology. 4) Meta data management. Project justification Application integration provides tangible benefits. As a result project justification can benefit from data warehousing-relating issues like the division between the IT and business units. Permanent organization Data ownership has emerged as a conceptual foundation from which roles and responsibilities as well as processes for permanent data warehousing were derived. Data warehousing Application integration. Organizational issues Development methodology Missing specifications. Data marts can be used to avoid them. By focusing on Data warehouse development phases it is interesting to find that they appear to have high reuse potential for application integration. Meta data management Meta-data: Data about data,including summaries, indices, software programs about data, etc. All meta data that are relevant for Data warehousing are also relevant for application integration based on Operational data stores and vice versa. Critical view of Data Warehousing Part 2 Basic Roles Utility. Dependence. Enabling. Utility It is aimed at reducing the costs of processing and communicating information throughout the organization. This is achieved by the aggregation of data and their organization by subject containing information relevant for decision support. Dependence The performance of a business process depends upon the information infrastructure, like the use of an ERP package. The link between the business strategy and infrastructure investment is obvious. Whether to use Data warehousing or Operational data stores should be decided carefully depending on the focus. Enabling Enabling infrastructures provide architectures and platforms for new applications. This yields flexibility. Time savings for data suppliers and users, availability of better information as a foundation for better decisions. Coexistence of Data warehouse and Operational data stores. Strategic alignment 1. 2. 3. How to link infrastructure to business strategy. Specify the needs of the corporation: Example: Data warehouse suitability. Large amounts of data. Data stored in different systems. Necessity for users to conduct extensive analysis. (Knowledge) Sharing An infrastructure is usually shared by the members of a community in the sense that it is the same single object used by all of them. Users access the Data warehouse take a copy of the needed data for analysis. This analysis is done using mining tools and leads to knowledge. Openness Infrastructures are open in the sense that there are no limits to the number of users , stakeholders, vendors, etc. involved in the network. In the case of Data warehousing this leads to varying constellations and alliances between humans (users) that access the data and non-human tools (Data warehouse). Heterogeneity Data warehousing constituencies include technological components and humans,( socio-technical networks) thus interaction is a crucial factor of success. Lack of incentive to share data and Knowledge can be costly. Data warehouse as a middleware layer can link DS applications with Operational applications, and integrate independent components (ecologies of infrastructures). Increasing Returns Increasing Returns: The more a product is produced, sold, or used the more valuable or profitable it becomes. The same applies for infrastructure standards. Data warehousing: Lowering the cost. Exploitation of warehouse data leads to knowledge. Greater efficiency. Path dependence Path dependence means that the past events will have large impacts on future development. Form of path dependence: Compatibility. Operational data stores should not be developed from scratch. Switching costs and Lock-in As the community using the same technology or standard grows, switching to a new technology or standard becomes an increasingly larger coordination challenge. How to introduce Operational data stores? Coexistence with Data warehousing. Key issue: Strategy to avoid lock-in – Evolution strategy. Evolution strategy Evolution strategy offers an easy migration path, and centers on reducing switching costs so that the users, can try the new technology gradually. Key issue: Linkage between the new technology and the old one. Actor-Network theory Infrastructure is a powerful actor in itself, seeking allies and fighting battles in order to survive. Separating a priori human actors and nonhuman tools creates difficulties in understanding the implementation of infrastructure. Well-run infrastructure: Successful alliance between human and non-human actors. Data warehousing cost: Lack of incentive to share data. THE END