ERP Data Warehouse Architectures, Tools & technologies by Wipro Technologies January 2002 ERP Data Warehouse Table of Contents Table of Contents ............................................................................................................ 2 1 Executive Summary ................................................................................................. 3 2 Introduction .............................................................................................................. 5 3 Technical Challenges Associated with ERP Data warehousing ................................ 5 4 Desired features of the ERP Data Warehouse ......................................................... 6 5 Architectural Choices ............................................................................................... 6 6 Tools & Technology Available .................................................................................. 8 6.1 Packaged Solution from ERP vendors .............................................................. 8 6.1.1 SAP Business Information Warehouse....................................................... 8 6.2 Extraction Tools ................................................................................................ 8 6.2.1 ActaWorks from Acta ................................................................................. 8 6.2.2 Data stage from Ascential .......................................................................... 9 6.2.3 PowerCenter from Informatica ................................................................. 11 7 Conclusion ............................................................................................................. 12 8 Appendix A............................................................................................................. 13 Wipro Confidential Page 2 of 37 ERP Data Warehouse 1 Executive Summary ERP applications have come into existence with a great promise of providing an integrated applications environment that addresses all the issues surrounding uncontrolled growth of stove pipe IS applications and serving full enterprise needs. After implementing expensive ERP packages, organizations as well as product vendors realized that although these solutions streamlined operational processes and IS applications, it was extremely difficult to serve the information needs of management. As a result organizations had to implement data warehouses for their decision support and business intelligence needs. There are 3 options available for the organizations for implementing the data warehouse. ERP-centric Data Warehouse: Data Warehouse is implemented using ERP vendor’s data warehousing package such as SAP Business Information Warehouse or PeopleSoft Enterprise Warehouse. Due to proprietary nature of these packages, this option is recommended only when more than 80% of the data in the data warehouse come from the same vendor’s OLTP systems. Otherwise data integration and customization cost may be more than the benefits of the well-integrated application environment. Two Independent Data Warehouses: One Data warehouse is built with non-ERP source data and the other is built within the ERP environment with ERP source data. This option does not provide true enterprise or cross-functional view and can result in multiple versions of truth. It also involves the burden of maintenance of two environments resulting in overheads in terms of cost, manpower, diverse skill set and also creates confusion among the business users. Custom Build Data Warehouse: This is built outside ERP environment using best of breed tools and technologies. This is a highly flexible solution and enables single version of truth, and can grow incrementally as organizational information needs grow. It is also highly scalable. But it takes slightly longer time to implement and more development effort. This option is recommended for cross-functional, high-performance, high volume, multi-dimensional analytical environment with large user base. Detailed advantages and disadvantages of each of these options are provided in section 5. Wipro Confidential Page 3 of 37 ERP Data Warehouse ETL tools for extraction of data from SAP R/3 and loading into SAP BW: ActaWorks from Acta: ActaWorks is tightly integrated with SAP R/3 and works seamlessly with SAP R/3 as well as BIW. It can also extract data from Non SAP R/3 data sources as well. It is becoming popular among the BIW installations where SAP R/3 is the primary source. It has features to extract incremental changes from SAP R/3. Data Stage from Ascential: Ascential’s Data stage is also one of the leading ETL tools. SAP is a reseller of Data Stage and DataStage load pack for SAP BW. These tools are integrated into mySAP business intelligence framework. PowerCenter from Informatica: Informatica PowerCenter is a strong ETL tool. It has separate plug-ins (PowerConnect) for SAP R/3, Siebel, and PeopleSoft etc. Hence, it can extract the data from SAP R/3, other ERP and Legacy systems. It could be a better choice when the majority of the data comes from non-SAP legacy sources. All the 3 products are SAP certified. However, ActaWorks was the first product to be developed that is well integrated with SAP R/3 and popular among SAP R/3 users. Later on SAP has become reseller for Data Stage product and integrated in its mySAP BI platform. Detailed comparison of these 3 ETL tools is provided in the Appendix A. Wipro Confidential Page 4 of 37 ERP Data Warehouse 2 Introduction Operational systems have been streamlined by deploying packaged enterprise resource planning (ERP) applications. These packages replace legacy and homegrown systems that are not well integrated. Traditionally, ERP packages have automated back-office operations, such as finance, human resources, and manufacturing. Now there are packages for front-office operations, such as sales, marketing, and customer service. However, ERP systems cannot address decision-support requirements for several reasons: ERP applications are designed to process large volumes of simple requests Larger queries take a long time for processing and need more resources ERP databases contain thousands of small tables that eliminate data redundancies It is easy to find and update a single data item, but querying is difficult ERP databases are very difficult to access, query, and navigate Some ERP systems store data in proprietary formats, making it difficult to access Finding the right entity within thousands of tables is a formidable barrier ERP system does not satisfy all the operational requirements of an enterprise. Similarly not all the modules of an ERP package meet the requirements of an enterprise, resulting in the implementation of part of the ERP package or multiple ERP packages that may co-exist with other legacy applications Therefore, there is a need to implement a data warehouse sourcing the data from the ERP, CRM and legacy systems to serve the information needs of business users. This paper outlines the technical issues involved, Desired features and architectural options available for implementing the data warehouse under ERP and non-ERP environments. 3 Technical Challenges Associated with ERP Data warehousing Following are the technical issues involved in extracting the data from ERP sources. Proprietary nature of ERP systems’ programming environment and APIs The complex architectures of ERP systems, which embed business logic and processes The data schemas of ERP systems, which are complex and typically contain thousands (SAP has about 9,000 tables) of tables (often described with abbreviations) The use of non-standard storage formats Change data capture Wipro Confidential Page 5 of 37 ERP Data Warehouse 4 Desired features of the ERP Data Warehouse ERP data warehousing requires an ETL infrastructure that will enable the extraction and integration of the data from multiple diverse platforms like legacy, CRM, sales force automation and external marketing data providers. Capturing changed data from the ERP applications and legacy application will be a challenge due to large volume of transactions, complex architecture and given little time window for extracting the data from ERP applications. Organizations require information and analysis in real time to facilitate important decisions. To achieve this ERP data warehouse required to extract and transform data from ERP applications in a near real-time manner. Meta Data management and reconciliation of inconsistent Meta data are biggest problems facing organizations with regard to their data warehousing applications. ERP data warehouse should support both the technical analyst and less technical general business users. ERP data warehouses are expected to store global data of an organization. This requires separation of reference data that changes over time and transactional data that is constant. Dimensional model with slowly changing dimensions concept can address this well. 5 Architectural Choices Approaches for Implementing Data Warehouses with advantages and disadvantages: ERP Centric Data Warehouse: Data Warehouse is built within the ERP environment (DSS provided by ERP vendor) by pulling non-ERP source data also into DSS system provided by the same ERP vendor. This option is recommended when majority of the data warehouse data (more than 80%) is sourced from ERP systems and business content for the required functional areas is available in the DSS provided by ERP vendor. Otherwise integration & customization effort can outweigh the benefits of tight integration. Two independent Data Warehouses: One Data Warehouse is built with ERP data and the other is built from ERP data sources. This is a natural growth as it technically easier and politically right solution. Custom Build Data Warehouse outside ERP environment: The Data Warehouse is built using best of breed tools outside the ERP environment. This option requires the data extraction from ERP sources that could prove costly. But with the advent of ETL tools such as ActaWorks, Ascential, Informatica that can extract data from ERP application layer, the issue is mitigated to some extent. Wipro Confidential Page 6 of 37 ERP Data Warehouse Following table elaborates on advantages and disadvantages of each of the above options: Option Advantages ERP centric Data Warehouse Tight integration of operational and decision support systems Easier to implement closed feedback loop DW Industry best practices are made available in the form of business processes and standard reports Two Independent Data Warehouses Easier to implement technically Politically natural solution Earlier investments on existing DW initiatives are protected Custom built Data Warehouse outside ERP environment Flexible True enterprise wide single version of truth can be attained Easier to integrate external data Scalability is not an issue Open Architecture is amenable to real-time Data Warehouse refresh and closed loop feedback Wipro Confidential Dis-advantages Not flexible Considerable customization effort and requires 3rd part ETL tools to integrate non-ERP sources data Integration of non-ERP data (organizational or external) into ERP environment is complex due to proprietary interfaces and limited business content ERP vendors are traditionally strong in OLTP, but not in DSS applications Not proven for high performance, high volume multidimensional analysis with large user base Not all the functionality may be supported by any given ERP vendor Growth to real-time Data Warehouse may not be possible No enterprise/cross functional view Higher maintenance and sustenance costs Prone to inconsistencies across two data warehouses leading to two versions of truth Ambiguity among the user community Data extraction from ERP OLTP systems is complex 3rd party vendor tools need to keep up to date with changing ERP environment Longer time to implement Page 7 of 37 ERP Data Warehouse 6 Tools & Technology Available 6.1 Packaged Solution from ERP vendors 6.1.1 SAP Business Information Warehouse Since SAP announced its business information warehouse in 1998, it has gone thru many transformations. Until version 2.1C, SAP BW has been primarily used for operational reporting that was not possible within SAP R/3. It had several limitations such as drill across, ODS structure and scalability. But version 2.1C (my SAP BI) seems to have addressed these issues and it now offers a sound BI platform for SAP R/3 users. SAP has tied up with Ascential to integrate its ETL tool Data stage as part of the BI platform. With this it has overcome the weakness of transporting the non-ERP data into its business warehouse. On the UI end it still does not have a competing OLAP tool, though its partners OLAP tool, such as Business Objects, Cognos, can be used for the same. Business Explorer UI that comes with business warehouse is excel like and does not offer robust OLAP functionality. Business content is also still limited and does not match with its competitor’s offerings in the packaged applications space such as those from Epiphany, Broadbase/EPM, DecisionPoint Application, Hyperion, Gentia, NCR, SAS, and Alphablox etc. 6.2 Extraction Tools 6.2.1 ActaWorks from Acta Acta was the first vendor to bring a product to market specifically tailored to support data warehousing with ERP systems. Today Acta offers the most comprehensive data warehousing and data integration products for use with ERP systems. ActaWorks for SAP is designed to support tight integration with SAP ERP applications. In addition to providing an intuitive GUI for mapping data from SAP and non –SAP sources to data warehouse or data mart, ActaWorks extracts data via SAP R/3 application layer, allowing access to all SAP data and business logic. ActaWorks also features a component that supports real-time updates and change-data capture for data warehouses. Also Acta offers pre-packaged data marts or Rapid marts for use with Acta Works to speed warehouse development. ActaWorks for SAP consists of five key components: ActaWorks Designer, a Meta data repository, ActaWorks Server, ActaWorks Integrator for SAP and ActaWorks administrator. ActaWorks designer is graphical tool for defining the data mappings, transformations and control logic necessary for managing a complex multi step process for populating a data warehouse. Designer allows users to define data mappings and transformation rules using GUI modeled on SQL. Wipro Confidential Page 8 of 37 ERP Data Warehouse The data mappings and transformation rules specified with designer are stored in ActaWorks Meta data repository. The repository also stores information describing the schema for SAP and non-SAP data sources and the target data warehouse schema. To facilitate the process of identifying the right information to extract, ActaLink provides English language descriptions of both tables and columns. The hub of the transformation process is ActaWorks Server, which performs complex data transformations and integrates data from non-SAP sources with SAP data. The server is designed to provide high throughput and uses in-memory transformations, parallel pipelining. To extract data from SAP, the ActaWorks Integrator for SAP automatically generates optimized ABAP/4 code. This removes the need to write and maintain custom ABAP/4 code. The features of the integrator are: Populates Meta data repository with SAP logical view of the data. Translates ANSI SQL constructs specified in the designer into ABAP/4 support (OpenSQL) Automatically Generates ABAP/4 code extracting data Uses SAP administrative infrastructure by extracting data via SAP’s application server layer thereby providing access to all SAP data, including data stored in pool and cluster tables, and other SAP business logic. Automatically extracts the hierarchies from SAP ActaWorks Administrator provides facilities for warehouse administrators to schedule and monitor jobs. To capture the changed transactions in the source (SAP) can be implemented using the IDocs (Intermediate Document architecture). Idocs capture data when a transaction is being processed. This is very effective means of capturing the data from SAP when underlying tables do not contain date and time stamps. ActaWorks generates ABAP to read staged Idoc data from header and detail. ActaWorks supports real-time data transformation including receiving messages from ERP systems or XML-based, e-commerce applications. “Real-time” means that ActaWorks reacts to messages as they are sent, performing predefined operations to respond appropriately. For real-time updates from the SAP it is required to install the Acta RealTime Component. For real-time data extraction, ActaWorks Real-Time uses SAP R/3 Application Link Enabling (ALE) technology and Intermediate Documents (IDocs) to capture and process transactions. Idocs can be enriched with other R/3 or non-R/3 data as you specify in the real-time data flow design. 6.2.2 Data stage from Ascential Using DataStage XE, warehouse developers can take data from diverse sources and complex data forms such as legacy data, B2B and web environments, as well as enterprise applications such as SAP and Siebel. They can transform this data, load it into a warehouse, data mart or business intelligence application for analysis. By managing the Meta data, DataStage XE completely integrates Meta data with the most Wipro Confidential Page 9 of 37 ERP Data Warehouse commercially popular data modeling and data access tools. Finally, the quality assurance component enables warehouse administrators to audit, monitor, and manage the quality of the data as the warehouse expands and evolves. Specifically, DataStage XE is an integrated set of software components consisting of: Quality Manager for data quality assurance critical for accurate business analysis MetaStage for Meta data integration in order to maintain consistent analytic interpretations as well as track changes to the data warehouse DataStage for data collection and integration from diverse sources for complete "snapshots" and data movement and transformation for system and end-user productivity DataStage XE/390 for extracting legacy data while using the power of the mainframe infrastructure As part of DataStage XE, Quality Manager gives development teams and business users the ability to audit, monitor, and certify data quality at key points throughout the data integration lifecycle. Further they can identify a wide range of data quality problems and business rule violations that can inhibit data migration efforts as well as generate data quality metrics for projecting financial returns. By improving the quality of the data going into DataStage transformations, organizations also improve warehouse performance and the data quality of the resultant target data. The end result is validated data and information for making smart business decisions and a reliable, repeatable and accurate process for making sure information maintains its superior quality over time. A critical component of DataStage XE is MetaStage, Ascential’s solution for meta data management across data warehouse environments. Most data warehouses and marts are created using a wide variety of tools that cannot exchange Meta data. As a result, business users are unable to understand and leverage enterprise data because the contextual information, or Meta data, required is unavailable or unintelligible. Based on patented technology, MetaStage offers broad support for sharing Meta data between third-party data environments. MetaStage uses MetaBrokers to ensure the complete exchange of all related meta data, regardless of source type. DataStage is a client/server development tool for building and supporting data migration applications. Ascential Software offers options such as XML Pack, Enterprise Application Packs, and the MQ Series Plug-in. On the server side, DataStage has a transformation engine that enables complex processing while providing ease of use, management control and maximum performance. The DataStage client is a graphical tool with the following major components: Manager, Designer, Director, and Administrator. The DataStage Manager supports the import/export of meta data, as well as the central control of shared transformation objects. The Designer is the tool that visually represents the data transformation process with an intuitive easy-to-use graphical engine. The Director, as its name implies, supports the scheduling and execution of completed transformations, and the Administrator provides for housekeeping and security functions. Data warehousing professionals use the DataStage client to interact with the DataStage Server, the workhorse that processes the transformations and moves data at run-time. Wipro Confidential Page 10 of 37 ERP Data Warehouse Enterprise application (EA) systems provides critical data sources for business analysis. DataStage XE provides full integration with leading enterprise applications including SAP, Siebel, and PeopleSoft. The DataStage Extract PACKs for SAP R/3, Siebel and PeopleSoft, and the DataStage Load PACK for SAP BW enable warehouse developers to integrate this data with the organization's other data sources. The DataStage Extract pack provides: 1. Extensive transformation capabilities to manipulate SAP R/3 data and load it to new or existing data warehouse or data mart. 2. Generates ABAP/4 SAP’s programming language. Automation of ABAP code shields developer from the complexity of manually writing ABAP code and more importantly reduces the development and maintenance costs 3. Access to all SAP R/3 data including transparent, pool, view and cluster tables using unique feature –DataStage Meta data object browser. With over 15000 SAP tables and its known complexity, the meta data object browser enables easy navigation through the info hierarchies before joining multiple R/3 tables – Simplifying the process 4. Enables two methods of operation to optimize performance and resources: Generated ABAP code can be uploaded to the R/3 system via remote function call or for the warehouse developers who don’t have direct access to the R/3 System, R/3 script can be moved manually via FTP and be imported by an R/3 administrator. Job scheduling can be controlled either from the DataStage Director or natively from the SAP scheduling services. 5. Performs complex transformations easily with drag-and-drop operations using DataStage designers graphical mapping tool 6. Utilizes SAP’s RFC library and iDocs; two of the primary data interchange mechanisms for access for SAP R/3, thus conforming to SAP interfacing standards. 7. Another key function is the ability to capture incremental changes and produce event-triggered updates with SAP’s IDoc (Intermediate Documents) functionality. DataStage’s IDoc extract interface retrieves IDoc meta data and automatically translates the segment fields into DataStage achieving real-time SAP data integration 6.2.3 PowerCenter from Informatica PowerCenter from Informatica is one of the popular and powerful tool in the ETL space. It offers seamless integration with wide data sources including the ERP, mainframe and relational systems as well as e-commerce and legacy applications. Informatics’ PowerConnect for PeopleSoft and PowerConnect for SAP can directly extract and integrate the data from SAP R/3 and people soft applications, as well as other formats. PowerConnect modules are component-based offering that complement and extend the functionality of Informatica core data warehouse development platform – the PowerCenter. PowerConnect for SAP provides Informatica PowerMart/PowerCenter users with native, high-speed data extraction from SAP R/3 systems, enabling full access to all SAP R/3 tables and SAP R/3 Info hierarchies. PowerConnect for SAP extracts data from SAP using ABAP 4, SAP’s proprietary 4GL. Using powerconnect, users can access all SAP R/3 Tables, including transparent, pool and cluster tables. This allows full access to all Wipro Confidential Page 11 of 37 ERP Data Warehouse data residing in SAP R/3’s application layer. Once extracted, SAP data is delivered to the PowerCenter server, which transforms the data for delivery to target data warehouse, data marts, or other analytic applications. PowerConnect for SAP lets you customize the R/3 extraction routines for load processing. You can choose to stage the data in an intermediary file or stream it directly into the PowerCenter Server. In addition when accessing data in R/3 PowerConnect only performs the actual extraction processes on the R/3 system. Transformation and load processing occur within the PowerCenter helping to minimize the load on the R/3 environment. 7 Conclusion Companies have been struggling for some time now to build data warehouses and data marts that will allow their users to perform better and easier analysis of SAP data. Due to the complexity of the SAP R/3 system and a lack of good data warehousing products specifically designed to handle SAP data, companies were forced to write their own custom extraction programs in ABAP/4. This however is changing and good number vendors, recognizing the opportunity, have introduced ETL products that can assist in extracting and integrating SAP and non-SAP data and moving it into the warehouse. SAP is seriously pursuing its efforts to provide a scalable BI platform by upgrading its Business Information Warehouse. It is enhancing the business content in each of the new versions, but still lacks the capabilities provided by competing packaged solutions. It has also integrated DataStage (an ETL tool) to integrate non-SAP data also into BW platform. Meta group predicts that by 2005, SAP BW can become a dominant player in the packaged data warehouse players catering to enterprise level information needs of SAP R/3 users. It may not achieve the same success among non SAP R/3 users. Wipro Confidential Page 12 of 37 ERP Data Warehouse 8 Appendix A Category Version----> Architecture Criteria Architecutre Scalable and Extensible Technology Informatica PowerCenter 5.0 Hub and Spoke Architecture Wipro Confidential 5.0 Open Client Server Platform facilitate the sharing of Meta Data Highly scalable and extensible Scalable, Flexible technology. Scale up as the Technology. data and load grows. Scales up w.r.t the hardware and software Client Platform Windows 2000/NT/98 Server Platforms Acta Works Sun Solaris, AIX, HP-UNIX, Windows NT/2000 Ascential Data Stage XE 5.1 Client Server Architecture Highly scalable Scales up w.r.t the hardware and software Windows 98/NT/2000, Windows 95/NT/2000 OS/2 Windows NT/2000, HP- Windows NT ( Intel and Alpha Platforms ), UNIX Unix, Solaris, AIX AIX, HP-UX, Sun Solaris, COMPAQ Tru64. Data Stage XE 390 works on OS/390 platform. Page 13 of 37 ERP Data Warehouse Which DBMS are supported for extraction and loading For Extraction: DB/2 Oracle, Informix, DB/2 /400,Flat Microsoft SQL Server, Files,IMS,Informix, MS SQL Sybase, DB2 UDB, Server, ODBC-compliant MS Access, Oracle, databases, and flat files Sybase,UDB,VSAM,ODBC,Others Targets: Informix DB/2 /400,MS SQL Server, MS Access,,Oracle, PeopleSoft Enterprise Performance Management(EPM),SAP® Business Information Warehouse (BW),Sybase,UDB,Flat Files,Others Support for ERP Sources Wipro Confidential QSAM: Sequential flat files ISAM: VSAM: KSDS, RSDS, ESDS support GROUPS, multilevel arrays, REDEFINES, and all PICTURE clauses. DB2, Adabas, Oracle OCI ( For releases 7 and 8 ) , Sybase Open Client , Informix CLI , OLE/DB for Microsoft SQL Server 7, ODBC. DataStage XE provides full integration with leading enterprise applications including SAP, Siebel, and PeopleSoft. The DataStage Extract PACKs for SAP R/3, Siebel and PeopleSoft, and the DataStage Load PACK for SAP BW enable warehouse developers to integrate this data with the organization's other data sources Page 14 of 37 ERP Data Warehouse Code Reusability capability within the product Supports development of All the objects in the Mapplets which acts as library object library can be rebetween Mappings and also can useable. An object can make transformations shareable be data flow, workflow, across Mappings. job etc. Parallelism Supports parallelism, one can run multiple mapping session on the same server. Wipro Confidential Permits the reuse of existing code through APIs thereby eliminating redundancy and retesting of established business rules Supports Parallelism, if it Automatically distributes is running on a multi independent job flows prcessor computer. It across multiple CPU takes full advantage of processes.This feature the Hardware ensures the best use of Architecture. available resources and speeds up overall processing time for the application. Page 15 of 37 ERP Data Warehouse Code Generator PowerCenter does not generate code,all the mappings developed will be inform of GUI interface. Does generate Code, but the Data Flow or Job Flow defined can be converted to code to check with Acta Support. Only Datastage XE/390 version automatically generates and optimizes native COBOL code and JCL scripts that run on the OS/390 mainframe. PowerCenter is based on Hub & Transformation is Transformation is engine Data engine based and relies based - column-toTransformation Spoke architecture and has column mappings Method (Engineinbuilt Transformation engine. on the server. Based ?) Wipro Confidential Page 16 of 37 ERP Data Warehouse Building & Managing Aggregates Support for various data types Data Quality Check functionality or feature Wipro Confidential Aggregation can be built using Aggrigation thru Read to Enhances performance the built in transformation use Transformation and reduces I/O with its provided. function built-in sorting and aggregation capabilities. The Sort and Aggregation stages of DataStage work directly on rows as they pass through the engine rather than depending on SQL and intermediate tables. Supports most of the industry Supports most of the It supports most of the standard data types. This also industry standard data industry standard data depends on the kind of source types types. It supports XML system being used. also. Through Quality Manager it is possible to audit, monitor, and certify data quality at key points throughout the data integration lifecycle. Page 17 of 37 ERP Data Warehouse Debugging and Does not a separate debugging Error Correction can be Helps developers verify Tool. The workaround is by done for each job their code with a built-in logging setting the "verbose" property workflow, data flow and debugger thereby features on each transformation. By this even object. informatica will create log files in the server, which can be used for further analysis. Exception Handling Wipro Confidential increasing application reliability as well as reducing the amount of time developers spend fixing errors and bugs. Supports debugging on row-by-row basis using break points. DataStage immediately detects and corrects errors in logic or unexpected legacy data values using this. Highly useful for complex transformation, date conversions etc. Throws out the error records or Support exception Supports exception rejected records into a log file handling no extra effort handling. required. Page 18 of 37 ERP Data Warehouse How Tool Provides information about exception Through log files stored in the server Restarting an Support restarting of the aborted ETL mappings process Wipro Confidential Through Log files Developers can closely observe the running jobs in the Monitor Window to provide run-time feedback on userselected intervals.The powerful process viewer estimates rows-persecond and allows developers to pinpoint possible bottle-necks and/or points of failure. Using the Director, the developer can browse detailed log records as each step of a job completes. These date and time stamped log records include notes reported by the DataStage Server as well as messages returned by the operating environment or source and target database systems. DataStage highlights log records with colored icons (green for informational, yellow are warnings, red for fatal)for easy identification. Restart is possible. Can Restart is possible. Can restart from the point of restart from the point of failure. failure. Page 19 of 37 ERP Data Warehouse 128 MB/ 256 MB Memory (Minimum/ Recommended) requirement at client machine Depends on the kind of Memory application running, 128 MB / (Minimum/ Recommended) 256 MB requirement at Server machine PowerCenter comes with good Repository features for backup and Backup and recovery of the repository. This Recovery can done through Repository Manager. Wipro Confidential 64 MB /128 MB 64 MB 64 MB /128 MB Minimum 256 MB Repository Backup can be taken by using Reportistory Manager. Supports distributed Repository - Remote sites can subscribe to a set of meta data objects within the warehouse application. These sites are notified via email when meta data changes occur within their subscription. DataStage XE offers version control such as table definitions, transformation rules, and source/target column mappings within a 2-part numbering scheme. Page 20 of 37 ERP Data Warehouse Meta data support Metadata Capture Automatically captures Stores all the meta data the meta data and stores in the Repository. in the repository Captures the Meta Data Automatically using component called 'Meta Stage' . It also offers broad support for sharing meta data between thirdparty data environments using Metabrokers. It maintains a complete catalog of the organization’s metadata, including physical, technical, business and process meta data. Not available. Only DataStage XE provides Business View Business Meta data needs to documented while building the Technical Meta Data is warehouse developers meta data mappings. This data will be stored. with a central hub that stored in the meta data manages meta data at repository. Using the SQL the tool-integration level. commands it is possible to Remote sites can query the meta data. subscribe to a set of meta data objects within the warehouse application.These sites are notified via email when meta data changes occur within their subscription. Since meta data is stored in the Provides meta data User level security Meta data repository of the product it is security through provided by DataStage security very well protected. repository manager, Administrator needs userid and password to login. Wipro Confidential Meta data is captured and stored in the repository of the PowerCenter Page 21 of 37 ERP Data Warehouse Web Integration support Does not have any web integration BY using Access Server Yes , Supports Web for Web administration. integration using Plugin Using this it is possible to API control the whole loading process from a remote machine. Supports versioning with the Supports Versioning DataStage XE offers Versioning help of the repository and through central version control,which Support allows one to define the repository. saves the history of all baseline. the ETL development.It preserves application components such as table definitions,transformation rules,and source/target column mappings within a 2-part numbering scheme.Developers can review older rules and optionally restore entire releases that can then be moved to distributed locations. Sharable through the Metadata Does not exchange the Has its version of the Metadata Exchange (MX2) API metadata with other Common Meta Model. repository's application The meta data can be compliance to shared using the one of the MetaBroker. industry meta data standards Wipro Confidential Page 22 of 37 ERP Data Warehouse Meta data views using query tools PowerCenter comes with the Central repository meta data reporting tool which provides meta data will help the users to access the viewing facility and also meta data stored in the repository tables can be repository.One can view meta queries using SQL data using the query tools like statements. SQL etc. Ease of setup Easy installation procedure The installation process Easy to install only two depends the platform on which components needs to being installed. Some times it installed. can run into rough weather due to various reasons. But most of the cases it is very easy to install It is possible to generate the Possible to Generate the Ability to generate Data target data mart schema similar Data mart Scehema. mart schema to source database. similar to source database Supports Start Schema data Support for designing data model for target data mart design. mart Wipro Confidential No tool currently available.The entire history of the data can be derived and viewed using Data Lineage. An industry standard installation script provided for each " DataStage "Packages" helps in easier installation and automated configuration. Possible to create the data mart schema similar to source E-Caches provides ready- Does not support to-use data marts suites directly. But with data with all the ETL facility integration capabilities of defined. DataStage/DataStage 390 with DB2 Warehouse Manager's data warehouse generation and management capabilities it is possible to design data mart/warehouse. Page 23 of 37 ERP Data Warehouse Importing data It is possible to import the data Does not support. models from models from different modelling modeling tools tools by using Plug in called MX. Wipro Confidential The MetaBroker for a particular tool represents the meta data just as it is expressed in the tool ’s schema. It accomplishes the exchange of meta data between tools by automatically decomposing the meta data concepts of one tool into their atomic elements via the MetaHub and recomposing those elements to represent the meta data concepts from the perspective of the receiving tool.In this way all meta data and their relationships in the integrated suite are captured and retained for use by any of the tools. Summarizing, MetaBrokers facilitates meta data exchange between DataStage and popular data modeling and business intelligence tools. Page 24 of 37 ERP Data Warehouse TransformationsFilter Format conversion Lookup Wipro Confidential Supports Filter transformation Supports various types of Supports Filter transformations: transformation Filtering, Merging, Key Generation, Table Comparison etc. Support Format conversion and Format Conversion is data type conversion. possible, Supports format conversion such as date & time display, numeric representation, National currency rules, Collating sequences etc. Suppors Lookup transformation Lookup funcitonlaity is Support lookup very well. possible, three types of procedures, hashed funcitonality, pre-cached, lookup tables to increase cahche-on-demand, no- performance. cache. Page 25 of 37 ERP Data Warehouse Scope for user One can define user define defined fields variables but there is no such thing called scope. Possible to define One can define user variable with scope define variables global, local and also can pass parameter values b/w various projects. Joins Supports most of the join types. Supports all types of joins. Supports most of the join types using join transformation Support for external procedures Supports external procedures, it Possible to call COM is possible to call stored objects, DLL functions procedures through mappings. etc. Built into DataStage are several features exclusively designed to support the packaging and deployment of completed data migration applications. Wipro Confidential Page 26 of 37 ERP Data Warehouse Management Scheduling feature Defining calendar and using it for ad-hoc scheduling Wipro Confidential Supports good scheduling Good Scheduler with in feature and it is possible to the tool with Work flow schedule the job/session using mechanism, calendar. Server Manager. With limited work-flow mechanism. Yes it is possible in a very sophisticated manner Good graphical scheduling and Monitoring feature provided by the datastage component called Data Director. It can also generate CRON scripts to schedule from Unix. With DataStage Job Control API and Command Language interface provided, any remote C program or command shell can be used to initiate jobs, query their results or program a more complex job execution sequence. Using the data stage Director it is possible to schedule the jobs Page 27 of 37 ERP Data Warehouse Provides more control to No special performance monitor tool but user through more developers can closely attributes, for better observe the running jobs monitoring in the Monitor Window to provide run-time feedback on userselected intervals. The powerful process viewer estimates rows-persecond and allows developers to pinpoint possible bottlenecks and/or points of failure. Performance Can provide Very high It's a strong point of Options performance. Can Acta as it gives more enhance performance parameter for using In-memory hash performance tables, reducing I/O improvement. operations with its built-in sorting and aggregation capabilities. DataStageallows to bypass ODBC and "talk" natively to the source and target structures using direct calls thereby increasing performance. Specifying the It is possible to load a large set Possible to specificy the Does not suppot atomicity updates. automaticity of the atomicity of the of records to the target database. updates updates Has got good security features Provides good secutity Provides security Security – and managed through through repository features using Data Encryption Repository Manager. No manager. Does not Administrator. Encryption facility. provide encryption facitlity Performance monitoring of ETL process Wipro Confidential Page 28 of 37 ERP Data Warehouse Security and Not Available Access Control using LDAP No option to provide LDAP interface Not Available Provides impact analysis Good impact analysis Adaptability Impact analysis It is possible to find out the capabilities provided by impact on change which needs capability capability to be done. SCD Support for growth Requires programatic design to Can be handled using filter and lookup update the SCD. transfors. the Metastage Hub across the integrated environment. It gives the entire relationship associated with an object. Requires programatic design to update the SCD. Supports versioning and Version/ configuration configuration management. management Provides good interface Provides version control to control the versions through distributed repository. (Repository can exists on either source or target) Supports Flat file, oracle, sql Ability to handle various server, DB2, and other ODBC source types compliant RDBMS. from flat to files to major RDBMS Only Oracle8.x,Informix,SQL Server and DB2 only.Also provide SAP R3 connectivity without any plugins. Wipro Confidential Supports heterogenous sources like Oracle, Informix, SQL Server, DB2, flat files, XML, ERP Sources like Oracle Apps, SAP R/3, Peoplesoft etc. Page 29 of 37 ERP Data Warehouse Incremental upload This needs to be handled in mappings manually. Yes One can call external procedure Yes Support for External loader in the mapping using external transformation. Wipro Confidential Supports Incremental load. Changed Data Capture captures changes to the operational data and produces Delta Store files.DataStage XE uses these files to update the data warehouse.From a workflow perspective,the warehouse developer defines a Delta Data Store file as an input table within one of the DataStage XE products on a Windows 95/NT platform. DataStage supports a wide variety of such bulk load utilities either by directly calling a vendor ’s bulk load API or generating the control and matching data file for batch input processing.DataStage developers simply connect a Bulk Load Stage icon to their jobs and then fill in the performance settings that are appropriate for their particular environment. Page 30 of 37 ERP Data Warehouse Does not generate Do not require Intermediate Only generates a temp file file generation when doing sorting or loading. intermediate file during intermediate files or loading. secondary storage during loading Event based loading locations to perform aggregation or intermediate sorting during loading process. Supports Event based loading Does not supports "true" work Yes it is possible for do flow mechanism. This can be done using external schedulers or workflow tools like AppWorks or NT Scheduling or using Mainframe OPC Scheduling tools. Supports Oracle, Informix, SQL Only Sybase Adaptive Server , Support for Oracle8.x,Informix,SQL Sybase Adaptive server wide range of Server, DB2 etc Server and DB2 only. IQ, Microsoft SQL Server databases for 7 via OLE/DB , Microsoft storing(Target) SQL Server 6.5 via BCP , information Informix Redbrick, Teradata, UDB. Bulk Loaders - Oracle , Informix ADO/XPO High Performance . Ascential databases- UniVerse, Unidata. Also XML,e-mail systems and Web Logs, ERP data and MQSeries messages. Supports multi user Supports multi user Supports multi user client Support for development environment. development server development multi-user environment environment development environment Wipro Confidential Page 31 of 37 ERP Data Warehouse Advance Data Re-usability Transformation Support for built in functions Wipro Confidential Supports re-usability of the provides various reusable Code Reusability is suported. Ascential's code by making transformation objects like reusable. Jobs,workflows,dataflows Quality Manager provides a framework for etc. developing a selfcontained and reusable Project which consists of business rules, analysis results, measurements, history and reports about a particular source or target environment. Support Built in transformations Support built in functions pre-built functions and like aggrigator , filter etc. routines are available Page 32 of 37 ERP Data Warehouse Handling duplicate records Does not handle duplicate rows. Possible to handle To be hanldled programatically duplicate records Lookup cache Supports caching of lookup tables. Consistency and Global Meta data re-use Wipro Confidential Does not handle duplicate rows. To be hanldled programatically Possible to define lookup Supports Lookup cache cache through lookup transformations Using PowerCenter and Supports Global Meta PowerMart model it is possible Data to handle global meta data. MetaBrokers enable the sharing of meta data among all of the tools in the warehouse environment.With MetaBrokers, tools can share meta data without having to change their Page 33 of 37 ERP Data Warehouse internal meta schema to conform to a common model. Compatibility Compatibility Currently PowerCenter Supports Supports EAI tool TIBCO Only IBM MQ Series is supported. with third party of ETL Tools following EAI vendors IBM MQ as an input . Series, TIBCO, Vitria and with EAI tools tools webMethods as source/ target for the data. Wipro Confidential Page 34 of 37 ERP Data Warehouse Licensing & Pricing Server Licensing Licensing Includes following for Provideds evaluation and Information Not availble Basic Version: permanent . No ability to add-on licenses.Which supports PowerMarts multiuser environment · No Global Repository and SAP R3 connectivity. · No centralized monitoring · 1 Server Engine* · 2 Relational Database Source Types · 2 Target Instances · Unlimited Flat File Sourcing · Unlimited Developers . Single CPU Unix Version Costs : US$ 140 K Windows NT/2000 Ver : US$ 95 K Information Not availble There is no separate licensing There is no separate Client for the Client. It Comes along license required for Licensing with the server. client. Information Not availble ODC Licensing No transfers are allowed from the client owned software to Wipro. Separate license has to be procured. May be Lab license will do which will be half the cost of the production license Wipro Confidential Page 35 of 37 ERP Data Warehouse Vendor Information 2 consecutive Informatica was recently named Acta continues to see the 11th fastest-growing strong growth in data years of technology company in Silicon integration with second profitability Valley by Deloitte & Touche. quarter revenue growth The ranking resulted from the results up 110%. company’s 10,491 percent revenue growth between 19951999. PowerCenter Works with most Significant of the software,database and third party partner support hardware vendors. Built on most with open system. The product like powerconnect for DB2 has been brought by informatica and supported. Has Global presence and has Global presence and support most of the continents. support Number of Customers Wipro Confidential SAP is a reseller of Ascential’s DataStage and DataStage Load PACK for SAP BW with the sole target being SAP BW. Ascential Software Corporation is the leading provider of Information Asset Management solutions to the Global 2000. More than 1800 as of is around 1300 as of Oct 2001 Has more then 200 customer as of Oct 2001. Aug' 01 Page 36 of 37 ERP Data Warehouse Company financial info readily available All the informtaion regarding the health of the company has been reported in its website. Revenue for Ascential Software's DataStage®, Media360™ and related product and service offerings was $27.0 million in the third quarter, an increase of 14% from $23.6 million in the third quarter of 2000. Revenue for these offerings for the nine months ended September 30, 2001 was $93.9 million, an increase of 47% over the $63.8 million in the first nine months of 2000. Company focus Informatica Came to BI market Acta is well positioned to Adds significant meta data management with the ETL product and has drive the "data on ETL segment for the established a major player in integration market" and services to the entire datawarehouse,including the market. This product will be coming up as major future ETL. Intend to offer the continue to be the flag ship player. capability for product despite change in its heterogeneous cross-tool positioning in the BI market analysis and query capabilities.Exploitation of XML Integration to enhance e-businesses communication.Delivers Key Metabroker development capabilities for its customers and partners. Wipro Confidential Page 37 of 37