4 Desired features of the ERP Data Warehouse

advertisement
ERP Data Warehouse
Architectures, Tools & technologies
by
Wipro Technologies
January 2002
ERP Data Warehouse
Table of Contents
Table of Contents ............................................................................................................ 2
1 Executive Summary ................................................................................................. 3
2 Introduction .............................................................................................................. 5
3 Technical Challenges Associated with ERP Data warehousing ................................ 5
4 Desired features of the ERP Data Warehouse ......................................................... 6
5 Architectural Choices ............................................................................................... 6
6 Tools & Technology Available .................................................................................. 8
6.1
Packaged Solution from ERP vendors .............................................................. 8
6.1.1
SAP Business Information Warehouse....................................................... 8
6.2
Extraction Tools ................................................................................................ 8
6.2.1
ActaWorks from Acta ................................................................................. 8
6.2.2
Data stage from Ascential .......................................................................... 9
6.2.3
PowerCenter from Informatica ................................................................. 11
7 Conclusion ............................................................................................................. 12
8 Appendix A............................................................................................................. 13
Wipro Confidential
Page 2 of 37
ERP Data Warehouse
1 Executive Summary
ERP applications have come into existence with a great promise of providing an
integrated applications environment that addresses all the issues surrounding
uncontrolled growth of stove pipe IS applications and serving full enterprise needs.
After implementing expensive ERP packages, organizations as well as product vendors
realized that although these solutions streamlined operational processes and IS
applications, it was extremely difficult to serve the information needs of management. As
a result organizations had to implement data warehouses for their decision support and
business intelligence needs.
There are 3 options available for the organizations for implementing the data warehouse.
ERP-centric Data Warehouse: Data Warehouse is implemented using ERP vendor’s
data warehousing package such as SAP Business Information Warehouse or
PeopleSoft Enterprise Warehouse.
Due to proprietary nature of these packages, this option is recommended only when
more than 80% of the data in the data warehouse come from the same vendor’s OLTP
systems. Otherwise data integration and customization cost may be more than the
benefits of the well-integrated application environment.
Two Independent Data Warehouses: One Data warehouse is built with non-ERP
source data and the other is built within the ERP environment with ERP source data.
This option does not provide true enterprise or cross-functional view and can result in
multiple versions of truth. It also involves the burden of maintenance of two
environments resulting in overheads in terms of cost, manpower, diverse skill set and
also creates confusion among the business users.
Custom Build Data Warehouse: This is built outside ERP environment using best of
breed tools and technologies.
This is a highly flexible solution and enables single version of truth, and can grow
incrementally as organizational information needs grow. It is also highly scalable. But it
takes slightly longer time to implement and more development effort. This option is
recommended for cross-functional, high-performance, high volume, multi-dimensional
analytical environment with large user base.
Detailed advantages and disadvantages of each of these options are provided in
section 5.
Wipro Confidential
Page 3 of 37
ERP Data Warehouse
ETL tools for extraction of data from SAP R/3 and loading into SAP BW:
ActaWorks from Acta:
ActaWorks is tightly integrated with SAP R/3 and works seamlessly with SAP R/3 as well
as BIW. It can also extract data from Non SAP R/3 data sources as well. It is becoming
popular among the BIW installations where SAP R/3 is the primary source. It has
features to extract incremental changes from SAP R/3.
Data Stage from Ascential:
Ascential’s Data stage is also one of the leading ETL tools. SAP is a reseller of Data
Stage and DataStage load pack for SAP BW. These tools are integrated into mySAP
business intelligence framework.
PowerCenter from Informatica:
Informatica PowerCenter is a strong ETL tool. It has separate plug-ins (PowerConnect)
for SAP R/3, Siebel, and PeopleSoft etc. Hence, it can extract the data from SAP R/3,
other ERP and Legacy systems. It could be a better choice when the majority of the data
comes from non-SAP legacy sources.
All the 3 products are SAP certified. However, ActaWorks was the first product to be
developed that is well integrated with SAP R/3 and popular among SAP R/3 users. Later
on SAP has become reseller for Data Stage product and integrated in its mySAP BI
platform.
Detailed comparison of these 3 ETL tools is provided in the Appendix A.
Wipro Confidential
Page 4 of 37
ERP Data Warehouse
2 Introduction
Operational systems have been streamlined by deploying packaged enterprise resource
planning (ERP) applications. These packages replace legacy and homegrown systems
that are not well integrated. Traditionally, ERP packages have automated back-office
operations, such as finance, human resources, and manufacturing. Now there are
packages for front-office operations, such as sales, marketing, and customer service.
However, ERP systems cannot address decision-support requirements for several
reasons:
 ERP applications are designed to process large volumes of simple requests
 Larger queries take a long time for processing and need more resources

ERP databases contain thousands of small tables that eliminate data redundancies
 It is easy to find and update a single data item, but querying is difficult

ERP databases are very difficult to access, query, and navigate
 Some ERP systems store data in proprietary formats, making it difficult to access
 Finding the right entity within thousands of tables is a formidable barrier

ERP system does not satisfy all the operational requirements of an enterprise.
Similarly not all the modules of an ERP package meet the requirements of an
enterprise, resulting in the implementation of part of the ERP package or multiple
ERP packages that may co-exist with other legacy applications
Therefore, there is a need to implement a data warehouse sourcing the data from the
ERP, CRM and legacy systems to serve the information needs of business users.
This paper outlines the technical issues involved, Desired features and architectural
options available for implementing the data warehouse under ERP and non-ERP
environments.
3 Technical Challenges Associated with ERP Data
warehousing
Following are the technical issues involved in extracting the data from ERP sources.





Proprietary nature of ERP systems’ programming environment and APIs
The complex architectures of ERP systems, which embed business logic and
processes
The data schemas of ERP systems, which are complex and typically contain
thousands (SAP has about 9,000 tables) of tables (often described with
abbreviations)
The use of non-standard storage formats
Change data capture
Wipro Confidential
Page 5 of 37
ERP Data Warehouse
4 Desired features of the ERP Data Warehouse






ERP data warehousing requires an ETL infrastructure that will enable the
extraction and integration of the data from multiple diverse platforms like legacy,
CRM, sales force automation and external marketing data providers.
Capturing changed data from the ERP applications and legacy application will be
a challenge due to large volume of transactions, complex architecture and given
little time window for extracting the data from ERP applications.
Organizations require information and analysis in real time to facilitate important
decisions. To achieve this ERP data warehouse required to extract and transform
data from ERP applications in a near real-time manner.
Meta Data management and reconciliation of inconsistent Meta data are biggest
problems facing organizations with regard to their data warehousing applications.
ERP data warehouse should support both the technical analyst and less
technical general business users.
ERP data warehouses are expected to store global data of an organization. This
requires separation of reference data that changes over time and transactional
data that is constant. Dimensional model with slowly changing dimensions
concept can address this well.
5 Architectural Choices
Approaches for Implementing Data Warehouses with advantages and disadvantages:
ERP Centric Data Warehouse: Data Warehouse is built within the ERP environment
(DSS provided by ERP vendor) by pulling non-ERP source data also into DSS system
provided by the same ERP vendor.
This option is recommended when majority of the data warehouse data (more than 80%)
is sourced from ERP systems and business content for the required functional areas is
available in the DSS provided by ERP vendor. Otherwise integration & customization
effort can outweigh the benefits of tight integration.
Two independent Data Warehouses: One Data Warehouse is built with ERP data and
the other is built from ERP data sources. This is a natural growth as it technically easier
and politically right solution.
Custom Build Data Warehouse outside ERP environment: The Data Warehouse is
built using best of breed tools outside the ERP environment. This option requires the
data extraction from ERP sources that could prove costly. But with the advent of ETL
tools such as ActaWorks, Ascential, Informatica that can extract data from ERP
application layer, the issue is mitigated to some extent.
Wipro Confidential
Page 6 of 37
ERP Data Warehouse
Following table elaborates on advantages and disadvantages of each of the above
options:
Option
Advantages
ERP centric Data
Warehouse
 Tight integration of
operational and decision
support systems
 Easier to implement closed
feedback loop DW
 Industry best practices are
made available in the form of
business processes and
standard reports
Two
Independent
Data
Warehouses
 Easier to implement
technically
 Politically natural solution
 Earlier investments on
existing DW initiatives are
protected
Custom built
Data Warehouse
outside ERP
environment
 Flexible
 True enterprise wide single
version of truth can be
attained
 Easier to integrate external
data
 Scalability is not an issue
 Open Architecture is
amenable to real-time Data
Warehouse refresh and
closed loop feedback
Wipro Confidential
Dis-advantages
 Not flexible
 Considerable customization
effort and requires 3rd part ETL
tools to integrate non-ERP
sources data
 Integration of non-ERP data
(organizational or external) into
ERP environment is complex
due to proprietary interfaces
and limited business content
 ERP vendors are traditionally
strong in OLTP, but not in DSS
applications
 Not proven for high
performance, high volume multidimensional analysis with large
user base
 Not all the functionality may be
supported by any given ERP
vendor
 Growth to real-time Data
Warehouse may not be possible
 No enterprise/cross functional
view
 Higher maintenance and
sustenance costs
 Prone to inconsistencies across
two data warehouses leading to
two versions of truth
 Ambiguity among the user
community
 Data extraction from ERP OLTP
systems is complex
 3rd party vendor tools need to
keep up to date with changing
ERP environment
 Longer time to implement
Page 7 of 37
ERP Data Warehouse
6 Tools & Technology Available
6.1
Packaged Solution from ERP vendors
6.1.1 SAP Business Information Warehouse
Since SAP announced its business information warehouse in 1998, it has gone thru
many transformations. Until version 2.1C, SAP BW has been primarily used for
operational reporting that was not possible within SAP R/3. It had several limitations
such as drill across, ODS structure and scalability. But version 2.1C (my SAP BI) seems
to have addressed these issues and it now offers a sound BI platform for SAP R/3 users.
SAP has tied up with Ascential to integrate its ETL tool Data stage as part of the BI
platform. With this it has overcome the weakness of transporting the non-ERP data into
its business warehouse.
On the UI end it still does not have a competing OLAP tool, though its partners OLAP
tool, such as Business Objects, Cognos, can be used for the same. Business Explorer
UI that comes with business warehouse is excel like and does not offer robust OLAP
functionality.
Business content is also still limited and does not match with its competitor’s offerings in
the packaged applications space such as those from Epiphany, Broadbase/EPM,
DecisionPoint Application, Hyperion, Gentia, NCR, SAS, and Alphablox etc.
6.2
Extraction Tools
6.2.1 ActaWorks from Acta
Acta was the first vendor to bring a product to market specifically tailored to support data
warehousing with ERP systems. Today Acta offers the most comprehensive data
warehousing and data integration products for use with ERP systems.
ActaWorks for SAP is designed to support tight integration with SAP ERP applications.
In addition to providing an intuitive GUI for mapping data from SAP and non –SAP
sources to data warehouse or data mart, ActaWorks extracts data via SAP R/3
application layer, allowing access to all SAP data and business logic. ActaWorks also
features a component that supports real-time updates and change-data capture for data
warehouses. Also Acta offers pre-packaged data marts or Rapid marts for use with Acta
Works to speed warehouse development.
ActaWorks for SAP consists of five key components: ActaWorks Designer, a Meta data
repository, ActaWorks Server, ActaWorks Integrator for SAP and ActaWorks
administrator.
ActaWorks designer is graphical tool for defining the data mappings, transformations
and control logic necessary for managing a complex multi step process for populating a
data warehouse. Designer allows users to define data mappings and transformation
rules using GUI modeled on SQL.
Wipro Confidential
Page 8 of 37
ERP Data Warehouse
The data mappings and transformation rules specified with designer are stored in
ActaWorks Meta data repository. The repository also stores information describing the
schema for SAP and non-SAP data sources and the target data warehouse schema. To
facilitate the process of identifying the right information to extract, ActaLink provides
English language descriptions of both tables and columns.
The hub of the transformation process is ActaWorks Server, which performs complex
data transformations and integrates data from non-SAP sources with SAP data. The
server is designed to provide high throughput and uses in-memory transformations,
parallel pipelining.
To extract data from SAP, the ActaWorks Integrator for SAP automatically generates
optimized ABAP/4 code. This removes the need to write and maintain custom ABAP/4
code. The features of the integrator are:
 Populates Meta data repository with SAP logical view of the data.
 Translates ANSI SQL constructs specified in the designer into ABAP/4 support
(OpenSQL)
 Automatically Generates ABAP/4 code extracting data
 Uses SAP administrative infrastructure by extracting data via SAP’s application
server layer thereby providing access to all SAP data, including data stored in
pool and cluster tables, and other SAP business logic.
 Automatically extracts the hierarchies from SAP
ActaWorks Administrator provides facilities for warehouse administrators to schedule
and monitor jobs.
To capture the changed transactions in the source (SAP) can be implemented using the
IDocs (Intermediate Document architecture). Idocs capture data when a transaction is
being processed. This is very effective means of capturing the data from SAP when
underlying tables do not contain date and time stamps. ActaWorks generates ABAP to
read staged Idoc data from header and detail.
ActaWorks supports real-time data transformation including receiving messages from
ERP systems or XML-based, e-commerce applications. “Real-time” means that
ActaWorks reacts to messages as they are sent, performing predefined operations to
respond appropriately. For real-time updates from the SAP it is required to install the
Acta RealTime Component. For real-time data extraction, ActaWorks Real-Time uses
SAP R/3 Application Link Enabling (ALE) technology and Intermediate Documents
(IDocs) to capture and process transactions. Idocs can be enriched with other R/3 or
non-R/3 data as you specify in the real-time data flow design.
6.2.2 Data stage from Ascential
Using DataStage XE, warehouse developers can take data from diverse sources and
complex data forms such as legacy data, B2B and web environments, as well as
enterprise applications such as SAP and Siebel. They can transform this data, load it
into a warehouse, data mart or business intelligence application for analysis. By
managing the Meta data, DataStage XE completely integrates Meta data with the most
Wipro Confidential
Page 9 of 37
ERP Data Warehouse
commercially popular data modeling and data access tools. Finally, the quality
assurance component enables warehouse administrators to audit, monitor, and manage
the quality of the data as the warehouse expands and evolves.
Specifically, DataStage XE is an integrated set of software components consisting of:
 Quality Manager for data quality assurance critical for accurate business analysis
 MetaStage for Meta data integration in order to maintain consistent analytic
interpretations as well as track changes to the data warehouse
 DataStage for data collection and integration from diverse sources for complete
"snapshots" and data movement and transformation for system and end-user
productivity
 DataStage XE/390 for extracting legacy data while using the power of the
mainframe infrastructure
As part of DataStage XE, Quality Manager gives development teams and business
users the ability to audit, monitor, and certify data quality at key points throughout the
data integration lifecycle. Further they can identify a wide range of data quality problems
and business rule violations that can inhibit data migration efforts as well as generate
data quality metrics for projecting financial returns.
By improving the quality of the data going into DataStage transformations, organizations
also improve warehouse performance and the data quality of the resultant target data.
The end result is validated data and information for making smart business decisions
and a reliable, repeatable and accurate process for making sure information maintains
its superior quality over time.
A critical component of DataStage XE is MetaStage, Ascential’s solution for meta data
management across data warehouse environments. Most data warehouses and marts
are created using a wide variety of tools that cannot exchange Meta data. As a result,
business users are unable to understand and leverage enterprise data because the
contextual information, or Meta data, required is unavailable or unintelligible. Based on
patented technology, MetaStage offers broad support for sharing Meta data between
third-party data environments. MetaStage uses MetaBrokers to ensure the complete
exchange of all related meta data, regardless of source type.
DataStage is a client/server development tool for building and supporting data migration
applications. Ascential Software offers options such as XML Pack, Enterprise Application
Packs, and the MQ Series Plug-in. On the server side, DataStage has a transformation
engine that enables complex processing while providing ease of use, management
control and maximum performance. The DataStage client is a graphical tool with the
following major components: Manager, Designer, Director, and Administrator. The
DataStage Manager supports the import/export of meta data, as well as the central
control of shared transformation objects. The Designer is the tool that visually represents
the data transformation process with an intuitive easy-to-use graphical engine. The
Director, as its name implies, supports the scheduling and execution of completed
transformations, and the Administrator provides for housekeeping and security functions.
Data warehousing professionals use the DataStage client to interact with the DataStage
Server, the workhorse that processes the transformations and moves data at run-time.
Wipro Confidential
Page 10 of 37
ERP Data Warehouse
Enterprise application (EA) systems provides critical data sources for business analysis.
DataStage XE provides full integration with leading enterprise applications including
SAP, Siebel, and PeopleSoft.
The DataStage Extract PACKs for SAP R/3, Siebel and PeopleSoft, and the DataStage
Load PACK for SAP BW enable warehouse developers to integrate this data with the
organization's other data sources. The DataStage Extract pack provides:
1. Extensive transformation capabilities to manipulate SAP R/3 data and load it to
new or existing data warehouse or data mart.
2. Generates ABAP/4 SAP’s programming language. Automation of ABAP code
shields developer from the complexity of manually writing ABAP code and more
importantly reduces the development and maintenance costs
3. Access to all SAP R/3 data including transparent, pool, view and cluster tables
using unique feature –DataStage Meta data object browser. With over 15000
SAP tables and its known complexity, the meta data object browser enables easy
navigation through the info hierarchies before joining multiple R/3 tables –
Simplifying the process
4. Enables two methods of operation to optimize performance and resources:
Generated ABAP code can be uploaded to the R/3 system via remote function
call or for the warehouse developers who don’t have direct access to the R/3
System, R/3 script can be moved manually via FTP and be imported by an R/3
administrator. Job scheduling can be controlled either from the DataStage
Director or natively from the SAP scheduling services.
5. Performs complex transformations easily with drag-and-drop operations using
DataStage designers graphical mapping tool
6. Utilizes SAP’s RFC library and iDocs; two of the primary data interchange
mechanisms for access for SAP R/3, thus conforming to SAP interfacing
standards.
7. Another key function is the ability to capture incremental changes and produce
event-triggered updates with SAP’s IDoc (Intermediate Documents) functionality.
DataStage’s IDoc extract interface retrieves IDoc meta data and automatically
translates the segment fields into DataStage achieving real-time SAP data
integration
6.2.3 PowerCenter from Informatica
PowerCenter from Informatica is one of the popular and powerful tool in the ETL space.
It offers seamless integration with wide data sources including the ERP, mainframe and
relational systems as well as e-commerce and legacy applications. Informatics’
PowerConnect for PeopleSoft and PowerConnect for SAP can directly extract and
integrate the data from SAP R/3 and people soft applications, as well as other formats.
PowerConnect modules are component-based offering that complement and extend the
functionality of Informatica core data warehouse development platform – the
PowerCenter.
PowerConnect for SAP provides Informatica PowerMart/PowerCenter users with native,
high-speed data extraction from SAP R/3 systems, enabling full access to all SAP R/3
tables and SAP R/3 Info hierarchies. PowerConnect for SAP extracts data from SAP
using ABAP 4, SAP’s proprietary 4GL. Using powerconnect, users can access all SAP
R/3 Tables, including transparent, pool and cluster tables. This allows full access to all
Wipro Confidential
Page 11 of 37
ERP Data Warehouse
data residing in SAP R/3’s application layer. Once extracted, SAP data is delivered to
the PowerCenter server, which transforms the data for delivery to target data
warehouse, data marts, or other analytic applications.
PowerConnect for SAP lets you customize the R/3 extraction routines for load
processing. You can choose to stage the data in an intermediary file or stream it directly
into the PowerCenter Server. In addition when accessing data in R/3 PowerConnect only
performs the actual extraction processes on the R/3 system. Transformation and load
processing occur within the PowerCenter helping to minimize the load on the R/3
environment.
7 Conclusion
Companies have been struggling for some time now to build data warehouses and data
marts that will allow their users to perform better and easier analysis of SAP data. Due to
the complexity of the SAP R/3 system and a lack of good data warehousing products
specifically designed to handle SAP data, companies were forced to write their own
custom extraction programs in ABAP/4.
This however is changing and good number vendors, recognizing the opportunity, have
introduced ETL products that can assist in extracting and integrating SAP and non-SAP
data and moving it into the warehouse.
SAP is seriously pursuing its efforts to provide a scalable BI platform by
upgrading its Business Information Warehouse. It is enhancing the business
content in each of the new versions, but still lacks the capabilities provided by
competing packaged solutions. It has also integrated DataStage (an ETL tool) to
integrate non-SAP data also into BW platform.
Meta group predicts that by 2005, SAP BW can become a dominant player in the
packaged data warehouse players catering to enterprise level information needs
of SAP R/3 users. It may not achieve the same success among non SAP R/3
users.
Wipro Confidential
Page 12 of 37
ERP Data Warehouse
8 Appendix A
Category
Version---->
Architecture
Criteria
Architecutre
Scalable and
Extensible
Technology
Informatica PowerCenter
5.0
Hub and Spoke Architecture
Wipro Confidential
5.0
Open Client Server
Platform facilitate the
sharing of Meta Data
Highly scalable and extensible Scalable, Flexible
technology. Scale up as the
Technology.
data and load grows. Scales up
w.r.t the hardware and
software
Client Platform Windows 2000/NT/98
Server
Platforms
Acta Works
Sun Solaris, AIX, HP-UNIX,
Windows NT/2000
Ascential Data Stage
XE
5.1
Client Server
Architecture
Highly scalable Scales up
w.r.t the hardware and
software
Windows 98/NT/2000, Windows 95/NT/2000
OS/2
Windows NT/2000, HP- Windows NT ( Intel and
Alpha Platforms ), UNIX
Unix, Solaris, AIX
AIX, HP-UX, Sun Solaris,
COMPAQ Tru64. Data
Stage XE 390 works on
OS/390 platform.
Page 13 of 37
ERP Data Warehouse
Which DBMS
are supported
for extraction
and loading
For Extraction: DB/2
Oracle, Informix,
DB/2 /400,Flat
Microsoft SQL Server,
Files,IMS,Informix, MS SQL
Sybase, DB2 UDB,
Server,
ODBC-compliant
MS Access, Oracle,
databases, and flat files
Sybase,UDB,VSAM,ODBC,Others
Targets: Informix
DB/2 /400,MS SQL Server, MS
Access,,Oracle, PeopleSoft
Enterprise
Performance
Management(EPM),SAP®
Business Information
Warehouse
(BW),Sybase,UDB,Flat
Files,Others
Support for
ERP Sources
Wipro Confidential
QSAM: Sequential flat
files ISAM: VSAM:
KSDS, RSDS, ESDS support GROUPS, multilevel arrays, REDEFINES,
and all PICTURE clauses.
DB2, Adabas, Oracle OCI
( For releases 7 and 8 ) ,
Sybase Open Client ,
Informix CLI , OLE/DB
for Microsoft SQL Server
7, ODBC.
DataStage XE provides
full integration with
leading enterprise
applications including
SAP, Siebel, and
PeopleSoft. The
DataStage Extract PACKs
for SAP R/3, Siebel and
PeopleSoft, and the
DataStage Load PACK for
SAP BW enable
warehouse developers to
integrate this data with
the organization's other
data sources
Page 14 of 37
ERP Data Warehouse
Code
Reusability
capability
within the
product
Supports development of
All the objects in the
Mapplets which acts as library object library can be rebetween Mappings and also can useable. An object can
make transformations shareable be data flow, workflow,
across Mappings.
job etc.
Parallelism
Supports parallelism, one can
run multiple mapping session
on the same server.
Wipro Confidential
Permits the reuse of
existing code through
APIs thereby eliminating
redundancy and retesting
of established business
rules
Supports Parallelism, if it Automatically distributes
is running on a multi
independent job flows
prcessor computer. It
across multiple CPU
takes full advantage of processes.This feature
the Hardware
ensures the best use of
Architecture.
available resources and
speeds up overall
processing time for the
application.
Page 15 of 37
ERP Data Warehouse
Code
Generator
PowerCenter does not generate
code,all the mappings
developed will be inform of GUI
interface.
Does generate Code, but
the Data Flow or Job
Flow defined can be
converted to code to
check with Acta Support.
Only Datastage
XE/390 version
automatically generates
and optimizes native
COBOL code and JCL
scripts that run on the
OS/390 mainframe.
PowerCenter is based on Hub & Transformation is
Transformation is engine
Data
engine based and relies based - column-toTransformation Spoke architecture and has
column mappings
Method (Engineinbuilt Transformation engine. on the server.
Based ?)
Wipro Confidential
Page 16 of 37
ERP Data Warehouse
Building &
Managing
Aggregates
Support for
various data
types
Data Quality
Check
functionality or
feature
Wipro Confidential
Aggregation can be built using Aggrigation thru Read to Enhances performance
the built in transformation
use Transformation
and reduces I/O with its
provided.
function
built-in sorting and
aggregation capabilities.
The Sort and
Aggregation stages of
DataStage work directly
on rows as they pass
through the engine
rather than depending on
SQL and intermediate
tables.
Supports most of the industry Supports most of the
It supports most of the
standard data types. This also industry standard data industry standard data
depends on the kind of source types
types. It supports XML
system being used.
also.
Through Quality Manager
it is possible to audit,
monitor, and certify data
quality at key points
throughout the data
integration lifecycle.
Page 17 of 37
ERP Data Warehouse
Debugging and Does not a separate debugging Error Correction can be Helps developers verify
Tool. The workaround is by
done for each job
their code with a built-in
logging
setting the "verbose" property workflow, data flow and debugger thereby
features
on each transformation. By this even object.
informatica will create log files
in the server, which can be
used for further analysis.
Exception
Handling
Wipro Confidential
increasing application
reliability as well as
reducing the amount of
time developers spend
fixing errors and bugs.
Supports debugging on
row-by-row basis using
break points. DataStage
immediately detects and
corrects errors in logic or
unexpected legacy data
values using this. Highly
useful for complex
transformation, date
conversions etc.
Throws out the error records or Support exception
Supports exception
rejected records into a log file handling no extra effort handling.
required.
Page 18 of 37
ERP Data Warehouse
How Tool
Provides
information
about
exception
Through log files stored in the
server
Restarting an Support restarting of the
aborted ETL mappings
process
Wipro Confidential
Through Log files
Developers can closely
observe the running jobs
in the Monitor Window to
provide run-time
feedback on userselected intervals.The
powerful process viewer
estimates rows-persecond and allows
developers to pinpoint
possible bottle-necks
and/or points of failure.
Using the Director, the
developer can browse
detailed log records as
each step of a job
completes. These date
and time stamped log
records include notes
reported by the
DataStage Server as well
as messages returned by
the operating
environment or source
and target database
systems. DataStage
highlights log records
with colored icons (green
for informational, yellow
are warnings, red for
fatal)for easy
identification.
Restart is possible. Can Restart is possible. Can
restart from the point of restart from the point of
failure.
failure.
Page 19 of 37
ERP Data Warehouse
128 MB/ 256 MB
Memory
(Minimum/
Recommended)
requirement at
client machine
Depends on the kind of
Memory
application running, 128 MB /
(Minimum/
Recommended) 256 MB
requirement at
Server machine
PowerCenter comes with good
Repository
features for backup and
Backup and
recovery of the repository. This
Recovery
can done through Repository
Manager.
Wipro Confidential
64 MB /128 MB
64 MB
64 MB /128 MB
Minimum 256 MB
Repository Backup can
be taken by using
Reportistory Manager.
Supports distributed
Repository - Remote
sites can subscribe to a
set of meta data objects
within the warehouse
application. These sites
are notified via email
when meta data changes
occur within their
subscription. DataStage
XE offers version control
such as table definitions,
transformation rules, and
source/target column
mappings within a 2-part
numbering scheme.
Page 20 of 37
ERP Data Warehouse
Meta data
support
Metadata
Capture
Automatically captures Stores all the meta data
the meta data and stores in the Repository.
in the repository
Captures the Meta Data
Automatically using
component called 'Meta
Stage' . It also offers
broad support for sharing
meta data between thirdparty data environments
using Metabrokers. It
maintains a complete
catalog of the
organization’s metadata,
including physical,
technical, business and
process meta data.
Not available. Only
DataStage XE provides
Business View Business Meta data needs to
documented while building the Technical Meta Data is warehouse developers
meta data
mappings. This data will be
stored.
with a central hub that
stored in the meta data
manages meta data at
repository. Using the SQL
the tool-integration level.
commands it is possible to
Remote sites can
query the meta data.
subscribe to a set of
meta data objects within
the warehouse
application.These sites
are notified via email
when meta data changes
occur within their
subscription.
Since meta data is stored in the Provides meta data
User level security
Meta data
repository
of
the
product
it
is
security
through
provided by DataStage
security
very well protected.
repository manager,
Administrator
needs userid and
password to login.
Wipro Confidential
Meta data is captured and
stored in the repository of the
PowerCenter
Page 21 of 37
ERP Data Warehouse
Web
Integration
support
Does not have any web
integration
BY using Access Server Yes , Supports Web
for Web administration. integration using Plugin
Using this it is possible to API
control the whole loading
process from a remote
machine.
Supports versioning with the
Supports Versioning
DataStage XE offers
Versioning
help of the repository and
through central
version control,which
Support
allows one to define the
repository.
saves the history of all
baseline.
the ETL development.It
preserves application
components such as
table
definitions,transformation
rules,and source/target
column mappings within
a 2-part numbering
scheme.Developers can
review older rules and
optionally restore entire
releases that can then be
moved to distributed
locations.
Sharable through the Metadata Does not exchange the Has its version of the
Metadata
Exchange (MX2) API
metadata with other
Common Meta Model.
repository's
application
The meta data can be
compliance to
shared using the
one of the
MetaBroker.
industry meta
data standards
Wipro Confidential
Page 22 of 37
ERP Data Warehouse
Meta data
views using
query tools
PowerCenter comes with the
Central repository
meta data reporting tool which provides meta data
will help the users to access the viewing facility and also
meta data stored in the
repository tables can be
repository.One can view meta queries using SQL
data using the query tools like statements.
SQL etc.
Ease of setup Easy
installation
procedure
The installation process
Easy to install only two
depends the platform on which components needs to
being installed. Some times it installed.
can run into rough weather due
to various reasons. But most of
the cases it is very easy to
install
It is possible to generate the
Possible to Generate the
Ability to
generate Data target data mart schema similar Data mart Scehema.
mart schema to source database.
similar to
source
database
Supports Start Schema data
Support for
designing data model for target data mart
design.
mart
Wipro Confidential
No tool currently
available.The entire
history of the data can
be derived and viewed
using Data Lineage.
An industry standard
installation script
provided for each "
DataStage "Packages"
helps in easier
installation and
automated configuration.
Possible to create the
data mart schema similar
to source
E-Caches provides ready- Does not support
to-use data marts suites directly. But with data
with all the ETL facility integration capabilities of
defined.
DataStage/DataStage
390 with DB2 Warehouse
Manager's data
warehouse generation
and management
capabilities it is possible
to design data
mart/warehouse.
Page 23 of 37
ERP Data Warehouse
Importing data It is possible to import the data Does not support.
models from models from different modelling
modeling tools tools by using Plug in called MX.
Wipro Confidential
The MetaBroker for a
particular tool represents
the meta data just as it is
expressed in the tool ’s
schema. It accomplishes
the exchange of meta
data between tools by
automatically
decomposing the meta
data concepts of one tool
into their atomic
elements via the
MetaHub and
recomposing those
elements to represent
the meta data concepts
from the perspective of
the receiving tool.In this
way all meta data and
their relationships in the
integrated suite are
captured and retained for
use by any of the tools.
Summarizing,
MetaBrokers facilitates
meta data exchange
between DataStage and
popular data modeling
and business intelligence
tools.
Page 24 of 37
ERP Data Warehouse
TransformationsFilter
Format
conversion
Lookup
Wipro Confidential
Supports Filter transformation
Supports various types of Supports Filter
transformations:
transformation
Filtering, Merging, Key
Generation, Table
Comparison etc.
Support Format conversion and Format Conversion is
data type conversion.
possible,
Supports format
conversion such as date
& time display, numeric
representation, National
currency rules, Collating
sequences etc.
Suppors Lookup transformation Lookup funcitonlaity is Support lookup
very well.
possible, three types of procedures, hashed
funcitonality, pre-cached, lookup tables to increase
cahche-on-demand, no- performance.
cache.
Page 25 of 37
ERP Data Warehouse
Scope for user One can define user define
defined fields variables but there is no such
thing called scope.
Possible to define
One can define user
variable with scope
define variables
global, local and also can
pass parameter values
b/w various projects.
Joins
Supports most of the join types. Supports all types of
joins.
Supports most of the join
types using join
transformation
Support for
external
procedures
Supports external procedures, it Possible to call COM
is possible to call stored
objects, DLL functions
procedures through mappings. etc.
Built into DataStage are
several features
exclusively designed to
support the packaging
and deployment of
completed data migration
applications.
Wipro Confidential
Page 26 of 37
ERP Data Warehouse
Management Scheduling
feature
Defining
calendar and
using it for
ad-hoc
scheduling
Wipro Confidential
Supports good scheduling
Good Scheduler with in
feature and it is possible to
the tool with Work flow
schedule the job/session using mechanism, calendar.
Server Manager. With limited
work-flow mechanism.
Yes it is possible in a
very sophisticated
manner
Good graphical
scheduling and
Monitoring feature
provided by the
datastage component
called Data Director. It
can also generate CRON
scripts to schedule from
Unix. With DataStage
Job Control API and
Command Language
interface provided, any
remote C program or
command shell can be
used to initiate jobs,
query their results or
program a more complex
job execution sequence.
Using the data stage
Director it is possible to
schedule the jobs
Page 27 of 37
ERP Data Warehouse
Provides more control to No special performance
monitor tool but
user through more
developers can closely
attributes, for better
observe the running jobs
monitoring
in the Monitor Window to
provide run-time
feedback on userselected intervals. The
powerful process viewer
estimates rows-persecond and allows
developers to pinpoint
possible bottlenecks
and/or points of failure.
Performance
Can provide Very high
It's a strong point of
Options
performance. Can
Acta as it gives more
enhance performance
parameter for
using In-memory hash
performance
tables, reducing I/O
improvement.
operations with its built-in
sorting and aggregation
capabilities.
DataStageallows to
bypass ODBC and "talk"
natively to the source
and target structures
using direct calls thereby
increasing performance.
Specifying the It is possible to load a large set Possible to specificy the Does not suppot
atomicity updates.
automaticity of the
atomicity of the of records to the target
database.
updates
updates
Has got good security features Provides good secutity Provides security
Security –
and managed through
through repository
features using Data
Encryption
Repository Manager. No
manager. Does not
Administrator.
Encryption facility.
provide encryption
facitlity
Performance
monitoring of
ETL process
Wipro Confidential
Page 28 of 37
ERP Data Warehouse
Security and Not Available
Access Control
using LDAP
No option to provide
LDAP interface
Not Available
Provides impact analysis Good impact analysis
Adaptability Impact analysis It is possible to find out the
capabilities provided by
impact on change which needs capability
capability
to be done.
SCD
Support for
growth
Requires programatic design to Can be handled using
filter and lookup
update the SCD.
transfors.
the Metastage Hub
across the integrated
environment. It gives the
entire relationship
associated with an
object.
Requires programatic
design to update the
SCD.
Supports versioning and
Version/
configuration configuration management.
management
Provides good interface Provides version control
to control the versions through distributed
repository. (Repository
can exists on either
source or target)
Supports Flat file, oracle, sql
Ability to
handle various server, DB2, and other ODBC
source types compliant RDBMS.
from flat to files
to major
RDBMS
Only
Oracle8.x,Informix,SQL
Server and DB2
only.Also provide SAP
R3 connectivity without
any plugins.
Wipro Confidential
Supports heterogenous
sources like Oracle,
Informix, SQL Server,
DB2, flat files, XML, ERP
Sources like Oracle Apps,
SAP R/3, Peoplesoft etc.
Page 29 of 37
ERP Data Warehouse
Incremental
upload
This needs to be handled in
mappings manually.
Yes
One can call external procedure Yes
Support for
External loader in the mapping using external
transformation.
Wipro Confidential
Supports Incremental
load. Changed Data
Capture captures
changes to the
operational data and
produces Delta Store
files.DataStage XE uses
these files to update the
data warehouse.From a
workflow perspective,the
warehouse developer
defines a Delta Data
Store file as an input
table within one of the
DataStage XE products
on a Windows 95/NT
platform.
DataStage supports a
wide variety of such bulk
load utilities either by
directly calling a vendor
’s bulk
load API or generating
the control and matching
data file for batch input
processing.DataStage
developers simply
connect a Bulk Load
Stage icon to their jobs
and then fill in the
performance settings
that are appropriate for
their particular
environment.
Page 30 of 37
ERP Data Warehouse
Does not generate
Do not require
Intermediate Only generates a temp file
file generation when doing sorting or loading. intermediate file during intermediate files or
loading.
secondary storage
during loading
Event based
loading
locations to perform
aggregation or
intermediate sorting
during loading process.
Supports Event based
loading
Does not supports "true" work Yes it is possible for do
flow mechanism. This can be
done using external schedulers
or workflow tools like AppWorks
or NT Scheduling or using
Mainframe OPC Scheduling
tools.
Supports Oracle, Informix, SQL Only
Sybase Adaptive Server ,
Support for
Oracle8.x,Informix,SQL Sybase Adaptive server
wide range of Server, DB2 etc
Server and DB2 only.
IQ, Microsoft SQL Server
databases for
7 via OLE/DB , Microsoft
storing(Target)
SQL Server 6.5 via BCP ,
information
Informix Redbrick,
Teradata, UDB. Bulk
Loaders - Oracle ,
Informix ADO/XPO High
Performance . Ascential
databases- UniVerse,
Unidata. Also XML,e-mail
systems and Web Logs,
ERP data and MQSeries
messages.
Supports multi user
Supports multi user
Supports multi user client
Support for
development
environment.
development
server development
multi-user
environment
environment
development
environment
Wipro Confidential
Page 31 of 37
ERP Data Warehouse
Advance Data Re-usability
Transformation
Support for
built in
functions
Wipro Confidential
Supports re-usability of the
provides various reusable Code Reusability is
suported. Ascential's
code by making transformation objects like
reusable.
Jobs,workflows,dataflows Quality Manager
provides a framework for
etc.
developing a selfcontained and reusable
Project which consists of
business rules, analysis
results, measurements,
history and reports about
a particular source or
target environment.
Support Built in transformations Support built in functions pre-built functions and
like aggrigator , filter etc.
routines are available
Page 32 of 37
ERP Data Warehouse
Handling
duplicate
records
Does not handle duplicate rows. Possible to handle
To be hanldled programatically duplicate records
Lookup cache Supports caching of lookup
tables.
Consistency and Global Meta
data
re-use
Wipro Confidential
Does not handle
duplicate rows. To be
hanldled programatically
Possible to define lookup Supports Lookup cache
cache through lookup
transformations
Using PowerCenter and
Supports Global Meta
PowerMart model it is possible Data
to handle global meta data.
MetaBrokers enable the
sharing of meta data
among all of the tools in
the warehouse
environment.With
MetaBrokers, tools can
share meta data without
having to change their
Page 33 of 37
ERP Data Warehouse
internal meta schema to
conform to a common
model.
Compatibility Compatibility Currently PowerCenter Supports Supports EAI tool TIBCO Only IBM MQ Series is
supported.
with third party of ETL Tools following EAI vendors IBM MQ as an input .
Series,
TIBCO,
Vitria
and
with EAI tools
tools
webMethods as source/ target
for the data.
Wipro Confidential
Page 34 of 37
ERP Data Warehouse
Licensing &
Pricing
Server
Licensing
Licensing Includes following for Provideds evaluation and Information Not availble
Basic Version:
permanent
. No ability to add-on
licenses.Which supports
PowerMarts
multiuser environment
· No Global Repository
and SAP R3 connectivity.
· No centralized monitoring
· 1 Server Engine*
· 2 Relational Database Source
Types
· 2 Target Instances
· Unlimited Flat File Sourcing
· Unlimited Developers
. Single CPU
Unix Version Costs : US$
140 K
Windows NT/2000 Ver :
US$ 95 K
Information Not availble
There is no separate licensing There is no separate
Client
for the Client. It Comes along license required for
Licensing
with the server.
client.
Information Not availble
ODC Licensing No transfers are allowed from
the client owned software to
Wipro. Separate license has to
be procured. May be Lab
license will do which will be half
the cost of the production
license
Wipro Confidential
Page 35 of 37
ERP Data Warehouse
Vendor
Information
2 consecutive Informatica was recently named Acta continues to see
the 11th fastest-growing
strong growth in data
years of
technology company in Silicon integration with second
profitability
Valley by Deloitte & Touche.
quarter revenue growth
The ranking resulted from the results up 110%.
company’s 10,491 percent
revenue growth between 19951999.
PowerCenter Works with most
Significant
of the software,database and
third party
partner support hardware vendors. Built on
most with open system. The
product like powerconnect for
DB2 has been brought by
informatica and supported.
Has Global presence and has
Global
presence and support most of the continents.
support
Number of
Customers
Wipro Confidential
SAP is a reseller of
Ascential’s DataStage
and DataStage Load
PACK for SAP BW with
the sole target being
SAP BW.
Ascential Software
Corporation is the
leading provider of
Information Asset
Management solutions to
the Global 2000.
More than 1800 as of
is around 1300 as of Oct 2001 Has more then 200
customer as of Oct 2001. Aug' 01
Page 36 of 37
ERP Data Warehouse
Company
financial info
readily
available
All the informtaion regarding
the health of the company has
been reported in its website.
Revenue for Ascential
Software's DataStage®,
Media360™ and related
product and service
offerings was $27.0
million in the third
quarter, an increase of
14% from $23.6 million in
the third quarter of 2000.
Revenue for these
offerings for the nine
months ended
September 30, 2001 was
$93.9 million, an
increase of 47% over the
$63.8 million in the first
nine months of 2000.
Company focus Informatica Came to BI market Acta is well positioned to Adds significant meta
data management
with the ETL product and has drive the "data
on ETL
segment for the established a major player in integration market" and services to the entire
datawarehouse,including
the market. This product will be coming up as major
future
ETL. Intend to offer the
continue to be the flag ship
player.
capability for
product despite change in its
heterogeneous cross-tool
positioning in the BI market
analysis and query
capabilities.Exploitation
of XML Integration to
enhance e-businesses
communication.Delivers
Key Metabroker
development capabilities
for its customers and
partners.
Wipro Confidential
Page 37 of 37
Download