® IBM Software Group IBM Information Server Transform – DataStage ©IBM Corporation ® IBM Software Group Why “Transform?” ©IBM Corporation IBM Software Group Why Transformation? Business Driver: Single View of Corporate Data Projects Related to Information Infrastructure Application integration Business Goals Platform migration On-demand transformation and correction Application re-engineering and migration (ERP to CRM) Decision Support (BI, DW, Data Marts) IT Initiatives Opportunity (discover new revenue sources) Control (Fraud detection, inventory) Information Integration Regulatory compliance -SOX, BASEL, Money Laundering Portals Balanced scorecard dashboards, BAM 3 IBM Software Group Transformation Pain Multiple sources for the same entity Lack of standards or consistent semantic meanings across systems Embedded business intelligence Evolving transformation requirements Need for batch and real-time and service oriented architectures Extreme data volumes! Business rules for resolving data conflicts Ownership and accountability Zero re-use of skills and processes 4 IBM Software Group How Is This Being Done Today? Hand coding: Java, C, C++, VB, .NET, COBOL, 4GLs… Spreadsheet “farms” Early generation ETL tools Competitive products 5 IBM Software Group IBM Information Server Delivering information you can trust Discover, model, and govern information structure and content Standardize, merge, and correct information Combine and restructure information for new uses Synchronize, virtualize and move information for in-line delivery 6 IBM Software Group The IBM Solution: IBM Information Server Delivering information you can trust IBM Information Server Unified Deployment Unified Metadata Management WebSphere DataStage Complex transformation for simplified data exchange and reduced coding 7 IBM Software Group Implementation Examples Uses real-time data in a financial data warehouse for intra-day analytics Improves supply chain management by creating forecasts from POS data. Basel II initiative will release about 40% of its minimum capital requirements Deutsche Bahn Group Replaced 4,000 hand-coded interfaces to create single view of ticket data Manages 3 terabytes of store sales data for customer and product analysis 8 IBM Software Group WebSphere DataStage Design integration projects within a graphic, codeless environment IBM Information Server DATASTAGE QUALITYSTAGE CLIENT Integrate data from the widest range of enterprise and external data sources COMMON SERVICES PARALLEL PROCESSING METADATA Produce re-useable components Deploy jobs in real-time, batch mode, or as services COMMON CONNECTIVITY Sources Targets Leverage the most scalable and adaptable parallel processing engine 9 IBM Software Group Graphical Design Metaphor 10 IBM Software Group Pre-Built Transformations for Productivity 11 IBM Software Group Graphical Design Metaphor Extensive list of available transformation functions to select from: Context-sensitive menu: Easy access to transforms 12 IBM Software Group Error notification Immediate notification when there’s a problem! 13 IBM Software Group Extensive Re-use Shared Containers Graphical unit of re-use Share one developer’s (subject matter expert) Meta data research Business rule definitions Transformation logic Special techniques Routines Re-usable functions Web Services Deploy jobs as web services. Invoke from other jobs or applications Use Web Services 14 IBM Software Group Connectivity Ensures Data Access Enterprise Applications Business Exchange Formats Flat File and General Access JD Edwards Oracle Applications PeopleSoft SAP BW (BAPI, IDOC) SAP R/3 (ABAP, BAPI, IDOC) Siebel XMLS EXML EDI FIX SWIFT HIPAA RDBMS Real-Time IBM DB2 IBM IMS VSAM Oracle Informix RedBrick SQL Server Sybase Teradata U2 (Universe, UniData) Tandem NON-STOP SQL SAS WebSphere MQ SeeBeyond Java Messaging Services Java (Client & Transformer) XML (Read / Write) XSL-T XSL-T Transformer Web Services (SOAP) Enterprise Java Beans VSAM VSAM CICS IDMS C-ISAM Sequential File Complex Flat File File Set Data Set Named Pipe FTP (standard, secure) Compressed / Encoded Data External Command Call Parallel Wrap 3rd party applications …And many more! 15 IBM Software Group Benefits of Scalability Process the same data volume in less time Processing Volume (gigabytes) - or - Processing Time (hours) 20 15 10 5 2 4 8 12 16 Number of CPUs 24 32 - - - Process more data in the same amount of time 1t 750 500 250 2 4 8 12 16 24 32 - - - Number of CPUs 16 IBM Software Group Parallel Execution Enables Timely Integration Uniprocessor SMP System MPP, GRID, and Clustered Systems 17 IBM Software Group Enabling Parallelism Given a Job Design: …DataStage creates “n” processes at runtime for each Stage, where “n” is the number of logical nodes defined in a configuration file 18 IBM Software Group Metadata Driven Integration Shared metadata across product modules Better and faster communication between team members Immediate access to definitions and notes on all objects Greater understanding, better data Powerful Metadata driven design tools Impact Analysis Quick Find and Advanced Find Impact Analysis Data Lineage reports Greater productivity, easier maintenance, reuse Find Capability 19 IBM Software Group DataStage Strength Summary Graphical, top-down design metaphor Extensible, component based architecture Strong Re-use capabilities Shared Containers, Routines & Web Services Graphical sequencing (“job flow”) Application Deployment Parameterization Changed Data Capture Ubiquitous Connectivity Unlimited Scalability Design serially, deploy in parallel 20 ® IBM Software Group Thank You ©IBM Corporation