Developing and Deploying Data Warehouse and Business Intelligence Solutions Kerr-McGee Information Management Group Skye Brannon DW Analyst Jeff Bridgwater DW Manager Sarena Sherrard Sr. DW Analyst Who is Kerr-McGee? • Kerr-McGee is an Oklahoma City-based energy and inorganic chemical company with worldwide operations and assets of approximately $10 billion. • http://www.kerr-mcgee.com/ Agenda • Introduction to DW/BI Concepts • Extract, Transform & Load (ETL) • Business Intelligence / Reporting • A Day in the Life DW / BI Concepts Information Management Strategy Structure the systems and data relationships to provide user-friendly customer access to data in order to provide decision-making information. Adding Value to Data Information Pyramid What is a Data Warehouse? A copy of data from one or more On-line Transaction Processing (OLTP) systems specifically structured for Query, Reporting and Analysis (QRA). • Data is typically at a summarized level to limit the size and complexity of the data warehouse • Data is usually cleansed and merged to create an “apples to apples” comparisons End-User Reporting OLTP Systems Data Warehouse The Idea Behind Data Warehousing Source Systems POS INVENTORY FINANCIALS Extract & Transform Data Warehouse Meaningful & Easy Access ? BILLING Data Information Framework Architecture Metadata Management Enterprise Reporting External Data Data Extraction Datamarts Data Cleansing Data Integration OLAP Query Data Improvement Data Mining Source Systems Data Transformation Operational Data Store Data Warehouse Project Management & Quality Assurance Operations & Systems Management Information Delivery Business Intelligence Business Activity Sales Marketing Service Customers Manage Inventory Maintain Accounts Data Business Intelligence Integrated Meaningful Consistent Validated Easy to Use Leveragable Timely Product Performance Sales Performance Customer Behavior Marketing Performance Inventory Monitoring Financial Performance Information Corp. Oracle Financials Passport InPower Production Operations Financial Operations Chemical Passport Maximo European Oracle Financials Chemhouse TOLAS O&G Tobin Issues • Multiple Versions of the Truth • Different definitions for similar data • Multiple Reporting Tools and Reports • Impacts Performance of Oracle Financials Domestic Oracle Financials Adage Data Warehouses Existing Reporting Systems HR Data Warehouse DFW Novistar Energy Financials O&G Data Warehouse PREMAS P2000 Merak Others Aberdeen Oracle Financials Intl. Systems Aberdeen Data Warehouse HR Data Warehouse Corp. Oracle Financials Passport? Phased Out Peoplesoft Production Operations “Centralized” Data Warehouse(s) Chemical Passport Domestic Oracle Financials Financial Operations Data Warehouses Existing Reporting Systems Consolidated Analysis & Reporting Solution (Cognos Business Intelligence) Adage European Oracle Financials Maximo TOLAS Consolidated Reporting Systems Benefits • Consolidated Tool Enables Multiple Economies of Scale - Massaging & Distribution - Desktop/Maintenance - Transaction System Load’ Novistar Energy Tobin • Single Version of the Truth • Business Intelligence environment leads to Timely Analysis & Reporting Dashboards & KPI Mgmt Chemhouse O&G DFW PREMAS P2000 Merak Others Budget & Forecasting Application Aberdeen Oracle Financials Intl. Systems O&G Data Warehouse Aberdeen Data Warehouse Possibly Phased Out or Integrated Possibly Phased Out or Integrated STANDARDIZE, CONSOLIDATE, MINIMIZE, & SIMPLIFY Data Warehouse Roles • Manager Planning and management of entire product or project lifecycle; May assist in ETL & BI Interface design and development • Data Warehouse Architect – Applies knowledge of technology options, platforms, and design techniques across product and project lifecycle; responsible for design of overall warehouse process • ETL Specialist – Analysis and design of extraction, transformation, and loading strategy; development of ETL scripts and procedures • Business Intelligence Specialist – Design and development of multidimensional-cubes & reports; performance and tuning of chosen technologies • Web Interface Specialist – Design and development of application interface elements; coordinates interfaces between application components ETL Data Extraction and Transformation Metadata Management • External Data • Data Extract Cleansing and Integration Process • • • Data Extraction and Transformation Plan/Forecast/ Applying business rules to turn data into useable Analysis information Clean up and standardization OLAP of consumers, vendors, Global / Dept/ Query products, etc. Business Unit Summary Integration of disparate internal and external data and Analysis Can be 70% - 80% of effort Issues Data Datamarts - Can be difficult and time consuming toVisualization define Data business rules Mining Source Systems Data Transformation - Extraction tools automate only the more simple tasks Executive Operational Data Information Information Data Store Warehouse Analysis Systems Project Management & Quality Assurance Operations & Systems Management ETL Challenges Data Volume + Source Inclusion + Extract Timing = Warehouse Complexity • Warehouse Complexity • Destructive vs. Incremental Loads • Integration (mergers/acquisitions) – Data integration (consolidation of information) – Process integration (consolidation of processes) • Smaller windows of opportunity – Make decision in a shorter period of time due to competitive, global market • Global marketplace (DW timing updates) • High-profile e-Business initiatives – Satisfying requirements ETL - The “Heavy Lifting” • Challenge to develop efficient, consistent methods of gathering and cleansing heterogeneous data – Capture and load of data from multiple source systems (both internal and external) – Integrates data into a single source – Cross-system mapping to standard identifiers (surrogate keys) – Aggregation for information delivery and BI initiatives ETL Tools - Only Half the Story • Half the story: ETL Tools Extract, Transform, and Load data • Transport data between sources and targets • Document data element changes (metadata) • Administer run-time processes and operations – – – – Scheduling Error management Audit logs Statistics ETL Tools – Core Components Databases/Files Metadata Import/Export Target Adapters Extract Transform Load Source Adapters Databases/Files/ Legacy Apps Runtime Metadata Services Metadata Repository Design Manager ETL - The Options • Software Products – “Homegrown” Solutions • SQL*LOADER, PL/SQL, GATEWAYS & LINKS • Developer-bound • Little cohesion between components – Niche Players (Oracle Warehouse Builder / DataJunction) • Enhanced Scheduling & Logging • Not Multi-Warehouse Oriented – Informatica Powermart • • • • Great UI Powerful Scheduling & Logging High Price Proprietary Transform Language ETL - The Reality Flat File O/S Scripts Mainfram e DBMS COBOL Code ERP 3GL Code RDBMS SQL Scripts Loader Utility Data Repository Oracle RDBMS PERL Scripts “New” Sources Interface Apps Informatica Powermart Repository Manger Workflow Manager Designer Workflow Monitor Business Intelligence / Reporting What is Business Intelligence? Business Intelligence is the transformation of data into information you can use to drive your business. There are a number of vendors that have developed Business Intelligence software. Kerr-McGee uses Cognos. Business Intelligence Tools Metadata Management Business Intelligence Tools • Combination of applications and tools External • Provide analysis, presentation and Data Data reporting facilities for users Extract Cleansing • Tailored to meet diverse needs of and Integrationmgrs, analysts executives, Process • Data may reside in ODS, data warehouse or data mart • Issues – Source Systems How do you choose the right tool or tools? Data Operational Data Transformation Data Store Warehouse Plan/Forecast/ Analysis OLAP Query Datamarts Data Visualization Data Mining Information Analysis Project Management Management & & Quality Quality Assurance Assurance Project Operations & Systems Management Global / Dept/ Business Unit Summary and Analysis Executive Information Systems Categorize Information Needs Highly Summarized Senior Management Executive Moderately Management Summarized Business Analysts Market Researchers Integrated Operational Data Business Users Financial analysts, product managers, etc.. Salespersons, line managers, administrative staff, etc.. Information Delivery Mechanisms Wireless Mobile Web or C/S Predefined Mechanism Summaries Delivery Considerations Data Operational Trends Web or C/S Visualization Integrated with Operations? Detailed Reporting only? Real-time or based Directed on a Periodic Specialized Business CycleAnalysis (Financials) Algorithms Tethered or ‘disconnected’? C/S Standardized Web Operational Wireless Reporting Operational Trends (in millions) Net Revenues Net income Earnings per share Return on net revenues Ad-hoc Queries Cash & s/t investments Total Assets Shareholder Equity 1998 1999 $x,xxx $x,xxx xxx xxx 2000 $x,xxx xxx 2001 2002 $x,xxx $x,xxx xxx x,xxx x.xx x.xx x.xx x.xx xx% xx% xx% xx% xx% $xxx $xxx $xxx $xxx $xxx x.xx $xxx $xxx $xxx $xxx $xxx xxx xxx xxx xxx x,xxx Web or C/S Operational Trends B.I. Infrastructure All things Cognos - Cognos is a vendor. The suite of applications we’ve bought and use from Cognos are: - Access Manager – (Security) - Upfront – (Portal – http:/intranet/kmbi) - PowerPlay – (reports/cubes) - Impromptu – (Web based PDF reports) - NoticeCast – (conditional report notifications) - *Visualizer – (graphic depictions of data warehouse/cube information) *What we will cover. Terminology Cube - A multidimensional way to analyze information, designed to provide quick answers to the who, what, why, when, and where business questions. Drill Down - Going from a summarized view to a more detailed view of information within the same cube Drill Across - Linking data from One Subject Area to Another (General Ledger to Accounts Payable) Drill Through - Linking to source data using selected filters Powerplay Web - On-Line Analysis Tool for cubes (slice/dice, drill down, drill across & drill through) Newsbox -A web based folder used to store views of data (reports). Every KMBI user has their own personal newsbox. Cognos - Upfront - Upfront - Portal Management Cognos - PowerPlay - PowerPlay – web reports/slicing and dicing/data analysis, based on cubes. More Information on Cognos website: http://www.cognos.com/products/businessintelligence/analysis/ Cognos - Impromptu - Impromptu – printable reports (in PDF) that may/or may not be produced with prompts for filtered information. Cognos - Visualizer - Visualizer – interactive graphic depictions of data warehouse/cube information Developing Visualizations • Initial Project meeting should include: – Client - gives input on look and feel, data requirements, timelines – Project Manager – ensures project is feasible within budget and time restraints at the onset and through out the project. – Data Warehouse Architect – ensures all the needed data is in the data warehouse. – Business Intelligence Specialist (cube builder) – ensures all the needed data is in the cube, in the correct format – Web Interface Specialist (visualization builder) – works with cube builde and client to established look and feel, navigation, chart styles, etc.. Day in the Life Typical Day