Leveraging the Microsoft Repository in Your Warehouse Saeed K. Rahimi Graduate Programs in Software University of St. Thomas DWSoft Corporation Contents Data Warehousing and Repository The Open Information Model and Meta Data Coalition Microsoft’s Data Warehousing Strategy Integrating the Repository Into Your Warehouse © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 2 Data Warehouse Architecture Operational Data Data warehouse Life Cycle DM • Extract • Cleans • Transform • Load DW DM DM Metadata Services SQL Server 7.5 - Object Persistence Services Repository © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 3 What is Metadata? (Continued) Information not just data What is 122599? Is it an integer number? A string? Christmas day 1999? Repository provides metadata about this: mmddyy © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 4 What Is Metadata? Information about your Systems Source systems, Warehouse systems, Processes Etc. Databases Source databases Warehouse databases Table Columns © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 5 What Is Metadata? Information about your Transformations From source system to data warehouse From data warehouse to source systems Calculations Total-Sales = amount*quantity – discount + shipping&handling © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 6 What Is Metadata? (Continued) OLAP information Sales cube TotalSales measure Time, product, geography dimensions Other information Reporting Tools Modeling Tools © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 7 Metadata Benefits Roadmap to information in the warehouse What is in the warehouse What it means How to get the information Who owns it When it was created How it was created/generated © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 8 Metadata Benefits All users that interact with the warehouse will benefit from enhanced information Analysts can better understand what data is available, and how it was calculated Maintains a history prior to current processes © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 9 Metadata Benefits (Continued) Provides a common frame of reference – “what does total sales mean” Provides a central storage location of warehouse knowledge (helps with turnover, growth) Allows enterprises to ensure data marts/warehouses are built using common terms and goals © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 10 Open Information Model The schema for the repository Common set of core definitions to promote sharing of information among different tools and vendors Covers basic set of sub-models Created with help/input from over 20 partners and reviewed by 300 Microsoft has transferred rights to evolve the OIM to the meta data coalition (MDC) © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 11 MDC OIM Components Current components Unified Modeling Language (UML) Relational database schema Data transformations Multidimensional schema (OLAP) English semantics Legacy databases Component descriptions Components under review Report definitions Entity/relationship diagrams Business engineering © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 12 Sample OIM Constructs Transformation – mapping from Source column to Target column TfmTransformationHasTargetColumns TfmTransformationHasSourceColumns ITfmTransformation TfmObjectSetConsistsOfObject ITfmTransformableObjectSet © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 ITfmTransformableObject (Columns) Page 13 Virtual Relationship This This A A B VS B C C D D © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 14 Virtual Properties This This A A a b VS B c C a b c d e B D d © Copyright 2000 Saeed K. Rahimi c C D e d DAMA Presentation, August 2000 e Page 15 Microsoft’s Strategy Impact Prior to SQL server 7.0, DW was primarily available only to very large companies Tools were expensive, expertise hard to find Microsoft has lowered the bar by bundling components (with SQL server for free) and making DW available to many more companies © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 16 Microsoft Warehouse Components Data transformation services Extract, transform, load OLAP services Multidimensional server and analysis Repository Information sharing English query English to SQL translation © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 17 Repository Maintenance Importing information into the repository has always been a challenging task Information needs to be mapped to the model components Information capture needs to be automated Once in the repository, maintaining versions of the information is needed New information load can not destroy the information already in the repository © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 18 Loading the Repository Data Transformation Services OLAP Services Star Schema Database OLTP Database English Query Application Transform Other Sources OLAP DB Schema DB Schema Semantics Other Repository © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 19 Loading the Repository DTS can save to repository OLAP information can be saved to repository with SQL Server 2000 OLE DB and ODBC data sources and related information can be imported into repository with SQL Server 2000 © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 20 Implementation Issues DTS doesn’t provide complete source to target mappings for certain transformations For example When a query used as source, Interfaces to many third party tools incomplete or not available Modeling tools ERWin Visio PowerDesigner Warehouse tools Business Objects Information Advantage © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 21 Repository Features Extensible metamodel Version management Dynamic COM based interface Batch XIF file based interface Workspace concept Some pre-built interfaces to data warehouse tools © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 22 Keys to Success Ensure repository information is current Provide access to all users of the warehouse Tackle implementation in well defined, small steps Designate someone with the responsibility and authority to manage the repository Maintain long term focus © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 23 Access to All Users Show users what they want to see Provide simple, easy to understand information Make warehouse information easily available “in context” Use familiar interface and UI concepts Web based access Client/server based access Provide custom integration to existing tools and interfaces when necessary © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 24 Modes of Access Administrative Access Management Meta-model (schema) changes Meta-data loading and maintenance End-user Access Browsing Navigation Impact analysis © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 25 Types of Access Local and over the LAN access Metamodel management Repository bulk loading Interface from other tools (loading meta-data) Remote and over the WEB access Browser based accessible from anywhere on the globe Navigation oriented Interface from/to other tools May be minor editing capabilities © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 26 Access Control User Classification Different Class of Users Need to Access Different Objects in the Repository Different User Profiles Different Read/write Access Control Possible User Classes Business Executive Casual User Database Administrator Data Warehouse Administrator Repository Administrator © Copyright 2000 Saeed K. Rahimi DAMA Presentation, August 2000 Page 27 Questions? Comments or discussions? Emails: skrahimi@stthomas.edu srahimi@dwsoft.com