Bus Matrix… the foundation of your Data Warehouse Bill Anton Prime Data Intelligence About Me • I Love Data! • …also, Microsoft DW/BI (MCTS/MCITP, MCSA/MCSE) • Independent Consultant @ Prime Data Intelligence, LLC • Atlanta BI SQL Server Users Group • Twitter: • Blog: • Email: @SQLbyoBI http://byoBI.com william.anton@gmail.com What we will cover today • Dimensional Modeling 101 • What, Why, How • Common Challenges • Bus Matrix • What is it? • How does it help? • Examples What is Dimensional Modeling? Facts • additive amounts • E.g. Sales amount, inventory quantity • SUM, AVERAGE, MAX, MIN, COUNT Dimensions • descriptive attributes • E.g. Date, Product, Location, Customer • GROUP BY <attribute>, <attribute>, etc What is Dimensional Modeling? DIMENSION DIMENSION DIMENSION FACT DIMENSION DIMENSION “Star Schema” What is Dimensional Modeling? • Denormalization • “Repeating Values” • Opposite of “normalized” (e.g. 3rd Normal Form) • Optimized for reads (not writes) Dimensional Modeling 101 Question: What are the most common types of Data Warehouse methodologies/architectures? • Kimball • Inmon • Data Vault Dimensional Modeling 101 Question: For which of these DW methodologies should you include a dimensional model? Kimball, Inmon, Data Vault All of them Kimball Dimensional DW Source Stage DW Cubes Inmon 3NF EDW + Data Mart(s) Source Stage 3NF DW Data Marts Cubes Data Vault + Data Mart(s) Source Stage Data Vault DW Data Marts Cubes Why Dimensional Modeling • Intuitive to Business Users • Simpler than OLTP/3NF • Rise of Self-Service (E.g. Power Pivot, Power View) • Iterative Development • “Agile” • Performance • Optimized for analytical queries e.g. sales amount by product in 2013 for top 10 all-time customers • And many more… See Teo Lachev’s “WHY SEMANTIC LAYER” newsletter: http://www.prologika.com/Newsroom/Newsletter2013Fall.aspx Intuitive to Business Users Dimensions Dimensions Facts How many bikes did we sell last year? Do we sell more bikes to single or married females? What was our most/least profitable product this year? What was the Average Monthly Gross Margin Return on Inventory Investment (GMROII) by Product Category for the trailing 6 months? It’s Complicated Star-Schema Date DIMENSION DIMENSION DIMENSION Customer Product Sales FACT DIMENSION DIMENSION Store Sales Person 1 “Star” per Fact table Date Date Product Customer Sales Sales Person Sales Process Inventory Store Product Store Inventory Process Facts are related through dimensions… Date Date Customer Product Sales Sales Person Sales Process Inventory Store Store Product Inventory Process Facts “Conformed are related through dimensions… Dimensions” A conformed dimension is a set of data attributes that have Customer Date been physically referenced by multiple fact tables using the same key value to refer to the same structure, attributes, domain values, definitions and concepts. Dimensions are conformed when they are either exactly Inventory the Sales Product same (including keys) or one is a perfect subset of the other. Dimension tables are NOT conformed if the attributes are Sales Person Store labeled differently or contain different values. Dimensions: Conformed vs Unconformed Revisiting Average Monthly Gross Margin Return on Inventory Investment (GMROII) Customer Date Sales Sales Person Average Monthly GMROII Product Inventory Store Profit for total time period Sum of each month ending inventory cost What was the Average Monthly Gross Margin Return on Inventory Investment (GMROII) by Product Category for the trailing 6 months? Where things start to get complex… • 1 Star per Fact table • Multiple Fact tables per business process • Multiple business processes in an enterprise Dimensional Model becomes a “Galaxy of Stars” Finance Production Sales Distribution HR ER Diagram: Adventure Works Sample DW For bigger Data Warehouses… This ^^ Turns into this ^^ Variety of Problems to Overcome with Dimensional Modeling • Communication & Strategy • What’s the short term plan of attack? • What’s the long term plan of attack? • Documentation • What’s in our Data Warehouse? • Business Users can’t read ER diagrams • Business Users are typically only familiar with a 1 or 2 business processes • E.g. Sales User vs Inventory User; Warehouse Supervisor vs CEO • Conforming Dimensions is hard…REALLY hard • So are changes (E.g. Impact Analysis) What’s the Solution? • Train business users to read ER Diagrams? • Simplify Data Model? • Ignore certain business processes? • Don’t use Conformed Dimensions? • Force business users to manually map data between processes? What about a Bus Matrix? What is a Bus Matrix? 2-dimensional visualization showing the intersection of facts and dimensions Variety of Use-Cases for a Bus Matrix • Documentation, Communication, Training • Facilitate User Adoption of BI tools • Communicate Expectations w/ Business • New users unfamiliar with new business process • Team Development • Agile • Prioritization of Tasks • Divide & Conquer • Road-Mapping • Prioritization of Business Processes in a Business Intelligence “Program” Documentation For Business Documentation for IT Master Bus Matrix Team Development Sprint 1 Internet Sales Sprint 2 Reseller Sales Road-Mapping When To Create a Bus Matrix • During Requirements Gathering • Before You Start Development! • Updated Over Time • Changes to Business Processes • New Source Systems (E.g. mergers/acquisitions) How To Create a Bus Matrix Manual via Excel Automated via SSRS Manual • Only option when starting out ;-) • Updates can be made quickly made as requirements come in • Adds development overhead, but the ROI is well worth it Automated • Reporting pack with drill-through to data dictionary information • Can be based on Cube or Relational Database (*FK required) • Incorporate query statistics to visualize common usage patterns • Use MDS to allow SME’s to manage business definitions Based on example from Alex Whittles http://www.purplefrogsystems.com/blog/2010/09/olap-cube-documentation-in-ssrs-part-1/ QUESTIONS References Twitter: @SQLbyoBI Blog: http://byoBI.com Email: william.anton@gmail.com http://byobi.com/blog/bus-matrix/