Cubes for Flat Table Land Michael P. Antonovich http://SharePointMike.wordpress.com #SharePointMikeA My Published Books User’s Guide to the Apple ][ - 1983 FoxPro 2 Programming Guide – 1992 Debugging and Maintaining FoxPro – 1992 Using Visual FoxPro 3.0 – 1995 Using Visual FoxPro 5.0 – 1996 Office and SharePoint User’s Guide – 2007 Office and SharePoint User’s Guide – 2010 Speaker at: Code Camp 2009, 2010, 2011, 2012 Orlando SharePoint Saturday 2011 & 2012 Tampa, 2012 Orlando SQL Saturday - #1, #4, #8, #10, #15, #16, #21, #32, #38, #49, #62, #74, #79, #85, #86, #110, #130, #151, #168 IT PRCamp – Jacksonville 2012 OPASS Mtg 2 October 24, 2012 Some Basic BI Terminology IMPORTANT BI TERMS Aggregate A mathematical function that allows you to summarize values of an attribute Dimension A dimension is essentially a look-up table that may define a hierarchy or drill-down path such as Year > Quarter > Month Measure Fact Star Schema Snowflake Schema OPASS Mtg A measure is something that identifies a value A fact is another term for a measure that contains numeric data that can be grouped along one or more dimensional hierarchy All dimension tables radiate out from a single fact table One fact table may relate to another fact table before relating to dimension tables. One dimension table can also have a related dimension table A Pivot table or chart is usually based around a single fact table 3 October 24, 2012 Two Models in SSAS Multidimensional Model No major functionality changes since 2008 R2 Tabular Model Visually and functionally resembles PowerPivot 2012 Both can be installed as separate instances on the same server. OPASS Mtg 4 October 24, 2012 Advantages of the MultiDimensional Model Tested technology since SQL 2000 Pre-calculated aggregates provide performance enhancements. Can handle larger data since it can store data on disk (MOLAP) or directly query the relational data source (ROLAP) Uses MDX which is supported by many 3rd party client tools. OPASS Mtg 5 October 24, 2012 Disadvantages of the MultiDimensional Model Model is getting ‘old’ and is not being revised. (designed for 32 bit, row based data and disk storage). MDX is perceived as being difficult to learn. Processing a multidimensional model can result in substantial downtime for large models. Changes to one table require the entire model to be reprocessed. Not compatible with PowerPivot OPASS Mtg 6 October 24, 2012 Advantages of the Tabular Model A 100% memory-based model provides greater performance. The xVelocity analytics column based engine offers significant query performance improvements. Queries and formulas use DAX which is ‘easier’ to learn than MDX. (MDX is also supported) Queries data from many different data sources. OPASS Mtg 7 October 24, 2012 Disadvantages of the Tabular Model Does not support many-to-many relationships. Does not support true role-playing dimensions. Does not support cell-level security. Does not support security on measures. Does not support translations of metadata for locales. Does not support ragged hierarchies. OPASS Mtg 8 October 24, 2012 Which to Choose? For most applications (60-70%) either model will work. Do you currently have a model in Multidimensional mode? Are you just learning Analysis Services? Licensing issues? Compatibility with PowerPivot? Hardware? Performance Issues? OPASS Mtg 9 October 24, 2012 SSAS Tabular Uses DAX DAX Stands for Data Analysis Expressions DAX is used to: Create calculated columns Create custom measures OPASS Mtg 10 October 24, 2012 Basic Syntax DAX expressions always begin with an equal sign: = Column References can be qualified or unqualified TableName[ColumnName] [ColumnName] DAX Data Types DAX Operators • • • • • • • • • • • • OPASS Mtg Integer Real Currency Date(DateTime) TRUE/FALSE (Boolean) Text 11 +, *, / =, <> >,< >=, <= & • AND && • OR || • NOT ! October 24, 2012 DAX Functions 2010 Version consisted of 135 functions 71 functions are similar to Excel functions 69 have the same name – 2 do not TEXT FORMAT DATEDIF YEARFRAC 64 functions are unique to DAX Aggregate data functions Date related functions 2012 Version has a little over 170 functions (no, I will not cover them all today) OPASS Mtg 12 October 24, 2012 Types of DAX Calculations Simple Calculations Calculated columns within fact tables Calculated columns for dimension tables Calculated columns between tables Calculated columns to eliminate lookup tables Calculated columns to serve as links to tables using multiple columns (Calculated columns are calculated for every row in the table) Context is the row Aggregate Calculations Calculate unique measures Context is in the evaluation of the pivot data (Aggregate measures are only calculated for the displayed data in the Pivot table) OPASS Mtg 13 October 24, 2012 Tabular Model Can Import From Microsoft Access 2003, 2007, 2010 Many other ODBC Databases Microsoft SQL Server 2005, 2008, 2008 R2 Text files (.txt, .tab, .csv) Oracle Relational DB 9i, 10g, 11g Analysis Services Cubes from SQL Server Teradata V2R6, V12 Data Feeds using Atom 1.0 Format IBM Relational Database 8.1 Excel Files from 97-2003, 2007, 2010 Sybase Relational Databases OPASS Mtg 14 October 24, 2012 Demo 1a: Retrieve Data from Multiple Sources Open C:\Contoso2012\Stores.xlsx and rename to C:\Contoso2012\SQLSaturday1.xlsx Go to PowerPivot window and load SQL Server database: Contoso2012 using all tables Add to Data Model, Stores from the current spreadsheet. This is a linked table. Add Access database ProductCategories. Add Excel file: Geography.xlsx OPASS Mtg 15 October 24, 2012 Demo 1a: Load Data OPASS Mtg 16 October 24, 2012 Loading Data into the Tabular Model Demo OPASS Mtg 17 October 24, 2012 Demo 1b: Create Relationships OPASS Mtg 18 October 24, 2012 Demo 1c: Show Diagram View OPASS Mtg 19 October 24, 2012 Creating Relations Between Tables Demo OPASS Mtg 20 October 24, 2012 Technical vs. Useless Columns Technical Columns Used to link tables (IDs) Use to calculate other columns Hide from Pivot Table Field List Useless Columns Came in when data imported from data source Not used in pivot table or to link tables Delete to improve performance OPASS Mtg 21 October 24, 2012 Demo 2: Eliminate Useless Columns and Hide Technical Columns OPASS Mtg 22 October 24, 2012 Denormalize Data Model Eliminate tables and columns that are not going to be used. Flatten structure by created calculated dimension attributes based on values in other tables. Hide columns used in calculations but which users no longer need to see. OPASS Mtg 23 October 24, 2012 Create a Hierarchy Predefine common hierarchies for users Hierarchies are defined from the largest grouping to the smallest: Product Category Product Subcategory Product After defining the hierarchy, you can remove the individual columns used to define the hierarchy. OPASS Mtg 24 October 24, 2012 Demo 3: Define a Product Hierarchy OPASS Mtg 25 October 24, 2012 Demo 4: Demo of Cube (so far) Using Excel Pivot OPASS Mtg 26 October 24, 2012 Building Hierarchies Demo OPASS Mtg 27 October 24, 2012 Create a Calculated Measure For those times when a built-in measure just isn’t enough… …you need a custom measure creating using DAX to satisfy the need. What is new in 2012 is that calculated measures can now be defined in the calculation area of the fact table. OPASS Mtg 28 October 24, 2012 Creating Custom DAX Measures For example, suppose you want to display the percent increase or decrease in sales by product in your stores channel for this year vs last year. You need a new measure to calculate store sales: By default, the above calculates sales for the entire table. However, in the pivot table, we can use the dimension: YEAR as a filter or slicer to perform the calculation by each year in the table. OPASS Mtg 29 October 24, 2012 Dimensions Serve as Filters Use Time Functions to calculate measures for other time periods. The above expression allows us to reference an existing expression but apply an additional filter to the calculation of StoreSales (which is already filtering on the channel: store). That additional filter in this case calculates the Store sales for the date one year prior to the current date of the record. OPASS Mtg 30 October 24, 2012 Calculate the Percent Change in Sales Given the prior two calculated measures, store sales for the current year and store sales for the prior year for each period in the cube, you can now calculate the percent change using an expression like: OPASS Mtg 31 October 24, 2012 Using Error Checking Actually, the above sample works only because the slicer limited the calculations to specific years. However, in general, you need to check equations for error conditions like dividing by zero by using a formula more like: OPASS Mtg 32 October 24, 2012 Demo 5a: Define a Calculated Measure OPASS Mtg 33 October 24, 2012 Demo 5b: The Pivot Table with Calculated Measures OPASS Mtg 34 October 24, 2012 Turning a Calculated Measure into a KPI KPI are nothing more than calculated measures in a fact table that are compared to a target value to determine whether the value is good or bad. OPASS Mtg 35 October 24, 2012 Adding KPI Calculations What is a KPI? Key Performance Indicator Key Performance Indicators provide information at a glance to indicate status of a measureable fact about your company/organization OPASS Mtg 36 October 24, 2012 Adding KPI Calculations A KPI Needs: • A Base Value • A Target Value • A number of status intervals • Thresholds for each interval • Symbols to use to indicate status OPASS Mtg 37 October 24, 2012 Demo 6a: Using DAX to Create a KPI OPASS Mtg 38 October 24, 2012 Demo 6b: Using DAX to Create a KPI OPASS Mtg 39 October 24, 2012 Adding a KPI Demo OPASS Mtg 40 October 24, 2012 Sorting by Other Fields You notice in the previous demo that while the rows displayed the sales by month, the months were sorted alphabetically, not chronologically. No one will accept that. How can you sort the months correctly. PowerPivot 2012 introduces a Sort by Another Column feature! OPASS Mtg 41 October 24, 2012 Define a Calculated Column with the Month # OPASS Mtg 42 October 24, 2012 Associate the Month Label with the New Column OPASS Mtg 43 October 24, 2012 Demo 6c: Correctly Ordered Months OPASS Mtg 44 October 24, 2012 Sorting on Alternate Columns Demo OPASS Mtg 45 October 24, 2012 Useful Links My blog is running a series of articles on working with the Tabular model http://SharePointMike.wordpress.com Using the SSAS Tabular Model Week 1 http://sharepointmike.wordpress.com/2012/10/06/using-the-ssas-tabular-model-week-1/ Gathering Data From Different Data Sources Week 2 http://sharepointmike.wordpress.com/2012/10/13/using-the-ssas-tabular-model-week-2/ Displaying your first Pivot Table from a Tabular Model http://sharepointmike.wordpress.com/2012/10/20/using-the-ssas-tabular-model-week-3/ Hierarchies http://sharepointmike.wordpress.com/2012/10/27/using-the-ssas-tabular-model-week-4-hierarchies http://sharepointmike.wordpress.com/2012/11/03/using-the-ssas-tabular-model-week-5-hierarchies-2 KPIs http://sharepointmike.wordpress.com/2012/11/10/using-the-ssas-tabular-model-week-6-kpi Clean-Up in Week 7 http://sharepointmike.wordpress.com/2012/11/17/using-the-ssas-tabular-model-clean-up-in-week-7 DAX On-line Function Reference http://technet.microsoft.com/en-us/library/ee634396.aspx OPASS Mtg 46 October 24, 2012 GOT QUESTIONS? OPASS Mtg 47 October 24, 2012 Thank You Don’t forget your evaluations. Mike@micmin.org Blog site: http://sharepointmike.wordpress.com/ OPASS Mtg 48 October 24, 2012