Cubes for Flat Table Landers

advertisement
Cubes for Flat Table Land
Michael P. Antonovich
http://SharePointMike.wordpress.com
#SharePointMikeA
My Published Books
User’s Guide to the Apple ][ - 1983
FoxPro 2 Programming Guide – 1992
Debugging and Maintaining FoxPro – 1992
Using Visual FoxPro 3.0 – 1995
Using Visual FoxPro 5.0 – 1996
Office and SharePoint User’s Guide – 2007
Office and SharePoint User’s Guide – 2010
Speaker at:
Code Camp 2009, 2010, 2011, 2012 Orlando
SharePoint Saturday 2011 & 2012 Tampa, 2012 Orlando
SQL Saturday - #1, #4, #8, #10, #15, #16, #21, #32, #38, #49, #62, #74, #79, #85, #86, #110, #130,
#151, #168
IT PRCamp – Jacksonville 2012
OPASS Mtg
2
October 24, 2012
Some Basic BI Terminology
IMPORTANT BI TERMS
Aggregate
A mathematical function that allows you to summarize values of an attribute
Dimension
A dimension is essentially a look-up table that may define a hierarchy or drill-down path
such as Year > Quarter > Month
Measure
Fact
Star Schema
Snowflake
Schema
OPASS Mtg
A measure is something that identifies a value
A fact is another term for a measure that contains numeric data that can be grouped
along one or more dimensional hierarchy
All dimension tables radiate out from a single fact table
One fact table may relate to another fact table before relating to dimension tables.
One dimension table can also have a related dimension table
A Pivot table or chart is usually based around a single fact table
3
October 24, 2012
Two Models in SSAS
Multidimensional Model
No major functionality changes since 2008 R2
Tabular Model
Visually and functionally resembles PowerPivot 2012
Both can be installed as separate instances on the
same server.
OPASS Mtg
4
October 24, 2012
Advantages of the
MultiDimensional Model
Tested technology since SQL 2000
Pre-calculated aggregates provide performance
enhancements.
Can handle larger data since it can store data on disk
(MOLAP) or directly query the relational data source
(ROLAP)
Uses MDX which is supported by many 3rd party
client tools.
OPASS Mtg
5
October 24, 2012
Disadvantages of the
MultiDimensional Model
Model is getting ‘old’ and is not being revised. (designed
for 32 bit, row based data and disk storage).
MDX is perceived as being difficult to learn.
Processing a multidimensional model can result in
substantial downtime for large models.
Changes to one table require the entire model to be
reprocessed.
Not compatible with PowerPivot
OPASS Mtg
6
October 24, 2012
Advantages of the
Tabular Model
A 100% memory-based model provides greater
performance.
The xVelocity analytics column based engine offers
significant query performance improvements.
Queries and formulas use DAX which is ‘easier’ to
learn than MDX. (MDX is also supported)
Queries data from many different data sources.
OPASS Mtg
7
October 24, 2012
Disadvantages of the
Tabular Model
Does not support many-to-many relationships.
Does not support true role-playing dimensions.
Does not support cell-level security.
Does not support security on measures.
Does not support translations of metadata for locales.
Does not support ragged hierarchies.
OPASS Mtg
8
October 24, 2012
Which to Choose?
For most applications (60-70%) either model will work.
Do you currently have a model in Multidimensional mode?
Are you just learning Analysis Services?
Licensing issues?
Compatibility with PowerPivot?
Hardware?
Performance Issues?
OPASS Mtg
9
October 24, 2012
SSAS Tabular Uses DAX
DAX Stands for
Data
Analysis
Expressions
DAX is used to:
Create calculated columns
Create custom measures
OPASS Mtg
10
October 24, 2012
Basic Syntax
DAX expressions always begin with an equal sign: =
Column References can be qualified or unqualified
TableName[ColumnName]
[ColumnName]
DAX Data Types
DAX Operators
•
•
•
•
•
•
•
•
•
•
•
•
OPASS Mtg
Integer
Real
Currency
Date(DateTime)
TRUE/FALSE (Boolean)
Text
11
+, *, /
=, <>
>,<
>=, <=
&
• AND &&
• OR ||
• NOT !
October 24, 2012
DAX Functions
2010 Version consisted of 135 functions
71 functions are similar to Excel functions
69 have the same name – 2 do not
TEXT  FORMAT
DATEDIF  YEARFRAC
64 functions are unique to DAX
Aggregate data functions
Date related functions
2012 Version has a little over 170 functions (no, I will not cover
them all today)
OPASS Mtg
12
October 24, 2012
Types of DAX Calculations
Simple Calculations
Calculated columns within fact tables
Calculated columns for dimension tables
Calculated columns between tables
Calculated columns to eliminate lookup tables
Calculated columns to serve as links to tables using multiple columns
(Calculated columns are calculated for every row in the table) Context is the row
Aggregate Calculations
Calculate unique measures
Context is in the evaluation of the pivot data
(Aggregate measures are only calculated for the displayed data in the Pivot table)
OPASS Mtg
13
October 24, 2012
Tabular Model Can
Import From
Microsoft Access 2003, 2007, 2010
Many other ODBC Databases
Microsoft SQL Server 2005, 2008,
2008 R2
Text files (.txt, .tab, .csv)
Oracle Relational DB 9i, 10g, 11g
Analysis Services Cubes from SQL
Server
Teradata V2R6, V12
Data Feeds using Atom 1.0 Format
IBM Relational Database 8.1
Excel Files from 97-2003, 2007,
2010
Sybase Relational Databases
OPASS Mtg
14
October 24, 2012
Demo 1a: Retrieve Data
from Multiple Sources
 Open C:\Contoso2012\Stores.xlsx and rename
to C:\Contoso2012\SQLSaturday1.xlsx
 Go to PowerPivot window and load SQL Server
database: Contoso2012 using all tables
 Add to Data Model, Stores from the current
spreadsheet. This is a linked table.
 Add Access database ProductCategories.
 Add Excel file: Geography.xlsx
OPASS Mtg
15
October 24, 2012
Demo 1a: Load Data
OPASS Mtg
16
October 24, 2012
Loading Data into the
Tabular Model
Demo
OPASS Mtg
17
October 24, 2012
Demo 1b: Create
Relationships
OPASS Mtg
18
October 24, 2012
Demo 1c: Show Diagram
View
OPASS Mtg
19
October 24, 2012
Creating Relations
Between Tables
Demo
OPASS Mtg
20
October 24, 2012
Technical vs. Useless
Columns
Technical Columns
Used to link tables (IDs)
Use to calculate other columns
Hide from Pivot Table Field List
Useless Columns
Came in when data imported from data source
Not used in pivot table or to link tables
Delete to improve performance
OPASS Mtg
21
October 24, 2012
Demo 2: Eliminate Useless
Columns and Hide
Technical Columns
OPASS Mtg
22
October 24, 2012
Denormalize Data
Model
Eliminate tables and columns that are not going to be
used.
Flatten structure by created calculated dimension
attributes based on values in other tables.
Hide columns used in calculations but which users no
longer need to see.
OPASS Mtg
23
October 24, 2012
Create a Hierarchy
Predefine common hierarchies for users
Hierarchies are defined from the largest grouping to
the smallest:
Product Category
Product Subcategory
Product
After defining the hierarchy, you can remove the
individual columns used to define the hierarchy.
OPASS Mtg
24
October 24, 2012
Demo 3: Define a Product
Hierarchy
OPASS Mtg
25
October 24, 2012
Demo 4: Demo of Cube (so
far) Using Excel Pivot
OPASS Mtg
26
October 24, 2012
Building Hierarchies
Demo
OPASS Mtg
27
October 24, 2012
Create a Calculated
Measure
For those times when a built-in measure just isn’t enough…
…you need a custom measure creating using DAX to satisfy the need.
What is new in 2012 is that calculated measures can now be defined
in the calculation area of the fact table.
OPASS Mtg
28
October 24, 2012
Creating Custom DAX
Measures
For example, suppose you want to display the
percent increase or decrease in sales by product in
your stores channel for this year vs last year.
You need a new measure to calculate store sales:
By default, the above calculates sales for the entire
table. However, in the pivot table, we can use the
dimension: YEAR as a filter or slicer to perform the
calculation by each year in the table.
OPASS Mtg
29
October 24, 2012
Dimensions Serve as Filters
Use Time Functions to calculate measures for other time
periods.
The above expression allows us to reference an existing
expression but apply an additional filter to the calculation of
StoreSales (which is already filtering on the channel: store).
That additional filter in this case calculates the Store sales
for the date one year prior to the current date of the record.
OPASS Mtg
30
October 24, 2012
Calculate the Percent
Change in Sales
Given the prior two calculated measures, store sales for the
current year and store sales for the prior year for each
period in the cube, you can now calculate the percent
change using an expression like:
OPASS Mtg
31
October 24, 2012
Using Error Checking
Actually, the above sample works only because the slicer
limited the calculations to specific years. However, in
general, you need to check equations for error
conditions like dividing by zero by using a formula more
like:
OPASS Mtg
32
October 24, 2012
Demo 5a: Define a
Calculated Measure
OPASS Mtg
33
October 24, 2012
Demo 5b: The Pivot Table
with Calculated Measures
OPASS Mtg
34
October 24, 2012
Turning a Calculated
Measure into a KPI
KPI are nothing more than calculated measures in
a fact table that are compared to a target value to
determine whether the value is good or bad.
OPASS Mtg
35
October 24, 2012
Adding KPI Calculations
What is a KPI?
Key
Performance
Indicator
Key Performance Indicators provide information at
a glance to indicate status of a measureable fact
about your company/organization
OPASS Mtg
36
October 24, 2012
Adding KPI Calculations
A KPI Needs:
•
A Base Value
•
A Target Value
•
A number of status
intervals
•
Thresholds for each
interval
•
Symbols to use to indicate
status
OPASS Mtg
37
October 24, 2012
Demo 6a: Using DAX to
Create a KPI
OPASS Mtg
38
October 24, 2012
Demo 6b: Using DAX to
Create a KPI
OPASS Mtg
39
October 24, 2012
Adding a KPI
Demo
OPASS Mtg
40
October 24, 2012
Sorting by Other Fields
You notice in the previous demo that while the rows displayed
the sales by month, the months were sorted alphabetically, not
chronologically.
No one will accept that.
How can you sort the months correctly.
PowerPivot 2012 introduces a Sort by Another Column feature!
OPASS Mtg
41
October 24, 2012
Define a Calculated
Column with the Month #
OPASS Mtg
42
October 24, 2012
Associate the Month Label
with the New Column
OPASS Mtg
43
October 24, 2012
Demo 6c: Correctly
Ordered Months
OPASS Mtg
44
October 24, 2012
Sorting on Alternate
Columns
Demo
OPASS Mtg
45
October 24, 2012
Useful Links
My blog is running a series of articles on working with the Tabular model
http://SharePointMike.wordpress.com
Using the SSAS Tabular Model Week 1
http://sharepointmike.wordpress.com/2012/10/06/using-the-ssas-tabular-model-week-1/
Gathering Data From Different Data Sources Week 2
http://sharepointmike.wordpress.com/2012/10/13/using-the-ssas-tabular-model-week-2/
Displaying your first Pivot Table from a Tabular Model
http://sharepointmike.wordpress.com/2012/10/20/using-the-ssas-tabular-model-week-3/
Hierarchies
http://sharepointmike.wordpress.com/2012/10/27/using-the-ssas-tabular-model-week-4-hierarchies
http://sharepointmike.wordpress.com/2012/11/03/using-the-ssas-tabular-model-week-5-hierarchies-2
KPIs
http://sharepointmike.wordpress.com/2012/11/10/using-the-ssas-tabular-model-week-6-kpi
Clean-Up in Week 7
http://sharepointmike.wordpress.com/2012/11/17/using-the-ssas-tabular-model-clean-up-in-week-7
DAX On-line Function Reference
http://technet.microsoft.com/en-us/library/ee634396.aspx
OPASS Mtg
46
October 24, 2012
GOT QUESTIONS?
OPASS Mtg
47
October 24, 2012
Thank You
Don’t forget your evaluations.
Mike@micmin.org
Blog site:
http://sharepointmike.wordpress.com/
OPASS Mtg
48
October 24, 2012
Download