2012-08-09-Tips+and+Tricks+for+Dimensional+Modeling

advertisement
Tips and Tricks for Dimensional
Modeling
By Shawn Jackson
Overview
• Set of techniques and concepts used in data
warehouse design
• Intended to support end-user queries and is
oriented around understandability and
performance
• Uses the concepts of facts (measures) and
dimensions (context)
• Facts are typically (but not always) numerical
values that can be aggregated
• Dimensions are groups of hierarchies and
descriptors that define the facts
Star Schema
Snowflake Schema
Kimball University:
10 Essential Rules of Dimensional Modeling (#1-5)
1.
2.
3.
4.
5.
Load detailed atomic data into dimensional structures
•
Store data at the lowest grain
•
Use summary tables/views to improve performance as necessary
Structure dimensional models around business processes
•
Fact tables should be based on a business event
•
Complement single process fact tables with consolidated fact tables
that combine metrics from multiple processes at the same level of
detail
Ensure that every fact table has an associated date dimension table
Ensure that all facts in a single fact table are at the same grain or level of
detail
Resolve many-to-many relationships in fact tables
Kimball University:
10 Essential Rules of Dimensional Modeling (#6-10)
6.
7.
Resolve many-to-one relationships in dimension tables
Store report labels and filter domain values in dimension tables
•
•
8.
9.
Don’t store codes and descriptions in the fact table
Make sure the full description of the code is in the dimension table
Make certain that dimension tables use a surrogate key
Create conformed dimensions to integrate data across the
enterprise
•
•
Date dimension is a common example
Single version of the truth
10. Continuously balance requirements and realities to deliver a
DW/BI solution that's accepted by business users and that
supports their decision-making
Slowly Changing Dimensions
•
•
•
•
•
•
Type 0
Type 1
Type 2
Type 3
Type 4
Type 6
SCD Type 0
•
•
•
•
Rows are added but never changed
Missing true business / natural key
Typically are only used in derived dimensions
Type 0 attributes are more common
Supplier
Key
Supplier
Name
123
Acme Supply Co
124
Acme Supply Company
SCD Type 1
• Rows can be updated or added based upon
business key
• Historical information is not tracked
Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Suply Co
CA
Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Supply Co
CA
Supplier_Key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Supply Co
IL
SCD Type 2
• Rows are only added
• A version number or effective dates are used
to keep track of history
Supplier
Key
Supplier
Code
Supplier
Name
Supplier
State
Start
Date
End
Date
123
ABC
Acme Supply Co
CA
01-Jan-2000
21-Dec-2004
124
ABC
Acme Supply Co
IL
22-Dec-2004
SCD Type 3
• Rows are updated but not added
• Historical information is preserved through
extra columns
Supplier
Key
Supplier
Code
Supplier
Name
Original / Prior
Supplier State
Effective
Date
Current
Supplier
State
123
ABC
Acme Supply Co
CA
22-Dec-2004
IL
SCD Type 4
• Combination of type 1 and type 2 dimensions
• Rows are updated in the type 1 table and added in
the type 2 table
Supplier
Supplier_key
Supplier_Code
Supplier_Name
Supplier_State
123
ABC
Acme Supply Co
IL
Supplier History
Supplier
HistKey
Supplier
Key
Supplier
Code
Supplier
Name
Supplier
State
Start
Date
End
Date
1001
123
ABC
Acme Supply
Co
CA
01-Jan-2000
21-Dec-2004
1002
123
ABC
Acme Supply
Co
IL
22-Dec2004
SCD Type 6 / hybrid
• Combines type 1, 2 and 3 in one table
Supplier Supplier
Key
Code
Supplier
Name
Current Prior
State
State
Start
Date
End
Date
Current
Flag
123
ABC
Acme
Supply Co
NY
CA
01-Jan2000
21-Dec2004
N
124
ABC
Acme
Supply Co
NY
IL
22-Dec2004
03-Feb2008
N
125
ABC
Acme
Supply Co
NY
NY
04-Feb2008
Y
Roleplaying Dimensions
• Recycled for multiple applications within the
same database
• Date dimension is commonly used (sale date,
delivery date)
• Can be used to get different views of data
Roleplaying Example
Factless Fact Tables
• Tracking events
• Many to many joins
Download