Outline Adamson & Venerable Chapter 2 & Homework 5 & 6 Hints

advertisement
Class 06.3.1 (TAL)
Adamson & Venerable
Chapter 2 &
Homework 5 & 6 Hints
PART 1: Transforming
Relational Databases into Dimensional
Diagrams
Spring 2016
Outline
• Homework #5 Requirements
• Dimensional Modeling
– Fact Table(s)
– Dimension Tables
• Transforming TAL TPS to DW
– ETL  OrderDetail
• Extract, Transform, Load
• Time Dimension Creation and Explanation
• Tools/Analyzer/Documenter
• Creating Reports (MOV)
1
Exercise #5
Due: March 2 (Sec1 & Sec2; Thurs)
Points: 20 points
Pratt & Last (8th ed.): TAL Distributors [TAL] and Colonial Adventure
Tours [CAT] Databases; Transform to Data Warehouses
1. Assignments must have cover sheet (scoring form with your name)
and table of contents (Table of contents should include page numbers).
Use ACCESS, with the TAL, and CAT databases.
2. Redesign both TAL and CAT databases as they would be for a data
warehouse as described in Adamson & Venerable [Chapters 1 & 2] and
Jukic [Chapter 7]. Use the Star diagram as the basis for their design.
Be sure to include a meaningful Time dimension table. See website.
3. Turn-in printouts of the REVISED Star Diagrams for both databases.
4. On a separate page(s), clearly identify for each database:
1. Fact tables, dimension tables, primary keys, foreign keys, alternate keys, etc. Use
the Relationship Report feature in Access to Print out the ERD.
2. Use Relational Notation from Pratt & Last, which I stressed in class.
5. Identify the correct Normal Form [1NF, 2NF, 3NF, etc.] of each table.
•NOTE: You should use the ORIGINAL Access copies of the TAL and
CAT databases for this assignment.
Name: _______________ Day / Eve
Score: _____/20
Homework #5 Scoresheet
Datasheet
View
Comments:
2
Dimensional Modeling in Sales
• In a Data Warehouse (DW) designed to
analyze SALES (ORDERS) data, an
important component of a dimensional
model is the Product Dimension.
• Product dimension includes fields
(columns) to represent characteristics
used to differentiate each product in the
marketplace, a.k.a., Discriminators.
Key Business Term: Discriminators
• Discriminators:
– Descriptive characteristics of a product that further
describes it and are relevant to purchasing decisions.
– Tracking discriminators allows the business analyst to
monitor performance of various product styles,
influencing production and marketing plans.
• Discriminators for a men’s suit: Cloth, color,
style/cut, weight, size
• Discriminators for vehicles: Model name, model
styling package, line, category, exterior color,
model year, interior color
3
Other Dimensions in Sales
• Time dimension: Time key, month, day, date, year, day
of week, quarter.
• Customer_Demographic dimension: This does not
require a row for each customer, but groups customers
by different combinations of age, gender, income, and
geography. The degree of demographic segmentation
varies by industry.
• Dealer dimension: Data on dealer performance are
needed, to suTALort decisions on which dealers should
be eased out of business.
• Method_Of_Payment dimension (lease, financing
options, etc.)
Fact Table: Storing derived facts
A commonly used derived fact should be stored,
and not calculated in reports and queries. Cutting
such “redundant” key measures from the fact
table results in the following:
1. Development of reports gets more complex
2. Increased potential for errors in reports
3. Increased documentation requirements
4. One (1) hundred dollars’ worth of disk space is
saved (40 MB of space savings) for a 10million row fact table)
4
BEGIN: TAL Transactional
Database (TPS)
5. TAL Star Diagram (Pre-Design)
Item
5
Transformation Stages
(Steps for Homework #5)
1.
2.
3.
4.
5.
6.
7.
De-normalization Process
a. Start with Normalized Tables (TAL TPS)
b. Determine Dimensions and Fact Tables
c. Delete Relationships (optional here; may defer to step 3.
below)
Rebuild Tables (ETL)
Rebuild Relationship Diagram as Star Diagram, a.k.a.,
Dimension Table
Build or add additional Dimension tables (Time, etc.)
Final Star Diagram: The TAL Data Warehouse.
Create Views for Reports
Homework #5 provides background information needed to
complete Homework 6, too!
1. TAL TPS ERD
Redrawn to form “most
likely” Star Diagram
6
1.c. Delete Relationships
(may postpone until Step 3.)
Need to Build Fact Table:
OrderDetail
OrderDetail[OrderNum, ItemNum,
OrderDate, CustomerNum, RepNum,
NumOrdered, QuotedPrice, ExtendedPrice]
2.a. ETL Tables
• Using copied operations database
• Be sure all ops. data is saved and backed up …
multiple times.
• Data staging & cleansing
– Denormalize extra relationships: Create Order_Detail
•
•
•
•
Order:OrderLine;
Customer:SalesRep;
Order:OrderNum, PartNum  OrderDetail
Customer:OrderNum CustNum  SalesRep
– Transform data for new tables in Access:
• Make Table Order_Detail
• Create Time dimension table
– EXCEL Option:
• Export data files, if needed, to rebuild elsewhere (Excel)
• Re-Import data files to new tables
7
2.b. OrderDetail Query
SQL View OrderDetail Query
SELECT Orders.OrderNum, OrderLine.ItemNum,
Orders.OrderDate, Orders.CustomerNum,
Customer.RepNum, OrderLine.NumOrdered,
OrderLine.QuotedPrice, [numordered]*[quotedprice] AS
ExtendedPrice
FROM Rep, Customer, Orders, Orderline, Rep
WHERE
Customer.CustomerNum = Orders.CustomerNum AND
Orders.OrderNum = OrderLine.OrderNum AND
Rep.RepNum = Customer.RepNum;
Datasheet View; Save OrderDetail
QBE Design View
Option: RepNum added in
Orders of Homework #4
2.c. Make Table OrderDetail Table
SELECT Orders.OrderNum, OrderLine.ItemNum, Orders.OrderDate, Orders.CustomerNum,
Customer.RepNum, OrderLine.NumOrdered, OrderLine.QuotedPrice,
[NumOrdered]*[QuotedPrice] AS ExtendedPrice INTO OrderDetail
FROM Rep INNER JOIN ((Customer INNER JOIN Orders ON Customer.CustomerNum =
Orders.CustomerNum) INNER JOIN OrderLine ON Orders.OrderNum =
OrderLine.OrderNum) ON Rep.RepNum = Customer.RepNum;
8
Correct OrderDetail
Based upon Original Data
Datasheet View; Original OrderDetail Data
2.d. OrderDetail Table
Set Primary Keys
Create INDEXes
9
2.e. Data Cleansing (Optional)
/2004
/2015
Fix dates
1.c. Delete Relationships (if
you skipped this step)
Need to Insert Fact Table:
OrderDetail
OrderDetail[OrderNum, ItemNum,
OrderDate, CustomerNum, RepNum,
NumOrdered, QuotedPrice, ExtendedPrice]
10
3. Build Star Diagram
Dimension Table
Dimension Table
Fact Table
Dimension Table
What’s Missing?
4. CREATE Time Dimension
QBE View
Datasheet View
Use SQL DISTINCT to
eliminate redundant dates
11
4.a. Built-in Date/Time Functions
4.b. Built-in Date/Time Scalar
Functions
SELECT DISTINCT Orders.OrderDate,
YEAR([Orders.OrderDate]) AS [Year],
Month([Orders.OrderDate]) AS [Month],
Day([Orders.OrderDate]) AS [Day],
WeekDay([Orders.OrderDate]) AS [WeekDay]
FROM Orders;
12
4.c. Make Table from Query
QBE View Time Convert
Query
Make Table: RUN!
Time2015 Table
4.d. Time Table w/Indexes
Datasheet View
Time2015 Table
Table Design View
Time2015 Table
Insert Primary Key
Create INDEXes as Needed
13
More Built-in Date/Time
Scalar Functions
SELECT DISTINCT Orders.OrderDate,
Year([Orders.OrderDate]) AS [Year],
Month([Orders.OrderDate]) AS [Month],
MonthName(Month([Orders.OrderDate]), 0) AS MName,
Day([Orders.OrderDate]) AS [Day],
Weekday([Orders.OrderDate]) AS WeekDay, Now() AS Now
FROM Orders;
TAL Datasheet Views (cont.)
Customer Table
Item Table
Rep Table
14
5. TAL Star Diagram (Pre-Final)
5. TAL DW -- Relation List
Fact Table
• OrderDetail [OrderNum, ItemNum, OrderDate, CustNum,
RepNum, NumOrdered, QuotedPrice, ExtendedPrice]
Dimension Tables (Full)
• Customer [CustNum, CustName, Street, City, State,
PostalCode, Balance, CreditLimit, (RepNum)]
• Rep [RepNum, LastName, FirstName, Street, City, State,
PostalCode, Commission, Rate]
• Item [ItemNum, Desc, OnHand, Category, Storehouse,
Price, (Allocation)]
• Time [OrderDate, Month, Year, Day, DayofWeek]
15
6. CAT DW -- Relation List
Fact Table
• ReservationFacts[ReservationID, TripID, CustomerNum,
GuideNum, TripDate (or TimeKey), NumPersons, TripPrice,
OtherFees, TotalTripPrice, SeasonCode]
Dimension Tables
• Customer[CustomerNum, LastName, FirstName, Address, City, State,
PostalCode, Phone]
• Trips[TripID, TripName, StartLocation, State, Distance, MaxGrpSize,
Type, Season]
• Guides[GuideNum, LastName, FirstName, Address, City, State,
PostalCode, PhoneNum, HireDate]
• Season [SeasonCode, SeasonName]
• Time [TripDate, Month, Year, Day, DayofWeek]
7. Final Touch: Replacing
OrderDate with Time_key
16
8. FULL Time Table in Access
Time2008-2015
Table Design View
TableView or Dynasheet View
9. Tools/Analyze/Documenter
17
Tools/Analyze/Documenter
CUSTOMER table
Tools/Analyze/Tables
18
Tools/Analyze/Tables
Tools/Analyze/Tables
19
10. Creating Reports:
Customer Subtotal Query
SQL View
SELECT OrderDetail.CustomerNum,
Count(OrderDetail.RepNum) AS CountOfOrders,
Sum(OrderDetail.ExtendedPrice) AS CustomerSubTotal,
Count(OrderDetail.OrderNum) AS CountOfCustomers
FROM OrderDetail
GROUP BY OrderDetail.CustomerNum;
Datasheet View
CustomerSubtotals
Total Customer Sales
SQL View
SELECT
Sum(CustomerSubTotals.C
ustomerSubTotal) AS
CustomerTotal
FROM CustomerSubTotals;
Datasheet View
20
Creating an Annotated View
SQL View
SELECT CustomerSubTotals.CustomerNum, Customer.CustomerName,
CustomerSubTotals.CountOfOrders, CustomerSubTotals.CountOfCustomers,
CustomerSubTotals.CustomerSubTotal
FROM CustomerSubTotals INNER JOIN Customer ON
CustomerSubTotals.CustomerNum = Customer.CustomerNum
ORDER BY Customer.CustomerName;
Datasheet View
Calculating Percentage
SQL View
SELECT CustomerSubtotalAnnotatedQuery.CustomerNum,
CustomerSubtotalAnnotatedQuery.CustomerName,
CustomerSubtotalAnnotatedQuery.CustomerSubTotal,
CustomerTotalSales.CustomerTotal,
[CustomerSubtotalAnnotatedQuery].[CustomerSubTotal]/[CustomerTot
al] AS CustomerPercent
FROM CustomerSubtotalAnnotatedQuery, CustomerTotalSales; Cartesian Product
Datasheet View
MOV
21
Materializing Object Views
• 1. Objects
Tables
• 2. Views
• 3. Materialize
Views/
Queries
– layouts, formats, etc
Reports,
Forms
Summary
• Complete Transformations
• How normal are the resulting tables?
– 1NF, 2NF, 3NF?
• Document Transformation maps
• Prepare for Appending Tables with new
data
• Extract, Transform, Load (ETL) Tables
• Tools/Analyze/Documenter
• Create Data for Reports: MOV … !!!
22
Download