Class 06.3.1 (TAL) Adamson & Venerable Chapter 2 & Homework 5 & 6 Hints PART 1: Transforming Relational Databases into Dimensional Diagrams Spring 2016 Outline • Homework #5 Requirements • Dimensional Modeling – Fact Table(s) – Dimension Tables • Transforming TAL TPS to DW – ETL OrderDetail • Extract, Transform, Load • Time Dimension Creation and Explanation • Tools/Analyzer/Documenter • Creating Reports (MOV) 1 Exercise #5 Due: March 2 (Sec1 & Sec2; Thurs) Points: 20 points Pratt & Last (8th ed.): TAL Distributors [TAL] and Colonial Adventure Tours [CAT] Databases; Transform to Data Warehouses 1. Assignments must have cover sheet (scoring form with your name) and table of contents (Table of contents should include page numbers). Use ACCESS, with the TAL, and CAT databases. 2. Redesign both TAL and CAT databases as they would be for a data warehouse as described in Adamson & Venerable [Chapters 1 & 2] and Jukic [Chapter 7]. Use the Star diagram as the basis for their design. Be sure to include a meaningful Time dimension table. See website. 3. Turn-in printouts of the REVISED Star Diagrams for both databases. 4. On a separate page(s), clearly identify for each database: 1. Fact tables, dimension tables, primary keys, foreign keys, alternate keys, etc. Use the Relationship Report feature in Access to Print out the ERD. 2. Use Relational Notation from Pratt & Last, which I stressed in class. 5. Identify the correct Normal Form [1NF, 2NF, 3NF, etc.] of each table. •NOTE: You should use the ORIGINAL Access copies of the TAL and CAT databases for this assignment. Name: _______________ Day / Eve Score: _____/20 Homework #5 Scoresheet Datasheet View Comments: 2 Dimensional Modeling in Sales • In a Data Warehouse (DW) designed to analyze SALES (ORDERS) data, an important component of a dimensional model is the Product Dimension. • Product dimension includes fields (columns) to represent characteristics used to differentiate each product in the marketplace, a.k.a., Discriminators. Key Business Term: Discriminators • Discriminators: – Descriptive characteristics of a product that further describes it and are relevant to purchasing decisions. – Tracking discriminators allows the business analyst to monitor performance of various product styles, influencing production and marketing plans. • Discriminators for a men’s suit: Cloth, color, style/cut, weight, size • Discriminators for vehicles: Model name, model styling package, line, category, exterior color, model year, interior color 3 Other Dimensions in Sales • Time dimension: Time key, month, day, date, year, day of week, quarter. • Customer_Demographic dimension: This does not require a row for each customer, but groups customers by different combinations of age, gender, income, and geography. The degree of demographic segmentation varies by industry. • Dealer dimension: Data on dealer performance are needed, to suTALort decisions on which dealers should be eased out of business. • Method_Of_Payment dimension (lease, financing options, etc.) Fact Table: Storing derived facts A commonly used derived fact should be stored, and not calculated in reports and queries. Cutting such “redundant” key measures from the fact table results in the following: 1. Development of reports gets more complex 2. Increased potential for errors in reports 3. Increased documentation requirements 4. One (1) hundred dollars’ worth of disk space is saved (40 MB of space savings) for a 10million row fact table) 4 BEGIN: TAL Transactional Database (TPS) 5. TAL Star Diagram (Pre-Design) Item 5 Transformation Stages (Steps for Homework #5) 1. 2. 3. 4. 5. 6. 7. De-normalization Process a. Start with Normalized Tables (TAL TPS) b. Determine Dimensions and Fact Tables c. Delete Relationships (optional here; may defer to step 3. below) Rebuild Tables (ETL) Rebuild Relationship Diagram as Star Diagram, a.k.a., Dimension Table Build or add additional Dimension tables (Time, etc.) Final Star Diagram: The TAL Data Warehouse. Create Views for Reports Homework #5 provides background information needed to complete Homework 6, too! 1. TAL TPS ERD Redrawn to form “most likely” Star Diagram 6 1.c. Delete Relationships (may postpone until Step 3.) Need to Build Fact Table: OrderDetail OrderDetail[OrderNum, ItemNum, OrderDate, CustomerNum, RepNum, NumOrdered, QuotedPrice, ExtendedPrice] 2.a. ETL Tables • Using copied operations database • Be sure all ops. data is saved and backed up … multiple times. • Data staging & cleansing – Denormalize extra relationships: Create Order_Detail • • • • Order:OrderLine; Customer:SalesRep; Order:OrderNum, PartNum OrderDetail Customer:OrderNum CustNum SalesRep – Transform data for new tables in Access: • Make Table Order_Detail • Create Time dimension table – EXCEL Option: • Export data files, if needed, to rebuild elsewhere (Excel) • Re-Import data files to new tables 7 2.b. OrderDetail Query SQL View OrderDetail Query SELECT Orders.OrderNum, OrderLine.ItemNum, Orders.OrderDate, Orders.CustomerNum, Customer.RepNum, OrderLine.NumOrdered, OrderLine.QuotedPrice, [numordered]*[quotedprice] AS ExtendedPrice FROM Rep, Customer, Orders, Orderline, Rep WHERE Customer.CustomerNum = Orders.CustomerNum AND Orders.OrderNum = OrderLine.OrderNum AND Rep.RepNum = Customer.RepNum; Datasheet View; Save OrderDetail QBE Design View Option: RepNum added in Orders of Homework #4 2.c. Make Table OrderDetail Table SELECT Orders.OrderNum, OrderLine.ItemNum, Orders.OrderDate, Orders.CustomerNum, Customer.RepNum, OrderLine.NumOrdered, OrderLine.QuotedPrice, [NumOrdered]*[QuotedPrice] AS ExtendedPrice INTO OrderDetail FROM Rep INNER JOIN ((Customer INNER JOIN Orders ON Customer.CustomerNum = Orders.CustomerNum) INNER JOIN OrderLine ON Orders.OrderNum = OrderLine.OrderNum) ON Rep.RepNum = Customer.RepNum; 8 Correct OrderDetail Based upon Original Data Datasheet View; Original OrderDetail Data 2.d. OrderDetail Table Set Primary Keys Create INDEXes 9 2.e. Data Cleansing (Optional) /2004 /2015 Fix dates 1.c. Delete Relationships (if you skipped this step) Need to Insert Fact Table: OrderDetail OrderDetail[OrderNum, ItemNum, OrderDate, CustomerNum, RepNum, NumOrdered, QuotedPrice, ExtendedPrice] 10 3. Build Star Diagram Dimension Table Dimension Table Fact Table Dimension Table What’s Missing? 4. CREATE Time Dimension QBE View Datasheet View Use SQL DISTINCT to eliminate redundant dates 11 4.a. Built-in Date/Time Functions 4.b. Built-in Date/Time Scalar Functions SELECT DISTINCT Orders.OrderDate, YEAR([Orders.OrderDate]) AS [Year], Month([Orders.OrderDate]) AS [Month], Day([Orders.OrderDate]) AS [Day], WeekDay([Orders.OrderDate]) AS [WeekDay] FROM Orders; 12 4.c. Make Table from Query QBE View Time Convert Query Make Table: RUN! Time2015 Table 4.d. Time Table w/Indexes Datasheet View Time2015 Table Table Design View Time2015 Table Insert Primary Key Create INDEXes as Needed 13 More Built-in Date/Time Scalar Functions SELECT DISTINCT Orders.OrderDate, Year([Orders.OrderDate]) AS [Year], Month([Orders.OrderDate]) AS [Month], MonthName(Month([Orders.OrderDate]), 0) AS MName, Day([Orders.OrderDate]) AS [Day], Weekday([Orders.OrderDate]) AS WeekDay, Now() AS Now FROM Orders; TAL Datasheet Views (cont.) Customer Table Item Table Rep Table 14 5. TAL Star Diagram (Pre-Final) 5. TAL DW -- Relation List Fact Table • OrderDetail [OrderNum, ItemNum, OrderDate, CustNum, RepNum, NumOrdered, QuotedPrice, ExtendedPrice] Dimension Tables (Full) • Customer [CustNum, CustName, Street, City, State, PostalCode, Balance, CreditLimit, (RepNum)] • Rep [RepNum, LastName, FirstName, Street, City, State, PostalCode, Commission, Rate] • Item [ItemNum, Desc, OnHand, Category, Storehouse, Price, (Allocation)] • Time [OrderDate, Month, Year, Day, DayofWeek] 15 6. CAT DW -- Relation List Fact Table • ReservationFacts[ReservationID, TripID, CustomerNum, GuideNum, TripDate (or TimeKey), NumPersons, TripPrice, OtherFees, TotalTripPrice, SeasonCode] Dimension Tables • Customer[CustomerNum, LastName, FirstName, Address, City, State, PostalCode, Phone] • Trips[TripID, TripName, StartLocation, State, Distance, MaxGrpSize, Type, Season] • Guides[GuideNum, LastName, FirstName, Address, City, State, PostalCode, PhoneNum, HireDate] • Season [SeasonCode, SeasonName] • Time [TripDate, Month, Year, Day, DayofWeek] 7. Final Touch: Replacing OrderDate with Time_key 16 8. FULL Time Table in Access Time2008-2015 Table Design View TableView or Dynasheet View 9. Tools/Analyze/Documenter 17 Tools/Analyze/Documenter CUSTOMER table Tools/Analyze/Tables 18 Tools/Analyze/Tables Tools/Analyze/Tables 19 10. Creating Reports: Customer Subtotal Query SQL View SELECT OrderDetail.CustomerNum, Count(OrderDetail.RepNum) AS CountOfOrders, Sum(OrderDetail.ExtendedPrice) AS CustomerSubTotal, Count(OrderDetail.OrderNum) AS CountOfCustomers FROM OrderDetail GROUP BY OrderDetail.CustomerNum; Datasheet View CustomerSubtotals Total Customer Sales SQL View SELECT Sum(CustomerSubTotals.C ustomerSubTotal) AS CustomerTotal FROM CustomerSubTotals; Datasheet View 20 Creating an Annotated View SQL View SELECT CustomerSubTotals.CustomerNum, Customer.CustomerName, CustomerSubTotals.CountOfOrders, CustomerSubTotals.CountOfCustomers, CustomerSubTotals.CustomerSubTotal FROM CustomerSubTotals INNER JOIN Customer ON CustomerSubTotals.CustomerNum = Customer.CustomerNum ORDER BY Customer.CustomerName; Datasheet View Calculating Percentage SQL View SELECT CustomerSubtotalAnnotatedQuery.CustomerNum, CustomerSubtotalAnnotatedQuery.CustomerName, CustomerSubtotalAnnotatedQuery.CustomerSubTotal, CustomerTotalSales.CustomerTotal, [CustomerSubtotalAnnotatedQuery].[CustomerSubTotal]/[CustomerTot al] AS CustomerPercent FROM CustomerSubtotalAnnotatedQuery, CustomerTotalSales; Cartesian Product Datasheet View MOV 21 Materializing Object Views • 1. Objects Tables • 2. Views • 3. Materialize Views/ Queries – layouts, formats, etc Reports, Forms Summary • Complete Transformations • How normal are the resulting tables? – 1NF, 2NF, 3NF? • Document Transformation maps • Prepare for Appending Tables with new data • Extract, Transform, Load (ETL) Tables • Tools/Analyze/Documenter • Create Data for Reports: MOV … !!! 22