11/5/2015 Class 11.21 BCIS 4660 Project 8 Notes Part 1 Fall 2015 Exercise #8 [TEAM PROJECT] p. 1 of 2 Due: Nov 17 (Sec 1; Tues) & Nov 19 (Sec 2; Thurs) Points: 40 points Pratt & Last: TAL Distributors Data Warehouse Assignments must have cover sheet (deliverable 1), table of contents (deliverable 2), and indicate the NAMES OF ALL TEAM MEMBERS and the TEAM NUMBER. Assignments must be typed using a word processor (Word, WordPerfect) and have a professional look. Use of ACCESS is REQUIRED for this assignment. Place Score sheet in front pocket pockets. • Perform a Data Warehouse ETL for TAL Distributors. Use your [corrected] Star Diagram from Exercise #7 and populate the tables in the model with the new data available at the course Web site, combining at least 6 teams. • Part 8.1 – Data Definitions: Use your corrected Star Diagram from Hwk #7 and other necessary data definition documentation. – Turn in a List of Relations (deliverable 3). On one (1) separate page, clearly list your fact and dimension tables using the simplified relational notation. E.g., EMPLOYEE [ENum, LastName, FirstName, … deptno, …] – Be sure to include the TEAM [and/or STATE, TERRITORY] data and add this keyfield for every record in the ORDER_DETAIL table. – Turn in copy of the Index Panes for each table (3a) – Turn in a printed ACCESS ERD of your Star Diagram (deliverable 4). May be from previous homework, if it was correct. – Turn in a printed copy of Documenter output (d5) of Table definitions [HINT: Tools/Analyze/Documenter/Tables – all tables] 1 11/5/2015 Exercise #8 [TEAM PROJECT] p. 2 of 2 • Part 8.2: Load data and print out your Fact and Dimension Tables. – Load the data provided by the instructor into your tables. • Note: Dates of all transactions should fall between 1/1/2008 and 12/31/2014 – Turn in Transformation Maps (d6): Carefully document any data cleansing activities that you performed. Note any data problems you encountered. – Create the necessary DETAILED transaction (lowest granularity) Fact Table that joins all the dimension tables. Show your SQL queries. – Turn in printouts of the contents of your fact and dimension tables (d7). Adjust their size and give them a professional look [reduce fonts to decrease paper waste]]. Use landscape orientation, when necessary. Make sure the data is in some reasonable order, such as ID or date, whichever is most appropriate. Print only the first 2 pages of each table. NOTE: See Scoresheet for Merge requirements & BONUS. 6 Merges beyond CA & FL. i.e., Your State PLUS 5 more!! • Part 8.3: Generate the following SQL Queries & REPORTS: – Use the your knowledge of SQL to create the following COMPUTED queries/views (SQL Query) and create the corresponding ACCESS REPORTS (all reports MUST have Grand total & Break-field Label, e.g. CustomerName, RepName, etc.): (d Tab3) 1. 2. 3. 4. Total Sales by Month (Subtotal by Year) Total Sales by Customer by Months for 2014-2015 (Subtotal by CustomerName ) Total Sales by Item by Month for 2014 (Subtotal by ItemName; ASC order by ItemName) Total Sales in 2008-2009 by RepName (DESC Order by Rep Sales within Year; Subtotal by Year for all Reps) 5. Total Sales by Territory (NE, SW, etc.) for 2008-2015 (Subtotal by Year); (Bonus 2pts) Download to EXCEL and graph it. 6. (BONUS 5pt) Total Sales by TeamID by Year (All Years; Subtotal by TeamID; and Grand Total) – Turn in printouts & SQL Code used for the 6 reports listed above (d8). – Turn in .accdb file (d9) with STAR models, data, views, reports (floppy, CD, e-mail attachment, or bring a flash disk to the instructor’s office). • Use this naming convention: TALDW08_TeamS.xx.accdb (standard naming convention) 2 11/5/2015 Territory Assignments Team Territory ItemNo RepNo OrdNo 1 NW GME, C 090 010000 0100 WA 2 NW GME, D 080 020000 0200 OR 3 SW GME, F 070 030000 0300 AZ 4 SW PZL, A 060 040000 0400 NM 5 MW PZL, B 050 050000 0500 MO 6 SE TOY, G 040 060000 0600 FL 7 NE TOY, H 030 070000 0700 NY 8 NE TOY, I 020 080000 0800 ME 9 SW PZL, J 010 090000 0900 OK 10 SE PZL, K 100 100000 1000 LA CustNo State Territory Assignments Team Territory ItemNo RepNo OrdNo 11 MW GME, L 110 110000 1100 NE 12 NW GME, M 200 120000 1200 ID 13 SW GME, N 190 130000 1300 NV 14 SW PZL, P 180 140000 1400 TX 15 MW PZL, R 170 150000 1500 KS 16 SE TOY, S 160 160000 1600 AL 17 NE TOY, T 150 170000 1700 NJ 18 NE TOY, U 140 180000 1800 MA 19 SW PZL, V 130 190000 1900 CO 20 SW PZL, W 120 200000 2000 CA CustNo State 3 11/5/2015 TAL Distributors Star Diagram Time2008-2016 5.B. TAL DW -- Relation List Fact Table • OrderDetail [OrderNum, ItemNum, TimeKey, CustNum, RepNum, NumOrdered, QuotedPrice, ExtendedPrice] Dimension Tables (Full) • Customer [CustNum, CustName, Street, City, State, PostalCode, Balance, CreditLimit, (RepNum)] • Rep [RepNum, LastName, FirstName, Street, City, State, PostalCode, Commission, Rate] • Item [ItemNum, Desc, OnHand, Category, Storehouse, Price, (Allocation)] • Time [TimeKey, OrderDate, Month, Cal_Year, Fiscal_Year, Quarter, Month_Key, Month_Day, Serial_Num, Week_Num, Julian, Day_of_Week, Day_of_Week_Num] • State[TeamID, StateCode, StateName, TerritoryCode] 4 11/5/2015 MS Project GANTT Chart • Serial Activities? – Some tasks MUST be serial • Parallel Activities? – Increases human productivity – One of most common ways of increasing IT capabilities – Nfold!; e.g., • Hard drives • Printers • Data Entry devices DW: Table Load Approaches BASE TABLE Team1 Team2 Team3 TeamN CUSTOMER CUST1 CUST2 CUST3 … CUSTn REP REP1 REP2 REP3 … REPn Item Item1 Item2 Item3 … Itemn ORDERDETAIL ORD_DET1 ORD_DE ORD_DE … T2 T3 ORD_DE Tn TIME N/C N/C N/C … N/C TEAM N/C N/C N/C … N/C Serial vs. Parallel 5 11/5/2015 Data ETL Procedures: Coding • Follow Coding guidelines in Exercise #7 – E.g., How you created the OrderDetail table in the last assignment. Also, see next page • Initial Pre-Load Procedures: – (Do this prior to loading all original TAL data into the TAL OrderDetail Star Diagram) – Denormalize Customer:Rep, Orders:OrderLine – All fields MUST be same type and size as in original TAL operational database – Enforce Referential Integrity on all dimensions – Note: CreditLimit constraint ($5,000, $7,500, $10,000, $15,000) enforced – Change any dates out of the RANGE 2008-2015 Data Load Procedures: General • Use the TAL1DataWarehouse.accdb/mdb file as your ETL (Extract, Transform, Load) work space. • Note: The TAL Data Warehouse work tables will be: – Customer, Item, Rep, Time, Territory and OrderDetal1A • You should load all the data into these tables using appropriate SQL statements from MakeTable, Append-Table, and Update query statements. EXCEL is not acceptable. – Examples are provided in the Access .accdb files. 6 11/5/2015 Data Load Procedures: Step 1 • Load Team 1 and Team 2 data into the work tables already. Load means you have used the :External Data/Access File option to locate Team1 & Team2 databases, and then you have extracted the Customer Tables from both. • Note: The remaining tables are numbered as follows: Customer3, Customer4, …, Item 3, Item4, etc., and so on. • Note: Customer3 and Customer4 tables were also loaded, see Append examples on the next pages – There are several examples in the Access .mdb file under Query Objects, Form Objects, and Report Objects. Creating Append Tables: Customer • The Append Customer2 query demonstrates how to Append Customer2 table to your base Customer Table: INSERT INTO Customer ( CustomerNum, CustomerName, Street, City, State, Zip, Balance, CreditLimit ) SELECT Customer2.Customer_Num, Customer2.Customer_Name, Customer2.Street, Customer2.City, Customer2.State, Customer2.Zip, Customer2.Balance, Customer2.Credit_Limit FROM Customer2 WHERE (((Customer2.State)<>"FL")); Note: The WHERE clause was used to exclude the “FL” customers from the APPEND-TABLE action query. This is not necessary, as duplicate records WILL NOT be added to a table where the primary key constraint is enforced. However, this clause prevents a warning message that “these records will not be added due to key constraints.” Do the same for Team2 thru TeamN Customer Tables. 7 11/5/2015 Extracting Customer Table fromTeam 1 Extracting (E) Select the Customer Table It was added as Customer1 8 11/5/2015 Build the Append Query (TL) Then Run it! CUSTOMER ETL Notes • You will need to load (L) Customer5, Customer6, Customer7 and Customer8 into Customer. Do one at a time. • Double check all field properties, types, sizes, and constraints • Verify codes of new records agree with the coding convention for that Territory. • Make sure Primary key is declared • Credit Limit constraints will be enforced • Adjust foreign keys – Remove Rep key during Append process – Make sure the Rep key has been placed in the OrderDetailx table • TRANSFORM MAPS 9 11/5/2015 REP Notes • Adjust Primary key field type and size • Make sure this key has been added to the OrderDetailx table • Verify codes of new records agree with the coding convention for that Territory [Append Rep2 query]. INSERT INTO Rep ( RepNum, LastName, FirstName, Street, City, State, Zip, Commission, Rate ) SELECT Rep2.RN, Rep2.Last_Name, Rep2.First_Name, Rep2.Street, Rep2.City, Rep2.State, Rep2.Zip, Rep2.Commision, Rep2.Rate FROM Rep2 WHERE (((Rep2.State)<>"FL")); • Note: Field names do not need to be identical; however field type and size does need to be consistent. TERRITORY Notes • Use TeamID as Primary Key; FL=6; CA = 20; • Use the following Territory IDs: NW, SW, MW, SE, NE • The Territory Table has been updated for all the approved team and territory codes. • Cleanse spellings, geographical inconsistencies, territorial redundancies, from OrderDetailx data. • This field may need to be added to the OrderDetailx table for Teamxx. 10 11/5/2015 Item Notes • Complete descriptions (names), and associated data • Text fields need to be set to correct size to avoid truncation [Append Item2 query]: INSERT INTO Item ( ItemNum, Description, OnHand, Class, Storehouse, Price ) SELECT Item2.Item_Num, Item2.Description, Item2.OnHand, Item2.Class, Item2.Storehouse, Item2.Price FROM Item2 WHERE (((Rep2.ItemNum) NOT IN (“AT94”, “BV06”, …. “KV94”)); and NOT LIKE “G*” Order Detail • This is the most complex data load and MUST be done after the other loads have been completed for each territory/team • Since you will be enforcing referential integrity and primary key constraints, the Customer, Rep, Item, and Territory tables must all be updated before this append will work. • Make sure the field types and sizes of the OrderDetailx table agrees with the OrderDetail1A • A computed field called ExtendedPrice is added to the Fact Table. This will save computational time later. 11 11/5/2015 OrderDetail Notes • The Append query has been modified for OrderDetail1 and is shown here for illustrative purposes [Append OrderDetail1 query]. Assumes TeamID is text item: INSERT INTO OrderDetail1A ( Order_Num, Item_Num, Order_Date, Customer_Num, RN, TeamID, NumOrdered, QuotedPrice, ExtendedPrice ) SELECT OrderDetail1.OrderNum, OrderDetail1.ItemNum, OrderDetail1.Date, OrderDetail1.CustomerNum, OrderDetail1.RepNum, 1, OrderDetail1.NumOrdered, OrderDetail1.QuotedPrice, OrderDetail1.QuotedPrice * OrderDetail1.NumOrdered FROM OrderDetail1; [Remember to NOT duplicate the FL OrderDetails: TeamID =20] • Note: “01” (if text field) team code can be inserted during the append process. Go to SQL code view, prior to running the query; Also, compute the ExtendedPrice from the NumOrdered and QuotedPrice fields during the data load. Cleaning Up & Transform Maps • I recommend you move (Export/Import) the completed tables to a “Clean” Data Warehouse; or • Save TAL1DataWarehouse.mdb as TALTeamx.mdb and delete all the extra tables and transform queries. Retain the TAL1DataWarehouse.mdb file for documentation purposes. • Under TAB2 it says to include the TRANSFORM Maps. This should include: – The Append SQL code you created – Notes: regarding field size and type changes that you had to perform – Coding problems encountered and how you fixed them – Recommendations to the DBA’s in each territory to improve their table structures for next month’s data loading. 12 11/5/2015 Required Table Listings • Contents of Fact Table and Dimension Tables: Datasheet View Printouts: – Use ACCESS report feature to make a nice looking report [See examples in REPORTs] – Make sure data is “logically” ordered first; – Use landscape orientation when appropriate to avoid wrap around pages – Print ONLY 2 pages of longer tables – Adjust the line spacing & font size to minimize the wasted paper – Print ONLY 1 last page of the TIME table Reports: Management Reports • Tab 3: Management Reports – Start with a Query, which extracts all the necessary columns from each dimension or fact table – Use Access Report function to create a nice looking reports – Use sorting/grouping and subtotals to display the six (6) reports specified for this section. – You may suppress “detail” records to shorten the report length. i.e., show only the subtotals and grand totals 13 11/5/2015 Alternative SQL Merge: UNION ALL Query • Often it is useful to create a UNION query to merge all tables prior to a Make-Table query • Create the structure you desire with queries, i.e., SELECT statements • Then UNION the queries • MAKE Query CUSTOMER: SELECT * FROM Customer UNION ALL SELECT * FROM Customer2 UNION SELECT * FROM Customer3 UNION SELECT * FROM Customer4 UNION … [more Unions] ; • NOTE: Unions automatically delete dupicates! Additional ACCESS Tools • Finding Missing Parents: “Find Unmatched Query Wizard” • Finding Duplicate Records: “Find Duplicates Query Wizard” 14 11/5/2015 Referential Integrity Problem with Item and OrderDetail Time2008-2016 Probable Cause: Missing parent record in Item Table Finding Missing Parents: “Find Unmatched Query Wizard” • When Referential Integrity appears to be an issue, use: • Query Wizard: – Find Unmatched Query Wizard 15 11/5/2015 Step 1: Select tables to analyze 1.A. Select OrderDetail Next 1.B. Select Item Next Step 2: Select Match Field(s) & Query columns to display 2.A. Select Matching Fields 2.B. Select Fields that will ItemNum in both tables help identify unmatched records (orphans) 16 11/5/2015 3. Final Steps 3.A. Name the query Finish 3.B. Sample Results: Check for each ItemNum in Item Table (orphans) Finding Duplicate Records: Find Duplicates Query Wizard • When you can’t assign a Primary Key because of duplicate records, use: • Query Wizard: – Find Duplicates Query Wizard 17 11/5/2015 Step 1: Select tables to analyze 1.A. Select OrderDetail Table Next 1.B. Select Primary keys: OrderNum, ItemNum Next Step 2: Select Extra Field(s) to help identify dup. records Select Extra Fields OrderDate and/or TerritoryID Next Name the query and Finish 18 11/5/2015 Step 3: Next Steps Sample Results: Check for each (OrderNum, ItemNum) in OrderDetail Table Prune the duplicates: 1.Manually 2.Extract and then 3.DELETE query Isolate the Duplicate Keys Find DISTINCT identifier SELECT DISTINCT OrderDetail.OrderNum, OrderDetail.ItemNum Actual Duplicate Keys (Not so many!) FROM OrderDetail WHERE (((OrderDetail.OrderNum) In (SELECT [OrderNum] FROM [OrderDetail] As Tmp GROUP BY [OrderNum],[ItemNum] HAVING Count(*)>1 And [ItemNum] = [OrderDetail].[ItemNum]))) ORDER BY OrderDetail.OrderNum, OrderDetail.ItemNum; 19 11/5/2015 DELETE Query for Duplicates DELETE Query SQL View DELETE Query Design View DELETE OrderDetail.*, OrderDetail.OrderNum FROM OrderDetail WHERE (((OrderDetail.OrderNum) In ("21608","21610","21613","21614","21 617","21619","21623"))); Class 11.21B BCIS 4660 Project 8 Notes Part 1.Add-Ons Fall 2015 20 11/5/2015 HIDDEN TABLES There are 2 ways to Hide tables (objects), when the Objects Window is getting too cluttered: 1. Hide the Table; Use Right-click on table name (or any object) to open options menu. Click “Hide” 2. Create a CUSTOM Group Directory. Preferred method. See below. Access Objects Window Very Cluttered: Viewing Custom Categories & Groups (click dropdown arrow): Note: Object Type is the default view. We will want to build The Custom category options. Now proceed to next slide. 21 11/5/2015 Creating Custom Categories Right-click on “All Access Objects” Select “Navigation Options…” Creating Custom Categories Select Custom Category Click “Add Group” to create “subdirectories” Note: Add Custom Groups … Unassigned Objects is a default group 22 11/5/2015 Access Custom Window Viewing Custom Categories & Groups (click dropdown arrow): Custom Groups appear: Note: Now select Custom Adding Objects To Custom Categories Right-click on an object you wish to: Select “Data Warehouse Tables” Group Or Simply Drag & Drop Object from one location to another 23 11/5/2015 THE END • Add-Ons Section 24