BCIS 4660 Project 8 Notes Part 1 Fall 2015

advertisement
11/5/2015
Class 11.21
BCIS 4660
Project 8 Notes
Part 1
Fall 2015
Exercise #8 [TEAM PROJECT] p. 1 of 2
Due: Nov 17 (Sec 1; Tues) & Nov 19 (Sec 2; Thurs)
Points: 40 points
Pratt & Last: TAL Distributors Data Warehouse
Assignments must have cover sheet (deliverable 1), table of contents (deliverable 2), and
indicate the NAMES OF ALL TEAM MEMBERS and the TEAM NUMBER. Assignments
must be typed using a word processor (Word, WordPerfect) and have a professional look.
Use of ACCESS is REQUIRED for this assignment. Place Score sheet in front pocket
pockets.
•
Perform a Data Warehouse ETL for TAL Distributors. Use your [corrected]
Star Diagram from Exercise #7 and populate the tables in the model with the
new data available at the course Web site, combining at least 6 teams.
•
Part 8.1 – Data Definitions: Use your corrected Star Diagram from Hwk #7
and other necessary data definition documentation.
– Turn in a List of Relations (deliverable 3). On one (1) separate page, clearly list your
fact and dimension tables using the simplified relational notation. E.g.,
EMPLOYEE [ENum, LastName, FirstName, … deptno, …]
– Be sure to include the TEAM [and/or STATE, TERRITORY] data and add this keyfield
for every record in the ORDER_DETAIL table.
– Turn in copy of the Index Panes for each table (3a)
– Turn in a printed ACCESS ERD of your Star Diagram (deliverable 4). May be from
previous homework, if it was correct.
– Turn in a printed copy of Documenter output (d5) of Table definitions [HINT:
Tools/Analyze/Documenter/Tables – all tables]
1
11/5/2015
Exercise #8 [TEAM PROJECT] p. 2 of 2
•
Part 8.2: Load data and print out your Fact and Dimension Tables.
– Load the data provided by the instructor into your tables.
• Note: Dates of all transactions should fall between 1/1/2008 and 12/31/2014
– Turn in Transformation Maps (d6): Carefully document any data cleansing activities that
you performed. Note any data problems you encountered.
– Create the necessary DETAILED transaction (lowest granularity) Fact Table that joins
all the dimension tables. Show your SQL queries.
– Turn in printouts of the contents of your fact and dimension tables (d7). Adjust their
size and give them a professional look [reduce fonts to decrease paper waste]]. Use
landscape orientation, when necessary. Make sure the data is in some reasonable order,
such as ID or date, whichever is most appropriate. Print only the first 2 pages of each
table. NOTE: See Scoresheet for Merge requirements & BONUS. 6 Merges beyond
CA & FL. i.e., Your State PLUS 5 more!!
•
Part 8.3: Generate the following SQL Queries & REPORTS:
– Use the your knowledge of SQL to create the following COMPUTED queries/views (SQL
Query) and create the corresponding ACCESS REPORTS (all reports MUST have Grand
total & Break-field Label, e.g. CustomerName, RepName, etc.): (d Tab3)
1.
2.
3.
4.
Total Sales by Month (Subtotal by Year)
Total Sales by Customer by Months for 2014-2015 (Subtotal by CustomerName )
Total Sales by Item by Month for 2014 (Subtotal by ItemName; ASC order by ItemName)
Total Sales in 2008-2009 by RepName (DESC Order by Rep Sales within Year; Subtotal by Year
for all Reps)
5. Total Sales by Territory (NE, SW, etc.) for 2008-2015 (Subtotal by Year); (Bonus 2pts) Download
to EXCEL and graph it.
6. (BONUS 5pt) Total Sales by TeamID by Year (All Years; Subtotal by TeamID; and Grand Total)
– Turn in printouts & SQL Code used for the 6 reports listed above (d8).
– Turn in .accdb file (d9) with STAR models, data, views, reports (floppy, CD, e-mail
attachment, or bring a flash disk to the instructor’s office).
• Use this naming convention: TALDW08_TeamS.xx.accdb (standard naming convention)
2
11/5/2015
Territory Assignments
Team
Territory
ItemNo
RepNo OrdNo
1
NW
GME, C 090
010000 0100
WA
2
NW
GME, D 080
020000 0200
OR
3
SW
GME, F
070
030000 0300
AZ
4
SW
PZL, A
060
040000 0400
NM
5
MW
PZL, B
050
050000 0500
MO
6
SE
TOY, G
040
060000 0600
FL
7
NE
TOY, H
030
070000 0700
NY
8
NE
TOY, I
020
080000 0800
ME
9
SW
PZL, J
010
090000 0900
OK
10
SE
PZL, K
100
100000 1000
LA
CustNo State
Territory Assignments
Team
Territory
ItemNo
RepNo OrdNo
11
MW
GME, L
110
110000 1100
NE
12
NW
GME, M 200
120000 1200
ID
13
SW
GME, N 190
130000 1300
NV
14
SW
PZL, P
180
140000 1400
TX
15
MW
PZL, R
170
150000 1500
KS
16
SE
TOY, S
160
160000 1600
AL
17
NE
TOY, T
150
170000 1700
NJ
18
NE
TOY, U
140
180000 1800
MA
19
SW
PZL, V
130
190000 1900
CO
20
SW
PZL, W
120
200000 2000
CA
CustNo State
3
11/5/2015
TAL Distributors Star Diagram
Time2008-2016
5.B. TAL DW -- Relation List
Fact Table
• OrderDetail [OrderNum, ItemNum, TimeKey, CustNum,
RepNum, NumOrdered, QuotedPrice, ExtendedPrice]
Dimension Tables (Full)
• Customer [CustNum, CustName, Street, City, State,
PostalCode, Balance, CreditLimit, (RepNum)]
• Rep [RepNum, LastName, FirstName, Street, City, State,
PostalCode, Commission, Rate]
• Item [ItemNum, Desc, OnHand, Category, Storehouse,
Price, (Allocation)]
• Time [TimeKey, OrderDate, Month, Cal_Year, Fiscal_Year,
Quarter, Month_Key, Month_Day, Serial_Num, Week_Num,
Julian, Day_of_Week, Day_of_Week_Num]
• State[TeamID, StateCode, StateName, TerritoryCode]
4
11/5/2015
MS Project GANTT Chart
• Serial Activities?
– Some tasks MUST be serial
• Parallel Activities?
– Increases human productivity
– One of most common ways of increasing IT capabilities – Nfold!; e.g.,
• Hard drives
• Printers
• Data Entry devices
DW: Table Load Approaches
BASE TABLE
Team1
Team2
Team3
TeamN
CUSTOMER
CUST1
CUST2
CUST3
…
CUSTn
REP
REP1
REP2
REP3
…
REPn
Item
Item1
Item2
Item3
…
Itemn
ORDERDETAIL
ORD_DET1
ORD_DE ORD_DE …
T2
T3
ORD_DE
Tn
TIME
N/C
N/C
N/C
…
N/C
TEAM
N/C
N/C
N/C
…
N/C
Serial vs. Parallel
5
11/5/2015
Data ETL Procedures: Coding
• Follow Coding guidelines in Exercise #7
– E.g., How you created the OrderDetail table in the last
assignment. Also, see next page
• Initial Pre-Load Procedures:
– (Do this prior to loading all original TAL data into the
TAL OrderDetail Star Diagram)
– Denormalize Customer:Rep, Orders:OrderLine
– All fields MUST be same type and size as in original
TAL operational database
– Enforce Referential Integrity on all dimensions
– Note: CreditLimit constraint ($5,000, $7,500,
$10,000, $15,000) enforced
– Change any dates out of the RANGE 2008-2015
Data Load Procedures: General
• Use the TAL1DataWarehouse.accdb/mdb file
as your ETL (Extract, Transform, Load) work
space.
• Note: The TAL Data Warehouse work tables will
be:
– Customer, Item, Rep, Time, Territory and
OrderDetal1A
• You should load all the data into these tables
using appropriate SQL statements from MakeTable, Append-Table, and Update query
statements. EXCEL is not acceptable.
– Examples are provided in the Access .accdb files.
6
11/5/2015
Data Load Procedures: Step 1
• Load Team 1 and Team 2 data into the work tables
already. Load  means you have used the
:External Data/Access File option to locate
Team1 & Team2 databases, and then you have
extracted the Customer Tables from both.
• Note: The remaining tables are numbered as
follows: Customer3, Customer4, …, Item 3, Item4,
etc., and so on.
• Note: Customer3 and Customer4 tables were also
loaded, see Append examples on the next pages
– There are several examples in the Access .mdb file
under Query Objects, Form Objects, and Report
Objects.
Creating Append Tables:
Customer
• The Append Customer2 query demonstrates how to
Append Customer2 table to your base Customer Table:
INSERT INTO Customer ( CustomerNum, CustomerName, Street, City,
State, Zip, Balance, CreditLimit )
SELECT Customer2.Customer_Num, Customer2.Customer_Name,
Customer2.Street, Customer2.City, Customer2.State,
Customer2.Zip, Customer2.Balance, Customer2.Credit_Limit
FROM Customer2
WHERE (((Customer2.State)<>"FL"));
Note: The WHERE clause was used to exclude the “FL” customers
from the APPEND-TABLE action query. This is not necessary, as
duplicate records WILL NOT be added to a table where the primary
key constraint is enforced. However, this clause prevents a warning
message that “these records will not be added due to key
constraints.” Do the same for Team2 thru TeamN Customer Tables.
7
11/5/2015
Extracting Customer Table fromTeam 1
Extracting (E)
Select the Customer Table
It was added as Customer1
8
11/5/2015
Build the Append Query (TL)
Then Run it!
CUSTOMER ETL Notes
• You will need to load (L) Customer5, Customer6,
Customer7 and Customer8 into Customer. Do one at a
time.
• Double check all field properties, types, sizes, and
constraints
• Verify codes of new records agree with the coding
convention for that Territory.
• Make sure Primary key is declared
• Credit Limit constraints will be enforced
• Adjust foreign keys
– Remove Rep key during Append process
– Make sure the Rep key has been placed in the OrderDetailx
table
• TRANSFORM MAPS
9
11/5/2015
REP Notes
• Adjust Primary key field type and size
• Make sure this key has been added to the OrderDetailx
table
• Verify codes of new records agree with the coding
convention for that Territory [Append Rep2 query].
INSERT INTO Rep ( RepNum, LastName, FirstName, Street, City,
State, Zip, Commission, Rate )
SELECT Rep2.RN, Rep2.Last_Name, Rep2.First_Name,
Rep2.Street, Rep2.City, Rep2.State, Rep2.Zip,
Rep2.Commision, Rep2.Rate
FROM Rep2
WHERE (((Rep2.State)<>"FL"));
• Note: Field names do not need to be identical; however
field type and size does need to be consistent.
TERRITORY Notes
• Use TeamID as Primary Key; FL=6; CA =
20;
• Use the following Territory IDs: NW, SW,
MW, SE, NE
• The Territory Table has been updated for
all the approved team and territory codes.
• Cleanse spellings, geographical
inconsistencies, territorial redundancies,
from OrderDetailx data.
• This field may need to be added to the
OrderDetailx table for Teamxx.
10
11/5/2015
Item Notes
• Complete descriptions (names), and
associated data
• Text fields need to be set to correct size to
avoid truncation [Append Item2 query]:
INSERT INTO Item ( ItemNum, Description,
OnHand, Class, Storehouse, Price )
SELECT Item2.Item_Num, Item2.Description,
Item2.OnHand, Item2.Class, Item2.Storehouse,
Item2.Price FROM Item2
WHERE (((Rep2.ItemNum) NOT IN (“AT94”,
“BV06”, …. “KV94”)); and NOT LIKE “G*”
Order Detail
• This is the most complex data load and MUST be
done after the other loads have been completed
for each territory/team
• Since you will be enforcing referential integrity
and primary key constraints, the Customer, Rep,
Item, and Territory tables must all be updated
before this append will work.
• Make sure the field types and sizes of the
OrderDetailx table agrees with the OrderDetail1A
• A computed field called ExtendedPrice is added
to the Fact Table. This will save
computational time later.
11
11/5/2015
OrderDetail Notes
• The Append query has been modified for OrderDetail1
and is shown here for illustrative purposes [Append
OrderDetail1 query]. Assumes TeamID is text item:
INSERT INTO OrderDetail1A ( Order_Num, Item_Num, Order_Date,
Customer_Num, RN, TeamID, NumOrdered, QuotedPrice,
ExtendedPrice )
SELECT OrderDetail1.OrderNum, OrderDetail1.ItemNum,
OrderDetail1.Date, OrderDetail1.CustomerNum,
OrderDetail1.RepNum, 1, OrderDetail1.NumOrdered,
OrderDetail1.QuotedPrice, OrderDetail1.QuotedPrice *
OrderDetail1.NumOrdered
FROM OrderDetail1;
[Remember to NOT duplicate the FL OrderDetails: TeamID =20]
• Note: “01” (if text field) team code can be inserted
during the append process. Go to SQL code view, prior to
running the query; Also, compute the ExtendedPrice from
the NumOrdered and QuotedPrice fields during the data
load.
Cleaning Up & Transform Maps
• I recommend you move (Export/Import) the completed
tables to a “Clean” Data Warehouse; or
• Save TAL1DataWarehouse.mdb as TALTeamx.mdb and
delete all the extra tables and transform queries. Retain
the TAL1DataWarehouse.mdb file for documentation
purposes.
• Under TAB2 it says to include the TRANSFORM Maps.
This should include:
– The Append SQL code you created
– Notes: regarding field size and type changes that you had to
perform
– Coding problems encountered and how you fixed them
– Recommendations to the DBA’s in each territory to improve their
table structures for next month’s data loading.
12
11/5/2015
Required Table Listings
• Contents of Fact Table and Dimension
Tables: Datasheet View Printouts:
– Use ACCESS report feature to make a nice
looking report [See examples in REPORTs]
– Make sure data is “logically” ordered first;
– Use landscape orientation when appropriate
to avoid wrap around pages
– Print ONLY 2 pages of longer tables
– Adjust the line spacing & font size to minimize
the wasted paper
– Print ONLY 1 last page of the TIME table
Reports: Management Reports
• Tab 3: Management Reports
– Start with a Query, which extracts all the
necessary columns from each dimension or fact
table
– Use Access Report function to create a nice
looking reports
– Use sorting/grouping and subtotals to display
the six (6) reports specified for this section.
– You may suppress “detail” records to shorten the
report length. i.e., show only the subtotals and
grand totals
13
11/5/2015
Alternative SQL Merge:
UNION ALL Query
• Often it is useful to create a UNION query to
merge all tables prior to a Make-Table query
• Create the structure you desire with queries, i.e.,
SELECT statements
• Then UNION the queries
• MAKE Query CUSTOMER:
SELECT * FROM Customer UNION ALL
SELECT * FROM Customer2 UNION
SELECT * FROM Customer3 UNION
SELECT * FROM Customer4 UNION … [more Unions]
;
• NOTE: Unions automatically delete dupicates!
Additional ACCESS Tools
• Finding Missing Parents:
“Find Unmatched Query Wizard”
• Finding Duplicate Records:
“Find Duplicates Query Wizard”
14
11/5/2015
Referential Integrity Problem with Item and
OrderDetail
Time2008-2016
Probable Cause: Missing parent record in Item Table
Finding Missing Parents:
“Find Unmatched Query Wizard”
• When Referential
Integrity appears
to be an issue,
use:
• Query Wizard:
– Find Unmatched
Query Wizard
15
11/5/2015
Step 1: Select tables to analyze
1.A. Select OrderDetail
Next 
1.B. Select Item
Next 
Step 2: Select Match Field(s)
& Query columns to display
2.A. Select Matching Fields 2.B. Select Fields that will
ItemNum in both tables  help identify unmatched
records (orphans) 
16
11/5/2015
3. Final Steps
3.A. Name the query
Finish 
3.B. Sample Results:
Check for each ItemNum
in Item Table (orphans)
Finding Duplicate Records:
Find Duplicates Query Wizard
• When you can’t
assign a Primary
Key because of
duplicate records,
use:
• Query Wizard:
– Find Duplicates
Query Wizard
17
11/5/2015
Step 1: Select tables to analyze
1.A. Select OrderDetail Table
Next 
1.B. Select Primary keys:
OrderNum, ItemNum
Next 
Step 2: Select Extra Field(s) to
help identify dup. records
Select Extra Fields
OrderDate and/or TerritoryID
Next 
Name the query and
Finish 
18
11/5/2015
Step 3: Next Steps
Sample Results:
Check for each
(OrderNum, ItemNum)
in OrderDetail Table
Prune the duplicates:
1.Manually
2.Extract and then
3.DELETE query
Isolate the Duplicate Keys
Find DISTINCT identifier
SELECT DISTINCT
OrderDetail.OrderNum,
OrderDetail.ItemNum
Actual Duplicate Keys
(Not so many!)
FROM OrderDetail
WHERE (((OrderDetail.OrderNum) In
(SELECT [OrderNum] FROM [OrderDetail]
As Tmp GROUP BY
[OrderNum],[ItemNum] HAVING
Count(*)>1 And [ItemNum] =
[OrderDetail].[ItemNum])))
ORDER BY OrderDetail.OrderNum,
OrderDetail.ItemNum;
19
11/5/2015
DELETE Query for Duplicates
DELETE Query
SQL View
DELETE Query
Design View
DELETE OrderDetail.*,
OrderDetail.OrderNum
FROM OrderDetail
WHERE (((OrderDetail.OrderNum) In
("21608","21610","21613","21614","21
617","21619","21623")));
Class 11.21B
BCIS 4660
Project 8 Notes
Part 1.Add-Ons
Fall 2015
20
11/5/2015
HIDDEN TABLES
There are 2 ways to Hide tables (objects),
when the Objects Window is getting too
cluttered:
1. Hide the Table; Use Right-click on table
name (or any object) to open options
menu. Click “Hide”
2. Create a CUSTOM Group Directory.
Preferred method. See below.
Access Objects Window
Very Cluttered:
Viewing Custom
Categories & Groups (click
dropdown arrow):
Note: Object Type
is the default view.
We will want to build
The Custom category
options.
Now proceed to next
slide.
21
11/5/2015
Creating Custom Categories
Right-click on “All Access
Objects”
Select “Navigation
Options…”
Creating Custom Categories
Select Custom Category
Click “Add Group” to
create “subdirectories”
Note:
Add Custom Groups …
Unassigned Objects is a default group
22
11/5/2015
Access Custom Window
Viewing Custom
Categories & Groups (click
dropdown arrow):
Custom Groups appear:
Note:
Now select Custom
Adding Objects To Custom
Categories
Right-click on an object
you wish to:
Select “Data Warehouse
Tables” Group
Or Simply Drag & Drop Object from one location to another
23
11/5/2015
THE END
• Add-Ons Section
24
Download