The Grocery Store

advertisement
Data Warehousing
(Kimball, Ch.2-4)
Dr. Vairam Arunachalam
School of Accountancy, MU
Grocery Store
case terminology





SKUs – Stock-keeping units
UPCs – Universal Product codes
POS system – Point of Sale system
Promotions – TPRs, ads in newspapers,
newspaper inserts, displays (shelf displays
and end-aisle displays), coupons
Promotion dimension – lift, baseline sales,
time shifting, cannibalization, growing the
market
Sep. 9, 1999
Dr. Vairam Arunachalam
2
The Grocery Store

Steps in the Design Process:
– Choose a business process to model (e.g.,
daily item movement)
– Choose the grain of the business process (e.g.,
SKU by store by promotion by day)
– Choose the dimensions applicable to the fact
table (e.g., time, product, store, promotion)
– Choose the measured facts (e.g., dollar sales,
unit sales, dollar cost, customer count)
Sep. 9, 1999
Dr. Vairam Arunachalam
3
Salient Principles
– The data warehouse almost always
demands data expressed at the lowest
possible grain of each dimension…
– A careful grain statement determines the
dimensionality of the fact table.
– The number of sales transaction line items
in a business can be estimated by dividing
the gross revenue of the business by the
average price of a sales item.
Sep. 9, 1999
Dr. Vairam Arunachalam
4
Salient Principles (contd.)
– The fact table in a dimensional schema is
naturally highly normalized
– Efforts to normalize any of the tables in a
dimensional database solely in order to
save disk space are a waste of time
– The dimension tables must not be
normalized but should remain as flat
tables. (Because?…)
Sep. 9, 1999
Dr. Vairam Arunachalam
5
Salient Principles (contd.)
– Most data warehouses need an explicit time
dimension table even though the primary
time key may be an SQL date-valued object.
(Because?…)
– Drilling down in a data warehouse is adding
row headers from the dimension tables.
Drilling up is subtracting row headers.
– The product dimension is one of the primary
dimensions in nearly every data warehouse.
Sep. 9, 1999
Dr. Vairam Arunachalam
6
Normalization review
1NF: no repeating groups; primary key
defined
 2NF: non-key domains functionally
dependent on entire primary key
 3NF: no dependencies between non-key
domains

Sep. 9, 1999
Dr. Vairam Arunachalam
7
Other Issues
Database sizing
 Domain transfer – design variations
 Additive vs. semi- (or non-additive)
dimensions

Sep. 9, 1999
Dr. Vairam Arunachalam
8
The Warehouse

Inventory Models:
– The Inventory Snapshot model
– Delivery Status model
– Transaction model
Sep. 9, 1999
Dr. Vairam Arunachalam
9
Inventory Snapshot Model
– Fig. 3.2
– Gross Margin Return on Inventory (GMROI)
= [(Qty Ship)*(Value at LSP – Value at
Cost)] / [(Daily Avg Qty)*(Value at LSP)]
Sep. 9, 1999
Dr. Vairam Arunachalam
10
Delivery Status Model
– Steps:







Sep. 9, 1999
Received
Inspected
Placed into inventory
Authorized to sell
Picked from inventory
Boxed
Shipped
Dr. Vairam Arunachalam
11
Delivery Status Model (contd.)
– Exception Conditions:








Failed inspection
Returned to vendor
Damaged in handling
Lost
Returned from customer
Returned to inventory
Written off
Refunded
– Fig 3.3
Sep. 9, 1999
Dr. Vairam Arunachalam
12
Transaction Model
– Includes:








Sep. 9, 1999
Receive shipment line item
Place SKU into inspection hold
Release SKU from inspection hold
Place SKU into inspection failed with reason
Mark SKU for return to vendor with reason
Place SKU in bin
Authorize SKU for sale
Pick SKU from bin
Dr. Vairam Arunachalam
13
Transaction Model (contd.)
– Includes:






Package SKU for shipment
Ship SKU to customer
Bill customer
Receive SKU from customer with reason
Return SKU to inventory from customer return
Remove SKU from inventory with reason
– Fig. 3.4
Sep. 9, 1999
Dr. Vairam Arunachalam
14
Transaction Model (contd.)
– Sample queries:




Sep. 9, 1999
How many times have we placed a product into an
inventory bin on the same day we have picked the
product from the same bin at a different time?
What is the clustering in time of customer returns
of a particular SKU?
How many separate shipments did we receive from
vendor X and when did we get them?
On which SKUs have we had more than one round
of QA inspection failures that caused the return of
the product to the vendor?
Dr. Vairam Arunachalam
15
Transaction Model (contd.)
– Transplant context (e.g. ,FedEx)
– Compare models
Sep. 9, 1999
Dr. Vairam Arunachalam
16
Salient principles
– All measures that record a static level (such as…)
are inherently nonadditive across time. However,
in these cases the measure may be usefully
aggregated across time by averaging over the
number of time periods.
– Document control numbers (such as…) usually are
presented as degenerate dimensions (i.e.,
dimension keys with no corresponding dimension
table) in fact tables where the grain of the table is
the document itself or a line item in the
document.
Sep. 9, 1999
Dr. Vairam Arunachalam
17
Salient principles (contd.)

Exceptions to absolute additivity in the fact
table can be made where the additive
measures are more conveniently delivered in
a view. Examples include computed time
spans from a large number of date fields, as
well as extended monetary amounts derived
from units costs and prices. In such a case, it
is important to have all users access the view
instead of the underlying table. (Tie to 3NF)
Sep. 9, 1999
Dr. Vairam Arunachalam
18
Shipments
The ideal shipments fact table (Fig. 4.1)
 Typical customer ship-to dimension
(Fig. 4.2)
 Typical deal dimension (Fig. 4.3)
 Typical ship mode dimension (Fig. 4.4)

Sep. 9, 1999
Dr. Vairam Arunachalam
19
Download