Business Intelligence/ Decision Models Week 2 IT Infrastructure & Marketing Database Design and Implementation Outline Issues with Mkt Databases DBMS Database Design and Schemas Data Integrity and Hygiene Demo and Lab: Table redundancy and Queries DB Marketing Problems Lack of a marketing strategy. Focus on promotions instead of relationships. Failure to have a 3600 picture of every customer. Failure to personalize your communications. Building a DB and sending e-mails in house. Getting the economics wrong. Failure to use tests and controls. Lack of a forceful leader. Bad DB architecture Corrupted data DB Environment Traditional Environment: Silo Approach Source: Laudon and Laudon 2012 Data Warehouse Technology Marketing Datamart Data Warehouse Architecture Data Warehouse Architecture Metadata Database Management Systems (DBMS) Flat Files Sequential A Name B C Address D A 1 Transactions 2 Fixed or variable length record 3 DBMS with VSAM Index TN NE NB IPE QC ON MB SK AB BC QC ON ON QC ON Hierarchical Indexed Direct Access DBMS Cust_id Name Purchases Top down Products Indexed Direct Access DBMS Key 107 110 145 167 234 267 Record 4 6 1 2 5 3 Records 1 145 2 167 3 267 4 107 5 234 6 110 ………. ………. ………. ………. ………. ………. Reversed Hierarchical DBMS Cust_id Name Psyte Code Lifestyle Purchases Bottom up/Top down Products Reversed Hierarchical DBMS NAME PSYTE Dubé Smith Bertrand White Harris Habib Jones 18 34 18 56 34 18 34 PURCHASES 120 130 150 200 50 300 430 PSYTE NAMES 18 Dubé; Bertrand; Habib 34 Smith; Harris; Jones 56 White Relational Database CUSTOMERS ORDERS 1 PRODUCTS Customer ID PK Order ID PK Product ID PK Cust First Name Customer ID FK Product Name Cust Last Name Product ID FK Product Description Street Order Date City Order Amount State Zip Relational DBMS Multiple Tables Source: Laudon and Laudon 2012 Relational DBMS with Query Source: Laudon and Laudon 2012 Relational Design An Unnormalized Relation For Order (flat file) An unnormalized relation contains repeating groups. For example, there can be many parts and suppliers for each order. There is only a one-to-one correspondence between Order Number and Order Date. Source: Laudon and Laudon 2012 Normalized Tables Created From Order Pros: Data integrity and updating Cons: Processing speed for large data sets Source: Laudon and Laudon 2012 Charitable Contributions The “Classic” Star Schema S to re D im e n sio n STORE KEY S to re D e sc rip tio n C ity S ta te D istric t ID D istric t D e sc . R e g io n _ID R e g io n D e sc . R e g io n a l M g r. Le v e l Fa c t Ta b le S TO R E K E Y PRO D UC T KEY P E R IO D K E Y D o lla rs U n its P ric e P ro d u c t D im e n sio n PRO D UC T KEY P ro d u c t D e sc . B ra n d C o lo r S ize M a n u fa c tu re r Le v e l Tim e D im e n sio n P E R IO D K E Y P e rio d D e sc Year Q u a rte r M o n th Day C u rre n t Fla g R e so lu tio n Sequence A single fact table, with detail and summary data Fact table primary key has only one key column per dimension Each key is generated Each dimension is a single table, highly de-normalized Tradeoff between data integrity, updating and speed Some alternatives: Star and Snowflake structure Benefits: Easy to understand, easy to define hierarchies, reduces # of physical joins, low maintenance, very simple metadata Source: Kishore-jaladi-DW.ppt Data Integrity and Hygiene Illustrating Data Hygiene Customers Undel. 15% Dup. 20% CPM = $500 Price = $60 GM 50% Quantities 2,000,000 1,700,000 15% 1,360,000 20% 2,000,000 1,700,000 1,360,000 2,000,000 1,700,000 1,360,000 Response 29,000 29,000 29,000 Cost $1,000,000 $850,000 $680,000 Revenue $870,000 $870,000 $870,000 BE = FC / (P-C) 1,000,000 / 30 $ BE = FC / (P-C) 850,000 / 30 $ BE = FC / (P-C) 680,000 / 30 $ 33,334 28,334 22,667 29,000 29,000 29,000 29,000 29,000 29,000 Response Rate 1.45% 1.71% 2.13% CPO $34.48 $29.31 $23.45 Profit -$130,000 $20,000 $190,000 ROI -13% 2% 28% Data Hygiene Processes (1) Standardize names Standardize addresses Address 1, Address 2, City, Province, Postal Code Abbreviations (apt., ave, p.o., province) Replace prestige names with postal addresses (i.e. Commerce Court) Scrubbing Title, First name, Initials, Family name, Suffix Ex. c/o, co, c/o Delivery FSA/LDU, Postal walk Address change database Data Hygiene Processes (2) Data Comparison Duplicate (cost, abuse) Householding • Hyphenated Names, Maiden Names, Spouse’s Name • Recomposed Families, Roommates Consolidation (merge/purge) • • • • Multiple Multiple Multiple Multiple Accounts (financial Services) policies (insurances) phone numbers (telco) divisions within firm Wrap-up Issues with Mkt Databases DBMS Database Design and Schemas Data Integrity and Hygiene Demo and Lab: Table redundancy and Queries Next Week Data Import Data Preparation Data Transformation