Market Basket Analysis & Association Rules, CRM

Business Intelligence Technologies –

Data Mining

`

Lecture 2 Market Basket Analysis,

Association Rules

Agenda

 Market basket analysis & Association rules

 Case Discussion

 Software demo

 Exercise

Barbie

®



Candy

1.

2.

3.

4.

5.

6.

7.

8.

Put them closer together in the store.

Put them far apart in the store.

Package candy bars with the dolls.

Package Barbie + candy + poorly selling item.

Raise the price on one, lower it on the other.

Barbie accessories for proofs of purchase.

Do not advertise candy and Barbie together.

Offer candies in the shape of a Barbie Doll.

Market Basket Analysis (MBA)





MBA in retail setting

 Find out what are bought together

 Cross-selling

 Optimize shelf layout

 Product bundling

 Timing promotions

 Discount planning (avoid double-discounts)

 Product selection under limited space

 Targeted advertisement, Personalized coupons, item recommendations

Usage beyond Market Basket

 Medical (one symptom after another)

 Financial (customers with mortgage acct also have saving acct)

What the data contains

Transaction No.

100

101

102

103

104

…

Item 1

Beer

Milk

Beer

Item 2 Item 3

Diaper Chocolate Cheese

Chocolate Shampoo

Wine

Beer Cheese

Ice Cream Diaper

Vodka

Diaper

Beer

Item 4

Chocolate

…

Customer No.

Age

100 >50

101

102

35-50

<35

Income

High

Mid

High

103

104

…

>50

<35

Mid

Low

Saving_acct Children Mortgage

Yes Yes Yes

No

Yes

No

No

No

Yes

Yes

No

No

Yes

Yes

No

Rules Discovered from MBA

 Actionable Rules

 Wal-Mart customers who purchase Barbie dolls have a 60% likelihood of also purchasing one of three types of candy bars

 Trivial Rules

 Customers who purchase large appliances are very likely to purchase maintenance agreements

 Inexplicable Rules

 When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners

Learning Frequent Itemsets and

Association Rules from

Data

A descriptive approach for discovering relevant and valid associations among items in the data.

Then

If buy diapers

Buy beer









The itemset corresponding to this rule is {Diaper, Beer}

Itemset: A collection of items.

Frequent Itemset: An itemset that occurs often in data.

Often times, finding frequent itemsets is enough.

Market Basket Analysis

Transaction No. Item 1

100 Beer

101

102

103

104

…

Milk

Beer

Beer

Item 2

Diaper

Wine

Cheese

Ice Cream Diaper

Item 3

Chocolate Shampoo

Vodka

Diaper

Beer

Item 4

Chocolate Cheese

Chocolate

…

Examples: Shoppers who buy Diaper are very likely to buy Beer.

If buy

Diaper

Then

Buy Beer

Shoppers who buy Beer and Diaper are likely to buy Cheese and Chocolate

Then

If buy

Beer, Diaper

Buy Cheese,

Chocolate

Association Rules

Rule format:

If {set of items}



Then {set of items}

LHS RHS

Then

If {Diaper,

Baby Food}

{Beer, Wine}

LHS

implies

RHS

Evaluation of Association Rules

What rules should be considered valid?

LHS RHS

Then

If {Diaper} {Beer}

An association rule is valid if it satisfies some evaluation measures

Rule Evaluation





Milk & Wine co-occur

But…

Only 2 out of 200K transactions contain these items

Transaction No.

100

101

102

103

104

….

Item 1

Beer

Milk

Beer

Beer

Ice Cream

Item 2

Diaper

Chocolate

Wine

Cheese

Diaper

Item 3

Chocolate

Wine

Vodka

Diaper

Beer

…

Rule Evaluation – Support

Support :

The frequency in which the items in LHS and RHS co-occur.

E.g., The support of the {Diaper}  {Beer} rule is 3/5:

60% of the transactions contain both items.

Support =

No. of transactions containing items in LHS and RHS

Total No. of transactions in the dataset

Transaction No.

100

101

102

103

104

Item 1

Beer

Milk

Beer

Beer

Ice Cream

Item 2

Diaper

Chocolate

Wine

Cheese

Diaper

Item 3

Chocolate

Shampoo

Vodka

Diaper

Beer

…

Support evaluation is not enough?





My friend, Bill, an 85 years old man, told me a joke in a party last Friday:









An old man is celebrating his 103th birthday.

“I will hold my 104 th birthday party next year. You are all welcome to join me,” he announces to his guests proudly.

“How do you know you will still be alive then?” one of his guests asks.

“Because very few people died between the age of 103 and

104,” he replies.

Explain the logic of the old man and provide your comments.



The old man’s logic: P{103+ & died} is low; so 1

- P{103+ & died} is high

 Common knowledge: P{103+ & died} = P{103+}

* P{died|103+}, where P{103+} is low.

 So the low of P{103+ & died} is due to P{103+}, while P{died|103+} is still high.

Rule Evaluation - Confidence

Is Beer leading to Diaper purchase or Diaper leading to Beer purchase?

Among the transactions with Diaper, 100% have Beer.



Among the transactions with Beer, 75% have Diaper. 

Transaction No.

Item 1 Item 2 Item 3 …

100

101

102

103

Beer

Milk

Beer

Beer

Diaper Chocolate

Chocolate Shampoo

Wine Vodka

Cheese Diaper

104 Ice Cream Diaper Beer

Confidence =

No. of transactions containing both LHS and RHS

No. of transactions containing LHS





 confidence for {Diaper}



{Beer} : 3/3

 When Diaper is purchased, the likelihood of Beer purchase is 100% confidence for {Beer}  {Diaper} : 3/4

 When Beer is purchased, the likelihood of Diaper purchase is 75%

So, {Diaper}



{Beer} is a more important rule according to confidence.

Rule Evaluation - Lift

Item 4 Transaction No.

Item 1

100

101

102

103

104

Beer

Milk

Beer

Beer

Milk

Item 2 Item 3

Diaper Chocolate

Chocolate Shampoo

Milk Vodka

Milk

Diaper

Diaper

Beer

Chocolate

Chocolate

…

What’s the support and confidence for rule {Chocolate}  {Milk}?

Support = 3/5 Confidence = 3/4

Very high support and confidence.

Does Chocolate really lead to Milk purchase?

No! Because Milk occurs in 4 out of 5 transactions. Chocolate is even decreasing the chance of Milk purchase (3/4 < 4/5)

Lift = (3/4)/(4/5) = 0.9375 < 1

Rule Evaluation – Lift (cont.)

 Measures how much more likely is the RHS given the

LHS than merely the RHS

 Lift = confidence of the rule / frequency of the RHS

Example: {Diaper}  {Beer}

 Total number of customer in database: 1000







No. of customers buying Diaper : 200

No. of customers buying beer : 50

No. of customers buying Diaper & beer : 20

 Frequency of Beer = 50/1000 (5%)







Confidence = 20/200 (10%)

Lift = 10%/5% = 2

Lift higher than 1 implies people have higher change to buy Beer when they buy Diaper. Lift lower than 1 implies people have lower change to buy Milk when they buy

Chocolate.

Algorithm to Extract Association Rules (1)

 Given a set of transactions T, the goal of association rule mining is to find all rules having



 support ≥ minsup threshold confidence ≥ minconf threshold

 Brute-force approach:





List all possible association rules

Compute the support and confidence for each rule

 Prune rules that fail the minsup and minconf thresholds



Computationally prohibitive !

Frequent Itemset Generation

 Brute-force approach:









Each itemset in the lattice is a candidate frequent itemset

Count the support of each candidate by scanning the database

Complexity ~ O(NMw) => Expensive since M = 2 d !!!

Match each transaction against every candidate

Complexity ~ O(NMw) => Expensive since M = 2d !!!

List of

Candidates

Transactions

N

TID Items

1 Bread, Milk

2 Bread, Diaper, Beer, Eggs

3 Milk, Diaper, Beer, Coke

4 Bread, Milk, Diaper, Beer

5 Bread, Milk, Diaper, Coke w

M

Mining Association Rules

TID Items

3

4

1

2

5

Bread, Milk

Bread, Diaper, Beer, Eggs

Milk, Diaper, Beer, Coke

Bread, Milk, Diaper, Beer

Bread, Milk, Diaper, Coke

Example of Rules:

{Milk,Diaper}



{Beer} (s=0.4, c=0.67)

{Milk,Beer}



{Diaper} (s=0.4, c=1.0)

{Diaper,Beer}



{Milk} (s=0.4, c=0.67)

{Beer}



{Milk,Diaper} (s=0.4, c=0.67)

{Diaper}



{Milk,Beer} (s=0.4, c=0.5)

{Milk}



{Diaper,Beer} (s=0.4, c=0.5)

Observations:

• All the above rules are binary partitions of the same itemset:

{Milk, Diaper, Beer}

• Rules originating from the same itemset have identical support but can have different confidence

• Thus, we may decouple the support and confidence requirements

Mining Association Rules

 Two-step approach:





Frequent Itemset Generation

Generate all itemsets whose support

 minsup







Rule Generation

Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset

Frequent itemset generation is still computationally expensive

Algorithm to Extract Association Rules (2)







The standard algorithm: Apriori

Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining

Association Rules in Large Databases. VLDB 1994: 487-499

The Association Rules problem was defined as:

Generate all association rules that have

 support greater than the user-specified minimum support

 and confidence greater than the user-specified minimum confidence

 The base algorithm uses support and confidence, but we can also use lift to rank the rules discovered by

Apriori.

The algorithm performs an efficient search over the data to find all such rules.

Finding Association Rules from Data

Association rules discovery problem is decomposed into two sub-problems:

1.

2.

Find all sets of items (itemsets) whose support is above minimum support --- called frequent itemsets or large itemsets

From each frequent itemset, generate rules whose confidence is above minimum confidence.

Given a large itemset Y , and X is a subset of Y

Calculate confidence of the rule X



( Y X )

If its confidence is above the minimum confidence, then X



( Y X ) is an association rule we are looking for.

Transaction No.

Item 1

100

101

Beer

Milk

102

103

104

Beer

Example

Item 2

Wine

Beer Cheese

Ice Cream Diaper

Item 3

Diaper Chocolate

Chocolate Shampoo

Vodka

Diaper

Beer





A data set with 5 transactions

Minimum support = 40%, Minimum confidence = 80%

 Phase 1: Find all frequent itemsets

{Beer}

(support=80%),

{Diaper}

(60%),

{Chocolate}

(40%)

{Beer, Diaper}

(60%)

Phase 2:

Beer



Diaper (conf. 60% ÷ 80%= 75%)

Diaper  Beer (conf. 60% ÷ 60%= 100%)

Phase 1: Finding all frequent itemsets

How to perform an efficient search of all frequent itemsets?

Note: frequent itemsets of size n contain itemsets of size n1 that also must be frequent

Example: if {diaper, beer} is frequent then {diaper} and {beer} are each frequent as well

This means that…





If an itemset is not frequent (e.g., { wine } ) then no itemset that includes wine can be frequent either, such as {wine, beer} .

We therefore first find all itemsets of size 1 that are frequent.

Then try to “expand” these by counting the frequency of all itemsets of size 2 that include frequent itemsets of size 1.



Example:

If {wine} is not frequent we need not try to find out whether {wine, beer} is frequent. But if both {wine} & {beer} were frequent then it is possible

(though not guaranteed) that {wine, beer} is also frequent.

Then take only itemsets of size 2 that are frequent, and try to expand those, etc.

Phase 2: Generating Association Rules

Assume {Milk, Bread, Butter} is a frequent itemset .















Using items contained in the itemset, list all possible rules

{Milk}  {Bread, Butter}

{Bread}  {Milk, Butter}

{Butter}  {Milk, Bread}

{Milk, Bread}  {Butter}

{Milk, Butter}  {Bread}

{Bread, Butter}  {Milk}





Calculate the confidence of each rule

Pick the rules with confidence above the minimum confidence

Confidence of {Milk}  {Bread, Butter}:

No. of transaction that support {Milk, Bread, Butter}

No. of transaction that support {Milk}

=

Support {Milk, Bread, Butter}

Support {Milk}

Association







If the rule {Yogurt}  {Bread, Butter } is found to have minimum confidence.

Does it mean the rule:

{Bread, Butter}  {Yogurt} also has minimum confidence?

No.

Example:

 Support of {Yogurt} is 20%,

 {Yogurt, Bread, Butter } is 10%







{Bread and Butter } is 50%

Confidence of {Yogurt}  {Bread, Butter} is

10%/20%=50%

Confidence of {Bread, Butter}  {Yogurt} is

10%/50%=20%

Agrawal (94)’s Apriori Algorithm—An Example

L

2

Transactions

T-ID Items

10 A, C, D

20 B, C, E

30 A, B, C, E

40 B, E

C

1

1 st scan

Itemset sup

{A} 2

{B}

{C}

3

3

{D}

{E}

1

3

Itemset sup

{A, C} 2

{B, C}

{B, E}

{C, E}

2

3

2

C

3

Itemset

{B, C, E}

{A,B,C}?

L

1

Itemset sup

{A} 2

{B}

{C}

{E}

3

3

3

3

C rd

2

Itemset sup

{A, B} 1

{A, C} 2

{A, E} 1

{B, C} 2

{B, E} 3

{C, E} 2 scan

L

3

2 nd scan

C

2

Itemset

{A, B}

{A, C}

{A, E}

{B, C}

{B, E}

{C, E}

Itemset sup

{B, C, E} 2

Sequential Patterns

Instead of finding association between items in a single transactions, find association between items across related transactions over time.

Customer ID Transaction Data.

Item 1

AA 2/2/2001 Laptop

AA

BB

1/13/2002

4/5/2002

Wireless network card laptop

Item 2

Case

Router iPaq

…

BB

…

8/10/2002

…

Wireless network card Router

… …





Sequence : {Laptop}, {Wireless Card, Router}

A sequence has to satisfy some predetermined minimum support

Examples of Sequence Data

Sequence

Database

Customer

Sequence

Purchase history of a given customer

Web Data

Event data

Element

(Transaction)

A set of items bought by a customer at time t

Event

(Item)

Books, diary products,

CDs, etc

Browsing activity of a particular Web visitor

History of events generated by a given sensor

A collection of files viewed by a Web visitor after a single mouse click

Events triggered by a sensor at time t

Home page, index page, contact info, etc

Types of alarms generated by sensors

Genome sequences

DNA sequence of a particular species

An element of the DNA sequence

Bases A,T,G,C

Element

(Transaction)

E1

E2

E1

E3

E2

E3

E4

Event

(Item)

Sequence

E2

Examples of Sequence

 Web sequence:

< {Homepage} {Electronics} {Digital Cameras} {Canon

Digital Camera} {Shopping Cart} {Order Confirmation}

{Return to Shopping} >

 Sequence of books checked out at a library:

 <{Fellowship of the Ring} {The Two Towers} {Return of the

King}>

Applications of Association Rules











Market-Basket Analysis:

 e.g. Product assortment optimization (see next slide)

Recommendations: Determines which books are frequently purchased together and recommends associated books or products to people who express interest in an item.

Healthcare: Studying the side-effects in patients with multiple prescriptions, we can discover previously unknown interactions and warn patients about them.

Fraud detection: Finding in insurance data that a certain doctor often works with a certain lawyer may indicate potential fraudulent activity. (virtual items)

Sequence Discovery: looks for associations between items bought over time. E.g., we may notice that people who buy chili tend to buy antacid within a month. Knowledge like this can be used to plan inventory levels.

Product Assortment Optimization

Graphs of expected sales (e.g derived from association rules) and costs

(e.g. of purchasing and holding inventory) can allow us to optimize the number and selection (choice) of items in a product category.

Dollars

Revenues

Costs

Margin

Products in Category

Dollars

Max Profit

Margin = Revenues - Costs

Products in Category

35

Agenda


 Case Discussion

 Software demo

 Exercise

8.

9.

3.

4.

5.

6.

7.

1.

2.

Case - Merkur

What are the benefits of finding the associated products sold together within the same transaction, or sold together to the same customer ? (i.e. use transaction or customer as the unit of analysis)

How to perform an item-based Market Basket Analysis or a customer-based Market Basket Analysis, and what are the benefits for each? (i.e. MBA based on data about a specific item, MBA based on data about a specific customer)

What are the interesting results from MBA discussed in the case?

How to decide promotion items based on MBA?

How to evaluate a promotion based on MBA?

How does MBA help product bundling?

Please brainstorm a promotion plan based on MBA to maximize the net profit of the retailer.

How to do targeted promotion over time?

Other possible strategies based on MBA?

Agenda


 Case Discussion

 Software demo

 Exercise

Agenda


 Case Discussion

 Software demo

 Exercise

Transaction No.Item 1

100 Beer

101

102

Milk

Beer

103

104

Beer

Milk

Exercise

Item 2

Diaper

Item 3

Chocolate

Chocolate Shampoo

Soap Vodka

Cheese Wine

Diaper Beer

Item 4

Chocolate

Given the above list of transactions, do the following:

1) Find all the frequent itemsets (minimum support 40%)

2) Find all the association rules (minimum confidence 70%)

3) For the discovered association rules, calculate the lift

What to Do After Class



Read Chapter 4, 9



Read cases for Lecture 3



Get familiar with SAS or WEKA, replicate the class demo.



Talk to candidate companies for your project

41

Market Basket Analysis & Association Rules, CRM

Business Intelligence Technologies –

Data Mining

`

Lecture 2 Market Basket Analysis,

Association Rules

Agenda

Barbie

®

Candy

Market Basket Analysis (MBA)

What the data contains

Rules Discovered from MBA

Learning Frequent Itemsets and

Association Rules from

A descriptive approach for discovering relevant and valid associations among items in the data.