Market Basket Analysis & Association Rules, CRM

Download Report

Transcript Market Basket Analysis & Association Rules, CRM

Business Intelligence Technologies –
Data Mining
`
Lecture 2 Market Basket Analysis,
Association Rules
Agenda

Market basket analysis & Association rules

Case Discussion

Software demo

Exercise
Barbie®  Candy
1.
2.
3.
4.
5.
6.
7.
8.
Put them closer together in the store.
Put them far apart in the store.
Package candy bars with the dolls.
Package Barbie + candy + poorly selling item.
Raise the price on one, lower it on the other.
Barbie accessories for proofs of purchase.
Do not advertise candy and Barbie together.
Offer candies in the shape of a Barbie Doll.
Market Basket Analysis (MBA)

MBA in retail setting
 Find out what
 Cross-selling
are bought together
 Optimize shelf layout
 Product bundling
 Timing promotions
 Discount planning (avoid
double-discounts)
 Product selection under limited space
 Targeted advertisement, Personalized coupons, item
recommendations

Usage beyond Market Basket
 Medical (one symptom after another)
 Financial (customers with mortgage acct
saving acct)
also have
What the data contains
Transaction No.
Item 1
Item 2
Item 3
Item 4
100
Beer
Diaper
Chocolate
Cheese
101
Milk
Chocolate
Shampoo
102
Beer
Wine
Vodka
103
Beer
Cheese
Diaper
104
Ice Cream
Diaper
Beer
Chocolate
…
Customer No.
Age
Income
Saving_acct
Children
Mortgage
100
>50
High
Yes
Yes
Yes
101
35-50
Mid
No
No
No
102
<35
High
Yes
No
Yes
103
>50
Mid
Yes
No
Yes
104
<35
Low
No
Yes
No
…
…
Rules Discovered from MBA

Actionable Rules
 Wal-Mart
customers who purchase Barbie dolls
have a 60% likelihood of also purchasing one of
three types of candy bars

Trivial Rules
 Customers
who purchase large appliances are
very likely to purchase maintenance agreements

Inexplicable Rules
 When
a new hardware store opens, one of the
most commonly sold items is toilet bowl cleaners
Learning Frequent Itemsets and
Association Rules from Data
A descriptive approach for discovering
relevant and valid associations
among items in the data.
If buy
diapers




Then
Buy beer
The itemset corresponding to this rule is {Diaper, Beer}
Itemset: A collection of items.
Frequent Itemset: An itemset that occurs often in data.
Often times, finding frequent itemsets is enough.
Market Basket Analysis
Transaction No. Item 1
Item 2
Item 3
Item 4
100
Beer
Diaper
Chocolate Cheese
101
Milk
Chocolate Shampoo
102
Beer
Wine
Vodka
103
Beer
Cheese
Diaper
104
Ice Cream Diaper
…
Chocolate
Beer
…
Examples:
Shoppers who buy Diaper are very likely to buy Beer.
If buy
Diaper
Then
Buy Beer
Shoppers who buy Beer and Diaper are likely to buy Cheese and Chocolate
If buy
Beer, Diaper
Then
Buy Cheese,
Chocolate
Association Rules
Rule format:
If {set of items}  Then {set of items}
LHS
If {Diaper,
Baby Food}
RHS
Then
{Beer, Wine}
LHS implies RHS
Evaluation of Association Rules
What rules should be considered valid?
LHS
RHS
Then
If {Diaper}
{Beer}
An association rule is valid if it satisfies some
evaluation measures
Rule Evaluation
 Milk &
 But…
Wine co-occur
Only 2 out of 200K transactions contain these
items
Transaction No.
Item 1
Item 2
Item 3
100
Beer
Diaper
Chocolate
101
Milk
Chocolate
Wine
102
Beer
Wine
Vodka
103
Beer
Cheese
Diaper
104
Ice Cream
Diaper
Beer
….
…
Rule Evaluation – Support
Support:
The frequency in which the items in LHS and RHS co-occur.
E.g., The support of the {Diaper}  {Beer} rule is 3/5:
60% of the transactions contain both items.
Support =
No. of transactions containing items in LHS and RHS
Total No. of transactions in the dataset
Transaction No.
Item 1
Item 2
Item 3
100
Beer
Diaper
Chocolate
101
Milk
Chocolate
Shampoo
102
Beer
Wine
Vodka
103
Beer
Cheese
Diaper
104
Ice Cream
Diaper
Beer
…
Support evaluation is not enough?

My friend, Bill, an 85 years old man, told me a joke in a
party last Friday:





An old man is celebrating his 103th birthday.
“I will hold my 104th birthday party next year. You are all
welcome to join me,” he announces to his guests proudly.
“How do you know you will still be alive then?” one of his guests
asks.
“Because very few people died between the age of 103 and
104,” he replies.
Explain the logic of the old man and provide your
comments.



The old man’s logic: P{103+ & died} is low; so 1
- P{103+ & died} is high
Common knowledge: P{103+ & died} = P{103+}
* P{died|103+}, where P{103+} is low.
So the low of P{103+ & died} is due to P{103+},
while P{died|103+} is still high.
Rule Evaluation - Confidence
Is Beer leading to Diaper purchase or Diaper leading to Beer
purchase?
Among the transactions with Diaper, 100% have Beer.
Among the transactions with Beer, 75% have Diaper.


Transaction No.
Item 1
Item 2
Item 3
100
Beer
Diaper
Chocolate
101
Milk
Chocolate
Shampoo
102
Beer
Wine
Vodka
103
Beer
Cheese
Diaper
104
Ice Cream
Diaper
Beer
Confidence =



…
No. of transactions containing both LHS and RHS
No. of transactions containing LHS
confidence for {Diaper} {Beer} : 3/3

When Diaper is purchased, the likelihood of Beer purchase is 100%
confidence for {Beer} {Diaper} : 3/4

When Beer is purchased, the likelihood of Diaper purchase is 75%
So, {Diaper} {Beer} is a more important rule according to confidence.
Rule Evaluation - Lift
Transaction No.
Item 1
Item 2
Item 3
Item 4
100
Beer
Diaper
Chocolate
101
Milk
Chocolate
Shampoo
102
Beer
Milk
Vodka
Chocolate
103
Beer
Milk
Diaper
Chocolate
104
Milk
Diaper
Beer
…
What’s the support and confidence for rule {Chocolate}{Milk}?
Support = 3/5
Confidence = 3/4
Very high support and confidence.
Does Chocolate really lead to Milk purchase?
No! Because Milk occurs in 4 out of 5 transactions. Chocolate is even
decreasing the chance of Milk purchase (3/4 < 4/5)
Lift = (3/4)/(4/5) = 0.9375 < 1
Rule Evaluation – Lift (cont.)
Measures how much more likely is the RHS given the
LHS than merely the RHS
 Lift = confidence of the rule / frequency of the RHS
Example: {Diaper}  {Beer}









Total number of customer in database: 1000
No. of customers buying Diaper: 200
No. of customers buying beer: 50
No. of customers buying Diaper & beer: 20
Frequency of Beer = 50/1000 (5%)
Confidence = 20/200 (10%)
Lift = 10%/5% = 2
Lift higher than 1 implies people have higher change to
buy Beer when they buy Diaper. Lift lower than 1 implies
people have lower change to buy Milk when they buy
Chocolate.
Algorithm to Extract Association Rules (1)

Given a set of transactions T, the goal of association rule
mining is to find all rules having



support ≥ minsup threshold
confidence ≥ minconf threshold
Brute-force approach:
List all possible association rules
 Compute the support and confidence for each rule
 Prune rules that fail the minsup and minconf thresholds
 Computationally prohibitive!

Frequent Itemset Generation

Brute-force approach:




Each itemset in the lattice is a candidate frequent itemset
Count the support of each candidate by scanning the database
Complexity ~ O(NMw) => Expensive since M = 2d !!!Match each transaction
against every candidate
Complexity ~ O(NMw) => Expensive since M = 2d !!!
Transactions
N
TID
1
2
3
4
5
Items
Bread, Milk
Bread, Diaper, Beer, Eggs
Milk, Diaper, Beer, Coke
Bread, Milk, Diaper, Beer
Bread, Milk, Diaper, Coke
w
List of
Candidates
M
Mining Association Rules
TID
Items
1
Bread, Milk
2
3
4
5
Bread, Diaper, Beer, Eggs
Milk, Diaper, Beer, Coke
Bread, Milk, Diaper, Beer
Bread, Milk, Diaper, Coke
Example of Rules:
{Milk,Diaper}  {Beer} (s=0.4, c=0.67)
{Milk,Beer}  {Diaper} (s=0.4, c=1.0)
{Diaper,Beer}  {Milk} (s=0.4, c=0.67)
{Beer}  {Milk,Diaper} (s=0.4, c=0.67)
{Diaper}  {Milk,Beer} (s=0.4, c=0.5)
{Milk}  {Diaper,Beer} (s=0.4, c=0.5)
Observations:
• All the above rules are binary partitions of the same itemset:
{Milk, Diaper, Beer}
• Rules originating from the same itemset have identical support but
can have different confidence
• Thus, we may decouple the support and confidence requirements
Mining Association Rules

Two-step approach:

Frequent Itemset Generation


Rule Generation


Generate all itemsets whose support  minsup
Generate high confidence rules from each frequent itemset,
where each rule is a binary partitioning of a frequent itemset
Frequent itemset generation is still computationally
expensive
Algorithm to Extract Association Rules (2)

The standard algorithm: Apriori
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining
Association Rules in Large Databases. VLDB 1994: 487-499

The Association Rules problem was defined as:
Generate all association rules that have
 support greater than the user-specified minimum
support
 and confidence greater than the user-specified
minimum confidence
 The base algorithm uses support and confidence, but
we can also use lift to rank the rules discovered by
Apriori.

The algorithm performs an efficient search over
the data to find all such rules.
Finding Association Rules from Data
Association rules discovery problem is decomposed
into two sub-problems:
1.
Find all sets of items (itemsets) whose support is above
minimum support --- called frequent itemsets or large itemsets
2.
From each frequent itemset, generate rules whose
confidence is above minimum confidence.
Given a large itemset Y, and X is a subset of Y
Calculate confidence of the rule X  (Y - X)
If its confidence is above the minimum confidence, then X  (Y - X) is an
association rule we are looking for.
Example


Transaction No. Item 1
Item 2
Item 3
100
Beer
Diaper
Chocolate
101
Milk
Chocolate Shampoo
102
Beer
Wine
Vodka
103
Beer
Cheese
Diaper
104
Ice Cream Diaper
Beer
A data set with 5 transactions
Minimum support = 40%, Minimum confidence = 80%
 Phase 1: Find all frequent itemsets
{Beer} (support=80%),
{Diaper} (60%),
Phase 2:
{Chocolate} (40%)
Beer  Diaper (conf. 60%÷80%= 75%)
{Beer, Diaper} (60%)
Diaper  Beer (conf. 60%÷60%= 100%)
Phase 1: Finding all frequent itemsets
How to perform an efficient search of all frequent itemsets?
Note: frequent itemsets of size n contain itemsets of size n-1 that also must
be frequent
Example: if {diaper, beer} is frequent then {diaper} and {beer} are each
frequent as well
This means that…
 If an itemset is not frequent (e.g., {wine}) then no itemset that includes
wine can be frequent either, such as {wine, beer} .
 We therefore first find all itemsets of size 1 that are frequent.
Then try to “expand” these by counting the frequency of all itemsets of
size 2 that include frequent itemsets of size 1.

Example:
If {wine} is not frequent we need not try to find out whether {wine, beer}
is frequent. But if both {wine} & {beer} were frequent then it is possible
(though not guaranteed) that {wine, beer} is also frequent.
Then take only itemsets of size 2 that are frequent, and try to expand
those, etc.
Phase 2:
Generating Association Rules
Assume {Milk, Bread, Butter} is a frequent itemset.

Using items contained in the itemset, list all possible rules








{Milk}  {Bread, Butter}
{Bread}  {Milk, Butter}
{Butter}  {Milk, Bread}
{Milk, Bread}  {Butter}
{Milk, Butter}  {Bread}
{Bread, Butter}  {Milk}
Calculate the confidence of each rule
Pick the rules with confidence above the minimum
confidence
Confidence of {Milk}  {Bread, Butter}:
No. of transaction that support {Milk, Bread, Butter}
=
No. of transaction that support {Milk}
Support {Milk, Bread, Butter}
Support {Milk}
Association


If the rule {Yogurt}  {Bread, Butter } is found
to have minimum confidence.
Does it mean the rule:
{Bread, Butter}  {Yogurt} also has minimum
confidence?

No.
Example:
 Support of {Yogurt} is 20%,
 {Yogurt, Bread, Butter } is 10%
 {Bread and Butter } is 50%
 Confidence of {Yogurt}  {Bread, Butter} is
10%/20%=50%
 Confidence of {Bread, Butter}  {Yogurt} is
10%/50%=20%
Agrawal (94)’s Apriori Algorithm—An Example
Transactions
T-ID
Items
10
A, C, D
20
B, C, E
30 A, B, C, E
40
B, E
C1
1st scan
C2
L2
Itemset
{A, C}
{B, C}
{B, E}
{C, E}
sup
2
2
3
2
Itemset
C3 {B, C, E}
{A,B,C}?
Itemset sup
{A}
2
{B}
3
{C}
3
{D}
1
{E}
3
Itemset sup
{A, B}
1
{A, C}
2
{A, E}
1
{B, C}
2
{B, E}
3
{C, E}
2
3rd scan
L3
L1
Itemset sup
{A}
2
{B}
3
{C}
3
{E}
3
C2
2nd scan
Itemset
{B, C, E}
sup
2
Itemset
{A, B}
{A, C}
{A, E}
{B, C}
{B, E}
{C, E}
Sequential Patterns
Instead of finding association between items in a single
transactions, find association between items across
related transactions over time.


Customer ID Transaction Data. Item 1
Item 2
AA
2/2/2001
Laptop
Case
AA
1/13/2002
Wireless network card
Router
BB
4/5/2002
laptop
iPaq
BB
8/10/2002
Wireless network card
Router
…
…
…
…
…
Sequence : {Laptop}, {Wireless Card, Router}
A sequence has to satisfy some predetermined minimum
support
Examples of Sequence Data
Sequence
Database
Sequence
Element
(Transaction)
Event
(Item)
Customer
Purchase history of a given
customer
A set of items bought by
a customer at time t
Books, diary products,
CDs, etc
Web Data
Browsing activity of a
particular Web visitor
A collection of files
viewed by a Web visitor
after a single mouse click
Home page, index
page, contact info, etc
Event data
History of events generated
by a given sensor
Events triggered by a
sensor at time t
Types of alarms
generated by sensors
Genome
sequences
DNA sequence of a
particular species
An element of the DNA
sequence
Bases A,T,G,C
Element
(Transaction)
Sequence
E1
E2
E1
E3
E2
E2
E3
E4
Event
(Item)
Examples of Sequence

Web sequence:
< {Homepage} {Electronics} {Digital Cameras} {Canon
Digital Camera} {Shopping Cart} {Order Confirmation}
{Return to Shopping} >

Sequence of books checked out at a library:

<{Fellowship of the Ring} {The Two Towers} {Return of the
King}>
Applications of Association Rules

Market-Basket Analysis:





e.g. Product assortment optimization (see next slide)
Recommendations: Determines which books are frequently purchased
together and recommends associated books or products to people who
express interest in an item.
Healthcare: Studying the side-effects in patients with multiple prescriptions,
we can discover previously unknown interactions and warn patients about
them.
Fraud detection: Finding in insurance data that a certain doctor often works
with a certain lawyer may indicate potential fraudulent activity. (virtual items)
Sequence Discovery: looks for associations between items bought over
time. E.g., we may notice that people who buy chili tend to buy antacid
within a month. Knowledge like this can be used to plan inventory levels.
Product Assortment Optimization
Graphs of expected sales (e.g derived from association rules) and costs
(e.g. of purchasing and holding inventory) can allow us to optimize the
number and selection (choice) of items in a product category.
Dollars
Revenues
Margin
Costs
Products in Category
Dollars
Max Profit
Margin = Revenues - Costs
Products in Category
35
Agenda

Market basket analysis & Association rules

Case Discussion

Software demo

Exercise
Case - Merkur
1.
2.
3.
4.
5.
6.
7.
8.
9.
What are the benefits of finding the associated products sold
together within the same transaction, or sold together to the same
customer ? (i.e. use transaction or customer as the unit of analysis)
How to perform an item-based Market Basket Analysis or a
customer-based Market Basket Analysis, and what are the benefits
for each? (i.e. MBA based on data about a specific item, MBA
based on data about a specific customer)
What are the interesting results from MBA discussed in the case?
How to decide promotion items based on MBA?
How to evaluate a promotion based on MBA?
How does MBA help product bundling?
Please brainstorm a promotion plan based on MBA to maximize the
net profit of the retailer.
How to do targeted promotion over time?
Other possible strategies based on MBA?
Agenda

Market basket analysis & Association rules

Case Discussion

Software demo

Exercise
Agenda

Market basket analysis & Association rules

Case Discussion

Software demo

Exercise
Exercise
Transaction No.Item 1
100
Beer
101
Milk
102
Beer
103
Beer
104
Milk
Item 2
Diaper
Chocolate
Soap
Cheese
Diaper
Item 3
Item 4
Chocolate
Shampoo
Vodka
Wine
Beer
Chocolate
Given the above list of transactions, do the following:
1) Find all the frequent itemsets (minimum support 40%)
2) Find all the association rules (minimum confidence 70%)
3) For the discovered association rules, calculate the lift
What to Do After Class
Read Chapter 4, 9
 Read cases for Lecture 3
 Get familiar with SAS or WEKA,
replicate the class demo.
 Talk to candidate companies for your
project

41