Profit Mining: From Patterns to Action
Download
Report
Transcript Profit Mining: From Patterns to Action
Profit Mining:
From Patterns to Action
Ke Wang, Senqiang Zhou, Jiawei Han
Simon Fraser University
1
Why Profit Mining?
A major obstacle in data mining application
is the gap between:
–
statistic-based pattern extraction and
–
value-based decision making
Profit mining:
–
value-based data mining
2
An Example
Suppose we want to maximize profit. Association
rules [AIS93]
{Perfume}->Lipstick (more often)
{Perfume}->Diamond (more profit)
do not suggest which items (and prices) to
recommend to a customer who bought Perfume.
Similar problems with correlation, classification, etc.
3
The Problem
Given: several transactions of form:
–
{<I,P,Q>,…, <I,P,Q> | <I,P,Q>}, for Item,
Promotion code, and Quantity. | separates nontarget items and target items.
–
{<FlakedChick., $3,2> | <Sunchip,$1,1>}
Recommend target <I,P> to customers who buy
non-target items, to maximize profit.
4
Not Prediction Problem
An example:
–
100 customers each bought 1 pack for $1/pack.
Profit=100(1-0.5)=$50.
–
100 customers each bought 4 packs for $3.2/4-pack.
Profit=100(3.2-2)=$120.
Prediction repeats the history.
Profit mining gets smarter from the history, by
recommending “right items” and “right prices”.
5
Challenge I - notion of profit
Pure statistic approach favors
–
Pure profit approach favors
–
{Perfume}-> Lipstick
{Perfume}-> Diamond.
Profit mining considers:
–
both statistical significance and profit significance.
6
Challenge II - customer intention
Mining On Availability (MOA):
–
Paying a higher price implies the willingness to
pay a lower price.
{<FC,$3>} -> <Sunchip,$1> can be extracted
from transaction {<FC,$5> | <Sunchip,$1.5>}
Recognizing this behavior brings new sales
opportunities (at lower price).
7
Challenge III - search space
Thousands of items, and much more sales.
Any combination can trigger a
recommendation.
Search at alternative concepts (food, meat,
etc) and prices makes it worse.
8
Step 1: generating rules
Association rules
–
{Diaper -> Beer}, supp=10%, conf=80%
Recommendation rules:
–
{g1,…,gk} -> <I,P>, where gi is <Item,Price>, or
Item, or Concept.
–
{<FlakedChick. , $3.8>} -> <Sunchip,$4.5>
–
{FlakedChick.} -> <Sunchip,$4.5>
–
{Meat} -> <Sunchip,$4.5>
9
Handle alternative concept and
prices
10
Step 2: building the model
We rank rules by the “average profit” made by
the recommendation of a rule.
–
–
{<FC,$3.5>} -> <Sunchip,$1> matches
t1: {<FC,$4.0>| <Sunchip,$2>} (a hit)
t2: {<FC,$4.5>|<Milk,$3.5>} ( a miss)
If the cost of Sunchip is $0.7, the average profit is
$0.15.
To recommend, we select the matching rule of
the highest possible rank.
11
Step 3: Pruning the model
The model favors “high average profit”
rules.
Such rules may bring a large profit.
Such rules may be random noise.
Cannot prune them simply based on
statistical frequency.
12
Pruning the model
We prune rules to increase the estimated
profit on the whole population.
We organize rules into specificity tree: the
parent is the highest ranked general rule of a
child.
We cut off the tree to maximize the estimated
profit.
13
14
Evaluation
Synthetic datasets: IBM synthetic data generator,
modified to have price and cost.
1000 items and 1000K transactions
For non-target item i:
–
cost(i)=c/i
–
price j=(1+j*10%)cost(i), j=1,2,3,4.
For target items:
–
Dataset I has 2 target items
–
Dataset II has 10 target items
15
Profit Gain on Dataset I
16
Hit Ratio on Dataset I
17
Hit Ratio on Dataset I
18
Profit Gain on Dataset II
19
Hit Ratio on Dataset II
20
Hit Ratio on Dataset II
21
Conclusion
Proposed a new direction of data mining:
Mining for profit.
Directly factor in business goal into data
mining
Related work: microeconomic view of data
mining [KPR98]
22