Transcript ICDM2002

Using Text Mining to Infer Semantic
Attributes for Retail Data Mining
Rayid Ghani & Andrew Fano
Accenture Technology Labs, USA
ICDM 2002
Who are we?
Accenture Technology Labs
• R&D Group for Accenture
• ~ 50 researchers in Chicago, Palo Alto (California) and Sophia
Antipolis (France)
• Research in Data Mining, Machine Learning, Ubiquitous Computing,
Wearable Computing, Language Technologies, Virtual & Augmented
Reality, Collaborative Workspaces…
Current State of Retail Data Mining
• Large amounts of data captured about transactions
• Each Retailer has terabytes of data in their data warehouse
• Several data mining algorithms applied to this data
Problem:
Today’s transaction data can’t answer
important marketing questions.
What do your best selling items have in common?
What about the worst sellers?
What do the products a customer has purchased say about them?
What do the products your competitors sell say about them?
What’s Missing?
Captured data focuses on the
transaction, not the product.
Product information captured with
transactions is typically limited to little
more than SKU, size, brand and price.
•
But what does a SKU mean?
Current Data Mining Practice
• Treat products as generic unique entities/objects with no associated
semantics
• Semantics are applied by humans AFTER the algorithm has done the
learning e.g. interpreting association rules, decision trees
Product Semantics:
What does a product mean?
What does this shirt say about her?
Is it conservative or flashy?
Trendy or classic?
Formal or casual?
Where would we get this information?
Extract underlying attributes from
product marketing descriptions
Marketing descriptions are designed to
convey a particular image to customers.
These descriptions implicitly contain these
more elusive attributes.
DKNY Jeans Ruched Side-Tie Tee
Get back to basics with a fresh new look
this season. The Ruched Side-Tie Tee has a
drawstring tie at left hip with shirred detail
down the side. Stretch provides a flattering,
shapely fit. V-neck.
SKU : 655432
UPC: 4200006200
Item: DKNYTee
Price $49
Training the System
Product
Descriptions
Domain
Experts
Product descriptions
marked up with
attribute values
Supervised
Learning
Algorithm
Learned
Statistical
Models
Inferring Attributes via Text Classification
• Build one classifier for each attribute type
• Simple statistical classifier – Naïve Bayes Multinomial
model (McCallum & Nigam 1998)
– For all words (description) and attribute values:
• calculate P(word | attribute value) using the manually
rated items
– Given a new item description:
• Calculate P(attribute value | item description) for all
attribute values
• Use Maximum Likelihood
0
BrandAppeal
Trendiness
Sportiness
Baseline
Conservative
Formality
Functionality
AgeGroup
Classification Accuracy
Naïve Bayes Results
Naïve Bayes
90
80
70
60
50
40
30
20
10
Can we get something for free?
Semi-supervised Learning
• Lot of product descriptions available for minimal/no cost from retail
websites
• Labeling them is expensive
• Can we utilize the unlabeled product descriptions to provide better
performance?
Semi-Supervised Learning
• Apply algorithms that combine labeled and unlabeled data for
classification
–
–
–
–
Expectation-Maximization (Nigam et al. 1999)
Co-Training (Blum & Mitchell 1999)
Co-EM (Nigam & Ghani, 2000)
ECOC + Co-Training (Ghani, 2002)
The EM Algorithm
Estimate
labels
Learn from
labeled data
Naïve
Bayes
E-Step
Probabilistically add
to labeled data
M-Step
Classification Accuracy
100
90
80
70
60
50
40
30
20
10
0
BrandAppeal
Trendiness
Naïve Bayes
Sportiness
Baseline
Conservative
Formality
Functionality
AgeGroup
EM Results
EM
A Peek at the Learned Models
Not Conservative (Flashy)
Extremely Conservative
leopard
chemise
straps
flirty
Double-breasted
seasonless
trouser
classic
Blazer
A Peek at the Learned Models
Informal
Formal
jean
denim
sweater
tee
jacket
skirt
lines
seam
crepe
A Peek at the Learned Models
Loungewear
Extremely Sporty
chemise
silk
kimono
lounge
robe
gown
sneaker
rubber
miraclesuit
athletic
Mesh
What can this be used for?
Applications
Example applications that we have built include:
• Recommender System
• Copywriter’s Workbench
• Competitive Comparisons
Recommender System
Retailer’s
Web Site
Extracted
Descriptions
of Products
Browsed
Product Semantics
Knowledge Base
Learned
Statistical
Models
Evolving
User Profile
Advantages over Traditional
Recommender Systems
This approach provides us some of the underlying attributes that
characterize a customer’s preference.
We can therefore begin to explain the preference rather than simply rely
on the co-occurrence of purchases (e.g. people who bought x also
bought y).
This helps with:
• Handling new products/rapidly changing products
• Low Frequency Products
• Cross Category Recommendations
Cross-Category
Recommendations
• Difficult for collaborative filtering and content-based systems
• Build a model of the user - personality, stylistic attributes
• Taste in clothing might also be suggestive of taste in other products,
say furniture and home decoration
• Create models for different product classes and create mappings
among these models
Application II
Competitive Comparison Tool
• Just as consumers may be profiled by what they buy, retailers can be
profiled by what they sell
• Track and compare how the positioning of products from different
retailers changes over time
• Brands can track how different retailers/stores position their products
Application III
Copywriters toolkit
• Can this system be used to help write product
descriptions?
• A tool for copywriters that provides feedback to help them
position a product in a particular way.
• Writers can assess their descriptions and get word
recommendations
ScreenShot
Classy and chic, this longsleeve pinstripe shirt has
the glamorous appeal of a
40s movie star or
European songstress.
• Shirring along front button
placket.
• Double-button extended cuffs.
• 3 1/2" side-seam slits.
• Cotton/polyester; dry clean.
• By BCBG Max Azria; imported.
Increase Tone : skin, flirty, low-neck, slim-fit, straps,
Summary
• “Understand” a product and hence the individual customer
• Use Text Learning (supervised and semi-supervised) to abstract from
product (description) to subjective, domain-specific features to create
enhanced product databases
• Create applications that have more semantic knowledge of products
and can help understand consumer behavior
• Provide Data Mining algorithms with semantic attributes to operate on
and build better and more domain specific models