Mining Semantic Data for Solving First-rater and Cold
Download
Report
Transcript Mining Semantic Data for Solving First-rater and Cold
1
IDEAS 2011
Lisbon
21-23 September
MINING SEMANTIC DATA FOR SOLVING
FIRST-RATER AND COLD-START
PROBLEMS IN RECOMMENDER SYSTEMS
María N. Moreno, Saddys Segrera, Vivian F. López,
M. Dolores Muñoz and Ángel Luis Sánchez
Data Mining Research Group
http://mida.usal.es
Department of
Computing and Automatic
CEDI 2010
Contents
Introduction
Recommender Systems
Recommendation framework
Case Study
Conclusions
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Introduction
Recommender systems
commerce
Server
Recommender systems provide users
with intelligent mechanisms to find
products to purchase
Catalog
Applications: e-commerce, e-learning,
tourism, news’ pages…
Drawbacks: low performance, low
reliability of recommendations…
Client
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Introduction
Proposal
Objective:
overcome critical drawbacks in
recommender systems
Methodology:
Semantic based Web Mining
Associative classification (Web Mining)
Machine learning technique that combines concepts from
classification and association
Domain-specific
ontology (Semantic Web)
Enrichment of the data to be mined with semantic annotations
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Recommender Systems
Classification of recommendation methods
Content-based:
compare text documents to user
profiles
Collaborative filtering: is based on opinions of
other users (ratings)
Memory
based (User-based): find users with similar
preferences (neighbors) by means of statistical techniques
Model based (Item-based): use data mining techniques to
develop a model of user ratings
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Recommender Systems
Critical drawbacks
Sparsity: the number of ratings needed for prediction is greater
than the number of the ratings obtained from users
Scalability: performance problems presented mainly in memorybased methods where the computation time grows linearly with
both the number of customers and the number of products in
the site
First-rater problem: new products never have been rated,
therefore they cannot be recommended
Cold-Start problem: new users cannot receive recommendations
since they have no evaluations about products
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Recommendation framework
Associative classification (Web Mining)
Sparsity:
slightly sensitive to sparse data
Scalability: model based approach
Domain-specific ontology (Semantic Web)
First-rater
problem:
Use of taxonomies to classify products
Induction of abstracts patterns which relate user
profiles with categories of products
Cold-Start problem:
Recommendations based on user profiles
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Recommendation framework
Off-line process
Data mining
algorithms
Historical
data
Domain
ontology
Provide
annotations
Low level model
Historical data
with semantic
annotations
Data mining
algorithms
High level
model
On-line process
[new user]
Registration
Check high
level model
Recommendation
request
Active
user
Check high
level model
new
products
[old user]
Recommendations
Check low
level model
old
products
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Case Study
MovieLens Data
Movies Data
User Data
ID
Gender Age Occupation
Num. Binary
Num. String
Zip
ID
Title
Genre (19 attributes)
Num.
Num.
String
Binary
Ratings Data
score
ID
User ID
Movie ID
Rating
Num.
Num.
Num.
Num. (1 - 5)
rating_bin
CEDI 2010
Case Study
MovieLens Data
ID
User Gender
Num. Binary
*User Age
< 18
[18, 24]
[25, 34]
[35, 44]
[45, 49]
[50, 55]
> 55
User Occupation Movie Title
String
String
*Movie Genre
String
CEDI 2010
Case Study
Ontology definition
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Case Study
Results
Associative classification methods (CBA, CMAR, FOIL and CPAR)
were compared to non-associative classification algorithms
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
CEDI 2010
Conclusions
A framework for recommender systems is proposed in order to
overcome some critical drawbacks
The proposal combines web mining methods and domain specific
ontologies in order to induce models at two abstraction levels:
The low level model relates users, movies and ratings for making the
recommendations
High level model is used for recommender not rated movies or for
making recommendation to new users and overcome the first-rater and
the cold-start problem
The off-line model induction avoids scalability problems in
recommendation time
Associative classification methods provides a way to deal with
sparsity problem
Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems
María N. Moreno, Saddys Segrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez
IDEAS 2011
Lisbon
21-23 September
THANKS FOR YOUR ATTENTION !
MINING SEMANTIC DATA FOR SOLVING FIRST-RATER AND COLDSTART PROBLEMS IN RECOMMENDER SYSTEMS
María N. Moreno*, Saddys Segrera, Vivian F. López, M. Dolores Muñoz & Ángel Luis Sánchez
*[email protected]
Department of
Computing and Automatic