PowerPoint from Honors Thesis Defense - E

Download Report

Transcript PowerPoint from Honors Thesis Defense - E

Interactive Database Design:
Exploring Movies through
Categories
Stacey Aylor
Statement of Purpose
• To determine if the concept of
personalized categories is a viable
paradigm for exploring databases
• In order to accomplish this goal:
– Find an effective trial list generator
and rater combination
– Create a system that allows users to
create categories and discover
collective statistics about them
– Draw conclusions based on user
testing and feedback
Test Domain: Movies
• Internet Movie Database
• Downloadable
• User-defined keywords
and movie reviews
• Features
– Admissions
– Runtime
– Release Date
– MPAA Ratings
– Opening Weekend Take
Category Approach
• Universal and
natural thinking
pattern
• Wittgenstein, Rosch
• Difficult to
articulate
• Examples
– Realistic Cartoons
– Disaster Dramas
– Smart Sci Fi
Related work
• Recommender
systems
– Netflix
– Itunes Genius
– Amazon
– Pandora
– Jinni
– zMovie
• Collaborative-based
– Nanocrowd
• Content-based
– Musicovery
Related work
• Interactive Database Exploration
–
–
–
–
QBE (IBM) (Zloof, [8])
Khoros (Konstantinides & Rasure, [9])
Magic Lens (Bier et al., [10])
DataSplash (Tioga-2) (Olston et al., [3])
Related work
• Visual Database Exploration
–
–
–
–
–
ViQing (Olston et al., [13])
Improvise (Weaver, [2])
DEVise (Livny et al., [4])
MineSet (Brunk et al., [6])
VisDB (Keim & Krigel, [12])
• Fuzzy Databases
– FSQL (Galindo, [1])
– FCQL (Meier et al. [11])
Summary of related work



Lots of research into
recommending new items to users
Lots of research into visual
database exploration
However, no existing work on
allowing users to create custom
categories and then discover
aggregate information about them
Membership Prediction
• Data mining classification
• Active Learning: intelligently choosing
which items to ask the user to classify
• My approach:
– Ask for a small number of initial
examples (5)
– Present trial lists to obtain other
examples and counterexamples
Naïve Bayes
•
•
•
•
A standard benchmark algorithm
Laplace Smoothing
Prior and setting the threshold
Bernoulli vs multinomial unigram
model
“UMW Algorithm”
(experimental)
• Theory: some keywords
relevant and others aren’t
• Progressively identify and
eliminate meaningless
keywords
• Deliberately present keywords
with undetermined relevance
Trial List Generator
Algorithm
• Controls which movies are
presented to the user to classify
(active learning)
• I compared three algorithms:
– Naïve Bayes (Bernoulli) based on
Keywords
– Naïve Bayes (multinomial unigram)
based on Keywords
– Custom Algorithm “Keyword UMW”
Rater Algorithm
•
•
Controls how we will predict
whether a given movie is probably
a member of the user's category
I compared four algorithms:
– Keyword Naïve Bayes (Bernoulli)
– Keyword Naïve Bayes (multinomial
unigram)
– Free Text Naïve Bayes (Bernoulli)
– Free Text Naïve Bayes (multinomial
unigram)
Experiment 1
• 62 Testers
• Web-based experiment
• Each assigned one of the three
Trial List Generator algorithms
• Created a “Chick Flicks” category
– 5 Initial Examples
– 6 Iterations of trial lists (10 films
each)
– 1 test list (45 films)
• Same process for a user-defined
category
Algorithm Evaluation
•
•
•
Accuracy: how frequent was the
system correct in predicting
category membership?
Precision: of the movies
predicted to be in the category,
what percentage actually were?
Recall: of the movies that were
actually examples, how many of
them were predicted to be?
Accuracy by TLG
Precision by TLG
Accuracy by Rater
Precision by Rater
Experiment 1 Conclusions
• Trial List Generators:
– “UMW algorithm”: Best Precision
– Keyword Naïve Bayes: Best Accuracy
• Raters:
– Bernoulli vs. multinomial unigram
• Bernoulli: Better Accuracy and Precision
– Free-text Naïve Bayes: Best Accuracy
and Precision
• For aggregate statistics, precision (not
accuracy) is actually most important
Prototype interface
• Category Builder
Process
– 5 Examples
– 3 Lists
• Find out
interesting
information
Demonstration
Experiment 2
• 12 Subjects
• Focused Empirical Testing
• Goal: to discover whether new users
could use a category-based paradigm
to answer complex analytical
questions
• Specific Tasks
– Part 1: Create Categories
– Part 2: Simple Analysis
– Part 3: Complex Analysis
Part 3 Questions
• Do films in Category2 generally
receive higher IMDB ratings than
films in Category3?
• Consider how many Chick Flicks
have been released in recent
years, compared to films in
general. What has the trend
been? Are Chick Flicks becoming
relatively more frequent, or less?
Strengths
• Overall: easy (and enjoyable!) to
use
• The information users found
generally aligned with their
intuition, with some surprising
(and enlightening) findings
• Applicability to other fields
Weaknesses
• Thinking of initial examples is
difficult for some people
• Complex analysis variability
• Lack of cross-checking
Conclusions
• With only a small time
investment, users are able to
specify categories that can be
used for accurate prediction
• This paradigm makes it possible
for users to discover new kinds
of information relevant to their
own natural groupings
Special thanks to Jonathan
Morin and Jesse Hatfield for
their valuable contributions
to this project
Bibliography
[1]
J. Galindo, J. Medina, O. Pons, and J. Cubero, “A Server for Fuzzy SQL Queries,” Flexible Query Answering
Systems, 1998, p. 164.
[2]
C. Weaver, “Building Highly-Coordinated Visualizations in Improvise,” Proc. of the IEEE Symposium on
Information Visualization, Washington, DC: IEEE Computer Society, 2004, pp. 159-166.
[3]
C. Olston, A. Woodruff, A. Aiken, M. Chu, V. Ercegovac, M. Lin, M. Spalding, and M. Stonebraker, “DataSplash,”
Proc. of the 1998 ACM SIGMOD international conference on Management of data, Seattle, WA: ACM,
1998, pp. 550-552.
[4]
M. Livny, R. Ramakrishnan, K. Beyer, G. Chen, D. Donjerkovic, S. Lawande, J. Myllymaki, and K. Wenger,
“DEVise: integrated querying and visual exploration of large datasets,” ACM SIGMOD Record, vol. 26,
1997, pp. 301-312.
[5]
E. Rosch, “Family Resemblances: Studies in the Internal Structure of Categories,” Readings in language and
mind, Wiley-Blackwell, 1996, pp. 442-459.
[6]
C. Burnk, J. Kelly, and R. Kohavi, “MineSet: An Integrated System for Data Mining,” Proc. of the 3rd Int’l Conf.
for Knowledge Discovery and Data Mining, Newport Beach, CA: AAAI, 1997, pp. 135-138.
[7]
E. Rosch, “Principles of Categorization,” Concepts: core readings, MIT Press, 1999, pp. 189-204.
[8]
M.M. Zloof, “Query-by-example: a data base language,” IBM Systems Journal, vol. 16, Dec. 1977, pp. 324-343.
[9]
K. Konstantinides and J. Rasure, “The Khoros software development environment for image and signal
processing,” IEEE transactions on image processing, vol. 3, 1994, pp. 243-252.
[10]
E. Bier, M. Stone, K. Pier, W. Buxton, and T. DeRose, “Toolglass and magic lenses,” Proc. of the 20th annual
conf. on Computer Graphics and interactive techniques, Anaheim, CA: 1993, pp. 73-80.
[11]
A. Meier, N. Werro, M. Albrecht, and M. Sarakinos, “Using a fuzzy classification query language for customer
relationship management,” Proc. of the 31st int’l conf. on Very large data bases, Trondheim, Norway:
VLDB Endowment, 2005, pp. 1089-1096.
[12]
D. Keim and H. Krigel, “VisDB: Database Exploration Using Multidimensional Visualization,” IEEE Computer
Graphics and Applications, vol. 14, 1994, pp. 40-49.
[13]
C. Olston, M. Stonebraker, A. Aiken, and J. Hellerstein, “VIQING: Visual Interactive Querying,” Proc. of the
IEEE Symposium on Visual Languages, Washington, DC: IEEE Computer Society, 1998, p. 162.
Recall by TLG
Recall by Rater