I(2) - CS Students - Stanford University

Download Report

Transcript I(2) - CS Students - Stanford University

Evaluating, Combining and Generalizing
Recommendations with Prerequisites
Aditya Parameswaran
Stanford University
(with Profs. Hector Garcia-Molina and Jeffrey D. Ullman)
1
Statistics (at Stanford):
>10,000 registered users
37,000 listed courses
160,000 evaluations
Overall
172 universities
100,000 users
2
Course Recommendations
• Instead of a traditional ranked list …
• Recommend a good package satisfying
– Prerequisites
(e.g., algebra  calculus)
– Requirements
(e.g., > 3 math courses)
– Planning constraints
(e.g., no two in same slot)
• Recent work on recommending packages
Prior Work
– Yahoo! Travel Plans [De Choudhury et. al. WWW 10]
Yahoo! Composite items [Roy et. al. SIGMOD 10]
– Minimizing Cost
[Xie et. al. RecSys 10]
3
Intuitive Example
• Nodes represent all items not taken yet
• Edges imply prerequisites
Prerequisites: YES
score: 29
A(5)
B(6)
C(3)
J(8)
G(7)
H(8)
K(9)
D(7)
E(2)
I(2)
Prerequisites: NO
score: 32
4
Example: General Prerequisites
Set Theory
Arithmetic
Algebra
Probability
Adv. Math
Geometry
Optimization
Algorithms
Information
Theory
Statistics
5
Formal Problem
• Directed acyclic graph G(V, E)
– with some nodes Labeled AND or OR
• Every node x has a score(x)
• Recommend k = |A| courses such that
–
score(a) is maximized
{a ϵ A}
– Prerequisites of all nodes are met
Chain Graphs
AND Graphs
OR Graphs
AND-OR Graphs
6
Outline of Work
Chain Graphs
• Complexity
• Chain Graphs: PTIME DP
AND Graphs
OR Graphs
• AND / OR / AND-OR: NP-Hard
AND-OR Graphs 7
• Adaptable Approx Algorithms
1) Breadth First
For Chain Graphs
2) Greedy
3) Top Down
• Worst case per structure
• Complexity: DP > Greedy > Top Down > BF
• Merge Algorithm
• Experiments
Sample
• Extensions to Fuzzy Prerequisites
Chain Graph Algorithm
Chain 0
Chain 1
….
Chain i -1 Chain i
Chain n
0
Score of best
feasible set of j
items from first
i chains
j
k
B [j, i] = max over all x
{B [x, i–1] + 1 … (j—x) of ith chain}
Complexity: O(nk2)
To pick j items from i chains:
Pick
• x items from i-1 chains
• First j – x items from the ith chain
8
Breadth First Algorithm Illustration
• K =4
• Add items until k = 4
• Swap items
A(5)
B(6)
C(3)
J(8)
G(7)
H(8)
K(9)
D(7)
E(2)
I(2)
9
Top Down & Greedy Algorithms
• Algorithms between extremes
– Efficient: Breadth First
– Inefficient but Exact: Dynamic Programming
• Top Down is the reverse of Breadth First
– Add best items first, then try to add prerequisites
• Greedy reasons about entire chains at once
– Tries to add prefixes of chains with high avg score
10
Outline of Work
Chain Graphs
• Complexity
• Chain Graphs: PTIME DP
AND Graphs
OR Graphs
• AND / OR / AND-OR: NP-Hard
AND-OR Graphs 11
• Adaptable Approx Algorithms
1) Breadth First
For Chain Graphs
2) Greedy
3) Top Down
• Worst case per structure
• Complexity: DP > Greedy > Top Down > BF
• Merge Algorithm
• Experiments
Sample
• Extensions to Fuzzy Prerequisites
Breadth First: Worst Case
• Worst case in terms of:
– d: maximum length of chain
– m: max difference in score in a given chain
lots of chains of depth d
a-Є
a+m-Є
a-Є
a-Є
a+m-Є
a+m-Є
lots of singleton elements
a
a
a
difference = k/d x (da + (d-1)m) - ka
= k/d x (d-1)m
a+m-Є
a+m-Є
a+m-Є
12
Experimental Setup
Chain Graphs
Three Algorithms: greedy, td, bf
Top-2 of td, bf
AND Graphs
Three Algorithms: greedy, td, bf
Top-2 of td, bf
Merge-2 of td, bf
Top-3 of greedy, td, bf
• Measure How we perform
• As fraction of DP (for chains) & no-prereqs (for AND graphs)
• Vary:
– n: number of components
– d: max depth of chain / component
– p: probability of a long chain / large component
– k: size of package
• Score: exponentially distributed
13
Ratio of Dynamic Programming Solution
Chain Graphs on Varying k
Size of the desired package
14
Ratio of No-prerequisite Solution
AND Graphs on Varying
Probability of Long Component
15
Conclusions
• Dynamic Programming Algorithm
– Only for Chain Graphs
– Guaranteed best recommendations
• Greedy Value Algorithm
– Adaptable to any structure
– Almost as good recommendations as DP
– With less complexity
• Top Down and Breadth First
– Even better complexity
– Not as good recommendations as Greedy
– Can be improved using Merge algorithm
16
Ratio of Dynamic Programming Solution
Chain Graphs on Varying p
Probability of a long chain (k small)
17