Intro to Content Optimization

Download Report

Transcript Intro to Content Optimization

Intro to Content Optimization
Yury Lifshits. Yahoo! Research
Largely based on slides by
Bee-Chung Chen, Deepak Agarwal & Pradheep Elango
-1-
Outline
• High-level overview
• Explore/Exploit algorithm for Yahoo! Frontpage
2
High-Level Overview of
Content Optimization
3
New Research Area
Content optimization =
optimizing publishing choices
for every user visit
under certain objective function
4
Publishing Choices
• Top stories
• Related stories
• Tweets, updates, trending queries/topics
• Headlines text, pictures, text length
• Ads
• Layout
• Modules, order of modules
• Menu items
• Balance between types/topics of content
5
Opportunities (User Visits)
• Time
• Context
• Search query
• Referrer, user session
• User demographics
• User history
• User interest profile
• User social graph (social targeting)
6
Objective function
• Clicks
• Time spent
• Engagement: comments, shares, sign-ups
• Actions on subsequent pages
• Ad revenue
• Off-line conversion
• Long-term objectives
+ Business rules constraints
7
Item Inventory
Articles, web page,
ads, …
Use an automated algorithm
to select item(s) to show
Opportunity
Users, queries,
pages, …
Get feedback (click, time spent,..)
Refine the models
Repeat (large number of times)
Measure metric(s) of interest
(Total clicks, Total revenue,…)
8
Some examples
Simple version
– I have an important module on my page, content inventory is obtained
from a third party source which is further refined through editorial
oversight. Can I algorithmically recommend content on this module? I
want to drive up total CTR on this module
More advanced
– I got X% lift in CTR. But I have additional information on other
downstream utilities (e.g. dwell time). Can I increase downstream utility
without losing too many clicks?
Highly advanced
– There are multiple modules running on my website. How do I take a
holistic approach and perform a simultaneous optimization?
9
Modeling: Key Components
Feature construction
Content: IR, clustering, taxonomy, entity,..
User profiles: clicks, views, social, community,..
Offline
Initialize
(Logistic, GBDT,..)
Online
(Fine resolution
Corrections)
(item, user level)
(Quick updates)
Explore/Exploit
(Adaptive sampling)
10
Tasks of Content Optimization
• Understand content (Offline)
• Serve content to optimize our objectives (Online)
• Quickly learn from feedback obtained using
ML/Statistics (Offline + Online)
• Constantly enhance our content inventory to
improve future performance (Offline)
• Constantly enhance our user understanding to
improve future performance (Offline + Online)
• Iterate
11
Science of Content Optimization
Large scale Machine Learning & Statistics:
• Offline Models
• Online Models
• Collaborative Filtering
• Explore/Exploit
12
Explore/Exploit Algorithm for
Yahoo! Frontpage Story Selection
13
Recommend search queries
Recommend packages:
Image
Title, summary
Links to other pages
Pick 4 out of a pool of K
K = 20 ~ 40
Dynamic
Routes traffic other pages
Recommend applications
Recommend news article
14
Problems in this example
• Optimize CTR on different modules together in a
holistic way
– Today Module, Trending Now, Personal Assistant,
News, Ads
– Treat them as independent?
• For a given module
– Optimize some combination of CTR, downstream
engagement and perhaps revenue.
15
Single Module CTR Optimization Problem
• Pick the top n items (stories) from an item pool of K items (stories)
for each user visit to the Yahoo! homepage in order to maximize the
number of clicks in the Today Module
In general, items can be
articles, ads, modules,
configuration parameters
of page layout
One may also replace
click with any
performance metric
observable with low
latency
16
Simplified Single Module Problem

Only consider the first position
- ~ 2/3 clicks happen at the first position
- Pick the best one from K items (stories)

The single best one for all users
- No personalization in this talk
- Best means having the highest click-through rate (CTR)

How to solve this “simple” problem?
- Can’t we just show (exploit) the item having the highest CTR?
- We need to explore every available item (using some fraction of
traffic) to estimate its CTR



Explore too little  Unreliable CTR estimates
Explore too much  Little traffic to show the best item
How much traffic should we allocate to each item now, in
order to maximize the total number of clicks in the future
(e.g., in a week)
17
Example Scenario
• 5 min intervals, 100 visits per interval
• One new story arrives every interval, story expires
after 4 intervals
• Every story is either “strong” (100% CTR) or
“weak” (0% CTR)
• New story is strong with probability 75%, weak
with probability 25%
What is the optimal strategy to allocate 100 views
for the next interval between the current 4 stories?
18
Sequential Decision Problem
now
clicks in the future
t –2
t –1
Item 1
Item 2
…
Item K
t
time
x1% page views
x2% page views
…
xK% page views
Determine (x1, x2, …, xK) based on clicks and views
observed before t in order to maximize the expected total
number of clicks in the future
19
Modeling the Uncertainty, NOT just the Mean
Probability density
Simplified setting: Two items
Item A
Item B
If we only make a single decision,
give 100% page views to Item A
If we make multiple decisions in the future
explore Item B since its CTR can
potentially be higher
CTR
We know the CTR of Item A (say, shown 1 million times)
We are uncertain about the CTR of Item B (only 100 times)
20
CTR Curves of Some Items in Two Days
Each curve is the 1st-position CTR of an item over time
CTRs are estimated using 1% random data
(See our WWW’09 paper for more information)
21
Characteristics of Our Application
• Non-stationary CTR
– The CTR of each item changes over time
• Dynamic item pools
– Items come and go with short lifetimes (~10hr)
• Batch serving
– For scalability reasons, data is processed in batches (e.g., one
batch per minute)
– We need to provide a sampling plan for each batch (time interval)
22
Bayesian Explore/Exploit
– Adaptation of Whittle (1988) to our problem setting
• With approximations to ensure computational feasibility
• With a time-series model to track the non-stationary CTR
– It provides an approximately Bayes optimal solution
– Development
• Bayes optimal solution to a simplified case:
Two items, two intervals
• Near optimal solution to the general case by using the above
solution as a building block
23
Bayesian Solution: Two Items, Two Intervals
• Two time intervals: t = 0 and t = 1
– Item P: We are uncertain about its CTR, p0 at t = 0 and p1 at t = 1
– Item Q: We know its CTR exactly,
q0 at t = 0 and q1 at t = 1
• To determine x, we need to estimate what would happen in the future
density
t=0
Question:
What fraction x of N0 views to item P
(1-x)
to item Q
Item Q
Item P
p0
q0
N1 views
N0 views
End
t=1
time
• Assume we observe c; we can update p1
• If x and c are given, optimal solution:
Give all views to Item P iff
E[ p1(x,c) I x, c ] > q1
CTR
Obtain c clicks after serving x
(not yet observed; random variable)
pˆ1 ( x, c)
density
Now
Item P
p1(x,c)
pˆ1 ( x, c)
Item Q
q1
CTR
24
The Two Item, Two Interval Case
• Expected total number of clicks in the two intervals
E[#clicks] at t = 0
E[#clicks] at t = 1
N 0 xpˆ 0  N 0 (1  x)q0  N1Ec [max{ pˆ1 ( x, c), q1}]
Item P
Item Q
Show the item with higher E[CTR]:
max{ pˆ1 ( x, c), q1}
 N 0 q0  N1q1  N 0 x( pˆ 0  q0 )  N1Ec [max{ pˆ1 ( x, c)  q1 , 0}]
E[#clicks] if we
always show
item Q
Gain(x, q0, q1)
Gain of exploring the uncertain item P using x
Gain(x, q0, q1) = Expected number of additional clicks if we explore
the uncertain item P with fraction x of views in interval 0, compared
to a scheme that only shows the certain item Q in both intervals
Solution: argmaxx Gain(x, q0, q1)
25
Two Items, Two Intervals: Normal Approximation
• Approximate pˆ1 ( x, c) by the normal distribution
– Reasonable approximation because of the central limit theorem


 q1  pˆ1  
 q1  pˆ1  
  1  
 ( pˆ 1  q1 )
Gain( x, q0 , q1 )  N 0 x( pˆ 0  q0 )  N1  1 ( x)   




  1 ( x)  
  1 ( x)  
Prior of p1 ~ Gamma(a, b)
pˆ1  Ec [ pˆ1 ( x, c)]  a /( a  b)
 12 ( x)  Var[ pˆ1 ( x, c)] 
xN0
ab
(a  b  xN0 ) (a  b) 2 (1  a  b)
• Proposition: Using the approximation,
the Bayes optimal solution x can be
found in time O(log N0)
26
Bayesian Solution: General Case
• From two items to K items
– Very difficult problem: max ( N 0  i xi pˆ i 0  N1 Ec [max i { pˆ i1 ( xi , ci )}] )
x0
i xi  1
Note: c = [c1, …, cK]
ci is a random variable representing
the # clicks on item i we may get
max Ec [ i zi (c) pˆ i1 ( xi , ci )]
z 0
i zi (c)  1, for all possible c
– Apply Whittle’s Lagrange relaxation (1988) to our problem setting
• Relax i zi(c) = 1, for all c, to Ec [i zi(c)] = 1
• Apply Lagrange multipliers (q1 and q2) to enforce the constraints
min ( N 0 q0  N1q1  i max Gain(xi , q0 , q1 ) )
q0 , q1
xi
– We essentially reduce the K-item case to K independent two-item
sub-problems (which we have solved)
27
Bayesian Solution: General Case
• From two intervals to multiple intervals
– Approximate multiple intervals by two stages
• Non-stationary CTR
– Incorporate a time-series model (WWW’09) into our solution
• Coarse-grained personalization:
– Partition user-feature space into segments (e.g., decision tree)
– Explore/exploit most popular items for each segment
28
Simulation Experiment: Different Traffic Volume
• Simulation with ground truth estimated based on real data (WWW’09)
• Setting:16 live items per interval
• Scenarios: Web sites with different traffic volume (x-axis)
29
Simulation Experiment: Different Sizes of the Item Pool
• Simulation with ground truth estimated based on real data
• Setting: 1000 views per interval; average item lifetime = 20 intervals
• Scenarios: Different sizes of the item pool (x-axis)
30
Experimental Result: Controlled Bucket Test
Bayes2x2, B-UCB1 and -Greedy were implemented in production
and used to serve 3 random samples of real users on a Yahoo! site
31
Characteristics of Different Schemes
• Why the Bayesian solution has better performance
• Characterize each scheme by three dimensions:
– Exploitation regret: The regret of a scheme when it is showing
the item which it thinks is the best (may not actually be the best)
• 0 means the scheme always picks the actual best
• It quantifies the scheme’s ability of finding good items
– Exploration regret: The regret of a scheme when it is exploring
the items which it feels uncertain about
• It quantifies the price of exploration (lower  better)
– Fraction of exploitation (higher  better)
• Fraction of exploration = 1 – fraction of exploitation
32
Characteristics of Different Schemes
Good
Exploration Regret
Exploitation Regret
Exploitation Regret
• Exploitation regret: Ability of finding good items (lower  better)
• Exploration regret: Price of exploration (lower  better)
• Fraction of Exploitation (higher  better)
Good
Exploitation fraction
33
Summary
• Explore/exploit is an effective strategy to maximize CTR
in a content display system
• Ongoing research
– Explore/exploit for personalized recommendation, page layout
optimization, and in the presence of business constraints
34
Thank You!!
Questions
35