Simulation Selection Problems: Overview of an

Download Report

Transcript Simulation Selection Problems: Overview of an

MSIM 852
Fall 2007
Simulation Selection Problems:
Overview of an Economic Analysis
Based On Paper By:
Stephen E. Chick
Noah Gans
Presented By:
Michael C. Jones
Problem Statement:
A manager must select a project to
implement from a group of potential projects.
The manager has a model of each project,
and some idea of the statistics of the
projects, but does not know the expected
value of any project…
The manager must select a project, or chose
to simulate a project, or chose to abandon
all of them and do nothing…
Problem Statement:
Running a simulation costs money, and
takes time. Recognize that taking time
reduces the Net Present Value (NPV) of any
project ultimately selected.
Net Present Value:
How much is information worth?
Bayes’ Rule
"The essence of the Bayesian approach is to provide a mathematical
rule explaining how you should change your existing beliefs in the light
of new evidence. In other words, it allows scientists to combine new
data with their existing knowledge or expertise. The canonical example
is to imagine that a precocious newborn observes his first sunset, and
wonders whether the sun will rise again or not. He assigns equal prior
probabilities to both possible outcomes, and represents this by placing
one white and one black marble into a bag. The following day, when the
sun rises, the child places another white marble in the bag. The
probability that a marble plucked randomly from the bag will be white
(ie, the child's degree of belief in future sunrises) has thus gone from a
half to two-thirds. After sunrise the next day, the child adds another
white marble, and the probability (and thus the degree of belief) goes
from two-thirds to three-quarters. And so on. Gradually, the initial belief
that the sun is just as likely as not to rise each morning is modified to
become a near-certainty that the sun will always rise." *
*”In Praise of Bayes,” The Economist, 30 Sept 2000
Bayes’ Rule
P( B | A) P( A)
P( A | B) 
P( B)
An example will help…
Bayes’ Rule
• Suppose you have two bags of marbles to select
from. One bag (a1) has 75% white marbles and
25% black marbles, the other bag (a2) has 25%
white marbles and 75% black marbles.
• You pick a bag, but do not know which one.
• From the bag, you select 5 marbles (with
replacement). They are 4 white and 1 black
marbles. Call the experiment B.
• What bag do you have? What is P(a1|B)? What
is P(a2|B)?
Bayes’ Rule
P( B | a1 ) P(a1 )
P(a1 | B) 
P( B)
but:
P( B)  P( B | a1 ) P(a1 )  P( B | a2 ) P(a2 )
and:
P(a1 )  P(a2 )  0.5
so:
P ( B | a1 )
P (a1 | B ) 
P( B | a1 )  P( B | a2 )
Bayes’ Rule
P ( B | a1 )
P (a1 | B ) 
P( B | a1 )  P( B | a2 )
but:
 5  3 
P( B | a1 )    
 1  4 
4
and:
1
   0.3955
4
 5  1   3 
P( B | a2 )        0.0146
 1  4   4 
4
so:
0.3955
P(a1 | B) 
 0.9643
0.3955  0.0146
Multi-armed Bandit Problem
• A gambler has a slot
machine with several
arms (or several slot
machines to select from)
but does not know the
expected payout. How
can he maximize his
probability of playing the
machine with the highest
payout?
Multi-armed Bandit Problem
• Classic Approach
– Epsilon First
– Epsilon Greedy
– Epsilon Decreasing
• Indifference Zones…
• Gittins Index…
Multi-armed Bandit Problem
• Gittins Index…
– Gittins found a dynamic allocation
index based on Baye’s rule.
– Does not follow Bayes directly.
(Consider what could happen after
the first pull if he followed Bayes
exactly…)
– Weights the results by the “regret”
cost of incorrect selection,
discounted to the horizon.
Gittins’ Index:
• Gittin’s Index is the value of the game:
– For several machines, n= 1, 2, …N
– machine n is in state i. If machine n is selected, a
n
reward, ri , is earned. Machine n then transitions
to state j, with some known probability.
Z  ri    P v z
n
i
n
n
ij j
n
j
Returning to the Manager:
• Gittin’s Index is difficult (and
computationally expensive) to calculate.
• The Index may be approximated by the
Optimal Expected Discounted Reward:
Returning to the Manager:
• The Index may be approximated by the
Optimal Expected Discounted Reward.
Select the most profitable of:
– Do nothing.
• E[] = 0
– Simulate a project.
• E[] =
 ci  E[ X (T  t  1)]
– Implement a project now.
• E[] =
E[ X (t )]
Procedure
Summary
• Strengths:
– Provides a tools for accounting for cost (both
opportunity and direct) of simulation.
– May be automated so the business manager
can apply the tool.
• Weaknesses:
– Limiting assumptions (Known variance but
unknown mean?)
– Mathematically abstract.
Future
• Not applicable directly
• The authors state that the paper presents
more questions than it answers.
• May provide stimulation for future
research.
Future
• Not applicable directly
• The authors state that the paper presents
more questions than it answers.
• May provide stimulation for future
research.
• “If I have seen a little further it is by
standing on the shoulders of Giants."
Newton, 1676