125740544603

Download Report

Transcript 125740544603

Conjugate Distribution &
EVP(S)I with A Normal
Distribution
1
A Joint Distribution with Two
Random Variables
A General Joint Probability Mass Function
p(xi, yi)=P(X=xi, Y=yi)
A Marginal Probability Mass Function
P ( x)   P( x, y ) for a discrete case
i
X
i
 f ( x, y)dy for a continuous case
f ( x)  
X
2
An Example for A Discrete Case
Y
X
0
1
2
3
0
1/8
2/8
1/8
0
1
0
1/8
2/8
1/8
pY(0)=P(Y=0)
=P(Y=0,X=0) + P(Y=0,X=1)
=1/8 + 0 = 1/8
3
An Example for A Continuous Case
f ( x, y)  12  x 2  xy , 0  x 1, 0  y 1
7

Sol]. f ( x)  12 01( x 2  xy)dy  12 ( x2  x )
X
7
7
2
4
An Expectation of the Joint Distribution
with A Function of “g” consisting of two
variables
E  g ( X ,Y )     g ( x, y) p( x, y) in the discrete case
yx
  g ( x, y) f ( x, y)dxdy in the continuous case
= 

Example] g(x,y)=X+Y in the continuous case.
  ( x  y) f ( x, y)dxdy
Sol] E[ X  Y ]  

 
 
=   xf ( x, y)dxdy    yf ( x, y)dxdy 
 x   f ( x, y)dy  dx   y   f ( x, y)dx  dy
 
 


  



 xf ( x)dx   yf ( y)dy E ( X )  E (Y )
=
 Y
 X
5
Conditional Probability
The Discrete Case:
p
( x | y)  P{X  x |Y  y}  P{X  x,Y  y}  p( x, y)
X |Y
P{Y  y}
p ( y)
Y
E[ X |Y  y]   xP{X  xY
|  y}
x
= xPX |Y ( x| y)
x
6
Example
P(1,1)=0.5, p(1,2)=0.1,
p(2,1)=0.1, p(2,2)=0.3
풀이:
(1|1)  P{X 1,Y 1}  5
X |Y
6
P{Y 1}
p (1)   p( x,1)  p(1,1)  p(2,1)  0.6
Y
x
p
(2|1)  p(2,1)  1
X |Y
p (1) 6
Y
p
7
Example
If X and Y are independent Poisson random variables with
respective means  1 and 2, then calculate the conditional
expected value of X given that X+Y=n
Sol:
Let us first calculate the conditional probability mass function of X
given that X+Y=n.
P{X  k | X  Y  n}
 P{X  k , X  Y  n}  P{X  k ,Y  n  k}  P{X  k}P{Y  n  k}
P{X  Y  n}
P{X  Y  n}
P{X Y  n}
1

 k  k  (  )
k  nk
e 1 e 2  e 1 2 (   )n 

n
!
1
2
1
2
1
2


=


n!
k ! (n  k )! 
(n  k )!k !(   )n

1 2



n 

    1   2 
 k   1  2   1  2 
k

nk
8
It may be concluded from the equation that the
conditional distribution of X given that X+Y=n, is the
binomial distribution with parameters n and
 1/(1+ 2).
Sol ]:
E{X | X  Y  n}  n

1
 
1 2
9
Example: (Sums of Independent Poisson Random Variables): Let X and Y be
independent Poisson random variables with respective means  and
1
 . Calculate the distribution of X+Y.
2
Sol.] Since the event {X+Y=n} may be written as the union of the disjoint events
{X=k,Y=n-k}, 0  k  n, then we have
n
n
P{X  Y  n}   P{X k ,Y n k}  P{X k}P{Y n k}
k 0
k 0
1 k 2 nk (  )
2
n e 1 e
e 1 2 n
n!
 

1k 2nk

(nk )!
n!
k 0 k !(nk )!
k 0 k !
(  )
n
1 2 

e




2 
 1
n!
Conclusion: X+Y has a Poisson distribution with mean  and  .
1
2
10
Moment Generating Functions
The moment generating function (t) of the random variable
X is defined for all values t by
 (t )  E[etx ]
= etx p( x)
for X is discrete
x
tx
= -
for X is continuous
 e f ( x)dx


 '(t )  d E[etx ]  E  d etx   E[ xetx ]
 dt

dt
 '(0)  E[ x]
 ''(t )  d  '(t )  d E[ xetx ]  E[ x2etx ]
dt
dt
 ''(0)  E[ x2]
11
Example: The binomial distribution with parameters n and p
 (t )  E[etx ]
n k
n
tk
n

k
tk
=  e   p (1 p)
  e   ( pet )k (1 p)nk
k 0  k 
k 0  k 
n


t
=  pe  (1 p) 


n1 t
d


'
tx
t
 (t )  E[e ]  n  pe  (1 p)  pe


dt
 '(0)  E[ x]  np
2
V ( x)  E ( x2)   E( x) 


E( x2)   ''(0)
n1 t 

t
''
 (t )   n pe (1 p)
pe 




 n(n 1)  pet (1 p) 


'
n 2
n1
t
t
2


( pe )  n pe (1 p)
( pet )et


 ''(0)  E[ x2]  n(n 1) p2  np
V ( x)  E[ x2 ]  E ( x)   n(n 1) p 2  np   (np)2
2
=np(1 p)


12
Bayesian Inference
Definition:
Prior Distribution:
Conditional Distribution:
Joint Distribution:
Marginal Distribution:
g ( )
f ( X | )
f ( X , )  f ( X | ) g ( )
f ( x)   f ( X | ) g ( )
for a discrete case
X

  f ( X | ) g ( )d ( )
for a continuous case
Posterior Distribution(Continuous Case):
h( | X )  f ( X , )  f ( X | ) g ( )
f ( x)  f ( X | ) g ( )d
X
13
Example with a Normal Pro. Dist.
Prior Dist. of :N (0 , 02 )
Conditional Dis.: X N(, 2 )
Posterior Dist.of :
h( | X )  f ( X |  ) g ( )
 f ( X | ) g ( )d 
 1

 1
 1
1
exp  2 ( x   ) 2 
exp  2 ( x  0 ) 2 
 2
 2
  0 2
 2 0

 1

 1
1
 1
2
2
 f ( X | ) g ( )d      2 exp  2 2 ( x   )   2 exp  2 2 ( x  0 )  d 0

 0
 0
 

f ( X |  ) g ( ) 
As we experienced with this example, it is extremely difficult to obtain
the closed form for a posterior probability distribution. Furthermore,
there are many cases which can not be solved at any cost. Therefore,
we make a family of probability distributions among which there
exists the special relationship. This family is called a “conjugate
family of distributions”.
14
A General Conjugate Family of
Distributions
Prior Dist.
Sampling Process
Uniform
Binomial
or Beta
Pascal
Exponential
Poisson
or Gamma
Gamma
Normal
Normal
Posterior Dist.
Beta
Gamma
Normal
15
Normal Prior Probabilities and the
Value of Information
Suppose that a company has an opportunity to buy a machine
for $17,000. The machine, if successful, will save labor hours
in a production process that now uses a large amount of hand
labor. The physical life of this machine is one year. Assume
the incremental cost of a labor hour to the company is $8; thus,
if the machine will save more than 2,150 labor hours
($17,000/$8=2,150), the company will benefits from owning
the machine.
Let us assume that the production engineer feels that the mean
number of hours saved will be 2,300 and that there is a 50-50
chance that the actual hours saved could be less than 2,100 or
more than 2,500 hours for the year.
16
We first need to calculate a variance of the normal
probability distribution to determine its shape with
information given. Let’s try to solve the problem looking at
the graph below.
0.67
0.67 = 200 hours
 =200/0.67
= 300hours
0.25
0.25
2,100
0.25
2,300
0.25
2,500
Fig. 1 Labor Hour Saved
17
Profit ==-17,200 + 8X
(X: the actual number of labor hours saved)
E()=-17,200 + 8E(X)=-17,200 + 8*2,300 = $1,200
Decision: Based on the prior information, it is better for a
company to purchase a machine.
However, the company wants to gather more information
before taking an action paying some amount of cost.
Hence, it is interested in the EVPI.
Recall that the EVPI is the expected opportunity loss.
So, it is requited to set up the opportunity loss function for the
problem as in the next slide.
18
OLF=$8(2,150 – X) if X 2,150
=0
$17,200
if X2,150
8(2,150-X)
Dollars
Slope=-8 hours
2,150
Fig. 2 Labor Hours Saved
19
Combining Figure 1 and 2 yield Figure 3.
Figure 1 gives us a probability information and
Figure 2 information on the amount of loss.
So, it can be said that Figure 3 shows us the
expected opportunity.
OLF
=300
2,150
2,300
Fig. 3 Labor Hours Saved
20
Observation: The EVPI depends on the following three factor.
i)
The standard deviation. This is a measure of how “uncertain”
the estimator is about the prior mean.
ii) The distance of the prior mean from the break-even point.
This is important because it helps determine how likely the
decision maker is to change the decision because of new
evidence.
iii) The absolute value of the slope of the OLF. The slope of the
line is a measure of how rapidly the loss increases as the
hours saved decrease below the break-even point.
21
The Derivation of the EVPI Formula
Partial Expectation:
X
E
( X )  E ( X ) PN ( Z )   PN' ( Z )
EVPI  EOLF 

Xb

C  Xb  X

f ( X )dX
Xb
Xb
=C  X b 
f ( X ) dX   Xf ( X ) dX 






Xb
=C 
X
P
(
Z
)

E
( X )
b
N



'
=C 
X
P
(
Z
)

E
(
X
)
P
(
Z
)


P
(Z ) 
b
N
N
N


 X  E ( X ) 

'
=C      b
 PN ( Z )  PN ( Z ) 




=C    N ( D )
where
D
X b  E( X )

22
Example: C=$8, =300, E(X)=2,300, Xb=2,150
Sol]: D=|2,150-2,300|/300 = 0.5,
N(0.5)=0.1978
EVPI=$800*300*0.1978=$475
23
Revision of Normal Probability by Sampling
Suppose that a firm is considering manufacturing a new
product. It will market this product through a chain of 5,000
stores. The firm is uncertain about the level of sales for the
product and has not been able to decide whether or to market
it. Therefore, the management of the firm estimates that based
on the experience represented by a subjective probability
distribution for the mean sales of the new product per store,
the subjective probability distribution for the average sales per
store is normal, with a mean of 400 units and a standard
deviation of 30.
With no other evidence, the firm would make its decision
using the above information. However, it may be able to
experiment and obtain additional information.
24
Revising the Prior Distribution
A prior distribution of a population mean:
_
A sampling distribution:
N ( X , )
N ( , )
0 0

X
Let “I” represent the amount of information contained in a
distribution. Then we have
1
1
n
n
1
I  , I 

 , I 
0
2
2
2
 0 X    p  s2 1 12
X
 I

 I
Revised Mean:   0     X
1 I I  0 I I
 0

 0

X

X


 X


25
Example 1] The prior normal distribution:
The sampling distribution:
N (400,30)
N (420,200)
The sample size: 100 stores
What is the posterior mean of sales per store ?
Sol.]
1
1
1
n
100
1
I 

,I 



0
 02 302 X  2  2p 2002 400
X
1




1




2
30
400
 (400)  
 (420)  413.846
1  
1
1
1
1






 2 400 
 2 400 
 30

 30

26
Revision of the Standard Deviation
1
1
n
n
1
I  , I 

 , I 
0
2
2
2
 0 X    p  s2 1 12
X
Revised S.D: I =I  I
1 0 _
X
2 2
 0 
1
1
1
X


, 2 
1
2 2
12  02  2
 0  
X
X
Example 2]
2
2
(30 )(20 )
2
1 
 276.9,   16.64
1
2
2
30  20
27
A Comparison of The Prior and
Posterior Distribution
1=16.6
0=30
0= 400 1=413.8
28
The Posterior Normal Distribution and Decision
Making
Example 3] The cost of machinery and promotion for the new
product is $520,000. The variable profit per unit sold is 25 cents.
Profit Function==-520,000+0.25× 5,000× 1
=-520,000+1,250× 1
b=520,000/1,250 = 416 units
Decision: Do not market the new product
because b =416 1 =413.8
EVPI:
EVPI=(1,250)(16.6)N(0.13)=$7,600
D=|413.8 – 416|/16.6 = 0.13, N(0.13)=0.3373
29
The Decision to Sample
Let us now consider whether sampling is appropriate and how
large a sample should be taken. The analysis in this section
differs from that of the previous section in that it is prior to
rather than posterior to the sample.
Example 4] EVPI=(1,250)(30)N(0.53)=$7,076
D=|400-416|/30 = 0.53, N(0.53)=0.1887
30
The Expected Value of Sample Information
The EVSI results from the fact that a sample reduces the
posterior expected loss. That is, the sample, by supplying
additional information, reduces the probability of a wrong
decision. The amount of reduction in uncertainty is
 2  02
 
, where  12  2 X 2
     2
2
0 
0 
2
*
 
*
2
0
2
1
2
0
2
0



X
X
 02
 02   2

X
Example 5]  
*
302
 24.9
30  20
EVSI  (1, 250)(24.9) N (0.64)  $4,918
D* 
2
2
| 400  416 |
 0.64, N (0.64)  0.1580
24.9
31
Example 6] Assume that b=402 instead of 416
EVPI=(1,250)(30)N(0.07)=$13,684
EVSI=(1,250)(24.9)N(0.08)=$11,211
(original case)
0= 400 b=402
b=416
32
Example 7] Assume that 0=150 instead of 30.
 p 200
 

 20
n 10
X
2
2
150
0

148.7
 2  2_
1502  202
0
X
EVSI  (1,250)(`48.7) N (0.013)  $72,919
400  402
D* 
 0.01344, N (0.01344)  0.3923
148.7
* 
33
The Meanings of EVPI and EVSI
EVPI:
•It is the expected cost of uncertainty.
•The expected cost of uncertainty is determined by the probability
that an investment decision based on existing information will be
wrong and consequences if the wrong decision is made.
•It is also the maximum a decision-maker should be willing to pay
for additional evidence to inform this decision in the future.
•If the EVPI exceeds the expected costs of additional research then
it is potentially cost-effective to demand more information to
substantiate a claim.
34
The Optimum Sample Size
ENG=EVSI – Cost of Sampling
As the size of the sample increases, its value(EVSI) increases, but at a
decreasing rate. However, the EVSI can never be greater than the EVPI. The
cost of sampling increases as the size of the sample increases. As a result, the
ENG rises at first and then declines.
EVPI
Value
EVSI
Cost of Sampling
ENG
Optimum sample size
n=sample size
35
Example ] Toy Manufacturer
The initial investment cost: $500,000
The incremental profit: $1/toy, The break-even mean of sales: 500,000
units/year
# of stores: 50,000 stores, b/store = 10 units/store
0.67
0.67 = 4
 =4/0.67
=6
0.25
0.25
8
0.25
0=12
0.25
16
36
OLF
0=6
C=$50,000
b=10
0=12
EVPI=($50,000)(6)(0.2555)=$76,650
C=1 toy * 50,000 stores * $1/toy = $50,000
D=|12-10|/6 = 0.333
N(0.333)=0.2555
37
Discussion: Let us consider whether our action will be
changed after we have obtained the sample information. If
we take a sample and observe its mean, we shall then revise
the mean of our prior subjective distribution. If the revised
mean is greater than b=10, we shall still choose to market
the toy, and the sample information would turn out to be
zero. Otherwise, choose not to market the toy, and the
sample would have value. Hence, the value of the sample
depends to a great extent on the value of the revised mean.
The revised mean (1) is unknown before taking a sample.
It is assumed that the best estimate of the revised mean is
the mean of the prior distribution since 0 is an unbiased
estimate of 1
38
Example] Assume that 
 p 10
 
 2
n 5
X
* 
p
10, n  25
2
0
 144  5.69
36  4
 2  2_
0
X
EVSI  (50,000)(5.69) N (0.35)  $70,556
12 10
*
D 
 0.351,
5.69
This estimate would be done before a sample is taken.
ENG=$70,556-$25,000($100/store*25samples)
=$68,056
39
Iterating what we did in the previous slide, we can have
the following table. That is, we can determine the
optimum sample size by taking a trial-error technique.
n
20
25
35
40
45
Var of
Sample
5.00
4.00
2.85
2.50
2.22
EVSI
*
5.62
5.69
5.75
5.77
5.80
D*
0.36
0.35
0.35
0.35
0.34
N(D*)
0.25
0.25
0.25
0.25
0.25
$69,126
$70,556
$72,125
$72,739
$73,099
Sample
Cost
$2,000
$2,500
$3,500
$4,000
$4,500
ENG
$67,126
$68,056
$68,625
$68,739
$68,599
40