statistical_issues_module_1

Download Report

Transcript statistical_issues_module_1

Module 1:
Statistical Issues in
Micro simulation
Paul Sousa
Overview







Numerical Solution
Simulation
Random number generation
Transformation
Techniques: Gibbs sampling, Metropolis Hasting
algorithm
Variance reduction techniques
Conclusion
Numerical Solution
Monte Carlo Technique
Simulation
Stochastic Simulation
Deterministic Simulation
Monte Carlo Simulation
Introduction






Model Solution: Analytical vs Numerical
Numerical solution: Substitutes Numbers for Independent
Variables and Parameters--------Needs Iteration Technique.
Numerical Technique: Monte Carlo Method & Simulation
Simulation: Deterministic Simulation & Stochastic
Simulation.
Deterministic Simulation: Does not Necessarily Imply the
Use of Random Number
Stochastic Simulation: Uses Random Numbers---Denoted
as Monte Carlo Simulation.
Linear Congruential Generators




A sequence of integers I1, I2,…, each between 0 and m-1 (a large
number) is generated by the recurrence relation:
Ij+1 = mod (a Ij + c, m)
where a and c are positive integers known as the multiplier and
increment, and m is the modulus
To calculate mod (X, m) divide X by m, then take the remainder
term and multiply it by m
e.g. mod (12, 7) = 5
12/7 = 1.7143
0.7143 x 7 =5
Finally, divide Ij by m gives a uniform variable between 0 and 1
Linear congruential methods are very fast, but are not
completely free of sequential correlation on successive calls.
Transformation to other
Distributions



Consider a random variable with density function f (x) and
corresponding cumulative density function F (x). If the inverse
of cumulative density function for X can be calculated, then X
can be obtained from U.
By definition, F (x) = k means that the probability of obtaining a
draw equal to or below x is k, where k is between 0 and 1. A
draw u from the standard uniform provides a number between 0
and 1. We can set F (x) = u.
thus x = F-1 (u)
This procedure works only for univariate distributions.
Univariate Density Example


Example: Extreme value distribution
density function, f (x) = exp (-x) * exp(-exp(-x))
CDF, F (x) = exp(-exp(-x))
A draw from this density is obtained as x = -ln (-ln u)
Draws from more complicated densities:
Accepting-Reject Method
Importance Sampling
Gibbs Sampling
Metropolis-Hasting Algorithm
Accept-Reject Method




More generalized way of drawing from multivariate
distributions.
Suppose we want to draw from multivariate density
g (x) within the range a ≤ x ≤ b
i.e. drawing from:
f (x) =
{ 1/k g (x) a ≤ x ≤ b
{ 0
otherwise
where k is a normalized constant
We can obtain draws from f by simply drawing from g
and retaining (“accepting”) the draws that are within
the relevant range and discarding (“rejecting”) the
draws that are outside the range.
Accept-Reject Method



Advantage: It can be applied whenever it is
possible to draw from the untruncated density.
Disadvantage: Crude method -> problems
However, it is a useful “last option”
Importance Sampling


1.
2.
3.

Suppose x has a density f (x) that cannot be easily
drawn from by other procedures. Suppose further
that there is another density g (x) that can be easily
draw from.
Draws from f (x) can be obtained as follows:
Take a draw from g (x) and label it x1.
Weight the draw by f (x1) /g (x1)
Repeat this process many times.
The set of weight draws is equivalent to the set of
draws from f.
Gibbs Sampling





For multinomial distributions, it is sometimes difficult to draw
directly from the joint density and yet easy to draw from the
conditional density of each element given the values of the other
elements. Gibbs sampling can be used in these situations.
Consider two random variables x1 and x2.
The joint density is f (x1, x2), and the conditional densities are f
(x1|x2) and f (x2|x1).
Gibbs sampling proceeds by drawing iteratively from the
conditional densities: drawing x1 conditional on a value of x2,
drawing x2 conditional on this draw of x1, drawing a new x1
conditional on the new value of x2, and so on.
This process converges to draws from the joint density.
Metropolis-Hastings Algorithm
1.
2.
3.
Start with a value of the vector x, labeled x0
Choose a trial value of x1 as x1t = x0 + n, where n is drawn
from a distribution g (η) that has zero mean. Usually a normal
distribution is specified for g (η).
Calculate the density at the trial value x1t, and compare it with
the density at the original value x0, i.e. compare f (x1t) with
f(x0). If f (x1t) > f (x0), then accept x1t, label it x1, and move to
step 4. If f (x1t) ≤ f (x0), then accept x1t with probability
f(x1t)/f(x0), and reject it with probability 1 - f(x1t)/f(x0). To
determine whether to accept or reject x1t in this case, draw a
standard uniform μ. If μ ≤ f(x1t)/f(x0), then keep x1t.
Otherwise, reject x1t. If x1t is accepted, then label it x1. If x1t is
rejected, then use x0 as x1.
Metropolis-Hastings Algorithm
4.
5.
6.

Choose a trial value of x2 as x2t = x1 + η, where η is a new
draw from g (η).
Apply the rule in step 3 to either accept x2t as x2 or reject x2t
and use x1 as x2.
Continue this process for many iterations. The sequence xt
becomes equivalent to draws from f (x) for sufficiently large t.
General but computational intensive algorithm
Variance Reduction





The use of independent random draws in simulation is appealing
because it is conceptually straightforward and the statistical
properties of the resulting simulator are easy to derive.
However, there are other ways to take draws that can provide
greater accuracy for a given number of draws.
In taking a sequence of draws from the density f( ), two issues
are at stake: Coverage and Covariance.
Coverage: If our objective is to approximate over the entire
domain F (x) = ∫ f (x)
A more accurate approximation would be obtained by evaluating
f (x) throughout the entire domain of f  better coverage
Variance Reduction
Covariance
 With independent draws, the covariance over draws is zero. The
variance of a simulator based on R independent draws is
therefore the variance based on one draw divided by R.
 If the draws are negatively correlated instead of independent,
then the variance of the simulator is lower.
The issue of Covariance is related to Coverage
 By inducing a negative correlation between draws, better
coverage is usually assured.
 E.g. With R=2, if the two draws are taken independently, then
both could end up being at the low side of the distribution. If
negative correlation is induced, then the second draw will tend
to be high if the first draw is low, which provides better
coverage.
Variance Reduction Techniques
Antithetics
 Antithetics draws are obtained by creating
various types of mirror images of a random
draw.
 For a symmetric density that is centered on zero,
the simplest antithetic variate is created by
reversing the sign of all elements of a draw. E.g.
x2k = - x2k-1
k = 1  n/2
Variance Reduction Techniques
Systematic sampling
 Systematic sampling creates a grid of points over the support of
the density and randomly shifts the entire grid.
 Consider draws from a uniform distribution between 0 and 1.
The unit interval is divided into four segments and draws taken
in a way that assures one draw in each segment with equal
distance between the draws. Take a draw from a uniform
between 0 and 0.25, as x1; x2 = 0.25 + x1; x3 = 0.5 + x1;
x4 = 0.75 + x1.
 It implies a tradeoff between the number of random variables
and the coverage
Module 1:
Statistical Issues in
Micro simulation
Paul Sousa