Monte Carlo simulation - University of South Carolina
Download
Report
Transcript Monte Carlo simulation - University of South Carolina
Random Numbers and
Simulation
Generating truly random numbers is not
possible
• Programs have been developed to generate
pseudo-random numbers
• Values are generated from deterministic
algorithms
© Fall
2011 John Grego and the University of South Carolina
1
Random Numbers
Pseudo-random deviates can pass any
statistical test for randomness
They appear to be independent and
identically distributed
Random number generators for common
distributions are available in R
Special techniques (STAT 740) may be
needed as well
2
Monte Carlo Simulation
Some common uses of simulation
• Modeling stochastic behavior
• Calculating definite integrals
• Approximating the sampling distribution of a
statistics (e.g., maximum of a random
sample)
3
Modeling Stochastic Behavior
Buffon’s needle
Random Walk
Observe X1, X2, …, where
p=P(Xi=1)=P(Xi=-1)=.5 and study
S1,S2,…, where
i
Si X j
j 1
4
Modeling Stochastic Behavior
This is also called Gambler’s ruin; each
Xi represents a $1 bet with a return of
$2 for a win and $0 for a loss.
5
Gambler’s Ruin
The properties of a fair game (p=.5) are
a lot more interesting than the properties
of an unfair game (p≠.5)
Some properties of this process are easy
to anticipate (E(S))
6
Gambler’s Ruin
Some properties are difficult to
anticipate, and can be aided by
simulation.
• Expected number of returns to 0
• Expected length of a winning streak
• Probability of going broke given an initial
bank
7
Calculating Definite Integrals
In statistics, we often have to calculate
difficult definite integrals (posterior
distributions, expected values)
I
b
h(x)dx
a
(here, x could be multidimensional)
8
Calculating Definite Integrals
Example 1
4
I1 0
2 dx
1 x
1
Example 2
I2
(4 x
1
1
0
0
2
1
2x )dx 2 dx1
2
2
9
Hit-or-Miss Monte Carlo
Example 1
4
h(x)
1 x 2
1 4
1) arctan(0)) 4 /4
0
2 dx 4(arctan(
1 x
Determine c such that c≥h(x) across
entire region of interest (here, c=4)
10
Hit-or-Miss Monte Carlo
Generate n random uniform (Xi,Yi) pairs,
Xi’s from U[a,b] (here, U[0,1]) and Yi’s
from U[0,c] (here, U[0,4])
Count the number of times (call this m)
that Yi is less than h(Xi)
Then I1 ≈c(b-a)m/n
• I.e., (height)(width)(proportion under curve)
11
Classical Monte Carlo
Integration
I
h(x)dx
a
Take n random uniform values, U1,…,Un
over [a,b] and estimate I using
b
n
ba
I
hU i
n i1
This method seems straightforward, but is
actually more efficient than Hit-or-Miss
Monte
Carlo
12
Expected Value of a Function of
a Random Variable
Suppose X is a random variable with
density f. Find E[h(x)] for some function
h, e.g.,
E X
E
2
X
E sinX
13
Expected Value of a Function of
a Random Variable
E hX
hxdx
X
For n random values X1, X2, …, Xn from
the distribution of X (i.e., with density f),
1 n
Eh X h X i
n i 1
14
Examples
Example 3: If X is a random variable with
a N(10,1) distribution, find E(X2)
Example 4: If Y is a random variable with
a Beta(5,1) distribution, E(-lnY)
There are more advanced methods of
integration using simulation (Importance
Sampling)
15
Integration
integrate() performs numerical
integration for functions of a single
variable (not using simulation techniques)
adapt() in the adapt package performs
multivariate numerical integration
16
Approximating the Sampling
Distribution of a Statistic
To perform inference (CI’s, hypothesis
tests) based on sampling statistics, we
need to know the sampling distribution of
the statistics, at least up to an
approximation
Example: X1, X2, …, Xn ~ iid N(m,s2).
X m
T
has a t(n 1) distribution
s n
17
Approximating the Sampling
Distribution of a Statistic
What if the data’s distribution is not
known?
• Large sample: Central Limit Theorem
• Small sample: Normal theory or
nonparametric procedures based on
permutation distributions
18
Approximating the Sampling
Distribution of a Statistic
If the population distribution is known, we
can approximate the sampling distribution
with simulation.
• Repeatedly (m times) generate random
samples of size n from the population
distribution
• Calculate a statistic (say, S) each time
• The empirical (observed) distribution of Svalues approximates the true distribution of S
19
Example
X1, X2, X3, X4 ~Expon(1)
What is the sampling distribution of:
X (t he mean)
max(X) min(X)
(t he midrange)
2
20