Lecture slides

Download Report

Transcript Lecture slides

Lecture 2
Basics of probability in statistical
simulation and stochastic programming
Leonidas Sakalauskas
Institute of Mathematics and Informatics
Vilnius, Lithuania
EURO Working Group on Continuous Optimization
Content
Random variables and random
functions
 Law of Large numbers
 Central Limit Theorem
 Computer simulation of random
numbers
 Estimation of multivariate integrals
by the Monte-Carlo method

Simple remark



Probability theory displays the library
of mathematical probabilistic models
Statistics gives us the manual how to
choose the probabilistic model
coherent with collected data
Statistical simulation (Monte-Carlo
method) gives us knowledge how to
simulate random environment by
computer
Random variable
Random variable is described by


Set of support SUPP( X )
Probability measure
Probability measure is described
by distribution function:
F ( x)  Pr ob( X  x)
Probabilistic measure
 Probabilistic
measure has three
components:
 Continuous;
 Discrete (integer);
 Singular.
Continuous r.v.
Continuous r.v. is described by
probability density function
Thus:
x
F ( x) 
 f ( y)dy

f (x )
Continuous variable
If probability measure is absolutely
continuous, the expected value of random
function:
Ef ( X )   f ( x)  p ( x)dx
Discrete variable
Discrete r.v. is described by mass
probabilities:
x1 , x2 ,..., xn
p1 , p2 ,..., pn
Discrete variable
If probability measure is discrete, the
expected value of random function is sum
or series:
n
Ef ( X )   f ( xi )  pi
i 1
Singular variable
Singular r.v. probabilistic measure is
concentrated on the set having zero
Borel measure (say, Kantor set).
Law of Large Numbers (Chebyshev, Kolmogorov)
lim
N 

N
z
i 1 i
N
 z,
here z1 , z2 ,..., z N are independent copies of r. v.  ,
z  E
What did we learn ?
The integral
 f ( x, z )  p( z )dz
is approximated by the sampling average

N
i
i 1
N
j
j


f
(
x
,
z
),
if the sample size N is large, here
j  1,..., N ,
z1 , z 2 ,..., z N is the sample of copies of r.v.  , distributed
with the density p (z ) .
Central limit theorem (Gauss, Lindeberg, ...)
 xN  

lim P
 x   ( x),
N    / N

2
x y
e 2
here
N
 xi
x N  i 1 ,
N
1
 ( x) 
 
2   
 dy,
  EX ,  2  D 2 X  E ( X   ) 2
Beri-Essen theorem
sup FN ( x)   ( x)  0.41 
x
where
E X 
3
3 N
FN ( x)  Pr obx N  x 
What did we learn ?
According to the LLN:
N
N
 ( xi  xN )
1
x   xi ,  2  i 1
N i 1
N
2
N

3
, E X  EX  i 1
xi  x N
Thus, apply CLT to evaluate the statistical error of
approximation and its validity.
N
3
Example
Let some event occurred n times repeating
N independent experiments.
Then confidence interval of probability of
event :

1.96  p  (1  p)
1.96  p  (1  p) 
, p
p

N
N


here
n (1,96 – 0,975 quantile of normal distribution,
p ,
N confidence interval – 5% )
If the Beri-Esseen condition is valid:
N  p  (1  p)  6
!!!
Statistical integrating …
b
I   f ( x)dx
???
a
Main idea – to use the
gaming of a large
number of random
events
Statistical integration
Ef ( X )   f ( x)  p ( x)dx


N
f
(
x
)
i
i 1
,
N
xi  p()
Statistical simulation and
Monte-Carlo method
F ( x)   f ( x, z )  p( z )dz  min
x


N
i 1
f ( x, zi )
N
 min ,
x
zi  p()
(Shapiro, (1985), etc)
Simulation of random variables
There is a lot of techniques and methods
to simulate r.v.
Let r.v. be uniformly distributed in the
interval (0,1]
Then, the random variable
F (U )   ,
U , where
is distributed with the cumulative
distribution function F ()

F (a)   x  cos( x  sin( a  x) )  e  x  dx
0
f ( x)  x  cos( x  sin( a  x) )
N=100, 1000
Wrap-Up and conclusions
o the expectations of random functions,
defined by the multivariate integrals, can be
approximated by sampling averages
according to the LLN, if the sample size is
sufficiently large;
o the CLT can be applied to evaluate the
reliability and statistical error of this
approximation