Lecture slides
Download
Report
Transcript Lecture slides
Lecture 2
Basics of probability in statistical
simulation and stochastic programming
Leonidas Sakalauskas
Institute of Mathematics and Informatics
Vilnius, Lithuania
EURO Working Group on Continuous Optimization
Content
Random variables and random
functions
Law of Large numbers
Central Limit Theorem
Computer simulation of random
numbers
Estimation of multivariate integrals
by the Monte-Carlo method
Simple remark
Probability theory displays the library
of mathematical probabilistic models
Statistics gives us the manual how to
choose the probabilistic model
coherent with collected data
Statistical simulation (Monte-Carlo
method) gives us knowledge how to
simulate random environment by
computer
Random variable
Random variable is described by
Set of support SUPP( X )
Probability measure
Probability measure is described
by distribution function:
F ( x) Pr ob( X x)
Probabilistic measure
Probabilistic
measure has three
components:
Continuous;
Discrete (integer);
Singular.
Continuous r.v.
Continuous r.v. is described by
probability density function
Thus:
x
F ( x)
f ( y)dy
f (x )
Continuous variable
If probability measure is absolutely
continuous, the expected value of random
function:
Ef ( X ) f ( x) p ( x)dx
Discrete variable
Discrete r.v. is described by mass
probabilities:
x1 , x2 ,..., xn
p1 , p2 ,..., pn
Discrete variable
If probability measure is discrete, the
expected value of random function is sum
or series:
n
Ef ( X ) f ( xi ) pi
i 1
Singular variable
Singular r.v. probabilistic measure is
concentrated on the set having zero
Borel measure (say, Kantor set).
Law of Large Numbers (Chebyshev, Kolmogorov)
lim
N
N
z
i 1 i
N
z,
here z1 , z2 ,..., z N are independent copies of r. v. ,
z E
What did we learn ?
The integral
f ( x, z ) p( z )dz
is approximated by the sampling average
N
i
i 1
N
j
j
f
(
x
,
z
),
if the sample size N is large, here
j 1,..., N ,
z1 , z 2 ,..., z N is the sample of copies of r.v. , distributed
with the density p (z ) .
Central limit theorem (Gauss, Lindeberg, ...)
xN
lim P
x ( x),
N / N
2
x y
e 2
here
N
xi
x N i 1 ,
N
1
( x)
2
dy,
EX , 2 D 2 X E ( X ) 2
Beri-Essen theorem
sup FN ( x) ( x) 0.41
x
where
E X
3
3 N
FN ( x) Pr obx N x
What did we learn ?
According to the LLN:
N
N
( xi xN )
1
x xi , 2 i 1
N i 1
N
2
N
3
, E X EX i 1
xi x N
Thus, apply CLT to evaluate the statistical error of
approximation and its validity.
N
3
Example
Let some event occurred n times repeating
N independent experiments.
Then confidence interval of probability of
event :
1.96 p (1 p)
1.96 p (1 p)
, p
p
N
N
here
n (1,96 – 0,975 quantile of normal distribution,
p ,
N confidence interval – 5% )
If the Beri-Esseen condition is valid:
N p (1 p) 6
!!!
Statistical integrating …
b
I f ( x)dx
???
a
Main idea – to use the
gaming of a large
number of random
events
Statistical integration
Ef ( X ) f ( x) p ( x)dx
N
f
(
x
)
i
i 1
,
N
xi p()
Statistical simulation and
Monte-Carlo method
F ( x) f ( x, z ) p( z )dz min
x
N
i 1
f ( x, zi )
N
min ,
x
zi p()
(Shapiro, (1985), etc)
Simulation of random variables
There is a lot of techniques and methods
to simulate r.v.
Let r.v. be uniformly distributed in the
interval (0,1]
Then, the random variable
F (U ) ,
U , where
is distributed with the cumulative
distribution function F ()
F (a) x cos( x sin( a x) ) e x dx
0
f ( x) x cos( x sin( a x) )
N=100, 1000
Wrap-Up and conclusions
o the expectations of random functions,
defined by the multivariate integrals, can be
approximated by sampling averages
according to the LLN, if the sample size is
sufficiently large;
o the CLT can be applied to evaluate the
reliability and statistical error of this
approximation