Lecture slides

Download Report

Transcript Lecture slides

Lecture 4
Nonlinear Stochastic Programming
by the Monte-Carlo method
Leonidas Sakalauskas
Institute of Mathematics and Informatics
Vilnius, Lithuania
EURO Working Group on Continuous Optimization
Content









Stochastic unconstrained optimization
Monte Carlo estimators
Statistical testing of optimality
Gradient-based stochastic algorithm
Rule for Monte-Carlo sample size regulation
Counterexamples
Nonlinear stochastic constrained optimization
Convergence Analysis
Counterexample
Stochastic unconstrained
optimization
Let us consider the stochastic unconstrained
optimization problem
F ( x)  Ef x,    min n
xD  R
 
-elementary event in the probability space
f : R    R,
n
Px
, , Px  ,
- random function:
- the measure, defined by probability density
function:
p : R n   R
Monte-Carlo samples
We assume here that the Monte-Carlo samples of a
certain size N are provided for any
n
Y  ( y1 , y 2 ,..., y N ),
xR
and the sampling estimator of the objective function is
computed :
N
1
F ( x)   f ( x, y j )
N j 1
and sampling variance also computed that is useful to
evaluate the accuracy of estimator
2
1 N
2
j
D ( x) 
f ( x, y )  F ( x) 


N  1 j 1
Gradient
The gradient is evaluated using the
same random sample:
1 N
j
g ( x)   g ( x, y ),
N j 1
Covariance matrix
The sampling covariance matrix is computed


1 N
j
j
A( x) 
g
x
,
y

g
x

g
x
,
y
 g  x
  




N  n j 1
later on for normalising the gradient estimator.

T
Gradient search procedure
Let some initial point
x D  R
0
n
be given and
the random sample of a certain initial size N0 be
generated at this point, as well as, the Monte-Carlo
estimates be computed.
The iterative stochastic procedure of gradient search can
be used further:
x
t 1
t
~
 x    g (x )
t
Monte-Carlo sample size problem
There is no a great necessity to compute
estimators with a high accuracy on starting the
optimisation, because then it suffices only to
approximately evaluate the direction leading to
the optimum.
Therefore, one can obtain not so large samples at
the beginning of the optimum search and, later
on, increase the size of samples so as to get the
estimate of the objective function with a desired
accuracy just at the time of decision making on
finding the solution to the optimisation problem.
Monte-Carlo sample size problem
The following version for regulating the
sample size is proposed:
N t 1
t





n

Fish
(

,
n
,
N

n
)

, N max 
 min max  

n
,
N
~
~

min
t T
t
1
t





(
G
(
x
)

(
A
(
x
))

(
G
(
x
)






Statistical testing of the optimality
hypothesis
The optimality hypothesis could be accepted for some
point xt with significance 1   , if the following
condition is satisfied
t
t T
t 1
t
(
N

n
)

(
G
(
x
))

(
A
(
x
))

(
G
(
x
))
2
Tt 
 Fish(  , n, N t  n)
n
Next, we can use the asymptotic normality again and
decide that the objective function is estimated with a
permissible accuracy
, if its confidence bound does
not exceed this value:

~ t
  D( x ) / N t  
Importance sampling
Let consider an application of SP to
estimation of probabilities of rare events:
1
2
P( x ) 

1
2

e

e
t2

2
x
a2 t 2
 at  
2 2
x a
where:
dt 
dt 
1
2
1
2
g (a , t )  e

e
( t a )2

2
e
( t a )2
2
x

 g (a , t )  e
x a
a2
 at 
2
t2

2
dt
e
t2

2
dt 
Importance sampling
Assume a be the parameter that should
be chosen.
The second moment is :
D ( x, a) 
2

1
2

e
x
1
2

g
2
(a , t )  e
t2

2
x a
a2 t 2
 at  
2 2
a2
e
dt 
2

e
x a
1
dt 
2
t2

2
dt

e
x a
t2
2 at  a 
2
2
dt 
Importance sampling
Select the parameter a in order to minimize
the variance:
1,20
1,00

D ( x, a)  P ( x)

P( x )  P 2 ( x )
2
2
0,80
2
0,60
0,40
0,20
0,00
0,00
2,00
4,00
a
6,00
8,00
Importance sampling
~
Pt
t
at
Sample size
1
5.000
1000
16.377
2.48182x10-7
2
5.092
51219
2.059
2.87169x10-7
3
5.097
217154
1.000
2.87010x10-7
 (%)
P( x )  2.86650x10 -7
Manpower-planning problem
The employer must decide upon a base
level of regular staff at various skill levels.
The recourse actions available are regular
staff overtime or outside temporary help in
order to meet unknown demand of service
at minimum cost (Ermolyev and Wets
(1988)).
Manpower-planning problem
x j : base level of regular staff at skill level j = 1, 2, 3
y j,t :
amount of overtime help
z j,t :
amount of temporary help,
c j : cost of regular staff at skill level j = 1, 2, 3
qj :
rj :
cost of overtime,
cost of temporary help
w t : demand of services at period t,
t : absentees rate for regular staff at time t,
 j 1 :
ratio of amount of skill level j per amount of
j-1 required,
Manpower-planning problem
The problem is to choose the number of staff
on three levels x  ( x1 , x2 , x3 ) in order to
minimize the expected costs:
3
12
3
j 1
t 1
j 1
F ( x, z )   c j  x j   E min ( (q j  y j , t  rj  z j , t ))
s.t.
x j  0,
3
 (y
j 1
y j , t  0,
z j , t  0,
3
j ,t
 z j , t )  wt   t   x j ,
y j , t  0.2  t x j ,
t  1, 2,
, 12,
t  1, 2,
, 12,
j 1
j  1, 2, 3,
 j 1 ( x j  y j 1, t  z j 1, t )  ( x j  y j 1, t  z j 1, t )  0,
the demands are normal:
j  1, 2, 3,
N ( t ,  t2 ) ,
t  1, 2,
t  l   t
, 12.
X
132
Manpower-planning problem
Manpower amount and costs in USD
with confidence interval 100 USD
l
x1
x2
x3
F
0
9222
5533
1106
94.899
1
9222
5533
1106
94.899
10
9376
5616
1106
96.832
30
9452
5672
1036
96.614
Nonlinear Stochastic Programming

Constrained continuous (nonlinear )stochastic
programming problem is in general
F0 ( x)  Ef 0 x,    n f 0 ( x, z )  p( z )dz  min
R
F1 ( x)  Ef1 x,    n f1 ( x, z )  p( z )dz 0,
R
x X.
Nonlinear Stochastic Programming
Let us define the Lagrange function
L( x, )  F0 ( x)    F1 ( x)
L( x, )  El ( x, ,  )
l ( x,  ,  )  f 0 ( x,  )    f1 ( x,  )
Nonlinear Stochastic Programming
Procedure for stochastic optimization:
~
x  x     x L( x t ,  t )
~ t
~
t
 max[0,       ( F1 ( x )     DF1 ( x t ))]
t 1
 t 1
0
N ,
,
0
x
t
0
  0,   0
- initial values
-parameters of
optimization
Conditions and testing of optimality
F0 ( x  )     F1 ( x  )  0,
   F1 ( x  )  0

~
~
1
( N  n)  ( x L)' A  ( x L) / n  Fish(  , n, N  n)
~ t
~
t
F1 ( x )     DF1 ( x )  0
2  i  DFi / N   i
Analysis of Convergence
In general the sample size is increased
as geometric progression:
t
N

t 1
N
Q
i N0  N  1  
t
i
t
Wrap-Up and conclusions



The approach presented in this paper is grounded
by the stopping procedure and the rule for adaptive
regulation of size of Monte-Carlo samples, taking
into account the statistical modeling accuracy.
Several stochastic gradient estimators were
compared by computer simulation studying
workability of estimators for testing of optimality
hypothesis by statistical criteria.
It was demonstrated that the minimal size of the
Monte Carlo sample necessary to approximate the
distribution of Hotelling statistics, computed using
gradient estimators, by Fisher distribution depends
on approximation approach and dimensionality of
the task .
Wrap-Up and Conclusions


The computation results show that the
approach developed provides us the
estimator for a reliable checking of
optimality hypothesis in a wide range of
dimensionality of stochastic optimization
problem (2<n<100)
Termination procedure proposed allows us
to test the optimality hypothesis and to
evaluate reliably the confidence intervals
of the objective and constraint functions