Presentación de PowerPoint

Download Report

Transcript Presentación de PowerPoint

Some basic information on
Mathematics in Cuba
Creation of the Career:
1962
1962-68: Use of Chzec
and Polish proffesors
1968-75: Redefinition
of plans and use of
french proffesors in
graduate and
postgraduate studies.
1975-1992
Postgraduate with
SU(Pure and
Probability) and GDR
(Optimization and
Statistics)
Specializations
Pure (Functional Analysis,
Differential Eq, Algebra,
Dynamical Systems, Education,
History)
Applied (Stochastics,
Optimization Numerical
Analysis)
Centers of High Education and
Research
Centers of High
Education
14 Universities
12
Polytechnical
Institutes
25 Medical Inst.
18 Pedagogical
Institutes
25 Military and
Political CES
Mathematicians
are formed in 3
universities
(West, Center,
East) (<10
graduates per
year)
Academy of
Sciences
14 Delegations
82 Institutes
(1 in Mathematics)
Mathematicians work in
other institutes (Statistics,
Numerical Problems,
Differential Equations,
Dynamical Systems)
54 other
reserch
institutes
adscribed to
Ministeries
So mathematicians are valuable in
the labour market
My group
12 Applied
Mathematicians
Organizers of the Biannual
Organizers of the
Biannual
International
Workshop on
Operations
Research
(supported by
DAAD and IMU).
(ALIO-IFORS, 2004)
Mono-thematic.
(Financial
Mathematics(2001),
Logistic (2003)),
International Operations
Research Conference
(supported by TWAS,
DAAD and IMU).
Invitated only
100foreigners/50 Cubans
Co-organized by
Humbodt
Universität and a
Scientific Society
March 2006’s co-organized
by Humboldt Universität
and Paris I (SorbonnePantheon) Universite
Host organizers
of several
Regional
Conferences
20foreigners/20
Cubans
Editors of an
International
Journal:
Investigacion
Operacional
The Talk and our work in Habana
Aims of relating with Economical Problems using the Research
cooperation with
Paris I Universite (Mathematical Economics)
Humboldt Universitat (Financial and Macroeconomics)
Universidade a Corunha (Quantitative Methods for Economics)
The talk presents results obtained in projects
with Humboldt (DE) and Corunha (es)
THE EXISTENCE OF A CONTAMINATION OF TWO
DISTRIBUTIONS: SOLUTIONS OF STOCHASTIC OPTIMIZATION
PROBLEMS.
Carlos N. Bouza
Universidad de La Habana
Objectives of this contribution
• Discuss the effect of
dealing with a
contamination of two
distributions in the
solution of some
optimization problems
• Illustrate how the
regression problem
may be solved
•
•
•
•
Discuss the effect of estimating the
probability measure involved in the
solution of a Stochastic
Programming Problem trough the
convergence of the Approximation
Error
Illustrate how this problem may be
solved
Discuss how the existence of a
contamination of DF´s in
Stochastic Programming can be
solved and the convergence of a
proposal
Study the behavior of the proposed
procedure through a Monte Carlo
experiment
Estimation of a density function
The estimator of the density
f n(z) =∑i=1nm(n)(z,zí)/n
An exhaustive set of properties of 
1) ∫ m (z,z’)(dt)=1
 2) ∫ 2m(z,z’) (dt)  c’m, z’Z’, Z’ a subset of a compact set C[E].
 3)  z  Z,  z’ Z’,  m m’ then  m(z,z’)  c’m
 4)  z Z,  FC[E], then ∫  m(z,z’) (z’) (dz’)- (z’) 0 as n .
 5)  FC[E], then Sup z Z  ∫  m(z,z’) (z’) (dz’)- (z’)  0as n .
6) z Z,  F C[E], then ∫  m(z,z’) (z’) (dz’)- (z’)   s`ms´´
 7) If E= then ∫ |  m(z,z’) | (dz’) is bounded over Z.
8)  (z,z’)  sxs ,  m(z,z’) 
 9) If E= s, ,  n,n >such that  m(z,z’)  O(ms)
 10) If E= s , Sup {| | z-z’ | | >c´m }  m(z,z’ )O(ms/n), c’>1.
 11) If E= s , Supz’ {| | z-z’ | | -1  m(z,z’ )>c´ },  c’>1.
 12) If E= s , Supzr+ {r | | z-z’ | <r  m(z,z’)}=0
Some assumptions on F and F*
F1) F (F*) is absolutely continuous with support to
the Lebesgue measure in Es.
F2) >the density function of F (z)
{F*(z)}satisfies that f(z) {f*(z)}for every
zZF{zZF*}
The real DF of z is a contamination of two DF´s:
F and H both belonging to the class Æ
F*l=lF(z)+(1- l)H(z)
Optimization Problems and Linear Regression
The
Programming Problem :
P1: Maximize {ctxAx=b, x0}
xn is the vector of decision variables ,
cn is the cost vector which components are known constants
A is a full rank nxm matrix.
The feasibility region of P1 is considered to be bounded and different from the empty
set .
The regression model considers that the response is modeled as:
Y=g(c, x*)+
 is an unobservable random variable that express the difference between the
observed Y and the predicted Y*=g(c, x*).
If g is linear in c the regression is called linear
Some considerations
The design variables Xij , j=1..,k have a marginal distribution Pd
defined on the Borelians of .
The regression distribution P is defined on the set of Borelians
k+1) .h>k,  k+1
Sh+1) is the set of posible regression equations for a particular
problem.
An estimator of the vector of regression parameters is the functional
: Sh+1) 
 is regression equivariant, scale- invariant and afin-equivariant.
The regression model Po determines a DF F0
For a (pseudo)metric  a weak neigborhood of P0 is B(F0, )={F  (F,F0)< }
The general optimization problem generated
by the LRP
The M-estimation point of view
Minc i=1n (yi ,g(x,c))
 Is a convex function
If the xij´s have a DF which is symmetric the solution is
unique
Under general regularity conditions the asymptotic
distribution is a normal
The L2 norm and the quadratic programming problem
The use of (yi ,g(xi ,c))=(yi -g(xi ,c))2 poses the Least Squares
problem
Minc i=1n (yi -g(xi ,c))2
The solution is optimal only if for any i=1,..,n
•Cov (i,i´)=0 if ii`
•E(i)2 =2
•i is distributed according to a Gaussian
•E(i)=0
•g(xi ,c)=cTxi+i
•nk+1
•There is not a design variable that can be expressed as a linear
combination of some others.
•i and xi are independent
The L1 norm and the linear programming problem
The use of (yi ,g(xi ,c))=yi -g(xi ,c)  poses the optimization problem
Minc i=1n yi -g(xi ,c)  which is solved
by
Minc i=1n (yi -g(xi ,c))++ (yi -g(xi ,c))ST:  n
i=1
yi –c0 -i=1k ci xij+ (yi -g(xi ,c))-(yi -g(xi ,c))-=0;
(yi -g(xi ,c))+, (yi -g(xi ,c))->0, i=1,...,n
The solution is known as Least Absolute Value. It is optimal only if for any
i=1,..,n
•g(xi ,c)=cTxi+i
•E(i)=0; E(i)2 =2; Cov (i,i´)=0 if ii`
•i is distributed according to a Laplacian
•nk+1
•There is not a design variable that can be expressed as a linear
combination of some others.
•i and xi are independent
The use of two norms as a model for outliers: L2 &L 1
(yi ,g(xi ,c))=l (yi -g(xi ,c))2 + (1-l)yi -g(xi ,c)|
Now
F*l=lF(z)+(1- l)H(z)
We deal with a parametric problem.
A solution for l known was derived by Arthanari and Dodge
(1981, Mathematical Programming in Statistics, Wiley, N. York)
Parametric Programming algorithms for solving the case l
unknown have been derived by Allende-Bouza (1993, In
“Parametric Optimization and Related Topics”, P. Lang, Frankfurt)
and a sequel of papers with collaborators published in 1994-2004
The Program
If we consider that it is a combination of a Normal and a
Laplacian a program that implements
a one step M-estimator is
Min (c, l) (1- l)[vTQv] +lcTv
Subject to: Avb, v0
Where
bT=[Y,-Y];cT=[01x2k , eT1xn ], eT=[1,...,1],
l(0,1),QL[2k+n,2k+n], A L[2n,2k+n]
Stochastic optimization is of use in many
applications when some of the coefficients are
unknown and of random nature.
•The touchstone example is given by a decision-maker that must make
decisions under un-certainty. Generally he/she should model the
problem as an optimization one.
•Uncertainty may modeled by the randomness of the studied
phenomena.
•It may be quite difficult to write down the theoretical model.
•A more complicated task is to compute the solution because the
calculation of a numerical solution generally relies on rather
complicated theoretical models.
•Even the solution of simple linear programming problems needs of the
computation of the optimal solutions of non-linear models.
•The contemporary existence of large computing capacities has
permitted to develop software packages for solving a large collection of
stochastic programs, see Kall-Wallace (1994) and Birgé-Louveaux
(1997) for example
The basic Optimization Problem
Take the probability space ,S,).
.generates a distribution function (DF) F(z) with
support ZFEs.
A broad class of Stochastic Programming Problems is
described
by
F,)= Inf{ ZF g( z,x) (dx)=EF(g(z,x))xXF ()}
where
•g1) > g (z,x), is a uniform continuous function.
•g2) > xX, g(x,z) is a Lipschitz function
of z ZF ( ) with Lipschitz constant L independent of x
For a fixed [0,1]
•X1) XF ()={x XF [g(),x)] }
•X2). If =0 then XF ()=X
•X3) X is a convex set.
The SLPWFR
()=min {cTx+  Z Q(z,x)(dz)xXk}
.z=(r,b,A) Z, r s , b  q A L(k , q),,
X is a convex polyhedral set
W L(s,, , q),, is the recourse matrix
Q(z,x)=inf {rTyWy=b-Ax, y0}
zZ and xX
{yWy=b-Ax, y0}
{u r WTu=b-Ax0}
Bounding the AE in SLPWCFR
Römisch – Wakolbinger (1987, in “Parametric
Optimization and Related Topics”)
•AE=  ()- (m)=
= Z g(z)(dz)- Z g(z)m(dz)C[1+M(p)+M(pm)]B1-1/p(,m)
Where for a p>0
•M(pa)< is the moment of order p of the pm a=,m
•C is a known constant
•B1-1/p(,m) is the Bounded Lipschitz metric
Estimation of the AE
Römisch – Wakolbinger (1987, in “Parametric
Optimization and Related Topics”)
Use m=n the empirical measure
Bouza (1992, in “System Modelling and Optimization”,
Springer)
Use f n(z) =∑i=1nm(n)(z,zí)/n
The convergence of (n ) to () depends of the
convergence of n to 
The convergence of (fn ) to ()= (f) depends of the
convergence of fn to the density function f of the DF
associated to .
Main point
The effect of using an incorrect
distribution for solving
the problem is measured by the
difference between the true
value of the optimal solution and
the computed one,
approximation error (AE).
Some other results on the use of the Approximation
Error when we deal with an uncontaminated DF
Römisch-Wakolbinger (1987), Römisch-Schultz (1987)
Römisch(1989),Bouza-Allende (1989),
Römisch-Schultz (1990)
Römisch-Schultz (1991) Dupachovà (1991),
Bouza (1992), Römisch-Schultz (1992),
Römisch-Growe (1992)
Etc...............................
=>The usual procedure is to derive a bound of the AE and to
analyze its behavior when an estimtor of F is used
What happens if the real DF of z is a
contamination of two DF´s:
F*l=lF(z)+(1- l)H(z)?
Kanková(1998, YUJOR)
analyzed this problem
Kanková´s result for s=1
AS
Hg(F,F*l)=EF(g(z,x)-EF*lg(z,x))|=(1- l)EF(g(z,x)-EH(g(z,x))|
We expect that D(F,F* l) is large if l1and small if l.
Theorem 2.1 (Kanková,1998) if s=1 and =0 then
Hg(F,F*l)LSupF(z)-F*l(z)/
Whenever
•g1,g2,F1 and F2 hold.
•X is a compact set
•ZF =<c,c´>, c´>0 ZH ZF (),
•=SupF(z)-H(z)/ 
Remarks
If l is fixed
•F(z)-F*l(z)/1-l
•(F(z, ))-(F*l(z, ))L(1- l)/
The determination of l may be of
importance, by itself, in many applications
If we deal with a contamination of two
Gaussians
Sup D(F(z),H(z))l F(z)-H(z)
Kanková´s result for s>1
Theorem 2.2 (Kanková,1998) if s>1
and the hypothesis for theorem 2.1
hold then
(F(z, ))-(*l(z, ))8s
Whenever
•ZF () =j=1s <cj ,c´j>,
• c´j>0, cj<c´j
Main Result of Bouza et.al. (2005)
VIII-ParaOpt (El Cairo)
Theorem 3.2: Assume that 1-4 hold and that
-N(f)=Sup {f(z)  , z’G={z’ E |mm0,  m(z,z´)0}}<
->0, ZF is covered by L() balls of radius >0, and there exist
s’, s’’ >0 with L()<s’es’’
-s’’’ and s’’’’>0 such that m>m0, z,z’ ZF and z*  E
then for a metric D
  m(z,z´)-  m(z*,z’)  <s’’’ms’’’’D(z,z’)
And with probability one holds that
Fl ,0)- F ,0)  [Sup  F*ln (z)- Fl z)  /] 1/s8Ls

An Application: The Flower girl
problem as described by Römisch (2002)
The girl buys xt flowers at a cost c*t
Each unit is sold at a price ct >c*t.
In a day t she has a remain of st-1 unsold flowers
from day t-1.
•A flower cannot be carried over only one additional day.
•The demand dt is random
•The number of unsold flowers is zt.
The flower vendor wants to solve the stochastic problem:
FGP:  = ArgMax E [ (c1 –c*1)x1 +∑Tt=1 (ct –c*t)xt –ct zt]
Subject to : st +zt xt +st -1-dt
zt  st -1-dt; xt {0,1}; t=1,..,T
Design of a Monte Carlo Experiment
•The distribution function F is a normal distribution, N( 100,5)=N(1)
•The distribution function H is a normal distribution, N( 50,5)=N(l)
•The effect of a shock in the market has the effect of generating the
demands as a contaminated distribution function F*l
•The demands were generated accordingly to one of these pdf’s.
•The solution of FGP is relatively of easy computation because of the
normality of the random variables.
• (q) was computed for q=1, 0.01, 0.05, 0.10.
•The sample sizes used for generating demands were n=50, 100, 500, 1000
•The density function was estimated.
•Each sample was replicated 10, 25 or 100 times.
•The evaluation of a particular density function estimator R for a sample
size n in run j is (q, R |n)j .
•The measure of accuracy was the mean relative error (AE)
, N (q,q’|R)=∑Nj=1   (q, R |n)j -  (q)}/ N   (q)}
Evaluation of a particular density function with an
estimator R for a sample size n in run j for(q, R |n)j .
The mean relative Approximation Error
N (q,q’|R)=∑Nj=1 (q, R |n} j- (q)}/ N(q)
Kernel

K(z)
Epanechnikov 0,75(1-0,2z)2 if z<
0 otherwise
Gaussian (2)-1/2exp{-z2/2 } for x
Triangular
1- z if z<1
0 otherwise
Gröwe-Römisch
(1992) developed
some studies on
the speed of
convergence of
kernel estimators
and they derived
that m(n)=n-a is
adequate for a>1.
We will use a=5.
Monte Carlo Experiments
Table 1. Results of the use of the estimates when N(100,5) is the correct
model.
Epachenikov
n/N
50
Gaussian
Triangular
10 25 100
10 25 100 10 25
0.33 0.21 0.17
0.25 0.29 0.15
100
0.19 0.28 0.15
100
0.18 0.28 0.14
0.19 0.13 0.11
0.15 0.09 0.08
500
0.16 0.10 0.10
0.13 0.09 0.08
0.50 0.34 0.33
1000 0.32 0.34 0.32
0.31 0.32 0.30
0.36 0.31 0.30
-nG was the best only for large samples
-nT has a good behavior for small samples.
- The computation based on Epanechikov kernel was 3.87 times faster than the use of the
Gaussian Kernel. alternative.
Table 2 Behavior of N (1, 0.01|R )
.
Epanechikov Gaussian
n/N
10
50
0.27 0.25 0.16 0.23 0.23 0.13 0.21 0.20 0.12
100
0.19 0.17 0.10
25
100
10
Triangular
25 100 10
25 100
0.30 0.29 0.25
0.27 0.26 0.23
500
0.25 0.24 0.20
0.22 0.23 0.20
0.30 0.28 0.24
1000
0.28 0.24 0.22
0.25 0.21 0.18
0.25 0.19 0.15
- nE is not better than the rest of the estimators.
-It is remarkable that also nT is better than nG for small samples.
- nG is better for large samples.
Table 3 Behavior of N (1, 0.05|R )
Epanechikov
Gaussian
25 1 00
Triangular
.n/N
10 25 100 10
10 25
100
50
0.27 0.24 0.14 0.25 0.22 0.14
0.22 0.22 0.14
100
0.19 0.12 0.25 0.33 0.32 0.31
0.29 0.27 0.29
500
0.27 0.25 0.29
0.23 0.28 0.27
0.29 0.26 0.22
1000
0.27 0.26 0.20 0.27 0.25 0.20
0.25 0.19 0.19
-The triangular kernel is the best general choice
-The other kernels has a similar behavior as a second best choice
when they are not the best alternative
Table 4 Behavior of N (1, 0.10|R )
.n/N Epanechikov
10 25
50
100
Gaussian
10 25 100
Triangular
10
25 100
0.23 0.18 0.14
0.23 0.17 0.12 0.20 0.17 0.12
100 0.19 0.17 0.11
0.31 0.31 0.31 0.30 0.28 0.27
500 0.29 0.28 0.25
0.26 0.26 0.26 0.25 0.26 0.25
1000 0.19 0.17 0.11
0.29 0.28 0.25 0.24 0.24 0.24
-The triangular kernel is the best general choice
- The Epanechikov kernel is the second best choice
The effect of using the model assumed as usual, when it is not adequate
completed the study.
The measure of accuracy used was
 (q,q’)= (q) -(q’)/(q), q=1, 0.01, 0.005, 0.10.
Table 5 contains the values of these AE’s.
Note that they are considerably larger than those associated to the estimations.
This fact supports that it is less risky to use an adequate estimate than to use an
incorrect pdf.
Table 5 Values of  (q,q’ )
.
Correct q q’ used for solving the FGP
1 0.01 0.05 0.10
1
0 1.03 0.82 0.77
0.01
1.03 0
0.65 0.59
0.05
0.82 0.65 0
0.10
0.77 0.59 0.51 0
0.51
In fact it is less risky to use an adequate estimate than
to use an incorrect pdf.