Factorized Distributed Algorithm
Download
Report
Transcript Factorized Distributed Algorithm
FDA- A scalable evolutionary
algorithm for the optimization of
ADFs
By Hossein Momeni
Factorized Distributed Algorithm
Outline
• Factorization Theorem
• FDA
• Analysis of FDA for large populations
• Boltzmann and Truncation selections
• Finite and critical population
• Numerical results
• LFDA
Page 2 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Introduction
• In a deceptive function the global optimum x=(1,…,1)
is isolated.
• Neighbors of the second best fitness value x=(0,…,0)
have large fitness value
• GAs are deceived by the fitness distribution
• Most Gas will convergence to x=(0,…,0)
Page 3 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Solutions
• Mathematical methods are suitable to optimize
deceptive functions
• Consider additively decomposed functions (ADF)
• Sj are non-overlapping substrings of X with k elements
• This class of functions is of great theoretical and
practical importance
• Optimization of an arbitrary in this space is NP complete
Page 4 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
ADFs Optimization Approaches
• Adaptive recombination
• Explicit detection of relations (kargupta&Goldberg, 97)
• Dependency trees(Baluja&Davies, 97)
• Bivariate marginal distributions (pelikan&Muhleinbein,98)
• Estimation of Distributions(Muhlenbein et all,1997)
Page 5 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
ADF
•
Definition: An additively decomposed function (ADF) is defined
by:
f ( x) f i ( xsi ) s s1 , s2 ,..., sl
si S
•
si X
For theoretical analysis, use Boltzmann Distribution
Page 6 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Gibbs or Boltzmann distribution
•
Definition: The Gibbs or Boltzmann distribution of a function f is
defined for u>=1 by
Expu f ( x)
p( x) :
Fu
•
•
•
Fu is partition function
larger function value f(x) and larger p(x)
Such a search distribution is suitable for an optimization
problem
• exponential computation
Page 7 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Reduce of B.D. computation
1) Approximate the Boltzmann distribution (simulated Annealing)
2) Look for ADFs with distribution computation in Polynomial time
•
factorize distribution into a product of marginal and
conditional probabilities (used by FDA)
Page 8 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Input sets for Factorization theorem
Definition: if S={s1,s2, …, sl} for i=1, 2,…, l then
In the decomposable graphs theory:
di histories
bi residuals
ci separators
Page 9 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Factorization Theorem
Theorem1: Let p(x) be a Boltzmann distribution on X
If
then
Page 10 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
FDAr
S0: set t=0, generate (1-r)*N>>o point randomly and r*N points (Equation 16)
S1: selection
S2: Compute
p s ( xbi xci , t )
using selected points
S3: Generate a new population p ( x, t 1)
l
s
p
( xbi xci , t )
i 1
S4: If termination criteria is met, Finish
S5: Add the best point of previous generation to generated points (elitist)
S6: Set t=t+1, Go to Step2
Page 11 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Analysis of Factorization Algorithm
The computational Complexity depends on the factorization and
population size N
• Number of function evaluations: FE=GENe*N
GENe is the number of generation till Convergence
p(x,t+1)=p(x,t)
•
•
The computational Complexity of computing N new search points is
compl(Npoi nts) l N
•
The Computational Complexity of computing probability is
l
compl(p) ( 2 ) M
si
i 1
Page 12 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Analysis of … (Contd)
•
Computation of FDA depends on:
1)
2)
3)
Number of decomposition functions (l)
Size of the defining sets (si)
Size of selected point (M)
•
An infinite population is needed to exactly computation
•
Should use a minimal population size N* in a numerical efficient FDA
•
Computation of N* is a difficult problem for any search method using
a population of points
Page 13 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
FDA-FAC
• S0: set i=1,
~
si
is non-linear sub-function
i
~
• S1: compute d i : ~
sj
j 1
•
~
S2: Select sk which has maximal overlap with di
~
and sk di
• S3: if no set is found go to step 5
• S4: Set
~
si 1 sk ,
i : i 1
if i<L go Step1
• S5: Compute the factorization using Eq. 6 with sets
Page 14 Of 47
Iran University of Science and Technology
~
si
November 2006
Factorized Distributed Algorithm
Generation of Initial Population
• Normally the initial population is generated randomly
• with ADF, initial point can be generated with this
information.
• Generate subsets
with high local fitness values
• Distribution
is an approximation of
• Conditional probabilities are computed using local fitness
functions
Page 15 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Generation of Initial Population….
•
•
The larger u, the steeper distribution
if u=1 the distribution is uniform.
•
if function Onemax(n)=∑xi then
•
FDA computes span=1 and u=10
Page 16 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Generation of Initial Population….
• if function Onemax(n)=∑xi then
• FDA computes span=1 and u=10
• There will be 10 times more 1s than 0s in the initial
population
• Such an initial population might not give a B.D.
• Only half of the population is generated by this method
• Other half is generated randomly
Page 17 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Convergence of FDA
• If points are selected base on Bol. Distribution
convergence of FDA is proved.
• The distribution ps of selected points is given by:
• If p(x,t) is B.D. then ps(x,t) is B.D.
• FDA computes new search points according to
Page 18 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
•
Theorem2
: If the initial points are distributed according to
with u>=1, then for FDA the
distribution at generation is given by
with w u.v t
Tip: B. Selection with fixed basis v>1 defines an annealing schedule
with T (t ) 1(t ln( v) ln( u))
that t is number of generation
Theorem3 remains valid for any annealing schedule with
Page 19 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Let
then base on Theorem 2 :
X opt {x1opt , x2opt ,...} be the set of optima,
•
Theorem 3(Convergence):
•
FDA with B. selection is exact simulated annealing algorithm.
•
simulated annealing is controlled by 2 parameters: N(T) and annealing
schedule
•
N can be called population size
Page 20 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Truncation Selection Vs B. selection
Numerically truncation selection is easier to implement
• With truncation threshold דthe best *דN individual are selected.
s
• Conditional probabilities of selected point is: p ( xb xc , t )
• Based on factorization theorem to generate new search points :
•
i
i
l
P( x, t 1) p s ( xbi xci , t )
i 1
•
Problem: After Truncation selection the distribution is not B.D.
l
therefore:
s
p ( x, t ) p s ( xbi xci , t )
i 1
s
p
(
x
,
t
1
)
p
( xopt , t )
opt
• With this inequality
convergence proof difficult.
Page 21 Of 47
Iran University of Science and Technology
that this makes a
November 2006
Factorized Distributed Algorithm
Theoretical Analysis for Infinite populations
•
For analysis two linear function will be investigated:
n
OneMax n ( x) xi
i 1
n
Int n ( x) 2i 1 xi
i 1
OneMax has (n+1) different fitness value which are multinomial D.
• Int has 2n different fitness value.
•
For ADFs the multinomial distribution is typical
• The distribution generated by Int is more special
• Both functions is linear, therefore can use following factorization:
•
n
p ( x, t 1) p ( xi , t )
i 1
Page 22 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
•
Theorem 4
For B. selection with basis v the probabilities distribution for
OneMax is given by:
tf ( x )
p( X , t )
•
v
(1 v t ) n
Number of generations to generate the optimum is given by:
GEN
Page 23 Of 47
ln
n
ln( v)
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
•
•
Theorem 5
For Truncation selection דwith selection intensity I דthe
marginal probability p(t) obeys for OneMax
I
p(t 1) p(t ) np (t )(1 p (t ))
n
The approximate solution of this equation is :
p(t ) 0.5(1 sin(
Where
•
t(
2
I
t arcsin( 2 p0 1))
n
arcsin( 2 p0 1))
n
I
The number of generations till convergence is given by:
Page 24 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Page 25 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Comparison Truncation & B. selection
• T.S. need more number of generation to convergence
than B.S.
• GENe is of order
for B.S. and for T.S. is
• If basis v is small (e.g. v=1.2) T.S. convergence is faster
Page 26 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
• B.S. with fixed v gives an annealing schedule of
Page 27 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
• FDA with truncation selection generates a B.D. with
annealing schedule
• The annealing schedule depends on the average fitness
and the variance
Page 28 Of 47
of the population.
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Page 29 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Page 30 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
• For Int the B.D. is concentrated around the optimum
• The selected population has a small diversity
• In finite population this cause a problem, some genes
will get fixed to wrong alleles
Page 31 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Analysis of FDA for Finite Populations
In finite population, convergence of FDA can be Probabilistic
Page 32 Of 47
Iran University of Science and Technology
November 2006
Factorized Distributed Algorithm
Analysis of FDA for Finite Populations
Cumulative fixation probability for Int(16) Truncation Selection vs.
Boltzmann selection with v=1.01
Page 33 Of 47
Iran University of Science and Technology
November 2006