Transcript Document

Recent Advances in Differential Evolution
Yong Wang
Lecturer, Ph.D.
School of Information Science and Engineering,
Central South University
[email protected]
Outline of My Talk
 Introduction to Differential Evolution
 The State-of-the-Art of Differential Evolution
 Composite Differential Evolution
 Orthogonal Crossover based Differential Evolution
 Conclusion
2
Evolutionary Algorithms
• What are evolutionary algorithms (EAs)?
 EAs are intelligent optimization and search techniques
inspired by nature
• Why evolutionary algorithms (EAs)?
f(x) of evolutionary algorithms (EAs)
• The framework
the first individual
the second individual
Population
the NPth individual
Selection
Replacement
10
Parent Set
f(x,y)
The optimal solution!
5
The optimal solution!
Crossover
0
Mutation
-5
New Solutions
Is it the optimal solution?
-10
60
x
40
y
60
40
20
20
0
0
x
3
Differential Evolution (1/2)
• Differential evolution (DE), proposed by Storn and Price in
1995, is one of the main branches of evolutionary
algorithms (EAs).
• DE includes three main operators, i.e., mutation operator,
crossover operator, and selection operator.
• Currently, DE has been successfully used in various fields.
4
Differential Evolution (2/2)
• The algorithmic framework of DE
the target vectors
Remark: mutation + crossover = trial vector generation strategy
5
The Mutation Operators
the fashion the base
vector has been selected
• rand/1
• rand/2
• best/1
• best/2
the number of the
difference vector
the base vector the scaling factor




vi ,G  xr1,G  F  ( xr 2,G  xr 3,G )
the difference vector






vi ,G  xr1,G  F  ( xr 2,G  xr 3,G )  F  ( xr 4,G  xr 5,G ) the scaling factor




vi ,G  xbest,G  F  ( xr1,G  xr 2,G )






vi ,G  xbest,G  F  ( xr1,G  xr 2,G )  F  ( xr 3,G  xr 4,G )
• current-to-best/1






vi ,G  xi ,G  F  ( xbest,G  xi ,G )  F  ( xr1,G  xr 2,G )
• current-to-rand/1






vi ,G  xi ,G  rand  ( xr1,G  xi ,G )  F  ( xr 2,G  xr 3,G )
Remark: r1, r2, r3, r4, and r5 are different indexes uniformly randomly selected
from {1,, NP} \ {i}, xbest ,G is the best individuals in the current population.
The scaling factor F plays a very important role in mutation.
6
The Characteristics of the Mutation Operators (1/3)
• rand/1




vi ,G  xr1,G  F  ( xr 2,G  xr 3,G )
• Characteristics
 rand/1 is the most commonly used mutation operator in the
literature.
 All vectors for mutation are selected from the population at
random and, consequently, it has no bias to any special
search directions and chooses new search directions in a
random manner.
 It usually demonstrates slow convergence speed and bears
stronger exploration capability.
7
The Characteristics of the Mutation Operators (2/3)
• rand/2






vi ,G  xr1,G  F  ( xr 2,G  xr 3,G )  F  ( xr 4,G  xr 5,G )
• Characteristics
 In rand/2, two difference vectors are added to the base
vector, which might lead to better perturbation than the
strategies with only one difference vector.
 It can generate more different trial vectors than the rand/1
mutation operator with respect to the same population.
 When using rand/2, the diversity of the population can be
kept, however, it has a side effect on the convergence
speed of DE.
8
The Characteristics of the Mutation Operators (3/3)
• best/1
• best/2




vi ,G  xbest,G  F  ( xr1,G  xr 2,G )






vi ,G  xbest,G  F  ( xr1,G  xr 2,G )  F  ( xr 3,G  xr 4,G )






• current-to-best/1 vi ,G  xi ,G  F  ( xbest,G  xi ,G )  F  ( xr1,G  xr 2,G )
• Characteristics
 best/1, best/2 and current-to-best/1 usually have the fast
convergence speed and perform well when solving unimodal
problems.
 They are easier to get stuck at a local optimum and thereby lead to
a premature convergence when solving multimodal problems.
 The best/1 is a degenerated case of the current-to-best/1 with the
first scaling factor F being equal to 1.
9
The Crossover Operators (1/2)
• Binomial crossover
the target vector
xi ,G  ( xi ,1,G , xi ,2,G ,
, xi ,r ,G ,
, xi , D,G )
rand1>CR rand2>CR
the trial vector
ui,G  (ui ,1,G , ui ,2,G , , ui ,r ,G , , ui , D,G )
the mutant vector
rand1≤CR rand2≤CR
vi,G  (vi ,1,G , vi ,2,G , , vi ,r ,G ,
mix
, vi , D,G )


ui , G is always different from xi ,G
10
The Crossover Operators (2/2)
• Exponential crossover
the target vector
xi,G  ( xi ,1,G ,
the trial vector
ui,G  (ui,1,G , mix, ui , L,G , , ui , L r ,G , , ui , D,G )
the mutant vector vi ,G  (vi ,1,G ,
, xi , L,G ,
, vi, L,G ,
, xi , L  r ,G ,
, vi, L  r ,G ,
, xi , D,G )
, vi, D,G )
Pr(r≥v)= CRv-1
The crossover control parameter CR plays a very important role in crossover.
11
The Characteristics of the Crossover Operators
• Characteristics
 Binomial crossover is similar to discrete crossover in genetic
algorithm.
 Exponential crossover is functionally equivalent to two-point
crossover in genetic algorithm.
 Exponential crossover has the capability in maintaining the
linkage among variables and the building block.
 Binomial crossover may destroy building block.
12
DE Variations
• By combining different mutation operators and different
crossover operators, we can obtain different DE variants.
• DE/x/y/z




DE: differential evolution
x: the fashion the base vector has been selected
y: the number of the difference vector
z: the type of the crossover operator; “bin” represents the
binomial crossover and “exp” represents the exponential
crossover
• DE/rand/1/bin, DE/rand/1/exp, DE/rand/2/bin, …
13
The Illustrative Graph of DE/rand/1/bin
the triangle denotes the trial vector
x2

ui ,G

x i ,G

v i ,G

xr 2 , G


F ( x r 2 ,G  x r 3,G )

x r1,G
base vector
0
perturbed
vectors

x r 3, G
x1
14
On Rotation Invariance (1/3)
• Why the rotation invariance is very important for
optimization algorithms
 We have no a prior knowledge about the topology structure
of the optimization problems
80
60
40
20
0
5
5
0
0
-5
-5
15
On Rotation Invariance (2/3)
• In DE, the crossover control parameter CR controls the
rotation invariance to a certain degree
CR=0.0
CR=0.5
CR=1.0
S. Das, and P. N. Suganthan. Differential evolution: A survey of the state-of-the-art.
IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 4-31, 2011.
16
On Rotation Invariance (3/3)
• current-to-rand/1 is a rotation-invariant strategy




vi ,G  xr1,G  F  ( xr 2,G  xr 3,G )
rand/1




ui ,G  xi ,G  rand  (vi ,G  xi ,G )

ui , G
arithmetic crossover





 xi ,G  rand  ( xr1,G  xi ,G )  rand  F  ( xr 2,G  xr 3,G )






vi ,G  xi ,G  rand  ( xr1,G  xi ,G )  F  ( xr 2,G  xr 3,G )
binomial
crossover/
exponential
crossover
Remark: current-to-rand/1 can be
considered as rand/1 + arithmetic
crossover, in which the crossover
control parameter CR is unnecessary
arithmetic crossover
17
Outline of My Talk
 Introduction to Differential Evolution
 The State-of-the-Art of Differential Evolution
 Composite Differential Evolution
 Orthogonal Crossover based Differential Evolution
 Conclusion
18
The Current Research Directions of DE
• The DE performance mainly depends on two components
 trial vector generation strategy (i.e., the mutation and
crossover operators)
 control parameters (i.e., the population size NP, the scaling
factor F, and the crossover control parameter CR).
• Much effort has been made to improve the performance
of DE
 Introduction of new trial vector generation strategy for
generating new solutions
 Tuning the control parameters (static/deterministic,
dynamic/adaptive, and self-adaptive)
 Hybridization of DE with other operators or methods
 Use of multiple populations (distributed DE)
19
Six Representative DE
•
jDE (self-adaptive parameters in DE, IEEE TEC, 2006, 10(6))
•
DEahcSPX (DE with adaptive hill-climbing and simplex crossover, IEEE
TEC, 2008, 12(1))
•
SaDE (DE with strategy adaptation, IEEE TEC, 2009, 13(2))
•
JADE (adaptive DE with optional external archive IEEE TEC, 2009, 13(5))
•
DEGL (DE using a neighborhood-based mutation operator, IEEE TEC,
2009, 13(3))
•
ODE (opposition-based DE, IEEE TEC, 2008, 12(1))
20
jDE
• Main motivation
 How to self-adaptively adjust the scaling factor F and the
crossover control parameters CR of DE
• Main idea
 F and CR are applied at individual level
x1,G
F1,G
CR1,G
x2,G
F2,G
CR2,G
…
…
xNP,G
FNP,G
…
CRNP,G
21
DEahcSPX (1/2)
• Main motivation
Local improvement process
(LIP) oriented LS
Crossover-based
LS (XLS)
 Incorporating local search (LS) heuristics is often very useful
in designing an effective evolutionary algorithm for global
optimization.
• Main challenges of XLS
 the length of the XLS
 the selection of individuals which undergo the XLS
 the choice of the other parents which participate in the
crossover operation
 whether deterministic or stochastic application of XLS
should be used
22
DEahcSPX (2/2)
• Main techniques
 At each generation, firstly, the best individual with other np
individuals randomly chosen from the population are
selected to participate in the simplex crossover (SPX).
 One offspring is produced and if the offspring is better than
the best individual, then it will be used to replace the best
individual.
 Afterward, DE is implemented.
adaptive hillclimbing (ahc)
23
SaDE (1/4)
• Main motivation
 At different stages of evolution, different trial vector generation
strategies coupled with different control parameter settings
may be required in order to achieve the best performance.
• Main idea
 Adaptively adjust the trial vector generation strategies and the
control parameters simultaneously by learning from their
previous experiences.
24
SaDE (2/4)
• How to adapt the trial vector generation strategy
 Use four trial vector generation strategies to construct the
strategy candidate pool
25
SaDE (3/4)
• How to adapt the trial vector generation strategy
 For each trial vector generation strategy at generation G,
SaDE records:


nk,G : the number of the trial vectors generated by the kth
strategy
nsk,G: the number of the trial vectors generated by the kth
strategy which can enter the next generation
 During the first LP generations, each trial vector generation
strategy is chose with the same probability. When the
generation number G is larger than LP, the probability, pk,G,
of using control parameter setting k is calculated as follows:
∑
=
∑
G -1
S k ,G
g = G - LP
G -1
avoid all the success rates being equal to zero
nsk , g
g = G - LP
nk , g
+ ς and pk ,G =
S k ,G
∑
4
k =1
S k ,G
26
SaDE (4/4)
• How to adapt F and CR
 the parameter F is approximated by a normal distribution with
mean value 0.5 and standard deviation 0.3, denoted by N(0.5,0.3).
 CR obeys a normal distribution with mean value CRm and standard
deviation Std=0.1, denoted by N(CRm,Std) where CRm is initialized
as 0.5.
 CRMemoryk is used to store those values with respect to the kth
strategy that have generated trial vectors successfully entering the
next generation within the previous LP generations.
 During the first LP generations, CR values with respect to kth
strategy are generated by N(0.5,0.1).
 At each generation after LP generations, the median value stored
in CRMemoryk will be calculated to overwrite CRmk. Then, CR values
can be generated according to N(CRmk,0.1) when applying the kth
strategy.
27
JADE (1/4)
• Main motivation
 The current-to-best/1 benefit from its fast convergence by
incorporating best solution information in the evolutionary
search. However, the best solution information may also
cause problems such as premature convergence due to the
resultant reduced population diversity.
 A well-designed parameter adaptation scheme is usually
beneficial to enhance the robustness of an algorithm.
How to exploit the advantages and overcome the disadvantages of the
current-to-best/1 and how to adapt F and CR during the evolution
28
JADE (2/4)
• A new mutation operator: current-to-pbest/1


p



vi ,G  xi ,G  Fi  ( xbest

x
)

F

(
x

x
i ,G
i
r1,G
r 2,G )
,G
• The characteristics of current-to-pbest/1
 Any of the top 100p% solutions can be randomly chosen
to play the role of the single best solution in DE/currentto-best.
 Recently explored inferior solutions, when compared to
the current population, provide additional information
about the promising progress direction. Denote A as the
set of archived inferior solutions and P as the current
population.

 xr 2,G is randomly chosen from the union P  A .
29
JADE (3/4)
• How to adapt CR
CRi  randni ( CR ,0.1)
The mean CR is initialized to be 0.5 and then updated at the end
of each generation as:
CR  (1  c)  CR  c  meanA ( SCR )
where c is a positive constant between 0 and 1, SCR is the set of
all successful crossover probabilities CRi at generation G. and
meanA(·) is the usual arithmetic mean.
30
JADE (4/4)
• How to adapt F
Fi  randci (  F ,0.1)
The location parameter  F of the Cauchy distribution is initialized
to be 0.5 and then updated at the end of each generation as
 F  (1  c)   F  c  meanL ( S F )
where SF is the set of all successful mutation factors in generation G
and meanL(·) is the Lehmer mean
 FS F 2
meanL ( S F ) 
 FS F
F
F
31
DEGL (1/3)
• Main motivation
 A proper tradeoff between exploration and exploitation is
necessary for the efficiency and effectiveness of a
population-based stochastic search method.
 The current-to-best/1 of DE favors exploitation only, since all
the vectors are attracted by the same best position found so
far by the entire population.
 As a result of such exploitative tendency, in many cases, the
population of DE may lose its global exploration abilities
within a relatively small number of generations.
How to balance the exploration and exploitation in the current-to-best/1
32
DEGL (2/3)
• Main idea
global mutation model
local neighborhood model
how to define the
neighborhood
Remark: w controls the balance between the exploration and exploitation
33
DEGL (3/3)
• How to set the parameter w
 increasing weight factor


linear increment
exponential increment
 random weight factor
 self-adaptive weight factor
the weight factor associated with the best
individual of the population
34
ODE (1/3)
• Main motivation
 All population-based optimization algorithms, no exception
for DE, suffer from long computational times because of
their evolutionary/stochastic nature.
 In the absence of a priori information about the solution, we
usually start with random guesses. The computation time, is
related to the distance of these initial guesses from the
optimal solution.
• Main idea
 By using the current solution and its opposite solution, the
convergence speed of DE can be enhanced (oppositionbased learning).
the current solution
a
the optimal solution
(a+b)/2
the opposite solution
35
b
ODE (2/3)
• Basic definitions of opposition-based learning
 Opposite number: Let x  [a, b] be a real number. The
opposite number x is defined as x  a  b  x .
 Opposite solution: Let x  ( x1 , , xD ) be a solution in Ddimensional space, where x1 , , xD  and xi  [ai , bi ] . The
opposite point is completely defined by its components
xi  ai  bi  xi .
 Opposition-Based Comparison: Let x  ( x1 , , xD ) be a
solution in D-dimensional space and f ( x ) its objective
function value. According to the definition of the opposite
solution, x  ( x1 , , xD ) is the opposite solution of x  ( x1 , , xD ).
If f ( x)  f ( x ) , then x can be replaced with x.
36
ODE (3/3)
• Opposition-based population initialization
• Opposition-based generation jumping
37
Outline of My Talk
 Introduction to Differential Evolution
 The State-of-the-Art of Differential Evolution
 Composite Differential Evolution
 Orthogonal Crossover based Differential Evolution
 Conclusion
38
Composite Differential Evolution (CoDE)
• Motivation
 During the last decade, DE researchers have suggested
many empirical guidelines for choosing trial vector
generation strategies and control parameter settings.


some trial vector generation strategies are suitable for the
global search and some others are useful for rotated problems
some control parameter settings can speed up the
convergence and some other settings are effective for
separable functions
 However, these experiences have not yet systematically
exploited in DE algorithm design.
whether the performance of DE can be improved by combining
several effective trial vector generation strategies with some
suitable control parameter settings
39
Composite Differential Evolution (CoDE)
• Main idea
DE/rand/1/bin
F=1.0, CR=0.1
DE/rand/2/bin
F=1.0, CR=0.9
DE/current-to-rand/1
F=0.8, CR=0.2
strategy candidate pool
parameter candidate pool
Y. Wang, Z. Cai, and Q. Zhang, “Differential evolution with composite trial
vector generation strategies and control parameters.” IEEE Transactions on
Evolutionary Computation, vol. 15, no. 1, pp. 55-66, 2011.
40
Composite Differential Evolution (CoDE)
• In general, we expect that the chosen trial vector
generation strategies and control parameter settings
show distinct advantages.
• Thus, they can be effectively combined to solve different
kinds of problems.
41
Composite Differential Evolution (CoDE)
• Basic properties of the strategy candidate pool
 DE/rand/1/bin has stronger global exploration ability, and it
is effective when solving multimodal problems.
 DE/rand/2/bin may lead to better permutation than
DE/rand/1/bin, since the former uses two difference vectors.
 DE/current-to-rand/1 is rotation-invariant and suitable for
rotated problems.
42
Composite Differential Evolution (CoDE)
• Basic properties of the parameter candidate pool
 A large value of F can make the mutant vectors distribute widely in
the search space and can increase the population diversity.
 A low value of F makes the search focus on neighborhoods of the
current solutions, and thus it can speed up the convergence.
 A large value of CR can make the trial vector very different from the
target vector. Therefore, the diversity of the offspring population
can be encouraged.
 A small value of CR is very suitable for separable problems, since
in this case the trial vector may be different from the target vector
by only one parameter and, as a result, each parameter is
optimized independently.
43
Composite Differential Evolution (CoDE)
• Basic properties of the parameter candidate pool
 When combined with the three strategies, [F=1.0,CR=0.1] is
for dealing with separable problems.
 [F=1.0,CR=0.9] is mainly to maintain the population diversity
and to make the three strategies powerful in global
exploration.
 [F=0.8,CR=0.2] encourages the exploitation of the three
strategies in the search space and thus accelerates the
convergence speed of the population.
Conclusion: the selected strategies and parameter settings exhibit
distinct advantages and, therefore, they can complement one
another for solving optimization problems of different characteristics.
44
Composite Differential Evolution (CoDE)
• The main framework
the first trial
vector
target vector xi,G
the second trial
vector
the best trial vector
the third trial
vector
combining each trial vector generation
strategies with one control parameter
setting randomly selected
comparison
45
Composite Differential Evolution (CoDE)
• The experimental results
 25 test functions proposed in the IEEE CEC2005 were used
to study the performance of the proposed CoDE




unimodal functions F1–F5
basic multimodal functions F6–F12
expanded multimodal functions F13–F14
hybrid composition functions F15–F25
 The average and standard deviation of the function error
value | f ( xbest )  f ( x* ) | were recorded for measuring the
performance of each algorithm
the 
bestFor
solution
the
eachfound
test by
function,
25theindependent
were conducted with
global optimumruns
of the
algorithm
a run
test function
300,000infunction
evaluations (FES)
as the termination criterion
 Wilcoxon’s rank sum test at a 0.05 significance level was
conducted on the experimental results
46
Composite Differential Evolution (CoDE)
• The experimental results
 Comparison with four state-of-the-art DE
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of CoDE, respectively.
basic multimodal functions
CoDE is the best
expanded multimodal functions
hybrid composition functions
unimodal functions
CoDE is the second best
Overall, CoDE is better than the four competitors.
47
Composite Differential Evolution (CoDE)
• The experimental results
 Comparison with CLPSO, CMA-ES, and GL-25
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of CoDE, respectively.
Overall, CoDE significantly outperforms CLPSO, CMA-ES, and GL-25.
48
Composite Differential Evolution (CoDE)
• The experimental results
 Random selection of the control parameter settings (CoDE)
 Adaptive selection of the control parameter settings
(adaptive CoDE)
 Adaptive CoDE VS CoDE
the adaptive CoDE outperforms CoDE on one unimodal function
CoDE wins the adaptive CoDE on another unimodal function
CoDE wins on two hybrid composition functions
Overall, CoDE is slightly better than the adaptive CoDE.
49
Outline of My Talk
 Introduction to Differential Evolution
 The State-of-the-Art of Differential Evolution
 Composite Differential Evolution
 Orthogonal Crossover based Differential Evolution
 Conclusion
50
Orthogonal Crossover based Differential Evolution
• Motivation
 The crossover operators of DE can only generate a vertex
of a hyper-rectangle defined by the mutant and target
vectors.
 Therefore, the search ability of DE may be limited.
x2
the triangle denotes the trial vector

x i ,G
whether the search ability of DE can
be enhanced by effectively probing
the hyper-rectangle defined by the
mutant and target vectors

v i ,G

xr 2 , G


F ( x r 2 ,G  x r 3,G )

x r1,G
base vector
0
perturbed
vectors

x r 3, G
x1
51
Orthogonal Crossover based Differential Evolution
• Orthogonal crossover (OX) can make a systematic and
rational search in a region defined by the parent
solutions.
• OX is based on orthogonal design.
• Orthogonal design: an example
Factors (K)
Levels
(Q)
the temperature
the amount of fertilizer
the pH value of the soil
20 ℃
100 g/m2
6
25 ℃
150 g/m2
7
30 ℃
200 g/m2
8
The number of all the experiments is QK
the main aim of orthogonal design is to choose several representative
combinations
52
Orthogonal Crossover based Differential Evolution
• How to implement the orthogonal design
 Orthogonal array: An orthogonal array for K factors with Q
levels and M combinations is often denoted by LM(QK).
a level combination
a level
a factor
 The orthogonality of an orthogonal array means that:


each level of the factor occurs the same number of times in
each column
each possible level combination of any two given factors
occurs the same number of times in the array.
53
Orthogonal Crossover based Differential Evolution
• Quantization orthogonal crossover (QOX)
 Step 1: quantize the solution space defined by two parents
into a finite number of points

e  (1.0, 3.0)

g  (3.0,1.0)
x1
[1.0, 3.0]
x2
[1.0, 3.0]
(1.0, 2.0, 3.0) (1.0, 2.0, 3.0)

e
x2
the number
of variables
3.0
D
2.0
Q
1.0

g
0.0
1.0
2.0
3.0
x1
the number
of levels
Y. W. Leung and Y. Wang, “An orthogonal genetic algorithm with quantization
for global numerical optimization,” IEEE Transactions on Evolutionary
Computation, vol. 5, no. 1, pp. 41-53, 2001.
54
Orthogonal Crossover based Differential Evolution
• Quantization orthogonal crossover (QOX)
 Step 2: select a small, but representative sample of points
as the potential offspring by orthogonal design


If D≤K, the first D columns of LM(QK) can be used directly
If D>K, the decision vector will be divided into K subvectors

 e = (2.0, 5.0,1.0, 0.0, 8.0 , 3.0)
e  (1.0, 3.0)
 g = (4.0,3.0,2.0, 6.0, 2.0, 5.0)
g  (3.0,1x.0) x x x x x
x1
D=2
1
x2
H1
2
3
4
5
3.0
2.0
6
1.0
H2
H3

g
H4
0.0
2.0 3.0
3.0 4.0
4.0 5.0

e
x2
K=4
2.0
5.0
8.0
1.0
2.0
3.0
x1
55
Orthogonal Crossover based Differential Evolution
• We proposed a generic framework for using QOX in DE
variants
x1,G
xk , G
vk ,G
LM(QK)
x2,G
uk _1,G
uk _ 2,G
xNP,G
comparison
uk ,G
uk _ M , G
Remark: Our framework uses QOX to complement binomial crossover or
exponential crossover for searching some promising regions in the search space.
Y. Wang, Z. Cai, and Q. Zhang, “Enhancing the search ability of differential
evolution through orthogonal crossover,” Information Sciences, vol. 185, no. 1,
pp. 153-177, 2012.
56
Orthogonal Crossover based Differential Evolution
• An instantiation of our framework
an improvement
 DE/rand/1/bin + QOX = OXDE

e
x2
3.0
2.0
1.0

g
0.0
1.0
2.0
3.0
x1
• Our framework can be easily generalized to other DE
variants by replacing DE/rand/1/bin with other DE variants.
57
Orthogonal Crossover based Differential Evolution
• The experimental results
 A suite of 24 test instances is used for our experimental
studies


the first 10 test instances are widely used in the evolutionary
computation community
the other 14 test instances are the first 14 test instances
designed for the IEEE CEC2005
 Parameter settings

NP=D=30, F=0.9, CR=0.9, and FESmax= 10,000×D
 The average and standard deviation of the function error
value | f ( xbest )  f ( x* ) | were recorded for measuring the
performance of each algorithm
 For each test function, 50 independent runs were conducted
 The t-test at a 0.05 significance level has been used in
58
the global optimum of the
the best solution found by the
comparison
test function
algorithm in a run
Orthogonal Crossover based Differential Evolution
• The experimental results
successful condition
 How to measure the successful run


*
A run is successful if | f ( xbest )  f ( x ) | 
-2
The parameter  is set to 10 for test functions F6-F14,
and 10-6 for the rest of the test functions
 How to measure the convergence speed

The mean and standard derivation of FESs among 50
independent runs are used to measure the convergence
speed of an algorithm
In a successful run, FESs is the number of FES needed
for reaching successful condition
In an unsuccessful run, FESs is set to FESmax
59
Orthogonal Crossover based Differential Evolution
• The experimental results
 OXDE VS DE/rand/1/bin
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of OXDE, respectively.



OXDE can achieve at least one successful run on 11 test
functions
DE/rand/1/bin can achieve at least one successful run on 7 test
functions
OXDE has a faster convergence speed on these 11 test
instances.
OXDE certainly outperforms DE/rand/1/bin in terms of both the solution
quality and the convergence speed.
60
Orthogonal Crossover based Differential Evolution
• The experimental results
 OXDE VS DE/rand/1/bin

Effect of population size (NP=50, 100, 200, and 300)
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of OXDE, respectively.
OXDE performs significantly better than DE/rand/1/bin with different
population sizes.
61
Orthogonal Crossover based Differential Evolution
• The experimental results
 OXDE VS DE/rand/1/bin

Effect of population size (NP=50, 100, 200, and 300)
Fgrw (NP=100)
F5 (NP=200)
62
Orthogonal Crossover based Differential Evolution
• The experimental results
 OXDE VS DE/rand/1/bin

Effect of the number of variables (D=10, 50, 100, and 200)
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of OXDE, respectively.
The advantage of OXDE over DE/rand/1/bin increases as the number
of variables increases.
63
Orthogonal Crossover based Differential Evolution
• The experimental results
 OXDE VS DE/rand/1/bin

Effect of the number of variables (D=10, 50, 100, and 200)
Fack (D=100)
Fsch (D=200)
64
Orthogonal Crossover based Differential Evolution
• The experimental results
 Runtime complexity of OXDE
65
Orthogonal Crossover based Differential Evolution
• The experimental results
 Can our framework improve other DE variants?


Our framework can greatly improve the performance of
DEahcSPX, DE/rand/1/exp, DE/rand/2/exp, and DE/rand/2/bin
Our framework can also improve the performance of jDE,
SaDE, and JADE to a certain degree
our framework could be an effective way to improve
the performance of other DE variants.
66
Orthogonal Crossover based Differential Evolution
• The experimental results
 Comparison with opposition-based DE (ODE)
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of OXDE, respectively.
OXDE performs better than ODE.
67
Orthogonal Crossover based Differential Evolution
• The experimental results
 Comparison with other state-of-the-art EAs
“-”, “+”, and “≈” denote that the performance of the corresponding algorithm is
worse than, better than, and similar to that of OXDE, respectively.
OXDE is a generally good global function optimizer.
68
Orthogonal Crossover based Differential Evolution
• The experimental results
 Orthogonal crossover VS uniformly random sampling and
Halton sampling
OXDE is able to exhibit better performance than HSDE in nearly all test instances;
URSDE shows better performance than OXDE in four test instances
OXDE outperforms URSDE in eight test instances
the same level of improvement cannot be achieved via
additional sampling strategies
69
Outline of My Talk
 Introduction to Differential Evolution
 The State-of-the-Art of Differential Evolution
 Composite Differential Evolution
 Orthogonal Crossover based Differential Evolution
 Conclusion
70
Conclusion
• We have demonstrated that the experiences and
knowledge obtained from the researchers can be
exploited to improve the performance of DE significantly
for the first time.
• We have verified that the search ability of DE can be
enhanced by effectively probing the hyper-rectangle
defined by the mutant and target vectors for the first time.
The source codes of CoDE and OXDE can be downloaded from the
following URL: http://deptauto.csu.edu.cn/staffmember/YongWang.htm
71