No Slide Title

Download Report

Transcript No Slide Title

Slides for Introduction to Stochastic Search
and Optimization (ISSO) by J. C. Spall
CHAPTER 9
EVOLUTIONARY COMPUTATION I:
GENETIC ALGORITHMS
•Organization of chapter in ISSO
–Introduction and history
–Coding of 
–Standard GA operations
–Steps of basic GA algorithm
–Extensions to basic GA algorithm
–Numerical examples
Background: Evolutionary Computation
• Evolutionary computation (EC) methods are based on
population of solutions
– Each iteration involves propagating all elements of the
population
– Each member of population (“chromosome”) corresponds
to one value of 
• Genetic algorithms (GAs) are most popular form of EC
• Early work in 1950s and 1960s; influential 1975 book by
John Holland laid foundation for modern implementations
• Dramatic increase in activity in mid-1980s
• Population-based structure well suited to parallel
processing
– But infeasible in some real-time applications
9-2
Background: EC (cont’d)
• Motivation for EC: Evolution
seems to work well in
nature.…perhaps it can be
used in optimization
• Three main types of EC
–Genetic Algorithms (Chap. 9
of ISSO)
–Evolution Strategies (Chap.
10)
–Evolutionary Programming
(Chap. 10)
Prototype EC Method
Initial Population
Selection
Next
Iteration
(Generation) Reproduction
Mutation
• Many other types of EC exist
(ant colony, particle swarm,
differential evolution, etc.)
9-3
Desirable Performance for GA with
Population of 12 Candidate Solutions
9-4
Role of GAs in Global Problems
• GA largely motivated for global search/optimization problems
– Global problem generally very difficult
– GAs (and related) have long history of success in global
problems
– Some global problems essentially impossible to solve
• Much solid research and applications with GAs
• Unfortunately, more misrepresentations, dubious claims, and
“hype” than other methods. For example, GA software ads:
– “…can handle the most complex problems, including
problems unsolvable by any other method.”
– “…uses GAs to solve any optimization problem!”
• No clear indication of problem class(es) for which GAs are
superior to other methods
9-5
Fitness Function and Coding of 
• Need to define “fitness function” to be maximized
• Fitness function may depend on  or on coded form of 
• Common choice is L() when algorithm works directly
with 
• Other choices desirable in some applications
– Add positive constant to ensure fitness always  0
– Rescaling to avoid extreme values that can cause
instabilities due to selection step
• Coding of : Binary bit and real-number (floating point)
representation of  are most common forms
– Bit representation applies to each element of  for each of
the members of the population (e.g.,   [0 1 1 0…1 0])
– Real-number “coding” (i.e., no coding of ) becoming popular
due to effectiveness in applications
9-6
Standard GA Operations
• Selection is the mechanism by which the “parents” are
chosen for producing offspring to be passed into next
generation
– Selection tends to pick best population elements as parents
• Elitism passes best chromosome(s) to next generation
intact
– Elite chromosomes also eligible for selection as parents
– Inclusion of elitism critical to practical performance of GA
(Holland’s original formulation did not include elitism)
• Crossover takes parent-pairs from selection step and
creates offspring
• Mutation makes “slight” random modifications to some or
all of the offspring in next generation
• Selection, crossover, and mutation discussed below,
followed by the basic steps of a GA….
9-7
Selection
• Parent selection methods based on probability of
selection being increasing function of fitness
• Roulette-wheel selection is common method
– Probability an individual is selected is equal to its fitness
divided by the total fitness in the population
• Problem: Selection probability highly dependent on
units and scaling for fitness function
– May cause premature convergence to local optima
• Rank selection and tournament selection methods
reduce sensitivity to choice of fitness function
– More robust: Only compare which chromosomes are
better, not relative magnitudes of fitness functions
– Often produce better practical results (see, e.g., examples
in Sect. 9.7 of ISSO)
9-8
Crossover
• Crossover provides means of “mixing” two parents (from
selection step) to provide two offspring
• For given parents, crossover occurs with specified
probability  1
– Else, parents placed directly into next generation (clones)
• Examples of crossover operator:
A. One-point crossover
B. Two-point crossover
9-9
Mutation
• Mutation operator introduces spontaneous variability (as in
random search algorithms)
• Mutation generally makes only small changes to solution
• Bit-based coding and real (floating point) coding require
different type of mutation
– Bit-based mutation generally involves “flipping” bit(s)
– Real-based mutation often involves adding small (Monte
Carlo) random vector to chromosomes
• Example below shows mutation on one element in
chromosome in bit-based coding:
9-10
Essential Steps of Basic GA
(Noise-Free Measurements)
Step 0 (initialization) Randomly generate initial population
of N (say) chromosomes and evaluate fitness function.
Step 1 (parent selection) Set Ne = 0 if elitism strategy is not
used; 0 < Ne < N otherwise. Select with replacement
NNe parents from full population.
Step 2 (crossover) For each pair of parents identified in step
1, perform crossover on parents at a randomly chosen
splice point (or points if using multi-point crossover) with
probability Pc .
9-11
Essential Steps of GA (cont’d)
Step 3 (replacement and mutation) Replace the non-elite
N Ne chromosomes with the current population of
offspring from step 2. Perform mutation on the bits with a
small probability Pm .
Step 4 (fitness and end test) Compute the fitness values for
the new population of N chromosomes. Terminate the
algorithm if stopping criterion or budget of fitness function
evaluations is met; else return to step 1.
9-12
Examples of GAs in ISSO
• Section 9.7 of ISSO includes several numerical examples
– Usual caveats regarding caution in drawing general
inferences; summary of examples below
• Example 9.4: unimodal function with p = 10
– GA with real coding outperforms GA with bit coding w/ noisefree loss measurements; real and bit coding similar with
noisy measurements
• Example 9.5: multimodal function with p = 2 and noise-free
loss measurements
– Plot of function on next slide
– GA outperforms SA with injected randomness
• Example 9.6: issues with traveling salesperson problem
– Fundamental issue is to ensure legal tours
9-13
Loss Function for Example 9.5
9-14