Transcript Week10

Topic 10
Evolutionary Programming, Genetic
Algorithms
•
•
•
•
•
•
•
•
•
Intelligence and Evolution
Evolution as Optimisation
Sexual Reproduction
Evolutionary Computation
Genetic Algorithms
Selection, Recombination & Mutation
GA Process
Example: The Travelling Salesman Problem
Advantages/Disadvantages of Genetic Algorithms
Reading:
Champandard Chapter 32
Link to Evolutionary Programming on Website
Link to Genetic Algorithms on Website
ICT219
1
Intelligence and Evolution
• One way of understanding intelligence is as the capability of a creature
to adapt itself to an ever-changing environment
• We normally think of adaptation as changes in the characteristics
(including behaviours) of a single animal in response to experiences
over its history
• But adaptation is also change in the characteristics of a species, over
the generations, in response to environmental change
• An individual creature is in competition with other individuals of the
same species for resources, mates etc.
• There is also rivalry from other species which may be a direct (predator)
or indirect (food, water, land, etc.) threat
• In nature, evolution operates on populations of organisms, ensuring by
natural selection that characteristics that serve the members well tend
to be passed on to the next generation, while those that don’t die out
ICT219
2
Evolution as Optimisation
• Evolution can be seen as a process leading to the optimisation of a
population’s ability to survive and thus reproduce in a specific
environment.
• Evolutionary fitness - the measure of the ability to respond adequately
to the environment, is the quantity that is actually optimised in natural
life
• Consider a normal population of rabbits. Some rabbits are naturally
faster than others. Any characteristic has a range of variation that is due
to i) sexual reproduction and ii) mutation
• We may say that the faster rabbits possess superior fitness, since they
have a greater chance of avoiding foxes, surviving and then breeding
• If two parents have superior fitness, there is a good chance that a
combination of their genes will produce an offspring with even higher
fitness. We say that there is crossover between the parents’ genes
• Over successive generations, the entire population of rabbits tends to
become faster to meet their environment challenges in the face of foxes
Sexual Reproduction
• The key to understanding evolution in nature lies in the basic biology
of reproduction
• The chromosome is the basic carrier of the genes, which are the units
of the genetic code that control an individual’s characteristics. Each
gene can take on one of a number of possible forms, called an allele
• An allele is like the value of a variable, and represents the effect that a
gene will have on the physical makeup of a body
• An individual’s particular sequence of alleles is called the genotype. It
determines the expression of characteristics in the individual’s body,
called the phenotype
• In humans, most cells contain 23 pairs of chromosomes. But
reproductive cells (spermatozoa and ova) contain 23 single
chromosomes, because they must merge with their opposite number
to produce a new offspring
• During fertilization of the ova byICT219
the sperm, the chromosomes from4
each recombine to form the 23 pairs of the new individual
Sexual Reproduction
Image: Osvego City School District Regents Exam Prep Center
• Selection operates as survival and choice of mates between parents
• Recombination of genes is the mechanism that generates the next
generation’s characteristics
• Sometimes random copying errors, called mutations, occur during the
recombination process. These are also important because they lead to
new characteristics, usually useless, occasionally adaptive
An albino is a common mutant
Evolutionary Computation
• Evolutionary computation simulates evolution on a computer. The
result of such a simulation is a series of optimisation algorithms,
usually based on a simple set of characteristics – the equivalent of
a genome
• Recall that optimisation iteratively improves the quality of solutions
to some problem until an optimal (or at least feasible) solution is
found
• Evolutionary computation is an umbrella term that includes genetic
algorithms (Holland, 1975), evolution strategies (Schwefel, 1981)
genetic programming (Koza, 1994) and other methods
• A-life researchers frequently experiment with populations of
simulated organisms put into artificial competition and subjected to
the laws of natural selection ICT219
7
Evolutionary Computation
“Standard Model”
evolution process
phenotype
convert
(a genotype)
apply
alleles
problem
feedback
(fitness)
population of individuals, each with its own
genotype
the genome
(data structure)
for this species
ICT219
8
Genetic Algorithms
• Genetic algorithms dispense with phenotypes altogether,
and evolve the genotypes directly
• This is more efficient because no conversion is needed –
the genotypes are applied directly to the problem at hand
• Usually the elements can be specially designed to make the
process work faster and more efficiently than real evolution
ICT219
Image: Michael Goodin
9
Genetic Algorithms
• Genetic algorithms were introduced by John Holland (1975) with
the aim of making a computer do what nature does - find good
combinations of characteristics, blindly
• He was concerned with algorithms that manipulate strings of binary
digits – an artificial “chromosome”
• Each artificial chromosome consists of a number of “genes”; in the
simplest case, each gene may have an “allele” of 0 or 1:
• Two mechanisms link a GA to the problem it is solving:
• Encoding is how the bits control the characteristics (incl. behaviour)
of the system in the world
• Evaluation is working out the fitness conferred by an artificial
chromosome by testing in the world
ICT219
10
Selection, Recombination & Mutation
• Populations of artificial chromosomes will be generated over a
number of generations
• This involves selection, recombination and mutation
• The chromosomes’ fitness is used to select them in pairs for mating
• As recombination takes place, a crossover operator exchanges
parts of the two single chromosomes from the mated individuals
• A mutation operator flips genes value in some randomly chosen
location of the chromosome at some rate set by a parameter
• Chromosomes can actually be arbitrarily complex data structures:
- bit strings (1011010000010101)
- real numbers (43.2, -33.1, ... , 0.0, 89.2)
- set permutations (E11, E3, E7, ... , E1, E15)
- lists of rules (R1, R2, R3, ... , R22, R23)
- program elements (genetic programming)
- really, any data structure ICT219
11
Basic Genetic Algorithm
• Step 1. Represent the problem domain as a chromosome of fixed
length n, choose the size of population of chromosomes N, a
crossover probability pc and a mutation probability pm
• Step 2. Define a fitness function f to measure the performance
(fitness) of an individual chromosome in the problem domain. The
fitness function is the basis for selecting chromosomes that will be
mated during reproduction
• Step 3: Randomly generate an initial population of chromosomes of
size N: x1, x1, ... ,xN
• Step 4: Calculate the fitness of each individual chromosome:
f(x1), f(x2), ... , f(x)N
• Step 5: Select a pair of chromosomes for mating from the current
population. Parent chromosomes are selected with a probability
related to their fitness (eg by Roulette wheel, tournament, ranking,
random walk or remainder methods)
ICT219
12
Basic Genetic Algorithm
• Step 6. From those parents, create a pair of offspring chromosomes
by applying the genetic operators crossover and mutation
• Step 7. Place the created offspring chromosomes in the new
population
• Step 8: Repeat from Step 5 until the new population size equals the
old population size
• Step 9: Replace the initial (parent) chromosome population with the
new (offspring) population
• Step 10: Go to Step 4, and repeat the process until some termination
(or optimisation) criterion is satisfied
• Note: the best individuals from the final population should now
perform above the criterion level
• Iterative process: each iteration is a generation. Typical run is
anywhere from 50 to > 500 iterations
ICT219
13
Genetic Algorithm Process
ICT219
14
Example: Travelling Salesman Problem
• Find a tour of a given set of cities so that 1) each city is visited only
once 2) the total distance traveled is minimised
• Representation is an ordered list of city numbers (known as order-based
GA):
1) London
3) Dunedin
5) Beijing
7) Tokyo
2) Venice
4) Singapore 6) Phoenix
8) Victoria
CityList1
CityList2
(3 5 7 2 1 6 4 8)
(2 5 7 6 8 1 3 4)
• Recombination uses i) crossover and ii) mutation by inversion:
Parent1
Parent2
(3 5 7 2 1 6 4 8)
(2 5 7 6 8 1 3 4)
Child
(2 5 7 2 1 6 3 4)
ICT219
crossover
15
Example: Travelling Salesman Problem
• Mutation involves reordering of the list: eg flip elements 3 and 6
•
*
Before:
After :
*
(2 5 7 2 1 6 3 4)
(2 5 6 2 1 7 3 4)
mutation
• Now consider 30 unnamed cities on a grid, each with (x,y) coordinates
ICT219
16
Example: Travelling Salesman Problem
ICT219
17
Example: Travelling Salesman Problem
ICT219
18
Example: Travelling Salesman Problem
ICT219
19
The Travelling Salesman Problem
In 1987, Martin Groetschel and Olaf Holland found an optimal tour of 666
interesting places in the world. Source: http://www.tsp.gatech.edu//index.html
Pros and Cons of Genetic Algorithms
• GAs are a flexible, widely applicable optimisation process
• Unlike NNs, GAs tend to avoid local minima, and find the global solution
(if you are not in a hurry) because search is not restricted to a single
part of the problem space
•
• Can optimise a lot of parallel measures simultaneously (multi-objective)
• Operators can be customised to take advantage of regularities or
constraints in a particular domain to improve speed or quality of
convergence
But...
• Abstractions about the problem itself such as mathematical
simplifications can’t be used
• Difficult to predict how long convergence will take – randomness in the
process means this might vary widely
• Because it requires representation and processing on a sizable
population of genotypes, it could be expensive in terms of memory and
computation – very complex problems could be infeasible
ICT219
21
GAs can be used when...
• Alternative solutions are too slow or overly complicated
• Need an exploratory tool to examine new approaches
• Need an “anytime” algorithm – always a solution, only improves
• Want to make a hybrid with another system
• Problem is similar to one that has already been successfully solved
using a GA!
GAs are now used to optimise
the design parameters of complex
machines, such as jet engines
ICT219
22
References
• Holland, John H. Adaptation in Natural and Artificial System. Ann Arbor:
The University of Michigan Press, 1975.
• Koza, J.R. Genetic Programming II: Automatic Discovery of Reusable
Programs. Cambridge, Mass.:MIT Press, 1994.
• Negnevitsky, M., Artificial Intelligence: A Guide to Intelligent Systems,
Harlow, Essex: Addison Wesley, Pearson Education Limited, 2002.
Chapter 7.
• Schwefel, H.P. Numerical Optimisation of Computer Models. Chichester:
John Wiley & Assoc., 1981.
ICT219
23