Transcript Bath
Genetic Algorithms
An Evolutionary Approach to Problem
Solving
Illusions of Design
Living things in nature seem to be like they
were designed by a skilled engineer/designer.
Evolution Theory: Too exquisitely
designed to be “random”.
If not random, then what is the non-random
process?
Evolution
A process of cumulative selection
Individuals with new traits are developed through
Mutations
Sexual Reproduction
Individuals with better traits are
more likely to survive and are
more likely to transfer those traits to their
descendents.
Key Principles of Evolution
http://www.pbs.org/wgbh/evolution/library/11/2/e_s_4.html
Variety: population of individuals with different set of
traits
Through reproduction: The importance of sexual reproductions
versus asexual reproduction
(http://www.pbs.org/wgbh/evolution/library/01/5/l_015_03.html)
Random mutations
Evaluation & Selection (through constraints in the
environment)
Reproduction: Transfer of traits to offsprings
From Evolution of life in Nature
to Evolution of Solutions
Can evolutionary principles be used to
develop solutions to problems?
Can these principles be used to do Data
Mining?
Computation Analogy of
Evolution
What is the equivalent of
an individual
(chromosome)?
What is the equivalent of
an individual having “good”
genetic material/good set
of traits?
What is the equivalent of a
population of individuals?
What is the equivalent of
reproduction?
What is the equivalent of
survival of the fittest?
Computation Analogy of Evolution
What is an individual
A collection of traits
Example: An individual has the following
traits
Color: Black or White
Speed: Fast, Medium, Slow or Very Slow
Intelligence: Very Smart, Smart, Somewhat Smart,
Medium Smart, Dumb or Very Dumb
GA – Generic Strategy
Start with a “population” of solutions (individuals)
Repeat
Choose 2 solutions from the population
With a certain probability Apply a crossover to create
child1, child2
With a certain probability, mutate child1 and child2
Update the population (discard of bad solutions).
Until stopping criteria has been met
Output the best solution(s) in the population
What is an individual
A collection of traits
Example: An individual has the following
traits
Color: Black or White
Speed: Fast, Medium, Slow, Very Slow
Intelligence: Very Smart, Smart, Somewhat Smart,
Medium Smart, Dumb or Very Dumb
Typically a coding system is used to code each
trait. E.g., binary coding.
Representing Solutions
The Fitness Function: Evaluation of
Individual Solutions
A “fitness function” .
Process of Evaluating the
Fitness
How a Population of Solutions
Looks Like
Reproduction: Crossover
Crossover between parents’ traits creating two children.
There are many crossover operators
1. Randomly choose a position and cross over contents
before that position. Crossover point
White
Medium
Dumb
Black
Slow
Smart
Children:
White
Slow
Smart
Black
Medium
Dumb
Reproduction
Crossover
2. Randomly choose a set of traits (genes) and
cross over those.
Examples: Cross over color and intelligence
White
Medium
Dumb
Black
Slow
Smart
Children:
Black
Medium
Smart
White
Slow
Dumb
Reproduction
Mutation
The purpose of mutations is to take a single solution and
introduce some random “shock” or changes to it to
create a new solution.
Implementation: Randomly chose a trait for each child
and randomly change its value to another valid value
Black
Black
Medium
Fast
Smart
Smart
White
Slow
Dumb
White
Slow
Intelligent
What is the equivalent of
survival of the fittest?
Simply give solutions with better fitness a higher
probability of being chosen for reproduction.
Sample with replacement: solutions with bigger
fitness will be selected more times.
Population
What will be the composition of
population in future
generations?
The Traveling Salesman
Problem
A travelling salesman who has to visit a set of n cities.
Find the order in which the salesman visits cities so as to
minimize total distance.
Variants of this problem are found in several domains.
As n gets very large, exhaustive search becomes impossible
due to the combinatorial nature of the problem.
Need heuristic methods to find good solutions, even if these
are not guaranteed to be the “best”.
The Traveling Salesman Problem
Consider the TSP problem
5 cities to visit: London, Oxford, Cambridge,
Brighton, and Bath.
What is the best path?
Cambridge
Oxford
London
Bath
Brighton
The Traveling Salesman Problem
Genetic Algorithms Solution
Step-1:
An individual is a “candidate solution”, a path.
Examples of candidate solutions for the TSP:
London, Oxford, Cambridge, Brighton, Bath
London, Bath, Oxford, Brighton, Cambridge
Brighton, London, Cambridge, Bath, Oxford.
…
The Traveling Salesman Problem
Coding Scheme for Candidate Solutions
Order in Order in
Order in
Order in Order in
sequence sequence
sequence
sequence sequence
of
of
of
of
of
London
Oxford Cambridge Brighton
Bath
Example:
London Oxford Cambridge Brighton Bath
1
2
3
4
5
Oxford London Cambridge Bath Brighton
2
1
3
5
4
The Traveling Salesman Problem
Step 2: Fitness Function
How “good” are the following solutions?
London, Oxford, Cambridge, Brighton, Bath
London, Bath, Oxford, Brighton, Cambridge
Brighton, London, Cambridge, Bath, Oxford.
The Traveling Salesman Problem
Fitness in TSP Problem: Distance Table
Distance (in Miles)
London
London
Oxford
Cambridge
Brighton
Bath
0
Oxford Cambridge
Brighton Bath
350
50
280
470
0
130
270
310
0
210
340
0
220
0
The Traveling Salesman Problem
Creating New Solutions
Crossover Operation : Randomly choose a position and cross over
contents before that position.
Crossover Operation for TSP: part of the first parent is copied and the
rest is taken in the same order as in the second parent
London
1
1
Oxford
Cambridge
Crossover point
2
5
Brighton
Bath
3
4
5
3
2
4
Reproduction
London
The Traveling Salesman Problem
Oxford
Cambridge
Crossover point
1
2
1
5
Brighton
Bath
3
4
5
3
2
4
London Oxford Cambridge Brighton Bath
London Brighton Cambridge Bath Oxford
Child:
London Oxford Brighton Cambridge Bath
1
2
4
3
5
Reproduction
London
The Traveling Salesman Problem
Oxford
Cambridge
Crossover point
1
2
1
5
Brighton
Bath
3
4
5
3
2
4
London Oxford Cambridge Brighton Bath
London Brighton Cambridge Bath Oxford
Child:
London Cambridge Brighton Bath Oxford
1
5
2
3
4
Reproduction
Mutation in the TSP Problem
Randomly changing one gene won’t work.
London
Oxford
Cambridge
Brighton
Bath
1
2
3
4
5
1
2
4
4
5
Design mutations around the “swap” concept:
1
2
5
4
3
GA for TSP
For the TSP problem we have:
Solution representation
A fitness evaluation function
Crossover operations on parents
Mutation on a single solution
Start with a population of solutions, and let
them evolve
Next Step: Initialize a population
Decision parameter: population size
Let us choose 5 solutions in our population.
We will now randomly initialize the population.
London, Oxford, Cambridge, Brighton, Bath
Oxford, London, Cambridge, Bath, Brighton
Cambridge, Oxford, Brighton, Bath, London
Bath, London, Brighton Cambridge, Oxford
Bath, Oxford, Cambridge, London, Brighton
(-1320)
(-1230)
(-1140)
(-1400)
(-990)
Evolving Solutions for TSP
Repeat
Choose 2 solutions from the population
With a certain “crossover probability” (say, 0.8) apply a
crossover operator to create child1, child2
With a certain “mutation” probability (say 0.1), mutate
child1, child2
Place the resulting 2 children in the population
Selection: which 5 solutions survive – the probability of
each individual to survive is propositional to its fitness
function
After stopping, output the best chromosome in the
population for the solution
(Until stopping criteria has been met)
Overview of the Selection Process
Over time solutions mate and traits passed
on to the offspring.
Children with “better” traits have a ability to
survive.
The weak solutions gradually disappear
from the population.
Good solution predominate the population
Building Solutions through
Evolution
http://www.pbs.org/saf/1103/video/watchon
line.htm
GA – Advantages:
Not engineering : enables finding
surprising solutions to problems
Quickly and reliably solve problems that
are hard to tackle by traditional means.
Implicit parallelism makes GAs a very
efficient optimization algorithm.
Great property is the ability to find
approximate solutions to combinatorially
explosive problems.
GA - Disadvantages
A heuristic: GAs may find only nearoptimal solutions.
Further restrictions are the difficulties of
choosing a suitable representation
technique, and making the right decision
regarding the choice of the selection
method and the genetic operator
probabilities