Transcript Bath

Genetic Algorithms
An Evolutionary Approach to Problem
Solving
Illusions of Design


Living things in nature seem to be like they
were designed by a skilled engineer/designer.
Evolution Theory: Too exquisitely
designed to be “random”.
If not random, then what is the non-random
process?
Evolution
A process of cumulative selection
Individuals with new traits are developed through
Mutations
Sexual Reproduction
Individuals with better traits are
more likely to survive and are
more likely to transfer those traits to their
descendents.
Key Principles of Evolution
http://www.pbs.org/wgbh/evolution/library/11/2/e_s_4.html

Variety: population of individuals with different set of
traits




Through reproduction: The importance of sexual reproductions
versus asexual reproduction
(http://www.pbs.org/wgbh/evolution/library/01/5/l_015_03.html)
Random mutations
Evaluation & Selection (through constraints in the
environment)
Reproduction: Transfer of traits to offsprings
From Evolution of life in Nature
to Evolution of Solutions
Can evolutionary principles be used to
develop solutions to problems?
Can these principles be used to do Data
Mining?
Computation Analogy of
Evolution





What is the equivalent of
an individual
(chromosome)?
What is the equivalent of
an individual having “good”
genetic material/good set
of traits?
What is the equivalent of a
population of individuals?
What is the equivalent of
reproduction?
What is the equivalent of
survival of the fittest?
Computation Analogy of Evolution
What is an individual
A collection of traits
Example: An individual has the following
traits
Color: Black or White
 Speed: Fast, Medium, Slow or Very Slow
 Intelligence: Very Smart, Smart, Somewhat Smart,
Medium Smart, Dumb or Very Dumb

GA – Generic Strategy


Start with a “population” of solutions (individuals)
Repeat






Choose 2 solutions from the population
With a certain probability Apply a crossover to create
child1, child2
With a certain probability, mutate child1 and child2
Update the population (discard of bad solutions).
Until stopping criteria has been met
Output the best solution(s) in the population
What is an individual
A collection of traits
Example: An individual has the following
traits
Color: Black or White
 Speed: Fast, Medium, Slow, Very Slow
 Intelligence: Very Smart, Smart, Somewhat Smart,
Medium Smart, Dumb or Very Dumb


Typically a coding system is used to code each
trait. E.g., binary coding.
Representing Solutions
The Fitness Function: Evaluation of
Individual Solutions

A “fitness function” .
Process of Evaluating the
Fitness
How a Population of Solutions
Looks Like
Reproduction: Crossover
Crossover between parents’ traits creating two children.
There are many crossover operators
1. Randomly choose a position and cross over contents
before that position. Crossover point
White
Medium
Dumb
Black
Slow
Smart
Children:
White
Slow
Smart
Black
Medium
Dumb
Reproduction
Crossover
2. Randomly choose a set of traits (genes) and
cross over those.
 Examples: Cross over color and intelligence
White
Medium
Dumb
Black
Slow
Smart
Children:
Black
Medium
Smart
White
Slow
Dumb
Reproduction
Mutation
The purpose of mutations is to take a single solution and
introduce some random “shock” or changes to it to
create a new solution.
Implementation: Randomly chose a trait for each child
and randomly change its value to another valid value
Black
Black
Medium
Fast
Smart
Smart
White
Slow
Dumb
White
Slow
Intelligent
What is the equivalent of
survival of the fittest?
Simply give solutions with better fitness a higher
probability of being chosen for reproduction.
Sample with replacement: solutions with bigger
fitness will be selected more times.
Population
What will be the composition of
population in future
generations?
The Traveling Salesman
Problem





A travelling salesman who has to visit a set of n cities.
Find the order in which the salesman visits cities so as to
minimize total distance.
Variants of this problem are found in several domains.
As n gets very large, exhaustive search becomes impossible
due to the combinatorial nature of the problem.
Need heuristic methods to find good solutions, even if these
are not guaranteed to be the “best”.
The Traveling Salesman Problem
Consider the TSP problem
5 cities to visit: London, Oxford, Cambridge,
Brighton, and Bath.
What is the best path?
Cambridge
Oxford
London
Bath
Brighton
The Traveling Salesman Problem
Genetic Algorithms Solution
Step-1:
An individual is a “candidate solution”, a path.
Examples of candidate solutions for the TSP:
London, Oxford, Cambridge, Brighton, Bath
 London, Bath, Oxford, Brighton, Cambridge
 Brighton, London, Cambridge, Bath, Oxford.
 …

The Traveling Salesman Problem
Coding Scheme for Candidate Solutions
Order in Order in
Order in
Order in Order in
sequence sequence
sequence
sequence sequence
of
of
of
of
of
London
Oxford Cambridge Brighton
Bath

Example:

London  Oxford  Cambridge  Brighton  Bath
1

2
3
4
5
Oxford  London  Cambridge  Bath  Brighton
2
1
3
5
4
The Traveling Salesman Problem
Step 2: Fitness Function
How “good” are the following solutions?
London, Oxford, Cambridge, Brighton, Bath
 London, Bath, Oxford, Brighton, Cambridge
 Brighton, London, Cambridge, Bath, Oxford.

The Traveling Salesman Problem
Fitness in TSP Problem: Distance Table
Distance (in Miles)
London
London
Oxford
Cambridge
Brighton
Bath
0
Oxford Cambridge
Brighton Bath
350
50
280
470
0
130
270
310
0
210
340
0
220
0
The Traveling Salesman Problem
Creating New Solutions
Crossover Operation : Randomly choose a position and cross over
contents before that position.
Crossover Operation for TSP: part of the first parent is copied and the
rest is taken in the same order as in the second parent
London
1
1
Oxford
Cambridge
Crossover point
2
5
Brighton
Bath
3
4
5
3
2
4
Reproduction
London
The Traveling Salesman Problem
Oxford
Cambridge
Crossover point
1
2
1
5
Brighton
Bath
3
4
5
3
2
4
London  Oxford  Cambridge  Brighton  Bath
London  Brighton  Cambridge Bath  Oxford
Child:
London  Oxford  Brighton  Cambridge Bath
1
2
4
3
5
Reproduction
London
The Traveling Salesman Problem
Oxford
Cambridge
Crossover point
1
2
1
5
Brighton
Bath
3
4
5
3
2
4
London  Oxford  Cambridge  Brighton  Bath
London  Brighton  Cambridge Bath  Oxford
Child:
London  Cambridge  Brighton Bath  Oxford
1
5
2
3
4
Reproduction
Mutation in the TSP Problem


Randomly changing one gene won’t work.
London
Oxford
Cambridge
Brighton
Bath
1
2
3
4
5
1
2
4
4
5
Design mutations around the “swap” concept:
1
2
5
4
3
GA for TSP

For the TSP problem we have:
Solution representation
 A fitness evaluation function
 Crossover operations on parents
 Mutation on a single solution


Start with a population of solutions, and let
them evolve
Next Step: Initialize a population
Decision parameter: population size
Let us choose 5 solutions in our population.
We will now randomly initialize the population.





London, Oxford, Cambridge, Brighton, Bath
Oxford, London, Cambridge, Bath, Brighton
Cambridge, Oxford, Brighton, Bath, London
Bath, London, Brighton Cambridge, Oxford
Bath, Oxford, Cambridge, London, Brighton
(-1320)
(-1230)
(-1140)
(-1400)
(-990)
Evolving Solutions for TSP

Repeat
Choose 2 solutions from the population
 With a certain “crossover probability” (say, 0.8) apply a
crossover operator to create child1, child2
 With a certain “mutation” probability (say 0.1), mutate
child1, child2
 Place the resulting 2 children in the population
 Selection: which 5 solutions survive – the probability of
each individual to survive is propositional to its fitness
function
After stopping, output the best chromosome in the
population for the solution


(Until stopping criteria has been met)
Overview of the Selection Process
Over time solutions mate and traits passed
on to the offspring.
Children with “better” traits have a ability to
survive.
The weak solutions gradually disappear
from the population.
Good solution predominate the population
Building Solutions through
Evolution
http://www.pbs.org/saf/1103/video/watchon
line.htm
GA – Advantages:



Not engineering : enables finding
surprising solutions to problems
Quickly and reliably solve problems that
are hard to tackle by traditional means.
Implicit parallelism makes GAs a very
efficient optimization algorithm.

Great property is the ability to find
approximate solutions to combinatorially
explosive problems.
GA - Disadvantages


A heuristic: GAs may find only nearoptimal solutions.
Further restrictions are the difficulties of
choosing a suitable representation
technique, and making the right decision
regarding the choice of the selection
method and the genetic operator
probabilities