The Standard Genetic Algorithm

Download Report

Transcript The Standard Genetic Algorithm

The Standard Genetic Algorithm
Dr. Chrisantha Fernando
Systems Biology Centre
University of Birmingham
DIY Evolution.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Genotype
 Binary String
 Real numbered String
Represents
length of
left leg.
Size of head, etc…
1101011111110101
0.2, 0.5, 0.2, 0.83, 0.01, 0.3
1101000101110101
0.6, 0.1, 0.5, 0.8, 0.9, 1.0
Evaluation
 Interpret the genotype to produce the phenotype.
 In the simplest case they are the same thing.
 E.g. Imagine we desire the string
 000000000
 We can define fitness of any string as the number
of places where it is the same as the above string,
e.g.
 000000000
 010101010
 ------------101010101 = 5 = Fitness
A Trivial Example
 int evaluate(int *g) {
 int i,
 r=0;
 for (i=0;i<10;i++)
 r += (g(i) == 0);
 return(r);
}
I. Harvey
So first initialize a population




int popn[30][10];
void initialise_popn() {
int i,j;
for (i=0;i<30;i++)
 for (j=0;j<10;j++)
 popn[i][j]= flip_a_bit();
}
I. Harvey
Main Loop
 For n times round generation loop








evaluate all the population (of 30)
select preferentially the fitter ones as parents
for 30 times round repro loop
pick 2 from parental pool
recombine to make 1 offspring
mutate the offspring
end repro loop
throw away parental generation and replace with
offspring
 End generation loop
I. Harvey
More complicated Evaluations
 Genotype encodes a neural network.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Dario Floreano’s Lab
Even More Complicated
Evaluations
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Problems in Practice
 An evaluation may take a long time, in
which case the genetic algorithm will be
slow.
 An evaluation may be noisy, so the same
agent may have different fitness each time
you make an evaluation. If it is too noisy,
then good agents may be lost from the
population.
Selection Methods
 Truncation Selection
 All parents come from top 50% or top 20% etc..
 Fitness Proportionate Selection
 E.g. if all fitnesses are 2, 4, 6, 8, 9, then select parent
using roulette wheel selection with probability 2/29,
4/29, 6/29, 8/29, 9/29
 Problems with this are
 If early on one agent dominates there is too much selective
pressure.
 If later agents have very similar fitnesses there is too little
selective pressure.
 Scaling methods can be used to get around these problems.
 Rank Selection: Ignore absolute fitness, can
use other slopes, but here the best is
selected 2x more than the average.
2
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
0
Elitism
 Force a direct un-mutated copy of the best
of the last generation.
 Never loose the best.
 Useful if there is a lot of noise in fitness
assessments, or if mutation is very likely to
produce a low fitness offspring.
Mutation (Asexual Reproduction)
 Mutate at randomly chosen loci with a small
probability.
 Mutate all loci by a very small amount
(vector mutation).
 With binary you do bit flips, with real
valued mutation you might multiply the
value by a Gaussian distributed random
number.
Recombination (Sexual
Reproduction)
Parent A
10100010101111
10100010000111
Parent B
00000011000111
00000011101111
1-point random equal crossover
Uniform crossover
00100010100111
The Competing Conventions
Problem
Head, Eye, Face, Leg
Face, Eye, Leg, Head
Head, Eye, Face, Head
Face, Eye, Leg, Leg
OK, so two heads, and two
legs. Sometimes competing
conventions can help, sometimes it can hinder.
Schema Theorem
 John Holland.
 A theory of how GAs work. Not everyone
agrees with this, but it is worth reading his
book if you are interested.
A Black Art
 No universal algorithm suitable for all
cases.
 Need to get a feeling for it by doing it.
Relationship to Real Genomes
 Usually GAs use haploid genomes, not
diploid ones, i.e. there is only one copy of
each ‘gene’.
Homework. I’m happy to help if
you need. [email protected]

The Card Problem


Genotype encoding


Each card can be in Pile_0 or Pile_1, there are 1024 possible ways of sorting them into 2 piles, and
you have to find the best. Think of a sensible way of encoding any possible solution-attempt as a
genotype.
Fitness


You have 10 cards numbered from 1 to 10. You have to choose a way of dividing them into 2 piles, so
that the cards in Pile_0 SUM to a number as close as possible to 36, and the remaining cards in
Pile_1 MULTIPLY to a number as close as possible to 360.
Some of these solution-attempts will be closer to the target than others. Think of a sensible way of
evaluating any solution-attempt and scoring it with a fitness-measure.
The GA

Write a program, in any sensible programming language, to run a GA with your genotype encoding
and Fitness function. Run it 100 times and see what results you get.
Who can write the GA that solves the problem in the least number of generations?