Transcript learningGA

Intro to AI
Genetic Algorithm
Ruth Bergman
Fall 2004
Imitating Nature
Aspect of the evolution of organisms:
• The organisms that are ill-suited for an environment
have little chances to reproduce (natural selection)
• Conversely, the best fitting have more chances to
survive and reproduce
Imitating Nature
Reproduction:
• Offspring are similar to their parents
• Random mutations occur and they can bring to better (or worse)
fitting individuals
“The Origin of the Species on the Basis of Natural Selection” C.
Darwin (1859)
Encoding:
• An organism is fully represented by its DNA string, that is a
string over a finite alphabet (4 symbols)
• Each element of this string is called gene
Genetic Algorithm (GA)
• Developed by John Holland in the early 70’s
• Optimization and machine learning techniques
inspired from the process of natural evolution and
evolutionary genetics
– Solutions are encoded as chromosomes
– Search proceeds through maintenance of a population of
solutions
– Reproduction favors “better” chromosomes
– New chromosomes are generated during reproduction
through processes of mutation and cross over, etc.
GA Framework
selection
Search
space
A
0 1 0 0 0
B
1 0 1 1 0
C
1 1 0 1 0
D
0 1 0 1 1
population
cross over
1 0 1 1 0
1 0 0 1 1
0 1 0 1 1
0 1 1 1 0
mutation
1 0 0 1 1
Fitness
evaluation
0 1 1 1 0
reproduction
GA Procedure
•
Start with a population of N individuals
1. Apply the fitness function to all the individuals
2. Randomly select the N/2 pairs of individuals for reproduction
(repetition allowed).
3. Each pair generates two children (reproduction with cross-over)
4. Apply a random mutation to the children with small probability.
The children become the next generation
5. Apply steps 1,2,3 until some termination criteria applies
Encoding Scheme
• An individual (an organisms) is intended to be a
possible solution for the problem you want to solve
• An individual is represented by a binary string. Such
a string is intended to be the complete description of
the individual
• Example:
Suppose you have to find a number between 0 and 255,
which binary representation contains the same number of 1s
and 0s.
A individual is a string of 8 bits, ex:
h=
0 1 1 1 1 1 1 0
= 126
Fitness Function
• A fitness function is a function that says how good is a
solution, i.e. how well an individual fit the environment
• Example
f (h)  8 | n1  n0 |
Where n1 is the number of 1’s in h and n0 is the number of 0’s in h.
note that the fitness function gets the minimum value (i.e. 0)
when n1 = 8 or n0 = 8 and the maximum value (i.e. 8) when
n1 = n0 = 4
The Initial Population
0 1 1 1 1
1 1 0
1 1 1 1 1
1 1 0
0 0 1 0 0
1 0 0
0 0 0 0 0
0 0 1
Selection
• Roulette wheel selection
– compute each individual’s contribution to the global fitness as
– The choice of the pairs for reproduction consists of randomly choosing
the individuals (with replacement) with distribution given by P
encoding
A
B
C
D
0 1 11 1 1 1 0
1 1 11 1 1 1 0
0 0 10 0 1 0 0
0 0 00 0 0 0 1
fitness
4
2
4
2
P(-)
.33
.17
.33
.17
D
17%
C
33%
A
33%
B
17%
Roulette Wheel
Crossover
– Randomly choose a cross over point “c”, i.e. a number
between 1 and n
– return two children: one composed by the first c bits of the
first parent and the last n-c bits of the second parent, the
other composed by the first c bits of the second parent and
the n-c bits of the first parents
0 1 1 1 1 1 1 0
0 1 1 1 1 1 0 0
0 0 1 0 0 1 0 0
0 0 1 0 0 1 1 0
c
1 1 1 1 1 1 1 0
1 1 1 1 1 1 0 1
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
Mutation
• mutation on individuals:
some of the children’s bits are changed (with a small,
independent probability)
0 1 1 1 1 1 0 0
0 1 1 1 1 1 1 0
0 1 1 1 1 1 1 0
f  8 | 6  2 | 4
0 0 1 1 0 1 1 0
f  8 | 4  4 | 8
1 1 1 1 1 1 0 1
f  8 | 7  1 | 2
0 0 1 0 0 0 1 0
f  8 | 2  6 | 4
maximum found
Stopping Criteria
• Convergence:
– A population is said to converge when all the genes have
converged, I.e. when the value of every bit is the same at
least in the 95% of the individuals in the population
• Since convergence is not guaranteed, we must
consider other stopping criteria:
– Number of generations
– Almost constant value of the best fitting individual
– Almost constant value of the average fitness of the
population
Parameter Settings
• Population size
– How many chromosomes are in population
• Too few chromosome  small part of search space
• Too many chromosome  GA slow down
– Recommendation : 20-30, 50-100
• Probability of crossover
– How often will crossover be performed
– Recommendation : 80% -95%
• Probability of mutation
– How often will be parts of chromosome mutated
– Recommendation : 0.5% - 1%
Optimization Search
• Genetic algorithms is a search algorithm
• evaluation function ≡ fitness function
• Similar to beam search with N beams, but
– Next generation selected stochastically
– Sexual reproduction
• Similar to hill-climbing, but
– Convergence to global optimum is expected eventually
cf.
Hill-climbing Method
GA Search Method
Genetic Programming
• One of the central challenges of CS is to get
a computer to do what needs to be done,
without telling it how to do it
– Automatic programming (or program synthesis)
• GP is a branch of genetic algorithms
• Main difference between GP and GA
– Representation of the solution (computer program)
• GA: a string of numbers
– fixed-length character strings
• GP: computer program (lisp or scheme)
– Represent hierarchical computer programs of dynamically
varying sizes and shapes
Evonomy
•
•
Evonomy brings advanced artificial intelligence and financial markets
together. It generates an enormous number of different trading
approaches and selects the one with the best profit/risk combination.
Evonomy is based on a genetic algorithm that applies the powerful
mechanisms found in biological evolution to financial markets.
(1) The system creates a large population of virtual traders each of
which has its own recipe for beating the market.
(2) The system tests the traders against the relevant historical market
data.
(3) The traders with the best profit-risk profiles survive and have
offspring: the new generation of traders is created by mating and
mutating the most promising recipes.
(4) The system runs the battle for survival as long as some traders
stand out and dominate the population. These traders have the
optimal profit/risk profile; any change in their recipe would make
them less fit.