Genetic Algorithms and Evolutionary Computation
Download
Report
Transcript Genetic Algorithms and Evolutionary Computation
Genetic Algorithms and
Evolutionary Computation
Presentation by Shumon
(Loutfouz Zaman)
COSC 6111
Is it Bioinformatics?
• Not the same as
Bioinformatics
• Used in Bioinformatics
(we will explore
examples later)
• Used in a variety of
other fields
Not really!!!
What is GA?
• A searching technique which is based
on the biological process of evolution.
• This technique is a heuristic (trial,
error, approximation)
• Evolution based on natural selection
Why Evolution?
• Example: searching
for a set of rules or
equations that will
predict the ups and
downs of a financial
market
• Huge # of
possible
solutions
• Use of
parallelism and
intelligent
computation
(different
possibilities are
explored
simultaneously)
Bioinformatics
• Example: In searching
for proteins with specified
properties, rather than
evaluate one amino acid
sequence at a time it
would be much faster to
evaluate many
simultaneously.
Why Evolution? (2)
• Adaptive problems
(continue to perform
well in a changing
environment)
• Innovative
(construct new and
original)
• Robot control in
which a robot has to
perform a task in a
variable environment
• Example: New
algorithm for
accomplishing a
computational task,
New scientific
discovery.
Why Evolution? (3)
• Complex solutions that are
difficult to program by hand
• Example: Problem of
creating artificial
intelligence. In game
development industry
programmers must come
up with efficient AI within
tight time constraints.
Bottom-up vs. Top-down
• Before AI people believed in using
rules that would present intelligence
in program explicitly, e.g. Expert
System (Top-down)
Problem: rules too hard to encode
• Now, they believe in Bottom-up
approach where only simple rules are
encoded and intelligence emerges
from parallel application and
interaction of these simple rules, e.g.
neural systems, GA
Appealing of Evolution in Biology
• Method of searching among an enormous
number of possibilities for "solutions."
• Desired "solutions" are highly fit organisms
• A method for designing innovative
solutions to complex problems
• Evolution tests and changes millions of
species in parallel by trial and error
• Species evolve by means of random
mutation
Search Spaces in GA
• Problem in Bioengineering: find a protein (seq. of amino acids) used
to let’s say to fight Bird flu.
• Search Space here is a collection of all possible protein seq.’s
(infinite set of possibilities).
• Simplify problem: assume we know the length of this seq. is 8 – still
a huge set (20 possibilities for each pos.)
• Candidate solution seq.: A G G M C G B L
• Let’s define distances:
Dprotein1= (A G G M C G B L, M G G M C G B L) = 1
Dprotein2= (A G G M C G B L, L B M P A F G A) = 8
• Obvious algorithm to select next candidate for the test would
assume correlation between quality of neighbor solutions (close in
space) depending on the result of previous test
• GA would occasionally select candidates for crossover from different
regions in the space.
Fitness Landscapes in GA
most fit
• Fitness landscape is a
representation of all
possible genotypes along
with their fitnesses
• Let genotype be a bitstring of size l
genotype
fitness
00
0.7
01
1.0
10
0.1
11
0.0
• Fitness function f. Can be
represented by
(l + 1)-dimensional plot
Example: Let l = 2
Fitness Landscape (…continued)
Global max: evolution
Local max: adaptation
Local max: adaptation
• Representation by graph (opposed to
(l+1) dim. plot)
Simple GA Operators
• Selection. Selects chromosomes (bit
strings) in the population for reproduction.
The fitter, the chromosome the more likely it
will be selected
• Crossover. Exchanges bits between two
chromosomes to create two offspring.
E.g.: 10000100 & 11111111 are crossed
over at 3rd bit pos. : 10011111 & 11100100
• Mutation. Randomly flips some of the bits
in a chromosome. (Pmutation < 0.001)
E.g. 00000100 is mutated into 01000100
Simple GA Illustration
•
•
•
•
•
•
•
•
•
•
=52
Assume we have a deck of cards
Think of the deck as of population
Think of each card as of a
chromosome
We select two cards which look
good
Make copies of them
Cut them in halves
Glue each half with the
corresponding half of the other
card
Put the original cards back in the
deck
Repeat the same thing until we
have the new deck of 52 glued
cards
Burn the old deck!
Simple GA
• 1) Randomly generate a population of n l-bit X
• 2) Calculate fitness f(x)
• 3) Loop until n O are created:
a. X1,X2 = Select().
//Pselection(Xi) = f(Xi). (Selection with replacement)
XA
b. (O1, O2) = Crossover(X1,X2)
//with Pcrossover at a random point.
If Pcrossover = 0 then O1 = X1, O2 = X2
c. (O1, O2) = Mutate (O1, O2)
//with Pmutation
if (n is odd) delete random O
• 4) i in n, Xi = Oi //replace with new generation
• 5) Go to Step 2.
A
Properties of GA
• Each iteration of GA is called generation
• A GA is typically iterated through 50 to 500,
sometimes more generations
Running Time = #generations x n =
(50;500) x n
• Entire set of generations is called a run
(diff. runs give diff. behavior)
• At the end of a run there are one or more
highly fit chromosomes
Detailed Example of GA
• Let f(x) be # of 1’s in a bit-string
• Pcrossover=0.7, Pmutation=0.001
Initially
After 1 iteration
X Label X String
Fitness
X Label X String
Fitness
A
00000110
2
E’
10110000
3
B
11101110
6
F
01101110
5
C
00100000
1
C
00100000
1
D
00110100
3
B’
01101110
5
Total : 12
Total : 14
unfit
although max fitness
decreased, but total fitness
increased (moving from local
max to global max)
Using GA to Evolve Strategies for
the Prisoner’s Dilemma
• Beavis & Butthead engaged in antisocial behavior for which they got
arrested
• Beavis is offered a deal: if he
confesses and testify against
Butthead, he will be released and
Butthead will go Juvenile Hall
• If Butthead would confess and testify,
then they both will go to Juvenile Hall
• Beavis is told that Butthead is being
offered precisely the same deal
• If neither testifies, then they will be
both sentenced to community service
• Neither of the boys understood the
deal they were offered after the police
repeated it 10 times, so the police
didn’t bother to put them in separate
cells, because they are too stupid to
think about co-operation anyway
Algorithms
Thisyou
sucks!
suck!
dillweed!
Prisoner’s Dilemma (2)
Cooperate
Butthead,
I am
Cooperate
I’ll kick
Cornholio!
Heh your @$$!
Are you
heh
threatening
cool!
me?
Defect
Defect
hi, my
name is
bin laden
Prisoner’s Dilemma (3)
• Robert Axelrod (U of Michigan) wanted to find a good strategy for
Prisoner’s Dilemma
• Organized tournament
• He solicited strategies from researchers in different disciplines
• Each participant submitted a computer program that implemented a
strategy
• Varies programs played iterated games with each other
• Each program remembered what move (cooperate or defect) it and
opponent did in previous games (strategy based on memory)
• Some strategies were complicated (based on Markov process,
Bayesian inference, etc…)
• But the winning strategy was the simplest of all submitted: TIT-FORTAT. It offers cooperation and reciprocates it. But if the other player
defects, TIT FOR TAT punishes with defection of it’s own, and
continues punishing until opponent cooperates again.
Prisoner’s Dilemma (4)
•
•
•
•
•
•
•
•
•
•
Encoding strategies into chromosomes:
CC (case 1)
CD (case 2)
DC (case 3)
DD (case 4)
A strategy is a rule that gives actions for each case. E.g.:
TIT-FOR-TAT:
If CC (case 1), then C
If CD (case 2), then D
If DC (case 3), then C
If DD (case 4), then D
Prisoner’s Dilemma (5)
•
•
•
•
•
•
•
•
Axelrod’s strategies remembered three previous games
So there are 4x4x4=64 possibilities for the previous three games
CC CC CC (case 1)
CC CC CD (case 2)
…
DD DD DC (case 63)
DD DD DD (case 64)
Therefore a strategy is encoded by a 64-letter string, e.g.
CDCCCDDCCCDD…
• Actual string was 70-letter long, 6 additional were representing
hypothetical previous games, used to decide how to make first
actual game
• Total # of strategies is 270...The search space is too big to be
searched exhaustively. So, run GA on it!
Prisoner’s Dilemma (6)
• Axelrod’s first experiment had 20 strategies.
• He found out that 8 of the human generated
strategies were representative of the entire set
of strategies (all programs submitted). This
served as fitness function. Interesting is the fact
that this set didn’t include TIT-FOR-TAT.
• Axelrod performed 40 different runs of 50
generations each. (each run used different seed
for random number).
• Observation: TIT-FOR-TAT is the way to go!
Prisoner’s Dilemma (7)
• Conclusion? Beavis & Butthead will be
better off if they are sentenced to doing
some community service together
Nothing escapes
the all-mighty
bunghole!
Future Directions
• GAs promise to solve technological problems
and machine learning
• Crossover and mutation are only the barest
bones of real genetic systems. Researchers
looked into other mechanisms: dominance,
translocation, sexual differentiation, genetic
regulation. These mechanics could be used in
problem solving with GAs. (more advanced GA,
changing environment)
• Connection to Math Genetics (no enough
attention has been paid what’s done here)
What did we learn today?
• What are GAs
• Why use Evolution for computation and how BioEvolution is appealing for these purposes
• Search Spaces
• Fitness Landscape and Fitness Function
• Simple GA Operators
• Simple GA Pseudocode
• Example of GA iterations
• How to use GA to Evolve Prisoner’s Dilemma
• Future Directions
References
• Introduction to Genetic Algorithms by Melanie Mitchell
• USING NEURAL NETWORKS AND GENETIC ALGORITHMS TO
PREDICT STOCK MARKET RETURNS (Thesis submitted to U of
Manchester by Efstathios Kalyvas)
• http://en.wikipedia.org/wiki/Fitness_landscape
• www.irit.fr/COSI/glossary/fulllist.php
• www.csse.monash.edu.au/hons/projects/1995/Andrew.Bulhak/node
38.html
• bioinformatics.unc.edu/
• http://www.boskowan.com/www/jirka/asimo/hondaasimoplayingfootball.jpg
• http://www.clipart.com
• http://www.animationfactory.com/