Transcript Part I-A
CS321 HS 2009
Autonomic Computer Systems
Evolutionary Computation I
November 17, 2009
Lidia Yamamoto and Manolis Sifalakis
University of Basel http://cn.cs.unibas.ch
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
1
Overview
Part I (today)
•
Genetic Algorithms (GA)
•
Genetic Programming (GP)
Part II (Thursday)
•
Representations
•
Dynamic environments
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
2
Overview of Today’s Lecture
Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
3
Textbooks
Melanie Mitchell, “An Introduction to
Genetic Algorithms”, MIT Press, 1998.
W. Banzhaf, P. Nordin, R. E. Keller,
and F. D. Francone. "Genetic
Programming, An Introduction".
Morgan Kaufmann Publishers, Inc.,
1998.
R. Poli, W. B. Langdon, and N. F.
McPhee. "A Field Guide to Genetic
Programming". Published via
http://lulu.com, 2008.
http://www.gp-field-guide.org.uk/.
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
4
Optimization and Self-Optimization
Optimization (Operations Research)
•
maximize/minimize objective function subject to constraints
•
search for a solution in a vast search space or solution space
(space of all possible solutions)
•
linear/non-linear
Self-Optimization (IBM Autonomic Computing)
•
ability of the system to optimize its operations based on a
given target operation profile (objectives and constraints)
•
system continually seeks to improve its performance and
efficiency, in order to meet end-users' needs with minimal
human intervention
•
also able to track and respond to profile changes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
5
Optimization (Operations Research)
General formulation of an optimization problem:
maximize:
subject to:
f(x) = objective function
gi(x) = constraints
f(x)
global
optimum
local
optima
search
space
Simple example:
1 variable (x),
no constraints
x
best solution
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
6
Heuristic Optimization
Search space often discrete, and too large for exhaustive search
Heuristic optimization seeks to explore promising regions of the
search space through a beamed search:
•
refinement of promising solutions seen so far
•
do not guarantee that a global optimum will be found
•
might be trapped in local optima
•
goal is to provide a satisfactory solution in reasonable time
Some heuristic optimization algorithms:
•
Evolutionary algorithms: Inspired by genetics and Darwinian
evolution
•
Swarm algorithms: Inspired by the behavior of social insects
(ants, bees...), flocks of birds, fish, etc.
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
7
Evolutionary Computation
Heuristic optimization method
•
Beamed search in a vast space of possible solutions
Inspired by Darwinian evolution:
•
Biological evolution by natural selection
•
Survival of the fittest
Advantages:
•
Simplicity
•
Potential parallelism
•
Able to work with limited information: only fitness improvement
Disadvantages:
•
Computation cost (e.g. large populations, complex fitness)
•
No guarantees (like all heuristics)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
8
An Eukaryotic Cell
(Prokaryotic = without nucleus, e.g. bacteria)
Eukaryotic = with nucleus, e.g. plant and animal cells
membrane
cytoplasm
organelles
and other
cell components
(e.g. mitochondria, ribosomes...)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
nucleus
genome
(genetic material)
9
The Genome
Each cell nucleus contains one or more chromosomes
A chromosome contains a number of genes
A gene is a region of genomic (DNA) sequence defining a
functional block
Chromosomes may occur in one or more copies within a cell:
•
Diploid cell: has two copies of each chromosome (e.g. in
humans and most animals)
•
Haploid cell: only one copy (e.g. gametes)
•
Polyploid cell: several copies (typically a power of 2, e.g.
tomatoes, crops)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
10
The Genome
chromosome pair
DNA
(diploid organism)
cell nucleus
gene
double helix
chromosomes
chromosomes
A
T
G
C
genes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
11
The Genome
A gene is the basic unit of heredity in a living organism
•
region of genomic sequence
•
can be seen as a functional block
•
typical encodes a protein, which leads to a trait, e.g. eye color
Locus is the position of a gene in the chromosome
Allele: each possible alternative DNA sequence for a given
gene locus (e.g. leading to different traits such as red or white
flowers)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
12
The Genome
Genotype: the genetic material of an individual
Phenotype: the ensemble of observable traits (e.g. flower color,
leaf shape) resulting from the genotype of an individual
allele for
white
flowers
allele
for red
flowers
homologous
chromosomes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
13
Recombination and Mutation
The DNA is able to replicate with a high fidelity (with the help of
enzymes)
However, mutations (errors in the copy process) might still occur,
e.g. due to chemical agents, radiations, etc.
Recombination (crossover) during sexual reproduction:
chromosomes swap genes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
14
Darwinian Evolution
Reproduction = replication + (unlimited) heritable variation
•
Replication of the DNA sequence
•
Cell replication
•
Organism reproduction
•
Variation: mutation, recombination
Fitness = Reproduction rate
•
how fast an organism (or species) is able to reproduce
Selection: survival of the fittest
•
exponential growth + finite resources = competition
•
outcome: competitive exclusion (survival of the fittest)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
15
Overview
Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
16
Evolutionary Computation
Genetic Algorithms (GA)
•
goal: find an optimum solution (e.g. combination of
parameters) to an instance of a problem
•
candidate solutions are typically strings
Genetic Programming (GP)
•
goal: find an optimum program able to solve any instance of
the problem
•
candidate solutions are programs, e.g. linear (e.g. assembly
language), tree (e.g. LISP), graph (dataflow, cartesian GP)
Other variants not covered in this course:
•
Evolution Strategies (ES), Evolutionary Programming (EP)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
17
Evolutionary Computation: Basic Concepts
Individual: a candidate solution to the problem to be solved
Population: set of candidate solutions
Genotype: (compact) representation of individuals; can be broken
down into chromosomes, genes, alleles,...
•
•
GA: string of bits, integers, characters, etc., examples:
bin: 01101011
hex: 39fe87ac3b46
dec: 2039384757
alpha: addbadbaaccd
GP: program, e.g. linear, tree, graph,...
linear: I1: R1 = R2 + R3; I2: R3 = R4 * R1; I3: Jump I2
LISP tree: (* (+ 3 a) (if (< a b) (- x y) (/ x 3)))
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
18
Evolutionary Computation: Basic Concepts
Genetic operators: variation functions that transform a
set of individuals (parents) into a new set (offspring)
Common operators:
•
Mutation: random change in genotype, with low probability
parent
offspring
mutation
01010010001
01010101001
•
Crossover: recombine portions of two genotypes
offspring 1
parent 1
1011 1100101
parent 2
crossover point
0010 011100010
crossover
1011 011100010
offspring 2
0010 1100101
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
19
Evolutionary Computation: Basic Concepts
Phenotype: (expanded) individual that can be evaluated for
fitness (e.g. program)
•
can be the same as the genotype (direct encoding)
•
or different: indirect encoding: genotype-phenotype map
phenotype
genotype
map
0 23 198 54
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
if (a > 10) then
x = 20
else
print b
20
Evolutionary Computation: Basic Concepts
Fitness: measure of how good a candidate solution is
•
tested on a number of test cases (training set)
•
expressed as a fitness function: e.g. error between ideal and
obtained solution (on training case); absolute or relative
performance measure
Selection strategy: Algorithm that selects individuals in the
population that will build the next generation
•
Principle: "survival of the fittest": best fit individuals have a
higher chance of being selected
•
Selected individuals undergo variation through genetic
operators to form the next generation
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
21
Evolutionary Computation: Basic Algorithm
pop = initial population
generation = 0
evaluate(pop)
while bestfit(pop) not good enough, and not stop:
•
parents = select(pop)
•
children = recombine(parents) with probability pr
•
mutate(children) with probability pm
•
pop = children + select(pop)
•
evaluate(pop)
•
generation = generation + 1
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
22
Fitness Landscape
n-dimensional
landscape:
•
fitness function is the
objective function:
•
is the
genotype to be optimized
•
peaks: local optima
3-D case: intuitive
visualization: z = f (x, y)
z
y
x
example of an initial population
(red dots) on a fitness landscape
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
23
Fitness Landscape
Convergence example:
z
z
y
x
y
x
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
24
Fitness Evaluation
Examples of fitness functions:
Error fitness function: sum of distances (error) between expected
(pi) and obtained (oi) solution on training case i out of n cases, for
individual p:
in this case, best fitness = 0
Relative performance measure, e.g. success rate (best fitness =
100%)
Absolute performance measure, e.g. amount of credits won,
amount of food found (best fitness = infinite)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
25
Selection Strategies
Selection is typically stochastic: "survival of the fittest":
•
best individuals have a higher chance of being selected
•
selected individuals become "parents" and produce "offspring”
•
offspring form the next generation of individuals to be
evaluated
Examples of selection strategies:
•
Fitness-proportional selection (Roulette wheel)
•
Ranking selection
•
Tournament selection
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
26
Selection Strategies
Fitness-proportional or Roulette Wheel selection: the probability
of selection (pi) of individual i (out of a population of m
individuals) is proportional to its fitness:
Ranking selection: probability of selection is a function of rank of
individual from worst to best fitness.
Tournament selection: a group of randomly selected individuals
"compete" in a tournament; winners (best fitness) produce
offspring which will replace losers.
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
27
Overview
Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
28
Genetic Algorithms
Genetic Algorithms: goal: find an optimum solution (e.g.
combination of parameters) to an instance of a problem
Became popular via John Holland, 1970s
Example: solving the Travelling Salesman Problem (NP-hard):
given a set of cities and connections (roads) between them, find the
shortest tour that visits all cities (and each city only once).
individual = candidate tour (sequence of cities)
fitness = length of tour (the shorter the better)
initial population = set of randomly generated tours
iterate by evaluating, selecting, mutating and recombining tours
until stop criterion (e.g. max number of generations,
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
29
Sample Iteration
Ci = individuals (chromosomes)
with two genes: X and Y
Fitness function:
parents:
C4-C3, C4-C1
recombine by
swapping genes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
30
Simple GA Example: Discover the hidden sentence
Goal: Find out what's the hidden sentence: e.g. "this is a test" (without blanks);
only the length of the target sentence is known to the GA
•
(generalization of the OneMax problem: maximize the number of ones in a
bitstring)
Fitness: Similarity measure between obtained and target string, using a string
alignment (edit distance) algorithm :
•
100% similarity = identical strings = optimum
Fixed-length genotype (character string)
Mutations:
•
point mutation (letter replacement, e.g. "ABCD" "ABXD");
•
shift (rotate) left/right (e.g. "ABCD" "BCDA")
Two-point crossover
Tournament selection, size 4 (two best remain in population, and their
offspring replace the two worst)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
31
Simple GA Example: Typical run
> ./strga "thisisatest"
gen= 1, fit= -27, best=qthakeqsfzt
gen= 2, fit= 9, best=ophrispstet
gen= 3, fit= 9, best=ophrispstet
gen=10, fit= 45, best=qthisattest
gen=15, fit= 81, best=thisisatesc
gen=18, fit= 100, best=thisisatest
Size of search space=3.670344e+15
Population size=1000 tested=45070
Percent of search space explored: 5.176626e-10 %
Size of search space = number of all possible strings of same length as target
(11 characters in example) using alphabet a-z = 2611 possible strings
Typically only a small fraction of (huge) search space is explored
But this problem is quite (too) easy for a GA (smooth fitness landscape)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
32
Search for solutions aggressively
Hill climbing: always chooses best solution so far
Always reaches a hilltop
But maybe not the highest top:
can be trapped in a local maximum
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
E.g. Steepest Ascend
33
Search for solutions with a GA
Solution discovered probabilistically
Problem: Can not guarantee discovery of hilltop
May reach global maxima by wandering through search space
GA
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
34
GA: Real-World Applications
Numerical and Combinatorial Optimisation
•
Economic
•
Viability of gene propagation
Social systems
•
Host-parasite co-evolution, resource flow, biological arm races
Population Genetics
•
Biding and trading strategies, stock trends
Ecology
•
Job-Shop Scheduling, Traveling salesman
Evolution of social behavior in insect colonies
Computer art
•
Automatic generation of graphics, music
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
35
GA for Computer Art
Interactive evolution: user (computer artist) determines fitness
Examples by Karl Sims, http://www.karlsims.com/
Competing
virtual creatures
1994
Genetic Images
1993
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
Galapagos
(3D animated forms)
1997
36
Overview
Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
37