Transcript Part I-A

CS321 HS 2009
Autonomic Computer Systems
Evolutionary Computation I
November 17, 2009
Lidia Yamamoto and Manolis Sifalakis
University of Basel http://cn.cs.unibas.ch
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
1
Overview


Part I (today)
•
Genetic Algorithms (GA)
•
Genetic Programming (GP)
Part II (Thursday)
•
Representations
•
Dynamic environments
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
2
Overview of Today’s Lecture

Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
3
Textbooks

Melanie Mitchell, “An Introduction to
Genetic Algorithms”, MIT Press, 1998.

W. Banzhaf, P. Nordin, R. E. Keller,
and F. D. Francone. "Genetic
Programming, An Introduction".
Morgan Kaufmann Publishers, Inc.,
1998.

R. Poli, W. B. Langdon, and N. F.
McPhee. "A Field Guide to Genetic
Programming". Published via
http://lulu.com, 2008.
http://www.gp-field-guide.org.uk/.
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
4
Optimization and Self-Optimization


Optimization (Operations Research)
•
maximize/minimize objective function subject to constraints
•
search for a solution in a vast search space or solution space
(space of all possible solutions)
•
linear/non-linear
Self-Optimization (IBM Autonomic Computing)
•
ability of the system to optimize its operations based on a
given target operation profile (objectives and constraints)
•
system continually seeks to improve its performance and
efficiency, in order to meet end-users' needs with minimal
human intervention
•
also able to track and respond to profile changes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
5
Optimization (Operations Research)

General formulation of an optimization problem:
maximize:
subject to:



f(x) = objective function
gi(x) = constraints
f(x)
global
optimum
local
optima
search
space
Simple example:
1 variable (x),
no constraints
x
best solution
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
6
Heuristic Optimization

Search space often discrete, and too large for exhaustive search

Heuristic optimization seeks to explore promising regions of the
search space through a beamed search:

•
refinement of promising solutions seen so far
•
do not guarantee that a global optimum will be found
•
might be trapped in local optima
•
goal is to provide a satisfactory solution in reasonable time
Some heuristic optimization algorithms:
•
Evolutionary algorithms: Inspired by genetics and Darwinian
evolution
•
Swarm algorithms: Inspired by the behavior of social insects
(ants, bees...), flocks of birds, fish, etc.
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
7
Evolutionary Computation

Heuristic optimization method
•



Beamed search in a vast space of possible solutions
Inspired by Darwinian evolution:
•
Biological evolution by natural selection
•
Survival of the fittest
Advantages:
•
Simplicity
•
Potential parallelism
•
Able to work with limited information: only fitness improvement
Disadvantages:
•
Computation cost (e.g. large populations, complex fitness)
•
No guarantees (like all heuristics)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
8
An Eukaryotic Cell

(Prokaryotic = without nucleus, e.g. bacteria)

Eukaryotic = with nucleus, e.g. plant and animal cells
membrane
cytoplasm
organelles
and other
cell components
(e.g. mitochondria, ribosomes...)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
nucleus
genome
(genetic material)
9
The Genome

Each cell nucleus contains one or more chromosomes

A chromosome contains a number of genes

A gene is a region of genomic (DNA) sequence defining a
functional block

Chromosomes may occur in one or more copies within a cell:
•
Diploid cell: has two copies of each chromosome (e.g. in
humans and most animals)
•
Haploid cell: only one copy (e.g. gametes)
•
Polyploid cell: several copies (typically a power of 2, e.g.
tomatoes, crops)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
10
The Genome
chromosome pair
DNA
(diploid organism)
cell nucleus
gene
double helix
chromosomes
chromosomes
A
T
G
C
genes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
11
The Genome

A gene is the basic unit of heredity in a living organism
•
region of genomic sequence
•
can be seen as a functional block
•
typical encodes a protein, which leads to a trait, e.g. eye color

Locus is the position of a gene in the chromosome

Allele: each possible alternative DNA sequence for a given
gene locus (e.g. leading to different traits such as red or white
flowers)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
12
The Genome

Genotype: the genetic material of an individual

Phenotype: the ensemble of observable traits (e.g. flower color,
leaf shape) resulting from the genotype of an individual
allele for
white
flowers
allele
for red
flowers
homologous
chromosomes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
13
Recombination and Mutation

The DNA is able to replicate with a high fidelity (with the help of
enzymes)

However, mutations (errors in the copy process) might still occur,
e.g. due to chemical agents, radiations, etc.

Recombination (crossover) during sexual reproduction:
chromosomes swap genes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
14
Darwinian Evolution


Reproduction = replication + (unlimited) heritable variation
•
Replication of the DNA sequence
•
Cell replication
•
Organism reproduction
•
Variation: mutation, recombination
Fitness = Reproduction rate
•

how fast an organism (or species) is able to reproduce
Selection: survival of the fittest
•
exponential growth + finite resources = competition
•
outcome: competitive exclusion (survival of the fittest)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
15
Overview

Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
16
Evolutionary Computation



Genetic Algorithms (GA)
•
goal: find an optimum solution (e.g. combination of
parameters) to an instance of a problem
•
candidate solutions are typically strings
Genetic Programming (GP)
•
goal: find an optimum program able to solve any instance of
the problem
•
candidate solutions are programs, e.g. linear (e.g. assembly
language), tree (e.g. LISP), graph (dataflow, cartesian GP)
Other variants not covered in this course:
•
Evolution Strategies (ES), Evolutionary Programming (EP)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
17
Evolutionary Computation: Basic Concepts

Individual: a candidate solution to the problem to be solved

Population: set of candidate solutions

Genotype: (compact) representation of individuals; can be broken
down into chromosomes, genes, alleles,...
•
•
GA: string of bits, integers, characters, etc., examples:
bin: 01101011
hex: 39fe87ac3b46
dec: 2039384757
alpha: addbadbaaccd
GP: program, e.g. linear, tree, graph,...
linear: I1: R1 = R2 + R3; I2: R3 = R4 * R1; I3: Jump I2
LISP tree: (* (+ 3 a) (if (< a b) (- x y) (/ x 3)))
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
18
Evolutionary Computation: Basic Concepts

Genetic operators: variation functions that transform a

set of individuals (parents) into a new set (offspring)

Common operators:
•
Mutation: random change in genotype, with low probability
parent
offspring
mutation
01010010001
01010101001
•
Crossover: recombine portions of two genotypes
offspring 1
parent 1
1011 1100101
parent 2
crossover point
0010 011100010
crossover
1011 011100010
offspring 2
0010 1100101
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
19
Evolutionary Computation: Basic Concepts

Phenotype: (expanded) individual that can be evaluated for
fitness (e.g. program)
•
can be the same as the genotype (direct encoding)
•
or different: indirect encoding: genotype-phenotype map
phenotype
genotype
map
0 23 198 54
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
if (a > 10) then
x = 20
else
print b
20
Evolutionary Computation: Basic Concepts

Fitness: measure of how good a candidate solution is
•
tested on a number of test cases (training set)
•
expressed as a fitness function: e.g. error between ideal and
obtained solution (on training case); absolute or relative
performance measure

Selection strategy: Algorithm that selects individuals in the

population that will build the next generation
•
Principle: "survival of the fittest": best fit individuals have a
higher chance of being selected
•
Selected individuals undergo variation through genetic
operators to form the next generation
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
21
Evolutionary Computation: Basic Algorithm

pop = initial population

generation = 0

evaluate(pop)

while bestfit(pop) not good enough, and not stop:
•
parents = select(pop)
•
children = recombine(parents) with probability pr
•
mutate(children) with probability pm
•
pop = children + select(pop)
•
evaluate(pop)
•
generation = generation + 1
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
22
Fitness Landscape
 n-dimensional
landscape:
•
fitness function is the
objective function:
•
is the
genotype to be optimized
•

peaks: local optima
3-D case: intuitive
visualization: z = f (x, y)
z
y
x
example of an initial population
(red dots) on a fitness landscape
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
23
Fitness Landscape

Convergence example:
z
z
y
x
y
x
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
24
Fitness Evaluation
Examples of fitness functions:

Error fitness function: sum of distances (error) between expected
(pi) and obtained (oi) solution on training case i out of n cases, for
individual p:
in this case, best fitness = 0

Relative performance measure, e.g. success rate (best fitness =
100%)

Absolute performance measure, e.g. amount of credits won,
amount of food found (best fitness = infinite)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
25
Selection Strategies


Selection is typically stochastic: "survival of the fittest":
•
best individuals have a higher chance of being selected
•
selected individuals become "parents" and produce "offspring”
•
offspring form the next generation of individuals to be
evaluated
Examples of selection strategies:
•
Fitness-proportional selection (Roulette wheel)
•
Ranking selection
•
Tournament selection
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
26
Selection Strategies

Fitness-proportional or Roulette Wheel selection: the probability
of selection (pi) of individual i (out of a population of m
individuals) is proportional to its fitness:

Ranking selection: probability of selection is a function of rank of
individual from worst to best fitness.

Tournament selection: a group of randomly selected individuals
"compete" in a tournament; winners (best fitness) produce
offspring which will replace losers.
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
27
Overview

Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
28
Genetic Algorithms

Genetic Algorithms: goal: find an optimum solution (e.g.
combination of parameters) to an instance of a problem

Became popular via John Holland, 1970s

Example: solving the Travelling Salesman Problem (NP-hard):

given a set of cities and connections (roads) between them, find the
shortest tour that visits all cities (and each city only once).

individual = candidate tour (sequence of cities)

fitness = length of tour (the shorter the better)

initial population = set of randomly generated tours

iterate by evaluating, selecting, mutating and recombining tours
until stop criterion (e.g. max number of generations,
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
29
Sample Iteration
Ci = individuals (chromosomes)
with two genes: X and Y
Fitness function:
parents:
C4-C3, C4-C1
recombine by
swapping genes
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
30
Simple GA Example: Discover the hidden sentence

Goal: Find out what's the hidden sentence: e.g. "this is a test" (without blanks);
only the length of the target sentence is known to the GA
•

(generalization of the OneMax problem: maximize the number of ones in a
bitstring)
Fitness: Similarity measure between obtained and target string, using a string
alignment (edit distance) algorithm :
•
100% similarity = identical strings = optimum

Fixed-length genotype (character string)

Mutations:
•
point mutation (letter replacement, e.g. "ABCD" "ABXD");
•
shift (rotate) left/right (e.g. "ABCD" "BCDA")

Two-point crossover

Tournament selection, size 4 (two best remain in population, and their
offspring replace the two worst)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
31
Simple GA Example: Typical run
> ./strga "thisisatest"
gen= 1, fit= -27, best=qthakeqsfzt
gen= 2, fit= 9, best=ophrispstet
gen= 3, fit= 9, best=ophrispstet
gen=10, fit= 45, best=qthisattest
gen=15, fit= 81, best=thisisatesc
gen=18, fit= 100, best=thisisatest
Size of search space=3.670344e+15
Population size=1000 tested=45070
Percent of search space explored: 5.176626e-10 %

Size of search space = number of all possible strings of same length as target
(11 characters in example) using alphabet a-z = 2611 possible strings

Typically only a small fraction of (huge) search space is explored

But this problem is quite (too) easy for a GA (smooth fitness landscape)
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
32
Search for solutions aggressively



Hill climbing: always chooses best solution so far
Always reaches a hilltop
But maybe not the highest top:
can be trapped in a local maximum
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
E.g. Steepest Ascend
33
Search for solutions with a GA

Solution discovered probabilistically

Problem: Can not guarantee discovery of hilltop

May reach global maxima by wandering through search space
GA
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
34
GA: Real-World Applications

Numerical and Combinatorial Optimisation
•

Economic
•

Viability of gene propagation
Social systems
•

Host-parasite co-evolution, resource flow, biological arm races
Population Genetics
•

Biding and trading strategies, stock trends
Ecology
•

Job-Shop Scheduling, Traveling salesman
Evolution of social behavior in insect colonies
Computer art
•
Automatic generation of graphics, music
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
35
GA for Computer Art

Interactive evolution: user (computer artist) determines fitness

Examples by Karl Sims, http://www.karlsims.com/
Competing
virtual creatures
1994
Genetic Images
1993
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
Galapagos
(3D animated forms)
1997
36
Overview

Evolutionary Computation, Part I
•
•
Introduction
–
(Self-)Optimization
–
Basic definitions from genetics
Evolutionary Computation
–
Common definitions, basic algorithm
–
Genetic Algorithms (GA)
–
Genetic Programming (GP)
•
Example: Symbolic regression with TinyGP
S321 HS 2009: Evolutionary Computation I, L. Yamamoto, M. Sifalakis, 17 Nov. 2009
37