Transcript Slide 1

Simple Selectorecombinative GAs

Scale poorely on hard problems (multimodal, deceptive, high
degree of subsolution interaction, noise, ...), largely the result of
their mixing behaviour
Inability of SGA to correctly identify and adequately mix the appropriate
BBs in subsequent generations
 Exponential computation complexity of SGA


Crossover operators or other exchange emchanisms are needed
such that adapt to the problem at hand

‹#›
Linkage adaptation
Genetic Algorithm with Limited Convergence
Messy Genetic Algorithms (mGAs)

Inspiration from the nature


evolution starts from the simplest forms of life
mGAs depart from SGA in four ways:
messy codings
 messy operators
 separation of processing into three heterogeneous phases
 epoch-wise iteration to improve the complexity of solution

‹#›
Genetic Algorithm with Limited Convergence
mGA’s codings

Tagged alleles

Variable-length strings: (name1, allele1) … (nameN, alleleN)
((4,0) (1,1) (2,0) (4,1) (4,1) (5,1))

Over-specification
multiple gene instances (gene 4)
 majority voting – would express deceptive genes too readily
 first-come first-served (left to right expression) - positional priority


Underspecification
missing gene instances (gene 3)
 average schema value – variance is too high
 competitive template – solution locally optimal with respect to k-bit
perturbations

‹#›
Genetic Algorithm with Limited Convergence
Messy operators: cut & splice

Cut – divides a single string into two parts

Splice – joins the head of one string with the tail of the other one

When short strings are mated – probability of cut is small  mostly the
string will be just spliced


‹#›
the strings’ length is doubled
When long string are mated – probability of cut is large  one-point
crossover
Genetic Algorithm with Limited Convergence
Three heterogeneous phases

Initialization
Enumerative initialization of the population with all sub-strings of a certain
length k<<l  (lk)2k  O(lk) computations
 Guaranteed that all BBs of certain size are present in the population


Primordial phase
Only selection used to dope the population with good BBs
 Good linkage groups are selected before their alleles are allowed to be
mixed


Juxtapositional phase
selection + cut&splice
 Mixing of the BBs

‹#›
Genetic Algorithm with Limited Convergence
Fast messy genetic algorithms

Probabilistically complete enumeration
Population of strings of length l’ close to l is generated
 Assumption: each string contains many different BBs of length
k<<l


Building block filtering
extracts highly-fit and effectively linked BBs
 repeats (1) selection and (2) gene deletion
 only O(l) computations to converge


Extended thresholding


number of genes in common
fmGA vs mGA


‹#›
tournaments are held only between strings that have a threshold
150-bit long problem, 305-bit deceptive function
1.9105 vs. 5.9108 evaluations
Genetic Algorithm with Limited Convergence
Gene expression messy GA gemGA

Messy ???
No variable-length strings
 No under- or over-specification
 No left-to-right expression


Messy use of heterogeneous phases of processing in gemGA
Linkage learning phase - first identifies linkage groups
 Mixing phase – selection + recombination


‹#›
exchanges good allele combinations within those groups to find
optimal solution
Genetic Algorithm with Limited Convergence
gemGA: The idea

Linkage learning phase

Transcription I (antimutation)



Each string undergoes l one-bit perturbations
Improvements are ignored ?!? (bit does not belong to optimal BB)
Changes that degrade the structure are marked as possible linkage groups
candidates
Ex.: two 3-bit deceptive BBs 111
101
marked not marked
(degrades) (improves)

Transcription II

Identifies the exact relations among the genes by checking nonlinearities
IF f(X’i) + f(X’j) != f(X’ij) THEN link(i,j)
‹#›
Genetic Algorithm with Limited Convergence