Evolutionary Computation

Download Report

Transcript Evolutionary Computation

Evolutionary Computation
Robert S. Roos
CMPSC 580
Junior Seminar
Spring 2007
Recall…
[See CS 580 “Lecture Notes and Handouts” from 1/30/2007]
Genetic algorithm
(Holland, 1975)
Initialize population P to random bit strings of length L
Repeat:
for each chromosome c in P compute fitness(c)
repeat
initialize new population P’ to emptyset, Ø
select some fraction of P based on fitness, add to P’
fill in the rest of P’ by performing crossover
randomly mutate elements of P’
P = P’
until stopping condition is met
Before Creating a GA…
…Is the problem well-suited to solution by a
genetic algorithm?
“Find shortest paths in a given graph using a GA?” NO! Polynomialtime algorithm.
“Find minimum spanning tree in a given graph using a GA?” NO!
Polynomial time algorithm.
“Can a given graph be colored using no more than k colors?” NO!
This is a decision problem, not an optimization problem. (But if we
ask “how many colors are needed?” it becomes an optimization
problem.)
Many hard problems already have efficient approximation algorithms.
For example, “pure” GAs are not as good as other methods for TSP.
To Create a GA, You Must…
• Select a representation (what do the
chromosomes represent?)
• Select a fitness function for chromosomes
• (Maybe) design problem-specific mutation
and crossover operators
• Select parameters (population size;
crossover probability; mutation probability)
• Decide on stopping condition
Example
Zero-One Knapsack: Given n objects x0, x1, …, xn-1 with
weights w0, w1, …, wn-1 and values v0, v1, …, vn-1
(respectively), and given a knapsack with capacity C,
choose a set of objects of maximum total value whose total
weight does not exceed C.
Chromosome: n-bit vector. The i-th bit is 1 if and only if xi
is chosen.
Fitness: ∑xivi – penalty, where penalty is zero if ∑xiwi ≤ C,
some large value otherwise.
Crossover, mutation: the usual
Parameters: Pop. Size=100, pc = .5, pm = .01
Stopping condition: Halt when no change in best solution
after 100 consecutive generations
Variations
Real-valued genes: Instead of using bits, use floating-point values.
Crossover exchanges values rather than single bits; mutation adds or
subtracts a small quantity from a real value.
Ordering Problems: chromosome is a permutation (TSP, scheduling,
etc.) Needs specialized crossover and mutation operators. E.g.,
consider TSP, 5 cities, chromosomes ABDEC and ACDEB. Ordinary
crossover could produce nonsense like “ABDEB” and “ACDEC”.
Hybrid GA: use non-genetic operators to improve solutions. E.g., for
function maximization, use hill-climbing or other local search techniques
in the neighborhood of a candidate solution to find a better solution
“Kick” operators: in the middle of computation, allow a portion of the
population to be replaced with random chromosomes
Elitist GA: always keep current best-known solution in the population
Some Terms
Genotype: The description of an individual at the gene level; the
representation used for individuals in the genetic algorithm. Sometimes
used in place of “chromosome”, but “genotype” is used more often in the
context of broad discussions of GAs in general rather than particular GAs
Phenotype: The individual described by a given genotype; the result of
expressing the genes in the genotype. (NOTE: there could be multiple
genotypes that map to the same phenotype – e.g., “ABCDE” and
“BCDEA” might both describe the same TSP tour)
Allele: One of the possible values that can be used in a particular
position of the chromosome. In a bit string the alleles are just 0 and 1.
Codon: the smallest unit of a chromosome that represents a single “trait”
of the individual. In simple bit string representations the codons are just 0
and 1, but if several positions in the bit string are always grouped
together and given an interpretation as a single entity, each such group of
bits is a codon.
Some Terms
Epistasis: Occurs when fitness is determined in a nonlinear way by a
number of genes. E.g., f(001) = f(010) = f(100) = 1; f(011) = f(101) =
f(110) = 2; f(111) = 2; and f(000) = 3. Also sometimes called Linkage.
Max-ones: a well-known “test” problem for GAs in which the goal is
simply to maximize the number of 1’s in a bitstring. (Also “Onemax”)
Building Block: in the traditional GA, a collection of bits and their
corresponding positions (i.e., a hyperplane) such that (1) individuals
having those bits in those positions tend to have higher fitness; (2) the
number of bits in the set is small (“low order”); (3) the bits are
relatively close together (“small defining length”). The Building Block
theorem shows that ordinary crossover and mutation tend to increase
the number of representatives of building blocks in the population at
an exponential rate.
Some Terms
Uniform Crossover: A type of crossover in which a child
chromosome is formed from two parents by randomly selecting one of
the two parents for each bit position. Roughly half of the bits in the
child come from each parent. Ex: 01100 x 10101 -> 00101
“Exploration versus Exploitation”: the fundamental tension in a
GA. We must balance work spent “exploring” the space of possible
solutions versus the work spent “exploiting” promising areas of the
solution space. Too much emphasis on either can result in
convergence to local optima.
Deceptive problem: a problem that is hard for GAs to solve. For
instance, define a four-bit subproblem in which the fitness of any
string is the number of 1s EXCEPT for “1111”, which has fitness 3,
and “0000”, which has fitness 4. Create a larger problem by stringing
together a number of these smaller subproblems.
Then f(0111 1011 1111) = 3+3+3=9, f(0010 0110 1001) = 1+2+2=5,
etc., but f(0000 0000 0000) = 4+4+4 = 12.
Library Exercise
Find as many books as you can on the shelves of the Alden
library about “genetic algorithms”, “genetic programming”,
“evolutionary computation”, and related areas. DON’T
remove them (unless you really intend to sign them out and
use them), but jot down brief titles and authors. I’ll ask you to
read this list out loud. I suggest you divide up into smaller
groups.
Locate the areas on our shelves where books about genetic
algorithms and evolutionary computation tend to be shelved.
What call numbers seem to occur most often?
Make a list of titles and authors and bring this back to the
classroom. I’ll ask you to read your list. Yes, there is a point
to this exercise!
Past Student Research
1998: Kim L. Bailey, ``Steady State Versus Generational Genetic
Algorithms.''
1998: Ricardo D. Cortes. ``The Traveling Salesperson Problem.''
1999: Sarah Toohey.``Using Ant Colony Systems to Solve the Traveling
Salesperson Problem.''
2000: Daniel Phifer.``Dynamic Genetic Operators and Local Optimum
Escape.''
2001: Dmitri Zagidulin, ``Data Abstraction in Genetic Programming.''
2002: Charles DiVittorio. ``Numerical Sequence Recognition Using
Genetic Programming.''
2002: Brian Hykes. ``Genetic Algorithms for Constraint Satisfaction: An
Example Using the `Minesweeper' Problem.''
Past Student Research
2002: Brian Zorman. ``The Creation and Analysis of a JavaSpace-Based
Distributed Genetic Algorithm.''
GECCO 2002: Tiffany Bennett, Jennifer Hannon, Elizabeth Zehner, and
Robert Roos. ``A genetic algorithm for improved shellsort sequences.''
2003. Chris McNamara. ``Experimenting with Distributed
Genetic Grouping Algorithms.''
2004. Joe Zumpella. ``Genetic Algorithms for Texture Generation.''
2005. Kristen Walcott. ``Prioritizing Regression Test Suites for TimeConstrained Execution Using a Genetic Algorithm.''
2005. Jason Zeleznik. ``Optimizing Genetic Algorithms in the Presence of
a Dynamic Fitness Function.''
Past Student Research
2006: Ian Volkwein. Optimizing Neural Network Topology Using
Particle Swarms
2006: Jonathan Goytia. Investigating Cooperative Particle Swarm
Optimization to Train Neural Networks for Learning Checkers
2006: Matthew McGettigan. Using Ant Colony Optimization with the
Rural Postman Problem.