Genetic Algorithm Power Point Presentation

Download Report

Transcript Genetic Algorithm Power Point Presentation

GENETIC ALGORITHMS AND
GENETIC PROGRAMMING
John R. Koza
Consulting Professor (Medical Informatics)
Department of Medicine
School of Medicine
Consulting Professor
Department of Electrical Engineering
School of Engineering
Stanford University
Stanford, California 94305
[email protected]
http://www.smi.stanford.edu/people/koza/
DEFINITION OF THE GENETIC
ALGORITHM (GA)
The genetic algorithm is a probabalistic search
algorithm that iteratively transforms a set
(called a population) of mathematical objects
(typically fixed-length binary character
strings), each with an associated fitness value,
into a new population of offspring objects using
the Darwinian principle of natural selection
and using operations that are patterned after
naturally occurring genetic operations, such as
crossover (sexual recombination) and mutation.
GENETIC ALGORITHM (GA)
Generation 0
Generation 1
Individuals
Fitness
Offspring
011
$3
111
001
$1
010
110
$6
110
010
$2
010
HAMBURGER RESTAURANT
PROBLEM
• Price
1 = $ 0.50 price
0 = $10.00 price
• Drink
1 = Coca Cola
0 = Wine
• Ambiance
1 = Fast snappy service
0 = Leisurely service with tuxedoed waiter
CHROMOSOME (GENOME) OF THE
GLOBAL OPTIMUM
McDONALD's
1
1
1
THE SEARCH SPACE
1
2
3
4
5
6
7
8
000
001
010
011
100
101
110
111
• Alphabet size K=2, Length L=3
• Size of search space: KL=2L=23=8
IMPRACTICALITY OF RANDOM
OR ENUMERATIVE SEARCH
• 81-bit problems are very small for GA
• However, even if L is as small as 81, 281 ~
1027 = number of nanoseconds since the
beginning of the universe 15 billion years
ago
GA FLOWCHART
GENERATION 0
Generation 0
011 3
001 1
110 6
010 2
1
2
3
4
Total
Worst
Average
Best
DEFINITION OF THE GENETIC
ALGORITHM (GA)
The genetic algorithm is a probabalistic search
algorithm that iteratively transforms a set
(called a population) of mathematical objects
(typically fixed-length binary character
strings), each with an associated fitness value,
into a new population of offspring objects using
the Darwinian principle of natural selection
and using operations that are patterned after
naturally occurring genetic operations, such as
crossover (sexual recombination) and mutation.
PROBABILISTIC SELECTION
BASED ON FITNESS
•
•
•
•
•
Better individuals are preferred
Best is not always picked
Worst is not necessarily excluded
Nothing is guaranteed
Mixture of greedy exploitation and
adventurous exploration
• Similarities to simulated annealing (SA)
PROBABILISTIC SELECTION
BASED ON FITNESS
0.17
0.25
0.08
0.5
DARWINIAN FITNESS
PROPORTIONATE SELECTION
Generation 0
1
011 3
.25
2
001 1
.08
3
110 6
.50
4
010 2
.17
Total
12
Worst
1
Average
3.00
Best
6
Mating pool
011
3
110
6
110
6
010
2
17
2
4.5
6
DEFINITION OF THE GENETIC
ALGORITHM (GA)
The genetic algorithm is a probabalistic search
algorithm that iteratively transforms a set
(called a population) of mathematical objects
(typically fixed-length binary character
strings), each with an associated fitness value,
into a new population of offspring objects using
the Darwinian principle of natural selection
and using operations that are patterned after
naturally occurring genetic operations, such as
crossover (sexual recombination) and mutation.
MUTATION OPERATION
• Parent chosen probabilistically based on
fitness
Parent
010
• Mutation point chosen at random
Parent
--0
• One offspring
Offspring
011
AFTER MUTATION OPERATION
Generation 0
1
011 3
.25
2
001 1
.08
3
110 6
.50
4
010 2
.17
Total
12
Worst
1
Average
3.00
Best
6
Mating pool Generation 1
011
3
110
6
110
6
010
2
--- 011 3
17
2
4.5
6
CROSSOVER OPERATION
• 2 parents chosen probabilistically based on
fitness
Parent 1 Parent 2
011
110
CROSSOVER (CONTINUED)
• Interstitial point picked at random
Fragment 1
01-
Fragment 2
11-
• 2 remainders
Remainder 1 Remainder 2
--1
--0
• 2 offspring produced by crossover
Offspring 1
111
Offspring 2
010
AFTER CROSSOVER OPERATION
Generation 0
1
011 3
.25
2
001 1
.08
3
110 6
.50
4
010 2
.17
Total
12
Worst
1
Average
3.00
Best
6
Mating pool Generation 1
011
3
2
111 7
110
6
2
010 2
110
6
010
2
17
2
4.5
6
AFTER REPRODUCTION
OPERATION
Generation 0 Mating pool Generation 1
1
011 3
.25
2
001 1
.08
3
110 6
.50 110
6
--- 110 6
4
010 2
.17
Total
12
17
Worst
1
2
Average
3.00
4.5
Best
6
6
DEFINITION OF THE GENETIC
ALGORITHM (GA)
The genetic algorithm is a probabalistic search
algorithm that iteratively transforms a set
(called a population) of mathematical objects
(typically fixed-length binary character
strings), each with an associated fitness value,
into a new population of offspring objects using
the Darwinian principle of natural selection
and using operations that are patterned after
naturally occurring genetic operations, such as
crossover (sexual recombination) and mutation.
GENERATION 1
Generation 0
1
011 3
.25
2
001 1
.08
3
110 6
.50
4
010 2
.17
Total
12
Worst
1
Average
3.00
Best
6
Mating pool
011
3
110
6
110
6
010
2
17
2
4.5
6
Generation 1
2
111 7
2
010 2
--- 110 6
--- 011 3
18
2
4.5
7
DEFINITION OF THE GENETIC
ALGORITHM (GA)
The genetic algorithm is a probabalistic search
algorithm that iteratively transforms a set
(called a population) of mathematical objects
(typically fixed-length binary character
strings), each with an associated fitness value,
into a new population of offspring objects using
the Darwinian principle of natural selection
and using operations that are patterned after
naturally occurring genetic operations, such as
crossover (sexual recombination) and mutation.
DEFINITION OF THE GENETIC
ALGORITHM (GA)
The genetic algorithm is a probabalistic search
algorithm that iteratively transforms a set
(called a population) of mathematical objects
(typically fixed-length binary character
strings), each with an associated fitness value,
into a new population of offspring objects using
the Darwinian principle of natural selection
and using operations that are patterned after
naturally occurring genetic operations, such as
crossover (sexual recombination) and mutation.
PROBABILISTIC STEPS
• The initial population is typically random
• Probabilistic selection based on fitness
- Best is not always picked
- Worst is not necessarily excluded
• Random picking of mutation and
crossover points
• Often, there is probabilistic scenario as
part of the fitness measure
ANTENNA DESIGN
ANTENNA DESIGN
• The problem (Altshuler and Linden 1998) is to
determine the x-y-z coordinates of the 3dimensional position of the ends (X1, Y1, Z1, X2,
Y2, Z2,… , X7, Y7, Z7) of 7 straight wires so that
the resulting 7-wire antenna satisfies certain
performance requirements
• The first wire starts at feed point (0, 0, 0) in the
middle of the ground plane
• The antenna must fit inside the 0.5 cube
ANTENNA GENOME
X1
Y1
+0010 -1110
Z1
X2
Y2
+0001 +0011 -1011
Z2
…
+0011 …
• 105-bit chromosome (genome)
• Each x-y-z coordinate is represented by 5 bits
(4-bit granularity for data plus a sign bit)
• Total chromosome is 3  7  5 = 105 bits
ANTENNA FITNESS
• Antenna is for ground-to-satellite
communications for cars and handsets
• We desire near-uniform gain pattern 10
above the horizon
• Fitness is measured based on the
antenna's radiation pattern. The
radiation pattern is simulated by
National Electromagnetics Code (NEC)
ANTENNA FITNESS
• Fitness is sum of the squares of the
difference between the average gain and
the antenna's gain
• Sum is taken for angles  between -90
and +90 and all azimuth angles  from
0 to 180
• The smaller the value of fitness, the
better
GRAPH OF ANTENNA FITNESS
U. S. PATENT 5,719,794
10-MEMBER TRUSS
10-MEMBER TRUSS
• Prespecified topological arrangement of the 10
members, the load, and the wall (Goldberg and
Samtani 1986)
• Truss has 10 members (6 are length of 30 feet
and 4 are length 30√2 = 41 feet)
• The problem is to determine the cross-sectional
areas (A1, … , A10) of each of the 10 members
so as to minimize weight of the material for a
truss that supports the 2 loads
• The weight is based on volume (i.e., crosssectional area  length)
TRUSS GENOME
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
0010 1110 0001 0011 1011 0011 1111 0011 0011 1010
•
•
•
•
•
40-bit chromosome (genome)
4-bit granularity for truss diameters
0000 = smallest diameter
1111 = largest diameter
Total chromosome is 4  10 = 40 bits
TRUSS FITNESS
• Two-part (multiobjective) fitness measure
– First, fitness is computed by taking the sum,
over the 10 members, of the cross-sectional
area of each member times the length of each
member (30 feet or 30√2 = 41 feet).
– Second, a penalty (up to 10%) is imposed for
violating the stress constraints. Stresses are
computed using standard mechanical
engineering techniques.
• The smaller the total fitness, the better
CELLULAR AUTOMATA
STATE TRANSITION TABLE
#
0
1
2
3
4
…
127
WWW
0
0
0
0
0
…
1
WW
0
0
0
0
0
…
1
W
0
0
0
0
1
…
1
X
0
0
0
0
1
…
1
E
0
0
1
1
0
…
1
EE
0
1
0
1
0
…
1
EEE
1
0
0
0
0
…
1
Rule
a0
a1
a2
a3
a4
…
a127
CELLULAR AUTOMATA
A0
A1
A2
…
A127
a0
a1
a2
…
a127
• 128-bit chromosome (genome)
PROBLEM-SPECIFIC GENOMES
N  M GENOME
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
1
1
1
0
1
0
1
1
1
1
1
1
1
1
0
1
1
1
0
1
1
1
0
1
1
1
1
1
1
0
1
1
1
0
GENETIC ALGORITHM USING
VARIABLE-LENGTH STRINGS
• 5-WIRE ANTENNA (5  15 = 75 bits)
X1
Y1
+0010 -1110
Z1
…
+0001
…
X5
Y5
+0010 -1110
Z5
+0001
• 4-WIRE ANTENNA (4  15 = 60 bits)
X1
Y1
Z1
+1010 -0110 +1101
…
…
X4
Y4
Z4
+1010 -0110 +1001
GENETIC PROGRAMMING
THE CHALLENGE
"How can computers learn to solve problems without
being explicitly programmed? In other words, how
can computers be made to do what is needed to be
done, without being told exactly how to do it?"
 Attributed to Arthur Samuel (1959)
CRITERION FOR SUCCESS
"The aim [is] ... to get machines to exhibit behavior, which
if done by humans, would be assumed to involve the use
of intelligence.“
 Arthur Samuel (1983)
REPRESENTATIONS
• Decision trees
• If-then production
rules
• Horn clauses
• Neural nets
• Bayesian networks
• Frames
• Propositional logic
• Binary decision
diagrams
• Formal grammars
• Coefficients for
polynomials
• Reinforcement
learning tables
• Conceptual clusters
• Classifier systems
A COMPUTER PROGRAM
GENETIC PROGRAMMING (GP)
• GP applies the approach of the genetic
algorithm to the space of possible
computer programs
• Computer programs are the lingua
franca for expressing the solutions to a
wide variety of problems
• A wide variety of seemingly different
problems from many different fields can
be reformulated as a search for a
computer program to solve the problem.
GP  MAIN POINTS
• Genetic programming now routinely delivers
high-return human-competitive machine
intelligence.
• Genetic programming is an automated
invention machine.
• Genetic programming has delivered a
progression of qualitatively more substantial
results in synchrony with five approximately
order-of-magnitude increases in the
expenditure of computer time.
DEFINITION OF “HIGHRETURN”
The AI ratio (the “artificial-to-intelligence”
ratio) of a problem-solving method as the
ratio of that which is delivered by the
automated operation of the artificial
method to the amount of intelligence that
is supplied by the human applying the
method to a particular problem
DEFINITION OF “ROUTINE”
A problem solving method is routine if it is
general and relatively little human effort
is required to get the method to
successfully handle new problems within
a particular domain and to successfully
handle new problems from a different
domain.
CRITERIA FOR
“HUMAN-COMPETITIVENESS”
• Previously patented, an improvement over a
patented invention, or patentable today
• Publishable in its own right as a new scientific
result independent of the fact that the result was
mechanically created
• Holds it own in regulated competition against
humans (or programs)
• 5 other similar criteria that are “arms-length”
from the fields of AI, ML, GP
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP
•
•
•
•
•
Toy problems
Human-competitive non-patent results
20th-century patented inventions
21st-century patented inventions
Patentable new inventions
GP FLOWCHART
A COMPUTER PROGRAM IN C
int foo (int time)
{
int temp1, temp2;
if (time > 10)
temp1 = 3;
else
temp1 = 4;
temp2 = temp1 + 1 + 2;
return (temp2);
}
OUTPUT OF C PROGRAM
Time
Output
0
6
1
6
2
6
3
6
4
6
5
6
6
6
7
6
8
6
9
6
10
6
11
7
12
7
PROGRAM TREE
(+ 1 2 (IF (> TIME 10) 3 4))
CREATING RANDOM PROGRAMS
CREATING RANDOM PROGRAMS
• Available functions
F = {+, -, *, %, IFLTE}
• Available terminals
T = {X, Y, Random-Constants}
• The random programs are:
– Of different sizes and shapes
– Syntactically valid
– Executable
GP GENETIC OPERATIONS
•
•
•
•
Reproduction
Mutation
Crossover (sexual recombination)
Architecture-altering operations
MUTATION OPERATION
MUTATION OPERATION
•
•
•
•
Select 1 parent probabilistically based on fitness
Pick point from 1 to NUMBER-OF-POINTS
Delete subtree at the picked point
Grow new subtree at the mutation point in same
way as generated trees for initial random
population (generation 0)
• The result is a syntactically valid executable
program
• Put the offspring into the next generation of the
population
CROSSOVER OPERATION
CROSSOVER OPERATION
• Select 2 parents probabilistically based on fitness
• Randomly pick a number from 1 to NUMBER-OFPOINTS for 1st parent
• Independently randomly pick a number for 2nd parent
• The result is a syntactically valid executable program
• Put the offspring into the next generation of the
population
• Identify the subtrees rooted at the two picked points
REPRODUCTION OPERATION
• Select parent probabilistically based on
fitness
• Copy it (unchanged) into the next
generation of the population
FIVE MAJOR PREPARATORY STEPS FOR GP
•
•
•
•
•
Determining the set of terminals
Determining the set of functions
Determining the fitness measure
Determining the parameters for the run
Determining the method for designating a result and
the criterion for terminating a run
ILLUSTRATIVE GP RUN
SYMBOLIC REGRESSION
Independent Dependent
variable X
variable Y
-1.00
1.00
-0.80
0.84
-0.60
0.76
-0.40
0.76
-0.20
0.84
0.00
1.00
0.20
1.24
0.40
1.56
0.60
1.96
0.80
2.44
1.00
3.00
PREPARATORY STEPS
Objective:
Find a computer program with one input
(independent variable X) whose output
equals the given data
1
Terminal set:
T = {X, Random-Constants}
2
Function set:
F = {+,
3
Fitness:
The sum of the absolute value of the
differences
between
the
candidate
program’s output and the given data
(computed over numerous values of the
independent variable x from –1.0 to +1.0)
4
Parameters:
Population size M = 4
5
Termination:
An individual emerges whose sum of
absolute errors is less than 0.1
-,
*, %}
SYMBOLIC REGRESSION
POPULATION OF 4 RANDOMLY CREATED
INDIVIDUALS FOR GENERATION 0
SYMBOLIC REGRESSION x2 + x + 1
FITNESS OF THE 4 INDIVIDUALS IN GEN 0
x+1
x2 + 1
2
x
0.67
1.00
1.70
2.67
SYMBOLIC REGRESSION x2 + x + 1
GENERATION 1
Mutant of (c)
Copy of (a)
picking “2”
as mutation
point
First offspring of
crossover of (a)
and (b)
picking “+” of
parent (a) and
left-most “x” of
parent (b) as
crossover points
Second offspring
of crossover of
(a) and (b)
picking “+” of
parent (a) and
left-most “x” of
parent (b) as
crossover points
CLASSIFICATION
GP TABLEAU – INTERTWINED
SPIRALS
Objective:
2
Create a program to classify a given
point in the x-y plane to the red or
blue spiral
Terminal set: T = {X,Y,Random-Constants}
Function set: F = {+,-,*,%,IFLTE,SIN,COS}
3
Fitness:
1
4
5
The number of correctly classified
points (0 – 194)
Parameters: M = 10,000. G = 51
Termination: An individual program scores 194
WALL-FOLLOWER
FITNESS
BEST OF GENERATION 57
BOX MOVER – BEST OF GEN 0
BOX MOVER
GEN 45 – FITNESS CASE 1
TRUCK BACKER UPPER
TRUCK BACKER UPPER
• 4-Dimensional control problem
–
–
–
–
horizontal position, x
vertical position, y
angle between trailer and horizontal, t
angle between trailer and cab, d
• One control variable (steering wheel turn
angle)
• State transition equations map the 4 state
variables into 1 output (the control variable)
• Simulation run over many initial conditions
and over hundreds of time steps
GENETIC PROGRAMMING: ON THE
PROGRAMMING OF COMPUTERS BY
MEANS OF NATURAL SELECTION
(Koza 1992)
2 MAIN POINTS FROM 1992 BOOK
• Virtually all problems in artificial intelligence,
machine learning, adaptive systems, and
automated learning can be recast as a search
for a computer program.
• Genetic programming provides a way to
successfully conduct the search for a computer
program in the space of computer programs.
SOME RESULTS FROM 1992 BOOK
•
•
•
•
•
•
•
•
•
•
Intertwined Spirals
Truck Backer Upper
Broom Balancer
Wall Follower
Box Mover
Artificial Ant
Differential Games
Inverse Kinematics
Central Place Foraging
Block Stacking
•
•
•
•
•
•
•
•
Randomizer
Cellular Automata
Task Prioritization
Image Compression
Econometric Equation
Optimization
Boolean Function Learning
Co-Evolution of GamePlaying Strategies
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP
•
•
•
•
•
Toy problems
Human-competitive non-patent results
20th-century patented inventions
21st-century patented inventions
Patentable new inventions
COMPUTER PROGRAMS
• Subroutines provide one way to REUSE code  possibly
with different instantiations of the dummy variables
(formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
executing code
SYMBOLIC REGRESSION
Fitness case L0
W0
H0
L1
W1
H1
Dependent variable D
1
3
4
7
2
5
3
54
2
7
10
9
10
3
1
600
3
10
9
4
8
1
6
312
4
3
9
5
1
6
4
111
5
4
3
2
7
6
1
-18
6
3
3
1
9
5
4
-171
7
5
9
9
1
7
6
363
8
1
2
9
3
9
2
-36
9
2
6
8
2
6
10
-24
10
8
1
10
7
5
1
45
EVOLVED SOLUTION
(- (* (* W0 L0) H0)
(* (* W1 L1) H1))
DIFFERENCE IN VOLUMES
D = L0W0H0 – L1W1H1
AUTOMATICALLY DEFINED
FUNCTION volume
AUTOMATICALLY DEFINED
FUNCTION volume
(progn
(defun volume (arg0 arg1 arg2)
(values
(* arg0 (* arg1 arg2))))
(values
(- (volume L0 W0 H0)
(volume L1 W1 H1))))
AUTOMATICALLY DEFINED
FUNCTIONS
• ADFs provide a way to REUSE code
• Code is typically reused with different
instantiations of the dummy variables
(formal parameters)
ADDITION OF V0 AND V1
Fitness case L0
W0
H0
L1
W1
H1
V0
V1
D
1
3
4
7
2
5
3
84
30
54
2
7
10
9
10
3
1
630
30
600
3
10
9
4
8
1
6
360
48
312
4
3
9
5
1
6
4
135
24
111
5
4
3
2
7
6
1
24
42
-18
6
3
3
1
9
5
4
9
180
-171
7
5
9
9
1
7
6
405
42
363
8
1
2
9
3
9
2
18
54
-36
9
2
6
8
2
6
10
96
120
-24
10
8
1
10
7
5
1
80
35
45
DIVIDE AND CONQUER
DIVIDE AND CONQUER
• Decompose a problem into sub-problems
• Solve the sub-problems
• Assemble the solutions of the subproblems into a solution for the overall
problem
CHANGE OF REPRESENTATION
CHANGE OF REPRESENTATION
• Identify regularities
• Change the representation
• Solve the overall problem
ADF IMPLEMENTATION
• Each overall program in population includes
– a main result-producing branch (RPB) and
– function-defining branch (i.e., automatically defined
function, ADF)
• In generation 0, create random programs with
different ingredients for the RPB and the ADF
– Terminal set for ADF typically contains dummy
arguments (formal parameters), such as ARG0,
ARG1, …
– Function set of the RPB contains ADF0
– ADFs are private and associated with a particular
individual program in the population
ADF MUTATION
• Select parent probabilistically on the
basis of fitness
• Pick a mutation point from either RPB or
an ADF
• Delete sub-tree rooted at the picked point
• Grow a new sub-tree at the picked point
composed of the allowable ingredients
appropriate for the picked point
• The offspring is a syntactically valid
executable program
ADF CROSSOVER
• Select parent probabilistically on the basis of
fitness
• Pick a crossover point from either RPB or an
ADF of the FIRST patent
• The choice of crossover point in the SECOND
parent is RESTRICTED to the picked RPB or
to the picked ADF
• The sub-trees are swapped
• The offspring are syntactically valid executable
programs
GENETIC PROGRAMMING II: AUTOMATIC
DISCOVERY OF REUSABLE PROGRAMS
(Koza 1994)
MAIN POINTS OF 1994 BOOK
• Scalability is essential for solving non-trivial
problems in artificial intelligence, machine
learning, adaptive systems, and automated
learning
• Scalability can be achieved by reuse
• Genetic programming provides a way to
automatically discover and reuse subprograms
in the course of automatically creating computer
programs to solve problems
COMPUTER PROGRAMS
• Subroutines provide one way to REUSE code  possibly
with different instantiations of the dummy variables
(formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
executing code
MEMORY
Settable
(named)
Indexed
vector
variables memory
Matrix
memory
Relational memory
LANGDON'S DATA STRUCTURES
•
•
•
•
Stacks
Queues
Lists
Rings
COMPUTER PROGRAMS
• Subroutines provide one way to REUSE code  possibly
with different instantiations of the dummy variables
(formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
executing code
AUTOMATICALLY DEFINED
ITERATION (ADI)
• The overall program includes an iterationperforming branch (IPB) in addition to a
result-producing branch (RPB) and functiondefining branches (ADF)
• There are no infinite loops because the iteration
is performed over a known, fixed set
– protein or DNA sequence (of varying length)
– time-series data
– two-dimensional array of pixels
• Memory is usually involved and is used to
communicate between IPB, RPB, and ADF
TRANSMEMBRANE SEGMENT
IDENTIFICATION PROBLEM
Goal is to classify a given protein segment
as being a transmembrane domain or
non-transmembrane area of the protein
TRANSMEMBRANE SEGMENT
IDENTIFICATION PROBLEM
(progn
(defun ADF0 ()
(ORN (ORN (ORN (I?) (H?)) (ORN (P?) (G?))) (ORN (ORN (ORN (Y?)
(N?)) (ORN (T?) (Q?))) (ORN (A?) (H?))))))
(defun ADF1 ()
(values (ORN (ORN (ORN (A?) (I?)) (ORN (L?) (W?))) (ORN (ORN
(T?) (L?)) (ORN (T?) (W?))))))
(defun ADF2 ()
(values (ORN (ORN (ORN (ORN (ORN (D?) (E?)) (ORN (ORN (ORN
(D?) (E?)) (ORN (ORN (T?) (W?)) (ORN (Q?) (D?)))) (ORN (K?)
(P?)))) (ORN (K?) (P?))) (ORN (T?) (W?))) (ORN (ORN (E?)
(A?)) (ORN (N?) (R?))))))
(progn (loop-over-residues
(SETM0 (+ (- (ADF1) (ADF2)) (SETM3 M0))))
(values (% (% M3 M0) (% (% (% (- L -0.53) (* M0 M0)) (+ (% (%
M3 M0) (% (+ M0 M3) (% M1 M2))) M2)) (% M3 M0))))))
TRANSMEMBRANE SEGMENT
IDENTIFICATION PROBLEM
• in-sample correlation of 0.976
• out-of-sample correlation of 0.968
• out-of-sample error rate 1.6%
AUTOMATICALLY DEFINED
LOOP (ADL)
•
•
•
•
loop initialization branch, LIB
loop condition branch, LCB
loop body branch, LBB
loop update branch, LUB
ADL
COMPUTER PROGRAMS
• Subroutines provide one way to REUSE code  possibly
with different instantiations of the dummy variables
(formal parameters)
• Loops (and iterations) provide a 2nd way to REUSE code
• Recursion provide a 3rd way to REUSE code
• Memory provides a 4th way to REUSE the results of
executing code
AUTOMATICALLY DEFINED
RECURSION (ADR)
•
•
•
•
recursion condition branch, RCB
recursion body branch, RBB
recursion update branch, RUB
recursion ground branch, RGB
ADR
HUMAN-COMPETITIVE RESULTS
(NOT RELATED TO PATENTS)
Transmembrane segment identification problem for proteins
Motifs for D–E–A–D box family and manganese superoxide dismutase family of proteins
Cellular automata rule for Gacs-Kurdyumov-Levin (GKL) problem
Quantum algorithm for the Deutsch-Jozsa “early promise” problem
Quantum algorithm for Grover’s database search problem
Quantum algorithm for the depth-two AND/OR query problem
Quantum algorithm for the depth-one OR query problem
Protocol for communicating information through a quantum gate
Quantum dense coding
Soccer-playing program that won its first two games in the 1997 Robo Cup competition
Soccer-playing program that ranked in the middle of field in 1998 Robo Cup competition
Antenna designed by NASA for use on spacecraft
Sallen-Key filter
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP
•
•
•
•
•
Toy problems
Human-competitive non-patent results
20th-century patented inventions
21st-century patented inventions
Patentable new inventions
GENETIC PROGRAMMING III: DARWINIAN
INVENTION AND PROBLEM SOLVING
(Koza, Bennett, Andre, Keane 1999)
SUBROUTINE DUPLICATION
SUBROUTINE CREATION
SUBROUTINE DELETION
ARGUMENT DUPLICATION
ARGUMENT DELETION
16 ATTRIBUTES OF A SYSTEM FOR
AUTOMATICALLY CREATING COMPUTER
PROGRAMS
• Starts with "What needs
to be done"
• Tells us "How to do it"
• Produces a computer
program
• Automatic determination
of program size
• Code reuse
• Parameterized reuse
• Internal storage
• Iterations, loops, and
recursions
• Self-organization of
hierarchies
• Automatic determination of
program architecture
• Wide range of
programming constructs
• Well-defined
• Problem-independent
• Wide applicability
• Scalable
• Competitive with humanproduced results
GENETIC PROGRAMMING PROBLEM
SOLVER (GPPS)
AUTOMATIC SYNTHESIS OF
BOTH THE TOPOLOGY AND
SIZING OF ANALOG ELECTRICAL
CIRCUITS BY MEANS OF
DEVELOPMENTAL GENETIC
PROGRAMMING
AUTOMATED CIRCUIT SYNTHESIS
• The topology of a circuit includes specifying the
gross number of components in the circuit, the
type of each component (e.g., a capacitor), and
a netlist specifying where each lead of each
component is to be connected.
• Sizing involves specifying the values (typically
numerical) of each of the circuit's components.
COMPONENT-CREATING FUNCTIONS
•
•
•
•
•
Resistor R function
Capacitor C function
Inductor L function
Diode D function
Transistor Q function (3-leaded)
COMPONENT-CREATING FUNCTIONS
TOPOLOGY-MODIFYING FUNCTIONS
• SERIES division
• PARALLEL division
• VIA
• FLIP
TOPOLOGY-MODIFYING FUNCTIONS
DEVELOPMENT-CONTROLLING
FUNCTIONS
• END function
• NOP (No Operation) function
• SAFE_CUT function
THE INITIAL CIRCUIT
DEVELOPMENTAL GP
(LIST (C (– 0.963 (– (– -0.875 -0.113)
0.880)) (series (flip end) (series
(flip end) (L -0.277 end) end) (L (– 0.640 0.749) (L -0.123 end)))) (flip
(nop (L -0.657 end)))))
CAPACITOR-CREATING FUNCTION
(LIST (C (– 0.963 (– (– -0.875
-0.113) 0.880)) (series (flip
end) (series (flip end) (L –
0.277 end) end) (L (– -0.640
0.749) (L -0.123 end)))) (flip
(nop (L -0.657 end)))))
CAPACITOR-CREATING FUNCTION
SERIES DIVISION FUNCTION
(LIST (C (– 0.963 (– (– -0.875
-0.113) 0.880)) (series (flip
end) (series (flip end) (L –
0.277 end) end) (L (– -0.640
0.749) (L -0.123 end)))) (flip
(nop (L -0.657 end)))))
SERIES DIVISION
DEVELOPMENTAL GP
EVALUATION OF FITNESS
+
IN
z0
OUT
Embryonic Circuit
Program Tree
Full y D esi gned C i rc ui t (N etG rap h)
C ircuit N etl i st (asci i)
C ircuit Si mul ator (SPI C E )
C ircuit B ehav i or (O utp ut)
Fi tness
DESIRED BEHAVIOR OF A LOWPASS
FILTER
EVOLVED CAMPBELL FILTER
U. S. patent 1,227,113
George Campbell
American Telephone and Telegraph
1917
EVOLVED ZOBEL FILTER
U. S. patent 1,538,964
Otto Zobel
American Telephone and Telegraph Company
1925
EVOLVED SALLEN-KEY FILTER
EVOLVED DARLINGTON EMITTERFOLLOWER SECTION
U. S. patent 2,663,806
Sidney Darlington
Bell Telephone Laboratories
1953
NEGATIVE FEEDBACK
HAROLD BLACK’S RIDE ON THE
LACKAWANNA FERRY
Courtesy of Lucent Technologies
20th-CENTURY PATENTS
Campbell ladder topology for filters
Zobel “M-derived half section” and “constant K” filter sections
Crossover filter
Negative feedback
Cauer (elliptic) topology for filters
PID and PID-D2 controllers
Darlington emitter-follower section and voltage gain stage
Sorting network for seven items using only 16 steps
60 and 96 decibel amplifiers
Analog computational circuits
Real-time analog circuit for time-optimal robot control
Electronic thermometer
Voltage reference circuit
Philbrick circuit
NAND circuit
Simultaneous synthesis of topology, sizing, placement, and routing
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP
•
•
•
•
•
Toy problems
Human-competitive non-patent results
20th-century patented inventions
21st-century patented inventions
Patentable new inventions
SIX POST-2000 PATENTED
INVENTIONS
EVOLVED HIGH CURRENT LOAD CIRCUIT
REGISTER-CONTROLLED CAPACITOR
CIRCUIT
LOW-VOLTAGE CUBIC CIRCUIT
VOLTAGE-CURRENT-CONVERSION
CIRCUIT
LOW-VOLTAGE BALUN CIRCUIT
TUNABLE INTEGRATED ACTIVE FILTER
21st-CENTURY PATENTED INVENTIONS
Low-voltage balun circuit
Mixed analog-digital variable capacitor circuit
High-current load circuit
Voltage-current conversion circuit
Cubic function generator
Tunable integrated active filter
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP
•
•
•
•
•
Toy problems
Human-competitive non-patent results
20th-century patented inventions
21st-century patented inventions
Patentable new inventions
NOVELTY-DRIVEN EVOLUTION
• Two factors in fitness measure
– Circuit’s behavior in the frequency domain
– Largest number of nodes and edges (circuit
components) of a subgraph of the given circuit that
is isomorphic to a subgraph of a template
representing the prior art. Graph isomorphism
algorithm with the cost function being based on the
number of shared nodes and edges (instead of just
the number of nodes).
NOVELTY-DRIVEN EVOLUTION
• For circuits not scoring the maximum
number of hits (101), the fitness of a
circuit is the product of the two factors.
• For circuits scoring 101 hits (100%compliant individuals), fitness is the
number of shared nodes and edges
divided by 10,000.
PRIOR ART TEMPLATE
NON-INFRINGING SOLUTION NO. 1
NON-INFRINGING SOLUTION NO. 5
GP AS AN INVENTION MACHINE
CIRCUIT-CONSTRUCTING PROGRAM TREE
WITH ADFs
LOWPASS FILTER WITH ADFs
ADF0
AUTOMATIC SYNTHESIS OF
CIRCUIT LAYOUT
INCLUDING THE PLACEMENT OF
COMPONENTS AND ROUTING OF
WIRES
ALONG WITH THE TOPOLOGY
AND SIZING
CIRCUIT LAYOUT
• Circuit placement involves the assignment of
each of the circuit's components to a particular
physical location on a printed circuit board or
silicon wafer.
• Routing involves the assignment of a particular
physical location to the wires between the leads
of the circuit's components.
LAYOUT
LAYOUT — GENERATION 0
100%-COMPLIANT LOWPASS FILTER
GENERATION 25 WITH 5 CAPACITORS AND
11 INDUCTORS  AREA OF 1775.2
100%-COMPLIANT LOWPASS FILTER
GENERATION 30 WITH 10 INDUCTORS AND
5 CAPACITORS  AREA OF 950.3
100%-COMPLIANT LOWPASS FILTER
BEST-OF-RUN CIRCUIT OF GENERATION
138 WITH 4 INDUCTORS AND 4
CAPACITORS  AREA OF 359.4
LAYOUT  60 DB AMPLIFIER
AUTOMATIC SYNTHESIS
OF BOTH THE TOPOLOGY
AND TUNING OF
CONTROLLERS
PROGRAM TREE FOR A
CONTROLLER
CONTROLLER BLOCKS
•
•
•
•
•
•
•
gain
integrator
differentiator
adder
subtractor
multiplier
differential input
integrators
• inverter
•
•
•
•
•
•
•
•
lead
lag
two-parameter lag
absolute value
limiter
divider
delay
conditional operators
(switches)
FUNCTION SET FOR
CONTROLLER SYNTHESIS
F = {GAIN,INVERTER,LEAD,LAG,LAG2,
DIFFERENTIAL_INPUT_INTEGRATOR,
DIFFERENTIATOR, ADD_SIGNAL,
SUB_SIGNAL,ADD_3_SIGNAL,ADF0,
ADF1,ADF2,ADF3,ADF4}
TERMINAL SET FOR
CONTROLLER SYNTHESIS
T = { REFERENCE_SIGNAL,
CONTROLLER_OUTPUT,
PLANT_OUTPUT}
CONSTRAINED SYNTACTIC
STRUCTURE
• A grammar that specifies what functions
and terminals are allowed as arguments
to particular functions
• For example, the first argument of the
GAIN function must be a value-setting
subtree whereas the second can be from
the general pool of functions
• Also known as strong typing
TWO-LAG PLANT
8 FITNESS CASES
• 8 elements of the fitness
represent 2  2  2 choices:
measure
– 2 different values of the plant's internal gain,
K (1.0 and 2.0),
– 2 different values of the plant's time constant
 (0.5 and 1.0),
– 2 different values for the height of the
reference signal (rising from 0 to 1 volts or
from 0 to 1 microvolts at t = 100 milliseconds
FITNESS MEASURE
• For each of these 8 fitness cases, a transient analysis
(time domain) is performed using the SPICE
simulator.
• The contribution to fitness for the 8 elements is
– Integral of time-weighted absolute error (ITAE)
– e(t) is difference between plant output and reference signal.
– Multiplication by B (106 or 1) makes both reference signals equally
influential.
– Additional weighting function, A, heavily penalizes non-compliant
amounts of overshoot. A weights all variations up to 2% above the
reference signal by 1.0, but bigger variations by 10.0.
EVOLVED CONTROLLER FOR TWO-LAG
PLANT
LESS ITAE AND OVERSHOOT
BETTER DISTURBANCE REJECTION
REVERSE ENGINEERING OF METABOLIC
PATHWAYS
EVOLVED PATHWAY
ANTENNA SYNTHESIS USING GP
(PROGN3
(TURN-RIGHT 0.125)
(LANDMARK
(REPEAT 2
(PROGN2
(DRAW 1.0 HALF-MM-WIRE)
(DRAW 0.5 NO-WIRE)))
(TRANSLATE-RIGHT 0.125 0.75))
USING A TURTLE TO DRAW TWODIMENSIONAL ANTENNA
BEST-OF-RUN ANTENNA FROM
GENERATION 90
3-DIMENSIONAL ANTENNA
NASA EVOLVED ANTENNA
To be on satellite to be launched in 2004
OTHER STRUCTURES
GENETIC NETWORK FOR lac
operon
EVOLVED NETWORK
(IF (< LACTOSE_LEVEL 9.139 ) (IF (<
REPRESSOR_LEVEL 6.270 ) (IF (> GLUCOSE_LEVEL
5.491 ) 2.02 (IF (< CAP_LEVEL 0.639 ) 2.033 (IF
(< CAP_LEVEL 4.858 ) (IF (> LACTOSE_LEVEL 2.511 )
(IF (> CAP_LEVEL 7.807 ) 5.586 (IF (>
LACTOSE_LEVEL 2.114 ) 1.978 2.137 ) ) 0.0 ) (IF
(> REPRESSOR_LEVEL 4.015 ) 0.036 (IF (<
GLUCOSE_LEVEL 5.128 ) 10.0 (IF (< REPRESSOR_LEVEL
4.268 ) 2.022 9.122 ) ) ) ) ) ) (IF (> CAP_LEVEL
0.842 ) 0.0 5.97 ) ) (IF (< CAP_LEVEL 1.769 )
2.022 (IF (< GLUCOSE_LEVEL 2.382 ) (IF (>
LACTOSE_LEVEL 1.256 ) (IF (> LACTOSE_LEVEL 1.933
) (IF (> GLUCOSE_LEVEL 2.022 ) (IF (<
GLUCOSE_LEVEL 5.183 ) 6.323 (IF (> CAP_LEVEL
1.208 ) 9.713 0.842 ) ) 10.0 ) (IF (>
GLUCOSE_LEVEL 6.270 ) 2.109 ) 1.965 ) ) 0.665 )
1.982 ) ) )
IN C-STYLE PSEUDO CODE
if(LACTOSE_LEVEL < 9.139)
{
if(REPRESSOR_LEVEL <
6.270)
{
LAC_mRNA_LEVEL =
2.022;
}
else
{
LAC_mRNA_LEVEL =
0.0;
}
}
else
{
if(CAP_LEVEL < 1.769)
{
LAC_mRNA_LEVEL =
2.022;
}
else
{
if(GLUCOSE_LEVEL <
2.382)
{
LAC_mRNA_LEVEL
= 10.0;
}
else
{
LAC_mRNA_LEVEL
= 1.982;
} } }
PARAMETERIZED TOPOLOGIES
• One of the most important characteristics
of computer programs is that they
ordinarily contain inputs (free variables)
and conditional operations
PARAMETERIZED TOPOLOGY FOR
LOWPASS FILTER
PARAMETERIZED TOPOLOGY FOR
HIGHPASS FILTER
PARAMETERIZED TOPOLOGY FOR
GENERAL-PURPOSE CONTROLLER
EVOLVED EQUATIONS FOR GENERALPURPOSE CONTROLLER
EVOLVED EQUATIONS FOR GENERALPURPOSE CONTROLLER
PATENTABLE NEW INVENTIONS
PID tuning rules that outperform the Ziegler-Nichols and Åström-Hägglund tuning rules
General-purpose controllers outperforming Ziegler-Nichols and Åström-Hägglund rules
PROGRESSION OF QUALITATIVELY MORE
SUBSTANTIAL RESULTS PRODUCED BY GP
•
•
•
•
•
Toy problems
Human-competitive non-patent results
20th-century patented inventions
21st-century patented inventions
Patentable new inventions
PARALLELIZATION WITH SEMI-ISOLATED
SUBPOPULATIONS
GP PARALLELIZATION
• Like Hormel, Get Everything Out of the
Pig, Including the Oink
• Keep on Trucking
• It Takes a Licking and Keeps on Ticking
• The Whole is Greater than the Sum of
the Parts
PETA-OPS
• Human brain operates at 1012 neurons
operating at 103 per second = 1015 ops per
second
• 1015 ops = 1 peta-op = 1 bs (brain second)
EVOLVABLE HARDWARE
CORNER OF XILINX XC6216
FUNCTION UNIT FOR CELL OF
XILINIX XC6216
SORTING NETWORK
A
B
C
D
H
G
F
E
EVOLVED SORTING NETWORK
GP 1987–2002
System
Dates
Speed-up
over first
system
HumanProblem Category
competitive
results
Serial LISP
1987–1994
1 (base)
0
toy problems
64 transputers
1994–1997
9
2
human-competitive
results not related to
patented inventions
64 PowerPC’s
1995–2000
204
12
20th-century
patented inventions
70 Alpha’s
1999–2001
1,481
2
20th-century
patented inventions
1,000 Pentium II’s
2000–2002
13,900
12
21st-century
patented inventions
4-week runs on
1,000 Pentium II’s
2002-2003
130,000
2
patentable new
inventions
PROMISING GP APPLICATION AREAS
• Problem areas involving many variables
that are interrelated in highly non-linear
ways
• Inter-relationship of variables is not well
understood
• Discovery of the size and shape of the
solution is a major part of the problem
• "Black art" problems
PROMISING GP APPLICATION AREAS
(CONTINUED)
• Areas where you simply have no idea how
to program a solution, but where you
know what you want
PROMISING GP APPLICATION AREAS
(CONTINUED)
• Problem areas where a good approximate
solution is satisfactory
 design
 control
 bioinformatics
 classification
 data mining
 system identification
 forecasting
PROMISING GP APPLICATION AREAS
(CONTINUED)
 Areas where large computerized databases are
accumulating and computerized techniques are
needed to analyze the data
 genome, protein, microarray data
 satellite image data
 astronomical data
 petroleum databases
 financial databases
 medical records
 marketing databases
PROMISING GP APPLICATION AREAS
(CONTINUED)
 Areas for which humans find it very
difficult to write good programs
 parallel computers
 cellular automata
 multi-agent strategies
 field-programmable game arrays
 digital signal processors
 swarm intelligence
CHARACTERISTICS SUGGESTING
USE OF GP
(1) discovering the size and shape of the solution,
(2) reusing substructures,
(3) discovering the number of substructures,
(4) discovering the nature of the hierarchical references among
substructures,
(5) passing parameters to a substructure,
(6) discovering the type of substructures (e.g., subroutines, iterations,
loops, recursions, or storage),
(7) discovering the number of arguments possessed by a substructure,
(8) maintaining syntactic validity and locality by means of a
developmental process, or
(9) discovering a general solution in the form of a parameterized
topology containing free variables
DESIGNING A GIRAFFE
•
•
•
•
•
•
Long neck
Long tongue
Vegetable-digesting enzymes in stomach
4 legs
Long legs
Brown coloration
THE DESIGN OF A GOOD GIRAFE
Neck
length
Tongue
length
Carnivorous? Number Leg
of legs length
15.11
feet
14 inches No
4
9.96 feet Brown
Floating
point
Floating
point
Integer
Floating Categorical
point
Boolean
Coloration
NON-LINEARITY — GIRAFE
• Taken one-by-one, some gene values found in a
giraffe, such as the long neck contribute (alone)
negatively to fitness
– requires considerable material to construct
– requires considerable energy to maintain
– prone to injury (thereby hurting rate of survival and
reproduction)
• Thus, maximizing any one variable will not
lead to the global optimum solution
NON-LINEARITY (CONTINUED)
• When the variables are taken in pairs
(there are 15 possible pairs), many
combinations of pairs (e.g., Long neck
and long tongue) are doubly detrimental
NON-LINEARITY (CONTINUED)
• But, certain combinations of traits, when taken
together, are "co-adapted sets of alleles" that
yield a very fit animal for eating high acacia
leaves in the jungle environment, having good
camouflage, having high escape velocity when
faced with predators, and exploiting a niche
(and avoiding competition) with other animals
feeding on low-hanging vegetation
SEARCH METHODS IN GENERAL
•
•
•
•
•
Initial structure(s)
Fitness measure
Operations for creating new structures
Parameters
Termination criterion and method of
designating the result
SPACE WITH MANY LOCAL OPTIMA
SEARCH METHODS
• Blind random search does not use acquired
information in deciding on the future direction
of the search
• Hill combing and gradient descent use acquired
information; however, they are prone to
becoming trapped on local optima
• The previous point is especially true for nontrivial search spaces
7 DIFFERENCES BETWEEN GP
AND ARTIFICIAL INTELLIGENCE
AND MACHINE LEARNING
APPROACHES
REPRESENTATION
Genetic programming overtly conducts it
search for a solution to the given problem
in program space
ROLE OF POINT-TO-POINT
TRANSFORMATIONS IN THE
SEARCH
Genetic programming does not conduct its
search by transforming a single point in the
search space into another single point, but
instead transforms a set of points into
another set of points
ROLE OF HILL CLIMBING IN THE
SEARCH
Genetic programming does not rely
exclusively on greedy hill climbing to
conduct its search, but instead allocates a
certain number of trials, in a principled
way, to choices that are known to be
inferior
DETERMINISM IN THE SEARCH
Genetic programming conducts its search
probabilistically
ROLE OF AN EXPLICIT
KNOWLEDGE BASE
Genetic programming does NOT make use
of a knowledge base
ROLE OF FORMAL LOGIC IN THE
SEARCH
Genetic programming does not utilize
formal logic in it’s search strategy.
Contradictory alternatives are created
and actively maintained.
UNDERPINNINGS OF THE
TECHNIQUE
Biologically inspired
TURING (1948)
Turing made the connection between
searches and the challenge of getting a
computer to solve a problem without
explicitly programming it in his 1948 essay
“Intelligent Machines”
"Further research into intelligence of
machinery will probably be very greatly
concerned with 'searches' ... “
TURING’S 3 APPROACHES TO
MACHINE INTELLIGENCE (1948)
LOGIC-BASED SEARCH
One approach that Turing identified is a
search through the space of integers
representing candidate computer
programs.
TURING’S 3 APPROACHES
(CONTINUED)
CULTURAL SEARCH
A second approach is the "cultural search“
which relies on knowledge and expertise
acquired over a period of years from
others (akin to present-day knowledgebased systems).
TURING’S 3 APPROACHES
(CONTINUED)
GENETICAL OR EVOLUTIONARY
SEARCH
"There is the genetical or evolutionary
search by which a combination of genes is
looked for, the criterion being the survival
value.“
TURING (1950)
From Turing’s 1950 paper "Computing
Machinery and Intelligence" …
“We cannot expect to find a good childmachine at the first attempt. One must
experiment with teaching one such machine
and see how well it learns. One can then try
another and see if it is better or worse.
There is an obvious connection between this
process and evolution, by the identifications”
TURING (1950) (CONTINUED)
“Structure of the child machine =
Hereditary material
“Changes of the child machine =
Mutations
“Natural selection =
Judgment of the experimenter”