Transcript 슬라이드 1
Estimation of Distribution
Algorithm and Genetic
Programming
Structure Complexity Lab ,Seoul National
University
KIM KANGIL
Index
What is EDA?
EDA with GP
Grammar Model based Program Evolution
How it works
Conclusion
Conventional GA,GP vs EDA
Conventional
EDA GP
Stochastic
Model
What is EDA
Generate
Stochastic Model
Components to
construct individuals
Keep the probability
distribution of these
components
Similar cycle and
strategy with
traditional evolutionary
computation
No crossover &
mutation
Model
Population
Comp type1
Comp type2
.
.
.
Selection
Best Individuals
Update Model
EDA process
EDA with GP
Use Tree representation for individuals
More complex and structured model
How to get the distribution in complex
structure?
Dependency
maintain structure
How to keep meaningful building blocks
Grammar is good to give an explanation of
internal structure
Grammar Model-based Program
Evolution (GMPE)
Use Stochastic Context Free Grammar to make a
model
A -> B 0.5
-> C 0.5
Tree representation for each individual
Doesn’t have fixed components to record the
probability
Probability learning with Structural learning every
generation
Use Minimum Message Length to give a boundary of
generalization
Model Implementation
Grammar :
Expanded language
Merging
Grammar :
The same language
with Best individuals
Converting
Best Individuals
Converting
Individual 2
Individual 1
N2
N1
SCFG
Exp
S-> N1
-> N2
Exp
N3
Exp
N4
Op
N5
Exp
Y
X
+
X
Give unique ID to each node
Each parent and child relation is production
Model is constructed by Stochastic CFG
Give 1 count to each created production
N1
N2
N3
N4
N5
->
->
->
->
->
1
1
Y
1
N3 N4 N5 1
X
1
+
1
X
1
Converting - detail
Giving 1 count for each production is
useful to calculate probabilities on total
structure
Even if some nonterminals is merged, we
can know how many this production is used
in best individuals
This grammar can only generate the best
individuals selected in last generation
If there is no generalization step, It can’t
search solution space
Merging
Select two Nonterminals and merge them
Every Production using these Nonterminals should change them
to new Nonterminal
If some productions become equivalent, compact them to one
production with summing their count
SCFG
SCFG
S-> N1
-> N2
N1
N2
N3
N4
N5
->
->
->
->
->
1
1
Merge
Y
1
(N1 , N2)
N3 N4 N5 1
To N6
X
1
+
1
X
1
S-> N6
-> N6
N6
N6
N3
N4
N5
->
->
->
->
->
SCFG
1
1
S-> N6
2
Y
1
N6 -> Y
1
Simplify
N3 N4 N5 1
-> N3 N4 N5 1
X
1
N3 -> X
1
+
1
N4 -> +
1
X
1
N5 -> X
1
Merging - detail
Select Nonterminals
Keeping consistency with original grammar
Recording Original Type in each Nonterminal
Select nonterminals which has the same original Type
How many Nonterminals should be merged?
Use Minimum Message Length – give boundary of
generalization
No crossover – contain all possible combinations
with probability
Minimum message length
Similar with Minimum Description length
Used to describe the model size
Based on entropy
Lower MML, simpler model
Merge nonterminals who can reduce this
cost
keep model to be not so complicated or
not so simple
Calculate the cost in all possible ways
Current issues
How much model should be generalized or
specialized to make the evolutionary process
more efficient?
How can we know which sub part of the model
should be changed?
Considering the distribution and variable measures
to describe some properties of model ( like
similarity )
Efficient learning with collecting more important
information
Conclusion
GMPE is one of EDA with GP approach
Update structure with probability at every generation
Useful to collect frequently used productions with flexible
grammar
Easy to give some constraints like GGGP reducing search
space
Good to get information of best solution set
It can give an explanation of model internally about which
part is important
Too much cost to maintain model, over-generalization
problem.
Using MML for giving Generalization step