Slides - Centre for Policy Modelling

Download Report

Transcript Slides - Centre for Policy Modelling

Using Localised ‘Gossip’ to
Structure Distributed Learning
Bruce Edmonds
Centre for Policy Modelling
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-1
The Problem
• For many problems/situations universal
solutions are unreachable
…in such situations one has to seek partial
solutions (i.e. solutions that are
valid/effective only in a subdomain).
• Sometimes the relevant subdomains seem
obvious (e.g. biology vs. physics)
…but in many other situations the best way to
subdivide a situation also needs to be
discovered (entangled with solution types).
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-2
Fitting data globally and piecewise
Graphs of
piecewise
models
Data points
Graph of global candidate
model
Problem Domain
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-3
Solution source
• Both ecology and human society inhabit
situations where universal solutions are not
reachable
• Even closely related species are successful
in different regions and niches
• Human techniques for dealing with the
environment have spread over the areas
where these techniques work
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-4
Cavalli-Sforza, Menozzi, and Piazza
1994 p. 257 – Cultural Diffusion
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-5
Beef Cows in the USA 2002
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-6
Milk Cows in the USA 2002
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-7
Change in the use of irrigation in USA
1997-2002
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-8
Different ranges of different species
Striped
grasshopper
Greenstriped
grasshopper
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-9
Distribution of terms for soft drinks in
the USA – Matthew Campbell’s map
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-10
Only occasionally do global parasites
arise…
…like homo sapiens!
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-11
An Illustration of the Basic Algorithm
D
3.7
2.1
p
0.9
2.2
Some Space of Characteristics
(Learning Domain & Content)
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-12
The algorithm outline (generic version)
Initialise space with a random set of genes
Repeat
ForEach gene from 1 to popSize
Randomly select a locality
randomly select from locality
a set of sample genes
evaluate set in the locality
chose two best from set
if randomNum < probCrossover
then cross two best -> newInd
else best -> newInd
Next gene
New population composed of newInds
Until finished
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-13
Two phases of this approach
• When species successfully propagate over
regions they tend to “speciate” into many varieties
• Information learnt is spread over the population
not in a single best individual
• Thus if you want to understand the results it is
helpful to add an “analysis” phase
…which does a sort of “cluster analysis” of the
locally best solutions in the population
• I do this by: turning off variation; allowing only one
solution per location; and massive but strictly local
propagation to nearby locations (in this 2nd phase)
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-14
An application to the Cleveland
Heart Disease Data Set
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-15
Cleveland Heart Disease Data Set – the
processed sub-set used
In processed sub-set:
• 281 entries
• 14 attributes numeric or numerically coded
• Attribute 14 is the outcome (0, 1, 2, 3, 4)
• Some attributes: 1 - age, 2 - sex, 4 - resting
blood pressure (trestpbs), 5 - cholesterol
(chol)
• Available at the repository of Machine
Learning
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-16
Why this particular data set?
• It is fairly large
• It is quite complex
• I know hardly anything about the causes of
heart disease
• Its accessible
• ML techniques so far have not found a very
high performing global solution
• It seemed a vaguely useful thing to do
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-17
The Solution Form
• Solutions are a set of 5 numeric functions (one for
each outcome), each coded as tree expressions
– E.g. Outcome 0 has weight calculated by: [TIMES [MIN
[CONST -0.6] [INPUT 8]] [SAFEDIVIDE [INPUT 1]
[CONST 0.5]]]
– Which simplifies to: 2 * V8 * V1
– Each of 5 functions evaluated (given 13 inputs)
– Function with highest value gives prediction
• Functions: MIN, MAX, IGZ, TIMES, MINUS,
PLUS, SAFEDIVIDE
• Leaves: inputs 1,2,…,13 and constants -1, -.9,.., 1
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-18
The space of characteristics
• Is essentially the 281 points in the data set
…with the distance structure determined by
the cartesian distance within the chosen
space of characteristics
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-19
The 3 sets of runs (12 runs each)
• Global: a standard GP approach, evaluation
against 10% random sample, population of
281, 90% crossover
• Local: set of solutions evaluated at a point
in the space, taken from point plus some
from neighbouring localities, population
800, 20% crossover
– Local (1, 2): space defined by age and sex
– Local (4, 5): space defined by restbps and chol
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-20
Measuring the success
• Cost of each approach measured in terms
of the number of evaluations of a solution at
a point in the space, since this dominates
the computational time
• Effective error is:
Global runs: the average error (over all points) of
the best solution in the population
Local runs: the average of the error of set of the
best solution at each point evaluated at that
point
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-21
Comparison of global and local runs
50%
Global
Effective Error
40%
30%
Local (1, 2)
20%
10%
Local (4, 5)
0%
10000
100000
1000000
Evaluations
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-22
10000000
Error and Spread in Local(1, 2)
30%
6
Analysis Phase
25%
5
Error
20%
4
15%
3
10%
2
Spread
5%
1
0%
0
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-23
500000
450000
400000
350000
300000
250000
200000
150000
100000
50000
0
Evaluations
Average Gene Spread
Average Locally Best Error
Development Phase
Error and Spread in Local(4, 5)
30%
6
Analysis Phase
25%
5
20%
4
15%
3
10%
2
Spread
5%
1
Error
0%
0
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-24
500000
450000
400000
350000
300000
250000
200000
150000
100000
50000
0
Evaluations
Average Gene Spread
Average Locally Best Error
Development Phase
Male
Both
Female
Spread of solutions using items 1&2
Age
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-25
cholesterol
Spread of solutions using items 4&5
resting blood pressure
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-26
Related Work
• Local Regression (or the slightly more
general locally weighted learning)
• Clustering techniques
• ‘Demes’ in GP
• Evolving parts of a problem separately
(DCCGA etc.)
• Decision tree induction (e.g. C4.5)
• Ecological models
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-27
Conclusion
• Memetic/ecological processes that combine
local propagation and solution development
can find and exploit niches in complex
problems
…but this does not lead to neat global
solutions (in cases I have tried)
…and can be sensitive to the selection of the
space over which propagation occurs
(although am investigating systems where
this is also discovered, so wish me luck!)
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-28
The End
Bruce Edmonds
bruce.edmonds.name
Centre for Policy Modelling
cfpm.org
Using localised gossip to structure distributed learning, Bruce Edmonds, SIC@AISB, Univ. of Herts., April 2005, http://cfpm.org/~bruce slide-29