Genetic Algorithms

Download Report

Transcript Genetic Algorithms

Copyright R. Weber
Machine Learning, Data Mining,
Genetic Algorithms, Neural Networks
ISYS370
Dr. R. Weber
Concept Learning is a Form of Inductive
Learning
•Learner uses:
Copyright R. Weber
–positive examples (instances ARE examples of a
concept) and
–negative examples (instances ARE NOT examples of
a concept)
Concept Learning
Copyright R. Weber
• Needs empirical validation
• Dense or sparse data determine quality of
different methods
Validation of Concept Learning i
• The learned concept should be able to correctly
classify new instances of the concept
Copyright R. Weber
– When it succeeds in a real instance of the concept it
finds true positives
– When it fails in a real instance of the concept it
finds false negatives
Validation of Concept Learning ii
• The learned concept should be able to correctly
classify new instances of the concept
Copyright R. Weber
– When it succeeds in a counterexample it finds true
negatives
– When it fails in a counterexample it finds false
positives
Rule Learning
Copyright R. Weber
• Learning widely used in data mining
• Version Space Learning is a search
method to learn rules
• Decision Trees
Decision trees
Copyright R. Weber
• Knowledge representation formalism
• Represent mutually exclusive rules (disjunction)
• A way of breaking up a data set into classes or
categories
• Classification rules that determine, for each
instance with attribute values, whether it
belongs to one or another class
• Not incremental
Decision
trees
-leaf nodes (classes)
- decision nodes
(tests on attribute values)
-from decision nodes
Copyright R. Weber
branches grow for each
possible outcome of the
test
From Cawsey, 1997
Decision tree induction
Copyright R. Weber
• Goal is to correctly classify all example data
• Several algorithms to induce decision trees:
ID3 (Quinlan 1979) , CLS, ACLS,
ASSISTANT, IND, C4.5
• Constructs decision tree from past data
• Attempts to find the simplest tree (not
guaranteed because it is based on heuristics)
•From:
ID3 algorithm
– a set of target classes
–Training data containing objects of more
than one class
Copyright R. Weber
•ID3 uses test to refine the training data
set into subsets that contain objects of
only one class each
•Choosing the right test is the key
How does ID3 chooses tests
Copyright R. Weber
• Information gain or ‘minimum entropy’
• Maximizing information gain corresponds to
minimizing entropy
•Predictive features (good indicators of the
outcome)
Choosing tests
Copyright R. Weber
• Information gain or ‘minimum entropy’
• Maximizing information gain corresponds to
minimizing entropy
•Predictive features (good indicators of the
outcome)
Copyright R. Weber
Monthy income
Job status
Repayment
Loan status
1
2,000
Salaried
200
Good
2
4,000
Salaried
600
Very bad
3
3,000
Waged
300
Very good
4
1,500
salaried
400
Bad
Data mining tasks ii
• Link analysis
Rules:
• Association generation
• Relationships between entities
• Deviation detection
Copyright R. Weber
• How things change over time, trends
KDD applications
• Fraud detection
– Telecom (calling cards, cell phones)
– Credit cards
– Health insurance



Loan approval
Investment analysis
Marketing and sales data analysis

Copyright R. Weber


Identify potential customers
Effectiveness of sales campaign
Store layout
Text mining
Copyright R. Weber
The problem starts with a query and
the solution is a set of information
(e.g., patterns, connections, profiles,
trends) contained in several different
texts that are potentially relevant to
the initial query.
Text mining applications
• IBM Text Navigator
– Cluster documents by content;
– Each document is annotated by the 2 most frequently
used words in the cluster;
• Concept Extraction (Los Alamos)
Copyright R. Weber
– Text analysis of medical records;
– Uses a clustering approach based on trigram
representation;
– Documents in vectors, cosine for comparison;
Problem
solving
method
Copyright R. Weber
rule-based ES
Reasoning
type
deductive reasoning
case-based reasoning
analogical reasoning
inductive ML, NN
inductive reasoning
algorithms
search
Copyright R. Weber
Genetic Algorithms
(GA)
Genetic algorithms (i)
Copyright R. Weber
• learn by experimentation
• based on human genetics, it originates new
solutions
• representational restrictions
• good to improve quality of other methods e.g.,
search algorithms, CBR
• evolutionary algorithms (broader)
Genetic algorithms (ii)
Copyright R. Weber
•
•
•
•
•
•
•
requires an evaluation function to guide the process
population of genomes represent possible solutions
operations are applied over these genomes
operations can be mutation, crossover
operations produce new offspring
an evaluation function tests how fit an offspring is
the fittest will survive to mate again
Genetic Algorithms ii
Copyright R. Weber
• http://ai.bpa.arizona.edu/~mramsey/ga.html You can change
parameters
• http://www.rennard.org/alife/english/gavgb.html Steven Thompson
presented
Copyright R. Weber
Neural Networks
(NN)
the evidence
Copyright R. Weber
~= 2nd-5th week
training vision
the evidence
~= 2nd-5th week
training vision
Copyright R. Weber
10
the evidence
~= 2nd-5th week
training vision
Copyright R. Weber
10
the evidence
~= 2nd-5th week
Copyright R. Weber
training vision
NN: model of brains
input
output
Copyright R. Weber
neurons
synapses
electric transmissions
:
Elements
Copyright R. Weber
•
•
•
•
input nodes
output nodes
links
weights
terminology
• input and output nodes (or units)
connected by links
• each link has a numeric weight
• weights store information
Copyright R. Weber
• networks are trained on training sets
(examples) and after are tested on test
sets to assess networks’ accuracy
• learning/training takes place as weights
are updated to reflect the input/output
behavior
The concept
1  Yes, 0  No
fly
lay
eggs
0
1
1
1
0
0
1
1
0
Copyright R. Weber
4 legs
=> bird
=> mammal
=> mammal
The concept
1  Yes, 0  No
fly
lay
eggs
0
1
1
1
0
0
1
1
0
Copyright R. Weber
4 legs
=> bird
=> mammal
=> mammal
The concept
1  Yes, 0  No
4 legs
0
fly
lay
eggs
1
1
=> bird
Copyright R. Weber
0.5 0.5 0.5
1
0
0
1
1
0
=> mammal
=> mammal
4 legs
0
fly
lay
eggs
1
1
=> bird
0*0.5+1*0.5+1*0.5= 1
1
0
1*0.5+0*0.5+0*0.5= 0.5
1
1
=> mammal
0
0
1*0.5+1*0.5+0*0.5= 1
0.5 0.5 0.5
=> mammal
Goal is to have weights that recognize different
representations of mammals and birds as such
4 legs
0
fly
lay
eggs
1
1
=> bird
0*0.5+1*0.5+1*0.5= 1
1
0
1*0.5+0*0.5+0*0.5= 0.5
1
1
=> mammal
0
0
1*0.5+1*0.5+0*0.5= 1
0.5 0.5 0.5
=> mammal
Suppose we want bird to be greater 0.5
and mammal to be equal or less than 0.5
4 legs
0
fly
lay
eggs
1
1
=> bird
0*0.25+1*0.25+1*0.5= 0.75
1
0
1*0.25+0*0.25+0*0.5= 0.25
1
1
=> mammal
0
0
1*0.25+1*0.25+0*0.5= 0.5
0.25 0.25 0.5
=> mammal
Suppose we want bird to be greater 0.5
and mammal to be equal or less than 0.5
The training
=> mammal (1)
=> bird (0)
Output=Step(wij f ij )
learning takes place as weights
are updated to reflect the
input/output behavior
i=1
i=2
i=3
Copyright R. Weber
4 legs
flies
0
0
0
1
0
0
1
0
0
1
0
0
0
0
0
1
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
1
1
eggs
0
1
1
j=1
j=2
j=3
Goal minimize error between
representation of the
expected and actual outcome
20
Copyright R. Weber
NN demo…..
Characteristics
Copyright R. Weber
• NN implement inductive learning algorithms (through
generalization) therefore, it requires several training
examples to learn
• NN do not provide an explanation why the task
performed the way it was
• no explicit knowledge; uses data
• Classification (pattern recognition), clustering,
diagnosis, optimization, forecasting (prediction),
modeling, reconstruction, routing
Where are NN applicable?
Copyright R. Weber
• Where they can form a model from training data
alone;
• When there may be an algorithm, but it is not
known, or has too many variables;
• There are enough examples available
• It is easier to let the network learn from examples
• Other inductive learning methods may not be as
accurate
Applications (i)
Copyright R. Weber
• predict movement of stocks, currencies, etc.,
from previous data;
• to recognize signatures made (e.g. in a bank)
with those stored;
• to monitor the state of aircraft engines (by
monitoring vibration levels and sound, early
warning of engine problems can be given;
British Rail have been testing an application to
monitor diesel engines;
Applications (ii)
• Pronunciation (rules with many exceptions)
• Handwritten character recognition
(network w/ 200,000 is impossible to train, final
9,760 weights, used 7300 examples to train and
2,000 to test, 99% accuracy)
Copyright R. Weber
• Learn brain patterns to control and activate
limbs as in the “Rats control a robot by
thought alone” article
• Credit assignment
CMU Driving ALVINN
Copyright R. Weber
• learns from human drivers how to steer a
vehicle along a single lane on a highway
• ALVINN is implemented in two vehicles
equipped with computer-controlled steering,
acceleration, and braking
• cars can reach 70 m/h with ALVINN
• programs that consider all the problem
environment reach 4 m/h only
Why using NN for the driving task?
• there is no good theory of driving, but it is easy to
collect training samples
• training data is obtained with a human* driving the
vehicle
–5min training, 10 min algorithm runs
• driving is continuous and noisy
• almost all features contribute with useful information
Copyright R. Weber
*humans are not very good generators of training instances when they behave too
regularly without making mistakes
the neural network
• INPUT:
Copyright R. Weber
video camera generates array of 30x32 grid
of input nodes
• OUTPUT: 30 nodes layer
corresponding to
steering direction
• vehicle steers to the
direction of the layer
with highest activation
Resources
http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html#what
http://www.ri.cmu.edu/projects/project_160.html
Copyright R. Weber
http://www.txtwriter.com/Onscience/Articles/ratrobot.html