Learning - echosf.net

Download Report

Transcript Learning - echosf.net

Introduction to
Machine Learning Algorithms
What is Artificial Intelligence (AI)?

Design and study of computer programs that
behave intelligently.
 Designing computer programs to make computers
smarter.
 Study of how to make computers do things at
which, at the moment, people are better.
2
Research Areas and Approaches
Research
Artificial
Intelligence
Learning Algorithms
Inference Mechanisms
Knowledge Representation
Intelligent System Architecture
Application
Intelligent Agents
Information Retrieval
Electronic Commerce
Data Mining
Bioinformatics
Natural Language Proc.
Expert Systems
Paradigm
Rationalism (Logical)
Empiricism (Statistical)
Connectionism (Neural)
Evolutionary (Genetic)
Biological (Molecular)
3
Concept of Machine Learning
4
5
Context
Computer
Science
(AI)
Cognitive
Science
Machine
Learning
Statistics
Information
Theory
6
Why Machine Learning?




Recent progress in algorithms and theory
Growing flood of online data
Computational power is available
Budding industry
Three niches for machine learning
 Data mining: using historical data to improve decisions
 Medical records --> medical knowledge

Software applications we can’t program by hand
 Autonomous driving
 Speech recognition

Self-customizing programs
 Newsreader that learns user interests
7
Learning: Definition

Definition
Learning is the improvement of performance in some
environment through the acquisition of knowledge
resulting from experience in that environment.
the improvement
of behavior
through acquisition
of knowledge
on some
performance task
based on partial
task experience
8
A Learning Problem: EnjoySport
Sky
Temp
Humid
Wind
Water Forecast EnjoySports
Sunny Warm
Normal Strong Warm Same
Yes
Sunny Warm
High
Strong Warm Same
Yes
Rainy
High
Strong Warm Change
No
High
Strong Cool
Cold
Sunny Warm
Change
Yes
What is the general concept?
9
Metaphors and Methods
Neurobiology
Connectionist
Learning
Biological
Evolution
Heuristic
Search
Tree / Rule
Induction
Genetic Learning
Memory and
Retrieval
Case-Based
Learning
Statistical
Inference
Probabilistic
Induction
10
What is the Learning Problem?

Learning = improving with experience at some
task
Improve over task T,
With respect to performance measure P,
Based on experience E.
E.g., Learn to play checkers
 T: Play checkers
 P: % of games won in world tournament
 E: opportunity to play against self
11
Machine Learning: Tasks

Supervised Learning
 Estimate an unknown mapping from known input- output pairs
 Learn fw from training set D={(x,y)} s.t. f w (x)  y  f (x)
 Classification: y is discrete
 Regression: y is continuous

Unsupervised Learning
 Only input values are provided
 Learn fw from D={(x)} s.t. f w (x)  x
 Compression
 Clustering

Reinforcement Learning
12
Machine Learning: Strategies









Rote learning
Concept learning
Learning from examples
Learning by instruction
Inductive learning
Deductive learning
Explanation-based learning (EBL)
Learning by analogy
Learning by observation
13
Supervised Learning

Given a sequence of input/output pairs of
the form <xi, yi>, where xi is a possible
input and yi is the output associated with xi.
 Learn a function f that accounts for the
examples seen so far, f(xi) = yi for all i, and
that makes a good guess for the outputs of
the inputs that it has not seen.
14
Examples of Input-Output Pairs
Task
Inputs
Outputs
Recognition
Descriptions of
objects
Classes that the
objects belong to
Action
Descriptions of
situations
Actions or predictions
Janitor robot
problem
Descriptions of
offices (floor, prof’s
office)
Yes or No (indicating
whether or not the
office contains a
recycling bin)
15
Unsupervised Learning

Clustering
A clustering algorithm partitions the inputs into a fixed
number of subsets or clusters so that inputs in the same
cluster are close to one another.

Discovery learning
The objective is to uncover new relations in the data.
16
Online and Batch Learning

Batch methods
Process large sets of examples all at once.

Online (incremental) methods
Process examples one at a time.
17
Machine Learning Algorithms and
Applications
18
Machine Learning Algorithms

Neural Learning
Multilayer Perceptrons (MLPs)
Self-Organizing Maps (SOMs)

Evolutionary Learning
Genetic Algorithms

Probabilistic Learning
Bayesian Networks (BNs)

Other Machine Learning Methods
Decision Trees (DTs)
19
Neural Nets for Handwritten Digit
Recognition
…
…
…
Pre-processing
?
0
1
2
3
9
…
…
0
Output units
Training
2
3
9
…
Hidden units
…
1
…
Input units
…
Test
…
20
ALVINN System: Neural Network Learning to Steer
an Autonomous Vehicle
21
Learning to Navigate a Vehicle by
Observing an Human Expert (1/2)

Inputs
The images produces by a camera mounted on
the vehicle

Outputs
The actions taken by the human driver to steer
the vehicle or adjust its speed.

Result of learning
A function mapping images to control actions
22
Learning to Navigate a Vehicle by
Observing an Human Expert (2/2)
23
Data Recorrection by a Hopfield
Network
corrupted
input data
original
target data
Recorrected
data after
10 iterations
Recorrected
data after
20 iterations
Fully
recorrected
data after
35 iterations
24
ANN for Face Recognition
960 x 3 x 4 network is trained on gray-level images of faces to predict
whether a person is looking to their left, right, ahead, or up.
25
Data Mining
Selection
& Sampling
Preprocessing
& Cleaning
Transformation
& reduction
Data Mining
Interpretation/
Evaluation
-- -- --- -- --- -- --
Database/data
warehouse
Target
data
Cleaned
data
Transformed
data
Patterns/
model
Knowledge
Performance
system
26
Hot Water Flashing Nozzle with
Evolutionary Algorithms
Hans-Paul Schwefel
performed the original
experiments
Start
Hot water entering
Steam and droplet at exit
At throat: Mach 1 and onset of flashing
27
Machine Learning Applications in
Bioinformatics
28
Bayesian Networks
for Gene Expression Analysis

Learning
Gene C
Processed
data
Data
Learning
algorithm
Gene B
Gene D
Gene A
Preprocessing
Target

Inference
Gene C
Gene D
Gene B
Gene A
Target
The values of Gene C and
Gene B are given.
Gene C
Gene D
Gene B
Gene A
Target
Belief propagation
Gene C
Gene D
Gene B
Gene A
Target
Probability for the target
is computed.
29
Multilayer Perceptrons for Gene
Finding and Prediction
Coding potential value
GC Composition
bases
Length
Discrete
Donor
exon score
Acceptor
Intron vocabulary
1
score
0
sequence
30
Self-Organizing Maps for DNA
Microarray Data Analysis
Two-dimensional array
of postsynaptic neurons
Winning
neurons
Bundle of synaptic
connections
Input
31
Biological Information Extraction
Data Analysis &
Field Identification
Text Data
Data Classification &
Field Extraction
Field Property
Identification & Learning
Database Template
DB Record
Filling
Location
Date
Information Extraction
DB
32
Biomolecular Computing
011001101010001
ATGCTCGAAGCT
33