Transcript Example

Universidad de Buenos Aires
Maestría en Data Mining y Knowledge Discovery
Aprendizaje Automático
1-Introducción
Eduardo Poggi ([email protected])
Ernesto Mislej ([email protected])
otoño de 2008
1
2 Agenda



Aprendizaje Automático
Sistemas de aprendizaje
Tareas
3 Campo de estudio

Artificial Intelligence






Planning
Natural Language
Robotics
K Representation
…
Machine Learning

“Knowledge” Discovery





Clusters
Rules
Concepts
Patterns
…
4 Multidisciplinary Field
Artificial
Intelligence
Probability &
Statistics
Computational
Complexity
Theory
Neurobiology
Machine
Learning
Information
Theory
Philosophy
5 Inteligencia

“Las personas poseen procesos que les
permiten resolver problemas complejos,
al conjunto de estos procesos que
desconocemos denominamos
inteligencia.” (M Minsky)

Definición en cambio permanente =
“Regiones inexploradas de África”.
6 Machine Learning


Machine learning is the study of how to
make computers automatically learn;
the goal is to make computers improve
their performance through experience.
The purpose of this course is to present
the key concepts, algorithms and theory
that form the core of Machine Learning.
7 ML & DM



Information: Set of patterns or
expectations that underlie the data.
Data Mining: Extraction of implicit,
previously unknown and potentially
useful information from data.
Machine Learning: Provides the
technical basis (algorithms) of data
mining.
Epistemological differences among
8 Computer Science, ML and DM
Classic data
processing
Machine Learning
(and Statistics)
DM
Simulates a deductive
reasoning (= applies an
existing model)
Simulates an inductive
reasoning (= invents a
model)
Simulates an inductive
reasoning ("even more
inductive")
validation according to
precision
validation according to
precision
validation according to utility
and comprehensibility
Results as universal as
possible
Results as universal as
possible
Results relative to particular
cases
elegance = conciseness
elegance = conciseness
elegance = adequacy to the
user's model
Tends to reject AI
Either tends to reject AI
(Statistics) or claims
belonging to AI (ML)
Naturally integrates AI, DB,
Stat., and MMI.
9 Model of learning systems
Class of Task
Computer +
Performance
(T)
Learning
(P)
Algorithm
Experience
(E)
10 Class of Tasks

It is the kind of activity on which the
computer will learn to improve its
performance. Examples:




Learning play chess
Recognizing images of handwritten words
Diagnosing patients coming into the
hospital
“Discovery” patterns in data
11 Settings for learning



Tasks are generated by a random process
outside the learner
The learner can pose queries to a teacher
The learner explores its surroundings
autonomously

Example: Learning to play chess



Learn from a specific sequence
Ask: what if the sequence is this?
Give me an amateur player and then an expert player
12 Experiencia y Memoria

Textos aprendidos “de memoria”






“… quien podría soportar tan duras …”
“Lasciate ogni esperanze voi ch´intrate”
Relaciones
Sin datos almacenados no hay aprendizaje
Las capacidades de razonamiento no
compensan la ignorancia
Relación, Generalización y Abstracción
13 Experience and Performance


Experience: What has been recorded in
the past.
Performance: A measure of the quality
of the response or action.



Example: Handwritten recognition using
Neural Networks
Experience: a database of handwritten
images with their correct classification
Performance: Accuracy in classifications
14 Performance



Efectividad = Qr / Qp
Eficacia = Efectividad * Tp / Tr
Eficiencia = Eficacia * Rp / Rr





Q = Cantidad de unidades (incluye calidad).
T = Tiempo
R = Recursos
p = previsto
r = real
15 Designing a Learning System
1. Define the knowledge to learn
2. Define the representation of the target knowledge
3. Define the learning mechanism
+ Define monitor mechanism
Example:
Handwritten recognition using Neural Networks
1.
2.
3.
A function to classify handwritten images
A linear combination of handwritten features
A linear classifier
16 The Knowledge to Learn
Supervised learning: A function to predict the class of new examples
Let X be the space of possible examples
Let Y be the space of possible classes
Learn F : X
Y
Example:
In learning to play chess the following are possible interpretations:
X : the space of board configurations
Y : the space of legal moves
The Representation of the Target
17 Knowledge
Example: Diagnosing a patient coming into the hospital.
Features:
 X1: Temperature
 X2: Blood pressure
 X3: Blood type
 X4: Age
 X5: Weight
 Etc.
Given a new example X = < x1, x2, …, xn >
F(X) = w1 x1 + w2 x2 + … + wn xn
If F(X) > T predict heart disease
otherwise predict no heart disease
The Representation of the Target
18 Knowledge

There are many possibilities:

The class of functions is very expressive.


The class of functions is very limited.


You can represent almost any function but to be effective
the method needs lots of examples.
Don’t need many examples but may fail to contain the
true target function.
Características: validez, expresividad, facilidad
de inferencia, adaptabilidad.
19 The Learning Mechanism 1
Machine learning algorithms abound:
 Decision Trees
 Rule-based systems
 Neural networks
 Nearest-neighbor
 Support-Vector Machines
 Bayesian Methods
Important characteristics of the learning mechanism:
• What is the class of functions
• How do you search over the class of functions
20 The Learning Mechanism 2
Example:
Look over the space of all possible decision trees.
Prefer small trees to large trees.
Higher score
Lower score
21 Choices designing a learning program

Determine Type of Training Experience




Determine Target Function



Board -> Value
Borad -> Move
Determine Representation of Learned Function




Games agaisnt experts
Games against self
Table of corrects moves
Function
Rules
Artificial neural network
Determine Learning Algorithm


Gradient descent
Linear programming
22 Application 1
23 Application 1
Automatic Car Drive
Class of Tasks:
Learning to drive on highways from
vision stereos.
Knowledge:
Images and steering commands recorded
while observing a human driver.
Performance Module: Accuracy in classification
24 Application 2
Learning to classify astronomical structures.
galaxy
stars
Features:
o Color
o Size
o Mass
o Temperature
o Luminosity
unkown
25 Application 2
Classifying Astronomical Objects
Class of Tasks:
Learning to classify new objects.
Knowledge:
database of images with correct
classification.
Performance Module: Accuracy in classification
26 Other Applications

Bio-Technology








Protein Folding Prediction
Micro-array gene expression
Computer Systems Performance Prediction
Credit Applications
Fraud Detection
Detección de patrones de consumo (compras
repetitivas y esporádicas, productos relacionados)
Character Recognition (US Postal Service)
Web Applications


Document Classification
Learning User Preferences
27 Diferentes modelos

Deductivos


Inductivos




Memorización
Clasificación
Clustering
Teorización
Híbridos



EBL – Explanation base learning
SML – Similarity base learning
CBL – Case base learning
Should I care about Machine
28 Learning at all?




Machine learning is becoming increasingly popular
and has become a cornerstone in many industrial
applications.
Machine learning provides algorithms for data mining,
where the goal is to extract useful pieces of
information (i.e., patterns) from large databases.
The computer industry is heading towards systems
that will be able to adapt and heal themselves
automatically.
The electronic game industry is now focusing on
games where characters adapt and learn through
time.
29 Summary




Machine learning is the study of how to make
computers automatically learn.
A learning algorithm needs the following
elements: class of tasks, performance metric,
and body of experience.
The design of a learning algorithm requires to
define the knowledge to learn, the
representation of the target knowledge, and
the learning mechanism.
Machine learning counts with many successful
applications and is becoming increasingly
important in science and industry.
30 Tareas

Leer:




Se sugiere leer:



Capítulo 1 de Mitchell
Kodratoff, Yves: Machine Learning and Data
Mining
Kodratoff, Yves: Cuando el ordenador aprende
Kvitca, Adolfo: Resolución de problemas con IA.
EBAI 1998. Caps: 1-4.
Rich, Elaine: AI. McGrawHill, 1984. Caps: 2 y 3.
Diseñar un sistema de aprendizaje para algún
juego