Machine Learning

Download Report

Transcript Machine Learning

Machine Learning
Tito Morais Brito Oliveira
System Modelling and Analysis
1
Machine Learning
• Machine learning is a scientific discipline that explores the
construction and study of algorithms that can learn from data.
• Definitions of machine learning:
• "Field of study that gives computers the ability to learn without
being explicitly programmed" by Arthur Samuel, 1959.
• "A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if
its performance at tasks in T, as measured by P, improves with
experience E“ by Tom M. Mitchell
2
Machine Learning
• What are the differences between Machine Learning and Artificial Intelligence ?
Machine learning deals with designing and developing algorithms to evolve behaviors based on
empirical data.
Artificial intelligence encompasses other areas apart from machine learning, including
knowledge representation, natural language processing/understanding, planning, robotics,
etc…
• Machine Learning and Data Mining overlap in many cases, but in essence they
have these differences:
Machine learning focuses on prediction, based on known properties learned from the training
data.
Data mining focuses on the discovery of (previously) unknown properties in the data. This is
the analysis step of Knowledge Discovery in Databases.
3
Machine Learning
Categories of Machine Learning:
• Supervisioned Learning
• The computer is presented with example inputs and their desired outputs, given by a
"teacher", and the goal is to learn a general rule that maps inputs to outputs.
• Unsupervisioned Learning
• No labels (tag, mark, stamp) are given to the learning algorithm, leaving it on its own to
find structure in its input.
• Reinforcement Learning
• A computer program interacts with a dynamic environment in which it must perform a
certain goal, you learn and relearn based on the actions and the effects/rewards from
that actions, (such as driving a vehicle), without a teacher explicitly telling it whether it
has come close to its goal or not. Another example is learning to play a game by playing
against an opponent.
4
Machine Learning - Approaches
• Decision tree learning
• Association rule learning
• Artificial neural networks
• Inductive Logic programming
• Support vector machines
• Clustering
• Bayesian networks
• Genetic Algorithms
• Reinforcement learning
• Representation learning
• Similarity and metric learning
• Sparse dictionary learning
5
Machine Learning Decision tree learning
Decision tree learning uses a
decision tree as a predictive
model which maps observations
about an item to conclusions about
the item's target value.
Tree models where the target
variable can take a finite set of
values are called classification
trees.
6
Decision tree learning
Classification trees
Leaves represents the labels and branches represent conjuctions (and
operation) of features that lead to those class labels.
Regression tree
Decision trees where the target variable can take continuous values, for
example real numbers.(e.g. the price of a house, or a patient’s length of
stay in a hospital).
7
Decision tree learning
Decision tree’s algorithms are usually applied to data mining.
The “learning” process occurs when you split a set into subsets based
on an attribute value. By repeating this process recursively (recursive
partitioning) we stop (when the subset at a node has all the same
value of the target variable) or (when splitting no longer adds value to
the predictions).
We can represent the decision tree using the formula bellow:
Where Y is the target variable that we are trying to understand, classify
or generalize, and x is composed by the input variables (x1, x2, …, xk)
that are used to apply the classification.
8
Decision tree learning
Some techniques, often called ensemble methods, construct more
than one decision tree:
Bagging decision trees, also called (Bootstrap aggregating), builds
multiple decision trees by repeatedly sampling training data with
replacement, and voting the trees for a consensus prediction. (e.g.
house pricing)
Random Forest classifier uses a number of decision trees, in order to
improve the classification rate. (the method combines the Bagging’s
idea above and the random selection of features).
Many others…
9
Decision tree learning
Specific decision-tree algorithms:
• ID3 (is the precursor to the C4.5)
• C4.5
• CART (Classification And
Regression Tree)
• CHAID
• MARS
Metrics
Different algorithms use different
metrics for measuring "best". These
techniques usually measure the
homogeneity of the target:
• Gini impurity – used by CART
algorithm.
• Information gain – based on the
entropy’s concept (Used by the ID3
and C4.5)
• Variance reduction – used by CART
10
Association rule learning
Association rule learning is a popular and well researched method for
discovering interesting relations between variables in large databases.
(Netflix, Amazon, shopping carts, etc…). It is ruled by association rules.
The association rule above found in the sales data of a supermarket
would indicate that if a customer buys onions and potatoes together,
he or she is likely to also buy hamburger meat.
11
Association rule learning
Who much important is that approach ?
The concept of association rules was popularized particularly due to the
1993 article of Agrawal et al. "Mining association rules between sets of
items in large databases".
Which has acquired more than 6000 citations according to Google Scholar, as
of March 2008, and is thus one of the most cited papers in the Data Mining
field.
A purported survey of behavior of supermarket shoppers discovered that
customers (presumably young men) who buy diapers tend also to buy beer.
This anecdote became popular as an example of how unexpected
association rules might be found from everyday data.
12
Association rule learning
Useful Concepts:
• Support:
, it means a support of
an intemset X. Example: X = {A,B,C}
It is defined as the proportion of
transactions in the data set which contain
the itemset.
• Confidence:
Algorithms:
•
•
•
•
Apriori algorithm
Eclat algorithm
FP-growth algorithm
AprioriDP
conf({butter U milk} => {butter}) = 1
means that each time somebody buys
butter and milk always buy butter.
• Lift:
• Conviction:
13
Neural Network Learning
Artificial neural networks (ANNs) are a family of statistical learning
algorithms inspired by biological neural networks (the central nervous
systems of animals, in particular the brain) and are used to estimate or
approximate functions that can depend on a large number of inputs and are
generally unknown.
Is usually applied at handwritten recognition, since it has some adaptive
nature.
“The number of neurons in the human brain: 100 billion, and the number of
synapses each can make: 10,000. The human brain is likely the most complex
thing in the universe.” The Evolving Brain, by R. Grant Steen, 2007.
14
Neural Network Learning
Models
Network Function (formal model)
The Network can be defined as a composition of other
function:
K: Activation function.
(Example: could be ON (1) or OFF (0))
𝑾𝒊 : Weight of each connection between neurons (synapses
or impulse strength of each neuron).
𝒈𝒊 (𝒙): Is the function related with each individual neuron.
f 𝒙 : Is the output function of the whole network.
Neural Networks can also be divided by layers, e.g:
Input, hidden and Output.
15
Neural Network Learning
Bidirectional Associative Learning
Two-layer feedback neural networks that associates two
vectors.
Proposed by Kosko, and it is based on the matrix
multiplications:
M: Weight
X: Input Matrix
Y: Output Matrix
Example:
A = (1,0,1,0,1,0,1,0)
B = (1,1,1,0,1,1,1)
A will be X, and B will be Y on the BAM’s model.
16
Genetic Algorithms
Genetic algorithm (GA) is a search heuristic that mimics the process of
natural selection.
It belongs to a broader class of algorithms called Evolutionary
Algorithms (EA), that posses operations such as inheritance, mutation,
selection and crossover.
17
Genetic Algorithms
Genetic algorithm (GA) is a search
heuristic that mimics the process of
natural selection.
Population of candidate
solutions (called individuals,
creatures) to an optimization
problem is evolved toward better
solutions.
Each candidate solution has a set of
properties (it’s chromossomes or
genotype) which can be mutated
and altered.
18
Genetic Algorithms
evolution usually starts from a
population of randomly generated
individuals, and is an iterative
process, with the population in each
iteration called a generation.
On the Evolution Environment:
the fitness of every individual in the
population is evaluated;
Evaluation envolve an objective
function;
The algorithm select the more fit
individuals.
19
Genetic Algorithms
After selection, a new generation
is created, and the genetic
operators are applied on these
new generation.
Reprodution:
A pair of selected parents
according with the fitness values
are selected. (the best fitness
values are selected)
20
Genetic Algorithms
Crossover:
It’s used to vary the programming of a
chromosome or chromossomes from
one generation to the next. You can
have one-point, two-point, etc..
crosovers. In the example on the right it
is one-point crossover.
Mutation:
It’s used to maintain genetic
diversity from one generation of a
population of genetic algorithm
chromossomes to the next.
21
Bibliography
• Coursera, Machine Learning – Stanford class,
https://www.coursera.org/course/ml
• Stuart Russel, Peter Norvig, 2009, Artificial Intelligence: A Modern
Approach
• Kosko, 1988, Bidirectional Associative Memories
• Witten, Frank, Hall: Data mining practical machine learning tools and
techniques
22
Questions ???
23