Artificial Neural Networks

Download Report

Transcript Artificial Neural Networks

Learning in Agents
Material collected, assembled and
extended by
S. Costantini,
Computer Sc. Dept. Univ. of L’Aquila
Many thanks to all colleagues that share
teaching material on the web.
Why Learning Agents?
Designers cannot foresee all
situations that the agent will
encounter.
 To display full autonomy agents need
to learn from and adapt to novel
environ-ments.
 Learning is a crucial part of
intelligence.

What is Machine Learning?
Definition: A computer program is said to
learn from experience E with respect to
some class of tasks T and perform-ance
measure P, if its performance at tasks in T,
as measured by P, improves with
experience E. [Mitchell 97]
 Example: T = “play tennis”,
E = “playing matches”, P = “score”

ML (machine learning):
Another View
ML can be seen as the task of:
 taking a set of observations represented in
a given object/data language and
 representing (the information in) that set
in another language called
concept/hypothesis language.
A side effect of this step – the ability to deal
with unseen observations.
When an agent learns:
The range of behaviors is expanded:
the agent can do more, or
 The accuracy on tasks is improved:
the agent can do things better, or
 The speed is improved: the agent can
do things faster.

Machine Learning Biases



The concept/hypothesis language specifies the
language bias, which limits the set of all
concepts/hypotheses that can be
expressed/considered/learned.
The preference bias allows us to decide between
two hypotheses (even if they both classify the
training data equally).
The search bias defines the order in which
hypotheses will be considered.

Important if one does not search the whole
hypothesis space.
Concept Language and
Black- vs. White-Box Learning
Black-Box Learning: Interpretation of
the learning result is unclear to a user.
 White-Box Learning: Creates
(symbolic) structures that are
comprehensible.

Machine Learning vs.
Learning Agents
Machine Learning:
Classic
Machine
Learning
Learning as the only goal
Active
Learning
Closed Loop
Machine
Learning
Learning as one of many goals:
Learning Agent(s)
Integrating Machine
Learning into the Agent
Architecture

Time constraints on learning

Synchronisation between agents’ actions

Learning and recall

Timing analysis of theories learned
Time Constraints on
Learning

Machine Learning alone:


predictive accuracy matters, time doesn’t
(just a price to pay)
ML in Agents
Soft deadlines: resources must be shared
with other activities (perception, planning,
control)
 Hard deadlines: imposed by environment:
Make up your mind now!

Learning and Recall
Agent must strike a balance between:
 Learning, which updates the model of
the world
 Recall, which applies existing model of
the world to other tasks
Learning and Recall (2)
Update
sensory
information
Recall current model of
world to choose and
carry out an action
Learn new model
of the world
• In theory, the two can run in parallel
• In practice, must share limited resources
Learning and Recall (3)
Possible strategies:
 Parallel learning and recall at all times
 Mutually exclusive learning and recall
– After incremental, eager learning, examples are
discarded…
– …or kept if batch or lazy learning used

Cheap on-the-fly learning (preprocessing),
off-line computationally expensive learning
– reduce raw information, change object language
– analogy with human learning and the role of sleep
Types of Learning Task
Supervised Learning: there is a “teacher”
 Unsupervised Learning: autonomous
 Reinforcement Learning: the agent is given
a (usually pre-defined) reward if the
knowledge coming from learning proves
useful for reaching agent’s goals.

Learning to Coordinate
Good coordination is crucial for good
MAS performance.
 Example: soccer team.
 Pre-defined coordination protocols are
often difficult to define in advance.
 Needed: learning of coordination
 Idea: use reinforcement learning

Soccer Formation
Soccer Formation Control
Formation control is a coordination
problem.
 Good formations and set-plays seem
to be a strong factor in winning teams.
 To date: pre-defined.
 Can (near-)optimal formations be
(reinforcement) learned? New idead
are being experimented...

Learning as Knowledge
Extraction

Extracting useful patterns from data:
Data Mining
Data Mining Taxonomy
Predictive Method
- …predict the value of a particular
attribute…
Descriptive Method
- …foundation of human-interpretable
patterns that describe the data…
Definition of Data Mining

“…The non-trivial process of
identifying valid, novel, potentially
useful, and ultimately understandable
patterns in data…”
Fayyad, Piatetsky-Shapiro, Smyth [1996]
Overview
Introduction
 Data Mining Taxonomy
 Data Mining Models and Algorithms
 Quick Wins with Data Mining
 Privacy-Preserving Data Mining

Classification &
Regression
Classification:
…aim to identify the characteristics that
indicate the group to which each case
belongs…
Two Crows Corporation
Regression:
…uses existing values to forecast what
other values will be…
Two Crows Corporation
Clustering & Association
Clustering:
…divides a database into different groups…
…find groups that are very different from each
other, with similar members….
Two Crows Corporation
Association:
…involve determinations of affinity-how
frequently two or more things occur
together…
Two Crows Corporation
Deviation Detection & Pattern
Discovery
Deviation Detection:
…discovering most significant changes in data from
previously measured or normative values…
V. Kumar, M. Joshi, Tutorial on High Performance Data Mining.
Sequential Pattern Discovery:
…process of looking for patterns and rules that predict
strong sequential dependencies among different
events…
V. Kumar, M. Joshi, Tutorial on High Performance Data Mining.
Overview
Introduction
 Data Mining Taxonomy
 Data Mining Models and Algorithms
 Quick Wins with Data Mining
 Privacy-Preserving Data Mining

Data Mining Models &
Algorithms
Neural Networks
 Decision Trees
 Rule Induction
 K-nearest Neighbor
 Logistic regression
 Discriminant Analysis

Neural Networks
-
-
efficiently model large and complex problems;
may be used in classification problems or for
regressions;
Starts with input layer => hidden layer => output
layer
3
1
4
6
2
Inputs
5
Hidden Layer
Output
Neural Networks (cont.)
-
-
-
can be easily implemented to run on
massively parallel computers;
can not be easily interpret;
require an extensive amount of training time;
require a lot of data preparation (involve very
careful data cleansing, selection, preparation,
and pre-processing);
require sufficiently large data set and high
signal-to noise ratio.
Decision Trees (cont.)
-
handle very well non-numeric data;
work best when the predictor
variables are categorical;
Decision Trees
-a way of representing a series of rules that
lead to a class or value;
-basic components of a decision tree: decision
node, branches and leaves;
Income>40,000
No
Yes
Job>5
Yes
Low Risk
High Debt
No
High Risk
Yes
High Risk
No
Low Risk
Rule Induction
-
-
-
method of deriving a set of rules to
classify cases;
generate a set of independent rules
which do not necessarily form a tree;
may not cover all possible situations;
may sometimes conflict in their
predictions.
K-nearest neighbor
-
-
decides in which class to place a new
case by examining some number of
the most similar cases or neighbors;
assigns the new case to the same
class to which most of its neighbors
belong;
X
Y
X
X
Y
x
X x
N X
X
Artificial Neural
Networks
Introduction

What is neural computing/neural
networks?
The brain is a remarkable computer.
 It interprets imprecise information from
the senses at an incredibly high
speed.

Introduction
• A good example is the processing of
visual information: a one-year-old baby is
much better and faster at recognising
objects, faces, and other visual features
than even the most advanced AI system
running on the fastest super computer.
• Most impressive of all, the brain learns
(without any explicit instructions) to create
the internal representations that make
these skills possible
Biological Neural Systems

The brain is composed of approximately 100
billion (1011) neurons
A typical neuron collects signals from other neurons
through a host of fine structures called dendrites.
Axon
Synapse
Dendrites
Schematic drawing of two biological
neurons connected by synapses
The neuron sends out spikes of electrical activity
through a long, thin strand known as an axon, which
splits into thousands of branches.
At the end of the branch, a structure called a synapse
converts the activity from the axon into electrical effects
that inhibit or excite activity in the connected neurons.
When a neuron receives excitatory input that is
sufficiently large compared with its inhibitory input, it
sends a spike of electrical activity down its axon.
Learning occurs by changing the effectiveness of the synapses so that the influence of one neuron
on the other changes
What is a Neural Net?

A neural net simulates some of the learning functions
of the human brain. It can recognize patterns and
"learn." You can use it to forecast and make smarter
business decisions. It can also serve as an "expert
system" that simulates the thinking of an expert and
can offer advice. Unlike conventional rule-based
artificial-intelligence software, a neural net extracts
expertise from data automatically - no rules are
required.

In other words through the use of a trial and error
method the system “learns” to become an “expert” in
the field the user gives it to study.
Components Needed:

In order for a neural network to learn it needs 2
basic components:
• Inputs
• Which consists of any information the expert uses to determine
his/her final decision or outcome.
• Outputs
• Which are the decisions or outcome arrived at by the expert
that correspond to the inputs entered.
How does a neural network
learn?

A neural network learns by determining the relation
between the inputs and outputs.

By calculating the relative importance of the inputs and
outputs the system can determine such relationships.

Through trial and error the system compares its results
with the expert provided results in the data until it has
reached an accuracy level defined by the user.
 With each trial the weight assigned to the inputs is
changed until the desired results are reached.
Artificial Neural Networks

Artificial neurons are analogous to their biological
x1 w
1
inspirers
x2
w2

y
a
wN
xN

f
An artificial neuron
Here the neuron is actually a processing unit, it
calculates the weighted sum of the input signal to the
neuron to generate the activation signal a, given by
N
a   wi xi
i 1
where wi is the strength of the synapse connected to
the neuron, xi is an input feature to the neuron
Artificial Neural Networks

The activation signal is passed through a transform function to
produce the output of the neuron, given by
y  f (a )

The transform function can be linear, or non-linear, such as a
threshold or sigmoid function [more later …].

For a linear function, the output y is proportional to the activation
signal a. For a threshold function, the output y is set at one of two
levels, depending on whether the activation signal a is greater than
or less than some threshold value. For a sigmoid function, the
output y varies continuously as the activation signal a changes.
Artificial Neural Networks

Artificial neural network models (or simply neural networks) are
typically composed of interconnected units or artificial neurons. How
the neurons are connected depends on some specific task that the
neural network performs.

Two key features of neural networks distinguish them from any other
sort of computing developed to date:



Neural networks are adaptive, or trainable
Neural networks are naturally massively parallel
These features suggest the potential for neural network systems
capable of learning, autonomously improving their own
performance, adapting automatically to changing environments,
being able to make decisions at high speed and being fault tolerant.
Neural Network
Architectures

Feed-forward single layered networks

Feed-forward multi-layer networks

Recurrent networks
Neural Network Applications
Speech/Voice recognition
 Optical character recognition
 Face detection/Recognition
 Pronunciation (NETtalk)
 Stock-market prediction
 Navigation of a car
 Signal processing/Communication
 Imaging/Vision
 ….
