Neural Networks Primer

Download Report

Transcript Neural Networks Primer

Neural Networks Primer
Dr Bernie Domanski
The City University of New York / CSI
2800 Victory Blvd 1N-215
Staten Island, New York 10314
[email protected]
http://domanski.cs.csi.cuny.edu
What is a Neural Network?
Artificial Neural Networks – (ANN)
Provide a general, practical method for
learning
– real valued functions
– discrete valued functions
– vector valued functions
from examples
Algorithms “tune” input parameters to
best fit a training set of input-output pairs
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 2
What is a Neural Network?
 ANN Learning is robust to errors in the
training set data
 ANN have been applied to problems like
Interpreting visual scenes
Speech recognition
Learning robot control strategies
Recoginizing handwriting
Face recognition
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 3
Biological Motivation
 ANNs are built out of a densely
interconnected set of simple units (neurons)
Each neuron takes a number of realvalued inputs
Produces a single real-valued output
Inputs to a neuron may be the outputs of
other neurons.
A neuron’s output may be used as input
to many other neurons
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 4
Biological Analogy
Human Brain: 1011 neurons
Each neuron is connected to 104 neurons
Neuron Activity is inhibited or excited
through interconnections with other
neurons
Neuron switching times = 10-3 (human)
Time to recognize mom = 10-1 seconds
Implies only several hundred neuron
firings
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 5
Complexity of the Biological System
 Speculation:
Highly parallel processes must be
operating on representations that are
distributed over many neurons.
Human neuron switching speeds are
slow
Motivation is for ANN to capture this
highly parallel computation based on a
distributed representation
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 6
A Simple Neural Net Example
Weight
Output
Input
Neurons
Nodes
LINK
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 7
How Does the Network Work?
Assign weights to each input-link
Multiply each weight by the input value (0 or 1)
Sum all the weight-firing input combinations
If Sum > Threshold for the Neuron then
 Output = +1
 Else Output = -1
So for the X=1, Y=1 case –
IF w1*X+w2*Y > 99 THEN OUTPUT =Z= +1
50*1+50*1 > 99 
IF w3*X+w4*Y+w5*Z > 59 THEN OUTPUT = +1
30*1+30*1+(-30)*1 > 59  ELSE OUTPUT = -1
OR
100
99
100
X
0
0
1
1
Y
0
1
0
1
output
-1
1
1
1
X
0
0
1
1
W3
30
W4
30
59
Y
0
1
0
1
output
-1
1
1
-1
Exclusive OR
Output
-30
X
W5
Y
50
W2
Neurons
99
50
W1
LINK
Exclusive-OR
Appropriate Problems for Neural Networks
 Instances where there are vectors of many
defined features (eg. meaurements)
 Output may be a discrete value or a vector of
discrete values
 Training examples may contain errors
 Non-trivial training sets imply non-trivial time for
training
 Very fast application of the learned network to a
subsequent instance
 We don’t have to understand the learned function
– only the learned rules
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 11
How Are ANNs Trained?
 Initially
choose small random weights (wi)
Set threshold = 1
Choose small learning rate (r)
 Apply each member of the training set to
the neural net model using the training rule
to adjust the weights
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 12
The Training Rule Explained
 Modify the weights (wi) according to the
Training Rule:
wi = wi + wi where
wi = r * (t – a) * xi
 Here –
r is the learning rate (eg. 0.2)
t = target output
a = actual output
xi = i-th input value
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 13
Training for ‘OR’
Training Set:
X1 X2 target
0
0 -1
0
1
1
1
0
1
1
1
1
Initial Random Weights
W1 = .3
W2 = .7
Learning Rate
r = .2
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 14
Applying the Training Set for OR - 1
0 0 = -1
0 1 = -1 X
w1 =
=
=
=
r * (t – a) * x1
.2 * (1-(-1)) * x1
.2 * (2) * 0
0
w2 = .2 * (1-(-1)) * x2
= .2 * (2) * 1
= .4
X1
.3
X2
.7
1
a
w1
= w1 + w1
= .3 + 0 = .3
w2
= w2 + w2
= .7 +.4 = 1.1
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 15
Applying the Training Set for OR - 2
0 0 = -1
0 1 = +1
1 0 = -1 X
w1 = r * (t – a) * x1
= .2 * (1-(-1)) * x1
= .2 * (2) * 1
= .4
w2 = .2 * (1-(-1)) * x2
= .2 * (2) * 0
=0
X1
.3
X2
1.1
1
a
w1
= w1 + w1
= .3 + .4 = .7
w2
= w2 + w2
= 1.1+0 = 1.1
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 16
Applying the Training Set for OR - 3
0 0 = -1
0 1 = +1
1 0 = -1 X
w1 = r * (t – a) * x1
= .2 * (1-(-1)) * x1
= .2 * (2) * 1
= .4
w2 = .2 * (1-(-1)) * x2
= .2 * (2) * 0
=0
X1
.7
X2
1.1
1
a
w1
= w1 + w1
= .7+.4 = 1.1
w2
= w2 + w2
= 1.1+0 = 1.1
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 17
Applying the Training Set for OR - 4
0
0
1
1
0
1
0
1
=
=
=
=
-1
+1
+1
+1 
X1
1.1
X2
1.1
1
© B. Domanski, 2000-2001. All Rights Reserved.
a
Slide 18
Training for ‘AND’
Training Set:
X1 X2 target
0
0 -1
0
1 -1
1
0 -1
1
1
1
Initial Random Weights
W1 = .3
W2 = .7
Learning Rate
r = .2
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 19
Applying the Training Set for AND - 1
0
0
1
1
0
1
0
1
=
=
=
=
-1
-1
-1
-1 X
w1 = r * (t – a) * x1
= .2 * (1-(-1)) * 1
= .4
w2 = .2 * (1-(-1)) * 1
= .4
X1
.3
X2
.7
1
a
w1
= w1 + w1
= .3 + .4 = .7
w2
= w2 + w2
= .7 +.4 = 1.1
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 20
Applying the Training Set for AND - 2
0 0 = -1
0 1 = +1 X
X1
.7
w1 = r * (t – a) * x1
= .2 * (-1-(+1)) * 0
=0
X2
1.1
w2 = .2 * (-1-(+1)) * 1
= -.4
1
a
w1
= w1 + w1
= .7 + 0 = .7
w2
= w2 + w2
= 1.1 -.4 = .7
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 21
Applying the Training Set for AND - 3
0
0
1
1
0
1
0
1
=
=
=
=
-1
-1
-1
+1 
X1
.7
X2
.7
1
© B. Domanski, 2000-2001. All Rights Reserved.
a
Slide 22
Applying the Technology
Date
#Trans CPUBusy RespTime DiskBusy NetBusy
01-Oct-93
28
3
9
71
3
02-Oct-93
140
80
6
90
4
03-Oct-93
156
87
4
12
5
04-Oct-93
187
95
7
69
5
05-Oct-93
226
40
0
16
5
06-Oct-93
288
16
5
40
6
07-Oct-93
309
10
2
64
6
08-Oct-93
449
84
4
18
8
09-Oct-93
453
89
3
32
8
10-Oct-93
481
77
2
44
8
11-Oct-93
535
23
8
61
8
12-Oct-93
609
37
3
86
9
13-Oct-93
658
58
9
51
9
14-Oct-93
739
33
8
25
9
15-Oct-93
776
25
1
34
10
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 23
Select The Data Set
Choose data for the
Neugent
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 25
Select The Output That You Want to Predict
Choose Inputs
Identify the
Outputs
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 26
Train And Validate the Neugent
Choose Action to be
performed:
• Create the model
(Quick Train)
• Train & Validate (to
understand the
predictive capability)
• Investigate the data
(Export to Excel or
Data Analysis)
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 27
Validate the Neugent With the Data Set
Selecting Training Data –
• Select a random sample percentage
• Use the entire data set
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 28
Neugent Model is Trained, Tested, and
Validated
Training Results –
•Model Fit: 99.598%
(trained model quality)
•Predictive Capability: 99.598%
(tested model quality)
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 29
View The Results in Excel
Consult trained Neugent for prediction
Save results using Excel
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 30
Data Analysis
Stats & Filtering:
mean, min, max, std
dev, filtering
constraints
Ranking:
input significance
Correlation Matrix:
corr. between all fields
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 31
Correlation Matrix
The closer to 1, the stronger the indication that the
information represented by the two fields is the same
NetBusy vs #Trans = .9966
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 33
Actual Vs Predicted
Net Busy: Actual Vs Predicted
120
NetBusy_actual
NetBusy_predicted
100
60
40
20
Date
2/18/94
2/4/94
1/21/94
1/7/94
12/24/93
12/10/93
11/26/93
11/12/93
10/29/93
10/15/93
0
10/1/93
Usage
80
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 34
Actual Vs Predicted
Test results:
label
NetBusy_actual NetBusy_predicted
01-Oct-93
3
3.93916
02-Oct-93
4
3.55748
03-Oct-93
5
6.07377
04-Oct-93
5
5.22928
05-Oct-93
5
4.69116
06-Oct-93
6
4.7997
07-Oct-93
6
7.08912
08-Oct-93
8
7.35073
09-Oct-93
8
5.7424
10-Oct-93
8
7.92246
11-Oct-93
8
7.94412
12-Oct-93
9
9.02078
13-Oct-93
9
9.51989
14-Oct-93
9
8.7129
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 35
Summary
Neural Networks
 Modeled after neurons in the brain
 Artificial neurons are simple
 Neurons can be trained
 Networks of neurons can be taught how to
respond to input
 Models can be built quickly
 Accurate predictions can be made
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 36
Questions?
 Questions, comments, … ??
 Finding me –
Dr Bernie Domanski
 Email: [email protected]
 Website: http://domanski.cs.csi.cuny.edu
 Phone: (718) 982-2850 Fax: 2356
 Thanks for coming and listening !
© B. Domanski, 2000-2001. All Rights Reserved.
Slide 37