Neural Networks Primer
Dr Bernie DomanskiThe City University of New York / CSI
2800 Victory Blvd 1N-215Staten Island, New York 10314
[email protected]://domanski.cs.csi.cuny.edu
© B. Domanski, 2000-2001. All Rights Reserved. Slide 2
What is a Neural Network?
Artificial Neural Networks – (ANN) Provide a general, practical method for
learninglearning – real valued functions– discrete valued functions– vector valued functionsfrom examples
Algorithms “tune” input parameters to best fit a training set of input-output pairs
© B. Domanski, 2000-2001. All Rights Reserved. Slide 3
What is a Neural Network?
ANN Learning is robust to errors in the training set data
ANN have been applied to problems like Interpreting visual scenes Speech recognition Learning robot control strategies Recoginizing handwriting Face recognition
© B. Domanski, 2000-2001. All Rights Reserved. Slide 4
Biological Motivation
ANNs are built out of a densely interconnected set of simple units (neuronsneurons) Each neuron takes a number of real-
valued inputs Produces a single real-valued output Inputs to a neuron may be the outputs of
other neurons. A neuron’s output may be used as input
to many other neurons
© B. Domanski, 2000-2001. All Rights Reserved. Slide 5
Biological Analogy
Human Brain: 1011 neurons Each neuron is connected to 104 neurons Neuron Activity is inhibitedinhibited or excitedexcited
through interconnections with other neurons
Neuron switching times = 10-3 (human) Time to recognize mom = 10-1 seconds Implies only several hundred neuron neuron
firingsfirings
© B. Domanski, 2000-2001. All Rights Reserved. Slide 6
Complexity of the Biological System
Speculation: Highly parallelparallel processes must be
operating on representations that are distributed over many neurons.
Human neuron switching speeds are slowslow
Motivation is for ANN to capture this Motivation is for ANN to capture this highly parallel computation based on a highly parallel computation based on a distributed representationdistributed representation
© B. Domanski, 2000-2001. All Rights Reserved. Slide 7
A Simple Neural Net Example
Input
Nodes
Input
Nodes
OutputOutput
NeuronsNeurons
LINKLINK
WeightWeight
How Does the Network Work?
Assign weights weights to each input-linkto each input-linkMultiplyMultiply each weight by the input value (0 or 1) each weight by the input value (0 or 1) SumSum all the weight-firing input combinations all the weight-firing input combinationsIf If Sum >Sum > ThresholdThreshold for the Neuron then for the Neuron then Output = +1Output = +1 Else Output = -1Else Output = -1
So for the X=1, Y=1 case –So for the X=1, Y=1 case – IF w1*X+w2*Y > 99 THEN OUTPUT =Z= +1 IF w1*X+w2*Y > 99 THEN OUTPUT =Z= +1 50*1+50*1 > 99 50*1+50*1 > 99 IF w3*X+w4*Y+w5*Z > 59 THEN OUTPUT = IF w3*X+w4*Y+w5*Z > 59 THEN OUTPUT =
+1+1 30*1+30*1+(-30)*1 > 59 30*1+30*1+(-30)*1 > 59 ELSE OUTPUT = -1ELSE OUTPUT = -1
OR
100
10099
X Youtput
0 0 -10 1 11 0 11 1 1
Exclusive OR
X
Y
Output
Neurons
LINK
W1
X Youtput
0 0 -10 1 11 0 11 1 -1
50
50 99
-30
30
30 59
Exclusive-OR
W2
W3
W4
W5
© B. Domanski, 2000-2001. All Rights Reserved. Slide 11
Appropriate Problems for Neural Networks
Instances where there are vectorsvectors of many defined features (eg. meaurements)
Output may be a discrete value or a vector of discrete values
Training examples may contain errors Non-trivial training sets imply non-trivial
time for training Very fast application of the learned network
to a subsequent instance We don’t have to understand the learned
function – only the learned rules
© B. Domanski, 2000-2001. All Rights Reserved. Slide 12
How Are ANNs Trained?
Initially choose small random weights (wi) Set threshold = 1 Choose small learning rate (r)
Apply each member of the training set to the neural net model using the training rule to adjust the weights
© B. Domanski, 2000-2001. All Rights Reserved. Slide 13
The Training Rule Explained
Modify the weights (wi) according to the Training Rule:
Here – r is the learning rate (eg. 0.2) t = target output a = actual output xi = i-th input value
wi = wi + wi wherewi = r * (t – a) * xi
© B. Domanski, 2000-2001. All Rights Reserved. Slide 14
Training for ‘OR’
Training Set:X1 X2
target 0 0 -1 0 1 1 1 0 1 1 1 1
Initial Random Weights
W1 = .3W2 = .7
Learning Rate
r = .2
© B. Domanski, 2000-2001. All Rights Reserved. Slide 15
Applying the Training Set for OR - 1
X1
X21 a
0 0 = -10 1 = -1 X
w1 = r * (t – a) * x1= .2 * (1-(-1)) * x1
= .2 * (2) * 0= 0
w2 = .2 * (1-(-1)) * x2
= .2 * (2) * 1= .4
w1 = w1 + w1
= .3 + 0 = .3
w2 = w2 + w2
= .7 +.4 = 1.1
.3
.7
© B. Domanski, 2000-2001. All Rights Reserved. Slide 16
Applying the Training Set for OR - 2
X1
X21 a
0 0 = -10 1 = +11 0 = -1 X
w1 = r * (t – a) * x1= .2 * (1-(-1)) * x1
= .2 * (2) * 1= .4
w2 = .2 * (1-(-1)) * x2
= .2 * (2) * 0= 0
w1 = w1 + w1
= .3 + .4 = .7
w2 = w2 + w2
= 1.1+0 = 1.1
.3
1.1
© B. Domanski, 2000-2001. All Rights Reserved. Slide 17
Applying the Training Set for OR - 3
X1
X21 a
0 0 = -10 1 = +11 0 = -1 X
w1 = r * (t – a) * x1= .2 * (1-(-1)) * x1
= .2 * (2) * 1= .4
w2 = .2 * (1-(-1)) * x2
= .2 * (2) * 0= 0
w1 = w1 + w1
= .7+.4 = 1.1
w2 = w2 + w2
= 1.1+0 = 1.1
.7
1.1
© B. Domanski, 2000-2001. All Rights Reserved. Slide 18
Applying the Training Set for OR - 4
X1
X21 a
0 0 = -10 1 = +11 0 = +11 1 = +1
1.1
1.1
© B. Domanski, 2000-2001. All Rights Reserved. Slide 19
Training for ‘AND’
Training Set:X1 X2
target 0 0 -1 0 1 -1 1 0 -1 1 1 1
Initial Random Weights
W1 = .3W2 = .7
Learning Rate
r = .2
© B. Domanski, 2000-2001. All Rights Reserved. Slide 20
Applying the Training Set for AND - 1
X1
X21 a
0 0 = -10 1 = -11 0 = -11 1 = -1 X
w1 = r * (t – a) * x1= .2 * (1-(-1)) * 1= .4
w2 = .2 * (1-(-1)) * 1= .4
w1 = w1 + w1
= .3 + .4 = .7
w2 = w2 + w2
= .7 +.4 = 1.1
.3
.7
© B. Domanski, 2000-2001. All Rights Reserved. Slide 21
Applying the Training Set for AND - 2
X1
X21 a
0 0 = -10 1 = +1 X
w1 = r * (t – a) * x1= .2 * (-1-(+1)) * 0= 0
w2 = .2 * (-1-(+1)) * 1= -.4
w1 = w1 + w1
= .7 + 0 = .7
w2 = w2 + w2
= 1.1 -.4 = .7
.7
1.1
© B. Domanski, 2000-2001. All Rights Reserved. Slide 22
Applying the Training Set for AND - 3
X1
X21 a
0 0 = -10 1 = -11 0 = -11 1 = +1
.7
.7
© B. Domanski, 2000-2001. All Rights Reserved. Slide 23
Applying the Technology
Date #Trans CPUBusy RespTime DiskBusy NetBusy01-Oct-93 28 3 9 71 302-Oct-93 140 80 6 90 403-Oct-93 156 87 4 12 504-Oct-93 187 95 7 69 505-Oct-93 226 40 0 16 506-Oct-93 288 16 5 40 607-Oct-93 309 10 2 64 608-Oct-93 449 84 4 18 809-Oct-93 453 89 3 32 810-Oct-93 481 77 2 44 811-Oct-93 535 23 8 61 812-Oct-93 609 37 3 86 913-Oct-93 658 58 9 51 914-Oct-93 739 33 8 25 915-Oct-93 776 25 1 34 10
© B. Domanski, 2000-2001. All Rights Reserved. Slide 25
Select The Data Set
Choose data for the Neugent
© B. Domanski, 2000-2001. All Rights Reserved. Slide 26
Select The Output That You Want to Predict
Choose Inputs
Identify the Outputs
© B. Domanski, 2000-2001. All Rights Reserved. Slide 27
Train And Validate the Neugent
Choose Action to be performed:
• Create the model (Quick Train)
• Train & Validate (to understand the predictive capability)
• Investigate the data (Export to Excel or Data Analysis)
© B. Domanski, 2000-2001. All Rights Reserved. Slide 28
Validate the Neugent With the Data Set
Selecting Training Data –
• Select a random sample percentage
• Use the entire data set
© B. Domanski, 2000-2001. All Rights Reserved. Slide 29
Neugent Model is Trained, Tested, and Validated
Training Results –
•Model Fit: 99.598%(trained model quality)
•Predictive Capability: 99.598%(tested model quality)
© B. Domanski, 2000-2001. All Rights Reserved. Slide 30
View The Results in Excel
Consult trained Neugent for prediction
Save results using Excel
© B. Domanski, 2000-2001. All Rights Reserved. Slide 31
Data Analysis
Stats & Filtering: mean, min, max, std dev, filtering constraints
Ranking: input significance
Correlation Matrix: corr. between all fields
© B. Domanski, 2000-2001. All Rights Reserved. Slide 33
Correlation Matrix
The closer to 1, the stronger the indication that the information represented by the two fields is the same
NetBusy vs #Trans = .9966
© B. Domanski, 2000-2001. All Rights Reserved. Slide 34
Actual Vs Predicted
Net Busy: Actual Vs Predicted
0
20
40
60
80
100
120
10
/1/9
3
10
/15
/93
10
/29
/93
11
/12
/93
11
/26
/93
12
/10
/93
12
/24
/93
1/7
/94
1/2
1/9
4
2/4
/94
2/1
8/9
4
Date
Us
ag
e
NetBusy_actual
NetBusy_predicted
© B. Domanski, 2000-2001. All Rights Reserved. Slide 35
Actual Vs Predicted
Test results:label NetBusy_actual NetBusy_predicted
01-Oct-93 3 3.9391602-Oct-93 4 3.5574803-Oct-93 5 6.0737704-Oct-93 5 5.2292805-Oct-93 5 4.6911606-Oct-93 6 4.799707-Oct-93 6 7.0891208-Oct-93 8 7.3507309-Oct-93 8 5.742410-Oct-93 8 7.9224611-Oct-93 8 7.9441212-Oct-93 9 9.0207813-Oct-93 9 9.5198914-Oct-93 9 8.7129
© B. Domanski, 2000-2001. All Rights Reserved. Slide 36
Summary
Neural NetworksModeled after neurons in the brainArtificial neurons are simpleNeurons can be trainedNetworks of neurons can be taught
how to respond to inputModels can be built quicklyAccurate predictions can be made
© B. Domanski, 2000-2001. All Rights Reserved. Slide 37
Questions?
Questions, comments, … ??
Finding me –Dr Bernie Domanski Email: [email protected] Website: http://domanski.cs.csi.cuny.edu Phone: (718) 982-2850 Fax: 2356
Thanks for coming and listening !