THE PERCEPTRON: (Classification) Threshold unit: where is the output for input pattern , are the synaptic weights and is the desired output w 1 w 2 w 3 w 4 w 5
THE PERCEPTRON: (Classification)
Threshold unit:
where is the output for input pattern , are the synaptic weights and is the desired output
w1 w2 w3 w4 w5
x1 x2 y 1 1 1 1 0 0 0 1 0 0 0 0
AND
0 1
1
-1.5
1 1 Linearly seprable
x1 x2 y 1 1 1 1 0 1 0 1 1 0 0 0
OR
-0.5
1 1
0 1
1
Linearly separable
x1 x2 y 1 1 0 1 0 1 0 1 1 0 0 0
XOR
0 1
1
Linearly separable?
Linearly separable Linearly inseparable
0 1
1
0 1
1
Linearly inseparable is not necessarily difficult
Perceptron learning rule:
w1 w2 w3 w4 w5
Convergence proof:
Hertz, Krough, Palmer (HKP)
Assignment 3a: program in matlab a preceptron with a perceptron learning rule and solve the OR, AND and XOR problems. (Due Feb 8’th)
Show Demo
Summary – what can perceptrons do and how?
Linear single layer network: ( approximation, curve fitting)
Linear unit:
where is the output for input pattern , are the synaptic weights and is the desired output
w1 w2 w3 w4 w5
Minimize mean square error:
or *
Linear single layer network: ( approximation, curve fitting)
Linear unit:
where is the output for input pattern , are the synaptic weights and is the desired output
w1 w2 w3 w4 w5
Minimize mean square error:
The best solution is obtained when E is minimal.
For linear neurons there is an exact solution for this called the pseudo-inverse (see HKP).
Looking for a solution by gradient descent:
E
w
-gradient
Chain rule
and
Since:
Error:
Therefore:
Which types of problems can a linear network solve?
Sigmoidal neurons:
Which types of problems can a sigmoidal networks solve?
Assignment 3b – Implement a one layer linear and sigmoidal network, fit a 1D a linear, a sigmoid and a quadratic function, for both networks.
for example:
Multi layer networks:
• Can solve non linearly separable classification problems.
• Can approximate any arbitrary function, given ‘enough’ units in the hidden layer.
Hidden layer
Output layer
Input layer
Multi layer networks:
Hidden layer
Output layer
Input layer
€
σ(x) =1
1+ exp(−βx)
Example sigmoid
€
oµ =σ 2 wk2σ( wk, j
1 x jµ )
j∑
k∑
Note: is not a vector but a matrix
Solving linearly inseparable problems
x1 x2 y 1 1 0 1 0 1 0 1 1 0 0 0
XOR
Hint: XOR = or and not and
How do we learn a multi-layer network The credit assignment problem !
.5 0
-.5
x1 x2 y 1 1 0 1 0 1 0 1 1 0 0 0
XOR
0.5 -0.5 1 -1
1 0.5
Gradient descent/ Back Propagation, the solution to the credit assignment problem:
€
E =12
yµ − oµ( )2
µ=1
P
∑ Where:
From hidden layer to output weights:
{ Note –for simplicity of derivation left out second sigma – in general need to keep it.
€
E =12
yµ − oµ( )2
µ=1
P
∑ Where:
For input to hidden layer:
and
and
{
For input to hidden layer:
and
Assignment 3c: Program a 2 layer network in matlab, solve the XOR problem. Fit the curve: x(x-1) between 0 and 1, how many hidden units did you need?
Formal neural networks can accomplish many tasks, for example:
• Perform complex classification
• Learn arbitrary functions
• Account for associative memory Some applications: Robotics, Character recognition, Speech recognition, Medical diagnostics.
This is not Neuroscience, but is motivated loosely by neuroscience and carries important information for neuroscience as well.
For example: Memory, learning and some aspects of development are assumed to be based on synaptic plasticity.
What did we learn today?
Is Backprop biologically realistic?