G51IAI Introduction to AI Andrew Parkes Neural Networks 1
G51IAIIntroduction to AI
Andrew ParkesNeural Networks 1
Neural Networks
• AIMA– Section 20.5 of 2003 edition
• Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994
• An Introduction to Neural Networks (2nd Ed). Morton, IM, 1995
Brief History
• Try to create artificial intelligence based on the natural intelligence we know:
• The brain– massively interconnected neurons
G5G51IAI1IAI Neural Networks Neural Networks
Neural Networks
Natural Neural Networks
• Signals “move” via electrochemical signals
• The synapses release a chemical transmitter – the sum of which can cause a threshold to be reached – causing the neuron to “fire”
• Synapses can be inhibitory or excitatory
Natural Neural Networks
• We are born with about 100 billion neurons
• A neuron may connect to as many as 100,000 other neurons
Natural Neural Networks
• McCulloch & Pitts (1943) are generally recognised as the designers of the first neural network
• Many of their ideas still used today e.g.– many simple units, “neurons” combine to
give increased computational power– the idea of a threshold
G5G51IAI1IAI Neural Networks Neural Networks
Modelling a Neuron
• aj :Activation value of unit j
• wj,i :Weight on link from unit j to unit i
• ini :Weighted sum of inputs to unit i
• ai :Activation value of unit i
• g :Activation function
j
jiji aWin ,
G5G51IAI1IAI Neural Networks Neural Networks
Activation Functions
• Stept(x) = 1 if x ≥ t, else 0 threshold=t
• Sign(x) = +1 if x ≥ 0, else –1
• Sigmoid(x) = 1/(1+e-x)
Building a Neural Network
1. “Select Structure”: Design the way that the neurons are interconnected
2. “Select weights” – decide the strengths with which the neurons are interconnected
– weights are selected so get a “good match” to a “training set”
– “training set”: set of inputs and desired outputs
– often use a “learning algorithm”
Neural Networks
• Hebb (1949) developed the first learning rule – on the premise that if two neurons
were active at the same time the strength between them should be increased
Neural Networks
• During the 50’s and 60’s many researchers worked, amidst great excitement, on a particular net structure called the “perceptron”.
• Minsky & Papert (1969) demonstrated a strong limit on the power of perceptrons– saw the death of neural network research for about
15 years
• Only in the mid 80’s (Parker and LeCun) was interest revived because of their learning algorithm for a better design of net – (in fact Werbos discovered algorithm in 1974)
Basic Neural Networks
• Will first look at simplest networks• “Feed-forward”
– Signals travel in one direction through net
– Net computes a function of the inputs
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
Neurons in a McCulloch-Pitts network are connected by directed, weighted paths
-1
2
2X1
X2
X3
Y
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
If the weight on a path is positive the path is excitatory, otherwise it is inhibitory
-1
2
2X1
X2
X3
Y
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
The activation of a neuron is binary. That is, the neuron either fires (activation of one) or does not fire (activation of zero).
-1
2
2X1
X2
X3
Y
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
For the network shown here the activation function for unit Y is
f(y_in) = 1, if y_in >= θ else 0
where y_in is the total input signal receivedθ is the threshold for Y
-1
2
2X1
X2
X3
Y
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
Originally, all excitatory connections into a particular neuron have the same weight, although different weighted connections can be input to different neurons
Later weights allowed to be arbitrary
-1
2
2X1
X2
X3
Y
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
Each neuron has a fixed threshold. If the net input into the neuron is greater than or equal to the threshold, the neuron fires
-1
2
2X1
X2
X3
Y
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
The threshold is set such that any non-zero inhibitory input will prevent the neuron from firing
-1
2
2X1
X2
X3
Y
Building Logic Gates
• Computers are built out of “logic gates”
• Can we use neural nets to represent logical functions?
• Use threshold (step) function for activation function– all activation values are 0 (false) or 1
(true)
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
AND Function
1
1X1
X2
Y
AND
X1 X2 Y
1 1 1
1 0 0
0 1 0
0 0 0
Threshold(Y) = 2
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
AND FunctionOR Function
2
2X1
X2
Y
OR
X1 X2 Y
1 1 1
1 0 1
0 1 1
0 0 0
Threshold(Y) = 2
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
AND NOT Function
-1
2X1
X2
Y
ANDNOT
X1 X2 Y
1 1 0
1 0 1
0 1 0
0 0 0
Threshold(Y) = 2
G5G51IAI1IAI Neural Networks Neural Networks
Simple Networks
AND OR NOTInput 1 0 0 1 1 0 0 1 1 0 1Input 2 0 1 0 1 0 1 0 1Output 0 0 0 1 0 1 1 1 1 0
G5G51IAI1IAI Neural Networks Neural Networks
Simple Networks
t = 0.0
y
x
W = 1.5
W = 1
-1
G5G51IAI1IAI Neural Networks Neural Networks
Perceptron• Synonym for Single-
Layer, Feed-Forward Network
• First Studied in the 50’s
• Other networks were known about but the perceptron was the only one capable of learning and thus all research was concentrated in this area
G5G51IAI1IAI Neural Networks Neural Networks
Perceptron• A single weight only
affects one output so we can restrict our investigations to a model as shown on the right
• Notation can be simpler, i.e.
jWjIjStepO 0
G5G51IAI1IAI Neural Networks Neural Networks
What can perceptrons represent?
AND XORInput 1 0 0 1 1 0 0 1 1Input 2 0 1 0 1 0 1 0 1Output 0 0 0 1 0 1 1 0
G5G51IAI1IAI Neural Networks Neural Networks
What can perceptrons represent?
0,0
0,1
1,0
1,1
0,0
0,1
1,0
1,1
AND XOR
• Functions which can be separated in this way are called Linearly Separable
• Only linearly separable functions can be represented by a perceptron
• XOR cannot be represented by a perceptron
G5G51IAI1IAI Neural Networks Neural Networks
What can perceptrons represent?
Linear Separability is also possible in more than 3 dimensions – but it is harder to visualise
XOR
• XOR is not “linearly separable”– Cannot be represented by a perceptron
• What can we do instead?1. Convert to logic gates that can be
represented by perceptrons2. Chain together the gates
• Make sure you understand the following– check it using truth tables
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)
G5G51IAI1IAI Neural Networks Neural Networks
The First Neural Neural Networks
XOR Function
2
2
2
2
-1
-1
Z1
Z2
Y
X1
X2
XOR
X1 X2 Y
1 1 0
1 0 1
0 1 1
0 0 0
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)
Single- vs. Multiple-Layers
• Once we chain together the gates then we have “hidden layers” – layers that are “hidden” from the output
lines• Have just seen that hidden layers allow us to
represent XOR– Perceptron is single-layer– Multiple layers increase the representational
power, so e.g. can represent XOR• Generally useful nets have multiple-layers
– typically 2-4 layers
Expectations
• Be able to explain the terminology used, e.g.– activation functions– step and threshold functions– perceptron– feed-forward– multi-layer, hidden layers– linear separability
• XOR– why perceptrons cannot cope with XOR– how XOR is possible with hidden layers
Questions?