Top Banner
3/19/2012 1 Lecture 2 - Neural Network (Cont’d) Neural Network Models Types of NN Supervised Training • Backpropagation Network RBF Network • LVQ Network Elman Network Unsupervised Training Hamming Network • KSOMS • SOFM • LVQ Hopfield Network • ART
13

Lecture 2 -Neural Network (Cont’d)

Jul 13, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 2 -Neural Network (Cont’d)

3/19/2012

1

Lecture 2 - Neural Network

(Cont’d)

Neural Network Models

Types of NN

Supervised Training

• Backpropagation Network

• RBF Network

• LVQ Network

• Elman Network

Unsupervised Training

• Hamming Network

• KSOMS

• SOFM

• LVQ

• Hopfield Network

• ART

Page 2: Lecture 2 -Neural Network (Cont’d)

3/19/2012

2

Radial Basis Function (RBF) Network• RBF network � 2 layers feedforward network except that the 1st

layer do not use the weighted sum of inputs and the sigmoid

transfer function like the other multilayer networks.

• The output of the first layer neurons each represent a “radial

function” determined by the distance between the network input

and the ‘center’ of the basis function.

• As input moves away from center, the neuron outputs drops

rapidly to zero (distance between w and p decreases, the output

increases).• radial basis neuron acts as a detector which

produces 1 whenever the input p is

identical to its weight

• The 2nd layer: uses linear transfer function.

• RBF networks � require more neurons but

trains faster than standard feed-forward

backpropagation networks

• output of the first layer for a feed forward network net can be obtained

with the following code:

a{1} = radbas(netprod(dist(net.IW{1,1},p),net.b{1}))

• This function can produce a network with zero error on training vectors.

It is called in the following way

• net = newrbe(P,T,SPREAD); a spread constant SPREAD for

the radial basis layer, and returns a network with weights and biases

such that the outputs are exactly T when the inputs are P

Radial Basis Function (RBF) Network

Page 3: Lecture 2 -Neural Network (Cont’d)

3/19/2012

3

• function newrb iteratively creates a radial basis network one neuron

at a time. Neurons are added to the network until the sum-squared

error falls beneath an error goal or a maximum number of neurons

has been reached. The call for this function is:

net = newrb(P,T,GOAL,SPREAD)

• The error of the new network is checked, and if low enough newrb is

finished. Otherwise the next neuron is added. This procedure is

repeated until the error goal is met, or the maximum number of

neurons is reached.

• Radial basis networks, even when designed efficiently with newrbe,

tend to have many times more neurons than a comparable feed-

forward network with tansig or logsig neurons in the hidden layer.

• choose a spread constant larger than the distance between adjacent

input vectors, but smaller than the distance across the whole input

space.

Radial Basis Function (RBF) Networkdemorb1

• The first design method, newrbe, finds an exact solution. The

function newrbe creates radial basis networks with as many radial

basis neurons as there are input vectors in the training data.

• The second method, newrb, finds the smallest network that can

solve the problem within a given error goal. Typically, far fewer

neurons are required by newrb than are returned newrbe.

• Generalized regression neural network (GRNN) is often used for

function approximation .

• Probabilistic neural networks can be used for classification

problems.

Radial Basis Function (RBF) Network

Page 4: Lecture 2 -Neural Network (Cont’d)

3/19/2012

4

Radial Basis Function (RBF) Network

Hamming Network• Hamming network � example of competitive network

(unsupervised learning)

(i) compute the distance between the prototype stored

pattern and the input pattern

(ii) perform a competition to determine which neuron

represents the prototype pattern closest to the input

(In unsupervised learning Self-Organizing Map (SOM), the

prototype patterns are adjusted as new inputs to the network

so that the network can cluster the inputs into different

categories)

Page 5: Lecture 2 -Neural Network (Cont’d)

3/19/2012

5

Hamming Network

• Hamming network � 1st layer: feedforward layer and 2nd

layer: recurrent layer (known as the competitive layer)

• Is called Hamming network because the neuron in the

feedforward layer with the largest output will correspond to

the prototype pattern that is closest in Hamming distance to

the input pattern

• In the competitive layer, the neuron competes with each

other to determine a winner. After the competition, only 1

neuron has a nonzero output.

• Transfer function in the competitive layer: poslin (linear for

+ve numbers and zero for –ve numbers)

Kohonen Self-Organizing Maps (SOMs)

• Self-organizing in network: networks can learn to detect

regularities and correlations in their input and adapt their

future responses to that input accordingly

• Self-organizing maps learn to recognize groups of similar

input vectors in such a way that neurons physically close

together in the neuron layer respond to similar input vectors.

• The weights of the winning neuron (a row of the input weight

matrix) are adjusted with the Kohonen learning rule.

Supposing that the ith neuron wins, the ith row of the input

weight matrix are adjusted as shown below.

Page 6: Lecture 2 -Neural Network (Cont’d)

3/19/2012

6

p = [.1 .8 .1 .9; .2 .9 .1 .8]

net = newc([0 1; 0 1],2); % to create competitive network

net.trainParam.epochs = 500

net = train(net,p);

a = sim(net,p)

ac = vec2ind(a)

ac = 1 2 1 2

• We see that the network has been trained to classify

the input vectors into two groups, those near the origin,

class 1, and those near (1,1), class 2.

KSOM in MATLAB

democ1

• Self-organizing feature maps (SOFM) learn to classify input

vectors according to how they are grouped in the input space.

They differ from competitive layers in that neighboring neurons

in the self-organizing map learn to recognize neighboring

sections of the input space.

• Thus self-organizing maps learn both the distribution (as do

competitive layers) and topology of the input vectors they are

trained on.

• Functions gridtop, hextop or randtop can arrange the neurons in

a grid, hexagonal, or random topology.

• Instead of updating only the winning neuron (competitive

network), all neurons within a certain neighborhood of the

winning neuron are updated using the Kohonen rule.

SOFM

Page 7: Lecture 2 -Neural Network (Cont’d)

3/19/2012

7

• different topologies for the original neuron locations with the

functions gridtop, hextop or randtop can be specified

• An 8x10 set of neurons in a hextop topology can be created and

plotted with the code shown below:

pos = hextop(8,10);

plotsom(pos)

SOFM

net = newsom([0 2; 0 1] , [2 3]);

P = [.1 .3 1.2 1.1 1.8 1.7 .1 .3 1.2 1.1 1.8 1.7;...

0.2 0.1 0.3 0.1 0.3 0.2 1.8 1.8 1.9 1.9 1.7 1.8]

We can plot all of this with

plot(P(1,:),P(2,:),'.g','markersize',20)

hold on

plotsom(net.iw{1,1},net.layers{1}.distances)

hold off

net.trainParam.epochs = 1000;

net = train(net,P);

SOFM in MATLAB

Page 8: Lecture 2 -Neural Network (Cont’d)

3/19/2012

8

• Competitive network learns to categorize the input

vectors presented to it

• Self-organizing map learns to categorize input vectors. It

also learns the distribution of input vectors.

• Self-organizing maps also learn the topology of their input

vectors. Neurons next to each other in the network learn

to respond to similar vectors. The layer of neurons can be

imagined to be a rubber net which is stretched over the

regions in the input space where input vectors occur.

• Self-organizing maps allow neurons that are neighbours to

the winning neuron to output values.

Summary for KSOM

Learning Vector Quantization (LVQ)

• LVQ networks classify input vectors into target classes by using a

competitive layer to find subclasses of input vectors, and then

combining them into the target classes.

• LVQ: hybrid network - uses both unsupervised and supervised

learning for classification

• Supervised learning � target is given

Page 9: Lecture 2 -Neural Network (Cont’d)

3/19/2012

9

LVQ in MATLAB net = newlvq(PR,S1,PC,LR,LF)

P = [-3 -2 -2 0 0 0 0 +2 + 2 +3; 0 +1 -1 +2 +1 -1 -2 +1 -1 0]

and

Tc = [1 1 1 2 2 2 2 1 1 1];

T = ind2vec(Tc)

create a network with four neurons in the first layer and two neurons in the second

layer. The second layer weights will have 60% (6 of the 10 in Tc above) of its

columns with a 1 in the first row, corresponding to class 1, and 40% of its columns

will have a 1 in the second row, corresponding to class 2.

net = newlvq(range(P),4,[.6 .4]);

net.trainParam.epochs = 1000;

net.trainParam.lr = 0.05;

net = train(net,P,T);

Y = sim(net,P)

Yc = vec2ind(Y)

Yc =

1 1 1 2 2 2 2 1 1 1

Recurrent Network

• Example: Elman (supervised learning) and Hopfield networks (unsupervised)

• Elman networks are two-layer backpropagation networks, with the addition of a

feedback connection from the output of the hidden layer to its input. This

feedback path allows Elman networks to learn to recognize and generate

temporal patterns, as well as spatial patterns.

• Elman networks, by having an internal feedback loop, are capable of learning to

detect and generate temporal patterns. This makes Elman networks useful in

such areas as signal processing and prediction where time plays a dominant role.

• The Hopfield network is used to store one or more stable target vectors. These

stable vectors can be viewed as memories which the network recalls when

provided with similar vectors which act as a cue to the network memory.

• Hopfield networks can act as error correction or vector categorization

networks. Input vectors are used as the initial conditions to the network, which

recurrently updates until it reaches a stable output vector.

Page 10: Lecture 2 -Neural Network (Cont’d)

3/19/2012

10

Elman Networknet = newelm([0 1],[5 1],{'tansig','logsig'});

P = round(rand(1,8))

P = 1 0 1 1 1 0 1 1

and

T = [0 (P(1:end-1)+P(2:end) == 2)]

T = 0 0 0 1 1 0 0 1

Here T is defined to be 0 except when two ones occur in P, in which case T

will be 1.

net = train(net,Pseq,Tseq);

Y = sim(net,Pseq);

z = seq2con(Y);

Page 11: Lecture 2 -Neural Network (Cont’d)

3/19/2012

11

Hopfield NetworkT = [-1 -1 1; 1 -1 1]‘

net = newhop(T);

Ai = T;

[Y,Pf,Af] = sim(net,2,[],Ai);

Y

This gives us

Y =

-1 1

-1 -1

1 1

the network has indeed been designed to be stable at its design points.

Ai = {[-0.9; -0.8; 0.7]}

[Y,Pf,Af] = sim(net,{1 5},{},Ai);

Y{1}

We get

Y =

-1

-1

1

Adaptive Resonance Theory (ART)

• ART � Unsupervised learning; designed to achieve

learning stability while maintaining sensitivity to novel

inputs

• As each input is given to the network, it is compared with

the prototype vector that is mostly closely matching.

• If the matching is not adequate, a new prototype is

selected.

• In this way, previously learned memories (prototype) is

not eroded by new learning

Page 12: Lecture 2 -Neural Network (Cont’d)

3/19/2012

12

Applications of NNAppcr1: Character Recognition

S1 = 10; S2 = 26;

net = newff(minmax(P),[S1 S2],{'logsig' 'logsig'},'traingdx');

Training Without Noise

The network is initially trained without noise for a maximum of 5000 epochs

or until the network sum-squared error falls beneath 0.1.

P = alphabet;

T = targets;

net.performFcn = 'sse';

net.trainParam.goal = 0.1;

net.trainParam.show = 20;

net.trainParam.epochs = 5000;

net.trainParam.mc = 0.95;

[net,tr] = train(net,P,T);

Applications of NNAppcr1: Character Recognition

Training With Noise

netn = net;

netn.trainParam.goal = 0.6;

netn.trainParam.epochs = 300;

T = [targets targets targets targets];

for pass = 1:10

P = [alphabet, alphabet, ...

(alphabet + randn(R,Q)*0.1), ...

(alphabet + randn(R,Q)*0.2)];

[netn,tr] = train(netn,P,T);

end

Page 13: Lecture 2 -Neural Network (Cont’d)

3/19/2012

13

Applications of NNAppcr1: Character Recognition

Training Without Noise Again

Once the network has been trained with noise it makes sense to train it without

noise once more to ensure that ideal input vectors are always classified

correctly.

To test the system a letter with noise can be created and presented to the

network.

noisyJ = alphabet(:,10)+randn(35,1) * 0.2;

plotchar(noisyJ);

A2 = sim(net,noisyJ);

A2 = compet(A2);

answer = find(compet(A2) == 1);

plotchar(alphabet(:,answer));

Applications of NN

Applin2: Adaptive Prediction

time1 = 0:0.05:4;

time2 = 4.05:0.024:6;

time = [time1 time2];

T = [sin(time1*4*pi) sin(time2*8*pi)];

Since we will be training the network incrementally, we will change t to a

sequence.

T = con2seq(T);

P = T;

lr = 0.1;

delays = [1 2 3 4 5];

net = newlin(minmax(cat(2,P{:})),1,delays,lr);

[w,b] = initlin(P,t)

[net,a,e]=adapt(net,P,T);