Lecture 6. Perceptron Simplest Neural Network Saint-Petersburg
State Polytechnic UniversityDistributed Intelligent Systems
DeptARTIFICIAL NEURAL NETWORKS Perceptron is one of the first and
simplest artificial neural networks, which was presented in the
middle of 50-th years. It was a first mathematical model, which
demonstrates new paradigms of machine learning computational
environment, and a threshold logic units model of classification
tasks. Prof. Dr. Viacheslav P. ShkodyrevE:Mail:
[email protected] 6-1 Lecture 6Perceptron - Neural Networks
Perceptron was proposed by Rosenblatt (1958) as the first model for
learning with a teacher, i.e. supervised learning. This model is an
enhancement of threshold logic units (TLUs) which used for the
classification of patterns said to be linearly separable. How to
formalize and interpret this model? Objective of Lecture 6 This
lecture introduces the simplest class of neural networks perseptron
and its application to pattern classification. It is possible to
interpret the functionality of perceptron and threshold logic units
model geometrically via a separating hyper plane. In this lecture
we will define what we mean by a perceptron learning rule, explain
the P networks and its training algorithms, discuss the limitation
of the P networks. You will learn: What is a single-layer
perceptron via threshold logic units model- Perceptrons as Linear
and Non-Linear Classifiers via threshold logic theory- Multi-Layer
Perceptrons networks- Perceptron Learning RulesL 6-2 Lecture 6The
Simple Nonlinear One-Layer Neural Networks It was P which presents
at first time a new paradigms of training computation algorithms.
thresholdInput patternAssociation unit We solve a classification
task when we assign any image, represented by a feature vector, to
one of two classes, which we shall denote by F of A, so that class
A corresponds to the character a and class F corresponds to b.
Using the perceptron training algorithm, we may to classify two
linearly separable classesL 6-3 Lecture 6.Single-Layer Perceptron
as a simplest model for classificationx1x2xnInput pattern
x...........................Input patternsa b c ...123:a1 b1 c1
:Input vector x Synapses matrix WThreshold logic activationutresh
u( ) u y =1Threshold logic activation The single-layer perceptron
was the first simplest model that generates a great interest due to
its ability to generalize from its training vectors and learn from
initially randomly distributed connections.Architecture of SLPL 6-4
Lecture 6Linearly Separable Classification Problem via SLP
lx1x2x2uutreshu=w1x1 + w2x2 + b( ) { }, SA tresh o u x u x x <
< ( ) { }, SB tresh o u x u x x > > uxu=wx+bSASButreshx0(
) { } , SA tresh o u u < < x x x( ) { }, SB tresh o u u >
> x x xx1SASB,21((
=xxxux1x2w1w2ybTwo-inputs Perceptronuxw ybOne-input
PerceptronGeometric interpretation of Threshold Logic UnitsL 6-5
Lecture 6:Limitations of Single-Layer Perceptrons Exclusive-Or
(XOR) gate Problem [Minsly, Papert, 1969] 0 1 11 1 01 0 10 0
0Output uInput x1Input x1XOR-Logicx2x(0,0)
x1x(1,1)x(1,0)x(0,1)x(0,0)x2x1x(1,1)x(1,0)x(0,1)Solution of the
Exclusive-Or (XOR) gate Problem Linear separable surface can not to
solve the Exclusive-Or gate classification tasksOvercoming of
problem is multi-layers networkL 6-6 Lecture 6Two-Layers Perceptron
for Non-Linear Separability u1(1)xw1(1)yWith an one-input
two-layers Perceptron we have close separable area with catting out
boundary u(x)xuthreshx01 x021S2SClose separable boundary at
1-Dimension space of x2 1S S S = Au2(1)u1(1)=w1(1)x +bw2(1)u
(2)w1(2)w2(2)u2(1)=w2(1)x +b( )( ) { }, S11 1 1 thresh o u x u x x
> > ( ) { }, S) 1 (2 2 2 thresh o u x u x x > > where:L
6-7 Lecture 6Topology Classification of Multilayer NNux2AS A
x1x1x2123123( ) ( ) ( ) 111111 b u + = x w ( ) ( ) ( ) 121212 b u +
= x wDecision boundary of Neuron 2Decision boundary of Neuron
1Decision boundary of Neuron 3With a two-inputs two-layers
Perceptron net we can realize a convex separable surface y(2)x
x1x2u1(1)u2(1)u3(1)w11(1)w12(1)w23(1)u(2)w11(2)w31(2)3 2 1 AS S S S
= AConvex separable boundary at 2-Dimension spaceLayer 2Layer1L 6-8
Lecture 6Learning of Neural Networksx
x1x2u1(1)u2(1)u3(1)w11(1)w12(1)w23(1)y1(2)u1(2)w11(2)w31(2)y2(2)u2(2)w12(2)w32(2)u3(3)w21(3)w11(3)y1(3)Layer
1Layer 2Layer 31S A2S AZWith a three-layers two-inputs Perceptron
net we can realize a non-convex separable surface Complex concave
separable surface:2 1S S Z A A =where:( ){ }( ){ }. S, S22 i 221 i
1threshthreshu uu u> A > AxxL 6-9 Lecture 6Learning Rule for
the Single-Layer Perceptron( ) ( ) min : 9 ey y W WWopt tarJ
JLearning of SLP via Optimization TaskSolution of Task:( ) ( ) (
),1 kijkijkij w w w A + =+o ( ) ( )( ).kijkijwJwcc
AWwhere:Rosenblatts learning rule:on base of quantized error
minimizationModified Rosenblatts learning rule:on base of
non-quantized error minimizationWidrow-Hoff learning rule (delta
rule}:on base of state error minimization( )( ) . sgn ,teachwx y ee
W = RJ ( )( ) . , sgn ), (21teach2 wx y ee W =RJ ( )( ). ,teach
kRJu u ee W = The aim of learning is to minimize the instantaneous
squared error of the output signal. L 6-10 Lecture 6Rosenblatts
learning ruleWe determine the cost function via quantized error e:
( )( ) Wx yy y e WsgnJteachteach R = = where:(((((
=m1eee2e( )= = = = = = = == =. 1 , 1 0, 0 , 1 , 1, 1 , 0 , 1, 0
, 0 , 0Tjteach jteach jteach jteach jteach jy y if y y if y y if y
y ifsgn y e x w- is a vector of quantized error with element
ej.Then weights change value is:( ) ( )( ) ( )( ) ( )( ) ( )( ) (
)= = = = = = = == = A. 0 , 1 , 0, 1 , 1 ,, 1 , 0 ,, 0 , 0 0,
kjkjkjkj ikjkj ikjkjjkjkije y if e y if x e y if x e y ifx e wooo(
) ( )( ).kijkijwJwcc AWor:The first original perceptron learning
rule for adjusting the weights was developed by Rosenblatt.L 6-11
Lecture 6Modified Rosenblatts learning ruleutresh u( ) u y =1utresh
u( ) u y =1 In modern perceptron implementations the hard-limiter
function is usually replaced by a smooth nonlinear activation
function such as the sigmoid function: ( ) ( )( )( ) ( ) ( )( ) (
). tanh : or , exp 1 :, 2112u u whereJteachteach M= + = = = WxWxWx
yy y e W We determine the modified cost function via quantized
error e: We get the final equationApplying the algebraic
transformation( ) ( ) ( )( ) | | ( ). 12 kikjkjkij x y e w = A o( )
( )( ),kijMkijwJwcc AWL 6-12 Lecture 6General Algorithm of the
Learning RuleSLP ( ) AJxInitW0W[k]| | k W AyyteachW[k+1]Delta
RuleModif. Rosen.Rozenblatt Learning of a SLP illustrates a
supervised learning rule which aims to assign the input patterns
{x1, x2, ,xp} to one of the prespecified classes or categories with
desired response if perceptron outputs for every classes we know in
advanced the desired response.fTarfxL 6-13 Lecture 6Block-Diagram
of the Rosenblatts Learning Rule| | ( ) ( )| || | | | | | | || | |
|| | | || | | |= = = = = = = == = A(((
cc = V A, 0 , 1 if , 0, 1 , 1 if ,, 1 , 0 if ,, 0 , 0 if , 0:,
elements h matrix wit -,k e k f k e k f x k e k f x k e k fx k e k
w wherek w wJJ kj jj j ij j ij ji i ijijijRRooooWW WThe Rosenblatts
learning rule realises the weights change value as:L 6-14 Lecture
6Recommended References 1. Marvin L., Minsly, S.Pafert Perceptrons
Expanded Edition: Introduction to Computation Geometry.2. Haykin S.
Neural Networks: A Comprehensive Foundation. Mac-Millan.
N.York.1994.3. Laurence Fausett Fundamentals of Neural Networks:
Architecture, Algorithms, and Applications . Prentice Hall, 1994.
4. Cichocki A. Unbehauen R. Neural Networks for Optimization and
Signal Processing, Wiley, 1993.