Top Banner

of 43

09 Artificial Neural Networks and Classification

Jul 06, 2018

Download

Documents

siddu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/18/2019 09 Artificial Neural Networks and Classification

    1/43

     

    Artifcial Neural Networks andClassifcation

     An artifcial neural network is a simple

    brain-like device that can learn byadjusting connections between its

    neurons

  • 8/18/2019 09 Artificial Neural Networks and Classification

    2/43

     

     The brain as a computer

  • 8/18/2019 09 Artificial Neural Networks and Classification

    3/43

     

     The brain’s architecture

    Human (and animal) brains have a ‘computer’

    architecture which consists o a comple! web o about "#"" hi$hl% inter&connected

    processin$ units called neurons

    'rocessin$ involves si$nals bein$ sent rom neuron to neuron

    b% complicated electrochemical reaction in a hi$hl% parallel manner

  • 8/18/2019 09 Artificial Neural Networks and Classification

    4/43

     

     The neuron A neuron is nerve cell consistin$ o

    a cell bod% (soma) containin$ a nucleus’

    branchin$ out rom the bod%  a number o fbres called dendrites a sin$le lon$ fbre called the axon

    a centimetre or lon$er’

     The a!on branches and connects to thedendrites o other neurons

    the connectin$ unction is called the synapse each neuron connects to between a do*en and "##+###

    other neurons’

  • 8/18/2019 09 Artificial Neural Networks and Classification

    5/43

     

    A real neuron

  • 8/18/2019 09 Artificial Neural Networks and Classification

    6/43

     

    ,i$nal propa$ation

    Chemical transmitter substances are releasedrom the s%napses and enter the dendrites’

     This raises or lowers the electrical potential o thecell bod%

    s%napses that raise potential are called excitatory  those that lower it are inhibitory  

    -hen a threshold is reached+ an electrical pulse+the action potential+ is sent down the a!on (fring)  This spreads into the a!on’s branches reachin$

    s%napses and releasin$ transmitters into cell bodieso other neurons

  • 8/18/2019 09 Artificial Neural Networks and Classification

    7/43

     

    .rain versus computer ,tora$e capacit%

    brain has more neurons than computer has bits

    ,peed

    brain is much slower than a computer a neuron has frin$ speed o "#&/ secs compared to computer switchin$

    speed o "#&"" secs

    .rain relies on massive parallelism or perormance %ou can reco$nise %our mother in #" secs

     The brain is more suited to intelli$ence processin$ and learnin$ 0t is $ood at ormin$ associations

    this seems to be the basis o learnin$

    0s more ault tolerant  neurons die all the time and computation continues

     Task perormance e!hibits graceul degradation in contrast to brittleness o computers

  • 8/18/2019 09 Artificial Neural Networks and Classification

    8/43

     

    Artifcial neural networks

  • 8/18/2019 09 Artificial Neural Networks and Classification

    9/43

     

    -hat is an artifcial neuralnetwork1 An artifcial neural network (ann) is a $rossl%

    oversimplifed version o the brain’s architecture

    0t has ar ewer ’neurons’ several hundred or thousand

    0t has much simpler internal structure

     The frin$ mechanism is less comple!

     The si$nals consist o real numbers passed romone neuron to another

  • 8/18/2019 09 Artificial Neural Networks and Classification

    10/43

     

    How does a network behave1 2ost anns can be re$arded as input&output

    devices

    numerical input is propa$ated throu$h the network romneuron to neuron till it reaches the output

     The connections between neurons havenumerical weights which are used to combinethe si$nals reachin$ a neuron

    3earnin$ involves establishin$ the wei$ht values(strengths) to achieve a particular $oal 0n theor% the stren$ths could be pro$rammed rather than

    learnt but or the most part this would be impossibl% tedious

  • 8/18/2019 09 Artificial Neural Networks and Classification

    11/43

     

    4esi$nin$ a network Creatin$ an ann re5uires the ollowin$ to be

    specifed

    Network topology  the number o units the pattern o interconnectivit% amon$st them

    the mathematical t%pe o the wei$hts

    Transer unction   This combines the inputs impin$in$ on the unit and produces

    the unit activation level which then becomes the output si$nal Representation or e!amples earning law

     This states how wei$hts are to be modifed to achieve thelearnin$ $oal

  • 8/18/2019 09 Artificial Neural Networks and Classification

    12/43

     

    Network topolo$% & neurons and

    la%ers ,pecifes how man% nodes (neurons) there are and

    how the% are connected in a ully connected network  each node is connected to ever%

    other 6ten networks are or$anised in layers (slabs) with no

    connections between nodes in a la%er & onl% across  The frst la%er is the input layer + the last+ the output layer 3a%ers between the input and output la%ers are called hidden

     The input units t%picall% do not carr% out internalcomputation+ ie do not have transer unctions the% merel% pass on their si$nal values

     The output units send their si$nal directl% to theoutside world

  • 8/18/2019 09 Artificial Neural Networks and Classification

    13/43

     

    Network topolo$% & wei$hts -ei$hts are usuall% real&valued

    At the start o learnin$+ their values are oten set

    randoml% 0 there is a connection rom a to b then a has

    in7uence over the activation value o b !xcitatory in"uence 

    hi$h activation in unit a contributes to hi$h activation in unit b

      is modelled b% a positive wei$ht  #nhibitory in"uence 

    hi$h activation in unit a contributes to low activation in unit b

    is modelled b% a ne$ative wei$ht 

  • 8/18/2019 09 Artificial Neural Networks and Classification

    14/43

     

    Network topolo$% & 7ow o

    computation Althou$h connections are uni&directional+ some

    networks have pairs o units connected in both 

    directions  there is a connection rom unit a to unit b and one

    back rom unit b to unit a 

    Networks in which there is no loopin$ back oconnections are called eed-orward si$nals are 8ed orward8 rom input throu$h to output 

    Networks in which outputs are eventuall% edback into the network as inputs are calledrecurrent 

  • 8/18/2019 09 Artificial Neural Networks and Classification

    15/43

     

    9!amples o eed&orwardtopolo$ies

    6-node input layer 

    2-node output layer 

    Single layer network 

    4-node input layer 

    4-node hidden layer 1-node output layer  

    Two layer network with 1 hidden layer 

  • 8/18/2019 09 Artificial Neural Networks and Classification

    16/43

     

     The transer unction &

    combinin$ input si$nals  The input si$nals to a neuron must be combined into

    a sin$le value+ the activation level to be output

    :suall% this transer takes place in two sta$es frst the inputs are combined

    and then passed throu$h another unction to produce the

    output

     The most common method o combination is the

    weighted sum

    1 1  ...

    n n sum w x w x= + +

    Here x i is the si$nal and wi is the wei$ht on

    connection # and n is the number o input si$nals

  • 8/18/2019 09 Artificial Neural Networks and Classification

    17/43

     

     The transer unction & the

    activation level  The wei$hted sum is passed throu$h an activation

    unction to produce the output si$nal (activation level) %′  Commonl% used unctions are

    inear    The output is ust the wei$hted sum

    inear threshold ($tep unction)  The wei$hted sum is thresholded at a value c i it is less than

    c+ then y ′  % &+ otherwise y ′  % '

    $igmoid response (logistic) unction  a continuous version o the step unction which produces

    $raceul de$radation around the 8step8 at c

    )(

    1

    1c sum

    e

      y−−

    +=′

  • 8/18/2019 09 Artificial Neural Networks and Classification

    18/43

     

    Activation unction$raphs

    c0

    1

    ,i$moid

    c0

    1

    ,tep

    c

    0

    3inear

  • 8/18/2019 09 Artificial Neural Networks and Classification

    19/43

     

    9!ample

    w" ; #/

    w

  • 8/18/2019 09 Artificial Neural Networks and Classification

    20/43

     

    3earnin$ with Anns

  • 8/18/2019 09 Artificial Neural Networks and Classification

    21/43

     

    -hat tasks can a network learn1 Networks can be trained or the ollowin$ tasks

    2lassifcation 3attern association

    e$ 9n$lish verbs mapped to their past tense

    2ontent addressable4associative memory  e$ can recall>restore whole ima$e when provided with a part o

    it 

     These all involve mappin$s  The mappin$ o input to output is determined b% the settin$s

    o all the wei$hts in the network (the weight vector ) ? this iswhat is learnt

     The network node conf$uration to$ether with the wei$htvector is the knowled$e structure

  • 8/18/2019 09 Artificial Neural Networks and Classification

    22/43

     

    3earnin$ laws

    3earnin$ provides a means o fndin$ the wei$htsettin$s to implement a mappin$  This is onl% possible i the network is capable o

    representin$ the mappin$  The more comple! the mappin$+ the lar$er the network

    that will be re5uired includin$ a $reater number ohidden la%ers

    0nitiall%+ wei$hts are set at random and altered inresponse to the trainin$ data

    A re$ime or wei$ht alteration to achieve there5uired mappin$ is called a learning law 

    9ven i a network can represent a mappin$+ aparticular learnin$ law ma% not be able to learn it

  • 8/18/2019 09 Artificial Neural Networks and Classification

    23/43

     

    @epresentation o trainin$e!amples :nlike decision trees which handle both

    discrete and continuous (numeric) attributes+

    anns can handle onl% the latter All discrete attributes must be converted

    (encoded) to be numeric  This also applies to the class

    ,everal wa%s are available and the choiceaects the success o learnin$

  • 8/18/2019 09 Artificial Neural Networks and Classification

    24/43

     

    4escription attributes 0t is desirable or all attributes to have values in

    the same ran$e  This is usuall% taken to be # to "

    Achieved or numeric attributes usin$normalisation

    value →   (value - min value) 4 (max value - min value)

    Bor discrete attributes can use

    '-out-o-N encoding (distributed) N binar% (#&") units used to represent the N values o the

    attribute+ one or each

    local encoding values mapped to numbers in ran$e # to "

    more suited to ordered values

  • 8/18/2019 09 Artificial Neural Networks and Classification

    25/43

     

    Class attribute

    "&out&o&N or local encodin$ can be used or the class

     The network output ater learnin$ is usuall% onl%

    appro!imate e$ in a binar% class problem with classes represented b% # and "+the network mi$ht output #= and this would be taken as ‘"’

    :sin$ "&out&o&N encodin$ allows or a probabilisticinterpretation+ e$

    classes or car domain unacc+ acc+ $ood+ v$ood

    can be represented with our binar% units

    e$ acc → (#+ "+ #+ #)

    6utput o (#

  • 8/18/2019 09 Artificial Neural Networks and Classification

    26/43

     

    Network conf$uration

    9ncodin$ o trainin$ e!amples aects networksi*e

    0nput la%er will have one unit or each numeric attribute

    one or each locall% encoded discrete attribute

    " or each binar% discrete attribute

    k or each distributed encodin$ o a discrete attributewhere the attribute has kE< values

    :suall% have a small number o hidden la%ers(one or two)

  • 8/18/2019 09 Artificial Neural Networks and Classification

    27/43

     

    '%ramid structure

    Hidden la%ers are used to reduce thedimensionalit% o the input

    A network has a pyramid structure i the frst hidden la%er ewer nodes than the input la%er

    each hidden la%er has less than its predecessor

    the output la%er has least

     The p%ramid structure acilitates learnin$ 0n classifcation each hidden la%er appears to partiall%

    classi% the e!amples until the actual classes arereached in the output la%er

  • 8/18/2019 09 Artificial Neural Networks and Classification

    28/43

     

     The learnin$ process

    Classifcation learnin$ uses a eedback  mechanism

    An e!ample is ed throu$h the network usin$the e!istin$ wei$hts

     The output value is 5F the correct  output value+ie the class in the e!ample+ is T  (tar$et)

    0 5 ≠ T + some or all o the wei$hts are chan$edsli$htl%

     The e!tent o the chan$e usuall% depends onT &5+ called the error 

  • 8/18/2019 09 Artificial Neural Networks and Classification

    29/43

     

     The delta rule

    A wei$ht+ wi + on a connection carr%in$ si$nal+

     x i + can be modifed b% addin$ an amount ∆wi 

     proportional to the error∆wi ; η  (T-5) x i 

    where η  is the learning rate

    η  is a positive constant usuall% set at about #"

    and $raduall% decreased durin$ learnin$  The update ormula or wi is then

    wi  ← wi G ∆wi 

  • 8/18/2019 09 Artificial Neural Networks and Classification

    30/43

     

     Trainin$ epochs

    Bor each e!ample in the trainin$ set the description attribute values are ed as input to

    the network and propa$ated throu$h to the output each wei$ht is updated

     This constitutes one epoch or cycle o learnin$

     The process is repeated till it is decided to stop 2an% thousands o epochs ma% be necessar%

     The fnal set o wei$hts represent the learnedmappin$

  • 8/18/2019 09 Artificial Neural Networks and Classification

    31/43

     

    -orked e!ample & $ol domain Conversion o attributes

     Attribute Values

    Outlook sunny, overcast, rain

    Temperature -0 to 10 !

    "umidity lo#, normal, hi$h,%indy

    !lass

    true, &alse

    yes, no

     Attribute Values

    'unnyOvercast(ain

    0, 10, 10, 1

    Temperature 0 to 1 T ←  (T+50)!00)o# *ormal"i$h%indy

    0, 10, 10, 10, 1

    +lay $ol& 1, 0

  • 8/18/2019 09 Artificial Neural Networks and Classification

    32/43

     

    Network conf$uration

    :se a sin$le la%er network (no hidden units) with step unctionto illustrate the delta rule

    0nitialise wei$hts as shown 

    ,et η  % &*'

    ,unn%

    6vercast

    @ain

     Temperature3owNormal

    Hi$h

    -ind%

    w&%&*+

    (bias)

    w' % -&*/

    w6 % -&*0

    w+ % &*6

    w0 % &*+

    w/ % &*'

    w7 % -&*'

    w8 % -&*6w % &*0

    -1w1w!w"

    w#w5w$ w% w&

  • 8/18/2019 09 Artificial Neural Networks and Classification

    33/43

     

    Beedin$ a trainin$e!ample Birst e!ample is

    (sunn%+

  • 8/18/2019 09 Artificial Neural Networks and Classification

    34/43

     

     The backpropa$ation al$orithm

  • 8/18/2019 09 Artificial Neural Networks and Classification

    35/43

     

    3earnin$ in multi&la%erednetworks Networks with one or more hidden la%ers are

    necessar% to represent comple! mappin$s

    0n such a network the basic delta learnin$ law isinsuKcient 0t onl% defnes how to update wei$hts in output units

    (uses T&6)

     To update hidden node wei$hts+ we have to defne

    their  error  This is achieved b% the 9ackpropagation al$orithm

  • 8/18/2019 09 Artificial Neural Networks and Classification

    36/43

     

     The .ackpropa$ationprocess 0nputs are ed throu$h the network in the usual

    wa% this is the orward pass

    6utput la%er wei$hts are adusted based onerrors

    L then wei$hts in the previous la%er are adustedL

    L and so on back to the frst la%er this is the backwards pass (or backpropagation)

    9rrors determined in a la%er are used to determinethose in the previous la%er 

  • 8/18/2019 09 Artificial Neural Networks and Classification

    37/43

     

    0llustratin$ the errorcontribution A hidden node is partiall% ‘credited’ or errors

    in the ne!t la%er

    these errors are created in the orward passerror '

    error 6

    error +

    error k 

    w'

    wk 

    error:contribution % w'  error ' . ; . wk  error k  

    '

  • 8/18/2019 09 Artificial Neural Networks and Classification

    38/43

     

     The backpropa$ational$orithm

    A backpropagation network  is  a multi&la%ered eed&orward network

    usin$ the si$moid response activation unction

    9ackpropagation algorithm" 0nitialise all network wei$hts to small random numbers

    (between #= and ##=)

  • 8/18/2019 09 Artificial Neural Networks and Classification

    39/43

     

     Termination conditions

    2an% thousands o iterations (epochs or c%cles)ma% be necessar% to learn a classifcation mappin$  The more comple! the mappin$ to be learnt+ the more

    c%cles will be re5uired

    ,everal termination conditions are used stop ater a $iven number o epochs

    stop when the error on the trainin$ e!amples (or on aseparate validation set) alls below some a$reed level

    ,toppin$ too soon results in underftting+ too latein overftting

  • 8/18/2019 09 Artificial Neural Networks and Classification

    40/43

     

    .ackpropa$ation as asearch 3earnin$ is a search or a network wei$ht

    vector to implement the re5uired mappin$

     The search is hill&climbin$ or ratherdescending called steepest gradient descent   The heuristic used is the total o (T&6)

  • 8/18/2019 09 Artificial Neural Networks and Classification

    41/43

     

    'roblems with the search 

     The si*e o step is controlled b% the learnin$ rate parameter   This must be tuned or individual problems

    0 the step is too lar$e search becomes ineKcient 

     The error surace tends to have e!tensive 7at areas trou$hs with ver% little slope

    0t can be diKcult to reduce error in such re$ions 

    -ei$hts have to move lar$e distances and it can be hard todetermine the ri$ht direction

    Hi$h numerical accurac% is re5uired+ e$ /

  • 8/18/2019 09 Artificial Neural Networks and Classification

    42/43

     

     The trained network

    Ater learnin$+ .ackpropa$ation ma% be usedas a classifer

    4escriptions o new e!amples are ed into thenetwork and the class is read rom the output la%er

    Bor "&out&o&N output representations+ e!act valueso # and " will not usuall% be obtained

    $ensitivity analysis (usin$ test data)

    determines which attributes are mostimportant or classifcation An attribute is re$arded as important i small

    chan$es in its value aect the classifcation

  • 8/18/2019 09 Artificial Neural Networks and Classification

    43/43

    .ackpropa$ation versus04/

     These two al$orithms are the $iants oclassifcation learnin$

    -hich is better1

    the ur% is still out  There are maor dierences

    04/ avours discrete attributes+ .ackprop avourscontinuous (but each handles both t%pes)

    .ackprop handles noise well .% usin$ prunin$+ so does04/

    .ackprop is much slower than 04/ and ma% $et stuck 04/ tells us which attributes are important .ackprop does

    this+ (to some e!tent) with sensitivit% anal%sis .ackprop’s learned knowled$e structure (wei$ht vector) is

    not understandable whereas an 04/ tree can becomprehended (althou$h this is diKcult i the tree is lar$e)