Top Banner
Neural Networks (In its most general form) a neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest; the network is usually implemented by using electronic components or is simulated in software on a digital computer. RKJHA Dendrites
25
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nn for Class_rkjha

Neural Networks(In its most general form)

a neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest; the network is usually implemented by using electronic components or is simulated in software on a

digital computer.

RKJHA

Dendrites

Page 2: Nn for Class_rkjha

Introduction

Neural network is one of the important components in Artificial Intelligence (AI).It has been studied for many years in the hope of achieving human-like performance in many fields, such as1. speech and image recognition as well as2. information retrieval.

To make the term 'neural network' used in this paper clear and to expand considerably on its content,

it is useful to give a definition to this term, analyze the general structure of a neural network, and explore the advantages of neural network models first.

Page 3: Nn for Class_rkjha

The Definition of Neural NetworkIn (1990), according to Miller :--"Neural networks are also called neural nets, connectionist models, collective models, parallel distributed processing models, neuromorphic systems, and artificial neural networks by various researchers (p.1-4)."

Similarly, in his article (1987), Lippmann states:-“Artificial neural net models or simple 'neural nets' go by many names such as connectionist models, parallel distributed processing models, and neuro-morphicsystems (p.4)."

However, Doszkocs and his coworkers (1990) think connectionist models are more general than neural network models and "they include several related information processing approaches, such as artificial neural networks, spreading activation models, associative networks, and parallel distributed processing (p. 209)."

In their mind, "early connectionist models were called neural network models because they literally tried to model networks of brain cells (neurons) (p. 212)".

Page 4: Nn for Class_rkjha

The Definition of Neural Network

A neural network model (or neural model) as that term is used refers to a connectionist model that simulates the biophysical information processing occurring in the nervous system. So, even though connectionist models and neural network models have same meaning in some literature, we prefer to regard connectionist models as a more general concept and neural networks is a subgroup of it.

In (1999) ,A preliminary definition of neural network is given by Kevin Gurney:-A Neural Network is an interconnected assembly of simple processing elements,units or nodes, whose functionality is loosely based on the animal neuron. Theprocessing ability of the network is stored in the inter-unit connection strengths, or weights, obtained by a process of adaptation to, or learning from, a set of training patterns.

Page 5: Nn for Class_rkjha

The Definition of Neural Network

Hecht-Neilsen Neuro-computer Corporation provides the following definition for neural network:

A cognitive information processing structure based upon models of brain function.In a more formal engineering context: a highly parallel dynamical system with thetopology of a directed graph that can carry out information processing by means ofits state response to continuous or initial input (as cited in Miller, 1990, p. 1-3).

Haykin (1999) has offered the following definition of a neural network viewed as an adaptive machine:A neural network is a parallel distributed processor made up of simple processingunits, which has a natural propensity for storing experiential knowledge and makingit available for use. It resembles the brain in two respects:1. Knowledge is acquired by the network from its environment through a learning

process;2. Interneuron connection strengths, known as synaptic weights, are used to store

theacquired knowledge (p. 2).

Page 6: Nn for Class_rkjha

Summary

In summary, neural networks are models attempt to achieve good performance via dense interconnection of simple computational elements (Lippmann, 1987)

Page 7: Nn for Class_rkjha

reviewA short review of early neural network models is given by Doszkocs, Reggia andLin (1990), with three representative examples as follows.

• Networks based on logical neurons.These are earliest neural network models. A Logical neuron is a binary state device, which is either off or on. There is no mechanism for learning and a network for a desired input-output relationship must be designed manually.

• Elementary Perceptron. This is a kind of neural networks developed during the 1950s and 1960s, which learns through changes of synaptic strength. Given any set of input patterns and any desired classification of patterns, it is possible toconstruct an elementary perceptron that will perform the desired classification.The crowning achievement of work on elementary perceptrons is the perceptron convergence theorem.

•· Linear networks. These are another class of neural models developed primarily during the 1960s and 1970s. Much work on linear networks has focused on associative memories.

Page 8: Nn for Class_rkjha

Historical Sketch of Neural Networks_1940s-

1940s-Natural components of mind-like machines are simple abstractions based on the behavior of biological nerve cells, and such machines can be built by interconnecting such elements.

W. McCulloch & W. Pitts (1943) the first theory on the fundamentals of neural computing (neuro- logicalnetworks )

“A Logical Calculus of the Ideas Immanent in Nervous Activity”

==> McCulloch-Pitts neuron model; (1947)

“How We Know Universals ”- an essay on networks capable of recognizing spatial patterns invariant of

geometric transformations.

Cybernetics: attempt to combine concepts from biology, psychology, mathematic s, and

engineering.

Page 9: Nn for Class_rkjha

Historical Sketch of Neural Networks_1940s_Continued

D.O. Hebb (1949)

“The Organization of Behavior ”

the first theory of psychology on conjectures about neural networks

==>(neural networks might learn by constructing internal representations of concepts in the form of “cell - assemblies ” - subfamilies of neurons that would

learn to support one another’ s activities).

==> Hebb’s learning rule:

==> “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A ’ s efficiency, as one of the cells firing B, is increased.”

Page 10: Nn for Class_rkjha

Historical Sketch of Neural Networks_1950s

Cybernetic machines developed as specific architectures to perform specific functions.==> “machines that could learn to do things they aren’t built to do ”

M. Minsky (1951) built a reinforcement -based network learning system.

IRE Symposium ==> “The Design of Machines to Simulate the Behavior of the

Human Brain”

(1955)with four panel members: W.S. McCulloch, A.G. Oettinger, O.H. Schmitt, N. Rochester,invited questioners: M. Minsky, M. Rubinoff, E.L. Gruenberg, J. Mauchly, M.E. Moran, W. Pitts, and the moderator H.E. Tompkins.

F. Rosenblatt (1958)

the first practical Artificial Neural Network (ANN) - the Perceptron,==>“The Perceptron :

A Probabilistic Model for Information Storage and Organization in the Brain. ”.

By the end of 50s, the NN field became dormant because of the new AI advances based on serial processing of symbolic expressions.

Page 11: Nn for Class_rkjha

Historical Sketch of Neural Networks_1960s_Continued

Connectionism (Neural Networks) - versus - Symbolism (Formal Reasoning)

B. Widrow & M.E. Hoff (1960) _“Adaptive Switching Circuits”presents an adaptive percepton –like network. The weights are adjusted so to minimize the mean square error between the actual and desired output

==> Least Mean Square (LMS) error algorithm . (1961) Widrow and his students “Generalization and Information Storage in Newtworks of Adaline “Neurons.”

M. Minsky & S. Papert (1969) _“Perceptrons”a formal analysis of the percepton networks explaining their limitations and indicating directions for overcoming them

==> relationship between the perceptron’s architecture and what it can learn :

“no machine can learn to recognize X unless it poses some scheme for representing X. ”

Limitations of the perceptron networks led to the pessimist view of the NN field as having no future ==> no more interest and funds for NN research!!

Page 12: Nn for Class_rkjha

Review_continued

In 1990, Widrow and Lehr (1990) reviewed the 30 years of adaptive neural networks. They gave a description of

1. the history, 2. origination, 3. operating characteristics,and4. basic theory of several supervised neural network training algorithms including: -

a. the perceptron rule b. the LMS (least mean square) algorithm, c. three Madaline rules, and d. the back propagation techniques.

In his book, Haykin (1999) has provided some historical notes on neural networksbased on a year by year literature review, which including

McCulloch and Pitts's logicl neural network, Wiener's Cybernetics, Hebb's The Organization of Behavior, and so on.

As a conclusion, the author claims, "perhaps more than any other publication, the 1982 paper by Hopfield and the 1986 two-volume book by Rumelhart and McLelland were the most influential publications responsible for the resurgence of interest in neural network in the 1980s (p. 44).

Page 13: Nn for Class_rkjha

Advantages of Neural Network Models over Traditional IR Models

In neural network models, information is represented as a network of weighted, interconnected nodes. In contrast to traditional information processing methods,neural network models are "self-processing" in that no external program operates on the network: the network literally processes itself, with "intelligent behavior" emerging from the local interactions that occur concurrently between the numerous network components (Reggia & Sutton, 1988).

According to Doszkocs, Riggia and Lin (1990), neural network models in general are fundamentally different from traditional information processing models in at least twoways.

· First they are self-processing. Traditional information processing models typically make use of a passive data structure, which is always manipulated by an active external process/procedure. In contrast, the nodes and links in a neural network are active processing agents. There is typically no external active agent that operates on them. "Intelligent behavior" is a global property of neural network models.

· Second, neural network models exhibit global system behaviors derived from on current local interactions on their numerous components. The external process that manipulated the underlying data structures in traditional IR models typically has global access to the entire network/rule set, and processing is strongly and explicitly sequentialized (Doszkocs, Reggia & Lin, 1990).

Page 14: Nn for Class_rkjha

Advantages of Neural Network Models over Traditional IR Models_Continued

Pandya and Macy (1996) have summarized that neural networks are naturalclassifiers with significant and desirable characteristics, which include but no limit to the follows.· Resistance to noise· Tolerance to distorted images/patterns (ability to generalize)· Superior ability to recognize partially occluded or degraded images· Potential for parallel processing

Page 15: Nn for Class_rkjha

Advantage and diadvantage

Advantages: A neural network can perform tasks that a linear program can not. When an element of the neural network fails, it can continue without any problem by their parallel nature.A neural network learns and does not need to be reprogrammed.It can be implemented in any application.It can be implemented without any problem.

Disadvantages: The neural network needs training to operate.The architecture of a neural network is different from the architecture of microprocessors therefore needs to be emulated. Requires high processing time for large neural networks.

Page 16: Nn for Class_rkjha

Component of Neural Networka neural network has three components:

• a network, •an activation rule, and• a learning rule.

1. The network consists of a set of nodes (units) connected together via directed links. Each node in the network has a numeric activation level associated with that time t. The overall pattern vector of activation represents the current state of the network at time t.

2. Activation ruleis a local procedure that each node follows in updating its activation level in the context of input from neighboring nodes.

3. Learning rule is a local procedure that describes how the weights on connections should be altered as a function of time.

Page 17: Nn for Class_rkjha

Element of modelSimilarly, Haykin (1999) thinks there are three basic elements of the neuronal model, which include:

1. A set of synapses or connecting links, each of which is characterized by a weight or strength of its own.

2. An adder for summing the input signals, weighed by the respective synapses of the neuron. The operations could constitute a linear combiner.

3. An activation function for limiting the amplitude of the output of a neuron. (p. 10)

The activation function is also referred to as a squashing function in that it squashes (limits) the permissible amplitude range of the output signal to some finite value. Types of activation functions include: 1) threshold function; 2) Piecewise-linear function, and 3) sigmoid function.

The sigmoid function, whose graph is s-shaped graph, is by far the most common form of activation function used in the construction of neural networks (p.14).

Page 18: Nn for Class_rkjha

Activation Function_contined

Activation functions:- the activation function acts as a squashing function, such that the outputof a neuron in a neural network is between certain values (usually 0 and 1, or -1 and 1).

general, there are three types of activation functions, denoted by Φ(.)

•Threshold Function •Piecewise-Linear function•sigmoid function

Threshold Function which takes on a value of 0 if the summed input is less than a certain threshold value (v), and the value 1 if the summed input is greater than or equal to the threshold value.

Page 19: Nn for Class_rkjha

Activation Function_contined

Secondly, there is the Piecewise-Linear function. This function again can take on the values of 0 or 1, but can also take on values between that depending on the amplification factor in a certain region of linear operation.

Thirdly, there is the sigmoid function. This function can range between 0 and 1, but it is also sometimes useful to use the -1 to 1 range. An example of the sigmoid function is the hyperbolic tangent function.

Page 20: Nn for Class_rkjha

Activation Function_contined

Page 21: Nn for Class_rkjha

Activation functionFunction Definition Range

Identity x (-inf,+inf)

Logistic (0,+1)

Hyperbolic (-1,+1)

-Exponential (0, +inf)

Softmax (0,+1)

Unit sum (0,+1)

Square root (0, +inf)

Sine sin(x) [0,+1]

Ramp [-1,+1]

Step [0,+1]

Page 22: Nn for Class_rkjha

A framework for distributed representation

An artifcial neural network consists of a pool of simple processing units which communicate by sending signals to each other over a large number of weighted connections. A set of major aspects of a parallel distributed model can be distinguished :

1. a set of processing units ('neurons,' 'cells');2. a state of activation yk for every unit, which equivalent to the output of the unit;3. connections between the units. Generally each connection is defined by a weight

wjk which determines the effect which the signal of unit j has on unit k;4. a propagation rule, which determines the effective input sk of a unit from its

external inputs;5. an activation function Fk, which determines the new level of activation based on

the efective input sk(t) and the current activation yk(t) (i.e., the update);6. an external input (aka bias, offset) øk for each unit;7. a method for information gathering (the learning rule);8. an environment within which the system must operate, providing input signals

and|if necessary|error signals.

Page 23: Nn for Class_rkjha

Processing units

Each unit performs a relatively simple job:1. receive input from neighbours or external sources and use this to compute an output

signal which is propagated to other units.

2. Apart from this processing, a second task is the adjustment of the weights.The system is inherently parallel in the sense that many units can carry out their computations at the same time. Within neural systems it is useful to distinguish three types of units:

1. input units (indicated by an index i) which receive data from outside the neural network,

2. output units (indicated by an index o) which send data out of the neural network, and

3. hidden units (indicated by an index h) whose input and output signals remain within the neural network.

During operation, units can be updated either synchronously or asynchronously. With synchronous updating, all units update their activation simultaneously; with asynchronous updating, each unit has a (usually fixed) probability of updating its activation at a time t, and usually only one unit will be able to do this at a time. In some cases the latter model has some advantages.

Page 24: Nn for Class_rkjha

Neural Network topologiesIn the previous section we discussed the properties of the basic processing unit in an artificial neural network. This section focuses on the pattern of connections between the units and the propagation of data. As for this pattern of connections, the main distinction we can make is between:

Feed-forward neural networksRecurrent neural networks

Feed-forward neural networks, where the data ow from input to output units is strictly feedforward. The data processing can extend over multiple (layers of) units, but no feedback connections are present, that is, connections extending from outputs of units to inputs of units in the same layer or previous layers.

Recurrent neural networks that do contain feedback connections. Contrary to feed-forward networks, the dynamical properties of the network are important. In some cases, the activation values of the units undergo a relaxation process such that the neural network will evolve to a stable state in which these activations do not change anymore. In other applications, the change of the activation values of the output neurons are significant, such that the dynamical behaviour constitutes the output of the neural network (Pearlmutter, 1990).

Page 25: Nn for Class_rkjha

Artificial Neural Networks Architectures

Artificial Neural networks aremathematical entities that are modeled after existing biological neurons found in the brain. All the mathematical models are based on the basic block known as artificial neuron.A simpleneuron isshown in figure 1. This is a neuron with a single R-element input vector is shown below. Here the individual element inputs p1,p2,…,pn