A Formal Model for Definition and Simulation of Generic Neural Networks

A Formal Model for De¢nition and Simulation ofGeneric Neural Networks

M. A. ATENCIA1, G. JOYA2 and F. SANDOVAL2

1Departamento de Lenguajes y Ciencias de la Computacion, E.T.S.I. Informatica, Universidadde Malaga, Campus de Teatinos, 29071 Malaga, Spain. E-mail: [email protected] de Tecnolog|a Electronica, E.T.S.I. Telecomunicacion, Universidad deMalaga, Campus de Teatinos, 29071 Malaga, Spain

Abstract. This paper presents the de¢nition of a formal data structure, which assists in thecharacterization of any neural paradigm, with no restriction, including higher-order networks.Within this model, a neural network is mathematically described by specifying some static par-ameters (number of neurons, order) as well as a set of statistical distributions (which we callthe network dynamics'). Once a concrete set of distributions is de¢ned, a single algorithmcan simulate any neural paradigm. The presented structure assists in an exhaustive and precisedescription of the network characteristics and the simulation parameters, providing us with auni¢ed criterion for comparing models and evaluating proposed systems.Though not presentedhere, the formal model has inspired a software simulator, which implements any system de¢nedaccording to this structure, thus facilitating the analysis and modelling of neuronal paradigms.

Key words: abstract data type, arti¢cial neural networks, backpropagation, computersimulation, data structure, higher-order, Hop¢eld paradigm, system modelling

1. Introduction

Any proposed neural solution should ful¢l some requirements: easy simulation,quantitative evaluation and possibility of comparison with other alternatives.However, each paradigm is speci¢ed by a different terminology, sometimes withqualitative descriptions of the network characteristics; with respect to simulations,they often depend upon ad-hoc programming and the implementation details areseldom completely speci¢ed. These two facts obstruct the evaluation and comparisonamong paradigms. To avoid these problems, we consider that it is necessary to have aformal structure available, which grants the description of both the topological anddynamical characteristics of any neuronal paradigm. The speci¢cation of this struc-ture must be accomplished with a precise language, which includes mechanismsfor the representation of parallelism and synchronization processes, typical in neuralnetworks. Also, this structure must be easily implemented through a simulationsoftware, which is desirable to ease the following processes:

Neural Processing Letters 11: 87^105, 2000. 87# 2000 Kluwer Academic Publishers. Printed in the Netherlands.

. Modi¢cation or extension of the model without reprogramming.

. Interactive evaluation of every small modi¢cation of the model. Obtainment ofinformation about the behaviour of the model.

. For instance, in the case of recurrent networks, we may be interested in observ-ing the stability (or instability) of the network.

This paper presents a data structure that ful¢ls the mentioned requirements. Thespeci¢cation of the network dynamical properties is performed by means of prob-ability distributions; each network is described by a speci¢c choice of the set of dis-tributions. We name the network state `con¢guration'; the con¢gurationevolution is obtained as the application of an algorithm, with a ¢xed de¢nition,but whose result depends on the concrete set of distributions that the networkdynamics comprises. Unlike other models [12, 13] we have aimed to achieve maxi-mum generality in the neuronal paradigms that may be represented by the formaldata structure. We have also implemented a software package [1] that simulatesthe con¢guration evolution starting from the static (order, number of neurons)and dynamic (probability distributions) properties of the network. Contrary to othersimulators [2] these properties need not to be chosen among a previously establishedset, but they are de¢ned by the modeller.

In Section 2 we de¢ne the characteristics of a neural network that are relevantfor our work: con¢guration and dynamics. The con¢guration is the set ofvariables that embody all the information about the network state at a giveninstant. The network dynamics is a set of functions and probability distributionsthat determine the dynamical behaviour of the network, that is, given a con-¢guration, the dynamics governs the process of obtaining a new con¢guration.In Section 3, the network evolution is described as the change of its con¢guration,according to an algorithm that is applicable to every network; however, the con-crete behaviour of this algorithm depends on the concrete network dynamics.The algorithm comprises several processes, which, in order of complexity, are:activation, step, simulation phase and experiment. An activation implies ¢ring(and possibly change of state) of a single neuron. A step is a sequence ofactivations, and represents a simultaneous change of state of a set of neurons.A simulation phase is a succession of steps, and is the largest process that anetwork may undergo, as long as there are no changes in weights (apart fromtheir initialization). An experiment is a sequence of phases, and it may involvechanges in weights, as well as in neuron states. These concepts are clari¢ed withtheir application to two neuronal paradigms: the Hop¢eld model and themultilayer perceptron. Finally, Section 4 summarizes the conclusions and presentssome ¢elds of future work. In Appendices A.1 and A.2 the paradigms that havebeen used as an example are brie£y described. A summary of notation is includedat the end.

88 M. A. ATENCIA, G. JOYA AND F. SANDOVAL

2. Neural Network: Con¢guration and Dynamics

The study of a neural network through its computer simulation suggests thefollowing de¢nition: a neural network is a dynamical system whose state evolvesalong a discrete temporal scale. This evolution will be formulated as successivetransformation from a con¢guration into another, according to an algorithm. Asit will be exposed in Section 3, even though this algorithm is ¢xed, its applicationmakes use of some parameters (which are probability distributions) that aredifferently de¢ned for each concrete network. In this way, any neural networkmay be mathematically de¢ned by specifying its static parameters (number ofneurons, order) and its dynamical properties (probability distributions). And thesimulation of the network may be achieved with a single algorithmic scheme.

2.1. DEFINITION: CONFIGURATION

A neural network is a dynamical system whose state, at a given instant, is charac-terized by a 5-tuple C, named con¢guration.

C � fn;m; s;W; agwhere

n is named number of neurons, n 2Nm is named order, m 2Ns is a vector named vector of states, s � (s0 . . . snÿ1), si 2 R, 8i 2 f0 . . . nÿ 1gW is called collection of weights,W � �W0 . . .Wmÿ1) andWj is the collection ofweights of order j, which comprises n j weights:

Wj �

w00...j000 w00...

j001 . . . w00...

j00nÿ1

w00...j010 w00...

j011 . . . w00...

j01nÿ1

..

. ... ..

....... ..

.

w00...j0nÿ10 w00...

jnÿ11 . . . w00...

j0nÿ1nÿ1

w00...j100 w00...

j101

..

.......

w00...j1nÿ1nÿ1

..

. ... ..

....... ..

.

..

. ... ..

....... ..

.

wnÿ1nÿ1...jnÿ10 wnÿ1nÿ1...

jnÿ11 . . . w0nÿ1nÿ1...

jnÿ1nÿ1

8>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>:

9>>>>>>>>>>>>>>>>=>>>>>>>>>>>>>>>>;wi0...ij 2 R; 8ik 2 f0 . . . nÿ 1g; k 2 f0 . . . jg

Zero-order weights, members of the collection W0, are often called bias othresholds.

a is called latest activated neuron, a 2 f0 . . . nÿ 1gWe will note the set of all possible con¢gurations with symbol G.

DEFINITION AND SIMULATION OF GENERIC NEURAL NETWORKS 89

2.2. NOMENCLATURE: PARAMETRIC PROBABILITY DISTRIBUTIONS

Given a set A, let y be a probability distribution depending on a parameter x2A, andde¢ned on a set B; that is, given an element x 2 A, y assigns a probability Px�y� toeach element y 2 B. In these conditions, we can perform a drawing of lots amongelements in B, so that each element y 2 B may be chosen with probability Px�y�;with the symbol y�x�we designate the result of a concrete execution of such drawing.With the symbol p(B: A) we designate the class of all probability distributions on aset B, as long as they are dependent on a parameter x 2 A, that is, if y 2 p(B:A), then y(x) 2 B, where x 2A. These drawing processes may be seen as the executionof an algorithm or, using terminology of the programming languages ¢eld, a pro-cedure with parameters. If the distribution y on the set B has no parameters, wewill write y 2 p(B), and the execution of the drawing process produces a resulty( ) 2 B, which is equivalent to the execution of a procedure without parameters.

Moreover, the evaluation of a function f: A!B is also equivalent to the appli-cation of an algorithm which, given x 2 A, produces an element y 2 B, but nowit is the case of a deterministic process, as the value y � f(x) is the same wheneverthe algorithm is executed. Thus, given a function f: A!B we adopt the conventionof being f 2 p(B: A), unifying the nomenclature between probability distributions(stochastic processes) and functions (deterministic processes).

For instance, the functions or probability distributions that characterize thedynamics of a neural network are usually dependent on the current con¢gurationof the network, that is, they belong to p(B: G) and the set B will be differentlyde¢ned in each case. Using this nomenclature, we de¢ne the following auxiliaryprobability distributions which will be used below:

n 2 p(I) is the uniform distribution on the interval I � [0,1]j2p(f0; . . . i ÿ 1g : N) is the discrete probability distribution that, given i 2N,assigns to each integer 0; . . . i ÿ 1 the probability 1/i.

It is worth noting that in a parametric distribution, not only the probabilities butalso the de¢nition set may depend on the p parameter. That is, we may a havea distribution y 2 p(Bx : A), where x 2 A, and Bx 6� By if x 6� y. The above de¢neddistribution j is an example of this situation, as the set f0 . . . i ÿ 1g depends onthe value i 2N that we are considering. We will not use any particular notationto remark this fact, that will be easily deduced from the context.

2.3. DEFINITION: DYNAMICS

The evolution of a neural net (transformation of a con¢guration into another one) isachieved by means of the application of a ¢xed algorithm, whose results depend onthe evaluation of the probability distributions which have been de¢ned for each con-crete network. A data structure which comprises these distributions is namednetwork dynamics. Given a neural network, its dynamics are described by the 8-tuple


D� fysl , ypt, yac, ysn, yin, yte, yre, yW g . The distributions that compose D are describedbelow; many of them use the current con¢guration of the network as a parameter, sofrom now on we will suppose a con¢guration C 2 G, de¢ned as C � fn;m;W; s; ag.

ysl 2 p�f0 . . . nÿ 1g : G) is a probability distribution called selection. Given a con-¢guration C 2 G, ysl(C) will be used as a mechanism of choice of the neuron thatmust activate at that instant. For instance, if we de¢ne ysl(C) = (a� 1) mod n,neurons will ¢re sequentially in a cyclic way, as in the multilayer perceptron(Appendix A.2); with ysl(C) � j(n), neurons will activate randomly, as in asynchro-nous feedback networks (Appendix A.1). (The distribution j(n) was de¢ned in para-graph 2.2 as the uniform discrete distribution on the set f0 . . . nÿ 1g).

ypt 2 p�R : G� f0 . . . nÿ 1g�, is a probability distribution named neuron input orpotential. When a neuron is selected for ¢ring, let us call it i, ypt(C, i) will calculatethe input to the activated neuron. For instance, in a ¢rst order network, a typicalde¢nition is ypt�c; i� �

Pnÿ1j�0 wijsj. This is the case of examples in Appendices A.1

and A.2.yac 2 p�R : R� is a distribution named activation. Once a neuron is selected for

activation, the distribution yac results in the new neuron state, having the inputof the neuron as a parameter. Usually, yac is a monotonically increasing function,e.g. yac(x) � tanh(x), as in the examples of Appendices A.1 and A.2.

ysn 2 p�f0; 1g : G) is a distribution which will be designated synchronism or stepdecision. ysn will be used to simulate synchronism, because in our model we onlyconsider the activation of a single neuron at each instant, in order to ease theimplementation in sequential computers. Simultaneous activation of several neuronsis modelled by means of a simulation step (see paragraph 3.2 for a detaileddescription): neurons are ¢ring in a sequence {i1, i2,. . .ik}; when such a con¢gurationC is reached that the evaluation of ysn(C) results in the value 1, the step is concluded;this process produces an identical effect to the simultaneous activation of the set ofneurons {i1, i2,... ik}. For instance, in an asynchronous network, no more thanone neuron activates at each instant, and this may be simulated by de¢ning a con-stant value for the step decision ysn(C)� 1, as in a Hop¢eld network (Appendix A.1).In a layered network, a simulation step must be performed whenever every neuron ina layer has ¢red. This behaviour is modelled by de¢ning,

ysn�C� � 0 if layer(a)=layer((a+1) mod n)1 otherwise

n;

as in the multilayer perceptron shown in Appendix A.2. This example points out thetight interdependence between this distribution and selection ysl : a correct de¢nitionof synchronism is based on the sequential activation of previously de¢ned neurons.

The remaining distributions, that also belong to the de¢nition of networkdynamics, correspond to simulation characteristics, rather than intrinsic propertiesof the network. These characteristics are often speci¢ed with a qualitativedescription, but we think that an accurate de¢nition of them is necessary for a preciseanalysis and comparison among models.


yin 2 p�R : G� f0 . . . nÿ 1g� is a distribution named state initialization or,simply, initialization. Given a con¢guration C 2G, neuron i will be set for the resultof the evaluation of yin(C, i) whenever a simulation phase starts (see Paragraph 3.3).For instance, if we de¢ne yin�C; i� � n( ) every neuron initializes its initial valuerandomly in the interval [0, 1], as in the Hop¢eld network of Appendix A.1(according to paragraph 2.2, n( ) is a uniform distribution). In a backpropagationnetwork, as the multilayer perceptron of Appendix A.2, we de¢ne

yin � pji if layer�i� � 0si otherwise

;

�where Pj � �pjl . . . pjn0� is the pattern which is currently being presented to thenetwork; i.e. neurons in the input layer hold the pattern, and any other neuronremains with the same value it reached at the end of the previous simulation phase.

yte 2 p�f0; 1g : G) is a distribution named termination or halt condition. For eachcon¢gurationC 2G reached after a simulation step is completed, yte�C� is calculated:if (and only if) it results in the value 1, the simulation phase stops (see Paragraph 3.3).For instance, in Appendix A.1 we consider a Hop¢eld network whose neurons havevalues on the interval [0, 1] and the solutions (¢nal state) must be in the set{0, 1}; the halt condition may be de¢ned as every neuron state being less than edistant from any of the extremes of the interval, which may be implemented asyte�C� �

imin�dif �si��, where

dif �si� � 1 if si < e or si > 1ÿ e0 otherwise

:

�If the network simulation stops when all neurons have been sequentially activated,we de¢ne

yte � 1 if a � nÿ 10 otherwise

�as in the multilayer perceptron (Appendix A.2).

yre 2 p�f0; 1g : G) is a distribution named repetition condition. For each con¢gur-ation C 2 G reached after a simulation phase is completed, yre�C� is calculated: if(and only if) it results in the value 1, a new simulation phase is undertaken, atthe beginning of which initialization of states and weights may be performed. InParagraphs 3.3 and 3.4 the concepts simulation phase and experiment (sequenceof simulation phases) are described in full detail. If the network simulation consistsof a single simulation phase, e.g. Hop¢eld network of Appendix A.1, this distri-bution must be constantly de¢ned as yre�C� � 0.However, in a multilayer perceptron(Appendix A.2) the simulation must restart while training patterns remain, and, also,until the network is correctly trained for these patterns (except for an error e). This


behaviour is implemented by de¢ning

yre � 0 if no patterns remain and Error < e1 otherwise

:

�yW is a tuple named weight initializations, de¢ned as yW � fyw0 ; yw1 . . . ywmg where

ywi 2 p�R : G� f0 . . . nÿ 1gj�1� 8wi. The collection of distributions yW embodies theprocess that calculates the weights of a neural network starting from a con¢gurationC, assigning to the j-order weights the valuewi0...ij�1 � ywj �C; i0; . . . ij�1�. In a Hop¢eldnetwork, weights are calculated just once, by identifying the function to beminimized with the energy function (see, for instance, Appendix A.1). On the otherhand, in a network trained with backpropagation, weights are randomly initializedand are modi¢ed after every simulation phase, according to formulae shown inAppendix A.2. This latter example shows the need for a process that comprises sev-eral simulation phases, which we name experiment: as we will see in Paragraph 3.3,weight calculation is only accomplished at the beginning of a simulation phase,not during the phase, so there must exist several phases for a training process (whichis a weight change mechanism) to be implemented.

3. Network Evolution: Transformation of Con¢gurations

Once the number of neurons, the order, and the dynamics of a neural network havebeen de¢ned, its evolution may be simulated by means of a ¢xed algorithm, con-sisting in the application of several processes that transform the network con-¢guration. These processed are, ordered according to their complexity:

. Activation

. Simulation step

. Simulation phase

. Experiment

The algorithmic de¢nition of these processes is permanent, but it involves theprobability distributions that comprise the network dynamics. Thus, the resultof the execution of this algorithm depends on the concrete neural network whichwe are simulating; moreover, the components of the dynamics are probability dis-tributions, so successive executions of the simulation may produce different results.Our aim is a single algorithmic scheme that is able to simulate any neural network,whose behaviour is accurately de¢ned.

3.1. ACTIVATION

We de¢ne an activation as a process that starts from an `actual' con¢gurationCi anda `working' con¢gurationCw, and results in a new con¢gurationCo which is identicalto Cw, except for the state of a single neuron i. The neuron i that is being activated ispicked according to the result of the selection distribution ysl which receives as a


parameter the working con¢guration; the state of neuron i is calculated by the poten-tial ypt and the activation yac using as a parameter the actual con¢guration Ci. Thisformalism is adopted so that the new neuron states, resulting from several successiveactivations, do not alter the actual con¢guration, instead they are stored in the work-ing con¢guration. When a simulation step is completed, the actual state vector isupdated, producing the same result as the simultaneous activation of several neurons(see Paragraph 3.2 below for de¢nition of step and examples).

DEFINITION: ACTIVATIONLet:

D be a dynamics, D � fysl , ypt, yac, ysn, yin, yte, yre, yW gCi be a con¢guration, Ci � fn, m, W; si, aig; si � (sij�Cw be a con¢guration, Cw � fn, m, W , sw, awg; sw � (swj )

An activation, AD(Ci, Cw), is de¢ned as the process of obtaining a new con-¢guration: AD(Ci, Cw) � C� � n, m, W , so, ao, so � (soj ), where

ao � ysl�Cw�

soj �yac�ypt�Ci; aw�� if j � ao

swj if j 6� ao

�

3.2. SIMULATION STEP

A simulation step is a formal concept that aims to represent the simultaneous acti-vation of several neurons, a parallel process, in a sequential computer or program-ming language. A simulation step may be de¢ned as the successive computationof the state of one or more neurons, starting from an initial con¢guration C,but the calculated states do not in£uence the calculation of the state of subsequentneurons. Instead, the calculated states are stored in successive con¢gurations C0,C00. . . which may be called `working con¢gurations'. Only at the end of the step,a resulting con¢guration with every updated state is obtained.

The dynamical description of a neural network must determine the sets of neuronsthat activate simultaneously at each instant. In this work, this is accomplished bymeans of functions, so the de¢nition of step results in the following formalmechanism: the step condition ysn is evaluated after every activation; if it producesthe value 1, the step ¢nishes; otherwise, a new activation occurs. In this way,the different classes of networks may be represented with adequate de¢nitions ofysn and the selection distribution ysl , for instance:

. Asynchronous networks: ysn(C)� 1 for every C: every activation involves astep. The result is equivalent to the asynchronous activation of the neurons.It is the case of Hop¢eld network, as in Appendix A.1.


. Synchronous networks:

ysn�C� � 1 if a � nÿ 10 if a 6� nÿ 1

�ysl�C� � �a� 1�mod n

Neuron activation occurs in a cyclic way, but new states pass to the output onlyafter every cycle, that is, with our terminology, a step is completed after theactivation of neuron nÿ 1. The same result is obtained if all neurons activatesimultaneously.

. Layered networks:

ysn�C� � 1 if a is the last neuron of a layer0 otherwise

�ysl�C� � �a� 1� mod n

Neuron activation occurs in a cyclic way, but unless the previous case, a simu-lation step is completed whenever all neurons in a layer have been activated.The same result is obtained if all neurons in a layer activate simultaneously.This is the de¢nition used for the multilayer perceptron in Appendix A.2.

DEFINITION: STEPLet:

D be a dynamics, D � fysl , ypt, yac, ysn, yin, yte, yre, yW gCi be a con¢guration, Ci � fn, m, W; si, aig; si � (sij�

A simulation step, SD(Ci), is de¢ned as the process of obtaining a new con-¢guration: SD(Ci) � Co � n, m, W ; so, ao, so � (soj ), by means of the followingsequence, that ¢nishes when the evaluation of the step condition results in value 1:

C1 � AD�Ci;Ci� ysn�C1� � 0

C2 � AD�Ci;C1� ysn�C2� � 0. . .

Cpÿ1 � AD�Ci;Cpÿ2� ysn�Cpÿ1� � 0

Cp � AD�Ci;Cpÿ1� ysn�Cp� � 1

The obtained con¢guration is Co = Cp.

3.3. SIMULATION PHASE

A simulation phase is a process that transforms a con¢guration into another andconsists of two stages: a initialization procedure, and a sequence of simulation stepsthat ¢nishes when either:


. The network has reached a stable state, or

. The evaluation of the halt condition yte results in the value 1.

A network is considered to reach a stable state if every neuron has been selected atleast once in an activation process, but there has not been any modi¢cation of thestate vector. We will use some previous de¢nitions.

Given the step SD(C), with working con¢gurations C1, C2, . . .Cp, the subset off0; . . . nÿ 1g that includes the subindices of neurons that have been activated at leastonce, is called activated set, and is represented by the symbol AC�SD�C��:

AC�SD�C�� [pj�1

aj�

; where Cj � fn;m;W; s j; a jg

Given the step SD(C), and a set E0 � f0 . . . nÿ 1g (activated set before the step),the activated set after the step is called stable-activated set, SA(SD(C), E0)� f0 . . . nÿ 1g, and it is de¢ned as the union of E0 and the activated set of SD(C),if the step has not produced a modi¢cation of the network state; otherwise, thestable-activated set is the empty set, that is:

SA�SD�C�;E0� �. if s 6� s0

E0 [ AC�SD�C�� if s � s0

�If the only criterion to stop the simulation in a feedback network is convergence to

a stable state, the halt condition may be de¢ned constant yte � 1. On the other hand,in feedforward only networks, where the concept of stability is not applicable, thespeci¢cation of yte allows for the de¢nition of the criterion to stop the simulation.For instance, in the multilayer perceptron of Appendix A.2, we have de¢nedyte � 1 if and only if every neuron has been activated.

DEFINITION: SIMULATION PHASELet:

D be a dynamics, D � fysl , ypt, yac, ysn, yin, yte, yre, yW gCi be a con¢guration, Ci � fn, m, W; si, aig; si � (sij�

A simulation phase, PD(Ci), is de¢ned as the process of obtaining a new con¢gur-ation PD(Ci� � Co � fn;m;W; so; aog; so � �soj ), through:(1) An initialization stage resulting in a con¢guration C1 � fn;m;W1; s1; a1g;

W1 � �W 1j �; j � 0 . . .m; W 1

j � �w1i0...ij �; ik � 0 . . . nÿ 1; s1 � �s1i �

(a) Weight initialization: w1i0...ij � ywj �Ci; i0; . . . ij�1�

(b) State initialization: s1j � yin�Ci; j�(2) A sequence of simulation steps that ¢nishes when either the network has reached a

stable state (that is, the stable-activated set comprises every neuron) or the evalu-ation of the halt condition results in value 1:


E1 �.

C2 = SD(C1), E2 � SA(SD(C1),E1), E2 6� f0 . . . nÿ 1g and yte(C2) � 0

C3 = SD(C2), E3 � SA(SD(C2), E2), E3 6� f0 . . . nÿ 1g and yte(C3) � 0

. . .

Cpÿ1 � SD(Cpÿ2), Epÿ1 � SA(SD(Cpÿ2), Epÿ2) Epÿ1 6� f0 . . . nÿ 1g and yte(Cpÿ1) � 0

Cp � SD(Cpÿ1), Ep � SA(SD(Cpÿ1), Epÿ1) Ep 6� f0 . . . nÿ 1g or yte(Cp) � 1

The obtained con¢guration is PD(Ci� � Co � Cp

3.4. EXPERIMENT

The transformations processes de¢ned so far, allow for the complete representationof neuronal paradigms which do not involve change in weights, such as the Hop¢eldnetwork. The models that include some learning mechanism must be simulated usinga new process, which we call experiment. An experiment is de¢ned as a sequence ofone or more simulation phases; at the end of every phase, the repetition conditionyre is evaluated and, if it returns value 1, a new phase is executed. For instance,in the multilayer perceptron (Appendix A.2) a simulation phase is executed whenevera new training pattern is presented. Thus, new simulation phases must be performedwhile patterns exist. Moreover, if all patterns have been presented and the error issigni¢cant yet, a new presentation of all patterns must be accomplished. Taking intoaccount these considerations, the following de¢nition is adopted:

yre � 0 if no patterns left and Error < e1 otherwise

�

At the beginning of each simulation phase, as mentioned above, initialization ofweights and states may be performed. This can be used to implement models thatinvolve modi¢cation of some parameters of the network, such as the slope ofthe activation function [8, 9].

DEFINITION: EXPERIMENTLet:

D be a dynamics, D � fysl , ypt, yac, ysn, yin, yte, yre, yW gNN be a neural network with n neurons and order m.

We de¢ne an experiment, and denote ED(n, m), as the obtainment of a ¢nal con-¢guration ED�n;m� � Co � fn;m;W; so; aog; so � �soj �, through the followingprocedure, that ¢nishes when the evaluation of the repetition condition results invalue 0:


C1 � fn;m;W1; s1; a1g; where

s1 � �s1j �; s1j � 0; j � 0 . . . nÿ 1

W1 � �W 1j �; j � 0 . . .m;W 1

j � �w1i0...ij �;w1

i0...ij � 0; ik � 0 . . . nÿ 1

a1 � 0

C2 � PD�C1�; yre�C2� � 1

C3 � PD�C2�; yre�C3� � 1. . .

Cpÿ1 � PD�Cpÿ2� yre�Cpÿ1� � 1

Cp � PD�Cpÿ1� yre�Cp� � 0

The obtained con¢guration is ED�n;m� � Co � Cp.

4. Conclusions

We have presented a model for de¢nition of neural networks with a comprehensivescope. No restriction is a priori imposed on the network features. A neural networkis completely and precisely described by specifying:

. Number of neurons.

. Order.

. Dynamics, which is a set of probability distributions or functions. Its de¢nitioncharacterizes the behaviour of the described paradigm.

The evolution of any network is formally described by means of a ¢xed algorithm,which is composed of these processes:

. Activation, which corresponds to the ¢ring of a single neuron.

. Simulation step, which is a sequence of one or more activations.

. Simulation phase, which is a sequence of usually many steps.

. Experiment, which is a sequence of one or more phases.

The result of this algorithm depends on the concrete de¢nition of the distributionswhich the network dynamics comprises. The speci¢cation of the network character-istics through a set of mathematical objects is a straightforward and precise method,which eases a complete and objective description of the network, so facilitating itssimulation and interactive study, as well as its evaluation and comparison with othermodels. This open and £exible structure eases the design of hybrid systems withnon-neuronal elements, such as neuro-fuzzy and stochastic systems, or inferencenetworks [10] based upon classical arti¢cial intelligence. This formal structurehas inspired the implementation of a simulation software [1] that directly performsthe simulation of any neural network de¢ned according to the formal model.


https://www.researchgate.net/publication/295797934_A_generic_formulation_of_neural_nets_as_a_model_of_parallel_and_self-programming_computation?el=1_x_8&enrichId=rgreq-796496c8-367e-48c7-b0cf-0db4790721c5&enrichSource=Y292ZXJQYWdlOzIyNzAwMzcxMTtBUzoxMDE1MzY3ODkxNzIyMjZAMTQwMTIxOTY2NTE0NQ==

Moreover, the simulator includes some additional facilities, such as modi¢cation ofthe network parameters along the simulation, or storing temporal results.

Some directions of future work include:

. Constructive^destructive learning, with the implementation of mechanisms forthe modi¢cation of topological features of the network (number of neurons andorder) that we have considered ¢xed in this paper.

. Application of the formal model and the simulator to the analytic and statisticstudy of several neuronal paradigms, specially convergence and stability analy-sis in Hop¢eld networks [7].

. Simulation software enhancements, adopting object oriented programming andhandling networks of greater size (both order and number of neurons).Moreover, a parallel implementation of the simulator is under study.

Appendices: Examples of De¢nition of Neuronal Paradigms

A.1. HOPFIELDNETWORK FORTHE SOLUTION OF THE TRAVELLING SALESMANPROBLEM (TSP)

In a continuous Hop¢eld network [4] neurons are activated randomly andasynchronously. When neuron i is activated, its state si is calculated accordingto the expression:

uiXnÿ1j�0

wijsj ÿ wi; si � g�ui�

where n is the number of neurons, wij is the weight of the connection from neuron j toneuron i, wi is the bias or threshold of neuron i and g is a sigmoid-like function, suchas the hyperbolic tangent. The network has the following energy function:

E � ÿXnÿ1i;j�0

wijsisj �Xnÿ1i�0

wisi

In the Travelling Salesman Problem (TSP) there are a set of p nodes (cities), linkedpairwise by paths of some length. The solution of the problem is a minimal lengthpath that travels through all nodes once and only once, going back to the initialnode. This problem, like other optimization problems [6] is NP-complete [3] andits traditional solutions are computationally complex. A ¢rst order Hop¢eld networkfor the solution of TSP has been proposed which usually provides an acceptablesolution, though not always optimum [5]. The network comprises n � p2 neuronsarranged in a square matrix with p rows and p columns. When the network reachesa valid solution, one and only one neuron in each row and column is on (value1) and the rest is off (value 0). The network state is mapped into a solution inthe following way: if the neuron at column i and row x is on, it represents thatthe node xis the ith node inside the path. The energy function that the network must


https://www.researchgate.net/publication/257352658_Associating_arbitrary-order_energy_functions_to_an_artificial_neural_network_Implications_concerning_the_resolution_of_optimization_problems?el=1_x_8&enrichId=rgreq-796496c8-367e-48c7-b0cf-0db4790721c5&enrichSource=Y292ZXJQYWdlOzIyNzAwMzcxMTtBUzoxMDE1MzY3ODkxNzIyMjZAMTQwMTIxOTY2NTE0NQ==

https://www.researchgate.net/publication/220695890_Computers_and_Intractability_A_Guide_To_The_Theory_of_NP-Completeness?el=1_x_8&enrichId=rgreq-796496c8-367e-48c7-b0cf-0db4790721c5&enrichSource=Y292ZXJQYWdlOzIyNzAwMzcxMTtBUzoxMDE1MzY3ODkxNzIyMjZAMTQwMTIxOTY2NTE0NQ==

minimize, corresponding to minimum length paths, is

E � A2

Xx

Xi

Xj 6�i

Sxisxj � B2

Xi

Xx

Xx 6�y

sxisyi � C2

Xx

Xi

sxi ÿ p

!2

�D2

Xx

Xy6�x

Xi

dxysxi sy;i�1 � sy;iÿ1ÿ �

where (x, i) is the neuron at row x and column i, sxi is the state of such neuron, A, Band C are positive constants and dxy is the length of the path from city x to cityy. To simplify the notation, subindices have been represented modulo n, that is,syn � sy0. When this function is compared with the network energy, the followingvalues for weights are obtained:

wxi;yj � ÿAdxy�1ÿ dij� ÿ Bdij�1ÿ dxy� ÿ C ÿDdxy�dj;i�1 � dj;iÿ1�

dij �1 if i � j

0 if i 6� j

�

where wxi;yj is the weight of connection from neuron (x, i) to neuron (y, j); and biasesare computed by:

wxi � ÿCp

In [5], the following values for constants are proposed: A � B� 500, C� 200, D�500.

According to the formal model described in this paper, the Hop¢eld network forthe solution of TSP with p nodes is de¢ned as:

n � p2; m � 1D � fysl; ypl; yac; ysn; yin; yte; yre; yW gysl�C� � j�n�; where C � fn;m;W ; s; ag

ypt�C; i� �Xnÿ1j�0

wijsj; where C � fn;m;W; s; ag; s � �si�; i � 0 . . . nÿ 1

yac�x� � tanh�x�ysn�C� � 1yin�C; i� � n� �


Simulation ¢nishes when a discrete state is (almost) reached:

yte�C� � mini�dif �si��; where dif �si� � 1 if si < e or si > 1ÿ e

0 otherwise

�

yre�C� � 0

yW � yw0 ; yw1

� yw0�C; i� � ÿ200 p

yw1�C; i; j� � ÿ500dxy�1ÿ dzt� ÿ 500dztj�1ÿ dxy�ÿ 200ÿ 500dxy�dt;z�1 � dt;zÿ1�

where x � node(i), z � stage(i), y � node(j), t � stage( j), is the transformation ofindices of neurons to the bidimensional arrangement that represents the obtainedpath.

A.2. MULTILAYER PERCEPTRON WITH BACKPROPAGATION LEARNING

Neurons in the multilayer perceptron arrange in layers and neurons in each layer areactivated simultaneously. When a neuron i at layer p is activated, its new state iscomputed according to the expression:

uiXj

wijsj ÿ wi; si � tanh�ui�

where the index j in the sum comprises only neurons in layer pÿ 1. The weight valuesare computed through the backpropagation algorithm: several training patters arepresented to the network and after every presentation the weights of the connectionsto the neurons in the output layer are calculated as:

di � �1ÿ s2i ��di ÿ si�; Dwij � mdisj

where di is the correct value of neuron i for the presented pattern and m is a parametercalled learning rate. To compute the connections to neurons in intermediate layers,the error is `back propagated':

di � �1ÿ s2i �Xl

wlidl; Dwij � mdisj

where, if neuron i belongs to layer p, the sum in l includes only neurons in layer p� 1.In [11], the value m � 0,05 is suggested.

In connection to the introduced formal model, the simultaneous activation of theneurons of one layer is modelled as a simulation step and the activation of all neuronsin the network after the presentation of one pattern is a simulation phase.


The complete de¢nition of a multilayer perceptron with backpropagation learningis as follows:

n �Xkÿ1j�0

nj;

where nj is the number of neurons in layer j y k is the number of layers.

m � 1D � fysl; ypt; yac; ysn; yin; yte; yre; yW g

Let C be a con¢guration, C � fn;m;W ; s; ag

ysl�C� � a� 1 a� 1 if a < nÿ 1n0 otherwise

�

Note that neurons in the ¢rst layer (input layer) are never activated.

ypt�C; i� �Xnÿ1j�0

wijsj ÿ wi

yac�x� � tanh�x�

ysn�C� �0 if layer�a� � layer��a� 1� mod n

1 otherwise

�

yin�C; i� � pji if layer�i� � 0si otherwise

�

where Pj � �pj1 . . . pjn0 is the input pattern currently being presented to the network.The simulation phase ¢nishes when the last neuron is activated:

yte�C� � 1 if a � nÿ 10 otherwise

�

A new simulation phase is executed if either the pattern presentation has not ¢nished,


or after the presentation the error is signi¢cant yet:

yre�C� � 1 if no patterns left and Error < e0 otherwise

�

yW � yw0 ; yw1

�

yw0�C; i� �

0 if layer�i� 6� layer�j� � 12nÿ 1 if layer�i� � layer�j� � 1 and this is the first

simulation phasewi � 0; 05di�ÿ1� if layer�i� � layer�j� � 1 and this is not the

first simulation phase

8>>>><>>>>:

yw0�C; i; j� �

0 if layer�i� 6� layer�j� � 12nÿ 1 if layer�i� � layer�j� � 1 and this is the first

simulation phasewij � 0; 05didj if layer�i� � layer�j� � 1 and this is not the

first simulation phase

8>>>><>>>>:

where di ��1ÿ s2i ��O j

i ÿ si� if layer�i� � kÿ 1 �output layer��1ÿ s2i �

Xl

layer�l��layer�i��1

wlidl if layer�i� 6� kÿ 1

8>><>>:

Summary of Notation

C Con¢guration of a neural network. It is a 5-tuple: C � fn;m; s;W; ag:G Class of all possible con¢gurations.p(B: A) Class of all probability distributions on the set B, depending on a par-

ameter x 2 A.p(B) Class of all probability distributions on the set B.y(x) If y 2 p(B: A) and x 2 A, y(x) is the result of the execution of a drawing

process among the elements of B, according to the distribution y with par-ameter x. If y 2 p(B), y(x) is the result of the execution of a drawing pro-cess among the elements of B, according to the distribution y.

n( ) Uniform random variable on the interval I � [0,1]. Note that n 2 p(I).j(i) Discrete random variable that, for a given i 2N, assigns to each integer

0; . . . i ÿ 1 the probability 1/i. Note that j 2 p(0 . . . i ÿ 1g : N).D Dynamics of a neural network, which comprises the following probability

distributions: D � fysl , ypt, yac, ysn, yin, yte, yre, yW g.AD Activation process on a network with dynamics D.SD Simulation step on a network with dynamics D.AC Activated set.


SA Stable-activated set.PD Simulation phase on a network with dynamics D.ED Experiment process on a network with dynamics D.

Acknowledgements

This work has been partially supported by Spanish Comision Interministerial deCienca y Technolog|a (CICYT), Project No. TIC98-0562.

References

1. Atencia, M. A.: An arbitrary order neural network design and simulation environment,graduating work, Dpto. Tecnolog|a Electronica, Univ. Malaga (Spain), (in Spanish),1997.

2. Garc|a del Valle, M., Garc|a, C., Lopez, F. J. and Acevedo, I.: Generic neural networkmodel and simulation toolkit, In: Mira, Moreno-D|az and Cabestany (eds.), Biologicaland Arti¢cial Computation: From Neuroscience to Technology, Lecture Notes in Com-puter Science, No. 1240, Springer-Verlag, (1997), pp. 313^322.

3. Garey, M. R. and Johnson, D. S.:Computers and Intractability. A Guide to the Theory ofNP-Completeness, W. H. Freeman and Company, (1979).

4. Hop¢eld, J. J.: Neurons with graded response have collective computational propertieslike those of two-state neurons, Proc. Nat. Acad. Sci. U.S.A., 81 (1984), 3088^3092.

5. Hop¢eld, J. J. and Tank, D. W.: Neural computation of decisions in optimizationproblems, Biological Cybernetics, 52 (1985), 141^152.

6. Joya, G., Atencia, M. A. and Sandoval, F.: Application of high-order Hop¢eld neuralnetworks to the solution of diophantine equations, In: A. Prieto (ed.), Arti¢cial NeuralNetworks, Lecture Notes in Computer Science, No. 540, Springer-Verlag, (1991),pp. 395-400.

7. Joya, G., Atencia, M. A. and Sandoval, F.: Associating arbitrary-order energy functionsto an arti¢cial neural network. Implications concerning the resolution of optimizationproblems, Neurocomputing, 14 (1997), 139^156.

8. Joya, G., Atencia, M. A. and Sandoval, F.: Hop¢eld Neural Network applied tooptimization problems: some theoretical and simulation results, In: Mira, Moreno-D|azand Cabestany (eds.), Biological and Arti¢cial Computation: From Neuroscience toTechnology, Lecture Notes in Computer Science, No. 1240, Springer-Verlag, (1997),pp. 556^565.

9. Joya, G.: Contributions of high order arti¢cial neural networks to the design of auton-omous systems. Doctoral Thesis, Dpto. Tecnolog|a Electronica, Univ. Malaga. Serviciode Publicaciones e Intercambio Cient|¢co de la Universidad de Malaga, (in Spanish),1997.

10. Mira, J., Herrero, J. C. and Delgado, A. E.: A Generic formulation of neural nets as amodel of parallel and self-programming computation, In: Mira, Moreno-D|az andCabestany (eds.), Biological and Arti¢cial Computation: From Neuroscience toTechnology, Lecture Notes in Computer Science, No. 1240, Springer-Verlag, (1997),pp. 195^206.

11. Mu« ller, B. and Reinhardt, J.: Learning Boolean Functions with Back-Propagation,Neural Networks. An Introduction, Springer-Verlag, (1990), pp. 222^227.












12. Santos, J., Cabarcos, M., Otero, R. P. and Mira, J.: Parallelization of connectionistmodels based on a symbolic formalism, In: Mira, Moreno-D|az and Cabestany (eds.),Biological and Arti¢cial Computation: FromNeuroscience to Technology, Lecture Notesin Computer Science No. 1240, Springer-Verlag, (1997), pp. 304^312.

13. Strey, A.: EpsiloNN ^ A speci¢cation language for the Ef¢cient Parallel Simulation ofNeural Networks, in Mira, Moreno-D|az and Cabestany (eds.), Biological and Arti¢cialComputation: From Neuroscience to Technology, Lecture Notes in Computer ScienceNo. 1240, Springer-Verlag, (1997), pp. 714^722.


A Formal Model for Definition and Simulation of Generic Neural Networks

Documents