Self-reflexive networks: Theory · topology · applications

Quality & Quantity 29: 339-403, 1995. 339 �9 1995 Kluwer Academic Publishers. Printed in the Netherlands*

Self-reflexive networks: Theory �9 Topology �9 Applications*

MASSIMO BUSCEMA SEMEION, Centro Ricerche di Scienze della Comunicazione, Viale di Val Fiorita 86, 00144, Rome, Italy

Abstract. We present a new type of Artificial Neural Networks: the Self-Reflexive Networks. We utter the theoretical presuppositions; their dynamics is analogous to the one ascribed to autopoietic systems: self-referentiality, unsupervised learning and unintentionally cooperative and contractual activities of their own units. We also hypothesize a new concept of perception. We present the basic equations of Self-Reflexive Networks, new concepts as the one of dynamic target, of Re-entry with dedicated and fixed connections, of Meta-Units. Therefore, we experiment a specific type of Self-Reflexive Networks, the Monodedicated, within the interpretation of a toy-DB and we have hinted at other already made experimentations, experimentations in process and planned experimentations. From the applicative work that we present a few specifics and novelties of this type of Neural Networks emerge: (a) the capability of answering to complex, strange, wrong or not precise questions, through

the same algorithms through which the learning phase took place. (b) the capability of spontaneously transforming their own learning inaccuracy in analogic

capability and original self-organization capability. (c) the capability of spontaneously integrate the models that it experienced in different moments

in an achronical hyper-modeI. (d) the capability of behaving as it had explored a decisions graph of large dimensions, both

deeply and in extension. With the consequence of behaving as an Addressing Memory for self-dynamic Contents.

(e) the capability of always learning, rapidly and anyway, besides the complexity of the learning patterns.

(f) the capability of answering simultaneously from different points of view, behaving, in this case, as a network that builds more similarity models for each vector-stimulus that it receives.

(g) the capability of adjusting in a biunivocal way, each question to the consulting DB and each DB to the question that are submitted. The consequence of this fact is the continuous creation of new answering models.

(h) the capability of building during the learning phase, a weights matrix that provides a sub- conceptual representation of the bi-directional relations between each couple of input variables.

(i) the capability, through the Metaunits, to integrate in a unitary typology, nodes with different saturation speed and, therefore, with different memory: in fact, while the SR units are short memory nodes, since each new stumulus zeros the previous stimulus, the Metaunits memorize the SR different stimulus during time, functioning as an average length memory. This fact should confirm that the avarage length memory is of a different level from the immediate memory and that it is based only upon relation among perceptive stimulus which are distributed in parallel and in sequence. In this context the weights matrix constitute the SR long term memory. And in this sense it will be opportune to think at a methodic through

* The original essay was published in M. Buscema et al., Reti Neurali AutoRiflessive. Teoria Metodi �9 Applicazioni �9 Confronti, QR Semeion no. 1, 9-74, Armando, Roma, 1994.

340 Massimo Buscema

which the Metaunits can influence during time, the same weights matrix. In any case, in the SR there are service nodes or filter nodes and learning nodes as if they were weights (the Metaunits) .

1. Philosophy and structure of self-reflexive networks*

1.1. The limits of forward networks

The main problem of Forward Networks (FN) is their strict functional depen- dency on target units.

Under this profile, the FN are Passive Deformation Networks, with respect to their Immersion space.

This Immersion space is made up of Input Models and Teaching Units (targets) that deform the internal dynamics of FN plastically, namely the Hidden units, and their Incoming and Outgoing weights.

This means that the values assumed by the Hidden units and by their weights, at the end of a learning session do not necessarily define the "sub- conceptual representation" of the task learned by the FN. Instead, it is the internal projection of the external constraints (input + target) that a passive Network (with external targets) has to work within.

This means that this kind of FN does not perceive anything of the Input it receives. In fact, if the FN perceived it, it would structure a series of stable Outputs, that it chose, that were the external representation of its internal structuralization that defined it autopoietically, in order to make these external inputs compatible with its internal dynamics.

In this sense, Forward Networks are an interesting example of neobe- haviourism.

S ---~ (r ---~ s --* r --~ s) --* R Input Hidden Output

Therefore, these are Passive Deformation Networks that can be extremely helpful in understanding a few support mechanisms of cognitive structures; for example, the simplest ones in which another type of Network "communi- cates" its own target to the FN, which have the task of matching their functioning mechanisms to the model of other and more complex mental mechanisms.

* All the experimentat ions carried out were realized through the Semeion AF1 program (Bus- cema 1993), except explicit notice.

Self-reflexive networks 341

The Forward Network, therefore, seems to be more of a support Network to the mental functioning, than mental functioning models.

Their specificity is not in perceiving the Inputs but in plastically finding forms of stability for multiple boundaries, imposed from the outside. 2

1.2. The structure of perception

One of the essential characteristics of a living organism is its capability of adjusting to the environment by making the environment adjust to it. 3

We refer to that particular interactive process "organism-environment", in which the organism "lives" the environment, trying to "deform it" according to its life needs, and simultaneously the environment answers to the organism's actions, forcing it to modify its structure. This complex negotiation draws, step by step, the dynamic morphology both of the environment and of the organism defining it. 4

In the FN, the Network's learning is essentially of an adjusting type: the target models to which the Network must conform itself, when specific perturbations take place (Input), are either the Input Models' mirror, or actions decided apart from the subjective reading that the Network is carrying on from the Input urging it; for example:

Forward Network Input Specular Targets Input Independent Targets I T I T

101 --~ 101 101 ~ 1 100 --* 100 100 ~ 0 001 --~ 001 001 ~ 1 etc. etc.

On the contrary, the simulation of any organism, assumes the capability of restructuring the Inputs in that organism, according to the way that organism is structured. Besides, it includes generating actions towards the environment, which are the product of the re-structuring that the organism is operating.

An essential condition is that the Network itself finds a behavioral target which is a subjective interpretation that the Network is making on the Input models it is testing.

We mean by that, that a neural Network simulating any organism must act as an autopoietic system. 5

342 Massimo Buscema

The first characteristic of an autopoietic system is to perceive the environment and not to submit to it.

At the moment, perception is considered as the re-structuring strategy that a subject works out on a portion of the world, in order to internalize it into its own organizative structure. 6

Perception is then a process that the subject activates, according to his internal lack of balance that he is trying to compensate. 7

This conception of the perception activity, as much as it assumes an active role of the perceiving subject, seemed to us rather naive.

At first, the perception activity is characterized as a process with a beginning and an end. See the figure.

I PERCEPTION i i

Extemal Object

()

Ob ect's Cfl ~ R l r c

/ /

�9 Internalization completed

Adjusting process of the external object to the intemal structure of the subject

It is assumed that this process might be iterative (continuously repeated) and singular (different each time according to the state of the subject and to the functional nature of the perceived object).

But from our point of view, the perception activity is part of the existence itself of the organism. We intend perception, therefore, to be a collateral and obligated effect of a living organism's simple existence.

Also from this point of view, perception is a process, but it is not an activity that begins and finishes, nor an activity that produces itself in certain moments of the organism's life.

It is, instead, a continuous activity. The orgamsm exists as long as it perceives. Of course, this continuity will be subjected to a certain scansion and to a different complexity, which is linked to the type of objects of t h e world that are perceived and to the condition of the perceiving subject, defining in each moment the perceptive continuum.

For each living organism, therefore, perception is a socialization of the existence.


A second characteristic of the perceptive activity is, from our point of view, its functioning as a process continuously reminding the organism of its space-temporal cohesion. 8

That means that the perceptive activity is made up of a multiplicity of parallel and asynchron micro-perception, of which for each the difference between the external world and the organism is not important. But of which the global effect is in the reformulation and/or re-affirmation of the auton- omous topology of the perceiving organism. 9

The third characteristic of perception should be the activity of trying to compensate for the living organism's fundamental lack of balance. This means that each organism is living because it is structurally unstable. Such a lack of balance "bounces" from one point of the organism to another, forcing it to continuously attack the external world, in order to locally re balance itself. 1~

Such dynamics are possible, only if we imagine living structures as structures created in a world with at least 4 dimensions. That would imply a continuous instability, due to the major complexity of the organism towards its immersion space, which would produce the manifestation of rejected dimensions in the form of time. 11

Therefore, the organism continuously perceives parts of itself and of the external world, as compensation activity of its structural unbalance.

The fourth characteristic of the perception process, should be that this activity creates the units it perceives, on the basis of the perturbations that the organism generates on its external world. In other words, the perceptive activity does not consist in "taking" units from the environment, but rather in re-formatting the units of the environment in units which are such, only within its internal organization. 12

The Input, therefore, is only a mass of perturbations whose location depends on the location of the sensors of the organism, and whose articulation depends on the resolution of conflicts between what the sensors of the organism perceive as location and intensity, and the organization of internal lack of balances, which at that specific moment they are trying to compensate in order to continue to exist space-temporally.

Such an Input re-formatting rule is not valid just to explain the relation between environment and organism, but also to explain the interpretative dynamics through which each unit of the organism behaves with respect to any other unit.

Such a concept of perception obligates to redefine the perception activity itself as a process marked by the following steps:

(a) In order to exist, the organism is internally unbalanced; therefore, moved by an internal need, it perturbates the environment and the energy spent in perturbing the environment, locally balances it again.

344 Massimo Buscema

(b) The perturbed environment re-organizes itself producing, in its turn, "clouds" of perturbations over the organism, which will "feel" these perturbations, according to the typology, the location and the sensitivity of its sensor and of its more internal units.

(c) The perturbative wave produced by the enviromment will travel from the outside towards the inside of the organism, according to an outer point of view. Actually, each layer of the organism's micro-units will re-interpret the interpretations of the perturbations that have been carried out in the previous layer, with the possibility that the new internal micro-units of the organism may be made up by repeating these processes.

Therefore, what the organism feels is a fragmented, distributed and asynchron set of re-interpretations at a level in which its units work starting from a cloud of "self-produced" perturbations. At this level, a perceiving consciousness does not yet exist.

(d) Each re-interpretation that each micro-unit of the organism worked out on the arriving perturbation, is the result of what that unit can perceive. It is the result of that unit specific conditions, in the moment it received the perturbations and of all the perturbations that the unit exchanges with other micro-units.

Therefore, the reinterpretation that each unit makes on the variation of its course, will simply be the result of the variation of its perturbations exchanges with other units to which it is connected and/or to which it connect itself. Therefore, a fragmented process of unbalancing begins throughout the entire organism. This process is analogous to a "feedback loop", which could be identified as "training and perception work".

(e) This process of unbalancing and readjustments, tends to minimize the energy of the original perturbation, in order to re-affirm the cohered topology of the organism. Such a minimization begins to produce itself when the organism lets a logic of virtual perturbation towards the environment emerge which is more or less isomorphous to the fragmentated logic through which its process of distributed perturbation occurred.

At this point, the perception process begins to stabilize itself as a form of perceptive consciousness. The perceptive consciousness is the compensation that the organism self-generates, as a project of virtual manipulation of the environment; this project is isomorphous to the unbalancing that it underwent and, therefore, it is a project that when it creates itself, it re-affirms the identity of the organism in the environment. I feel, I change, I interpretate, therefore I exist.

While, what we have described in a linear way was producing itself, the organism was continuing to produce its micro-improvements even on the


enviromnent, that was answering with new urges that were rendering the process more complex and in some way continuous.

The organism, therefore, continuously generates pieces of perceptive consciousness every time it can minimize the energy self-produced by its continuous internal and external activity.

When this activity of perpetual re-planning the virtual manipulation of the environment, isomorphous to the internal dynamics that the organism is living in order to exist in the environment, tends towards determinated schemes, it is possible then, that the virtual manipulation begins to transform itself into reality, slowly producing a modification of the topology of the organism itself and producing what we see as a biological change. This change can become genotypical.

In short, the biology describing an organism is a set of perceptions that organism has come up with in its negotiation process with the environment.

Therefore, a gap between thoughts and biological shapes does not exist. The latter being perceptions of which the project of manipulating of the environment, has slowly come up with.

Under this profile, the shape of the feline's claws is the biologically frozen perception of the re-production cycle of a predator.

1.3. Self-reflexive networks

On this ground, in 1989 I started experimenting the possibility to create unsupervised Artificial Neural Networks (ANN), that were able to use co-operative learning algorithms instead of competitive ones.

The importance of the unsupervised ANN is plain: many complex systems are self-organizing and they do not need an outside given target. The importance of co-operative algorithms is less plain but equally deep-rooted: a competitive process is the radicalization of a co-operative process; the last one, therefore, implies the first one showing all its possible shades.

The Self-Reflexive Artificial Networks were born then, and their characteristics are quite simple (Buscema and Massini, 1993).

These ANN are apparently provided of three layers. The Input layer is made up of a series of "k" nodes which are arranged in order to catch the signals from the enviromnent; the Hidden units "j'" layer and the Output units "i" layer, actually concern the same units caught at two different moments: when receiving the signals from the Input layer the ANN units work as Hidden units, when communicating their behavior to the environment they appear as Output units.

346 Massimo Buscema

This means that the number of Hidden and Output units of these ANN is always the same: every Hidden node has a copy in an Output node.

The connections between the Input and the Hidden layers can be regulated in different ways; we have studied some of them: the ones with the maximum connection, the ones with dedicated connections and the ones with mono-connections.

The connections between the Hidden and the Output layers instead, as a rule, must be at the maximum gradient, but other possibilities are being experimented (Pandin, 1993).

The propagation algorithm of the signal, in these kind of ANN, is very simple and traditional: it can be easily resumed in the following equation:

U,. = f ( ~ Uj. W~j + O~) (1.1)

where f is a semilinear function of the already known functions (sygmoid, hyperbolic tangent, arc tangent, etc.).

The learning algorithm is based on the Back Propagation (Rumelhart et al., 1986) and, therefore, it follows the descending gradient method. Whereas the error back propagation occurs through the comparison between the desired Output and the real one at each Output node, in the Self-Reflexive ANN the Delta Output of each node is generated through the comparison between the real Output and the Hidden node value which is a copy of that Output node.

&out/ = ( U j - Ui) . Ui " (1 - Ui)

first derivative of Ui if the transfer function of (1.1) is a sygmoid included between 0 and 1

(1.2)

This small difference in the learning equation has great effects on the ANN behavior. The most obvious one is that the ANN will change its target at each cycle even if it is facing an identical Input.

The Self-Reflexive ANN, therefore, have a dynamic target that they self- program during their learning. This fact makes them similar to autopoietic systems. Autopoietic systems determine again the meaning of each external perturbation, with respect to their internal dynamics. In this sense, the learning process of a Self-Reflexive ANN is analogous to a perception process. The ANN has perceived some external stimulus when it succeeded in structuring its own external reaction (Output) with a dynamic analogous to


the one through which it perceived the external stimulus (Hidden); in the figure:

Input k Hiddenj Output/

External Stimulus Perception of the Reaction to the External Stimulus External Stimulus

I j Perception Process

It is clear that in these conditions the Self-Reflexive ANN learn when it minimizes the Energy internally. This minimization occurs on the weights matrix. Therefore, the Global mean Error of a Self-Reflexive ANN will be given by:

N

aE = 1 . . ! . Z ( v f - v ? ) 2 (1.3) p p 2 i

where M = number of the Input models, N = number of the Output and Hidden nodes, U ~ = Output units, and U H = Hidden units.

An introductory paper on the Self-Reflexive ANN at the maximum gradient, has been already written (Buscema and Massini 1993, chapters XIII and XIV).

On that occasion Giulia Massini of the Semeion Research Center, made some experiments on this type of ANN showing their capability of mapping the Input models through the matrix of the Hidden-Output weights (Buscema and Massini 1993, chapter XIV, paragraph 14.2 and 14.3).

In these experimentations it is shown how the Hidden-Output weights matrix, tends to stabilize the fuzzy relationship of "sociality" and of "autarchy" that each Input node entertains with all the others, in the case where the number of Input nodes is equal to the number of Hidden and of Output nodes.

In this case, in fact, having assumed that each Input node is an "hypo- thesis" (Hinton 1981), it is possible to maintain that a Self-Reflexive ANN generates the weights matrix that allows to ask questions to this network as it were a constraints satisfaction network (compare Rumelhart et al. 1986, vol. 2, chapter 14).

Interesting experimentations and discoveries have been also carried out, on Self-Reflexive ANN with dedicated Input-Hidden connections. More particularly, the behavior of a Self-Reflexive ANN with dedicated connections

348 Massimo Buscema

in couples, towards the Hidden units, has been studied: that is, each Hidden unit receives only 2 connections coming from 2 Input units. In this specific case the ANN maps in output the pondered and negotiated difference (DNP) between the 2 values of the 2 Input nodes (Buscema and Massini 1993, chapter VIII and chapter XIV, par. 14.4).

~]i ]i Input

$

Fig. la

Fig. lb

O ] Input

~.~ llidden

) ] Output

) -]Input

~ ] Hidden

-1 Output

Figs. la and lb. Two Self-Reflexive ANN with a maximum gradient in the Input-Hidden weights. This architecture could be useful for the Morphing techniques: in this case a con- densative Morphing; in the case where the architecture was overturned (2 x 3) an expansive morphing. In the following month we will explore this possibility. Fig. lb presents a wave architecture with 2 hidden-output units level (3 x 2 x 3). We must still explore these types of architectures which seem interesting as mechanism of filtering and manipulation signals.

Hidden

Output

Fig. ic. A Self-Reflexive ANN with dedicated Input-hidden connections. They are localistic architectures, since every Hidden node receives specific signals, only from specific Input nodes. In these cases we can say that every Hidden node is semantized. Therefore, every Hidden node (and its Output copy) is comparable to a concept whose semantic traits are given by the Input nodes connected to it. Once the learning occurred, the analysis of the Hidden-output weights matrix allows to find out the sociality and autarchy relations that arised between a "concept" and another. A further study on their application has been carried out for the analysis of interpersonal perceptions (Buscema and Massini, 1993).


Recently, M. Pandin and G. Didon6 began to experiment the behavior of monodedicated Self-Reflexive ANN: in this case each input node is connected with only one hidden node (Pandin 1993; Didon6 1993).

With this architecture the Self-Reflexive ANN have shown 2 different capabilities: the capability of filtering, in order to carry on the analysis of similiarities among forms, and the capability of fuzzy data base.

A monodedicated Self-Reflexive Network to which different Input models are submitted, for example the 21 graphic letters of the Italian alphabet, behaves as a Self-Reflexive with a quicker maximum gradient: it minimizes the global error (RMS), after a certain number of cycles.

The problem is that the Output mapping of the Input vectors that the network is able to show after the learning has occured, is rather imperfect. This is easy to explain. The ANN is structurally given to stabilize its own Hidden-Output weights on the minimum of distinctions among the various Input nodes: therefore, it may occur that it is given to vary a certain output node between 0.2 and 0.9, while the variation that it needs in order to distinguish another output node is possible between 0.45 and 0.61. A Self- Reflexive ANN practically maps the input, recoding it in its internal system of differences. Nevertheless, this has the consequence of an imperfect mapping of the Input in Output.

In the monodedicated Self-Reflexive ANNs it is possible to avoid the problem through a normalization equation. In fact, we know that from the Hidden values of each node it is possible to stabilize exactly the value of the correspondent Input node because in this type of ANN there is only one weight connecting the Input to the Hidden.

Therefore, if:

1 Uj = 1 + e-(Uk'% k+~ (1.4)

where: k = j, Uj = Hidden node value, Uk = Input node value, 0j = Hidden node bias, Wjk = connection between Uk and U i, e = natural logarithm (e = 2.718281828459); then: '

Uk = l g ( U j ) - l g (1 - Uj ) - 0j , ( 1 . 5 )

given that the learning is measured through the minimization of the distance of the Hidden and Output nodes, then we can write the (1.5) replacing Uj (Hidden) with its copy Ui (Output).

350 Massimo Buscema

U~: = lg(Ui) - lg(1 - Ui) - Oj

wsk (1.6)

where: k = j = i; U;r = Normalized Output value. I have called the (1.6) normalization equation and it is the one I needed

in order to rewrite the Output values of the monodedicated Self-Reflexive. It is clear that through this equation, once the learning has occured, the mapping of the Input in the Output will appear very effective. But this is not the point. The equation (1.6) is, in fact, useful in order to ask questions to the ANN, once the learning phase has been concluded.

The interrogation of a monodedicated Self-Reflexive occurs through a very sample technique that we have called somewhere else Re-entry (Buscema 1994, chapter 3.2.9). The Re-entry consists in putting again the Output of an ANN in its Input vector until the difference between the Output and the Input will be minimized. In the Feed Forward ANN this technique is only possible with self-associated topologies. The result that we obtain is a constraint of the Output generated towards one of the targets that the ANN has previously learned (the most similar and the less far away from the initial Input).

In the monodedicated Self-Reflexive we have used this technique putting again in Input the ANN normalized Output again. The results were interesting.

We show, then, some experimentations that we have carried out using a specific variety of Re-entry.

1.4. Some examples

We have shown to a monodedicated Self-Reflexive ANN all the objects present in 50 rooms. The rooms were: 10 types of bathrooms, 10 types of kitchens, 10 types of living-rooms, ten types of offices. In all 50 rooms a variety of maximum 50 objects was present (soap, cupboard, etc.).

We have let a 50 x 50 x 50 Self-Reflexive ANN learn the 50 rooms (see Figure 2). Then we have asked questions to the Network, with the Re-entry technique, fixing the external Input each time on 1 or more than the 50 objects that the Network had surveyed. This was done in order to evaluate the capability of the Network to rebuild "frames" through both a parallel and a sequential "thought". 13

Figures 3a, b, c, d show the ability of a Self-Reflexive ANN, in behaving as a generator of prototypes and, therefore, the possibility to use it as a fuzzy SQL (Structured Query Language), in order to ask questions to a Data Base of any nature.


Fig. 2. It is the monodedicated SR ANN that we used for the analysis of the 50 rooms that we talked about in the text. The SR architecture of this type is always squared (n x n: the same number of input and of Hidden-Output). In this case each Input node has a copy in the correspondent Hidden node that, in its turn, has a copy in the correspondent output node. At the moment it is the most experimented SR, For its use it is fundamental the normalization equation and the Re-entry technique. Its applicative properties seem to be of three types: (a) building of SQL with fuzzy and/or probabilistic questioning on the basis of resonances among records; (b) filter for the similarities among signals (visual, acoustic, etc.) without forcing the Input signal towards pre-constituted targets; (c) analysis of the minimization of the information in a complex system (AF1 program, Semeion copyright, 1993; (Buscema 1993)).

The most outstanding characteristics of this applications have shown to be the following: 1. The Self-Reflexive stabilizes its own answer frame in a very few numbers

of cycles. 2. Thanks to the Re-entry technique, the Self-Reflexive is not forced to push

all its node values towards 0 or towards 1, as it happens instead in the Constraint Satisfaction (CS) ANN (Rumelhart et aI. 1986); its final answer is stable with respect to the data and not with respect to the used calculation algorithm.

3. The node that has been activated from the outside tends to assume its own stableness value, which is different from the one settled by the user. This allows to activate from the outside some Inputs that do not succeed in activating a complete or determinated frame and that during the Re-

352 Massimo Buscema

Figs. 3a, b, c, d. The Monodedicated Self-Reflexive Networks already l earned the 50 rooms survey ing all 50 objects that m a y appear or m a y not appear in e a c h room.

D E S K

AS H T P ~ V

~ H A I R S

S T O O L S

B O O K S

~ H E L U E S

P E N S

P A P E R

~ D A P

T E L E P H O N E

T E L E U I S I O N

R A D I O

H I - F I

C O H P U T E N

D I S K E T T E r

F L O O R - L A M P

P I C T U R E S

C L O C K

S O F A

W I N D O N S

B A S K E T

G A D G E T S

C E I L I N G

D R A P E S

C U P B O A R D

T O I L E T

H A L L S

M E D I U M

B E D

0 . . . . . . . . . . . E N Y - C H A I R 0 . . . . . . . . . . .

o . 4 4 �9 �9 �9 �9 = �9 �9 �9 �9 �9 m SHOm4ER 0.9S6 imlmmmmmmmmmm 0 " 0 ~ . . . . . . . . . . . S I HK ~ mnmmllmmmmm 0.234 " " " " " ' " " " " ' SCALE 0.702 m l m m m m n m m m m

0 "00~4~ . . . . . . . . . . . O00R 0 " 9~9 mmmmmmmmmmmm 0" ~ I . . . . . . . . . . . ~N~LL 0"559 m �9 �9 �9 �9 �9 �9 �9 �9 �9 �9

0 . 0 0 8 1 8 . . . . . . . . . . . T ' Y P E H R ] [ T E R 0 . . . . . . . . . . .

0 . 0 0 8 1 S 4 . . . . . . . . . . . C O F F E E - C U P S 0 . 0 . t 7 2 . . . . . . . . . . .

J. I I I I I N i l I I I I O O F F E E P D T 0 . o J. ' , ,2 . . . . . . . . . . . O . JL.1L2 . . . . . . . . . . . D R E S S E R 0 . 2 J E 6 . . . . . . . . . . .

0 . O060,.1L . . . . . . . . . . . O U E N 0 . 0 ~ 5 3 . . . . . . . . . . .

0.006G3 . . . . . . . . . . . U E R Y _ ~ H t c ~ I - L 0 o 6 6 . - -'~ �9 �9 �9 m mm �9 m �9 1 i N

O . O O J L I 8 . . . . . . . . . . . B O O K ~ ; H E L F O . . . . . . . . . . .

O . . . . . . . . . . . R E F R I G E R A T O R 0.0172 . . . . . . . . . . .

0 . . . . . . . . . . . UERY__LARC=E O . 1 1 " ' . . . . . . . . .

0 . . . . . . . . . . . C A R P E T O . . . . . . . . . . .

O . 0 0 3 2 7 . . . . . . . . . . . F I R E P L A C E 0 . 0 0 1 5 1 . . . . . . . . . . .

0 . 6 7 " m Him m 1 ml m m NTOA)IER o . o x 5 3 . . . . . . . . . . . o o 7 6 " i " m m n l m m m m m m m w m 0.9o6 l m n m m l l m m l m o .",'G J. mmmmmmlmmmmm

0 , 0 0 0 8 7 4 . . . . . . . . . . . B A T H T U B

O , 1 ) ( ~ 3 �9 �9 m �9 �9 �9 �9 �9 �9 �9 �9 C L O T H E S - H A N G E R

n i n n m i i m i i i o ~ . E n ~ 0 . 5 5 5 n m m m m u m m m m u

mmmmmnmmmmm 0 , 2 3 = l , . . . . . I . .

O . 0 1 7 2 . . . . . . . . . . .

0.9o6 mmmimmmmmmm mmmmmmmmmmm

0 . 3 4 m m m m w m m m m m m

0 ...........

Fig. 3a. In the screen the object "soap" is activated (see the asterisk that marks on the l ines the word "soap"). After 99 Re-entries, the ANN draws the f rame of a "bathroom" with all the objects that, dif ferent ly , s e e n o p p o r t u n e to it in order to fill this type of room. We must point out that w h e n the va lue of the object tends towards 0.5 its per t inence in the e n v i r o n m e n t , processed by the ANN, is indifferent; when, instead, it tends towards 0.0 is imper t inent and w h e n it tends to 1.0 is pert inent .

entry, self-depress themselves. The CS ANN, on the contrary, always answer with a complete frame to each Input. This means that in a living- room they do not make any difference between a sofa, a ceiling and a picture.

4. The equations through which the Self-Reflexive has learned are the same equations through which it answers to the perturbations. This is important from a theoretical point of view.

A new series of experimentations may point out the capability of the Self- Reflexive A N N to organize the temporal sequences of its own experiences in spatial model which are codified in its own connections.

Following an idea of G. Didon6 and M. Pandin we have built 31 Input models, each of which was determined by a 9 x 9 matrix. In each model 80 nodes were empty (value = 0.0) and one only completely on (value = 1.0). A 81 • 81 Self-Reflexive has rapidly learned the 31 Input models reproducing


D E S K 0 . . . . . . . . . . . E A S Y - C H A I R 3 .97e-05 . . . . . . . . . . .

A S H T R A Y 0.222 . . . . . . . . . . . S H O X E R 0 . . . . . . . . . . .

CHAZRS 0 . S S O m i l u i n m l m m m s z N K 1 llllllllllm STOOLS 0~ m m i m m m m m m m m s c R L E o . 8 . m m m m m m l m i l i m B O O K ; 0 .444 I �9 I �9 �9 �9 �9 I m I m DOOR 1 m m m m m m m m m l l

S H E L U E S 0,444 �9 �9 �9 �9 �9 �9 i l �9 m �9 S M A L L 0.067 m �9 �9 �9 �9 �9 �9 �9 �9 �9 �9

P E N ; 0 . ~ 6 �9 �9 �9 �9 �9 �9 I I I �9 �9 T V P E H R I T E R 0 . . . . . . . . . . .

P A P E R 0.SSS m �9 �9 �9 �9 �9 �9 �9 �9 m �9 C O F F E E - C U P ; I l m l l l m m l m l m l ~;OAP t III �9 i m m mm I mmCOFFEEPOT 1 mmmmImmmmmmmmmmmm TELEPHONE 0 ,333 �9 I I m I 1 I I l I I D BE~ ~ER ~ . . . . . . . . . . .

�9 EKE~176 0,,, : : : : : : : : : : :oDE. 00. , . . . , = . . . = = R r 1 7 6 0 , ~ 6 U E R Y _ ; H A L L 0.444 �9 �9 [] �9 m [] �9 �9 �9 �9 �9

H I - F I ~ ,200 .05 . . . . . . . . . . . B O O K S H E L F 3.830-06 . . . . . . . . . . .

C O M P U T E R g . . . . . . . . . . . E F R I G E R A T O R ~ l l l l l I l l l l l

O I S K E T T E S 0 . . . . . . . . . . . U E R Y . _ I - R R G E 0,{1'1 . . . . . . . . . . .

F L O O R - L R M P 0 , 2 2 2 " " ' " = " " " " " " C A R P E T r . . . . . . . . . . .

P I E ~ T U R E ; 0.222 " " " " " ' " " " " = F I R E P L A C E 5 . ' r 30 -0~ , . . . . . . . . . . .

ICLOCK o.o08 IIIIIIIIIIITOA;TER 0.809 iilliillmmm �9 O F A G.IBe.0~[ . . . . . . . . . . . B A T H T U B 0 . . . . . . . . . . .

4I MOORS I lIillmllliilCLOT HES -HnMGER 'l.,:le-O~5 . . . . . . . . . . . ]ASKET 1 i m m l l m m n m m m D B n W E R S 1 I l m l m I l l l l l 3ADGETS 0,444 �9 m �9 �9 �9 l i �9 i �9 �9 �9

Z E I L I . G I i l m l m l m l m m l IDBAPES 0.778 �9 �9 �9 �9 �9 l �9 l I �9 �9

~ U P B O n R D * 1 l m m I l l l l l I l

F O I L E T 0 . . . . . . . . . . .

J A L L S ~ Iiiiiilimli I E D I U H 0,809 ImmmliIIIII 3ED 0 . . . . . . . . . . .

Fig. 3b. In the screen starting out from the external Input "cupboard" , the A N N builds its

f rame of a "ki tchen".

)ESK 0 . . . . . . . . . . . EASY-CHAIR 0 . 6 9 1 " �9 m m m �9 mm m i i i

ISHTRnV 0-)1 ~r mmmmmmmmmmmmmmms.ouEa 0.310 �9 �9 �9 �9 �9 �9 �9 �9 �9 �9 �9

) A I R S 0.7'4 i I I H I I � 9 1 4 9 1 4 9 0.268 �9 " . . . . . . . . .

; T O O L S 0.0958 . . . . . . . . . . . S C A L E 0.203 ' " " " " " �9 " " " "

3ooK= o.060 � 9 1 4 9 1 4 9 1 4 9 1 4 9 0 . 4 ~ 3 - ' - � 9 1 4 9 1 4 9 �9 [] � 9

; H E L U E ~ ; 0.703 i m m mm l ml i II ilml SMALL 0.114 . . . . . . . . . . .

~ E R ; 0.235 " " . . . . . . . . . T V P E H R I T E R 0 . . . . . . . . . . .

= A P E R 0.247 " " . . . . . . . . . C O F F E E - C U P S 0 . . . . . . . . . . .

SOAP 0.180 . . . . . . . . . . . C O F F E E P O T 0 . . . . . . . . . . .

t E L E P H O N E 0.5"$3 e n �9 �9 �9 m [ ] �9 �9 l i D R E ~ : S E R 0 . . . . . . . . . . .

r E L E U I S I O H 0,105 In I ! I l l I H I I n I I I O O E H 0 . . . . . . . . . . .

~ A D I O 0,131 . . . . . . . . . . ' U E R Y _ ; M ~ L L 0 . t 7 2 . . . . . ' . . . . .

4 I - F I g . r~ I I I n I i �9 i i i �9 I B O O K S H E L F 0 ,63 t I i l l B I I I m l l

2 O R P U T E R 0 . . . . . . . . . . . R E F R I G E R A T O R 0 . . . . . .

3 ] S I < E T T E S 0 . . . . . . . . . . . U E R V _ . L A R G E 0.739 i HI �9 i I I I l l l I

= L O O m - L R M P 0 . . . . . . . . . . . C A R P E T 0.727 a �9 �9 i �9 a �9 l �9 �9 �9 ) I CTUF~E= 0 . r s l �9 l i l �9 l i i �9 l I F I F ~ E P L A C E ~ 0.785 l l l l l l l � 9

: L O C K 0 . 9 t 4 I I I I I I I I I I I T O A S : T E R o : . . . . . . . . . . ;OFR o.o5t l l l l m i l i m m l a n ' r H r U B o.24a . . . . . . . . . . . 4 I N D O H S 0.873 m l l l l l m l l i l C L O T H E S - - H A H G E R 0 . 0 6 t m I I �9 �9 �9 �9 �9 �9 �9 g

3 ~ S K E T 0.22 | " " " " " " " " " = D R A N E R S 0.303 " " " " " " " " " " "

�9 r~CGETS 0.83 l l i l l l l l l l l

: E I L I N G I l m l l m i m l l l m ) R A P E S o . r t 8 �9 1 �9 [ ] �9 �9 �9 �9 �9 �9 �9

.~UpBO#~RD @ . . . . . . . . . . .

T O ] f E E T 0.318 �9 �9 �9 = �9 = �9 �9 �9 = I

J R L L ; 4 l l l l l l l l i i l 1 E D I U R { . 0 1 7 . . . . . . . . . . .

3ED 0 . . . . . . . . . . .

Fig. 3c. The screen is interesting because starting out from the external Input "fire-place", the A N N is not able to build a convincing frame of a "l iving-room" since it does not retain this object very distinctive of the frame "living-room".

354 Massimo Buscema

I E ~ K 0 . 0 ~ 6 . . . . . . . . . . . E A S V - C H A I R 0.801 l i i | | R i l | l i | |

~ S H T R A Y 0,083 I l l l l l i i l l l ~ H O H E R 0.0021~ . . . . . . . . . . .

: H A Z RS 0.91o I l l l i i l i l l l X Z N K o . . . . . . . . . . . ; T O O L S 0.106 . . . . . . . . . . . S C A L E 0 . . . . . . . . . . .

3OOKS 0.884 i l l l i l l i l i n i l i I DOOR @.390 " " " �9 = = " a �9 u ,

r 0.806 nm mmnnnmi n m m m i n m = H A L L 0 . . . . . . . . . . .

l E N S 0.3gg I I I I I I I l I I = T V 1 P E i 4 R I T E R 0 . . . . . . . . . . .

3 A P E R 0.4 �9 �9 �9 �9 " �9 �9 �9 �9 �9 �9 COIFFEE - C U P S 0 . . . . . . . . . . .

; O A P 0 C O F F E E P O T 0 . . . . . . �9 . . . . .

r E L E P H O N E 0.803 R R R i R R R U R R i D R E S S E R O . . . . . . . . . . .

r E L E U I S l O N 1 1 , 9 t 0 i I l i l l i l l l i l l l l l i b i O U E H O . . . . . . . . . . .

t A D I O 0,312 �9 �9 " " �9 �9 �9 " �9 �9 �9 U E R Y _ S H A L L 0 . . . . . . . . . . .

t I - F I 0.83 I I I l i l l l i b m l i m B B O O K S H E L F 0.770 i i III I I I I I I III I I in i i i i i i

:OHPL IT E R 0.0499 . . . . . . . . . . . R E F R I O E R A T O R 0 . . . . . . . . . . .

)1 S K E T T E S 0.0499 . . . . . . . . . . . U E R Y . . L A R G E 0.056 Im I I I l l III I i I I l i m

F L O D R - L A H P 0,156 . . . . . . . . . . . C A R P E T * 0 .92 t I n l l l i n l l i l

P I C T U R E S 0.9:>t i l i i n u n i n m i i F I R E P L A C E - 0 . 8 ~ l i l l i l l i i l l l i l i l

C L O C K 1 l l i l l i i n l n l T O A S T E R 0 . . . . . . . . . . .

S O F A " 1 I I I I l l l l l l l B A T H T U B 0.00172 . . . . . . . . . . .

H I R D O H S 1 l l l l l i i l l i l C L O T H E S - H A N G E R 0 . ~ 3 a �9 �9 ! I i �9 I �9 i I

B A S K E T 0.0308 . . . . . . . . . . . D R A N E R S 0.396 m u �9 �9 = �9 = . I = �9

G A D G E T S 0.8t" r i l l U l l l i l l l

C E I L i N G t i l l i l i l l i l i l i l i P l i i

D R A P E S 0.907 l l l l ' l l l l l l l

C U P B O A R D 0 . . . . . . . . . . .

T O I L E T 0.002tS . . . . . . . . . . .

H A L L % 1 l i l l l l l l l l l ~ E D I U H 0,301 I I I U = I I m = I I

B E D 0 . . . . . . . . . . .

Fig. 3d. In the screen where the external Input was made up of 3 objects ("carpet", "sofa", "fire-place"), after 99 Re-entries the ANN draws a good scenary for a "living-room". We think that this capability of self-evaluation of the relevance of the external Input is particularly important.

them in its normalized Output (see Equation 1.6). This was the first part of the experiment. The reader must know, nevertheless, that placing the 31 Input models upon another, one obtains the child drawing of a "face" made up of 31 points. When the learning occured the Self-Reflexive showed to have understood this "integrative logic"; in fact, when it is activated with an empty Input vector (81 values = 0.0), after 2 Re-entries it shows the "face" that the Input models distributed to it through time. Actually, as we discovered afterwards, the coding of the "face" had already been codified in the 81 Input-Hidden weights of the Self-Reflexive during the learning (see Fig- ures 4a, b, c).

Subsequent experiences carried on with G. Didon6 and M. Pandin have revealed that the Self-Reflexive has also codified the order of succession through which the models have been sent to it and whose integration realizes the drawing of the "face".

The third experimentation that we show, focuses on the capability of Self- Reflexive ANNs to be able to filter the similarity among models of different signals and to be able to create analogies among forms, taking advantages of its own learning error residuals. With this aim we have built a Self-Reflexive 49 • 49. Each model was characterized by a horizontal bar of 7 saturated


complementary models, while the other 42 were completely off. The learning models were, then, 7. After the Self-Reflexive had almost perfectly learned the 7 models (RMS = 0.000002), we have reinserted the 7 Input models, leaving the Self-Reflexive operating on each of them a sufficient number of Re-entries in order to stabilize itself (difference between normalized Input and Output, in each node minor of 0.0001).

An interesting phenomena occured: during the Re-entries, the Output began to modify itself until it transformed itself in an excitatory field which was the exact opposite to an inhibitory counterfield, whose sum left unaltered the global energy of the Output itself (see Figures 5a, b, c).

This and other experiments show that a Self-Reflexive provided of a Re- entry takes advantage of its own learning error, in order to create a typical analogic plasticity: the new organizations occur thanks to the exploitation of the noise: if nature learned all perfectly, it woulid never be able to create new solutions.

Imperfection is the shadow of an eternal creativity.

�9 o . . . . . . . . . �9 . . . . . . . . . I . . . . . . . . . �9 . . . . . . . . . �9 . . . . . . . . . �9 . . . . . . . . . �9 . . . . . . . . . . . . . . . . .

" ; : " " ! " : . . . . . . . . . . . . . . . . . . . . . . �9 . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0 . . . . . . . . . O . . . . . . . . . O.

i ! ! ! ! i i i ! ! i ! ! ! ! ! ! ! ! i ! ! ! i i ! i ! ! i ! ! i i ! ! !!i!~!i!! ! ! ! i ! ! ! ! ! !i!!~!!i! ! i ! i ! ! ! i ! i ! ! ! ! ! i ! !

::::::::: ::::::::: iiiiiiiii . . . . . . . . . ii~iiiiii ii!!iii i!!iiiiii iiiii!ii! iiii!iiii iii!ii!ii . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiiii?i! iiii+iiii ::::i:::: ::::i:::: :i::::: :::::~::: :+::::::: ::::::::: :::::::::

i i i i ! i i~ i i i i i i ! i ! i ! ! ! ! ! i ! i ! i i ! i ! i i i i

Fig. 4a. The 31 Inpu t Models , v iewed by the Self-Reflexive Ne twork for 25.000 cycles, are

�9 �9

0

I ' �9

W E I G H T S

grouped 9 x 9 each of them.

0 0 0 0

�9 �9 0 0

�9 �9 �9

0 0

0 0 e 0

O �9 O ,

II

0 O II

0 O

Fig. 4b. The 81 Self-Reflexive Inpu t -Hidden weights are shown once the learning occurred.

356 Massimo Buscema

OUTPUTS

~ O �9 �9 �9 O O �9 ,~-

~!,~ O .fi.;. 0 O O ~:~,','.. 0 ~..'.:...I. ,.~ ,~'.' O .;.:.~.~ .~':-~ .I~, O ,~, ~: '~ ~."-" 'f:?" O O O .':~. -",~i' ~i~

Fig. 4c. The Self-Reflexive normalized Output after 2 Re-entries is shown, when it is stimulated with an empty Input 9 x 9 matrix (value = 0.0). (In dark the excited nodes, in light the inhibited

8 1

ones). We point out that the saturation global value of the [G, = ~=~ U~] Output is 0.64. The relation among the 31 excited nodes and the 50 inhibited of the Output is 0.62.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . L L L ~ L ~ L

. . . . . . . . ' ~ ' - 2 : : : 2 . . . . . . . . . . . . . .

: : 1 : : 2 1 : : z z : = ~ ======= . . . . . . . . . . . . . . . . . . . . . ~ 2 ~ : : ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fig. 5a. The 7 Input models have been submitted to the Self-Reflexive Network. We point out that the learning order that we chose is bottom up.

WEIGHTS

o l o o o o o , . . . . . . . . . . . . . . . . . , . " . . . . �9 . . . . . . . . . . . . ' ' , . . . . . . . 0 4 0 0 0 0 0 , ' , . . . . . . . . , . ~ . . . . . . . . . ' ' , . . . . . . . . . . . . . ' ' . " . . . . . .

i ? i ! i ! ? !? ! ? i ! i i i i ? ! ? ! ? i ? ! ! i ! ! i ! ? i i i ?

o o o o o o o . . . . . . . . . . . . . . . . . . . ' ' ' . . . . . . . . . . ' . . . . . . . ~ : ~ : ~ : :~ . . . . . . . . e e o o o , o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ; : ; : ; : : ; : : : : : : ; : ' ' 1 1 1 1 ; . . : : ; : ; : ; : . ; : : ; : : ; : ; : ; : . 1 : : : ; : : ;

,I t : : . ; .:. : : . : . : : . : : ; : ; : . : . ~ : . : ~ ~ ; . . i : .: : .' : ; : , ; : . ; , : . : : . : , : : . ' : ; . , . . . . . . . . . . . . " . - - . . . " . . . . . ~ o ~ o o * o o . . . . . . . . . ~ ' . . . . . . . , ' . . . ' . . - , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : : : : : : : . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . , . . . . . . �9 . . . . . . . . . . . . . . . . . . . ~ o o ~ o o ~ - . . . . . . . . . . - . . . .

. . . . . . . . . , , . . . . . . . . . . . . . . . . . . . . . ~ o o ~ o o o . . . ' . . . . . , " . . . . . . .

. . . . . . . . . , . . . . . . . . . . . . , , , . , - . . . . ~ o o ~ o o o . . . . , . . . . . �9 . . . . . . .

. . . . . . . . . , . . . . . . . . . . . . - . . . . . . . . . . . . . . e o . o ~ . . . . . . . . . . �9 . . . . . .

, , . . . . . . . , . . . . . . . . . . . . . . . . . . - . . . . . ~ o o ~ o o o . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ; . . . . . . ; ~ e ~ o e ~ ; - : , : ; ,

. . . . . . . . . . . 1 , 1 . . . . . . . . . . . . . . . . . . . . . . . . . . , 1 , 1 . . . . r k ' ~ O O O ~ O 0 ~

~ . . . . , . . . . . . . . . . . . . . . �9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . o ~ o ~ o ~

i i i ? i i : i i i ; i i i i:iiii::ii::iiii?i:i?iiiii:iiii?iiiii!:! i

Fig. 5b. It is shown the Hidden-Output weights matrix once the learning occurred. Receptive fields 7 x 7 spontaneously took place.


INPUT

JdJl~ A A A J l ~ A A

"111g~ "~WI~ V V V V V

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cicli: 100 Goodness: 6 .9985037 HIDDEN

Fig. 5c. To the right there is the stimulus given to the Self-Reflexive in Input, to the left its normalized Output after 100 Re-entries. An excitory field spontaneously took place, in the direction of the Input Models learning order, to which the Self-Reflexive networks was submitted. To this field a complementary inhibitory one counteropposed. The Goodness value, nevertheless, remains unvaried (6.999) and analogous to the sum of the Input values that activated it.

2. Self-reflexive networks and other networks

2.1. Data base

In order to evaluate the power and the specific abilities of an ANN is often useful to use toy-examples.

In our case the toy-example is a small set of data that represent a problem which is considered complex and upon which the abilities of other ANN have been experimented.

The toy from which we will take the cue is a small Data Base (DB), already used by Rumelhart and others in previous publications (Rumhelhart et al. 1986). In this small Data Base 27 records exist that represent some characteristics of every male subjects of the movie West Side Story (see Table 1).

358 Massimo Buscema

Table 1. The DB that we have used in this experimentation

Name Band Age Academic Degree Civil Status Profession

Art Jets 40 J.H. Single Pusher A1 Jets 30 J.H. Married Burglar

Sam Jets 20 COL. Single Bookie Clyde Jets 40 J.H. Single Bookie Mike Jets 30 J.H. Single Bookie Jim Jets 20 J.H. Divorced Burglar

Greg Jets 20 H.S. Married Pusher John Jets 20 J.H. Married Burglar Doug Jets 30 H.S. Single Bookie Lance Jets 20 J.H. Married Burglar

George Jets 20 J.H. Divorced Burglar Pete Jets 20 H.S. Single Bookie Fred Jets 20 H.S. Single Pusher Gene Jets 20 COL. Single Pusher Ralph Jets 30 J.H. Single Pusher Phil Sharks 30 COL. Married Pusher Ike Sharks 30 J.H. Single Bookie

Nick Sharks 30 H.S. Single Pusher Don Sharks 30 COL. Married Burglar Ned Sharks 30 COL. Married Bookie Karl Sharks 40 H.S. Married Bookie Ken Sharks 20 H.S. Single Burglar Earl Sharks 40 H.S. Married Burglar Rick Sharks 30 H.S. Divorced Burglar O1 Sharks 30 COL. Married Pusher

Neal Sharks 30 H.S. Single Bookie Dave . Sharks 30 H.S. Divorced Pusher

As it is easy to verify, the typology of the tracing of each record is the

following:

Type of Record: N a m e of the Subject

Age G r o u p

Academic Degree

Civil Status

Profession

Band of Belonging

In this DB each record "field" assumes finished and exclusive options.

More specifically:

# Name of the Subject:

# Age Group:

# Academic Degree:

# Civil Status:

# Profession:

# Band of Belonging:

Self-reflexive networks

27 different names each of which identifies one subject

20 years old 30 years old 40 years old

Junior High School (J.H.) High School (H.S.) College (College)

Single Married Divorced

Pusher Burgler Bookie

Jets Sharks

359

Before proceeding on to the experimentations with different types of ANN, we make an elementary statistical analysis of the data of this bonsai DB.

The analysis of the Frequencies of the field " # Name of the Subject" is simple: there are 27 names and 27 subjects and each subject has only a non ambiguous name. Therefore, each name will have a frequency of attribution equal to 3.70% and a frequency of non attribution equal to 96.30%.

The Frequencies table for the other fields is, instead, the following:

a . # Band of Belongings

Jets Sharks

% 55.56 44.44

N 15 12

b. # Age Group

20's

% 37.04

N 10

3gs 4gs

48.15 14.81

13 4

360

C.

Massimo Buscema

# Academic Degree

J.H.

% 37,04

N 10

H.S. College

40.74 22.22

11 6

d. # C iv i l S t a t u s

Single

% 48.15

N 13

Married

37.04

10

Divorced

14.81

4

e, # Profession

Pusher

% 33.33

N 9

Bookie

33.33

9

Burglar

33.33

9

Let's now try to couple the variables and let's observe the distribution of the frequencies, the chi-square and the degree of contingence for each couple:

1. # Band - Age Group

Jets

20's 9

30 's 4

40 's 2

Sharks

1

9

2

2. # Band - Academic Degree

Jets

J.H. 9

H.S. 4

College 2

Sharks

1

Chi-square: 8.090 (MAX: 27.00)

Contingence: 0.48 (MAX: 0.707)

% Associative: 67.903




3. # B a n d - Civil Status

Jets

Single 9

Married 4

Divorced 2

Sharks

4

6

2

Self-reflexive ne tworks 361




4. # Band - Profess ion

Jets

Pusher 5

Bookie 5

Burglar 5

Sharks

5. # A g e Group - A c a d e m i c Degree

20 ' s 30 ' s 40 ' s

H.S. 5 2

College 4 0

6. A g e Group - Civil Status

_' 20's

Single 5

Married 3

Divorced 2

30 's 40 's

6 2

5 2

2 0










7. # A g e Group - Profess ion

Pusher

Bookie

Burglar

20 'S

3

2

5

/ ~. 30 ' s 40 ' s

1 5 1

5 2

3 1




8. A g e Group - Civil

J.H.

Single ~

Married - -

Divorced

Status

H.S. College

6 2

3 4

2 0




362 Massimo Buscema

9. # Academic Degree - Profession

Pusher Bookie Burglar

J.H.

2

3 5

H.S.44 __ College23

3 1

10. # Civil Status - Profession

Pusher

Bookie Burglar

Single

5 7 1

Married23 -- Divorced11

5 3







These analysis show the existence of bonds of typicalness among the different fields of our DB; for example:

IF the field "Being a Je t " is activated

THEN: is rather typical

and it is quite typical and

THEN: is quite atypical in order

THEN: is more typical

"to be 20 years old" " to associate with the J .H." "to be a Single"

"to be 20 years old" "to be a Bookmaker"

"to associate with the J .H ." "in order to be more a Burglar" than "a Pusher".

These fuzzy rules that link in a not always clear way, our DB data, are the objects of the experimentations that follows.

The aim, in fact, consists in evaluating which ANN are able to answer more effectively to some of "our" questions to which a SQL (Structured Query Language) would not be able to answer at all. For example: (1) Who, out of the 27 subjects, better represents the typical Jets and who better represents the typical Sharks 9. (2) To which subjects SAM is more similarg, (3) Which subjects are the most representative of the category of "Burglars" and which other properties should possess 9.


The questions on which we will verify different types of ANN will not only be questions of typicalness and questions that imply the automatic recovery of missing and poor information. We will also submit impossible questions and virtual questions.

A question is called impossible when, with respect to the DB that we are questioning, at least a couple of pre-requisites present in the questions are contradictory between them. For example: in our DB no subject is both 20s and 40s. Therefore, if I ask to any SQL to find all the subjects that are both 20s and 40s, I am asking an impossible question because the two properties we are talking about exclude themselves reciprocally. The SQL will simply answer that this combination does not exist in my DB. Instead, the ANN that we will experiment, will try to give an answer.

A virtual question is, instead, a question presenting all the prerequisites that are not present in the DB that we are questioning. If, for example, I ask to a SQL to find in its DB all the subjects that are "40s" and that "have attended College" and "that are bookmakers", the answer will be, as in the previous case, that there are no subjects having all these properties together.

The ANN that we will experiment will also have to give answers to these types of virtual questions.

2.2. The self-reflexive learning

In 2 subsequent publications Rumelhart and McClelland have presented 2 ANN (Rumelhart et al. 1986; McClelland and Rumehart 1988). The first was called Interactive Activation and Competition (IAC), the second one was called Constraint Satisfaction (CS).

In their second publication McClelland and Rumelhart (1988) provide also a software that allows to use both the ANN on different type of DB. One of these DB is the one we are talking about. We will compare, then, the performances of the IAC and of the CS using the software package that the 2 authors have provided and we will compare both the performances with the results of the Self-Reflexive (SF) that we have created. For'~a further study on the IAC and the CS we refer to the cited texts. On the SF we ibrovide some specific indication on the use and a few methodological considerations.

We have used a monodedicated SF, made up of 41 Input nodes: the 27 names of the 27 subjects and the 14 characteristics that represent the properties of the left 5 fields. Each characteristic has been codified with a simple "no" (value = 0.00) if it was not present in the record of a subject or "yes" 'value = 1.00), if it was.

364 Massimo Buscema

The SF has carried out its learning on 27 models, that is the 27 records that makes up our DB. At the beginning of the learning all the 1804 weights of the SF have been initialized with homogeneous values included between 0 and 1. More specifically: the 41 weights that connect the Input layer to the Hidden one have received the value 1.00; the 1681 weights that connect the Hidden layer have received the value 0.01; finally the 82 bias of the Output nodes and the Hiddcn nodes have received the value 0.000.

During the learning phase the Self-Momentum algorithm has been used (Buscema and Massini 1993). The training has been carried for 29700 cycles, that is 1100 epoques. At that point, the RMS mean value for each model was 0.0001, while the Global one (summing all the errors of all the nodes of each model of all models) was 0.003. More formally:

IF:

IF:

THEN:

N

Ep = 1. ~ (h idden/ - outputi) 2 2 i

M

P

Ep = 0.0001 Ge = 0.003

N = number of Hidden

and Output nodes

M = number of Models Ep = Error of each Model

Ep = Mean Error for each Model GE = Global Error

Once trained, the SR has been questioned with the Normalization and Reentry technique which has seen previously (Buscema 1994).

The normalized Output values, during the questioning, have been bounded between 0 and 1 and the external Input was been left fixed at each cycle of Re-entry.

2.3. Advanced self-reflexive: the meta-units

The experimentations carried out have favoured a methodological increase of the SR.

The SR that was submitted to training is, at the moment, the same that is questioned through the Re-entry. This does not allow the SR to generate answers on the models that have learned, namely on the records, but only on the properties by which it has learned them (namely, the field options).

Practically, if I activate from the outside the properties "20s" and "Sharks" a very good trained SR, after a reasonable number of Re-entry will also have co-activated the properties "H.S.", "Burglar", "Single" and "his name is Ken".

Nevertheless, the property "his name is Ken" does not correspond to the


"Subject Ken". The "Subject Ken" is a record made up of 6 properties among which there is the property "to have Ken as a name", which is a considerable difference.

Actually, it would be very important that during the Re-entry the SR organized resonances among the records that it activates. This would allow to enhance the fuzzy similarity among the various subjects.

In the IAC and in the CS this aim is easily reached, especially in the DB example that we are examining: it is enough to add to these ANN a pool of Hidden units, namely, not handlable in Inputs, equal to the number of the records of the DB.

In order to represent a specific record, each added Hidden unit will have some positive connections (for example: +1), with all the properties that make up that record, and some inhibitive connections (for example -1 ) with all the other Hidden units that represent the other records.

Once this architecture is set, both the Input properties and the new Hidden units are treated as nodes of the same type by the updating equations of the units.

On the SR, nevertheless, is not possible to proceed with the same simplicity. And for at least 2 reasons: a) The IAC and the CS are circuit ANN; the SR are layers ANN. Then, to

which layer must the new units that represent the DB be added? And then: with which connections and where?

b) The IAC receive the values of their connections directly by the DB architecture; the CS may behave in an analogous way or let a different statistical and/or neural mechanism, prepare the connections.

Nevertheless, they have a specific updating algorithm of the units when both questioned which has no relation with the way by which their connections have been made up. In the SR, instead, the answer algorithm is part of the learning algorithm. It is the same SR that learns and answers.

These problems induced us to proceed in the following way: In the questioning phase one adds to the SR M output units equal to

the number of models (records) that it learned during the training phase. Therefore, if during the training phase the SR will appear configured in this way:

a. Input Level: Inputk 1 <<. k <~ N

b. Hidden Level: Hiddenj 1 ~ j ~ N c. Output Level: Output~ 1 <~ i ~ N

{i = j = k N = number of Nodes}

in the questioning phase the SR will show its Output Level which will be modified in this way:

366 Massimo Buscema

c. Output Level: Output~ + NewOutputp {2<~p<-M M = number of Learned

Models}

We have called these new Output nodes of the SR "MetaUnits". Each MetaUnit is connected to all of the SR Hidden units with specific weights. More specifically: the value of each weight between a MetaUnit and all the Hidden units is a function of the Input nodes values that represent the MetaUnit. Therefore:

Wpj = f '~(Inpu~) {p = Record; j = Hidden Node} (2.1)

This is possible because each Hidden node, as was demonstrated (Didon6 1993), is a receptor of the Input node with which it is connected.

The function, fro, of manipulation of the Input is a classic function of MiniMax:

Wpj = Scale �9 Inpu~ + Offset (2.1a)

where

Scale - (high - low) (2.1b) (Max - Min)

Offset = (Max. low - Min. high) (Max - Min)

(2.1c)

The Max and the Min values represent, respectively, the maximum and the minimum value of incoming in the function. Since the values are all taken from the Input vector, Max will be equal to 1.0, and Min will be equal to 0.0.

The high and low values, instead, represent the maximum and minimum values that the weight Wpi can take.

These two values are made out by the analysis of the weights between the Hidden level and the Output level that the SR has developed during the learning phase. More specifically: one calculate the mean of the absolute value of the matrix weights Wi]:

N N

X w 0 _ i ] (2.1d) N 2


and, then, the high and low values are initialized:

367

high=J?w0 and l o w = - ) { w o . (2.1e)

It is also possible to initialize these values through other procedures: for cxample, instead of the mean of the weights )?w, it is possible to use the function Maxw[IW jl], or Minw[lWu[].

In any case, the choice of the criteria for stabilizing the maximum and the minimum values of the new weights Wpi is a filter choice; it is a matter of deciding which is the impact power that the Hidden units of the SR must have on the new Output units.

Once we have stabilized the new weights matrix Wpj, the SR is ready to be questioned through the Re-entry, also considering the added Output units.

During the forward algorithm, in fact the NewOutputs are calculated starting from the Hidden units through the weights Wpj, in the same way as for the old Output i units:

NewOutputp = fS(Netp) where f* = sigmoidal function (2.2)

where:

N

Netp = ~ Hiddenj �9 Wpj (2.2a) J

and then:

1 NewOutputp - (2.2b)

1 + e -Netp"

The NewOutputp values that are calculated at each cycle of questioning contribute to the global value of the MetaUnitsp.

The MetaUnitsp, then, are saturation units, that increase and/or decrease their total value with respect to the values of the NewOutputp at each Re- entry and with respect to a certain parameter of activation decay.

The maximum decay parameter is stabilized by the same SR at each Re- entry. We have calculated it as the mean value of the values of the all NewOutputp, at each Re-entry.

368 Mass imo Buscema

M

"2 NewOutputp decay = p

M (2.3)

The subsequent operation consists in calculating the Deltap, by which each MetaUnitp will increase or decrease its value at each Re-entry.

We have stabilized this value with respect to the Net Input (Netp) that each MetaUnitp receive from the others at each Re-entry.

if(Netp > 0)

then Deltap = Netp �9 (1 - MetaUnitsp)

else Deltap = Netp �9 MetaUnitsp.

(2.4)

The Net Input, Netp, that arrives at each MetaUnitp is stabilized by the other 2 parameters: a) a function, f , of the sum between a function, f i , of the NewOutput value

of the MetaUnitp that we are calculating, and a function fi , of the value of each of the left MetaUnits;

b) a linear function of the MetaUnit value that we are calculating at that moment.

The equation regulating these parameters is then the following:

Net,, = f ~(DdDp); (2.5)

where f t is the function of the hyperbol ic tangent that in a semilinear way binds the Netp value alnong the excluded boundaries of - 1 and + 1.

eDdDp _ e--DdDp

Netp = eDaD p + e_DdD . (2.5a)

The values DdDp (Degree of Difference) is, instead, calculated in this way:

DdDp =

M

[(f(NewOutputp) - f i (NewOutput j ) ) �9 (1 - MetaUnitp)]. ]

(2.5b)

The function f i that adjusts the NewOutput values of the (2.5b) is a function that we could call a virtual inversion of these values. It is, in fact expressed in this way:


fi(NewOutputp) = lg(NewOutput;) - lg(1 - NewOutp) (2.6) d~w M

where Xw~ is the mean of the values that connect the Input level and the Hidden level of the SR (we remind you, the SR is monodedicated); then:

N

( w H - - j , k

N j = k. (2.6a)

At this point it is possible to utter the equation that updates the MetaUnitSp at each Re-entry.

MetaUnitsp(n+~) = MetaUnitsp( m + Deltap - Decay- MetaUnitsp(~). (2.7)

By this algoritlun the MetaUnits update their activation values at each Re- entry, among the included limits 0 and 1.

All this occurs without the help of any external parameter of the SR, but rather using always and only locally the information that the SR created for itself during the training phase.

The experimentation that follow will show the capabilities and the limits of this architecture.

2.4. The experimentation questions

We have submitted the three ANN that we chose to compare, IAC, CS and SR to different types of questions. Obviously, it is a matter of questions to which only a SQL would answer in a banal way or would not answer at all.

The DB we chose represent a toy, given its simplicity and its dimensions. But it is by using toys that we discover children's inclinations. We hope that this analogy works for the SR networks. We have thought of 4 types of questions: 1. Prototypical Questions; 2. Uncorrect Questions; 3. Impossible Questions; 4. Virtual Questions.

1. In the Prototypical Questions one activates one or more than one

370 Massimo Buscema

properties of the DB and one evaluates if the ANN is able to activate the Subjects (Records) and the other properties that better represent a Prototype of the property/ties activated from the outside. We have made a hundred questions of this type to the 3 ANN. We will show two of them: la. the subjects and the most significant characteristics of the Jets; lb. the subjects and the most significant characteristics of the Sharks.

2. The Uncorrect Questions are a type of activations to which a normal DB would not find an answer, since at least one of the remarks of the questioning is wrong in the DB on which we are operating. For example, the subject called Ken is characterized by the following properties: - he is a Shark - he is 20s - he attended the High School - he is a Single - he is a Burglar.

There is no Jet having the same properties of Ken and Ken is the only Shark being 20s. The problem is: if we ask to the 3 ANN to find all the subjects of the Jet Band having the last 4 characteristics of Ken, which will the answer be? They will realize that we are looking for Ken, but that we made an error in formulating the question thinking of him as a Jet?

Also in this case we will make 2 questions to the 3 ANN: 2a. find the Jet Subjects that are 20s that attended the H.S. that are Single

and that are Burglar (I am thinking about Ken, but Ken is a Shark); 2b. find the characteristics of a Jet Subject called Dave (but, unfortunately,

Dave is a Shark). 3. The Impossible Questions are questions that presume the research of

properties that in that DB cannot appear in the same record. For example, the property "being a Jet" and "being a Shark" cannot appear as true in the same subject. But in the theory of Fuzzy Sets it is possible to conceive subjects that have characteristics of fuzzy intersection among those that are considered two incompatible universes from the point of view of the classic theory of Sets (Schmucker 1984; Zadesh 1987; Kosko 1992). We will make, even in this case, 2 questions to the ANN: 3a. find all the subjects that are both Jets and Sharks; 3b. find all the subjects that are both called A1 and Karl.

4. The Virtual Questions are the other side of the Uncorrect Questions. It is a matter of questions that look for sets of properties which are not realized in any subject of the DB. This does not mean that subjects having just that combination of characteristics, cannot be added one day to the DB. In this sense, they are Virtual Questions. But this does not mean that there are no subjects in that DB that come closer to virtual combinations of


requested properties (and with respect to them the question is Uncorrect). In our opinion, the Virtual Question are, theoretically, the most important and practically the most useful. To be able to answer to these questions is a step that allows to make predictions on the development of subjects of the DB and/or to discover submerged relations among those subjects. We will make 2 Virtual Questions to the 3 ANN: 4a. find the 40s subjects that have studied at College; 4b. find the Divorced subject that work as Bookie.

Each of the 3 ANN will answer to these 8 "strange questions". We will consider the answer of each of the 3 ANN completed, once for more than 10 cycles its Degree of Goodness will not vary within the eighth figure after the comma.

3. Synthesis of the ANN in comparison with each other

In order to render the comparison that we are about to make more readable we use the following abbreviations: 1. IAC: for the Interactive Activation and Competition by McClelland (Ru-

melhart et al. 1986: 27-31; McClelland and Rumelhart 1988: 11-47). 2. CS: for the Constraint Satisfaction by Rumelhart, Hinton and others

(Rumelhart et al. 1986: 7-57; McClelland and Rumelhart 1988: 49-81); 3. SR for the Self-Reflexive created by myself and described in these pages

(but see also Buscema and Massini 1993; Pandin 1993; Didon6 1993).

3.1. IAC parameters and equations

The criteria for the definition of the weights in these ANN is very simple among the properties of the same field there is an inhibitory relation (in the case: W u = -1); among the properties connected to a same record there is an excitatory relation (in the case W u = +1); in all other cases there is a relation of indifference (in the case W u = 0 ) .

The basic equation for the calculation of the Net Input that reaches each unit, is the following:

Neti = ~ W u �9 Outputj + Extlnput, (3.1) J

where Outputj is the Output value of all IAC units in that cycle; ExtInput is the activation value of the external Input of that same unit.

372 Massimo Buscema

We have to specify that practically this equation has been modified with shift parameters; then:

Nets = alfa. ( ~ Wq. Output j ) + estr �9 (Extinputs). (3.1a)

In the experimentation that we wiil present we have put alfa = 0.1 and estr = 0.4.

The equation for the calculation of the updating Delta of each unit is the following:

if(Net~ > 0)

AU i = (Max - Ui) �9 Nets - Decay . (Us - Rest)

else

&Us = (Ui - Min) �9 Nets - Decay . ( G - Rest)

(3.2)

where: Max: maximum excitatory value that each unit, Us, may assume; in the case Max = 1.0; Min: maximum inhibitory value that each unit, Ui, may assume; in the case Min = -0 .2 ; Decay: value of decay of the sign of each unit at each cycle; in the case Decay = 0.1; Rest: rest value of each unit; in the case Rest = -0 .1 .

The parameters values that we have fixed are the same parameters that were fixed by the authors of the IAC in their experimentations on the same toy DB that we are using in these pages (McClelland and Rumelhart 1988: 40).

The program used for the experimentation has been the same one as the one provided by the same authors IAC.EXE (McClelland and Rumelhart 1988).

3.2. CS parameters and equations

The CS that we worked with, used a deterministic algorithm for the calculation of the net Input:

Neti = istr. ( ~ U~. Wij+ biasi) + estr �9 (inputi) (3.3)

where: istr: the shift of the internal net Input: in this case istr = 0.1; estr:


the shift of the external input with respect to the unit Ui; in this case estr =

0.4. The biasi, in this case, was fixed differently for the units that represent

the DB properties (bias~ = - 2 . 0 ) , from the one fixed for the units that represent the DB records (bias~ = -1 .0) we have followed, also in this case, the default indications given by the authors (McClelland and Rumelhart, 1988: 68).

The equation for the updating of the units of CS is the following:

if(Net, > 0) AUi = Net, �9 (1 - Ui)

else 2xUi = Net/ �9 Ui.

(3.4)

At each cycle, the updating of the units occured according to a random code.

The program used for the experimentation has been the same one as the one provided by the same authors: CS.EXE (McClelland and Rumelhart 1988).

3.3. Synthesis of the SR equations

We have already described the SR in the previous pages. We think it is useful, therefore, to synthesize the equations of the monodedicated SR that we have used in the following experimentations:

(a) Forward Algorithm

U~ = fs ( ~ Uj. W~j + Oi) = fS(Net,) = 1 1 + e -Net/"

(b) Backward Algorithm (the exponents I, H, and O are given for the Input, the Hidden and the Output).

d, = U Y - U?

SelfMomentumi = di �9 Aouti(n - 1)

Aouti(n) = SelfMomentumi + d~ �9 U ~ (1 - U ~

374 Massimo Buscema

N

Ahiddenj(~) = ~ Aout~(~) �9 W~ ~ Uff . (1 - Uff) i

0 0 0~(n+l) = 0i(n) + Aouti(n) �9 R a t e

O O . W~j(n+l) = Wij(~) + Aout~(n) U ~ . R a t e

0j(~+l) ~ = j0~(, o + Ah id d en j ( , o �9 R a t e

H H Wjk(.+t) = Wjk(n) + A h i d d e n j ( . ) �9 U ~ . R a t e .

(c) Output Normalization

Oppos i t e~ = l g ( U ~ ) - lg(1 - U ~ - O~ w~

(d) Re-entry

N

G(~) ~ ~ O p p o s i t e / i

W h i l e G ( n - t ) :~ G(m

I _ _ Ui - O p p o s i t e i

(e) Weight of the MetaUnits ( p = i n d e x o f the M e t a U n i t s , M = n u m b e r of

M e t a u n i t s )

o ~ UZp W p i = f ( ," ) = S c a l e . U~ p+Of f s e t

Sca le - h igh - l ow M a x - M i n '

Of f se t = M a x . l ow - M i n �9 h igh

M a x - M i n

M a x = 1.0; M i n = 0.0.

high = Xwo; low = - X w o

N N

~ w O _ _ i j N 2


(W ~ = Outpu t weights matrix)

375

(f) MetaUnits: Forward Calculation

N

NewOutpu tp - f~ (~. U~-Wppj) = f ' (Ne tp ) - \ j 1 + e-Netp"

(g) Calculation of Meta Units

M

~; NewOutpu tp Decay = p

M

if(Netp > 0)

e!se @ = Netp . (1 - MetaUni tsp)

2Xp = Netp �9 MetaUni tsp

eDdDp _ e - -DdDp

Netp = f f (DdDp) = eDd~ ~ + e--DdDp

M

DdDp = ~ [ ( f f (NewOutpu tp ) - f i ( N e w O u t p u t j ) ) . (1 - MetaUnitsp)] J

f~(NewOutputp) = lg(NewOutputp) -_lg(1 - NewOutpu tp) X w ~

N

~ W H ~- i

N

MetaUnitsp(~+~) = MetaUnitsp(~) + Ap - D e c a y . MetaUnitsp(,O.

376 Massimo Buscema

4 . T h e a n s w e r g i v e n b y t h e A N N to t h e D B

The DB that we have used in the experimentations is made up of 27 records, one for each subject and 6 fields. The first 5 fields refer to classificatory properties.

a. The type Belonging Band

b. The Age-Group of the Subject

c. Academic Degree of the Subject

d. The Civil Status of the Subject

e. The Subject Profession

- Jets - Sharks - 20s - 3 0 s

- 4 0 s

- JS

- HS - College - Single - Married - Divorced - Pusher - Bookie - Burglar

The sum of all these properties is realized by all the A N N that we will examine, in 14 nodes The sixth field treats an indexical property: the name of each subject. In this case, then, 27 different names, one for each subject, will be available.

These characteristics of the DB, imply that each of the 2 comparison A N N ( IAC and CS) will be made up of 41 visible nodes and 27 hidden nodes. According to the following chart:

Visible nodes Hidden nodes

Classificatory Indexical properties Records properties

No. of nodes 14 27 27

Meaning The sum of the

properties among from

the fields: band, age

group, academic

degree, civil status,

profession

The set of the field

properties "name of

the subjects"

Each record is a subject

meant as a choice of

property of every

visible node

In the case of the SR the visible nodes re-double themselves (41 Input Units + 41 Output Units), 41 Hidden Units which are part of the SR structure


intermediate among each other and 27 nodes are added in Output and are called MetaUnits, each of which represents a Record of the 27 foreseen by the DB. Also the matrix of the weights of the 3 ANN is consequently different: (a) IAC: 840 connections; (b) CS: 881 connections (840 weight + 41 bias); (c) AF: 2901 connections (1681 Hidden-Output weights + 41 Input-Hidden

weights + 41 Hidden-bias + 41 Output bias + 1097 Hidden-Metaunits weights).

This surely means that the SR is more complex than the other ANN and that this major complexity will be only justified by significantly better results.

In order to make the reading of the following data easier, we decided not to show the answers of the 3 ANN on the 27 indexical properties. Therefore the name of the subjects that will appear in the charts will refer to their Record and not to the "fact of having a certain name".

We have also decided to show only the Outputs with a value more than zero.

This should make easier the visual comparison of the results that the 3 ANN have given to our 8 questions,

4.1. The concept of plausibility

All the 3 tested ANN answer with values included between 0 and 1 or between -0 .2 and +1.0 (the IAC).

The Output value that stabilizes at each node after an opportune number of cycles, indicates the plausibility through which that node satisfies the set of boundaries to which the ANN is submitted.

The concept of plausibility is completely different from the one of probabil- ity.

The plausibility of the belonging of an element to a class is the degree of fuzzy belonging of that element to that class, and it is determined by a special function.

For example given a fuzzy subclass A of the universe U to which belong the elements {a, b, c, d, e, f}, then it is possible to say that, with respect to A: (a) it is present with a degree of belonging 1.0 (b) it is present with a degree of belonging 0.9 (c) it is present with a degree of belonging 0.2 (d) it is present with a degree of belonging 0.8 (e) it is present with a degree of belonging 1.0 (f) it is present with a degree of belonging 0.0

378 Massimo Buscema

this is like rewriting A in the following way:

{l/a, 0.9/b, 0.2/c, 0.8/d, lie, 0/f}

for further study compare Schmucker (1984).

The answer of the three ANN, then, must be interpreted like the plausibility of the belonging of each node to the class of boundaries imposed by the activation of the ANN weights + external input.

The value of plausibility of each node will be put in brackets, near the name of each property and record of the ANN, and it will be multiplied by 100.

4.2. The prototypical questions

4.2.1. The jets prototype The node "Jets" has been activated in the three ANN with the value 1.0. The IAC and the CS needed 400 cycles in order to stabilize. For the SR 100 cycles were necessary. The node activated from the outside was characterized with the sign "*"

Jets: the Properties

IAC CS AF

*JETS (84) *JETS (100) *JETS (100) PUSHER (62) BURGLAR (99) B U R G L A R (100) SINGLE (62) MARRIED (99) DIVORCED (100)

20s (44) 20s (99) 20s (100) J.H. (44) J.H.(99) J.H. (100)

All the three ANN agree in considering that the typical Jet is a 20s subject with the title J.H. This was evident at first sight: within the Sharks there is only a 20s Subject and only 1 subject with the title J.H.

The typical civil status of the Jets seems to be the single one (out of 13 singles, 9 are Jets).

Nevertheless, only 4 of these are 20s. And only other 4 have the J~H~ title. And there is no subject among the Jets that is single, 20s and J.H. at the same time.

The IAC, instead, offer this solution: the typical civil status of the Jets is single: a quantitative dazzling. The same dazzling for the profession: since


4/5 single of the Jets are Pushers, then Pushers is the typical profession of the Jets. It is as if the IAC was able to take into account variables involved just as couples.

Among the Jets ~ many 20s Single, therefore Single and 20s; -~ many Single with J.H., therefore Single and J.H.;

many Single Pusher, therefore Single and Pusher.

The fact that there are only a few relations between 20s and J.H., and no relation behveen Single, 20s and J.H., force the IAC to decrease the plausibility of the 2 basic properties of the Jets, to be 20s and to have attended the J.H.

The CS choice seems to be better: out of 5 Burglar Jets, 4 of them are 20s and two of them are married. The problem is rather that the situation is also analogous among the Sharks: out of 4 Burglar, 2 of them are married. Therefore to be Burglar and Married seems to be more of a possible collateral effect of being 20s than a typical characteristics of the Jets.

Furthermore, since the CS is provided of a maximization algorithm, it tends to squash all the properties towards the maximum activation. Just as if a judge was considering a slaughter in the same way as an insult, just because they both belong to the class of crimes.

The SR choice, instead, is net and more precise: to be a Burglar and Divorced and to be 20s with the J.H. academic degree, is typical of the Jets.

In fact, in the Jets, out of 5 Burglar: (a) 2 of them are married, are 20s and have the J.H. academic degree, but

there are other 2 among the Sharks which are married; (b) 1 is married, has the J.H. academic degree but is 30s (typical properties

of the Sharks); (c) 2 are divorced, are 20s, have the J.H. academic degree; among the

Sharks there are only 2 that have the J.H. title and are 30s. Therefore, in the Jets out of 4, the 20s with the J.H. title:

(a) are all Burglar; (b) there is no 20s subject in the Sharks with the J.H. academic degree; (c) the only 2 Divorced Burglar among the sharks have typical characteristics

of the Sharks; namely they are 30s and have the H.S. academic degree.

380 Massimo Buscema

IAC

Jets: the Records

CS AF

ART (49) JIM (99) FRED (49) JOHN (99) GENE (49) LANCE (99)

RALPH (49) GEORGE (99) ART (77)

GEORGE (90) JIM (89)

JOHN (87) LANCE (87)

AL (83) KEN (72) RICK (71 ART (64)

GENE (62) SAM (61)

CLYDE (61) GREG (60)

RALPH (57) FRED (55) PETE (54) MIKE (52)

The IAC having roughly understood the Jets, points out as typical Jets, 4 subjects that seem typically atypical: (a) Art is 405 and not 20s; (b) Fred has the H.S. title, typical of the Sharks; (c) Gene has the title of the College, which is more typical for the Sharks

than for the Jets; (d) Ralph is 30s, which is a typical trait of the Sharks.

Why such a macroscopic error? Simply because all of these 4 subjects are Single and Pusher, two traits that the IAC considered erroneously typical of the Jets.

The CS generates a better answer: 2 of the most plausible (Jim and George) are Divorced Burglar and the other 2 are Married Burglar (John and Lance). Art is considered less plausible but rather Jet and for the following reasons: (a) he has not the Sharks specific traits (30s and H.S.); (b) he has a Jet specific trait (J.H.); (c) he does not show the combination Single + Bookie (es. Clyde), typical

of the Jets that have Sharks traits (Pete, Doug, Mike), and of a typical Sharks (Neal).

Nevertheless, both the IAC with its errors and the CS, with its intuitions, seem to be rather monotonic. For example, they cannot conceive sophisticated scales of plausibility. In fact, they see which Jets are a little Jets rather than see which Sharks could be Jets. This is exactly the SR result. The first


4 typical Jets of the SR clash with the typical Jets of the CS. Nevertheless, their plausibility scores are very different.

The SR, even by having the trait "Divorced" as a typical trait of the Jets, is able to give a very high plausibility to 2 Married subjects (John and Lance). Moreover, it considers an exception that a subject like A1 (83%) is 30s, being characterized by traits like Married, Burglar and J.H. But not only. At the 6th and 7th place of the SR appear 2 Sharks: Ken and Rick.

Ken is framed as a possible Jets (72%) for 2 reasons: (a) he is 20s and he is the only 20s of the Sharks; (b) he is a Burglar, which is a property that the SR has demostrated to

consider quite a lot when it is connected to being 20s. Rick is framed among the possible Jets (71%) for one reason: he is a

Divorced Burglar, combinations of traits that for the SR is typical of the Jets. But the scores of typicalness of the SR, continues in stabilizing the plausibility value of any other Jets. Except Doug, all the Jets have a degree of belonging to the Jets class which is larger than zero.

Doug, in fact, presents 2 typical traits of the Sharks (30s and H.S.) and 2 traits that are not typical for the Jets, nor for the Sharks (Single + Bookie).

From this first questioning the answer given by the SR seem richer, more analytical and reasoned than the answer given by the other 2 ANN. It is as if the SR reasoned simultaneously on all the combinations among the 41 properties of the DB, while the other 2 ANN were forced to make combina- tory lopping according to the combinations that during the processing em- power themselves.

4.2.2. The sharks prototype Also in this case, for the IAC and the CS, 400 cycles were necessary. For the SR, 100 cycles were necessary:

Sharks: the Properties

IAC CS AF

*SHARKS (84) 30s (65)

COLLEGE (65) MARRIED (65) PUSHER (51)

*SHARKS (100) 30s (99)

H.S. (99) MARRIED (99) PUSHER (99)

COLLEGE (27)

*SHARKS (100) 30s (96)

H.S. (91) MARRIED (90)

B UR GL AR (100) J.H. (14)

DIVORCED (12) 20s (5)

PUSHER (2) JETS (1)

382 Massimo Buscema

The variable Profession is distributed both in the Jets and in the Sharks in an equiprobable way; 5 Pusher, 5 Bookie and 5 Burglar in the Jets and 4 Pusher, 4 Bookie and 4 Burglar in the Sharks. In order to understand which of the 3 professions is more typical of the Sharks it is necessary to understand how the three of them are connected to the other 3 fields, that works in this question as independent variables.

A tree-graph may partially help to visualize the problem:

I V Single

2 ~ ' H ' S " I~--~ Divorced

30s ~ J.H.

a) Pusher 0 ~ . ' ~ 20s College ~. Married

40s

b) Bookie

1 H.S. ~ Single

1 30s J.H. ~ Single

1 College ~ Married

20s

1 ! 40s ~ H.S. I~ Married

e) Burglar

S 1

4Os

1 - - H.S.

30s I ~ J.H.

I ~ College

20s 1 I~ H.S.

1 I~ H.S.

1 I~ Divorced

1 -~ Married

l I,- Single

1 I~ Married

the trait College It is not just a

this combination by the Jets.


At a first sight it seems to make sense that the profession Pusher is

representative of the Sharks. In this sense, one may sustain that the CS answer is more complete than

the answer of the IAC. The trait 'College", in fact, is not more typical in the Sharks Pusher than

the trait "H .S . " The CS considers this different representativity in terms of plausibility (H.S. = 99% ; College = 27%).

Nevertheless, both the IAC and the CS miss the comparison annong the Pusher traits - 30s - College - Married in the Sharks and in the Jets.

This oversight occurs because these 2 ANN have suppressed in a strong way the trait Jets during their processing.

In the Jets, in fact, 3 Pusher out of 5 present either the trait H.S. (2), or (1). case that in the previous question the IAC was deceived by in considering that the profession Pusher was characterized

The SR on the contrary, answer in a more articulated way. It agrees with the other 2 ANN on the 3 typical traits of the Sharks: 30s, H.S. and Married. But it does not flatten them, providing them of the same plausibility. It individuates the profession Burglar as typical of the Sharks, when it is connected with these traits and with other traits of which it does not exclude the manifestation: J.H. (14%), Divorced (12%), 20s (5%), Pusher (2%) and even Jets (1%).

The answer that the 3 ANN have generated on the Records focus and make clear even more the unmonotonic complexity of the SR. Such a complexity is explained imagining 2 ways of proceeding that the 3 ANN carries on in order to resolve the problem.

The first way, which is of a competitive type and specific of the IAC and the CS, is the following: the stronger traits influence straightaway the solution towards a certain direction. These traits exclude the traits which are counteropposed to them. The last ones in the successive phases do not have a propositive role anymore but rather they can, at the most, reshuffle the power of the emerging traits. Briefly, the consequences of the most important choices of the 2 ANN cannot retroact on the choice model that generated them but they can only reshuffle their capacity.

The second way to carry on the traits processing, is, instead, of a cooperative type and it is typical of the SF: at each cycle all the traits are convened to renegotiate from the beginning their participation to the elaborative process considering their real power. In this way, the consequences of the most important choices may retroact on the whole model that is being configured, even redrawing a new one, which will be submitted to analogous evaluation.

384 Massimo Buscema

Sharks: the Records

IAC CS AF

PHIL (63) PHIL (99) AL (88) OL (63) NICK (99) E A R L (87)

DON (47) OL (99) RICK (87) NED (47) DAVE (99) DON (87)

DAVE (86) KEN (86)

K A R L (83) PHIL (81)

OL (81) NICK (81) NED (81)

NEAL (80) LANCE (1)

For the IAC the most representative Sharks subjects are the 4 subjects having the College title. This is quite strange since no subject is selected having the H.S. title: out of 12 Sharks, 7 of them present the H.S. trait and 4 of them are 30s. Among the Jets, instead, only 4 out of 15 subjects present the H.S. trait and 3 of them are 20s. The trait that associates the 4 selected Sharks is Married. But out of 6 of the Sharks subjects that are Married, only 2 are also Pushers.

For the CS the most representative Sharks subjects are the 2 Married Pusher with the College title (analogous answer to the IAC) and the only 2 Pushers with the H.S. trait.

In this case the answer seems more pondered. One does not understand nevertheless why it is so important the profession as a Pusher in order to be a Shark, when the Pusher Sharks are exactly 1/3 of all the Sharks and the Jets Pusher are 1/3 of all the Jets.

The SR answer is more complex, as usual. All the Sharks subjects except one are activated with a plausibility which is higher or equal to 80%. These 11 Sharks subjects, seen as typically Sharks, have different professions: 4 Pushers, 4 Burglar and 3 Bookies. This takes place even if the SR has simultaneously sustained that the profession Burglar better combines with the other in drawing the typical Sharks.

The excluded Sharks is IKE and the reason of such an exclusion is clear: IKE is a collage of atypical traits for the Sharks but not typical for the Jets: - he is 30s like a Sharks; - he is the only Sharks with the J.H. title, which is typical of the Jets; - like Neal he is the only single Bookie of the Sharks, which is a typical

combination of the Jets;


- no typical Jets with his traits combination exist. This makes it an anomalous subject for both the bands in fact, he looks

fike Mike which is a little Jets and like Dong which is definitely an atypical Jets.

According to the SR, therefore, the Sharks band is globally more compact of the Jets band and even more various: Earl (87%) is 40s and not 30s; Rick (87%) is Divorced and not Married, Don (87%) has the college title and not the H.S. Dave (86%) is Divorced and not Married; besides he is a Pusher and not a Burglar; Ken (86%) is a single 20s; Karl (83%) is a 40s Bookie, etc.

Briefly, although the SR has selected specific traits of the Sharks among the properties, this has not prevented it from giving also a strong plausibility to some Sharks not having all those traits.

Moreover, the SR indicates a typical Sharks as if he were a Jets Subject, A1, that also came out to be a quite typical Jets.

This shouldn't surprise: if one considers to be an exception the J.H. A1 trait, this subject has typical Sharks traits. On the other hand, if one considers an exception the 30s AI trait, one also has a descrete representation of a Jets.

This qualifies A1 as an ambigous subject: very Sharks and quite Jets. The contrary of IKE which is a neutral subject: little Sharks and not enough Jets.

This means that the answer of the SR also offer a sociogram of the records on which the SR reasons. This implies that through its quantitative activations, the SR also draws a semantic and a syntax of the records on which it is working.

4.3. The uncorrect questions

4.3.1. Jets or Sharks? In this case we have simulated a research on the Subject KEN, making a question with an error. The traits of the subjects Ken are the following:

�9 Sharks �9 20s �9 H.S. �9 Single ~ Burglar

The external input we have activated implied 4 traits: the last ones were the same last 3 that identifies the subject Ken; but the trait Sharks has intentionally been changed with the trait Jets.

The question put to the three ANN has then taken the following shape:

�9 Jets �9 H.S. �9 Single �9 Burglar

We specify, that there is no Jets having these characteristics. Therefore,

386 Massimo Buscema

the problem that we had, was will the 3 ANN be able to understand that Ken is the nearest subject to this description, even if we have mistaken the most important and discriminant trait of each subject? For the IAC and the CS, 400 cycles were necessary. For the SR, 100.

* J e t s �9 H . S . �9 S i n g l e * Burglar: the Properties

IAC CS AF

*SINGLE (85) *SINGLE (100) *H.S. (84) *H.S. (100) *JETS (83) *JETS (100)

*BURGLAR (79) *BURGLAR (100) 20s (62) 20s (99)

*SINGLE (100) *20s (100) *H.S. (99) *BURGLAR (96) *JETS (95) SHARKS (8) J.H. (5) 30s (3) MARRIED (3)

On the fundamental properties the 3 ANN agree: the subjects which respond more to the input constraint are surely 20s.

While, for the IAC this characteristics is active, for the CS and the SR it is a fundamental trait.

For the SR, nevertheless, the Jets trait is lightly selfdepressed with respect to the others; this is a sign that this ANN has maybe understood that the subjects at issue must be also searched among the Sharks. The Sharks trait, in fact, is activated for 8%, the J.H. (5%), 30s (3%) and Married (3%).

�9 Jets * H.S. �9 Single * Burglar: the Records

IAC CS AF

PETE (62) FRED (56) KEN (52)

DOUG (39) SAM (24)

PETE (99) FRED (99) KEN (99)

KEN (89) PETE (87) FRED (87) DOUG (86) LANCE (85) GENE (82)

GEORGE (81) JIM (81)

SAM (81) GREG (80) JOHN (80)


The answers for the 3 ANN are globally satisfactory. Both the IAC and the CS agree on the first 3 subjects. The SR follows this interpretative line but it overturns the order of plausibility of the IAC: Ken is the subject which respond more than the others to the questioning conditions.

The SR, moreover, presents a very wide and diversified list of virtual candidates to the questioning conditions.

4.3.2. A Jets called Dave In this second experimentation the simulated error has been more profound. From the DB we know that the subject that is called Dave is a Sharks. According to the SR he is a rather typical Sharks.

Therefore, we have asked to the 3 ANN to search for a Jets called Dave that obviously does not exist in the referring Data Base.

The aim consists in evaluating which of the 3 ANN, once facing this wrong request, is able to self-correct and in which way. (Even in this case 400 cycles for the IAC and the CS and 100 cycles for the SR).

�9 J e t s �9 Dave: the Properties

IAC CS AF

*CALLED DAVE (78) *CALLED DAVE (100) *CALLED DAVE (89) *JETS (84) *JETS (100)

SINGLE (62) MARRIED (99) PUSHER (62) B U R G L A R (99)

J.H. (43) J.H. (99) 20s (43) 20s (99)

20s (100) PUSHER (99)

DIVORCED (98) H.S. (96)

*JETS (96) COLLEGE (7) SHARKS (6) BOOKIE (6)

30s (3) M A R R I E D (3)

One can see straightaway that the SR is the only ANN to have also given the minimum of plausibility to the traits that characterize the subject called Dave Sharks (6%): 30s (3%), H.S. (96%), Divorced (98%), and Pusher (99%). The first two traits, typically Sharks, have rather low scores, while the last 3, which are shared among the Jets too, present higher scores.

The other two ANN have tried to give an answer all the same, but their choices are reductive: once they have verified that the constraints "Jets" is stronger than the constraints "called Dave", they have excluded once for all, both the Sharks subjects and the Divorced ones.

388 Massimo Buscema

�9 Jets �9 Dave: the Records

IAC CS AF

ART (47) JIM (99) FRED (47) JOHN (99) GENE (47) LANCE (99)

R A L P H (47) G E O R G E (99) AL (84)

G R E G (90) DAVE (89) FRED (88) GENE (85)

G E O R G E (85) JIM (84)

PETE (84) SAM (83) KEN (68) RICK (68) NICH (62)

D O U G (39) ART (27)

The SR is the only AN that realizes the pertinence to give high plausibility to the subject Dave: the CS answers according to his model of Jets, while the IAC considers only the Jets that are less far away from the traits of the prototypical Jets. On the contrary, the SR as well as behaving as it had realized the error in making the question, also offers a series of record models that could be the analogous within the Jets of a Sharks Dave. In the first place, Greg (90%) which is not a typical Jets, nor a Typical Sharks. Nevertheless, he is a typically 20s like the Jets, as Dave is typically 30s like lhe Sharks, he is married, which is not a typical trait of the Jets, as Dave is Divorced, which is not a typical trait of the Sharks. Furthermore he is, as Dave, a Pusher with the H.S. title. Analogous considerations can be made on the other SR proposals.

4.4. The impossible questions

4.4.1. Both Jets and Sharks The referring questions is only apparently impossible. It is true that in the referring DB there is no subject which is characterized by both of these properties: either one is Jets or one is Sharks, unless the double-crossing is expected.

Nevertheless, it is possible to make oneself the following questions: which are, if there are, the subjects whose traits are so complex to make them play both in the Jets team and in the Sharks team? Briefly: which are the virtual double-crossers?


It is quite a different question from the question that would sound like: "Which are the subjects which are not quite Jets, nor quite Sharks?" In order to pick up again an already made reasoning, in this last case it is a matter of locating the "most neutral" subjects, while in our case we search for "more complex" subjects. (For the IAC and the CS we have taken 400 cycles. For the SR, 200 cycles).

~ J e t s �9 Sharks: the Properties

IAC CS AF

*JETS (79) *JETS (100) *J.H. (100) *SHARKS (79) *SHARKS (100) B UR OL AR (99) SINGLE (68) MARRIED (99) DIVORCED (98)

30s (65) 30s (99) 20s (93) BOOKIE (65) B U R G L A R (99) *JETS (52)

H.S. (60) J.H. (99) *SHARKS (50) 20s (3) 30s (8)

SINGLE (3)

The IAC and the CS choose in a rather radical way, the traits from which they select the subjects that better answer to the imposed constraints of the question. They agree on one only trait: to be 30s. Then, for the IAC it is a matter of Singles, Bookies, H.S. while for the CS it is a matter of Married, Burglar, J.H. with a gleam open to the trait 20s (3%). For the SR the selection conditions of the subjects pertinent to this question, are more fuzzy.

There will be a strong area of research among the 20s with J.H., Burglar and Divorced and a more limited area of research characterized by subjects with the traits 30s and Single.

* Jets �9 Sharks: the Records

IAC CS AF

NEAL AL (99) GEORGE (59) (90)

D O U G JOHN (99) JIM (89) (59)

MIKE LANCE JOHN (86) (33) (99)

IKE (33) PHIL (99) LANCE (86)

PETE DON (99) KEN (82) (30)

NICK RICK (82) (30)

IKE (69)

The IAC has chosen the less prototypical subjects of both the teams.

390 Massimo Buscema

The CS has made a difficult choice if we want to explain it at the first sight; it seems as if the Married were privileged: 30s with college if Sharks; with J.H. if Jets.

The choice of the SR is more complex. Leaving Ike out, whose plausibility is rather reduced, the SR has chosen 4 Jets and 2 Sharks. The 2 Sharks subjects, Ken and Rick were previously found rather prototypical of their team. Nevertheless, the traits H.S. and Single are not a strange combination among the Jets. At the same way the traits of Rick, Divorced and Burglar are rather typical of the Jets, even if the trait 30s and H.S. prevents him from being a typical Jet. Furthermore, for Ken, the trait 20s characterize him as Jets, while the combination Single Burglar seems to leave it out from this team.

If the complex or ambiguous nature of Ken and Rick is found to be clear, the nature of George and Jim seems less easy to realize. Previously the SR, in fact, located them as typical Jets. And an analogous reasoning can be made for the couple John and Lance. Actually, to be divorced is typical but rare among the Jets (2 models), so as it is not typical and rare among the Sharks (2 models). For the couple of traits "Divorced" and "Burglar" the reasoning can be repeated, with the difference that the number of Sharks with this double characteristics is reduced to one: Rick.

The couple George and Jim was then chosen for its similarity with the subject Rick. It is as if the SR had thought that "they can understand each other". The couple Lance and John has been selected for analogous reasons. (a) Married is a very Sharks characteristics trait; (b) Burglar is equidistributed but Married Burglar is more typical of the

Sharks than of the Jets; (c) 20s and Burglar are traits that presents also Ken.

In other words: the SR has not only simply answered to our questions through a list of subjects but it proposcd us a new team of subjects: - all Burglar; - Married, Single, and Divorced; - 20s and 30s; - with J.H. and H.S.

That is a set of characteristics that, once distributed among the 6 subjects, makes each of them typical of his team but not a stranger to the other team and/or to the other 5 selected.

4.4.2. Who could simultaneously be Karl and Al? Karl and A1 are the names of two subjects which are very different from each other they belong to different bands (Sharks and Jets), they have


different ages (40s and 30s) and they have different academic degrees (H.S., J.H.).

The only trait that they have in common is to be both Married. Further- more, they both aren't good representatives of the Bands to which they belong: even if Karl is less atypical for the Sharks than A1 is for the Jets. The question that we will make to the 3 ANN is not, then, impossible under any profile: it is true that there is no subject in our DB that is simultaneously called Karl and AI: but it is also true that it is possible that there are subjects that fuzzily own characteristics belonging to the fuzzy subset of the characteristics of Karl and A1. The IAC and the CS needed 400 cycles in order to stabilize; the SR needed 200.

�9 A 1 �9 Karl: the Properties

IAC CS AF

*KARL (77) *KARL (100) JETS (100) *AL (75) *AL (100) MARRIED (100)

MARRIED (54) MARRIED (99) 40s (97) SHARKS (50) OEORGE (99) *KARL (91)

40s (50) SHARKS (99) J.H. (89) H.S. (50) 30s (99) *ART (87)

BURGLAR (29) H.S. (99) BOOKIE (75) BOOKIE (19) BURGLAR (99) BURGLAR (32)

40s (15) H.S. (11) 3Os (5)

SINGLE (3) SHARKS (1)

The answer of the IAC is clear: it has privileged Karl, except in the case of the job, since the property Burglar that characterizes A1, is actually more typical of the Sharks than the property Bookie.

The answer of the CS is analogous but more radical: the characteristics Bookie disappears and to be 40s becomes lightly plausible.

Besides the major complexity of the answer, the SR, did not try to adjust its choices to the statistical schemes presented in the DB, it rather created a new type of subject, an hybrid which is not present in the DB, a virtual subject that better interpretes the condition of question we have made: (a) Surely a Jet (as A1); (b) Surely Married (as both); (c) Very plausibly 40s (as Karl); (d) Very plausibly with the H.S. title (as Karl);

392 Massimo Buscema

(e) Preferably a Bookie (as Karl); (f) But even a Burglar is acceptable (as A1); (g) It is not excluded that could have the H.S. title (as Karl); (h) Little plausible if he were 30s (as Karl); (i) Little plausible if he were Single (as none of them); (1) It is not forbidden but it would be unusual if he were a Sharks (as Karl).

In our DB there is no subject having the characteristics from "a" to "d". Then, the SR has created the premises for the birth of a new type of subject; through the technology it has conceived the computer son of Karl and A1.

�9 A 1 �9 Karl: the Records

IAC CS AF

KARL (66) EARL (55)

AL (19)

KARL (99) AL (99)

EARL (99) DON (99) RICK (33)

KARL (89) AL (89)

CLYDE (89) JOHN (86)

LANCE (86) ART (84)

MIKE (83) EARL (81)

JIM (72) GEORGE (72)

GREG (53)

The choice of the Records confirms what has been said. The IAC do not consider the Jets and tolerates A1. It associates the subject Earl to Karl because the first one is analogous to the second one except for the job: he is in fact a Burglar, which is a privileged property for the IAC. The CS leaves out all the Jets but A1, and enlarge the list of candidates of the Sharks: Earl is chosen for analogous reasons to the IAC. Don is the only Burglar Married and 308 of the Sharks and to be 30s is typically Sharks. Rick is tolerated: he is promoted because he is a Burglar and 30s with the H.S. title, typical traits of the Sharks, he is inhibited since he is not married.

The candidates of the SR instead come from both Bands. Together with Karl and A1, Clyde is selected and he could be defined as "an Al" that has Karl's job, and which is not Married. John and Lance are a good "genetic synthesis" between A1 and Karl: - Jets (as A1) - 208 (new trait)


- J . H . (as A1) - Married (as both) - Burglar (as Karl).

Analogous speech for the subject Art that presents nevertheless 2 new traits with respect to Karl and AI (Single + Pusher) and for Mike, that present an only new trait and other mixed traits among A1 amd Karl.

But the SR also selects Earl, Sharks subject presenting a combination of characteristics of Karl and just one of A1.

On could also talk about all the other choices of the SR. Nevertheless, we prefer to provide a few considerations of a more general character.

The SR answers with major degrees of freedom compared to the other 2 ANN. Furthermore, in answering, it does not adjust to the statistical structure in the DB wherein it is operating, but it adjusts the structure of the DB to the structure of the question that it is submitted to. That allows the SR to create new types of records during the answers.

While the CS and the IAC, in answering the question, try to maximize the compatible units and to minimize the incompatible ones, the SR tries to find spontaneously a solution of Min-Max among the units: the maximum of the activation that minimizes the distance among the units considering their internal constraints weights and the external ones (inputs). The CS and the IAC work with a "step 2 memory": the tree-graph with all the possible combinations among their units can be only deeply covered through funda- mentally binary options. The SR cover the same graph both deeply and in extension showing with their solutions a N step memory, where N is the number of all the units playing.

This allows the SR to maximize certain Records that present some characteristics, which are completely inhibited among the units that represent the DB properties.

4.5. The virtual questions

4.5.1. 40 years old at College In the DB that we are examining there are no 40s subjects that have a College academic degree. To bind the 3 ANN on these 2 characteristics means asking: which subjects that attended College present similar characteristics to the 40s subjects and which 40s subjects present characteristics similar to subjects that attended College.

394 Massimo Buscema

o 40s �9 College: the Properties

IAC CS AF

*COLLEGE (84) *408 (75)

SHARKS (64) MARRIED (64)

30s (53) PUSHER (50)

*COLLEGE (100) *40s (100)

SHARKS (99) MARRIED (99)

30s (99) PUSHER (99)

MARRIED (100) *COLLEGE (98)

SHARKS (97) *40s (94)

BURGLAR (86) PUSHER (15)

20s (9) H.S. (7) JETS (6)

The 3 ANN agree on the fact that the hypothetical subjects that could present the two traits "College" and "40s" together, is very plausibly coming from the Sharks band and plausibly Married.

Once this course is taken, the IAC and the CS provide, in a rather deterministic way, the characteristics that they already provided in order to represent the model of the typical Sharks: Pusher and 30s.

The SR behaves differently from the first 2 and differently from how it behaved when it determined the characteristics of the typical Sharks. The SR selects as very plausible the characteristics Burglar and fairly plausible, the characteristics Pusher.

At a first sight this choice may seem strange. In fact, out of the four Sharks subjects having the title "College", 2 of them are Pushers, one is a Burglar and one is a Bookie. Nevertheless, the 4 of them are Married and this trait is typical of the Burglar, both in the Sharks and in the Jets; 3 out of 5 Burglar in the Jets are Married and 2 out of 4 are in the Sharks. Furthermore, in the Sharks the only 2 40s which are present are Married and one of them is also a Burglar.

Locating the characteristics Burglar as fundamental and the one of Pusher as secondary within the virtual type requested, appears as an answer that presumes a very sophisticated analysis of our DB. The fact, then, that the SR considers the characteristics 20s, H.S. and Jet not excluded, offers a further moment of meditation on the analysis of this behavior. This meditation comes from observing the different types of records selected by the 3 ANN.

IAC


* 40s * College: the Records

CS AF

PHIL (61) PHIL (99) OL (61) OL (99)

NED (44) NED (99) DON (44) DON (99)

EARL (91) DON (89) NED (89) PHIL (87)

OL (87) KARL (87) GENE (79) JOHN (77)

LANCE (77) KEN (75) AL (75)

RICK (71)

395

With the usual difference of radicality the IAC and the CS answer to our virtual question likewise: they select only the Sharks Records.

The SR imply the answers of the other 2 ANN, it completes them selecting Records from both the Bands (8 Sharks and 4 Jets) and provides a complex and fuzzy setting.

As seen previously, it even located properties different from the first 2 ANN as characteristics typical of this question.

Earl, the most activated candidate is a Shark, he is 40s, Married, Burglar and with the H.S. title: Don and Ned, instead, are 305, characteristics that the SR has not even activated but this did not prevent it from implying these two 30s in the list of the most representative candidates of the virtual type that we have set in the question.

Through a scheme (see Fig. 1) the choices of the SR may appear more clear. From the diagram it looks clear that the SR is able to explore the Diagram of all the possible state of the DB, both deeply and in extension. In the specific case the SR provides more possible FRAMES as answer to our virtual question, each of which is provided of its own plausibility and is made up of characteristics selected from the whole universe of the properties of the DB and not only from the characteristics seen as emergent.

4.5.2. Bookie at risk of divorce There is no subject in our DB that works as a Bookie and is Divorced. Also in this case, then, the 3 ANN have tried to answer to our virtual question. We remind that there are only 4 subjects with the trait Divorced, 2 in the

396 Massimo Buscema

Fig . 1

KEN (75)

20s (9)

Single (0)

LANCE + (77)

JOHN

Jets (6)

20s (9)

J.H. (0)

AL (VS')

Jets (6)

30s (0)

J.~L (o)

4

v.:d~,L (91) Sharks (97)

40s (94)

Married (100)

Burglar (86)

H.S. (7)

[ PHIL+OL(87) [ Pusher (15)

1' DON (89)

30s (0)

College (98)

KARL (87)

Bookie (0)

RICK (71)

30s (0)

Divoroed (0)

where x ~ y = the property y substitutes the property x.

GENE (79)

20s

Single

Jets

[ nED (sg)

BooNe (0)

Sharks and 2 in the Jets. Furthermore, the profession Bookie, as well as being distributed in an analogous way among the 2 bands, it is the only profession that was not individuated as specific of the 2 bands, by none of the 3 ANN. The state of equilibrium was reached after 400 cycles for the IAC and the CS and after 200 cycles for the SR.

�9 D i v o r c e d �9 Bookie: the Properties

IAC CS AF

*BOOKIE (81) *DIVORCED (81)

30s (62) SHARKS (62)

H.S. (58)

*BOOKIE (100) *DIVORCED (100)

30s (99) J.H. (99)

SINGLE (99) JETS (99)

*BOOKIE (99) H.S. (99)

*DIVORCED (98) SHARKS (96)

40s (82) 20s (22)

JETS (7) MARRIED (7)

PUSHER (5) COLLEGE (3)


This time the answers of the 3 ANN are very diversified. The first reason of this unlikeness is maybe depending on the fact that the trait Divorced is more typical of the Jets than of the Sharks, in spite of the quantitative distribution. But the typical Jets Divorced are also Married and Burglar, typical traits combinations of the Sharks too.

The IAC does not catch this association and points out only the typical properties of the Sharks. The CS feels the presence of this strange association but is not able to resolve it, therefore it offers an apparently confused answer, building a frame that looks like an unthought cocktail of traits of the 2 bands: Jets with J.H. but 30s.

The SR makes a choice that leaves completely behind the other 2 ANN: it privileges the Sharks but assign to the virtual type that it is building, the typical traits of the Jets subjects which are Divorced (20s); but a H.S. title and being 40s, which are not atypical traits for the Bookies, characterizes them. The solution of the Records will clear these aspects better:

* D i v o r c e d �9 Bookie: the Records

IAC CS AF

DAVE (54) CLYDE (99) NEAL (54) MIKE (99) RICK (54) D O U G (99) NED (13) IKE (99) IKE (13) RALPH (99)

D O U G (1)

KARL (91) DAVE (89) RICK (86) EARL (86) NEAL (86) PETE (81) KEN (80)

D O U G (78) CLYDE (75)

NED (74) IKE (72)

NICK (71) GEORGE (50)

The IAC finds out more among the Sharks, the CS among the Jets with some interesting overlapping.

The SR finds out in both the DB (9 Sharks and 4 Jets), including among its more activated candidates the combination among the IAC and the CS Records (except Mike and Ralph).

The diagram of the answers of the 3 ANN (Figures 2, 3 and 4) will allow to observe better their systems of work.

398 Massimo Buscema

Fig. 2

NED (13)

College (- 13) Married (- 14)

IKE (13)

J.H. (- 13) Single (- 11)

DOUG (1)

Jets (- 13) Single (- 11)

Graph of IAC

NONE

Sharks (62)

30s (62)

H.S. (58)

Divorced (81)

Bookie (81)

DAVE (54)

Pusher (- 12)

NEAL (54)

Single (- 11)

RICK (54)

Burglar (- I2)

Fig. 3

IKE (99)

Single (99) Sharks (0)

RALPII (51)

Single (99)

Pusher (0)

4 - - - - -

Graph of CS

NONE

Jets (99)

30s (99)

J.H. (99)

Divorced (100)

Bookie (I00)

I,

t-

CLYDE (99)

40s (0) Single (99)

MIKE (99)

Single (99)

DOUG (99)

H.S. (0) Single (99)

We remind that the state of Rest (REST) for the IAC is - 0 . 1 (that is - 1 0 in this notation), while the state of maximum inhibition is - 0 . 2 (that is - 2 0 in this notation).

It is more clear that the IAC is able to move on to a trait else than the one conceived, in order to locate some plausible subjects to the answer, and to 2 traits at the maximun in order to locate rather acceptable subjects, provided that the preferred band of belonging is not canceled.

The emphatization of the trait Single allows the CS to make analogies among the "ideal model" that it has deduced from the virtual question and the real Records present in the DB. This trait also allows the CS to be able to activate an atypical subject as Ike.


Fig . 4

GEORGE (50)

Jets (7)

208 (22)

J.H. (0)

DAVE (89)

308 (0)

Pusher (5)

KEN (80)

208 (22)

Single (0)

Burglar (0)

IKE (72)

Jat. (o) ___]

G r a p h o f S R

RICK (86)

4 i Burglar (0)

30s (0)

NONE 41 - -

Bookie (99)

Divorced (99)

Sharks (96)

4 - - 408 (82)

H.S. (99)

I NEAL (86)

V DOUG (78)

Jets (7)

.[

CLYDE (75)

Single (0)

Jets (7)

J.H. (o)

KARL (91)

Married (7)

EARL (86)

Burglar (0)

NICK (71)----]

Pusher (5) 1

PETE (81)

Single

20s (22)

Jets

NED (74) 1 3Os (o)

College (3) j

The structure of its analogies is, nevertheless, analogous to the one of the IAC.

We must point out that the most activated subjects are all Bookies. On the contrary, no subjects appear which are characterized the by trait Div- orced; the CS succeeded in substituting Divorced with Single, and then it has worked as if the trait Divorced never existed.

One could, then, sustain that the IAC interpreted the virtual question in a rigid way: it looked for the typical Sharks that were either Divorced or Bookie. On the contrary, the CS has escaped the question.

In Figure 4 an SR diagram is shown. To comment on it is a long and complex work. But we can give a few synthetical considerations:

400 Massimo Buscema

(a) the diagram of the SR implies many of the selections worked by the other 2 ANN;

(b) the ideal frame chosen by the SR seems richer of consequences of the ideal frames of the other 2 ANN;

(c) the value of activation of each Record seems determined by the local values that the SR attributed to all the properties of the DB; that means that the SR built more models of similarity with respect to the ideal model that it selected without suffering of any contradictions;

(d) the activation logic of the Records is not linear. With respect to the ideal model, the subject Doug and Nick differ in 3 traits:

Doug {Jets*, 30s*, H.S., Single*, Bookie}---> 78% Nick {Sharks*, 30s*, H.S., Single*, Pusher}--~ 71%

nevertheless, their difference of activation is remarkable. Analogous thing for Clyde and Pete:

Clyde {Jets*, 40s, J.H.*, Single*, Bookie}---> 75% Pete {Jets*, 20s*, H.S., Single*, Bookie}--~ 81%

This means that the SR answers according to quanti-qualitative parameters. It is possible, in fact, to use this concept when the dynamics of a system does not show a marked homeomorphism among the quantitative and the qualitative deviation.

5. Conclusions

We have presented a new type of Artificial Neural Networks: the Self- Reflexive Networks.

We have uttered the theoretical presuppositions; their dynamics is analogous to the one ascribed to autopoietic systems: self-referentiality, unsupervised learning and unintentionally cooperative and contractual activities of their own units.

We have also presumed that among the perceptive dynamic of biologic systems and the biological forms there is an underground connection which is analogous to the one that links the wanting to do to the done. In our opinion, the Self-Reflexive Networks point out even more this secret complexity between the abstract and the concrete, between the existing and the being.

We have also hypothesized a new concept of perception; a concept which


frames time as the necessary spatial unstableness of biological forms: under this profile, chaos appears as the shadow of the complexity in the 3-dimen- sional spaces in which we all are squashed.

We have presented the basic equations of Self-Reflexive Networks and the most specific ones: known concepts as those of the propagation of a signal in a network and the back propagation method. New concepts as the one of dynamic target, of Re-entry with dedicated and fixed connections, of Meta- Units. The last one, maybe, is one of the most complex of this paper.

We have, therefore, experimented a specific type of Self-Reflexive Net- works, the Monodedicated, within the interpretation of a toy-DB and we have hinted at other already made experimentations, experimentations in process and planned experimentations.

From the applicative work that we have presented a few specifics and novelties of this type of Neural Networks emerge:

(a) The capability of answering to complex, strange, wrong or not precise questions, through the same algorithms through which the learning phase took place.

(b) The capability of spontaneously transforming their own learning inaccuracy in analogic capability and original self-organization capability.

(c) The capability of spontaneously integrate the models that it experienced in different moments in an acronical hyper-model.

(d) The capability of behaving as it had explored a decisions graph of large dimensions, both deeply and in extension. With the consequence of behaving as an Addressing Memory for self-dynamic Contents.

(e) The capability of always learning, rapidly and anyway, the complexity and the populousity of the learning model.

(f) The capability of answering simultaneously from different points of view, behaving, in this case, as a network that builds more similarity models for each vector-stimulus that it receives.

(g) The capability of adjusting in a biunivocal way, each question to the consulting DB and each DB to the question that are submitted. The consequence of this fact is the continuous creation of new answering models.

(h) The capability of building during the learning phase, a weights matrix that provides a sub-conceptual representation of the bi-directional relations between each couple of input variables.

(i) The capability, through the Metaunits, to integrate in a unitary typology, nodes with different saturation speed and, therefore, with different memory: in fact, while the SR units are short memory nodes, since each new stimulus zeros the previous stimulus, the Metaunits memorize the SR different stimulus during time, functioning as an average length memory. This fact should confirm that the average length memory is of a different level from the

402 Massimo Buscema

immediate memory and that it is based only upon relation among perceptive stimulus which are distributed in parallel and in sequence. In this context the weights matrix constitute the SR long term memory. And in this sense it will be opportune to think at a methodic through which the Metaunits can influence during time, the same weights matrix. In any case, in the SR there are service nodes or filter nodes and learning nodes as if they were weights (the Metaunits).

From the applicative and more prosaic point of view, it would be interesting to be able to evaluate, with this type of ANN the effects that bills would have on different parts of the society before becoming laws.

Many harmful collateral effects, unforeseeable at first sight, could be discovered before letting the whole country pay for them. It would be sufficient to collect the opportune data in order to carry on experimentations of

this kind. The same technique could be used in order to analyze and discover in

advance, the strategies of criminal organizations concerned with drug distribution, money recycling, arms traffic, etc.

Anyway, at the moment, we prefer to stay within the more strictly scientific and experimental field. Since the civic use of these methodics does not

interest anybody.

Notes

1. On autopoietic systems and on the differences that they present with respect to other systems, see Maturana and Varela 1980.

2. Other critical considerations on Forward Networks may be found in Minsky 1986 in the new chapter of the extended edition "Epilogue: the new Connectionism".

3. On this purpose, see Grass6 1973, Riedl 1978, Waddington 1970 and 1975. 4. On this level a particular relevance is assumed by Nicolis and Prigogine 1987. 5. See Maturana and Varela 1980 and Varela 1979. 6. See Fraisse and Piaget 1963 and Lindsay and Norman 1977. 7. On this direction, move both Maturana and Varela 1980 and Thom 1972. 8. On the same direction also Maturana and Varela 1980. 9. To this point of view, also Minsky 1986,and Rumelhart et al. 1986, agree.

10. There are partial traces of these ideas in Von Foerster 1981 and Thom 1972-1980. 11. On this point Von Foerster 1981 is the only one that hints to the relevance of the spatial

dimensions for the organizations of living structure. 12. It is a matter of activities that Maturana and Varela 1980 define of Self-Referentiality od

Autopoietic Systems. 13. An analogous experience has been carried on by the PDP group in 1986 (Rumelhart et al.

1986, vol. II, chap. 14), using a Constraint Satisfaction ANN created by them modifying the basic algorithm of the Boltzmann machine.

References


Buscema, M. (1993). AFI: Shell to Program SelfReflexive Networks, Semeion Software n. 7, Rome.

Buscema, M. (1994). Squashing Theory. Reti Neurali per la Previsione dei Sistemi Complessi, Collana Semeion, Armando Editore, Rome (eng. vers. Squashing Theory. An Evaluative and Prediction Model for Complex Systems, Semeion, Rome 1993).

Buscema, M. and Massini, G. (1993). I1 Modello MQ. Reti Neurali e Percezione Interpersonale, Collana Semeion, Armando Editore, Rome (eng. vers. The MQ Model, Semeion, Rome 1993).

Buscema, M. et al. (1994). Reti Neurali AutoRiflessive. Teoria, Metodi, Applicazioni e Confronti, Quaderni Semeion, Armando Editore, Rome, n. 1.

Carpenter, G.A. and Grossberg, S. (1991). Pattern Recognition by Self-Organizing Neural Network, The MIT Press, Cambridge, MA.

Didon6, G. (1993). Alcuni aspetti significativi della Rete AutoRiflessiva Monodedicata, in M. Buscema et al., 1994, Retti Neurali AutoRiflessive. Teoria, Metodi, Applicazionie e Confronti, pp. 75-82, Quaderni Semeion, Armando Editore, Rome, n. 1.

Fraisse, P. and J. Piaget (eds). (1963), Traitg de Psychologie Experimentale: IV. La Perception, Press Universitaires de France, Paris.

Foerster, H. von (1981). Observing Systems, Intersystems Publications, Seaside, CA. Grass6, P.P. (1973). L'Evolution du Vivant, Albin Michel, Paris. Hinton, G.E. (1981). Implementmg Semantic Networks in Parallel Hardware, in G.E. Hinton

and J.A. Anderson (eds), Parallel Models of Associative Memory, Erlbaum, Hillsdale, NJ, pp. 161-188.

Lindsay, P. and Norman, D. (1977). Human Information Processing, Academic Press, New York.

Kosko, B. (1992). Ne'utral Networks and Fuzzy System, Prentice Hall, New Jersey. Maturana, H. and Varela, F. (1980). AutoPoiesis and Cognition. The Realization of Living, D.

Reidel Publishing Company, Dordrecht, Holland. McClelland, J.L. and Rumelhart, D.E. (1988). Explorations in Parallel Distributed Processing,

The MIT Press, Cambridge, MA. Minsky, M. (1986). The Society of Mind, Simon and Schuster, N.J. Minsky, M. and Papert, S. A. (1969). Perceptrons, The MIT Press, Cambridge, MA (Expanded

Edition 1988). Nicolis, G. and Prigogine, I. (1987). Exploring Complexity. An Introduction, R. Piper GmbH

and Co. KG, Monaco. Pandin, M. (1993). Data Base e Reti Neurali, in M. Buscema et al., (1994). Reti Neurali

AutoRiflessive. Teoria, Metodi, Applicazioni e Confronti, pp. 83-91, Quaderni Semeion, Armando Editore, Rome, n. i.

Reidl, R. (1978). Order in Living Organism, Wiley Interscience, Chichester, New York. Rumelhart, D.E., McClelland, J.L. and PDPG Group (eds). (1986). Parallel Distributed Pro-

cessing, Vol 1: Foundations. Vol 2: Psychological and Biological Models, The MIT Press, Cambridge, MA.

Schmucker, K.S. (1984), Fuzzy Sets, Natural Language Computations and Risk Analysis, Com- puter Science Press, USA.

Thorn, R. (1972). Stabilitg Strueturelle et Morphogenese. Essay d'une Th~orie Ggndrale des Models, Inter Editions, Paris.

Thorn, R. (1980). ModUles Mathematiques de la Morphog~n~se, Christian Bourgois, Paris. Varela~ F. (1979). Principles of Biological Autonomy, Elsevier, North Holland, New York. Waddington, C. (ed.). (1970). Toward a Theoretical Biology, Edinburgh University Press, 3

Volumes, Edinburgh. Waddington, C. (ed.). (1975). The Evolution of an Evolutionist, Edinburgh University Press,

Edinburgh. Zadesh, L.A. (1987). Fuzzy Sets and Applications, John Wiley and Sons, New York.

Self-reflexive networks: Theory · topology · applications

Documents