Top Banner
1998 Special Issue Coordinate-free sensorimotor processing: computing with population codes Pietro G. Morasso a , *, Vittorio Sanguineti b , Francesco Frisone a , Luca Perico a a Department of Informatics, Systems and Telecommunications, University of Genova, Genova, Italy b Department of Physiology, Northwestern University Medical School, Chicago, USA Received 13 November 1997; revised 4 May 1998; accepted 4 May 1998 Abstract The purpose of the study is to outline a computational architecture for the intelligent processing of sensorimotor patterns. The focus is on the nature of the internal representations of the outside world which are necessary for planning and other goal-oriented functions. A model of cortical map dynamics and self-organization is proposed that integrates a number of concepts and methods partly explored in the field. The novelty and the biological plausibility is related to the global architecture which allows one to deal with sensorimotor patterns in a coordinate-free way, using population codes as distributed internal representations of external variables and the coupled dynamics of cortical maps as a general tool of trajectory formation. The basic computational features of the model are demonstrated in the case of articulatory speech synthesis and some of the metric properties are evaluated by means of simple simulation studies. q 1998 Elsevier Science Ltd. All rights reserved. Keywords: Population code; Cortical map; Cortical dynamics; Field computing; Self-organization; Speech; Topology representing network; Hebbian learning 1. Introduction A fundamental feature of sensorimotor processing in bio- logical or robotic organisms is its ecological nature, i.e. the fact that the relevant dynamics applies to the whole ensem- ble ‘‘organism þ environment’’ and the latter is a full part- ner, not a mere passive ‘‘slave’’ of the former. There is no doubt that the implications of this concept of circularity have not been explored to their full extent, although the main idea has been around for some time (since the pioneer- ing work of J. Piaget and J.J. Gibson), shifting from one research field to another: cognitive psychology, cognitive neuroscience, robotics, neural networks, artificial life, etc. In AL, for example, the attention is focused on very simple organisms which are able to exhibit some form of intelligent behavior, without any explicit form of internal intelligence, a situation that has been described also as pre-rational intel- ligence (Cruse, 1996). Such simple organisms have ‘‘sim- ple’’ sensory and motor organs and the related sensorimotor processing can be reduced to a rather straightforward (although tunable) analog circuitry, which directly trans- forms sensory signals into motor commands. The richness of the observed behavior is mainly a consequence of the complexity and, in a sense, creativity of the non-linear, dis- sipative, non-equilibrium dynamics of the environment; thus, a small amount of adaptability is sufficient for the organism to tailor its behavior to the essential constraints. However, adaptability does not necessarily imply intelligence. With the exception of ‘‘hard-wired’’ tropistic organisms, most existing organisms must be adaptable (in order to survive) but not all of them can be considered ‘‘intelligent’’, whatever the specific definition we use for such an elusive concept (Fig. 1). For the scope of the paper, we limit ourselves to the domain of sensorimotor processing and we argue that in such context a thing called intelligence is a necessity for organisms which have the burden to manage the complex sensory and motor organs required for complex tasks, such as manipulation and phonation. ‘‘Complex organs’’ of this kind would be useless for an insect and, in general, for an organism that could only rely on reflexive processing mod- ules, although adaptable. What is needed, in general, is the ability to build internal representations of the external world, which allow the organism to anticipate, plan, and * Requests for reprints should be sent to Dr P.G. Morasso, University of Genova, DIST, via Opera Pia 13, I-16145 Genova, Italy. Tel.: +39-10- 3532749; fax: +39-10-3532154; e-mail: [email protected] 0893–6080/98/$19.00 q 1998 Elsevier Science Ltd. All rights reserved. PII: S0893-6080(98)00065-3 Neural Networks 11 (1998) 1417–1428 PERGAMON Neural Networks
12

Coordinate-free sensorimotor processing: computing with population codes

Apr 24, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Coordinate-free sensorimotor processing: computing with population codes

1998 Special Issue

Coordinate-free sensorimotor processing: computing withpopulation codes

Pietro G. Morassoa,*, Vittorio Sanguinetib, Francesco Frisonea, Luca Pericoa

aDepartment of Informatics, Systems and Telecommunications, University of Genova, Genova, ItalybDepartment of Physiology, Northwestern University Medical School, Chicago, USA

Received 13 November 1997; revised 4 May 1998; accepted 4 May 1998

Abstract

The purpose of the study is to outline a computational architecture for the intelligent processing of sensorimotor patterns. The focus is onthe nature of the internal representations of the outside world which are necessary for planning and other goal-oriented functions. A model ofcortical map dynamics and self-organization is proposed that integrates a number of concepts and methods partly explored in the field. Thenovelty and the biological plausibility is related to the global architecture which allows one to deal with sensorimotor patterns in acoordinate-free way, using population codes as distributed internal representations of external variables and the coupled dynamics of corticalmaps as a general tool of trajectory formation. The basic computational features of the model are demonstrated in the case of articulatoryspeech synthesis and some of the metric properties are evaluated by means of simple simulation studies.q 1998 Elsevier Science Ltd. Allrights reserved.

Keywords:Population code; Cortical map; Cortical dynamics; Field computing; Self-organization; Speech; Topology representing network;Hebbian learning

1. Introduction

A fundamental feature of sensorimotor processing in bio-logical or robotic organisms is itsecological nature, i.e. thefact that the relevant dynamics applies to the whole ensem-ble ‘‘organismþ environment’’ and the latter is a full part-ner, not a mere passive ‘‘slave’’ of the former. There is nodoubt that the implications of this concept ofcircularityhave not been explored to their full extent, although themain idea has been around for some time (since the pioneer-ing work of J. Piaget and J.J. Gibson), shifting from oneresearch field to another: cognitive psychology, cognitiveneuroscience, robotics, neural networks, artificial life, etc.In AL, for example, the attention is focused on very simpleorganisms which are able to exhibit some form of intelligentbehavior, without any explicit form of internal intelligence,a situation that has been described also aspre-rational intel-ligence(Cruse, 1996). Such simple organisms have ‘‘sim-ple’’ sensory and motor organs and the related sensorimotorprocessing can be reduced to a rather straightforward

(although tunable) analog circuitry, which directly trans-forms sensory signals into motor commands. The richnessof the observed behavior is mainly a consequence of thecomplexity and, in a sense, creativity of the non-linear, dis-sipative, non-equilibrium dynamics of the environment;thus, a small amount of adaptability is sufficient for theorganism to tailor its behavior to the essential constraints.However, adaptability does not necessarily implyintelligence. With the exception of ‘‘hard-wired’’ tropisticorganisms, most existing organisms must be adaptable (inorder to survive) but not all of them can be considered‘‘intelligent’’, whatever the specific definition we use forsuch an elusive concept (Fig. 1).

For the scope of the paper, we limit ourselves to thedomain of sensorimotor processing and we argue that insuch context a thing called intelligence is a necessity fororganisms which have the burden to manage the complexsensory and motor organs required for complex tasks, suchas manipulation and phonation. ‘‘Complex organs’’ of thiskind would be useless for an insect and, in general, for anorganism that could only rely on reflexive processing mod-ules, although adaptable. What is needed, in general, is theability to build internal representationsof the externalworld, which allow the organism to anticipate, plan, and

* Requests for reprints should be sent to Dr P.G. Morasso, University ofGenova, DIST, via Opera Pia 13, I-16145 Genova, Italy. Tel.: +39-10-3532749; fax: +39-10-3532154; e-mail: [email protected]

0893–6080/98/$19.00q 1998 Elsevier Science Ltd. All rights reserved.PII: S0893-6080(98)00065-3

Neural Networks 11 (1998) 1417–1428PERGAMON

NeuralNetworks

Page 2: Coordinate-free sensorimotor processing: computing with population codes

imagine sensorimotor patterns (whether real or onlyplausible) in order to free itself from the ‘‘tyranny’’ ofcontrol automatisms. We propose such ability as anoperational definition of sensorimotor intelligence, for anorganism that can take advantage of complex sensory andmotor organs when carrying out complex sensorimotortasks. Many possible computational architectures may beconceived that fit the requirement, but it may be consideredthat the nature of the phylogenetic process favors solutionswhich evolve in an incremental, although non-linear wayfrom older ones. In the evolution of the nervous systemfrom a series of ganglia (invertebrates) to a central nervoussystem differentiated into spinal cord and brain (verte-brates), the emergence of the cerebral cortex is clearly aturning point. In fact, it makes available a new piece ofneuronal hardware, which not only is massively integratedwith afferent and efferent flows (a strong link withrealreality) but, at the same time, has an endogenous dynamicswhich gives it the power of planning and problem solving onrealistic but not necessarily real sensorimotor patterns (i.e. asuitable degree ofvirtual reality). Substantial advances inthe study of sensorimotor cortical areas have been achievedsince the pioneering work in the 1950s and 1960s by Mount-castle, Hubel, Wiesel, Evarts, and others, thus gaining anunderstanding of the cortex as a continuously adapting sys-tem, shaped by competitive and cooperative interactions.However, the greatest part of the effort has been devotedto the investigation of the receptive-field properties of thecortical maps, whereas relatively little attention has beendevoted to the role of lateral connections and the corticaldynamic processes that are determined by the patterns ofrecurrent excitation (Amari, 1977; Kohonen, 1982; Grajskiet al., 1990; Reggia et al., 1992; Martinetz et al., 1994;Sirosh et al., 1996; Morasso et al., 1996).

The paper gives a contribution in this direction by inves-tigating a computational model for the cortical processing ofhigh-dimensional sensorimotor variables. It is based ontopologically organized cortical maps, which support theformation of internal representations of the variables in acoordinate-free way by means of population codes. Weshow that the same mechanism of cortical dynamics,which induces the emergence of the population code fromthe topology of connections, is also able to carry out task-relevant computations with population codes by exploitingthe coupled dynamics of the different cortical areas.

The model is demonstrated in the field of speech motor

control, with the very limited goal of showing the feasibilityof the computational mechanism and its ability to deal withcomplex patterns in a general way. Admittedly, this is aqualitative test and the main purpose is to outline a newway of looking at cortical computation, without attemptingfor lack of space a quantitative comparison with alternativemodels on specific aspects of the theory. The model devel-ops previous concepts on field computing (Morasso et al.,1997a) and speech production (Sanguineti et al., 1997;1998) and investigates in more detail the metric aspects ofthe map dynamics.

2. Cortical dynamics and population codes

In our model, a cortical map is characterized for its abilityto store internal, distributed representationsC of a genericenvironmentalvector x. The representation is constructedaccording to the scheme of Fig. 2:1

• x [ X , Rn is mapped onto a large set of ‘‘filters’’

F ¼ { fk} 1,N each of which is characterized by a particu-lar response functionfk(x). We callF a thalamic repre-sentationand we map it onto the cortical map, as anexternal input, via a set of convergent, unidirectionalconnectionsWki.

• C emerges from the dynamic interaction between thethalamo-cortical inputs (viaWki, giving rise to the recep-tive field properties of the cortical units) and the cortico-cortical inputs (via the intra-connectionsCij, whichexpress the topological structure of the map):x ⇒ F ⇒ C ⇔ C. C is given by a population code{ Vi} 1,M (a pattern of activity clustered around a winningneuron) that implicitly expresses the posterior prob-ability distribution ofx given the available sensorimotormeasurements.

In principle, this scheme can be applied to maps that

Fig. 1. The ecological nature of sensorimotor processing.

Fig. 2. Cortical representation of environmental variables.

1 For simplicity, Fig. 2 does not include the cross-connections amongcortical maps that are an essential part of the theory.

1418 P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 3: Coordinate-free sensorimotor processing: computing with population codes

represent the ‘‘distal’’ space of targets and/or obstacles andmaps related to the ‘‘proximal space’’ of body configura-tions and/or motor commands. Our theory suggests that therelations among such maps should be bi-directional, in orderto carry out a variety of task-related computational opera-tions. For example, in some cases we might wish to make aprediction of the distal sensory patterns which are likely toarise from an internally generated motor command pattern;in another, we might be interested in planning the timecourse of the motor commands which is required for reach-ing a distal target while satisfying proximal constraints andavoiding distal obstacles. In both cases, the different inter-acting maps are requested to deal with patterns that aremotor and sensory at the same time. In this section wesimply introduced some general ideas about cortical mapsand population codes that can be applied to a whole familyof possible models. The next session addresses some topicsof biological plausibility for this class of models and thenwe present our proposal of a multiple map architecture.

2.1. Biological plausibility of cortical map models

As regards the biological plausibility of cortical mapmodels, the somatotopic or ecotopic layout of many corticalareas has long suggested a kind of topologic organization,associated with a dimensionality reduction of the represen-tational space, thus motivating the development of a (large)family of self-organizing network models.

From its beginning, however, the effort has been affectedby a number of misconceptions, partly due to the over-emphasis on the receptive field properties of corticalneurons. Only recently, a new understanding of the cortexis emerging as a dynamical system, which focuses the atten-tion on the competitive and cooperative effects of lateralconnections (Sirosh et al., 1996). It has been shown thatcortico-cortical organization is not static but changes withontogenetic development together with patterns of thalamo-cortical connections (Katz et al., 1992). Shortly, it has beensuggested that cortical areas can be seen as a massivelyinterconnected set of elementary processing elements,which constitute a computational map (Knudsen et al.,1987). From the modeling point of view, the most commonmisconceptions about cortical functionality can be reducedto the following three items:

• flatnessof cortical maps (related to the locality of lateralconnections);

• fixedlateral connections (versus plastic thalamo-corticalconnections, which determine receptive-field proper-ties);

• Mexican-hatfunction of lateral interactions (it implies asignificant amount of recurrent inhibition for the forma-tion of localized responses by lateral feedback).

The flatness assumption that characterizes the classic mapmodels (Amari, 1977; Kohonen, 1982) is contradicted bythe fact that the structure of lateral connections is not

genetically determined but depends mostly on electricalactivity during development. More precisely, the connec-tions have been observed to grow exuberantly after birthand reach their full extent within a short period; duringthe subsequent development, apruning process takesplace so that the mature cortex is characterized by a welldefined pattern of connectivity, which includes a largeamount of non-local connections: this rules out all the mod-els limited to a purely 2-D circuitry. Moreover, the super-ficial connections to non-neighboring columns are organizedinto characteristic patterns: a collateral of a pyramidal axontypically travels acharacteristic lateral distancewithout giv-ing off terminal branches and then it produces tightly packedterminal clusters (possibly repeating the process several timesover a total distance of several millimeters). Such character-istic distance is not a universal cortical parameter and is notdistributed in a purely random fashion but is different in dif-ferent cortical areas (Gilbert et al., 1979; Schwark et al., 1989;Calvin, 1995). Thus, the development of lateral connectionsdepends on the cortical activity caused by the external inflow,in such a way to capture and represent the (hidden) correlationin the input channels. Each individual lateral connection is‘‘weak’’ enough to go virtually unnoticed while mappingthe receptive fields of cortical neurons but the total effect onthe overall dynamics of cortical maps can be substantial, as isrevealed by cross-correlation studies (Singer, 1995). Lateralconnections from superficial pyramids tend to be recurrent(and excitatory) because 80% of synapses are with other pyr-amids and only 20% with inhibitory interneurons, most ofthem acting within columns (Nicoll et al., 1993). Recurrentexcitation is likely to be the underlying mechanism whichproduces the synchronized firing which has been observedin distant columns.

The existence (and preponderance) of massive recurrentexcitation in the cortex is in contrast with what could beexpected, at least in primary sensory areas, considering theubiquitous presence of peristimulus competition (or ‘‘Mexi-can-hat pattern’’) which has been observed in many pathwaysas the primary somatosensory cortex and has been confirmedby direct excitation of cortical areas as well as correlationstudies; in other words, in the cortex there is a significantlylarger amount of long-range inhibition than expected from thedensity of inhibitory synapses. In general, ‘‘recurrent compe-tition’’ has been assumed to be the same as ‘‘recurrent inhibi-tion’’, for providing an antagonistic organization that sharpensresponsiveness to an area smaller than would be predictedfrom the anatomical funneling of inputs. Thus, an intriguingquestion is which manner of long-range competition can arisewithout long-range inhibition and a possible solution is themechanism ofgating inhibitionbased on acompetitive dis-tribution of activation, proposed by Reggia et al. (1992) andfurther investigated by Morasso et al. (1996).

2.2. The proposed model: interacting cortical maps

The following model of cortical dynamics is proposed

1419P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 4: Coordinate-free sensorimotor processing: computing with population codes

where, for simplicity, we lump the genericith cortical col-umn into a single processing element, characterized by anactivity level Vi and two kinds of inputs (hlat

i andhexti ):

dVidt

¼ g(t)[ ¹giVi þ hlati þ hext

i ] (1)

The equation simply says thatVi evolves under the action ofthree competing influences:

1. a self-inhibition (weighted by the parameterg i . 0);2. a net inputhlat

i coming from the set of lateral connectionsinside the same cortical map;

3. a net external inputhexti coming from thalamo-cortical

connections (or cortico-cortical connections from othercortical maps).

The first term is consistent with the already mentionedintra-columnar nature of inhibitory synapses; we can alsosay that it gives the column the character of a ‘‘leaky inte-grator’’. The second term is a recurrent input, intended toexpress the massive lateral excitatory connections:

hlati ¼

∑j

CijVj∑

k

Vk

where the sum is extended to the set of columns laterallyconnected to the given element and the connection weightsCij are positive and symmetric. This term includes an ele-ment ofgating inhibition, because the activity level of eachneuron is normalized (as in Reggia’s model) according tothe average activity of its immediate neighbors. The sym-metry of connections implies that the map, as a kind ofcontinuous Hopfield network, is characterized by point-attractor dynamics. This kind of gating inhibition allowsthe attractor-pattern (i.e. the population code) to be muchsharper than the receptive field, as is clearly apparent in thesimulations.

The external input, which in the scheme of Fig. 2 isdetected by a set of ‘‘filters’’, defines the receptive fieldproperties of the unit. It is a function of the environmentalvariablex, with a preferred value or receptive field centerwi . In the simulations, we simply used a broad GaussianGi(x),2 whose covariance matrix identifies the receptivefield size and shape. For the dynamics of the cortical map,however, it is essential to add ashunting interactionterm(an idea borrowed by Grossberg, 1973)

hexti ¼ Gi(x)Vi

This contributes, together with gating inhibition, to obtainthe following type of transient behavior: the sudden shift ofthe input variablex (say the selection of a new target)induces first adiffusion process (which initially flattensthe population code, spreading the activity pattern over alarge part of the network) and then are-sharpeningprocess

around the target (which builds up faster and faster as thediffused wave-form reaches the target area). The combina-tion of the two processes is thepropagationof the popula-tion code toward the new target, following a geodesic in thecharacteristic manifold of the map, and this kind of behavioris the basic component of the envisaged coordinate-freecomputational mechanism operating on internal representa-tions of the external world.3

The transient behavior is smooth due to the combinedaction of diffusion and re-sharpening, even with a time-invariant gaing. In this case, however, there is no controlover the timing which can change unpredictably for theunavoidable fluctuations of the equation parameters. How-ever, this undesirable effect can be counteracted, withoutchanging the nature and simplicity of the model, by usinga suitable time-varying gaing(t),4 a concept which has beenexplored in a variety of contexts: ‘‘GO-signal’’ (Bullock etal., 1989), ‘‘terminal attractor’’ pace-maker (Barhen et al.,1989), ‘‘y-model’’ (Morasso et al., 1993; Morasso et al.,1997b). In this way, the timing behavior is more robustwith respect to the fluctuations in the equation parametersand allows the synchronization of concurrent corticaldynamic processes. A biologically plausible implementa-tion of this concept is related to the basal-thalamo-corticalloop and the well established role of the basal ganglia in theinitiation and speed-control of voluntary movements.

Fig. 3 shows a simulation of a cortical map that illustratesthe described computational mechanism. The input environ-mental variablex is two-dimensional, varying in a circulardomain. The map neurons (N ¼ 128) are characterized asfollows: (i) the receptive field centers are set according to aregular tessellation of the input domain; (ii) the receptivefields are radially symmetric and their sizes are large (com-parable to the size of the input domain); and (iii) the lateralconnections are consistent with the Voronoi tessellation ofthe input domain. The map is initially at rest in the pointx ¼ (0:2, ¹ 0:2); the transient is initiated by switching onthe pacemakery(t) and during the simulation the externalinput x remains fixed at its final value (¹0.2, 0.2).

A few words of justification are needed on the choice ofReggia’s model as a reference for our attempt to modelsome dynamic aspects of the sensorimotor cortex. Althoughno solid experimental evidence of competitive distributionof activation is available, the weaker and less committingfeature of gating inhibition that we use in our model, withthe specific goal of inducing the smooth propagation ofpopulation codes, is quite attractive from many points of

2 With respect to the diagram of Fig. 2,Gi (x) must be considered as anapproximation of the convergent thalamo-cortical input

Pk fk(x).

3 The propagation of the population code cannot be obtained with theoriginal Reggia’s model. In that model the map equation (to be comparedwith Eq. (1)) is as follows: dVi =dt ¼ ¹giVi þ (M ¹ Vi )[hlat

i þ hexti ].

hlati ¼

Pj cpVj (Vj (Vi þ q)=

Pk (Vk þ q) and cp, q are map-wide constants.

Instead of a propagation, the model yields a combination of a waning peak(in the old location) and a growing peak (in the new location). In our model,the gating and shunting terms are both necessary in order to obtain thepropagation effect.

4 The following time-varying gain has been used:g¼ g(t) ¼ dy=dt=(1¹ y)wherey(t) is a sigmoid (0→ 1 in Ts). See Morasso et al. (1997b) for details.

1420 P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 5: Coordinate-free sensorimotor processing: computing with population codes

view: (i) it is consistent with the widespread presence ofrecurrent excitation;5 (ii) it gives the right emphasis to therole of lateral connections; (iii) it is supported by converginglines of evidence on the continuous variation of the equili-brium trajectories that underlay coordinated movements; and(iv) it is not contradicted by specific experimental data.

2.3. Training

The learning procedure of the cortical model is an

extension of the technique proposed by Martinetz et al.(1994) for the TRN model. If we apply such a techniqueto the network illustrated by the previous simulation, with atraining set uniformly distributed in the domain, we obtain adistribution of receptive fields and a pattern of lateral con-nections quite similar to the one adopted in the simulation.In the TRN model, however, the lateral connections do notinfluence the network dynamics and there are differentlearning rules for the thalamo-cortical and the cortico-cortical connections: the former rule is Hebbian but thelatter one uses an explicit ordering procedure which isbiologically non-plausible and contradicts the basic

Fig. 3. Transient behavior of a cortical map model.

5 The basic dynamic behavior of the network is preserved, even if weinclude some degree of recurrent inhibition.

1421P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 6: Coordinate-free sensorimotor processing: computing with population codes

requirement of parallelism and locality. In the proposedextension (Frisone et al., 1997), the same Hebbian rule isutilized for adapting both the thalamo-corticalWij and cor-tico-corticalCkl connection weights: in the former case theweights are increased proportionally to the correlationbetween the thalamic input and the cortical activation(XiVj) and, in the latter, to the correlation between thecorresponding cortical activities (VkVl). The ordering pro-cedure, which is explicit in the TRN rule, is implicitly per-formed in our model by cortical dynamics itself: it attributesto the cortical units, at steady state, an activation levelVi

that is higher the closer the receptive field centerwi to theinput patternx. Thus, learning should only be applied atsteady state, after the application of a new input pattern.In fact, simulations have shown that this kind of learningstrategy yields a tessellation quite similar to the TRN model.As learning proceeds, both groups of connection weights aremodulated, thus shifting the receptive field centers and, atthe same time, pruning the lateral connection weights whichfall under a given threshold.

3. Cortical control of speech movements

In this section, we show how the cortical model can beapplied to the control of sensorimotor problems by means ofthe dynamic interaction between cortical maps. As an exam-ple, we consider the case of speech motor control.6 Theminimal computational architecture that has been imple-mented consists of (i) anarticulatory map, representingthe spaceX of articulatory gestures, i.e. the geometric con-figurations that the vocal tract can possibly assume duringspeech movements, and (ii)an acoustic map, storing theacoustic consequences in a formant spaceY. In particular,the environmental vectorsx andy were defined as follows:

• x is 10-dimensional and is based on a geometriccharacterization of the vocal tract due to Badin et al.(1995)x ¼ [LH (Lip Height), LP (Lip Protrusion), JH(Jaw Height), TB (Tongue Body), TD (Tongue Dor-sum), TT (Tongue Tip), TA (Tongue Advance), LY(Larynx), VH (Velum Height), LV (Lips Vertical)]T;

• y is five-dimensional and stores the first five formants ofthe vocal tracty ¼ [F1, F2, F3, F4, F5]

T.An importantcomment is that this choice of parameterization for theacoustic and motor spaces does not imply that the cor-tical maps are directly encoded in this way. We chosethose parameters for convenience and compatibilitywith the available training set. However, any otherkind of parameterization would be acceptable if it ispowerful enough to fit the data. One of the strong points

of the cortical map model is that it gives a coordinate-free representation of the environmental variables and sois rather insensitive to the parameterization of input andoutput.

The two maps (which contain 1000 and 500 neurons,respectively) were trained by means of a data set in whichthe acoustic output of a male French speaker,7 pronouncingVVV and VCV sequences, was synchronized with a ciner-adiographic acquisition, yielding about 5000 digitized X-ray images of the sagittal view of the vocal tract (at thesampling frequency of 50 Hz) (Badin et al., 1995). Fromthese datax and y vectors were extracted8 and then usedin a training procedure which adopted the TRN strategy (forlearning the receptive field centers and the patterns of intra-connections) extended in such a way so as to learn an ana-logous set of cross-connections between the two maps.9 Thecross-connections implicitly code the functional relation-ship between the two manifolds and also allow one tomap the population code of one map as external input forthe other: this induces coupled acoustic-articulatorydynamics that is a general-purpose tool for solving a numberof sensorimotor problems in a simple and unified frame-work. Unfortunately there is no space for exploring all theimplications of the model in this context. We simply list afew of them as an illustration of generality of the approach.

Fig. 4 shows the histograms of lateral connections in thetwo maps after learning; they are consistent, for theX mani-fold, with an intrinsic dimensionality of about 3–4 and, fortheY manifold, with a dimensionality of about 4–5.10 Thisis consistent with estimates of dimensionality performedwith standard statistical methods. Moreover, if we projectthe acoustic map onto the plane of the first two formants, werecover the classical triangle of vowels (Fig. 5 (top)) whichin fact is ‘‘bent’’ if we also consider the third formant (Fig. 5(bottom)).

An important issue in motor control is redundancy. Themaps can be used as a visualization tool. The speech articu-latory system (which includes tongue, jaw, lips and larynx)mechanically speaking has an infinite number of degrees offreedom. However, in functional terms the real number of

6 This is only an exercise, although a complex one. There is no space for adetailed comparison with alternative models and we do not claim that it isbetter in any sense. The point we wish to make is that it is possible to handlehigh-dimensional sensorimotor patterns by means of a totally distributedarchitecture that only relies on population codes.

7 The data were made available in the framework of the Esprit ProjectSPEECH-MAPS, coordinated by ICP-INPG in Grenoble.

8 The TRN algorithm requires a training set of suitable size, particularlyfor yielding a sufficient approximation of the lateral connectivity. For over-coming the limitation of the available data set we exploited the fact that aGaussian mixture centered on the learned prototype vectors optimallyapproximates the underlying probability density function of the data setand we randomly sampled the Gaussians in order to extend the training setto a minimum size (10 times the number of neurons).

9 The connection weightCij between neuron-i in one map and neuron-j inthe other was modified proportionally to the productViVj with a decay termproportional toCij.10 In a regular tessellation, according to the theory of dense sphere packing,

the kissing numberK is a precise index of the dimensionality of the mani-fold (e.g.K ¼ 2, 6, 12, 24, 40 for a dimensionality equal to 2, 3, 4, 5, 6,respectively). In a quasi-regular tessellation, as achieved by the TRNalgorithm, the same information is approximated by the number of lateralconnections.

1422 P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 7: Coordinate-free sensorimotor processing: computing with population codes

degrees of freedom orfunctional articulatorsis only 4–5but is greater than the estimated dimensionality of theacoustic manifold. This means that each target phonemecan be produced by a variety of articulations, which definethe no-motion-manifoldof the phoneme.11 The maps pro-vide a good way of exploring the morphology of such mani-folds. For example, if we consider the /u/ phoneme, which isidentified by a point in theY space, the corresponding no-motion-manifold inX can be evaluated by projecting thatpoint onto the articulatory map, via the cross-connections. Itturns out that such a manifold is basically two-dimensional,simply by looking at the projections on the different articu-latory planes, such as the JH–LH plane (Fig. 6).

In the same way, it is possible to describe the effect ofarticulatory constraints, such as thebite-block, which can bedefined by a constant value of the articulatory variable JH(or a constant constraining a combination of several articu-latory variables). The functional reduction of the acousticspace that is a consequence of such constraint can be eval-uated by identifying the neurons, in the articulatory map,which approximately fit the constraint, and isolating, in theacoustic map, the neurons that are cross-connected withthem. We get a direct picture (Fig. 7) of the reducedmanifold of phonemes that can be uttered in such con-strained condition, to be compared with the larger manifoldof Fig. 5.

Finally, the computational power of the dual-map modelis demonstrated by testing its ability to generate coordinatedacoustic-articulatory patterns in VV transitions. Fig. 8 illus-trates the case of an /ae/ transition, for which we had experi-mental data. The initial conditions in the two maps werechosen by centering the two population codes accordingto the available data vectors and allowing the overall systemto stabilize. The phoneme /e/ was then given as new externalinput at t ¼ 0. It was applied to the neuron in the acoustic

Fig. 4. Histogram of intra-connections for the acoustic map (left) and articulatory map (right).

Fig. 5. Acoustic map (500 neurons) projected onto theF1–F2 andF2–F3

planes (receptive field centers are visualized as points and lateral connec-tions as segments).

11 Articulatory movements inside the no motion manifold of a given pho-neme are called non-audible gestures because they do not affect the acous-tic characteristics of the phoneme.

1423P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 8: Coordinate-free sensorimotor processing: computing with population codes

map that was the closest to /e/, with an amplitude modulatedin time as shown in the lower-right graph of Fig. 8. (At thesame time, the amplitude of the initial /a/ input wasdecreased to 0 in a symmetric way, while keeping constantthe equation gaing.) The two maps started co-evolving intime, as dictated by Eq. (1), under the driving influence ofthe following cross-coupling term

hexti

� articulatory map¼

∑kCikVk

n oacoustic map

where the C-coefficients identify the cross-connectionsamong the two maps. The population code in the acousticmap was attracted by the target phoneme /e/, producing amoving wave of activation that was similar to the one illu-strated in Fig. 3, with the difference that it was five-dimen-sional instead of two-dimensional. At the same time, thepopulation code in the articulatory map was attracted by amoving target, identified by the cross-coupling term above.At the end of the transient, the articulatory map settled in aconfiguration that implicitly selected, in the no-motion-manifold of /e/, the configurationclosestto the initial one.In other words, an effect of the cross-coupling is to establisha correspondence between phonemes and no-motion mani-folds and the map dynamics is then anavigation toolthatcarries out the inverse acoustic-articulatory mapping, with-out any explicit regularization or optimization procedure.Fig. 8 shows the articulatory-acoustic transitions generatedby integrating the equations: they are consistent, from thequalitative and quantitative points of view, with the experi-mental data mentioned above.

In principle, the same model could be applied to generateVCV transitions, but then we would need to expand it byincluding an additional map that is appropriate to representconsonant-targets in addition to vowel-targets. It is well

known indeed that consonants are badly represented in anacoustic way (e.g. in terms of formants) whereas they can beprecisely identified by specifying the location of constric-tions in the vocal tract. Therefore, a straightforward exten-sion of the dual-map model to the VCV paradigm requirestraining aconstriction map, together with the acoustic andarticulatory maps described above. A VCV transition wouldbe generated by activating the sequence of targets in the twomaps (acoustic and constriction, respectively) and allowingthe combined dynamics to carry out the overall computa-tion. In fact, the available data set allowed us to estimate theconstrictions but unfortunately the size of the set (barelyenough for the vowels) was quite insufficient for theconsonants.

4. Metric properties

In the previous section, the computational power of theproposed cortical map model is demonstrated by showingthe qualitative properties of the mechanism in dealing withcomplex sensorimotor patterns in a coordinate-free way, i.e.operating directly with population codes. At no point of thesensorimotor processing architecture is there a need for theorganism to perform an explicit evaluation of coordinatevalues. In this section we intend to investigate the basicmetric properties and the robustness with respect to differentkinds of disturbances.

4.1. Static and dynamic precision

First, we wish to estimate the degree of accuracy to whichit is possible to recover a given environmental variable from

Fig. 6. Map of non-audible gestures corresponding to /u/ displayed in the plane JH–LH.

1424 P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 9: Coordinate-free sensorimotor processing: computing with population codes

its population code in the best possible conditions. For thispurpose, we considered a perfect topological map in whichthe receptive field centers were pre-set according to a reg-ular tessellation of the input domain and the lateral connec-tions were perfectly consistent with the Delaunaytriangulation. A simple read-out mechanismx ¼ f (V) wasthen used which was based on the least-squares approach.The input domain was sampled using the samples as inputstimuli and allowing the map to reach the correspondingequilibrium states. These were the data used for findingthe best linear read-out operator at steady state (theMoore–Penrose pseudo-inverse operator). During the tran-sients from an initial state to a target, illustrated for exampleby the graphs in Fig. 3, we used the same operator forestimating the trajectory implicitly determined by thedynamic-map equations and evaluating how well thesetrajectories approximate the corresponding geodesic curvesor ideal trajectories (straight lines in the simulations).

A 2-D, circular input domain, with a radius of 1, was usedin the simulations. Fig. 9(A) shows the simulation of a four-

sided trajectory near the center of the domain. The map wasallowed to settle in each vertex before turning on the nextone. The top-left panel of the figure displays four inputstimuli (small circles), the estimated trajectory (dottedlines), and the corresponding ideal trajectories (full lines);the bottom-left panel shows the evolution of the errors12

along the curvilinear coordinates, respectively for the fourmovements; the top-right panel shows the neurons of themap (dots) and the succession of neurons with the highestactivity during the transients (stars; circlesþ stars identifythe four corners); in the bottom-right panel the same infor-mation is displayed together with the underlying pattern oflateral connections. The figure shows that, as expected, thedynamic accuracy is significantly smaller than the staticaccuracy but, in fact, is not too bad. Fig. 9(B) and (C)document the graceful degradation of performance whenthe mechanism operates near the border of the domain oreven beyond (carrying out an extrapolation function). More-over, in spite of the increasing metric error, remarkably thecomputational mechanisms can preserve the topologicalstructure of the reconstructed pattern.

4.2. Sensitivity to noise in the thalamo-cortical connections

Such noise affects the position of the receptive field cen-ter in the input domain. We added a noise uniformly dis-tributed in a circle of radius 0.1 (10% of the domain size)and the simulation result is shown in Fig. 10. The dynamicerrors are larger than in the previous case but the static onesare comparable (they are exaggerated by the fact that weused, for simplicity, the same read-out matrix computed forthe perfect map). The topologic robustness is also confirmedin this case.

4.3. Sensitivity to noise in the lateral connections

In the previous simulations, the lateral connections are alltopologically correct, symmetric and equal. In one set ofsimulations, we estimated the influence of additive noiseon these connections, while keeping the topological correct-ness. We found that a 10–20% noise level (on top of thenominal value¼ 1) hardly had any effect on the perfor-mance of the network. The effect is more significant, asone might expect, if such ‘‘noise’’ affects the topologicalstructure of the network. In Fig. 11 we show two examples:one can be labeled ‘‘random pruning’’ (top panel: 25% ofthe connections were randomly destroyed) and the other‘‘random sprouting’’ (bottom panel: 10% long connectionswere randomly added). From this experiment and a numberof similar ones, we could conclude that the cortical mapmodel is more sensitive to over-pruning than over-sprouting. The possible biological implication is intriguing:

Fig. 7. Reduced acoustic map, corresponding to the bite-block experiment

12 The error is the magnitude of the vector difference between the points onthe ideal trajectory (the straight line joining the initial to the final point) andthe points estimated from the population code via the read-out procedure.

1425P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 10: Coordinate-free sensorimotor processing: computing with population codes

we might expect biological networks to follow the strategyof having some redundancy in the patterns of lateralconnections (although it slightly degrades the metricprecision) in order to exhibit a greater robustness withrespect to the unavoidable destruction due to aging oraccidental damages.

4.4. Sensitivity to the receptive field size

In most simulations the receptive field size (coded by thestandard deviation ofGi(x), the external input functions)was 0.75, i.e. a large fraction of the domain size. Reducingit in half did not have any effect on the performance. In asimilar way, using a non-circular covariance had a smalleffect, provided that the eigenvalue ration was not toolarge. The reason is probably that the intrinsic dynamicstends in any case to sharpen the population code to a regionin the immediate neighborhood of the ‘‘winning’’ neuron.

In general, it appears that the cortical model is a ratherrobust computational mechanism of trajectory formation

that can carry out a number of ‘‘computational missions’’in a homogeneous way. In particular, it allows the popula-tion codes to implicitly store distributed representations ofhigh-dimensional environmental variables and operates onthem with spatio-temporal competence. We wish to empha-size that the read-out mechanism of the population codeemployed in the simulations is simply a probe and there isno need for the robotic/biological organism to ever use it,because the multi-map computational architecture is totallydistributed and only operates with population codes. Thisfeature of our model should be contrasted with the samenotion of population code in motor control, as originallyproposed by Georgopoulos et al. (1983) and later developedin a large set of models inspired by those findings; the mod-els generally assume the presence of a specific read-outmechanism, thus introducing the vexing question aboutthe intrinsic coordinates of the population code. Ourmodel, on the contrary, in agreement with Sanger (1994)is based on the idea that population codes give the cortex theopportunity to manipulate coordinate-free representations

Fig. 8. Evolution of the articulatory commands (left) and formant variables (right) during a simulated /ae/ transition.

1426 P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 11: Coordinate-free sensorimotor processing: computing with population codes

of external variables, eliminating even the need to formulatethe question.

5. Conclusions

The circularity of the organism–environment interactionhas two complementary implications: one is related tolearning and the other to dynamics.

Learning is a process of self-organization at the

macroscopic-behavioral level (the Piagetiancircularreaction) as well as at the microscopic level (where theHebbian paradigm applies). Both operational levels implythat what is learnt inside is somehowsimilar to the worldoutside. In fact, such similarity or somatotopic/ecotopicorganization has always been the target of a line of criticismcentered on thehomunculus paradox: if the brain operates

Fig. 9. Ideal map with border effects.

Fig. 10. Map with noise in the thalamo-cortical connections (uniformlydistributed on a circle with a radius of 0.1).

Fig. 11. Map with topological noise in the lateral connections: 25% randompruning (up); 10% random addition of long connections (down).

1427P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428

Page 12: Coordinate-free sensorimotor processing: computing with population codes

by building internal representation of the environment, whois going to manage them? The paradox is essentially aninfinite regression of computational layers. However, thisonly holds for hierarchical models in which there is asharp separation between the world and the computationalmechanism: in such models the sensorimotor data are pur-ified and abstracted as the computation goes up from onelevel to the next one, ending up with pure symbols and/orcoordinates of abstract variables.

In a circular-distributed model, on the contrary, sensor-imotor patterns interact in a distributed, coordinate-free wayby means of their population codes, without any need ofextracting (or reading-out) coordinates and/or symbols,but taking advantage of the pattern-formation properties ofthe overall dynamics. Dynamics also operates as a regular-ization mechanism, inverting in an implicit way the sensory-motor mappings. Inflow is transformed into the outflow in acontinuous way and the intermediate distributed patterns ofactivation have at the same time sensory and motor nature,while keeping some degree of somatotopic/ecotopicorganization.

Finally, we wish to observe that the distributed and ana-logic nature of the proposed internal representations mightprovide a natural and simple interface between sensorimotorand cognitive processes, without any need of explicit sym-bolic computations. The role of the cognitive/attentionalsystem is indeed to identify, select, mask, etc. generalizedtargets/obstacles, thus leaving to the sensorimotor dynamicsthe task of generating the ouflow of motor commands andtriggering the causally related inflow of sensory reaffer-ences. At the same time, the complementary task of thecognitive system is to categorize situations/patterns in theperceptual data, extracting ‘‘symbols’’ out the continuousflow, which guide the further selection of targets/obstacles.In this way, the action–perception cycle is closed, exter-nally, through the dynamics of the real world and, intern-ally, through the categorization/selection interplay betweenthe dynamics of the sensorimotor maps and the underlyingcognitive processes.

Acknowledgements

The research was supported by the EU project SPEECH-MAPS, ISS, CNR, and MURST.

References

Amari S. (1977). Dynamics of pattern formation in lateral-inhibition typeneural fields.Biological Cybernetics, 27, 77–87.

Badin, P., Gabioud, B., Beautemps, D., Lallouache, T., Bailly, G., Maeda,S., Zerling, J. P., & Block, J. (1995). Cineradiography of VCVsequences: articulatory-acoustic data for speech production model. In:International Conference on Acoustics, Trondheim, Norway (pp. 349–352).

Barhen J., Gulati S., & Zak M. (1989). Neural learning of constrainednonlinear transformations.IEEE Computer, 6, 67–76.

Bullock, D., & Grossberg, S. (1989). VITE and FLETE: Neural modules fortrajectory formation and postural control. In: W. A. Hershberger (Ed.),Volitional action(pp. 253–297). Amsterdam: North-Holland/Elsevier.

Georgopoulos A. P., Caminiti R., Kalaska J. F., & Massey J. T. (1983).Spatial coding of movements: A hypothesis concerning the coding ofmovement direction by cortical populations.Exper. Brain ResearchSuppl., 7, 327–336.

Calvin, W. (1995). Cortical columns, modules and hebbian cell assemblies.In: M. Arbib (Ed.),The handbook of brain theory and neural networks(pp. 269–272). Cambridge, MA: MIT Press.

Cruse, H. (1996).Neural networks as cybernetic systems. Stuttgart: G.Thieme Verlag.

Frisone, F., Perico, L., & Morasso, P. (1997). Extending the TRN model ina biologically plausible way. In: W. Gerstner, A. Germond, M. Hasler,& J. D. Nicoud (Eds.),Artificial neural networks, LNCS vol. 1327 (pp.201–206).

Gilbert C. D., & Wiesel T. N. (1979). Morphology and intracortical projec-tions of functionally identified neurons in cat visual cortex.Nature, 280,120–125.

Grajski K. A., & Merzenich M. M. (1990). Hebb-type dynamics is sufficientto account for the inverse magnification rule in cortical somatotopy.Neural Computation, 2, 71–84.

Grossberg S. (1973). Contour enhancement, short term memory, and con-stancies in reverberating neural networks.Studies in Applied Mathe-matics, 52, 213–257.

Katz L. C., & Callaway E. M. (1992). Development of local circuits inmammalian visual cortex.Annual Review of Neuroscience, 15, 31–56.

Kohonen T. (1982). Self-organizing formation of topologically correct fea-ture maps.Biological Cybernetics, 43, 59–69.

Knudsen E. I., du Lac S., & Esterly S. (1987). Computational maps in thebrain.Ann. Rev. Neuroscience, 10, 41–65.

Martinetz T., & Schulten K. (1994). Topology representing networks.Neural Networks, 7, 507–522.

Morasso, P., Sanguineti, V., & Tsuji, T. (1993). A dynamical model for thegeneration of curved trajectories. In: S. Gielen, & B. Kappen (Eds.),Proceedings ICANN’93, London: Springer Verlag (pp. 115–118).

Morasso P., & Sanguineti V. (1996). How the brain can discover the exis-tence of external egocentric space.Neurocomputing, 12, 289–310.

Morasso P., Sanguineti, & Spada V. G. (1997). A computational theory oftargeting movements based on force fields and topology representingnetworks.Neurocomputing, 15, 411–434.

Morasso, P., & Sanguineti, V. (Eds.), (1997a). Self-organization, corticalmaps and motor control. Amsterdam: North Holland Elsevier.

Nicoll A., & Blakemore C. (1993). Patterns of local connectivity in theneocortex.Neural Computation, 5, 665–680.

Reggia J. A., D’Autrechy C. L., Sutton III G. G., & Weinrich M. (1992). Acompetitive distribution theory of neocortical dynamics.Neural Com-putation, 4, 287–317.

Sanger T. (1994). Theoretical considerations for the analysis of populationcoding in motor cortex.Neural Computation, 6, 29–37.

Sanguineti V., Laboissiere R., & Ostry D. J. (1998). A dynamic niomecha-nical model for neural control of speech production.Journal of theAcoustical Society of America, 103 (3), 1615–1627.

Sanguineti V., Laboissie´re R., & Payan Y. (1997). A control model ofhuman tongue movements in speech.Biological Cybernetics, 77, 11–22.

Schwark H. D., & Jones E. G. (1989). The distribution of intrinsic corticalaxons in area 3b of cat primary somatosensory cortex.ExperimentalBrain Research, 78, 501–513.

Singer W. (1995). Development and plasticity of cortical processing archi-tectures.Science, 270, 758–764.

Sirosh, J., Mikkulainen, R., & Choe, Y. (1996). Lateral interactions in thecortex. Hypertext book, www.cs.utttexas.edu/users/nn/web-pubs/htmlbook96.

1428 P.G. Morasso et al. / Neural Networks 11 (1998) 1417–1428