Top Banner
A bioinspired collision detection algorithm for VLSI implementation J.Cuadri *a , G.Liñán a , R.Stafford b , M.S.Keil a , E.Roca a a Instituto de Microelectrónica de Sevilla, Centro Nacional de Microelectrónica, Avda. Reina Mercedes s/n, Campus Universidad de Sevilla, E-41012 Sevilla (Spain). b School of Biology, Ridley Building, University of Newcastle upon Tyne, NE1 7RU (United Kingdom). ABSTRACT In this paper a bioinspired algorithm for collision detection is proposed, based on previous models of the locust (Locusta migratoria) visual system reported by F.C. Rind and her group, in the University of Newcastle-upon-Tyne 1,2 . The algorithm is suitable for VLSI implementation in standard CMOS technologies as a system-on-chip for automotive applications. The working principle of the algorithm is to process a video stream that represents the current scenario, and to fire an alarm whenever an object approaches on a collision course. Moreover, it establishes a scale of warning states, from no danger to collision alarm, depending on the activity detected in the current scenario. In the worst case, the minimum time before collision at which the model fires the collision alarm is 40 msec (1 frame before, at 25 frames per second). Since the average time to successfully fire an airbag system is 2 msec, even in the worst case, this algorithm would be very helpful to more efficiently arm the airbag system, or even take some kind of collision avoidance countermeasures. Furthermore, two additional modules have been included: a "Topological Feature Estimator" and an “Attention Focusing Algorithm”. The former takes into account the shape of the approaching object to decide whether it is a person, a road line or a car. This helps to take more adequate countermeasures and to filter false alarms. The latter centres the processing power into the most active zones of the input frame, thus saving memory and processing time resources. Keywords: bioinspired model, LGMD, vision chip, VLSI, collision avoidance. 1-INTRODUCTION Collision avoidance is nowadays one of the most important fields of research in the automotive industry. The reason is straightforward: automotive accidents cause hundreds of thousands deaths and injuries every year, and therefore, in addition to educational measures, developing systems that help to reduce that number is mandatory. Plenty of works are being developed world wide within this subject, using different points of view and strategies. In 3 a vehicle tracking mechanism based on traditional image processing techniques is presented. The idea is to extract three images from the input frame, each representing a different spatial resolution, and to search for objects that match the symmetry properties that a vehicle is supposed to present for each of them. The algorithm, thus, involves muti-resolution image processing, edge detection, symmetry computation, etc, and it needs a high processing power and large memory resources. Very sophisticated systems can be found in nature, improved after thousands of years of evolution. Analysing those systems, and trying to implement them or to extract some methodology from their structure is a promising path for science. For instance, the locusts visual system has a neural structure called Lobula Giant Movement Detector (LGMD) that selectively responds to looming objects that are supposed to be approaching on a collision course. This allows those animals to fly in huge and very dense swarms of them without colliding, apart from escaping from hazardous situations as approaching predators. In 4 , a very original LGMD implementation is proposed. It consists of the correlation between consecutive frames in the pre-synaptic layers to the LGMD, so that the only task is to integrate these inputs. The pre- synaptic processing that is accomplished is the Reichardt Correlation, between the actual frame, and two predictions made from the previous frame, one expanding and the other contracting it. So, if the object is approaching, the former correlation * Mail: [email protected], Phone: +34-955-056666, Fax: +34-955-056686 Copyright 2005 Society of Photo-Optical Instrumentation Engineers. This paper will be published in the Proceedings of Microelectronics for the New Millenium Symposium and is made available as an electronic preprint with permission of SPIE. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.
11

A bioinspired collision detection algorithm for VLSI implementation

Feb 25, 2023

Download

Documents

Luis Moreno
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A bioinspired collision detection algorithm for VLSI implementation

Copyright 2005 Society of Photo-Optical Instrumentation Engineers.This paper will be published in the Proceedings of Microelectronics for the New Millenium Symposium and is made available as an electronic preprint with permission of SPIE. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.

A bioinspired collision detection algorithm for VLSI implementation

J.Cuadri*a, G.Liñána, R.Staffordb, M.S.Keila, E.Rocaa aInstituto de Microelectrónica de Sevilla, Centro Nacional de Microelectrónica, Avda. Reina

Mercedes s/n, Campus Universidad de Sevilla, E-41012 Sevilla (Spain).

bSchool of Biology, Ridley Building, University of Newcastle upon Tyne, NE1 7RU (United Kingdom).

ABSTRACT

In this paper a bioinspired algorithm for collision detection is proposed, based on previous models of the locust(Locusta migratoria) visual system reported by F.C. Rind and her group, in the University of Newcastle-upon-Tyne1,2. Thealgorithm is suitable for VLSI implementation in standard CMOS technologies as a system-on-chip for automotiveapplications. The working principle of the algorithm is to process a video stream that represents the current scenario, andto fire an alarm whenever an object approaches on a collision course. Moreover, it establishes a scale of warning states,from no danger to collision alarm, depending on the activity detected in the current scenario. In the worst case, theminimum time before collision at which the model fires the collision alarm is 40 msec (1 frame before, at 25 frames persecond). Since the average time to successfully fire an airbag system is 2 msec, even in the worst case, this algorithm wouldbe very helpful to more efficiently arm the airbag system, or even take some kind of collision avoidance countermeasures.Furthermore, two additional modules have been included: a "Topological Feature Estimator" and an “Attention FocusingAlgorithm”. The former takes into account the shape of the approaching object to decide whether it is a person, a road lineor a car. This helps to take more adequate countermeasures and to filter false alarms. The latter centres the processing powerinto the most active zones of the input frame, thus saving memory and processing time resources.

Keywords: bioinspired model, LGMD, vision chip, VLSI, collision avoidance.

1-INTRODUCTION

Collision avoidance is nowadays one of the most important fields of research in the automotive industry. The reasonis straightforward: automotive accidents cause hundreds of thousands deaths and injuries every year, and therefore, inaddition to educational measures, developing systems that help to reduce that number is mandatory. Plenty of works arebeing developed world wide within this subject, using different points of view and strategies. In 3 a vehicle trackingmechanism based on traditional image processing techniques is presented. The idea is to extract three images from the inputframe, each representing a different spatial resolution, and to search for objects that match the symmetry properties that avehicle is supposed to present for each of them. The algorithm, thus, involves muti-resolution image processing, edgedetection, symmetry computation, etc, and it needs a high processing power and large memory resources.

Very sophisticated systems can be found in nature, improved after thousands of years of evolution. Analysing thosesystems, and trying to implement them or to extract some methodology from their structure is a promising path for science.For instance, the locusts visual system has a neural structure called Lobula Giant Movement Detector (LGMD) thatselectively responds to looming objects that are supposed to be approaching on a collision course. This allows those animalsto fly in huge and very dense swarms of them without colliding, apart from escaping from hazardous situations asapproaching predators. In 4, a very original LGMD implementation is proposed. It consists of the correlation betweenconsecutive frames in the pre-synaptic layers to the LGMD, so that the only task is to integrate these inputs. The pre-synaptic processing that is accomplished is the Reichardt Correlation, between the actual frame, and two predictions madefrom the previous frame, one expanding and the other contracting it. So, if the object is approaching, the former correlation

* Mail: [email protected], Phone: +34-955-056666, Fax: +34-955-056686

Page 2: A bioinspired collision detection algorithm for VLSI implementation

will have a high value, and if the object is receding, it will be the latter the one which will have a higher value. Theexcitation to the LGMD is generated thus from this correlations. In the same direction but a different approach there is thework presented in 5. There, the design and test of a bioinspired analog VLSI chip that detects collisions up to 500 msecbefore they occur is explained. Particularly, this work is based on the “Elementary Motion Detector” (EMD) units that canbe found in flies and other insects. This EMD is used to measure the optic flow in the input sequence, and it consists of a1D array of elements that correlate its visual input with the one from the previous timestep shifted rightwards or leftwards.The idea is to design an array of EMD disposed in a radial way, from the centre of the array to the borders. Therefore, anapproaching object would generate response in almost all the EMD’s, given that its edges would all move outwards in theradial sense, while an object that approaches in non-collision course, or a translating object, would generate response inonly a fraction of them. Thus, summing up all the EMD responses and setting an appropriate threshold, a collision alarmcan be fired accurately with this device. In the framework of the biological modelling, other works are being carried withinthe visual system of the locust. In 6 and 7 works about the Descending Contralateral Movement Detectors (DCMDs) of thelocust are reported. The DCMDs are two neural structures present on each side of the locust, running along the length ofthe body from head to thorax. In these works authors examine in separate experiments the steering responses of tetheredflying locusts to identical stimuli. Those stimuli consisted of computer-generated spheres with different sizes approachingat different speeds. The experiments have revealed that the DCMD is more sensitive to variations in target trajectory in thehorizontal plane than in the vertical plane, and moreover, it has been proved that the DCMD responses also depend on thesize and angle of approaching trajectory of the target. It is also concluded that the DCMDs helps the locust to make evasivemanoeuvres associated with the presence of predators, rather than being related to collision avoidance, as it was thought inthe beginning.

In this paper a model for the Lobula Giant Movement Detector is presented, based on previous studies1,2, and from it,a Collision Avoidance System for automotive applications is proposed. Even though the initial mathematical descriptionof the LGMD is relatively simple, it still presents some drawbacks for its electronic implementation. Therefore, the primarygoal is to simplify and to adapt this model to embed it in a VLSI system-on-chip using standard CMOS technology. Besides,two complementary modules have been designed for this model, in order to make the whole system more robust, versatileand optimized. These modules are called "Topological Feature Estimator" and "Attention Focusing Mechanism". Withrespect to the former, its aim is to extract further information about the current alarm situation, particularly making an earlyclassification of the approaching object that is generating the collision alarm. In parallel to that, the latter’s aim is tooptimize the use of the computing resources restricting the processing to the zones of the frame that present more activityat a given instant.

The paper is organized as follows: in section 2 the initial Rind et al. model is presented. In section 3, the architecturalissues for the electronic implementation and the measures taken to guarantee its suitability are presented. Thecomplementary modules are described in section 4, whereas in section 5 experimental results are presented. Finally,conclusions are drawn out in section 6.

2-INITIAL MODEL OUTLINE

The starting point of this paper is the LGMD Model proposed by Rind et al. in1. The model represents a neural networkstructure, decomposed into 4 retinotopically connected layers of neurons that interact with each other, as shown in Fig.1.The model takes as input the luminance of the current scene, and calculates a movement map, i.e. the difference betweenthe current frame and the previous one, so that a signal proportional to the moving objects is generated. The movement mapsignal is also passed to the inhibitory 'I' layer, where inhibition spreads between neighbouring units is delayed beforeinteracting with the current excitation. This process removes much of the excitation caused by small changes in backgroundmovement, which may cause spurious alerts. Between the ‘E’ and ‘I’ layer a critical race between excitation and spreadinhibition is accomplished. Thus, if the excitation is strong enough it will go on to the next layer. Otherwise it will becancelled by inhibition, thus not causing any response in the model. The result of this competitive race between ‘E’ and ‘I’leads to the signal that feeds the 'S' cells. This layer is directly presynaptic to the LGMD, and all its neuron’s axons convergein the LGMD neuron, where their response is integrated to generate the LGMD potential value. There is also a forwardpathway that directly connects the luminance input with the LGMD. This Feedforward inhibition’s purpose to eliminateLGMD responses against global excitations, such as sudden changes in the illumination conditions.

Page 3: A bioinspired collision detection algorithm for VLSI implementation

This conceptual model has been algorithmically implemented in order to simulate it8, and for that purpose, some minorchanges have been made to the original idea. The input image has been decided to be of 100x150 pixels size, and itsgrayscale resolution to be of 6 bits. This dynamic range has shown to be fine enough for this particular application. Theinput image is first passed to the so called ON-OFF layer that calculates the movement map of the current scene. Let

be the luminance signal captured by the photosensors, where i and j are the row and column index, and T thediscrete timesteps. Then, the ON-OFF signal (the movement map), is defined as:

(1)

where 0 < i < 100 is the row index, 0 < j < 150 is the column index, and T are discrete time steps. Movement map values are rounded (so that ) before fed into the subsequent stage, the inhibition layer, which is calculated

in Eq. (2).

(2)

where denotes convolution between a and b, and K is a convolution kernel that implements a low pass filter. is rounded before fed into the next stage, the Summing Cells layer, where the computation of the critical race

between excitation and inhibition is accomplished (Eq. (3)). Note that can present either positive or negativevalues.

(3)

After that, a new final layer has been introduced before the LGMD value. This is called the Block Summing Cells.Here, a local activity map is calculated, summing up the excitation present in the S layer in groups of m rows and n columns,thus giving as a result a array. Since the S cells could adopt both positive and negative values, correspondingto ON and OFF activity, respectively, summing cell activity into an block may cancel. This is made to eliminatesmall scale spatial changes in the S units by allowing its ON and OFF activity to antagonise over small spatial scales. Thisphenomenon occurs in the locust’s neural pathway 1, and has clear advantages for automotive applications, because it isable to eliminate activity caused by background movement in general, such as camera vibrations, turning movements and

Figure 1: Original Rind et al. Model for the locust’s Lobula Giant Movement Detector (see functional descriptionbelow)

L i j T, ,( )

C i j t, ,( ) L i j T, ,( ) L i j T 1–, ,( )–2

---------------------------------------------------------=

C i j T, ,( ) C 0 31[ , ]∈

I i j T, ,( ) 1.5 C i j T 1–, ,( ) C i j T, ,( )–[ ] K⊗⋅=

a b⊗I i j T, ,( )

I i j T, ,( )

S i j T, ,( ) max C i j T, ,( ) 1.5 I i j T 1–, ,( )⋅– 0{ , } max 1.5 I i j T 1–, ,( ) C i j T, ,( )–⋅ 0{ , }–=

100m

--------- 150

n--------- ×

m n×

Page 4: A bioinspired collision detection algorithm for VLSI implementation

so on, whilst maintaining sensitivity to looming objects approaching over a high contrast background. So, the BSC layer isgiven by:

(4)

where: and , and in this case, and .

From the BSC layer, the net scalar excitation is calculated by summing up all the values of its cells, adequately scaledto get the signal into the proper dynamic range:

(5)

Finally, the LGMD potential value results from the temporal averaging of , in the following way:

(6)

LGMDCoeff is the coefficients vector for the LGMD potential calculation. The LGMD potential value is at each timestepcompared with the value of a dynamic threshold, and if the LGMD is greater that it, a spike is generated and the values

, and are set to 0. This phenomenon consists of the discharging of the cell membrane after firing onespike. Therefore, a new LGMD variable is needed, not subjected to that reset:

(7)

The threshold is calculated in the following manner:

(8)

ThreshCoeff is the coefficients vector for the threshold calculation, and a is a threshold minimum value, that helps to getrid of spurious excitation. The collision alarm firing process begins with the spike generation. One spike is fired at everyframe in which the LGMD value is greater that the threshold value, as it actually happens in the locusts. If so, as mentionedbefore, the LGMD values of , and are set to 0, and the processing continues in the same way with the nextframes. If four or more spikes are detected in a 5 frames interval, then the collision alarm is fired. Otherwise the modelremains silent.

3-ARCHITECTURAL ISSUES FOR VLSI IMPLEMENTATION

The objective is to integrate this model into a chip, so that a compact and reliable system can be fabricated for realapplications in the automotive industry. This is not straightforward, due to the presence of parameters and operations whoseimplementation in silicon is either impossible or very problematic (time, power or area consuming) using analog buildingblocks. Therefore, modifications and simplifications to the model are mandatory. Furthermore, adaptation of the model tothe particular application is also needed, because the conditions under which it works are different from those of the locusts’case. For instance, the frame rate of image acquisition in the system will be 25 frames/sec, while in the locust’s visualsystem the LGMD’s spike frequency can exceed 500 Hz.

The main architectural problems regarding the VLSI implementation of the model are related to silicon area and powerconsumption. Particularly, if a convolution kernel has to be executed in a given layer of the model, it implies a high densityof synaptic connections between the cells in that layer, and consequently an increase of cell area due to the required extrarouting.

B p q T, ,( ) S i j T, ,( )∑=

1 i 100≤ ≤ 1 j 150≤ ≤ 1 p 10≤ ≤ 1 q 15≤ ≤

e T( )b p q T, ,( )∑

25-------------------------------=

l T( ) e T( )

l T( ) LGMDCoeff 3( ) l T 2–( ) LGMDCoeff 2( ) l T 1–( ) LGMDCoeff 1( ) e T( )⋅+⋅+⋅=

I T( ) I T 1–( ) I T 2–( )

v T( ) LGMDCoeff 3( ) v T 2–( ) LGMDCoeff 2( ) v T 1–( ) LGMDCoeff 1( ) e T( )⋅+⋅+⋅=

V T( )

V T( ) ThreshCoeff 3( ) v T 5–( )⋅ ThreshCoeff 2( ) v T 10–( )⋅ ThreshCoeff 1( ) v T 14–( )⋅+ + a+=

T T 1– T 2–

Page 5: A bioinspired collision detection algorithm for VLSI implementation

Additionally, if the parameters of the model present values that differ too much one from another, then a wide dynamicrange is needed to operate with all of them simultaneously. If the implementation is digital, this consumes again a highamount of area. The problem is even more difficult to resolve if the different operators are implemented by using analogblocks where very high dynamic ranges in either constants or data cannot be accommodated due to the area and powerconsumption constraints existing in these type of systems.

Apart from this, an additional problem appears with the need of using information from previous timesteps on somelayers. Each timestep of memory needed means a complete new layer to store, with the consequent spent of area in thedigital case, and the problem of maintaining the information along time in the analog memories (to counterbalance currentleakages, etc).

To summarize, given all the previous considerations, the changes in the model should be oriented to simplify thestructure, to reduce the required memory for information in the layers from previous timesteps, and to equalize as much aspossible the parameters and the dynamic range of its signals. To achieve that, and based on simulations using real andartificial automotive situations, the first modification has been to remove the spread of inhibition realized by theconvolution kernel K in Eq. (2). This implies a big simplification of the structure, and saves area by removingapproximately synaptic connections that were needed for that operation to be executed in the focal plane.Simulations have shown that this does not affect the model’s behaviour in a remarkable way. Another structuralmodification that leads to a further reduction of the required integration area has been to remove the whole BSC layer, aslong as it has revealed not to affect in an important way the responses of the model, because the background activity isindeed efficiently attenuated by previous layers. With respect to the dynamic range of the signals, experiments have shownthat most often the values of the cells in all the layers of the model are positive or zero, being negative (and near zero) invery few occasions. Thus, after each layer, a half or complete wave rectification has been introduced, with thecorresponding reduction of the dynamic range.

To keep the behaviour of the model unchanged and even to enhace it, a readjustment of the parameters and scalefactors has been carried out, according to either heuristic rules or results from optimization genetic algorithms 9. Relatedto the parameters, a variation in the inhibition coefficient in Eq. (3) has also been introduced, called Adaptive Inhibiton. Itconsists of the variation of the inhibition influence depending on the number of consecutive spikes that have been generateduntil the current frame. The higher the number of consecutive spikes, the stronger the inhibition (i.e., the higher theinhibition coefficient, , Eq. (11)). This is useful to avoid false spikes and therefore, mistakes in firing thecollision alarm. Finally, to reduce the memory needed for the model’s execution, and given that the convolution kernel hasalready been removed from the inhibition, if its dependence from the previous values of the movement map is alsoremoved, then it could be approximated by the actual value of . With the same goal of memory saving, andtherefore, area saving, and considering that the signals are only allowed to have positive values, has also beensimplified, as can be seen in Eq. (11). As a result, the new layer by layer description of the model is the following:

Movement Map: (9)

Inhibition: (10)

Activity: , (11)

with k varying accordingly to the number of consecutive spikes

Excitation: (12)

LGMD: (13)

The spiking generation as well as the decision about the collision alarm remain unchanged.

100 150×( ) 8×

InhCoeff k( )

C i j T, ,( )S i j T, ,( )

C i j T, ,( ) L i j T, ,( ) L i j T 1–, ,( )–2

---------------------------------------------------------=

I i j T, ,( ) C i j T, ,( )=

S i j T, ,( ) C i j T, ,( ) InhCoeff k( ) I i j T 1–, ,( )×–=

e T( )S i j T, ,( )∑150

----------------------------=

l T( ) LGMDCoeff 3( ) l T 2–( ) LGMDCoeff 2( ) l T 1–( ) LGMDCoeff 1( ) e T( )⋅+⋅+⋅=

Page 6: A bioinspired collision detection algorithm for VLSI implementation

4-MODEL COMPLEMENTS

4.1 Attention Focusing AlgorithmDownloading retinotopical information from the array is a very important point, since it could become a bottleneck

for the processing speed. One of the possibilities is to read the information from all the cells in a column of the processorsarray, storing it in the result matrix, and after that, reading the next column. This process is repeated for every column inthe processing array. Anyhow, this process of extracting the information always results in lowering the maximumprocessing speed, due to the parallel to serial conversion that is actually being accomplished. So, reducing the amount ofinformation to be extracted from the array at each timestep is very desirable. Experiments have shown that information inthe latter layers of the model is concentrated on a part of the frame. Mainly this happens when there is some approachingobject, and the excitation is not being generated by a turning movement or camera vibration (Fig. 2). These are the casesin which the model should respond firing the collision alarm. Focusing on that region that contains the information aboutthe relevant activity, and rejecting the rest of the frame would help to achieve the goal of reducing the amount ofinformation previously explained.

In order to do that, an attention mechanism based on the distribution of the excitation in the S-layer has beendeveloped. The main idea is to divide the input frame in square-shaped groups of cells (attention grid) that can be eitheractivated or disabled. Each of these groups is called an attention cell. If an attention cell is active, the processing cellsbelonging to it compute the operations needed for the model, and their respective result must be downloaded from the array.If it is disabled, no operation is made, and its cells stay silent. In this case these cells do not have to be taken into accountwhen reading the information from the array. The activation and switching off of the attention cells is accomplished by anattention module. If no particular region presents a high activity, then it returns to the initial attention grid. The attentionalgorithm is decomposed into three main modules:

• "Attention: this module makes the decision of which attention cells present enough activity to activate their neighbours (activating cell). It also disables the ones in which there was not enough activity in the previous frame, transferring their activation to the nearest disabled attention cell that belongs to the default activation grid.

• "Activate: this module is called by "Attention", and activates the attention cells that surround the current activating cell. It returns the number of cells it has actually activated, since it depends on the location of the central cell, and the previous state of activation of its surrounding attention cells.

• "Disable: this module is also called by "Attention", and its duty is to switch off the same number of attention cells "Activate" has activated. The goal is to maintain constant the amount of information to extract from the array of the S cells in the model. The rule is to switch off the attention cells that are furthest to the current activating cell.

The activation grid will consequently act as a mask for every layer in the model, fact that could also be interpreted asif the disabled attention cells did not work at all, not consuming power, and not having information to be downloaded fromthe array. This is what will actually happen in the on-chip implementation of the model. This implies a power saving, andan acceleration of the process of reading the information from the last layer of the model, lowering the bitrate needed to

Figure 2: Examples of distribution of the excitation in the S layer. Note that it concentrates in the surrounding of the approachingobject (inside the red border), while the rest of the cells in the frame remain silent.

Page 7: A bioinspired collision detection algorithm for VLSI implementation

accomplish that task in time. One important point regarding the algorithm is the default activation grid. By construction,the attention mechanism presents a strong tendency to go back to the default activation grid, and therefore its shape willaffect the global behaviour of the model. Tests have been made with initial activation grids with different shape and size,as can be seen in Fig. 6. The pointer present in these frames represents the centre of mass of the excitation in the S layer.Thus, it points to the approaching object. Its colour means no danger nor activity when it is green, activity but no danger ifit is yellow, and activity and alarm when it is red.

4.2 Topological Feature EstimatorTo calculate the LGMD potential, one of the final steps is to sum up all pixels in the S layer (excitation). Clearly this

implies a great loss of information. It would be possible to further describe the environment if topological information insome of the layers of the model was taken into account. For instance, if the excitation in the S layer is known to be line-shaped, the kind of approaching object could be discriminated (a horizontal line (a road line), a vertical line (probably aperson, or a traffic light), and so forth). If the excitation is spread in the whole S layer, then it is probably due to a turningmovement of the car, in which all the background moves rapidly and generates activity. If it is localized in a given regionof the frame, then it is more probable that it has been generated by an approaching object, probably a car or some otherobject in the trajectory of the vehicle. Besides, it has been checked that some very usual situations in automotiveenvironments generate false collision alarms, which are unavoidable with the skills of the model. These situations comprisethe approaching of a horizontal road line (that is assumed to be an approaching object by the model), a quick turn of thecar, or a bump in the road causing a violent shake of the camera. Another advantage of making an early classification ofthe approaching object is the possibility of taking more accurate countermeasures. It is obvious that the safety measureswould not be the same if the object in collision course is a pedestrian than if it is a car. In the former, the safety concernsalmost totally to the pedestrian (braking, evasive manoeuvre…), while in the latter, it also involves the hosting vehicle.Thus, an additional module called "Topological Feature Estimator" (TpFE) has been developed to evaluate the nature ofthe approaching object and generate an early classification of it.

To accomplish the task of classifying, it has been decided to use the information present in the excitatory layer (S-layer). This layer is directly pre-synaptic to the LGMD in the biological model. The main reason is that on that layer thebackground and non significant items have been removed already, and only the movement map of relevant objects in thefield of view remains. Thus, it is particularly appropriate for extracting information about the geometrical features of theobject. For these purposes, the most important feature of an object (particularly in a 2D environment) is its shape. Theclassification is made into four different classes: vertical-shaped, horizontal-shaped, global (in the spatial sense), or dot-shaped. These would correspond to a vertical-shaped object (traffic light, person…), a road line or road stripes, a turningmovement, and an approaching car, respectively. For the classification, let be the pixel values in the S layer at agiven time instant T. First, two vectors are created,

where: (14)

where: (15)

Let be called column vector and row vector, since they are created summing up the columns and the rowsof the S layer, respectively. Notice that the first thirty samples of have been removed. This has been done for stabilitypurposes, given that usually there is plenty of undesired activity in the upper zone of the frame, due to the horizonmovement. The mean value and the variance of these vectors are defined in the following way:

Sij T( )

V T( ) v31 v32… v100, ,[ ]t= vi Sij T( )

j 1=

150

∑=

H T( ) h1 h2… h150, ,[ ]= hj Sij T( )

i 1=

100

∑=

V T( ) H T( )V T( )

Page 8: A bioinspired collision detection algorithm for VLSI implementation

, (16)

, (17)

From these, two descriptors can be defined:

Vertical Deviation (18)

Horizontal Deviation (19)

Those descriptors represent the verticality and horizontalness of the shape of the object, respectively. Comparing themthe main component of the shape can be obtained, following the rule:

If &

Vertical shape (person, traffic light…)

else if &

Horizontal shape (road line, road stripes…)

else if & & &

Dot shape (Car…)

else

None (turning, non identifiable object…)

where a and b are adjustable parameters that depend on the dynamic range of the pixels in the S layer and the expecteddimensions of the objects to classify as cars. The decision made by this rule is stored each timestep in an N elements buffer,one for each possible class, in such a way that we have the N latest decisions about the shape of the object in memory. Toimprove the robustness of the method and its accuracy, the definitive decision is made from the information in these buffers,in a way that could be defined as a "voting" mechanism. Thus, if in the N timesteps there is a majority of "X" decisions(where "X" can be "horizontal", "vertical", and so forth…), then it is accepted as valid. This is true for all the classes exceptfor the 'None' one, which is accepted as valid only if all the other buffers are completely empty. Therefore, before decidingthat the object is unclassifiable, there must pass N timesteps without any decision about its shape.

5-EXPERIMENTAL RESULTS

Tests have been made with a set of videos recorded in automotive situations (Fig. 3). In total, there are 49 videosequences representing several scenarios, comprising driving on a highway, collisions with a balloon car, pedestrianscrossing in front of the car, roundabouts, and other common situations in traffic. The sequences are 8 bit@100x150 AVIfiles at 25 fps.

µv T( ) 170------ vi T( )

i 31=

100

∑= σ2v T( ) 1

70( )2

-------------- vi T( ) µv T( )–( )2

i 31=

100

∑=

µh T( ) 1150--------- hj T( )

j 1=

150

∑= σ2h T( ) 1

150( )2

----------------- hj T( ) µh T( )–( )2

j 1=

150

∑=

δv T( ) µv T( ) σv T( )–=

δh T( ) µh T( ) σh T( )–=

δv T( ) 0> δh T( ) 0<

δv T( ) 0< δh T( ) 0>

δv T( ) 0> δh T( ) 0> Maxj hj T( ){ } µh T( )–( ) a> Maxi vi T( ){ } µv T( )–( ) b>

Page 9: A bioinspired collision detection algorithm for VLSI implementation

The behaviour of the model has been simulated using Matlab(Tm) , and it has shown an excellent overall performance.With respect to the collision alarms, it fires at about 7 frames before the collision actually occurs. This means 300 msec.approx. Examples of the alarm level can be observed in Fig. 4 and Fig. 5, being represented by the blue bar below the inputframe. The colour code goes from green (left side, no danger) to dark red (right side, collision alarm). In addition to that,the TpFE has revealed a percentage of success in classification of 73% (Fig. 4). This means that when there was a collisionalarm, 73% of the times the object that was creating it was classified correctly.

With respect to the attention focusing algorithm, tests have shown that the best default attention grids are the ones thatuniformly distribute the attention over the frame when there is no particular source of excitation, namely, a) and c) in Fig.6. Examples of the concentration of the attention can be observed in Fig. 5. After running the simulations over all the testsequences, it has been concluded that the attention algorithm keeps the number of active attention cells constant, thusmaintaining the requirements with respect to computing speed and downloading bitrate, while concentrates the attentioninto the more active zones of the frame. This has also revealed to be very useful in filtering false alarms, since lots ofspurious excitation coming from peripheral zones of the frame (due to camera vibrations, turning movements, etc.) is no

Figure 3: Examples of test videos: (a) Driving on a highway. (b) Roundabout. (c) Collision with test balloon car. (d) Pedestriancrossing in front of the car.

Figure 4: Classification examples of the appraching object using topological information from the S-layer, and their correspondingalarm level (blue bar below. The color code goes from green (no danger) to dark red (collision alarm)).

Figure 5: Examples of the attention focusing depending on the activity detected in each part of the frame. Note how the attentionpattern in zones where no activity is observed correspond whether to the default attention grid or to disabled attention cells (black).

Page 10: A bioinspired collision detection algorithm for VLSI implementation

longer taken into account. As the main drawback, this enhancement of the robustness causes the model to miss some alarmsand warnings, above all those that correspond to sudden appearances of pedestrians or objects crossing slowly in front ofthe car. Nevertheless, the behaviour of the model improves globally.

To summarize, the TpFE gives additional information about the nature of the approaching object, while the attentionalgorithm helps to drastically lower the computing power needed for the execution of the model. Moreover, it reduces thebottleneck effect of downloading the information from the array. Besides, it leads to a more efficient use of the availablecomputing resources. This makes the model more robust, responding more accurately to the real situations under which ithas been tested.

6-CONCLUSIONS

A collision detection algorithm based on the locust’s vision system has been presented. The algorithm shows anexcellent performance when tested in standard traffic situations, alarming about the collision up to 300 msec before itactually occurs. Furthermore, the complementary modules included have revealed to be very useful for making the modelmore reliable and the final system more robust and optimized.

ACKNOWLEDGEMENTS

This work has been partially funded by project IST-2001-38097 (LOCUST) and TIC2003 - 09817-C02-01(VISTA). Mr. Cuadri’s work is funded by an F.P.U. grant of the Spanish Ministry of Education and Science. Special thanksto Volvo Car Corporation for recording the test videos representing automotive situations.

REFERENCES

1.F.C. Rind and P.J. Simmons, “Seeing what is coming: building collision-sensitive neurones”. Trends Neuroscience 22(5), pp 215-220. May 1999.

2.Rind, F.C., & Bramwell, D.I. (1996). “Neural network based on the input organisation of an identified neuron signalingimpending collision”. J. Neurophysiol., 75 (3), 967-984.

3.Broggi, A.; Cerri, P.; Antonello, P.C., “Multi-resolution vehicle detection using artificial vision”, Intelligent VehiclesSymposium, 2004 IEEE, 14-17 June 2004, Pages:310 - 314.

4.S.Bermudez, P.Verschure, “A Collision Avoidance Model Based on the Lobula Giant Movement Detector (LGMD)neuron of the Locust”, Proceedings of the IJCNN, Budapest 2004.

Figure 6: Different Default Attention Grids tested. (a) 4x6 attention cells, uniformly distributed. (b) 4x6 attention cells, focused ona-priori interesting areas of the frame. (c) 10x15 attention cells, uniformly distributed. (d) 10x15 attention cells, focused on a-prioriinteresting areas of the frame.

(b)(a) (c) (d)

Page 11: A bioinspired collision detection algorithm for VLSI implementation

5.R.R. Harrison, “A Low-Power Analog VLSI Visual Collision Detector”, Advances in Neural Information ProcessingSystems (NIPS 2003), Vancouver, Canada, 2004.

6.Gray, J.R., Lee, J.K. and Robertson, R.M., “Activity of descending contralateral movement detector neurons and collisionavoidance behaviour in response to head-on visual stimuli in locusts”. J. Comp. Physiol. A 187:115-129 (2001).

7.Roger D. Santer, Peter J. Simmons, F. Claire Rind, “Gliding behaviour elicited by lateral looming stimuli in flyinglocusts”, J. Comp. Physiol. 191: 61–73 (2005).

8.Project IST-2001-38097 (LOCUST), Deliverable 11, “Biological Model Report”.

9.Yue S., Rind F.C., Keil M.S., Cuadri J. & Stafford R. "A bio-inspired visual collision detection mechanism for cars:Optimisation of a model of a locust neuron to a novel environment", submitted to Neurocomputing, October 2004.