Incorporating statistical background model and Joint Probabilistic Data Association filter into motorcycle tracking

Incorporating Statistical Background Model and Joint Probabilistic Data Association Filter into

Motorcycle Tracking

Phi-Vu Nguyen Faculty of Information Technology

University of Science Ho Chi Minh City, Vietnam

Hoai-Bac Le Faculty of Information Technology

University of Science Ho Chi Minh City, Vietnam

Abstract— Multi-target tracking is an attractive research field due to its widespread application areas and challenges. Every point tracking method includes two mechanisms: object detection and data association. This paper is a combination between a statistical background modeling method for foreground object detection and Joint Probabilistic Data Association filter (JPDAF) in the context of motorcycle tracking. A major limitation of JPDAF is its inability to adapt to changes in the number of targets, but in this work, it is modified so that we can successfully apply JPDAF with known number of targets at each time instant. The experimental system works well with the number of targets less than 10/frame and be able to self-evolve with gradual and “once-off” background changes.

Keyword– Multi-target tracking, point tracking, data association, JPDA, JPDAF, foreground object detection, statistical background model, motorcycle tracking.

I. INTRODUCTION

Motion understanding is an essential function of human vision. Consequently, object tracking takes the crucial role in computer vision. Multi-target tracking has widespread applications in both military (air defense, air traffic control, ocean surveillance) and civilian areas (for automatical surveillance demands in public or secret places), especially when human labour becomes more and more expensive.

Object tracking, in general, is a challenging problem. Its complexities arise due to the following factors [4]: loss of information caused by projection from 3D to 2D space, complex object motions, complex object shapes, partial and full object occlusions, scene illumination changes, and real-time processing requirements. There are three main categories of object tracking [4]: point tracking, kernel tracking, and silhouette tracking. While kernel and silhouette tracking concern object shapes, point tracking considers an object as a point and just focuses on its position and motion, which can be represented by state vector. Filtering is a class of methods that is suited for solving the dynamic state estimation problems of point tracking. In multi-target tracking, we have a task of finding a correspondence between the current targets and measurements, named data association. Data association is a complicated problem especially in the presence of occlusions,

misdetections, entries, and exits of objects. There are many statistical techniques for data association [7], among them, Joint Probabilistic Data Association (JPDA) is the method that aims to find a correspondence between measurements and objects at the current time step based on enumerating all possible associations and computing the association probabilities, it is a widely used technique for data association ([8],[9]). However, to have a good JPDA filter (JPDAF), it is required to have accurate measurements, that means we need to have good object detection results.

Every tracking method requires an object detection mechanism, there are those which just need the detection at the first time objects appear, while the others need it in every frame, point tracking belongs to this type. One effective way for foreground object detection is to give an accurate background model. Recently, L.Li et al proposed a foreground object detection method by statistical modeling of complex backgrounds [1]. This work used a Bayesian framework for incorporating three type of features: spectral, spatial and temporal features into a representation of complex background containing both stationary and nonstationary objects. With the statistics of background features, the method is able to: represent the appearances of both static and dynamic background pixels, self-evolve to gradual as well as sudden “once-off” background changes.

Taking advantage of the excellent object detection results from this method, this paper employs JPDAF for vehicle tracking in the motorcycle lane. A major limitation of JPDAF is its inability to adapt to changes in the number of targets, because it is confused between a measurement originated from a new object appearance and a false alarm. However, in the context of motorcycle surveillance, we has proposed a strategy to detect new objects entering and objects leaving the observation area, so that we can successfully apply JPDAF with known number of targets at each time instant. The experimental system has good results with the number of targets less than 10/frame, including of detecting and tracking the wrong-wayed motorcycles. Motorcycle tracking in particular and traffic tracking in generral, is an interesting but challenging application. Its main difficulties can be

284978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

https://www.researchgate.net/publication/220566062_Object_tracking_a_survey_ACM_Comput_Surv?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/220566062_Object_tracking_a_survey_ACM_Comput_Surv?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/220659377_A_review_of_statistical_data_association_for_motion_correspondence?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/3193263_Probabilistic_data_association_methods_for_tracking_complex_visual_objects?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/2387062_Tracking_Multiple_Moving_Targets_with_a_Mobile_Robot_using_Particle_Filters_and_Statistical_Data_Association?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/51371284_Statistical_Modeling_of_Complex_Backgrounds_for_Foreground_Object_Detection?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

enumerated as: the severely occlusions when traffic density is high (especially in rush hours), the shadows of big vehicles, and the real-time processing demand of a traffic surveillance system. This paper is the next step (after [6]) of the effort finding the most satisfied approach for automatical traffic surveillance in big cities of Vietnam.

The remains of this paper is organized as follow: section II is the main ideas for statistical modeling of complex background proposed in [1], section III reviews the background of JPDAF, a complete algorithm and experimented results on simulated data of JPDAF are presented at the end of this section, section IV is the combination of statistical background model and the modified JPDAF so that they can be applied in the motorcycle tracking situation, the experimental results of this combination are submitted in section V, and the conclusion is after all others.

II. STATISTICAL BACKGROUND MODELING FORFOREGROUND OBJECT DETECTION

A. Bayesian framework for classifying background and foreground points Let s(x,y) be a pixel in a video frame at time t with its

Decartes co-ordinate, v be the feature vector extracted at s.Then using Bayes formula we can determine the probability that s belongs to background given v as follow:

( | ) ( )( | )

( )s s

ss

P b P bP b

P=

vv

v (1)

where b implies that s belongs to background. Similarly, the probability that s belongs to a foreground object given v is:

( | ) ( )( | )

( )s s

ss

P f P fP f

P=

vv

v (2)

where f refers that s is a foreground point. According to Bayesian decision rule, s will be classified as background point if: ( | ) ( | )s sP b P f>v v (3)

Undergoing some transformations, (3) is equivalent to: 2 ( | ) ( ) ( )s s sP b P b P>v v (4)

where Ps(b) is the probability that s is classified as background, Ps(v) is the probability that v is observed at s, and Ps(v|b) is the probability that v is observed when s has already been classified as background. Thus, we can use Ps(v|b), Ps(b) and Ps(v), which will be modeled and estimated based on statistics in subsection B and C, to judge whether a point comes from background or foreground.

B. Statistics for background features and feature selection To estimate Ps(v|b), Ps(b) and Ps(v), we need a data

structure to take into account the statistical information relevant to feature vector v at s over a sequence of frames. Each feature type at s has a table of statistics defined as:

( )( )

{ ( )}, 1,..., ( )

t

t

p bT

S i i M=

=v

vv

sv

(5)

where ( )tp bv grasps the Ps(b) at time t based on the classification results at s up to time t, and 1,..., ( ){ ( )}t

i MS i =v v takes note the statistics of the M(v) feature vectors which have the highest frequencies at s, each ( )tS iv contains:

|

1 ( )

( )

( ) ( | )

( ,..., )

i

i

ts i

t tb s i

i i iD

p P

S i p P b

v v τ

=

= =

=

v

v v

v

v

v

v

(6)

where D(v) is the dimension of vi. In table Tv(s), the 1,..., ( ){ ( )}t

i MS i =v v are kept sorting in descending order with respect

to i

tpv , the frequence of vi. Then, the first N(v) (N(v)<M(v))

members in 1,..., ( ){ ( )}t

i MS i =v v will be used to estimate Ps(v|b), Ps(b) and Ps(v) in subsection C.

Another important issue in background modeling is feature selection. Herein, three types of features: spectral, spatial and temporal features are combined for complex background modeling.

1) Feature selection for static background pixels: due to the constancy in shape and appearance of a pixel comes from a static background object, spectral and spatial features, in this case are its color and gradient, are exploited. Let c = (R, G, B)T

be the color vector and e = (gx, gy)T be the gradient vector of a pixel s, then we respectively need two tables Tc(s) and Te(s) to make them learned.

Because two types of features are used, the decision rule in (4) must be modified with the assumption that color and gradient vectors are independent: 2 ( | ) ( | ) ( ) ( ) ( )s s s s sP b P b P b P P>c e c e (7)

With color and gradient features, we need a quantization measure that is less sensitive to illumination changes, so a normalized distance measure based on the inner product of two vectors is adopted [2]:

1 21 2 2 2

1 2

2 ,( , ) 1d = −

+

v vv v

v v (8)

where v = {c, e}, v1 and v2 are identified with each other if d(v1, v2)< .

2) Feature selection for dynamic background pixels: as for a dynamic background object, its motion is usually in a small range (so that it is still referred to background) and has a period: waving tree branches and their shadows for example. Hence, the color co-occurrence feature is used to take advantage of these properties. Let ct-1 = (Rt-1, Gt-1, Bt-1)T and ct

= (Rt, Gt, Bt)T be the color features at time t-1 and t at pixel s,then the color co-occurrence vector at time t and pixel s is defined as cct = (Rt-1, Gt-1, Bt-1, Rt, Gt, Bt)T. In this case, another distance measure is used:

[1..6]( , ) max{ }t j tk jkk

d cc cc∈

= −cc cc (9)

285978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

https://www.researchgate.net/publication/220704468_Applied_particle_filter_in_traffic_tracking?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/3327493_Leung_MKH_Integrating_Intensity_and_Texture_Differences_for_Robust_Change_Detection_IEEE_Trans_Image_Process_112_105-112?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

https://www.researchgate.net/publication/51371284_Statistical_Modeling_of_Complex_Backgrounds_for_Foreground_Object_Detection?el=1_x_8&enrichId=rgreq-d205dc59-ce14-4275-bb85-9fb5a58ff288&enrichSource=Y292ZXJQYWdlOzQzNjIxMDY7QVM6MTM4Mzk2NzY0NzQxNjMyQDE0MTAwMDc3Njg2ODQ=

C. Learning the statistics of background features So far, we have already had a data structure for statistics,

now for the procedure of feature learning. There are two kinds of background changes, so we will have different learning strategy for each one.

1) Gradual background changes: once we have the classification result at pixel s (subsection D) and time t, its statistics at the next time instant will be updated as follow:

1

1

1| |

( ) (1 ) ( )

(1 )

(1 ) ( )i i i

i i i

t t tb

t t t

t t t tb b b

p b p b L

p p L

p p L L

α αα α

α α

+

+

+

= − +

= − +

= − +

v v

v v v

v v v

(10)

where v = {c ,e, cc}, 0 < < 1 is the learning rate. If s is classified as background point at time t, then t

bL = 1, else tbL =0. If the input feature vector vt is identified with vi then

i

tLv =1, otherwise, i

tLv = 0. Besides, if there is no vi in table

Tv(s) identified with vt, the last component in 11,..., ( ){ ( )}t

i MS i+=v v

will be replaced by new one:

( )M t=vv v ,( )

1M

tp α+ =vv ,

( )|

1M b

tp α+ =vv (11)

2) “Once-off” background changes: an “once-off” background change occurs when there is a suddenly change in illumination, or a moving foreground object stopping and becoming a background instance, that means when background becomes foreground suddenly or vice versa. When this happens, we have:

( ) ( ) ( )

1 1 1( ) ( | ) ( ) ( ) ( | )

N N N

s s i s i s s ii i i

P f P f P P b P b M= = =

= − >v v v

v v v (12)

Or ( ) ( )

|1 1

( )i i

N Nt t t

bi i

p p b p M= =

− >v v

v v v (13)

where M is a high percentage threshold (80% ~ 90%). Thus, (13) can be considered as a condition to check if an “once-off” background change is happening. In that case, the statistics of foreground should be turned to background statistics:

1

1

|1| 1

( ) 1 ( )

( ( ) )( )

i i

i i

i

t t

t t

t t tbt

b t

p b p b

p p

p p b pp

p b

+

+

++

= −

=

−=

v v

v v

v v vv

v

(14)

for i = 1, …, N(v).

This learning process is also proved that ( )

1|

1i

N vt

bi

p +

=v will

converge to 1 as long as the background features are observed frequently [1].

D. Foreground object detection 1) Change detection: In order to have a proper feature

selection as mentioned in C, we need to know whether a pixel is static or dynamic. Therefore, color-based background

differencing and interframe differencing methods are applied to detect changes. Background differencing calculates the difference between background B(s,t) and input frame, while interframe differencing performs the same work on consecutive frames. Let Fbd(s,t) and Ftd(s,t) be the background difference and interframe difference respectively. If Fbd(s,t) = 0 and Ftd(s,t) = 0, pixel s is referred to nonchange background point. If Ftd(s,t) = 1, s is classified as dynamic point, then color co-occurrence features are used for background/foreground classification, otherwise, s is a static point, so color and gradient features are used in the next step.

2) Background/Foreground classification: Let vt be the input feature at pixel s and time t. The probabilities are estimated as follow:

( )

|( )

( ) ( )

( )

( | )

j

j t

j

j t

ts

ts t

U

ts t b

U

P b p b

P p

P b p

∈

∈

=

=

=

v

vv v

vv v

v

v

(15)

where v = {c ,e, cc}, U(vt) is the set of vi∈Tv(s) that are identified with vt:

( ) { ( ), ( , ) and ( )}t j t jU T d j Nδ= ∈ ≤ ≤vv v s v v v (16)

If there is no vi∈Tv(s) identified with vt , Ps(vt) and Ps(vt|b) = 0.

As saying above, if s is a static pixel, we will have v = cand v = e, thus, Tc(s) and Te(s) are used as their tables of statistics. After calculating the probabilities as (15), (7) is used to classified s as background or foreground. Note that in this case: ( ) ( ) ( )t t

sP b p b p b= =c e . Similarly, if s is a dynamic pixel, v = cc and (4) is used as the classification criterion.

3) Foreground object segmentation: after finishing background/foreground classification for all pixels, an “oil spreading” algorithm is applied to find connected regions of foreground pixels. Then some Heuristic technologies are used to separate objects sticking each other due to shades.

E. Background maintenance To make the background differencing in change detection

step more accurate, the background image should be regularly updated. Let B(s,t) and I(s,t) be the background and input frame at s and time t. If s is referred to a nonchange background point, the background at s is updated as: ( , 1) (1 ) ( , ) ( , )t t tβ β+ = − +B s B s I s (17)

where 0 < < 1. Otherwise, if s classified as a background point (static or dynamic), the background at s is replaced by the new one: ( , 1) ( , )t t+ =B s I s (18)

Figure 1 presents the complete algorithm of foreground object detection.

286978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

Figure 1. The complete algorithm of foreground object detection.

III. JOINT PROBABILISTIC DATA ASSOCIATIONFILTER

Let M(t) be the number of objects at time t, and N(t) be the number of measurements received. The set of objects and measurements at time t can be respectively denoted as:

,1 ,2 , ( ){ , ,..., }t t t t M tX = x x x (19)

,1 ,2 , ( ){ , ,..., }t t t t N tZ = z z z (20)

Let = { i,j, j = 1..N(t), i = 1..M(t)} denote the joint association event between objects and measurements, where

i,j is the particular event which assigns measurement zt,j to object i. The joint association event probability is:

1: 1: 1

1: 1 1: 1

1: 1

( | ) ( | , ( ), )1 ( | , ( ), ) ( | , ( ))

1 ( | , ( ), ) ( | ( ))

t t t

t t t

t t

p Z p Z N t Z

p Z N t Z p Z N tc

p Z N t Z p N tc

θ θ

θ θ

θ θ

−

− −

−

=

=

=

(21)

where Z1:t is the sequence of measurements up to time t, c is the normalization constant. The first term p(Zt| , N(t), Z1:t-1) is the likelihood function of the measurements, given by:

,

1: 1 , ,( | , ( ), ) ( ) ( | )j i

t t F t t i t jp Z N t Z gθ θ

θ µ φ−∈

= ∏ x z (22)

where φ is the number of false alarms, ( )Fµ φ is the probability of number of false alarms, which is usually Poisson distributed, gt(xt,i|zt,j) is the likelihood that measurement zt,j is originated from target xt,i. The second term

p( |N(t)) is the prior probability of a joint association event, given by:

( ) ( ) ( ) !( | ( )) (1 ) ( )( )!

N t M t N tD D Fp N k p p

N tφ φ φθ µ φ− − += − (23)

where pD is the probability of detection of an object with the assumption that target detection occurs independently over time with known probability. Thus, the probability of a joint association event is:

,

( ) ( ) ( )1:

2, ,

1 !( | ) (1 )( )!

( ( )) ( | )j i

N t M t N tt D D

F t t i t j

p Z p pc N t

g

φ φ

θ θ

φθ

µ φ

− − +

∈

= − ×

∏ x z (24)

The state estimation of object i is:

, , 1:( )

, , , 1: ,1

( )

, , , 1: , 1: ,1

ˆ ( | )

( , | )

( | , ) ( | )

t i t i t

N t

t i t i j i t t ij

N t

t i t i j i t j i t t ij

E Z

p Z d

p Z p Z d

θ

θ θ

=

=

=

=

=

x x

x x x

x x x

(25)

Let the association probability for a particular association between measurement zt,j and object i be defined by:

,

, , 1:

1::

( | )( | )

j i

j i j i t

t

p Z

p Zθ θ θ

β θθ

∈

== (26)

Hence, (25) becomes: ( )

, , , 1: ,1

( )

, ,1

ˆ ( | , )

ˆ

N t

t i t i j i t j ij

N tjt i j i

j

E Zθ β

β

=

=

=

=

x x

x (27)

where ,ˆ jt ix is the state estimation from Kalman filter [10] with

the assumption on association between measurement zt,j and object i.

Note that: ( )

,1

1N t

j ij

β=

< , and in fact, it is difficult to propose

a model for exactly estimating ,j iβ in (26) as the theory, so

we want to normalize ,j iβ so that ( ),1

1N t

j ijβ

== before using

in (27). Hence: ( ), ,1

1( )

N t

j i j ijN tβ β

== , (28)

and (27) becomes: ( )

, , ,1

ˆ ˆN t

jt i t i j i

j

β=

=x x (29)

Figure 2 below is the complete algorithm of JPDAF at each time instant t.

287978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

Figure 2. The complete algorithm of JPDAF at each time instant.

Figure 3 is the experimental results of JPDAF performed on simulated data of 8 targets in 100 time steps. The left one of each image pair is the simulated data and the right one is the estimated track of each target. Targets’ positions are initialized in the area of [0..500] x [0..50], false alarms are taken randomly in the area of [0..200] x [0..200], PD = 0.98,

( ) ( 0.1)F Poissonµ φ λ = .

Figure 3. JPDA filter results on simulated data for 8 targets.

IV. COMBINING STATISTICAL BACKGROUNDMODEL AND JPDA FILTER FOR MOTORCYCLE

TRACKING

A. Moving object detection Statistical background model is applied to detect moving

objects in the motorcycle lane with the parameters in Table 1.

TABLE I. PARAMETERS FOR STATISTICAL BACKGROUND MODEL

Simulated data Estimated tracks

Target 1

Target 2

Target 3

Target 4

Target 5

Target 6

Target 7

Target 8

288978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

The color and gradient vectors are obtained by quantizing their domains to 256 resolution levels, while for color co-occurrence vectors, the number of quantized levels is 32, = 0.005 is used for the distance measure in (8) while = 2 is used for (9).

B. Multi-target tracking Using the measurements achieved from detection stage,

JPDAF performs data association between the current measurements and targets. At each time t, basing on the accuracy of detection results, we can propose a strategy to detect new objects entering the observation area. If:

,t j tZ∃ ∈z so that 1, 1 , 1,,t i t t j t iX ε− − −∀ ∈ − >x z x (30)

where xt-1,i = (xt-1,i, yt-1,i) and , , ,( , )x yt j t j t jz z=z are respectively

the Decartes coordinates of object i at time t-1 and measurement j at time t, is a small positive number. Then zt,jis considered as a measurement originated from a new object. That means if a measurement is not “too close” with any target at the last time instant, it is implied that a new target has occurred. Besides, if an object is at the end of the observation area and it is not a new object or it is misdetected more than 3 time instant, it will be removed.

To increase the accuracy of JPDAF, beside the spatial distance, the information of color histogram should be incorporated into the likelihood gt(xt,i|zt,j) in (22). Hence, Bhattachayya distance is employed to calculate the “distance” between the reference color model * *

0 1,...,{ ( ; )}n Nk n =K x and the candidate color model 1,...,( ) { ( ; )}k t n Nk n =K x x of each target, (details in [3]):

1/ 2* *

01

[ , ( )]= 1 ( ; ) ( ; )N

k kn

k n k nξ=

−K K x x x (31)

where reference color model of a target is chosen as its last state and the candidate color model is its current measurement. Moreover, for increasing the accuracy, the reference and candidate model are divided into two sub-regions (Figure 4), then the color likelihood of a candidate model is produced:

2 2 *1

[ , ( )], ,|( ) w w tw

t t i t jcl eλξ

=−∝ K K xx z (32)

Figure 4. The reference color model.

Let dlt(xt,i|zt,j) be the spatial distance likelihood (p(zt|Xt)) which attained by Kalman filter between measurement j and target i [10], then the likelihood gt(xt,i|zt,j) in (24) is defined as:

, , , , , ,|( | ) ( | ) (1 ) ( )t t i t j t t i t j t t i t jg dl clγ γ= + −x z x z x z (33)

where 0.5 < < 1 because spatial distance information has a higher priority than color in this context. In our application, we chose = 0.7.

V. EXPERIMENTAL RESULTS

A. Object detection results The below is some results of object detection (Figure 5

(a)), the left image of each pair is the input frame and the right one is the detection result. The experimental sequences are taken from the motorcycle lane in a cloudy weather and the illumination changes are easily seen, but the detection algorithm still works very well. The background is learned rapidly, figure 5 (b) is a learned background after 60 frames, together with the statistics of background features, the results of background/foreground classification step is very accurate, there are almost no misclassified background point. But the difficulty is in the segmentation step, when the object density at the end of the observation area is high, many occlusions usually occurs and the segmentation step will usually make mistakes (Figure 6). Figure 5 (c) is an example of “once-off” background change, there was a motorbike stopping close to the pavement for a while and it became background soon after that.

(a)

289978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

Figure 5. (a) Some results of object detection; (b) Learned background image at frame 60; (c) An example of “once-off” background change.

Figure 6. Some successful and failed results of segmenting objects sticking each other due to shades.

Table II is the quantitative results of object detection. The system was tested on ten sequences which has the object density < 10 objects/frame, each sequence has an average length of 10 seconds and uses the first 30 frames (1 second) for initial background learning (“+30” in Length column). The precision rate = 100% demonstrated that there is no background object which is classified as foreground, and the mistake percentages in the recall rate is caused by the incorrect segmentation.

TABLE II. THE STATISTICS OF OBJECT DETECTION RESULTS

Figure 7. Some results of tracking.

(b)

(c)

(a)

(b)

(c)

(d)

290978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

B. Tracking results The results of JPDAF depends on the object detection

results, if objects are correctly detected, the tracking algorithm will works very well. In general, this system works well with a reasonable number of targets/frame (< 10 targets/frame). With the strategy for detecting objects entering and exiting the observation area, the JPDAF can also detect and track the motorcycles driven in wrong direction. Figure 7 shows some tracking results, including of the tracking of wrong-wayed motorcycle (Figure 7 (d), object 10). Table III shows the statistics of full correct tracks in the ten sequences above (the mis-tracked objects in any frame are not counted).

Since JPDAF is an NP-hard problem (the number of possible joint association events at each time instant t is

( ( ), ( ))( ) ( )1

Min M t N t i iM t N ti

C A=

), the computation cost of JPDAF is one of its major weak points. All of these experiments are deployed on a Pentium IV 2.4 Ghz, 512 MB RAM, due to the high cost of object detection and tracking algorithm, the processing rate is 2s/frame with the frame size is 360x240 and the sequence rate is 30 frames/s.

TABLE III. THE STATISTICS OF TRACKING RESULTS

VI. CONCLUSION This paper is a next step on the way searching an efficient

approach for a motorcycle surveillance system after using Particle filter in [6]. Some improvements have been achieved in object detection step which has more accurate results in the whole observation area and the ability to efficiently adapt to illumination changes and “once-off” changes. However, occlusions have not been strictly handled and the computation cost is one of the major limitations. In the future, we hope that many new multi-target tracking methods will be applied in this context and the best selection will be produced.

REFERENCES

[1] L.Li, W.Huang, I.Y. Gu, and Q.Tian, “Statistical modeling of complex backgrounds for forground object detection,” IEEE Transactions on Image Processing, Vol. 13, No. 11, Nov. 2004.

[2] L.Li and M.Leung, “Integrating intensity and texture differences for robust change detection,” IEEE Transaction on Image Processing, Vol. 11, pp.105-112, Feb. 2002.

[3] Okuma, A.Taleghani, N.de Freitas, J.J.Little and D.G.Lowe, “A boosted particle filter: Multitarget detection and tracking,” Proceedings of ECCV 2004, Vol I:2839, 2004.

[4] A.Yilmaz, O.Javed and M.Shah, “Object tracking: a survey,” ACM Computing Surveys, Vol. 38, No. 4, Dec. 2006.

[5] O.Frank, “Multiple Target Tracking,” thesis in the degree of Dipl.El.Ing.ETH (Swiss Federal Institute of Technology Zurich), Feb. 2003.

[6] Hoai Bac Le, Nam Trung Pham, Tuong Vu Le Nguyen, “Applied Particle Filter in Traffic Tracking,” Proceedings of IEEE International Conference on RIVF 2006.

[7] I.J.Cox, “A review of statistical data association techniques for motion correspondence,” Int. J. Comput. Vision 10, 1, 53-66, 1993.

[8] C.Rasmussen and G.Hager, “Probabilistic data association methods for tracking complex visual objects,” IEEE Trans. Patt. Analy. Mach. Intell.23, 6, 560-576, 2001.

[9] D.Schulz, W.Burgard, D.Fox and A.B.Cremers, “Tracking multiple moving targets with a mobile robot using Particle filters and statistical data association,” Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), Seoul, Korea, 2001.

[10] B.Ristic, S.Arulampalam, N.Gordon, “Beyond the Kalman Filter,” Artech House, 2004.

291978-1-4244-2379-8/08/$25.00 (c)2008 IEEE

Incorporating statistical background model and Joint Probabilistic Data Association filter into motorcycle tracking

Documents