Kernel-Based Object Tracking Dorin Comaniciu Visvanathan Ramesh Peter Meer Real-Time Vision and Modeling Department Siemens Corporate Research 755 College Road East, Princeton, NJ 08540 Electrical and Computer Engineering Department Rutgers University 94 Brett Road, Piscataway, NJ 08854-8058 Abstract A new approach toward target representation and localization, the central component in visual track- ing of non-rigid objects, is proposed. The feature histogram based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated us- ing the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking. Keywords: non-rigid object tracking; target localization and representation; spatially-smooth sim- ilarity function; Bhattacharyya coefficient; face tracking. 1 Introduction Real-time object tracking is the critical task in many computer vision applications such as surveil- lance [44, 16, 32], perceptual user interfaces [10], augmented reality [26], smart rooms [39, 75, 47], object-based video compression [11], and driver assistance [34, 4]. Two major components can be distinguished in a typical visual tracker. Target Representa- tion and Localization is mostly a bottom-up process which has also to cope with the changes in the appearance of the target. Filtering and Data Association is mostly a top-down process dealing with the dynamics of the tracked object, learning of scene priors, and evaluation of different hy- potheses. The way the two components are combined and weighted is application dependent and plays a decisive role in the robustness and efficiency of the tracker. For example, face tracking in 1
30
Embed
Kernel-BasedObject Tracking - Dorin Comaniciu · Kernel-BasedObject Tracking Dorin Comaniciu Visvanathan Ramesh Peter Meer Real-Time Vision and Modeling Department Siemens Corporate
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
A new approachtowardtargetrepresentationandlocalization,thecentralcomponentin visualtrack-ing of non-rigidobjects,is proposed.Thefeaturehistogrambasedtargetrepresentationsareregularizedby spatialmaskingwith anisotropickernel.Themaskinginducesspatially-smoothsimilarity functionssuitablefor gradient-basedoptimization,hence,the target localizationproblemcanbe formulatedus-ing the basinof attractionof the local maxima. We employ a metric derived from the Bhattacharyyacoefficient assimilarity measure,andusethemeanshift procedureto performtheoptimization.In thepresentedtrackingexamplesthenew methodsuccessfullycopedwith cameramotion,partialocclusions,clutter, andtargetscalevariations.Integrationwith motionfiltersanddataassociationtechniquesis alsodiscussed.We describeonly few of thepotentialapplications:exploitationof backgroundinformation,Kalmantrackingusingmotionmodels,andfacetracking.
Thestoppingcriterionthreshold§ usedin Step6 is derivedby constrainingthevectors B@ and B@ �to bewithin thesamepixel in original imagecoordinates.A lower thresholdwill inducesubpixel
accuracy. From real-timeconstraints(i.e., uniform CPU load in time), we alsolimit the number
Figure11: Measurementsandestimatedstatefor Hand sequence.(a) The measurementvalue(dottedcurve) andtheestimatedlocation(continuouscurve) functionof the frameindex. Uppercurvescorrespondto the y filter, while the lower curvescorrespondto thex filter. (b) Estimatedvelocity. Dottedcurve is for they filter andcontinuouscurve is for thex filter
[2] F. Aherne,N. Thacker, andP. Rockett, “The Bhattacharyyametricasan absolutesimilarity measurefor frequency codeddata,” Kybernetika, vol. 34,no.4, pp.363–368,1998.
[3] S. Arulampalam,S. Maskell, N. Gordon,andT. Clapp,“A tutorial on particlefilters for on-linenon-linear/non-GaussianBayesiantracking,” IEEE Trans.Signal Process., vol. 50, no. 2, pp. 174–189,2002.
[4] S. Avidan, “Supportvectortracking,” in Proc. IEEE Conf. on ComputerVision andPatternRecogni-tion, Kauai,Hawaii, volumeI, 2001,pp.184–191.
[5] Y. Bar-ShalomandT. Fortmann,TrackingandDataAssociation. AcademicPress,1988.
26
[6] B. Bascleand R. Deriche,“Region tracking throughimagesequences,” in Proc. 5th Intl. Conf. onComputerVision,Cambridge,MA, 1995,pp.302–307.
[7] S. Birchfield, “Elliptical headtrackingusingintensitygradientsandcolor histograms,” in Proc. IEEEConf. on ComputerVisionandPatternRecognition,SantaBarbara,CA, 1998,pp.232–237.
[8] M. BlackandD. Fleet,“Probabilisticdetectionandtrackingof motionboundaries,” Intl. J. of ComputerVision, vol. 38,no.3, pp.231–245,2000.
[9] Y. Boykov andD. Huttenlocher, “Adaptive Bayesianrecognitionin trackingrigid objects,” in Proc.IEEE Conf. onComputerVisionandPatternRecognition,Hilton Head,SC,2000,pp.697–704.
[10] G. R. Bradski,“Computervision facetrackingasacomponentof aperceptualuserinterface,” in Proc.IEEE WorkshoponApplicationsof ComputerVision,Princeton,NJ,October1998,pp.214–219.
[11] A. D. Bue,D. Comaniciu,V. Ramesh,andC. Regazzoni,“Smartcameraswith real-timevideoobjectgeneration,” in Proc. IEEE Intl. Conf. on Image Processing, Rochester, NY, volume III, 2002, pp.429–432.
[12] G. Caenen,V. Ferrari,A. Zalesny, andL. VanGool,“Analyzing the layoutof compositetextures,” inProceedingsTexture 2002Workshop,Copenhagen,Denmark,2002,pp.15–19.
[13] T. ChamandJ. Rehg,“A multiple hypothesisapproachto figure tracking,” in Proc. IEEE Conf. onComputerVision andPatternRecognition,Fort Collins,CO,volumeII, 1999,pp.239–219.
[14] H. ChenandT. Liu, “Trust-regionmethodsfor real-timetracking,” in Proc.8thIntl. Conf. onComputerVision,Vancouver, Canada,volumeII, 2001,pp.717–722.
[15] Y. Chen,Y. Rui, andT. Huang,“JPDAF-basedHMM for real-timecontourtracking,” in Proc. IEEEConf. on ComputerVisionandPatternRecognition,Kauai,Hawaii, volumeI, 2001,pp.543–550.
[16] R. Collins, A. Lipton, H. Fujiyoshi,andT. Kanade,“Algorithmsfor cooperative multisensorsurveil-lance,” Proceedingsof theIEEE, vol. 89,no.10,pp.1456–1477,2001.
[17] D. Comaniciuand P. Meer, “Mean shift: A robust approachtoward featurespaceanalysis,” IEEETrans.PatternAnal.MachineIntell., vol. 24,no.5, pp.603–619,2002.
[18] D. Comaniciu,V. Ramesh,andP. Meer, “Real-timetrackingof non-rigid objectsusingmeanshift,”in Proc. IEEE Conf. on ComputerVision andPatternRecognition, Hilton Head,SC,volumeII, June2000,pp.142–149.
[19] T. CoverandJ.Thomas,Elementsof InformationTheory. JohnWiley & Sons,New York, 1991.
[21] D. DeCarloandD. Metaxas,“Optical flow constraintsondeformablemodelswith applicationsto facetracking,” Intl. J. of ComputerVision, vol. 38,no.2, pp.99–127,2000.
[22] A. Djouadi, O. Snorrason,and F. Garber, “The quality of training-sampleestimatesof the Bhat-tacharyyacoefficient,” IEEE Trans.PatternAnal.MachineIntell., vol. 12,pp.92–97,1990.
[24] A. Elgammal,D. Harwood, and L. Davis, “Non-parametricmodel for backgroundsubtraction,” inProc.EuropeanConf. on ComputerVision,Dublin, Ireland,volumeII, June2000,pp.751–767.
[25] F. EnnesserandG. Medioni, “Finding Waldo, or focusof attentionusing local color information,”IEEE Trans.PatternAnal.MachineIntell., vol. 17,no.8, pp.805–809,1995.
27
[26] V. Ferrari,T. Tuytelaars,andL. V. Gool, “Real-timeaffine region trackingandcoplanargrouping,” inProc. IEEE Conf. on ComputerVision andPatternRecognition, Kauai,Hawaii, volumeII, 2001,pp.226–233.
[27] P. FieguthandD. Teropoulos,“Color-basedtrackingof headsandothermobileobjectsat videoframerates,” in Proc.IEEEConf. onComputerVisionandPatternRecognition,SanJuan,PuertoRico,1997,pp.21–27.
[28] K. Fukunaga,Introductionto StatisticalPatternRecognition. AcademicPress,secondedition,1990.
[29] J. Garcia,J. Valdivia, and X. Vidal, “Information theoreticmeasurefor visual target distinctness,”IEEE Trans.PatternAnal.MachineIntell., vol. 23,no.4, pp.362–383,2001.
[30] D. Gavrila, “The visualanalysisof humanmovement:A survey,” ComputerVisionandImage Under-standing, vol. 73,pp.82–98,1999.
[32] M. Greiffenhagen,D. Comaniciu,H. Niemann,andV. Ramesh,“Design,analysisandengineeringofvideomonitoringsystems:An approachanda casestudy,” Proceedingsof the IEEE, vol. 89, no. 10,pp.1498–1517,2001.
[33] G. Hagerand P. Belhumeur, “Real-time tracking of imageregions with changesin geometryandillumination,” in Proc. IEEE Conf. on ComputerVision andPatternRecognition, SanFrancisco,CA,1996,pp.403–410.
[34] U. Handmann,T. Kalinke, C. Tzomakas,M. Werner, andW. von Seelen,“Computervision for driverassistancesystems,” in ProceedingsSPIE, volume3364,1998,pp.136–147.
[35] I. HaritaogluandM. Flickner, “Detectionandtrackingof shoppinggroupsin stores,” in Proc. IEEEConf. on ComputerVisionandPatternRecognition,Kauai,Hawaii, 2001.
[36] I. Haritaoglu,D. Harwood,andL. Davis, “W4: Who? When?Where?What? - A real time systemfor detectingandtrackingpeople,” in Proc.of Intl. Conf. on AutomaticFaceandGesture Recognition,Nara,Japan,1998,pp.222–227.
[37] J. Huang,S. Kumar, M. Mitra, W. Zhu, andR. Zabih,“Spatialcolor indexing andapplications,” Intl.J. of ComputerVision, vol. 35,no.3, pp.245–268,1999.
[38] C. Hue,J.Cadre,andP. Perez,“SequentialMonteCarlofiltering for multiple target trackinganddatafusion,” IEEETrans.SignalProcess., vol. 50,no.2, pp.309–325,2002.
[39] S. Intille, J. Davis, andA. Bobick, “Real-timeclosed-world tracking,” in Proc. IEEE Conf. on Com-puterVisionandPatternRecognition,SanJuan,PuertoRico,1997,pp.697–703.
[42] S.JulierandJ.Uhlmann,“A new extensionof theKalmanfilter to nonlinearsystems,” in Proc.SPIE,volume3068,1997,pp.182–193.
[43] T. Kailath, “The divergenceandBhattacharyyadistancemeasuresin signalselection,” IEEE Trans.Commun.Tech., vol. 15,pp.52–60,1967.
[44] V. Kettnaker andR. Zabih,“Bayesianmulti-camerasurveillance,” in Proc. IEEE Conf. on ComputerVisionandPatternRecognition,Fort Collins,CO,1999,pp.253–259.
28
[45] G. Kitagawa, “Non-Gaussianstate-spacemodelingof nonstationarytime series,” J. of Amer. Stat.Assoc., vol. 82,pp.1032–1063,1987.
[46] S.Konishi,A. Yuille, J.Coughlan,andS.Zhu, “Fundamentalboundson edgedetection:An informa-tion theoreticevaluationof differentedgecues,” in Proc. IEEE Conf. on ComputerVision andPatternRecognition,Fort Collins,1999,pp.573–579.
[47] J. Krumm, S. Harris, B. Meyers, B. Brumitt, M. Hale, and S. Shafer, “Multi-cameramulti-persontrackingfor EasyLiving,” in Proc. IEEE Intl. Workshopon Visual Surveillance, Dublin, Ireland,2000,pp.3–10.
[48] B. Li andR. Chellappa,“Simultaneoustrackingandverificationvia sequentialposteriorestimation,”in Proc. IEEE Conf. on ComputerVision andPatternRecognition, Hilton Head,SC,volumeII, 2000,pp.110–117.
[52] R. Mahler, “Engineeringstatisticsfor multi-object tracking,” in Proc. IEEE WorkshopMulti-ObjectTracking, 2001.
[53] S. McKenna,Y. Raja,andS. Gong,“Trackingcolourobjectsusingadaptive mixturemodels,” ImageandVision ComputingJournal, vol. 17,pp.223–229,1999.
[54] R. Merwe, A. Doucet, N. Freitas,and E. Wan, “The unscentedparticle filter,” TechnicalReportCUED/F-INFENG/TR380,CambridgeUniversityEngineeringDepartment,2000.
[55] K. Nickels and S. Hutchinson,“Estimatinguncertaintyin SSD-basedfeaturetracking,” Image andVisionComputing, vol. 20,pp.47–58,2002.
[56] C. Olson, “Image registrationby aligning entropies,” in Proc. IEEE Conf. on ComputerVision andPatternRecognition,Kauai,Hawaii, volumeII, 2001,pp.331–336.
[57] P. Perez,C.Hue,J.Vermaak,andM. Gangnet,“Color-basedprobabilistictracking,” in Proc.EuropeanConf. on ComputerVision,Copenhagen,Denmark,volumeI, 2002,pp.661–675.
[58] W. Press,S.Teukolsky, W. Vetterling,andB. Flannery, NumericalRecipesin C. CambridgeUniversityPress,secondedition,1992.
[63] D. Reid,“An algorithmfor trackingmultiple targets,” IEEETrans.AutomaticControl, vol. AC-24,pp.843–854,1979.
29
[64] A. Roche,G.Malandain,andN. Ayache,“Unifying maximumlikelihoodapproachesin medicalimageregistration,” TechnicalReport3741,INRIA, 1999.
[65] R. RosalesandS.Sclaroff, “3D trajectoryrecovery for trackingmultipleobjectsandtrajectoryguidedrecognitionof actions,” in Proc.IEEEConf. onComputerVisionandPatternRecognition,Fort Collins,CO,1999,pp.117–123.
[66] Y. Rui andY. Chen,“Better proposaldistributions: Objecttrackingusingunscentedparticlefilter,” inProc. IEEE Conf. on ComputerVision andPatternRecognition, Kauai,Hawaii, volumeII, 2001,pp.786–793.
[67] S.Sclaroff andJ. Isidoro,“Active blobs,” in Proc.6th Intl. Conf. on ComputerVision,Bombay, India,1998,pp.1146–1153.
[68] D. W. Scott,MultivariateDensityEstimation. Wiley, 1992.
[69] C. SminchisescuandB. Triggs, “Covariancescaledsamplingfor monocular3D body tracking,” inProc. IEEE Conf. on ComputerVision and Pattern Recognition, Kauai,Hawaii, volumeI, 2001,pp.447–454.
[74] R. Wildes, R. Kumar, H. Sawhney, S. Samasekera, S. Hsu, H. Tao, Y. Guo, K. Hanna,A. Pope,D. Hirvonen,M. Hansen,andP. Burt, “Aerial videosurveillanceandexploitation,” Proceedingsof theIEEE, vol. 89,no.10,pp.1518–1539,2001.
[75] C. Wren, A. Azarbayejani,T. Darrell, andA. Pentland,“Pfinder: Real-timetrackingof the humanbody,” IEEE Trans.PatternAnal.MachineIntell., vol. 19,pp.780–785,1997.
[76] Y. Wu andT. Huang,“A co-inferenceapproachto robusttracking,” in Proc.8thIntl. Conf. onComputerVision,Vancouver, Canada,volumeII, 2001,pp.26–33.
[77] F. Xu andK. Fujimura,“Pedestriandetectionandtrackingwith nightvision,” in Proc.IEEEIntelligentVehicleSymposium,Versailles,France,2002.
[78] A. Yilmaz, K. Shafique,N. Lobo, X. Li, T. Olson,andM. Shah,“Target trackingin FLIR imageryusingmeanshift andglobal motion compensation,” in IEEE Workshopon ComputerVision BeyondVisible Spectrum,Kauai,Hawaii, 2001.
[79] “Real-timetrackingof non-rigidobjectsusingmeanshift.” US patentpending,2000.