Seminari XXIII ciclo Tracking in flussi video 3D Ing. Samuele Salti Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano
Seminari XXIII ciclo
Tracking in flussi video 3D
Ing. Samuele Salti
Tutors: Prof. Tullio Salmon CinottiProf. Luigi Di Stefano
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
The Tracking problem
• Detection– Object model, Track initiation, Track termination, …
• Tracking– Object motion model, Model update, …
• Multi-target tracking / Data association– Occlusion handling, Combinatorial problem (Exponential complexity
with growing number of targets), …
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
2D Tracking
• State of the art performances in 2D videos
• Main idea: Tracking-by-Detection– “Reliable ” detector used in every
frame: Implicit Shape Model (ISM), Histogram-of-Gradient (HOG), etc…
– Tracking reformulated as data association across frames
• Limitations– People pose– Occlusions & clutter– Illumination changes– Output 2D Liebe & al., IJCV 08,
Breitentesin & al., ICCV 09
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Why not just one image?
• Analyzing a single view is not possible to unambiguously reconstruct the 3D structure of the scene
• This is due to effects of the perspective projection that maps points of a 3D space in a 2D space (the image plane of the camera)
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
3D acquisition devices
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
3D data and previous work
• Most exploited approach– Camera calibrated wrt the ground
plane– People “detected” with
background subtraction– 2D projection of 3D data – Tracking in 2D plan view
• Limitations– Assume static camera– Requires a background model– Requires calibration– Bottom-up approach
Beymer & Konolige 2000 Iocchi & Bolles ICIP 2005 Harville & Li, CVPR 04 Yous & al., ECCV WS 2008
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
My contribution
• Design an enhanced people detector, exploiting the full potential of 3D data
• Toward this goal– propose a new 3D descriptor of local shape suitable for our task– Design a theoretically sound and adaptive way to merge 2D and 3D info
for the purpose of people detection (i.e. object category recognition)
• Plug this in a tracking framework conceived for time critical, online applications– No global optimization– More emphasis on tracking than on data association– Recursive Bayesian Estimation (RBE) methods
• Enhance RBE via machine learning
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
3D shape descriptor
• Our proposal dubbed HON: Histogram of Normals
• Designed to be– Fast– Robust to noise and clutter– Robust to sampling density variations
• Definition of a new, robust way to compute an invariant local reference frame
• Inspired to successful approaches for 2D texture description
cos θ
Lowe, IJCV 04
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
HON: Results on noise and clutter
1-precision
recall
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
HON: Results on sampling density
1-precision
recall
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
My contribution
• Design an enhanced people detector, exploiting the full potential of 3D data
• Toward this goal– propose a new 3D descriptor of local shape suitable for our task– Design a theoretically sound and adaptive way to merge 2D and 3D info
for the purpose of people detection (i.e. object category recognition)
• Plug this in a tracking framework conceived for time critical, online applications– No global optimization– More emphasis on tracking than on data association– Recursive Bayesian Estimation (RBE) methods
• Enhance RBE via machine learning
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Recursive Bayesian Estimation
• RBE provides a theoretically sound conceptual solution to the problem of state estimation in presence of uncertainty.
• RBE is widely employed in the context of Visual Tracking and Motion Analysis.
• In this framework the system is completely specified by a first order Markov model compound of– a transition model in state space– a measurement model– an initial state
• Practical instantiations– the Kalman filter (Linear & Gaussian scenario, optimal solution)– the particle filter (Non-Linear / Non-Gaussian scenario, sub-optimal
solution)
( )1,k k k kf υ−=x x
( ),k k k kh η=z x( )0p⇒ x
( )1k kp −⇒ x x
( )k kp⇒ z x0x
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Motivations
• A major limitation of RBE is the requirement to a priori specifythe transition model.
• In most cases this model is unknown and is empirically selected among a restricted set of standard ones or it is learned off-line
• Both approaches do not allow for changing the transition model trough time, although this would be beneficial and neither the conceptual solution nor the solving algorithms require this.
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Proposal
• In case of a completely observable system, we propose to learnthe transition model on-line.
• In such a case, the transition model is directly related to the dynamics exhibited by the measures. Hence, it is possible to exploit their temporal evolution in order to learn the function , and, implicitly, the PDF .
• Furthermore, we propose to learn the motion model using Support Vector Machine in ε-regression mode (SVR)– SVR theoretical properties minimize the risk of overfitting– SVR can learn non-linear mapping effectively via the kernel trick– SVR can be trained very efficiently exploiting SMO
( )1: 1 1k k kkp
− −z x x( )1: 1 1,k k kkf υ
− −z x
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Support Vector Kalman
• RBE in the linear & Gaussian scenario becomes:
• In this case, the PDF we want to estimate becomes
• Therefore, we use SVRs to estimate – the transition matrix Fk
– the associated noise covariance matrix, Qk
( ) ( ) ( )1: 1 1 1; ; ; ;
k k k k k k k kkp N Nμ− − −= Σ =z x x x x F x Q
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Simulations
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Mean Shift Tracking
Seminari XXIII ciclo - Tracking in flussi video 3D Samuele Salti
Future work
• Design an enhanced people detector, exploiting the full potential of 3D data
• Toward this goal– propose a new 3D descriptor of local shape suitable for our task– Design a theoretically sound and adaptive way to merge 2D and 3D info
for the purpose of people detection (i.e. object category recognition)
• Plug this in a tracking framework conceived for time critical, online applications– No global optimization– More emphasis on tracking than on data association– Recursive Bayesian Estimation (RBE) methods
• Enhance RBE via machine learning