Feature and object tracking algorithms for video tracking Student: Oren Shevach Instructor: Arie nakhmani
Dec 28, 2015
Feature and object tracking algorithms for video tracking
Student: Oren Shevach
Instructor: Arie nakhmani
Overview• Given a video sequence, the purpose is to track
the objects in the video and overcome occlusions.
• Camera can be stationary or moving.
• Movement between frames:
▫Translations: movement in the x-y axis.
▫Affine transformations: Rotations, scaling.
Overview• Project goal: study and understand 2 algorithms
for tracking
▫KLT- Feature Tracking
▫GLOMO-Object Learning
Kanade Lucas Tomasi-KLT• Basic feature tracking algorithm.
• Good feature: Consider small rectangular windows all over the image, Good feature is a window that can be tracked easily in a sequence of images.
• Feature movement :
11 12
21 22
1
2
Affine transform matrix
Translation vector
a aA
a a
dd
d
J AX d I X
Current image
Next image
Current pixel
I
J
xX
y
Kanade Lucas Tomasi-KLT• Tracking Goal: Find and parameters that
Minimize the dissimilarity between current and next image in the sequence.
• To Find minimum, we set first derivative of dissimilarity to zero.
• Taylor extension for next Image, assuming small movements:
2( ) dissimilarity
( ) Weights vector usually 1WJ AX d I X w x dx
w x
A d
(( ) )TJ AX d J X g A I X d , ,T
x y
J Jg g g
x y
-Gradients vector
Kanade Lucas Tomasi-KLT• We receive the equation to solve:
Tz a2 2 2 2 2
2 2 2 2 2
2 2 2 2 2
2 2 2 2 2
2 2 2
2 2 2
x x y x x y x x y
x y y x y y x y y
x x y x x y x x y
x y y x y y x y y
x x y x x y x x y
x y y x y y x y y
x g x g g xyg xyg g xg xg g
x g g x g xyg g xyg xg g xg
xyg xyg g y g y g g yg yg gT
xyg g xyg y g g y g yg g yg
xg xg g yg yg g g g g
xg g xg yg g yg g g g
W
wdx
Consists of gradient and pixels location
11
22
12
21
1
2
1
1
a
a
az
a
d
d
( )
x
y
x
W y
x
y
xg
xg
yga I X J X wdx
yg
g
g
Movement parameters
Error vector
Kanade Lucas Tomasi-KLT
• Solution is iterative:• Initialization:
• Iteration step: continue until convergence
• Calculate T and a matrixes
• Receive parameters from z
• Update
1( )i i i i i iA d J J A x d
0 0 00,0 ( )T
A I d J J x
KLT-Results
1
2
34
5
67
89
10
11
12
13
14
15
1617
18
19
20
21
22
23
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
4445
46
47
48
49
50
51
5253
54
55 56
57
58
59
60
6162
6364 65
66
67
68
69
70
71
7273
74
75
76
77
78
79
80
81
82
83
84
8586
87
88
89
90
91
92
93
94
95
96
97 98
99
100
101
102
103
104
105
106
107108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125 126
127
128129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149150
1
2
34 5
6
7
89
10
11
12
13
14
151617 18
19
20
21
22
23
24
25
2627
28
29
3031
32
33
34
35
36
37
38
39
4041
42
43
4445
46
4748
49
50
51
5253
54
55
56
57
58
59
60
6162
6364 65
66
67
68
69
70
71
7273
74
75
76 77
78
79
80
81
82
83
84
85 86
87
88
89
90
91
92 93
94
95
96
97 98
99 100101
102
103
104
105
106
107108
109
110
111
112
113
114
115
116
117
118
119120
121
122
123 124
125 126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
1
23
4 5
6 78
9
10
11
12
13
141516
17
18
19
20 21
22
2324
25
26
27
28
29
3031
32
33
34
3536
37
38
39
40
41
42
4344
45
46
47
48
4950
51
52
53
54
55
56
57
58
59
60
61
62
6364 65
66
67
686970
71
7273
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97 98
99
100101
102
103
104
105106
107108
109
110
111
112
113 114115
116
117
118
119
120
121
122
123
124
125 126
127
128
129
130 131
132
133
134
135
136
137
138
139
140
141
142
143144
145
146
147
148149
150
KLT-Results
1
2
3
4
56
78
9
1011
12
13
14
15
16 17
1819
20 21
2223
2425
26
2728
29 30
31
323334
3536
37
3839
40
4142
43
444546
47
48
4950
5152
53
5455
56
57
585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
1
2
3
4567
8
9
10
11
12
13
14
15
16
17181920
21
22
23
24
25
2627282930
31
3233
34
3536
373839
40
41
42
43
444546
47
48
4950
5152
53
54
55
56
57585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 1
2
34567
8
9
101112
13
1415
16
171819202122232425262728293031323334
35
363738394041424344454647484950515253
54
55
56
57585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
KLT-Results
123
4
56
7
8
91011
12
13
14 15
16
17
18
19
2021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 1
23
4
56789
10
1112131415
16
17
18192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 12
3
4
56789
10
1112131415
16
17
18192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
12
3
456789
10
1112131415
16
17
18192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 12
3
456789
10
1112131415
16
17
18192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150 12
3
456789101112131415
16
17
18192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
GLOMO-Greedy Unsupervised Learning Of Multiple Objects• Tracking all objects in the sequence and the
background • Algorithm output (for one object):lf The object
l Object mask
bj Background transformation
lj Object Transformation
b Background
f Object Variance
b Background Variance
• Given a sequence of frames, in each frame object goes through transformations and might be noisy
• Assuming there are J possible transformations.
• Tracking the object parameters using EM ,an iterative procedure to compute the Maximum Likelihood estimate in the presence of missing or hidden data
• Object density:
Tracking objects from images
2( ; ) ( ; , )f i i i i fp x f N x f
• EM Example:(for one object with static background)▫Expectation: given current parameters find
Giving weight to each transformation possible
▫Maximization: Update the parameters
Tracking objects from images
2( ; , )( )
2( ; , )1
nP N x T fj j fnp j xJ nP N x T fj j fj
1 1( )1 1
N J n nf p j x T xjN n j
212 ( )
1 1
N J n np j x x T ff jNPn j
• Background ,objects and all parameters are found together in EM iterations.
• For more than one object and moving background the complexity is too high.
• The probability model doesn’t handle object occlusion
Tracking objects from images-problems
A more efficient approach is needed!
• First finding the background and then every object separately.
• New probability density model: Each pixel is part of the object/background or uniform for other.
• Tracking will be in relevant pixels only for speeding up the tracking.
The Algorithm-GLOMO
2( ; ) ( ; , ) (1 ) ( )
Time constant for pixel not being occluded
p x f N x f U xf i i f i i f f i
f
• Algorithm Steps:▫User defines the number of objects to find and
number of EM iterations.
▫Find the background and its transformations assuming all masks are zero.
▫Define vector Z which contains relevant pixels and initialize:
▫For each object, find object parameters and transformations by applying EM for the object.
The Algorithm-GLOMO
0 1nz
▫After tracking the object, Update Vector Z with the object pixels.
The Algorithm-GLOMO
11
1
1 1
* 1
* The pixel is part of the objectj
n n nl l l
n n nj
z z
T r
GLOMO-ResultsMask
Foreground*Mask
Mask
Foreground*Mask Background
Example:
Reconstruction:Reconstructed frame
Reconstructed frame
Original Ordering New Ordering
GLOMO-ResultsExample : moving background
Mask
Foreground*Mask Background
GLOMO-ResultsMask
Foreground*Mask
Mask
Foreground*Mask Background
Example :change number of frames
20 frames
Mask
Foreground*Mask
Mask
Foreground*Mask Background
60 frames
GLOMO-ResultsExample :change number of frames
Mask
Foreground*Mask
Mask
Foreground*Mask Background
110 frames
GLOMO-ResultsExample :change number of EM iterations for 20 frames
70 iterations
300 Iterations
Mask
Foreground*Mask
Mask
Foreground*Mask Background
Mask
Foreground*Mask
Mask
Foreground*Mask Background
Conclusions• Both algorithms worked well on high quality
pictures with large and defined objects.
• On low quality pictures with less defined objects, GLOMO didn’t recognize the objects very well and KLT lost all the features very quickly.
• Both Algorithms handled well with moving camera and changing background.
• KLT doesn’t recover from occlusions, while GLOMO handles them very well.
Conclusions
• GLOMO doesn’t work well on 20 frames but works well on more than 100 frames can’t work on real time systems. while KLT works on real time systems.