Juergen Gall Analyzing Human Behavior in Video Sequences
Juergen Gall
Analyzing Human Behavior in
Video Sequences
Analyzing Human Behavior
Videos
Low level features, e.g., gradients, optical flow
Analyzing Human Behavior
Human Pose
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 2
21 Actions from HMDB
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 3
928 clips, 33183 frames
HMDB51 (Kuehne et al, ICCV 2011)
Puppet Annotation
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 4
Joint-annotated HMDB (JHMDB)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 5
[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]
[ http://jhmdb.is.tue.mpg.de ]
Study with Annotated Data (2013)
• Large potential gain for pose feature
• Not with existing 2d human pose methods
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 6
given flow
+ ~11%
given mask
+ ~9%
pose features
+ ~20%
baseline given puppet flow given puppet mask given joint positions
Low Mid High
baseline
GT
[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]
[ http://jhmdb.is.tue.mpg.de ]
CNNs for Pose Estimation
Stack CNNs:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 7
[ S.-E. Wei et al. Convolutional Pose Machines. CVPR 2016 ]
Coupled Action Recognition and Pose
Estimation
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 16
[ U. Iqbal et al. Pose for Action – Action for Pose. FG 2017 ]
Pose Estimation in Videos
Video datasets for human pose in unconstrained videos
does not exist.
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 18
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Estimation in Videos
Video datasets for human pose in unconstrained videos
does not exist.
Unconstrained means
• Public available content from the Internet (e.g.
Youtube)
• Multiple persons in a video (no assumption about
position)
• Arbitrary number of visible joints (truncation and
occlusion)
• Large scale variations (unknown scale)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 19
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose-Track Dataset
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 20
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Joint-annotated HMDB (JHMDB)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 22
[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]
[ http://jhmdb.is.tue.mpg.de ]
Pose-Track Dataset
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 23
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose-Track Dataset
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 24
Challenge ICCV 2017
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 25
[ http://posetrack.net/workshops/iccv2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Estimate pose + person association over time:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 26
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Estimate pose + person association over time:
• Predict body joints (CNN trained on MPII Pose)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 27
Pose Track: Simultaneous Pose
Estimation and Tracking
Estimate pose + person association over time:
• Predict body joints (CNN trained on MPII Pose)
• Build a graph with temporal and spatial edges
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 28
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 29
f’f f’’f’f f’’
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 30
f’f f’’
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Unaries: Confidences of detected joints
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 31
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Spatial binaries: Extract quadratic bounding box around
detection
Two cases:
• Different joint type:
• Logistic regression based on distance and orientation
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 32
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Spatial binaries: Extract quadratic bounding box around
detection
Two cases:
• Same joint type:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 33
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 34
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Temporal binaries: Compute optical flow (DeepMatching)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 35
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Temporal binaries: Compute optical flow (DeepMatching)
Logistic regression:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 36
f’f[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Solve integer linear program:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 37
f’f f’’
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Solve integer linear program:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 38
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Solve integer linear program:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 39
f’f f’’
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
To obtain plausible pauses, constraints are added:
• Spatial transitivity:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 40
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
To obtain plausible pauses, constraints are added:
• Spatial transitivity:
• Temporal transitivity:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 42
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
To obtain plausible pauses, constraints are added:
• Spatial transitivity:
• Temporal transitivity:
• Spatio-temporal trans.:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 43
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 44
Pose Track: Simultaneous Pose
Estimation and Tracking
To obtain plausible pauses, constraints are added:
Spatio-temporal consistency:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 45
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
Estimate pose + person association over time:
• Predict body joints (CNN trained on MPII Pose)
• Build a graph with temporal and spatial edges
• Partition spatio-temporal graph
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 46
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Simultaneous Pose
Estimation and Tracking
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 47
Pose Track: Evaluation
• Pose estimation accuracy (mAP)
• Person association (MOTA)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 48
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Pose Track: Evaluation
• Pose estimation accuracy (mAP)
• Person association (MOTA)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 49
[ U. Iqbal et al. Pose-Track: Joint Multi-Person
Pose Estimation and Tracking. CVPR 2017 ]
Joint-annotated HMDB (JHMDB)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 50
[ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ]
[ http://jhmdb.is.tue.mpg.de ]
Video Analysis for Studying the
Behavior of Mice
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 72
Recurrent Neural Networks
• Gated units (LSTM/GRU)
10 /9 /20 17 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 74
Weakly Supervised Learning
• Fully supervised:
• Weakly supervised (transcripts)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 75
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 77
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 78
• Represent an activity a like “spoon_powder” by latent
sub-activities s1(a) ,s2
(a),s3(a),…
• Optimal number of sub-activities is unknown:
• Many sub-activities for long activities
• Few sub-activities for short activities
s1(a) s2
(a) s3(a) s4
(a) s5(a) s6
(a)
Model
• RNN with Gated Recurrent Units (GRU)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 79
Model
• Hidden Markov Model (HMM) enforce fixed order of
sub-activities: s1(a) ,s2
(a),s3(a),…
• HMMs use probabilities of RNN as input
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 80
Model
• Hidden Markov Model (HMM) for each activity
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 81
Model
• The transcripts define the order of activities:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 82
Model
• The transcripts define the order of activities:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 83
Model
• The transcripts define the order of activities:
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 84
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 85
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 86
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 87
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 88
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Weakly Supervised Learning
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 89
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Results
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 90
Results
• Accuracy on unseen sequences (video without
transcript)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 91
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Results
• Accuracy on unseen sequences (video without
transcript)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 92
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Results
• Accuracy on unseen sequences (video with
transcript)
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 93
[ A. Richard et al. Weakly Supervised Action Learning with RNN
based Fine-to-Coarse Modeling. CVPR 2017 ]
Research Unit - Anticipating Human
Behavior
20 .09 .2 01 6 Resear ch Uni t 2535 - Ant ic i p a t in g Hum an Behavior 94
[ https://pages.iai.uni-bonn.de/FOR2535 ]
Research Unit - Anticipating Human
Behavior
20 .09 .2 01 6 Resear ch Uni t 2535 - Ant ic i p a t in g Hum an Behavior 95
Thank you for your attention.
09 .10 .2 01 7 Juer gen Ga l l – I ns t i t u t e o f Com puter S c ience I I I – Com puter V is ion Gr oup 96