Top Banner
Temporal Segmentation of Egocentric Videos Yair Poleg Chetan Arora Shmuel Peleg CVPR 2014 Presenter: Hsin-Ping Huang
28

Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Aug 18, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Temporal Segmentation of Egocentric Videos

Yair Poleg Chetan Arora Shmuel Peleg

CVPR 2014

Presenter: Hsin-Ping Huang

Page 2: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

• Browsing long unstructured videos is time consuming!

• Video

Egocentric Video

Policeman UN Inspectors in Syria Google Glass

Page 3: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Video credit: HUJI EgoSeg Dataset

Page 4: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Related Work

Understanding Objects and Activities

Unsupervised Segmentation

Clustering: no semantic meanings

Hard to generalize Short-term: seconds Long-term: minutes/hours

[Fathi et al., ICCV 2011] [Ryoo et al., CVPR 2013]

[Kitani et al., CVPR 2011]

Page 5: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Related Work

Story-Driven Summarization

[Lu et al., CVPR 2013]

Page 6: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Contribution

• Do temporal segmentation into hierarchy of motion classes

• Detect fixation of wearer’s gaze

Page 7: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Difficulty

• Two sources of information – Motion of the wearer

– objects and activities

• Hard to find ego-motion – Head rotation

– Depth variations

– Dynamic objects

Feature Tracking

Optical Flow

Image credit: Voodoo Camera Tracker (top)

Page 8: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Classification of Wearer’s Motion

Page 9: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Instantaneous Displacement (ID)

• Compute the ID at patches

Instantaneous Displacement of One Patch

forward motion

Motion Detector

Page 10: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Cumulative Displacement (CD)

• Compute the CD by integrating the ID

horizontal outside scene: expanding curve inside scene: horizontal

expanding curve

right of focus

left of focus

Page 11: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Motion Vector and Radial Projection Response

Focus of expansion

• Compute motion vectors as the slopes of smoothed CDs

• Compute radial projection response

• Video

< φ ?

Page 12: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Video credit: Shmuel Peleg

Page 13: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

large radially outwards

mix small

Global Motion

Head Motion Instantaneous Displacement Vectors

Motion Vectors

Walking Standing Riding Bus

Motion Vector and Radial Projection Response

Radial Projection Response

low high low

Outside Region

Page 14: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Feature

• AVG of top/bottom 6% motion vectors

• DIFF of top/bottom 6% motion vectors

• AVG of motion vectors

• Motion vectors

• # of successful flow computation

• AVG and SD of instantaneous displacements

• Radial projection response

Page 15: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

• Train SVM classifiers for each binary classification task in the proposed class hierarchy

Classifier

Page 16: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Detecting Period of Gaze Fixation

Page 17: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Gaze

Cumulative Displacement

left motion

right motion

Smoothed CD Curve

Original CD Curve

Page 18: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Cumulative Difference

Motion Detector Threshold > 1 standard deviation higher peaks

Gaze

Gaze Hypothesis Threshold > 80%

• Compute the cumulative difference positive

negative

+ -

Page 19: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Experiment

Page 20: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Dataset

• > 65 hours egocentric videos

• Manually annotated as one of the leaf classes

• Video

Page 21: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Video credit: HUJI EgoSeg Dataset

Page 22: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Classification of Wearer’s Motion leaf node accuracy

inner node accuracy

Sitting vs Standing Bus vs Standing

Average: 70% Best: 97%

Page 23: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

• Valid gaze fixation: a head fixation > 5 seconds

Detecting Period of Gaze Fixation

Page 24: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Conclusion

Page 25: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

• Mixed features from adjacent activities

– Short-term sitting when riding

Weakness

Page 26: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

• Mixed activities

• Ambiguity in gaze fixation

– A left and right turn in quick succession

– A person turns in place

Weakness

Waiting in line = Standing + Walking

Riding an open train = Open or Riding ?

Standing while coming into the station = Static or Box ?

Page 27: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Strength

• Simple, efficient and robust

• Use only the recorded video

• Make no assumptions on the scene structure

• Focus on long-term activities to prevent over-segmentation of the video

Page 28: Temporal Segmentation of Egocentric Videosvision.cs.utexas.edu/381V-fall2017/slides/huang_paper.pdf · 2017. 10. 18. · Policeman UN Inspectors in Syria Google Glass . Video credit:

Extension

• Use bilateral filter to find long-term trends

• Use a regularization framework like MRF on the classification results

• Handle the ambiguity in gaze fixation

• Combine with external sources such as GPS and inertial sensors

• Generalize to detect short-term activities

• Aid video summarization