Top Banner
18

Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Jan 24, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Leveraging Textural Features for Recognizing Actions in

Low Quality Videos

Saimunur Rahman, John See, Chiung Ching Ho

Centre of Visual Computing, Faculty of Computing and InformaticsMultimedia University, Cyberjaya 63100, Selangor, Malaysia

RoViSP 2016, Penang, Malaysia

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 1 / 18

Page 2: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Visual human actions

Human actions: major visual events in movies, news, ...

Low quality videos: low frame resolution, low frame rate,compression artifacts, motion blurring

We recognize human actions from low quality videos

Leverage textures with shape and motion features toimprove action recognition form low quality videos.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 2 / 18

Page 3: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Motivation

Recognizing human actions from video is of central importance due toits large real-world application domain:

I surveillance, human computer application, video indexing etc.

Many methods have been proposed in recent years but majority arefocused on high quality videos that o�er �ne details and strong signal�delity.

I not suitable for real-time and lightweight applications

Current methods are not designed for processing low quality videos.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 3 / 18

Page 4: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Summary of Approach

Detect space-time patches by feature detector and describe usingshape and motion descriptor.

Calculate textural features from entire space-time volume.

Combine shape, motion and textural features to improve performance.

Summary of Contribution

Propose textural features to alleviate the limitation of shape andmotion features.

Use BSIF-TOP as a textural feature descriptor for action recognitionin low quality videos.

Evaluate various textural features on low quality videos.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 4 / 18

Page 5: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Related Work

Shape and motion featuresI Space-Time Interest Points [Laptev et al'05]

I Dense Trajectories [Wang et al.'11]

Textural featuresI LBP-TOP [Kellokompu et al'09]

I Extended LBP-TOP [Mattvi and Shao'09]

Similar approachesI Joint Feature Utilization [Rahman et al'15, See and Rahman'15]

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 5 / 18

Page 6: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Outline

1 Shape and Motion Features

2 Textural Features

3 Dataset

4 Evaluation Framework

5 Experimental Results

6 Conclusion

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 6 / 18

Page 7: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Shape and Motion Feature Representation

Spatio-temporal interest points are detected by Harris3D detector[Laptev'05].

Description of 3D patch around IPs using HOG and HOF [Laptev'08].I HOG - histogram of oriented gradients (encodes shape)I HOF - histogram of optical �ow (encodes motion)

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 7 / 18

Page 8: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Textural Feature Representation

Three types of textural features are calculated form entire space-timevolume:

I LBP - Local Binary Pattern [Zhao et al.'08].I LPQ - Local Phase Quantization [Zhao et al.'08].I BSIF - Binarized Statistical Image Features [Kannala and Rahtu'12].

To obtain dynamic textures we apply three orthogonal plane (TOP)technique [Zhao et al.'08].

I Features are calculated from XY, XT and YT plane of space-timevolume (XYT).

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 8 / 18

Page 9: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Dataset : KTH Action [Schüldt et al'04]

Total 599 videos captured in a controlled environment.

6 action classes performed by 25 actors in 4 di�erent scenarios.

Sampling rate: 25 fps, Resolution: 160 × 120 pixels.

Evaluation protocol: original experimental setup by authors.

Six downsampled versions were cerated (3 spatial (SDα) and 3temporal (SDβ) )

I We limit α, β = {2, 3, 4}, where α, β denotes spatial and temporaldownsampling to half, one third and one fourth of the originalresolution or frame rate respectively.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 9 / 18

Page 10: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Dataset : HMDB51 [Oh et al'11]

Total 6,766 videos of 51 action classes collected from movies orYouTube.

Videos are annotated with a rich set of meta-labels including qualityinformation

I three quality labels were used, i.e. `good', `medium' and `bad'.

Evaluation protocol: three training-testing split by authors.

We use the split speci�ed for training, while testing is done using onlyvideos with 'bad' and 'medium' labels; for clarity, we denote them asHMDB-BQ and HMDB-MQ respectively.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 10 / 18

Page 11: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Evaluation Framework

STIPs HOG/HOFFeature Encoding

LBP/LPQ/BSIF spatio-temporal textures

x

y

t

Shape-Motion feature detection Shape-Motion

feature representation

Textural feature calculation Textural feature representation

Input Video

Multi-class

non-linear SVM

Feature histograms

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 11 / 18

Page 12: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Experimental Results: KTH dataset

Performance (average accuracy over all class) comparison:

Best method: HOG+HOF+BSIF-TOP

Spatially downsampled videos are highly bene�ted by textural features.

BSIF-TOP outperform other textural features.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 12 / 18

Page 13: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Experimental Results: HMDB51 dataset

Performance (average accuracy over all class) comparison:

Best method: HOG+HOF+BSIF-TOP

Texture vastly improve the performance of both `Bad' and `Medium'quality videos.

BSIF-TOP outperform other textural features.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 13 / 18

Page 14: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Experimental Results: BSIF-TOP vs. other textures

Performance improvement by BSIF-TOP over LBP-TOP andLPQ-TOP when aggregated with HOG+HOF:

LPQ-TOP is better for spatially downsampled videos.

LBP-TOP is better for temporally downsampled videos.

Using BSIF-TOP, HMDB-LQ and HMDB-MQ results improves toalmost double of baseline.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 14 / 18

Page 15: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Experimental Results: Computational Complexities

Computational cost (feature detection/calculation + quantizationtime) of various feature descriptors:

Runtime reported using a Core i7 3.6 GHz 32GB RAM machine.

All test run on a sampled video from KTH-SD2 dataset consist of 656frames.

Ranking of descriptors in terms of speed:I LPQ-TOP > BSIF-TOP > HOG+HOF > LBP-TOP.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 15 / 18

Page 16: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Conclusion

We leveraged on textural features to improve the recognition ofhuman actions in low quality video clips.

Considering that most current approaches involved only shape andmotion features, the use of textural features is a novel proposition thatimproves the recognition performance by a good margin.

BSIF-TOP o�ers a signi�cant leap of around 16% and 18% on theKTH-SD4 and HMDB-MQ datasets respectively, over their originalbaselines.

In future, we intend to extend this work towards a larger variety ofhuman action datasets.

It is also worth designing textural features that are more discriminativeand robust towards complex backgrounds.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 16 / 18

Page 17: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Acknowledgement

This work is supported, in part, by MOE Malaysia under FundamentalResearch Grant Scheme (FRGS) project FRGS/2/2013/ICT07/MMU/03/4.

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 17 / 18

Page 18: Leveraging Textural Features for Recognizing Actions in Low Quality … · 2020. 7. 30. · HMDB-BQ and HMDB-MQ respectively. Rahman, See and Ho Leveraging exturTe for HAR MMU, ...

Thank You!

Q & A

Rahman, See and Ho Leveraging Texture for HAR MMU, Cyberjaya 18 / 18