Top Banner
Recognition From Recognition From Recovered 3-D Human Recovered 3-D Human Joints Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST 2010 Junxia Gu, Member, IEEE, Xiaoqing Ding, Senior Member, IEEE, Shengjin Wang, Member, IEEE, and Youshou Wu Adviser Ming-Yuan Shieh
29

Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

Dec 27, 2015

Download

Documents

Kevin Mills
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

Action and Gait Recognition Action and Gait Recognition FromFromRecovered 3-D Human JointsRecovered 3-D Human Joints

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST 2010

Junxia Gu, Member, IEEE, Xiaoqing Ding, Senior Member, IEEE,

Shengjin Wang, Member, IEEE, and Youshou Wu

Adviser : Ming-Yuan Shieh

Student : shun-te chuang

ppt 製作: 100%

Page 2: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

OutlineOutline

ABSTRACTINTRODUCTIONPREVIOUS WORKFULL-BODY TRACKING METHOD

FOR POSE RECOVERYCLASSIFICATIONEXPERIMENTAL RESULTS AND ANALYSISCONCLUSION

Page 3: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

AbstractA common viewpoint-free framework that fuses pose

recovery and classification for action and gait recognition is presented in this paper.

First, a markerless pose recovery method is adopted to automatically capture the 3-D human joint and pose parameter sequences from volume data.

 Second, multiple configuration features (combination

of joints) and movement features (position, orientation, and height of the body) are extracted from the recovered 3-D human joint and pose parameter sequences.

 

Page 4: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

AbstractA hidden Markov model (HMM) and an exemplar-

based HMM are then used to model the movement features and configuration features, respectively.

 Finally, actions are classified by a hierarchical

classifier that fuses the movement features and the configuration features, and persons are recognized from their gait sequences with the configuration features.

Page 5: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

INTRODUCTIONINTRODUCTION

VIDEO-BASED study of human motion has been receiving increased attention in the past decades.

This has been motivated by the desire for application of intelligent video surveillance and human–computer interaction.

With increased awareness in security issues, motion analysis is becoming increasingly important in surveillance systems.

Page 6: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

INTRODUCTIONINTRODUCTION

Action recognition is a new requirement for understanding what the person is doing.

Current intelligent surveillance systems are in urgent need of noninvasive and viewpoint-free research on motion analysis.

This paper focuses on the movement of main body segments (arms, legs, and torso). A human gait is extracted from a “walk” action.

Page 7: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

INTRODUCTIONINTRODUCTION

In this paper, a vision-based markerless pose recovery approach is proposed to extract 3-D human joints.

Human joint sequence is one of the most effective and discriminative representations of human motion.

It contains much information, including position, orientation, and joint position.

The information is categorized into two types: movement features and configuration features.

Page 8: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

INTRODUCTIONINTRODUCTION

The changes of position, orientation, and height of the body, which describe the global movement of the subject, are defined as movement features.

The sequences of human joint positions, which describe the change of relative configuration of body segments, are defined as configuration features.

A hidden Markov model (HMM) and an exemplar-based HMM (EHMM) are employed to characterize the movement and configuration features, respectively.

Both the HMM and the EHMM have been used to recognize an action and a gait .

Page 9: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

INTRODUCTIONINTRODUCTION

Fig. 1. Flowchart of the video-based motion recognition.

Page 10: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

PREVIOUS WORKPREVIOUS WORK

1. Appearance-Based Methods: Appearance-based approaches are widely used in

action and gait representation. They directly represent human motion using image information, such as a silhouette, an edge, and an optical flow.

2. Human Model-Based Methods: Human model-based approaches represent an action

or a gait with body segments, joint positions, or pose parameters.

Page 11: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

PREVIOUS WORKPREVIOUS WORK

This paper combined stochastic search with a gradient descent for local pose refinement to recover complex whole body motion.

 The initialization of the model was automatic, with an

initialization pose standing upright with his/her arms and legs spread in the “Da Vinci” pose.

The tracking speed was below 1 s per frame. In this paper, an adaptive particle filter method is proposed for pose recovery.

Page 12: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

PREVIOUS WORKPREVIOUS WORK

First, the whole body of a subject in each frame is segmented into several body segments.

A particle filter with an adaptive particle number is then used to track each body segment.

This method decomposes the search space and reduces the computational complexity.

Page 13: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

FULL-BODY TRACKING FULL-BODY TRACKING METHODMETHODFOR POSE RECOVERYFOR POSE RECOVERY

Human Model

Human Model-Based Full-Body Tracking

Page 14: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

Human ModelHuman Model

Page 15: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

Human Model-Based Full-Body Human Model-Based Full-Body TrackingTracking

Page 16: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

CLASSIFICATIONCLASSIFICATIONA. HMM and EHMM Learning

B. Classifier for Gait Recognition

C. Classifier for Action Recognition

Page 17: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

A. HMM and EHMM A. HMM and EHMM LearningLearningThe EHMM is different from the HMM in the definition of observation densities. For the HMM, the general representation of observation densities is the Gaussian mixture model (GMM) of the following form:

In the EHMM, the definition of the observation probability is as follows:

Page 18: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

B. Classifier for Gait B. Classifier for Gait RecognitionRecognition

For each person c ∈ {1, . . . , Cp} in the database, we learn an EHMM gait model λ(c) Sgait with features Sgait and an EHMM gait model λ(c) Lgait with features Lgait. A testing gait sequence Y = {y0, . . . , yK} is classified with the following MAP estimation:

Page 19: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

C. Classifier for Action C. Classifier for Action RecognitionRecognitionTesting action sequence Y = {y0, . . . , yK}, with features SY , RY , PY , OY , and HY , is classified with a two-layer classifier fused multiple features. The first layer is a weighted-MAP classifier that fuses three movement features and theconfiguration feature of the whole body as

If the decision c1 of the first layer belongs to single-arm actions, sequence Y will be recognized by the second MAP classifier with arm features as

Page 20: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

There are 11 actions, and each subject plays each action three times. All samples contain approximately 29 100 frames in total.

These actions include “check watch,” “cross arm,” “scratch head,” “sit down,” “get up,” “turn around,” “walk in a circle,” “wave hand,” “punch,” “kick,” and “pick up,” .To demonstrate view invariance, subjects freely change their orientations. The acquisition is achieved using five standard firewire cameras.

The image resolution is 390 × 291 pixels, and the volume of interest is divided into 64 × 64 × 64 voxels.

Page 21: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 8. Samples of images and 3-D volume data.

Page 22: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 9. Annotation of the human joint. (a) Annotation of the knee joint. (b) Results of annotation.

Page 23: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 11. Average error of individual joint positions.

Fig. 10. Average error of the joint positions.

Page 24: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 12. Results of the pose recovery.

Page 25: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 13. Selected exemplars and recognition rate versus the number of exemplars.

Page 26: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 14. Comparison of convergence performance between the EHMM andthe HMM.

Page 27: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

EXPERIMENTAL RESULTS EXPERIMENTAL RESULTS AND ANALYSISAND ANALYSIS

Fig. 15. Average recognition rates of actions.

Page 28: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

CONCLUSIONCONCLUSION

The main contribution of this paper is the fusion of pose recovery and motion recognition.

Future work plans include automatically segmenting temporal sequence, reducing computational complexity, analyzing actions that are more complex, and recognizing 2-D actions based on the 3-D EHMM.

The free-viewpoint 3-D human joint sequence contains a significant amount of information for motion analysis.

In addition to representing the single actions used in this paper, it can be used for more applications, such as analysis of complex actions.

Page 29: Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

CONCLUSIONCONCLUSION

High DOF and huge 3-D points make the human model-based pose recovery method very time consuming.

To solve this problem, parallel computing, code optimization, and a GPU can be used to lessen time cost. It is difficult to obtain robust volume data of subjects in surveillance and content analysis scenarios at present.

Actions/gaits are affected by different factors, including clothing, age, and gender. In the future, performance with these factors will be analyzed in larger databases.