Top Banner
Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie
21

Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Jan 02, 2016

Download

Documents

Steven Fowler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Multimodal Information Analysis for Emotion

Recognition(Tele Health Care Application)

Malika MeghjaniGregory Dudek and Frank P. Ferrie

Page 2: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Content

1. Our Goal

2. Motivation

3. Proposed Approach

4. Results

5. Conclusion

Page 3: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Our Goal

• Automatic emotion recognition using audio-visual information analysis.

• Create video summaries by automatically labeling the emotions in a video sequence.

Page 4: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Motivation• Map Emotional States of the Patient to

Nursing Interventions.• Evaluate the role of Nursing Interventions for

improvement in patient’s health.

NURSING INTERVENTIONS

Page 5: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Proposed Approach

Visual Feature

Extraction

Audio Feature

Extraction

Visual based Emotion

Classification

Audio based Emotion

Classification

Data Fusion

Recognized Emotional

StateDecision Level Fusion

Page 6: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Visual Analysis

1. Face Detection (Voila Jones Face Detector using Haar

Wavelets and Boosting Algorithm)

2. Feature Extraction (Gabor Filter: 5 Spatial Frequencies

and 4 orientations, 20 filter responses for each frame)

3. Feature Selection (Select the most discriminative features

in all the emotional classes)

4. SVM Classification (Classification with probability

estimates)

Page 7: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Visual Feature Extraction

X =

( 5 Frequencies X 4 Orientation ) Frequency Domain Filters

Automatic Emotion

Classification

Feature Selection

Page 8: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Audio Analysis

1. Audio Pre-Processing (Remove leading and trailing

edges)

2. Feature Extraction (Statistics of Pitch and Intensity

contours and Mel Frequency Cepstral Coefficients)

3. Feature Normalization (Remove inter speaker

variability)

4. SVM Classification (Classification with probability

estimates)

Page 9: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Audio Feature Extraction

Speech RatePitchIntensitySpectrum AnalysisMel Frequency Cepstral Coefficients(Short-term Power Spectrum of Sound)

Audio Feature

Extraction

Automatic Emotion

Classification

Audio Signal

Page 10: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

SVM Classification

Page 11: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Feature Selection

Average Count of Error Features Distance Selected

Bounding Planes

• Feature selection method is similar to the SVM classification. (Wrapper Method)

• Generates a separating plane by minimizing the weighted sum of distances of misclassified data points to two parallel planes.

• Suppress as many components of the normal to the separating plane which provide consistent results for classification.

Page 12: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Data Fusion1. Decision Level:

• Obtaining probability estimate for each emotional class using SVM margins.

• The probability estimates from two modalities are multiplied and re-normalized to the give final estimation of decision level emotional classification.

2. Feature Level:

Concatenate the Audio and Visual feature and repeat feature selection and SVM classification process.

Page 13: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Database and Training

Database:1. Visual only posed database2. Audio Visual posed database

Training: • Audio segmentation based on minimum

window required for feature extraction.• Corresponding visual key frame

extraction in the segmented window. • Image based training and audio segment

based training.

Page 14: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Experimental Results

Statistics

Database

No. of Training

Examples

No. of Subjects

No. of Emotional

State

% Recognition

Rate

Validation Method

Posed Visual Data

Only (CKDB)

120 20 5+Nuetral 75% Leave one subject

out cross validation

Posed Audio

Visual Data (EDB)

270 9 6 82%

76%

Decision Level

Feature Level

Page 15: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Time Series Plot

( * Cohen Kanade Database, Posed Visual only Database)

75% Leave One Subject Out Cross Validation Results

Surprise Sad Angry Disgust Happy

Page 16: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Feature Level Fusion

(*eNTERFACE 2005 , Posed Audio Visual Database)

Page 17: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Decision Level Fusion

(*eNTERFACE 2005 , Posed Audio Visual Database)

Page 18: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Confusion Matrix

Page 19: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Demo

Page 20: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Conclusion• Combining two modalities (Audio and Visual)

improves overall recognition rates by 11% with Decision Level Fusion and by 6% with Feature Level Fusion

• Emotions where vision wins: Disgust, Happy and Surprise.

• Emotions where audio wins: Anger and Sadness• Fear was equally well recognized by the two

modalities.• Automated multimodal emotion recognition is

clearly effective.

Page 21: Multimodal Information Analysis for Emotion Recognition (Tele Health Care Application) Malika Meghjani Gregory Dudek and Frank P. Ferrie.

Things to do…

• Inference based on temporal relation between instantaneous classifications.

• Tests on natural audio-visual database (on-going).