Top Banner
Presented by: Anuj Mehra (2006 IPG13) Emotion State Recognition System & Its Analysis Using Soft Computing Guided by: Prof. Anupam Shukla
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Anuj Mehra (2006IPG13)-MTP Presentation

Presented by: Anuj Mehra (2006

IPG13)

Emotion State Recognition System & Its Analysis Using

Soft Computing

Guided by: Prof. Anupam Shukla

Page 2: Anuj Mehra (2006IPG13)-MTP Presentation

IntroductionAutomatic recognition of facial gestures (i.e.,

facial muscle activity) is rapidly becoming an area of intense interest in the research field of machine vision.

FACIAL expressions play a significant role in our social and emotional lives. They are visually observable, conversational, and interactive signals that clarify our current focus of attention and regulate our interactions with the environment and other persons in our vicinity.

They are our direct and naturally preeminent means of communicating emotions.

Page 3: Anuj Mehra (2006IPG13)-MTP Presentation

Contd..Nonverbal communication plays a very important role

in human communication. Telephones have been mainly used to communicate in business, but recently, telephones are used more and more for everyday communication among family members and friends.

In addition to human-to-human communication, communication between human and computer agents has become more and more common. Computer agents that act as communication mediators will become common entities in our society. As such, the capability of communicating with humans using both verbal and nonverbal communication channels will be essential.

Page 4: Anuj Mehra (2006IPG13)-MTP Presentation

ObjectiveThe main aim of this research is to develop

and analyze an automatic emotion state recognition system and its applications using facial features and soft computing.

Today non verbal communication is the most important task to be done and with the help of this system it can be done easily.

Page 5: Anuj Mehra (2006IPG13)-MTP Presentation

Literature ReviewEkman and Friesen [7] postulated, Joy, Surprise,

Anger, Sadness, Fear and Disgust, six primary emotions in 1971. These emotions are referred as universally defined basic emotions.

Suwa et al. [8] presented a preliminary investigation on automatic facial expression from an image sequence in year 1978. Earlier to this, study on facial expression analysis was a subject for Psychologists only.

In 1994, Kobayashi and Hara [9] developed active human interface machine recognition of human emotions form facial expression by using Neural Network. They obtained high recognition rate of 90% for six basic emotions.

Page 6: Anuj Mehra (2006IPG13)-MTP Presentation

Contd..In 1997, Huang and Huang [19] introduced an

automatic facial expression recognition system, which consists of two parts: facial feature extraction and facial expression recognition. The system applies the point distribution model and the gray-level model to find the facial features. The position variations of certain designated points on the facial features were described by 10 action parameters.

Donat et al. [23] quantified facial movement in terms of component actions in 1999. They compared various techniques for automatically recognizing facial action in sequence of images. These techniques include analysis of facial motion through estimation of optical, holistic spatial analysis, independent components analysis etc.

Page 7: Anuj Mehra (2006IPG13)-MTP Presentation

Contd..In 2000, Pantic and Rothkrantz [26] described an

integrated system for facial expression recognition (ISFER). This performs recognition and emotional classification of human facial expression from a still full-face image.

In 2001, Evan smith et al. [27] described a Neural Network analog of the HMM interpolation methods to analyze facial expression. The Networked demonstrated robust recognition for the six upper facial action units whether they occurred individually or in combination.

Suzuki et al. [28], in 2001, described model of the interrelationship between physical feature of face and its emotional impression by suing a unique Neural Network.

Page 8: Anuj Mehra (2006IPG13)-MTP Presentation

Motivation In contrast to the previous approaches to automatic AU

detection, which did not dealt with static face images, the proposed research here addresses the problem of automatic AU coding from static face images.

The research is undertaken with two motivations:

1. While motion records are necessary for studying temporal dynamics of facial behavior, static images are important for obtaining configurationally information about facial expressions.

Since 100 still images or a minute of a video tape take approximately one hour to manually score in terms of AUs [5] it is obvious that automating facial expression measurement would be highly beneficial. While some efforts in automating FACS coding from face image sequences have been made, no such effort has been made for the case of static face images.

Page 9: Anuj Mehra (2006IPG13)-MTP Presentation

Contd..2. A basic understanding as how to achieve

automatic facial gesture analysis is necessary if facial expression analyzers capable of handling partial and inaccurate data are to be developed.

Page 10: Anuj Mehra (2006IPG13)-MTP Presentation

MethodologyThe system proposed recognizes five basic

emotions such as anger, happy, sad, neutral, disgust.

In this research work, two methodologies are proposed.Detection of face from the dataset

Applying PCA

Euclidean distance for the prediction to which class the test image belongs to

Neural networks for the prediction to which class the test image belongs to

Euclidean Distance to calculate distance from the neutral

Page 11: Anuj Mehra (2006IPG13)-MTP Presentation

DiscussionFrom several methods for recognition of facial

gestures, the facial action coding system (FACS) [5] is the best known and most commonly used.

FACS is an index of facial expressions, but does not actually provide any bio-mechanical information about the degree of muscle activation.

FACS [5] defines 32 AUs, which are a contraction or relaxation of one or more muscles.

Intensities of FACS are annotated by appending letters A-E (for minimal-maximal intensity) to the Action Unit number (e.g. AU 1A is the weakest trace of AU 1 and AU 1E is the maximum intensity possible for the individual person).

Page 12: Anuj Mehra (2006IPG13)-MTP Presentation

Feature Points

Feature Points [1]

Page 13: Anuj Mehra (2006IPG13)-MTP Presentation

Database

Sample Database [31,32]

Page 14: Anuj Mehra (2006IPG13)-MTP Presentation

Identified Face

Page 15: Anuj Mehra (2006IPG13)-MTP Presentation

Input

Image001.jpg,happyImage002.jpg,happyImage003.jpg,happyImage004.jpg,happyImage005.jpg,happyImage006.jpg,happyImage007.jpg,happyImage008.jpg,happyImage009.jpg,happyImage010.jpg,happyImage011.jpg,happyImage012.jpg,happyImage013.jpg,happy

Image014.jpg,disgustImage015.jpg,disgustImage016.jpg,disgustImage017.jpg,disgustImage018.jpg,disgustImage019.jpg,disgustImage020.jpg,disgustImage021.jpg,disgustImage022.jpg,disgustImage023.jpg,disgustImage024.jpg,disgust

Image025.jpg,angerImage026.jpg,angerImage027.jpg,angerImage028.jpg,angerImage029.jpg,angerImage030.jpg,angerImage031.jpg,angerImage032.jpg,angerImage033.jpg,angerImage034.jpg,anger

Image035.jpg,sadImage036.jpg,sadImage037.jpg,sadImage038.jpg,sadImage039.jpg,sadImage040.jpg,sadImage041.jpg,sadImage042.jpg,sadImage043.jpg,sad

Image044.jpg,neutralImage045.jpg,neutralImage046.jpg,neutralImage047.jpg,neutralImage048.jpg,neutralImage049.jpg,neutralImage050.jpg,neutral

Training Image, Expression

Page 16: Anuj Mehra (2006IPG13)-MTP Presentation

Algorithm1. The training images are utilized to create a low dimensional

face space. This is done by performing Principal Component Analysis (PCA) in the training image set and taking the principal components (i.e. Eigen vectors with greater Eigen values). In this process, projected versions of all the train images are also created.

2. The test images also are projected on the face space – as a result, all the test images are represented in terms of the selected principal components.

3. The Euclidean distance of a projected test image from all the projected train images are calculated and the minimum value is chosen in order to find out the train image which is most similar to the test image. The test image is assumed to fall in the same class that the closest train image belongs to.

Page 17: Anuj Mehra (2006IPG13)-MTP Presentation

Algorithm (Contd.) After calculating Principal Components BPA is

applied as a classifier. In order to determine the intensity of a

particular expression, its Euclidean distance from the mean of the projected neutral images is calculated. The more the distance - according to the assumption - the far it is from the neutral expression. As a result, it can be recognized as a stronger the expression.

The emotions extracted from the system are used to control a music player.

Page 18: Anuj Mehra (2006IPG13)-MTP Presentation

Output

Image001.jpg,2221,neutral, Image046.jpg Image017.jpg,4002,neutral,Image046.jpg

Image002.jpg,3669,happy,Image008.jpg Image018.jpg,6088,disgust,Image022.jpg

Image003.jpg,4764,disgust,Image014.jpg Image019.jpg,4331,disgust,Image022.jpg

Image004.jpg,4462,anger,Image029.jpg Image020.jpg,5274,anger,Image026.jpg

Image005.jpg,3933,anger,Image025.jpg Image021.jpg,5002,anger,Image029.jpg

Image006.jpg,4745,happy,Image003.jpg Image022.jpg,5135,disgust,Image021.jpg

Image007.jpg,5398,sad,Image041.jpg Image023.jpg,4134,disgust,Image018.jpg

Image008.jpg,5851,happy,Image010.jpg Image024.jpg,4570,disgust,Image022.jpg

Image009.jpg,2503,neutral,Image046.jpg Image025.jpg,4331,disgust,Image023.jpg

Image010.jpg,4183,happy,Image008.jpg Image026.jpg,3387,neutral,Image049.jpg

Image011.jpg,5135,sad,Image040.jpg Image027.jpg,4800,disgust,Image016.jpg

Image012.jpg,6319,anger,Image031.jpg Image028.jpg,4274,disgust,Image023.jpg

Image013.jpg,5292,happy,Image006.jpg Image029.jpg,5359,anger,Image029.jpg

Image014.jpg,6207,happy,Image012.jpg Image030.jpg,5994,disgust,Image022.jpg

Image015.jpg,5899,happy,Image006.jpg Image031.jpg,5921,anger,Image029.jpg

Image016.jpg,4163,sad,Image040.jpg  

Testing Image, Distance From Neutral, Expression, Best Match

Page 19: Anuj Mehra (2006IPG13)-MTP Presentation

Result

PCA + Euclidean Distance

PCA + BPA

Training 100% 89%

Validation 98.3% 85.7%

Page 20: Anuj Mehra (2006IPG13)-MTP Presentation

Graphs

Page 21: Anuj Mehra (2006IPG13)-MTP Presentation

Future ScopeThe proposed system can also be made in such

a way that it can handle distractions like occlusions (i.e. by a hand), glasses and facial hair.

The output obtained from the above system is currently used with the music player and can also be used in the different areas and fields related to emotion recognition.

Page 22: Anuj Mehra (2006IPG13)-MTP Presentation

Previous Work[1] Mehra Anuj, Shukla Anupam, Tiwari Ritu,

“Intelligent Biometric System for Speaker Identification using Lip features with PCA and ICA”, Journal of Computing, Volume 2, Issue 4, April 2010, pp.120-127

[2]Mehra Anuj, Shukla Anupam, Tiwari Ritu,” Expert System for Speaker Identification Using Lip Features with PCA”, Intelligent Systems and Applications (ISA) 2010, IEEE, Wuhan, 22-23 May 2010.

[3] Mehra Anuj, Shukla Anupam, “Emotion State Recognition System using Euclidean distance and PCA”, Journal of Computing [in review].

Page 23: Anuj Mehra (2006IPG13)-MTP Presentation

References P. Ekman and E. Rosenberg (2005). “Basic and Applied Studies of Spontaneous Expression using the Facial

Action Coding System.” Oxford University Press, 2nd Edition, Feb. 2005.

R Gutierrez Osuna et al. (2005),” Speech Driven Facial Animation with Realistic Dynamics”, IEEE Transactions on Multimedia, Vol. 7 (1), pp 33-41.

A. V. Barbosa and H. C. Yehia (2001), “Measuring the Relation between Speech Acoustics and 2D Facial Motion” Speech Communication Vol. 26, pp 23-48.

L. C. De Silva, T. Miyasato and R. Nakatsu (1997), “Facial Emotion Recognition Using Multimodal Information” Proceedings of IEEE International Conference on Information Communication and Signal Processing (ICICS’97) Singapore, pp 397-401.

R. G. Osuna, P. K. Kakumanu, A. Esposito, O. N. Garcia, A. Bojorquez, J. L. Castillo and I. Rudomin (2005), “Speech- Driven Facial Animation with Realistic Dynamics” IEEE Transactions on Multimedia. Vol. 7 (01), pp 33-42.

C. Darwin (1965). , “The Expression of Emotion in Man and Animals” John Murray, Ed.1872. Reprinted by university Chicago Press, 1965.

P. Ekman and W. V. Friesen (1971), “Constants across cultures in the face and emotion” Journal of Personality and Social Psychology, Vol. 17(2), pp 124-129.

M. Suwa, N. Sugie, and K. Fujimora (1978), “A preliminary note on pattern recognition of human emotional expression,” Proceedings of the Fourth International Joint Conference on Pattern Recognition, Kyoto (Japan): pp 408-410.

H. Kobayashi and F. Hara (1994), “Analysis of the Neural Network and Recognition Characteristics of 6 Basic Facial Expressions” IEEE, International Workshop on Robot and Human Communication 1994, pp 222-227.

F. Kawakami, S. Morishima, H Yamada and H. Harashima (1994), “Construction of 3-D Emotion Space Based on Parameterised Faces” IEEE, International Workshop on Robot and Human Communication, 1994, pp 216-221.

Page 24: Anuj Mehra (2006IPG13)-MTP Presentation

Contd.. R. R. Advent, C. T. Ng and J. A. Nel (1994), “Machine Vision Recognition of Facial Affect Using Back-

propagation Neural Networks”, Proceedings of the 16th Annual International Conference of IEEE, Engineering in Medicine and Biology Society 1994, Engineering Advances: New Opportunities for Biomedical Engineers, pp 1364-1365.

S. Morishima (1996),” Modeling of facial expression and emotion for human communication system Displays,” IEEE Transactions, 1996, pp15-25.

R. Herpers, M. Michaelis, K. H. Lichtenauer and G. Sommer (1996), “Edge and Key point Detection in Facial Regions”, 2nd International Conference on Automatic Face and Gesture Recognition, pp 212-217.

H. Demirel, T. J. Clarke and P. Y. K. Cheung (1996), “Adaptive Automatic Facial Feature Segmentation”, 2nd International Conference on Automatic Face and Gesture Recognition, pp 277-282.

Y. Yacoob and L. S. Davis (1996), “Recognizing Human Facial Expressions From Long Image Sequences Using Optical Flow”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18 (06), pp 636-642.

M Rosenblum, Y. Yacoob and L. S. Davis (1996), “Human Expression Recognition from Motion Using a Radial Basis Function Network Architecture”, IEEE Transactions on Neural Networks, Vol. 7 (05), pp 1121-1138.

A. Essa and A. P. Pentland (1997), “Coding Analysis, Interpretation and Recognition of Facial Expression”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19(7), pp 757-763.

A. Lanitis, C. J. Taylor and T. F. Cootes (1997), “Automatic Interpretation and coding of Face Images using Flexible Models”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19(7), pp 743-756.

C. L. Huang and Y. M. Huang (1997), “Facial Expression Recognition Using Modal-Based Feature Extraction and Action Parameters Classification”, Journal of Visual Communication and Image Representation, Vol. 8 (03), pp 278-290.

P. Eisert and B. Girod (1997), “Facial Expression Analysis for Modal-Based Coding of Video Sequences”, Picture Coding Symposium, Berlin-1997: pp 33-38.

Page 25: Anuj Mehra (2006IPG13)-MTP Presentation

Contd.. G. A. Abrantes and F. Pereira (1998), “Interactive Analysis for MPEG-4 Facial Model Configuration”,

EUROGRAPHICS, 1998, Lisboa (Portugal), pp 1-4.

C. Gershenson (1999), “Modelling Emotions with Multidimensional Logic”, 18th International Conference of North American Fuzzy Information Processing Society, New York City, pp 42-46.

G. Donato, M. S. Bartlett, J. C. Hager, P. Ekaman and T. J. Sejnowski (1999), IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 21(10), pp 974-989.

M. D. Bonis, P. D. Boeck, F. Perez-Diaz and M. Nahas (1999), “A Two-process Theory of Facial Perception of Emotions”, Life Science, Vol. 322(8), pp 669-675.

J. J. J. Lien, T. Kanade, J. F. Cohn and C. C. Li (2000), “Detection, Tracking and Classification of action units in facial expression”, Journal of Robotics and Autonomous Systems, Vol. 31(2000), pp 131-146.

M. Pantic & L. J. M. Rothkrantz (2000), “Expert system for automatic analysis of facial expressions”, Image and Vision Computing, Vol. 18, pp 881-905.

E. Smith, M. S. Bartlett and J. Movellan (2001), “Computer Recognition of Facial Actions: A study of co-articulation effects”, Proceedings of 8th Joint Symposium on Neural Computations, May 2001, pp 1-6.

K. Suzuki, H. Yamada and S. Hashimoto (2001), “Interrelating physical features of facial expression and its impression”, Proceedings IEEE International Conference on Neural Network, 2001, pp1864-1869.

Y. L. Tian, T. Kanade, J. F. Cohn (2001), “Recognizing Action Units for Facial Expression Analysis”, Transactions on Pattern Analysis and Machine Intelligence, Vol. 23(02), pp 97-105.

M. Nahas and M. D. Bonis (2001), “Image Technology and Facial Expression of Emotions”, Proceedings of 10th IEEE, International Workshop on Robot and Human Interactive Communication, 2001, pp 524-527.

Vidit Jain, Amitabha Mukherjee. The Indian Face Database. http://vis-www.cs.umass.edu/~vidit/IndianFaceDatabase/, 2002.

Japanese Female Facial Expression (JAFFE) Database - http://www.kasrl.org/jaffe.html

Page 26: Anuj Mehra (2006IPG13)-MTP Presentation

Thank You