Top Banner
Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174 A SURVEY ON FACIAL EXPRESSION DATABASES ANITHA C * Senior lecturer, Dept of ISE, BMS College of Engineering, Bangalore, India. M K VENKATESHA Principal, RNS Institute of Technology, Bangalore, India. B SURYANARAYANA ADIGA Senior Consultant, TCS ltd, Bangalore, India. Abstract: Human faces are non-rigid objects with a high degree of variability in size, shape, color, and texture. The face databases are extensively used for evaluation of various algorithms used in facial expression/gesture recognition systems. Any automated system for face and facial gesture recognition has immense potential in identification of criminals, surveillance and retrieval of missing children, office security, credit card verification, video document retrieval, telecommunication, high - definition television, medicine, human–computer interfaces, multimedia facial queries, and low-bandwidth transmission of facial data. This paper presents a comprehensive survey of the currently available databases that can be used in facial expression recognition systems. The growth in face database development has been tremendous during the recent years. Keywords: Facial expression recognition systems, databases, human-computer interaction I. Facial Expression Databases Images used for facial expression recognition are static images or image sequences. An image sequence potentially contains more information than a still image, because the former also depicts the temporal information. The usage of these databases is restricted for research purpose only. Most commonly used databases include Cohn-Kanade facial expression database, Japanese Female Facial Expression (JAFFE) database, MMI database, CMU-PIE database. Currently 3D databases are used to a larger extent. Table 1 summarizes the facial expression databases that are currently available for evaluation usage. Each of the databases listed in table 1 below is briefly described. The description mainly emphasizes the techniques used during the development of the database. This is followed by a few of the face images from the database. Finally the salient features of the database are listed. The conclusion at the end, gives details for researchers requiring an access for these databases. * Corresponding author is a faculty and research scholar at B M S college of Engineering, Bangalore, Karnataka, India. ISSN: 0975-5462 5158
17
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    A SURVEY ON FACIAL EXPRESSION DATABASES

    ANITHA C * Senior lecturer, Dept of ISE,

    BMS College of Engineering, Bangalore, India.

    M K VENKATESHA Principal,

    RNS Institute of Technology, Bangalore, India.

    B SURYANARAYANA ADIGA Senior Consultant,

    TCS ltd, Bangalore, India.

    Abstract: Human faces are non-rigid objects with a high degree of variability in size, shape, color, and texture. The face databases are extensively used for evaluation of various algorithms used in facial expression/gesture recognition systems. Any automated system for face and facial gesture recognition has immense potential in identification of criminals, surveillance and retrieval of missing children, office security, credit card verification, video document retrieval, telecommunication, high - definition television, medicine, humancomputer interfaces, multimedia facial queries, and low-bandwidth transmission of facial data. This paper presents a comprehensive survey of the currently available databases that can be used in facial expression recognition systems. The growth in face database development has been tremendous during the recent years. Keywords: Facial expression recognition systems, databases, human-computer interaction

    I. Facial Expression Databases

    Images used for facial expression recognition are static images or image sequences. An image sequence potentially contains more information than a still image, because the former also depicts the temporal information. The usage of these databases is restricted for research purpose only. Most commonly used databases include Cohn-Kanade facial expression database, Japanese Female Facial Expression (JAFFE) database, MMI database, CMU-PIE database. Currently 3D databases are used to a larger extent. Table 1 summarizes the facial expression databases that are currently available for evaluation usage.

    Each of the databases listed in table 1 below is briefly described. The description mainly emphasizes the techniques used during the development of the database. This is followed by a few of the face images from the database. Finally the salient features of the database are listed. The conclusion at the end, gives details for researchers requiring an access for these databases.

    *Corresponding author is a faculty and research scholar at B M S college of Engineering, Bangalore, Karnataka, India.

    ISSN: 0975-5462 5158

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Table 1 Facial expression databases available for evaluation

    Name of database

    No. of images

    Expression / pose / illumination

    Color/ Gray

    Resolution Images / image sequence

    Number of subjects

    Year

    FERET 14,051 2/9-20/2 Gray 256x384 Images 1199 1996 JAFFE 213 7/1/1 Gray 256x256 Images 10 1998 AR database

    3288 4/1/4 Color 768x567 Images 116 1998

    Cohn Kanade AU-coded , v1

    486 6/1/1 Gray 640x490 image sequence

    97 2000

    CAS-PEAL database

    30,900 6/21/9-15 Color 360x480 Images 1040 2003

    Korean Face Database (KFDB)

    52,000 5/7/16 Color 640x480 Images 1000 2003

    MMI FE 800+ sequences, 200+ images

    6/2/1 Color 720x576

    Images & Image sequence

    52 2005

    University of Texas Video db

    11/9/1 Color 720x480 Images, video stream

    284 2005

    BU-3D FE 2500 7/2/4 Color 1040x1329

    Images, 3D models

    100 2006

    FG-NET 399 6/1/1 Color 320x240 Image sequence

    18 2006

    FE db of MPI for Biological Cybernetics

    5600 4/7/1 Color 256x256 Images

    200 2006

    BU-4D FE 606 3D sequence

    6/ 1/1 Color 1040x1329 Image sequence, 3D models

    101 2008

    Radboud Face Database

    8040 8/5/1 Color 1024x681 Images 67 2010

    ISSN: 0975-5462 5159

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    I.I. FERET database

    Face Recognition Technology program (FERET) was sponsored by the Department of Defence (DoD) Counterdrug Technology Development Program Office [Phillips et al. (2000)]. The goal of FERET program was to develop automatic face recognition capabilities that could be employed to assist security, intelligence and law enforcement personnel in the performance of their duties. The program consisted of three major elements: (a) sponsoring research, (b) collecting the FERET database, and (c) performing the FERET evaluations. The database was collected in 15 sessions between August 1993 and July 1996. The database contains 1564 sets of images for a total of 14,126 images that includes 1199 individuals and 365 duplicate sets of images. A duplicate set is a second set of images of a person already in the database and was usually taken on a different day. During each session 13 conditions with varying facial expressions, illumination and occlusion were captured.

    Salient features:

    The age-related facial change was considered while collecting the images, with the interval between two

    sessions extending up to 2 years for some subjects. Duplicate sets of images considered. Maximum number of subjects used.

    I.II. JAFFE database Ten female subjects posed for the six basic expressions: happiness, sadness, anger, disgust, fear and surprise, and the

    neutral face (see figure 2). Each of the subjects posed with three to four examples per expression to make a total of 219 images [Lyons et al. (1998)]. The still images were captured in a controlled environment (pose and illumination see figure 1). The semantic ratings of the expressions were performed from psychological experiments averaged over 60 Japanese female subjects as ground truth. According to Michael J Lyons, any expression is never a pure expression but a mixture of different emotions. So, a 5 level scale was used for each of the expression images (5 for high and 1 for low). Two such ratings were given, one with fear expression images and the other without fear expression images. The expression images are labeled as per the predominant expression in that image. Considerably low resolution images used 256 x 256 with the number of subjects just equal to ten (smallest in comparison with other databases).

    Fig 1: Apparatus used to photograph the facial expressions.

    ISSN: 0975-5462 5160

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig 2: Examples of images from the Japanese Female Facial Expression database.

    Salient features:

    It is the only facial expression database that uses the minimum number of subjects. Manual rating used to identify facial expressions.

    I.III. AR database The AR database was collected at the Computer Vision Centre in Barcelona, Spain in 1998. It contains images of 116 individuals (63 men and 53 women) [Martinez et al. (1998)]. The imaging and recording conditions (camera parameters, illumination setting, and camera distance) were carefully controlled and constantly recalibrated to ensure that settings are identical across subjects. The resulting RGB colour images are 768 576 pixels in size. The subjects were recorded twice at a 2week interval. During each session 13 conditions with varying facial expressions, illumination and occlusion were captured. Figure 3 shows an example for each condition. So far, more than 200 research groups have accessed the database.

    Courtesy: [Gross (2005)]

    Fig 3: AR database. The conditions are (1) neutral, (2) smile, (3) anger, (4) scream, (5) left light on, (6) right light on, (7) both lights on, (8) sun glasses, (9) sun glasses/ left light, (10) sun glasses/ right light, (11) scarf, (12) scarf/ left light, (13) scarf/ right light.

    Salient features:

    First ever facial expression database to consider occlusions in face images. Inclusion of scream, a non-prototypic gesture, in the database. To enable testing and modelling using this database, 22 facial feature points are manually labelled on each

    face.

    I.IV. Cohn-Kanade facial expression database Subjects in the available portion of the database were 97 university students enrolled in introductory psychology classes. They ranged in age from 18 to 30 years. Sixty-five percent were female, 15 percent were African-American, and three percent were Asian or Latino. The observation room was equipped with a chair for the subject and two

    ISSN: 0975-5462 5161

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Panasonic WV3230 cameras, each connected to a Panasonic S-VHS AG-7500 video recorder with a Horita synchronized time-code generator. One of the cameras was located directly in front of the subject, and the other was positioned 30 degrees to the right of the subject [Kanade et al. (2000)]. Only image data from the frontal camera are available at this time. Subjects were instructed by an experimenter to perform a series of 23 facial displays that included single action units (e.g., AU 12, or lip corners pulled obliquely) and combinations of action units (e.g., AU 1+2, or inner and outer brows raised). Subjects began and ended each display from a neutral face, but the image sequences provided contains from neutral to expression (see figure 4). Before performing each display, an experimenter described and modelled the desired display. Six of the displays were based on descriptions of prototypic basic emotions (i.e., joy, surprise, anger, fear, disgust, and sadness). Image sequences from neutral to target display were digitized into 640 by 480 or 490 pixel arrays with 8-bit precision for gray scale values. The images are available in png and jpg. Images are labelled using their corresponding VITC. The final frame of each image sequence was coded using FACS (Facial Action Coding System) which describes subject's expression in terms of action units (AUs). FACS codes were conducted by a certified FACS coder.

    Jeffrey Cohn

    Fig 4: An image sequence of a subject expressing Surprise' from the Cohn- Kanade Facial Expression Database.

    Salient features:

    Image sequences considered instead of mug shots. Evaluation performed based on Action Unit recognition.

    I.V. CAS-PEAL database PEAL stands for Pose, Expression, Accessory and Lighting. A large-scale Chinese face database with variations in illumination and accessories is PEAL [Gao et al. (2004)]. The database currently contains 99,594 images of 1040 individuals (595 males and 445 females). Five different expressions, six accessories (3 glasses and 3 caps), and 15 lighting conditions are considered while capturing the images. 9 equally spaced cameras were used to horizontally capture different poses simultaneously (see figure 6). The subjects were asked to look up and down to capture another set of 18 images. The conditions considered during the database creation are listed in table 2. The currently available database for research contains a subset of this database with 30,900 images of 1040 subjects. These images belong to two main subsets: frontal and pose subset. In frontal subset, all images are captured from camera C4 with the subject looking right into this camera. Among them, 377 subjects have images with 6 expressions (see figure 7), 438 subjects have images wearing 6 different accessories, 233 subjects have images under at least 9 lighting changes, 296 subjects have images against 2 or 4 different backgrounds and 296 subjects have images with different distances from cameras. Also, 66 subjects have images recorded in two sessions at a 6 month interval (see figure 8). In pose subset, images of 1040 subjects across 21 different poses without any other variations are included.

    ISSN: 0975-5462 5162

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig 6: Plan form of CAS-PEAL camera system.

    Table 2: Sources of variations considered in CAS-PEAL database

    # Viewpoints 9

    #Variations Facing

    directions Expression Lighting Accessory Background Aging Distance

    3 6 15 6 4 2 2 #Combined 27 54 135 54 36 18 18

    #Total 342

    Fig 7: Example images of one subject with 6 expressions across three poses in CAS-PEAL database.

    ISSN: 0975-5462 5163

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig 8: Example images captured with time difference. The images in the bottom row are captured after six months. Salient features:

    Time/age consideration during image collection. Inclusion of multiple accessories in database. Consideration of surprise and open mouth categories in the database.

    I.VI. Korean face database The Korean Face Database (KFDB) contains facial imagery of a large number of Korean subjects collected under carefully controlled conditions [Hwang et al. (2003)]. The images with varying pose, illumination, and facial expressions were recorded. The subjects were imaged in the middle of an octagonal frame carrying seven cameras and eight lights (in two colours: fluorescent and incandescent) against a blue screen background. The cameras were placed between 45 off frontal in both directions at 15 increments. Figure 9 shows example images for all seven poses. Pose images were collected in three styles: natural (no glasses, no hair band to hold back hair from the forehead), hair band, and glasses. The lights were located in a full circle around the subject at 45 intervals (see figure 9). Separate frontal pose images were recorded with each light turned on individually for both the fluorescent and incandescent lights. Figure 10 shows example images for all eight illumination conditions. In addition, five images using the frontal fluorescent lights were obtained with the subjects wearing glasses. The subjects were also asked to display five facial expressions neutral, happy, surprise, anger, and blink which were recorded with two different colored lights (see figure 11), resulting in 10 images per subject. In total, 52 images were obtained per subject. The database also contains extensive ground truth information. The location of 26 feature points (if visible) is available for each face image.

    Fig 9: Pose variation in KFDB. The poses vary from +45o to full frontal and on to -45o.

    ISSN: 0975-5462 5164

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig 10: Illumination variation in KFDB. Lights from eight different positions (L1 L8) located in a full circle around the subject were used. For each position images with both fluorescent and incandescent lights were taken.

    Fig 11: Example colour images of expression changes under two kinds of illumination. Salient features:

    Usage of two types of illumination (fluorescent and incandescent lights) Blinking, a non-prototypic gesture, considered in the database

    I.VII. MMI FE database The developers of MMI facial expression database are from the Man-Machine Interaction group of Delft University of Technology, Netherlands. This was the first web-based facial expression database [Pantic et al. (2005)]. The basic criteria defined for this database include easy accessibility, extensibility, manageability, user-friendliness, with online help files and various search criteria. The database contains both still as well as video streams depicting the six basic expressions: happiness, anger, sadness, disgust, fear and surprise. The activation of the individual facial action muscles is taken care of. The database was built using JavaScript, Macromedia Flash, MySQL, PHP and Apache HTTP server. JavaScript was used for the creation of dynamic pages, Macromedia Flash was used to build rich internet applications (animation features), MySQL was chosen for the database server, PHP due to its compatibility with MySQL and being an open source platform, Apache HTTP server for its open source application, security, extendibility and efficiency. This database provides the users with a very good repository, easily searchable. Over 200 images and 800 video sequences can be accessed by the user. There are a good 308 number of active users of this database currently. Salient features:

    First web-based facial expression database. Includes both still images and image sequences.

    I.VIII. University of Texas video database It contains 284 students from the University of Texas (Males = 76, Females = 208).This database gives a combination of both static images and video sequences of face images [OToole et al. (2005)]. The static images and video clips were shot indoors in controlled environment (figure 12). The database also includes the video sequences of people walking and conversing under variable illuminations and distances. The high quality facial

    ISSN: 0975-5462 5165

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    mug shots provide nine discrete views ranging from left profile to right profile views in steps with equal degree. All the participants were asked to wear a grey-colored smock to cover the clothing from the camera. The video sequences captured were categorized into three varieties. The first one being the moving version of the static mug shots, the subjects were asked to move their heads and pause for a while at the required angles. The time duration from the first clip to the last one lasted about 10 seconds. The second variety is about dynamic facial speech videos capturing the rigid and the non-rigid movements of the subject while speaking. The subjects were asked to animate speech which included one or more head motions, facial expressions and eye gazes with speech movements. The subjects were asked to answer a series of mundane questions and their responses were recorded as the dynamic video sequence. The audio response was not considered and not recorded. The duration of each of the video sequence was 10 seconds. The third variety of video sequences comprised of facial expressions. The expressions captured were prototypic and non-prototypic viz., happiness, sadness, fear, disgust, anger, puzzlement, laughter, surprise, boredom or disbelief. The expressions have not been rated and have no ground truth provided by the user. There are instances where more than one expression was expressed by the subject. The video sequences of people comprise of two variations (figure 13), the first is applicable to gait video and the second one is called a conversational video. In the gait video, the subjects walk parallel/perpendicular to the line of sight of the camera, approaching t.he camera, but veering off to the left at the end. The conversation video shows a conversation between two people, one facing the camera and the other in the opposite direction. Natural gestures were portrayed by the subject facing the camera as showing directions to various destinations in the building. The lighting effect was variable due to the light falling from outside the glass windows. The close-range videos provide test stimuli for face recognition and tracking algorithms that operate when the head is undergoing rigid and/or non-rigid transformations. The dynamic mug shots, speech, and expression videos are likewise useful for computer graphics modeling of heads and facial animation. The memory required for the entire database is about 160 GB. This database is available for researchers only. The file format provided for images is TIFF and DV stream format for videos.

    Fig 12: Row 1 shows a facial mug shot series with nine still images, varying in pose from left (- 90) to right (90) profile in 22.5 steps. The second row contains five still images extracted from a facial speech video. The third and fourth rows contain images extracted from a disgust

    expression and laughter expression video, respectively.

    ISSN: 0975-5462 5166

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig 13: The first row of the figure contains five still images extracted from a parallel gait video. The second row contains five still images extracted from a perpendicular gait video. The third row of the figure contains five still images extracted from a conversation video.

    Salient features:

    A combination of mug shots, image streams, videos (+ audios), conversational video and gait in database. Can be used for a greater range of algorithm evaluation. Memory requirement is large.

    I.IX. BU-3D FE databases The Binghamton University was materialistic in creating 3D facial expression databases for the purpose of evaluation of algorithms. The databases come in two versions, one with the static data and the other with dynamic data. The static database includes still color images, while the dynamic database contains video sequences of subjects with expressions. The databases also include the neutral expression with the six prototypic expressions. I.IX.I. BU-3D FE: 3D Dynamic facial expression database 3D facial models have been extensively used for 3D face recognition and 3D face animation, the usefulness of such data for 3D facial expression recognition is unknown [Yin et al. (2006)]. This 3D facial expression database (called BU-3DFE database) includes 100 subjects with 2500 facial expression models. The BU-3DFE database is available to the research community (e.g., areas of interest come from as diverse as affective computing, computer vision, human computer interaction, security, biomedicine, law-enforcement, and psychology). The database presently contains 100 subjects (56% female, 44% male), ranging from 18 to 70 years of age, with a variety of ethnic/racial ancestries, including White, Black, East-Asian, Middle-east Asian, Indian, and Hispanic Latino. Participants in face scans include undergraduates, graduates and faculty from our institutes departments of Psychology, Arts, and Engineering (Computer Science, Electrical Engineering, System Science, and Mechanical Engineering). The majority of participants were undergraduates from the Psychology Department (collaborator: Dr. Peter Gerhardstein). Each subject performed seven expressions in front of the 3D face scanner (see right of figure 14). With the exception of the neutral expression, each of the six prototypic expressions (happiness, disgust, fear, angry, surprise and sadness) includes four levels of intensity (see left of figure 14). Therefore, there are 25 instant 3D expression models for each subject, resulting in a total of 2,500 3D facial expression models in the database. Associated with each expression shape model, is a corresponding facial texture image captured at two views (about +45 and -45). As a result, the database consists of 2,500 two-views texture images and 2,500 geometric shape models.

    ISSN: 0975-5462 5167

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    (a) Four levels of facial expressions from low to high. Expression models show the cropped face region and the entire facial head

    Fig 14: (a) The expression levels considered, (b) the seven expressions from BU-3D FE database.

    Salient features:

    Introduction of 3D into facial expression database. Inclusion of 3D models along with the databases. Intensity levels for expressions considered.

    I.IX.II. BU-4DFE (3D + time): 3D Dynamic Facial Expression Database To analyze the facial behaviour from a static 3D space to a dynamic 3D space, the BU-3DFE was extended to the BU-4DFE. A newly created high-resolution 3D dynamic facial expression database is available to the scientific research community [Yin et al. (2008)]. The 3D facial expressions are captured at a video rate (25 frames per second). For each subject, there are six model sequences showing six prototypic facial expressions (anger, disgust, happiness, fear, sadness, and surprise), respectively. Each expression sequence contains about 100 frames (a sample seen in figure 16). The database contains 606 3D facial expression sequences captured from 101 subjects, with a total of approximately 60,600 frame models. Each 3D model of a 3D video sequence has the resolution of approximately 35,000 vertices (see figure 15). The texture video has a resolution of about 10401329 pixels per frame. The resulting database consists of 58 female and 43 male subjects, with a variety of ethnic/racial ancestries, including Asian, Black, Hispanic/Latino, and White. This database includes the salient features of the 3D database in the previous sub-section along with the dynamic characteristics.

    (b) Seven expressions female and male (neutral, angry, disgust, fear, happiness, sad and surprise), with face

    images and face models

    ISSN: 0975-5462 5168

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig 15: Individual model views from BU-4D FE database

    Fig 16: Sample expression image and model sequences (male and female) from BU-4D FE database. Salient features:

    Includes all features of BU-3D FE database with dynamic features. The image sequences begin and end with a neutral expression.

    I.X. FG-NET database The FG-NET Database with Facial Expressions and Emotions from the Technical University Munich is an image database containing face images showing a number of subjects performing the six different basic emotions defined by Ekman & Friesen. The database has been developed in an attempt to assist researchers who investigate the effects of different facial expressions [Wallhoff (2006)]. The database has been generated as part of the European Union project FG-NET (Face and Gesture Recognition Research Network). One of the underlying paradigms of this database is to let the observed people react as naturally as possible. As a consequence, it was attempted to wake real emotions by playing video clips or still image after a short introduction phase instead of requiring telling the person

    ISSN: 0975-5462 5169

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    to play a role. This includes that head movements in all directions. The covered emotions include Happiness, Disgust, Anger, Fear, Sadness, Surprise and Neutral (see figure 17 for a typical image sequence). The images were acquired using a Sony XC-999P camera equipped with an 8mm COSMICAR 1:1.4 television lens. A BTTV 878 frame grabber card was used to grab the images with a size of 640 x 480 pixels, a color depth of 24 bits and a frame rate of 25 frames per second. Due to capacity reasons, the images where converted into 8 Bit JPEG-compressed images with a size of 320 x 240. The database can be downloaded as a collection of MPEG compressed movies. After extraction the images are separately stored in subdirectories as follows: {anger, disgs, fears, happy, neutr, sadns, surpr}.

    Fig 17: Image sequence of a subject from neutral state to emotion (happiness) state. Salient feature:

    The expressions on the faces are considered as natural as possible. I.XI. FE database of MPI for Biological Cybernetics The faces were filmed in a purpose built video laboratory using five synchronized digital cameras for a frontal view and four profile views at 22 and 45 [Pilz et al. (2006)]. To avoid the effect of hair and clothing, each subject were provided with a black cap and a black scarf respectively while posing the cameras (figure 18(b)). Eight amateur actors were filmed making a range of isolated expression gestures. Two of clips were used relating to anger and surprise expressions. The actors were asked to narrate the situation by relating words like wow! for surprise and what! for anger. The system was designed as a distributed computer cluster of video and audio recording nodes (see figure 18(a)). Each recording node consists of a specialized digital video camera, a specialized frame grabber and optionally - a sound card, which were attached to a standard Intel-x86 compatible PC with fast hard disks. The computers of the nodes were connected with each other and with a control computer as well as a file server via a standard 100 MB Ethernet local area network. The frame grabbers of all nodes were connected to each other for the transmission of an electronic trigger signal that allows for high precision synchronization of frame capture between the nodes.

    ISSN: 0975-5462 5170

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Fig. 18 (a) Schematic overview of the Max Planck video lab, (b) Example stimuli from one actor showing the five different perspectives used in the current experiments.

    Salient feature:

    Highly segmented/normalized images with the usage of black cap and scarf.

    I.XII. Radboud Faces Databases (RaFD) RaFD is a set of pictures of 67 models (including Caucasian males and females, Caucasian children, both girls and boys, and Moroccan Dutch males) displaying eight different emotions [Langner et al. (2010)]. The RaFD in an initiative of the Behavioural Science Institute of the Radboud University Nijmegen, which is located in Nijmegen (the Netherlands), and can be used freely for non-commercial scientific research by researchers who work for an officially accredited university. Accordingly to the Facial Action Coding System, each model was trained to show the following expressions: Anger, disgust, fear, happiness, sadness, surprise, contempt, and neutral (figure 19 (a)). Each emotion was shown with three different gaze directions and all pictures were taken from five camera angles simultaneously (figure 19(b) and (c) respectively). The targeted emotional expressions were based on prototypes [Ekman et al. (2002)]. The action units targeted in this database is a variation of the Directed Facial Action Task [Ekman (2007)] as shown in figure 20. The number of registered users is already 334.

    ISSN: 0975-5462 5171

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    (a) (b)

    (c)

    Fig. 19 (a) Eight emotional expressions from top left: sad, neutral, anger, contemptuous, disgust, surprise, fear and happiness, (b) Three gaze directions: left, straight and right, (c) Five camera angles at 180o, 135o, 90o, 45o and 0o.

    Fig. 20 Targeted action units (AU) for all emotional expressions. Salient features:

    Contempt, a non-prototypic expression considered. Different gaze directions considered. It is latest facial expression database.

    II. Conclusions The databases discussed in this paper, chronologically ordered from 1996 to 2010, are currently freely available for use in research for evaluation purpose (other facial expression databases available are either having a restricted usage to a particular group or demand an access fee). There are other face databases; most of them are either

    ISSN: 0975-5462 5172

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    inaccessible or inactive. These can be accessed by communicating to the respective resource persons. The authors of the paper have restricted the list of databases only to those that can be used for facial expression recognition and not just face recognition. Almost all of them have good ground truth and have been well evaluated on standard methods of recognition by the creators. Table 3 below gives a listing of the resource persons to contact for the accessibility of the databases.

    Table 3: Information on database accessibility

    S no.

    Name of the database Contact for accessibility University / country

    1 FERET database http://face.nist.gov/colorferet/request.html George Mason

    University, USA

    2 JAFFE database Michael J Lyons

    http://www.kasrl.org/jaffe_download.html

    Psychology Department, Kyushu

    University, Japan

    3 AR database Aleix M Martinez [email protected]

    Computer Vision Center, Purdue

    University, Barcelona, Spain

    4

    Cohn Kanade facial

    expression database

    Jeffrey Cohn - [email protected] Carnegie Mellon

    University, Robotics Institute, Pittsburg

    5 CAS-PEAL database Shaoxin Li - [email protected] Face Group, Chinese

    Academy of Sciences, China

    6 Korean Face database http://www.kisa.or.kr/eng/main.jsp

    Center for Artificial Vision Research, Korea University,

    Korea

    7 MMI FE database Maja Pantic [email protected] Delft University of Technology, Delft, The Netherlands

    8 University of Texas Video

    database Alice OToole [email protected] University of Texas, Dallas

    9 BU-3D FE database Lijun Yin - [email protected]

    Binghamton University, State

    University of New York

    10 FG-NET database [email protected] Technical University of Munich, Munich

    11

    MPI for Biological

    Cybernetics database

    http://faces.kyb.tuebingen.mpg.de/index.php

    Max Planck Institute for Biological Cybernetics,

    Tubingen, Germany

    12 Radboud Face database Ron Dotsch - www.rafd.nl Radboud University,

    Nijmegen, The Netherlands

    ISSN: 0975-5462 5173

  • Anitha C et. al. / International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5158-5174

    Acknowledgement The authors are grateful to all the developers of the facial expression databases mentioned in the paper. Michael J Lyons, Aleix M Martinez, Jeffrey Cohn, Shaoxin Li, Maja Pantic, Alice OToole, Lijun Yin, Ron Dotsch and all of them who have contributed with their valuable suggestions and feedbacks. The authors are also grateful to the department of Electronics and Communications Engineering, BMS College of Engineering, Bangalore, for extending their support.

    References [1] Ekman P, Friesen W V & Hager J C (2002), Facial Action Coding System: Investigators guide, Salt Lake City, UT: Research Nexus. [2] Ekman P (2007), The directed facial action task, In Handbook of emotion elicitation and assessment, Oxford, UK: Oxford University Press. [3] Gao W, Cao B, Shan S, Zhou D, Zhang X, and Zhao D (2004), CAS-PEAL large-scale Chinese face database and evaluation protocols,

    Technical Report JDL-TR-04-FR-001, Joint Research & Development Laboratory. [4] Gross R (2005), Face Databases, Chapter in Handbook of Face Recognition, Springer-Verlag publications. [5] Hwang B W, Byun H, Roh M C, and Lee S W (2003), Performance evaluation of face recognition algorithms on the Asian face database,

    KFDB. In Audio- and Video-Based Biometric Person Authentication (AVBPA). [6] Kanade T, Cohn J, and Tian Y (2000), Comprehensive database for facial expression analysis, In Proceedings of the Fourth IEEE

    International Conference on Automatic Face and Gesture Recognition. [7] Langner O, Dotsch R, Bijlstra G, Wigboldus D H J, Hawk S T, and Van Knippenberg A (2010), Presentation and validation of the Radboud

    Faces Database, Cognition and Emotion, Psychology Press. [8] Lyons M, Akamatsu S, Kamachi M, and Gyoba J (1998), Coding facial expressions with Gabor wavelets, In 3rd International Conference

    on Automatic Face and Gesture Recognition. [9] Martinez A R and Benavente R (1998), The AR face database, Computer Vision Center (CVC) Technical Report, Barcelona. [10] OToole A J, Harms J, Snow S L, Hurst D R, Pappas M R, Ayyad J H, Abdi H (2005), A video database of moving faces and people, IEEE

    transactions on Pattern Analysis and Machine Intelligence. [11] Pantic M, Valstar M, Rademaker R and Maat L (2005), Web-based Database for Facial Expression Analysis, Proc. of IEEE Intl Conf.

    Multmedia and Expo (ICME05). [12] Phillips P J, Moon H, Rizvi S, and Rauss P J (2000), The FERET evaluation methodology for face-recognition algorithms, IEEE

    Transactions on Pattern Analysis and Machine Intelligence, 22(10). [13] Pilz K S, Thornton I M and Blthoff H H (2006), A search advantage for faces learned in motion, Experimental Brain Research 171(4). [14] Wallhoff F (2006), Facial Expressions and Emotion Database, http://www.mmk.ei.tum.de/~waf/fgnet/feedtum.html, Technische Universitt

    Mnchen. [15] Yin L, Wei X, Sun Y, Wang J, Rosato M J (2006), A 3D Facial Expression Database For Facial Behavior Research, 7th International

    Conference on Automatic Face and Gesture Recognition (FGR06). [16] Yin L, Chen X, Sun Y, Worm T, Reale M (2008), A High-Resolution 3D Dynamic Facial Expression Database, The 8th International

    Conference on Automatic Face and Gesture Recognition (FGR08).

    ISSN: 0975-5462 5174