-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
A SURVEY ON FACIAL EXPRESSION DATABASES
ANITHA C * Senior lecturer, Dept of ISE,
BMS College of Engineering, Bangalore, India.
M K VENKATESHA Principal,
RNS Institute of Technology, Bangalore, India.
B SURYANARAYANA ADIGA Senior Consultant,
TCS ltd, Bangalore, India.
Abstract: Human faces are non-rigid objects with a high degree
of variability in size, shape, color, and texture. The face
databases are extensively used for evaluation of various algorithms
used in facial expression/gesture recognition systems. Any
automated system for face and facial gesture recognition has
immense potential in identification of criminals, surveillance and
retrieval of missing children, office security, credit card
verification, video document retrieval, telecommunication, high -
definition television, medicine, humancomputer interfaces,
multimedia facial queries, and low-bandwidth transmission of facial
data. This paper presents a comprehensive survey of the currently
available databases that can be used in facial expression
recognition systems. The growth in face database development has
been tremendous during the recent years. Keywords: Facial
expression recognition systems, databases, human-computer
interaction
I. Facial Expression Databases
Images used for facial expression recognition are static images
or image sequences. An image sequence potentially contains more
information than a still image, because the former also depicts the
temporal information. The usage of these databases is restricted
for research purpose only. Most commonly used databases include
Cohn-Kanade facial expression database, Japanese Female Facial
Expression (JAFFE) database, MMI database, CMU-PIE database.
Currently 3D databases are used to a larger extent. Table 1
summarizes the facial expression databases that are currently
available for evaluation usage.
Each of the databases listed in table 1 below is briefly
described. The description mainly emphasizes the techniques used
during the development of the database. This is followed by a few
of the face images from the database. Finally the salient features
of the database are listed. The conclusion at the end, gives
details for researchers requiring an access for these
databases.
*Corresponding author is a faculty and research scholar at B M S
college of Engineering, Bangalore, Karnataka, India.
ISSN: 0975-5462 5158
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Table 1 Facial expression databases available for evaluation
Name of database
No. of images
Expression / pose / illumination
Color/ Gray
Resolution Images / image sequence
Number of subjects
Year
FERET 14,051 2/9-20/2 Gray 256x384 Images 1199 1996 JAFFE 213
7/1/1 Gray 256x256 Images 10 1998 AR database
3288 4/1/4 Color 768x567 Images 116 1998
Cohn Kanade AU-coded , v1
486 6/1/1 Gray 640x490 image sequence
97 2000
CAS-PEAL database
30,900 6/21/9-15 Color 360x480 Images 1040 2003
Korean Face Database (KFDB)
52,000 5/7/16 Color 640x480 Images 1000 2003
MMI FE 800+ sequences, 200+ images
6/2/1 Color 720x576
Images & Image sequence
52 2005
University of Texas Video db
11/9/1 Color 720x480 Images, video stream
284 2005
BU-3D FE 2500 7/2/4 Color 1040x1329
Images, 3D models
100 2006
FG-NET 399 6/1/1 Color 320x240 Image sequence
18 2006
FE db of MPI for Biological Cybernetics
5600 4/7/1 Color 256x256 Images
200 2006
BU-4D FE 606 3D sequence
6/ 1/1 Color 1040x1329 Image sequence, 3D models
101 2008
Radboud Face Database
8040 8/5/1 Color 1024x681 Images 67 2010
ISSN: 0975-5462 5159
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
I.I. FERET database
Face Recognition Technology program (FERET) was sponsored by the
Department of Defence (DoD) Counterdrug Technology Development
Program Office [Phillips et al. (2000)]. The goal of FERET program
was to develop automatic face recognition capabilities that could
be employed to assist security, intelligence and law enforcement
personnel in the performance of their duties. The program consisted
of three major elements: (a) sponsoring research, (b) collecting
the FERET database, and (c) performing the FERET evaluations. The
database was collected in 15 sessions between August 1993 and July
1996. The database contains 1564 sets of images for a total of
14,126 images that includes 1199 individuals and 365 duplicate sets
of images. A duplicate set is a second set of images of a person
already in the database and was usually taken on a different day.
During each session 13 conditions with varying facial expressions,
illumination and occlusion were captured.
Salient features:
The age-related facial change was considered while collecting
the images, with the interval between two
sessions extending up to 2 years for some subjects. Duplicate
sets of images considered. Maximum number of subjects used.
I.II. JAFFE database Ten female subjects posed for the six basic
expressions: happiness, sadness, anger, disgust, fear and surprise,
and the
neutral face (see figure 2). Each of the subjects posed with
three to four examples per expression to make a total of 219 images
[Lyons et al. (1998)]. The still images were captured in a
controlled environment (pose and illumination see figure 1). The
semantic ratings of the expressions were performed from
psychological experiments averaged over 60 Japanese female subjects
as ground truth. According to Michael J Lyons, any expression is
never a pure expression but a mixture of different emotions. So, a
5 level scale was used for each of the expression images (5 for
high and 1 for low). Two such ratings were given, one with fear
expression images and the other without fear expression images. The
expression images are labeled as per the predominant expression in
that image. Considerably low resolution images used 256 x 256 with
the number of subjects just equal to ten (smallest in comparison
with other databases).
Fig 1: Apparatus used to photograph the facial expressions.
ISSN: 0975-5462 5160
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig 2: Examples of images from the Japanese Female Facial
Expression database.
Salient features:
It is the only facial expression database that uses the minimum
number of subjects. Manual rating used to identify facial
expressions.
I.III. AR database The AR database was collected at the Computer
Vision Centre in Barcelona, Spain in 1998. It contains images of
116 individuals (63 men and 53 women) [Martinez et al. (1998)]. The
imaging and recording conditions (camera parameters, illumination
setting, and camera distance) were carefully controlled and
constantly recalibrated to ensure that settings are identical
across subjects. The resulting RGB colour images are 768 576 pixels
in size. The subjects were recorded twice at a 2week interval.
During each session 13 conditions with varying facial expressions,
illumination and occlusion were captured. Figure 3 shows an example
for each condition. So far, more than 200 research groups have
accessed the database.
Courtesy: [Gross (2005)]
Fig 3: AR database. The conditions are (1) neutral, (2) smile,
(3) anger, (4) scream, (5) left light on, (6) right light on, (7)
both lights on, (8) sun glasses, (9) sun glasses/ left light, (10)
sun glasses/ right light, (11) scarf, (12) scarf/ left light, (13)
scarf/ right light.
Salient features:
First ever facial expression database to consider occlusions in
face images. Inclusion of scream, a non-prototypic gesture, in the
database. To enable testing and modelling using this database, 22
facial feature points are manually labelled on each
face.
I.IV. Cohn-Kanade facial expression database Subjects in the
available portion of the database were 97 university students
enrolled in introductory psychology classes. They ranged in age
from 18 to 30 years. Sixty-five percent were female, 15 percent
were African-American, and three percent were Asian or Latino. The
observation room was equipped with a chair for the subject and
two
ISSN: 0975-5462 5161
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Panasonic WV3230 cameras, each connected to a Panasonic S-VHS
AG-7500 video recorder with a Horita synchronized time-code
generator. One of the cameras was located directly in front of the
subject, and the other was positioned 30 degrees to the right of
the subject [Kanade et al. (2000)]. Only image data from the
frontal camera are available at this time. Subjects were instructed
by an experimenter to perform a series of 23 facial displays that
included single action units (e.g., AU 12, or lip corners pulled
obliquely) and combinations of action units (e.g., AU 1+2, or inner
and outer brows raised). Subjects began and ended each display from
a neutral face, but the image sequences provided contains from
neutral to expression (see figure 4). Before performing each
display, an experimenter described and modelled the desired
display. Six of the displays were based on descriptions of
prototypic basic emotions (i.e., joy, surprise, anger, fear,
disgust, and sadness). Image sequences from neutral to target
display were digitized into 640 by 480 or 490 pixel arrays with
8-bit precision for gray scale values. The images are available in
png and jpg. Images are labelled using their corresponding VITC.
The final frame of each image sequence was coded using FACS (Facial
Action Coding System) which describes subject's expression in terms
of action units (AUs). FACS codes were conducted by a certified
FACS coder.
Jeffrey Cohn
Fig 4: An image sequence of a subject expressing Surprise' from
the Cohn- Kanade Facial Expression Database.
Salient features:
Image sequences considered instead of mug shots. Evaluation
performed based on Action Unit recognition.
I.V. CAS-PEAL database PEAL stands for Pose, Expression,
Accessory and Lighting. A large-scale Chinese face database with
variations in illumination and accessories is PEAL [Gao et al.
(2004)]. The database currently contains 99,594 images of 1040
individuals (595 males and 445 females). Five different
expressions, six accessories (3 glasses and 3 caps), and 15
lighting conditions are considered while capturing the images. 9
equally spaced cameras were used to horizontally capture different
poses simultaneously (see figure 6). The subjects were asked to
look up and down to capture another set of 18 images. The
conditions considered during the database creation are listed in
table 2. The currently available database for research contains a
subset of this database with 30,900 images of 1040 subjects. These
images belong to two main subsets: frontal and pose subset. In
frontal subset, all images are captured from camera C4 with the
subject looking right into this camera. Among them, 377 subjects
have images with 6 expressions (see figure 7), 438 subjects have
images wearing 6 different accessories, 233 subjects have images
under at least 9 lighting changes, 296 subjects have images against
2 or 4 different backgrounds and 296 subjects have images with
different distances from cameras. Also, 66 subjects have images
recorded in two sessions at a 6 month interval (see figure 8). In
pose subset, images of 1040 subjects across 21 different poses
without any other variations are included.
ISSN: 0975-5462 5162
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig 6: Plan form of CAS-PEAL camera system.
Table 2: Sources of variations considered in CAS-PEAL
database
# Viewpoints 9
#Variations Facing
directions Expression Lighting Accessory Background Aging
Distance
3 6 15 6 4 2 2 #Combined 27 54 135 54 36 18 18
#Total 342
Fig 7: Example images of one subject with 6 expressions across
three poses in CAS-PEAL database.
ISSN: 0975-5462 5163
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig 8: Example images captured with time difference. The images
in the bottom row are captured after six months. Salient
features:
Time/age consideration during image collection. Inclusion of
multiple accessories in database. Consideration of surprise and
open mouth categories in the database.
I.VI. Korean face database The Korean Face Database (KFDB)
contains facial imagery of a large number of Korean subjects
collected under carefully controlled conditions [Hwang et al.
(2003)]. The images with varying pose, illumination, and facial
expressions were recorded. The subjects were imaged in the middle
of an octagonal frame carrying seven cameras and eight lights (in
two colours: fluorescent and incandescent) against a blue screen
background. The cameras were placed between 45 off frontal in both
directions at 15 increments. Figure 9 shows example images for all
seven poses. Pose images were collected in three styles: natural
(no glasses, no hair band to hold back hair from the forehead),
hair band, and glasses. The lights were located in a full circle
around the subject at 45 intervals (see figure 9). Separate frontal
pose images were recorded with each light turned on individually
for both the fluorescent and incandescent lights. Figure 10 shows
example images for all eight illumination conditions. In addition,
five images using the frontal fluorescent lights were obtained with
the subjects wearing glasses. The subjects were also asked to
display five facial expressions neutral, happy, surprise, anger,
and blink which were recorded with two different colored lights
(see figure 11), resulting in 10 images per subject. In total, 52
images were obtained per subject. The database also contains
extensive ground truth information. The location of 26 feature
points (if visible) is available for each face image.
Fig 9: Pose variation in KFDB. The poses vary from +45o to full
frontal and on to -45o.
ISSN: 0975-5462 5164
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig 10: Illumination variation in KFDB. Lights from eight
different positions (L1 L8) located in a full circle around the
subject were used. For each position images with both fluorescent
and incandescent lights were taken.
Fig 11: Example colour images of expression changes under two
kinds of illumination. Salient features:
Usage of two types of illumination (fluorescent and incandescent
lights) Blinking, a non-prototypic gesture, considered in the
database
I.VII. MMI FE database The developers of MMI facial expression
database are from the Man-Machine Interaction group of Delft
University of Technology, Netherlands. This was the first web-based
facial expression database [Pantic et al. (2005)]. The basic
criteria defined for this database include easy accessibility,
extensibility, manageability, user-friendliness, with online help
files and various search criteria. The database contains both still
as well as video streams depicting the six basic expressions:
happiness, anger, sadness, disgust, fear and surprise. The
activation of the individual facial action muscles is taken care
of. The database was built using JavaScript, Macromedia Flash,
MySQL, PHP and Apache HTTP server. JavaScript was used for the
creation of dynamic pages, Macromedia Flash was used to build rich
internet applications (animation features), MySQL was chosen for
the database server, PHP due to its compatibility with MySQL and
being an open source platform, Apache HTTP server for its open
source application, security, extendibility and efficiency. This
database provides the users with a very good repository, easily
searchable. Over 200 images and 800 video sequences can be accessed
by the user. There are a good 308 number of active users of this
database currently. Salient features:
First web-based facial expression database. Includes both still
images and image sequences.
I.VIII. University of Texas video database It contains 284
students from the University of Texas (Males = 76, Females =
208).This database gives a combination of both static images and
video sequences of face images [OToole et al. (2005)]. The static
images and video clips were shot indoors in controlled environment
(figure 12). The database also includes the video sequences of
people walking and conversing under variable illuminations and
distances. The high quality facial
ISSN: 0975-5462 5165
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
mug shots provide nine discrete views ranging from left profile
to right profile views in steps with equal degree. All the
participants were asked to wear a grey-colored smock to cover the
clothing from the camera. The video sequences captured were
categorized into three varieties. The first one being the moving
version of the static mug shots, the subjects were asked to move
their heads and pause for a while at the required angles. The time
duration from the first clip to the last one lasted about 10
seconds. The second variety is about dynamic facial speech videos
capturing the rigid and the non-rigid movements of the subject
while speaking. The subjects were asked to animate speech which
included one or more head motions, facial expressions and eye gazes
with speech movements. The subjects were asked to answer a series
of mundane questions and their responses were recorded as the
dynamic video sequence. The audio response was not considered and
not recorded. The duration of each of the video sequence was 10
seconds. The third variety of video sequences comprised of facial
expressions. The expressions captured were prototypic and
non-prototypic viz., happiness, sadness, fear, disgust, anger,
puzzlement, laughter, surprise, boredom or disbelief. The
expressions have not been rated and have no ground truth provided
by the user. There are instances where more than one expression was
expressed by the subject. The video sequences of people comprise of
two variations (figure 13), the first is applicable to gait video
and the second one is called a conversational video. In the gait
video, the subjects walk parallel/perpendicular to the line of
sight of the camera, approaching t.he camera, but veering off to
the left at the end. The conversation video shows a conversation
between two people, one facing the camera and the other in the
opposite direction. Natural gestures were portrayed by the subject
facing the camera as showing directions to various destinations in
the building. The lighting effect was variable due to the light
falling from outside the glass windows. The close-range videos
provide test stimuli for face recognition and tracking algorithms
that operate when the head is undergoing rigid and/or non-rigid
transformations. The dynamic mug shots, speech, and expression
videos are likewise useful for computer graphics modeling of heads
and facial animation. The memory required for the entire database
is about 160 GB. This database is available for researchers only.
The file format provided for images is TIFF and DV stream format
for videos.
Fig 12: Row 1 shows a facial mug shot series with nine still
images, varying in pose from left (- 90) to right (90) profile in
22.5 steps. The second row contains five still images extracted
from a facial speech video. The third and fourth rows contain
images extracted from a disgust
expression and laughter expression video, respectively.
ISSN: 0975-5462 5166
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig 13: The first row of the figure contains five still images
extracted from a parallel gait video. The second row contains five
still images extracted from a perpendicular gait video. The third
row of the figure contains five still images extracted from a
conversation video.
Salient features:
A combination of mug shots, image streams, videos (+ audios),
conversational video and gait in database. Can be used for a
greater range of algorithm evaluation. Memory requirement is
large.
I.IX. BU-3D FE databases The Binghamton University was
materialistic in creating 3D facial expression databases for the
purpose of evaluation of algorithms. The databases come in two
versions, one with the static data and the other with dynamic data.
The static database includes still color images, while the dynamic
database contains video sequences of subjects with expressions. The
databases also include the neutral expression with the six
prototypic expressions. I.IX.I. BU-3D FE: 3D Dynamic facial
expression database 3D facial models have been extensively used for
3D face recognition and 3D face animation, the usefulness of such
data for 3D facial expression recognition is unknown [Yin et al.
(2006)]. This 3D facial expression database (called BU-3DFE
database) includes 100 subjects with 2500 facial expression models.
The BU-3DFE database is available to the research community (e.g.,
areas of interest come from as diverse as affective computing,
computer vision, human computer interaction, security, biomedicine,
law-enforcement, and psychology). The database presently contains
100 subjects (56% female, 44% male), ranging from 18 to 70 years of
age, with a variety of ethnic/racial ancestries, including White,
Black, East-Asian, Middle-east Asian, Indian, and Hispanic Latino.
Participants in face scans include undergraduates, graduates and
faculty from our institutes departments of Psychology, Arts, and
Engineering (Computer Science, Electrical Engineering, System
Science, and Mechanical Engineering). The majority of participants
were undergraduates from the Psychology Department (collaborator:
Dr. Peter Gerhardstein). Each subject performed seven expressions
in front of the 3D face scanner (see right of figure 14). With the
exception of the neutral expression, each of the six prototypic
expressions (happiness, disgust, fear, angry, surprise and sadness)
includes four levels of intensity (see left of figure 14).
Therefore, there are 25 instant 3D expression models for each
subject, resulting in a total of 2,500 3D facial expression models
in the database. Associated with each expression shape model, is a
corresponding facial texture image captured at two views (about +45
and -45). As a result, the database consists of 2,500 two-views
texture images and 2,500 geometric shape models.
ISSN: 0975-5462 5167
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
(a) Four levels of facial expressions from low to high.
Expression models show the cropped face region and the entire
facial head
Fig 14: (a) The expression levels considered, (b) the seven
expressions from BU-3D FE database.
Salient features:
Introduction of 3D into facial expression database. Inclusion of
3D models along with the databases. Intensity levels for
expressions considered.
I.IX.II. BU-4DFE (3D + time): 3D Dynamic Facial Expression
Database To analyze the facial behaviour from a static 3D space to
a dynamic 3D space, the BU-3DFE was extended to the BU-4DFE. A
newly created high-resolution 3D dynamic facial expression database
is available to the scientific research community [Yin et al.
(2008)]. The 3D facial expressions are captured at a video rate (25
frames per second). For each subject, there are six model sequences
showing six prototypic facial expressions (anger, disgust,
happiness, fear, sadness, and surprise), respectively. Each
expression sequence contains about 100 frames (a sample seen in
figure 16). The database contains 606 3D facial expression
sequences captured from 101 subjects, with a total of approximately
60,600 frame models. Each 3D model of a 3D video sequence has the
resolution of approximately 35,000 vertices (see figure 15). The
texture video has a resolution of about 10401329 pixels per frame.
The resulting database consists of 58 female and 43 male subjects,
with a variety of ethnic/racial ancestries, including Asian, Black,
Hispanic/Latino, and White. This database includes the salient
features of the 3D database in the previous sub-section along with
the dynamic characteristics.
(b) Seven expressions female and male (neutral, angry, disgust,
fear, happiness, sad and surprise), with face
images and face models
ISSN: 0975-5462 5168
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig 15: Individual model views from BU-4D FE database
Fig 16: Sample expression image and model sequences (male and
female) from BU-4D FE database. Salient features:
Includes all features of BU-3D FE database with dynamic
features. The image sequences begin and end with a neutral
expression.
I.X. FG-NET database The FG-NET Database with Facial Expressions
and Emotions from the Technical University Munich is an image
database containing face images showing a number of subjects
performing the six different basic emotions defined by Ekman &
Friesen. The database has been developed in an attempt to assist
researchers who investigate the effects of different facial
expressions [Wallhoff (2006)]. The database has been generated as
part of the European Union project FG-NET (Face and Gesture
Recognition Research Network). One of the underlying paradigms of
this database is to let the observed people react as naturally as
possible. As a consequence, it was attempted to wake real emotions
by playing video clips or still image after a short introduction
phase instead of requiring telling the person
ISSN: 0975-5462 5169
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
to play a role. This includes that head movements in all
directions. The covered emotions include Happiness, Disgust, Anger,
Fear, Sadness, Surprise and Neutral (see figure 17 for a typical
image sequence). The images were acquired using a Sony XC-999P
camera equipped with an 8mm COSMICAR 1:1.4 television lens. A BTTV
878 frame grabber card was used to grab the images with a size of
640 x 480 pixels, a color depth of 24 bits and a frame rate of 25
frames per second. Due to capacity reasons, the images where
converted into 8 Bit JPEG-compressed images with a size of 320 x
240. The database can be downloaded as a collection of MPEG
compressed movies. After extraction the images are separately
stored in subdirectories as follows: {anger, disgs, fears, happy,
neutr, sadns, surpr}.
Fig 17: Image sequence of a subject from neutral state to
emotion (happiness) state. Salient feature:
The expressions on the faces are considered as natural as
possible. I.XI. FE database of MPI for Biological Cybernetics The
faces were filmed in a purpose built video laboratory using five
synchronized digital cameras for a frontal view and four profile
views at 22 and 45 [Pilz et al. (2006)]. To avoid the effect of
hair and clothing, each subject were provided with a black cap and
a black scarf respectively while posing the cameras (figure 18(b)).
Eight amateur actors were filmed making a range of isolated
expression gestures. Two of clips were used relating to anger and
surprise expressions. The actors were asked to narrate the
situation by relating words like wow! for surprise and what! for
anger. The system was designed as a distributed computer cluster of
video and audio recording nodes (see figure 18(a)). Each recording
node consists of a specialized digital video camera, a specialized
frame grabber and optionally - a sound card, which were attached to
a standard Intel-x86 compatible PC with fast hard disks. The
computers of the nodes were connected with each other and with a
control computer as well as a file server via a standard 100 MB
Ethernet local area network. The frame grabbers of all nodes were
connected to each other for the transmission of an electronic
trigger signal that allows for high precision synchronization of
frame capture between the nodes.
ISSN: 0975-5462 5170
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Fig. 18 (a) Schematic overview of the Max Planck video lab, (b)
Example stimuli from one actor showing the five different
perspectives used in the current experiments.
Salient feature:
Highly segmented/normalized images with the usage of black cap
and scarf.
I.XII. Radboud Faces Databases (RaFD) RaFD is a set of pictures
of 67 models (including Caucasian males and females, Caucasian
children, both girls and boys, and Moroccan Dutch males) displaying
eight different emotions [Langner et al. (2010)]. The RaFD in an
initiative of the Behavioural Science Institute of the Radboud
University Nijmegen, which is located in Nijmegen (the
Netherlands), and can be used freely for non-commercial scientific
research by researchers who work for an officially accredited
university. Accordingly to the Facial Action Coding System, each
model was trained to show the following expressions: Anger,
disgust, fear, happiness, sadness, surprise, contempt, and neutral
(figure 19 (a)). Each emotion was shown with three different gaze
directions and all pictures were taken from five camera angles
simultaneously (figure 19(b) and (c) respectively). The targeted
emotional expressions were based on prototypes [Ekman et al.
(2002)]. The action units targeted in this database is a variation
of the Directed Facial Action Task [Ekman (2007)] as shown in
figure 20. The number of registered users is already 334.
ISSN: 0975-5462 5171
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
(a) (b)
(c)
Fig. 19 (a) Eight emotional expressions from top left: sad,
neutral, anger, contemptuous, disgust, surprise, fear and
happiness, (b) Three gaze directions: left, straight and right, (c)
Five camera angles at 180o, 135o, 90o, 45o and 0o.
Fig. 20 Targeted action units (AU) for all emotional
expressions. Salient features:
Contempt, a non-prototypic expression considered. Different gaze
directions considered. It is latest facial expression database.
II. Conclusions The databases discussed in this paper,
chronologically ordered from 1996 to 2010, are currently freely
available for use in research for evaluation purpose (other facial
expression databases available are either having a restricted usage
to a particular group or demand an access fee). There are other
face databases; most of them are either
ISSN: 0975-5462 5172
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
inaccessible or inactive. These can be accessed by communicating
to the respective resource persons. The authors of the paper have
restricted the list of databases only to those that can be used for
facial expression recognition and not just face recognition. Almost
all of them have good ground truth and have been well evaluated on
standard methods of recognition by the creators. Table 3 below
gives a listing of the resource persons to contact for the
accessibility of the databases.
Table 3: Information on database accessibility
S no.
Name of the database Contact for accessibility University /
country
1 FERET database http://face.nist.gov/colorferet/request.html
George Mason
University, USA
2 JAFFE database Michael J Lyons
http://www.kasrl.org/jaffe_download.html
Psychology Department, Kyushu
University, Japan
3 AR database Aleix M Martinez [email protected]
Computer Vision Center, Purdue
University, Barcelona, Spain
4
Cohn Kanade facial
expression database
Jeffrey Cohn - [email protected] Carnegie Mellon
University, Robotics Institute, Pittsburg
5 CAS-PEAL database Shaoxin Li - [email protected] Face Group,
Chinese
Academy of Sciences, China
6 Korean Face database http://www.kisa.or.kr/eng/main.jsp
Center for Artificial Vision Research, Korea University,
Korea
7 MMI FE database Maja Pantic [email protected] Delft
University of Technology, Delft, The Netherlands
8 University of Texas Video
database Alice OToole [email protected] University of Texas,
Dallas
9 BU-3D FE database Lijun Yin - [email protected]
Binghamton University, State
University of New York
10 FG-NET database [email protected] Technical University of
Munich, Munich
11
MPI for Biological
Cybernetics database
http://faces.kyb.tuebingen.mpg.de/index.php
Max Planck Institute for Biological Cybernetics,
Tubingen, Germany
12 Radboud Face database Ron Dotsch - www.rafd.nl Radboud
University,
Nijmegen, The Netherlands
ISSN: 0975-5462 5173
-
Anitha C et. al. / International Journal of Engineering Science
and Technology Vol. 2(10), 2010, 5158-5174
Acknowledgement The authors are grateful to all the developers
of the facial expression databases mentioned in the paper. Michael
J Lyons, Aleix M Martinez, Jeffrey Cohn, Shaoxin Li, Maja Pantic,
Alice OToole, Lijun Yin, Ron Dotsch and all of them who have
contributed with their valuable suggestions and feedbacks. The
authors are also grateful to the department of Electronics and
Communications Engineering, BMS College of Engineering, Bangalore,
for extending their support.
References [1] Ekman P, Friesen W V & Hager J C (2002),
Facial Action Coding System: Investigators guide, Salt Lake City,
UT: Research Nexus. [2] Ekman P (2007), The directed facial action
task, In Handbook of emotion elicitation and assessment, Oxford,
UK: Oxford University Press. [3] Gao W, Cao B, Shan S, Zhou D,
Zhang X, and Zhao D (2004), CAS-PEAL large-scale Chinese face
database and evaluation protocols,
Technical Report JDL-TR-04-FR-001, Joint Research &
Development Laboratory. [4] Gross R (2005), Face Databases, Chapter
in Handbook of Face Recognition, Springer-Verlag publications. [5]
Hwang B W, Byun H, Roh M C, and Lee S W (2003), Performance
evaluation of face recognition algorithms on the Asian face
database,
KFDB. In Audio- and Video-Based Biometric Person Authentication
(AVBPA). [6] Kanade T, Cohn J, and Tian Y (2000), Comprehensive
database for facial expression analysis, In Proceedings of the
Fourth IEEE
International Conference on Automatic Face and Gesture
Recognition. [7] Langner O, Dotsch R, Bijlstra G, Wigboldus D H J,
Hawk S T, and Van Knippenberg A (2010), Presentation and validation
of the Radboud
Faces Database, Cognition and Emotion, Psychology Press. [8]
Lyons M, Akamatsu S, Kamachi M, and Gyoba J (1998), Coding facial
expressions with Gabor wavelets, In 3rd International
Conference
on Automatic Face and Gesture Recognition. [9] Martinez A R and
Benavente R (1998), The AR face database, Computer Vision Center
(CVC) Technical Report, Barcelona. [10] OToole A J, Harms J, Snow S
L, Hurst D R, Pappas M R, Ayyad J H, Abdi H (2005), A video
database of moving faces and people, IEEE
transactions on Pattern Analysis and Machine Intelligence. [11]
Pantic M, Valstar M, Rademaker R and Maat L (2005), Web-based
Database for Facial Expression Analysis, Proc. of IEEE Intl
Conf.
Multmedia and Expo (ICME05). [12] Phillips P J, Moon H, Rizvi S,
and Rauss P J (2000), The FERET evaluation methodology for
face-recognition algorithms, IEEE
Transactions on Pattern Analysis and Machine Intelligence,
22(10). [13] Pilz K S, Thornton I M and Blthoff H H (2006), A
search advantage for faces learned in motion, Experimental Brain
Research 171(4). [14] Wallhoff F (2006), Facial Expressions and
Emotion Database, http://www.mmk.ei.tum.de/~waf/fgnet/feedtum.html,
Technische Universitt
Mnchen. [15] Yin L, Wei X, Sun Y, Wang J, Rosato M J (2006), A
3D Facial Expression Database For Facial Behavior Research, 7th
International
Conference on Automatic Face and Gesture Recognition (FGR06).
[16] Yin L, Chen X, Sun Y, Worm T, Reale M (2008), A
High-Resolution 3D Dynamic Facial Expression Database, The 8th
International
Conference on Automatic Face and Gesture Recognition
(FGR08).
ISSN: 0975-5462 5174