1 (a) (b) (c) (d) (a) (b) (c) (d) J-C. Martin, L. Devillers, A. Zara – LIMSI-CNRS V. Maffiolo, G. Le Chenadec – France Télécom R&D France EmoTABOU Corpus
Jan 22, 2016
1
(a) (b) (c) (d)(a) (b) (c) (d)
J-C. Martin, L. Devillers, A. Zara – LIMSI-CNRSV. Maffiolo, G. Le Chenadec – France Télécom R&D
France
EmoTABOU Corpus
2
(a) (b) (c) (d)(a) (b) (c) (d)
Research Context and Goals
3
(b) (c) (d)(b) (c) (d)
Research context Long-term goal: model of human computer
emotional interaction Requires knowledge on emotional multimodal
behaviors during human-human interaction (e.g. synchronization between modalities)
Corpus-based approach Experimental data and studies
Monomodal Acting single emotion Emotion but interaction not videotaped (EmoTV) Or interaction videotaped but not emotion
4
(b) (c) (d)(b) (c) (d)
Related workDiscriminative features of mvt quality (Wallbott 98)
Emotion Movement quality
Hot anger High mvt activity, expansive mvts, high mvt dynamics
Elated Joy High mvt activity, expansive mvt, high mvt dynamics
Happiness Low movement dynamics
Disgust Inexpansive movements
Contempt Low movement activity, low movement dynamics
Sadness Low movement activity, low movement dynamics
Despair Expansive movements
Terror High movement activity
Boredom Low movement activity, inexpansive mvts, low mvt dynamics
Shame, Interest, Pride
How does that applies to not (instructed in-lab acting of single emotions) ?
5
(b) (c) (d)(b) (c) (d)
Research context
Cognitive - motivational - emotive systems such as EMA (Gratch and Marsella) are mainly based on theoretical psycho-cognitive models behavioral models based on acted emotions
Our aim is to observe and analyze corpora of spontaneous
human-human interaction with emotional multimodal behavior
to build more realistic and “natural” behavioral models
6
(b) (c) (d)(b) (c) (d)
Research questions Questions
how do emotion and interaction combine? what is the impact of both on the synchrony between
modalities? Gesture stroke phase and lexical affiliate (McNeill 05), max
F0 Gaze during mental states (Baron-Cohen 97) and turn-taking
(Allwood 06) How to collect behavior that
are spontaneous are multimodal are emotional occur during interaction
7
(a) (b) (c) (d)(a) (b) (c) (d)
Experimental Protocole
8
(b) (c) (d)(b) (c) (d)
Experimental protocolAdaptation of the Taboo game Taboo game
1 card with one secret word and 5 forbiden words
Multimodal elicitation Iconic Deictics
9
(b) (c) (d)(b) (c) (d)
Experimental protocolAdaptation of the Taboo game
Emotion elicitation uncommon word to elicit surprise
or embarrassment 1 player was a naive subject other player instructed to elicit
emotion using strategies (might not find the word on purpose)
10
(b) (c) (d)(b) (c) (d)
Collected data 10 pairs of players 8 hours of video Upper body + face close-up
table
tableTable
table
table
table
C S
E
C : comparseS : sujetE : expérimentateur
11
(b) (c) (d)(b) (c) (d)
Samplepalimpsest
12
(a) (b) (c) (d)(a) (b) (c) (d)
Levels of annotation
• multimodal behavior (acoustic/gestures/face) • linguistic behavior (dialog/com. acts) • emotional/mental state behavior • strategic behavior
13
(a) (b) (c) (d)(a) (b) (c) (d)
Annotation of emotion & context
14
(b) (c) (d)(b) (c) (d)
Previous scheme Multi-level scheme for emotion and context
representation Emotion labels (broad sense including attitude,
emotion, mood) Dimensions (valence and intensity) Contextual information (quality, speaker, etc.)
EmoTV (Devillers, Abrilian, Martin, 2005) CEMO (Devillers et al., 2005)
15
(b) (c) (d)(b) (c) (d)
EmoTabou scheme
Adaptation of our previous scheme for emotions annotation in interaction
We added More general set of mental states Dialog acts Communicative acts Contextual information scheme (sub-dialog
of the game, role of the subject, card, etc.) Meta information
16
(b) (c) (d)(b) (c) (d)
Emotion
labels in EmoTabou
The protocol for obtaining thislist was to rate the emotion words of the Humaine list (55 terms) in terms of their relevance for the task (majority Voting procedure – 5 people).
In order to represent complex emotions,we allow the annotation of at most5 emotions per segment.
Then we computed the different annotations of several labelers in a soft vector representation.
Positive
Amusement,
Excitation POS,
Satisfaction,
Joy
Pride,
Contentment
Relief
Surprise POS
Emotion
Negative
Sadness,
Cold anger,
Boredom,
Worry,
Guilt,
Agitation,
Nervous,
Weariness,
Anxiety,
Frustration,
Disappointment,
Irritation,
Embarrassment,
Exasperation,
Stress
Surprise NEG
17
(b) (c) (d)(b) (c) (d)
Other studies: list of emotional labels
extended with other “mental states” (Reidsma 06, Le Chenadec et al., 05)
We added to our list the mental states defined by Baron-Cohen (96) (ie. “Thinking”, “Unsure”)
Aims: Study the relation between emotions and mental
states between the two players Study how emotions and mental states are
expressed through multimodal behaviors in human interactions
Mental states and emotions
18
(b) (c) (d)(b) (c) (d)
Mental states (Baron-Cohen, 1996)
19
(b) (c) (d)(b) (c) (d) DAMSL scheme “Dialog Act Markup in Several Layers”:
annotation of interaction (4 levels: Information-Level, Communicative Status, and Forward- and Backward-Looking Functions)
more than 75 tags Experiments using multi-level annotations:
dialogic and emotion tags carried out with the FP5-AMITIES (Devillers 02) showed correlation between emotion/some dialog acts in speech
ex: anger with repetition. We use a reduced set of DAMSL tags adapted
to EmoTabou
Dialog acts (DAMSL)
20
(b) (c) (d)(b) (c) (d) Give a cue Suggest a word Assert other Ask a question
Understand Answer Yes Answer No Don’t know Interjection
Inintelligible
Dialog acts (DAMSL)
Forward-Looking Functions
Backward-Looking Functions
Communicative Status
21
(b) (c) (d)(b) (c) (d) Previous works have already provided lists of communicative functions (Poggi, Pelachaud)
Here, we defined a list after analysing our corpus: Abandon, Disapprove, Criticize, Self-criticize,
Lack-of-confidence, Doubt about other, Rush, Unkind, Irony, Mocking, Joke, Sarcastic.
Admire, Approve, Congratulate, Encourage, Congratulate, propose strategy,
Communicative functions
22
(b) (c) (d)(b) (c) (d) We defined a contextual information scheme: strategies: list of strategies given to the
associate or observed in the corpus. game phases: give a card, play, give the result card to guess Player role: “devin” (mind-reader) or mime
Meta-information: Post-game information subject personality (Eysenck personality
inventory) questionnaire (emotions felt and elicited)
Contextual information and Meta-information
23
(b) (c) (d)(b) (c) (d) Associate Irritation : the associates have the instruction
to criticize the subject Card
Embarrassment : to embarrass the subject, unusual words have been chosen like «palimpseste »
Experimenter Stress: the subject has 2 minutes to guess a
word. After 1mn30, the experimenter announces 30 seconds left, then 15 seconds…
Examples of the different strategies
24
(b) (c) (d)(b) (c) (d) The coding scheme is implemented in Anvil (Kipp 04)
To annotate the corpus, we proceed in the following way:
1) Segmentation2) Annotation by four annotators Iterative definition of the coding scheme
Test with one video Measure agreement (intra, inter-coder agreement)
Annotation protocol for Emotion and context
25
(a) (b) (c) (d)(a) (b) (c) (d)
Annotation of multimodal behaviors
26
(b) (c) (d)(b) (c) (d)
Informal study of the collected behaviors
(a) (b) (c) (d)(a) (b) (c) (d)(a) iconic gesture describing the action "turning a split",
(b) deictic gesture indicating the scores listed on the black board,
(c) adaptator gesture done by the naive subject and imitation by the instructed subject (d)
27
(b) (c) (d)(b) (c) (d)
Annotation of multimodal expressive behaviors Gaze direction Gesture
Phase Function (including manipulators) Expressivity (adapted from Pelachaud
05) Facial expressions (subset AU) Head mvt Posture
28
(b) (c) (d)(b) (c) (d)
Annotation of multimodal expressive behaviors
Anvil (Kipp 04) 1 coder Agreement
29
(a) (b) (c) (d)(a) (b) (c) (d)
Future DirectionsIllustrations of
possible measures
30
(b) (c) (d)(b) (c) (d)
Descriptive analysis of one clip: Emotion repartition
Attribut % clip % clip
Speaker turn 60% Subject : 40% Associate :20%
Main Emotions Subject
77% Amusement : 36.6% Stress : 11.8% Exasperation : 9.7% Embarrassment :9.25%
Main Emotions Associate
Embarrassment:34.4% Amusement : 27.6% Stress : 11.4%
31
(b) (c) (d)(b) (c) (d)
We observed some relations between our set of emotions and more general mental states Embarrassment -> unsure
Emotion vs Mental states
32
(b) (c) (d)(b) (c) (d)
Ex: soft-vectors representation(Emotion label, weight, intensity, valence)(Mental state, weight)
Subject: “ok, I propose that we do not even try this oneand accept the penalty”
Subject.emotion ( amusement 2/6, 4, 4 ; embarrasment 1/6, 5, 2 ; disapointment 1/6, 4, 2 ; pride,1/6,4, 4 calme1/6, 1, 3)
Subject.Mentalstate (unsure 2/3; thinking1/3)
Associate.emotion( amusement 2/5,3,4 frustration 1/5,1,2 ; embarrasment 1/5,3,3 ; irritation 1/5,4,2)
Associate.Mentalstate()
33
(b) (c) (d)(b) (c) (d)
SubjectAssociate
Abandon Criticize Lack-of-confidenc
e
Doubt about other
Amusement 33% 39% 31% 0%
Embarrassment
33% 61% 28% 100%
Emotions/Communicative acts
Associate
Subject
Criticize Disapprove
Lack-of-confiden
ce
Doubt about other
Irritation 2% 0 1% 34%
Embarrassment
18% 0 1% 0
Exasperation (annoyance)
26% 3% 21% 0
Stress 18% 3% 0 1%
Amusement 34% 29% 0 46%
34
(b) (c) (d)(b) (c) (d)
Multimodal behaviorsIllustration of measures to be done
Naïve player gestures (RH + symetrical gestures) 83% of video / 7 gesture units High percentage of manipulators (48%) and hold
(60%) Gaze direction x gesture type
Adaptators: 59% Interlocutor, 31% elsewhere Deictics: 63% Interlocutor, 18% Panel, 19% Elsewhere
Expressivity and gesture type 59% deictic expanded, 94% adaptator contracted
Expressivity and phases 44% of smooth gestures occur during stroke
35
(b) (c) (d)(b) (c) (d)
Future directionsSynchronization between modalities
Further annotations Lexical affiliate Facial expression close-up
Temporal analysis Between modalities Between modalities / mental states
Measures Gaze / mental state Individual behaviors
36
(b) (c) (d)(b) (c) (d)
Future directionsSynchronization between modalities
Annotation and measures feedback, sequencing et turn management
(Allwood 06) Imitation
Relations between the different levels of annotation in the behavior of the two players
Comparison with pairs of naïve subjects