JC Martin - LIMSI/CNRS - WP5 WS1 Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS,

Post on 31-Mar-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

JC Martin - LIMSI/CNRS - WP5 WS

1

Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews

J.-C. Martin, S. Abrilian, L. Devillers

LIMSI-CNRS, France

JC Martin - LIMSI/CNRS - WP5 WS 2

Outline

Introduction Goals Requirements on annotation Emotional parameters of mm

behaviors Coding scheme

1st coding scheme and annotation 2nd coding scheme and example on 1 video

Future directions

JC Martin - LIMSI/CNRS - WP5 WS

3

Introduction

JC Martin - LIMSI/CNRS - WP5 WS 4

IntroductionGoals

How modalities correlate in non acted emotions ? Annotations and models : one source of knowledge

Coordination between modalities during non-acted emotion Synthesis of non acted spontaneous multimodal emotions in

ECAs How to code/represent multimodal emotional behavior ? Methodology (which attributes can be annotated easily

manually) Trade-off / intermediate level

Manual global text free whole video Manual medium/high order signs Automatic low level signs

WP5 + WP6 + WP4 + (WP3)

JC Martin - LIMSI/CNRS - WP5 WS 5

Introduction Requirements on coding scheme

Enable annotation (or computation) Literature: Main attributes of emotional behaviors Corpus based approach: Cover behaviors observed in EmoTV Multi-level annotation of temporal data

Global annotation: Manual annotation of multimodal signs for the global sequence Computations from manual annotations in each modality (mono, red,

comp) Emotional segment level

Computations from manual annotations in each modality (mono, red, comp)

Provide one source of knowledge for ECA specification Enable reliability and readability Annotation time

JC Martin - LIMSI/CNRS - WP5 WS 6

IntroductionEmotional parameters of mm behaviors

Psychology & behavior Montepare, J., Koff, E., Zaitchik, D. and Albert, M. (1999). "The

use of body movements and gestures as cues to emotions in younger and older adults." Journal of Nonverbal Behavior.

Wallbott, H. G. (1998). "Bodily expression of emotion." European Journal of Social Psychology

Detection of emotions + relevant non-verbal behaviors Acted data +/- Basic emotions Age, Gender Facial expression masked

Expressivity in ECAs (Hartman & Pelachaud 2004)

JC Martin - LIMSI/CNRS - WP5 WS 7

IntroductionEmotional parameters of mm behaviors

(Boone and Cunningham 1996; Boone and Cunningham 1998)

-Changes in tempo-Directional changes-Frequency-Muscle tension-Duration

Acted

(DeMeijer 1991) -Trunk (stretching, bowing)-Arm (opening, closing)-Vertical direction (upward, downward)-Sagittal direction (forward, backward)-Force (strong, light)-Velocity (fast, slow)-Directness

 Acted

JC Martin - LIMSI/CNRS - WP5 WS 8

Introduction Multimodal corpora from TV clips

Communicative functions Kipp (2003) MUMIN (Alwood et al. 2004) Musical Score (Magno Caldognetto et al.

2004) Emotions / informal annotation

Orage (Atifi and Marcoccia 2001)

JC Martin - LIMSI/CNRS - WP5 WS

9

Coding Scheme

JC Martin - LIMSI/CNRS - WP5 WS 10

Current status

1st annotation on 35 clips from EmoTV with 2

coders 2nd

Iterative definition and application to 1 clip of EmoTV using Anvil (SA, JCM)

Annotation guide written 1 meeting with Catherine Pelachaud Paris 8 for

investigating use for WP6

JC Martin - LIMSI/CNRS - WP5 WS 11

Mouvement quality Annotated vs. computed

Quality (annotated) Number of repetitions Fluidity: smooth / normal / jerky Strength: soft / normal / hard Speed: slow / normal / fast Spatial expansion: contracted / normal / expanded

Computed Start / end / duration Mvt direction, type, angle approximation

Torso : Computed from Pose track

JC Martin - LIMSI/CNRS - WP5 WS 12

Annotation #1Multimodal coding scheme

Speech transcription including non-verbal events (laughter, cry, …);

Posture pose; posture shift including speed and action (4 cues with

3 to 10 attributes per each cue, for instance: cue = action, attribute = walk);

Gestures phases of gesture (preparation, stroke, retraction), handedness, speed, energy, spatial region, hand shape,

direction of gesture, gesture type (beats, adaptors, deictic…);

Facial expressions subset Facial Animation Parameters (FAPs)

JC Martin - LIMSI/CNRS - WP5 WS 13

Annotation #1Statistics

most frequently annotated behaviors : facial expressions (78.6% of annotated multimodal behaviors for coder1, 80.4% for coder2), gestures (11.3% for coder1, 11.9% for coder2), posture (10% for coder1, 7.7% for coder2).

most frequent attributes were: gaze direction (26.8% for coder1, 17% for coder2), head movements (23.5% for coder1, 21% for coder2), blinking (15.8% for coder1, 17.6% for coder2), eyebrows movements (10% for coder1, 9.3% for coder2).

quantitatively agreed for some attributes (number of annotations of preparation and stroke gestures phases, number of annotation of speed of posture shift).

Coder1 was more sensitive than coder2 in all the modalities. Disagreements occurred on body poses, and gesture type and energy.

Coder1 annotated subtle body moves, contrary to coder2 who annotated well visible movements. Coder2 associated gesture’s energy with gesture’s speed, while coder1 differentiated both attributes, perceiving that a gesture might have a high energy and a slow motion.

JC Martin - LIMSI/CNRS - WP5 WS 14

Annotation #1 Statistics

Many cues in coder1 annotations are shared by several emotion labels (blinking, head movements…), but there are also typical cues for some emotions such as lowering hands when despaired, slow body movement for serenity.

difference between behaviors linked to strong (anger, exaltation…) and weak (irritation, serenity), attributes for discriminating attributes: are speed and energy for gestures, and speed for body movement.

Serenity involves no gestures, whereas exaltation is often accompanied by fast and energetic gestures.

Anger is correlated with fast and intense gestures, whereas irritation involves slow and low-intensity gestures.

JC Martin - LIMSI/CNRS - WP5 WS 15

Annotation #1Quantitative analysis

Gesture - Phase - Speed

fast56%moderate

27%

slow17%

fast

moderate

slow

Gesture - Phase - Speed

fast47%

moderate52%

slow1%

fast

moderate

slow

Low intercoder agreement on some attributes Reduce the number of values 7 => 3 Improve annotation protocole & guide

JC Martin - LIMSI/CNRS - WP5 WS 16

Tracks or group Tracks

Torso Head Facial expressions Global body Shoulders (Arms) (Gestures)

Alternation of pose and movements Torso, head, shoulders

Common value for attributes: Asymetry, other

JC Martin - LIMSI/CNRS - WP5 WS 17

Methodology

Annotation guide Track per track Annotate emotion vs. Communication

emotionally rich clips reduced interaction (monologue in

interviews) exagerated mouth / brows movements

JC Martin - LIMSI/CNRS - WP5 WS 18

Torso Movement direction to be computed from

pose Poses

3 dimensions twist, side-side, bend rotational, lateral, sagittal

Labels + approximation of angles

JC Martin - LIMSI/CNRS - WP5 WS 19

Torso Pose Twist

JC Martin - LIMSI/CNRS - WP5 WS 20

Torso PoseSide-side / Bend

JC Martin - LIMSI/CNRS - WP5 WS 21

ExampleTorso fast movement

JC Martin - LIMSI/CNRS - WP5 WS 22

Head Mouvements Numerous and combined

=> direction annotated in movement track

Primary & secondary Position Mouvement

FACS

JC Martin - LIMSI/CNRS - WP5 WS 23

ExampleHead : 2 directions - speed

JC Martin - LIMSI/CNRS - WP5 WS 24

Gestures structural transcription(Kipp 04; Efron 1941; McNeill 92)

Preparation Bringing arm and hand into stroke position, note that changing hand shape before/after moving the arm belongs to the preparation

Stroke The most energetic part of the gesture

SequenceOfStroke A number of successive strokes; all strokes should be covered by this phase.

Retract Movement back to rest position; in sitting position this is usually the arm rest, the lap or folded arms.

Hold A phase of stillness just before or just after the stroke, usually used to defer the stroke so that it coincides with a certain word.

JC Martin - LIMSI/CNRS - WP5 WS 25

Gesture functional transcriptionManipulator Contact with body or object. Movement which serve functions of

drive reduction or other non-communicative functions, like scratching oneself.

Beat Synchronised with the emphasis of the speech.

Deictic Arm/hand is used to point at an existing or imaginary object.

Representational Represents attributes, actions, relationships of objects and characters (concrete or abstract)

Emblem Movement with a precise, culturally defined meaning, like the eye-wink, gestures signalling the intellectual deficiency of another person or obscene gestures.

JC Martin - LIMSI/CNRS - WP5 WS 26

ExampleHomogeneous sequence of stroke

JC Martin - LIMSI/CNRS - WP5 WS 27

ExampleManipulator gesture

JC Martin - LIMSI/CNRS - WP5 WS 28

Gesture annotation attributes

Deictic target: self / Camera Manipulator target: Chest / Hairs / Eyebrows / Nose /

Mouth Object in hand: If the character is holding an object,

enter the name of the object. Spatial region: Up / Head / Chest / Down / Extreme

periphery Directness: Linear / Shaped pathway Vertical direction: Upward / Downward Horizontal direction: Leftward / Rightward Sagittal direction: Forward / Backward Hands relationship: Independent / Mirror / Asymmetric

JC Martin - LIMSI/CNRS - WP5 WS 29

Other annotations

Limited set of annotations for Facial expression

Label + Action Unit (combination) Gaze, brows, mouth, chin, nose

Shoulders Arms Global pose and mouvement

JC Martin - LIMSI/CNRS - WP5 WS 30

Future directions Modifications for potential use as one source of

knowledge for WP6 / WP4 Adding temporal evolution in segments Wrist position Fluidity only for between gestures or repetitions ? Integration with other sources of knowledge (temporal)

Validation of annotation Perceptual tests at the different levels of multimodal

annotation Segment of multimodal behavior Annotate common segments + intercoder agreement

Annotation of several videos Evaluation of annotation time

Correlations between emotions and multimodal annotations

JC Martin - LIMSI/CNRS - WP5 WS

31

Architectural Principles of a Software Platform for the Management of Multimodal Emotional Corpora

JC Martin - LIMSI/CNRS - WP5 WS 32

Goals

Guidelines Illustrative combinations of tools

JC Martin - LIMSI/CNRS - WP5 WS 33

Surveys of annotation tools for multimodal corpora

Tools Anvil, TasX,

Surveys ISLE D10, NITE, Harper Eurospeech,

NISLab LREC 2004 paper LREC WS 2002 / 2004

JC Martin - LIMSI/CNRS - WP5 WS 34

Anvil (Kipp 2001)http://www.dfki.uni-sb.de/~kipp/research/index.html

JC Martin - LIMSI/CNRS - WP5 WS 35

TASXhttp://tasxforce.lili.uni-bielefeld.de/

Tiers

Panel switch

Start/End-point

JC Martin - LIMSI/CNRS - WP5 WS 36

Meta-dataMPI tools Editor Browser

JC Martin - LIMSI/CNRS - WP5 WS 37

Platforms examplesWizard of Oz (Buisine et al. 2003)

Schéma de codage

JAVAJAXP

AnnotationsAnnotations

Enregistrementsvidéo

Annotations

Métriques

SPSS

Statistiques

AnnotationsPRAAT ANVIL

34 vidéos

Schéma de codage

JAVAJAXP

AnnotationsAnnotations

Enregistrementsvidéo

Annotations

Métriques

SPSS

Statistiques

AnnotationsPRAAT ANVIL

34 vidéos

JC Martin - LIMSI/CNRS - WP5 WS 38

Requirements / description Requirements of such a platform for emotion

Continuous / discrete Replay / validation

Description Software Data files:

media, meta data Annotations: manual, automatic, mixed

Coding schemes Documentation files Paper forms

JC Martin - LIMSI/CNRS - WP5 WS 39

Architecture

Tools Input / output

Use during various iterations Segmentation Agreement / vote / reduce number of

classes Re-annotation Audio only, video only, audio-video

JC Martin - LIMSI/CNRS - WP5 WS

40

Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews

J.-C. Martin, S. Abrilian, L. Devillers

LIMSI-CNRS, France

JC Martin - LIMSI/CNRS - WP5 WS 41

IntroductionEmotional parameters of mm behaviors

(Montepare et al. 1999) - Hand positions- Gait- Fluidity- Stiffness- Strength- Speed- spatial expansion- Activity

Acted

(Wallbott 1998) - Upper body- Shoulders (up, backward, forward)- Head (downward, backward, turned sideways, bent

sideways)- Arms- Hands- Movement quality (activity, spatial expansion, movement

dynamics, energy, power)- Symmetry

Acted

top related