-
Affective Communication Aid using Wearable Devicesbased on
Biosignals
Yuji TakanoUniversity of Tsukuba
1-1-1 TennodaiTsukuba, Japan
[email protected]
Kenji SuzukiUniversity of Tsukuba/JST
1-1-1 TennodaiTsukuba, [email protected]
ABSTRACTWe propose a novel wearable interface for sharing facial
ex-pressions between children with autism spectrum disorders(ASD)
and their parents, therapists, and caregivers. The de-veloped
interface is capable of recognizing facial expressionsbased on
physiological signal patterns taken from facial bio-electrical
signals and displaying the results in real time. Thephysiological
signals are measured from the forehead andboth sides of the head.
We veried that the proposed classi-cation method is robust against
facial movements, blinking,and the head posture. This compact
interface can supportthe perception of facial expressions between
children withASD and others to help improve their
communication.
Categories and Subject DescriptorsI.5.5 [Implementation]:
Interactive systems; K.4.2 [SocialIssues]: Assistive technologies
for persons with disabilities
General TermsMeasurement
KeywordsFacial expression, Smile sharing, Autism Spectrum
Disorder
1. INTRODUCTIONIn this paper, we propose a novel interaction
method for
sharing children's facial expressions with their parents inorder
to facilitate communication. In human communica-tion, facial
expressions carry some of the most importantnon-verbal information.
Facial expressions include psycho-logical information, such as
emotions, which are very impor-tant aspects of communication.
People can read a person'sthoughts simply by observing their
expression. Psycholog-ical studies have found that facial
expressions can projectemotions such as disgust, sadness,
happiness, fear, anger,and surprise [2]. Expressing these emotions
is a universal
Permission to make digital or hard copies of all or part of this
work for personal orclassroom use is granted without fee provided
that copies are not made or distributedfor prot or commercial
advantage and that copies bear this notice and the full cita-tion
on the rst page. Copyrights for components of this work owned by
others thanACMmust be honored. Abstracting with credit is
permitted. To copy otherwise, or re-publish, to post on servers or
to redistribute to lists, requires prior specic permissionand/or a
fee. Request permissions from [email protected], June 1720,
2014, Aarhus, Denmark.Copyright 2014 ACM 978-1-4503-2272-0/14/06
...$15.00.http://dx.doi.org/10.1145/2593968.2610455.
communication skill common to all humankind and does notmuch
depend on culture. Understanding facial expressionscorrectly is
very important for communication with otherpeople. Daily
communication between parents and childrenis very important to
building their relationship and has a keyrole in children's mental
and social development. However,there are some cases where it is
dicult for parents or care-givers to consistently recognize their
children's facial expres-sions. For example, children with autism
spectrum disorders(ASD) have diculties with communicating and
socially in-teracting through facial expressions, even with their
parents.Autism comprises a wide range of neurodevelopmental
dis-orders, and its intensity diers greatly in individuals.
There-fore, setting a clear boundary between healthy and
autisticpeople is dicult, and the mechanisms of autism have notyet
been claried. A typical example of the communicationdiculty in the
case of autism includes the lack of facialexpressions and eye
contact [5, 9]. Facial expressions playan important role in
communication with others, and wewant to know when and how much
their facial expressionchanges based on events in their daily
lives. Previously, wereported on the relationship between smiles
and positive so-cial behavior [4]. The smiles of children with ASD
can bequantitatively measured and analyzed by using a
wearabledevice [6]. There are many situations where reading
andunderstanding a child's facial expressions are desirable.Various
classication methods of facial expressions have
been proposed based on dierent features. The facial actioncoding
system (FACS) [3] describes facial expressions basedon physical and
anatomical criteria, and many researchershave embraced FACS to
classify facial expressions [8]. Thereare also many approaches to
capturing facial expressions.One method is to extract physical
variations in facial fea-tures from video by means of image
processing. This is anon-contact method that is the most commonly
used to rec-ognize facial expressions; it is also easy to use, with
littleeort needed to install the equipment. However, it has
thedisadvantage of spatial limitations as it depends on the cam-era
position and eld of view, and its accuracy is aected bythe head
posture, so the target user has to face the cameraconstantly. For
use in actual situations outside a controlledenvironment, there is
little possibility of the subject stay-ing in the same position
constantly, especially for childrenwho are moving and playing
around. Thus, using imageprocessing is dicult. Another possible
approach is to usemotion-capture technology to extract the
three-dimensionalshape of the face from the markers' coordinates
and measurethe physical features more properly. However, placing
the
213
-
LED Display
Vibrator
Looking
Vibration
Figure 1: Smile sharing: proposed interactionparadigm
markers requires preparation, which makes this approach
la-borious, and the markers can be easily occluded. Thus,
thedevelopment of a method for capturing facial expressionsthat is
easy to use and does not depend on spatial orienta-tion is still a
dicult challenge. We have been developinga tool to detect the
facial expressions of a person who hasdiculty with expressing their
intent in an accurate and con-tinuous manner through the use of a
wearable device. Thisallows users to not only capture the facial
expression butalso share them with others, even if the face is not
alwaysobservable by sensors installed in the environment, such
ascameras or depth sensors. In this paper, we propose the con-cept
of smile sharing, where a wearable device|namely, anaective
communication aid|is used that meets the abovecriteria to
communicate facial expressions. We evaluated thedevice to verify
its performance through several case studies.
2. METHODOLOGYThe proposed system provides a novel method of
interac-
tion, particularly between children and their parents,
thatconsiders use in daily life. Figure 1 shows a conceptual
di-agram of the proposed interaction frame. We rst describethe
method for capturing and classifying the facial expres-sions
independent of spatial orientation and then the sharingof the
facial expressions.
2.1 Wearable InterfaceOur proposed wearable interface can
capture facial ex-
pressions independent of the spatial orientation. To realizethis
system, we use surface electromyography (sEMG) onthe forefront and
sides of the face. sEMG can be capturedby using small electrodes to
measure the bioelectrical signalsemitted from muscles that are
activated to generate facialexpressions. Conventionally, the
electrodes must be accu-rately pasted on the skin on top of facial
muscles, includ-ing the orbicular muscles of the mouth and eyes,
for sEMGmeasurement of the face. However, pasting electrodes onthe
skin has some disadvantages: The process takes a longtime, and the
electrodes are prone to interference from fa-cial movements. A
possible approach to overcoming theseobstacles is measuring sEMG on
the sides of face, i.e., dis-tal EMG [6]. We have previously shown
that distal electrodelocations on areas of low facial mobility have
a strong ampli-tude and are correlated to signals captured in the
traditional
Electrode
ElectrodeSocket
Headband
LED
Electrode
Socet
Figure 2: Overview of head-mounted interface
positions on top of facial muscles. In this study, we
measuredsEMG on the sides of the head and forehead to reduce
inhibi-tion against physical variations of the face and developed
aneasy-to-wear interface. We used patterns of acquired sEMGsignals
to classify the facial expressions. By regarding fa-cial
expressions as specic patterns of activity by severalfacial
muscles, the interface can classify them without need-ing to
identify individual muscle activity. A support vectormachine (SVM)
was used for pattern classication and todierentiate smiles from
other facial expressions.
2.2 Smile SharingWe propose a method for sharing facial
expressions so that
a child's ambiguous or hidden expressions can be perceivedin
real time. In the current implementation, we only clas-sify the
child's smile and communicate it to their parentsthrough various
modalities. A smile is a facial expressionthat represents happiness
[1], and the perception of smilingfacilitates communication between
children and their par-ents. Specically, perceiving a child's smile
helps in under-standing what makes the child happy. By knowing what
achild is interested in, the parents can communicate with himor her
more intensely and feel more encouraged in their un-derstanding.
For smile sharing, we used both light-emittingand vibration
devices. Using a light-emitting device helpsthe parents perceive a
child's smile even if the child turnsaway his or her face. The
parents can also perceive thechild's smile by using a wrist-mounted
vibration device evenif they are not relatively close, which can
happen when play-ing. These methods are also viable for autistic
children andtheir parents when the parents cannot look at their
child'sface directly.
3. SYSTEM CONFIGURATIONThe system consists of an interface unit
and signal pro-
cessing unit. The interface unit measures signals and out-puts
the classication results. The signal processing unitclassies facial
expressions based on the measured sEMGsignals and sends the result
to the interface unit via Blue-tooth wireless communication.
3.1 Interface unitWe developed two dierent wearable interfaces:
head-
mounted and wrist-mounted devices. Figure 2 shows anoverview of
the head-mounted interface. The wrist-mountedinterface is a simple
vibration device that simply vibrateswhen the smile is detected by
the head-mounted interface.
214
-
Figure 3: Appearances of LED interface
Filtering
FeatureVector
ClassLabeling
Training
LED
SVM
Input Output
Electrode
400
Calibration
Rectificationand
Smoothing
MeasuringSection
Signal ProcessingSection
Light EmittingSection
Bluetooth Bluetooth
Comb BandPass
ICA
A/Dfs=1000[Hz]
y(t)b(t)x(t) c(t)
=100[ms]
Figure 4: Overview of facial recognition by usinghead-mounted
interface
The head-mounted interface is used both to acquire facialsEMG
and to display the resulting facial expression classi-cation. It
comprises dry electrodes and an LED embed-ded in a headband. sEMG
is acquired by the interface andsent to the signal processing unit
through Bluetooth wire-less communication. We decided to use dry
electrodes inthe interface, although they are prone to noise
contamina-tion in the case of unstable contact with the skin,
becausethey are much easier to apply on the skin and enable
fastmeasurement of sEMG with minimal preparation time. Theheadband
is made of elastic material, and the position of theelectrodes
inside the headband is adjustable. Therefore, theinterface can
manage dierent head sizes and shapes, andit holds the dry
electrodes steady in place to provide betterstability. Figure 3
shows the appearance of the LED, andthe LED colors of the interface
are white and red. The LEDemits a red light if the wearer is
smiling and a white lightfor any other facial expression. The LED
is tted in a smalltube as shown in Figure 3 to make it easily
noticeable byothers.The wrist-mounted interface comprises a
vibration motor
and presents the facial expression of the headband wearerthrough
vibration. This interface vibrates if the personwearing the
head-mounted interface is smiling. By usingthis interface, parents
can perceive their child's smile evenif they cannot look at his or
her face directly. In particular,the wrist-mounted interface can
help the parents of autisticchildren perceive their child's
smile.
3.2 Signal processing unitThe signal processing unit handles
digital ltering and
pattern recognition processes. The sEMG signals acquiredfrom the
interface unit are pre-processed and then classied
by the SVM based on their patterns. Figure 4 shows anoverview of
the signal processing. The sEMG signals areacquired every 1 ms, and
the facial recognition is performedwithin a certain time window ( =
150 ms). sEMG signalsvary depending on individual dierences and
electrode po-sition. The system rst needs to be calibrated for each
userby recording some facial expressions in advance and learn-ing
the wearer's signal pattern and intensity. However, it isdicult for
children with ASD to participate in this calibra-tion session. In
such cases, the system user simply gives theperiod of smiling time
as a reference, which is used as thebasis for smile
recognition.
4. EXPERIMENTWe conducted two experiments to evaluate the
perfor-
mance of the proposed system. In this section, we presentthe
classication accuracy and robustness against head mo-tion of the
system.
4.1 Evaluation of classication accuracyIn order to evaluate the
classication accuracy of the pro-
posed system, we compared it to the human cognitive abil-ity to
recognize smiling in an experimental setting. Werecorded videos of
three people (persons A{C); each alter-nately smiled and had a
neutral expression for two or threetimes over about 20 s while
wearing the headband interface.Nine subjects (eight male, one
female) in their twenties andthirties were recruited for the
experiment. Informed con-sent was obtained from the participants in
advance. Videosof the smiling/neutral faces (A{C) were shown to the
sub-jects, and they were asked to mark the smile intervals
byclicking a button to indicate the start and stop of smiles.We
covered the LED in the videos to avoid inuencing thesubject's
judgment. We calculated the maximum, minimum,and median values of
precision and recall based on the clas-sications by the subjects
and proposed system. Figure 5shows the results.As shown in the
gure, the precision of each subject was
above 0.95, but the recall varied among subjects dependingon who
created the facial expressions. The dierences in re-call may have
been due to the dierent facial features, someof which are more
dicult to recognize than others. Thismade it more dicult to set a
threshold for smiling (as forsubject B), which lowered the recall.
However, in terms ofclassication accuracy, the results were
positive because theprecision average was sucient for potential
applications.We also calculated the intra-class correlation
coecient toevaluate the degree of coincidence between the
classicationby the interface and the judgment of the subjects. The
av-erage intra-class correlation coecient was more than 0.936,which
is also sucient.
4.2 Evaluation of robustnessWe then conducted an experiment to
investigate the ro-
bustness of the system against head motion artifacts. Inthis
experiment, we investigated whether the system is ca-pable of
classifying facial expressions when there are distur-bances such as
head motions. The classication accuracyagainst head nodding
(forward and back head movement),head tilting (left and right head
tilting), head shaking (rightand left rotation), and blinking was
checked to evaluate therobustness of the system. The three motions
we investigated(nod, tilt, and shake) correspond to all possible
motions
215
-
0.7
0.8
0.9
1
Precision
Recall
A B C
n=9Precision / Recall
Figure 5: Maximum, minimum, and median valuesof precision and
recall
80
90
100
neutral
smile
n=8
no motion blink nod tilt shake
70
Acc
ura
cy[%
]
Figure 6: Maximum, minimum, and median valuesof robustness
(roll, pitch, and yaw); therefore, a positive result means
thatthe system is likely to be robust against any combination
ofhead motions. We asked the eight subjects to perform
thisexperiment while wearing the headband. Each subject per-formed
each motion for about 5 s while smiling or havinga neutral face.
Blink represents 10 blinks, and Nod andTile were done twice each.
Shake represents random headshaking along the yaw axis. Figure 6
shows the maximum,minimum, and median values of the classication
accuracyfor each motion in the experiment. The results showed
thatthe system classied neutral expression with no motion witha
probability of 100%. The system was able to classify theneutral
expression of most subjects with an accuracy of morethan 95% even
when there were some disturbances. In thecase of smiles, there were
some cases where the smile wasoccasionally not detected properly.
In the most prominentcase, subjects reported that it was dicult to
smile andblink at the same time, which probably contributed to
theclassication accuracy for blinking being lower than
others.However, the interface was capable of classifying smiles
bythe majority of the subjects with an accuracy of more
than90%.
5. DISCUSSION AND CONCLUSIONSIn this study, we considered the
scenario of daily com-
munication between children and their parents and focusedon
facial expressions, which are non-verbal information thatis
important to facilitating communication. We proposedwearable
interfaces to classify facial expressions based onfacial muscle
activities and share them through light andvibration. We evaluated
the classication accuracy and ro-bustness of the system through
experiments and veried that
the acquired signals from the sides of the head and foreheadcan
be used for facial expression classication. Through sev-eral
experiments, we veried the classication accuracy ofthe developed
system. The results demonstrated that theinterface can be used in
real environments with some dis-turbances to classify facial
expressions with high accuracyand to present smiles in real-time.
Further investigation willinclude the implementation of adaptive
ltering to removemotion artifacts.So far, we have presented the
concept of a novel interac-
tion design between children and their parents and devel-oped
interfaces that enable the realization of such interac-tion. We
have already conducted a feasibility study withchildren having ASD
during robot-assisted activities andconrmed that the proposed
device is acceptable [7]. In thefuture, we plan to conduct a user
study with children andfamilies to verify that the interfaces can
support the sharingand perception of facial expressions in the
given scenario.
6. REFERENCES[1] P. Ekman. An argument for basic emotions.
Cognition
and Emotion, 6(3):169{200, 1992.
[2] P. Ekman. Emotions Revealed: Recognizing Faces andFeelings
to Improve Communication and EmotionalLife. Times Books, 2003.
[3] P. Ekman and W. Friesen. Facial Action CodingSystem: A
Technique for the Measurement of FacialMovement. Consulting
Psychologists Press, 1978.
[4] A. Funahashi, A. Gruebler, T. Aoki, H. Kadone, andK. Suzuki.
The smiles of a child with autism spectrumdisorder during an
animal-assisted activity mayfacilitate social positive behaviors -
quantitativeanalysis with smile-detecting interface. J Autism
DevDisord, 44(3):685{693, 2014.
[5] K. Gray and B. Tonge. Are there early features ofautism in
infants and preschool children? J PaediatrChild Health,
37(3):221{226, June 2001.
[6] A. Gruebler and K. Suzuki. Design of a wearable devicefor
reading positive expressions from facial emg signals.IEEE Trans. on
Aective Comput., (in press).
[7] M. Hirokawa, A. Funahashi, and K. Suzuki. A
doll-typeinterface for real-time humanoid teleoperation
inrobot-assisted activity: A case study. In ACM/IEEEIntl. Conf. on
Human-Robot Interaction, pages174{175, 2014.
[8] J. J. Lien, T. Kanade, J. F. Cohn, and C. C. Li.Automated
facial expression recognition based on facsaction units. In IEEE.
Published in the Proceedings ofFG098, April 1998.
[9] F. R. Volkmar and L. C. Mayes. Gaze behavior inautism.
Development and Psychopathology, 2(1):61{69,January 1990.
216