-
Human Robot Teaming: Approaches from JointAction and Dynamical
Systems
Tariq Iqbal and Laurel D. Riek
Abstract As robots start to work alongside people, they are
expected to coordi-nate fluently with humans in teams. Many
researchers have explored the problemsinvolved in building more
interactive and cooperative robots. In this chapter, wediscuss
recent work and the main application areas in human-robot teaming.
Wealso shed light on some practical challenges to achieving fluent
human-robot co-ordination, and conclude the chapter with future
directions for approaching theseproblems.
Key words: Human-Robot Interaction, Human-Robot Teaming, Joint
Action, Dy-namical Group Modeling, Coordination
1 Introduction
As robots are becoming more ubiquitous, they will be expected to
interact withpeople in a range of settings, from dyads to groups.
To be effective and functionalteammates, robots need the ability to
perceive and understand the activities per-formed by other group
members. For example, if a robot can interpret various ac-tions
performed by people around it during a social event, then it can
make efficientdecisions about its own actions. However, it is
difficult to automatically perceiveand understand all the different
tasks people engage in to make effective decisionsas a
teammate.
If a robot could make better sense of how humans interact among
themselves ina group, its interactions with humans would reach a
higher level of coordination,resulting in a fluent meshing of
actions [66, 28, 14, 16, 30, 58, 26]. When twoor more agents work
together, Hoffman and Breazeal [16] defined fluency as thequality
of achieving a high level of mutual coordination and adaptation.
This quality
Tariq Iqbal and Laurel D. RiekUniversity of California San
Diego, La Jolla, CA, USA, e-mail: tiqbal,[email protected]
1
-
2 Tariq Iqbal and Laurel D. Riek
is particularly important when the agents are well-accustomed to
the task and to eachother.
This chapter discusses the existing methods and applications of
human-robot in-teraction (HRI) in cooperative tasks. In many of
these situations, robots are expectedto work with people to achieve
a common goal through the process of human-robotjoint action. Thus,
we start this chapter by giving a brief introduction to joint
action,both in the context of human-human and human-robot joint
action. We then summa-rize recent applications of human-robot
cooperative interaction from the literature.Finally, we conclude
the chapter by briefly presenting the challenges to realizing
ef-fective human-robot coordination with respect to hardware,
software, and usability.
2 Background
2.1 Approaches from Cognitive Science to Model Joint Action
When a person acts alone, their behavior is very different than
when they coordinatein a group [33]. When two or more persons
coordinate in a group, it is important tounderstand the different
ways they can interact among themselves and generate suit-able
interactive behaviors [31]. Many researchers from the fields of
psychology andcognitive science investigate the underlying
mechanisms of a joint action task. Thisincludes how people interact
together, how they understand the intention of otherindividuals,
and how they coordinate together to perform a joint action. Curioni
etal. [8] presented a detailed review of joint action in human
teams.
Sebanz defined joint action as a form of social interaction
where two or moreparticipants coordinate their actions in space and
time while making changes to theirenvironment [71, 34]. Sebanz et
al. described three important parts in a successfulperformance of a
joint action task [70]. The first part makes a prediction about
theintention of other interactional partners. The second involves
understanding whento perform the actions jointly, as this is very
important for temporal coordination.The last part involves
understanding where and how to perform the joint action. Theauthors
described these as the “what”, “when”, and “where” components of
jointaction.
Vesper et al. [81] suggested an architecture for joint action
which focuses onplanning, action monitoring, and action prediction
processes, and ways of simplify-ing coordination. This architecture
described minimal requirements for an individ-ual agent to engage
in a joint action. This architecture aims to fill the gap
betweenthe approaches that focus on language and propositional
attitudes, and dynamicalsystem approaches.
Many researchers have explored the underlying mechanisms that
people mayemploy to perform a successful joint action task [46]. To
perform joint actions suc-cessfully in a group, each individual
needs to integrate self-behavior with a predic-tion about others’
behavior simultaneously [51]. For example, Novembre et al. [51]
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 3
investigated whether this integration process of self and
other-related behavior isunderpinned by a neural process associated
with motor simulation. They exploredthis through a music
performance experiment. Their results suggested that
motorsimulation underpins temporal coordination during joint
actions.
Other researchers took a group-perspective approach to model a
successful jointaction. For example, Valdesolo et al. [79]
investigated whether a coordinated actionin a group has any
influence on the ability of the group members to pursue a jointgoal
together. Their results suggested that a person’s ability to
rocking in synchronyenhanced that person’s perceptual sensitivity
to the motion of other group members.The ability to be synchronous
with others resulted in an increase of their success ina joint
action task.
Slowiński et al. [75] explored whether coordination between two
people per-forming a joint action task is higher when they exhibit
similar motion features. Toexplore this, they proposed an index of
motion variability, called individual motorsignature (IMS), to
capture the subtle differences of human movements. They
in-vestigated the validity of this index via a mirror game. Their
results suggested thatwhen two people shared a similar IMS value,
the synchronization level was higher.
2.2 Dynamical Modeling of Groups
In this subsection, we discuss the contrasting perspective,
which is more bottomup and non-linear, and explores coordination
dynamics as a mechanism for real-izing joint action. In group
interactions, the activities of each member continuallyinfluence
the activities of other group members. Most groups create a state
of inter-dependence, where each member’s outcomes and actions are
determined in part byother members of the group [10]. This process
of influence can result in coordinatedgroup activity over time.
Many disciplines have approached the problem of how to assess
coordination in asystem. These include: robotics, physics,
neuroscience, psychology, dance, and mu-sic. Many of these
techniques take a bottom-up approach, which first try to measurethe
low-level signals, and then build a high-level behavior from the
low-level sig-nals [30, 23, 43, 24]. These low-level signals can
include physical motion features,physiological features (e.g.,
heart rate), eye gaze behavior, or activity features. High-level
behaviors, such as coordination within a group, are then inferred
from theselow-level signals.
For example, Richardson et al. [60] proposed a method to assess
group synchronyby analyzing the phase synchronization of rocking
chair movements. A group ofsix participants rocked in their chairs
with their eyes either open or closed, andthey used a cluster-phase
method to quantify phase synchronization. Their resultssuggested
that their group level synchrony measure could successfully
distinguishbetween synchronous and asynchronous conditions.
Similarly, Néda et al. [45] in-vestigated the development of
synchronized clapping in a naturalistic environment.
-
4 Tariq Iqbal and Laurel D. Riek
They quantitatively described the phenomena of how asynchronous
group applausestarts suddenly and transforms into synchronized
clapping.
Coordination among explicit and implicit behaviors has also been
explored inhuman-human interaction. Varni et al. [80] presented a
system for real-time analy-sis of nonverbal, affective social
interaction in a small group. In their study, severalpairs of
violin players performed while conveying four different emotions.
The au-thors then used recurrence quantification analysis to
measure the synchronization ofthe performers’ affective behavior.
In follow-on work, the researchers developed asystem capable of
analyzing the interaction patterns in a group of dancers.
Konvalinka et al. [36] explored coordination among implicit
physiological sig-nals and performed a study to measure the
synchronous arousal between performersand observers during a
Spanish fire-walking ritual. This synchronous arousal wasderived
from heart rate dynamics of the active participants and the
audience.
Taking a non-linear, dynamical systems approach, Iqbal and Riek
developed amethod to measure the degree of synchronous joint action
in a group [21, 23, 28,25, 30]. Their method takes multiple types
of task level events into account whilemeasuring the
synchronization. This method can work on multiple types of
hetero-geneous events and can measure asynchronous situation in a
group, in contrast tomost other methods from the literature which
only take a single type of event intoaccount. The authors validated
their method by applying it to both human-humanand human-robot
teaming scenarios. Their results suggested that the method
cansuccessfully measure the degree of coordination in a group which
matches the col-lective perception of group members. Extending this
work, the authors designed anew approach to enable robots to
perceive human group behavior in real-time, an-ticipate future
actions, and synthesize their own motion accordingly (see Fig.
1)[30, 25].
Lorenz et al. [40] also investigated movement coordination in
human-human andhuman-robot teams. Their study involved both a
human-human and human-robotdyad tapping on two positions on a table
at certain times. The authors exploredwhether goal-directed, but
unintentional coordination of movements occurred dur-ing these
interactions. Their results suggest that humans synchronized their
move-ments with the movements of the robots.
3 Recent Applications
As robots are increasing working with people, they need to
perform joint actionswith people efficiently. To achieve this, many
of the aforementioned approacheshave been employed in human-robot
teams. This section will outline four main ap-plication areas where
robots cooperatively perform joint action tasks with humans.We
summarize the approaches used in these areas in Table 1 at the end
of this sec-tion.
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 5
Fig. 1 People and robots are engaged in cooperative tasks (from
[40, 30])
3.1 Proximate Human-Robot Teaming
In many interactions, robots and humans need to share a common
physical spaceto interact. Various methods are employed on robots
to work efficiently in closeproximities by avoiding collisions,
such as models from a human demonstration,anticipatory action
planning, etc. [77].
To build policies for robots to share a space with humans, many
approaches inthe literature first built models from human
demonstrations. After training, robotsthen use these trained models
to collaborate with people. For example, Ben Amoret al. [2]
collected human motion trajectories as Dynamic Movement
Primitives(DMP) from a human-human task. After that, the authors
used dynamic time warp-ing to estimate the robot’s DMP parameters.
Using these parameters, they modelledhuman-robot joint physical
activities using a new representation, called InteractionPrimitives
(IP). Their experimental results suggested that a robot
successfully com-pleted a joint physical task with a person when
IPs were used.
Nikolaidis et al. [49] proposed a two-phase framework to fit a
robot’s collabora-tive policy to fit with a human collaborator.
They first grouped the human activitiesinto clusters and then
learned a reward function for each cluster using an
inversereinforcement learning. This learned model was incorporated
with a Mixed Ob-servability Markov Decision Process (MOMDP) policy
with the human type as thepartially observable variable. After
that, they used this model for a robot to infer thehuman type and
to generate the appropriate policies.
Many researchers try to achieve successful human-robot
collaboration in a sharedspace by modeling human activities and use
that knowledge as an input to a robot’santicipatory action planning
mechanism [77]. This approach enables robots to gen-erate movement
strategies to efficiently collaborate with people.
For instance, Hoffman and Weinberg [19, 18] developed an
autonomous roboticjazz-improvising robot, Simon, which played the
marimba (see Fig. 2). To play inreal-time with a person, the robot
needed an anticipatory action plan. The authorsdivided the actions
into preparation and follow-through steps. Based on the antici-
-
6 Tariq Iqbal and Laurel D. Riek
patory plans, their robot could simultaneously perform and react
to shared activitieswith people.
Koppula et al. [37] also developed a method to anticipate a
person’s future ac-tions. Anticipated actions were then used to
plan appropriate actions for a robot toperform collaborative tasks
in a shared environment. In their method, they modelhumans through
low-level kinematics and high-level intent, as well as using
con-textual information. Then, they modeled the human’s and robot’s
behavior througha Markov Decision Process (MDP). Their results
suggested that this approach per-formed better than various
baseline methods for collaborative planning.
Mainprice and Berenson [41] presented a framework to allow a
human anda robot to perform a manipulation task together in close
proximity. This frame-work used early prediction of the human
motion to generate a prediction of humanworkspace occupancy. Then,
they used a motion planner to generate robot trajec-tories by
minimizing a penetration cost in the human workspace occupancy.
Theyvalidated their framework via simulation of a human-robot
collaboration scenario.
Along these lines, Pérez-D’Arpino et al. [55] proposed a
data-driven approachwhich used human motions to predict a target
during a reaching-motion task. Un-helkar et al. [78] extended this
concept for a human-robot co-navigation task. Thismodel used “human
turn signals” during walking as anticipatory indicators of hu-man
motion. These indicators were then used to plan motion trajectories
for a robot.
3.2 Human-Robot Handovers
A particular kind of activity often conducted in the proximate
human-robot inter-action space is a handover. It is an active
application space in robotics research[77]. Most of the work on
handovers focuses on designing algorithms for robots tosuccessfully
hand objects to people, as well as receive objects from them. The
re-searchers working in this area use many methods to achieve their
goals, including:nonverbal signal analysis, human-human handover
models, and legible trajectoryanalysis.
Many researchers used non-verbal signals of people to facilitate
fluent objecthandover during human-robot interaction [77]. These
signals included eye gaze,body pose, head orientation, etc. For
example, Shi et al. [74] focused on buildinga model for a robot to
handover leaflets in a public space, looking specifically atthe
relationship between gaze, arm extension, and approach. They used a
pedestriandetector in their implementation on a small humanoid
robot. Their results showedthat pedestrians accepted more leaflets
from the robot when their approach was em-ployed than another state
of the art approach.
Similarly, Grigore et al. [11] demonstrated that the integration
of an understand-ing of joint action into human-robot interaction
can significantly improve the suc-cess rate of robot-to-human
handover tasks. The authors introduced a higher-levelcognitive
layer which models human behavior in a handover situation. They
par-
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 7
ticularly focused on the inclusion of eye gaze and head
orientation into the robot’sdecision making.
Other researchers also investigated human-human handover
scenarios to get in-spiration to build models for human-robot
handover scenarios [77]. Along this lineof research, Huang et al.
[20] analyzed data from human dyads performing a com-mon household
handover task - unloading a dish rack. They identified two
coordina-tion strategies that enabled givers to adapt to receivers’
task demands, namely proac-tive and reactive methods, and
implemented these strategies on a robot to performthe same task in
a human-robot team. Their results suggested that neither
proactivenor reactive strategy can achieve both better team
performance and better user ex-perience. To address this challenge,
they developed an adaptive method to achievea better user
experience with an improved team performance compared to the
othermethods.
To improve the fluency of a robot’s actions during a handover
task, Cakmak etal. [5] found that the failure to convey an
intention of a robot to handover an objectcauses delay during the
handover process. To address this challenge and to achievefluency,
the authors tested two separate approaches on a robot: performing
distincthandover poses and performing unambiguous transitions
between poses during thehandover task. They performed an experiment
where a robot used these two ap-proaches while handing over an
object to a person. Their findings suggested thatunambiguous
transition between poses reduced human waiting times, resulted in
asmoother object handover. However, distinct handover poses did not
have any effecton that.
Other researchers work on perform trajectory analysis to achieve
smooth han-dover of objects. For example, Strabala et al. [76]
proposed a coordination struc-ture for human-robot handovers based
on human-human handover. The authors firststudied how people
perform handovers with their partners. From this study, theauthors
structured how people approach, move their hands, and transfer
objects.Taking inspiration from this structure, the authors then
developed a similar han-dover structure for human-robot handover.
This human-robot handover structureconcerned about what, when and
where aspects of handovers. They experimentallyvalidated this
design structure.
3.3 Fluent Human-Robot Teaming
Many researchers in the robotics community try to build fluent
human-robot teams.To achieve this goal, many approaches have been
taken, including: insights fromhuman-human teams, cognitive
modeling for robots, understanding the coordinationdynamics of
teams, and adaptive future prediction methods [77].
To achieve fluency in human-robot teams, many researchers
investigated howpeople achieve fluent interaction in human-only
teams. This knowledge is used todevelop strategies for robots to
achieve fluent interaction while interacting with peo-ple.
-
8 Tariq Iqbal and Laurel D. Riek
Fig. 2 A live performance of a robotic marimba player (from
[18])
Taking insights from human-human teaming, Shah et al. [72, 73]
developed arobot plan execution system, called Chaski, to use in
human-robot teams. This sys-tem enables a robot to collaboratively
execute a shared plan with a person. Thissystem can schedule a
robot’s action, and adapt to the human teammate to minimizethe
human’s idle time. Through a human-robot teaming experiment, the
authors val-idated that Chaski can reduce a person’s idle time by
85%.
To build cognitive models for robots, researchers build on many
other fields,including cognitive science, neuroscience, and
psychology. For example, Hoffmanand Breazeal [13] address the issue
of planning and execution through a frame-work for collaborative
activity in human-robot groups by building on the variousnotions
from cognitive science and psychology literature. They presented a
hier-archical goal-oriented task execution system. This system
integrated human verbaland nonverbal actions, as well as robot
nonverbal actions to support the shared ac-tivity requirements.
Iqbal, Rack, and Riek developed two anticipation algorithms for
robots to co-ordinate their movements with people in teams by
taking team coordination dy-namics into account [30, 58]. One of
the anticipation algorithms (SIA) relied onhigh-level group
behavior understanding, whereas the other method (ECA) did notrely
on high-level group behavior. The results indicated that the robot
was moresynchronous to the team and exhibited more contingent and
fluent motion whenthe SIA method was used than the ECA method.
These findings suggested that therobot performed better when it had
an understanding of high-level group behaviorthan when it did
not.
Additionally, Iqbal and Riek [25] investigated how the presence
of robots af-fects group coordination when both their behavior and
their number (single robotor multi-robot) vary. Their results
indicate that group coordination is significantlyaffected when a
robot joins a human-only group. The group coordination is
furtheraffected when a second robot joins the group and has a
different behavior from theother robot. These results indicated
that heterogeneous behavior of robots in a multi-
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 9
human multi-robot group can play a major role in how group
coordination dynamicsstabilize.
Drawing inspiration from the neuroscience and cognitive science
literature, Iqbalet al. [29] developed algorithms for robots which
leveraged a human-like under-standing of temporal changes during
the coordination process, with a particular eyetoward an
understanding of rhythmic tempo change. In their work, a robot
employedtwo separate processes while coordinating with people, a
temporal adaptation pro-cess, and a temporal anticipation process.
A robot used the temporal adaptation pro-cess to compensate for
temporal errors that occurred while coordinating with
people.Additionally, the robot used the anticipation process to
generate a prediction aboutthe timing of the next action to
coincide with the timing of the next external rhyth-mic signal.
They applied these processes to a robot to drum synchronously with
agroup of people.
Building adaptive models based on a prediction of future actions
is another ap-proach to achieve fluent human-robot collaboration.
Hoffman and Breazeal [16] de-veloped a cognitive architecture for
robots, taking inspiration from neuropsycho-logical principles of
anticipation and perceptual simulation. In this architecture,
thefluency in joint action achieved through two processes: 1)
anticipation based on amodel of repetitive past events, and 2) the
modeling of the resulting anticipatoryexpectation as perceptual
simulation. They implemented this architecture on a
non-anthropomorphic robotic lamp, which performed a human-robot
collaborative task.Their results suggested that the sense of team
fluency and the robot’s contributionto the fluency significantly
increased when the robotic lamp used their
developedarchitecture.
In other work, Hoffman and Breazeal [15] proposed an adaptive
action selectionmechanism for a robot in the context of human-robot
joint action. This model madeanticipatory decisions based on the
confidence of their validity and relative risk.They validated their
model through a study involving human subjects working witha
simulated robot. They used two versions of robotic behaviors during
this study,one was fully reactive, and another one used their
proposed anticipation model.Their results suggested a significant
improvement in best-case task efficiency andsignificant difference
in the perceived commitment of the robot to the team and
itscontribution to the team’s fluency and success.
3.4 Robot as a Partner
There are still many open areas regarding social interactional
capabilities that arobot should have before it can fluently and
naturally interact with people as a part-ner. Many researchers have
tried to tackle these open questions by building modelsfor robots
to understand and to act appropriately as a partner in social
situations[50].
For example, Leite et al. [38] conducted an ethnographic study
to investigatehow a robot’s capability of recognizing and
responding empathically can influence
-
10 Tariq Iqbal and Laurel D. Riek
Fig. 3 A human-robot drumming team (from Iqbal et al. [29])
an interaction. The authors performed the study in an elementary
school where chil-dren interacted with a social robot. That robot
had the capability of recognizingand responding empathically to
some of the children’s affective states. The resultssuggested that
the robot’s empathic behavior had a positive effect on how
childrenperceived the robot.
Many researchers also explored how a robot’s explicit behavior
can influence itsinteraction with people [39, 69]. For example,
Riek et al. [65, 64] investigated howimitation by a robot affects
human-robot teaming. They designed a study where arobot performed
three head gestures while interacting with a person: full head
ges-ture mimicking, partial mimicking, and no mimicking. The
authors found that inmany cases people nodded back in response to
the robot’s nodding during inter-actions. They suggested
incorporating more gestures, along with head nods, whilestudying
affective human-robot teaming.
In another study, Riek et al. [62] explored the effect of
cooperative gestures per-formed by a humanoid robot in a teaming
scenario. The authors performed an ex-periment where they
manipulated the gesture type, the gesture style, and the
gestureorientation performed by the robot while interacting with
people. Their results sug-gested that people cooperate more quickly
when the robot performed abrupt (“robot-like”) gestures, and when
the robot performed front-oriented gestures. Moreover, thespeed of
people’s ability to decode robot gestures is strongly correlated
with theirability to decode human gestures.
In HRI, eye gaze can provide important non-verbal information
[77]. For exam-ple, Moon et al. [42] performed an experiment where
a robot performed human-likegaze behavior during a handover task.
In their experiment, a PR2 robot performed
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 11
three different gaze behaviors while handing over a water bottle
to a person. The re-sults indicated that the timing of handover and
the perceived quality of the handoverevent were improved when the
robot showed a human-like gaze behavior.
Admoni et al. [1] explored whether a deviation from a robot’s
standard behaviorcan influence the interaction. The authors claimed
that people often times over-looked robot’s standard non-verbal
signals (e.g., eye gaze) if they were not relatedto the primary
task. In their experiment, the authors manipulated the handover
be-havior of a robot to deviate a little from the standard expected
behavior. The resultsof this experiment suggested that a simple
manipulation on standard handover tim-ing of a robot made people be
more aware of other nonverbal behaviors of the robot,such as eye
gaze behavior.
Another well-investigated approach in the field is to teach a
robot appropriatebehaviors by teaching it through demonstration,
i.e., learning from demonstration(LfD) [3]. For instance, Niekum et
al. [47] developed a method to discover seman-tically grounded
primitives during a demonstrated task. From these primitives,
theauthors then built a finite-state representation of the task.
The authors used a BetaProcess Autoregressive Hidden Markov Model
to automatically segment demon-strations into motion categories.
These categories were then further divided intomotion grounded
stated in a finite automaton. From many demonstrated examples,this
model was trained on a robot.
Hayes [12] looked at mutual feedback as an implicit learning
mechanism duringan LfD scenario. The authors explored grounding
sequences as a feedback channelfor mutual understanding. In their
study, both a person and a robot provided non-verbal feedback to
communicate their mutual understanding. The results from
theexperiments showed that people provided implicit positive and
negative feedback tothe robot during the interaction, such as by
smiling or by averting their gaze fromthe robot. The results of
this work can help us to build adaptable robot policies inthe
future.
Brys et al. [4] explored how to merge reinforcement learning and
LfD approachestogether to achieve a better and faster learning
phase. One key limitation of rein-forcement learning is that it
often requires a huge amount of training data to achievea desirable
level of performance. For a LfD approach, there is no guarantee
about thequality of the demonstration, which can have many errors.
Brys et al. investigatedthe intersection between these two
approaches and tried to speed up the learningphase of RL methods
using an approach called reward shaping.
4 Challenges
When a robot leaves controlled spaces and begins to work
alongside people, manythings taken for granted in terms of
perception and action do not apply, becausepeople act
unpredictably, and little can be known about human environments in
ad-vance [63, 48, 52]. These challenges include: difficulties in
human action detection,understanding of team dynamics, limitations
in robot hardware and software design,
-
12 Tariq Iqbal and Laurel D. Riek
Table 1 Application areas of human-robot collaboration
Application Areas Approaches
Proximate Human-Robot TeamingModels from human demonstration
([2], [49])
Anticipatory action planning ([19], [18], [37], [41], [55])
Human Robot Handovers
Non-verbal signal analysis ([74], [11])
Modelling based on human-human handover ([20], [5])
Trajectory analysis ([76])
Fluent Human-Robot Teaming
Insights from human-human teams ([72], [73])
Cognitive modeling ([13], [30], [25], [29], [58])
Predicting actions ([16], [15])
Robot as a Partner
Explicit behavior analysis ([38], [65], [64], [62])
Eye gaze analysis ([42], [1])
Learning from demonstration ([47], [12], [4])
and egocentric perception. This section introduces some of the
challenges that re-searchers face while incorporating robots into
human environments to coordinatewith people, and briefly discusses
some solutions to these problems.
4.1 Uncertainty in Human Action Detection
One of the main challenges to detecting human actions is the
unpredictability ofhuman behavior. Sometimes it can be difficult
for a robot to perceive and understandthe different types of events
involved in these activities to make effective decisionsdue to
sensor occlusion, sensor fusion error, unanticipated motion, narrow
field ofview, cluttered backgrounds, etc. [66, 7, 6, 59].
One approach to address the challenge of human action detection
is to use clas-sification algorithms to detect actions from video
data. However, this approach hasmajor challenges, including: intra
vs inter class variations between action classes,environment and
recording settings, temporal variations of actions, and
obtainingand labeling training data [56]. Moreover, using a
classifier for action detection hasseveral computational
bottlenecks, including: generalizability, abnormality detec-tion,
and classifier training [59].
Most of the approaches available in the literature can not
handle most of thesechallenges. Moreover, in most action
recognition cases, researchers usually assumethat camera positions
are static. However, this is not the case for mobile robots
[6].
Ryoo and Matthies try to address the challenge of action
detection from a first-person point-of-view [68]. In their work,
the authors try to detect seven classes ofcommonly observed
activities during human-human interaction from a first-person
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 13
point-of-view. Ryoo et al. [67] further extended this approach
to detect early hu-man activities from a robot. Using their method,
a robot can detect human activitiesearly, in real-time, in
real-world environments. However, these methods still do notaddress
other practical challenges, such as occlusion.
4.2 Unpredictable Changes in Team Dynamics
If a robot has some ability to model team dynamics, it can
anticipate future ac-tions in a team, and adapt to those actions to
be an effective teammate. However,understanding team dynamics is
not trivial. If robots have an understanding of itsenvironment,
then its interactions within the team might facilitate a
higher-level ofcoordination.
In many human-human team situations, team members are explicitly
assigned tovarious roles [53]. On the other hand, in many
human-human teams, various rolesemerges over time across the team
members to achieve a common goal [35]. Oftentimes these assigned
roles change dynamically based on necessities. For example, aperson
who begins to lead a team to move a table may follow another
teammate’slead later during the moving process. How people
coordinate and cooperate amongthemselves in these situations are
important indicators for robots to understand var-ious roles in
groups.
In human-robot interaction scenarios, various role distribution
models are used.High-level role distribution models in the HRI
paradigm are master-slave, supervisor-subordinate, partner-partner,
teacher-learner, and leader-follower [53, 22, 27, 61].However,
these well-defined role distributions are rarely seen in real-world
situa-tions. Moreover, distributed roles change dynamically in many
situations. Therefore,if the roles are not predefined for an
interaction, the robot needs to make predictionsabout the role of
co-present people, to infer its own role in the group.
Understanding the role of other people in a group is not easy
for a robot. Thus,most of the human-robot teams are designed using
some prior distribution of roles toachieve goals. However, a
dynamic understanding of role distributions in a human-robot team
can enable a robot to understand team dynamics more
appropriately,which can lead to a fluent interaction in the
group.
4.3 Limited Behavioral Versatility on Robots
Another challenge of incorporating robots in human teams is a
lack of versatility ofbehaviors on robots. Most robots are designed
to perform a specific task. Therefore,most of time they are limited
in their behavioral abilities because they are restrictedby their
physical capabilities. For example, some robots are designed to
performmanipulation tasks, some are good at recognizing and
tracking people, and someare good at mobility.
-
14 Tariq Iqbal and Laurel D. Riek
However, a robot often needs to perform more than one of these
abilities simul-taneously to interact fluently with, and establish
trust with people. For example, tosocially interact with people, a
robot needs to be able to identify them, approachthem by avoiding
obstacles, understand verbal and non-verbal messages, commu-nicate
verbally and non-verbally, and work alongside them. Thus,
researchers needrobots with versatile behaviors and abilities to
build more efficient and functionalhuman-robot teams.
Anthropomorphic robots are widely used in social environments to
interact withpeople. These robots can engage with people in social
interaction by perceiving var-ious social cues from verbal and
nonverbal channels, and by communicating withpeople verbally and
non-verbally. However, these types of robots are often not
de-signed with capabilities to perform other tasks, such as
mobility and manipulation.Kismet [32] was one of the first few
anthropomorphic robots with an expressive facethat was used to
interact with people in social environments using gaze, facial
ex-pression, body posture, and vocal babbling. However, due to lack
of other physicalparts, such as hands, this social robot lacks the
capability to perform hand gesturesto interact with people
fluently.
The Nao robot is a widely used humanoid robot for research [44],
which canwalk, show expressive gestures, and verbally communicate
with people. Becauseof its expressive body gestures and verbal
communication capabilities, it became apopular platform which
enabled researchers to design a wide variety of interactionswith
people. However, it lacks facial expressions and is incapable of
performingmanipulation tasks, which limits its utility.
There are also non-anthropomorphic robots that interact and
collaborate withpeople. These robots can show various verbal and
nonverbal responses, and can alsogenerate animated gestures while
collaborating with people. For example, Hoffmanand Ju [17] designed
a non-humanoid robot with expressive movements in mind.This robot
can perform human-like gestures, such as a head nod to express
agree-ment, and a head shake to express disagreement. This robot
can express selectivegestures; however, it can not express many
other gestures which are possible to per-form with an expressive
face.
On the other hand of the spectrum, there exist many robots that
are strictlydesigned to perform manipulation tasks, e.g., Fetch and
Freight robot by FetchRobotics [9]. These arms are capable of
performing dexterous manipulation tasks.However, these robots are
not particularly functional in social situations, as oftentimes
they are not safe around people, and can not easily generate
expressive behav-iors.
PR2 robot by Willow Garage [57] is another widely used robotic
platform, par-ticularly for manipulation and handover research.
This robot has two manipulationarms with grippers and can perform
many dexterous tasks, which make it a widelyused robotic
manipulator by the research community. However, this robot lacks
thecapability to perform any expressive behavior towards people,
and not very suitablefor human social environments.
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 15
4.4 Lack of Infrastructure to Support Replicability
Because of the wide range of platforms used on various robots,
it is very challengingfor researchers to replicate studies across
different robots. This limitation preventshuman-robot collaboration
researchers from exploring the effects of using variouskinds of
robots in similar situations.
These difficulties include: changes in sensor modalities across
various platforms,variation in on-board processing units, and
variation in physical structure. For ex-ample, if a robot has a
high definition RGB-D camera and has an onboard graphicalprocessing
unit, then it can detect facial expressions more precisely. On the
otherhand, if another robot only has a low definition RGB camera
with no onboard pro-cessing unit, then the same algorithms will not
perform consistently.
The Robot Operating System (ROS) is a commonly used platform in
the aca-demic community [54]. However, as this is open source
software, there are manychallenges using it due to lack of software
support and maintenance.
Moreover, similar algorithms need to be implemented on different
platforms asnot all robots are using a unified platform. This
requires researchers to reimplementpre-existing algorithms to
accommodate different platforms, which oftentimes delayprogress.
Having common infrastructures will greatly help the research
communityto achieve replicability and to explore new robotic
behaviors to coordinate withpeople.
5 Discussion
In this chapter, we discussed some exciting recent work on
human-robot coordina-tion. We briefly described recent approaches
to model human-human and human-robot joint action from the
literature. These approaches include: neural process mod-eling,
taking a group-perspective, bottom-up approaches, nonlinear
dynamical sys-tems approaches, implicit and explicit physiological
signals.
We also discussed four main application areas in human-robot
cooperation do-main, namely human-robot handovers, interaction in
close physical proximities, flu-ent human-robot teaming, and robot
as a partner. Many approaches have been takento incorporate robots
in these application domains, including: dynamic
trajectoryanalysis, anticipatory action planning, cognitive
modeling, explicit and implicit be-havior analysis, affective
behavior analysis, learning from demonstration (LfD).
Although there exist many applications of human-robot
coordination, there alsoexist many practical issues that must be
addressed to achieve a higher level of flu-ency in interaction.
These practical issues include a lack of work done to detect
andrecognize co-present human actions and to understand team
dynamics, limitationsin robot design, and a lack of infrastructure
to support replicability. Computationalfields, like computer vision
and machine learning, are trying to address specificrobotic
problems related to real-world scenarios, such as using egocentric
vision,computationally inexpensive object proposal algorithms, and
so on [7, 6]. Along
-
16 Tariq Iqbal and Laurel D. Riek
with improvements in these technologies and existing algorithms,
social robots willbe able to cooperate with co-present people
better in human social environments inthe future.
References
[1] Admoni H, Dragan A, Srinivasa S, Scassellati B (2014)
Deliberate DelaysDuring Robot-to-Human Handovers Improve Compliance
With Gaze Com-munication. Int Conf Human-Robot Interact
[2] Amor HB, Neumann G, Kamthe S, Kroemer O, Peters J (2014)
Interactionprimitives for human-robot cooperation tasks. In: IEEE
Conf. on Robotics andAutomation
[3] Argall B, Chernova S, Veloso M (2009) A Survey of Robots
Learning fromDemonstration. Robotics and Autonomous Systems
[4] Brys T, Harutyunyan A, Brussel VU, Taylor ME (2015)
Reinforcement Learn-ing from Demonstration through Shaping. Proc
Twenty-Fourth Int Jt Conf Ar-tif Intell
[5] Cakmak M, Srinivasa SS, Lee MK, Kiesler S, Forlizzi J (2011)
Using spa-tial and temporal contrast for fluent robot-human
hand-overs. In: Proc. ofACM/IEEE HRI
[6] Chan D, Riek LD (2017) Object proposal algorithms in the
wild: Are theygeneralizable to robot perception? In: Review
[7] Chan D, Taylor A, Riek LD (2017) Faster robot perception
using salient depthperception. In: IROS
[8] Curioni A, Knoblich G, Sebanz N (2017) Joint Action in
Humans - A Modelfor Human-Robot Interactions? Section:
Human-Humanoid Interaction, Hu-manoid Robotics: a Reference
[9] Fetch Robot (2017) https://www.fetchrobotics.com[10] Forsyth
DR (2009) Group dynamics, 4th edn. T. Wadsworth[11] Grigore EC,
Eder K, Pipe AG, Melhuish C, Leonards U (2013) Joint action
understanding improves robot-to-human object handover. In:
IEEE/RSJ Inter-nation Conference on Intelligent Robots and Systems
(IROS)
[12] Hayes CJ, Moosaei M, Riek LD (2016) Exploring implicit
human responsesto robot mistakes in a learning from demonstration
task. In: Robot and HumanInteractive Communication (RO-MAN)
[13] Hoffman G, Breazeal C (2007) Collaboration in human-robot
teams. In: AIAAIntelligent Systems Technical Conference
[14] Hoffman G, Breazeal C (2007) Cost-based anticipatory action
selection forhuman–robot fluency. IEEE Transactions on Robotics
[15] Hoffman G, Breazeal C (2007) Effects of anticipatory action
on human-robotteamwork efficiency, fluency, and perception of team.
Human-Robot Interact
[16] Hoffman G, Breazeal C (2008) Anticipatory Perceptual
Simulation forHuman-Robot Joint Practice: Theory and Application
Study. AAAI
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 17
[17] Hoffman G, Ju W (2014) Designing robots with movement in
mind. Journalof Human-Robot Interaction
[18] Hoffman G, Weinberg G (2010) Synchronization in human-robot
Musician-ship. Int Symp Robot Hum Interact Commun
[19] Hoffman G, Weinberg G (2011) Interactive improvisation with
a roboticmarimba player. Auton Robots
[20] Huang Cm, Cakmak M, Mutlu B (2015) Adaptive Coordination
Strategies forHuman-Robot Handovers. In: Robot. Sci. Syst.
[21] Iqbal T, Riek L (2014) Assessing group synchrony during a
rhythmic socialactivity: A systemic approach. In: Proc. of the
conference of the InternationalSociety for Gesture Studies
(ISGS)
[22] Iqbal T, Riek LD (2014) Role distribution in synchronous
human-robot jointaction. In: Proc. of IEEE RO-MAN, Towards a
Framework for Joint Action
[23] Iqbal T, Riek LD (2015) A Method for Automatic Detection of
PsychomotorEntrainment. IEEE Transactions on Affective
Computing
[24] Iqbal T, Riek LD (2015) Detecting and Synthesizing
Synchronous Joint Actionin Human-Robot Teams. In: International
Conference on Multimodal Interac-tion
[25] Iqbal T, Riek LD (2017) Coordination dynamics in
multi-human multi-robotteams. IEEE Robotics and Automation Letters
(RA-L)
[26] Iqbal T, Gonzales MJ, Riek LD (2014) A Model for
Time-Synchronized Sens-ing and Motion to Support Human-Robot
Fluency. In: ACM/IEEE Human-Robot Interaction, Workshop on Timing
in HRI
[27] Iqbal T, Gonzales MJ, Riek LD (2014) Mobile robots and
marching humans:Measuring synchronous joint action while in motion.
In: AAAI Fall Symp. onAI-HRI
[28] Iqbal T, Gonzales MJ, Riek LD (2015) Joint action
perception to enable flu-ent human-robot teamwork. In: Proc. of
IEEE Robot and Human InteractiveCommunication
[29] Iqbal T, Moosaei M, Riek LD (2016) Tempo Adaptation and
AnticipationMethods for Human-Robot Teams. In: Robotics: Science
and Systems, Plan-ning for HRI: Shared Autonomy and Collab.
Robotics Work.
[30] Iqbal T, Rack S, Riek LD (2016) Movement coordination in
human-robotteams: A dynamical systems approach. IEEE Transactions
on Robotics32(4):909–919
[31] Jarrassé N, Charalambous T, Burdet E (2012) A framework to
describe, ana-lyze and generate interactive motor behaviors. PLoS
One
[32] Kismet Robot (2017)
https://www.ai.mit.edu/projects/humanoid-robotics-group/kismet/kismet.html
[33] Knoblich G, Jordan JS (2003) Action coordination in groups
and individuals:learning anticipatory control. J Exp Psychol Learn
Mem Cogn
[34] Knoblich G, Butterfill S, Sebanz N (2011) Psychological
Research on JointAction: Theory and Data. In: Res. Theory
-
18 Tariq Iqbal and Laurel D. Riek
[35] Konvalinka I, Vuust P, Roepstorff A, Frith CD (2010) Follow
you, follow me:continuous mutual prediction and adaptation in joint
tapping. J of Experimen-tal Psychology
[36] Konvalinka I, Xygalatas D, Bulbulia J, Schjødt U, Jegindø
EM, Wallot S,Van Orden G, Roepstorff A (2011) Synchronized arousal
between perform-ers and related spectators in a fire-walking
ritual. P Natl Acad Sci USA
[37] Koppula HS, Jain A, Saxena A (2016) Anticipatory planning
for human-robotteams. Springer Tracts Adv Robot
[38] Leite I, Castellano G, Pereira A, Martinho C, Paiva A
(2012) Modelling em-pathic behaviour in a robotic game companion
for children: an ethnographicstudy in real-world settings. ACM/IEEE
Int Conf Human-Robot Interact
[39] Lohan KS, Lehmann H, Dondrup C, Broz F, Kose H (2017)
Enriching thehuman-robot interaction loop with natural, semantic
and symbolic gestures.Section: Human-Humanoid Interaction, Humanoid
Robotics: a Reference
[40] Lorenz T, Mortl A, Vlaskamp B, Schubo A, Hirche S (2011)
Synchronizationin a goal-directed task: human movement coordination
with each other androbotic partners. Proc IEEE RO-MAN
[41] Mainprice J, Berenson D (2013) Human-robot collaborative
manipulationplanning using early prediction of human motion.
IEEE/RSJ Int Conf IntellRobot Syst
[42] Moon A, Troniak DM, Gleeson B, Pan MKXJ, Zheng M, Blumer
BA,MacLean K, Croft EA (2014) Meet Me Where I’M Gazing: How
SharedAttention Gaze Affects Human-robot Handover Timing. In:
ACM/IEEE Int.Conf. Human-robot Interact.
[43] Mörtl A, Lorenz T, Hirche S (2014) Rhythm patterns
interaction-synchronization behavior for human-robot joint action.
PloS one
[44] Nao Robot (2017)
https://www.ald.softbankrobotics.com/en/cool-robots/nao[45] Néda
Z, Ravasz E, Brechet Y, Vicsek T, Barabási AL (2000)
Self-organizing
processes: The sound of many hands clapping. Nature[46]
Newman-Norlund RD, Noordzij ML, Meulenbroek RGJ, Bekkering H
(2007)
Exploring the brain basis of joint action: co-ordination of
actions, goals andintentions. Soc Neurosci
[47] Niekum S, Chitta S (2013) Incremental Semantically Grounded
Learning fromDemonstration. Robot Sci Syst IX
[48] Nigam A, Riek LD (2015) Social context perception for
mobile robots. In:IEEE/RSJ Intelligent Robots and Systems
(IROS)
[49] Nikolaidis S, Gu K, Ramakrishnan R, Shah J, May RO (2014)
Effi-cient Model Learning for Human-Robot Collaborative Tasks pp
1–9, DOI10.1145/2696454.2696455,
[50] Nomura T (2017) Empathy as signaling feedback between
(humanoid) robotsand humans. Section: Human-Humanoid Interaction,
Humanoid Robotics: aReference
[51] Novembre G, Ticini LF, Schütz-Bosbach S, Keller PE (2014)
Motor simulationand the coordination of self and other in real-time
joint action. Soc Cogn AffectNeurosci
-
Human Robot Teaming: Approaches from Joint Action and Dynamical
Systems 19
[52] O’Connor MF, Riek LD (2015) Detecting social context: A
method for socialevent classification using naturalistic multimodal
data. In: Automatic Face andGesture Recognition (FG)
[53] Ong K, Seet G, Sim S (2008) An Implementation of Seamless
Human-RobotInteraction for Telerobotics. Int J Adv Robot Syst
[54] Open Source Robotics Foundation (2017)
https://www.osrfoundation.org/[55] Pérez-DArpino C, Shah J (2015)
Fast target prediction of human reaching mo-
tion for cooperative human-robot manipulation tasks using time
series classi-fication. In: IEEE Conf. on Robotics and
Automation
[56] Poppe R (2010) A survey on vision-based human action
recognition. ImageVis Comput
[57] PR2 Robot (2017)
http://www.willowgarage.com/pages/pr2/overview[58] Rack S, Iqbal T,
Riek L (2015) Enabling synchronous joint action in human-
robot teams. In: Proc. of ACM/IEEE Human-Robot Interaction[59]
Ramanathan M, Yau Wy, Teoh EK (2014) Human Action Recognition
With
Video Data : Research and Evaluation Challenges. IEEE
Transactions onHuman-Machine Systems
[60] Richardson MJ, Garcia RL, Frank TD, Gergor M, Marsh KL
(2012) Measuringgroup synchrony: a cluster-phase method for
analyzing multivariate movementtime-series. Frontiers in
Physiology
[61] Rickert M, Gaschler A, Knoll A (2017) Applications in HHI
Physical Coop-eration. Section: Human-Humanoid Interaction,
Humanoid Robotics: a Refer-ence
[62] Riek L, Rabinowitch TC, Bremner P, Pipe A, Fraser M,
Robinson P (2010)Cooperative gestures: Effective signaling for
humanoid robots. ACM/IEEE IntConf Human-Robot Interact
[63] Riek LD (2013) The Social Co-Robotics Problem Space: Six
Key Challenges.In Proc of RSS, Robotics Challenges and Visions
[64] Riek LD, Robinson P (2008) Real-time empathy: Facial
mimicry on a robot.In: International Conference on Multimodal
Interfaces., Affective Interactionin Natural Environments
(AFFINE)
[65] Riek LD, Paul PC, Robinson P (2010) When my robot smiles at
me: Enablinghuman-robot rapport via real-time head gesture mimicry.
J Multimodal UserInterfaces
[66] Riek LD, Rabinowitch TC, Bremner P, Pipe AG, Fraser M,
Robinson P (2010)Cooperative gestures: Effective signaling for
humanoid robots. In: Proc. ofACM/IEEE HRI
[67] Ryoo M, Fuchs TJ, Xia L, Aggarwal J, Matthies L (2015)
Robot-centric activ-ity prediction from first-person videos: What
will they do to me? In: Proc. ofACM/IEEE HRI
[68] Ryoo MS, Matthies L (2013) First-Person Activity
Recognition: What AreThey Doing to Me? In Proc of IEEE Computer
Vision and Pattern Recognition
[69] Sandini G, Sciutti A, Rea F (2017) Movement-based
communication forhumanoid-human interaction . Section:
Human-Humanoid Interaction, Hu-manoid Robotics: a Reference
-
20 Tariq Iqbal and Laurel D. Riek
[70] Sebanz N, Knoblich G (2009) Prediction in Joint Action:
What, When, andWhere. Top Cogn Sci
[71] Sebanz N, Bekkering H, Knoblich G (2006) Joint action:
bodies and mindsmoving together. T Cogn Sci
[72] Shah J, Breazeal C (2010) An Empirical Analysis of Team
Coordination Be-haviors and Action Planning With Application to
Human-Robot Teaming.Hum Factors J Hum Factors Ergon Soc
[73] Shah J, Wiken J, Williams B, Breazeal C (2011) Improved
human-robot teamperformance using chaski, a human-inspired plan
execution system. Proc 6thInt Conf Human-robot Interact
[74] Shi C, Shiomi M, Smith C, Kanda T, Ishiguro H (2013) A
Model of Distribu-tional Handing Interaction for a Mobile Robot.
Robot Sci Syst
[75] Słowiński P, Zhai C, Alderisio F, Salesse R, Gueugnon M,
Marin L, Bardy BG,di Bernardo M, Tsaneva-Atanasova K (2015) Dynamic
similarity promotesinterpersonal coordination in joint-action
[76] Strabala K, Lee MK, Dragan A, Forlizzi J, Srinavasa SS,
Cakmak M, Micelli V(2012) Towards Seamless Human-Robot Handovers. J
Human-Robot Interact
[77] Thomaz A, Hoffman G, Cakmak M (2016) Computational
Human-Robot In-teraction. Foundations and Trends in Robotics
[78] Unhelkar VV, Pérez-DArpino C, Stirling L, Shah J (2015)
Human-robot co-navigation using anticipatory indicators of human
walking motion. In: IEEEConf. on Robotics and Automation
[79] Valdesolo P, Ouyang J, DeSteno D (2010) The rhythm of joint
action: Syn-chrony promotes cooperative ability. J Exp Soc
Psychol
[80] Varni G, Volpe G, Camurri A (2010) A System for Real-Time
MultimodalAnalysis of Nonverbal Affective Social Interaction in
User-Centric Media.IEEE T Multimedia
[81] Vesper C, Butterfill S, Knoblich G, Sebanz N (2010) A
minimal architecturefor joint action. Neural Networks