-
Multimodal Segmentation on a Large Interactive
Tabletop:Extending Interaction on Horizontal Surfaces with Gaze
Joshua Newn, Eduardo Velloso, Marcus Carter, Frank
VetereMicrosoft Research Centre for Social NUI
The University of
Melbourne[joshua.newn][evelloso][marcusc][f.vetere]@unimelb.edu.au
ABSTRACTEye tracking is a promising input modality for
interactivetabletops. However, issues such as eyelid occlusion and
theviewing angle at distant positions present significant
chal-lenges for remote gaze tracking in this setting. We presentthe
results of two studies that explore the way gaze interactioncan be
enabled. Our first study contributes the results from anempirical
investigation of gaze accuracy on a large horizon-tal surface,
finding gaze to be unusable close to the user (dueto eyelid
occlusion), accurate at arms length, and only pre-cise horizontally
at large distances. In consideration of theseresults, we propose
two solutions for the design of interac-tive systems that utilise
remote gaze-tracking on the tabletop;multimodal segmentation and
the use of X-Gazeour noveltechniqueto interact with out-of-reach
objects. Our secondstudy evaluates and validates both these
solutions in a Video-on-Demand application, presenting immediate
opportunitiesfor remote-gaze interaction on horizontal
surfaces.
Author KeywordsInteractive tabletop; large horizontal surfaces;
eye tracking;smooth pursuit; gaze interaction; multimodal
interaction.
ACM Classification KeywordsH.5.2. Information Interfaces and
Presentation: UserInterfaces: Input devices and strategies
INTRODUCTIONTouch is the most widely supported input modality
for in-teractive tabletops, providing a precise, spatial, and
naturalmeans of interacting with digital content [2, 6]. As the
sizesof interactive surfaces increase, reachability starts
becominga problem as touch input can be only used within
physicalreach [31]. To minimise this, various novel interaction
tech-niques have been proposed in the literature, which extendtouch
on tabletops: through emulation of a mouse [5], theuse of
tools/widgets [1, 7], or by introducing an additionalinput
device/modality [4, 16, 18, 32].Permission to make digital or hard
copies of all or part of this work forpersonal or classroom use is
granted without fee provided that copies are notmade or distributed
for profit or commercial advantage and that copies bearthis notice
and the full citation on the first page. Copyrights for
componentsof this work owned by others than ACM must be honored.
Abstracting withcredit is permitted. To copy otherwise, or
republish, to post on servers or toredistribute to lists, requires
prior specific permission and/or a fee. Requestpermissions from
[email protected] 16, November 06-09, 2016, Niagara Falls,
ON, Canadac 2016 ACM. ISBN 978-1-4503-4248-3/16/11...$15.00
DOI: http://dx.doi.org/10.1145/2992154.2992179
Figure 1: Multimodal Segmentation. Left: Input modalities for
eachregion segmented. Right: The application built in relation to
where eachmodality/interaction technique works best. This is an
animated figureand it is best viewed in Adobe Reader.
Recently, eye tracking has been shown to be a promising
tech-nology for remote interaction with very large vertical
dis-plays, as gaze is fast, naturally drawn to objects of
interest,and able to interact with objects beyond our reach [29,
30,34]. However, remote gaze-tracking is inaccurate on
largehorizontal surfaces. As the results from our first study
demon-strate, this is constrained for two reasons. At the far end,
thewide visual angle leads to inaccuracies, while in the closerend
to the user, the eyelid tends to cover the pupil as the userlooks
down. To get around these constraints, HCI researchershave
typically used wearable eye-trackers to simulate perfectgaze
tracking on tabletops [3, 14]. This approach limits ac-tual
implementation where remote gaze tracking is often theonly suitable
option, e.g., in unsupervised walk-up-and-useinteractive surfaces
found in museums [10] or in collabora-tive spaces where impromptu
interactions take place [39].
This paper contributes the initial steps for enabling
interac-tion on a large interactive tabletop with gaze input. Our
firststudy identifies the gaze estimation accuracy and precision
ona large tabletop display with a low-cost remote eye tracker.The
results show that, whereas gaze pointing works well at
-
the centre of the screen, the accuracy decreases
exponentiallywith distance. However, even at the far end of the
table, hori-zontal eye movements are still tracked with high
precision.Based on these results we contribute a multi-level
interac-tion architecture in which we divide the table into three
re-gions that employ different input modalities i.e.
multimodalsegmentation (see Figure 1-Left). In the firstthe Touch
Re-gionusers interact using touch gestures. In the
secondtheGaze+Touch Regionusers point with gaze combined
withindirect touch gestures. In the thirdthe X-Gaze Regionusers
select targets with X-Gaze, a novel technique we devel-oped in this
work using a horizontal smooth pursuit trackingalgorithm adapted
from Vidal et al. [36]. Our second studyevaluates both
solutionsmultimodal segmentation and X-Gazethrough a
Video-on-Demand application (see Figure1-Right) that demonstrates
how they can be used in a cohe-sive interface on an interactive
tabletop. Lastly, we presentthe immediate future opportunities from
these solutions.
RELATED WORKA considerable amount of research in HCI has
explored theuse and design of large (> 1m2) interactive
tabletops. Touchinput on tabletops has been the common modality,
reflectiveof its ease of use, intuitiveness, and social advantages
[2, 6].However, touch is limited by the users reach, limiting
theseadvantages. Gaze input presents a possible solution [12],
andHCI researchers have typically used wearable eye trackers
tosimulate perfect gaze-tracking on tabletops [3, 14, 37].
Thisapproach is likely due to the inherent challenge of
remotelydetecting gaze on a horizontal plane. Therefore, few
studieshave explored the use of remote gaze tracking on
tabletopsand predominantly in the context of intent prediction [17,
41].
Holman [12] first proposed the idea of using remote gaze asa
possible solution to address inherent problems when inter-acting
with large tabletop surfaces (e.g. reachability). Basedon this
idea, an eye-tracking tabletop interface (ETTI) andan accompanying
game were developed and evaluated by Ya-mamoto et al. [40, 41].
Beyond demonstrating the ability topredict a users intention based
on gaze detection, their exper-iment highlighted two developments
for interaction with hor-izontal surfaces using gaze. First, it
showed the potential forunsupervised walk-up-and-use interaction
using remote gaze,such as in museums: the natural habitat of
multi-user largehorizontal surfaces [10]. Second, it showed that
remote gazecould be used in a horizontal configuration with high
accu-racy when they conducted their initial empirical gaze
detec-tion accuracy study. However, their system was designed
sothat all parts of the projected surface could be reached
(withinarms length) by any single user.
Mauderer et al. [17] demonstrated the use of remote gazeon a
larger surface (53-inch) where reachability starts to be-come a
problem in which their proposed technique attemptedto address. By
combining the use of touch and gaze usingtwo existing interaction
techniquesMAGIC [42] and Su-perflick [20]the operable area was
extended allowing gaze-assisted object placement through a flick
gesture. Their re-sults showed a high error in the y-axis compared
to the x-axisin different conditions, warranting further
investigation. Ac-cordingly, our first study sets out to
empirically characterise
Figure 2: The diagram illustrates the theoretical regions on the
tabletopcorresponding to 5 degrees of visual angle close and far to
the user on a1m projected surface.
remote gaze tracking on a large horizontal surface.
Neverthe-less, these works highlight two challenges faced when
usingremote gaze tracking on horizontal surfaces. First,
commer-cial eye trackers are built for vertical displays, thus
their un-derlying algorithms start from this assumption. Second,
thedifferences in visual angle when a user is positioned on oneside
of the surface causes difficulties in gaze tracking.
The combination of remote gaze and touch have been ex-plored in
various ways. Stellmach and Dachselt [29] com-bined touch (using a
hand-held device) and remote gaze inputwith distant displays,
demonstrating the potential for multi-modal gaze-supported
interaction. Pfeuffer et al. [19]s workemphasises the complementary
combination of gaze for se-lection and multi-touch for manipulation
on the same hori-zontal surface. Their Gaze-touch technique allowed
objectswithin reach to be interacted with using touch while
objectsout of reach were selectable using gaze and manipulated
bytouch gestures in the reachable space. Results showed thatwhen
touch is combined with gaze, interaction speed in-creased, and as
physical mid-air hand movements were de-creased, fatigue decreased
as well. Serim and Jacucci [25]illustrated that gaze input could be
used to extend interactionbeyond gaze pointing as seen in related
works (e.g. [19, 29,30, 34]). Their results show that gaze can
support interfaceuse with decreased reliance on visual guidance.
For largeinterfaces, this means that touch input combined with
gazecould be used differently i.e. the user touches where she
islooking versus not looking. As demonstrated in the
literature,combining touch with gaze shows great promise,
motivatingus to explore this area further. Thus, the contributions
in thispaper aim to extend this existing area of literature.
STUDY 1: CHARACTERISING EYE GAZE ON TABLETOPSWe noted two
significant challenges for remote gaze track-ing in this setting.
First, commercial eye trackers are builtfor vertical displays, so
their underlying algorithms start fromthis assumption. Second, the
gaze inaccuracy increases withdistance from the user on a
horizontal surface due to the dif-ferences in visual angle. Here,
we identify a third challenge:when users look down at the edge of
the table closest to them,the eyelids cover the pupil thereby,
deteriorating the accuracy.Focusing on the second challenge, the
further from the user,the wider the estimation error is for a given
visual angle (seeFigure 2). Using the derived equation below, it is
possibleto predict how inaccurate (Error) the gaze estimation
willbe depending on the Distance to the user, the given visualangle
and the Height of the eyes.
-
Error = Height tan( + arctan
(DistanceHeight
)) 100
Figure 2 shows the theoretical error curve at scale for a
5-degree visual angle at a 60cm height on a 1m surface.
Tocharacterise the gaze estimation on the tabletop in practice,we
designed a study that builds on the gaze estimation accu-racy
evaluation by Yamamoto et al. [41] but on a much largersurface,
simultaneously investigating the high error rates inthe y-axis
found in Mauderer et al. [17]s evaluation. There-fore, the aim of
this study is to empirically characterise thelimitations of remote
gaze on a large horizontal surface. Thiswas done by comparing the
tracking error between the esti-mated gaze point for each ground
truth point displayed.
Hardware ConfigurationAs shown in Figure 2, a face-down
short-throw projector wasused to create a 45-inch (1000x563mm)
horizontal displaywith a resolution of 1920x1080px on the tabletop.
A low-cost eye tracker (Tobii EyeX (30Hz)) was mounted on thetable
surface on the edge of the projection facing the userat 40-50
degree angle and 25cm away from the short edge ofthe table. As the
eye tracker specified working distance is 4580cm (tracker to eyes),
this positioning was suitable and ableto accommodate to
participants of different seated heights butfor each participant,
the angle of the tracker was adjusted topoint towards her eyes from
below for better tracking perfor-mance.
Study ProcedureUpon arrival, participants signed an informed
consent formand completed a simple demographics questionnaire.
Par-ticipants were asked to sit comfortably in front of the
eyetracker, which we first calibrated using its default 9-point
pro-cedure. Participants were then asked to look at 63
circulartarget points (size: 20px, approx. 1cm in diameter)
displayedsequentially for 4 seconds each. All participants looked
atthe same sequence row by row. To make it more comfortable,each
point was presented with a grow/shrink animation.
ResultsWe recruited 10 participants (7M/3F), aged 25 to 52
years(mean=32), 6 participants wore glasses, and only 2 had
pre-vious experience with eye tracking. For each of the 63
points,we computed the difference between the estimated
coordi-nates of the gaze point and the corresponding coordinates
ofthe ground truth. Figure 3 shows that the error in each
direc-tion depends on the position of the ground truth on the
table-top. To elaborate, vertical error refers to the deviation of
theestimated coordinates of the gaze point for each point (in
pix-els) from the coordinates of the ground truth point in terms
ofthe Y coordinate. Likewise, the horizontal error refers to
thedeviation in the X coordinate. Arranged in 9 rows of 7, Fig-ure
3(a) shows the error for the 9 rows in terms of the y-axis,while
Figure 3(b) shows in terms of the x-axis accordingly.
We found that in the y-axis, the error increases
exponentiallywith the angle between the gaze direction and the
tabletop,as predicted by the error estimation equation. However, it
isnoted that there is a slight increase in the error at the
posi-tions immediately closer to the user, likely due to the
eye-lids covering the eyes. The x-axis presented a much smaller
Figure 3: Horizontal and vertical error. The vertical error
increasesexponentially with the distance to the user. The
horizontal error is muchsmaller in magnitude and follows a slender
U shape curve.
estimation error, with a curve following a slender U-shape.This
is also expected, as the distance is at its minimum whenusers are
looking straight ahead, increasing as they look to-wards either
side. In any case, this represents consistentlylow error throughout
the x-axis in comparison to the y-axis.We then evaluated how the
distance to the user affects thetracking precision. The effects of
the Y coordinate of theground truth points on the mean horizontal
and vertical stan-dard deviations of the measurements was tested
using Pear-sons product-moment correlation. A significant
correlationbetween the Y coordinate and the vertical standard
devia-tion was found (Pearsons r(7) = 0.84, p = 0.0045), butno
significant correlation with the horizontal standard devia-tion (p
= 0.22). This means that for each target point furtheraway from the
user, the recorded gaze estimation points de-viated more from the
target point. As it is highly significant(p < 0.01), the results
confirm our original prediction wherethe gaze inaccuracy is likely
to increase with distance in thissetting. As mentioned, there is no
statistical significance inthe x-axis, meaning that the deviations
remained consistentthroughout and was not affected by changes in
horizontal dis-tance, showing that the X coordinate remained fairly
accurateand usable for all 63 points.
To confirm this, we conducted a follow-up study in which theeye
tracker was moved to the centre of the long edge of the ta-ble to
obtain a wider surface. We collected the gaze data from10
participants (6M/4F), aged 22 to 52 years (mean=33.2).Four
participants wore glasses and three had no experiencewith eye
trackers. In addition, three participants took part inour initial
study. Figure 3(c) shows a similar plot as Figure3(a), showing the
consistency of error at the positions clos-est to the user in both
studies. More importantly, Figure 3(d)shows that the tracking error
remains consistently low in thisconfiguration, and we tested the
correlation between the Y co-ordinate and the horizontal standard
deviation and found nosignificant correlation (p = 0.8955). These
results furtheredour confidence in using the X coordinate for
interaction on alarge tabletop, especially at the end furthest from
the user.
In summary, these results show that the gaze estimation inthe
far end of the tabletop is highly inaccurate and that the
-
accuracy decreases exponentially. However, the results alsoshow
that even though the vertical precision deteriorates asthe distance
to the user increases, the horizontal precision re-mains the same.
For interface design, this suggests that tech-niques that require
high accuracy (e.g. gaze pointing) are notwell suited for the far
end of the tabletop, but techniques thatonly require a consistently
precise estimatesuch as Pursuits[36]can still work if only the
horizontal direction is consid-ered. Instead of comparing the
absolute position of the gazepoint and the target, Pursuits
compares the relative movementof the eyes with moving targets on
the screen. In the next sec-tion, we elaborate and describe how
Pursuits can be adaptedto extend interaction at the far end of the
tabletop with gaze.Finally, it appears that there is some tracking
deterioration atthe edge closer to the user due to eyelid
occlusion.
STUDY 2: INTERACTIVE SYSTEM EVALUATIONThe results from our first
study showed that vertical gaze es-timation is considerably
affected by the distance to the user,but the relative movement in
the horizontal axis can still beused for interaction, even at the
furthest area of the table-top. Moreover, the findings also showed
that the estimationerror is at its minimum at the centre of the
table, increas-ing again as it gets closer to the user due to
eyelid occlusion.With knowledge of these constraints, we designed a
systemthat segments the tabletop in three distinct regions, each
withan interaction technique that overcomes its inherent
shortfallsand builds on its opportunitieswhat we call multimodal
seg-mentation. Further, we draw on guidelines by Shen et al.
[27]that consider occlusion and reachthe two overarching prob-lems
in our work aims to address. The purpose of the systemis to
evaluate the viability of (1) gaze-only interaction on thefar end
of the tabletop and (2) interacting with a large surfaceusing
multimodal segmentation.
Segmented RegionsIn the region closest to the user (Touch
Region), the gaze es-timation error is high due to eyelid
occlusion, but this is notnecessarily a problem since the user can
still interact with theregion using touch. Touch is a well suited
interaction modal-ity in reachable areas on horizontal surfaces, it
is highly ap-plicable and suited for use in this region. For this
reason,only touch was adopted in this region. In the central
regionof the tabletop (Gaze+Touch Region), touch-based
interactiontechniques become awkward due to the need to reach
out,whereby the user needs to move in order to be able to
reachsufficiently far. However, it is the region with the
smallestgaze estimation error, meaning that both the X and Y
gazecoordinates are the most accurate and thus usable for
gazepointing in this region. Therefore, the interaction
techniqueused in this region draws from Pfeuffer et al.s
Gaze-touchwhere we combine gaze pointing with indirect touch
confir-mation i.e. the touch is used for manipulation in the
regionclose to the user while gaze is used for selection beyond
thereach of the user [19].
In the area furthest from the user (X-Gaze Region), the
usercannot physically reach targets and gaze estimation error
ishigh in the vertical axis. However, the still precise horizon-tal
axis can be taken advantage of by displaying targets thatmove only
in the horizontal direction and correlating their
X coordinate with the X coordinate of the eyes. This ap-proach
has been used in previous works using 2D movementsto enable
interaction with public displays [36] and with smartwatches [9],
using both X and Y coordinates. We adaptedthe approach by solely
using the X coordinate to overcomethe limitations of remote gaze
estimation at far distances onhorizontal surfaces that we
identified in our first study.
Smooth Pursuit Eye MovementsSmooth pursuit eye movements have
recently been proposedas a solution for contexts where calibration
and precise point-ing are challenging, such as in public displays,
smart watches,and smart homes [9, 35, 36]. The technique works by
corre-lating the smooth movement of the eyes with moving targetson
the interface to detect where the user is looking, lever-aging the
smooth movement our eyes naturally perform whenwe follow a moving
object. It is known that our eyes naturallyare drawn towards
objects of interest such as moving objects[13]. The Pursuits
technique is suitable for scenarios wheregaze tracking is
inherently inaccurate such as on horizontalsurfaces as it is not a
concern where exactly the user is look-ing, but in the movement
that the eye makes when fixated ona moving object. The details of
how the Pursuits algorithmworks can be found in Vidal et al.
[36].
Here, we note that in the X-Gaze Region, the optimised
pa-rameters are not known, as the Pursuits algorithm has notbeen
implemented on a horizontal surface, let alone one thatis large in
size. In theory, a horizontal adaptation of the algo-rithm is ideal
for visual interaction. Collewijn and Tamminga[8] suggested that
due to the extensive use in following every-day objects that tend
to be horizontal, our ability to performhorizontal smooth pursuit
is likely to be better than verticalsmooth pursuit. Similar results
are also seen in Rottach et.als study [22] where horizontal,
vertical and diagonal smoothpursuit eye movements were compared.
Furthermore, exist-ing implementations of Pursuits solely use gaze
as an input.This paper presents an example of how Pursuits can be
com-bined as part of a multimodal system.
ApplicationTo illustrate how these regions can be combined into
one co-hesive application, we built a Video-on-Demand
applicationthat allows users to explore an on-line video library in
a mul-timodal fashion (see Figure 1). In the X-Gaze Region,
multi-ple tags are displayed that correspond to different video
chan-nels. These tags move left and right in distinct patterns
andthe smooth pursuit correlation algorithm [36] is used to
selectthem. However, by not considering the Y coordinate of theeye
gaze and tag movements, this substantially reduces thenumber of
possible trajectories for different targets. For ex-ample, in 2D,
even if two objects present the same horizontalmovement, the
selection can be disambiguated by the verticalaxis. To compensate
for this, the X-Gaze region is divided intwo side-by-side
sub-regions (see Figure 4). The absolute Xcoordinate is then used
to estimate at which sub-region theuser is looking, only presenting
moving targets in that sub-region. The relative movement of the
targets is compared withthe relative movement of the eyes to select
a specific tag. Thisway, only one of the sub-regions presents any
movements at
-
Figure 4: X-Gaze technique illustration.
a time, which start depending on the absolute horizontal
co-ordinate of the gaze point. Moreover, by sub-regioning,
thevisible number of possible tags doubles, presenting more
se-lectable channel options.
When a tag is selected, the Gaze+Touch Region is populatedwith
videos that satisfy that query. The thumbnails of thevideos are
displayed along with their titles and summary de-scriptions. To
select a video, the user looks at its thumbnailand touches the Play
button displayed in the Touch Region atthe very bottom of the table
together with other playback con-trols. The video is shown in the
Touch Region. The user canalso use the Next 10 button in the Touch
Region to repopulatethe Gaze+Touch Region with another 10
videos.
Hardware ConfigurationWe used the same hardware configuration as
in the previ-ous study, but with added touch capabilities using an
over-head Kinect mounted on top of the projector (see Figure
5-Right). The touch events were detected using the Ubi Dis-plays
Toolkit [11], which matched touch points on the tableto the pixels
being projected. The eye tracker is moved tothe bottom of the
Gaze+Touch Region, as there is no longera need to track the users
eyes in the Touch Region and thetouch gestures could occlude the
eye tracking if the locationremains. By moving the eye tracker
further away, the abil-ity to use gaze increases and therefore
increasing the possiblearea for interaction. In the enhanced setup
used in this study, a25% increase in display size (45-inch to
60-inch) was createdby moving the eye tracker forward further from
the user. Alarger display size means that more content can be
displayedat any one time. The placement of the segmented regions
inrelation to the hardware and user is shown in Figure 5-Right.
Data CollectionThe screen was recorded using Open Broadcaster
Software(OBS), along side the recording from the video camera
thatrecorded the side view of the user and the interface, as
shownin Figure 5-Right. This generated video recordings for
analy-sis, showing the actions of the users synchronised with
whatthey saw on the display. Throughout the tasks, participantswere
asked to employ the think-aloud technique. The dataprimarily
provides insights into the participants thinking andwhether they
understood the system. Both interview andthink-aloud data were
included as part of the combinedvideo recording which was
transcribed and coded using the-matic analysis. Observations were
made from a desk placedbehind the participant, viewing the actions
on the interfacethrough a display clone. Notes were taken during
both thetasks and the semi-structured interviews to be used for
dataanalysis. Participants were then asked to complete a post-study
questionnaire with a five-point semantic differential
Figure 5: Study 2 application and hardware configuration. Left:
High-light selection used in Gaze+Touch Region with reachability
demonstra-tion shown. Right: Enhanced setup used with added touch
capabilities.
scale consisting of 15 dimensions (e.g. I felt
uncomfortableversus I felt comfortable) (see Figure 6) that focuses
on the X-Gaze Technique, the Multimodal Segmentation and the
Over-all User Experience in line with the goals of the
evaluation.
Study ProcedureUpon arrival, participants were asked to complete
the consentform and a simple demographics questionnaire.
Participantswere then asked to sit comfortably in front of the eye
trackerand undergo a standard calibration procedure using the
de-fault 9-point procedure. Upon a successful calibration,
par-ticipants were then asked to perform three tasks
sequentially.Each task was designed with increasing difficulty,
encouragedthe participant to use the system as a whole, and are
adaptedfrom the functionality (i.e. find similar, remove odd) of
aprior study on video searching [28]. In the first task, we
ex-plained how the system worked through a live tutorial, guid-ing
participant through steps before inviting participants toexplore
the video library. Here, participants were encouragedto use all
regions on the tabletop to familiarise themselves andto get over
the initial learning curve while not giving awayhow the interface
should be used. We were able to observehow participants initially
approached the techniques withoutthe complexities of a difficult
task. This was followed by ashort semi-structured interview to gain
insights into partici-pants impressions, their overall perception
of the system, theinteraction techniques used in each region, and
any difficul-ties that they experienced so far.
The following tasks were presented one after another and
par-ticipants were asked to stop if they could not find all
videosafter 5 minutes. In the second task, the participant was
askedto find similar videos, requiring the user to jump between
thechannels as they look for e.g. cat videos, which are not allin a
single channel. The cat video will be one that belongsto the
channel but may not be immediately obvious at first.For example, in
one channel, the thumbnail is a cartoon illus-tration of a cat
instead of a picture while in another, the wordcat appears in the
video title but the thumbnail does not dis-play a cat. In the third
task, the videos were shuffled amongthe channels and the
participants were asked to find the videosthat did not belong to
that respective channel. Among the10 videos displayed, there can be
any number of videos thatdo not belong to that channel, encouraging
participants to goback and forth between the channels to contrast
the videoswith the dominant type of videos in that channel.
Therefore,the placement of the content in relation to the tasks
requiredparticipants to not only navigate within the channel but
alsoto switch between different channels, encouraging the use ofthe
X-Gaze technique implicitly rather than explicitly. Once
-
Figure 6: Study 2 Post-Study Questionnaire Results. The median
score is shown for each dimension.
completed, the participants were asked to complete the
post-study questionnaire followed by a second semi-structured
in-terview.
ResultsA total of 13 participants took part in the study
(4F/9M),aged between 24 and 38 years (mean=28.2). However,
1participant (P7) was omitted due the inability to calibratewith
the eye trackers default calibration. All participantshad little to
no experience with eye tracking. Two partici-pants wore corrective
contact lenses and one wore glasses.All participants either
completed all tasks within 5 minutesor gave up by asking where the
remaining videos were lo-cated. Figure 6 shows the results of the
post-study ques-tionnaire where participants rated between 1
(strongly un-favourable) to 5 (strongly favourable) for each
dimension.The semi-structured interviews provided further insight
intothese scores.
The X-Gaze technique in the distant region was quickly
high-lighted with 10 out of 12 participants reporting that the
tech-nique being minimal was easy to learn, and that it feltnatural
enough that there was little room for improvement(e.g. I looked at
things and it worked right away [P8]). Thiscorresponds with the low
improvement scores but high learn-ability scores in the
questionnaire. However, three partici-pants (P4, P9, P13) reported
that they felt the technique wassomewhat slow. Overall,
participants praised the novelty ofthe technique, along with the
ability to accurately select dis-tant targets solely with gaze,
despite the unusual technique.For example, P3 mentioned you follow
something with youreyes and it then selects. There is no real-world
equivalent.Its easy and its effective but it doesnt feel like
somethingnormally I would do in my life. Thought, it was intuitive
so Iwasnt thinking about it too much.
In the Gaze+Touch Region, we have chosen not to use a
gazepointer (or cursor), instead, the tiles were highlighted
(blue)when selected by dwell time shown in Figure 5-Left. Whenthe
gaze estimate was accurate, this provided effective,
subtlefeedback, but when the tracking was particularly inaccurate,
itcreated a flickering nuisance as noted in observation and
con-firmed in our video analysis. This is despite the centre of
thescreen being the area of the tabletop where gaze is most
ac-curate. This shows that gaze tracking in this configuration
isstill substantially less accurate than on vertical screens.
Thiswas expected to some extent and was compensated by theuse of
large targets for the video thumbnails. However, halfthe
participants still reported that the targets at the edges ofthe
screen were more difficult to select that the ones in thecentre. A
subset of these participants (P3, P8, P12) also re-ported that it
was sometimes harder to select videos on the
bottom row (closer to the user) than compared to the top
row(further from the user). This difficulty in selection led to
thelow gaze accuracy score. On the other hand, Participant P4and P5
mentioned when reading the video titles, the buttonwill highlight
suddenly (due delayed dwell-time activation)causing a shock to the
eyes contributing to eye fatigue. Fur-ther, P4 mentioned When I
wanted to look at something, Ijust wanted to look but when I wanted
to select something, Iwanted it to be fast. This is reflective of
the Midas Touchproblem, an expected inherent problem when using eye
gazemodality for interaction [13].
In contrast, a few participants (P3, P4, P12) explicitly
notedthe ability to select further than their physical reach when
in-teracting with out of reach areas, which is reflective of
ourmotivation to use eye gaze to extend reachability on
largetabletops. Their positive responses are as follows:
P3: It made sense that you can touch things in front ofyou and
everything else [demonstrated the use of eye gazeby selecting
objects that were out of reach]. Its prettyunique to be able to
select stuff that is so far away.
P4: I think it works when its a big screen like that. Likeyou
dont want to go all the way and touch it.
P12: When its here [shows touch area], its easy to useyour
hands, but when its over there [points at out-of-reacharea], its
hard, so yeah., Its actually very good. I reallylike it, its easy
and I dont want to be [participant stretcheshand out ]
Overall, participants enjoyed using the system through
theseamless transition between the regions, receiving high rat-ings
in all dimensions as shown in Figure 6. In the nextsection, we will
discuss the issues that have risen from ourevaluation followed by
the implications of our solutions.
DISCUSSIONThis paper demonstrates the immediate opportunities
for en-abling gaze interaction on large horizontal surfaces. The
find-ings from the first study characterised the use of remote
gazeon large interactive surfaces, in this case a large
tabletop.This empirical investigation showed the different
accuraciesin the different parts of the tabletop. The highest
accuracywas found in the centre while at the far end of the
tabletop,accuracy only remains along the horizontal axis. This led
tomultimodal segmentation, the formation of regions with eachregion
employing an interaction technique in accordance withthe strengths
and limitations. Subsequently, the characterisa-tion informed the
development of X-Gazeour novel interac-tion techniquethat enables
the interaction on a large table-top where the vertical axis is
unusable especially at the farend. In the second study, both
multimodal segmentation and
-
X-Gaze were evaluated through a Video-on-Demand applica-tion
that illustrates how these regions can be used togetherin a
cohesive interface on a large interactive tabletop. Theevaluation
showed positive results, yielding many interestinginsights
presenting how the system can be implemented in thefuture. In this
section, we draw attention to both solutions anddiscuss their
implications in light of our findings.
X-Gaze: Findings and OpportunitiesWe presented the design,
development and evaluation of X-Gaze, a novel interaction technique
we developed directly torespond to the need to interact with
distant targets using onlyX coordinates. In our evaluation, the
technique was foundto be favourable by participants as the low
overhead madeit minimal and therefore easy to learn with little
roomfor improvement as well as the ability to select
out-of-reachtargets. This highlights the fact that even techniques
as suchmay look unusual at first, can still provide an effective
meansof providing input to interactive systems.
Some participants found that using the technique becameslower
after adapting to the technique, especially after com-pleting the
first task. This is a common effect in eyes-onlytechniques that
require the user to follow a target or to dwellon it for a certain
amount of time, due to a trade-off betweenthe systems
responsiveness and its robustness to errors [15].Another
possibility is that participants transitioned to beingexpert users
after successfully using the technique during thefirst task. This
caused these expert users feel slow due tothe thresholds in place
to prevent false activations. Our imple-mentation privileged
robustness, using a 2-second activationwindow and a .95 correlation
threshold, substantially higherthan previous works [9, 36], which
came at the cost of re-sponsiveness. Future implementations could
consider givingparticipants control over the activation window
time, reduc-ing it as users become more adept at using the
technique.
The Pursuits technique [36], in which X-Gaze has beenadapted
from, relies on interfaces that are highly dynamic i.e.interfaces
with objects of interest moving with different tra-jectories and
speeds. The authors referred to as a potentiallimitation as the
constant movement might be a source of con-fusion or fatigue for
users if used for longer periods of time.X-Gaze provides a possible
solution to this by leveraging theuse of gaze-aware regions which
is possible with large sur-faces as the general gaze direction and
point-of-regard canbe estimated. When a gaze-aware region is
activated, onlythen objects will start to move which further
provides the userfeedback that gaze has been detected in that
region. Our ap-plication used in the evaluation used two gaze-aware
subre-gions which could have been easily increased.
Furthermore, Vidal et al. [36] states that their technique
ispotentially unsuitable for objects that contain more than ashort
segment of text as it may be difficult to read and followthe moving
text at the same time and that objects that movetoo slowly may also
cause bad performance. In our imple-mentation, we demonstrated that
text could be read as it onlymoves in one direction with the
combination of a slow speed.As the coordinates of only one axis is
used, the slow speedof the objects does not have an impact on
performance, rather
provided users with more control. As mentioned, a high
cor-relation threshold was achieved. This is likely attributed
tothe fact that humans perform horizontal smooth pursuit betterthan
vertical smooth pursuit [8, 22]. Thus, addressing somelimitations
of Pursuits, particularly in situations where gazeaccuracy starts
to becomes a problem with distance and con-sequently, where
reachability becomes a problem too. Wenote that reachability is
present when interacting with verticalsurfaces as well [10]. This
presents opportunities for explor-ing the use of X-Gaze on large
vertical displays especially foruse in public settings.
Multimodal Segmentation: A Viable StrategyThe informed decision
to divide the tabletop into segmentedregions can be viewed as a
divide and conquer strategytaken where an interaction technique was
employed to con-quer each segment in accordance to their strengths.
OurVideo-on-Demand application illustrated how each of
thesedifferent regions could cohesively work together. This
co-hesion was largely achieved by the use of a flat
hierarchicalstructure that formed a natural division between each
level.This was also possible as there were at least three levels
whereeach level is progressively further from the user. These
levelsare visible at all times afforded by the large display that
en-couraged the serendipitous discovery of content. Moreover,this
afforded visibility and the techniques employed in eachregion
allowed the user to quickly jump between the differentchannels.
This means that the user does not need to traverseup and down the
hierarchy. For instance, while watching avideo in the region
closest to the user, the user can select an-other channel and scan
for the next video while the currentvideo remains played. The
seamless transition between theregions builds on the use of a
familiar drill down navigationalstructure commonly seen in
interface design. This allowedusers to bring the desired content
within their reach as theirinteraction moves from the furthers
region to the closest re-gion to them. The two highest median
scores in the question-naire were given to content and organisation
as it was wellreceived by participants.
However, the evaluation encountered some difficulties withgaze
tracking especially in the Gaze+Touch Region (centre ofthe
tabletop). The deterioration in accuracy reflects the curvein
Figure 3(b), which shows how the accuracy deterioratestowards the
edges. Likewise, some participants encountereddifficulty in
selecting the videos closer to them, and this ismost likely due to
the error between 200 to 400 pixels in thevertical axis shown in
Figure 3(a) where there is some degreeof tracking deterioration due
to eyelid occlusion.
Focusing on interaction techniques employed in the regions,this
work presents the first instance where two types of gazeinteraction
were combined on the same interface i.e. gazepointing and a natural
gaze behaviour (e.g. smooth pursuiteye movement). Previous works
that employ the former typ-ically relied on accurate tracking while
those that employ thelatter were used in scenarios where accurate
pointing waschallenging. In this study, both types were identified
throughthe characterisation and the intended use of a large
surface,leading to the employment of the both. In the
evaluation,participants did not have any issues switching between
both
-
Figure 7: Regions in relation to the Theory of Tabletop
Territoriality byScott et al. [28] in a multi-user setting.
types of interaction and was observed to have been
performedsomewhat unconsciously. This proves that there are
opportu-nities in combining both types of interaction. Lastly,
humansnaturally divide spaces on computers, such as grouping
win-dows and icons on desktop computers. On tabletops,
whetherdigital or non-digital, it has been shown that there is a
natu-ral segmentation which we aimed to investigate as part of
ourfuture work and is discussed further in the next section.
EXTENSIONS TO MULTIUSER TABLETOP SCENARIOSPrevious studies have
articulated numerous advantages of us-ing tabletops for
collaborative work. Tables in general pro-vide a large and natural
interface for supporting human-to-human interactions whereby its
characteristics provide affor-dances that allow the gathering of
people for face-to-facecommunication as they surround a shared
surface [26]. How-ever, sharing the surface has certain negative
connotations de-spite having advantages for subtle communication
and thishas to do with both space and access. Ryall et al.
[23]observed that the actions of people in this setting often
con-flict with one another, both intentionally and accidentally.
Onoccasion, users want to use the whole table, but sometimesprivacy
becomes an issue when users want to interact withdisplayed elements
without sharing them with other users,which raises the problem of
undesired access in specific situ-ations [21]. Therefore, multiuser
tabletops should allow usersto protect their data against undesired
access.
A potential solution for this is to increase the size of the
inter-active surface and divide the tabletop interface into
territoriesin accordance to the Theory of Tabletop Territoriality
[24].The authors observed three distinct territories on the
table,namely personal, storage and group (shared). This
divisionlimits other users from reaching into the personal space
ofanother user. The personal territory is typically determinedby
the users reach while storage territory is within their ex-tended
reach [31]. However, placing and reaching for objectsoutside their
personal and storage territories (e.g. group ter-ritory) now
becomes a problem. Likewise, placing or obtain-ing an object in
another users personal territory with agree-ment from the owner
also becomes a problem. Current pro-posed techniques such as
I-Grabber [1] allows the user toseamlessly interact with both
out-of-reach territories from theusers current location without
blocking the territory of an-
Figure 8: Potential application of X-Gaze in a multi-user
setting. A usercan initiate the movement while the other user
selects using X-Gaze.
other user. This once again raises the problem of
undesiredaccess and that it might be useful to implement
authorisationand privacy protocols to avoid distant users to see or
manipu-late ones workspace without permission [38].
Our contributions present immediate opportunities to balancethe
issues of private personal territory, reachability, and hav-ing a
form of authorisation protocol for seeking permissionto access
content from another user. Coincidentally, the re-gions defined in
this work echo the territories of the tabletoppresented by Scott et
al. [24] (see Figure 7). In relation tothese regions, the X-Gaze
technique can be used to supportmulti-user sessions in both terms
of passing objects and asa method of authentication. In their
personal and immedi-ate storage territory, users can interact using
touch gestures,which is well suited for private tasks such as
reading, writ-ing, annotating. In the group territory users
interact using thecombination of gaze and touch. This not only
makes usersaware of each others attention but allows them to
directly in-teract with public content at a distance. Users can
also usecombinations of gaze and indirect touch gestures as
exempli-fied in Gaze+RST [34] to move items back and forth
betweenthis space and their personal space. Finally, X-Gaze
elegantlysupports protecting a users personal territory from
interfer-ence by other users [33]. For the user on the other side
of thetable to select an item, it is necessary for that item to
move.If the owner of the personal space authorises the other userto
interact with that content, she can move the object side-to-side a
few times. If the other user follows it with her eyes,she is able
to select that object and can be transferred over toher personal
territory. To avoid unwanted selections, the usermoving the object
can casually gaze into the other users per-sonal territory which
also naturally facilitates feedback, animportant function in
human-computer interaction. For in-stance, the user passing the
object will anticipate the objectreappearing in the other users
personal territory and whenthis occurs it provides the user with a
closure that the interac-tion has been successful.
Consequently, this form of authentication presents a form
ofsocial contract and can be expanded to facilitate simulta-neous
exchange between two users, and this is best demon-strated by way
of example. Take Monopoly, and picture adigitised version on a
large interactive tabletop with two userssitting opposite one
another. The group space displays theboard while each user holds
virtual deed cards to their prop-erties and their virtual money
notes in their personal territo-
-
ries. In the scenario where one user purchases a property
fromanother, one user can move the virtual deed card while theother
moves the money notes. Both users will consequentlygaze into
personal territory of one another, forming a socialcontract where
both users authenticate one another simul-taneous upon agreed
terms, therefore, facilitating a naturalexchange. Without
neglecting the opportunities in group ter-ritory, it is possible
that gaze awareness can be employed herewhich could change the
social experience in this setting col-laboratively. Tse et al. [32]
mentions that monitoring thegaze of others lets us know where they
are looking and whereattention is directed. More importantly, gaze
awareness hap-pens easily and naturally in co-located tabletop
settings asusers can easily gauge what another user is gazing
towards toin addition to being a great indicator for attention.
Therefore,making gaze explicitly visible on a shared surface has
someinteresting connotations. For example, one user can say
toanother when solving a large jigsaw puzzle collaboratively,Can
you get me that piece?. The other user will knowwhich piece was
refereed to by simply looking where theuser is looking on the
surface, serving as an implicit pointingmethod. Alternatively, it
can be used competitively in a gameof chess such that if one user
is aware of the other users inten-tions through their visible gaze
and whether they will changetheir strategy or trick their opponent.
Nevertheless, we hopeto further explore how a multi-user
gaze-enabled tabletop cansupport collaborative tasks using our
interaction architecture.
CONCLUSIONThis paper contributes a first step in enabling gaze
interac-tion on large tabletops and for the first time, gaze-based
in-teraction has been used effectively over a large distance ona
horizontal surface (1m2). This was achieved by initiallyidentifying
that the relative accuracy of the gaze along x-axis(parallel to
user) was usable rather than y-axis (perpendicu-lar to user) in our
first study. We highlight that this first timecharacterisation to
be a key contribution of this paper.
This informed the development of two solutions in which
weevaluated in a second study. First, X-Gaze, a novel
gaze-basedinteraction technique that leverage natural gaze
behaviourwas developed to enable gaze-only interaction at the far
endof the tabletop. It is important to note that there are
naturallimitations of gaze, and despite the maturing enabling
tech-nology, it will not change the way our eyes work and it
crucialthat gaze-based techniques adhere to this fact.
Consequently,novel techniques such as X-Gaze show great potential,
suchas enabling users to select of out-of-reach objects with
lowoverhead and high precision. Secondly, we demonstrated
howmultimodal segmentation and X-Gaze can be incorporated inan
interface design through a Video-on-Demand application,which we
evaluated in a user study. Our findings showedthat participants
overall enjoyed using the system through theseamless transition
between the regions but more importantly,showed that our solutions
can be used in practice. Therefore,addressing two specific problems
with respect to remote eyetracking, (1) the accuracy of
eye-tracking on horizontal sur-faces at long distances (i.e. beyond
physical reach) and (2)the problem of eyelid occlusion at short
distances. For ourfuture work, we will expand our solutions formed
in this pa-per to support collaborative multi-user
environments.
REFERENCES1. Abednego, M., Lee, J.-H., Moon, W., and Park,
J.-H.
I-grabber: Expanding physical reach in a large-displaytabletop
environment through the use of a virtualgrabber. In Proc. of ITS
09, ACM (2009), 6164.
2. Ardito, C., Buono, P., Costabile, M. F., and Desolda,
G.Interaction with large displays: A survey. ACM Comput.Surv. 47, 3
(Feb. 2015), 46:146:38.
3. Bader, T., Vogelgesang, M., and Klaus, E.
Multimodalintegration of natural gaze behavior for
intentionrecognition during object manipulation. In Proc. of
the2009 Int. Conf. on Multimodal Interfaces, ICMI-MLMI09, ACM
(2009), 199206.
4. Banerjee, A., Burstyn, J., Girouard, A., and Vertegaal,R.
Pointable: An in-air pointing technique to manipulateout-of-reach
targets on tabletops. In Proc. of ITS 11,ACM (2011), 1120.
5. Bartindale, T., Harrison, C., Olivier, P., and Hudson,S. E.
Surfacemouse: Supplementing multi-touchinteraction with a virtual
mouse. In Proc. of TEI 11,ACM (2011), 293296.
6. Benko, H., Morris, M. R., Brush, A. B., and Wilson,A. D.
Insights on interactive tabletops: A survey ofresearchers and
developers. Tech. Rep.MSR-TR-2009-22, March 2009.
7. Bezerianos, A., and Balakrishnan, R. The vacuum:Facilitating
the manipulation of distant objects. In Proc.of CHI 05, ACM (2005),
361370.
8. Collewijn, H., and Tamminga, E. P. Human smooth andsaccadic
eye movements during voluntary pursuit ofdifferent target motions
on different backgrounds. TheJournal of Physiology 351, 1 (1984),
217250.
9. Esteves, A., Velloso, E., Bulling, A., and Gellersen,
H.Orbits: Gaze interaction for smart watches using smoothpursuit
eye movements. In Proc. of UIST 15, ACM(2015), 457466.
10. Geller, T. Interactive tabletop exhibits in museums
andgalleries. Computer Graphics and Applications, IEEE26, 5 (Sept
2006), 611.
11. Hardy, J., and Alexander, J. Toolkit support forinteractive
projected displays. In Proc. of MUM 12,ACM (2012), 42:142:10.
12. Holman, D. Gazetop: Interaction techniques forgaze-aware
tabletops. In CHI 07 Extended Abstracts onHuman Factors in
Computing Systems, CHI EA 07,ACM (2007), 16571660.
13. Jacob, R. J. K. What you look at is what you get:
Eyemovement-based interaction techniques. In Proc. of CHI90, ACM
(1990), 1118.
14. Lander, C., Gehring, S., Kruger, A., Boring, S., andBulling,
A. Gazeprojector: Accurate gaze estimationand seamless gaze
interaction across multiple displays.In Proc. of UIST 2015 (2015),
395404.
-
15. Majaranta, P. Communication and Text Entry by Gaze.IGI
Global, 2012.
16. Marquardt, N., Jota, R., Greenberg, S., and Jorge, J. A.The
continuous interaction space: Interaction techniquesunifying touch
and gesture on and above a digitalsurface. In Proc. of INTERACT11,
Springer-Verlag(2011), 461476.
17. Mauderer, M., Daiber, F., and Kruger, A. Combiningtouch and
gaze for distant selection in a tabletop setting.In CHI 2013:
Workshop on Gaze Interaction in thePost-WIMP World (2013).
18. Parker, J. K., Mandryk, R. L., and Inkpen, K. M.Integrating
point and touch for interaction with digitaltabletop displays.
Computer Graphics and Applications,IEEE 26, 5 (Sept 2006),
2835.
19. Pfeuffer, K., Alexander, J., Chong, M. K., and Gellersen,H.
Gaze-touch: Combining gaze with multi-touch forinteraction on the
same surface. In Proc. of UIST 14,ACM (2014), 509518.
20. Reetz, A., Gutwin, C., Stach, T., Nacenta, M.,
andSubramanian, S. Superflick: A natural and efficienttechnique for
long-distance object placement on digitaltables. In Proc. of GI 06,
Canadian InformationProcessing Society (2006), 163170.
21. Remy, C., Weiss, M., Ziefle, M., and Borchers, J. Apattern
language for interactive tabletops incollaborative workspaces. In
Proc. of EuroPLoP 10,ACM (2010), 9:19:48.
22. Rottach, K. G., Zivotofsky, A. Z., Das, V.
E.,Averbuch-Heller, L., Discenna, A. O., Poonyathalang,A., and
Leigh, R. Comparison of horizontal, vertical anddiagonal smooth
pursuit eye movements in normalhuman subjects. Vision Research 36,
14 (1996),21892195.
23. Ryall, K., Forlines, C., Shen, C., Morris, M. R.,
andEveritt, K. Experiences with and observations ofdirect-touch
tabletops. In Proc. of TABLETOP 06, IEEEComputer Society (2006),
8996.
24. Scott, S. D., Carpendale, S., and Inkpen, K.
M.Territoriality in collaborative tabletop workspaces. InProc. of
CSCW 04, ACM (2004), 294303.
25. Serim, B., and Jacucci, G. Pointing while lookingelsewhere:
Designing for varying degrees of visualguidance during manual
input. In Proceedings of CHI16, ACM (2016), 57895800.
26. Shen, C. From clicks to touches: Enabling face-to-faceshared
social interface on multi-touch tabletops. In Proc.of OCSC07,
Springer-Verlag (2007), 169175.
27. Shen, C., Ryall, K., Forlines, C., Esenther, A., Vernier,F.
D., Everitt, K., Wu, M., Wigdor, D., Morris, M. R.,Hancock, M., and
Tse, E. Informing the design ofdirect-touch tabletops. IEEE Comput.
Graph. Appl. 26, 5(Sept. 2006), 3646.
28. Smeaton, A. F., Lee, H., Foley, C., and Mcgivney,
S.Collaborative video searching on a tabletop. MultimediaSyst. 12,
4-5 (Mar. 2007), 375391.
29. Stellmach, S., and Dachselt, R. Look &
touch:Gaze-supported target acquisition. In Proc. of CHI 12,ACM
(2012), 29812990.
30. Stellmach, S., and Dachselt, R. Still looking:Investigating
seamless gaze-supported selection,positioning, and manipulation of
distant targets. In Proc.of CHI 13, ACM (2013), 285294.
31. Toney, A., and Thomas, B. H. Considering reach intangible
and table top design. In Proc. of TABLETOP06, IEEE (2006), 2pp.
32. Tse, E., Greenberg, S., Shen, C., and Forlines, C.Multimodal
multiplayer tabletop gaming. Computers inEntertainment (CIE) 5, 2
(Apr. 2007).
33. Tse, E., Histon, J., Scott, S. D., and Greenberg, S.Avoiding
interference: How people use spatialseparation and partitioning in
sdg workspaces. In Proc.of CSCW 04, ACM (2004), 252261.
34. Turner, J., Alexander, J., Bulling, A., and Gellersen,
H.Gaze+rst: Integrating gaze and multitouch for
remoterotate-scale-translate tasks. In Proc. of CHI 15, ACM(2015),
41794188.
35. Velloso, E., Wirth, M., Weichel, C., Esteves, A.,
andGellersen, H. AmbiGaze: Direct Control of AmbientDevices by
Gaze. In Proc. of DIS16, ACM (2016),812817.
36. Vidal, M., Bulling, A., and Gellersen, H.
Pursuits:Spontaneous interaction with displays based on
smoothpursuit eye movement and moving targets. In Proc. ofUbiComp
13, ACM (2013), 439448.
37. Voelker, S., Matviienko, A., Schoning, J., and Borchers,J.
Combining direct and indirect touch input forinteractive workspaces
using gaze input. In Proc. of SUI15, ACM (2015), 7988.
38. Voelker, S., Weiss, M., Wacharamanotham, C., andBorchers, J.
Dynamic portals: A lightweight metaphorfor fast object transfer on
interactive surfaces. In Proc.of ITS 11, ACM (2011), 158161.
39. Wigdor, D., Jiang, H., Forlines, C., Borkin, M., andShen, C.
Wespace: The design development anddeployment of a walk-up and
share multi-surface visualcollaboration system. In Proc. of CHI 09,
ACM (2009),12371246.
40. Yamamoto, M., Komeda, M., Nagamatsu, T., andWatanabe, T.
Development of eye-tracking tabletopinterface for media art works.
In Proc. of ITS 10, ACM(2010), 295296.
41. Yamamoto, M., Komeda, M., Nagamatsu, T., andWatanabe, T.
Hyakunin-eyesshu: A tabletophyakunin-isshu game with computer
opponent by theaction prediction based on gaze detection. In Proc.
ofNGCA 11, ACM (2011), 5:15:4.
42. Zhai, S., Morimoto, C., and Ihde, S. Manual and gazeinput
cascaded (magic) pointing. In Proc. of CHI 99,ACM (1999),
246253.
IntroductionRelated WorkStudy 1: Characterising Eye Gaze on
TabletopsHardware ConfigurationStudy ProcedureResults
Study 2: Interactive System EvaluationSegmented RegionsSmooth
Pursuit Eye MovementsApplicationHardware ConfigurationData
CollectionStudy ProcedureResults
DiscussionX-Gaze: Findings and OpportunitiesMultimodal
Segmentation: A Viable Strategy
EXTENSIONS TO MULTIUSER TABLETOP
SCENARIOSConclusionREFERENCES