Top Banner
ORIGINAL ARTICLE Juha Kela Panu Korpipa¨a¨ Jani Ma¨ntyja¨rvi Sanna Kallio Giuseppe Savino Luca Jozzo Sergio Di Marca Accelerometer-based gesture control for a design environment Received: 18 January 2005 / Accepted: 29 April 2005 / Published online: 23 August 2005 Ó Springer-Verlag London Limited 2005 Abstract Accelerometer-based gesture control is studied as a supplementary or an alternative interaction modal- ity. Gesture commands freely trainable by the user can be used for controlling external devices with with handheld wireless sensor unit. Two user studies are presented. The first study concerns finding gestures for controlling a design environment (Smart Design Studio), TV, VCR, and lighting. The results indicate that different people usually prefer different gestures for the same task, and hence it should be possible to personalise them. The sec- ond user study concerns evaluating the usefulness of the gesture modality compared to other interaction modali- ties for controlling a design environment. The other modalities were speech, RFID-based physical tangible objects, laser-tracked pen, and PDA stylus. The results suggest that gestures are a natural modality for certain tasks, and can augment other modalities. Gesture com- mands were found to be natural, especially for commands with spatial association in design environment control. Keywords Gesture recognition Gesture control Multimodal interaction Mobile device Accelerometer 1 Introduction A variety of spontaneous gestures, such as finger, hand, body and head movements, are used to convey infor- mation in interactions among people. Gestures can hence be considered a natural communication channel, which has not yet been fully utilised in human–computer interaction. Most of our interaction with computers to date is carried out with traditional keyboards, mouses and remote controls designed mainly for stationary interaction. Mobile devices, such as PDAs, mobile phones and wearable computers, provide new possibili- ties for interacting with various applications, but also introduce new problems with small displays and minia- ture input devices. Small wireless gesture input devices containing accelerometers for detecting movements could be integrated into clothing, wristwatches, jewellery or mobile terminals to provide a means for interacting with different kinds of devices and environments. These input devices could be used to control such things as home appliances with simple user-definable hand move- ments. For example, simple up and down hand move- ments could be used to operate a garage door, adjust the volume of your stereo equipment or your room’s lights. As a general additional or alternative interaction modality, the acceleration-based gesture command interface is quite recent, and many research problems still need to be solved. Firstly, it should be examined what kind of gestures are natural and useful for the multitude of tasks they could be used for; and for what tasks other modalities are more natural. Gestures can also be combined with other modalities. The topic is very wide in scope since there are a large number of possible gestures suitable for certain tasks, as well as many tasks that could potentially be performed using gestures. Secondly, the recognition accuracy for detect- ing the gesture commands should be high; nearly 100% accuracy is required for user satisfaction since too many mistakes may cause the users to abandon the method. Thirdly, in personalising the gestures, the control com- mands need to be trained by the user. If the training is too laborious, it may also cause the users to abandon the interaction method. Therefore, the training process should be as fast and effortless as possible. Finally, practical implementation and experience are important for empirical evaluation of the gesture modality. J. Kela (&) P. Korpipa¨a¨ J. Ma¨ntyja¨rvi S. Kallio VTT Electronics, 1100, 90571 Oulu, Finland E-mail: juha.kela@vtt.fi Tel.: +358-20-722111 Fax: +358-20-7222320 G. Savino L. Jozzo S. D. Marca Italdesign-Giugiaro, Via A. Grandi 25, 10024 Moncalieri, Italy E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] Pers Ubiquit Comput (2006) 10: 285–299 DOI 10.1007/s00779-005-0033-8
15
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Accelerometer-Based Gesture Control for a Design Environment

ORIGINAL ARTICLE

Juha Kela Æ Panu Korpipaa Æ Jani Mantyjarvi

Sanna Kallio Æ Giuseppe Savino Æ Luca Jozzo

Sergio Di Marca

Accelerometer-based gesture control for a design environment

Received: 18 January 2005 / Accepted: 29 April 2005 / Published online: 23 August 2005� Springer-Verlag London Limited 2005

Abstract Accelerometer-based gesture control is studiedas a supplementary or an alternative interaction modal-ity. Gesture commands freely trainable by the user can beused for controlling external devices with with handheldwireless sensor unit. Two user studies are presented. Thefirst study concerns finding gestures for controlling adesign environment (Smart Design Studio), TV, VCR,and lighting. The results indicate that different peopleusually prefer different gestures for the same task, andhence it should be possible to personalise them. The sec-ond user study concerns evaluating the usefulness of thegesture modality compared to other interaction modali-ties for controlling a design environment. The othermodalities were speech, RFID-based physical tangibleobjects, laser-tracked pen, and PDA stylus. The resultssuggest that gestures are a natural modality for certaintasks, and can augment other modalities. Gesture com-mands were found to be natural, especially for commandswith spatial association in design environment control.

Keywords Gesture recognition Æ Gesture control ÆMultimodal interaction Æ Mobile device Æ Accelerometer

1 Introduction

A variety of spontaneous gestures, such as finger, hand,body and head movements, are used to convey infor-mation in interactions among people. Gestures can hence

be considered a natural communication channel, whichhas not yet been fully utilised in human–computerinteraction. Most of our interaction with computers todate is carried out with traditional keyboards, mousesand remote controls designed mainly for stationaryinteraction. Mobile devices, such as PDAs, mobilephones and wearable computers, provide new possibili-ties for interacting with various applications, but alsointroduce new problems with small displays and minia-ture input devices. Small wireless gesture input devicescontaining accelerometers for detecting movementscould be integrated into clothing, wristwatches, jewelleryor mobile terminals to provide a means for interactingwith different kinds of devices and environments. Theseinput devices could be used to control such things ashome appliances with simple user-definable hand move-ments. For example, simple up and down hand move-ments could be used to operate a garage door, adjust thevolume of your stereo equipment or your room’s lights.

As a general additional or alternative interactionmodality, the acceleration-based gesture commandinterface is quite recent, and many research problemsstill need to be solved. Firstly, it should be examinedwhat kind of gestures are natural and useful for themultitude of tasks they could be used for; and for whattasks other modalities are more natural. Gestures canalso be combined with other modalities. The topic isvery wide in scope since there are a large number ofpossible gestures suitable for certain tasks, as well asmany tasks that could potentially be performed usinggestures. Secondly, the recognition accuracy for detect-ing the gesture commands should be high; nearly 100%accuracy is required for user satisfaction since too manymistakes may cause the users to abandon the method.Thirdly, in personalising the gestures, the control com-mands need to be trained by the user. If the training istoo laborious, it may also cause the users to abandon theinteraction method. Therefore, the training processshould be as fast and effortless as possible. Finally,practical implementation and experience are importantfor empirical evaluation of the gesture modality.

J. Kela (&) Æ P. Korpipaa Æ J. Mantyjarvi Æ S. KallioVTT Electronics, 1100, 90571 Oulu, FinlandE-mail: [email protected].: +358-20-722111Fax: +358-20-7222320

G. Savino Æ L. Jozzo Æ S. D. MarcaItaldesign-Giugiaro, Via A. Grandi 25, 10024 Moncalieri, ItalyE-mail: [email protected]: [email protected]: [email protected]

Pers Ubiquit Comput (2006) 10: 285–299DOI 10.1007/s00779-005-0033-8

Page 2: Accelerometer-Based Gesture Control for a Design Environment

Previous work on gesture control can be classifiedinto two categories, camera-based and movement sen-sor-based. Camera-based recognition is most suitablefor stationary applications, and often requires specificcamera set up and calibration. For example, the cam-era-based control device called the Gesture Pendantallows its wearer to control elements in the home viapalm and finger gestures [1]. The movement sensor-based approach utilises different kinds of sensors e.g.tilt, acceleration, pressure, conductivity, capacitance,etc. to measure movement. An example of such animplementation is GestureWrist, a wristwatch-typegesture recognition device using both capacitance andacceleration sensors to detect simple hand and fingergestures [2]. Accelerometer-based gesture recognition isused in, for example, a musical performance controland conducting system [3]; and a glove-based systemfor recognition of a subset of German sign language [4].Tsukada and Yasumura developed a wearable interfacecalled Ubi-finger, using acceleration, touch and bendsensors to detect a fixed set of hand gestures, and aninfrared LED for pointing a device to be controlled [5].XWand, a gesture-based interaction device, utilisesboth sensor-based and camera-based technologies [6].The creators of XWand present a control device thatcan detect the orientation of a device using a two-axisaccelerometer, a three-axis magnetometer and a one-axis gyroscope, as well as position and pointing direc-tion using two cameras. The system is also equippedwith an external microphone for speech recognition.The user can select a known target device from theenvironment by pointing, and control it with speechand a fixed set of simple gestures. Moreover, gesturerecognition has been studied for implicit control offunctions in cell phones — for answering and termi-nating a call for example — without the user having toexplicitly perform the control function [7, 8]. In implicitcontrol, or in other words context-based gesture con-trol, the main problem is that users usually perform thegesture very differently in different cases, e.g. pickingup a phone can be done in very many different waysdepending on the situation. Explicit control directlypresupposes that gestures trained for performing cer-tain functions are always repeated as they were trained.

In addition to the camera and movement sensor-based gesture control approaches, 2D patterns, such ascharacters or strokes drawn on a surface with a mouseor a pen can be used as an input modality. This categoryof input has been referred to as character recognitionand gesture recognition in the literature [9]. Some of thecurrent PDAs and mobile devices are equipped withtouch-sensitive displays that can be utilised for pen orfinger-based 2D stroke gesture control. Using a mediaplayer application in a PDA with gestures has been re-ported to reduce the user workload for the task [10].Mouse-based 2D strokes have been successfully used asan input modality in the early nineties UNIX worksta-tion CAD program (Mentor Graphics) and today’smodern www browsers (e.g. Opera or Mozilla) for page

navigation. The focus of this paper is on 3D movementsensor-based freely personalisable gesture control.

As a sensing device, SoapBox (Sensing, Operatingand Activating Peripheral Box) is utilised in this work. Itis a sensor device developed for research activities inubiquitous computing, context awareness, multimodaland remote user interfaces, and low-power radio pro-tocols [11]. It is a light, matchbox-sized device with aprocessor, a versatile set of sensors, and wireless andwired data communications. Because of its small size,wireless communication and battery-powered operation,SoapBox is easy to install at different places, includingmoving objects. The basic sensor board of SoapBoxincludes a three-axis acceleration sensor, an illuminationsensor, a magnetic sensor, an optical proximity sensorand an optional temperature sensor.

Various statistical and machine learning methods canpotentially be utilised in training and recognising gestures.This study applies Hidden Markov Models (HMMs), awell-known method for speech recognition [12]. HMM isa statistical signalmodelling technique that can be appliedto the analysis of a time series with properties that changeover time. HMM is widely used in speech and hand-written character recognition as well as in gesture recog-nition in video-based and glove-based systems.

Themain contributions of this article are the following.Two user studies were conducted. The first study exam-ined the suitability of gestures and types of gestures forcontrolling selected home appliances and an interactivevirtual reality design environment, Smart Design Studio.The results indicate that different people usually use dif-ferent gestures for performing the same task, and hencethese gestures should be easily personalisable. The seconduser study evaluated the usefulness of gestures comparedwith other modalities in a design environment control.Gesture commands were found to be natural, especiallyfor commandswith spatial association to the design space.Furthermore, the users should be able to use as few ges-tures as possible for training, since repetitions can be anuisance. A method for reducing the number of requiredrepetitions, while maintaining proper recognition accu-racy was applied [13]. For getting empirical experience ofgesture control, a prototype multimodal design environ-ment was implemented, and a gesture recognition systemwas integrated as a part of the system.

The article is organised as follows. Firstly, gestureinterface basic concepts are properly defined and cate-gorised to clarify the characteristics of the differentmethods, devices and applications for gesture-basedcontrol. To address the wide topic of gesture usabilityand usefulness, a pre-study on gesture control in chosenapplication domains is presented in Sect. 3. The appliedgesture training and recognition methods are introducedbriefly and user-dependent recognition accuracy isevaluated in Sect. 4. The gesture recognition system wasintegrated into the Smart Design Studio prototype,which is described in Sect. 5. The prototype was used forthe second user study that aimed at evaluating andcomparing the use of different modalities for different

286

Page 3: Accelerometer-Based Gesture Control for a Design Environment

tasks in the design environment. Subsequently, the re-sults of both user studies are discussed. Finally, sug-gestions for future work are given together withconclusions.

2 Gesture control

Gestures are an alternative or complementary modalityfor application control with mobile devices. A multi-modal interface utilising gesture commands is presentedin Sect. 5. Before that, the types of movement sensor-based user interfaces need to be categorised to clarify thedifferences between the various approaches. Table 1presents a categorisation based on a few essentialproperties of movement sensor-based control systems.

Based on their operating principle, direct measure-ment and control systems are not considered to begesture recognition systems. In this article, gestures arereferred to as user’s hand movements collected by a setof sensors in the device in the user’s hand and modelledby mathematical methods in such a way that any per-formed movement can be trained to be later recognisedin real-time; based on which a discrete device controlcommand is executed. Gesture commands can be usedto control two types of applications: device internalfunctions, and external devices. In device internalapplication control, gestures are both recognised andapplied inside the device. In external device control,gestures operate separate devices outside the sensingdevice, and the recognition, based on the collected ges-ture movement data, is performed either inside or out-side the sensing device. If the recognition is performedoutside the sensing device, the movement data is sent toan external recognition platform, such as a PC. Therecognition results are mapped to commands for exter-nal device(s) and the commands are transmitted to thecontrol target with a wireless or wired communicationchannel.

With regard to sensor-based hand movement inter-faces, this paper focuses on gesture command interfaces(the second category in Table 1). Moreover, this article

focuses on the gesture control of external devices. Thereare a multitude of existing simple measure and controlapplications that belong to category one in Table 1. TheSmart Design Studio prototype presented in Sect. 5utilises this category of control in addition to categorytwo. Other modalities in the prototype, such as speech ora PDA stylus, can be used as an alternative orcomplementary interface to gestures, making the systemmultimodal.

3 User study on gesture types

The range of potential applications for gesture com-mand control is extensive. This study focuses on certainchosen application areas. A pre-study was conducted inorder to examine the potential suitability of gestures forthe chosen applications. The goal of the user study wasto gather information about feasible gestures for thecontrol of home equipment and computer-aided designsoftware. Another goal was to investigate whether thereis a universal gesture set, a vocabulary that can beidentified for certain applications. The test participantswere requested to imagine, create and sketch the ges-ture(s) they considered most appropriate for the controlfunctions given in the questionnaire form. They weregiven two weeks to contemplate and try different po-tential gestures in their homes before returning thequestionnaire form. Use of the same gestures for similarfunctions in different devices, such as VCR on, TV on,etc., was also allowed. Discrete gesture movements couldbe performed in all three dimensions (up-down, left-right, forward-backward), and it was supported in thequestionnaire notation. Although simple tilting androtation-based interaction could be useful in someapplication areas, it was not within the focus of thisquestionnaire.

The responses for the two questionnaires were ac-quired from European ITEA Ambience project partnersand their families. The first was carried out in Finnishand the main target group was IT professionals and theirfamilies. The task was to describe gestures for controlling

Table 1 Categorisation and properties of movement sensor-based user interfaces

Sensor-baseduser interfacetype (movement)

Operatingprinciple

Analogy withspeechinterface

Personalisation Complexityandcomputingload

1. Measure andcontrol

Direct measurement oftilting, rotationor amplitude

e.g. controlbased onvolume level

— Very low

2. Gesturecommand

Gesture commandrecognition

Speech commandrecognition

Machine learning,freelypersonalisable

High

3. Continuousgesture command

Continuous gesturerecognition

Continuousspeechrecognition

Machine learning,freelypersonalisable

Very high

287

Page 4: Accelerometer-Based Gesture Control for a Design Environment

the basic functionality of three different home appliances— TV, VCR and lighting. The functions included suchcontrol actions as changing the channel, increasing/decreasing the volume, and switching the device on andoff. The second questionnaire was carried out amongAmbience project partners and included the same VCRcontrol tasks and five 3D design software control tasks.The total number of tasks was 21 in the first and 26 in thesecond questionnaire. The questionnaire was initiallydelivered to 50 selected participants, who were alsoencouraged to distribute and recommend the test toother friends and colleagues. Both the questionnaireshad the same open questions at the end of the form toask for additional information, such as:

– Did the respondents find it natural to use gestures tocontrol the given devices?

– What other devices could be controlled with gestures?– What remote controllable devices did they own and

use?– Give free comments and ideas about the design and

applicability of the gesture control device.

4 Questionnaire results and discussion

The total number of responses was 37, of which 73%were males and 27% females, with varying culturalbackgrounds. About 78% had a technical education.The average age was around 32, while the age distribu-tion ranged from 21 to 54. The nationalities of therespondents were 57% Finnish, 38% Italian, 3% Frenchand 3% Dutch. Typical frequently used remotely con-trollable home appliances were TV (97%), stereo/Hi-fi(80%), and VCR (71%).

The VCR control gestures were given by 37 partici-pants, TV and lighting control by 23, and 3D designsoftware control by 14. Table 2 summarises the threemost popular controlling gestures proposed for the dif-ferent VCR control tasks. Each gesture command isrepresented by the gesture trace, the spatial plane, andthe percentage of responses suggesting that particulargesture for the given control function.

The results indicate that the respondents tended touse spatial two-dimensional gestures in the x–y plane.

Table 2 Most popular gesturesfor VCR control

288

Page 5: Accelerometer-Based Gesture Control for a Design Environment

Utilising all three dimensions in one gesture was rare.The third dimension, ‘‘depth’’ axis (pointing away fromthe user) was used for the functions ‘‘On-Off’’ gesturepair that consists of ‘‘push forward – pull backward’’movements, Record, and Pause, Table 2. Another find-ing was that some of the proposed gestures follow thecontrol logic, so that opposite control actions are rep-resented by opposite direction gestures.

on� off ! push� pull

next� previous ! right� left

increase� decrease ! up� down

These gestures seem intuitive for the given tasks. How-ever, there were control tasks that did not produce suchstraightforward gestures such as VCR record, VCRpause and TV mute. This was obvious with operationsthat have no easily identifiable mental model or uni-versal symbol, e.g. VCR record, Fig. 1. Comments weregiven that specific critical functions, such as VCR re-cord, should be protected from accidental activation bydefining more complex control gestures.

Another observation was that the same gestures wereproposed for different devices:

– VCR on, tv on, lights on, zoom in– VCR off, tv off, lights off, zoom out

– VCR play, tv volume up, lights brighten– VCR stop, tv volume down, lights dim– VCR next channel, tv next channel– VCR previous channel, tv previous channel

This result suggests that the users would like tocontrol different devices using the same basic gestures.Hence, there is a need for a method for selecting thecontrollable device from the environment prior tomaking the gesture. Research studies can be foundusing, e.g., a laser/infrared pointer to pick out the de-sired control target [14].

Seventy six percent of the respondents found itnatural to use gestures for controlling the given devices,while 8% did not find it natural, and 16% left thequestion unanswered. Respondents commented thatthey found the gestures natural for some commands butthe gestures should only be used for simple basic tasks.Moreover, according to the comments, the number ofgestures should be relatively small and there should be apossibility of training and rehearsing difficult gestures.

When asked what devices people would like to con-trol by gestures, the most popular device category waspersonal devices and home entertainment devices, suchas PDAs, mobile terminals, TVs, DVD players and Hi-fiequipment. Another popular category was locking andsecurity applications, including garage doors, alarmsystems, and car and home locks. People also proposedsome typical PC environment tasks, such as presentationcontrol, Internet browser navigation and tool selectionof paint software. There were also a couple of specialapplication areas, such as vacuum cleaners, car gear-boxes and microwave ovens. To generalise, the mostpopular application target for gesture control, accordingto the responses, seems to be simple functions of currentremotely controllable devices.

5 Gesture training and recognition

According to the study, users prefer intuitive user-definable gestures for gesture-based interaction. This is achallenge for the gesture recognition and training sys-tem, since both real-time training and recognition arerequired. To make the usage of the system comfortable,a low number of repetitions of a gesture is requiredduring the training. On the other hand, a good gener-alisation performance in recognition must be achieved.Other requirements include: a recogniser must maintainmodels of several gestures, and when a gesture is per-formed, training or recognition operations must beshort. This section presents the methods used in

Fig. 1 Sample of gestures proposed for the VCR record task

Fig. 2 Block diagram of a gesture recognition/training system

289

Page 6: Accelerometer-Based Gesture Control for a Design Environment

real-time gesture recognition in the design environmentprototype.

In accelerometer-based gesture interaction, sensorsproduce signal patterns typical for gestures. These signalpatterns are used in generating models that allow therecognition of distinct gestures. We have used discreteHMMs in recognizing gestures. HMMs are stochasticstate machine that can be applied to modelling timeseries with spatial and temporal variability. HMMs havebeen utilised in experiments for gesture and speech rec-ognition [8, 15]. Acceleration sensor-based gesture rec-ognition using HMMs have been studied in [4, 8, 16–18].

The recognition system works in two phases: trainingand recognition. A block diagram is presented in Fig. 2.Common steps for these phases are signal sampling fromthree accelerometers to 3D sample signals, preprocess-ing, and vector quantisation of signals. Repeating thesame gesture produces a variety of measured signalsbecause the tempo and the scale of the gesture canchange. In preprocessing, data from the gestures is firstnormalised to equal length and amplitude. The data isthen submitted to a vector quantiser. The purpose of thevector quantiser is to reduce the dimensionality of thepreprocessed data to 1D sequences of discrete symbolsthat are used as inputs for the HMMs in training and inrecognition.

5.1 Preprocessing

The preprocessing stage consists of interpolation orextrapolation and scaling. The gesture data is first line-arly interpolated or extrapolated if the data sequence istoo short or too long, respectively. The amplitude of thedata is scaled using linear min-max scaling. Preprocess-ing normalises the variation in gesture speed (tempo)and scale, thus improving the recognition of the gestureform and direction.

5.2 Vector quantisation

The vector quantisation is used to convert preprocessed3D data into 1D prototype vectors. The collection of theprototype vectors is called a codebook. In our experi-ments, the size of the codebook was empirically selectedto be eight. Vector quantisation is done using the k-means algorithm [19]. The codebook is built using theentire data set available for the experiments. Figure 3illustrates a 3D acceleration vector (upper diagram) and

its corresponding 1D symbol vector (lower diagram) fora ‘>’ -shaped gesture.

5.3 Training and recognition with HMM

The Hidden Markov Model [12] is a stochastic statemachine. HMM classification and training are solvedusing Viterbi and Baum–Welch algorithms. The globalstructure of the HMM recognition system is composedof parallel connection of each trained HMM [20]. Henceadding a new HMM or deleting the existing one is fea-sible. In this paper an ergodic, i.e. fully connected dis-crete HMM topology, was utilised. In the case of gesturerecognition from acceleration signals, both ergodic andleft-to-right models have been reported as giving similarresults [4]. The alphabet size—the codebook size—usedhere is eight and the number of states in each model isfive. Good results have been obtained in earlier studiesby using an ergodic model with five states for a set ofgestures [4]. It has been reported that the number ofstates does not have a significant effect on the gesturerecognition results [21].

6 Gesture recognition experiments and results

Systematic tests are required in order to evaluate theaccuracy of the gesture training and recognition system.In this section, the recognition accuracy of the system isevaluated by established pattern recognition methods.First, the collected data is described, the methods andexperiments are explained, and finally the results are

Table 3 Gestures used in therecognition accuracyexperiments

Fig. 3 Three-dimensional acceleration vector and its symbolicrepresentation for ‘>‘ shaped gesture

290

Page 7: Accelerometer-Based Gesture Control for a Design Environment

discussed. These tests were performed offline before themultimodal design environment user study. The userstudy was performed using the implemented real-timerecognition system integrated with the design environ-ment prototype, Sect. 5.

According to the questionnaire’s results, eight popu-lar gestures, presented in Table 3, were selected as a testdata set for the HMM-based gesture recognition accu-racy test. Data collection was performed using a wirelesssensor device (SoapBox) equipped with accelerometers.For each gesture, 30 distinct 3D acceleration vectorswere collected from one person in two separate sessionsover 2 days. The gestures were collected sequentially byrepeating the first gesture 15 times, taking a short breakand then proceeding to the collection of the next gesture.The total test data set consisted of 240 repetitions. Thelength of the 3D acceleration vectors varied dependingon the duration of the gesture.

The experiments were divided into two parts. First,the recognition rate for each gesture was calculated byusing two repetitions for training and the remaining 28for recognition. The recognition percentage for eachgesture was the result of cross-validation, so that, in thecase of the two training repetitions there were 15training sets and the rest of the data (28) was used as thetest set 15 times. The procedure was repeated for eachgesture and the result was averaged over all the eightgestures. This procedure was then repeated with four,six, eight, ten, and 12 repetitions to find the number ofrequired training repetitions for reaching a properaccuracy.

In the second test, the aim was to reduce the numberof required training repetitions in order to make thetraining process less laborious for the user. Our goal wasto find a solution which would keep the amount of re-quired training repetitions as low as possible and stillprovide a recognition rate of over 95%, which should besatisfactory for most applications. The experimentedmethod was to copy the original gesture data and addnoise into the copy [13]. The noise-distorted copy wasthen used as one actual training repetition. The goal ofthe experiment was to see if the number of repetitionscould be reduced and to find the right level of noise toadd to the data for optimal results. Two actual gestureswere used for training in this test, plus two copy gestures— a total of four. The accuracies in this test are also theresult of cross-validation — for each noise parametervalue the models were each trained and tested 15 times,the test set consisting of 28 repetitions for each gesture.

The result of the first test, the effect of the number ofthe training vectors for recognition rate is shown inTable 4. A recognition rate of over 90% was achievedwith four training vectors.

The result of the second test was that with two ori-ginal and two noise-distorted (uniform distributed, SNR5) training vectors the recognition accuracy is 96.4%.Compared with the situation in which HMMs weretrained using only two original training vectors, the gainachieved by adding noised vectors is over nine per cent.In fact, the result seems to be near the recognitionaccuracy achieved using ten original training vectors.Clearly, discrete HMM is not able to make good gen-eralizations from only two training vectors. Adding thenoised vectors to the original training set increasesthe density of the significant features of the gesture in theoverall training set. Variation of gesture becomes bettercaptured and the new training set is a more representa-tive sample of the vectors describing the gesture. Resultsshow that two original vectors with two noise-distortedvectors can capture the variation of the gesture with thesame accuracy as ten original vectors [22]. According tothe preliminary, as yet unpublished empirical results,walking while making a gesture does not cause a sig-nificant decrease in the recognition accuracy, thoughtraining in this study was performed while stationary.

7 Smart Design Studio

The Smart Design Studio prototype was developed inco-operation with ITEA Ambience project partners. Theprototype was realised in the Italdesign-Giugiaro VirtualReality Centre, which is an interactive work environ-ment supporting different design and engineering activ-ities. The Virtual Reality Centre is equipped with a largewall-sized projection display where engineers andindustrial designers present their design sketches andprototypes to their customers. A typical usage scenariofor the Virtual Reality Centre presentation starts withpreparatory tasks, such as loading virtual models, pre-sentation slides and meeting agenda, and, during theactual presentation, the designer or marketing repre-sentative constantly changes between different modelviews and switches between different programs in thepresentation. This requires that either a separate pre-sentation operator changes the programs, models andviews according to the presenter’s requests or the pre-senter himself has to move back and forth from theprojection wall to the back of the room to operate thecomputer.

Virtual Reality Centre users had found the existingmouse and keyboard-based interaction impractical andslow to use, so they wanted to develop a new easy-to-useenvironment — the Smart Design Studio — with dif-ferent optional interaction modalities. In addition, theusers wanted to have separate design and presentationsoftware user interfaces to be integrated into one com-mon user interface application supporting effortlesschanging of modalities. The main objective was to sim-plify and accelerate the engineering and design devel-opment phases by introducing more devices anddifferent interaction modalities. The users were free to

Table 4 Recognition rate versus number of training vectors

1 2 4 6 8 10 12

81.2% 87.2% 93.0% 95.3% 96.1% 96.6% 98.9%

291

Page 8: Accelerometer-Based Gesture Control for a Design Environment

select the most suitable control modality for the giventask according to their personal preference. Due to thelimitations in control capabilities, some modalities sup-ported only a limited set of control tasks. The modalityand task selection was done in co-operation with VirtualReality Centre users. For example, RFID-based controlwas practical in room access control and image presen-tation tasks but less practical for CAD model editing.The new modalities besides the existing mouse andkeyboard were the following:

– Speech recognition and speech output for voicecommand-based navigation

– Gesture recognition for controlling the Smart DesignStudio with hand movements

– RFID tag card-based physical tangible navigation ofmultimedia material

– Laser-tracked pen for direct manipulation of objectson the projection wall

– Tablet PC and PDA touch-sensitive display for re-mote controlling

This paper concerns the experiments from the gesturemodality compared with the others. An overview of theSmart Design Studio architecture and modalities is dis-cussed first, followed by a description of the gesturerecognition system integrated into the design environ-ment. Finally, the end-user evaluation is presented andthe results of the comparison of the modalities arediscussed.

8 System architecture and interaction modalities

The Smart Design Studio was realised in the existingVirtual Reality Centre, where the interaction was earlierperformed using only a mouse and a keyboard for theinput and a projection wall for the output. The basicinfrastructure supporting seamless multimodal opera-tion of the different design and presentation softwarewas developed by Italdesign-Giugiaro. Figure 4 presents

the conceptual overview of the Smart Design StudioDemonstrator.

The Smart Design Studio is implemented using twodedicated servers, Windows and UNIX, connected to-gether with a TCP/IP network. The Windows server isused to run a web server and typical office and presen-tation tools. Moreover, it has hardware and softwareinterfaces for speech and gesture recognition, RFID tag-based physical tangible objects (PTO) and wireless net-work connections (WLAN). The UNIX server functionsas a platform for the Virtual Room Service Manager(VRSM) server, CAD design and presentation software,pen tracker interface, and the projection wall controller.The VRSM is used to manage and synchronise interac-tion events coming from different modalities accordingto the contextual information about currently activesoftware and control devices. In detail, interaction withthe system is permitted through:

– Speech input and output. Through this modality, theusers can manage complex applications by simply‘‘talking’’ with the system using a wireless micro-phone. Speech recognition can go from the basiccommands to ‘start presentation’ and ‘select from amenu’ to a much more complex interaction with thesystem, such as changing the viewpoint of a model.However, the commands related to 3D model editingwere not supported. System responses were generatedusing a text-to-speech engine to provide feedback andhelp in conflicting situations. The Speech recognitionsolution was developed by Knowledge.

– Gesture input allows the association of user-definablegestures with specific correspondent commands (suchas move, rotate, zoom in, zoom out) enabling theusers to interact with the applications in a simple andnatural way. Gesture recognition could be used for selecting items from a menu, changing the point ofview or moving and rotating a model in the virtualspace. The gesture control set was limited to simplebasic commands and more complex CAD design wasnot supported. The gesture recognition system wasdeveloped by VTT Electronics and is described inmore detail in the following section.

Fig. 4 Conceptual overview of Smart Design Studio prototype

292

Page 9: Accelerometer-Based Gesture Control for a Design Environment

– Physical tangible object (PTO). RFID-tag technologyenables the association of digital information withphysical objects. The system allows users of the SmartDesign Studio to start applications and select designsby simply placing a physical object on a table. Ital-design-Giugiaro and Philips have implemented asolution to present design sketches and documents totheir customers by attaching related data objectsto RFID cards. Using a tagged card it is possible tolaunch different applications and visualize differentdata, such as images and office documents. The RFIDreceiver has two antennae: one for manipulating datacontent such as pictures and design models, and theother for application selection, such as text editors orimage viewers. The system has been implementedusing flat 2D RFID cards with the name of theapplication or a photo of the respective car model.The latter could just as easily be implemented using asmall 3D model of the car. Tangible object recogni-tion offers the advantage of instant access to certainfunctions, eliminating the need to search throughdifferent application menus. It is also very flexible andnot restricted to the number of buttons that fit on aremote control. The main advantage, however, is thatit enables users to interact with the system using oneof their most basic and intuitive skills: handlingphysical objects. The same technology could be usedin access control and presentation environment per-sonalization; so, for example, when the designer entersthe room the lighting level is adjusted and theappropriate presentation data is loaded.

– PDA and tablet PC. These devices can wirelesslycontrol the Smart Design Studio via a dedicated andre-configurable browser interface. The navigation isperformed using either a touch-sensitive display or awireless mouse and keyboard. The same interface canalso be used with a laptop or workstation PC equip-ped with a web browser. The control set was limitedand it supported only 3D navigation and presentationtask related commands.

– IntelliPen is used as a physical pointer device over alarge projection screen. It acts as a mouse equivalentallowing direct interaction with CAD applications ona 1:1 scale, thus giving new opportunities for thestylist and engineers to interact with the system. TheIntelliPen can control all functions that are control-lable by mouse, making it the most versatile controldevice in the Smart Design Studio. High-precisiontracking is based on two laser scanners, giving accu-racy high enough to let the designers work on thescreen in the same way that they would do with nor-mal input devices. The IntelliPen was developed byBarco for the prototype.

– Projection wall provides visual output for the SmartDesign Studio. It has a resolution of 3,200 ·1,120 pixels and the size of the screen is 6.2 · 2.2 m2

In the initial system set up, different input commandsare bound to certain functions of the controllable

applications by using a specific communication protocol.The protocol defines the applications and the specificcommands that are available. There is also a set ofglobal commands, which are available to the user in anycontext (for example, ‘VRSM_help’ will activate thecontext-related help). Speech, gesture and physical tan-gible objects require the user to explicitly define andpersonalise the commands and desired control outputs,while the PDA and Tablet PC have a special browserinterface. IntelliPen is the only device that does not needany configuration, except for initial calibration, since itworks as a mouse equivalent. Because all the informa-tion to be presented is handled with the UNIX server,VRSM is constantly aware of what application is usedand currently visible, thus making it possible to ignorecommands that are not reasonable for that context —for example, the gesture command Rotate clockwise isignored in the slide presentation mode. However, thiskind of inconsistency is uncommon in a normal pre-sentation situation where the show is controlled by oneperson familiar with the presentation agenda. The sys-tem enables true multimodal interaction by providingthe possibility of making the same control commands inmany different ways. For instance, the move downcommand can be provided by gesture, voice, IntelliPenor PDA, and the load command can be provided byvoice or physical tangible object.

9 Gesture recognition system as part of the SmartDesign Studio

VTT Electronics provided the personalisable gesturemodality to be used for control and navigation withindifferent Smart Design Studio applications. In thetraining phase, users were free to associate differentactivation and navigation functions with their personalgestures. This enabled the users to present a slideshow orselect different viewpoints of CAD software. Gestureinput was collected using SoapBox, which uses acceler-ation sensors to measure both dynamic acceleration (e.g.motion of the box) and static acceleration (e.g. tilt of thebox). The acceleration was measured in three dimensionsand sampled at a rate of 46 Hz. The measured signalvalues were wirelessly transmitted from the handheldSoapBox to a receiver SoapBox that was connected tothe Windows server with a serial connection. The gesturestart and end were marked by pushing the button on theSoapBox at the start of the gesture and releasing it at theend, which then activated either the training or recogni-tion algorithm. This may produce some extra artefacts tothe actual gesture data, such as short still parts in thestart or end of the actual gesture, which could be filteredout to improve the recognition. The ideal solution wouldbe device operation without use of buttons, which couldbe technically compared to continuous speech recogni-tion with always open microphones. In both cases theinput data is continuously monitored and finding the

293

Page 10: Accelerometer-Based Gesture Control for a Design Environment

actual data from the background noise, i.e. additionaluser movement in the case of gesture recognition, for therecognition is difficult and computationally heavy.Moreover, continuous recognition might produce unin-tentional recognitions, which may disturb the user. Allsignal processing and pattern recognition (HMM) soft-ware ran in the Windows server. Recognition resultscould be mapped to different control commands, whichwere transmitted to VRSM using TCP/IP socket com-munication. The mapping between gestures and outputfunctions was done in the training phase by naming thegestures using specific command names, e.g.Model zoomin for each gesture.

Moreover, the set up also supported two kinds ofcontinuous control (category measure and control inTable 1) by utilising the tilting angle of the control deviceand rotation (bearing) detected by an electronic compass.Three different operation modes — discrete gesturecommands, tilting, and rotation — could be selected bytwo buttons on top of the SoapBox. These modes couldbe utilised for zooming or rotating virtual models. Thiskind of continuous control was not originally supportedin the communication protocol of the VRSM, althoughduring the preliminary user tests it was found to be a verynatural way of manipulating certain views of the soft-ware and was included in the protocol. Because theelectronic compass is very sensitive to metallic andmagnetic objects nearby, a separate calibration programwas provided to filter out these errors in the signal.

10 Comparison of modalities

The usefulness of the Smart Design Studio prototypewas evaluated with user tests performed in the realenvironment. The test session started with a brief dem-onstration and as introduction to the new controlmodalities of the Smart Design Studio. The usage sce-nario consisted of a typical presentation session simu-lation. Each user performed the tasks of a typical designsession, closely following the typical work methodolo-gies of the designer in the classic design session and inthe meetings. During the test sessions, users wereencouraged to give free comments on the operation andfunctionality of the control devices. After the test, theusers were interviewed by the test observers. There wasno specific format for the interviews since the questionswere asked informally, depending on the user’s profes-sional background and their behaviour during the test.However, every user was asked to answer the followingopen questions:

– What was the most feasible control modality for acertain Smart Design Studio application function?

– Overall impression of the control modalities (like/dislike)

– Did the new Smart Design Studio environment, withnew modalities, improve the interaction compared tothe previous VRC environment?

In addition, more specific questions were asked toget detailed feedback from the gesture recognitionsystem. The following items were queried from theusers:

– Should gesture recognition be user-specific or user-independent?

– Is the personalisation phase too complicated (gesturetraining)?

– Usefulness of the different interaction modes (gesture,tilting, rotation)

– Ergonomics of the device

In order to cover all the Italdesign - Giugiaro VirtualReality Centre user types the following user groups wereselected:

– Young designers and (Computer-aided styling) CASoperators from the Styling Department

– Senior designers from the Styling department– Project managers, engineers and (digital mock up)

DMU operators

Fig. 5 a) SoapBox gesture device and b) Gesture-based interactionwith the Smart Design Studio

294

Page 11: Accelerometer-Based Gesture Control for a Design Environment

The users’ levels of knowledge of the computer-aidedtools varied a great deal; some of them use the computeras their main working instrument while others considerit just as a support for their job. The average age of thetest subjects was 37 and their educational backgroundswere in economics, telecommunications and electricalengineering or industrial design. The total number oftest subjects was 15 and all were male. Figure 5 showsthe SoapBox gesture control device and an industrialdesigner interacting with CAD presentation software byusing gestures.

11 Test results

User interviews and comments revealed successful as-pects of the Smart Design Studio prototype, and targetsfurther development. Overall, on the positive side, thefollowing aspects were identified: the new opportunity tointeract with the system by using a combination of dif-ferent modalities for different tasks significantly im-proves the productivity of the work and usability of thevirtual reality centre environment (i.e. Smart DesignStudio). The new modality options were found naturaland intuitive to use. Also the operation speed of thesystem and the possibility of interacting directly withvirtual models on the large screen were appreciated.

Switching between modalities was easy and the systemconfiguration phase did not require too much effort.

When comparing modalities between different usergroups, the most popular control modality among eachgroup was the IntelliPen as presented in Table 5. Be-cause the IntelliPen acts similar to a mouse pointer, itprovides a natural and instant way of interacting withthe system with little or no training. One reason forthe popularity of the IntelliPen was that all the func-tions could be controlled with it. The other modalities— e.g. physical tangible objects or gesture control —could only be used for controlling a small subset offunctions in the system. However, when using the In-telliPen in a program or menu navigation some func-tions may not be so feasible — for example, the menubar may open on the other end of the projection wallrequiring extra walking to make a selection. In thesecases, the users preferred to use additional modalitiesthat provided direct shortcuts to certain menu func-tions, such as move model left, rotate model or acti-vate help. Table 5 also shows that different usergroups performed different tasks in the workspace,which affected the preferred control modalities; thedesigners and computer-aided styling engineers pre-ferred tools that enabled design tasks, while the engi-neers and DMV operators liked to use morepresentation-oriented modalities.

Table 7 Gestures used in theSmart Design Studioexperiments

Table 6 Percentage of userspreferring certain controlmodalities for controllingcertain application

Control modality

Speechinteraction

Physical tangibleobjects

Gestureinteraction

PDA IntelliPen

Applications Opticore 17 13 43 22 5Catia V5 15 23 15 0 47IcemSurf 14 20 13 0 53Sketch presentation 30 35 0 35 0Slideshow 40 40 0 20 0

Table 5 Percentage of usertypes preferring certain controlmodalities

Control modality

Speechinteraction

Physical tangibleobjects

Gestureinteraction

PDA IntelliPen

Users Senior Designer 12 15 8 10 55Young Designer 11 15 12 14 48CAS operators 11,5 14,5 16 7 51Engineers / DMU operators 7 21 23 14 35

295

Page 12: Accelerometer-Based Gesture Control for a Design Environment

11.1 IntelliPen

The IntelliPen was found especially useful in the tasksrequiring direct manipulation of virtual models. This isclearly seen in Table 6, where the modalities are evalu-ated in terms of application used. Catia V5 and Icem-Surf, which are applications used in 3D model design,styling and rendering, require active use of a pointerdevice. In addition, the users appreciated the possibilityof using the IntelliPen in direct interaction with the de-sign model on a 1:1 scale, which is often impossible whenworking with a mouse-controlled workstation. Despitethe fact that the IntelliPen provided the possibility toaccess every function of the system, it was not preferredin the presentation tasks since other modalities provideddirect shortcuts to certain actions and thus more naturalinteraction.

11.2 Gesture interaction

The test results show that the users preferred gestureinteraction when using the Opticore, which is a specialsoftware for 3D model visualisation. Their gestures wereused in 3D navigation in ‘‘fly mode’’ — i.e. moving themodel in different directions by making simple spatialup, down, left and right gestures in the x–y plane(Table 7). A similar ‘‘fly mode’’ can also be found in theCatia V5 and IcemSurf applications and it was com-monly used during the tests. During the system inte-gration phase, the functionality of the gesture recogniserwas tested with the above-mentioned navigation gesturesplus push and pull gestures in the z-axis for zooming(Table 7). It was found that the operation of gesturecommand-based zooming was too stepwise and contin-uous control would have been better. The continuoustilting and rotation features were implemented andintegrated into the Smart Design Studio before the finaluser evaluation. Positive feedback proved that themodification was justified. Especially when usingthe rotation feature, some users reported that they hadthe sensation of ‘‘having the model in their hands’’. Thegestures, tilting and rotation were easy to use andthe users appreciated the possibility of interacting withthe model without the spatial constraints of a mouse andkeyboard.

Gesture recognition accuracy was good in the testsand the system was able to recognise both small andlarge scale gestures performed with different speeds.However, there were some problems with the sensitivityof the rotation. The electronic compass inside the sensorbox is sensitive to magnetic fields in the environment,which may change when moving around the SmartDesign Studio. During the interviews, there were com-ments that it would be nice if the system included a user-independent pre-trained library of typical navigationgestures enabling instant usage and a separate trainingprogram for personalisation of the library. However, thegesture models trained for one person were not accurate

when used by a person other than the trainer. Most ofthe recognition errors resulted from varying buttonusage and sensor box tilting differences between the userwho trained the gestures and the test user. Because of thesmall size of the sensor box, different users easily tiltedthe box in their hand, especially during gesture move-ment. This is not a problem if the sensor box is tilted inthe same way it was tilted in the training phase, butif there are differences in tilting between training andrecognition, the acceleration signal axes are shifted andthe recognition rate decreases. The tests showed that thesystem is not so sensitive to moderate speed and scaledifferences in gesture movement. However, the initialresults suggest that the gesture recognition system is ableto recognise gestures user-independently, if the gestureset is trained by a group of users (five or more) and thetilting angle variation is filtered out. We will report theresults for user-independent recognition in a later pub-lication.

Moreover, user comments revealed specific gestureinterface issues that need to be addressed in the future:

– How to undo operations– How to correct recognition errors– How to identify which state the controllable device is in– When controlling different devices with the same

gestures, how to select the device under control

Furthermore, it was found that some sort of feed-back is essential, especially in the error cases whennothing happens. This is confirmed by the literature[10]. Overall, the user comments stated that the oper-ation speed of the gesture recognition system was fastand it provided a natural way of interacting with 3Dvisualisation software. However, concerning ergonom-ics, for some users the buttons on the sensor box weretoo close to each other and thus a little cumbersome touse because of the small size of the sensor box. Usersproposed that the casing of the sensor box should beredesigned to make it more comfortable and thereshould be a selection switch for discrete or continuousgesture control modes.

11.3 PDA, physical tangible objects (PTO)and speech interaction

The PDA was found most useful in visualisation(Opticore) and presentation control tasks. Since thehttp-based control required implementation of anapplication-specific interface for each application, com-plicated design and modelling tasks were left out due tothe limited input and output capabilities of the PDA.The tagged cards (PTO) were widely used in the pre-sentation tasks — especially in the slideshow and sketchapplication, which is a thumbnail-based tool for imageviewing. Furthermore, positive feedback was given tothe option to launch the different applications and loadthe design models simply by throwing an appropriatedata object (card) onto the (antenna) table. Overall, the

296

Page 13: Accelerometer-Based Gesture Control for a Design Environment

physical aspect of controlling the data and applicationswas found natural and comfortable.

Another modality valued in the slideshow presenta-tion was speech control. Typical speech commands wereload model <model_no>, go to next slide, go to previousslide or go to slide <slide_no>. Similar commands werealso used in the image navigation inside the sketchpresentation application. Both slides and images wereeasily navigable by using speech or tagged cards. Speechinput was found practical in some special functions, suchas loading models and setting viewpoints of design andvisualisation software. In addition, the user commentsrevealed that the dialogue of the speech recognitionsystem was practical, providing help and asking ques-tions in the case of conflicting situations — for exampleif the user requested ‘‘Please rotate the model’’, thesystem would generate the prompt ‘‘Rotate model, leftor right?’’.

12 Discussion

The use of accelerometer-based gesture recognition as acomplementary modality for a multimodal system wasanalysed with two user studies. The first study examinedthe suitability and type of gestures for controllingselected home appliances and the Smart Design Studio.During the study it was found that, with a small subsetof users, the request to refill the same questionnaireproduced different gestures for certain tasks after a fewdays. Finding the suitable gestures for a certain taskseems to require iteration. At first, the users may pro-pose complex gestures, which are easy to understand andrelate to a function, e.g. R-shaped gesture for VCR re-cord (Table 2). Later, the users may discover straight-forward gestures that demand less user effort thancomplex ones, e.g. move gestures in the design envi-ronment. Moreover, for certain tasks, there is no simi-larity between proposed gestures, Fig. 1. Overall, theresults of the study indicated that people prefer to definepersonal gestures. The gesture personalisation capabilitywas evaluated with a recognition accuracy test, whichvalidated that gesture training and accurate recognitionis feasible.

The usefulness of the integrated gesture recognitionsystem was evaluated with real users in the Smart DesignStudio with a multimodal interface. The comparison ofmodalities suggested that preferred interaction methodis task dependent. However, the preference of modalitiesvaried a lot between test participants. Some users pre-ferred using multiple modalities for the same task, andsome preferred using just one. Different modalities haddifferent numbers of possible controllable functions.IntelliPen had the widest scope of functions. For speech,gestures, stylus, and tangible objects, the amount ofreasonable control functions was lower, excluding e.g.drawing functions. Despite the limited set of functionsfor these modalities, they were still preferred for some

tasks. All the modalities had the highest precedencepercent for at least one task, suggesting that multiplemodalities improve the interaction with the system.Results show that users tried to select the most naturalcontrol modality available for a given task. Tasks whichdid not require direct manipulation of screen objectssuch as CAD model editing were typically controlled bymodalities that provided command-based discrete con-trol i.e. speech, gestures, stylus and tangible objects. Inaddition, these modalities also offered freedom to movearound during presentations and control the taskremotely even from the other side of the room, whiledirect manipulation required the user to stand right infront of the screen, often blocking the view of theaudience. Test results were based on multiple test ses-sions of a few hours duration over a short period oftime. The results indicated that interaction can be im-proved by using different modalities for different tasks.However, user preferences might change during long-term usage, and more extensive tests are required tovalidate that the results are not only based on the nov-elty effect alone.

Concerning gesture interaction, user feedback in-cluded positive comments about the fast operationalspeed of the gesture recognition system and naturalnessof the interaction in the navigation tasks. The usersexperimented with and tested different gestures forcontrolling the Smart Design Studio applications.Interestingly, after freely experimenting with gestures inuser groups in the test environment a consensus wasfound for a common gesture set for 3D navigation. Forthis set, design model move commands, a clear spatialassociation exists. The discovery of a common gestureset reflected user feedback, which suggested that thesystem should include a user-independent pre-trainedlibrary of typical navigation gestures enabling instantusage. Nevertheless, a separate training program wasstill desired for the personalisation (adding and modi-fying) of the gestures.

Discovering a common gesture set differs from thequestionnaire findings, where consensus was missing.Finding a common gesture set seems to depend on thetask. When the chosen set of gestures is intuitive enoughfor the given task, as for movements in 3D navigation,the users can easily adopt the gestures defined by others.For gesture commands having a natural spatial associ-ation with the navigation task, a common set agreed bya group of users can be found, but this requires realexperimentation with a real system. The contradictoryquestionnaire results show that it is difficult to imaginebeforehand what gesture would be useful and practicalfor a certain task, without interacting with the realapplication. Gestures are a rather new control methodand only a few users have previous experience of how tointeract with computer systems using gestures. Hence,concrete hands-on experimentation is required for anytask to find the best gestures agreed among a group ofusers.

297

Page 14: Accelerometer-Based Gesture Control for a Design Environment

13 Conclusions and future work

Accelerometer-based gesture recognition was studied asan emerging interaction modality, providing new possi-bilities to interact with mobile devices, consumer elec-tronics, etc. A user questionnaire was circulated toexamine the suitability and type of gestures for control-ling a design environment and selected home appliances.The results indicated that people prefer to define personalgestures, implying that the gestures should be freelytrainable. An experiment to evaluate gesture training andrecognition based on signals from 3D accelerometers andmachine learning methods was conducted. The resultsvalidated that gesture training and accurate recognitionis feasible in practice. The usefulness of the developedgesture recognition system was evaluated with a userstudy with a Smart Design Studio prototype that had amultimodal interface. The results indicated that gesturecommands were natural especially for simple commandswith spatial association. Furthermore, for this type ofgestures a common set of gestures, agreed upon by agroup of users, can be found but this requires hands-onexperimentation with real applications. Test results werebased on multiple test sessions over a short period oftime. In the future, more extensive sessions are requiredto acquire more detailed results of the long-term useful-ness of the system.

Sensor-based gesture control brings some advantagescompared with more traditional modalities. Gesturesrequire no eye focus on the interface, and they are silent.For some tasks, the gesture control can be natural andquick. However, many targets remain for future work.The results of user-dependent recognition should beextended to user-independent recognition, as is the goalin speech recognition. The use of buttons should beeliminated, and the tilt of the device should be filteredout. A long-term goal is continuous gesture recognition.In addition, the gesture interface should give feedback tothe user by means of vibration or audio. One of thechallenges is selecting a controllable device from theenvironment, in order to enable the use of different de-vices with the same gestures. Furthermore, in this study,mobility refers to the mobility of the device, such asmobile phone or a separate control device, which iscarried with the person. Testing the recognition accuracyin cases where the user is moving, requires further work.The test should cover the most common forms ofmovement that could include gesture control whilemoving. However, early experiments have shown thatthe recogniser maintains performance if the user iswalking. An important topic is practical usability; moreuser studies are needed to empirically evaluate and de-velop the gesture modality for a variety of interactiontasks in multimodal systems.

Acknowledgements We gratefully acknowledge research fundingfrom the National Technology Agency of Finland (Tekes) and theItalian Ministry of Education, University and Research (MIUR).

We would also like to thank our partners in the ITEA Ambienceproject.

References

1. Starner T, Auxier J, Ashbrook D, Gandy M (2000) The gesturependant: a self-illuminating, wearable, infrared computer visionsystem for home automation control and medical monitoring.In: proceedings of the fourth international symposium onwearable computers, ISWC 2000, pp 87–95

2. Rekimoto J (2001) GestureWrist and gesturepad : unobtrusivewearable interaction devices. In: proceedings of the fifth inter-national symposium on wearable computers, ISWC 2001, pp21–31

3. Sawada H, Hashimoto S (2000) Gesture recognition using anaccelerometer sensor and its application to musical perfor-mance control. Electron Commun Jpn Part 3, pp 9–17

4. Hoffman F, Heyer P, Hommel G (1997) Velocity profile basedrecognition of dynamic gestures with discrete hidden markovmodels. Proceedings of gesture workshop ‘97, Springer, BerlinHeidelberg, Newyork

5. Tsukada K, Yasumura M (2002) Ubi-finger: gesture input de-vice for mobile use. In: Proceedings of APCHI 2002, Vol. 1, pp388-400

6. Wilson A, Shafer S (2003) Between u and i: XWand: UI forintelligent spaces. In: Proceedings of the conference on humanfactors in computing systems, CHI 2003, April 2003, pp 545–552

7. Flanagan J, Mantyjarvi J, Korpiaho K, Tikanmaki J (2002)Recognizing movements of a handheld device using symbolicrepresentation and coding of sensor signals. In: Proceedings ofthe first international conference on mobile and ubiquitousmultimedia, pp 104–112

8. Mantyla V-M, Mantyjarvi J, Seppanen T, Tuulari E (2000)Hand gesture recognition of a mobile device user. In: Pro-ceedings of the international IEEE conference on multimediaand expo, pp 281–284

9. Theodoridis S, Koutroumbas K (1999) Pattern recognition.Academic press, London

10. Pirhonen A, Brewster S, Holgiun C (2002) Gestural and audiometaphors as a means of control for mobile devices. CHI 2002,April 2002, pp 291–298

11. Tuulari E, Ylisaukko-oja A (2002) SoapBox: a platform forubiquitous computing research and applications. In: Firstinternational conference, Pervasive 2002, pp 26-28

12. Rabiner L (1998) Tutorial on hidden markov models and se-lected applications in speech recognition. In: Proceedings of theIEEE, Vol. 77, No. 2

13. Kay S (2000) Can detectability be improved by adding noise?IEEE Signal Process Lett, 7(1):8–10

14. Ailisto H, Plomp J, Pohjanheimo L, Strommer E (2003) Aphysical selection paradigm for ubiquitous computing. In: 1stEuropean symposium on ambient intelligence (EUSAI 2003).ambient intelligence, Lecture Notes in Computer Science Vol.2875. Aarts, Emile et al (eds) Springer, Berlin Heidelberg,Newyork, pp 372–383

15. Peltola J, Plomp J, Seppanen T (1999) A dictionary-adaptivespeech driven user interface for distributed multimedia plat-form. In: Euromicro Workshop on multimedia and telecom-munications, Milan, Italy

16. Kallio S, Kela J, Mantyjarvi J (2003) Online gesture recogni-tion system for mobile interaction. IEEE International Con-ference on Systems, Man & Cybernetics, Vol. 3, WashingtonDC, USA, pp 2070–2076

17. Mantyjarvi J (2003) Sensor-based context recognition for mo-bile applications. VTT Publications 511

18. Iacucci G, Kela J, Pehkonen P (2004) Computational supportto record and re-experience visits. Personal and ubiquitouscomputing, Vol 8 No 2, Springer, Berlin Heidelberg, Newyork,pp 100–109

298

Page 15: Accelerometer-Based Gesture Control for a Design Environment

19. Gersho A, Gray RM (1991) Vector Quantization and signalcompression. Kluwer, Dordrecht

20. Yoon HS (2001) Hand gesture recognition using combined fea-tures of location, angle and velocity. Pattern Recogn 34:491–501

21. Mantyla V-M (2001) Discrete hidden markov models withapplication to isolated user-dependent hand gesture recogni-tion. VTT Publications 449

22. Mantyjarvi J, Kela J, Korpipaa P, Kallio S (2004) Enablingfast and effortless customisation in accelerometer based ges-ture interaction. In: Proceedings of the third internationalconference on mobile and ubiquitous multimedia, ACM, pp25–31

299