Top Banner
Future Human Computer Interaction with special focus on input and output techniques Thomas Hahn University of Reykjavik [email protected] March 26, 2010 Abstract Human Computer Interaction in the field of input and output techniques has developed a lot of new techniques over the last few years. With the recently released full multi- touch tablets and notebooks the way how people interact with the computer is com- ing to a new dimension. As humans are used to handle things with their hands the tech- nology of multi-touch displays or touchpad’s brought much more convenience for use in daily life. But for sure the usage of human speech recognition will also play an impor- tant part in the future of human computer interaction. This paper introduces tech- niques and devices using the humans hand gestures for the use with multi-touch tablets and video recognition and techniques for voice interaction. Thereby the gesture and speech recognition take an important role as these are the main communication methods between humans and how they could dis- rupt the keyboard or mouse as we know it today. 1 Introduction As mentioned before, much work in the sec- tor of human computer interaction, in the field of input and output techniques, has been made since the last years. Now since the release of the multi-touch tablets and notebooks, some of these developed multi- touch techniques are coming into practical usage. Thus it will surely take not much time that sophisticated techniques will en- hance these techniques for human gesture or voice detection. These two new meth- ods will for sure play an important role of how the HCI in future will change and how people can interact more easily with their computer in daily life. Hewett, et al defined that ”Human- computer interaction is a discipline con- cerned with the design, evaluation and im- plementation of interactive computing sys- tems for human use and with the study of major phenomena surrounding them.”[1] So since the invention of the Human Com- puter Interface in the 1970s at Xerox Park, we are used to have a mouse and a key- board to interact with the computer and to have the screen as a simple output device. With upcoming new technologies these de- vices are more and more converting with each other or sophisticated methods are re- placing them. Therefore this paper mainly deals with these new developments, how they should be implemented in future and how they could influence and change the daily computer interaction. For example with the techniques used for multi-touch devices, described in sec- 1
13
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Future Human Computer Interaction

Future Human Computer Interactionwith special focus on input and output techniques

Thomas HahnUniversity of [email protected]

March 26, 2010

Abstract

Human Computer Interaction in the field ofinput and output techniques has developeda lot of new techniques over the last fewyears. With the recently released full multi-touch tablets and notebooks the way howpeople interact with the computer is com-ing to a new dimension. As humans are usedto handle things with their hands the tech-nology of multi-touch displays or touchpad’sbrought much more convenience for use indaily life. But for sure the usage of humanspeech recognition will also play an impor-tant part in the future of human computerinteraction. This paper introduces tech-niques and devices using the humans handgestures for the use with multi-touch tabletsand video recognition and techniques forvoice interaction. Thereby the gesture andspeech recognition take an important role asthese are the main communication methodsbetween humans and how they could dis-rupt the keyboard or mouse as we know ittoday.

1 Introduction

As mentioned before, much work in the sec-tor of human computer interaction, in thefield of input and output techniques, hasbeen made since the last years. Now since

the release of the multi-touch tablets andnotebooks, some of these developed multi-touch techniques are coming into practicalusage. Thus it will surely take not muchtime that sophisticated techniques will en-hance these techniques for human gestureor voice detection. These two new meth-ods will for sure play an important role ofhow the HCI in future will change and howpeople can interact more easily with theircomputer in daily life.Hewett, et al defined that ”Human-computer interaction is a discipline con-cerned with the design, evaluation and im-plementation of interactive computing sys-tems for human use and with the studyof major phenomena surrounding them.”[1]So since the invention of the Human Com-puter Interface in the 1970s at Xerox Park,we are used to have a mouse and a key-board to interact with the computer and tohave the screen as a simple output device.With upcoming new technologies these de-vices are more and more converting witheach other or sophisticated methods are re-placing them. Therefore this paper mainlydeals with these new developments, howthey should be implemented in future andhow they could influence and change thedaily computer interaction.For example with the techniques usedfor multi-touch devices, described in sec-

1

Page 2: Future Human Computer Interaction

tion 2.1, the screen recently becomes a newinput and output tool in one device. Indoing so it is seen that there is no needfor extra input devices. Thus even thisfact is a completely new way, as we areused to having more than just one device.Included in section 2.2 is another award-ing method concerning human gesture in-teraction in the field of detecting gesturesvia video devices. In section 2.3 then an-other future technique is pointed out whichis concerned with the human speech detec-tion as an input method. Section 2.4 thendeals with a method of using the combi-nation of video and speech detection. Af-ter these sections about the different typesof recent human computer interaction workthe section 3 then deals with the opportu-nities of these new techniques and how theycan be used in future especially how life canbe changed. It is then also pointed out inwhich field these new developments can beadopted.

2 Recent Developments

Human gestures and the human speech arethe most intuitive motions which humansuse to communicate with each other. Al-though after the invention of the mouseand the keyboard, no further devices whichcould replace this two objects as a com-puter input method have been developed.Relating to the fact that it are more hu-man like methods of a lot of research ofhow this fact can be used for communica-tion between computers and human beingshas been done. As there are some differ-ent ways of how gestures should be usedas input this section is divided into multi-touch, video, speech and mutli-modal inter-action Sections. Nowadays there are alreadymany tablets with touch screens availableand with the new Apple iPad a new fullmulti-touch product has been released. Butthere is also a trend noticeable that these

methods can also be used on bigger screenslike the approach of Miller [2] or Microsoft’sSurface [3], only to mention these two. Thusthe trend is going more and more not onlyin the direction of merging input and outputdevices but rather using every surface as aninput and output facility. In this relationthe video is becoming of greater interest asit uses the full range of human motion ges-tures and it is usable on any ground. Inthe end the input via speech also takes apart in the HCI as it is the human’s easi-est way to communicate, but for sure some-thing completely different, compared withthe other two types of input and outputmethods, as it is more an algorithm than adevice. The combination of different inputmethods, called multi-modal interaction, isthen described in the last Section.

2.1 Multi-Touch Devices

As mentioned before this section is dealingwith the technique of the recently releasedmulti-touch devices and with some new en-hanced approaches. This method is now be-coming common in the tablet pc’s like forexample the new Apple iPad in the sectorof notebooks and the HP TouchSmart in thedesktop sector. Thereby the screen becomesan input and output device. But multitouchis also used today in many normal touchpads which are offering four-finger naviga-tion. With this invention of a new humancomputer interaction many more work inthis sector has been done and should sooneror later also come into practical use.Nowadays the usage of touch screens andmulti-touch pads seems to be really com-mon and that this is going to be the futureof human computer interaction, but thereis for sure more enhancement which canbe seen in many approaches. In the fieldof multi-touch products the trend to biggertouch pads in terms of multi-touch screenscan be seen. Therefore the technique of a

2

Page 3: Future Human Computer Interaction

single-touch touchpad as it is known fromformer notebooks is enhanced and more fin-gers offering natural human hand gesturescan be used. Thus the user can use up to10 fingers to fully control things with bothhands as with the 10/GUI system whichR. Clayton Miller [2] introduced in 2009.With another upcoming tool even the us-age of any surface for such a touch screencan be used in future, like for the exam-ple DisplaxTMMultitouch Technology fromDISPLAXTMInteractiveSystems [4]. Fromthese examples it can be clearly seen thatnew high potential techniques are pushinginto the market and are going to replace theApple’s iPad and Microsoft’s Surface. In thenext following sections these new tools andalso the related devices which are already inthe market are described in detail.

2.1.1 iPad

The recently introduced iPad from Appleis one of the many implementations of fullmulti-touch displays whereas it is a com-pletely new way how people can interactwith their computer. Thus the possibilityto use the screen not only as a single-touchdisplay is taking the Human Computer In-teraction to the next level. With the intro-duced iPad it is possible to use all the fin-ger movements which are also possible withthe build-in multi touchpad’s in the AppleMacBooks that can be found on the Apple1

homepage. In doing so the user is able to useup to four fingers at the same time to navi-gate through the interface. For example twofingers can be used to zoom and four fingersto browse through windows. With using thescreen as a big touchpad the techniques ofthe normal touchpad have been enhanced.Although this technique is just the begin-ning of the new multi-touch display revo-lution which will be for sure expanded byincreasing the display size.

1http://www.apple.com

2.1.2 Microsoft c©Surface

Microsoft’s called their example of a multi-touch screen just simply ”Surface”. Withthis tool they generated a large touch screentabletop computer. Therefore they are us-ing infrared cameras for recognition of ob-jects that are used on this screen. An ob-ject can thereby the human fingers or evenother tagged items which can be placed onthe screen. With this opportunity the de-vice supports recognition of human’s natu-ral hand gestures as well as interaction withreal objects and shape recognition. Withthis opportunity no extra devices are re-quired for a usage of this tool and interac-tion can be made directly with the hands.With the large 30 inch display even morepeople then only one person at the sametime can interact with the system and witheach other at the same time. The recog-nition of the objects placed on the table-top pc then provides more information andinteraction. So it is perhaps possible tobrowse through different information menusabout the placed item and obtain more dig-ital information. On the other hand the sizeof the display and the therefore needed in-frared cameras underneath is leading to anincrease of size. Thus the tool is mainly de-signed for stationary use, like for example asa normal table with which it is then possibleto interact with.

2.1.3 10/GUI

The 10/GUI system which Miller [2] in-vented, is an enhanced touchpad for desk-top computer purpose which can recognize10 fingers. With this opportunity humanbeings can interact with the computer withboth hands and use it as a tracking andmaybe also as a keyboard device. There-fore Miller designed this new touch surfacetool especially for use in the desktop field.To have the best ergonomically position heargues that it is better to have a full multi-

3

Page 4: Future Human Computer Interaction

touch pad in front of a screen as the key-board and mouse replacement then havingthe whole screen as an input device, like itis known from other touch-screens.This novel system is a whole differentmethod of computer interaction and is ex-tended with a special graphical user inter-face for a full use of all 10 fingers. The mostremarkable thing about this touchpad ex-cept the recognition of more fingers, is alsothe pressure detection of every finger, whichis directly stated on the screen. Thereforeevery finger can be used as a pointing de-vice like the mouse. With this feature it canbe also used in future as a keyboard in thedaily used position of the hands, withoutthe need of selecting Letters with only onefinger. But for sure this first development ofthe 10/GUI is mainly concerned with havingthe 10 fingers act instead of the mouse,”...as many activities today need only a mouseand windowed information display on thetop of that...”[2]The other innovation of this system is itsspecial designed user interface for the us-age of these 10 fingers. Thereby Milleris proposing a way to solve the problemof multiple open windows. His solutiontherefore is a linear arrangement of the ac-tive windows where it is possible to browsethrough them with 2 buttons on the outerleft and right side of the touch-panel.

Figure 1: Displax’s thin transparent multi-touch surface [4]

2.1.4 DisplaxTMMultitouch Technol-ogy

Another forward-looking multi-touch tech-nology comes from the Future Labs of theDisplaxTMCompany. They have inventeda new method to turn ”...any surface intoan interactive multitouch surface.” [4] Toachieve this goal they are using in their ap-proach an very thin transparent paper thatis attached to the DisplaxTMMultitouchcontroller. With this ultra thin paper theyare then able to turn any surface into a upto 50 inch big touchscreen. With this possi-bility you can work directly on a big screenby just using your hands. This can be seenas the main advantage against the MicrosoftSurface tool which is tied to a more a lessstationary place. Additionally this interfaceallows a usage of 16 fingers at the same timeso that more than just one user can work onthe screen simultaneously. With the weightof just 300g it is also a very transportabletool beside the fact that it is well durable asthe film is placed on the back of the surfaceto protect it from scratches and other dam-age. Figure 1 illustrates a detail of the thintouch-screen surface where even the usageon transparent surfaces is possible. [4]

2.2 Video Devices

Great efforts have been made also in thesection of video input and output deviceslike for example SixthSense that PranavMistry, et al. [5] invented in 2009. Themain purpose of such devices thus lies in thepotential of having even more interactionwith the computer then with normal touch-screens or -pads. Thereby these techniquestend to recognize human gestures like thehands without the need of additional hand-held pointing devices. According to the ap-proach of Feng, et. al [6], they have alreadyprovided an algorithm for real-time naturalhand gesture detection in their paper withwhich it is possible to use only the hand as

4

Page 5: Future Human Computer Interaction

a pointing device like the mouse. PranavMistry, et. al [5] have improved this inputpointing method with the development of awearable gesture interface which is not onlyan input but also an output tool which willfor sure have an impact on the future humancomputer interaction. Within this sectionthese new methods of handling the interac-tion are now explained.

Figure 2: Components of the SixthSense de-vice [7]

2.2.1 SixthSense

Pranav Mistry, et al. [5] developed withSixthSense clearly the beginning of newtechniques for human computer interaction.In their approach they are using a wearablegesture interface, shown in Figure 2, wherethey are taking the HCI to the next levelwhere communication is done with handswithout any handheld tools. Just as easy isalso the output of the device for this inter-face which is included in this device. With asimple small projector the output is directlyprojected to any surface you like, which isclearly the main advantage against other de-vices which are tied to a certain place. Sothey are using for there whole communica-tion with the computer a simple webcamfor the users input and a simple projectorfor the output. Therefore the input canrange from simple hand gestures for sort-ing or editing images to more specific tasks.

Thereby the input can also be a keyboardwhich is projected over the small beamerand so it is fulfilling the output and inputat the same time. With this opportunitythe have also included in the video recog-nition to react to other things then just onthe human hand gestures. Perhaps it recog-nizes that the user is reading a newspaperand adds additional media to this topic byprojecting the media directly on the news-paper.

2.2.2 Skinput

The above mentioned video recognition toolis handling with human motion gestures.Especially it is dealing with recognizing thehand gestures and letting people interactwith their hand gestures over a certain out-put device. Researchers at Carnegie MellonUniversity and Microsoft have developedanother highly topical approach by usingthe humans arm as an input surface, calledSkinput. [8]

Figure 3: Acoustic biosensor with projector[8]

Thereby they are using a small projectorwhich is fixed around the humans biceps toproject buttons on the user’s arm. Seemsat first sight like the same approach like thebefore mentioned SixthSense [5] tool. Butwhen the user then tips on a button an in

5

Page 6: Future Human Computer Interaction

the wristband built-in acoustic biosensor de-tects the pushed button. This techniquesespecially works because it detects the dif-ferent acoustic sounds which vary due to theunderlying bones in the forearm. Detailedresults of how accurate this new method canwork in practical use will be announced onthe 28th ACM Conference on Human Fac-tors in Computing Systems2 this year. Butso far it is stated that with 5 buttons theresearcher were able to reach an ”...accu-racy of 95.5% with the controllers when fivepoints on the arm were designated as but-tons.” [8] With the different fingers it isthen possible to operate even through morecomplex user interfaces like using a scrollingbutton or even playing tetris on the palm.An example of how Skinput looks like andthe technical components are shown in Fig-ure 3.

2.2.3 g-speak

Oblong Industries [9] invented with g-speaka full gesture input/output device with 3Dinterface. This of course can be comparedwith the mentioned SixthSense and Skin-put devices before. The disparity betweenthe other two methods is for sure the tar-get group. As SixthSense and Skinput aremainly designed for mobil usage, g-speakcomes up with a more sophisticated user in-terface and is designed for a usage with bigscreens that are using a lot more space.

Therefore the user wears a sort of handgloves to control the interface, seen in Fig-ure 4. The gestural motions are thereby de-tected with an accuracy of 0.1 mm at 100Hz as well as two-handed and multi-user in-put are also supported. Additionally thesystem comes with a high-definition graph-ical output which can be projected to anyscreen in front of the user. For detailed op-erations the user can also drag objects fromthe big screens to a smaller screen in front

2http://www.chi2010.org/

of him. The user is then able to operatewith the interface on a touchscreen and candrag the objects back to the big output fa-cilities. The most important thing therebyis the fact that the system can be used withany device you want to choose, for exam-ple a desktop screen, touchscreen or hand-held device. Furthermore with the supportof multiple users this interface then allowsmore users to interact with this system andwork together very easy. [9]

This can be seen as a form of multi-modalcomputer interaction interface where theuser has the opportunity to interact withthe computer with more then just one de-vice.

Figure 4: Usage of g-speak on a large-scaledscreen [9]

2.3 Speech Detection

Speech Detection is always mentioned as themost common straight-forwarded way, afterthe gestural motion, of how people interactbetween each other. This fact of course im-pacts also the design of human computerinterfaces. Within the section of speech de-tection the main point is of course the soft-ware. For speech recognition itself you onlyneed a normal microphone. The only thingyou then have to consider is noise which willbe also recorded with your actual planedvoice. The major thing thereby is to create agood algorithm not only to select the noisefrom the actual voice but rather detectingwhat humans are actually saying. In this

6

Page 7: Future Human Computer Interaction

section I am introducing two of the variousapproaches which can be used for speech de-tection. First an algorithm to improve thespoken natural language is described andsecond I am describing a more common ar-chitecture from Microsoft.

2.3.1 Spoken natural language

Florez-Choque, et. al [10] have introducedin their approach a way how to improve thehuman computer interaction trough spokennatural language. Furthermore they areusing ”Hybrid Intelligent System based onGenetic Algorithms Self-Organizing Map torecognize the present phonemes” [10]. Un-fortunately this model is based only on aspanish language database and so for a usewith another language it is obvious that theapproach has to be changed in order to ful-fill full speech recognition. In general us-ing speech processing and phoneme recogni-tion modules to recognize spoken languageis a good approach for Speech Detection.As mentioned before it is then necessaryto adopt the phone recognition module asin every language the pronunciation differs.More detailed information about their workon improving the human computer interac-tion through spoken natural language canbe found in their paper for further informa-tion. [10]

2.3.2 Microsoft c©SAPI

Microsoft of course is also doing a lot of re-search in speech recognition like for exam-ple with their Speech Technologies like theSpeech Application Programming InterfaceSAPI. With this API Microsoft provides onthe one hand a converter for full humanspoken audio input to readable text. Onthe other hand it can be also used to con-vert written text into human speech with asynthetic voice generator. The main advan-tage compared to the approach of Florez-Choque, etl al [10] is thereby the fact that

this tool comes up with a package for nearlyevery common language, starting with En-glish, German, French, Spanish, and so on.On the one hand this API is available onevery Windows OS but on the other handit is only made for native windows applica-tions. So only windows applications ”...canlisten for speech, recognize content, processspoken commands, and speak text...”. [11]A lot more information about the develop-ment of this tool can be found on the Mi-crosoft Speech Technology page itself.

2.4 Multi-modal: combina-tion of video and speechrecognition

As mentioned before Microsoft is doing alot of research in the field of speech detec-tion. With their API described in the previ-ous sector they are using a good method tofor interface navigation for the human lan-guage, although these technique needs a lotof work to come to an achievement. Mi-crosoft’s researchers are now working sincedecades to solve the problem of an accuratespeech detection system but haven’t foundthe perfect solution yet. A trendsetting ap-proach example in a slightly different fieldcould be the approach which Microsoft in-troduced in their vision of a future home.This was originally developed for the us-age in a future kitchen but describes perfecthow the future Human Computer Interac-tion may look like.In this approach they are multi-modal in-terfaces with combining video and speechrecognition. Therefore they are using videoto detect goods in the kitchen and videoprojection to display the user interface di-rectly on the kitchen surface. The detec-tion, for instance, can be imagined like thatthe systems recognized which ingredient isplaced on the surface. For the navigationthrough the interface they are then combin-ing this video detecting method with speech

7

Page 8: Future Human Computer Interaction

detection. The whole demonstration of thisdevice can be found on Microsoft’s FutureHome website itself.[12]

It is then obvious that this technologyof multi-modal interface for sure influencesalso the ”normal” Human Computer Inter-action as the same methods can be used toin a day-to-day user interface.

3 Applications

It can be clearly seen that the way of howpeople are interacting with their computerhas changed over the years and that thetrend is going to have a more convenientway to interact with the PC. As mentionedin the section before it can be seen thatthere are mainly 3 different types of how theinteraction with the computer in the futurecould look like.

1. Multi-touch,

2. Video gesture and pointing,

3. and speech

recognition are thereby the categories inwhich the Human Computer Interaction willmake great efforts. The combination ofthese techniques then will lead to multi-modal interfaces where we would use all thistypes together in one system. Now since wehave heard how these different are used andmodeled to come to a practical use the mostimportant thing is how we could benefitfrom this new techniques. The main pointaccording to this is where these techniquesand devices can find their appliance in outdaily use. Within these section some possi-ble application of how we can use this newdevices are shown. The enumeration abovetherefore can also be seen as a ranking ofthe techniques which are likely to come intothe global market. This ranking is based onthe fact that multi-touch devices with theiPad as a cutting-edge, are already pushinginto the market.

3.1 Multi-touch Recognition

The most common way of Human ComputerInteraction in the near future is for sure thetechnology of multi-touch displays. The re-cently introduced iPad is though the begin-ning of this new trend. But where can weuse and benefit of the technique of multipleusage of fingers and user’s at the same time?In which applications could there be a po-tential use for those techniques and are theymaybe already in use but only in a smallmarket like the military or education sec-tor?From the three introduced mutli-touchtypes in the Section 2.1 is until now onlythe iPad coming in to the global market.Thereby the iPad is mainly designed for mo-bile usage as it is always referred to as abigger version of the iPhone. With a pricearound 500$ it is also competing with thecheap netbook sector. The price is for surethe reason for the other mentioned productswhy they are still positioned in the globalmarket. This has until now not reached thecritical price, so that the wide mass couldeffort to buy it. A wide range of applica-tions for such models is certainly available.Especially for tools like the 10/GUI system[2] that is made for home computer ser-vices. Most of us would appreciate having afull multi touchpad in front of the computerscreen with which everything could be op-erated. No unnecessary cables and differentdevices for an input pointing and typing de-vice would be needed. In doing so it is clearthat this device could disrupt the mouse andthe keyboard as the typical input devices.Another well known product example is Mi-crosoft’s Surface. [3] With their multi-touchlarge screen tabletop computer the intendto generate a device for many different ap-plications. With the opportunity to workwithout the need of any additional deviceslike for example the mouse it is very intu-itively to use. Microsoft is seeing appliancesin the industries of financial services, health

8

Page 9: Future Human Computer Interaction

care, hospitality, retail and the public sec-tor. With this examples it is obvious thatthis technique should have a wide targetgroup. We can see the main target groupin the public or even more in the entertain-ment sector because of its relatively largersize of the device, where other devices use alot less space. On the other hand the mainadvantage against all other devices is trulythe usage of tagged objects. With this pos-sibility not only human gestures are trackedalso more sophisticated methods are used.A completely different way of touch screenusage supports the DisplaxTMMultitouchTechnology [4] with its ultra thin transpar-ent paper that can be spanned over everysurface you like. The certainty that thewhole device only weighs 300g makes it alsovery mobile, so that it can be used every-where, for example to bring it to conferencesor meetings where more people can collabo-rate with each other. The actuality that thesystems also allows the usage of 16 fingersbrings even more advantages for this inter-action.These appliances are all mainly for personalor business use but in what other areascould there be an employment? There is nospecific answer for this question. For sureany surface with this paper film placed oncould then be an interactive input device.The DISPLAXTMcompany itself sees theirpotential customers in the retail and diverseindustries such as telecoms, museums re-tail, property, broadcast, pharma or finance.As this technology was primarily developedfor displays to integrate a touchscreen it”...will also be available for LCD manu-facturers, audiovisual integrators or gamingplatforms....” . [4]

3.2 Video gesture and point-ing Recognition

Video gesture and pointing recognition de-vices are using even more sophisticated

methods to interact with the computer.Therefore the basic background behindthese technique is are the human gesturesand motions with which these devices arecontrolled. It can be seen from the recentdevelopments that there are two types ofapplications for video recognition devices.The disparity is displayed in the mobilityaspect. As mobility in SixthSense [5] andSkinput [8] plays an important role in thistwo devices, g-speak [9] is mainly designedfor stationary purposes. Nevertheless allthree types have their potential to becomea future device for Human Computer Inter-action, but lets start with the possible ap-pliances of the g-speak tool.Oblong Industries have designed with theirtool a complete new way to allow free handgestures as input and output. Beside thegestural part they also constructed this plat-form for real-space representation of all in-put objects and on-screen constructs onmulti-screens as shown in Figure 4. Withthis tool they are not only using the so calledmid-air detection which recognizes the hu-man hand gestures and operates the inter-face but also a multi-touch tabletop pc isused as described in the previous Section.Skinput as it is introduced in Section 2.2.2is a new way how interaction with humanfingers can look like in the future. Skinputwas designed to the fact that mobile devicesoften do not have very large input displays.Therefore it uses the human body, or moreprecisely the humans forearm to use it asan input with several touch buttons. As itcan be seen for now on this system containssome new interesting technology but is lim-ited to pushing buttons. Thus it will findits use in areas where an user operates aninterface with just buttons and so it is notvery likely that it will replace the mouse orkeyboard in general.With the prototype of SixthSense PranavMistry, et. al are demonstrating in a fas-cinating way how it is going to find its us-

9

Page 10: Future Human Computer Interaction

age in future human computer interaction.As already mentioned this tool uses humangestures as a input method but it offers alot of more possible appliances. As onlya few examples from many it can projecta map where the user then can zoom andnavigate through, it can be used as a photoorganizer or as a free painting applicationwhen displayed on a wall. Taking picturesby forming the hands to a frame, display-ing a watch on the hands wrist or a key-board on the palm and displaying detailedinformation in newspapers or actual flightinformations on flight tickets are just a fewof the many opportunities. Some of theseapplications are shown in Figure 5. Thislist of many appliances highlights the highpotential of this new technology and withthe up-to-date components in the prototypewith about $300 it is even affordable forthe global market. Thus this device deliversmost likely the highest probability to be thefirst to push into the market and become thebeginning of the future interaction with thecomputer.

3.3 Speech Recognition

The humans speech as mentioned before isthe easiest and most convenient way howpeople are used to communicate with eachother. But how can this fact be used for theinteraction between humans and the com-puter? As described in the above sectionthere are different approaches to solve theproblem to get a good accuracy of correctdetected words. Some of these techniquesare already in the practical use like the Mi-crosoft’s Speech Application which comeswith all Windows operating systems. Al-though particularly Microsoft has done alot of work in this sector until now theywere not able to develop a method with aaccuracy which is high enough so that itcould be used as a 100% reliable input tech-nique. As long as a 100% reliability is not

(a) Map navigation on any sur-face

(b) Detailed flight information

(c) Keyboard on the palm

Figure 5: Applications of SixthSense Tool[7]

reached this technique will perhaps find nopractical use as an single human computerinteraction alternative. When this goal isachieved this method will for sure find partsin many areas. A large target group in gen-eral is the business section where an auto-matically speech detection would relievablea lot. Thereby especially the part of speechto text recognition is an important part asthis represents the hardest work. For eas-ier speech recognition some approaches arealready somewhat satisfiable in use. For in-stance for a simple navigation through op-erating systems these technique is just goodenough and helps many people especially

10

Page 11: Future Human Computer Interaction

in the medical sector. Thereby it is reallyhelpful for people with physical disability orwhere the usage of hands is unsuitable be-cause the common way to interact with thecomputer is definitely going in the directionof using human gesture detection or the us-age of multi-modal interface as described inthe next Section.

3.4 Multi-modal Interfaces

Multi-modal Interfaces are developed in ev-ery different combination of several inputmethods. We have seen already some ex-amples with the g-speak, SixthSense toolswhere they are combining gestural aspectswith touch techniques. This Section ismainly dealing with the example of Mi-crosoft’s Future Home example where theyare combining video and audio detection forthe interaction between the computer andhumans. Thus it is concerned in detail withthe pros of adding the audio part to the in-teraction. Microsoft is using in their exam-ple an outstanding way how the communi-cation with the computer can look like infuture. With the usage of video and audiodetection this is brought to the next level.In their approach they are introducing thismethods in general for the kitchen appli-ance. For practical usage it can be clearlyseen that this combination brings more flex-ibility in the interaction. This refers espe-cially to tasks where it is not possible to usethe hands to interact with the computer. InMicrosoft’s example this location is as men-tioned before the kitchen where you oftenneed the hand for cooking and it is veryhelpful that you can navigate through aninterface with your voice. Another few ex-amples therefore can then be the medicalsector for people with disabilities, the car,the plane, and many more. Unfortunatelythe most appliances claim a high level of ac-curacy which the speech recognition for nowone can not achieve. Of course this tech-

nique is sophisticated enough to use it forsimple navigations through interfaces as inthe kitchen example. On the other hand thevideo object recognition in this example isalso a good addition to generate a powerfultool.

3.5 Comparison

Within the last Sections we have heardmany details about the tools and the newapproaches and applications. Thus thisleads to the question which tools will comeup into practical use in the near future.With the recently released iPad and also Mi-crosoft’s Surface the first step into a futureof multi-touch and video detection deviceshas been made. Thereby the iPad with itsaffordable price is the first one who is reallypushing into the global market and com-pared to the other products and its featuresit provides the most advantages. The ultrathin paper developed by DisplaxTMextendsthis multi-touch display to a large screenwith the great benefit that it allows multiuser. Nevertheless the SixthSense tool andg-speak have a complete new technologywhich they are using. The only problemthereby is the matter of the price. Untilnow these tools have not reached the criti-cal price so that everybody can effort buy-ing such a tool. With the specified priceof the SixthSense with only $300 prototypeit could be a big competitor for this multi-touch devices. Therefore we have to see ifthis price is really marketable and if so ithas the highest potential to come into theglobal market within the next few years.The main advantage of this tool is definitelythe wide range of usage which all the otherdevices are not able to compete with. Fora more sophisticated adoption the g-speakis a better solution. It delivers an enhancedinterface and the possibility of multiple userinteraction which is the big advantage buton the other hand it serves more the busi-

11

Page 12: Future Human Computer Interaction

ness or education sector as it would be toexpensive for individuals. The 10/GUI sys-tems comes up with a solution for the homecomputer use. It enhances the multi-touchpads with an individual designed interfaceand could displace thereby the mouse andthe keyboard in the home pc sector.All considered all these methods delivermany new opportunities of future humancomputer interaction. The main factorthereby will be the price and the SixthSenseseems to supply the best price for an adop-tion in the near future.

4 Conclusion

This paper introduces a lot approaches ofnew future Human Computer Interactionmethods and also devices or prototypes inwhich these techniques are already in use.We have seen that we are tending towardsto disrupt the usage the usage of the mouseand the keyboard as we are used to usethem as a computer input device for the last3 decades. Many new methods are goinginto the sector of using human hand ges-tures and even multi-modal methods to in-teract with the computer. With the toolsdescribed we have seen that with the re-cently released iPad and Microsoft’s Surfacesome of these methods are already included.For sure there are more sophisticated meth-ods which will push into the market soon.With the SixthSense and g-speak tool twoenhanced methods of human computer in-terfaces were developed. With this oppor-tunity and the fact that we are used to actwith our hands and communicate with ourvoice this parts will play a major role inour interaction with the computer. On theother hand it can be seen that there is stilla lot of work left especially in the sectorof human voice detection though the videoand multi-touch detection also leaves somespace for expansion. Many new approacheswill be described in the this years 28th ACM

Conference on Human Factors in Comput-ing Systems and we will see how other de-vices will take part in this new technology.

References

[1] Hewett, Baecker, Card, Carey,Gasen, Mantei, Perlman, Strong,and Verplank, “Acm sigchi cur-ricula for human-computer in-teraction,” 1992,1996. http:

//old.sigchi.org/cdg/cdg2.html.accessed on 2010, February 23.

[2] R. C. Miller, “10/gui,” 2009. http://

10gui.com/. accessed on 2010, March08.

[3] Microsoft c©, “Microsoft c©surface,”2010. http://www.microsoft.com/

surface/Default.aspx, accessed on2010, March 20.

[4] D. InteractiveSystems, “Displax multi-touch technology,” 2009. http://www.displax.com/en/future-labs/

multitouch-technology.

html#/en/future-labs/

multitouch-technology.html.accessed on 2010, March 20.

[5] P. Mistry and P. Maes, “Sixth-sense: a wearable gestural inter-face,” in ACM SIGGRAPH ASIA 2009Sketches, (Yokohama, Japan), pp. 1–1,ACM, 2009.

[6] Z. Feng, B. Yang, Y. Zheng, Z. Wang,and Y. Li, “Research on 3d handtracking using particle filtering,” inICNC ’08: Proceedings of the 2008Fourth International Conference onNatural Computation, (Washington,DC, USA), pp. 367–371, IEEE Com-puter Society, 2008.

[7] P. Mistry, “Sixthsense integratinginformation with the real world,”

12

Page 13: Future Human Computer Interaction

2009. http://www.pranavmistry.

com/projects/sixthsense/, accessedon 2010, March 20.

[8] H. C., T. D., and M. D., “Skin-put: Appropriating the bodyas an input surface,” 2010.http://www.chrisharrison.net/

projects/skinput/. accessed on2010, March 20.

[9] O. I. Inc, “g-speak spatial operatingenvironment,” 2009. http://oblong.

com/, accessed on 2010, March 20.

[10] F.-C. O. and C.-V. E., “Improving hu-man computer interaction through spo-ken natural language,” pp. 346 –350,April 2007.

[11] Microsoft c©, “Microsoft c©speechtechnologie,” 2010. http:

//www.microsoft.com/speech/

default.aspx, accessed on 2010,March 20.

[12] Microsoft c©, “Designing home tech-nology for the future,” 2009. http:

//www.microsoft.com/presspass/

events/mshome/default.mspx, ac-cessed on 2010, March 20.

13