The Interaction

THE INTERACTION

OV E RV I E W

n Interaction models help us to understand what is goingon in the interaction between user and system. Theyaddress the translations between what the user wantsand what the system does.

n Ergonomics looks at the physical characteristics of theinteraction and how these influence its effectiveness.

n The dialog between user and system is influenced bythe style of the interface.

n The interaction takes place within a social andorganizational context that affects both user andsystem.

3

124 Chapter 3 n The interaction

INTRODUCTION

In the previous two chapters we have looked at the human and the computer respect-ively. However, in the context of this book, we are not concerned with them in isolation. We are interested in how the human user uses the computer as a tool toperform, simplify or support a task. In order to do this the user must communicatehis requirements to the computer.

There are a number of ways in which the user can communicate with the system.At one extreme is batch input, in which the user provides all the information to thecomputer at once and leaves the machine to perform the task. This approach doesinvolve an interaction between the user and computer but does not support manytasks well. At the other extreme are highly interactive input devices and paradigms,such as direct manipulation (see Chapter 4) and the applications of virtual reality(Chapter 20). Here the user is constantly providing instruction and receiving feed-back. These are the types of interactive system we are considering.

In this chapter, we consider the communication between user and system: theinteraction. We will look at some models of interaction that enable us to identify andevaluate components of the interaction, and at the physical, social and organiza-tional issues that provide the context for it. We will also survey some of the differentstyles of interaction that are used and consider how well they support the user.

MODELS OF INTERACTION

In previous chapters we have seen the usefulness of models to help us to under-stand complex behavior and complex systems. Interaction involves at least two par-ticipants: the user and the system. Both are complex, as we have seen, and are verydifferent from each other in the way that they communicate and view the domainand the task. The interface must therefore effectively translate between them to allowthe interaction to be successful. This translation can fail at a number of points andfor a number of reasons. The use of models of interaction can help us to understandexactly what is going on in the interaction and identify the likely root of difficulties.They also provide us with a framework to compare different interaction styles and to consider interaction problems.

We begin by considering the most influential model of interaction, Norman’s execution–evaluation cycle; then we look at another model which extends the ideas of Norman’s cycle. Both of these models describe the interaction in terms of the goals and actions of the user. We will therefore briefly discuss the terminology used and the assumptions inherent in the models, before describing the modelsthemselves.

3.2

3.1

3.2 Models of interaction 125

3.2.1 The terms of interaction

Traditionally, the purpose of an interactive system is to aid a user in accomplishinggoals from some application domain. (Later in this book we will look at alternativeinteractions but this model holds for many work-oriented applications.) A domaindefines an area of expertise and knowledge in some real-world activity. Some ex-amples of domains are graphic design, authoring and process control in a factory. A domain consists of concepts that highlight its important aspects. In a graphicdesign domain, some of the important concepts are geometric shapes, a drawing surface and a drawing utensil. Tasks are operations to manipulate the concepts of adomain. A goal is the desired output from a performed task. For example, one taskwithin the graphic design domain is the construction of a specific geometric shapewith particular attributes on the drawing surface. A related goal would be to producea solid red triangle centered on the canvas. An intention is a specific action requiredto meet the goal.

Task analysis involves the identification of the problem space (which we discussedin Chapter 1) for the user of an interactive system in terms of the domain, goals,intentions and tasks. We can use our knowledge of tasks and goals to assess the inter-active system that is designed to support them. We discuss task analysis in detail in Chapter 15. The concepts used in the design of the system and the description ofthe user are separate, and so we can refer to them as distinct components, called theSystem and the User, respectively. The System and User are each described by meansof a language that can express concepts relevant in the domain of the application.The System’s language we will refer to as the core language and the User’s language we will refer to as the task language. The core language describes computationalattributes of the domain relevant to the System state, whereas the task languagedescribes psychological attributes of the domain relevant to the User state.

The system is assumed to be some computerized application, in the context of thisbook, but the models apply equally to non-computer applications. It is also a com-mon assumption that by distinguishing between user and system we are restricted tosingle-user applications. This is not the case. However, the emphasis is on the viewof the interaction from a single user’s perspective. From this point of view, otherusers, such as those in a multi-party conferencing system, form part of the system.

3.2.2 The execution–evaluation cycleNorman’s model of interaction is perhaps the most influential in Human–ComputerInteraction, possibly because of its closeness to our intuitive understanding of theinteraction between human user and computer [265]. The user formulates a plan ofaction, which is then executed at the computer interface. When the plan, or part ofthe plan, has been executed, the user observes the computer interface to evaluate theresult of the executed plan, and to determine further actions.

The interactive cycle can be divided into two major phases: execution and evalu-ation. These can then be subdivided into further stages, seven in all. The stages inNorman’s model of interaction are as follows:


1. Establishing the goal.2. Forming the intention.3. Specifying the action sequence.4. Executing the action.5. Perceiving the system state.6. Interpreting the system state.7. Evaluating the system state with respect to the goals and intentions.

Each stage is, of course, an activity of the user. First the user forms a goal. This is theuser’s notion of what needs to be done and is framed in terms of the domain, in thetask language. It is liable to be imprecise and therefore needs to be translated into the more specific intention, and the actual actions that will reach the goal, before it can be executed by the user. The user perceives the new state of the system, afterexecution of the action sequence, and interprets it in terms of his expectations. If thesystem state reflects the user’s goal then the computer has done what he wanted andthe interaction has been successful; otherwise the user must formulate a new goaland repeat the cycle.

Norman uses a simple example of switching on a light to illustrate this cycle.Imagine you are sitting reading as evening falls. You decide you need more light; that is you establish the goal to get more light. From there you form an intention to switch on the desk lamp, and you specify the actions required, to reach over andpress the lamp switch. If someone else is closer the intention may be different – youmay ask them to switch on the light for you. Your goal is the same but the intentionand actions are different. When you have executed the action you perceive the result,either the light is on or it isn’t and you interpret this, based on your knowledge of the world. For example, if the light does not come on you may interpret this as indicating the bulb has blown or the lamp is not plugged into the mains, and you will formulate new goals to deal with this. If the light does come on, you will evaluate the new state according to the original goals – is there now enough light? If so, thecycle is complete. If not, you may formulate a new intention to switch on the mainceiling light as well.

Norman uses this model of interaction to demonstrate why some interfaces causeproblems to their users. He describes these in terms of the gulfs of execution and thegulfs of evaluation. As we noted earlier, the user and the system do not use the sameterms to describe the domain and goals – remember that we called the language of the system the core language and the language of the user the task language. Thegulf of execution is the difference between the user’s formulation of the actions toreach the goal and the actions allowed by the system. If the actions allowed by thesystem correspond to those intended by the user, the interaction will be effective.The interface should therefore aim to reduce this gulf.

The gulf of evaluation is the distance between the physical presentation of the system state and the expectation of the user. If the user can readily evaluate the presentation in terms of his goal, the gulf of evaluation is small. The more effort that is required on the part of the user to interpret the presentation, the less effectivethe interaction.


Norman’s model is a useful means of understanding the interaction, in a way thatis clear and intuitive. It allows other, more detailed, empirical and analytic work to be placed within a common framework. However, it only considers the system asfar as the interface. It concentrates wholly on the user’s view of the interaction. It does not attempt to deal with the system’s communication through the interface.An extension of Norman’s model, proposed by Abowd and Beale, addresses thisproblem [3]. This is described in the next section.

3.2.3 The interaction framework

The interaction framework attempts a more realistic description of interaction byincluding the system explicitly, and breaks it into four main components, as shownin Figure 3.1. The nodes represent the four major components in an interactive sys-tem – the System, the User, the Input and the Output. Each component has its ownlanguage. In addition to the User’s task language and the System’s core language,which we have already introduced, there are languages for both the Input and Outputcomponents. Input and Output together form the Interface.

As the interface sits between the User and the System, there are four steps in theinteractive cycle, each corresponding to a translation from one component toanother, as shown by the labeled arcs in Figure 3.2. The User begins the interactivecycle with the formulation of a goal and a task to achieve that goal. The only way the user can manipulate the machine is through the Input, and so the task must bearticulated within the input language. The input language is translated into the core

Human error – slips and mistakes

Human errors are often classified into slips and mistakes. We can distinguish these usingNorman’s gulf of execution.

If you understand a system well you may know exactly what to do to satisfy your goals – you haveformulated the correct action. However, perhaps you mistype or you accidentally press the mousebutton at the wrong time. These are called slips; you have formulated the right action, but fail toexecute that action correctly.

However, if you don’t know the system well you may not even formulate the right goal. For ex-ample, you may think that the magnifying glass icon is the ‘find’ function, but in fact it is to magnifythe text. This is called a mistake.

If we discover that an interface is leading to errors it is important to understand whether they areslips or mistakes. Slips may be corrected by, for instance, better screen design, perhaps puttingmore space between buttons. However, mistakes need users to have a better understanding of thesystems, so will require far more radical redesign or improved training, perhaps a totally differentmetaphor for use.


language as operations to be performed by the System. The System then transformsitself as described by the operations; the execution phase of the cycle is complete andthe evaluation phase now begins. The System is in a new state, which must now be communicated to the User. The current values of system attributes are renderedas concepts or features of the Output. It is then up to the User to observe the Outputand assess the results of the interaction relative to the original goal, ending the evalu-ation phase and, hence, the interactive cycle. There are four main translationsinvolved in the interaction: articulation, performance, presentation and observation.

The User’s formulation of the desired task to achieve some goal needs to be articu-lated in the input language. The tasks are responses of the User and they need to betranslated to stimuli for the Input. As pointed out above, this articulation is judgedin terms of the coverage from tasks to input and the relative ease with which thetranslation can be accomplished. The task is phrased in terms of certain psycholo-gical attributes that highlight the important features of the domain for the User. Ifthese psychological attributes map clearly onto the input language, then articulationof the task will be made much simpler. An example of a poor mapping, as pointed

Figure 3.1 The general interaction framework

Figure 3.2 Translations between components


out by Norman, is a large room with overhead lighting controlled by a bank ofswitches. It is often desirable to control the lighting so that only one section of theroom is lit. We are then faced with the puzzle of determining which switch controlswhich lights. The result is usually repeated trials and frustration. This arises from thedifficulty of articulating a goal (for example, ‘Turn on the lights in the front of theroom’) in an input language that consists of a linear row of switches, which may ormay not be oriented to reflect the room layout.

Conversely, an example of a good mapping is in virtual reality systems, whereinput devices such as datagloves are specifically geared towards easing articulation by making the user’s psychological notion of gesturing an act that can be directlyrealized at the interface. Direct manipulation interfaces, such as those found on common desktop operating systems like the Macintosh and Windows, make thearticulation of some file handling commands easier. On the other hand, some tasks,such as repetitive file renaming or launching a program whose icon is not visible, arenot at all easy to articulate with such an interface.

At the next stage, the responses of the Input are translated to stimuli for theSystem. Of interest in assessing this translation is whether the translated input lan-guage can reach as many states of the System as is possible using the System stimulidirectly. For example, the remote control units for some compact disc players do notallow the user to turn the power off on the player unit; hence the off state of theplayer cannot be reached using the remote control’s input language. On the panel ofthe compact disc player, however, there is usually a button that controls the power.The ease with which this translation from Input to System takes place is of less import-ance because the effort is not expended by the user. However, there can be a realeffort expended by the designer and programmer. In this case, the ease of the trans-lation is viewed in terms of the cost of implementation.

Once a state transition has occurred within the System, the execution phase of the interaction is complete and the evaluation phase begins. The new state of theSystem must be communicated to the User, and this begins by translating the Systemresponses to the transition into stimuli for the Output component. This presentationtranslation must preserve the relevant system attributes from the domain in the lim-ited expressiveness of the output devices. The ability to capture the domain conceptsof the System within the Output is a question of expressiveness for this translation.

For example, while writing a paper with some word-processing package, it is necessary at times to see both the immediate surrounding text where one is currentlycomposing, say, the current paragraph, and a wider context within the whole paperthat cannot be easily displayed on one screen (for example, the current chapter).

Ultimately, the user must interpret the output to evaluate what has happened. Theresponse from the Output is translated to stimuli for the User which trigger assess-ment. The observation translation will address the ease and coverage of this finaltranslation. For example, it is difficult to tell the time accurately on an unmarkedanalog clock, especially if it is not oriented properly. It is difficult in a command line interface to determine the result of copying and moving files in a hierarchical file system. Developing a website using a markup language like HTML would be virtually impossible without being able to preview the output through a browser.


Assessing overall interaction

The interaction framework is presented as a means to judge the overall usability of an entire interactive system. In reality, all of the analysis that is suggested by theframework is dependent on the current task (or set of tasks) in which the User isengaged. This is not surprising since it is only in attempting to perform a particulartask within some domain that we are able to determine if the tools we use are adequate. For example, different text editors are better at different things. For a particular editing task, one can choose the text editor best suited for interaction relative to the task. The best editor, if we are forced to choose only one, is the one that best suits the tasks most frequently performed. Therefore, it is not too disappointing that we cannot extend the interaction analysis beyond the scope of a particular task.

DESIGN FOCUS

Video recorder

A simple example of programming a VCR from a remote control shows that all four translations in theinteraction cycle can affect the overall interaction. Ineffective interaction is indicated by the user notbeing sure the VCR is set to record properly. This could be because the user has pressed the keys onthe remote control unit in the wrong order; this can be classified as an articulatory problem. Or maybethe VCR is able to record on any channel but the remote control lacks the ability to select channels,indicating a coverage problem for the performance translation. It may be the case that the VCR displaypanel does not indicate that the program has been set, a presentation problem. Or maybe the user doesnot interpret the feedback properly, an observational error. Any one or more of these deficiencieswould give rise to ineffective interaction.

FRAMEWORKS AND HCI

As well as providing a means of discussing the details of a particular interaction,frameworks provide a basis for discussing other issues that relate to the interaction.The ACM SIGCHI Curriculum Development Group presents a framework similar tothat presented here, and uses it to place different areas that relate to HCI [9].

In Figure 3.3 these aspects are shown as they relate to the interaction framework.In particular, the field of ergonomics addresses issues on the user side of the interface,covering both input and output, as well as the user’s immediate context. Dialogdesign and interface styles can be placed particularly along the input branch of theframework, addressing both articulation and performance. However, dialog is mostusually associated with the computer and so is biased to that side of the framework.

3.3

3.4 Ergonomics 131

Presentation and screen design relates to the output branch of the framework. Theentire framework can be placed within a social and organizational context that alsoaffects the interaction. Each of these areas has important implications for the designof interactive systems and the performance of the user. We will discuss these in briefin the following sections, with the exception of screen design which we will save untilChapter 5.

ERGONOMICS

Ergonomics (or human factors) is traditionally the study of the physical character-istics of the interaction: how the controls are designed, the physical environment inwhich the interaction takes place, and the layout and physical qualities of the screen.A primary focus is on user performance and how the interface enhances or detractsfrom this. In seeking to evaluate these aspects of the interaction, ergonomics will certainly also touch upon human psychology and system constraints. It is a large and established field, which is closely related to but distinct from HCI, and full coverage would demand a book in its own right. Here we consider a few of the issuesaddressed by ergonomics as an introduction to the field. We will briefly look at thearrangement of controls and displays, the physical environment, health issues andthe use of color. These are by no means exhaustive and are intended only to give an

3.4

Figure 3.3 A framework for human–computer interaction. Adapted from ACMSIGCHI Curriculum Development Group [9]


indication of the types of issues and problems addressed by ergonomics. For moreinformation on ergonomic issues the reader is referred to the recommended readinglist at the end of the chapter.

3.4.1 Arrangement of controls and displays

In Chapter 1 we considered perceptual and cognitive issues that affect the way we present information on a screen and provide control mechanisms to the user. In addition to these cognitive aspects of design, physical aspects are also important.Sets of controls and parts of the display should be grouped logically to allow rapidaccess by the user (more on this in Chapter 5). This may not seem so importantwhen we are considering a single user of a spreadsheet on a PC, but it becomes vitalwhen we turn to safety-critical applications such as plant control, aviation and airtraffic control. In each of these contexts, users are under pressure and are faced witha huge range of displays and controls. Here it is crucial that the physical layout ofthese be appropriate. Indeed, returning to the less critical PC application, inappro-priate placement of controls and displays can lead to inefficiency and frustration. For example, on one particular electronic newsreader, used by one of the authors,the command key to read articles from a newsgroup (y) is directly beside the com-mand key to unsubscribe from a newsgroup (u) on the keyboard. This poor designfrequently leads to inadvertent removal of newsgroups. Although this is recover-able it wastes time and is annoying to the user. We saw similar examples in theIntroduction to this book including the MacOS X dock. We can therefore see thatappropriate layout is important in all applications.

We have already touched on the importance of grouping controls together logic-ally (and keeping opposing controls separate). The exact organization that this willsuggest will depend on the domain and the application, but possible organizationsinclude the following:

functional controls and displays are organized so that those that are functionallyrelated are placed together;

sequential controls and displays are organized to reflect the order of their use in atypical interaction (this may be especially appropriate in domains where a particu-lar task sequence is enforced, such as aviation);

frequency controls and displays are organized according to how frequently they areused, with the most commonly used controls being the most easily accessible.

In addition to the organization of the controls and displays in relation to eachother, the entire system interface must be arranged appropriately in relation to the user’s position. So, for example, the user should be able to reach all controls necessary and view all displays without excessive body movement. Critical displaysshould be at eye level. Lighting should be arranged to avoid glare and reflection dis-torting displays. Controls should be spaced to provide adequate room for the user tomanoeuvre.

3.4 Ergonomics 133

DESIGN FOCUS

Industrial interfaces

The interfaces to office systems have changed dramatically since the 1980s. However, some care isneeded in transferring the idioms of office-based systems into the industrial domain. Office informationis primarily textual and slow varying, whereas industrial interfaces may require the rapid assimilation ofmultiple numeric displays, each of which is varying in response to the environment. Furthermore, theenvironmental conditions may rule out certain interaction styles (for example, the oil-soaked mouse).Consequently, industrial interfaces raise some additional design issues rarely encountered in the office.

Glass interfaces vs. dials and knobsThe traditional machine interface consists of dials and knobs directly wired or piped to the equipment.Increasingly, some or all of the controls are replaced with a glass interface, a computer screen throughwhich the equipment is monitored and controlled. Many of the issues are similar for the two kinds ofinterface, but glass interfaces do have some special advantages and problems. For a complex system, aglass interface can be both cheaper and more flexible, and it is easy to show the same information inmultiple forms (Figure 3.4). For example, a data value might be given both in a precise numeric field andalso in a quick to assimilate graphical form. In addition, the same information can be shown on severalscreens. However, the information is not located in physical space and so vital clues to context aremissing – it is easy to get lost navigating complex menu systems. Also, limited display resolution oftenmeans that an electronic representation of a dial is harder to read than its physical counterpart; in somecircumstances both may be necessary, as is the case on the flight deck of a modern aircraft.

Figure 3.4 Multiple representations of the same information

Indirect manipulationThe phrase ‘direct manipulation’ dominates office system design (Figure 3.5). There are argumentsabout its meaning and appropriateness even there, but it is certainly dependent on the user being inprimary control of the changes in the interface. The autonomous nature of industrial processes makesthis an inappropriate model. In a direct manipulation system, the user interacts with an artificial worldinside the computer (for example, the electronic desktop).

In contrast, an industrial interface is merely an intermediary between the operator and the real world. One implication of this indirectness is that the interface must provide feedback at two levels

3.4.2 The physical environment of the interaction

As well as addressing physical issues in the layout and arrangement of the machineinterface, ergonomics is concerned with the design of the work environment itself.Where will the system be used? By whom will it be used? Will users be sitting, stand-ing or moving about? Again, this will depend largely on the domain and will be morecritical in specific control and operational settings than in general computer use.However, the physical environment in which the system is used may influence howwell it is accepted and even the health and safety of its users. It should therefore beconsidered in all design.

The first consideration here is the size of the users. Obviously this is going to varyconsiderably. However, in any system the smallest user should be able to reach all thecontrols (this may include a user in a wheelchair), and the largest user should not becramped in the environment.

In particular, all users should be comfortably able to see critical displays. For longperiods of use, the user should be seated for comfort and stability. Seating shouldprovide back support. If required to stand, the user should have room to movearound in order to reach all the controls.

(Figure 3.6). At one level, the user must receive immediate feedback, generated by the interface, thatkeystrokes and other actions have been received. In addition, the user’s actions will have some effecton the equipment controlled by the interface and adequate monitoring must be provided for this.

The indirectness also causes problems with simple monitoring tasks. Delays due to periodic sampling,slow communication and digital processing often mean that the data displayed are somewhat out of date. If the operator is not aware of these delays, diagnoses of system state may be wrong. Theseproblems are compounded if the interface produces summary information displays. If the data com-prising such a display are of different timeliness the result may be misleading.


Figure 3.5 Office system – direct manipulation

Figure 3.6 Indirect manipulation – two kinds of feedback

3.4 Ergonomics 135

3.4.3 Health issues

Perhaps we do not immediately think of computer use as a hazardous activity but weshould bear in mind possible consequences of our designs on the health and safetyof users. Leaving aside the obvious safety risks of poorly designed safety-critical sys-tems (aircraft crashing, nuclear plant leaks and worse), there are a number of factorsthat may affect the use of more general computers. Again these are factors in thephysical environment that directly affect the quality of the interaction and the user’sperformance:

Physical position As we noted in the previous section, users should be able to reachall controls comfortably and see all displays. Users should not be expected tostand for long periods and, if sitting, should be provided with back support. If a particular position for a part of the body is to be adopted for long periods (for example, in typing) support should be provided to allow rest.

Temperature Although most users can adapt to slight changes in temperaturewithout adverse effect, extremes of hot or cold will affect performance and, inexcessive cases, health. Experimental studies show that performance deterioratesat high or low temperatures, with users being unable to concentrate efficiently.

Lighting The lighting level will again depend on the work environment. However,adequate lighting should be provided to allow users to see the computer screenwithout discomfort or eyestrain. The light source should also be positioned toavoid glare affecting the display.

Noise Excessive noise can be harmful to health, causing the user pain, and in acutecases, loss of hearing. Noise levels should be maintained at a comfortable level inthe work environment. This does not necessarily mean no noise at all. Noise canbe a stimulus to users and can provide needed confirmation of system activity.

Time The time users spend using the system should also be controlled. As we sawin the previous chapter, it has been suggested that excessive use of CRT displayscan be harmful to users, particularly pregnant women.

3.4.4 The use of color

In this section we have concentrated on the ergonomics of physical characteristics of systems, including the physical environment in which they are used. However,ergonomics has a close relationship to human psychology in that it is also con-cerned with the perceptual limitations of humans. For example, the use of color in displays is an ergonomics issue. As we saw in Chapter 1, the visual system hassome limitations with regard to color, including the number of colors that are dis-tinguishable and the relatively low blue acuity. We also saw that a relatively high proportion of the population suffers from a deficiency in color vision. Each of thesepsychological phenomena leads to ergonomic guidelines; some examples are dis-cussed below.


Colors used in the display should be as distinct as possible and the distinctionshould not be affected by changes in contrast. Blue should not be used to display critical information. If color is used as an indicator it should not be the only cue:additional coding information should be included.

The colors used should also correspond to common conventions and user expecta-tions. Red, green and yellow are colors frequently associated with stop, go andstandby respectively. Therefore, red may be used to indicate emergency and alarms;green, normal activity; and yellow, standby and auxiliary function. These conven-tions should not be violated without very good cause.

However, we should remember that color conventions are culturally determined.For example, red is associated with danger and warnings in most western cultures,but in China it symbolizes happiness and good fortune. The color of mourning isblack in some cultures and white in others. Awareness of the cultural associations ofcolor is particularly important in designing systems and websites for a global market.We will return to these issues in more detail in Chapter 10.

3.4.5 Ergonomics and HCI

Ergonomics is a huge area, which is distinct from HCI but sits alongside it. Its contribution to HCI is in determining constraints on the way we design systemsand suggesting detailed and specific guidelines and standards. Ergonomic factors are in general well established and understood and are therefore used as the basis forstandardizing hardware designs. This issue is discussed further in Chapter 7.

INTERACTION STYLES

Interaction can be seen as a dialog between the computer and the user. The choice ofinterface style can have a profound effect on the nature of this dialog. Dialog designis discussed in detail in Chapter 16. Here we introduce the most common interfacestyles and note the different effects these have on the interaction. There are a num-ber of common interface styles including

n command line interfacen menusn natural languagen question/answer and query dialogn form-fills and spreadsheetsn WIMPn point and clickn three-dimensional interfaces.

As the WIMP interface is the most common and complex, we will discuss each of itselements in greater detail in Section 3.6.

3.5

3.5 Interaction styles 137

3.5.1 Command line interface

The command line interface (Figure 3.7) was the first interactive dialog style to becommonly used and, in spite of the availability of menu-driven interfaces, it is stillwidely used. It provides a means of expressing instructions to the computer directly,using function keys, single characters, abbreviations or whole-word commands. Insome systems the command line is the only way of communicating with the system,especially for remote access using telnet. More commonly today it is supplementaryto menu-based interfaces, providing accelerated access to the system’s functionalityfor experienced users.

Command line interfaces are powerful in that they offer direct access to systemfunctionality (as opposed to the hierarchical nature of menus), and can be combinedto apply a number of tools to the same data. They are also flexible: the commandoften has a number of options or parameters that will vary its behavior in some way,and it can be applied to many objects at once, making it useful for repetitive tasks.However, this flexibility and power brings with it difficulty in use and learning.Commands must be remembered, as no cue is provided in the command line toindicate which command is needed. They are therefore better for expert users thanfor novices. This problem can be alleviated a little by using consistent and meaning-ful commands and abbreviations. The commands used should be terms within thevocabulary of the user rather than the technician. Unfortunately, commands areoften obscure and vary across systems, causing confusion to the user and increasingthe overhead of learning.

3.5.2 Menus

In a menu-driven interface, the set of options available to the user is displayed on the screen, and selected using the mouse, or numeric or alphabetic keys. Since the options are visible they are less demanding of the user, relying on recognitionrather than recall. However, menu options still need to be meaningful and logic-ally grouped to aid recognition. Often menus are hierarchically ordered and theoption required is not available at the top layer of the hierarchy. The grouping

sable.soc.staffs.ac.uk> javac HelloWorldAppjavac: invalid argument: HelloWorldAppuse: javac [-g][-O][-classpath path][-d dir] file.java…sable.soc.staffs.ac.uk> javac HelloWorldApp.javasable.soc.staffs.ac.uk> java HelloWorldAppHello world!!sable.soc.staffs.ac.uk>

Figure 3.7 Command line interface


and naming of menu options then provides the only cue for the user to find therequired option. Such systems either can be purely text based, with the menu options being presented as numbered choices (see Figure 3.8), or may have a graphical component in which the menu appears within a rectangular box andchoices are made, perhaps by typing the initial letter of the desired selection, or by entering the associated number, or by moving around the menu with the arrowkeys. This is a restricted form of a full WIMP system, described in more detailshortly.

3.5.3 Natural language

Perhaps the most attractive means of communicating with computers, at least at firstglance, is by natural language. Users, unable to remember a command or lost in ahierarchy of menus, may long for the computer that is able to understand instruc-tions expressed in everyday words! Natural language understanding, both of speechand written input, is the subject of much interest and research. Unfortunately, however, the ambiguity of natural language makes it very difficult for a machine to understand. Language is ambiguous at a number of levels. First, the syntax, orstructure, of a phrase may not be clear. If we are given the sentence

The boy hit the dog with the stick

we cannot be sure whether the boy is using the stick to hit the dog or whether thedog is holding the stick when it is hit.

Even if a sentence’s structure is clear, we may find ambiguity in the meaning of the words used. For example, the word ‘pitch’ may refer to a sports field, a throw, a waterproofing substance or even, colloquially, a territory. We often rely on the con-text and our general knowledge to sort out these ambiguities. This information isdifficult to provide to the machine. To complicate matters more, the use of pronounsand relative terms adds further ambiguity.

PAYMENT DETAILS P3-7

please select payment method: 1. cash2. check3. credit card4. invoice

9. abort transaction

Figure 3.8 Menu-driven interface


Given these problems, it seems unlikely that a general natural language inter-face will be available for some time. However, systems can be built to understandrestricted subsets of a language. For a known and constrained domain, the systemcan be provided with sufficient information to disambiguate terms. It is importantin interfaces which use natural language in this restricted form that the user is awareof the limitations of the system and does not expect too much understanding.

The use of natural language in restricted domains is relatively successful, but it is debatable whether this can really be called natural language. The user still has to learn which phrases the computer understands and may become frustrated if too much is expected. However, it is also not clear how useful a general natural language interface would be. Language is by nature vague and imprecise: this gives it its flexibility and allows creativity in expression. Computers, on the other hand,require precise instructions. Given a free rein, would we be able to describe ourrequirements precisely enough to guarantee a particular response? And, if we could,would the language we used turn out to be a restricted subset of natural languageanyway?

3.5.4 Question/answer and query dialog

Question and answer dialog is a simple mechanism for providing input to an applica-tion in a specific domain. The user is asked a series of questions (mainly with yes/noresponses, multiple choice, or codes) and so is led through the interaction step bystep. An example of this would be web questionnaires.

These interfaces are easy to learn and use, but are limited in functionality andpower. As such, they are appropriate for restricted domains (particularly informa-tion systems) and for novice or casual users.

Query languages, on the other hand, are used to construct queries to retrieveinformation from a database. They use natural-language-style phrases, but in factrequire specific syntax, as well as knowledge of the database structure. Queries usually require the user to specify an attribute or attributes for which to search the database, as well as the attributes of interest to be displayed. This is straight-forward where there is a single attribute, but becomes complex when multipleattributes are involved, particularly if the user is interested in attribute A or attributeB, or attribute A and not attribute B, or where values of attributes are to be com-pared. Most query languages do not provide direct confirmation of what wasrequested, so that the only validation the user has is the result of the search. Theeffective use of query languages therefore requires some experience. A specializedexample is the web search engine.

3.5.5 Form-fills and spreadsheets

Form-filling interfaces are used primarily for data entry but can also be useful in data retrieval applications. The user is presented with a display resembling a paper


form, with slots to fill in (see Figure 3.9). Often the form display is based upon an actual form with which the user is familiar, which makes the interface easier touse. The user works through the form, filling in appropriate values. The data are then entered into the application in the correct place. Most form-filling interfacesallow easy movement around the form and allow some fields to be left blank. Theyalso require correction facilities, as users may change their minds or make a mistakeabout the value that belongs in each field. The dialog style is useful primarily for data entry applications and, as it is easy to learn and use, for novice users. How-ever, assuming a design that allows flexible entry, form filling is also appropriate forexpert users.

Spreadsheets are a sophisticated variation of form filling. The spreadsheet com-prises a grid of cells, each of which can contain a value or a formula (see Figure 3.10).The formula can involve the values of other cells (for example, the total of all cells in this column). The user can enter and alter values and formulae in any order and the system will maintain consistency amongst the values displayed, ensuring that all formulae are obeyed. The user can therefore manipulate values to see theeffects of changing different parameters. Spreadsheets are an attractive medium for interaction: the user is free to manipulate values at will and the distinc-tion between input and output is blurred, making the interface more flexible andnatural.

Figure 3.9 A typical form-filling interface. Screen shot frame reprinted bypermission from Microsoft Corporation


3.5.6 The WIMP interface

Currently many common environments for interactive computing are examples ofthe WIMP interface style, often simply called windowing systems. WIMP stands forwindows, icons, menus and pointers (sometimes windows, icons, mice and pull-downmenus), and is the default interface style for the majority of interactive computer sys-tems in use today, especially in the PC and desktop workstation arena. Examples ofWIMP interfaces include Microsoft Windows for IBM PC compatibles, MacOS forApple Macintosh compatibles and various X Windows-based systems for UNIX.

Figure 3.10 A typical spreadsheet

Mixing styles

The UNIX windowing environments are interesting as the contents of many of the windowsare often themselves simply command line or character-based programs (see Figure 3.11). In fact,this mixing of interface styles in the same system is quite common, especially where older legacysystems are used at the same time as more modern applications. It can be a problem if usersattempt to use commands and methods suitable for one environment in another. On the AppleMacintosh, HyperCard uses a point-and-click style. However, HyperCard stack buttons look very like Macintosh folders. If you double click on them, as you would to open a folder, your twomouse clicks are treated as separate actions. The first click opens the stack (as you wanted), butthe second is then interpreted in the context of the newly opened stack, behaving in an apparentlyarbitrary fashion! This is an example of the importance of consistency in the interface, an issue weshall return to in Chapter 7.


3.5.7 Point-and-click interfaces

In most multimedia systems and in web browsers, virtually all actions take only a single click of the mouse button. You may point at a city on a map and when you clicka window opens, showing you tourist information about the city. You may point ata word in some text and when you click you see a definition of the word. You maypoint at a recognizable iconic button and when you click some action is performed.

This point-and-click interface style is obviously closely related to the WIMP style.It clearly overlaps in the use of buttons, but may also include other WIMP elements.However, the philosophy is simpler and more closely tied to ideas of hypertext. In addition, the point-and-click style is not tied to mouse-based interfaces, and isalso extensively used in touchscreen information systems. In this case, it is oftencombined with a menu-driven interface.

The point-and-click style has been popularized by world wide web pages, whichincorporate all the above types of point-and-click navigation: highlighted words,maps and iconic buttons.

3.5.8 Three-dimensional interfaces

There is an increasing use of three-dimensional effects in user interfaces. The mostobvious example is virtual reality, but VR is only part of a range of 3D techniquesavailable to the interface designer.

Figure 3.11 A typical UNIX windowing system – the OpenLook system. Source: Sun Microsystems, Inc.


The simplest technique is where ordinary WIMP elements, buttons, scroll bars, etc.,are given a 3D appearance using shading, giving the appearance of being sculpted outof stone. By unstated convention, such interfaces have a light source at their topright. Where used judiciously, the raised areas are easily identifiable and can be usedto highlight active areas (Figure 3.12). Unfortunately, some interfaces make indis-criminate use of sculptural effects, on every text area, border and menu, so all senseof differentiation is lost.

A more complex technique uses interfaces with 3D workspaces. The objects displayed in such systems are usually flat, but are displayed in perspective when at anangle to the viewer and shrink when they are ‘further away’. Figure 3.13 shows onesuch system, WebBook [57]. Notice how size, light and occlusion provide a sense of

Figure 3.12 Buttons in 3D say ‘press me’

Figure 3.13 WebBook – using 3D to make more space (Card S.K., Robertson G.G.and York W. (1996). The WebBook and the Web Forager: An Information workspacefor the World-Wide Web. CHI96 Conference Proceedings, 111–17. Copyright © 1996ACM, Inc. Reprinted by permission)


distance. Notice also that as objects get further away they take up less screen space.Three-dimensional workspaces give you extra space, but in a more natural way thaniconizing windows.

Finally, there are virtual reality and information visualization systems where theuser can move about within a simulated 3D world. These are discussed in detail inChapter 20.

These mechanisms overlap with other interaction styles, especially the use ofsculptured elements in WIMP interfaces. However, there is a distinct interactionstyle for 3D interfaces in that they invite us to use our tacit abilities for the real world,and translate them into the electronic world. Novice users must learn that an ovalarea with a word or picture in it is a button to be pressed, but a 3D button says ‘pushme’. Further, more complete 3D environments invite one to move within the virtualenvironment, rather than watch as a spectator.

DESIGN FOCUS

Navigation in 3D and 2D

We live in a three-dimensional world. So clearly 3D interfaces are good . . . or are they? Actually, our3D stereo vision only works well close to us and after that we rely on cruder measures such as ‘this isin front of that’. We are good at moving obects around with our hands in three dimensions, rotating,turning them on their side. However, we walk around in two dimensions and do not fly. Not surpris-ingly, people find it hard to visualize and control movement in three dimensions.

Normally, we use gravity to give us a fixed direction in space. This is partly through the channels in theinner ear, but also largely through kinesthetic senses – feeling the weight of limbs. When we lose thesesenses it is easy to become disoriented and we can lose track of which direction is up: divers are trainedto watch the direction their bubbles move and if buried in an avalanche you should spit and feel whichdirection the spittle flows.

Where humans have to navigate in three dimensions they need extra aids such as the artificial horizonin an airplane. Helicopters, where there are many degrees of freedom, are particularly difficult.

Even in the two-dimensional world of walking about we do not rely on neat Cartesian maps in our head. Instead we mostly use models of locationsuch as ‘down the road near the church’ that rely on approximate topolo-gical understanding and landmarks. We also rely on properties of normalspace, such as the ability to go backwards and the fact that things that areclose can be reached quickly. When two-dimensional worlds are not like this, for example in a one-way traffic system or in a labyrinth, we have greatdifficulty [98].

When we design systems we should take into account how people navigate in the real world and usethis to guide our navigation aids. For example, if we have a 3D interface or a virtual reality world weshould normally show a ground plane and by default lock movement to be parallel to the ground. Ininformation systems we can recruit our more network-based models of 2D space by giving landmarksand making it as easy to ‘step back’ as to go forwards (as with the web browser ‘back’ button).

See the book website for more about 3D vision: /e3/online/seeing-3D/

3.6 Elements of the WIMP interface 145

ELEMENTS OF THE WIMP INTERFACE

We have already noted the four key features of the WIMP interface that give it itsname – windows, icons, pointers and menus – and we will now describe these inturn. There are also many additional interaction objects and techniques commonlyused in WIMP interfaces, some designed for specific purposes and others more general. We will look at buttons, toolbars, palettes and dialog boxes. Most of theseelements can be seen in Figure 3.14.

Together, these elements of the WIMP interfaces are called widgets, and they com-prise the toolkit for interaction between user and system. In Chapter 8 we willdescribe windowing systems and interaction widgets more from the programmer’sperspective. There we will discover that though most modern windowing systemsprovide the same set of basic widgets, the ‘look and feel’ – how widgets are physicallydisplayed and how users can interact with them to access their functionality – of dif-ferent windowing systems and toolkits can differ drastically.

3.6.1 WindowsWindows are areas of the screen that behave as if they were independent terminalsin their own right. A window can usually contain text or graphics, and can be moved

3.6

Figure 3.14 Elements of the WIMP interface – Microsoft Word 5.1 on an Apple Macintosh. Screen shotreprinted by permission from Apple Computer, Inc.


or resized. More than one window can be on a screen at once, allowing separate tasks to be visible at the same time. Users can direct their attention to the differentwindows as they switch from one thread of work to another.

If one window overlaps the other, the back window is partially obscured, and then refreshed when exposed again. Overlapping windows can cause problems by obscuring vital information, so windows may also be tiled, when they adjoin but do not overlap each other. Alternatively, windows may be placed in a cascadingfashion, where each new window is placed slightly to the left and below the previouswindow. In some systems this layout policy is fixed, in others it can be selected by theuser.

Usually, windows have various things associated with them that increase their use-fulness. Scrollbars are one such attachment, allowing the user to move the contentsof the window up and down, or from side to side. This makes the window behave asif it were a real window onto a much larger world, where new information is broughtinto view by manipulating the scrollbars.

There is usually a title bar attached to the top of a window, identifying it to theuser, and there may be special boxes in the corners of the window to aid resizing,closing, or making as large as possible. Each of these can be seen in Figure 3.15.

In addition, some systems allow windows within windows. For example, inMicrosoft Office applications, such as Excel and Word, each application has its ownwindow and then within this each document has a window. It is often possible tohave different layout policies within the different application windows.

Figure 3.15 A typical window. Screen shot reprinted by permission from Apple Computer, Inc.


3.6.2 Icons

Windows can be closed and lost for ever, or they can be shrunk to some very reducedrepresentation. A small picture is used to represent a closed window, and this repre-sentation is known as an icon. By allowing icons, many windows can be available onthe screen at the same time, ready to be expanded to their full size by clicking on theicon. Shrinking a window to its icon is known as iconifying the window. When a usertemporarily does not want to follow a particular thread of dialog, he can suspendthat dialog by iconifying the window containing the dialog. The icon saves space onthe screen and serves as a reminder to the user that he can subsequently resume thedialog by opening up the window. Figure 3.16 shows a few examples of some iconsused in a typical windowing system (MacOS X).

Icons can also be used to represent other aspects of the system, such as a waste-basket for throwing unwanted files into, or various disks, programs or functions thatare accessible to the user. Icons can take many forms: they can be realistic represen-tations of the objects that they stand for, or they can be highly stylized. They can evenbe arbitrary symbols, but these can be difficult for users to interpret.

3.6.3 Pointers

The pointer is an important component of the WIMP interface, since the interactionstyle required by WIMP relies very much on pointing and selecting things such asicons. The mouse provides an input device capable of such tasks, although joysticksand trackballs are other alternatives, as we have previously seen in Chapter 2. Theuser is presented with a cursor on the screen that is controlled by the input device. A variety of pointer cursors are shown in Figure 3.17.

Figure 3.16 A variety of icons. Screen shot reprinted by permission from Apple Computer, Inc.


The different shapes of cursor are often used to distinguish modes, for example thenormal pointer cursor may be an arrow, but change to cross-hairs when drawing aline. Cursors are also used to tell the user about system activity, for example a watchor hour-glass cursor may be displayed when the system is busy reading a file.

Pointer cursors are like icons, being small bitmap images, but in addition all cur-sors have a hot-spot, the location to which they point. For example, the three arrowsat the start of Figure 3.17 each have a hot-spot at the top left, whereas the right-pointing hand on the second line has a hot-spot on its right. Sometimes the hot-spotis not clear from the appearance of the cursor, in which case users will find it hard toclick on small targets. When designing your own cursors, make sure the image hasan obvious hot-spot.

3.6.4 Menus

The last main feature of windowing systems is the menu, an interaction techniquethat is common across many non-windowing systems as well. A menu presents achoice of operations or services that can be performed by the system at a given time.In Chapter 1, we pointed out that our ability to recall information is inferior to ourability to recognize it from some visual cue. Menus provide information cues in theform of an ordered list of operations that can be scanned. This implies that thenames used for the commands in the menu should be meaningful and informative.

The pointing device is used to indicate the desired option. As the pointer movesto the position of a menu item, the item is usually highlighted (by inverse video, or some similar strategy) to indicate that it is the potential candidate for selection.Selection usually requires some additional user action, such as pressing a button onthe mouse that controls the pointer cursor on the screen or pressing some special key on the keyboard. Menus are inefficient when they have too many items, and socascading menus are utilized, in which item selection opens up another menu adja-cent to the item, allowing refinement of the selection. Several layers of cascadingmenus can be used.

Figure 3.17 A variety of pointer cursors. Source: Sun Microsystems, Inc.


The main menu can be visible to the user all the time, as a menu bar and submenuscan be pulled down or across from it upon request (Figure 3.18). Menu bars areoften placed at the top of the screen (for example, MacOS) or at the top of each window (for example, Microsoft Windows). Alternatives include menu bars alongone side of the screen, or even placed amongst the windows in the main ‘desktop’area. Websites use a variety of menu bar locations, including top, bottom and eitherside of the screen. Alternatively, the main menu can be hidden and upon request itwill pop up onto the screen. These pop-up menus are often used to present context-sensitive options, for example allowing one to examine properties of particular on-screen objects. In some systems they are also used to access more global actionswhen the mouse is depressed over the screen background.

Pull-down menus are dragged down from the title at the top of the screen, bymoving the mouse pointer into the title bar area and pressing the button. Fall-downmenus are similar, except that the menu automatically appears when the mousepointer enters the title bar, without the user having to press the button. Some menusare pin-up menus, in that they can be ‘pinned’ to the screen, staying in place untilexplicitly asked to go away. Pop-up menus appear when a particular region of thescreen, maybe designated by an icon, is selected, but they only stay as long as themouse button is depressed.

Another approach to menu selection is to arrange the options in a circular fashion. The pointer appears in the center of the circle, and so there is the same distance to travel to any of the selections. This has the advantages that it is easier toselect items, since they can each have a larger target area, and that the selection timefor each item is the same, since the pointer is equidistant from them all. Comparethis with a standard menu: remembering Fitts’ law from Chapter 1, we can see thatit will take longer to select items near the bottom of the menu than at the top.However, these pie menus, as they are known [54], take up more screen space and aretherefore less common in interfaces.

Figure 3.18 Pull-down menu


The major problems with menus in general are deciding what items to include and how to group those items. Including too many items makes menus too long orcreates too many of them, whereas grouping causes problems in that items that relateto the same topic need to come under the same heading, yet many items could begrouped under more than one heading. In pull-down menus the menu label shouldbe chosen to reflect the function of the menu items, and items grouped within menusby function. These groupings should be consistent across applications so that theuser can transfer learning to new applications. Menu items should be ordered in themenu according to importance and frequency of use, and opposite functionalities(such as ‘save’ and ‘delete’) should be kept apart to prevent accidental selection of thewrong function, with potentially disastrous consequences.

3.6.5 Buttons

Buttons are individual and isolated regions within a display that can be selected by the user to invoke specific operations. These regions are referred to as buttonsbecause they are purposely made to resemble the push buttons you would find on a control panel. ‘Pushing’ the button invokes a command, the meaning of which is usually indicated by a textual label or a small icon. Buttons can also be used to toggle between two states, displaying status information such as whether the currentfont is italicized or not in a word processor, or selecting options on a web form. Suchtoggle buttons can be grouped together to allow a user to select one feature from aset of mutually exclusive options, such as the size in points of the current font. Theseare called radio buttons, since the collection functions much like the old-fashionedmechanical control buttons on car radios. If a set of options is not mutually exclus-ive, such as font characteristics like bold, italics and underlining, then a set of toggle buttons can be used to indicate the on/off status of the options. This type ofcollection of buttons is sometimes referred to as check boxes.

Keyboard accelerators

Menus often offer keyboard accelerators, key combinations that have the same effect as selectingthe menu item. This allows more expert users, familiar with the system, to manipulate things with-out moving off the keyboard, which is often faster. The accelerators are often displayed alongsidethe menu items so that frequent use makes them familiar. Unfortunately most systems do notallow you to use the accelerators while the menu is displayed. So, for example, the menu might say

However, when the user presses function key F3 nothing happens. F3 only works when the menuis not displayed – when the menu is there you must press ‘F’ instead! This is an example of an inter-face that is dishonest (see also Chapter 7).


3.6.6 Toolbars

Many systems have a collection of small buttons, each with icons, placed at the topor side of the window and offering commonly used functions. The function of thistoolbar is similar to a menu bar, but as the icons are smaller than the equivalent text more functions can be simultaneously displayed. Sometimes the content of thetoolbar is fixed, but often users can customize it, either changing which functions aremade available, or choosing which of several predefined toolbars is displayed.

DESIGN FOCUS

Learning toolbars

Although many applications now have toolbars, they are often underused because users simply do notknow what the icons represent. Once learned the meaning is often relatively easy to remember, butmost users do not want to spend time reading a manual, or even using online help to find out whateach button does – they simply reach for the menu.

There is an obvious solution – put the icons on the menus in the same way that accelerator keys arewritten there. So in the ‘Edit’ menu one might find the option

Imagine now selecting this. As the mouse drags down through the menu selections, each highlights inturn. If the mouse is dragged down the extreme left, the effect will be very similar to selecting the iconfrom the toolbar, except that it will be incidental to selecting the menu item. In this way, the toolbaricon will be naturally learned from normal menu interaction.

Selecting the menu option = selecting the icon

This trivial fix is based on accepted and tested knowledge of learning and has been described in moredetail by one of the authors elsewhere [95]. Given its simplicity, this technique should clearly be usedeverywhere, but until recently was rare. However, it has now been taken up in the Office 97 suite andlater Microsoft Office products, so perhaps will soon become standard.


3.6.7 PalettesIn many application programs, interaction can enter one of several modes. Thedefining characteristic of modes is that the interpretation of actions, such askeystrokes or gestures with the mouse, changes as the mode changes. For example,using the standard UNIX text editor vi, keystrokes can be interpreted either as operations to insert characters in the document (insert mode) or as operations toperform file manipulation (command mode). Problems occur if the user is notaware of the current mode. Palettes are a mechanism for making the set of possiblemodes and the active mode visible to the user. A palette is usually a collection oficons that are reminiscent of the purpose of the various modes. An example in adrawing package would be a collection of icons to indicate the pixel color or patternthat is used to fill in objects, much like an artist’s palette for paint.

Some systems allow the user to create palettes from menus or toolbars. In the caseof pull-down menus, the user may be able ‘tear off ’ the menu, turning it into a paletteshowing the menu items. In the case of toolbars, he may be able to drag the toolbaraway from its normal position and place it anywhere on the screen. Tear-off menusare usually those that are heavily graphical anyway, for example line-style or colorselection in a drawing package.

3.6.8 Dialog boxesDialog boxes are information windows used by the system to bring the user’s atten-tion to some important information, possibly an error or a warning used to preventa possible error. Alternatively, they are used to invoke a subdialog between user andsystem for a very specific task that will normally be embedded within some largertask. For example, most interactive applications result in the user creating some file that will have to be named and stored within the filing system. When the user orsystem wants to save the file, a dialog box can be used to allow the user to name the file and indicate where it is to be located within the filing system. When the savesubdialog is complete, the dialog box will disappear. Just as windows are used to separate the different threads of user–system dialog, so too are dialog boxes used tofactor out auxiliary task threads from the main task dialog.

INTERACTIVITY

When looking at an interface, it is easy to focus on the visually distinct parts (the but-tons, menus, text areas) but the dynamics, the way they react to a user’s actions, areless obvious. Dialog design, discussed in Chapter 16, is focussed almost entirely onthe choice and specification of appropriate sequences of actions and correspondingchanges in the interface state. However, it is typically not used at a fine level of detailand deliberately ignores the ‘semantic’ level of an interface: for example, the valida-tion of numeric information in a forms-based system.

3.7

3.7 Interactivity 153

It is worth remembering that interactivity is the defining feature of an interactivesystem. This can be seen in many areas of HCI. For example, the recognition rate for speech recognition is too low to allow transcription from tape, but in an airlinereservation system, so long as the system can reliably recognize yes and no it canreflect back its understanding of what you said and seek confirmation. Speech-basedinput is difficult, speech-based interaction easier. Also, in the area of informationvisualization the most exciting developments are all where users can interact with avisualization in real time, changing parameters and seeing the effect.

Interactivity is also crucial in determining the ‘feel’ of a WIMP environment. AllWIMP systems appear to have virtually the same elements: windows, icons, menus,pointers, dialog boxes, buttons, etc. However, the precise behavior of these elementsdiffers both within a single environment and between environments. For example,we have already discussed the different behavior of pull-down and fall-down menus.These look the same, but fall-down menus are more easily invoked by accident (andnot surprisingly the windowing environments that use them have largely fallen intodisuse!). In fact, menus are a major difference between the MacOS and MicrosoftWindows environments: in MacOS you have to keep the mouse depressed through-out menu selection; in Windows you can click on the menu bar and a pull-downmenu appears and remains there until an item is selected or it is cancelled. Similarlythe detailed behavior of buttons is quite complex, as we shall see in Chapter 17.

In older computer systems, the order of interaction was largely determined by themachine. You did things when the computer was ready. In WIMP environments, theuser takes the initiative, with many options and often many applications simultan-eously available. The exceptions to this are pre-emptive parts of the interface, wherethe system for various reasons wrests the initiative away from the user, perhapsbecause of a problem or because it needs information in order to continue.

The major example of this is modal dialog boxes. It is often the case that when adialog box appears the application will not allow you to do anything else until thedialog box has been completed or cancelled. In some cases this may simply block theapplication, but you can perform tasks in other applications. In other cases you cando nothing at all until the dialog box has been completed. An especially annoyingexample is when the dialog box asks a question, perhaps simply for confirmation ofan action, but the information you need to answer is hidden by the dialog box!

There are occasions when modal dialog boxes are necessary, for example when a major fault has been detected, or for certain kinds of instructional software.However, the general philosophy of modern systems suggests that one should mini-mize the use of pre-emptive elements, allowing the user maximum flexibility.

Interactivity is also critical in dealing with errors. We discussed slips and mistakesearlier in the chapter, and some ways to try to prevent these types of errors. The otherway to deal with errors is to make sure that the user or the system is able to tell whenerrors have occurred. If users can detect errors then they can correct them. So, evenif errors occur, the interaction as a whole succeeds. Several of the principles inChapter 7 deal with issues that relate to this. This ability to detect and correct isimportant both at the small scale of button presses and keystrokes and also at thelarge scale. For example, if you have sent a client a letter and expect a reply, you can


put in your diary a note on the day you expect a reply. If the other person forgets to reply or the letter gets lost in the post you know to send a reminder or ring whenthe due day passes.

THE CONTEXT OF THE INTERACTION

We have been considering the interaction between a user and a system, and how thisis affected by interface design. This interaction does not occur within a vacuum. Wehave already noted some of the physical factors in the environment that can directlyaffect the quality of the interaction. This is part of the context in which the interac-tion takes place. But this still assumes a single user operating a single, albeit complex,machine. In reality, users work within a wider social and organizational context. Thisprovides the wider context for the interaction, and may influence the activity andmotivation of the user. In Chapter 13, we discuss some methods that can be used togain a fuller understanding of this context, and, in Chapter 14, we consider in moredetail the issues involved when more than one user attempts to work together on asystem. Here we will confine our discussion to the influence social and organiza-tional factors may have on the user’s interaction with the system. These may not befactors over which the designer has control. However, it is important to be aware ofsuch influences to understand the user and the work domain fully.

3.8

Bank managers don’t type . . .

The safe in most banks is operated by at least two keys, held by different employees of thebank. This makes it difficult for a bank robber to obtain both keys, and also protects the bankagainst light-fingered managers! ATMs contain a lot of cash and so need to be protected by sim-ilar measures. In one bank, which shall remain nameless, the ATM had an electronic locking device.The machine could not be opened to replenish or remove cash until a long key sequence had beenentered. In order to preserve security, the bank gave half the sequence to one manager and halfto another, so both managers had to be present in order to open the ATM. However, these weretraditional bank managers who were not used to typing – that was a job for a secretary! So theyeach gave their part of the key sequence to a secretary to type in when they wanted to gain entryto the ATM. In fact, they both gave their respective parts of the key sequence to the same secret-ary. Happily the secretary was honest, but the moral is you cannot ignore social expectations andrelationships when designing any sort of computer system, however simple it may be.

The presence of other people in a work environment affects the performance ofthe worker in any task. In the case of peers, competition increases performance, atleast for known tasks. Similarly the desire to impress management and superiorsimproves performance on these tasks. However, when it comes to acquisition of

3.8 The context of the interaction 155

new skills, the presence of these groups can inhibit performance, owing to the fear of failure. Consequently, privacy is important to allow users the opportunity toexperiment.

In order to perform well, users must be motivated. There are a number of pos-sible sources of motivation, as well as those we have already mentioned, includingfear, allegiance, ambition and self-satisfaction. The last of these is influenced by theuser’s perception of the quality of the work done, which leads to job satisfaction. If a system makes it difficult for the user to perform necessary tasks, or is frustrating touse, the user’s job satisfaction, and consequently performance, will be reduced.

The user may also lose motivation if a system is introduced that does not matchthe actual requirements of the job to be done. Often systems are chosen and intro-duced by managers rather than the users themselves. In some cases the manager’sperception of the job may be based upon observation of results and not on actualactivity. The system introduced may therefore impose a way of working that is unsat-isfactory to the users. If this happens there may be three results: the system will berejected, the users will be resentful and unmotivated, or the user will adapt theintended interaction to his own requirements. This indicates the importance ofinvolving actual users in the design process.

DESIGN FOCUS

Half the picture?

When systems are not designed to match the way people actually work, then users end up having to do ‘work arounds’. Integrated student records systems are becoming popular in universities in theUK. They bring the benefits of integrating examination systems with enrolment and finance systems soall data can be maintained together and cross-checked. All very useful and time saving – in theory.However, one commonly used system only holds a single overall mark per module for each student,whereas many modules on UK courses have multiple elements of assessment. Knowing a student’smark on each part of the assessment is often useful to academics making decisions in examinationboards as it provides a more detailed picture of performance. In many cases staff are therefore supplementing the official records system with their own unofficial spreadsheets to provide this information – making additional work for staff and increased opportunity for error.

On the other hand, the introduction of new technology may prove to be a motiva-tion to users, particularly if it is well designed, integrated with the user’s currentwork, and challenging. Providing adequate feedback is an important source of motiva-tion for users. If no feedback is given during a session, the user may become bored,unmotivated or, worse, unsure of whether the actions performed have been success-ful. In general, an action should have an obvious effect to prevent this confusion andto allow early recovery in the case of error. Similarly, if system delays occur, feedbackcan be used to prevent frustration on the part of the user – the user is then aware ofwhat is happening and is not left wondering if the system is still working.


EXPERIENCE, ENGAGEMENT AND FUN

Ask many in HCI about usability and they may use the words ‘effective’ and‘efficient’. Some may add ‘satisfaction’ as well. This view of usability seems to stemmainly from the Taylorist tradition of time and motion studies: if you can get theworker to pull the levers and turn the knobs in the right order then you can shave10% off production costs.

However, users no longer see themselves as cogs in a machine. Increasingly, applications are focussed outside the closed work environment: on the home, leisure,entertainment, shopping. It is not sufficient that people can use a system, they mustwant to use it.

Even from a pure economic standpoint, your employees are likely to work betterand more effectively if they enjoy what they are doing!

In this section we’ll look at these more experiential aspects of interaction.

3.9.1 Understanding experience

Shopping is an interesting example to consider. Most internet stores allow you to buy things, but do you go shopping? Shopping is as much about going to theshops, feeling the clothes, being with friends. You can go shopping and never intendto spend money. Shopping is not about an efficient financial transaction, it is anexperience.

But experience is a difficult thing to pin down; we understand the idea of a good experience, but how do we define it and even more difficult how do we designit?

Csikszentimihalyi [82] looked at extreme experiences such as climbing a rock facein order to understand that feeling of total engagement that can sometimes happen.He calls this flow and it is perhaps related to what some sportspeople refer to as being‘in the zone’. This sense of flow occurs when there is a balance between anxiety and boredom. If you do something that you know you can do it is not engaging; youmay do it automatically while thinking of something else, or you may simply becomebored. Alternatively, if you do something completely outside your abilities you may become anxious and, if you are half way up a rock face, afraid. Flow comes whenyou are teetering at the edge of your abilities, stretching yourself to or a little beyondyour limits.

In education there is a similar phenomenon. The zone of proximal development isthose things that you cannot quite do yourself, but you can do with some support,whether from teachers, fellow pupils, or electronic or physical materials. Learning isat its best in this zone. Notice again this touching of limits.

Of course, this does not fully capture the sense of experience, and there is an activesubfield of HCI researchers striving to make sense of this, building on the work ofpsychologists and philosophers on the one hand and literary analysis, film makingand drama on the other.

3.9

3.9 Experience, engagement and fun 157

3.9.2 Designing experience

Some of the authors were involved in the design of virtual Christmas crackers. Theseare rather like electronic greetings cards, but are based on crackers. For those whohave not come across them, Christmas crackers are small tubes of paper between 8and 12 inches long (20–30 cm). Inside there are a small toy, a joke or motto and apaper hat. A small strip of card is threaded through, partly coated with gunpowder.When two people at a party pull the cracker, it bursts apart with a small bang fromthe gunpowder and the contents spill out.

The virtual cracker does not attempt to fully replicate each aspect of the physicalcharacteristics and process of pulling the cracker, but instead seeks to reproduce the experience. To do this the original crackers experience was deconstructed andeach aspect of the experience produced in a similar, but sometimes different, way inthe new media. Table 3.1 shows the aspects of the experience deconstructed andreconstructed in the virtual cracker.

For example, the cracker contents are hidden inside; no one knows what toy orjoke will be inside. Similarly, when you create a virtual cracker you normally cannotsee the contents until the recipient has opened it. Even the recipient initially sees a page with just an image of the cracker; it is only after the recipient has clicked on the ‘open’ icon that the cracker slowly opens and you get to see the joke, web toyand mask.

The mask is also worth looking at. The first potential design was to have a pictureof a face with a hat on it – well, it wouldn’t rank highly on excitement! The essentialfeature of the paper hat is that you can dress up. An iconic hat hardly does that.

Table 3.1 The crackers experience [101]

Real cracker Virtual cracker

Surface elementsDesign Cheap and cheerful Simple page/graphicsPlay Plastic toy and joke Web toy and jokeDressing up Paper hat Mask to cut out

Experienced effectsShared Offered to another Sent by email, messageCo-experience Pulled together Sender can’t see content until

opened by recipientExcitement Cultural connotations Recruited expectationHiddenness Contents inside First page – no contentsSuspense Pulling cracker Slow . . . page changeSurprise Bang (when it works) WAV file (when it works)


Instead the cracker has a link to a web page with a picture of a mask that you can print, cut out and wear. Even if you don’t actually print it out, the fact that youcould changes the experience – it is some dressing up you just happen not to havedone yet.

A full description of the virtual crackers case study is on the book website at:/e3/casestudy/crackers/

3.9.3 Physical design and engagement

In Chapter 2 we talked about physical controls. Figure 2.13 showed controllers for amicrowave, washing machine and personal MiniDisc player. We saw then how cer-tain physical interfaces were suited for different contexts: smooth plastic controls foran easy clean microwave, multi-function knob for the MiniDisc.

Designers are faced with many constraints:

Ergonomic You cannot physically push buttons if they are too small or too close.

Physical The size or nature of the device may force certain positions or styles of con-trol, for example, a dial like the one on the washing machine would not fit on theMiniDisc controller; high-voltage switches cannot be as small as low-voltage ones.

Legal and safety Cooker controls must be far enough from the pans that you do notburn yourself, but also high enough to prevent small children turning them on.

Context and environment The microwave’s controls are smooth to make themeasy to clean in the kitchen.

Aesthetic The controls must look good.

Economic It must not cost too much!

These constraints are themselves often contradictory and require trade-offs to be made. For example, even within the safety category front-mounted controls arebetter in that they can be turned on or off without putting your hands over the pansand hot steam, but back-mounted controls are further from children’s grasp. TheMiniDisc player is another example; it physically needs to be small, but this meansthere is not room for all the controls you want given the minimum size that can be manipulated. In the case of the cooker there is no obvious best solution and sodifferent designs favor one or the other. In the case of the MiniDisc player the endknob is multi-function. This means the knob is ergonomically big enough to turnand physically small enough to fit, but at the cost of a more complex interactionstyle.

To add to this list of constraints there is another that makes a major impact on theease of use and also the ability of the user to become engaged with the device, for itto become natural to use:

Fluidity The extent to which the physical structure and manipulation of the devicenaturally relate to the logical functions it supports.

3.9 Experience, engagement and fun 159

This is related closely to the idea of affordances, which we discuss in Section 5.7.2.The knob at the end of the MiniDisc controller affords turning – it is an obviousthing to do. However, this may not have mapped naturally onto the logical func-tions. Two of the press buttons are for cycling round the display options and forchanging sound options. Imagine a design where turning the knob to clockwisecycled through the display options and turning it anti-clockwise cycled through thesound options. This would be a compact design satisfying all the ergonomic, physi-cal and aesthetic constraints, but would not have led to as fluid an interaction. Thephysically opposite motions lead to logically distinct effects. However, the designersdid a better job than this! The twist knob is used to move backwards and forwardsthrough the tracks of the MiniDisc – that is, opposite physical movements produceopposite logical effects. Holding the knob out and twisting turns the volume up anddown. Again, although the pull action is not a natural mapping, the twist maps verynaturally onto controlling the sound level.

As well as being fluid in action, some controls portray by their physical appearancethe underlying state they control. For example, the dial on the washing machine both sets the program and reflects the current stage in the washing cycle as it turns.A simple on/off switch also does this. However, it is also common to see the poweron computers and hifi devices controlled by a push button – press for on, then press again for off. The button does not reflect the state at all. When the screen is onthis is not a problem as the fact that there is something on the screen acts as a veryimmediate indicator of the state. But if the screen has a power save then you mightaccidentally turn the machine off thinking that you are turning it on! For this reason,this type of power button often has a light beside it to show you the power is on. A simple switch tells you that itself !

3.9.4 Managing value

If we want people to want to use a device or application we need to understand theirpersonal values. Why should they want to use it? What value do they get from usingit? Now when we say value here we don’t mean monetary value, although that maybe part of the story, but all the things that drive a person. For some people this mayinclude being nice to colleagues, being ecologically friendly, being successful in theircareer. Whatever their personal values are, if we ask someone to do something or usesomething they are only likely to do it if the value to them exceeds the cost.

This is complicated by the fact that for many systems the costs such as purchasecost, download time of a free application, learning effort are incurred up front,whereas often the returns – faster work, enjoyment of use – are seen later. In eco-nomics, businesses use a measure called ‘net present value’ to calculate what a futuregain is worth today; because money can be invested, £100 today is worth the same asperhaps £200 in five years’ time. Future gain is discounted. For human decision mak-ing, future gains are typically discounted very highly; many of us are bad at savingfor tomorrow or even keeping the best bits of our dinner until last. This means thatnot only must we understand people’s value systems, but we must be able to offer


gains sooner as well as later, or at least produce a very good demonstration of poten-tial future gains so that they have a perceived current value.

When we were preparing the website for the second edition of this book wethought very hard about how to give things that were of value to those who had thebook, and also to those who hadn’t. The latter is partly because we are all academicsand researchers in the field and so want to contribute to the HCI community, butalso of course we would like lots of people to buy the book. One option we thoughtof was to put the text online, which would be good for people without the book, but this would have less value to people who have the book (they might even beannoyed that those who hadn’t paid should have access). The search mechanism wasthe result of this process (Figure 3.19). It gives value to those who have the bookbecause it is a way of finding things. It is of value to those who don’t because it acts as a sort of online encyclopedia of HCI. However, because it always gives thechapter and page number in the book it also says to those who haven’t got the book:‘buy me’. See an extended case study about the design of the book search on the website at /e3/casestudy/search/

SUMMARY

In this chapter, we have looked at the interaction between human and computer,and, in particular, how we can ensure that the interaction is effective to allow the userto get the required job done. We have seen how we can use Norman’s execution–evaluation model, and the interaction framework that extends it, to analyze the

3.10

Figure 3.19 The web-based book search facility. Screen shot frame reprinted bypermission from Microsoft Corporation

Exercises 161

interaction in terms of how easy or difficult it is for the user to express what he wantsand determine whether it has been done.

We have also looked at the role of ergonomics in interface design, in analyzing the physical characteristics of the interaction, and we have discussed a number ofinterface styles. We have considered how each of these factors can influence theeffectiveness of the interaction.

Interactivity is at the heart of all modern interfaces and is important at many levels. Interaction between user and computer does not take place in a vacuum, butis affected by numerous social and organizational factors. These may be beyond the designer’s control, but awareness of them can help to limit any negative effectson the interaction.

EXERCISES

3.1 Choose two of the interface styles (described in Section 3.5) that you have experienceof using. Use the interaction framework to analyze the interaction involved in using these inter-face styles for a database selection task. Which of the distances is greatest in each case?

3.2 Find out all you can about natural language interfaces. Are there any successful systems? For whatapplications are these most appropriate?

3.3 What influence does the social environment in which you work have on your interaction with thecomputer? What effect does the organization (commercial or academic) to which you belong haveon the interaction?

3.4 (a) Group the following functions under appropriate headings, assuming that they are to form thebasis for a menu-driven word-processing system – the headings you choose will become themenu titles, with the functions appearing under the appropriate one. You can choose as manyor as few menu headings as you wish. You may also alter the wordings of the functions slightlyif you wish.

save, save as, new, delete, open mail, send mail, quit, undo, table, glossary, preferences,character style, format paragraph, lay out document, position on page, plain text, bold text,italic text, underline, open file, close file, open copy of file, increase point size, decreasepoint size, change font, add footnote, cut, copy, paste, clear, repaginate, add page break,insert graphic, insert index entry, print, print preview, page setup, view page, find word,change word, go to, go back, check spelling, view index, see table of contents, count words,renumber pages, repeat edit, show alternative document, help

(b) If possible, show someone else your headings, and ask them to group the functions under yourheadings. Compare their groupings with yours. You should find that there are areas of greatsimilarity, and some differences. Discuss the similarities and discrepancies.

Why do some functions always seem to be grouped together?Why do some groups of functions always get categorized correctly?Why are some less easy to place under the ‘correct’ heading?Why is this important?

RECOMMENDED READING

D. A. Norman, The Psychology of Everyday Things, Basic Books, 1988. (Republishedas The Design of Everyday Things by Penguin, 1991.)A classic text, which discusses psychological issues in designing everyday objects andaddresses why such objects are often so difficult to use. Discusses the execution–evaluation cycle. Very readable and entertaining. See also his more recent booksTurn Signals are the Facial Expressions of Automobiles [267], Things That Make UsSmart [268] and The Invisible Computer [269].

R. W. Bailey, Human Performance Engineering: A Guide for System Designers, PrenticeHall, 1982.Detailed coverage of human factors and ergonomics issues, with plenty of examples.

3.5 Using your function groupings from Exercise 3.4, count the number of items in your menus.

(a) What is the average?What is the disadvantage of putting all the functions on the screen at once?What is the problem with using lots of menu headings?What is the problem of using very few menu headings?

Consider the following: I can group my functions either into three menus, with lots of func-tions in each one, or into eight menus with fewer in each. Which will be easier to use? Why?

(b) Optional experimentDesign an experiment to test your answers. Perform the experiment and report on yourresults.

3.6 Describe (in words as well as graphically) the interaction framework introduced in Human–Computer Interaction. Show how it can be used to explain problems in the dialog between a userand a computer.

3.7 Describe briefly four different interaction styles used to accommodate the dialog between userand computer.

3.8 The typical computer screen has a WIMP setup (what does WIMP stand for?). Most commonWIMP arrangements work on the basis of a desktop metaphor, in which common actions arelikened to similar actions in the real world. For example, moving a file is achieved by selecting itand dragging it into a relevant folder or filing cabinet. The advantage of using a metaphor is thatthe user can identify with the environment presented on the screen. Having a metaphor allowsusers to predict the outcome of their actions more easily.

Note that the metaphor can break down, however. What is the real-world equivalent of format-ting a disk? Is there a direct analogy for the concept of ‘undo’? Think of some more examplesyourself.


Recommended reading 163

G. Salvendy, Handbook of Human Factors and Ergonomics, John Wiley, 1997.Comprehensive collection of contributions from experts in human factors andergonomics.

M. Helander, editor, Handbook of Human–Computer Interaction. Part II: UserInterface Design, North-Holland, 1988.Comprehensive coverage of interface styles.

J. Raskin, The Humane Interface: New Directions for Designing Interactive Systems,Addison Wesley, 2000.Jef Raskin was one of the central designers of the original Mac user interface. This book gives a personal, deep and practical examination of many issues ofinteraction and its application in user interface design.

M. Blythe, A. Monk, K. Overbeeke and P. Wright, editors, Funology: From Usabilityto Enjoyment, Kluwer, 2003.This is an edited book with chapters covering many areas of user experience. It includes an extensive review of theory from many disciplines from psychologyto literary theory and chapters giving design frameworks based on these. The theoretical and design base is grounded by many examples and case studiesincluding a detailed analysis of virtual crackers.

The Interaction

Documents

interaction problems

allowthe interaction

differentstyles of interaction

human user

user andsystem

user wantsand

e wn interaction models

different interaction