Top Banner
Automation Reliability Automation Reliability in Unmanned Aerial in Unmanned Aerial Vehicle Control: Vehicle Control: A Reliance-Compliance A Reliance-Compliance Model of Automation Model of Automation Dependence Dependence in High Workload in High Workload Stephen R. Dixon, Christopher D. W ickens University of Illinois HUMAN FACTORS, Vol. 48, No. 3, Fal l 2006, pp. 474-486
41

Stephen R. Dixon, Christopher D. Wickens University of Illinois

Jan 01, 2016

Download

Documents

Automation Reliability in Unmanned Aerial Vehicle Control: A Reliance-Compliance Model of Automation Dependence in High Workload. Stephen R. Dixon, Christopher D. Wickens University of Illinois HUMAN FACTORS, Vol. 48, No. 3, Fall 2006, pp. 474-486. Introduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Stephen R. Dixon, Christopher D. Wickens University of Illinois

Automation Reliability in Unmanned Automation Reliability in Unmanned Aerial Vehicle Control:Aerial Vehicle Control:

A Reliance-Compliance Model of A Reliance-Compliance Model of Automation DependenceAutomation Dependence

in High Workloadin High Workload

Stephen R. Dixon, Christopher D. Wickens

University of Illinois

HUMAN FACTORS, Vol. 48, No. 3, Fall 2006, pp. 474-486

Page 2: Stephen R. Dixon, Christopher D. Wickens University of Illinois

IntroductionIntroduction• Unmanned aerial vehicles (UAVs) are now

commonly used to fulfill military reconnaissance missions without endangering human pilots.

Page 3: Stephen R. Dixon, Christopher D. Wickens University of Illinois

Imperfect AutomationImperfect Automation• Imperfect automation has been shown to c

reate different states of overtrust, undertrust, or calibrated trust complacency, and performance loss.

Page 4: Stephen R. Dixon, Christopher D. Wickens University of Illinois

Diagnostic Failures: Misses and FalsDiagnostic Failures: Misses and False Alarmse Alarms

• The focus of the current study was on imperfect automation diagnostic alerting systems.

• There is some evidence that the generic costs of alerting system false alarms may be greater than those of misses.

Page 5: Stephen R. Dixon, Christopher D. Wickens University of Illinois

Reliance Versus ComplianceReliance Versus Compliance• Reliance refers to the human operator stat

e when the alert is silent, signaling “all is well.”

• Reliant operators will have ample resources to allocate to concurrent tasks because they rely on the automation to let them know when a problem occurs.

Page 6: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• Compliance describes the operator’s response when the alarm sounds, whether true or false.

• A compliant operator will rapidly switch attention from concurrent activities to the alarm domain.

Page 7: Stephen R. Dixon, Christopher D. Wickens University of Illinois

The Current StudyThe Current Study• UAVsimulation provided an ideal test bed for two

experiments that examined the issues of imperfect automation in dual-task settings.

• 4 hypotheses:• H1: The symptoms of automation dependence

(benefits if correct, costs if incorrect) will emerge primarily at high workload. Automation imperfection driven by misses and false alarms would show qualitatively different effects as reflected by measures of reliance and of compliance.

Page 8: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• H2: Indices of high reliance will decrease with increasing miss rate.

• H3: Indices of high compliance will decrease with increasing false alarm rate.

• H4: The two vectors of reliance and compliance will show relative independence from each other.

Page 9: Stephen R. Dixon, Christopher D. Wickens University of Illinois

METHODS: EXPERIMENT 1METHODS: EXPERIMENT 1• Participants: Thirty-two undergraduate and

graduate students received $8/hr, plus bonuses of $20, $10, and $5, for first-, second-, and third-place finishes, out of groups of 8 participants.

Page 10: Stephen R. Dixon, Christopher D. Wickens University of Illinois

ApparatusApparatus• The experimental simulation ran on an Ev

ans and Sutherland SimFusion 4000q system.

• the experimental environment was subdivided into four separate windows.

Page 11: Stephen R. Dixon, Christopher D. Wickens University of Illinois
Page 12: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• The top left window contained a 3-D egocentric image view of the terrain directly below the UAV(6000 feet altitude).

• The bottom left window contained a 2-D top-down map of the 20 × 20 mile (32 × 32 km) simulation world.

Page 13: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• The bottom center window contained the message box, with “fly to” coordinates and command target (CT) report questions.

• The bottom right window contained the four system gauges for the system failure monitoring task.

Page 14: Stephen R. Dixon, Christopher D. Wickens University of Illinois

ProcedureProcedure• Each participant flew one UAV through 10

consecutive mission legs.

• During each leg, the participant completed three goal-oriented tasks: mission navigation and command target inspection, target of opportunity (TOO) search, and systems monitoring.

Page 15: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• mission navigation and command target inspection: Once participants arrived at the CT location, they loitered around the target, manipulated a camera for closer target inspection via a joystick, and reported back relevant information to mission command.

Page 16: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• TOO search: a task similar to the CT report except that the TOOs were much smaller than the CT report objects and were camouflaged.

• Systems monitoring: participants were also required to monitor the system gauges for possible system failures.

Page 17: Stephen R. Dixon, Christopher D. Wickens University of Illinois

DesignDesign• The auditory autoalerts for the SFs were p

rovided for three out of the four conditions, using a between-subjects design.

• four conditions:A100, A67f, A67m, and baseline.

Page 18: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• A100 condition: A = automation, 100% reliable.

• A67f condition: f = false alarm, 67% reliable, provided 10 true alerts and an additional 5 false alarms.

• A67m condition: (m = miss, 67% reliable) provided 10 true alerts but failed to alert an additional 5 events (10 true alarms plus 5 misses).

Page 19: Stephen R. Dixon, Christopher D. Wickens University of Illinois

RESULTS: EXPERIMENT 1-Primary TRESULTS: EXPERIMENT 1-Primary Task: Mission Navigation and CT Inspectiask: Mission Navigation and CT Inspecti

onon

• Tracking error and CT reporting: Planned comparisons revealed no main effect for tracking error (all ps > .10) or for CT reporting speed an accuracy.

• Repeats: the A67m condition generated twice as many repeats as the did A67f condition, t(14) = 2.52, p = .01.

Page 20: Stephen R. Dixon, Christopher D. Wickens University of Illinois

Secondary Task: TOO MonitoringSecondary Task: TOO Monitoring• TOO detection rates: detection rates were

significantly lower in the A67m (miss) condition than in the A67f (false alarm) condition in both the lowworkload, t(12)=2.25, p < .05, and high-workload trials, t(12) = 2.20, p < .05.

Page 21: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• TOO detection times: the A67f condition may have generated longer detection times than the A67m condition did, t(6) = 1.40, p = .10 (approaching significance).

Page 22: Stephen R. Dixon, Christopher D. Wickens University of Illinois

SF MonitoringSF Monitoring• SF detection rates: Planned comparisons r

evealed that the 67% reliable conditions resulted in poorer detection rates than did the baseline condition.

• SF detection times: it is interesting to note that the A67m condition resulted in detection times slower than those in the A67f condition.

Page 23: Stephen R. Dixon, Christopher D. Wickens University of Illinois
Page 24: Stephen R. Dixon, Christopher D. Wickens University of Illinois
Page 25: Stephen R. Dixon, Christopher D. Wickens University of Illinois

DISCUSSION: EXPERIMENT 1DISCUSSION: EXPERIMENT 1• Perfect automation had a beneficial effect,

relative to baseline, on performance in the automated task, but it had no benefit on concurrent task performance.

• Imperfect automation (67%) hurt both the automated task and concurrent tasks, even dropping these below baseline in some cases.

Page 26: Stephen R. Dixon, Christopher D. Wickens University of Illinois

METHODS: EXPERIMENT 2METHODS: EXPERIMENT 2• The procedures of Experiment 2 replicated

those of Experiment 1.

• An A80 condition (A = automation, 80% reliable) failed by giving 1 false alarm and 1 miss during each mission (8 true alarms, 1 miss, and 1 false alarm).

Page 27: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• A60f condition (f = false alarm, 60% reliable) was created by imposing 3 automation false alarms and 1 automation miss (4 automation failures) out of the 10 possible system failures.

• A60m condition (m = miss, 60% reliable) resulted in 3 misses and 1false alarm (6 true alarms plus 3 misses and 1 false alarm).

Page 28: Stephen R. Dixon, Christopher D. Wickens University of Illinois

RESULTS: EXPERIMENT 2-MissioRESULTS: EXPERIMENT 2-Mission Completionn Completion

• performance in the two 60% reliable conditions was worse than baseline, t(20) = 2.77, p < .05, whereas the A80 condition did not differ from baseline.

• There was no statistical difference between the A60f and A60 conditions.

Page 29: Stephen R. Dixon, Christopher D. Wickens University of Illinois

TOO MonitoringTOO Monitoring• TOO detection rates: Planned comparison

s revealed that there was no difference between the 60% reliable conditions and baseline, t(20) = 1.17, p > .10, whereas performance in the A80 condition was better than baseline.

• TOO detection times: 60% reliable conditions was worse than baseline, t(16) = 3.09, p < .01, but there was no difference between the A80 condition and baseline.

Page 30: Stephen R. Dixon, Christopher D. Wickens University of Illinois

SF MonitoringSF Monitoring• SF detection rates: the reduced rates in th

e A60f condition in high workload (50%), as compared with the other conditions (mean = 74%).

• SF detection times: in the high workload trials, performance in the 60% reliable conditions may have been worse than baseline.

Page 31: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• Complacency effect: responses when an alert correctly sounded (A60f = 13.93 s; A60m = 3.96 s) and those when the alert failed to sound (A60f = 26.05 s; A60m = 23.29 s).

Page 32: Stephen R. Dixon, Christopher D. Wickens University of Illinois
Page 33: Stephen R. Dixon, Christopher D. Wickens University of Illinois
Page 34: Stephen R. Dixon, Christopher D. Wickens University of Illinois

DISCUSSION: EXPERIMENT 2DISCUSSION: EXPERIMENT 2• Highly reliable automation did not benefit p

erformance in the automated task relative to baseline, but it had a small benefit to concurrent task performance.

• Lowreliability automation (60%) hurt both the automated task and concurrent tasks, with different effects for false alarms and misses.

Page 35: Stephen R. Dixon, Christopher D. Wickens University of Illinois

MODELING OF AUTOMATIONMODELING OF AUTOMATIONDEPENDENCEDEPENDENCE

• It is possible to assess measures of reliance and compliance.

• Reliance is indexed by (a) the performance on secondary or concurrent tasks. (b) Reliance is also indexed by the time required to respond to an unannounced failure.

Page 36: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• Compliance is indexed by the response time and accuracy to an announced system failure (higher compliance shorter RT) under high workload.

Page 37: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• All four measures of reliance showed a correlation in the expected direction

• SF automation miss rate correlates with TOO miss rate, r = .50; RT to TOO, r = .73; repeats, r = .76; and RT to SF misses, r = –0.97.

Page 38: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• the correlations were in the expected direction. The correlation of automation FA rate with RT to SF was r = .37; with SF miss rate it was r = .73 – that is, higher FA rate→ less compliance → slower and less accurate response to the SF alerts.

Page 39: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• Correlations on the pooled data revealed that miss rate → reliance (r = .67, p < .01); miss rate → compliance (r = .07, ns); FA rate → reliance (r = –.50, p = .06); FA rate → compliance (r = .49, p = .11).

Page 40: Stephen R. Dixon, Christopher D. Wickens University of Illinois

GENERAL DISCUSSIONGENERAL DISCUSSION• A100 performance was superior to baselin

e performance in the RTto system failures only at high workload, supporting H1.

• People depend on automation even when it is imperfect.

Page 41: Stephen R. Dixon, Christopher D. Wickens University of Illinois

• Our data revealed a strong effect of miss rate on reliance (r = .67), as participants became less trusting of the automation to alert them if a failure occurred.

• Negative effects of high false alert rate on compliance, reflecting the “cry wolf” phenomenon.