Taming the Alarm Problem: To boldly go where others have gone before [email protected] Alarms system problems have contributed to many tragic losses 1. International Society of Automation (ISA): “SP18 – Alarm Systems Management and Design Guide” 2. Abnormal Situation Management (ASM) Consortium: “Effective Alarm Management Practices” 3. UK Health and Safety Executive (HSE): “Better Alarm Handling” 4. Engineering Equipment & Materials User’s Association (EEMUA): “Alarm Systems: A Guide to Design, Management, and Procurement” 5. NAMUR: “NA 102: Alarm Management” 6. FAA: “Human Factors Design Standard”, Ch 5: Alarms Alarms have been used in other domains for a long time Commercial Aviation Continuous Process Industries Alarms systems develop problems over time • no single alarm believes it is the source of flood • easy to add alarms • hard to remove alarms • alarm floods • standing alarms • nuisance alarms • poor alarm configurations 2005 BP Texas City Refinery Explosion 1B$ economic loss, 15 people killed • 275 alarms in the 11 minutes before the explosion • “ … warnings of the developing problem were lost in the plethora of instrument alarms triggered in the control room, many of which were unnecessary and registering with increasing frequency, so operators were unable to appreciate what was actually happening …” 1994 Texaco Milford Haven Refinery Explosion, £ 400M economic loss 1984 Union Carbide Bhopal Isocynate Plant Over 3800 people killed • Few alarms or interlocks in critical locations that might have warned operators of abnormal conditions • Alarms sounded so many times a week (20 to 30) that no way to know what the siren signified • Emergency signal was identical to that used for other purposes, including practice drills. • Alarm at flare tower was non-operational • Failure of the FE control computers and alarm system • Failed Alarm Tools were a major factor in the blackout. • The primary server hosting the EMS alarm processing application failed, due either to the stalling of the alarm application, “queuing” to the remote EMS terminals, or some combination of the two. 2003 East Coast Blackout 6B$, 50 million people without power • Air Traffic Control • Collision Avoidance • Cockpit Avionics • Oil Refineries • Power grids • Chemical plants • 05:48 BM-2 that keeps that's come on <<thrust reverser warning>> • 08:27 BM-2 additional system failures may cause in-flight deployment • 09:50 BM-1 it's probably … moisture or something cause it's not it's not just on it's coming on and off • 10:04 BM-1 it's just an advisory thing I don't ah could be some moisture in there or somethin' • 15:01 BM-2 oh reverser's deployed 1991 Lauda Air 767 Crash 223 people killed 2010 BP Deepwater Horizon Oil Spill: 40B$ in economic loss, 11 people killed • Vital warning systems on the Deepwater Horizon oil rig were switched off at the time of the explosion in order to spare workers being woken by false alarms, a federal investigation has heard. • The revelation that alarm systems on the rig at the centre of the disaster were disabled came in testimony by a chief technician working for Transocean, the drilling company that owned the rig “It appears that the medical equipment industry is traversing the same ground with regards to audible alarms as the military and space industries crossed decades ago.” - Rochelle Grober, 1995 Other domains have learned how to tame their alarm problems