Top Banner
An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices. - Recognition of The Risk Class Before and After The Risk Control Measure - Yoshio SAKAI Engineering Promotion Center, NIHON KOHDEN CORPORATION Seiko SHIRASAKA The Graduate School of System Design and Management, KEIO University Yasuharu NISHI Department of Systems Engineering, The University of Electro-Communications
34

An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

Jan 20, 2015

Download

Health & Medicine

Yoshio SAKAI

An extended notation of FTA for risk assessment of software-intensive medical devices
Yoshio Sakai, Seiko Shirasaka and Yasuharu Nishi
It is difficult to assess the risk of software-intensive medical devices. An extended notation of FTA recognizes the risk class before and after the risk control measure and the software in the system affects the top event of FTA.

You can see this content as 6-pages paper from IEEE Website.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

- Recognition of The Risk Class Before and After The Risk Control Measure -

Yoshio SAKAI Engineering Promotion Center, NIHON KOHDEN CORPORATION Seiko SHIRASAKA The Graduate School of System Design and Management, KEIO University Yasuharu NISHI Department of Systems Engineering, The University of Electro-Communications

Page 2: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Flow of the Presentation

1. Explanation of the traditional FTA which lack consideration of the software. 2. Explanation of the risk assessment method in ISO 14971 which lack consideration of

the software. 3. Explanation of solutions using an extended notation of FTA.

2

1. Traditional FTA 2. Risk Assessment Method in ISO 14971

3. An Extended Notation of FTA

OLD OLD NEW

Lack of consideration of the Software Failure

Hazard

Hazardous Situation

Harm

Severity of the Harm

Probability of Occurrence

of HarmRisk

Seq

uenc

e of

Eve

nts

Exposure (P1)

P2

P1 × P2

Intensive-Software

Page 3: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The History of FTA (Fault Tree Analysis)

Fault Tree Analysis (FTA) was originally developed for Minuteman Missile in 1962 at Bell Laboratories by H.A. Watson. At that time, FTA was designed because the electronic system was not able to endure vibration and caused it to break down.

As for the FTA, completeness was raised by BOEING.

1962

1965

NOW The FTA is used widely.

The cause of the trouble was the hardware failure,

not software.

Page 4: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The traditional FTA which lacks consideration of the software.

• When FTA was developed, the failure caused by the software was not an element of the failures of FTA.

• The traditional FTA is not comprehensible about – The effectiveness before and after the risk control measure. – The software in the system and the risk control measure affects the top event.

• The calculation of the failure rate on FTA can not use for the failure caused by the software.

4

HARDWARE

SOFTWARE

×

Page 5: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The Traditional Risk Assessment Method

1. The hot water as the thermal energy

2. A cover opens and spills hot water

3. Getting burned

The example is the boiled water with an electric kettle.

5

Fig. 3. ISO 14971

P1 is the probability of a hazardous situation occurring. P2 is the probability of a hazardous situation leading to harm.

Software ?

Page 6: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The Estimation of the probability of a hazardous situation

6

Failure Rate of Random Hardware Failure

HARDWARE USABILITY

SOFTWARE •Software is Invisible. •The failure caused by the software occurs systematically, but not statistically.

We can not estimate the probability or the likelihood of the failure cased by Software.

The likelihood of the usability failure HIGH Frequent Probable Occasional Remote Improbable LOW

Likelihood: SOURCE IEC 80001-2-1 Step by Step

Page 7: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Feature of Systematic Failure Systematic failure is unwanted behaviour which is • repeatable

– If the conditions can be exactly replicated

• predictable (but not accurately) – all systems have flaws

• indefensible – it should not occur... … but it is extremely hard to prevent

7

Page 8: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The definition and explanation of Systematic Failure

This International Standard NOTE4 : • sets requirements for the avoidance and control of systematic faults, which are based

on experience and judgment from practical experience gained in industry. Even though the probability of occurrence of systematic failures cannot in general be quantified the standard does, however, allow a claim to be made, for a specified safety function, that the target failure measure associated with the safety function can be considered to be achieved if all the requirements in the standard have been met;

SOURCE: IEC 61508-3:2010

Systematic Failure failure, related in a deterministic way to a certain cause, that can only be eliminated by a change of the design or of the manufacturing process, operational procedures, documentation or other relevant factors

SOURCE: ISO 26262-1:2011

8

Page 9: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Two types of evaluation of the hazard caused by Systematic Software Failure

9

The probability of such failure shall be assumed to be 100 percent. (IEC 62304:2006)

If the hazard could arise from a failure of the software, the risk evaluation should be analyzed by the following two concerns. (IEC 62304:2006 Amd.1 , This Study)

• The probability is 100%. • This 100 percent principle has been chosen for conservative purpose

but not practical in real application.

• 1st concern is the risk level as the severity of the harm before the risk control measures. • 2nd concern is the risk level as the severity of the harm after the risk control measures. • The evaluation of the residual risk is of importance, but under the cause of the software, the

probability of occurrence of harm before the risk control measures is not.

Page 10: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The procedure of evaluation of the hazard caused by Systematic Software Failure

10

RISK RESIDUAL RISK RISK CONTROL MEASURES

If the hazardous situation occurs by Systematic Software Failure

The safety is affected by • the hardware as the risk

control measure and • the reliability of the

critical software component.

After the risk control measures, we have to evaluate the residual risk for the safety.

The probability of occurrence of harm caused by the software before the risk control measures is not necessary for the risk assessment.

Page 11: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Method of evaluating Systematic Failure

Medical device Manufacturers can evaluate the residual risk class by the following combination after countermeasure.

11

a. The severity of the residual risk

b. The reliability of the software items that could contribute to a hazardous situation

c. The safe architecture of the software system

These are not elements of the probability

Page 12: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Relation between the risk control measures and Architecture.

12

Complicated Software Items (Low cohesion and High coupling)

Segregated Software Items (High cohesion and Low coupling)

Layered Architecture (3 Layers: Presentation, Domain and Date Source)

Result of having continuous addition (A real software system)

Clear Not Clear

Page 13: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The mode of cut or coagulation is switched by software.

Mode Principles

Cut For cutting, a continuous single frequency sine wave is often employed.

Coagulation For coagulation, the average power is typically reduced below the threshold of cutting. Generally, the sine wave is turned on and off in a rapid succession.

13

The Principles of Electrosurgical Knife

There are the serious hazardous situations in the software system.

Page 14: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Electrosurgical Knife Block Diagram

14

The wave is controlled and switched by the software

The most serious hazard is hemorrhage not intended by the abnormal output of Electrosurgical knife.

Let’s see the fault tree analysis following slides.

High Risk Software Component

High Risk Software Component

Page 15: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Extended Notation of FTA (1)

15

Abnormal Output ofElectrosurgical Knife

High-frequencyWave Failure

Wave CircuitFailure

Output HardwareFailure

Timer Failure

Failure of the AbnormalDetection

AbnormalMonitoring

Failure

A/DConvertor

Failure

Cut/CoagMode

Mismatch

Unintended Output causedby Software

AbnormalMonitoring

Failure

Abnornal Output causedby Hardware

Class Bs = AND (Bs, B)Class C = OR (C, C, B)

Class C Class C Class B

Class Bs

Class A(C)s = AND (C, --Bs))

Class Bs Class B

Class Cs

Class A(C)s = AND (Cs, --Bs)

Class A(C)s = OR (A(C)s, A(C)s)

1st column from the bottom and on the left side of FTA Example

a. There are three hardware failures. b. Each failure is classified by the risk level. c. Three basic events are connected with OR

gate. d. The highest risk class is adopted by the OR

function.

Risk Class Definition (Source IEC 62304:2006)

Class A No injury or damage to health is possible

Class B Non-serious injury is possible

Class C Death or serious injury is possible

Page 16: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Extended Notation of FTA (2)

16

2nd column from the bottom and on the left side of FTA Example a. The right basic event is an abnormal monitoring failure. b. This event is caused by the software. c. It is described with Class Bs as impact level of risk Class

B and with “s” as the effect of the software. d. The abnormal monitoring inhibits and controls the output

hardware failure. This is indicated by AND function as AND(C, --Bs). The stage of inhibit is shown by the number of the minus. In this case, the risk control measure goes down the risk level by two stages from C to A.

Abnormal Output ofElectrosurgical Knife

High-frequencyWave Failure

Wave CircuitFailure

Output HardwareFailure

Timer Failure

Failure of the AbnormalDetection

AbnormalMonitoring

Failure

A/DConvertor

Failure

Cut/CoagMode

Mismatch

Unintended Output causedby Software

AbnormalMonitoring

Failure

Abnornal Output causedby Hardware

Class Bs = AND (Bs, B)Class C = OR (C, C, B)

Class C Class C Class B

Class Bs

Class A(C)s = AND (C, --Bs))

Class Bs Class B

Class Cs

Class A(C)s = AND (Cs, --Bs)

Class A(C)s = OR (A(C)s, A(C)s)

Class C Class A

-- Risk Control Measure(Class Bs)

Down the risk level by two stages

Class A(C) s = AND(C, --Bs)

Page 17: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Extended Notation of FTA (3)

17

1st column from the bottom and On the right side of FTA Example. a. The abnormal monitoring failure is caused by the

software.

b. The A/D convertor failure is caused by hardware.

c. If the basic event does not inhibit the other basic event, the highest risk class is adopted by the AND function. (This method is inspired by the notation of ASIL decomposition in ISO 26262-9)

d. The subscript “s” is inherited from the left side to the right side through the function as the affect of the software to the system.

Abnormal Output ofElectrosurgical Knife

High-frequencyWave Failure

Wave CircuitFailure

Output HardwareFailure

Timer Failure

Failure of the AbnormalDetection

AbnormalMonitoring

Failure

A/DConvertor

Failure

Cut/CoagMode

Mismatch

Unintended Output causedby Software

AbnormalMonitoring

Failure

Abnornal Output causedby Hardware

Class Bs = AND (Bs, B)Class C = OR (C, C, B)

Class C Class C Class B

Class Bs

Class A(C)s = AND (C, --Bs))

Class Bs Class B

Class Cs

Class A(C)s = AND (Cs, --Bs)

Class A(C)s = OR (A(C)s, A(C)s)

Page 18: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Extended Notation of FTA (4)

18

1st column from the top of FTA Example. a. The highest risk class is adopted by

the OR function. In this case, the risk classes are same.

b. The risk class of a top event is expressed after all as Class A (C) s.

• The followings are recognized by this notation.

– The risk class of the residual risk is A. – The highest risk class before the risk

control measure is C. – The software affects the top event or the

risk control measure in the system.

Abnormal Output ofElectrosurgical Knife

High-frequencyWave Failure

Wave CircuitFailure

Output HardwareFailure

Timer Failure

Failure of the AbnormalDetection

AbnormalMonitoring

Failure

A/DConvertor

Failure

Cut/CoagMode

Mismatch

Unintended Output causedby Software

AbnormalMonitoring

Failure

Abnornal Output causedby Hardware

Class Bs = AND (Bs, B)Class C = OR (C, C, B)

Class C Class C Class B

Class Bs

Class A(C)s = AND (C, --Bs))

Class Bs Class B

Class Cs

Class A(C)s = AND (Cs, --Bs)

Class A(C)s = OR (A(C)s, A(C)s)

Page 19: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Effectiveness of this Notation

These are the following effectiveness of this notation. • The safety analysts can recognize

– the risk class before and after the risk control measure. – the software in the system and the risk control measure affects the top event. – the effect of the risk control by the minus mark in the AND function.

• When there is the mark "s" of the event in the fault tree, the safety analysts find the start point of the effect of the software for the system safety.

• When there is the mark "s" and the minus mark, the safety analysts can recognize the risk which is given by changing software of the risk control measure.

19

Page 20: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013 20

The start point of the effect of the software for the system safety

There is the risk which is given by changing software of the risk control measure

There is the risk which is given by changing software of the risk control measure

Effectiveness of this Notation

Page 21: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Attention! • FTA is an excellent way to show the structure of the mechanism that

Top Event as "undesired state of the system" is generated. • On the other hand, the calculation of the failure rate on FTA has a

dangerous feature too.

21

1.The evaluation of the residual risk is of importance. 2.We can evaluate the severity of the harm before and after the risk control measures. Therefore, we should focus on the architecture of the software system and the structure of the risk control measures.

When Systematic Software Failure has not been recognized, the analysis of a radiation therapy machine named Therac-25 included the software in the fault trees but used a “generic failure rate” of 10-4 for software events.

This number was justified based on the historical performance of the Therac-25 software.(This source is from SAFEWARE by Pf. Nancy Leveson)

But now, we understand the features of the software well, and recognize it is not realistic.

Page 22: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013 22

Thank you. I wish this notation will be used in the real development of Medical Devices.

Page 23: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

REFERENCES [1] Dolores R. Wallace, D. Richard Kuhn, “Failure Modes In Medical Device Software:An

Analysis Of 15 Years Of Recall Data” , 2001 [2] S.Shirasaka, Y.Sakai, Y.Nishi, “Feature Analysis of Estimated Causes of Failures in Medical

Device Software and Proposal of Effective Measures” , ISSRE 2011, [3] ISO 14971:2007 Medical devices - Application of risk management to medical devices [4] ISO 26262-1:2011 Road vehicles - Functional safety - Part 1: Vocabulary [5] IEC/TR 80001-2-1 Application of risk management for IT-networks incorporating medical

devices – Part 2-1: Step-by-step risk management of medical IT-networks – practical applications and examples

[6] IEC 62304:2006 Medical device software - Software life cycle processes [7] “Katerina Goseva-Popstojanova, Ahmed Hassan, Ajith Guedem, Walid Abdelmoez, Diaa Eldin

M. Nassar, Hany Ammar, Ali Mili, “Architectural-Level Risk Analysis Using UML”, IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 29 NO. 10 OCTOBER 2003

[8] Sherif M. Yacoub, Hany H. Ammar, “A Methodology for Architecture-Level Reliability Risk Analysis”, IEEE TRANSACTIONS ON SOFTWARE ENGINEERING VOL. 28 NO. 6 JUNE 2002

23

Page 24: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013 24

Extra Information for this study

Page 25: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Therac-25 FTA • The probability for the computer to

choose the wrong energy is 10-11 . • The probability for the computer to

choose the wrong mode is 4×10-9 • I took off a safety device with the

hardware for an economic reason. • Systematic Software Failure has not

been recognized • This number was justified based on

the historical performance of the Therac-25 software.

The probability is 10-11 ? The probability is 4×10-9 ?

VT100

PDP-11

Computerchooses the

wrong energy

0.00000000001

System outputs thewrong energy

Computerchooses the

wrong mode

0.000000004

Page 26: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

IEC 80001-2-1 Figure 8

26

Work Sheet Example of Hazard Analysis

Page 27: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

New Hazard Analysis of the real medical devices.

27

Probability should be replaced to Effect of Risk Control Measure (e.g. Major/Moderate/Minor)

If there is the combination of the hardware faults and the software errors, we should have separation of the concern which is Hardware or Usability or Software.

Add “Risk Control Measure Type of Concern” SOFTWARE, USABILITY, HARDWARE, CONBINATION of ・・・

Probability should be replaced to Probability or Likelihood or NA(Software): Not Applicable.

Page 28: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Separation of The Concern for the risk assessment

28

HARDWARE USABILITY

SOFTWARE

Probability (Statistically)

likelihood

The risk level before the risk control measures. The risk level after the risk control measures.

1st Concern

2nd Concern 3rd Concern

NA→The risk level

Page 29: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

IEC 80001-2-1 Table D.3

29

Usability <-> ○ Likelihood Software <-> × Likelihood

If the hazardous situation occurred in the software, we can estimate the risk level as only the severity of the harm after the risk control measures.

Page 30: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013 30

Medical Device System

Hardware & Software

Hazard

User Needs

Hazardous Situation & Harm

Intended Use

Software Architecture

Risk Control Measure

Residual Risk

RequirementsAnalysis

Risk Assessment

Risk Reduction

Hazard

Hazardous Situation

Harm

Severity of the Harm

Probability of Occurrence

of HarmRisk

Seq

uenc

e of

Eve

nts

Exposure (P1)

P2

P1 × P2

Change the method of the risk assessment!

The important aspects

We should focus on the architecture of the software system and the structure of the risk control measures.

Page 31: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

IEC 62304:2006 Amd1 CD 4.3 Software safety classification

This chart and our study are the same classify method.

Page 32: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

The Types of Safety Design

Fault Avoidance

Fail Safe

Fault Tolerance

Error Proof (Fool Proof)

Total Optimization Specific Optimization

Usability

USER

Contrasting Method

Architecture

Specific optimization as Fault Avoidance approach is not realistic for the large-scale and complicated software system.

Total optimization approach is reasonable for today’s medical device software.

Page 33: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

Safety Design Method Realization Technique

Fault Avoidance

Fail Safe

Fault Tolerance

Error Proof / Fool Proof

Formal Method

Space Tolerance

Main Sub

High Coverage Testing

Interlock Lockout Safeguard

Easy Operation Home button Safety Label

Time Tolerance

1st 2nd

Information Tolerance Main

Information Error

Correction

Page 34: An Extended Notation of FTA for Risk Assessment of Software-intensive Medical Devices.

[email protected] 24th ISSRE / 1st MedSRDR 2013 Nov 7, 2013

ISO 26262-9 Figure 2 — ASIL decomposition schemes

34

• If the basic event does not inhibit the other basic event, the highest risk class is adopted by the AND function. (This method is inspired by the notation of ASIL decomposition in ISO 26262-9)

AND function without the element of the risk control as inhibit should select the maximum level of failures. Because it focus on the risk class before and after the risk control measures.