Top Banner
Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua
26

Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Safety-Critical Systems 2

T 79.232

Risk analysis and design for safety

Ilkka Herttua

Page 2: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

V - Lifecycle model

SystemAcceptance

System Integration & Test

Module Integration & Test

Requirements Analysis

Requirements Model

Test Scenarios Test Scenarios

SoftwareImplementation

& Unit Test

SoftwareDesign

Requirements Document

Systems Analysis &

Design

Functional / Architechural - Model

Specification Document K

now

led

ge B

ase

** Configuration controlled Knowledge that is increasing in Understanding until Completion of the System:

• Requirements Documentation• Requirements Traceability• Model Data/Parameters• Test Definition/Vectors

Page 3: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Overall safety lifecycle

 

Concept1

System Acceptance10

System Validation (including Safety Acceptance and

Commissioning)

9

Installation8

Design and Implementation

6

Apportionment of System Requirements

5

Performance Monitoring12

Modification and Retrofit13

System Definition and Application Conditions

2

Re-apply Lifecycle(See note)

Risk Analysis3

Operation and Maintenance11

System Requirements4

Manufacture7

Decommissioning and Disposal

14

Note: The phase at which a modification enters the life-cycle will be dependent upon both the systembeing modified and the specific modification under consideration.

Page 4: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Risk Analysis

• Risk is a combination of the severity (class) and frequency (probability) of the hazardous event.

• Risk Analysis is a process of evaluating the probability of hazardous events.

• The Value of life??Value of life is estimated between 0.75M –2M GBP.

USA numbers higher.

Page 5: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Risk Analysis

• Classes: - Catastrophic – multiple deaths >10 - Critical – a death or severe injuries- Marginal – a severe injury

- Insignificant – a minor injury

• Frequency Categories:Frequent 0,1 events/year Probable 0,01Occasional 0,001Remote 0,0001Improbable 0,00001Incredible 0,000001

Page 6: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Hazard Analysis

• A Hazard is situation in which there is actual or potential danger to people or to environment.

• Analytical techniques: - Failure modes and effects analysis (FMEA) - Failure modes, effects and criticality analysis (FMECA) - Hazard and operability studies (HAZOP) - Event tree analysis (ETA) - Fault tree analysis (FTA)

Page 7: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

 

Fault Tree Analysis 1

The diagram shows a heater controller for a tank of toxic liquid. The computer controls the heater using a power switch on the basis of information obtained from a temperature sensor. The sensor is connected to the computer via an electronic interface that supplies a binary signal indicating when the liquid is up to its required temperature. The top event of the fault tree is the liquid being heated above its required temperature.

Page 8: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Fault event notfully traced to its source

Basic event, input

Fault event resultingfrom other events

OR connection

Page 9: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Risk acceptability• National/international decision – level of an acceptable loss (ethical,

political and economical) Risk Analysis Evaluation:

ALARP – as low as reasonable practical (UK, USA)“Societal risk has to be examined when there is a possibility of a catastrophe involving a large number of casualties”

GAMAB – Globalement Au Moins Aussi Bon = not greater than before (France)“All new systems must offer a level of risk globally at least as good as the one offered by any equivalent existing system”

MEM – minimum endogenous mortality “Hazard due to a new system would not significantly augment the figure of the minimum endogenous mortality for an individual”

 

Page 10: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Risk acceptabilityTolerable hazard rate (THR) – A hazard rate which guarantees that the

resulting risk does not exceed a target individual risk

SIL 4 = 10-9 < THR < 10-8 per hour and per function

SIL 3 = 10-8 < THR < 10-7

SIL 2 = 10-7 < THR < 10-6

SIL 1 = 10-6 < THR < 10-5

Potential Loss of Life (PLL) expected number of casualties per year

 

Page 11: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Current situation / critical systems

• Based on the data on recent failures of critical systems, the following can be concluded:

a) Failures become more and more distributed and often nation-wide (e.g. commercial systems like credit card denial of authorisation)

b) The source of failure is more rarely in hardware (physical faults), and more frequently in system design or end-user operation / interaction (software).

c) The harm caused by failures is mostly economical, but sometimes health and safety concerns are also involved.

d) Failures can impact many different aspects of dependability (dependability = ability to deliver service that can justifiably be trusted).

Page 12: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Examples of computer failures in critical systems

Page 13: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Driving force: federation

• Safety-related systems have traditionally been based on the idea of federation. This means, a failure of any equipment should be confined, and should not cause the collapse of the entire system.

• When computers were introduced to safety-critical systems, the principle of federation was in most cases kept in force.

• Applying federation means that Boeing 757 / 767 flight management control system has 80 distinct microprocessors (300, if redundancy is taken into account). Although having this number of microprocessors is no longer too expensive, there are other problems caused by the principle of federation.

Page 14: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Designing for Safety

• Faults groups:

- requirement/specification errors

- random component failures

- systematic faults in design (software)• Approaches to tackle problems

- right system architecture (fault-tolerant)

- reliability engineering (component, system)

- quality management (designing and producing processes)

Page 15: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Designing for Safety• Hierarchical design

- simple modules, encapsulated functionality- separated safety kernel – safety critical functions

• Maintainability- preventative versa corrective maintenance- scheduled maintenance routines for whole lifecycle - easy to find faults and repair – short MTTR mean time to repair

• Human error- Proper HMI

Page 16: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Hardware Faults

Intermittent faults- Fault occurs and recurrs over time (loose connector)Transient faults- Fault occurs and may not recurr (lightning)- Electromagnetic interferencePermanent faults- Fault persists / physical processor failure (design fault – over current)

Page 17: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

• Fault tolerance hardware- Achieved mainly by redundancy Redundancy- Adds cost, weight, power consumption, complexityOther means:- Improved maintenance, single system with better materials (higher MTBF)

Fault Tolerance

Page 18: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Redundancy types

Active Redundancy:

- Redundant units are always operating.

Dynamic Redundancy (standby):

- Failure has to be detected

- Changeover to other modul

Page 19: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Hardware redundancy techniques

Active techniques:

- Parallel (k of N)

- Voting (majority/simple)

Standby :

- Operating - hot stand by

- Non-operating – cold stand by

Page 20: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Reliability prediction

• Electronic Component- Based on propability and statictical- MIL-Handbook 217 – experimental data on actual device behaviour- Manufacture information and allocated circuit types-Bath tube curve; burn in – useful life – wear out

Page 21: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Safety-Critical Hardware

Fault Detection:- Routines to check that hardware works- Signal comparisons - Information redundancy –parity check etc..- Watchdog timers- Bus monitoring – check that processor alive- Power monitoring

Page 22: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Safety-Critical Hardware

Possible hardware:COTS Microprocessors- No safety firmware, least assurance- Redundancy makes better, but common failures possible- Fabrication failures, microcode and documentation errors- Use components which have history and statistics.

Page 23: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Safety-Critical Hardware

Special Microprocessors- Collins Avionics/Rockwell AAMP2- Used in Boeing 747-400 (30+ pieces)- High cost – bench testing, documentation, formal verification- Other models: SparcV7, TSC695E, ERC32 (ESA radiation-tolerant), 68HC908GP32 (airbag)

Page 24: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Safety-Critical Hardware

Programmable Logic Controllers PLC• Contains power supply, interface and one or more processors.• Designed for high MTBFs• Firmware • Programm stored in EEPROMS• Programmed with ladder or function block diagrams

Page 25: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Safety management

• Safety culture/policy of the organisation

- Task for management ( Targets )

• Safety planning

- Task for safety manager ( How to )

• Safety reporting

- All personal

- Safety log / validation reports

Page 26: Safety-Critical Systems 2 T 79.232 Risk analysis and design for safety Ilkka Herttua.

Home assignments

• 4.18 (tolerable risk)

• 5.10 (incompleteness within specification)

Email before 2. March to [email protected]