Top Banner
1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson, ARDEC L. Borshard, ARDEC J. Fornoff, ARDEC
19

1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

Mar 27, 2015

Download

Documents

Antonio Garza
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

1

Evaluation of Commercial Off The Shelf (COTS) Operating System (OS)

Malfunction Mitigation Methods

C. Forni, ATKB. Blake, ATK

R. Hall, TextronD. Magidson, ARDECL. Borshard, ARDECJ. Fornoff, ARDEC

Page 2: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

2

Summary – Mitigation Cases

OS APP

NormalFunction

HAZARD

OS APP

NormalFunction

HAZARD

No Mitigations Case - Error at any point causes hazard

MiddleWare

Middleware - OS related errors mitigated - others still possible

OS APP

NormalFunction

HAZARD

Wrapper

Function Wrappers - Some OS related errors mitigated -others still possible

UserI nterf ace

UserI nterf ace

Hazardous Errors PropagatedSome Hazardous Errors Blocked - lead to normal f unctionI dentifi ed Hazardous Errors BlockedI dentifi ed Hazardous Errors Mitigated

UserI nterf ace

Page 3: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

3

Summary – Mitigation Cases

OS APP

NormalFunction

HAZARD

App Based OS Mitigations - OS errors mitigated at App App errors can still produce hazards

APPMit

OS

APP

NormalFunction

HAZARD

APPMit

APPMit

UserInterface

UserInterface

App Based OS Mitigations - OS errors mitigated at App Interface

App errors Mitigated in App

Safe

OS APP

NormalFunction

HAZARD

APPMit

APPMit

UserInterface Safe

Middleware

APP

SharedResource

Middleware & App Based OS Mitigations - same as above +Blocks errors from 2nd app or shared resource

More than one App =Middleware necessary!

Page 4: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

4

Summary – Safety Impact Required for OS Changes

OS APP

NormalFunction

HAZARD

APPMit

APPMit

UserInterface

App Based OS Mitigations - OS errors mitigated at App App errors Mitigated in AppApp and all mitigations require safety analysis

Evaluate f or Safety I mpact

OS APP

NormalFunction

HAZARD

APPMit

APPMit

UserInterface

OS or HW upgrade

App Based OS Mitigations + Wrapper Same as above but wrapper must also be evlauted

OS or HW upgrade

OS APP

NormalFunction

HAZARD

APPMit

APPMit

UserInterface

OS or HW upgrade

Lock Out if ConfigChanges

Page 5: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

5

• Use an OS with known safety-integrity

• Use an OS with middleware of known safety integrity

• Use an OS with all required mitigation within the Application

Possible OS Fault Mitigation Methods

Page 6: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

6

OS Fault Mitigation Method Selection Process Flow

Evaluate TargetOS

Is the target OSsafety certified?

Is there aneed for a "context free"assessment of the safety

criticality of the OS?

Is therea need to provide OS

Isolation (reuse, Alt OS,incompatibilities)?

No

No

AnalyzeApplicationto determinesuitability of

OS/Middleware

andAdditionalRequired

Mitigations

Yes

Is itbenificial and practical to provide Mitigations

via Middleware orWrappers?

Yes

Yes Yes

No

ImplementApplication

Mitigation in theform of Wrappers

or Middleware

Done

Perform OS "ContextFree" assessment

AcquireMiddleware

Is AdequateMiddleware

Available for usewith this OS?

Yes

No

Are programresources adequate toperform a context free

assesssment?

Yes

SelectAnother OS

No

Are programresources adequate

to Develop AdequateMiddleware?

DevelopMiddleware

Yes

No

Implement In-LineApplicationMitigation

No

Page 7: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

7

Evolution of OS Functionality and Hazard Causal Factors

HardwareResource

Event Detectionand Activation

Software

HWStatus

Application Functionto be activated on

event

HardwareResource

Event Detectionand Activation

Software

HWStatus

Application Functionto be activated on

event

Status GenerationHardware and/or

SoftwareStatus

Black Box Function

HardwareResource

EventDetection and

ActivationSoftware

HWStatus

ApplicationFunction tobe activated

on event

Condition Detection /Event GenerationHardware and/or

Software

Event

Black Box Function

(w/ Cause for Case 4)

HardwareResource

Callback EventActivationSoftware

HWStatus

ApplicationFunction to beactivated on

event

Event DetectionHardware and/

or Software

CallbackEvent

Black Box Function

(w/ Cause for Case 6)

Page 8: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

8

Event Processing (activation of an Application Function) CaSe

Implementation Basis of false activation Possible Mitigation methods

1Application directly obtains status from the hardware resource to determine if activation is necessary.

Probability of erroneous activation based on hardware failure rate and the “detection and activation software” failure rate.

Mitigation for possible “event detection and activation software” failures achieved by Application verification of the HW status in cases where function activation is allowed.

2Application obtains status from a ‘black box’ function to determine if activation is necessary.

Probability of erroneous activation based on “black box” function failure rate and the “detection and activation software” failure rate. Note: Pushing “event detection and activation software” functionality into the “black box” function may result in a complexity decrease for the “event detection and activation software”.

Mitigation for “event detection and activation software” failures can be achieved by Application verification of the status in cases where function activation is allowed. Mitigation for a portion (Status Generation Hardware and/or Software”) of the possible “black box” function failures can be achieved by Application verification of the HW in cases where function activation is allowed.

3

A ‘black box’ function generates an event to notify application software that the Application Function should be performed.

Probability of erroneous activation based on “black box” function failure rate and the “event activation software” failure rate.

Mitigation for “event activation software” failures can be achieved by Application Function verification of the event in cases where function activation is allowed.

4

Same as case 3 with the following exception. The event contains the source of data resulting in the event. Note: Maximum mitigation would use HW Status as the Cause.

Probability of erroneous activation based on “black box” function failure rate and the “event activation software” failure rate.

Mitigation for “event activation software” failures can be achieved by Application Function verification as described for Case 2. Mitigation for a portion (Event Detection Hardware and/or Software”) of the possible “black box” function failures can be achieved by Application verification of the Cause in cases where function activation is allowed.

5Application Function is activated by a ‘black box’ function.

Probability of erroneous activation based on “black box” function failure rate.

In cases where function activation is allowed, no Mitigation for “black-box” function failures can be achieved without information about the source of the callback event.

6

Application is activated by a ‘black box’ activation function. The Activation Cause is available to the Application Function for validation. Note: Maximum mitigation would use HW Status as the Cause.

Probability of erroneous activation is based on the “black box” function failure rate. The cause data allows the probability of erroneous detection to be reduced to that of the “Black Box” function at the point were the cause data is generated.

In cases where function activation is allowed, mitigation for a portion (“Event Detection Hardware and/or Software” and “Callback Event Activation”) of the possible “black box” function failures can be achieved by Application Function verification of the Cause.

Page 9: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

9

Mitigation Assessment For COTS OS Usage Process Development

HardwareOS OS functions

Application

OS Events (App configured)OS Exceptions

App

Con

figur

edE

vent

s(C

allb

acks

)

OS

Fun

ctio

n

Cal

l

OS

Eve

nts

App

Con

trol

IF4IF3IF2IF1

OS/Application Interfaces

Page 10: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

10

Application Control and Support (IF1) clue list

• What if the OS Aborts Application Load (insufficient resources)?

• What if the OS Aborts Application execution (insufficient resources, hardware and/or execution problems)?

• What if the OS provides insufficient Processing time (Higher priority activities, scheduler problem, etc)?

• What if the OS and/or Application memory (program or data) is corrupted?

• What if an application created thread is aborted?

• What if execution is performed at random locations in memory within the application?

Page 11: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

11

Application Control and Support (IF1) Analysis Partitioning

• Pre-Execution Analysis Hardware power-Up and Initialization OS Load and StartupApplication Load and Initialization

• Execution Analysis – Partitioned by Fault Source Classes Execution Integrity Data Integrity Resource Integrity

• Post-Execution Analysis Application Termination OS TerminationHardware Power-Down

Page 12: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

12

Process Chart for Application Control And Support (IF1)

Generalized Execution Integrity mitigation analysis process flow

GeneralizedExecution IntegrityMitigation analysis

process flow

Can ahazardous conditionoccur if erroneous

execution occurs insidethe function?

No Implement AppMitigation

Yes

Done

Is it possibleto detect that the

execution fault hasoccurred?

Yes

NoEvaluate Alternate

Mitigation

Are thereapp methods that

can adequately mitigate adetected execution fault

condition?

No

Yes

Generalized Data IntegrityMitigation analysis

process flow

Can ahazardous condition

occur if a Data elementis corrupt?

No Implement AppMitigation

Yes

Done

Is it possibleto detect if the element

has beencorrupted?

Yes

No Evaluate AlternateMitigation

Are thereapp methods that

can adequately mitigatea detected data

corruption?

No

Yes

Generalized Data Integrity mitigation analysis process flow

GeneralizedResource IntegrityMitigation analysis

process flow

Can ahazardous condition

occur if the resource fails(insufficient CPU time,

resource access,etc)?

No Implement AppMitigation

Yes

Done

Is it possibleto detect if the resource

has failed?

Yes

NoEvaluate Alternate

Mitigation

Are thereapp methods that

can adequately mitigatethe resource failure?

No

Yes

Generalized Resource Integrity mitigation analysis process flow

Page 13: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

13

OS Events (IF2) clue list

•What if a hardware fault related OS exception occurs (processor, interface, resource, memory access, etc)?

•What if the OS provided interface or an OS operation exception occurs?

Page 14: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

14

Process Chart for OS Events (IF2)

EvaluateType 2 OS

Interface Event

Can ahazardous conditionoccur if the interface

event occurs?

No Implement AppMitigation

Yes

Done

Is it possibleto notify the application

of event occurance?Yes

NoEvaluate Alternate

Mitigation

Are thereapp methods that canadequately mitigateevent occurance?

No

Yes

Type 2 OS Interface analysis process flow

Page 15: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

15

Application Configured Events (IF3) clue list

• What if the triggering event does not occur when it should?

• What if the triggering event occurs when it is not allowed?

• What if erroneous triggering of the event occurs when allowed?

Page 16: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

16

Process Chart for Application Configured Events (IF3)

EvaluateType 3 OSInterfaceFunction

(Callback)

Is therePotential for

Indeterminate /UndesirableBehavior?

Is the eventactivated functionalways allowed?

Can ahazardous

condition occur ifactivated when the

function is notallowed?

No

Yes No

No MitigationRequired

No

Can ahazardous

condition occur ifactivated erroneously

when functionallowed?

Yes

Yes

Use AppMitigations

Yes

Are App mitigations adequate to prevent hazardousconditions if activated

when function notallowed?

Yes EvaluateAlternateMitigation

No

Are App mitigations adequate to prevent hazardous

conditions if erroneouslyactivated when function

is allowed?

NoUse App

MitigationsYes

No

Done

Note: Degradation of othermitigations may constitute

a hazardous condition

Page 17: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

17

OS Function Calls (IF4) clue list

General clue• What if the function does not return?

Clues for - function used to setup data:• What if the function returns without the data being properly set (exception or fault indication not provided, valid exception or fault indication provided, erroneous indication of fault status)?

Clues for - function used to get data:• Is the status of the acquisition function available on return?• What if the function returns without the data being properly acquired (exception or fault indication not provided, valid exception or fault indication provided, erroneous indication of fault status)?• What if the function returns properly with invalid data acquired (stale data, out of range, erroneous data)?

Clues for - function waits for conditions to be met:• What if the function returns without the condition being met (exception or fault detected, erroneous condition met indication, condition met previously [stale data], timeout too early, timeout too late)?

Page 18: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

18

Process Charts for OS Function Calls (IF4)

EvaluateType 4 OS

Interface Function(call)

Does OS iffunction perform setup

and controlactivities?

Done

No

Evaluate Type 4Setup and Control

OS Interface Function

Yes

Does OS iffunction perform Dataacquisition or return

status?

No

Evaluate Type 4Data Acquisition

OS Interface Function

Yes

Does OS iffunction block the calling

thread?No

Evaluate Type 4Blocking

OS Interface Function

Yes

Could ahazardous conditionoccur if the functiondoes not return from

the call?

NoImplement App

Mitigation

YesIs It possible

to detect that the calldid not return?

Yes

NoEvaluate Alternate

Mitigation

Can the failure be adequately mitigatedby knowing the call did

not return?

No

Yes

Type 4 OS function analysis process flow

Page 19: 1 Evaluation of Commercial Off The Shelf (COTS) Operating System (OS) Malfunction Mitigation Methods C. Forni, ATK B. Blake, ATK R. Hall, Textron D. Magidson,

19

Process Charts for OS Function Calls (IF4) Continued

Evaluate Type 4Setup And Control/ Data Acquisition

OS InterfaceFunction

Will ahazardous condition occur

if the setup and control/data acquisition function

fails to perform?

Can a hazardouscondition occur if the

setup / returned data isincorrect when function is

performed?

NoImplement App

Mitigation

Yes

DoneNo

Yes

Is Statusavailable to identify ifthe function has failed

to perform?

Yes

No

Willrestricting/validatingsetup / returned data

provide adequateMitigation for hazard

conditions?

No

Yes

Isthere other

adequate mitigationto handle the results of

incorrect setup /received

data?

Yes

No

Are thereapp methods that

can adequately mitigatethe hazard conditions if

not performed?

No

Yes

Evaluate AlternateMitigation

EvaluateAlternateMitigation

Implement AppMitigation

Evaluate Type 4Blocking OS

Interface Function

Will a hazardouscondition occur if the

Blocking function returnswhen it shouldn't?

NoImplement App

Mitigation

Yes

Done

Is Statusavailable to identify why

the function has returned? Yes

NoEvaluate Alternate

Mitigation

Are thereapp methods that

can adequately mitigatea detected erroneous

return condition?

No

Yes

NOTE: this includesreturning late, early, or

returning when theconditions are not met

Type 4 Blocking OS function analysis process flow

Type 4 Setup and Control / Data Acquisition OS function analysis process flow