7/22/04 Report Back: Performance Analysis Track Dr. Carol Smidts Wes Deadrick.

7/22/04 Report Back:Performance Analysis Track

Dr. Carol SmidtsWes Deadrick

Track Members• Carol Smidts (UMD) – Track Chair

– Integrating Software into PRA

• Ted Bennett and Paul Wennberg (Triakis)– Empirical Assurance of Embedded Software Using Realistic

Simulated Failure Modes

• Dolores Wallace (GSFC)– System and Software Reliability

• Bojan Cukic (WVU)– Compositional Approach to Formal Models

• Kalynnda Berens (GRC)– Software Safety Assurance of Programmable Logic– Injecting Faults for Software Error Evaluation of Flight Software

• Hany Ammar (WVU)– Risk Assessment of Software Architectures

Agenda

• Characterization of the Field

• Problem Statement

• Benefits of Performance Analysis

• Future Directions

• Limitations

• Technology Readiness Levels

Characterization of Field

• Goal: Prediction and Assessment of Software Risk/Assurance Level (Mitigation optimization)

• System Characteristics of interest– Risk (Off-nominal situations)– Reliability, availability, maintainability = Dependability– Failures - general sense

• Performance Analysis Techniques - modeling and simulation, data analysis, failure analysis, design analysis focused on criticality

Problem Statement

• Why should NASA do performance analysis? - We care if things fail!

• Successfully conducting SW and System Performance Analysis gives us the data necessary to make informed decisions in order to improve performance and overall quality

• Performance analysis permits:– Ability to determine if/when system meets requirements– Risk reduction and quantification– Application of new knowledge to future systems– A better understanding of the processes by which systems are

developed and therefore enables NASA to exercise continual improvement

Benefits of Performance Analysis

• Reduced development and operating costs• Manage and optimize current processes thereby

resulting in more efficient and effective processes– Defined and repeatable process – reduced time to do same

volume of work

• Reduces risk and increases safety and reliability• Better software architecture designs• More maintainable systems• Enable NASA to handle more complex systems in the

future• Put the responsibility where it belongs from a

organizational perspective - focuses accountability

Future Directions for Performance Analysis

• Automation of modeling and data collection – increased efficiency and accuracy

• A more useful, better reliability model – useful = user friendly (enable the masses not just the

domain experts), increased usability of the data (learn more from what we have)

– better = greater accuracy and predictability

• Define and follow repeatable methods/processes for data collection and analysis including:– education and training– use of simulation– gold nugget = accurate and complete data

Future Directions for Performance Analysis (Cont.)

• Develop a method for establishing accurate performance predictions earlier in life cycle

• Evolve to refine system level assessment – factor in the human element

• Establish and define an approach to performing trade-off of attributes – reliability, etc.

• Need for early guidance on criticality of components

• Optimize a defect removal model• Methods and metrics for calculating/defending

return on investment of conducting performance analysis

Why not

• Standard traps - Obstacles– Uncertainty about scalability– User friendliness– Lack of generality – “Not invented here” syndrome

• Costs and benefits– Difficult to assess and quantify– Long term project benefit tracking

recommended

Technology Readiness Level• Integrating Software into PRA – Taxonomy (7)• Test-Based Approach for Integrating SW in PRA (3)• Empirical Assurance of Embedded Software Using

Realistic Simulated Failure Modes (5)• Maintaining system and SW test consistency (8)• System Reliability (3)• Software Reliability (9)• Compositional Approach to Formal Models (2)• Software Safety Assurance of Programmable Logic (2)• Injecting Faults for Software Error Evaluation of Flight

Software (9)• Risk Assessment of Software Architectures (5)

Research Project Summaries

Integrating Software Into PRADr. Carol Smidts, Bin Li

Objective:

• PRA is a methodology to assess the risk of large technological systems

• The objective of this research is to extend current classical PRA methodology to account for the impact of software onto mission risk

Integrating Software Into PRA (Cont)

Achievements

1. Developed a software related failure mode taxonomy

2. Validated the taxonomy on multiple projects (ISS, Space Shuttle, X38)

3. Proposed a step-by-step approach to integration in the classical PRA framework with quantification of input and functional failures.

Problem

Disconnect exists between System and software development loops

SYSTEM

Design/Debug

Analyze/Test/V&V

Model,Simulate

,Prototyp

e,ES, etc.

SWInterpretation

Requirements

Analyze/Test/Verify

Design/Debug

Build

Integration Testing

Most embedded SW faults found at integ. test traceable to Rqmts. & interface

misunderstanding

TRIAKIS Corporation

Approach• Develop & simulate entire system design using

executable specifications (ES)• Verify total system design with suite of tests• Simulate controller hardware• Replace controller ES with simulated HW running

object (flight) software• Test SW using system verification testsWhen SW passes all system verification tests, it

has correctly implemented all of the tested requirements

TRIAKIS Corporation

Mini-AERCam

IV&V Facility

Empirical Assurance of Embedded SWUsing Realistic Simulated Failure Modes

• Problem: FMEA Limitations– Expensive & time-consuming– List of possible failure modes extensive – Focuses on prioritized subset of failure modes

• Approach: Test SW w/sim’d Failures– Create pure virtual simulation of Mini-AERCam

HW & flight environment running on PC – Induce realistic component/subsystem failures– Observe flight SW response to induced failures

Can we improve coverage by testing SW resp. to sim’d failures?

Compare results with project-sponsored FMEA, FTA, etc.: #Issues

uncovered?#Failure modes

evaluated?Effort

involved?

TRIAKIS Corporation

Software and System Reliability

Dolores Wallace, Bill Farr, Swapna Gokhale

• Addresses the need to evaluate and assess the reliability and availability of large complex software intensive systems by predicting (with associated confidence intervals):– The number of software/system faults,– Mean time to failure and restore/repair,– Availability, – Estimated release time from testing.

2003 & 2004 Research2003 (Software Based)2003 (Software Based)• Literature search completed• New models were selected: 1) Enhanced Schneidewind

(includes risk assessment and trade-off analysis) and 2) Hypergeometric Model

• Incorporated the new software models into the established public domain tool SMERFS^3

• Applied the new models on a Goddard software project• Made the latest version of SMERFS^3 available to the

general public2004 (System Based)2004 (System Based)• Conducted similar research effort for System Reliability and

Availability• Will enhance SMERFS^3 and validate the system models

on a Goddard data set

A Compositional approach to Validation of Formal Models

• Problem– Significant number of faults in real systems can be traced back to

specifications.– Current methodologies of specification assurance have problems:

• Theorem Proving: Complex• Model Checking: State explosion problems• Testing: Incomplete.

• Approach– Combine them!

• Use test coverage to build abstractions.• Abstractions reduce the size of the state space for model checking. • Develop visual interfaces to improve the usability of the method.

Dejan Desovski, Bojan Cukic

Software Fault Injection ProcessKalynnda Berens, Dr. John Crigler, Richard Plastow

• Standardized approach to test systems with COTS and hardware interfaces

• Provides a roadmap of where to look to determine what to test

Identify Interfaces and Critical Sections

Error/Fault Research

Estimate Effort Required

Obtain Source Code and Documentation

Start

Sufficient time and funds?

Importance Analysis

Select Subset

Test Case Generation

Fault Injection Testing Document Results,

Metrics, Lessons Learned

Feedback to FCF Project

End

Yes

Programmable Logic at NASA Kalynnda Berens, Jacqueline Somos

• Issues– Lack of good assurance of PLCs and PLDs– Increasing complexity = increasing problems– Usage and Assurance Survey - SA involved in less than 1/3 of

the projects; limited knowledge

• Recommendations– Trained SA for PLCs– PLDs – determine what is complex; use process assurance (SA

or QA)

• Training Created– Basic PLC and PLD training aimed at SA– Process assurance for hardware QA

Year 2 of Research• What is industry and other government agencies doing

for assurance and verification? – An intensive literature search of white papers, manuals,

standards, and other documents that illustrated what various organizations were doing.

– Focused interviews with industry practitioners. Interviews were conducted with assurance personnel (both hardware and software) and engineering practitioners in various industries, including biomedical, aerospace, and control systems.

– Meeting with FAA representatives. Discussions with FAA representatives lead to a more thorough understanding of their approach and the pitfalls they have encountered along the way.

• Position paper, with recommendations for NASA Code Q

Current Effort• Implement some of the recommendations

– Develop coursework to educate software and hardware assurance engineers

– Three courses• PLCs for Software Assurance personnel

• PLDs for Software Assurance personnel

• Process Assurance for Hardware QA

– Guidebook

• Other recommendations– For Code Q to implement if desired– Follow-up CSIP to try software-style assurance on complex

electronics

Severity Analysis MethodologyHanny Ammar, Katerina Goseva-Popstojanova, Ajith Guedem, Kalaivani

Appukutty, Walib AbdelMoez, and Ahmad Hassan

• We have developed a methodology to assess severity of failures of components, connectors, and scenarios based on UML models

• This methodology is applied on NASA’s Earth Observing System (EOS)

Requirements Scenarios Failure Modes

FM1 FM2 … FMn

R1 S1 Rf

S2R2 S3

…

Rm

Risk factor of scenario S1 in Failure mode FM2

• Requirements are mapped to UML use case and scenarios

• A failure mode refers to the way in which a scenario fails to achieve its requirement

• According to Dr. Martin Feather’s DDP Process, “The Requirements matrix maps the impacts of each failure mode on each requirement.”

Requirement Risk Analysis Methodology

• We have developed a methodology for assessing requirements based risk using normalized dynamic complexity and severity of failures. This can be used in the DDP process developed at JPL.

What to Read

• Key works in the field

• Tutorials

• Web sites

– Will be completed at a later time

7/22/04 Report Back: Performance Analysis Track Dr. Carol Smidts Wes Deadrick.

Documents