Top Banner
© ABB Group 6/6/22 | Slide 1 A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis Heiko Koziolek, Bastian Schlich, Carlos Bilich, ABB Corporate Research, 2010-11-01
24

A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Dec 03, 2014

Download

Technology

Heiko Koziolek

Talk from ISSRE 2010
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

© ABB Group April 9, 2023 | Slide 1

A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Heiko Koziolek, Bastian Schlich, Carlos Bilich, ABB Corporate Research, 2010-11-01

Page 2: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Architecture-based Software Reliability Analysis (ABSRA)

What?

Typical questions of software architects concerning reliability

„What is the reliability (probability of failures) in my system?“

„How do individual components contribute to the system reliability?“

„Which architectural alternative is best for reliability?“

„Where shall I introduce fault-tolerance mechanisms?“

„How to distribute my limited testing efforts among components?“

Additional questions by ABB

„How much more reliable is a new architecture than a former one?“

„Does ABSRA work on large-scale systems?“

© ABB Group April 9, 2023 | Slide 2

Page 3: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Architecture-based Software Reliability Analysis (ABSRA)

How?

© ABB Group April 9, 2023 | Slide 3

Softwarecomponents, control flow, reliabilities

R=0.995

R=0.982

R=0.937

Markov Model

combine

Markov Model

Solution

trans-form

R = 0.9923Predicted system

reliabilitysolve

im-prove

Page 4: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Related workExisting empirical studies

© ABB Group April 9, 2023 | Slide 4

”… very little effort has been devoted to the validation of architecture-based software reliability techniques.”

[Gokhale2007, IEEE Transactions on Dependable and Secure Computing, Vol. 4, No. 1]

Source Name Year Lang. LOC # Components[Gokhale2004, Perf. Eval.]

SHARPE 1998 C 35,000 30

[Goseva2001, ISSRE]

ESA 2001 C 10,000 3

[Goseva2005,ISSRE]

GCC 2005 C 350,000 13

[Wang2005,JSS]

SMS 2006 C/C++ 13,000 15

[Goseva2006,ISSRE]

IDN 2006 C 11,000 6

Source Name Year Lang. LOC # Components[Gokhale2004, Perf. Eval.]

SHARPE 1998 C 35,000 30

[Goseva2001, ISSRE]

ESA 2001 C 10,000 3

[Goseva2005,ISSRE]

GCC 2005 C 350,000 13

[Wang2005,JSS]

SMS 2006 C/C++ 13,000 15

[Goseva2006,ISSRE]

IDN 2006 C 11,000 6

Our Paper ABB 2010 C++ >3,000,000 8 (>100)

Page 5: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

System under study: Process control system

© ABB Group April 9, 2023 | Slide 5

Page 6: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

System under study: Process control systemTopology

© ABB Group April 9, 2023 | Slide 6

Plant / Office Network

NetworkIsolation

Device

RemoteWorkplaces

Firewall

Internet

RemoteWorkplaces

Redundant Network

Workplaces

Controllers

Servers

Fieldbus

Remote I/O andField devices

Page 7: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

System under study: Process control systemSubsystems within the servers

© ABB Group April 9, 2023 | Slide 7

Page 8: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Which steps are required for ABSRA?

Estimate component failure probabilities

Estimate transition probabilities

Construct the Markov model

Exploit the results

© ABB Group April 9, 2023 | Slide 8

Page 9: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Estimate component failure probabilitiesExisting methods

Code metrics [Nagappan2006]

• Validity debated

Reliability growth modeling [IEEE Std 1633-2008]

• Requires component failure reports

Random/statistical testing [Miller1992]

• Does not scale, difficult to apply on components

Fault injection [Gokhale2004]

• Does not determine the current reliability

Explicit failure modeling [Cheung2008]

• Accuracy unknown

© ABB Group April 9, 2023 | Slide 9

Page 10: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Reliability growth modelingGeneral principle

© ABB Group April 9, 2023 | Slide 10

0 ,

)(

))(exp()()(),,(

1

llilii

ilg

Littlewood/Verrall Model

Page 11: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Reliability growth modeling Using the Littlewood/Verrall-model on one subsystem

© ABB Group April 9, 2023 | Slide 11

Filtered subsystem bug list Release dates

Curve fitting in CASRE 3.0http://www.openchannelsoftware.com/projects/CASRE_3.0/

Page 12: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Reliability growth modeling Result

© ABB Group April 9, 2023 | Slide 12

R1= ...

R8= ...

R4= ...

R3= ...

R5= ...

R6= ...

R7= ...

R2= ...

Page 13: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Which steps are required for ABSRA?

Estimate component failure probabilities

Estimate transition probabilities

Construct the Markov model

Exploit the results

© ABB Group April 9, 2023 | Slide 13

Page 14: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Estimate component transition probabilitiesExisting methods

Exploiting design document [Gokhale2007]

• Only static dependencies in SW architecture

Profiling [Goseva2005]

• Complicated filtering of data required

Manual code instrumentation• Can be time-comsuming

© ABB Group April 9, 2023 | Slide 14

Page 15: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Self-coded script

Estimate component transition probabilitiesProfiling with proprietary tools

© ABB Group April 9, 2023 | Slide 15

Example trace from profiling

Set up and ran the system

Page 16: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Which steps are required for ABSRA?

Estimate component failure probabilities

Estimate transition probabilities

Construct the Markov model

Exploit the results

© ABB Group April 9, 2023 | Slide 16

Page 17: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Construct the Markov modelExisting state-based methods

[Littlewood1979]

[Cheung1980]

[Laprie1984]

[Kubat1989]

[Gokhale1998]

[Ledoux1999]

[Gokhale1998-2]

© ABB Group April 9, 2023 | Slide 17

[Goseva-Popstojanova2001]

Page 18: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Cheung modelAdding failure & end states, compute reliability

© ABB Group April 9, 2023 | Slide 18

[Cheung1980]

Page 19: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Which steps are required for ABSRA?

Estimate component failure probabilities

Estimate transition probabilities

Construct the Markov model

Exploit the results

© ABB Group April 9, 2023 | Slide 19

Page 20: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Exploit the resultsPossibilities

Estimate system reliability [Cheung1980]

• Experience by customers hard to validate

Conduct sensitivity analysis [Gokhale2002]

• Study system reliability for varying component failure rates

Assess costs of bugs [Cheung1980]

• Quantify the effect of an error in component

Evaluate design alternatives [Goseva2001]

• Values for new componentes need to be guessed

Allocate test budgets efficiently [Pietrantuono2010]

• Test critical components more often

© ABB Group April 9, 2023 | Slide 20

Page 21: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Sensitivity AnalysisImpact of varying subsystem failure rates

© ABB Group April 9, 2023 | Slide 21

http://www.prismmodelchecker.org/

Page 22: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

Evaluation Cost estimations in person hours (best/worst case)

© ABB Group April 9, 2023 | Slide 22

Page 23: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

ConclusionsLessons learned

Getting failure and transition probabilities is hard

Time consuming, error-prone, limited automation

Main obstacle for ABSRA is data collection

Currently rather simple models

No technologies, concurrency, hardware

Difficult to evaluate architecture alternatives

Limited decision support from the predictions

Lack of empirical studies in literature

Predominantly small systems

Often dubious techniques for estimating failure rates

Replicated case studies needed

© ABB Group April 9, 2023 | Slide 23

Page 24: A Large-Scale Industrial Case Study on Architecture-based Software Reliability Analysis

© ABB Group April 9, 2023 | Slide 24