1 Introducing Bayesian Nets in AgenaRisk An example based on Software Defect Prediction.

1

Introducing Bayesian Nets in Introducing Bayesian Nets in AgenaRiskAgenaRisk

An example based on An example based on Software Defect PredictionSoftware Defect Prediction

Typical Applications

• Predicting reliability of critical systems

• Software defect prediction• Aircraft accident traffic risk• Warranty return rates of

electronic parts• Operational risk in financial

institutions• Hazards in petrochemical

industry






industry






industry






industry






industry






industry

A Bayesian Net for predicting air traffic incidents

A Detailed Example

• What follows is a demo of a simplified version of a Bayesian net model to provide more accurate predictions of software defects

• Many organisations worldwide have now used models based around this one

Predicting software defects

Operational defects

The number of operational defects (i.e. those found by customers) is what

we are really interested in

predicting

Residual Defects

Operational defects

We know this is clearly dependent on the number of residual defects.


Residual Defects

Operational defectsOperational usage

But it is also critically dependent on the amount of operational usage. If you do not use the system you will find no defects irrespective of the number there.


Residual Defects

Defects Introduced



The number of residual defects is determined by the number you introduce during development….

Residual DefectsDefects found and fixed

Defects Introduced



…minus the number you successfully find and fix

Residual DefectsDefects found and fixed

Defects Introduced


Obviously defects found and fixed is dependent on the number introduced


Residual Defects

Problemcomplexity

Defects found and fixed

Defects Introduced


The number introduced is influenced by problem complexity…


Residual Defects

Problemcomplexity


Defects IntroducedDesign processquality


….and design process quality


Residual DefectsTesting Effort

Problemcomplexity


Defects IntroducedDesign processquality


Finally, how many defects you find is influenced not just by the number there to find but also by the amount of testing effort


A Model in action

Here is that very simple model with the probability distributions shown

Here is that very simple model with the probability distributions shown

A Model in action

We are looking at an individual software component in a system

We are looking at an individual software component in a system

A Model in action

The prior probability distributions represent our uncertainty before we enter any specific information about this component.

The prior probability distributions represent our uncertainty before we enter any specific information about this component.

A Model in action

So the component is just as likely to have very high

complexity as very low

A Model in action

and the number of defects found and fixed in testing is in a wide range where the median value

is about 20.

A Model in action

As we enter observations about the component the probability distributions update

As we enter observations about the component the probability distributions update

Here we have entered the observation that this

component had 0 defects found and fixed in testing

Note how the other distributions changed.

The model is doing forward inference to

predict defects in operation…..

..and backwards inference to make

deductions about design process quality.

but actually the most likely explanation is very low testing quality.

…and lower than average complexity.

But if we find out that the complexity is actually high…..

But if we find out that the complexity is actually high…..

https://intranet.dcs.qmul.ac.uk/courses/coursenotes/DCS235/

then the expected number of operational

defects increases


and we become even more convinced of

the inadequate testing


So far we have made no

observation about operational usage.


If, in fact, the operational usage is high…

Then we have an example of a component with no defects in test ..

…but probably many defects in operation.

But suppose we find out that the test quality was very high.

Then we completely revise out beliefs

Then we completely revise out beliefs

We are now pretty convinced that the module will be fault free in operation

…And the ‘explanation’ is that the design process is likely to be very high quality

A Model in action

we reset the model and this time use the model to argue backwards

we reset the model and this time use the model to argue backwards

A Model in action

Suppose we know that this is a critical component that has a requirement for 0 defects in operation…

The model looks for explanations for such a state of affairs.

The model looks for explanations for such a state of affairs.

The most obvious way to achieve such a result is to not use the component much.

But if we know it will be subject to high usage…

Then the model adjusts the beliefs about the other uncertain variables.

Then the model adjusts the beliefs about the other uncertain variables.

A combination of lower than average complexity…..

…Higher than average design quality…..

and much higher than average testing quality …..

But suppose we cannot assume our testing is anything other than average…

Then better design quality …..

..and lower complexity are needed …..

But if complexity is very high …..

…Then we are left with a very skewed distribution for design process quality.

What the model is saying is that, if these are the true requirements for the component then you are very unlikely to achieve them unless you have a very good design process

What the model is saying is that, if these are the true requirements for the component then you are very unlikely to achieve them unless you have a very good design process

Making better decisions

• That was a simplified version of model produced for Philips

• Helped Philips make critical decisions about when to release software for electronic components

• 95% accuracy in defect prediction – much better than can be achieved by traditional statistical methods

Model Implementation

In AgenaRiskwww.agenarisk.com

1 Introducing Bayesian Nets in AgenaRisk An example based on Software Defect Prediction.

Documents

number of defects

predicting slide

number of residual defects

testing slide

low slide

system slide

petrochemical industry

air traffic incidents