Analyzing a Multi-Legged Argument Using Eliminative ... · Analyzing a Multi-Legged Argument Using Eliminative Argumentation John B. Goodenough Charles B. Weinstock Ari Z. Klein Neil

© 2013 Carnegie Mellon University

Analyzing a Multi-Legged Argument UsingEliminative Argumentation

John B. GoodenoughCharles B. WeinstockAri Z. KleinNeil Ernst

December 2013

2Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

Copyright 2013 ACM

This material is based upon work funded and supported by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center.

NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN “AS-IS” BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.

This material has been approved for public release and unlimited distribution.

This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic form without requesting formal permission. Permission is required for any other use. Requests for permission should be directed to the Software Engineering Institute at [email protected].

DM-0000795


Multi-Legged Argument

Informal definition• “Independent” evidence and argument supporting the same claim, e.g.,

proving and testing

How much confidence does each leg contribute?

How can independence of the legs be determined?

First, what does it mean to have confidence in a claim?


Gaining Confidence in a Claim

A classic philosophical problem: • How should evidence be used to evaluate belief in a hypothesis?

Use Induction• Enumerative: Support increases as confirming instances are found• Eliminative: Support increases as reasons for doubt are eliminated





Using past experience as the basis for predicting future behavior

?





?

Power? Bulb OK?Wired?

Confidence increases as doubts are eliminated


Eliminative Argumentation

Multi-Legged Arguments• How much confidence does each leg contribute?

– Depends on the extent to which doubts are eliminated in each leg• How can independence of the legs be determined?

– Look at dependencies among the doubts

An eliminative argument is visualized in a confidence map, which shows reasons for doubt graphically.

An eliminative argument shows reasons for doubting an argument’s conclusion and why

those doubts are eliminated


C2.3

Light bulb is functional

Ev3.1Examination results

showing bulb doesn'trattle when shaken

C2.2

Power is available

C2.1

Switch is connected

C1.1

Light turns on

Make doubts and inference rules explicit in a confidence map


Lightbulb CM with rebutting defeaters

R2.2

Unless no poweravailable

C1.1

Light turns on

R2.3

Unless bulb isdead

R2.1

Unless switch isnot connected

IR2.4

If these reasons forfailure are eliminated,

the light will turn on

UC3.3Unless there are

unidentifiedreasons for failure

Rebutting Defeaters(Attack claim validity) R → ~C


Lightbulb CM with IR and rebutting defeaters

R2.2


C1.1

Light turns on

R2.3

Unless bulb isdead

R2.1


IR2.4





Inference Ruleasserts: ~R → C


Lightbulb CM with UC, IR, and R

Undercutting Defeater(Attacks rule sufficiency)

R2.2


C1.1

Light turns on

R2.3

Unless bulb isdead

R2.1


IR2.4






Lightbulb CM with IR3.2 and UC

Ev3.1Examination results

showing bulb doesn'trattle when shaken

UC4.2

Unless bulb is notincandescent type

UM4.1

But the examineris hard of hearing

R2.3

Unless bulb isdead

IR3.2

If bulb doesn't rattle whenshaken, bulb is good


Eliminative Argumentation

Finding doubts• Attack claim (rebutting defeater)

– Why claim may be false

• Attack evidence (undermining defeater)– Why evidence may be invalid

• Attack inference (undercutting defeater)– Premise ok; conclusion uncertain

R2.3Unless bulb is

dead

UM4.1But the examiner is

hard of hearing

UC4.2Unless bulb is notincandescent type


A Multi-Legged Argument Example

Given a system reliability claim:• pfd < 10-3 (with 99% certainty)

What evidence and argument will give confidence in this claim?• 4603 random successful test executions would be necessary and sufficient

But suppose we can only execute 4000 tests?• How much confidence in the claim in this case? Why?• What other evidence could increase our confidence in the claim?• Static analysis? How much would confidence increase? Why?

Example is based on a multi-legged argument suggested by Bloomfield and Littlewood (2003)


Multi-Legged AC

S2.1Argue using

statistical testingresults

Cx1.1aThe system is acceptably

reliable if pfd < 0.001 (with 99%statistical certainty)

C1.1

The system is acceptablyreliable

C3.1No failures have been observed

in a sequence of 4603operationally random test

executions

Ev4.1

A sequence of 4000operationallyrandom tests

showing no failureoccurences

S2.2Argue over absence of

statically detectablecoding errors

Ev3.2

Static analysisresults showing nostatically detectable

coding errors

C3.2

The code contains nostatically detectable

coding errors

Cx3.1a

Littlefield &Wright 1997


ML CM with prob calculation

Ev3.1

4000 operationallyrandom tests showingno failure occurences

0.551.00J

J2.1a

One failure in 4603 orfewer executions is

sufficient to contradict theclaim [Littlewood & Wright

1997]

IR2.2If no failures are observed in asequence of 4000 operationallyrandom test executions, then the

system is acceptably reliable

Cx1.1aThe system is

acceptably reliable if pfd< 0.001 (with 99%statistical certainty)

C1.1The system is acceptably

reliable

1.00

R2.1Unless at least one failure isobserved in a sequence of

4000 (or fewer) operationallyrandom test executions

0.55

0.55

UC3.2

Unless fewer than 4603operationally random testsare executed successfully

0.55 = (1 – .001)603


Validity of Ev3.1

Ev3.1


0.551.00J

J2.1a



1997]



Cx1.1aThe system is



reliable

1.00



0.55

0.55

UC3.2


Ev3.1


1.00

UM4.3But the oraclesometimes

misclassifiesfailed tests as

successes

UM4.2

But the testselection

process is notrandom

UM4.1

But theoperational

profile isinaccurate


Statistical CM with joint calc

Ev3.1


0.551.00J

J2.1a



1997]



Cx1.1aThe system is



reliable

1.00



0.55

0.55

UC3.2


1.00

The defeaters in each subtree are INDEPENDENT, i.e., the truth of one defeater does not imply the truth of another. I.p.,• The validity of the rule is unaffected by whether the evidence is valid, and vice

versa


0.80 for R2.3

0.200.20

Cx1.1aThe system is acceptablyreliable if pfd < 0.001 (with99% statistical certainty)

UC3.5Unless non-staticcoding errors are

present that increasethe pfd

UC3.4Unless there is no

basis fordetermining how

much pfd is reducedby the presence of


UC3.6Unless designerrors exist thatincrease the pfd

0.80

Ev3.3

Static analysis resultsshowing no static coding

errors

0.80

0.16


reliable

R2.3Unless there are


IR2.4If there are no statically detectablecoding errors, then the system is

acceptably reliable

UM4.5But the static analysis

overlooked somestatically detectable errors

0.80


Calculation of static error confidence

0.200.20

Cx1.1aThe system is acceptablyreliable if pfd < 0.001 (with99% statistical certainty)

UC3.5Unless non-staticcoding errors are

present that increasethe pfd

UC3.4Unless there is no

basis fordetermining how

much pfd is reducedby the presence of


UC3.6Unless designerrors exist thatincrease the pfd

0.80

Ev3.3

Static analysis resultsshowing no static coding

errors

0.80

0.16


reliable

R2.3Unless there are


IR2.4If there are no statically detectablecoding errors, then the system is

acceptably reliable

UM4.5But the static analysis

overlooked somestatically detectable errors

0.80

These defeaters are INDEPENDENT, i.e., the truth of one defeater does not imply the truth of another. I.p.,• The validity of the rule is unaffected by whether the evidence is valid, and vice

versa


Calculation of multi-leg confidence0.62 = 1 – (1 – 0.55)(1 – 0.16)

0.55

0.55

R2.1

Unless at least onefailure is observed ina sequence of 4000

(or fewer)operationally random

test executions

IR2.2If no failures are

observed in asequence of 4000

operationally randomtest executions, then

the system isacceptably reliable


reliable

0.62Cx1.1a

The system isacceptably reliable if pfd

< 0.001 (with 99%statistical certainty)

R2.3

Unless there arestatically

detectablecoding errors

IR2.4

If there are no staticcoding errors, then


0.551.00

0.55 0.16

0.80 0.20

Each leg is independent of the other because defeaters in each leg are independent:• The truth of a defeater in one leg does not imply the truth of a defeater in an

“independent” leg, i.e.,– The validity of an argument in one leg is unaffected by defects in another

leg

0.62 = 1 – (1 – 0.55)(1 – 0.16)


Multi-legged argument definitions

“Standard” definition• “Independent” evidence and argument supporting the same claim, e.g.,

proving and testingEliminative argument definition• Two or more argument legs rooted at claim C whose defeaters are

independent– Two defeaters are independent if the truth of one defeater does not affect

the truth of the other– An argument leg is defined by an inference rule connecting rebutting

defeaters to claim C

R2.1

Unless at least onefailure is observed ina sequence of 4000

(or fewer)operationally random

test executions

IR2.2If no failures are

observed in asequence of 4000

operationally randomtest executions, then



reliable

0.62Cx1.1a

The system isacceptably reliable if pfd

< 0.001 (with 99%statistical certainty)

R2.3

Unless there arestatically

detectablecoding errors

IR2.4

If there are no staticcoding errors, then


0.551.00

0.55 0.16

0.80 0.20


Protection Against Argumentation Error

A multi-legged argument is more robust• Because defeaters are independent

– If one leg is defective (i.e., if some defeater is true), the other leg still provides some reason to believe the parent claim

– Argument is more likely to hold up in the future as more info (making a defeater true) becomes available


Reduction in Doubt

Each leg attacks alternative (top-level) reasons for doubt• Alternative: different (top-level) defeaters and inference rules, i.e.,

– (/\~Ri) → C, (/\~Rj) → C where Ri ≠ Rj

– Defeaters in each leg are independentProbability that at least one leg is valid is 1 – (prob no leg is valid)• Prob no leg is valid: ∏(1 – Li)• Assumes Li are independent• Li = 0 implies information from that leg does not increase confidence in parent

claim


Summary

How much confidence does each leg contribute?• Depends on the extent to which defeaters are eliminated in a given leg

– Various rules can be used to determine partial defeater elimination

How can independence of two legs be determined?• By determining that doubts are not shared among the legs

– i.e., the truth of a defeater in one leg does not imply the truth of a defeater in the other leg


Summary

Eliminative argumentation (identification of doubts and their elimination)provides a framework for building confidence in an argument and in properties of a system• Confidence maps are a visualization of an eliminative argument• They explicitly document reasons for doubt and their elimination

An assurance case provides an argument asjustification for a claim

We seek to provide justification for belief in a claim

We do so by identifying and eliminating defeaters (doubts) relevant to the claim and the argument


Contact Information

John B. GoodenoughSEI Fellow (retired)Telephone: +1 412-268-6391Email: [email protected]

U.S. MailSoftware Engineering Institute4500 Fifth AvenuePittsburgh, PA 15213-2612USA

Charles B. WeinstockSenior Member of the Technical StaffTelephone: +1 412-268-7719Email: [email protected]

Ari Z. KleinPh.D. Candidate – RhetoricTelephone: +1 412-268-7700Email: [email protected]


OBJECTIONS


Objections

What if a relevant defeater has not been identified?

What if a defeater cannot be completely eliminated?

Not all defeaters are of equal importance. How is this handled?

Eliminative induction (Baconian probability) seems rather weak (compared to Bayesian probability or enumerative induction). What is being gained (and lost) with this approach?

The potential number of defeaters seems incredibly large for a real system. Is this approach practical?


Objections







What if there is a defeater is unidentified?

Assurance cases are inherently defeasible; always possible something has been omitted• Complete confidence (n|n) only reflects what is known at a particular point in

timeUncertainty about completeness is itself a reason for doubt that needs to be recognized and countered• “Not all hazards have been identified”• Assessment of a case must consider this as a reason for doubting the

adequacy of the case– Eliminative argumentation provides a method for identifying where sources

of doubt can be foundEliminative argumentation provides ways of thinking about and explaining why one should have confidence in a case, or a claim• The approach does not, of course, guarantee a sound case• But helps in developing sound and complete arguments


Objections







Incomplete Defeater Elimination

We have addressed this in our examplesWe accept that in practical cases, there will always be some residual doubt• The issue is whether the remaining doubts are considered significant or not

The general principle is that uneliminated lower level doubts propagate to higher level claims• The goal is to formulate lower level defeaters that can be eliminated by

appropriate evidence and inference rules


Objections







Differential Defeater Importance

The elimination of some defeaters seems more important (in some intuitive sense) than others. A strict eliminative induction (Baconian) approach treats all uneliminated defeaters equally.• Consider hazards identified in a safety analysis. All above a certain threshold

must be eliminated/mitigated– Assessing their relative importance/likelihood is not profitable

This is a current subject of research


Objections




Eliminative induction (Baconian probability) seems rather weak. What is being gained (and lost) with this approach?



Why Use Eliminative Argumentation

With eliminative argumentation, we learn something concrete about why a system works• With enumerative induction, we at best only learn something statistical

(although this can be valuable knowledge)

Eliminative argumentation avoids “confirmation bias”• To the extent evidence eliminates defeaters, we know an argument cannot be

invalid for all situations covered by these defeaters


Objections







Practical Considerations

The amount of evidence and argument for a real system is inherently quite large• Can eliminative argumentation provide a more thorough and cost-effective

basis for developing confidence in system behavior?• Are assurance efforts more effective and focused?

More research is needed

Analyzing a Multi-Legged Argument Using Eliminative ... · Analyzing a Multi-Legged Argument Using Eliminative Argumentation John B. Goodenough Charles B. Weinstock Ari Z. Klein Neil

Documents