Top Banner
© 2013 Carnegie Mellon University Analyzing a Multi-Legged Argument Using Eliminative Argumentation John B. Goodenough Charles B. Weinstock Ari Z. Klein Neil Ernst December 2013
39

Analyzing a Multi-Legged Argument Using Eliminative ... · Analyzing a Multi-Legged Argument Using Eliminative Argumentation John B. Goodenough Charles B. Weinstock Ari Z. Klein Neil

Feb 05, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • © 2013 Carnegie Mellon University

    Analyzing a Multi-Legged Argument UsingEliminative Argumentation

    John B. GoodenoughCharles B. WeinstockAri Z. KleinNeil Ernst

    December 2013

  • 2Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Copyright 2013 ACM

    This material is based upon work funded and supported by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center.

    NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN “AS-IS” BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.

    This material has been approved for public release and unlimited distribution.

    This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic form without requesting formal permission. Permission is required for any other use. Requests for permission should be directed to the Software Engineering Institute at permission@sei.cmu.edu.

    DM-0000795

  • 3Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Multi-Legged Argument

    Informal definition• “Independent” evidence and argument supporting the same claim, e.g.,

    proving and testing

    How much confidence does each leg contribute?

    How can independence of the legs be determined?

    First, what does it mean to have confidence in a claim?

  • 4Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Gaining Confidence in a Claim

    A classic philosophical problem: • How should evidence be used to evaluate belief in a hypothesis?

    Use Induction• Enumerative: Support increases as confirming instances are found• Eliminative: Support increases as reasons for doubt are eliminated

  • 5Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Gaining Confidence in a Claim

    A classic philosophical problem: • How should evidence be used to evaluate belief in a hypothesis?

    Use Induction• Enumerative: Support increases as confirming instances are found• Eliminative: Support increases as reasons for doubt are eliminated

    Using past experience as the basis for predicting future behavior

    ?

  • 6Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Gaining Confidence in a Claim

    A classic philosophical problem: • How should evidence be used to evaluate belief in a hypothesis?

    Use Induction• Enumerative: Support increases as confirming instances are found• Eliminative: Support increases as reasons for doubt are eliminated

    ?

    Power? Bulb OK?Wired?

    Confidence increases as doubts are eliminated

  • 7Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Eliminative Argumentation

    Multi-Legged Arguments• How much confidence does each leg contribute?

    – Depends on the extent to which doubts are eliminated in each leg• How can independence of the legs be determined?

    – Look at dependencies among the doubts

    An eliminative argument is visualized in a confidence map, which shows reasons for doubt graphically.

    An eliminative argument shows reasons for doubting an argument’s conclusion and why

    those doubts are eliminated

  • 8Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    C2.3

    Light bulb is functional

    Ev3.1Examination results

    showing bulb doesn'trattle when shaken

    C2.2

    Power is available

    C2.1

    Switch is connected

    C1.1

    Light turns on

    Make doubts and inference rules explicit in a confidence map

  • 9Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Lightbulb CM with rebutting defeaters

    R2.2

    Unless no poweravailable

    C1.1

    Light turns on

    R2.3

    Unless bulb isdead

    R2.1

    Unless switch isnot connected

    IR2.4

    If these reasons forfailure are eliminated,

    the light will turn on

    UC3.3Unless there are

    unidentifiedreasons for failure

    Rebutting Defeaters(Attack claim validity) R → ~C

  • 10Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Lightbulb CM with IR and rebutting defeaters

    R2.2

    Unless no poweravailable

    C1.1

    Light turns on

    R2.3

    Unless bulb isdead

    R2.1

    Unless switch isnot connected

    IR2.4

    If these reasons forfailure are eliminated,

    the light will turn on

    UC3.3Unless there are

    unidentifiedreasons for failure

    Inference Ruleasserts: ~R → C

  • 11Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Lightbulb CM with UC, IR, and R

    Undercutting Defeater(Attacks rule sufficiency)

    R2.2

    Unless no poweravailable

    C1.1

    Light turns on

    R2.3

    Unless bulb isdead

    R2.1

    Unless switch isnot connected

    IR2.4

    If these reasons forfailure are eliminated,

    the light will turn on

    UC3.3Unless there are

    unidentifiedreasons for failure

  • 15Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Lightbulb CM with IR3.2 and UC

    Ev3.1Examination results

    showing bulb doesn'trattle when shaken

    UC4.2

    Unless bulb is notincandescent type

    UM4.1

    But the examineris hard of hearing

    R2.3

    Unless bulb isdead

    IR3.2

    If bulb doesn't rattle whenshaken, bulb is good

  • 16Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Eliminative Argumentation

    Finding doubts• Attack claim (rebutting defeater)

    – Why claim may be false

    • Attack evidence (undermining defeater)– Why evidence may be invalid

    • Attack inference (undercutting defeater)– Premise ok; conclusion uncertain

    R2.3Unless bulb is

    dead

    UM4.1But the examiner is

    hard of hearing

    UC4.2Unless bulb is notincandescent type

  • 17Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    A Multi-Legged Argument Example

    Given a system reliability claim:• pfd < 10-3 (with 99% certainty)

    What evidence and argument will give confidence in this claim?• 4603 random successful test executions would be necessary and sufficient

    But suppose we can only execute 4000 tests?• How much confidence in the claim in this case? Why?• What other evidence could increase our confidence in the claim?• Static analysis? How much would confidence increase? Why?

    Example is based on a multi-legged argument suggested by Bloomfield and Littlewood (2003)

  • 18Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Multi-Legged AC

    S2.1Argue using

    statistical testingresults

    Cx1.1aThe system is acceptably

    reliable if pfd < 0.001 (with 99%statistical certainty)

    C1.1

    The system is acceptablyreliable

    C3.1No failures have been observed

    in a sequence of 4603operationally random test

    executions

    Ev4.1

    A sequence of 4000operationallyrandom tests

    showing no failureoccurences

    S2.2Argue over absence of

    statically detectablecoding errors

    Ev3.2

    Static analysisresults showing nostatically detectable

    coding errors

    C3.2

    The code contains nostatically detectable

    coding errors

    Cx3.1a

    Littlefield &Wright 1997

  • 23Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    ML CM with prob calculation

    Ev3.1

    4000 operationallyrandom tests showingno failure occurences

    0.551.00J

    J2.1a

    One failure in 4603 orfewer executions is

    sufficient to contradict theclaim [Littlewood & Wright

    1997]

    IR2.2If no failures are observed in asequence of 4000 operationallyrandom test executions, then the

    system is acceptably reliable

    Cx1.1aThe system is

    acceptably reliable if pfd< 0.001 (with 99%statistical certainty)

    C1.1The system is acceptably

    reliable

    1.00

    R2.1Unless at least one failure isobserved in a sequence of

    4000 (or fewer) operationallyrandom test executions

    0.55

    0.55

    UC3.2

    Unless fewer than 4603operationally random testsare executed successfully

    0.55 = (1 – .001)603

  • 25Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Validity of Ev3.1

    Ev3.1

    4000 operationallyrandom tests showingno failure occurences

    0.551.00J

    J2.1a

    One failure in 4603 orfewer executions is

    sufficient to contradict theclaim [Littlewood & Wright

    1997]

    IR2.2If no failures are observed in asequence of 4000 operationallyrandom test executions, then the

    system is acceptably reliable

    Cx1.1aThe system is

    acceptably reliable if pfd< 0.001 (with 99%statistical certainty)

    C1.1The system is acceptably

    reliable

    1.00

    R2.1Unless at least one failure isobserved in a sequence of

    4000 (or fewer) operationallyrandom test executions

    0.55

    0.55

    UC3.2

    Unless fewer than 4603operationally random testsare executed successfully

    Ev3.1

    4000 operationallyrandom tests showingno failure occurences

    1.00

    UM4.3But the oraclesometimes

    misclassifiesfailed tests as

    successes

    UM4.2

    But the testselection

    process is notrandom

    UM4.1

    But theoperational

    profile isinaccurate

  • 27Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Statistical CM with joint calc

    Ev3.1

    4000 operationallyrandom tests showingno failure occurences

    0.551.00J

    J2.1a

    One failure in 4603 orfewer executions is

    sufficient to contradict theclaim [Littlewood & Wright

    1997]

    IR2.2If no failures are observed in asequence of 4000 operationallyrandom test executions, then the

    system is acceptably reliable

    Cx1.1aThe system is

    acceptably reliable if pfd< 0.001 (with 99%statistical certainty)

    C1.1The system is acceptably

    reliable

    1.00

    R2.1Unless at least one failure isobserved in a sequence of

    4000 (or fewer) operationallyrandom test executions

    0.55

    0.55

    UC3.2

    Unless fewer than 4603operationally random testsare executed successfully

    1.00

    The defeaters in each subtree are INDEPENDENT, i.e., the truth of one defeater does not imply the truth of another. I.p.,• The validity of the rule is unaffected by whether the evidence is valid, and vice

    versa

  • 35Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    0.80 for R2.3

    0.200.20

    Cx1.1aThe system is acceptablyreliable if pfd < 0.001 (with99% statistical certainty)

    UC3.5Unless non-staticcoding errors are

    present that increasethe pfd

    UC3.4Unless there is no

    basis fordetermining how

    much pfd is reducedby the presence of

    statically detectablecoding errors

    UC3.6Unless designerrors exist thatincrease the pfd

    0.80

    Ev3.3

    Static analysis resultsshowing no static coding

    errors

    0.80

    0.16

    C1.1The system is acceptably

    reliable

    R2.3Unless there are

    statically detectablecoding errors

    IR2.4If there are no statically detectablecoding errors, then the system is

    acceptably reliable

    UM4.5But the static analysis

    overlooked somestatically detectable errors

    0.80

  • 37Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Calculation of static error confidence

    0.200.20

    Cx1.1aThe system is acceptablyreliable if pfd < 0.001 (with99% statistical certainty)

    UC3.5Unless non-staticcoding errors are

    present that increasethe pfd

    UC3.4Unless there is no

    basis fordetermining how

    much pfd is reducedby the presence of

    statically detectablecoding errors

    UC3.6Unless designerrors exist thatincrease the pfd

    0.80

    Ev3.3

    Static analysis resultsshowing no static coding

    errors

    0.80

    0.16

    C1.1The system is acceptably

    reliable

    R2.3Unless there are

    statically detectablecoding errors

    IR2.4If there are no statically detectablecoding errors, then the system is

    acceptably reliable

    UM4.5But the static analysis

    overlooked somestatically detectable errors

    0.80

    These defeaters are INDEPENDENT, i.e., the truth of one defeater does not imply the truth of another. I.p.,• The validity of the rule is unaffected by whether the evidence is valid, and vice

    versa

  • 41Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Calculation of multi-leg confidence0.62 = 1 – (1 – 0.55)(1 – 0.16)

    0.55

    0.55

    R2.1

    Unless at least onefailure is observed ina sequence of 4000

    (or fewer)operationally random

    test executions

    IR2.2If no failures are

    observed in asequence of 4000

    operationally randomtest executions, then

    the system isacceptably reliable

    C1.1The system is acceptably

    reliable

    0.62Cx1.1a

    The system isacceptably reliable if pfd

    < 0.001 (with 99%statistical certainty)

    R2.3

    Unless there arestatically

    detectablecoding errors

    IR2.4

    If there are no staticcoding errors, then

    the system isacceptably reliable

    0.551.00

    0.55 0.16

    0.80 0.20

    Each leg is independent of the other because defeaters in each leg are independent:• The truth of a defeater in one leg does not imply the truth of a defeater in an

    “independent” leg, i.e.,– The validity of an argument in one leg is unaffected by defects in another

    leg

    0.62 = 1 – (1 – 0.55)(1 – 0.16)

  • 42Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Multi-legged argument definitions

    “Standard” definition• “Independent” evidence and argument supporting the same claim, e.g.,

    proving and testingEliminative argument definition• Two or more argument legs rooted at claim C whose defeaters are

    independent– Two defeaters are independent if the truth of one defeater does not affect

    the truth of the other– An argument leg is defined by an inference rule connecting rebutting

    defeaters to claim C

    R2.1

    Unless at least onefailure is observed ina sequence of 4000

    (or fewer)operationally random

    test executions

    IR2.2If no failures are

    observed in asequence of 4000

    operationally randomtest executions, then

    the system isacceptably reliable

    C1.1The system is acceptably

    reliable

    0.62Cx1.1a

    The system isacceptably reliable if pfd

    < 0.001 (with 99%statistical certainty)

    R2.3

    Unless there arestatically

    detectablecoding errors

    IR2.4

    If there are no staticcoding errors, then

    the system isacceptably reliable

    0.551.00

    0.55 0.16

    0.80 0.20

  • 43Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Protection Against Argumentation Error

    A multi-legged argument is more robust• Because defeaters are independent

    – If one leg is defective (i.e., if some defeater is true), the other leg still provides some reason to believe the parent claim

    – Argument is more likely to hold up in the future as more info (making a defeater true) becomes available

  • 44Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Reduction in Doubt

    Each leg attacks alternative (top-level) reasons for doubt• Alternative: different (top-level) defeaters and inference rules, i.e.,

    – (/\~Ri) → C, (/\~Rj) → C where Ri ≠ Rj– Defeaters in each leg are independent

    Probability that at least one leg is valid is 1 – (prob no leg is valid)• Prob no leg is valid: ∏(1 – Li)• Assumes Li are independent• Li = 0 implies information from that leg does not increase confidence in parent

    claim

  • 45Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Summary

    How much confidence does each leg contribute?• Depends on the extent to which defeaters are eliminated in a given leg

    – Various rules can be used to determine partial defeater elimination

    How can independence of two legs be determined?• By determining that doubts are not shared among the legs

    – i.e., the truth of a defeater in one leg does not imply the truth of a defeater in the other leg

  • 46Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Summary

    Eliminative argumentation (identification of doubts and their elimination)provides a framework for building confidence in an argument and in properties of a system• Confidence maps are a visualization of an eliminative argument• They explicitly document reasons for doubt and their elimination

    An assurance case provides an argument asjustification for a claim

    We seek to provide justification for belief in a claim

    We do so by identifying and eliminating defeaters (doubts) relevant to the claim and the argument

  • 47Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Contact Information

    John B. GoodenoughSEI Fellow (retired)Telephone: +1 412-268-6391Email: jbg@sei.cmu.edu

    U.S. MailSoftware Engineering Institute4500 Fifth AvenuePittsburgh, PA 15213-2612USA

    Charles B. WeinstockSenior Member of the Technical StaffTelephone: +1 412-268-7719Email: weinstock@sei.cmu.edu

    Ari Z. KleinPh.D. Candidate – RhetoricTelephone: +1 412-268-7700Email: azklein@sei.cmu.edu

  • 48Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    OBJECTIONS

  • 49Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Objections

    What if a relevant defeater has not been identified?

    What if a defeater cannot be completely eliminated?

    Not all defeaters are of equal importance. How is this handled?

    Eliminative induction (Baconian probability) seems rather weak (compared to Bayesian probability or enumerative induction). What is being gained (and lost) with this approach?

    The potential number of defeaters seems incredibly large for a real system. Is this approach practical?

  • 50Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Objections

    What if a relevant defeater has not been identified?

    What if a defeater cannot be completely eliminated?

    Not all defeaters are of equal importance. How is this handled?

    Eliminative induction (Baconian probability) seems rather weak (compared to Bayesian probability or enumerative induction). What is being gained (and lost) with this approach?

    The potential number of defeaters seems incredibly large for a real system. Is this approach practical?

  • 51Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    What if there is a defeater is unidentified?

    Assurance cases are inherently defeasible; always possible something has been omitted• Complete confidence (n|n) only reflects what is known at a particular point in

    timeUncertainty about completeness is itself a reason for doubt that needs to be recognized and countered• “Not all hazards have been identified”• Assessment of a case must consider this as a reason for doubting the

    adequacy of the case– Eliminative argumentation provides a method for identifying where sources

    of doubt can be foundEliminative argumentation provides ways of thinking about and explaining why one should have confidence in a case, or a claim• The approach does not, of course, guarantee a sound case• But helps in developing sound and complete arguments

  • 52Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Objections

    What if a relevant defeater has not been identified?

    What if a defeater cannot be completely eliminated?

    Not all defeaters are of equal importance. How is this handled?

    Eliminative induction (Baconian probability) seems rather weak (compared to Bayesian probability or enumerative induction). What is being gained (and lost) with this approach?

    The potential number of defeaters seems incredibly large for a real system. Is this approach practical?

  • 53Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Incomplete Defeater Elimination

    We have addressed this in our examplesWe accept that in practical cases, there will always be some residual doubt• The issue is whether the remaining doubts are considered significant or not

    The general principle is that uneliminated lower level doubts propagate to higher level claims• The goal is to formulate lower level defeaters that can be eliminated by

    appropriate evidence and inference rules

  • 54Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Objections

    What if a relevant defeater has not been identified?

    What if a defeater cannot be completely eliminated?

    Not all defeaters are of equal importance. How is this handled?

    Eliminative induction (Baconian probability) seems rather weak (compared to Bayesian probability or enumerative induction). What is being gained (and lost) with this approach?

    The potential number of defeaters seems incredibly large for a real system. Is this approach practical?

  • 55Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Differential Defeater Importance

    The elimination of some defeaters seems more important (in some intuitive sense) than others. A strict eliminative induction (Baconian) approach treats all uneliminated defeaters equally.• Consider hazards identified in a safety analysis. All above a certain threshold

    must be eliminated/mitigated– Assessing their relative importance/likelihood is not profitable

    This is a current subject of research

  • 56Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Objections

    What if a relevant defeater has not been identified?

    What if a defeater cannot be completely eliminated?

    Not all defeaters are of equal importance. How is this handled?

    Eliminative induction (Baconian probability) seems rather weak. What is being gained (and lost) with this approach?

    The potential number of defeaters seems incredibly large for a real system. Is this approach practical?

  • 57Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Why Use Eliminative Argumentation

    With eliminative argumentation, we learn something concrete about why a system works• With enumerative induction, we at best only learn something statistical

    (although this can be valuable knowledge)

    Eliminative argumentation avoids “confirmation bias”• To the extent evidence eliminates defeaters, we know an argument cannot be

    invalid for all situations covered by these defeaters

  • 58Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Objections

    What if a relevant defeater has not been identified?

    What if a defeater cannot be completely eliminated?

    Not all defeaters are of equal importance. How is this handled?

    Eliminative induction (Baconian probability) seems rather weak (compared to Bayesian probability or enumerative induction). What is being gained (and lost) with this approach?

    The potential number of defeaters seems incredibly large for a real system. Is this approach practical?

  • 59Analyzing Multi-Legged ArgumentsGoodenough, Dec 2013© 2013 Carnegie Mellon University

    Practical Considerations

    The amount of evidence and argument for a real system is inherently quite large• Can eliminative argumentation provide a more thorough and cost-effective

    basis for developing confidence in system behavior?• Are assurance efforts more effective and focused?

    More research is needed