Fault Location via State Alteration

Fault Location via State Alteration

CS 206Fall 2009

2

Value Replacement: Overview

INPUT:Faulty program and test suite (1+ failing runs)

TASK:(1) Perform value replacements in failing runs

(2) Rank program statements according to collected information

OUTPUT:Ranked list of program statements

Aggressive state alteration to locate faulty program statements [Jeffrey et. al., ISSTA 2008]

3

Alter State by Replacing Values

Passing Execution Failing Execution

Correct Output

Failing Execution: Altered State

ERROR

REPLACEVALUES

ERROR

Incorrect Output Correct? / Incorrect?

4

1: read (x, y);

2: a := x - y;

3: if (x < y)

4: write (a);

else

5: write (a + 1);

Example of a Value Replacement

(output: ?)PASSING EXECUTION:

1 10

0

1 1(F)

1

2

3

5

1 1

1

5


FAILING EXECUTION: (expected output: 1)(actual output: ?)

1: read (x, y);

2: a := x + y;

3: if (x < y)

4: write (a);

else

5: write (a + 1);

ERROR: plus should be minus

1 1

1 12

2

1 1(F)

1

2

3

5

ERRORERROR

3

6


STATE ALTERATION:

1: read (x, y);

2: a := x + y;

3: if (x < y)

4: write (a);

else

5: write (a + 1);

1 11

21 12ERRORERROR

(expected output: 1)(actual output: ?)

Original Values0 11 Alternate ValuesREPLACEVALUES

11

0 1(T)3

4

Interesting Value Mapping Pair (IVMP):Location: statement 2, instance 1Original: {a = 2, x = 1, y = 1}Alternate: {a = 1, x = 0, y = 1}

7

Searching for IVMPs in a Failing Run

Step 1: Compute the Value ProfileSet of values used at each statement with respect to all available test case executions

Step 2: Replace values to search for IVMPs

For each statement instance in failing run

For each alternate set of values in value profile

Replace values to see if an IVMP is found

Endfor

Endfor

8

Searching for IVMPs: Example

1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);

Test Case (x, y)

Actual

Output

Expected

Output

(1, 1) 3 1

(-1, 0) -1 -1

(0, 0) 1 1

x = 1y = 1

\ -1\ 0

-1

VALUEREPLACEMENT

RESULTINGOUTPUT IVMP?

1: read (x, y);

VALUE PROFILE

a = 2

output = 3

x = 0

y = 0

branch = F

x = -1

y = 0

branch = T

x = 1

y = 1

branch = F

x = 0

y = 0

a = 0

x = -1

y = 0

a = -1

x = 1

y = 1

a = 2

x = 0

y = 0

x = -1

y = 0

x = 1

y = 11:

2:

3:

4:

5:

a = -1output = -1

a = 0output = 1

(1,1) (-1,0) (0,0)

9



Test Case (x, y)

Actual

Output

Expected

Output

(1, 1) 3 1

(-1, 0) -1 -1

(0, 0) 1 1

1: read (x, y);

IVMPs Identified:

stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )

x = 1y = 1

\ 0\ 0

1

VALUEREPLACEMENT

RESULTINGOUTPUT IVMP?

a = 2

output = 3

x = 0

y = 0

branch = F

x = -1

y = 0

branch = T

x = 1

y = 1

branch = F

x = 0

y = 0

a = 0

x = -1

y = 0

a = -1

x = 1

y = 1

a = 2

x = 0

y = 0

x = -1

y = 0

x = 1

y = 11:

2:

3:

4:

5:

VALUE PROFILE

a = -1output = -1

a = 0output = 1

10



Test Case (x, y)

Actual

Output

Expected

Output

(1, 1) 3 1

(-1, 0) -1 -1

(0, 0) 1 1

IVMPs Identified:

stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )

2: a := x + y;

stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )

x = 1y = 1a = 2

\ -1\ 0\ -1

x = 1y = 1a = 2

\ 0\ 0\ 0

a = 2

output = 3

x = 0

y = 0

branch = F

x = -1

y = 0

branch = T

x = 1

y = 1

branch = F

x = 0

y = 0

a = 0

x = -1

y = 0

a = -1

x = 1

y = 1

a = 2

x = 0

y = 0

x = -1

y = 0

x = 1

y = 11:

2:

3:

4:

5:

VALUE PROFILE

a = -1output = -1

a = 0output = 1

11



Test Case (x, y)

Actual

Output

Expected

Output

(1, 1) 3 1

(-1, 0) -1 -1

(0, 0) 1 1

IVMPs Identified:

stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )

3: if (x < y)

stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )

x = 1y = 1branch = F

\ -1\ 0 \ T

x = 1y = 1branch = F

\ 0\ 0 \ F

a = 2

output = 3

x = 0

y = 0

branch = F

x = -1

y = 0

branch = T

x = 1

y = 1

branch = F

x = 0

y = 0

a = 0

x = -1

y = 0

a = -1

x = 1

y = 1

a = 2

x = 0

y = 0

x = -1

y = 0

x = 1

y = 11:

2:

3:

4:

5:

VALUE PROFILE

a = -1output = -1

a = 0output = 1

12



Test Case (x, y)

Actual

Output

Expected

Output

(1, 1) 3 1

(-1, 0) -1 -1

(0, 0) 1 1

IVMPs Identified:

stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )

5: write (a + 1);

stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )

stmt 5, inst 1: ( {a=2, output=3}{a=0, output=1} )

a = 2output = 3

\ 0 \ 1

a = 2

output = 3

x = 0

y = 0

branch = F

x = -1

y = 0

branch = T

x = 1

y = 1

branch = F

x = 0

y = 0

a = 0

x = -1

y = 0

a = -1

x = 1

y = 1

a = 2

x = 0

y = 0

x = -1

y = 0

x = 1

y = 11:

2:

3:

4:

5:

VALUE PROFILE

a = -1output = -1

a = 0output = 1

13



Test Case (x, y)

Actual

Output

Expected

Output

(1, 1) 3 1

(-1, 0) -1 -1

(0, 0) 1 1

IVMPs Identified:

stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )

stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )

stmt 5, inst 1: ( {a=2, output=3}{a=0, output=1} )

DONE

a = 2

output = 3

x = 0

y = 0

branch = F

x = -1

y = 0

branch = T

x = 1

y = 1

branch = F

x = 0

y = 0

a = 0

x = -1

y = 0

a = -1

x = 1

y = 1

a = 2

x = 0

y = 0

x = -1

y = 0

x = 1

y = 11:

2:

3:

4:

5:

VALUE PROFILE

a = -1output = -1

a = 0output = 1

14

IVMPs at Non-Faulty Statements

Causes of IVMPs at non-faulty statementsStatements in same dependence chainCoincidence

Consider multiple failing runsStmt w/ IVMPs in more runs more likely to be faultyStmt w/ IVMPs in fewer runs less likely to be faulty

15

{1, 2} {4, 5} {3}MOST LIKELY TO BE FAULTY

LEAST LIKELY TO BE FAULTY

Multiple Failing Runs: Example

1: read (x, y);2: a := x + y; 3: if (x < y)4: write (a); else5: write (a + 1);

Test Case (x, y) Actual Output Expected Output

(1, 1) 3 1

(0, 1) 1 -1

(-1, 0) -1 -1

(0, 0) 1 1

[A]

[B]

[C]

[D]

Test Case [A] IVMPs:

stmt 1, inst 1: ( {x=1, y=1}{x=0, y=1} )stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=1, a=1} )stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )stmt 5, inst 1: ( {a=2, output=3}{a=0, output=1} )

stmts with IVMPs: {1, 2, 5}

1: read (x, y);2: a := x + y;

5: write (a + 1);

Test Case [B] IVMPs:

stmt 1, inst 1: ( {x=0, y=1}{x=-1, y=0} )stmt 2, inst 1: ( {x=0, y=1,a=1}{x=-1, y=0,a=-1} )stmt 4, inst 1: ( {a=1, output=1}{a=-1, output=-1} )

stmts with IVMPs: {1, 2, 4}

2: a := x + y;

4: write (a);

1: read (x, y);

16

Ranking Statements using IVMPs

Sort in decreasing order of:

Break ties using Tarantula technique[Jones et. al., ICSE 2002]

The number of failing runs in which the statement is associated with at least one IVMP

fraction of failing runs exercising stmt

fraction of passing runs exercising stmt

fraction of failing runs exercising stmt+

17

Techniques Evaluated

Value Replacement techniqueConsider all available failing runs (ValRep-All)

Consider only 2 failing runs (ValRep-2)

Consider only 1 failing run (ValRep-1)

Tarantula technique (Tarantula)Consider all available test cases

Most effective technique known for our benchmarks

Only rank statements exercised by failing runs

18

Score for each ranked statement list

Represents percentage of statements that need not be examined before error is locatedHigher score is better

Metric for Comparison

size of listrank of the faulty stmt

100%xsize of list

HighSuspiciousness

LowSuspiciousness

HighSuspiciousness

LowSuspiciousness

Higher Score Lower Score

19

Benchmark Programs

Program LOC # Faulty Ver. Avg. Suite Size (Pool Size)

tcas 138 41 17 (1608)

totinfo 346 23 15 (1052)

sched 299 9 20 (2650)

sched2 297 9 17 (4130)

ptok 402 7 17 (4130)

ptok2 483 9 23 (4115)

replace 516 31 29 (5542)

129 faulty programs (errors) derived from 7 base programs

Each faulty program is associated with a branch-coverage adequate test suite containing at least 5 failing and 5 passing test cases

Test suite used by Value Replacement, test pool used by Tarantula

20

Effectiveness Results

0

10

20

30

40

50

60

70

80

90

100

0102030405060708090100

Score (%)

% o

f F

ault

y P

rog

ram

s

ValRep-All

ValRep-2

ValRep-1

Number (%) of faulty programs

Score ValRep-All Val-Rep-2 ValRep-1

≥ 99% 23 (17.8%) 21 (16.3%) 18 (14.0%)

≥ 90% 89 (69.0%) 84 (65.1%) 75 (58.1%)

Value Replacement technique

21

Effectiveness ResultsComparison to Tarantula

0

10

20

30

40

50

60

70

80

90

100

0102030405060708090100

Score (%)

% o

f F

ault

y P

rog

ram

s ValRep-All

ValRep-2

ValRep-1

Tarantula

Number (%) of faulty programs

Score ValRep-All Val-Rep-2 ValRep-1 Tarantula

≥ 99% 23 (17.8%) 21 (16.3%) 18 (14.0%) 7 (5.4%)

≥ 90% 89 (69.0%) 84 (65.1%) 75 (58.1%) 48 (37.2%)

22

Value Replacement: Summary

Highly EffectivePrecisely locates 39 / 129 errors (30.2%)

Most effective previously known: 5 / 129 (3.9%)

LimitationsCan require significant computation time to search for IVMPs

Assumes multiple failing runs are caused by the same error

23

Handling Multiple Errors

Effectively locate multiple simultaneous errors [Jeffrey et. al., ICSM 2009]

Iteratively compute a ranked list of statements to find and fix one error at a time

Three variations of this techniqueMIN: minimal computation; use same list each time

FULL: full computation; produce new list each time

PARTIAL: partial computation; revise list each time

24

Multiple-Error Techniques

Value Replacement

Faulty Programand Test Suite

Ranked List ofProgramStatements

Developer Find/Fix Error

Done

Single Error



DoneFailing Run Remains?NoYes

Multiple Errors (MIN)

Value Replacement


25

Multiple-Error Techniques




Multiple Errors (FULL)

Value Replacement





Multiple Errors (PARTIAL)

PartialValue Replacement


26

PARTIAL Technique

Step 1: Initialize ranked lists and locate first errorFor each statement s, compute a ranked list by considering only failing runs exercising sReport ranked list with highest suspiciousness value at the front of the list

Step 2: Iteratively revise ranked lists and locate each remaining error

For each remaining failing run that exercises the statement just fixed, recompute IVMPsUpdate any affected ranked listsReport ranked list with the most different elements at the front of the list, compared to previously-selected lists

27

PARTIAL Technique: Example

1

2

3 4

5

Program (2 faulty statements) Failing Run Execution Trace

Statements with IVMPs

A (1, 2, 3, 5) {2, 5}

B (1, 2, 3, 5) {1, 2}

C (1, 2, 4, 5) {2, 4, 5}

Computed Ranked Lists: (statementsuspiciousness)

1

2

3

4

5

23, 52, 11, 41, 30

23, 52, 11, 41, 30

22, 11, 51, 30, 40

21, 41, 51, 10, 30

23, 52, 11, 41, 30

[based on runs A, B, C]


[based on runs A, B]

[based on run C]


Report list 1, 2, or 5 (assume 1) Fix faulty statement 2

28

PARTIAL Technique: Example

1

2

3 4

5

Program (1 faulty statement) Failing Run Execution Trace

Statements with IVMPs

C (1, 2, 4, 5) {4}

Computed Ranked Lists: (statementsuspiciousness)

2

3

4

5

22, 11, 41, 51, 30

22, 11, 51, 30, 40

41, 10, 20, 30, 50

22, 11, 41, 51, 30

[based on runs A, B, C] (C updated)

[based on runs A, B] (no updates)

[based on run C] (C updated)

[based on runs A, B, C] (C updated)

Report list 4 Fix faulty statement 4 Done

29

Techniques Compared

(MIN) Only compute ranked list once

(FULL) Fully recompute ranked list each time

(PARTIAL) Compute IVMPs for subset of failing runs and revise ranked lists each time

(ISOLATED) Locate each error in isolation

30

Benchmark Programs

Program # 5-Error Faulty Versions

Average Suite Size

(# Failing Runs / # Passing Runs)

tcas 20 11 (5 / 6)

totinfo 20 22 (10 / 12)

sched 20 29 (10 / 19)

sched2 20 30 (9 / 21)

ptok 2 32 (8 / 24)

ptok2 11 29 (5 / 24)

replace 20 38 (9 / 29)

Each faulty program contains 5 seeded errors, each in a different stmt

Each faulty program is associated with a stmt-coverage adequate test suite such that at least one failing run exercises each error

Experimental Benchmark Programs

31

50

55

60

65

70

75

80

85

90tc

as

toti

nfo

sch

ed

sch

ed2

pto

k

pto

k2

rep

lace

Effectiveness Comparison of Value Replacement Techniques

IsolatedFullPartialMin

Effectiveness ResultsA

vg. S

co

re p

er R

anke

d L

ist

(%)

32

Efficiency of Value Replacement

Searching for IVMPs is time-consuming

Lossy techniquesReduce search space for finding IVMPsMay result in some missed IVMPsPerformed for single-error benchmarks

Lossless techniquesOnly affect the efficiency of implementationResult in no missed IVMPsPerformed for multi-error benchmarks

5 failing runs X 50,000 stmt instances per run 15 alt value sets per instanceX

= 3.75 million value replacement program executions

Over 10 days if each execution requires a quarter-second

33

Lossy Techniques

Limit considered statement instancesFind IVMP skip all subsequent instances of the same statement in the current run

Don’t find IVMP skip statement in subsequent runs

Limit considered alternate value setsonly use min <, max <, min >, and max >, as compared to original value

orig

max < min >min < max >

(skip) (skip)

34

Lossless Techniques

stmt instance 1

stmt instance 2

stmt instance 3

Original Execution

(assume 2 alternate value sets at each stmt instance)

Regular Value Replacement Executions

(value replacements are independent of each other)(portions of original execution are duplicated multiple times)

(x 6)

(x 4)

(x 2)

Efficiency Improvements:

(1) Fork child process to do each value replacement in original failing execution(2) Perform value replacements in parallel

35

Lossless Techniques

With Redundant Execution Removed

(no duplication of any portion of original execution)

With Parallelization

(total time required to perform all value replacements is reduced)

36

Search Reduction by Lossy Techniques

0

5

10

15

20

0 20 40 60 80 100 120

Faulty Program

# V

alu

e R

ep

lac

em

en

ts (

in m

illio

ns

)

Full Search

LimitedSearch

Reduction in # of Executions by Lossy Techniques: Single-Error Benchmarks

# val replacements needed

Full Limited

Mean 2.0 M 0.03 M

Max 21.5 M 0.4 M

37

Search Reduction by Lossy Techniques

0

5

10

15

20

0 20 40 60 80 100 120

Faulty Program

# V

alu

e R

ep

lac

em

en

ts (

in m

illio

ns

)

Full Search

LimitedSearch

Reduction in # of Executions by Lossy Techniques: Single-Error Benchmarks

On average,total number of executionsreduced by a factor of 67

38

Time Required for Reduced Search

0102030405060708090

100

0 100 200 300 400 500 600 700 800

Time (minutes)

% F

au

lty P

rog

ram

s C

om

ple

ted

Time Required to Search using Lossy Techniques: Single-Error Benchmarks

Mean 55.6 min

< 1 min 39% of progs



Max 846.5 min

39

Time Required for Reduced Search

0102030405060708090

100

0 100 200 300 400 500 600 700 800

Time (minutes)

% F

au

lty P

rog

ram

s C

om

ple

ted

Time Required to Search using Lossy Techniques: Single-Error Benchmarks

Only 13% of faulty programsrequired more than 100

minutes of IVMP search time

40

0

100

200

300

400

500

600tc

as

toti

nfo

sch

ed

sch

ed2

pto

k

pto

k2

rep

lace

Time to Search in Each Faulty Program usingLossless Techniques: Multi-Error Benchmarks

FullPartialMin

Time Required with Lossless TechniquesA

vg. T

ime

(sec

on

ds

) With Lossless techniques,multiple errors in a programcan be located in minutes.

With Lossy techniques,some single errors

require hours to locate.

41

Execution Suppression

Efficient location of memory errors through targeted state alteration [Jeffrey et. al., ICSM 2008]

Alter state in a way that will definitely get closer to the goal each time

Goal: identify first point of memory corruption in a failing execution

42

Memory Errors and Corruption

Memory errorsBuffer overflowUninitialized readDangling pointerDouble freeMemory leak

Memory corruptionIncorrect memory location is accessed, orIncorrect value is assigned to a pointer variable

43

Study of Memory Corruption

Traversal of error First point of memory corruption Failure

Program LOC Memory Error

Type

Analyzed Input Types

gzip 6.3 K Global overflow No crash Crash 1

man 10.8 K Global overflow Crash 1

bc 10.7 K Heap overflow No crash Crash 1

pine 211.9 K Heap overflow No crash Crash 1

mutt 65.9 K Heap overflow No crash Crash 1

ncompress 1.4 K Stack overflow No crash Crash 1 Crash 2

polymorph 1.1 K Stack overflow No crash Crash 1 Crash 2

xv 69.2 K Stack overflow No crash Crash 1

tar 28.4 K NULL dereference Crash 1

tidy 35.9 K NULL dereference Crash 1

cvs 104.1 K Double free Crash 1

44

Observations from Study

Total distance from point of error traversal until failure can be large

Different inputs triggering memory corruption may result in different crashes or no crashes

Distance from error traversal to first memory corruption, is considerably less than distance from first memory corruption to failure

45

Execution Suppression: High-Level

Program crash reveals memory corruption

Key: assume memory corruption leads to crash

Component 1: suppressionIteratively identify first point of memory corruption

Omit the effect of certain statements during execution

Component 2: variable re-orderingExpose crashes where they may not occur

Helpful since key assumption does not always hold

46

Suppression: How it WorksWhile a crash occurs

Identify accessed location L directly causing crashIdentify last definition D of location LRe-execute program and omit execution of D and anything dependent on it

EndwhileReport the statement associated with the most recent D

First point ofmemory corruption

47

Suppression: Example1: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);

Stmt 4: copy-paste error:“x” should be “y”

Stmt 8: clobbers def @ stmt 6

Stmts 9 - 11: propagation

Stmt 12: potential buffer overflow

Stmt 13: potential overflow orNULL dereference

Stmt 15: double free

48

Suppression: Example

Stmt 4: The error as well as the first point of memory corruption

(Located in 4 executions)

1: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);

49

Example: Execution 1 of 41: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);

Stmt: Loc Defined: OK?

1 p12 p23 q1

5 *p16 *p27 *q1

4 q2

8 *p2/*q29 a10 b11 c12 CRASH

Action:Suppress definition of c at stmt 11 and all of its effects

50



1 p12 p23 q1

5 *p16 *p27 *q1

4 q2

8 *p2/*q29 a10 b11 12 13 CRASH

Action: Suppress def of *p2/*q2 at stmt 8 and all of its effects

51


1 p12 p23 q1

5 *p16 *p27 *q1

4 q2

Action: Suppress q2 at stmt 4 and effects

8 9 10 11 12 131415 CRASH


52


1 p12 p23 q1

5 *p16 *p27 *q1

4

Result: stmt 4 identified

8 9 10 11 12 131415


53

Example: Summary

1

2

3

4

5

6

7

8

9

10

11

Execution 1

p1:

p2:

q1:

q2:

p1:

p2:

q1:

p2, q2:

***

* *a:

b:

c:

12CRASH!

1

2

3

4

5

6

7

8

9

10

11

12

p1:

p2:

q1:

q2:

p1:

p2:

q1:

p2, q2:

***

* *a:

b:

13CRASH!

Execution 2

suppress

1

2

3

4

5

6

7

8

9

10

11

12

13

14

p1:

p2:

q1:

q2:

p1:

p2:

q1:

***

Execution 3

suppress

15CRASH!

1

2

3

4

5

6

7

8

9

10

11

12

13

14

p1:

p2:

q1:

p1:

p2:

q1:

***

Execution 4

suppress

15

REPORT 4

54

Variable Re-Ordering

Re-order variables in memory prior to execution

Try to cause a crash due to corruption, in cases where crash does not occur

Can overcome limitations of suppressionDo not terminate prematurely when corruption does not cause crashApplicable to executions that do not crash

Position address variables after buffers

55

Variable Re-Ordering: Example

From program ncompress:

void comprexx(char **fileptr){ int fdin; int fdout; char tempname[1024]; strcpy(tempname, *fileptr); ...}

tempname fdout fdin$ return

addr

On the call stack:

Original Variable Ordering:

Variable Re-Ordering: tempnamefdout fdin$ return

addr

Original ordering no stack smash

Re-ordering stack smash

56

The Complete Algorithm

Explanation

Execution Suppression Algorithm

exec := original failing execution;Do

(A) identifiedStmt, exec := run suppression component using exec;(B) reordering, exec := run variable re-ordering component using exec;

While (crashing reordering is found);Report identifiedStmt;

(A) Runs suppression until no further crashes occur(B) Attempts to expose an additional crashDo/While loop iterates as long as variable re-ordering exposes a new crash

57

Execution Suppression: EvaluationSuppression-only results (no variable re-ordering):

Program Input

Type

# Exec. Required

Maximum Static Dependence Distance From Located Stmt To…

1st Memory Corruption Error

gzip Crash 1 2 0 0

man Crash 1 2 1 2

bc Crash 1 2 0 1

pine Crash 1 2 0 5

mutt Crash 1 3 0 1

ncompress Crash 1

Crash 2

2

4

0

0

0

0

polymorph Crash 1

Crash 2

2

3

0

0

1

1

xv Crash 1 4 0 2

tar Crash 1 2 0 0

tidy Crash 1 2 0 0

cvs Crash 1 2 0 0

58

Execution Suppression: Evaluation

Suppression and variable re-ordering results:

Program Input

Type

# Crashes Exposed

# Var R-O Exec.

Maximum Static Dependence Distance From Located Stmt To…

1st Memory Corruption Error

gzip No Crash 0 15 --- ---

man Crash 1 1 18 0 1

bc No Crash 1 --- 0 1

pine No Crash 1 --- 0 5

mutt No Crash 1 --- 0 1

ncompress No Crash 1 5 0 0

polymorph No Crash 1 6 0 1

xv No Crash 1 135 0 2

59

Memory Errors in Multithreaded Programs

Assume programs run on single processor

Two main enhancements required for Execution Suppression

Reproduce failure on multiple executionsRecord failure-inducing thread interleaving

Replay same interleaving on subsequent executions

In general, other factors should be recorded/replayed

Identify data race errorsData race: concurrent, unsynchronized access of a shared memory location by multiple threads; at least one write

Identified on-the-fly during suppression

60

Identifying Data Races

Data races involve WAR, WAW, or RAW dependencies

Identified points of suppression are writesCan involve WAR or WAW dependence prior to that point

Can involve RAW dependence after that point

Monitor for an involved data race on-the-fly during a suppression execution

61

On-the-fly Data Race Detection

Identified suppression point (thread T1 writes to location L)

Last access to L by a thread other than T1

Next read from L by a thread other than T1

Monitor for synchronization on L


Suppression Execution

62






Next read from L by a thread other than T1

WAR or WAW data race may be identified at this point

63






RAW data race may be identified at this point

WAR or WAW data race may be identified at this point

64

Potentially-Harmful Data Races

Given two memory accesses involved in a data race, force other thread interleavings to see if the state is altered

Memory access point 1: access to L from thread T1

Memory access point 2: access to L from thread T2

For each ready thread besides T1, re-execute from this point and schedule it in place of T1

If value in L is changed at this point, data race is potentially-harmful

Harmful Data Race Checking Executions

65

Evaluation with Multithreaded Programs

Program LOC Error Type # Executions Required

Precisely Identifies Error?

apache 191 K Data race 3 yes

mysql-1 508 K Data race 3 yes

mysql-2 508 K Data race 3 yes

mysql-3 508 K Uninitialized read 2 yes

prozilla-1 16 K Stack overflow 2 yes

prozilla-2 16 K Stack overflow 4 yes

axel 3 K Stack overflow 3 yes

Multithreaded Benchmark Programs and Results:

66

Implementing Suppression: GeneralGlobal variables

count: Dynamic instruction count valuesuppress: Suppression mode flag (boolean)

Variables associated with each register and memory wordlastDef: count value associated with the instruction last defining itcorrupt: whether associated effects need to be suppressed

target.lastDef := ++count;

At a program instruction (defines target, uses src1 and src2):

Ensure instruction responsible for a crash can be identified:

Carry out suppression as necessary:if (current instruction is a suppression point)

suppress := true; target.corrupt := true;if (suppress)

if (src1.corrupt or src2.corrupt)target.corrupt := true;

elseexecute instruction; target.corrupt := false;

67

Software/Hardware Support

Software-only implementation can incur relatively high overhead (SW)

Reduce overhead with hardware supportExisting support in Itanium processors for deferred exception handling: extra bit for registers (HW1) Further memory augmentation: extra bit for memory words (HW1 + HW2)

Overheads compared in a simulator

68

Average Overhead: 7.2x (SW) 2.7x (HW1) 1.8x (HW1 + HW2)

Performance Overhead Comparison

0

2

4

6

8

10

12g

zip

man b

c

pin

e

mu

tt

nco

mp

ress

po

lym

orp

h

xv tar

tid

y

cvs

Suppression Overhead Comparison:Software and Hardware Support

SWHW1HW1 + HW2

69

Other Dynamic Error Location Techniques

Other state-alteration techniquesDelta Debugging [Zeller et al. FSE 2002, TSE 2002, ICSE 2005]

Search in space for values relevant to a failureSearch in time for failure cause transitions

Predicate Switching [Zhang et. al. ICSE 2006]

Alter predicate outcome to correct failing output

Value Replacement is more aggressive; Execution Suppression is better targeted for memory errors

70


Program slicing-based techniquesPruning Dynamic Slices with Confidence [Zhang et. al. PLDI 2006]

Failure-Inducing Chops [Gupta et. al. ASE 2005]

Invariant-based techniquesDaikon [Ernst et. al. IEEE TSE Feb. 2001]

AccMon [Zhou et. al. HPCA 2007]

71


Statistical techniquesCooperative Bug Isolation [Ben Liblit doctoral dissertation, 2005]

SOBER [Liu et. al. FSE 2005]

Tarantula [Jones et. al. ICSE 2002]

COMPUTE RESULT

Spectra-based techniquesNearest Neighbor [Renieris and Reiss, ASE 2003]

FAIL PASS SUSPICIOUS

72

Future DirectionsEnhancements to Value Replacement

Improve scalabilityStudy when IVMPs cannot be found at faulty statement

Enhancements to Execution SuppressionImprove scalability of variable re-orderingOther techniques to expose crashesHandle memory errors that do not involve corruption

Applications to Fixing ErrorsIVMPs can be used in BugFix [Jeffrey et. al., ICPC 2009]Comparatively little research in automated techniques for fixing errors

Applications to Tolerating ErrorsSuppression can be used to recover from failures in server programs [Nagarajan et. al., ISMM 2009]Other applications?

Fault Location via State Alteration

Documents

Fault Location via State Alteration