Fault Location via State Alteration
CS 206Fall 2009
2
Value Replacement: Overview
INPUT:Faulty program and test suite (1+ failing runs)
TASK:(1) Perform value replacements in failing runs
(2) Rank program statements according to collected information
OUTPUT:Ranked list of program statements
Aggressive state alteration to locate faulty program statements [Jeffrey et. al., ISSTA 2008]
3
Alter State by Replacing Values
Passing Execution Failing Execution
Correct Output
Failing Execution: Altered State
ERROR
REPLACEVALUES
ERROR
Incorrect Output Correct? / Incorrect?
4
1: read (x, y);
2: a := x - y;
3: if (x < y)
4: write (a);
else
5: write (a + 1);
Example of a Value Replacement
(output: ?)PASSING EXECUTION:
1 10
0
1 1(F)
1
2
3
5
1 1
1
5
Example of a Value Replacement
FAILING EXECUTION: (expected output: 1)(actual output: ?)
1: read (x, y);
2: a := x + y;
3: if (x < y)
4: write (a);
else
5: write (a + 1);
ERROR: plus should be minus
1 1
1 12
2
1 1(F)
1
2
3
5
ERRORERROR
3
6
Example of a Value Replacement
STATE ALTERATION:
1: read (x, y);
2: a := x + y;
3: if (x < y)
4: write (a);
else
5: write (a + 1);
1 11
21 12ERRORERROR
(expected output: 1)(actual output: ?)
Original Values0 11 Alternate ValuesREPLACEVALUES
11
0 1(T)3
4
Interesting Value Mapping Pair (IVMP):Location: statement 2, instance 1Original: {a = 2, x = 1, y = 1}Alternate: {a = 1, x = 0, y = 1}
7
Searching for IVMPs in a Failing Run
Step 1: Compute the Value ProfileSet of values used at each statement with respect to all available test case executions
Step 2: Replace values to search for IVMPs
For each statement instance in failing run
For each alternate set of values in value profile
Replace values to see if an IVMP is found
Endfor
Endfor
8
Searching for IVMPs: Example
1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y)
Actual
Output
Expected
Output
(1, 1) 3 1
(-1, 0) -1 -1
(0, 0) 1 1
x = 1y = 1
\ -1\ 0
-1
VALUEREPLACEMENT
RESULTINGOUTPUT IVMP?
1: read (x, y);
VALUE PROFILE
a = 2
output = 3
x = 0
y = 0
branch = F
x = -1
y = 0
branch = T
x = 1
y = 1
branch = F
x = 0
y = 0
a = 0
x = -1
y = 0
a = -1
x = 1
y = 1
a = 2
x = 0
y = 0
x = -1
y = 0
x = 1
y = 11:
2:
3:
4:
5:
a = -1output = -1
a = 0output = 1
(1,1) (-1,0) (0,0)
9
Searching for IVMPs: Example
1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y)
Actual
Output
Expected
Output
(1, 1) 3 1
(-1, 0) -1 -1
(0, 0) 1 1
1: read (x, y);
IVMPs Identified:
stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )
x = 1y = 1
\ 0\ 0
1
VALUEREPLACEMENT
RESULTINGOUTPUT IVMP?
a = 2
output = 3
x = 0
y = 0
branch = F
x = -1
y = 0
branch = T
x = 1
y = 1
branch = F
x = 0
y = 0
a = 0
x = -1
y = 0
a = -1
x = 1
y = 1
a = 2
x = 0
y = 0
x = -1
y = 0
x = 1
y = 11:
2:
3:
4:
5:
VALUE PROFILE
a = -1output = -1
a = 0output = 1
10
Searching for IVMPs: Example
1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y)
Actual
Output
Expected
Output
(1, 1) 3 1
(-1, 0) -1 -1
(0, 0) 1 1
IVMPs Identified:
stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )
2: a := x + y;
stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )
x = 1y = 1a = 2
\ -1\ 0\ -1
x = 1y = 1a = 2
\ 0\ 0\ 0
a = 2
output = 3
x = 0
y = 0
branch = F
x = -1
y = 0
branch = T
x = 1
y = 1
branch = F
x = 0
y = 0
a = 0
x = -1
y = 0
a = -1
x = 1
y = 1
a = 2
x = 0
y = 0
x = -1
y = 0
x = 1
y = 11:
2:
3:
4:
5:
VALUE PROFILE
a = -1output = -1
a = 0output = 1
11
Searching for IVMPs: Example
1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y)
Actual
Output
Expected
Output
(1, 1) 3 1
(-1, 0) -1 -1
(0, 0) 1 1
IVMPs Identified:
stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )
3: if (x < y)
stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )
x = 1y = 1branch = F
\ -1\ 0 \ T
x = 1y = 1branch = F
\ 0\ 0 \ F
a = 2
output = 3
x = 0
y = 0
branch = F
x = -1
y = 0
branch = T
x = 1
y = 1
branch = F
x = 0
y = 0
a = 0
x = -1
y = 0
a = -1
x = 1
y = 1
a = 2
x = 0
y = 0
x = -1
y = 0
x = 1
y = 11:
2:
3:
4:
5:
VALUE PROFILE
a = -1output = -1
a = 0output = 1
12
Searching for IVMPs: Example
1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y)
Actual
Output
Expected
Output
(1, 1) 3 1
(-1, 0) -1 -1
(0, 0) 1 1
IVMPs Identified:
stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )
5: write (a + 1);
stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )
stmt 5, inst 1: ( {a=2, output=3}{a=0, output=1} )
a = 2output = 3
\ 0 \ 1
a = 2
output = 3
x = 0
y = 0
branch = F
x = -1
y = 0
branch = T
x = 1
y = 1
branch = F
x = 0
y = 0
a = 0
x = -1
y = 0
a = -1
x = 1
y = 1
a = 2
x = 0
y = 0
x = -1
y = 0
x = 1
y = 11:
2:
3:
4:
5:
VALUE PROFILE
a = -1output = -1
a = 0output = 1
13
Searching for IVMPs: Example
1: read (x, y);2: a := x + y; // + should be –3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y)
Actual
Output
Expected
Output
(1, 1) 3 1
(-1, 0) -1 -1
(0, 0) 1 1
IVMPs Identified:
stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )
stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )
stmt 5, inst 1: ( {a=2, output=3}{a=0, output=1} )
DONE
a = 2
output = 3
x = 0
y = 0
branch = F
x = -1
y = 0
branch = T
x = 1
y = 1
branch = F
x = 0
y = 0
a = 0
x = -1
y = 0
a = -1
x = 1
y = 1
a = 2
x = 0
y = 0
x = -1
y = 0
x = 1
y = 11:
2:
3:
4:
5:
VALUE PROFILE
a = -1output = -1
a = 0output = 1
14
IVMPs at Non-Faulty Statements
Causes of IVMPs at non-faulty statementsStatements in same dependence chainCoincidence
Consider multiple failing runsStmt w/ IVMPs in more runs more likely to be faultyStmt w/ IVMPs in fewer runs less likely to be faulty
15
{1, 2} {4, 5} {3}MOST LIKELY TO BE FAULTY
LEAST LIKELY TO BE FAULTY
Multiple Failing Runs: Example
1: read (x, y);2: a := x + y; 3: if (x < y)4: write (a); else5: write (a + 1);
Test Case (x, y) Actual Output Expected Output
(1, 1) 3 1
(0, 1) 1 -1
(-1, 0) -1 -1
(0, 0) 1 1
[A]
[B]
[C]
[D]
Test Case [A] IVMPs:
stmt 1, inst 1: ( {x=1, y=1}{x=0, y=1} )stmt 1, inst 1: ( {x=1, y=1}{x=0, y=0} )stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=1, a=1} )stmt 2, inst 1: ( {x=1, y=1, a=2}{x=0, y=0, a=0} )stmt 5, inst 1: ( {a=2, output=3}{a=0, output=1} )
stmts with IVMPs: {1, 2, 5}
1: read (x, y);2: a := x + y;
5: write (a + 1);
Test Case [B] IVMPs:
stmt 1, inst 1: ( {x=0, y=1}{x=-1, y=0} )stmt 2, inst 1: ( {x=0, y=1,a=1}{x=-1, y=0,a=-1} )stmt 4, inst 1: ( {a=1, output=1}{a=-1, output=-1} )
stmts with IVMPs: {1, 2, 4}
2: a := x + y;
4: write (a);
1: read (x, y);
16
Ranking Statements using IVMPs
Sort in decreasing order of:
Break ties using Tarantula technique[Jones et. al., ICSE 2002]
The number of failing runs in which the statement is associated with at least one IVMP
fraction of failing runs exercising stmt
fraction of passing runs exercising stmt
fraction of failing runs exercising stmt+
17
Techniques Evaluated
Value Replacement techniqueConsider all available failing runs (ValRep-All)
Consider only 2 failing runs (ValRep-2)
Consider only 1 failing run (ValRep-1)
Tarantula technique (Tarantula)Consider all available test cases
Most effective technique known for our benchmarks
Only rank statements exercised by failing runs
18
Score for each ranked statement list
Represents percentage of statements that need not be examined before error is locatedHigher score is better
Metric for Comparison
size of listrank of the faulty stmt
100%xsize of list
HighSuspiciousness
LowSuspiciousness
HighSuspiciousness
LowSuspiciousness
Higher Score Lower Score
19
Benchmark Programs
Program LOC # Faulty Ver. Avg. Suite Size (Pool Size)
tcas 138 41 17 (1608)
totinfo 346 23 15 (1052)
sched 299 9 20 (2650)
sched2 297 9 17 (4130)
ptok 402 7 17 (4130)
ptok2 483 9 23 (4115)
replace 516 31 29 (5542)
129 faulty programs (errors) derived from 7 base programs
Each faulty program is associated with a branch-coverage adequate test suite containing at least 5 failing and 5 passing test cases
Test suite used by Value Replacement, test pool used by Tarantula
20
Effectiveness Results
0
10
20
30
40
50
60
70
80
90
100
0102030405060708090100
Score (%)
% o
f F
ault
y P
rog
ram
s
ValRep-All
ValRep-2
ValRep-1
Number (%) of faulty programs
Score ValRep-All Val-Rep-2 ValRep-1
≥ 99% 23 (17.8%) 21 (16.3%) 18 (14.0%)
≥ 90% 89 (69.0%) 84 (65.1%) 75 (58.1%)
Value Replacement technique
21
Effectiveness ResultsComparison to Tarantula
0
10
20
30
40
50
60
70
80
90
100
0102030405060708090100
Score (%)
% o
f F
ault
y P
rog
ram
s ValRep-All
ValRep-2
ValRep-1
Tarantula
Number (%) of faulty programs
Score ValRep-All Val-Rep-2 ValRep-1 Tarantula
≥ 99% 23 (17.8%) 21 (16.3%) 18 (14.0%) 7 (5.4%)
≥ 90% 89 (69.0%) 84 (65.1%) 75 (58.1%) 48 (37.2%)
22
Value Replacement: Summary
Highly EffectivePrecisely locates 39 / 129 errors (30.2%)
Most effective previously known: 5 / 129 (3.9%)
LimitationsCan require significant computation time to search for IVMPs
Assumes multiple failing runs are caused by the same error
23
Handling Multiple Errors
Effectively locate multiple simultaneous errors [Jeffrey et. al., ICSM 2009]
Iteratively compute a ranked list of statements to find and fix one error at a time
Three variations of this techniqueMIN: minimal computation; use same list each time
FULL: full computation; produce new list each time
PARTIAL: partial computation; revise list each time
24
Multiple-Error Techniques
Value Replacement
Faulty Programand Test Suite
Ranked List ofProgramStatements
Developer Find/Fix Error
Done
Single Error
Faulty Programand Test Suite
Ranked List ofProgramStatements
DoneFailing Run Remains?NoYes
Multiple Errors (MIN)
Value Replacement
Developer Find/Fix Error
25
Multiple-Error Techniques
Faulty Programand Test Suite
Ranked List ofProgramStatements
DoneFailing Run Remains?NoYes
Multiple Errors (FULL)
Value Replacement
Developer Find/Fix Error
Faulty Programand Test Suite
Ranked List ofProgramStatements
DoneFailing Run Remains?NoYes
Multiple Errors (PARTIAL)
PartialValue Replacement
Developer Find/Fix Error
26
PARTIAL Technique
Step 1: Initialize ranked lists and locate first errorFor each statement s, compute a ranked list by considering only failing runs exercising sReport ranked list with highest suspiciousness value at the front of the list
Step 2: Iteratively revise ranked lists and locate each remaining error
For each remaining failing run that exercises the statement just fixed, recompute IVMPsUpdate any affected ranked listsReport ranked list with the most different elements at the front of the list, compared to previously-selected lists
27
PARTIAL Technique: Example
1
2
3 4
5
Program (2 faulty statements) Failing Run Execution Trace
Statements with IVMPs
A (1, 2, 3, 5) {2, 5}
B (1, 2, 3, 5) {1, 2}
C (1, 2, 4, 5) {2, 4, 5}
Computed Ranked Lists: (statementsuspiciousness)
1
2
3
4
5
23, 52, 11, 41, 30
23, 52, 11, 41, 30
22, 11, 51, 30, 40
21, 41, 51, 10, 30
23, 52, 11, 41, 30
[based on runs A, B, C]
[based on runs A, B, C]
[based on runs A, B]
[based on run C]
[based on runs A, B, C]
Report list 1, 2, or 5 (assume 1) Fix faulty statement 2
28
PARTIAL Technique: Example
1
2
3 4
5
Program (1 faulty statement) Failing Run Execution Trace
Statements with IVMPs
C (1, 2, 4, 5) {4}
Computed Ranked Lists: (statementsuspiciousness)
2
3
4
5
22, 11, 41, 51, 30
22, 11, 51, 30, 40
41, 10, 20, 30, 50
22, 11, 41, 51, 30
[based on runs A, B, C] (C updated)
[based on runs A, B] (no updates)
[based on run C] (C updated)
[based on runs A, B, C] (C updated)
Report list 4 Fix faulty statement 4 Done
29
Techniques Compared
(MIN) Only compute ranked list once
(FULL) Fully recompute ranked list each time
(PARTIAL) Compute IVMPs for subset of failing runs and revise ranked lists each time
(ISOLATED) Locate each error in isolation
30
Benchmark Programs
Program # 5-Error Faulty Versions
Average Suite Size
(# Failing Runs / # Passing Runs)
tcas 20 11 (5 / 6)
totinfo 20 22 (10 / 12)
sched 20 29 (10 / 19)
sched2 20 30 (9 / 21)
ptok 2 32 (8 / 24)
ptok2 11 29 (5 / 24)
replace 20 38 (9 / 29)
Each faulty program contains 5 seeded errors, each in a different stmt
Each faulty program is associated with a stmt-coverage adequate test suite such that at least one failing run exercises each error
Experimental Benchmark Programs
31
50
55
60
65
70
75
80
85
90tc
as
toti
nfo
sch
ed
sch
ed2
pto
k
pto
k2
rep
lace
Effectiveness Comparison of Value Replacement Techniques
IsolatedFullPartialMin
Effectiveness ResultsA
vg. S
co
re p
er R
anke
d L
ist
(%)
32
Efficiency of Value Replacement
Searching for IVMPs is time-consuming
Lossy techniquesReduce search space for finding IVMPsMay result in some missed IVMPsPerformed for single-error benchmarks
Lossless techniquesOnly affect the efficiency of implementationResult in no missed IVMPsPerformed for multi-error benchmarks
5 failing runs X 50,000 stmt instances per run 15 alt value sets per instanceX
= 3.75 million value replacement program executions
Over 10 days if each execution requires a quarter-second
33
Lossy Techniques
Limit considered statement instancesFind IVMP skip all subsequent instances of the same statement in the current run
Don’t find IVMP skip statement in subsequent runs
Limit considered alternate value setsonly use min <, max <, min >, and max >, as compared to original value
orig
max < min >min < max >
(skip) (skip)
34
Lossless Techniques
stmt instance 1
stmt instance 2
stmt instance 3
Original Execution
(assume 2 alternate value sets at each stmt instance)
Regular Value Replacement Executions
(value replacements are independent of each other)(portions of original execution are duplicated multiple times)
(x 6)
(x 4)
(x 2)
Efficiency Improvements:
(1) Fork child process to do each value replacement in original failing execution(2) Perform value replacements in parallel
35
Lossless Techniques
With Redundant Execution Removed
(no duplication of any portion of original execution)
With Parallelization
(total time required to perform all value replacements is reduced)
36
Search Reduction by Lossy Techniques
0
5
10
15
20
0 20 40 60 80 100 120
Faulty Program
# V
alu
e R
ep
lac
em
en
ts (
in m
illio
ns
)
Full Search
LimitedSearch
Reduction in # of Executions by Lossy Techniques: Single-Error Benchmarks
# val replacements needed
Full Limited
Mean 2.0 M 0.03 M
Max 21.5 M 0.4 M
37
Search Reduction by Lossy Techniques
0
5
10
15
20
0 20 40 60 80 100 120
Faulty Program
# V
alu
e R
ep
lac
em
en
ts (
in m
illio
ns
)
Full Search
LimitedSearch
Reduction in # of Executions by Lossy Techniques: Single-Error Benchmarks
On average,total number of executionsreduced by a factor of 67
38
Time Required for Reduced Search
0102030405060708090
100
0 100 200 300 400 500 600 700 800
Time (minutes)
% F
au
lty P
rog
ram
s C
om
ple
ted
Time Required to Search using Lossy Techniques: Single-Error Benchmarks
Mean 55.6 min
< 1 min 39% of progs
< 10 min 60% of progs
< 100 min 87% of progs
Max 846.5 min
39
Time Required for Reduced Search
0102030405060708090
100
0 100 200 300 400 500 600 700 800
Time (minutes)
% F
au
lty P
rog
ram
s C
om
ple
ted
Time Required to Search using Lossy Techniques: Single-Error Benchmarks
Only 13% of faulty programsrequired more than 100
minutes of IVMP search time
40
0
100
200
300
400
500
600tc
as
toti
nfo
sch
ed
sch
ed2
pto
k
pto
k2
rep
lace
Time to Search in Each Faulty Program usingLossless Techniques: Multi-Error Benchmarks
FullPartialMin
Time Required with Lossless TechniquesA
vg. T
ime
(sec
on
ds
) With Lossless techniques,multiple errors in a programcan be located in minutes.
With Lossy techniques,some single errors
require hours to locate.
41
Execution Suppression
Efficient location of memory errors through targeted state alteration [Jeffrey et. al., ICSM 2008]
Alter state in a way that will definitely get closer to the goal each time
Goal: identify first point of memory corruption in a failing execution
42
Memory Errors and Corruption
Memory errorsBuffer overflowUninitialized readDangling pointerDouble freeMemory leak
Memory corruptionIncorrect memory location is accessed, orIncorrect value is assigned to a pointer variable
43
Study of Memory Corruption
Traversal of error First point of memory corruption Failure
Program LOC Memory Error
Type
Analyzed Input Types
gzip 6.3 K Global overflow No crash Crash 1
man 10.8 K Global overflow Crash 1
bc 10.7 K Heap overflow No crash Crash 1
pine 211.9 K Heap overflow No crash Crash 1
mutt 65.9 K Heap overflow No crash Crash 1
ncompress 1.4 K Stack overflow No crash Crash 1 Crash 2
polymorph 1.1 K Stack overflow No crash Crash 1 Crash 2
xv 69.2 K Stack overflow No crash Crash 1
tar 28.4 K NULL dereference Crash 1
tidy 35.9 K NULL dereference Crash 1
cvs 104.1 K Double free Crash 1
44
Observations from Study
Total distance from point of error traversal until failure can be large
Different inputs triggering memory corruption may result in different crashes or no crashes
Distance from error traversal to first memory corruption, is considerably less than distance from first memory corruption to failure
45
Execution Suppression: High-Level
Program crash reveals memory corruption
Key: assume memory corruption leads to crash
Component 1: suppressionIteratively identify first point of memory corruption
Omit the effect of certain statements during execution
Component 2: variable re-orderingExpose crashes where they may not occur
Helpful since key assumption does not always hold
46
Suppression: How it WorksWhile a crash occurs
Identify accessed location L directly causing crashIdentify last definition D of location LRe-execute program and omit execution of D and anything dependent on it
EndwhileReport the statement associated with the most recent D
First point ofmemory corruption
47
Suppression: Example1: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);
Stmt 4: copy-paste error:“x” should be “y”
Stmt 8: clobbers def @ stmt 6
Stmts 9 - 11: propagation
Stmt 12: potential buffer overflow
Stmt 13: potential overflow orNULL dereference
Stmt 15: double free
48
Suppression: Example
Stmt 4: The error as well as the first point of memory corruption
(Located in 4 executions)
1: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);
49
Example: Execution 1 of 41: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);
Stmt: Loc Defined: OK?
1 p12 p23 q1
5 *p16 *p27 *q1
4 q2
8 *p2/*q29 a10 b11 c12 CRASH
Action:Suppress definition of c at stmt 11 and all of its effects
50
Example: Execution 2 of 41: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);
Stmt: Loc Defined: OK?
1 p12 p23 q1
5 *p16 *p27 *q1
4 q2
8 *p2/*q29 a10 b11 12 13 CRASH
Action: Suppress def of *p2/*q2 at stmt 8 and all of its effects
51
Example: Execution 3 of 41: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);
1 p12 p23 q1
5 *p16 *p27 *q1
4 q2
Action: Suppress q2 at stmt 4 and effects
8 9 10 11 12 131415 CRASH
Stmt: Loc Defined: OK?
52
Example: Execution 4 of 41: int *p1 = &x[1];2: int *p2 = &x[0];3: int *q1 = &y[1];4: int *q2 = &x[0];5: *p1 = readInt();6: *p2 = readInt();7: *q1 = readInt();8: *q2 = readInt();9: int a = *p1 + *p2;10: int b = *q1 + *q2;11: int c = a + b + 1;12: intArray[c] = 0;13: structArray[*p2]->f = 0;14: free(p2);15: free(q2);
1 p12 p23 q1
5 *p16 *p27 *q1
4
Result: stmt 4 identified
8 9 10 11 12 131415
Stmt: Loc Defined: OK?
53
Example: Summary
1
2
3
4
5
6
7
8
9
10
11
Execution 1
p1:
p2:
q1:
q2:
p1:
p2:
q1:
p2, q2:
***
* *a:
b:
c:
12CRASH!
1
2
3
4
5
6
7
8
9
10
11
12
p1:
p2:
q1:
q2:
p1:
p2:
q1:
p2, q2:
***
* *a:
b:
13CRASH!
Execution 2
suppress
1
2
3
4
5
6
7
8
9
10
11
12
13
14
p1:
p2:
q1:
q2:
p1:
p2:
q1:
***
Execution 3
suppress
15CRASH!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
p1:
p2:
q1:
p1:
p2:
q1:
***
Execution 4
suppress
15
REPORT 4
54
Variable Re-Ordering
Re-order variables in memory prior to execution
Try to cause a crash due to corruption, in cases where crash does not occur
Can overcome limitations of suppressionDo not terminate prematurely when corruption does not cause crashApplicable to executions that do not crash
Position address variables after buffers
55
Variable Re-Ordering: Example
From program ncompress:
void comprexx(char **fileptr){ int fdin; int fdout; char tempname[1024]; strcpy(tempname, *fileptr); ...}
tempname fdout fdin$ return
addr
On the call stack:
Original Variable Ordering:
Variable Re-Ordering: tempnamefdout fdin$ return
addr
Original ordering no stack smash
Re-ordering stack smash
56
The Complete Algorithm
Explanation
Execution Suppression Algorithm
exec := original failing execution;Do
(A) identifiedStmt, exec := run suppression component using exec;(B) reordering, exec := run variable re-ordering component using exec;
While (crashing reordering is found);Report identifiedStmt;
(A) Runs suppression until no further crashes occur(B) Attempts to expose an additional crashDo/While loop iterates as long as variable re-ordering exposes a new crash
57
Execution Suppression: EvaluationSuppression-only results (no variable re-ordering):
Program Input
Type
# Exec. Required
Maximum Static Dependence Distance From Located Stmt To…
1st Memory Corruption Error
gzip Crash 1 2 0 0
man Crash 1 2 1 2
bc Crash 1 2 0 1
pine Crash 1 2 0 5
mutt Crash 1 3 0 1
ncompress Crash 1
Crash 2
2
4
0
0
0
0
polymorph Crash 1
Crash 2
2
3
0
0
1
1
xv Crash 1 4 0 2
tar Crash 1 2 0 0
tidy Crash 1 2 0 0
cvs Crash 1 2 0 0
58
Execution Suppression: Evaluation
Suppression and variable re-ordering results:
Program Input
Type
# Crashes Exposed
# Var R-O Exec.
Maximum Static Dependence Distance From Located Stmt To…
1st Memory Corruption Error
gzip No Crash 0 15 --- ---
man Crash 1 1 18 0 1
bc No Crash 1 --- 0 1
pine No Crash 1 --- 0 5
mutt No Crash 1 --- 0 1
ncompress No Crash 1 5 0 0
polymorph No Crash 1 6 0 1
xv No Crash 1 135 0 2
59
Memory Errors in Multithreaded Programs
Assume programs run on single processor
Two main enhancements required for Execution Suppression
Reproduce failure on multiple executionsRecord failure-inducing thread interleaving
Replay same interleaving on subsequent executions
In general, other factors should be recorded/replayed
Identify data race errorsData race: concurrent, unsynchronized access of a shared memory location by multiple threads; at least one write
Identified on-the-fly during suppression
60
Identifying Data Races
Data races involve WAR, WAW, or RAW dependencies
Identified points of suppression are writesCan involve WAR or WAW dependence prior to that point
Can involve RAW dependence after that point
Monitor for an involved data race on-the-fly during a suppression execution
61
On-the-fly Data Race Detection
Identified suppression point (thread T1 writes to location L)
Last access to L by a thread other than T1
Next read from L by a thread other than T1
Monitor for synchronization on L
Monitor for synchronization on L
Suppression Execution
62
On-the-fly Data Race Detection
Last access to L by a thread other than T1
Monitor for synchronization on L
Monitor for synchronization on L
Suppression Execution
Next read from L by a thread other than T1
WAR or WAW data race may be identified at this point
63
On-the-fly Data Race Detection
Last access to L by a thread other than T1
Monitor for synchronization on L
Monitor for synchronization on L
Suppression Execution
RAW data race may be identified at this point
WAR or WAW data race may be identified at this point
64
Potentially-Harmful Data Races
Given two memory accesses involved in a data race, force other thread interleavings to see if the state is altered
Memory access point 1: access to L from thread T1
Memory access point 2: access to L from thread T2
For each ready thread besides T1, re-execute from this point and schedule it in place of T1
If value in L is changed at this point, data race is potentially-harmful
Harmful Data Race Checking Executions
65
Evaluation with Multithreaded Programs
Program LOC Error Type # Executions Required
Precisely Identifies Error?
apache 191 K Data race 3 yes
mysql-1 508 K Data race 3 yes
mysql-2 508 K Data race 3 yes
mysql-3 508 K Uninitialized read 2 yes
prozilla-1 16 K Stack overflow 2 yes
prozilla-2 16 K Stack overflow 4 yes
axel 3 K Stack overflow 3 yes
Multithreaded Benchmark Programs and Results:
66
Implementing Suppression: GeneralGlobal variables
count: Dynamic instruction count valuesuppress: Suppression mode flag (boolean)
Variables associated with each register and memory wordlastDef: count value associated with the instruction last defining itcorrupt: whether associated effects need to be suppressed
target.lastDef := ++count;
At a program instruction (defines target, uses src1 and src2):
Ensure instruction responsible for a crash can be identified:
Carry out suppression as necessary:if (current instruction is a suppression point)
suppress := true; target.corrupt := true;if (suppress)
if (src1.corrupt or src2.corrupt)target.corrupt := true;
elseexecute instruction; target.corrupt := false;
67
Software/Hardware Support
Software-only implementation can incur relatively high overhead (SW)
Reduce overhead with hardware supportExisting support in Itanium processors for deferred exception handling: extra bit for registers (HW1) Further memory augmentation: extra bit for memory words (HW1 + HW2)
Overheads compared in a simulator
68
Average Overhead: 7.2x (SW) 2.7x (HW1) 1.8x (HW1 + HW2)
Performance Overhead Comparison
0
2
4
6
8
10
12g
zip
man b
c
pin
e
mu
tt
nco
mp
ress
po
lym
orp
h
xv tar
tid
y
cvs
Suppression Overhead Comparison:Software and Hardware Support
SWHW1HW1 + HW2
69
Other Dynamic Error Location Techniques
Other state-alteration techniquesDelta Debugging [Zeller et al. FSE 2002, TSE 2002, ICSE 2005]
Search in space for values relevant to a failureSearch in time for failure cause transitions
Predicate Switching [Zhang et. al. ICSE 2006]
Alter predicate outcome to correct failing output
Value Replacement is more aggressive; Execution Suppression is better targeted for memory errors
70
Other Dynamic Error Location Techniques
Program slicing-based techniquesPruning Dynamic Slices with Confidence [Zhang et. al. PLDI 2006]
Failure-Inducing Chops [Gupta et. al. ASE 2005]
Invariant-based techniquesDaikon [Ernst et. al. IEEE TSE Feb. 2001]
AccMon [Zhou et. al. HPCA 2007]
71
Other Dynamic Error Location Techniques
Statistical techniquesCooperative Bug Isolation [Ben Liblit doctoral dissertation, 2005]
SOBER [Liu et. al. FSE 2005]
Tarantula [Jones et. al. ICSE 2002]
COMPUTE RESULT
Spectra-based techniquesNearest Neighbor [Renieris and Reiss, ASE 2003]
FAIL PASS SUSPICIOUS
72
Future DirectionsEnhancements to Value Replacement
Improve scalabilityStudy when IVMPs cannot be found at faulty statement
Enhancements to Execution SuppressionImprove scalability of variable re-orderingOther techniques to expose crashesHandle memory errors that do not involve corruption
Applications to Fixing ErrorsIVMPs can be used in BugFix [Jeffrey et. al., ICPC 2009]Comparatively little research in automated techniques for fixing errors
Applications to Tolerating ErrorsSuppression can be used to recover from failures in server programs [Nagarajan et. al., ISMM 2009]Other applications?