Top Banner
Minimizing Faulty Executions of Distributed Systems Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula, Arvind Krishnamurthy, Scott Shenker
83

Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Dec 15, 2016

Download

Documents

vudung
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Minimizing Faulty Executions of Distributed Systems

Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula, Arvind Krishnamurthy, Scott Shenker

Page 2: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Software Developer

Page 3: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

(GBs)Software Developer

Page 4: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Node1 ?

Node2

Node3

Node4

Node5

Node6

Node7

Node8

Node9

Node10

Node11

Node12

Software Developer

Page 5: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

1 LaToza, Venolia, DeLine, ICSE’ 06

49% of developers’ time spent on debugging!1

Page 6: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

1 LaToza, Venolia, DeLine, ICSE’ 06

49% of developers’ time spent on debugging!1

Understanding How Bug Is Triggered

Fixing Problematic Code

Page 7: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Our Goal

Allow Developers To Focus on Fixing the Underlying Bug

Page 8: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Problem Statement

Identify a minimal causal sequence of events that

triggers the bug

Page 9: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Page 10: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Page 11: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Page 12: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Page 13: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OS

App

RPC lib

OS

App

RPC lib

OS

Page 14: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

Page 15: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Page 16: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Page 17: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Page 18: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Page 19: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

message delivery

Page 20: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

message delivery

Page 21: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: bmsg

dst: a

message delivery

Page 22: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

Page 23: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

External events (events outside

system’s control): Crash-recovery Process creation External message

Page 24: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

crash recovery

External events (events outside

system’s control): Crash-recovery Process creation External message

Page 25: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

crash recovery

External events (events outside

system’s control): Crash-recovery Process creation External message

Page 26: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

crash recovery

Page 27: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Invariant Checking

An invariant is a predicate P over the state of all processes.

a b c d e

{ ✔ ✗

Page 28: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Invariant Checking

An invariant is a predicate P over the state of all processes.

a b c d e

{ ✔ ✗

A faulty execution is one that ends in an invariant violation.

e1 i1 i2 i3 i4e2

Page 29: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

OutlineIntroductionBackground

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Page 30: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Formal Problem Statement

Find: locally minimal reproducing sequence τ’:

τ’ violates P, |τ’| ≤ |τ|

τ’ contains a subsequence of the external events of τ

if we remove any external event e from τ’,

¬∃ τ’’ containing same external events - e, s.t. τ’’ violates P

Given: schedule τ that results in violation of P

Page 31: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Formal Problem Statement

After finding τ’, minimize internal events:

remove extraneous message deliveries from τ’

Page 32: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

RequestVote

RequestVote

RequestVote

RequestVote

VoteGranted

VoteGranted

VoteGranted

VoteGranted

Page 33: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Minimization

τ :Given

… ✗e1 i1 i2 i4e2 enim

Page 34: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Minimization

τ :Given

Straightforward approach: Enumerate all schedules |τ’| ≤ |τ|, Pick shortest sequence that reproduces ✗

τ Schedule Space

… ✗e1 i1 i2 i4e2 enim

Page 35: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Minimization

τ :Given

Straightforward approach: Enumerate all schedules |τ’| ≤ |τ|, Pick shortest sequence that reproduces ✗

τ Schedule Space

… ✗e1 i1 i2 i4e2 enim

Page 36: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

O(n!)

Page 37: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Observation #1: many schedules are commutative

Adopt DPOR: Dynamic Partial Order Reduction

C. Flanagan, P. Godefroid, “Dynamic Partial-Order Reduction for Model Checking Software”, POPL ‘05

Page 38: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

O( !)nk

Page 39: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Approach: prioritize schedule space exploration

Page 40: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Approach: prioritize schedule space exploration

Assume: fixed time budget Objective: quickly find small failing schedules

Page 41: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

τ :

Page 42: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

τ :

e1 e2 ene3 e4ext: e5

Page 43: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ene3ext: e5e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

Page 44: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

x

τ :

ene3ext: e5e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

Page 45: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

x

τ :

ene3ext: e5e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

(Apply Delta Debugging1)

1A Zeller, R. Hildebrandt, “Simplifying and Isolating Failure-Inducing Input”, IEEE ‘02

Observation #2: selectively mask original events

Page 46: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ene3ext: e5

sub1:

e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

e4 e5 en…

(Apply Delta Debugging1)

1A Zeller, R. Hildebrandt, “Simplifying and Isolating Failure-Inducing Input”, IEEE ‘02

Observation #2: selectively mask original events

Page 47: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

… e5e4 en

Observation #2: selectively mask original events

Page 48: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 … e5e4 en

Observation #2: selectively mask original events

Page 49: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 … e5e4 en

Observation #2: selectively mask original events

Page 50: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 … e5e4 enim

Observation #2: selectively mask original events

Page 51: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 … e5e4 enim

Observation #2: selectively mask original events

Page 52: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

Page 53: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

Page 54: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

Page 55: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

Page 56: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2:

ene5e4

i1 i4 ✗… e5e4 enim

e5 en

Observation #2: selectively mask original events

Page 57: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2: i1 i4 …

ene5e4

i1 i4 ✗… e5e4 enim

e5 enim

Observation #2: selectively mask original events

Page 58: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2: i1 i4 ✔…

ene5e4

i1 i4 ✗… e5e4 enim

e5 enim

Observation #2: selectively mask original events

Page 59: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2:

…. . .

i1 i4 ✔…Explore backtrack points until (i) ✗ or (ii) time budget for sub2 expired

ene5e4

i1 i4 ✗… e5e4 enim

e5 enim

Observation #2: selectively mask original events

Page 60: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Observation #4: shrink external message contents

Observation #1: many schedules are commutative

Approach: prioritize schedule space exploration

Goal: find minimal schedule that produces violation

Minimize internal events after externals minimized

Observation #2: selectively mask original events

Observation #3: some contents should be masked

Page 61: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

OutlineIntroductionBackground

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Page 62: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Target Systems

Page 63: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

How well does DEMi work?To

tal E

vent

s

0

300

600

900

1200

1500

1800

2100

2400

2700

3000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

1114407718040226823523 300

600

1000

400

17101500

2850

2380

1250

2160

Initial ExecutionAfter Minimization

Page 64: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

How well does DEMi work?To

tal E

vent

s

0

300

600

900

1200

1500

1800

2100

2400

2700

3000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

1114407718040226823523 300

600

1000

400

17101500

2850

2380

1250

2160

Initial ExecutionAfter Minimization

80% - 97% Reduction!

Page 65: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

How well does DEMi work?To

tal E

vent

s

0306090

120150180210240270300

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294spark-3150spark-9256

11112529392851

212322 111440

77

180

40

226

82

3523

After MinimizationSmallest Manual Trace

Page 66: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

How well does DEMi work?To

tal E

vent

s

0306090

120150180210240270300

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294spark-3150spark-9256

11112529392851

212322 111440

77

180

40

226

82

3523

After MinimizationSmallest Manual Trace

Factor of 1x - 5x from hand-crafted

Page 67: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

170 69

How quickly does DEMi work?Ru

ntim

e in

Sec

onds

0

400

800

1200

1600

2000

2400

2800

3200

3600

4000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

210245427348

10676

69

43482

2132

282170

Overall Minimization(~12 hours) (~3 hours)

(~35 minutes)

Page 68: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

170 69

How quickly does DEMi work?Ru

ntim

e in

Sec

onds

0

400

800

1200

1600

2000

2400

2800

3200

3600

4000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

210245427348

10676

69

43482

2132

282170

Overall Minimization

<10 minutes except 3 cases

(~12 hours) (~3 hours)

(~35 minutes)

Page 69: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

See the paper for… How we handle non-determinism Handling multithreaded processes Supporting other RPC libraries Sketch for minimizing production traces More in-depth evaluation Related work …

Page 70: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Conclusion

Open source tool: github.com/NetSys/demi

Read our paper! eecs.berkeley.edu/~rcs/research/nsdi16.pdf

Optimistic that these techniques can be successfully applied more broadly

Thanks for your time!Contact me! [email protected]

Page 71: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

AttributionsInspiration for slide design: Jay Lorch’s IronFleet slides

Graphic Icons: thenounproject.org logfile: mantisshrimpdesign magnifying glass: Ricardo Moreira disk: Anton Outkine hook: Seb Cornelius bug report: Lemon Liu devil: Mourad Mokrane Putin: Remi Mercier

Page 72: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Production TracesModel: feed partially ordered log into

single machine DEMi

Require: - Partial ordering of all message deliveries - All crash-recoveries logged to disk

Page 73: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Instrumentation Complexity

Page 74: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Related WorkThread Schedule Minimization

•Isolating Failure-Inducing Thread Schedules. SIGSOFT ’02. •A Trace Simplification Technique for Effective Debugging of

Concurrent Programs. FSE ’10. Program Flow Analysis.

•Enabling Tracing of Long-Running Multithreaded Programs via Dynamic Execution Reduction. ISSTA ’07.

•Toward Generating Reducible Replay Logs. PLDI ’11. Best-Effort Replay of Field Failures

•A Technique for Enabling and Supporting Debugging of Field Failures. ICSE ’07.

•Triage: Diagnosing Production Run Failures at the User’s Site. SOSP ’07.

Page 75: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

DDmin in more detail

Page 76: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

DDmin assumptions

Page 77: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Local vs. global minima

Page 78: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Minimization Pace

Page 79: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Dealing With ThreadsIf you’re lucky: threads are largely independent (Spark)

If you’re unlucky: key insight: A write to shared memory is equivalent to a message delivery

Approach: •interpose on virtual memory, thread scheduler •pause a thread whenever it writes to shared memory / disk

Cf. “Enabling Tracing Of Long-Running Multithreaded Programs Via Dynamic Execution Reduction”, ISSTA ‘07

Page 80: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Dealing With Non-DeterminismInterpose on:

- Timers - Random number generators - Unordered hash values - ID allocation

Stop-gap: replay each schedule multiple times

Page 81: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Complete Results

Page 82: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Runtime Breakdown

Page 83: Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula ...

Integrating with other RPC libsApp

RPC lib

OS

App

RPC lib

OS

App

RPC lib

OS

DEMiJVM