Top Banner
Virtual Synchrony Jared Cantwell
48

Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Virtual Synchrony

Jared Cantwell

Page 2: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Review

• Multicast• Causal and total ordering• Consistent Cuts• Synchronized clocks• Impossibility of consensus• Distributed file systems

Page 3: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Goal

• Distributed programming is hard• What tools can make it easier?• What assumptions can make it easier?

Distributed programming is hard!Let’s go shopping!!!

According to http://en.wikipedia.org/wiki/Barbie, Barbie once said “Math is hard!” (misquoted).

Page 4: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

The Process Group Approach to Reliable Distributed Computing

• Ken Birman– Professor, Cornell University

• ISIS– “toolkit mechanism for distributed programming”– Financial trading floors– Telecommunications switching

Page 5: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Virtual Synchrony

• Simplify distributed systems programming by assuming a synchronous environment

• Features:– Process Groups– Reliable Multicast– Fault Tolerance

– Performance

Page 6: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline

• Problem / Motivation• Solution (Virtual Synchrony)– Assumptions– Close Synchrony– Virtual Synchrony

Page 7: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline

• Problem / Motivation• Solution (Virtual Synchrony)– Assumptions– Close Synchrony– Virtual Synchrony

Page 8: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Motivation

• Distributed Programming is hard

Page 9: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Difficulties

• No reliable multicast• Membership churn• Message ordering• State transfers• Failure atomicity

Page 10: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

No Reliable Multicast

p

q

r

Ideal Reality

• UDP, TCP, Multicast not good enough• What is the correct way to recover?

Page 11: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Membership Churn

p

q

r

Receives new membership

Never sent

• Membership changes are not instant• How to handle failure cases?

Page 12: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Message Ordering

p

q

r

1 2

• Everybody wants it!• How can you know if you have it?• How can you get it?

Page 13: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

State Transfers

• New nodes must get current state• Does not happen instantly• How do you handle nodes failing/joining?

p

q

r

Page 14: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Failure Atomicity

p

q

r

Ideal Reality

x

?

• Nodes can fail mid-transmit• Some nodes receive message, others do not• Inconsistencies arise!

Page 15: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Motivation Review

• Distributed programming is hard!

• No reliable multicast• Membership churn• Message ordering• State transfers• Failure atomicity

Page 16: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline

• Problem / Motivation• Solution (Virtual Synchrony)– Assumptions– Close Synchrony– Virtual Synchrony

Page 17: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Assumptions

• WAN of LANs• Unreliable network• Flow control at lowest layer• Clocks not synchronized• No partitions– CAP Theorem?

Page 18: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Failure Model

• Nodes crash• Network is lossy• Can’t distinguish difference

Page 19: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline

• Problem / Motivation• Solution (Virtual Synchrony)– Assumptions– Close Synchrony– Virtual Synchrony

Page 20: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline

• Problem / Motivation• Solution (Virtual Synchrony)– Assumptions– Close Synchrony• Model• Significance• Issues

– Virtual Synchrony

Page 21: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Model

• Events (all or nothing)– Internal computation– Message transmission & delivery– Membership change

Page 22: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Model

• Synchronous execution

p

q

r

s

t

u

Ken’s Slides - 2006

Page 23: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Significance

• Multicast is always reliable• Membership is always consistent• Totally ordered message delivery• State-transfer happens instantaneously• Failure Atomicity– Multicast is a single event

Page 24: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Issues

• Discrete event simulator• Is it practical?• Impossible with failures• Very expensive– System progresses in lock-step– Limited by speed of other members

Page 25: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline

• Problem / Motivation• Solution (Virtual Synchrony)– Assumptions– Close Synchrony– Virtual Synchrony

Page 26: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Outline• Virtual Synchrony– Asynchronous Execution– Virtual Synchrony– ISIS– Parallels– Benefits– Discussion

Page 27: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Asynchronous Execution

• Key to high throughput in distributed systems• Only wait for responses (or too fast sends)• Communication channel– Acts as a pipeline– Not limited by latency

• Not possible with Close Synchrony!!

Page 28: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Asynchronous Execution

p

q

r

s

t

u

Ken’s Slides - 2006

Page 29: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Virtual Synchrony

• Close Synchrony + Asynchronous• Indistinguishable to application• So….when can synchronous execution be

relaxed?

Page 30: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

ISIS

• Communication Framework• Membership Service• VS primitives– ABCAST– CBCAST

Page 31: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

ISIS

• Problem– Crash and Lossy Network Indistinguishable

• Solution:– Membership list– Nonresponsive or failed members are dropped– Only listed members can participate– Re-join protocol– Does Membership exist in all distributed systems?

Page 32: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

ISIS

• Atomic Broadcast (ABCAST)• No message can be delivered to any user until

all previous ABCAST messages have been delivered

• Costly to implement

• …But not everyone needs such strong guarantees

Page 33: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

ISIS

• Causal Atomic Broadcast (CBCAST)• Sufficient for most programmers• Concurrent messages commute• Weaker than ABCAST

Page 34: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

When to use CBCAST?

• When any conflicting multicasts are uniquely ordered along a single causal chain

• …..This is Virtual Synchrony

p

r

s

t1

2

3

4

5

1

2

Ken’s Slides - 2006

Each thread corresponds to a different lock

Page 35: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Parallels

• Logical time• Replication in database systems• Schneider’s state machine approach• Parallel processor architectures• Distributed database systems

Page 36: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Benefits

• Assume a closely synchronous model• Group state and state transfer• Pipelined communication (async)• Single event model• Failure handling

Page 37: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Discussion

• Partitions• False positives – Most have them, VS admits it

• False negatives– Depend on a timeout

Page 38: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Summary

• Programming in distributed systems is hard• Close Synchrony makes it easier– Costs too much

• Take asynchronous when you can• Virtual Synchrony – Pipelined– Easy to reason over

Page 39: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Understanding the Limitations of Causally and Totally Ordered Communication

• Authors– David Cheriton• Stanford• PhD – Waterloo• Billionaire

– Dale Skeen• PhD – UC Berkeley• 3-phase commit protocol

Page 40: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

The flaws of CATOCS

• Unrecognized causality• No semantic ordering• No Efficiency Gain (over State-level Techniques)

• No Scalability

Page 41: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Unrecognized Causality

• External communication is unknown

p

q

r

s

Page 42: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Unrecognized Causality

• Database is external entity

• Causal relation exists, but CATOCS misses it

Page 43: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

No Semantic Ordering

• Serialization– Messages can’t be “group together”– Implementing eliminates CATOCS need

• Causal Memory– Solution: state-level logical clock

Page 44: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

No Efficiency Gain

• Still need state-level techniques• False causality– Reduces Performance– Increased Memory

• Message overhead

Page 45: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

No Efficiency Gain

• What if m2 happened to follow m1, but was not causally related?

• CATOCS would make False Causality

Page 46: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

No Scalability

• ≈ quadratic growth of expected message buffering

• Rebuttal:– Worst case– Impractical use case

Page 47: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Summary

• CATOCS software is overkill• Communication system doesn’t know

everything• Everything is better at the application level

Page 48: Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Conclusions

• Distributed Programming is hard• Close Synchrony– Too costly

• Virtual Synchrony– Limitations

• VS not perfect for all situations