Top Banner
CPSC 668 Set 15: Broadcast 1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch
34

CPSC 668 Distributed Algorithms and Systems

Feb 10, 2016

Download

Documents

Devika Devika

CPSC 668 Distributed Algorithms and Systems. Fall 2006 Prof. Jennifer Welch. Broadcast Specifications. Recall the specification of a broadcast service given in the last set of slides: Inputs : bc-send i ( m ) an input to the broadcast service - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 1

CPSC 668Distributed Algorithms and Systems

Fall 2006Prof. Jennifer Welch

Page 2: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 2

Broadcast Specifications

• Recall the specification of a broadcast service given in the last set of slides:

• Inputs: bc-sendi(m)– an input to the broadcast service– pi wants to use the broadcast service to send m to

all the procs• Outputs: bc-recvi(m,j)

– an output of the broadcast service– broadcast service is delivering msg m, sent by pj,

to pi

Page 3: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 3

Broadcast Specifications

• A sequence of inputs and outputs (bc-sends and bc-recvs) is allowable iff there exists a mapping from each bc-recvi(m,j) event to an earlier bc-sendj(m) event s.t. is well-defined: every msg bc-recv'ed was

previously bc-sent (Integrity) restricted to bc-recvi events, for each i, is one-to-

one: no msg is bc-recv'ed more than once at any single proc. (No Duplicates)

restricted to bc-recvi events, for each i, is onto: every msg bc-sent is received at every proc. (Liveness)

Page 4: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 4

Ordering Properties

• Sometimes we might want a broadcast service that also provides some kind of guarantee on the order in which messages are delivered.

• We can add additional constraints on the mapping :– single-source FIFO or– totally ordered or– causally ordered

Page 5: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 5

Single-Source FIFO Ordering

• For all messages m1 and m2 and all pi and pj, if pi sends m1 before it sends m2, and if pj receives m1 and m2, then pj receives m1 before it receives m2.

• Phrased carefully to avoid requiring that both messages are received.– that is the responsibility of a liveness

property

Page 6: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 6

Totally Ordered• For all messages m1 and m2 and all pi and

pj, if both pi and pj receive both messages, then they receive them in the same order.

• Phrased carefully to avoid requiring that both messages are received by both procs.– that is the responsibility of a liveness property

Page 7: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 7

Happens Before for Messages

• Earlier we defined "happens before" relation for events.

• Now extend this definition to messages.• Assume all communication is through

broadcast sends and receives.• Msg m1 happens before msg m2 if

– bc-recv event for m1 happens before the bc-send event for m2, or

– m1 and m2 are sent by the same proc. and m1 is sent before m2 is sent.

Page 8: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 8

Causally Ordered

• For all messages m1 and m2 and all pi, if m1 happens before m2, and if pi receives both m1 and m2, then pi receives m1 before it receives m2.

• Phrased carefully to avoid requiring that both messages are received.– that is the responsibility of a liveness

property

Page 9: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 9

Example

a

b

single-source FIFO?

totally ordered?

causally ordered?

Page 10: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 10

Example

a b

single-source FIFO?

totally ordered?

causally ordered?

Page 11: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 11

Example

a

b

single-source FIFO?

totally ordered?

causally ordered?

Page 12: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 12

Algorithm to Simulate Basic Broadcast on Top of Point-to-Point• When bc-sendi(m) occurs:

– pi sends a separate copy of m to every processor (including itself) using the underlying point-to-point message passing communication system

• When can pi perform bc-recvi(m)?– when it receives m from the underlying point-

to-point message passing communication system

Page 13: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 13

Correctness of Basic Broadcast Algorithm• Assume the underlying point-to-point

message passing system is correct (i.e., conforms to the spec given in previous set of slides).

• Check that the simulated broadcast service satisfies:– Integrity– No Duplicates– Liveness

Page 14: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 14

Single-Source FIFO Algorithm• Assume the underlying communication system is

basic broadcast.• when ssf-bc-sendi(m) occurs:

– pi uses the underlying basic broadcast service to bcast m together with a sequence number

– pi increments sequence number by 1 each time it initiates a bcast

• when can pi perform ssf-bc-recvi(m)?– when pi has bc-recv'ed m with sequence number T and

has ssf-bc-recv'ed messages from pj (the ssf-bc-sender of m) with all smaller sequence numbers

Page 15: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 15

Single-Source FIFO Algorithm

SSF alg(timestamps)

basic bcastalg (n copies)

point-to-point message passing

user of SSF bcast

ssf-bc-send ssf-bc-recv

bc-send

send

bc-recv

recv

basicbcast

ssfbcast

Page 16: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 16

Asymmetric Algorithm for Totally Ordered Broadcast• Assume underlying communication service is

basic broadcast.• There is a distinguished proc. pc

• when to-bcasti(m) occurs:– pi sends m to pc (either assume the basic broadcast

service also has a point-to-point mechanism, or have recipients other than pc ignore the msg)

• when pc receives m from pi from the basic broadcast service:– append a sequence number to m and bc-send it

Page 17: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 17

Asymmetric Algorithm for Totally Ordered Broadcast• when can pi perform to-bc-recv(m)?

– when pi has bc-recv'ed m with sequence number T and has to-bc-recv'ed messages with all smaller sequence numbers

Page 18: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 18

Symmetric Algorithm for Totally Ordered Broadcast

• Assume the underlying communication service is single-source FIFO broadcast.

• Each proc. tags each msg it sends with a timestamp (increasing).

• Each proc. keeps a vector of estimates of the other proc's timestamps:– if pi 's estimate for pj is k, then pi will not receive any

later msg from pj with timestamp k.– Estimates are updated based on msgs received and

"timestamp update" msgs

Page 19: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 19

Symmetric Algorithm for Totally Ordered Broadcast

• Each proc. keeps its timestamp to be ≥ all its estimates:– when pi has to increase its timestamp because of the

receipt of a message, it sends a timestamp update msg

• A proc. can deliver a msg with timestamp T once every entry in the proc's vector of estimates is at least T.

Page 20: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 20

Symmetric Algorithm

when to-bc-sendi(m) occurs:

ts[i]++ add (m,ts[i]) to pending invoke ssf-bc-sendi((m,ts[i]))

when ssf-bc-recvi((m,T)) from pj

occurs: ts[j] := T add (m,T) to pending if T > ts[i] then ts[i] := T invoke ssf-bc-sendi("ts-up",T)

when ssf-bc-recvi("ts-up",T) from pj occurs:

ts[j] := T

invoke to-bc-recvi(m) when:

(m,T) is entry in pending with smallest T T ≤ ts[k] for all kresult: remove (m,T) from pending

Page 21: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 21

SSF alg(timestamps)

basic bcastalg (n copies)

point-to-point message passing

symmetric TO alg

ssf-bc-send ssf-bc-recv

bc-send

send

bc-recv

recv

basicbcast

user of TO bcast

to-bc-send to-bc-recv

ssfbcast

TObcast

Page 22: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 22

Correctness of Symmetric AlgorithmLemma (8.2): Timestamps assigned to msgs

form a total order (break ties with id of sender).

Theorem (8.3): Symmetric algorithm simulates totally ordered broadcast service.

Proof: Must show top-level outputs of symmetric algorithm satisfy 4 properties, in every admissible execution (relies on underlying ssf-bcast service being correct).

Page 23: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 23

Correctness of Symmetric Alg.Integrity: follows from same property for ssf-bcast.No Duplicates: follows from same property for ssf-bcast.Liveness: • Suppose in contradiction some pi has some entry (m,T)

stuck in its pending set forever, where T is the smallest timestamp of all stuck entries.

• Eventually (m,T) has the smallest timestamp of all entries.• Why is (m,T) stuck at pi? Because its estimate of some

pk's timestamp is stuck at some value T' < T.• But that would mean either pk never receives (m,T) or pk's

timestamp update msg resulting from pk receiving (m,T) is never received at pi, contradicting correctness of the SSF broadcast.

Page 24: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 24

Correctness of Symmetric Alg.Total Ordering: Suppose pi does to-bc-recv for msg m

with timestamp T, and later it does to-bc-recv for msg m' with timestamp T'. Show T < T'.

• By the code, if (m',T') is in pi's pending set when pi does the to-bc-recv for m, then T < T'.

• Suppose (m',T') is not yet in pi's pending set at that time. Let pj be the proc. that initiated the to-bcast of m'.

• When pi does the to-bc-recv for m, T ≤ ts[j]. So pi has received a msg from pj with timestamp ≥ T.

• By the SSF property, every subsequent msg pi receives from pk will have timestamp > T, so T' must be > T.

Page 25: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 25

Causal Ordering Algorithms

• The symmetric total ordering algorithm ensures causal ordering:– timestamp order extends the happens-

before order on messages.• Causal ordering can also be attained

without the overhead of total ordering using an algorithm based on vector clocks…

Page 26: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 26

Causal Order Algorithm

when co-bc-sendi(m) occurs:

vt[i]++ invoke co-bc-recvi(m)

invoke bc-sendi((m,vt))

when bc-recvi((m,w)) from pj occurs:

add (m,w,j) to pending

invoke co-bc-recvi(m) when:

(m,w,j) is in pending w[j] = vt[j] + 1 w[k] ≤ vt[k] for all k ≠ iresult: remove (m,w,j) from pending vt[j]++

Note: vt[j] records how many msgs from pj have been co-recv'ed

Page 27: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 27

Correctness of Causal Order Algorithm (Sketch)Lemma (8.6): The local array variables vt

serve as vector clocks.Theorem (8.7): The algorithm simulates

causally ordered broadcast, if the underlying communication system satisfies (basic) broadcast.

Proof: Integrity and No Duplicates follow from the same properties of the basic broadcast. Liveness requires some arguing. Causal Ordering follows from the lemma.

Page 28: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 28

Reliable Broadcast

• What do we require of a broadcast service when some of the procs can be faulty?

• Specifications differ from those of the corresponding non-fault-tolerant specs in two ways:

1. proc indices are partitioned into "faulty" and "nonfaulty"

2. Liveness property is modified…

Page 29: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 29

Reliable Broadcast Specification

• Nonfaulty Liveness: Every msg bc-sent by a nonfaulty proc is eventually bc-recv'ed by all nonfaulty procs.

• Faulty Liveness: Every msg bc-sent by a faulty proc is bc-recv'ed by either all the nonfaulty procs or none of them.

Page 30: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 30

Discussion of Reliable Bcast Spec

• Specification is independent of any particular fault model.

• We will only consider implementations for crash faults.

• No guarantee is given concerning which messages are received by faulty procs.

• Can extend this spec to the various ordering variants:– msgs that are received by faulty procs must

conform to the relevant ordering property.

Page 31: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 31

Spec of Failure-Prone Point-to-Point Message Passing System• Before we can design an algorithm to

implement reliable (i.e., fault-tolerant) broadcast, we need to know what we can rely on from the lower layer communication system.

• Modify the previous point-to-point spec from the no-fault case in two ways:

1. partition proc indices into "faulty" and "nonfaulty"2. Liveness property is modified…

Page 32: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 32

Spec of Failure-Prone Point-to-Point Message Passing System• Nonfaulty Liveness: every msg sent by a

nonfaulty proc to any nonfaulty proc is eventually received.

Note that this places no constraints on messages received by faulty procs.

Page 33: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 33

Reliable Broadcast Algorithm

when rel-bc-sendi(m) occurs:

invoke sendi(m) to all procs

when recvi(m) from pj occurs:

if m has not already been recv'ed then invoke sendi(m) to all procs

invoke rel-bc-recvi(m)

Page 34: CPSC 668 Distributed Algorithms and Systems

CPSC 668 Set 15: Broadcast 34

Correctness of Reliable Bcast Alg

• Integrity: follows from Integrity property of underlying point-to-point msg system.

• No Duplicates: follows from No Duplicates property of underlying point-to-point msg system and the check that this msg was not already received.

• Nonfaulty Liveness: follows from Nonfaulty Liveness property of underlying point-to-point msg system.

• Faulty Liveness: follows from relaying and underlying Nonfaulty Liveness.