Distributed Computing 8. Impossibility of consensus Shmuel Zaks [email protected] ©

Distributed Computing 8. Impossibility of consensus

Shmuel [email protected]

©

2

ConsensusConsensus

Input:Input: 1 or 0 to each processor1 or 0 to each processor

Output:Output: Agreement:Agreement: all procssors decide 0 or 1 all procssors decide 0 or 1 Termination:Termination: all processors eventually all processors eventually

decidedecide Validity:Validity: if all inputs x , then decide x if all inputs x , then decide x

3

The result: No completely asynchronous No completely asynchronous

consensus protocol can tolerate consensus protocol can tolerate even a single unannounced even a single unannounced process death.process death.

4

This problem serves a role that is similar This problem serves a role that is similar to the one served by “the halting to the one served by “the halting problem” in computability theory.problem” in computability theory.

Many problems equivalent to consensus Many problems equivalent to consensus (or reduce to it)(or reduce to it)

5

How commit protocols in practice dealHow commit protocols in practice dealwith this outcome ?with this outcome ?Weaken an assumption. For example: Weaken an assumption. For example: Computation model: e.g., assume Computation model: e.g., assume

bounded –delay networkbounded –delay network Fault model: e.g., assume faults only at Fault model: e.g., assume faults only at

start.start.

6

The ModelThe Model

Message SystemMessage System Reliable Reliable

Delivers all messages correctlyDelivers all messages correctly Exactly onceExactly once

Processing Model Completely Asynchronous

No Assumptions about relative No Assumptions about relative speedsspeeds

Unbounded time in delivering Unbounded time in delivering messagemessage

Weak ConsensusWeak Consensus

Every process starts with initial value in {0,1} A nonfaulty process decides on a value in {0,1}

by entering an appropriate decision state All nonfaulty process are required to choose

the same value

Both 0 and 1 are possible decision values, although perhaps for different initial configurations.

(Trivial solutions – e.g., “0” - are ruled out) 7

8

System ModelSystem Model

Communicate by means of one global message buffer

Atomic step Attempt to receive a message Perform local computation Send arbitrary but finite set of

messages

Consensus Protocol

N processes (N > 1) Each process has

xp – one-bit input register yp – output register with values in

{b,0,1} Unbounded amount of internal

storage PC – Program counter

9

10

Consensus Protocol

N processes (N > 1)

process process p p

xxp p 0/10/1 yyp p 0/1/b0/1/b memorymemory

(unboundd)(unboundd)

PCPC

input register

output register

memory

Program counter

11

Fixed starting valued at the memory (except the input register)

Output register starts with b The output register is “write once” when a value is written to the output

register, the process is “in a decision state”.

Process acts deterministically according to a Transition function

12

Communication System

A message is a pair (p,m) p is the name of the destination m is a “message value”

message buffer Maintains messages that have been sent but

not yet delivered

We assume a clique topology

13

two operations by a process : send (p,m) – place (p,m) in the message buffer

( “message (p,m) is sent to process p”)

receive (p)delete a message (p,m) from the message buffer and returns m ( “message (p,m) is received”)OR

returns (message buffer unchanged)

14

Message system nondeterministic.However, each message (p,m) in the

message buffer: if receive(p) is performed times, then (p,m) is eventually delivered.

In other words: in response to receive(p) : if a message

(p,m) is in the message buffer, then the message system can return , but only a finite number of times .

15

(P(P11,M) ,M)

Message BufferMessage Buffer

(P(P00,M’,M’) )

(P(P22,M’’,M’’) )

(P(P11,M’’’,M’’’) )

Process 0Process 0 Process 2Process 2Process 1Process 1

receive(0)receive(0) (P(P00,M’,M’) )

16

(P(P11,M) ,M)

Message BufferMessage Buffer

Process 0Process 0 Process 2Process 2Process 1Process 1

receive(1)receive(1)

(P(P22,M’’,M’’) )

(P(P11,M’’’,M’’’))

(P(P11,M’’’) ,M’’’) send(2,msend(2,m

))

(P(P22,m) ,m)

17

Configurations

A configurations consists of Internal state of each process Contents of the message buffer

initial configuration each process p starts with xp=0 or xp=1 the message buffer is empty

step – consists of a primitive step by a single process p. phase 1 – receive(p) is performed phase 2 – p enters a new internal

state and sends a finite set of messages

A step is completely determined by the pair e = (p,m), called an event.

18

19

event e = (p,m) (“receipt of m by p”). step of a single process p:

receive(p) is performed ( p receives m) p enters a new internal state p sends a finite set of messages

event and step:

event: syntax

step: semantic

20

Events and Schedules

e(C) – denotes the resulting configuration

(“e can be applied to C”) The event (p,) can always be applied A schedule from C is a finite/infinite

sequence of events that can be applied from C.

The associated sequence of steps is called a run.one: event - step

many: schedule - run

21

If a schedule is finite, (C) denotes the resulting configuration C’, which is “reachable from C “.

C’ is accessible if it is reachable from an initial configuration.

22

Lemma 1 (‘commutativity’)

Lemma 1 : Suppose that from some configuration C, the schedules 1,2 lead to configurations C1,C2 , respectively.

If the sets of processes taking steps in 1 and 2 , respectively, are disjoint, then 2

can be applied to C1 , and 1 can be applied to C2 , and both lead to the same

configuration C3 .

23

CC22

CC00

CC11

CC33

11

11

22

22

when when 1 1 and and 2 2 contain contain a single eventa single event (p,m) (p,m) eventevent

24

(P(P11,M,M11) ) (P(P22,M,M22) )

(P(P22,M,M22) ) (P(P11,M,M11) )

1122

1122

The The message message buffer of buffer of

CC33


CC11


CC22


CC00

Message buffer

25

PP11Internal Internal state - state - AA

PP22Internal Internal state - state - XX

PP11Internal Internal state - state - BB

PP22Internal Internal state - state - YY

PP11Internal Internal state - state - BB

PP22Internal Internal state - state - XX

PP11Internal Internal state - state - AA

PP22Internal Internal state - state - YY

11 22

1122

All other processors – change unchanged

states

26

CC22

CC00

CC11

CC33

11

11

22

22

when when 1 1 and and 2 2 contain contain a single eventa single event (p,m) (p,m) event - okevent - ok when when 1 1 and and 2 2 contain contain any runany run – use – use inductioninduction

27

A configuration A configuration CC has a has a decision valuedecision value vv

if some process if some process pp is in a decision state is in a decision state with with yyp p = v= v ( (v =0 v =0 oror v=1 v=1). ).

A consensus protocol is A consensus protocol is partially correctpartially correct if it if it satisfies two conditions:satisfies two conditions:

1. No accessible configuration has more 1. No accessible configuration has more than one decision value.than one decision value.

2. For each 2. For each v v {0,1}, {0,1}, some accessible some accessible configuration has decision value configuration has decision value v v ..

good news

- it is non trivial

- sometimes it decides

- it never decides incorrectly

bad news

- termination not guaranteed

- what about delivering all messages?

-what about failures?

28

A process A process p p is is nonfaultynonfaulty in a run if it in a run if it takes takes steps. It is steps. It is faulty faulty otherwise.otherwise.

bad news: a process can be declared faulty only at

!! A run is A run is admissibleadmissible if if

- at most one process is faulty, and - at most one process is faulty, and

- all messages sent to non-faulty- all messages sent to non-faulty

processes are eventually received.processes are eventually received.

29

A run is A run is decidingdeciding if some process reaches if some process reaches a decision state.a decision state.

A consensus protocol is A consensus protocol is totally correcttotally correct in in spite of one faultspite of one fault if it is:if it is: partially correct, andpartially correct, and every admissible run is a deciding run.every admissible run is a deciding run.

30

Theorem:Theorem:

No consensus protocol is totally No consensus protocol is totally correct in spite of one fault.correct in spite of one fault.

31

Sketch of ProofSketch of Proof:: Assume that P is totally Assume that P is totally correct in spite of one fault. correct in spite of one fault.

show an initial configuration from which show an initial configuration from which each decision is still possibleeach decision is still possible ( ( Lemma 2 Lemma 2 ))

show that from such a configuration one can show that from such a configuration one can always reach another similar configuration always reach another similar configuration

( ( Lemma 3 Lemma 3 ) ) conclude – by induction – with an admissible conclude – by induction – with an admissible

run that never decides – a contradiction.run that never decides – a contradiction.

32

Let Let CC be a configuration and let be a configuration and let VV be the be the set of decision values of configurations set of decision values of configurations reachable from reachable from CC..

CC is is bivalentbivalent ifif |V| = 2|V| = 2 CC is is univalentunivalent if if |V| = 1|V| = 1

if if V = {0}V = {0} then then CC is is 0-valent0-valent if if V = {1}V = {1} then then CC is is 1-valent1-valent(Note: |V|≠0, since P is totally correct)(Note: |V|≠0, since P is totally correct)

Theorem:Theorem: No consensus protocol is No consensus protocol is

totally correct in spite of one totally correct in spite of one fault.fault.ProofProof: : Assume that P is totally correct in Assume that P is totally correct in spite of one fault. We will reach a spite of one fault. We will reach a contradiction.contradiction.

33

0-valent configuration0-valent configuration

From now on:



UnknownUnknown

34

Proof:Proof:

Assume there is no bivalent initial configuration.Assume there is no bivalent initial configuration.

But P is partially correct.But P is partially correct.

So, there are both 0-valent and 1-valent So, there are both 0-valent and 1-valent

initial configurations.initial configurations.

Lemma 2Lemma 2: : PP has a bivalent initial has a bivalent initial configuration.configuration.

35

. .

. .

. .

. .

. .

. .

bivalentbivalent

configurationconfigurationinitial configurationsinitial configurationsCC

36

CC00

. .

. .

. .

. .

. .

. .

0-valent0-valentconfigurationconfiguration CC11

initial configurationsinitial configurations

1-valent1-valentconfigurationconfiguration

37

Two initial configurations are called Two initial configurations are called adjacentadjacent if they differ only in the initial if they differ only in the initial value of a single processvalue of a single process..

00 11 00 11 11

00 11 00 11 00

x0 x1 x2 x3 x4

38

Claim:Claim: There exist a 0-valent initial

configuration C0 adjacents to a 1-valent

initial configuration C1.

39

00 11 00 11 11

11 11 00 11 11

11 11 00 11 00

11 11 00 00 00

11 00 00 00 00

x0 x1 x2 x3 x4

C0

C1

Proof by Proof by example:example:

0-valent0-valent

1-valent1-valent

40

So: So: There exist a 0-valent initial

configuration C0 adjacents to a 1-valent

initial configuration C1.

Let pp be the process in whose initial value they differ

41

PP is a consensus protocol that is totally is a consensus protocol that is totally correct in spite of one fault.correct in spite of one fault.

Consider an admissible deciding run (with Consider an admissible deciding run (with schedule schedule ) from ) from CC00 in which process in which process pp takes no steps.takes no steps.

can be applied to can be applied to CC11

The two corresponding configurations are The two corresponding configurations are identical, except for the internal state in identical, except for the internal state in pp

Both runs reach the sameBoth runs reach the same decision decision xx

42

x = 1 C0 is bivalent

x = 0 C1 is bivalent

Contradiction.

CC11CC00

C’C’

C’’C’’

Decision: x x

0-valent0-valent

1-valent1-valent


So , we proved:

43

Lemma Lemma 3:3:Let:Let:

CC be a be a bivalentbivalent configuration of P, configuration of P,

e = (p,m)e = (p,m) be an event that is applicable to be an event that is applicable to CC. .

SS be the set of configurations reachable from be the set of configurations reachable from

CC without applying without applying ee, and , and

D D = = e(S)e(S) = { = {e(E)e(E)| | EESS and and ee is applicable to is applicable to EE}.}.

Then, Then, DD contains a contains a bivalentbivalent configuration. configuration.

44

Note:e =(p,m) is applicable to Cso: message (p,m) is in the message

buffer,so: e is applicable to every ES.

45

EE

ee22ee11ee44

eei i ≠ ≠ ee

bivalent bivalent configurationconfiguration

eeeeee ee

SS

ee

D=e(D=e(S)S)

ee

ee55 ee66ee77

CC

Need to prove: Need to prove: DD contains a contains a bivalentbivalent configuration configuration

46

Prove by contradictionAssume that D contains no

D=e(S)D=e(S)

eei i ≠ ≠ ee

ee eeee ee

SS

eeee

CC 0-valent0-valent

1-valent1-valent

47

Step 1:

Claim: D contains both and

0-valent0-valent

1-valent1-valent

So: every configuration d D is or

The proof has three steps.

48

SSee

D=e(SD=e(S))

DD00 DD11

ee

e=(p,m)

Step 1

49

1.

2. i

i

case E S

case E S

Î

Ï

C is bivalent There exist Ei, , i=0,1, i-valent

configurations reachable from C.

eei i ≠ ≠ ee

ee eeee ee

SS

ee

D=e(S)D=e(S)ee

CC

50

Let F1 = e (E1 ) .

1 1. ( 1)case E S iÎ =

EE1100

ee22ee11ee44

eei i ≠ e ≠ e

bivalent bivalent configuraticonfigurati

ononFF11

ee eeee ee

SS

ee

D=e(SD=e(S))

ee

ee55 ee66

ee77

CC

0-valent0-valent

1-valent1-valent

so: D contains

51

e was applied in reaching E0

so, either E0 is in D, or there exists

F0 D from which E0 is reachable.

0 2. ( 0)case E S iÏ =

ee22ee11ee44

eei i ≠ e ≠ e


onon

ee eeee ee

SS

ee

D=e(SD=e(S))

ee

ee55 ee66

ee77

FF00

EE00

CC

0-valent0-valent

1-valent1-valent

so: D contains

52

So: Fi is i-valent (not bivalent) One of Ei and Fi is reachable from the

other.

both and

So, we know that D contains

0-valent0-valent

1-valent1-valent

End of step 1Start of step 2

53

Step 2Claim: There exist C0 , C1 S such that:

C0 and C1 are neighbors

( C1 = e’(C0 ), e’=(p’,m’) )

D0 = e(C0) is

D1 = e(C1) is

(two configurations neighbors if one results from the other in a single step.)

0-valent0-valent

1-valent1-valent

54

SSee

D=e(SD=e(S))

DD00 DD11

e’CC11CC00

ee

e=(p,m)

e’=(p’,m’)

Step 2

55

e(C) is or . Suppose it is .

There are and in D.

They have predecessors in S.

e(C)e(C)

SS

D=e(SD=e(S))

e(C)e(C)

CC

ee

ee ee

0-valent0-valent

1-valent1-valent

56

Consider the path in S from C to

the predecessor of

e(C)e(C)

SS

ee

D=e(SD=e(S))

ee

e(C)e(C)

CC

ee

0-valent0-valent

1-valent1-valent

57

Applying e to each configuration on this path, we get a configuration in D, which is or .


onon

SS

ee

D=e(SD=e(S))

ee

e(C)e(C)

eeeeee

CC

ee

58

So we get two configurations C0 and C1 , that

are neighbors in S; i.e., there is e’ s.t.

SSee

D=e(SD=e(S))

e(C)e(C)DD00 DD11

e’CC11

CC00CC

ee

0 1

e'C C

59

So, we proved the claim:

There exist C0 , C1 S such that:

C0 and C1 are neighbors

( C1 = e’(C0 ), e’=(p’,m’) )

D0 = e(C0) is

D1 = e(C1) is

hw: complete the proof when e( C) is

End of step 2 Start of step 3

60

D1 = e’(D0) by Lemma 1Case 1 : Case 1 : p’ ≠

pcontradiction

SSee

D=e(SD=e(S))

e(C)e(C)DD00 DD11

e’CC11

CC00CC

ee

e’

Step 3: get to a contradiction

Recall: e=(p,m)

61

SSee

D=e(SD=e(S))

DD00 DD11

e’CC11CC00

ee

e=(p,m)

e’=(p’,m’)

p’ p

Case 2 : Case 2 : p’ = p

recall:

62

CC11

CC00

DD00DD11

AA

Case 2 : Case 2 : p’ = p

e

- - deciding deciding run from Crun from C00 in which p in which p takes no takes no stepssteps A = A = ((CC00))

deciding rundeciding run

1-valent1-valent

0-valent0-valent

e

e’

e’

e e

EE00

EE11A is a deciding run. But it cannot be

and it cannot be . a contradiction !!!

63

Lemma Lemma 3:3:Let:Let:

CC be a be a bivalentbivalent configuration of P, configuration of P,

e = (p,m)e = (p,m) be an event that is applicable to be an event that is applicable to CC. .

SS be the set of configurations reachable from be the set of configurations reachable from

CC without applying without applying ee, and , and

D D = = e(S)e(S) = { = {e(E)e(E)| | EESS and and ee is applicable to is applicable to EE}.}.

Then, Then, DD contains a contains a bivalentbivalent configuration. configuration.


So, we proved:

64

Any deciding run from a bivalent initial configuration goes to univalent configuration, so there must be some single step that goes from a bivalent to univalent configuration.We construct a run that avoids such a step:

bivalent configuration


deciding run


…

univalent configuration

end of proof:

65

we construct an infinite non-deciding run



non-deciding run


……

66

Start with a bivalent initial configuration( Lemma 2)

The run constructed in stages. Every stage starts with a bivalent configuration and ends with a bivalent configuration

A queue of processes, initially in arbitrary order

Message buffer is ordered according to the time messages were sent

67

In each stage:

C is a bivalent configuration that the stage starts with.

Suppose that process p heads the queue

Suppose that m is the earliest message to p in the message buffer if any (or otherwise)

e = (p,m)

68

By Lemma 3 there is a bivalent configuration C’ reachable from C by a schedule in which e is the last event.

After applying this schedule: move p to the back of the queue

69

in any infinite sequence of stages every process takes infinitely many steps

every process receives every message sent to it

Therefore, the constructed run is admissible

never reaches a univalent configuration The protocol never reaches a decision The protocol is not totally correct in

spite of one fault.contradiction

70

Conclusion

Theorem:

No consensus protocol is totally correct in spite of one fault.

hw: which process fails in the infinite run that was constructed for the proof?

71

One importance lesson:In an asynchronous system, there is no

way to distinguish between a faulty process and a slow process.

Other tasks not solvable with one faulty processor:

Input graph – connectedOutput graph - disconnected

Many extensions and uses

72

References

• J. Pachl, E. Korach and D. Rotem, Lower bounds for distributed maximum-finding algorithm. JACM, 1984.

• E. Chang and R. Roberts, An improved algorithm for decentralized extrema-finding in circular configurations of processes, CACM, 1979.

• M. Fischer, N. Lynch, M. Paterson, Impossibility of distributed consensus with one faulty processor, JACM, 1985.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks [email protected] ©

Documents