1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01- 10/18/01 Kamen Yotov [email protected]
Dec 21, 2015
1
Message LoggingPessimistic & Optimistic
CS717 Lecture 10/16/01-10/18/01
Kamen Yotov
2
Intruduction
Context & Applications Check-pointing Message Logging
Pessimistic (failure-free mode suffers) Optimistic (good for failure-free mode) Causal (to be discussed in next lectures...)
Main problems Consistency
• Orphans
3
Fault Tolerance “Why”s
Flow of eventsCheck-pointLog messagesCrashRestoreReplay
4
Common Assumptions
Fail-stop model Failure eventually detectable by all Channels
Asynchronous Reliable FIFO Unbounded message delivery Failures
• Transiently dropping• No duplication and/or corruption
Stable storage Spare processing capacity
5
Common goals
Application independence Application transparency
Simple Independent evolution Handles preexisting programs
High throughput Failure-free model with little overhead
Maximum fault-tolerance Any number of failures
6
Formal Terminology
Delivery (as opposed to receipt) Non-faulty processes eventually deliver all
messages that they have received Receive sequence number
• If p delivers m and m.rsn=l then m is the lth message p delivers
Run Sequence of system states Asynchronous
• Only one process changes state at once
7
Formal Terminology (cont.)
Properties: Logical expressions over runs □ - Always ◊ - Eventually
Message determinant #m = <m.src, m.ssn, m.dest, m.rsn, m.data> m.data and m.dest not essential Logging determinants vs. actual messages
Other notation N – set of all processes C – set of failed processes Log(m) – set of processes possessing a copy of #m Depend(m) – set of processes that depend on m
)'()(:'
.
.
def
mdelivermdeliverm
mdeliveredhasjdestmjNjDepend(m)
jdestm
8
Orphan Properties
Before failure, by definition #mLog(m)
#m lost if Log(m)C stable(m) if #m cannot be lost p orphan of C if
p did not failpDepend(m)#m is lost
9
Orphan Properties (cont.)
mLogmDependmLogmDependfmLogm
mLogmDependfmLogm
CmDependmstablem
mDependmstablem
mLogmDependfmLogm
mLogmDependmstablem
mLogmDependm
mLogmDependm
CmDependCmLogm
CmLogmDependpm
CNpCoforphanp
:
:
:Causal
:
:Optimistic
1:
:property)stronger (much cPessimisti
:
:
:
:
:
:
def
10
Performance Metrics
Number of forced roll-backs Time spend on blocking Number of messages Size of messages
11
Got to the real-world stuff!
No additional messages Any number of failures (including total) No assumptions about the logging protocol
Pessimistic doesn’t require that generality
12
The ModelProcess states
Process states State interval
• Instantiates a new one on each message received• State interval index (auto increment)
p1
p2
p3
I03 I1
3 I23 I3
3 I43 I5
3
I01 I1
1 I32
13
The ModelProcess states (cont.)
p1
p2
p3
I03 I1
3 I23 I3
3 I43 I5
3
I01 I1
1 I32
Dependencies between process states (pi depends on pj) Maximum index of any interval of pj, on which pi depends Inside a process each interval depends on the previous one
Dependence vector di = <*> = < 1, 2, 3, 4,…, n>, k = , 0, 1, …
14
The ModelSystem states
Process state – dependence vector di = <*> = < 1, 2, 3, 4,…, n>, k = , 0, 1, …
System state – dependence matrix nn Row i – process state for pi
Diagonal – current state intervals
nnnnn
n
n
n
D
321
3333231
2232221
1131211
**
15
The ModelSystem states (cont.)
S – set of all system states A=[**]S and B=[**]S
A B i=1..n: ii ii
Partial order different than Lamport’s• Orders system states vs. events• Only events are state intervals
Lattice A B = [**] ik = ii ii ? ik : ik
A B = [**] ik = ii ii ? ik : ik
16
The ModelConsistent System states
Consistent state All received messages
• Sent in the current state of the sender• Can be deterministically sent in the future
Messages not yet received are not a problem
Definition: D=[**]S, i, k=1..n: ik kk
• A process cannot depend on the state interval of another process, that has not been reached yet
C = { D S | D is consistent }• C is a sub-lattice of S – proof straightforward!
17
The ModelLogging and Stability
logged(i,) Message that started state interval of
process i has been logged on stable storage checkpoint(i,)
Exists a check-point that contains the state of process i on stable storage
checkpoint(i,0) is implicit Effective check-point for on i is
checkpoint(i,), , is maximal stable(i,) : < [logged(i,)]
18
The ModelRecoverable System states
Recoverable system state System state is consistent All current process states are stable D=[**]S
• recoverable(D) D C && i : stable(i, ii) R = { D S | recoverable(D) }
• R is a sub-lattice of S – proof straightforward! Theorem: A single maximum recoverable state exists!
Proof• R S;• A B R if A, B R A, B A B• Therefore maximum is D R D, obviously unique!
19
The ModelRecoverable System states (cont.)
Current recovery state The Maximum Recover State at any time Never decreases
• D=[**], No : ( i : ii ) is ever rolled back• Proof:
• D will always remain consistent ii will always remain stable• Since R is a lattice, any new state formed after D
will be greater than D• In any new current recovery:
ii state interval index for each process• Therefore, not state interval ii for each i
will ever need to be rolled back!
20
The ModelWrapup…
Corollary 1: If all messages received are eventually logged no domino effect occurs
If D=[**] is the current recovery state Corollary 2: Any messages sent by process i
from state ii may be committed
With i being the effective checkpoint of ii
• Corollary 3: All previous checkpoints of process i may be discarded
• Corollary 4: All messages that begin state intervals prior to i may be discarded
21
The AlgorithmOverview
Keep a current recovery state On each new interval for some process k
becoming stable Try to improve the current recovery state,
such that:• State of process k advances to • Add more state intervals from other processes to
maintain consistency• Succeed if all such included intervals are stable
22
The AlgorithmBasic implementation
Notation D=[**]– the current recovery state – state interval of process k becoming stable dk = <*> = < 1, 2, 3, 4,…, n>, j = , 0, 1, … –
state of process k (dependence vector) Algorithm
if ( >kk) { i : ki i // update row of D while ( i,j : ij >jj ) if ( ij : stable()) // - an interval for j i : ji i // update row of D with dj for else fail}
23
The AlgorithmSome details
The chosen should be the minimum stable state interval: ij
The comparisons ij >jj can be made in any order without affecting the final result
When state interval of process k becomes stable, the algorithm finds some recoverable D with kk =
No stable process state interval that was not suitable should be checked again before advancing the current recovery state
Corollary: When the recovery state advances from some D to D’, the rejected ’s above that need to be rechecked are those with direct dependency on some on any process i such that ii < < ii’
24
The AlgorithmProof of Correctness
The algorithm presented always finds the current recovery state of the system Only finds recoverable system states Any such state found is greater Following the observations stated before, all
possible new states are considered Therefore, the correct one is always found!
25
The AlgorithmOptimizations & Implementation
Optimization considerationsKeeping work list of rows to update D
• Keep only the one with max index
Keeping only the diagonal of D Implementation
Provided in the paperFollows everything said till nowTakes advantage of some specifics
26
Conclusions
General Model and Algorithm Work for both pessimistic and optimistic protocols Does not need the generality for the pessimistic case
Optimistic logging is desirable from performance standpoint in low failure environments
Unifies existing approaches to fault tolerance Check-pointing Message Logging
Results Existence of unique maximum recoverable state Never decreases (progress is being made) Domino effect cannot occur
27
Future work list…
Address non-determinism Switch between
• check-pointing for the non-deterministic part• Check-pointing + message logging elsewhere
Output-driven optimistic message logging and check-pointing Pay attention to communication of the results
Application specific knowledge