Shyh-In Hwang in Yuan Ze University 1 Chapter 6 Consistency and Replication
Shyh-In Hwang in Yuan Ze University 1
Chapter 6Consistency and Replication
Shyh-In Hwang in Yuan Ze University 2
Reasons for Replication To increase the reliability of a system
Survive after one replica crashes Better protection against corrupted data
Performance Scalability
Numerous clients The size of geographical area
Shyh-In Hwang in Yuan Ze University 3
Price of Replication Consistency of the replicas
When and how to carry out modifications to all copies
Shyh-In Hwang in Yuan Ze University 4
Object Replication (1)
Organization of a distributed remote object shared by two different clients.
Shyh-In Hwang in Yuan Ze University 5
Object Replication (2)
a) A remote object capable of handling concurrent invocations on its own.
b) An object adapter is required to handle concurrent invocations
How to protect the object against simultaneous access by multiple clients
Shyh-In Hwang in Yuan Ze University 6
Object Replication (3)
a) A distributed system for replication-aware distributed objects.
b) A distributed system responsible for replica management (More common. Simplicity for AP developers)
Shyh-In Hwang in Yuan Ze University 7
Data-Centric Consistency Models
The general organization of a logical data store, physically distributed and replicated across multiple processes.
Shyh-In Hwang in Yuan Ze University 8
A contract between the processes and the data store If the processes agree to obey
certain rules, the store promises to work correctly.
Consistency Models
Shyh-In Hwang in Yuan Ze University 9
Wi(x) a A write by process Pi to data item x with the valu
e a
Ri(x) b A read from data item x by process Pi returning t
he value b
Data items are initially NIL
Denotation
Shyh-In Hwang in Yuan Ze University 10
Strict Consistency The most stringent consistency
model
Definition Any read on a data item x returns a
value corresponding to the result of the most recent write on x.
Shyh-In Hwang in Yuan Ze University 11
Strict Consistency
W(x)a
R(x)a
Strictly consistent memory
R(x)aR(x)NIL
Not strictly consistent
P1
P2
Shyh-In Hwang in Yuan Ze University 12
Strict Consistency
Machine A Machine B
P2 P1
T2: R2(x) T1: W1(x), where T2 = T1 + 1ns
3 meters
C = 3 * 108 m/sec
Propagation speed: 3 m / 10-9 s = 3 * 109 m/sec
In order to meet the strict consistency requirement
Shyh-In Hwang in Yuan Ze University 13
Strict consistency is an ideal programming model; however, it’s nearly impossible to implement in a distributed system.
Strict consistency may violate the laws of physics (Einstein’s special theory of relativity)
It is impossible in a distributed system to assign a unique timestamp to each operation that corresponds to actual global time
Strict Consistency
Shyh-In Hwang in Yuan Ze University 14
Programmers can often live with weaker models.
When the order of events is essential, semaphores or other synchronization tools should be used.
Shyh-In Hwang in Yuan Ze University 15
Definition(Lamport, 1979) The result of any execution is the same as
if the (read and write) operations by all processes on the data store were executed in some sequential order, and the operations of each individual process appear in this sequence in the order specified by its program.
Sequential Consistency
Shyh-In Hwang in Yuan Ze University 16
That is, when processes run concurrently on (possibly) different machines, any valid interleaving of read and write operations is acceptable behavior, but all processes see the same interleaving of operations.
Nothing is said about time. There is no reference to the “most recent” write operation on an object.
A process sees writes from all processes but only its own reads.
Sequential Consistency
Shyh-In Hwang in Yuan Ze University 17
Sequential Consistency
P1
P2
W(x)a
R(x)NIL R(x)a
P1
P2
W(x)a
R(x)a R(x)a
Two possible results of running the same program
Shyh-In Hwang in Yuan Ze University 18
Sequential Consistency
P1
P2
W(x)a
R(x)b R(x)aP3
P4
W(x)b
R(x)b R(x)a
Sequentially consistent data store
Shyh-In Hwang in Yuan Ze University 19
Sequential Consistency
P1
P2
W(x)a
R(x)b R(x)aP1
P2
W(x)b
R(x)a R(x)b
Not sequentially consistent
Shyh-In Hwang in Yuan Ze University 20
Definition Definition of Sequential Consistency, plus If tsop1(x) < tsop2(y) , then operation OP1(x)
should precede OP2(y) in this sequence
Ordering according to a set of loosely synchronized clocks with only finite precision
Stronger than sequential consistency
Linearizability
Shyh-In Hwang in Yuan Ze University 21
Linearizability
Why linearizability? To assist formal verification of concurren
t algorithms
Shyh-In Hwang in Yuan Ze University 22
Sequential Consistency
Example: consider the following three parallel processes, assume that all statements are atomic. How many possible execution sequences are there?
Process P1 Process P2 Process P3
x = 1;
print ( y, z);
y = 1;
print (x, z);
z = 1;
print (x, y);
Each statement is assumed to be indivisible
Shyh-In Hwang in Yuan Ze University 23
Sequential Consistency
Process P1 Process P2 Process P3
x = 1;
print ( y, z);
y = 1;
print (x, z);
z = 1;
print (x, y);
6! / 2 / 2 / 2 = 90
x, y, and z are initially 0.
Shyh-In Hwang in Yuan Ze University 24
Four Valid Results Accepted by Sequential Consistency
P1 executes both statements before P2 or P3 starts
P3 must complete before P1
starts
x = 1;
print (y, z);
y = 1;
print (x, z);
z = 1;
print (x, y);
Prints: 001011
Signature: 001011
(a)
x = 1;
y = 1;
print (x,z);
print(y, z);
z = 1;
print (x, y);
Prints: 101011
Signature: 101011
(b)
y = 1;
z = 1;
print (x, y);
print (x, z);
x = 1;
print (y, z);
Prints: 010111
Signature:
110101
(c)
y = 1;
x = 1;
z = 1;
print (x, z);
print (y, z);
print (x, y);
Prints: 111111
Signature:
111111
(d)
Shyh-In Hwang in Yuan Ze University 25
Sequential Consistency Signature: a 6-bit string of the output of
P1, P2, P3 in that order, which characterizes a particular interleaving of statements
90 different valid statement orderings less than 64 valid program results under
sequential consistency 000000 is not permitted 001001 is not permitted
Shyh-In Hwang in Yuan Ze University 26
To Express Sequential Consistency Ahamad et all (1993)
E1: W1(x)a (execution on x) E2: W2(x)b E3: R3(x)b, R3(x)a E4: R4(x)b, R4(x)a
H: history Program order must be maintained Data coherence must be respected
R(x) must return the value most recently written to x ( 指排序非指時間 )
H = W2(x)b, R3(x)b, R4(x)b, W1(x)a, R3(x)a, R4(x)a
Shyh-In Hwang in Yuan Ze University 27
Problem of sequential consistency: Poor performance (Lipton & Sandberg, 1988) r: read time w: write time t: minimal packet transfer time between
nodesr + w t
For any sequentially consistent memory, changing the protocol to improve the read performance makes the write performance worse, and vice versa.
Sequential Consistency
Shyh-In Hwang in Yuan Ze University 28
Causal Consistency– Consider USENET posting example:
• The “answer message” comes before the “question message,” thus violate the causal consistency rules.
Examples• When there is any read followed later by any
write, the two events are potentially causally related.
• A read is causally related to the write that provided the data the read got.
• If two processes spontaneously and simultaneously write two variables, these are not causally related.
– Operations that are not causally related are said to be concurrent.
Shyh-In Hwang in Yuan Ze University 29
Causal Consistency
– Hutto and Ahamad, 1990, a causally consistent memory should obey following condition:
• Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines.
– Overhead: implementing causal consistency requires keeping track of which processes have seen which writes.
Shyh-In Hwang in Yuan Ze University 30
Causal Consistency
W(x)bR(x)a
R(x)a
R(x)a
W(x)c
R(x)c R(x)b
R(x)b R(x)c
P1
P2
W(x)a
P3
P4
concurrent
This sequence is allowed with causally consistent memory, but not with sequentially consistent memory or strictly consistent memory.
Shyh-In Hwang in Yuan Ze University 31
A violation of causal memory
W(x)bR(x)a
R(x)b R(x)a
R(x)a R(x)b
P1
P2
W(x)a
P3
P4
Wrong order
Potentially causally related
Causal Consistency
Shyh-In Hwang in Yuan Ze University 32
A correct sequence of events in causal memory(concurrent writes can be seen in a different order on different machines)
W(x)b
R(x)b R(x)a
R(x)a R(x)b
P1
P2
W(x)a
P3
P4
Concurrent
Causal Consistency
Shyh-In Hwang in Yuan Ze University 33
– PRAM (Pipelined RAM) Consistency• Writes done by a single process are seen
by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes.
• Easy to implement• Require that writes originating in a single
process be seen everywhere in order.• All writes generated by different
processes are concurrent
FIFO Consistency
Shyh-In Hwang in Yuan Ze University 34
FIFO Consistency
Key difference between sequential & PRAM consistency Sequential consistency: although the
order of statement execution and memory references is non-deterministic, at least all processes agree what it is;
PRAM consistency: they need not agree
Shyh-In Hwang in Yuan Ze University 35
A valid sequence of events for PRAM consistency
FIFO Consistency
R(x)a W(x)b
R(x)b R(x)a
P1
P2
W(x)a
P3
P4
W(x)c
R(x)c
R(x)a R(x)b R(x)c
Shyh-In Hwang in Yuan Ze University 36
FIFO Consistency
001001 is impossible with sequential consistency
x = 1;
print (y, z);
y = 1;
print(x, z);
z = 1;
print (x, y);
Prints: 00
(a)
x = 1;
y = 1;
print(x, z);
print ( y, z);
z = 1;
print (x, y);
Prints: 10
(b)
y = 1;
print (x, z);
z = 1;
print (x, y);
x = 1;
print (y, z);
Prints: 01
(c)
Seen by P1 Seen by P2 Seen by P3
Shyh-In Hwang in Yuan Ze University 37
FIFO Consistency
Consider two parallel processes P1 & P2 running under different consistency model Sequential consistent: P1 is killed, P2 is killed, or
neither is killed PRAM consistent: Both processes can be killed
x = 1;
if (y == 0) kill (P2)
y = 1;
if (x == 0) kill (P1)
P1: P2:
Shyh-In Hwang in Yuan Ze University 38
Not all applications require even seeing all writes, let alone seeing them in order
E.g.: a process inside a critical section reading and writing some variables in a tight loop.
Other processes are not supposed to touch the variables until the first process has left its critical section
The memory has no way of knowing when a process is in a critical section and when it is not, so it has to propagate all writes to all memories in the usual way
A better solution is to let the process finish its critical section and then make sure that the final results were sent everywhere. => Synchronization variable
Problem of FIFO Consistency
Shyh-In Hwang in Yuan Ze University 39
– Synchronization variable• used to synchronize memory• When a synchronization completes, all writes
done on that machine are propagated outward and all writes done on other machines are brought in.
• All of shared memory is synchronized
Weak Consistency
Shyh-In Hwang in Yuan Ze University 40
Weak Consistency Property 1: Accesses to
synchronization variables associated with a data store are sequentially consistent
All processes see all accesses to synchronization variables in the same order.
Shyh-In Hwang in Yuan Ze University 41
Weak Consistency Property 2: No operation on a
synchronization variable is allowed to be performed until all previous writes have been completed everywhere
Accessing a synchronization variable flushes the pipeline (writes).
Shyh-In Hwang in Yuan Ze University 42
Weak Consistency Property 3: No read or write operation on
data items are allowed to be performed until all previous operations to synchronization variables have been performed.
By doing a synchronization before reading shared data, a process can be sure of getting the most recent values.
Shyh-In Hwang in Yuan Ze University 43
– Most useful when isolated accesses to shared variables are rare, with most coming in clusters (many accesses in a short period, then none for a long time).
– We limit only the time when consistency holds
– With Weak Consistency, Sequential consistency is enforced on groups of operations instead of on individual operations.
Weak Consistency
Shyh-In Hwang in Yuan Ze University 44
Weak Consistency
Having memory be wrong is acceptable. Only when the function f is called does the compiler have to put the current values of a and b back in memory.
int a, b, c, d, e, x, y; /* variables */int *p, *q; /* pointers */int f( int *p, int *q); /* function prototype */
a = x * x; /* a stored in register */b = y * y; /* b as well */c = a*a*a + b*b + a * b; /* used later */d = a * a * c; /* used later */p = &a; /* p gets address of a */q = &b /* q gets address of b */e = f(p, q) /* function call */
Shyh-In Hwang in Yuan Ze University 45
Weak Consistency
R(x)b R(x)a
P1
P2
W(x)a
P3
W(x)b S
R(x)a R(x)b S
S
A valid sequence of events for weak consistency
P1
P2
W(x)a W(x)b S
S R(x)a
Should be b
An invalid sequence of events for weak consistency
Shyh-In Hwang in Yuan Ze University 46
– Gharachorloo et al., 1990• Two kinds of synchronization variables or operati
ons are provided instead of one in weak consistency.
– acquire: a critical region is about to be entered.– release: a critical region has just been exited.
• Able to tell the difference between entering/leaving a critical region, thus more efficient than weak consistency.
– Acquire & release do not have to apply to all of memory. Instead, they may only guard specific shared variables.
• The shared variables that are kept consistent are said to be protected.
Release Consistency
Shyh-In Hwang in Yuan Ze University 47
A valid event sequence for release consistency
Release Consistency
Acq(L) W(x)a W(x)b Rel(L)
Acq(L) R(x)b Rel(L)
P1
P2
P3R(x)a
Shyh-In Hwang in Yuan Ze University 48
– It is also possible to use barriers instead of critical regions with release consistency
• A barrier is a synchronization mechanism that prevents any process from starting phase n + 1 of a program until all processes have finished phase n.
• When a process arrives at a barrier, it must wait until all other processes get there too. When the last one arrives, all shared variables are synchronized and then all processes are resumed.
• Departure from the barrier is done on an acquire and arrival is done on a release.
Release Consistency
Shyh-In Hwang in Yuan Ze University 49
– When the software does an acquire:• All the local copies of the protected variables
are brought up to date to be consistent with the remote ones if need be.
• Doing an acquire does not guarantee that locally made changes will be sent to other machines immediately
– When the release is done:• Protected variables that have been changed
are propagated out to other machines.• Doing a release does not necessarily import
changes from other machines.
Release Consistency
Shyh-In Hwang in Yuan Ze University 50
Release Consistency
Rules: Before a read or write operation on shared
data is performed, all previous acquires done by the process must have completed successfully.
Before a release is allowed to be performed, all previous reads and writes by the process must have completed
Accesses to synchronization variables are FIFO consistent (sequential consistency is not required).
Shyh-In Hwang in Yuan Ze University 51
– Eager release consistency• When a release is done, the process doing
the release pushes out all the modified data to all other processes that already have a copy.
Eager vs. Lazy Release Consistency
Shyh-In Hwang in Yuan Ze University 52
– Lazy release consistency(Keleher et al., 1992)
• At the time of release, nothing is sent anywhere.• When an acquire is done, the process trying to do
the acquire has to get the most recent values of the data from the process or processes holding them.
• A timestamp protocol can be used to determine which variables have to be transmitted.
• More efficient: no network traffic until another process does an acquire
• Repeated acquire-release pairs by the same process in the absence of outside competition are free.
• E.g. a critical region located inside a loop
Eager vs. Lazy Release Consistency
Shyh-In Hwang in Yuan Ze University 53
Entry Consistency Each synchronization variable has a
current owner (process). The owner can enter and exit critical
sections repeatedly without having to send any messages on the network.
for loop{ … Acq(Lx) ;
W(x)I ; Rel(Lx) ; …}
Shyh-In Hwang in Yuan Ze University 54
Entry Consistency Other processes has to ask the owner to
change ownership, and the latest values.
Effect: all accesses are sequentially consistent
Several processes may simultaneously own a synchronization variable in nonexclusive mode (for read, but not write).
Shyh-In Hwang in Yuan Ze University 55
An acquire access of a synchronization variable is not allowed to perform with respect to a process until all updates to the guarded shared data have been performed with respect to that process. At an acquire, all guarded shared data have
been brought up to date.
Rules for Entry Consistency (1)
Shyh-In Hwang in Yuan Ze University 56
Before an exclusive mode access to a synchronization variable by a process is allowed to perform with respect to that process, no other process may hold the synchronization variable, not even in nonexclusive mode. Before updating a shared data item, a
process must enter a critical region in exclusive mode so that no other process can update it at the same time.
Rules for Entry Consistency (2)
Shyh-In Hwang in Yuan Ze University 57
After an exclusive mode access to a synchronization variable has been performed, any other process's next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable's owner. If a process wants to enter a critical region
in nonexclusive mode, it must check with the owner to fetch the most recent copies of the guarded shared data.
Rules for Entry Consistency (3)
Shyh-In Hwang in Yuan Ze University 58
Entry Consistency
A valid event sequence for entry consistency.
Shyh-In Hwang in Yuan Ze University 59
• Requires each ordinary shared data items to be associated with some synchronization variable.
• When an acquire is done on a synchronization variable, only those data guarded by that synchronization variable are made consistent.
• Lazy release consistency: does not associate shared data items with locks or barriers and at acquire time has to determine empirically which variables it needs.
Entry Consistency
Shyh-In Hwang in Yuan Ze University 60
– Advantages• Reduces the overhead associated with
acquiring and releasing a synchronization variable, since only a few shared data items have to be synchronized.
• Allows multiple critical sections involving disjoint shared data to execute simultaneously, increasing the amount of parallelism.
– Disadvantages• Extra overhead and complexity of associating
every shared data item with some synchronization variable.
• Programming this way is also more complicated and error prone.
Entry Consistency
Shyh-In Hwang in Yuan Ze University 61
Summary of Consistency Models
Strict
1 2 3 4 5
Interleaving according to absolute time
All processes see the same interleaving
1 2 3 4 5
1 2 3 4 5
P1
P2
1 2 3 4 5P1, P2 sees
Shyh-In Hwang in Yuan Ze University 62
Summary of Consistency Models
Sequential A process sees writes from all processes,
but only its own reads. Time does not play a role
1 2 3 4 5
Any valid interleaving is OK
All processes see the same interleaving
1 2 3 4 5
1 123 4 52 3 4 5
P1
P2( and : writes)
P1, P2 sees
Shyh-In Hwang in Yuan Ze University 63
Summary of Consistency Models
Linearizable It is similar to strict consistency Using synchronized clocks, rather than absolute time.
1 2 3 4 5
Interleaving according to synchronized clocks
All processes see the same interleaving
1 2 3 4 5
1 2 3 4 5
P1
P2
1 2 3 4 5
Absolute time & after Synchronization
Synchronized clock
1 2 3 4 5P2 Absolute time
P1, P2 sees
Shyh-In Hwang in Yuan Ze University 64
Summary of Consistency Models
Causal
1 2 3 4 5
Different processes may see different interleaving
1 2 3 4 5
3 4 53 4 5
P1
P2
( and : writes)
P1 sees
3 4 53 4 5P2 sees
1 21 2
1 21 2
Potentially causally related
Consurrent
Shyh-In Hwang in Yuan Ze University 65
Summary of Consistency Models
FIFO All writes generated by different
processes are concurrent
1 2 3 4 5
Any valid interleaving is OK
Different processes may see different interleaving
1 2 3 4 5
1 123 4 52 3 4 5
P1
P2( and : writes)
P1 sees
112 3 4 523 4 5P2 sees
Shyh-In Hwang in Yuan Ze University 66
Summary of Consistency Models
Consistency models not using synchronization operations.
Consistency Description
Strict Absolute time ordering of all shared accesses matters.
LinearizabilityAll processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp
SequentialAll processes see all shared accesses in the same order. Accesses are not ordered in time
CausalAll processes see causally-related shared accesses in the same order.
FIFOAll processes see writes from a single process in the order they were used. Writes from different processes may not always be seen in that order
Shyh-In Hwang in Yuan Ze University 67
Summary of Consistency Models
Models with synchronization operations.
Consistency Description
Weak Shared data can be counted on to be consistent only after a synchronization is done
Release Shared data are made consistent when a critical region is exited
Entry Shared data pertaining to a critical region are made consistent when a critical region is entered.
Shyh-In Hwang in Yuan Ze University 68
Client-Centric Consistency Models
Characteristics Lack of simultaneous updates Most operations are read A very weak consistency model: eventual
consistency Can be relatively cheap to hide
inconsistencies
Shyh-In Hwang in Yuan Ze University 69
Eventual Consistency DNS
Only one authority is allowed to update its domain
No write-write conflicts read-write conflicts: acceptable to
propagate updates in a lazy fashion
Shyh-In Hwang in Yuan Ze University 70
Eventual Consistency WWW
Only one authority is allowed to update its site
No write-write conflicts Browsers and Web proxies cache
pages, which may be out-of-date
Shyh-In Hwang in Yuan Ze University 71
Eventual Consistency Inconsistency exists and is
tolerable in the large-scale distributed and replicated DB
Eventual consistency If no updates take place for a long
time, all replicas will gradually become consistent.
Shyh-In Hwang in Yuan Ze University 72
Eventual Consistency The principle of a mobile user accessing different
replicas of a distributed database.
Shyh-In Hwang in Yuan Ze University 73
Eventual Consistency Client-centric consistency
Guarantees for a single client concerning the consistency of accesses to a data store by that client
No guarantees concerning concurrent accesses by different clients
Shyh-In Hwang in Yuan Ze University 74
Client-centric consistency How we take care of “Eventual”
Monotonic reads consistency Monotonic writes consistency Read your writes consistency Writes follow reads consistency
Shyh-In Hwang in Yuan Ze University 75
Monotonic Reads Definition
If a process reads the value of a data item x, any successive read operation on x by the same process will always return that same value or a more recent value.
E.g., A user reads emails in San Francisco. Later, he flies to NYC and opens his mailbox again.
Shyh-In Hwang in Yuan Ze University 76
Monotonic Reads
The read operations performed by a single process P at two different local copies of the same data store.
a) A monotonic-read consistent data storeb) A data store that does not provide monotonic reads.
WS(x1;x2): WS(x1) is part of WS(x2)
1:mail from David(2:Mail from Linda)
1:mail from David
1:mail from David
2:Mail from Linda
Shyh-In Hwang in Yuan Ze University 77
Monotonic Writes Definition
A write operation by a process on a data item x is completed before any successive write operation on x by the same process.
E.g., updates on software library Required when part of the state is updated Not necessary when each write operation
completely overwrites the value of x
Shyh-In Hwang in Yuan Ze University 78
Monotonic Writes Monotonic writes resembles data-
centric FIFO consistency Write operations by the same
process are performed in the correct order everywhere
Difference: we only consider consistency for a single process here rather than concurrent processes in FIFO
Shyh-In Hwang in Yuan Ze University 79
Monotonic Writes
The write operations performed by a single process P at two different local copies of the same data store
a) A monotonic-write consistent data store.b) A data store that does not provide monotonic-write consistency.
Update to Ver1.01
Update to Ver1.02
Update to Ver1.01
Shyh-In Hwang in Yuan Ze University 80
Monotonic Writes A weaker form
The effects of a write operation are seen only if all preceding writes have been carried out, but perhaps not in the order in which they have been originally initiated.
Applicable when write operations are commutative
Shyh-In Hwang in Yuan Ze University 81
Read Your Writes Definition
The effect of a write operation by a process on a data item x will always be seen by a successive read operation on x by the same process.
E.g., absence of read-your-writes consistency “Read your writes” is not guaranteed when
a web browser caches an old copy. Updating passwords may take time
Shyh-In Hwang in Yuan Ze University 82
Read Your Writes
a) A data store that provides read-your-writes consistency.b) A data store that does not.
Shyh-In Hwang in Yuan Ze University 83
Writes Follow Reads Definition
A write operation by a process on a data item x following a previous read operation on x by the same process, is guaranteed to take place on the same or a more recent value of x that was read.
Can guarantee that users see a posting of a reaction to an article only after they have seen the original article.
Shyh-In Hwang in Yuan Ze University 84
Writes Follow Reads
a) A writes-follow-reads consistent data storeb) A data store that does not provide writes-follow-reads consistency
Q: 今天天氣如何?
Q: 今天天氣如何?A: 今天艷陽高照
A: 今天艷陽高照
Q: 今天天氣如何?
Shyh-In Hwang in Yuan Ze University 85
Distribution Protocols Replica placement Update propagation Epidemic protocols
Shyh-In Hwang in Yuan Ze University 86
Replica Placement
The logical organization of different kinds of copies of a data store into three concentric rings.
Shyh-In Hwang in Yuan Ze University 87
Permanent Replicas Number of permanent replicas is
small Distribution of web servers: more
or less statically configured LAN Mirroring: mirror sites
Shyh-In Hwang in Yuan Ze University 88
Server-Initiated Replicas Push caches: to install temporary
replicas in regions where sudden bursts of requests are coming from
Web hosting services Dynamically replicating files to
servers close to clients
Shyh-In Hwang in Yuan Ze University 89
Server-Initiated Replicas Counting access requests from different clients.
P is the server in the Web Hosting Service near C1 & C2 (by looking up routing database).
Shyh-In Hwang in Yuan Ze University 90
Server-Initiated Replicas
delQ(F)
repQ(F)
reqQ(F)
To remove F
To migrate F If cntQ(P, F) > 0.5 * reqQ(F), then migrate F from Q to P
If cntQ(P, F) > Threshold * reqQ(F), then replicate F to P
Remove F from Q
To replicate FreqQ(F) > repQ(F)
reqQ(F) < delQ(F)& is not last copy
repQ(F) > reqQ(F) > delQ(F)
Condition How
Shyh-In Hwang in Yuan Ze University 91
Client-Initiated Replicas Client caches
To improve access time to data Kept for limited time A nearby client cache, might be
shared between clients on the same LAN In the same department In the same region
To place cache servers at specific points in WAN
Shyh-In Hwang in Yuan Ze University 92
Update Propagation Update operations on a distributed
and replicated data store are generally Initiated at a client Forwarded to one of the copies The update is propagated to other
copies
Shyh-In Hwang in Yuan Ze University 93
Update Propagation What is propagated
A notification of an update Invalidation protocol Little bandwidth Best for Read/Write ratio is low
Transfer data from one copy to another Best for Read/Write ratio is high
Propagate the update operation to other copies
Active replication Perform updates at minimum bandwidth cost Require more processing power
Shyh-In Hwang in Yuan Ze University 94
Pull versus Push Protocols Push-based approach
Server-based protocols Often used between permanent
replicas and server-initiated replicas Good when Read/Update ratio is high
Shyh-In Hwang in Yuan Ze University 95
Pull versus Push Protocols Pull-based approach
Client-based protocols Often used by client caches The client polls the server to see
whether an update is needed Good when Read/Update ratio is low Drawback: the response time
increases in the case of a cache miss
Shyh-In Hwang in Yuan Ze University 96
Pull versus Push Protocols
A comparison between push-based and pull-based protocols in the case of multiple client, single server systems.
Issue Push-based Pull-based
State of server List of client replicas and caches None
Messages sentUpdate (and possibly fetch update later if invalidation is used)
Poll and update
Response time at client
Immediate (or fetch-update time if invalidation is used)
Fetch-update time
Shyh-In Hwang in Yuan Ze University 97
Pull versus Push Protocols Lease
A promise that server will push updates to the client for a specified time
Dynamically switching between push-based and pull-based strategy
Shyh-In Hwang in Yuan Ze University 98
Lease (1) Age-based leases
Long-lasting lease to data items that are expected to remain unmodified can reduce number of (lease) update messages
(2) Renewal-frequency based leases Long-lasting lease to clients whose cache often
needs refreshed Short-term lease for clients who request
occasionally Server takes care of clients where its data are
popular (3)State-space based leases
Lowers the expiration time of new leases when server becomes overloaded gradually
Dynamically switch to a more stateless mode
Shyh-In Hwang in Yuan Ze University 99
Unicasting vs Multicasting Cheaper to use multicasting facilities
E.g., all replicas are in the same LAN
Multicasting + push-based approach Unicasting + pull-based approach
Shyh-In Hwang in Yuan Ze University 100
Epidemic Protocols Update propagation in eventual-
consistent data stores is often implemented by epidemic protocols
Propagating updates to all replicas in as few messages as possible Advantage: scalability Updates are often aggregated into a
single message
Shyh-In Hwang in Yuan Ze University 101
Anti-Entropy Propagation Model
A server P picks server Q at random, and exchanges updates with Q P only pushes its own updates to Q P only pulls in new updates from Q P and Q send updates to each other (pus
h-pull approach) More effecitve
Shyh-In Hwang in Yuan Ze University 102
Rumor Spreading (Gossiping) To ensure that updates are spread
quickly => pushing updates to a number of servers
Method An infective P contacts an arbitrary other
server Q and tries to push the update to Q If Q is already updated, P stops spreading
with some probability A really good way to rapidly spread
updates
Shyh-In Hwang in Yuan Ze University 103
Epidemic Protocols Removing data is hard
e.g., chained letter ( 幸運信 ) Will easily be infected again: deletion of a
data item destroys all information on the item. When receiving an old copy, a server will interpret as updates on something it did not have before.
Solution: death certificate Deletion of a data item as an update Old copies are treated as old versions, an
d will not be spread
Shyh-In Hwang in Yuan Ze University 104
Epidemic Protocols Problem of death certificate
The certificates will be piled up Should be cleaned up eventually
Solution: dormant death certificates Death certificates are timestamped with creatio
n time Assumption: updates propagate to all servers wi
thin a known finite time. The death certificates are then removed.
Keep a few dormant death certificates Spreading death certificates again when neede
d
Shyh-In Hwang in Yuan Ze University 105
Consistency Protocols Describes an implementation of a
specific consistency model Primary-based protocols
Remote-write protocols Primary-based remote-write protocol (single
copy) Primary-backup protocol (multiple copies)
Local-write protocols: the primary copy migrates Primary-based local-write (single copy) Primary-backup (multiple copies)
Replicated-write protocols Active Replication (all copies) Quorum-based protocols (some copies only)
Shyh-In Hwang in Yuan Ze University 106
Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded.
Remote-Write Protocols (1): Primary-based remote-write protocol
Shyh-In Hwang in Yuan Ze University 107
Implementation of seqential consistencyProcess always see the effects of most recent write operations
Remote-Write Protocols (2): Primary-backup protocol
Shyh-In Hwang in Yuan Ze University 108
A single copy is migrated between processes.
Local-Write Protocols (1): Primary-based local-write protocol
Shyh-In Hwang in Yuan Ze University 109
Local-Write Protocols (1): Primary-based local-write protocol
A fully distributed non-replicated version of the data store
Problems: location of data items Local area: broadcasting Forwarding pointers Home-based approaches Large-scale and widely-distributed data
stores: hierarchical location service
Shyh-In Hwang in Yuan Ze University 110
The primary migrates to the process wanting to perform an update.
Local-Write Protocols (2): Primary-backup protocol
Shyh-In Hwang in Yuan Ze University 111
Local-Write Protocols (2): Primary-backup protocol
Advantage Perform write operations locally, while reading
processes can still access their local copies Mobile computers
Mobile computer becomes the primary server Disconnected: all updates are performed
locally Other processes can perform read operations Connected: propagation of updates from the
primary to backups
Shyh-In Hwang in Yuan Ze University 112
Replicated-Write Protocols Write operations can be carried out
at multiple replicas, instead of only one Active replication: an operation is
forwarded to all replicas Quorum-based: majority voting
Shyh-In Hwang in Yuan Ze University 113
Active Replication Operations need to be carried out in t
he same order everywhere Lamport timestamps: does not scale well
in large distributed systems Sequencer
A central coordinator: assigns a unique sequence number
( primary-based consistency protocols)
Shyh-In Hwang in Yuan Ze University 114
Active Replication (1)
The problem of replicated invocations.
Transfer $100,000
Shyh-In Hwang in Yuan Ze University 115
Active Replication (2)
a) Forwarding an invocation request from a replicated object.b) Returning a reply to a replicated object.
The invocation request is assigned the same unique identifier by each replica of B
Shyh-In Hwang in Yuan Ze University 116
Quorum-Based Protocols
Three examples of the voting algorithm:a) A correct choice of read and write setb) A choice that may lead to write-write conflictsc) A correct choice, known as ROWA (read one, write all)
Shyh-In Hwang in Yuan Ze University 117
Quorum-Based Protocols NR: read quorum NW: write quorum
All servers in write quorum get the new version of data and the new version number
NR + NW > N: to prevent read-write conflicts
NW > N/2: to prevent write-write conflicts