Top Banner
IM IM NTU NTU Distributed Information Distributed Information Systems 2004 Systems 2004 Replication Management Replication Management -- -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Mana gement National Taiwan Universit y
41

IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

Jan 04, 2016

Download

Documents

Stuart Summers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 11

Replication Management

Yih-Kuen Tsay

Dept. of Information Management

National Taiwan University

Page 2: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 22

Motivations for Replication

• Performance enhancement– Client vs. server caching– Server pools– Replication of immutable vs. changing data

• Increased availability– Server failures– Network partition and disconnected operation

• Fault tolerance: guarantee correctness in spite of faults

Page 3: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 33

General Requirements

• Replication transparency– Clients are not aware of multiple physical copies

(replicas) of an object.– Clients see one logical copy for each object.

• Consistency– Servers perform operations in a way that meets

the specification of correctness.

Page 4: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 44

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

An Architecture forReplication Management

Page 5: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 55

About the Servers

• Recoverability• State Machines

– Consist of state variables and commands– Outputs determined by the sequence of requests

processed

• Static vs. dynamic set of replica managers– Dynamic: servers may crash; new ones may join– Static: crashed servers are considered to cease

operating (possibly for an indefinite period)

Page 6: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 66

Phases of Request Processing

• Issuance– unicast or multicast (from the front end to replica managers)

• Coordination (to ensure consistency)– FIFO ordering, causal ordering, total ordering, …

• Execution (maybe tentatively)• Agreement (to commit or abort)• Response

– From one replica manager or several replica managers to the front end

* The ordering of the phases varies for different systems.

Page 7: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 77

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Services for Process Groups

Page 8: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 88

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

View-Synchronous Group Communications

Page 9: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 99

Correctness Criteria

• Linearizability

• Sequential consistency

* Consider individual operations (instead of transactions).

Page 10: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1010

Linearizability

• The interleaved sequence of operations meets the specification of a single correct copy of the objects.

• The order of operations in the interleaving is consistent with the real times at which the operations occurred in the actual execution.

Page 11: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1111

Sequential Consistency

• The one-copy semantics of the replicated objects is respected.

• The order of operations is preserved for each client, i.e., consistent with the program order for each client.

* Every linearizable service is also sequentially consistent.

Page 12: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1212

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Consistency is easily guaranteed if the replica managers are organized as a group

and the primary uses view-synchronous group communication to send updates.

The Primary-Backup (Passive) Model

Page 13: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1313

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Each front end sends its requests one at a time to all replica managers using a

totally ordered multicast primitive, ensuring that all requests are processed in the

same order at all replica managers.

Active Replication

Page 14: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1414

The Gossip Architecture

• A framework for providing high availability of service through lazy replication

• A request normally executed at one replica

• Replicas updated by lazy exchange of gossip messages (containing most recent updates).

Page 15: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1515

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Operations in a Gossip Service

Page 16: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1616

Timestamps

• Each front end keeps a vector timestamp reflecting the latest version accessed.

• The timestamp is attached to every request sent to a replica.

• Two front ends may exchange messages directly; these messages also carry timestamps.

• The merging of timestamps is done as usual.

Page 17: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1717

Timestamps (cont.)

• Each replica keeps a replica timestamp representing those updates it has received.

• It also keeps a value timestamp, reflecting the updates in the replicated value.

• The replica timestamp is attached to the reply to an update, while the value timestamp is attached to the reply to a query.

Page 18: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1818

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Timestamp Propagations

Page 19: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 1919

The Update Log

• Every update, when received by a replica, is recorded in the update log of the replica.

• Two reasons for keeping a log:– The update cannot be applied yet; it is held

back.– It is uncertain if the update has been received

by all replicas.

• The entries are sorted by timestamps.

Page 20: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2020

The Executed Operation Table

• The same update may arrive at a replica from a front end and in a gossip message from another replica.

• To prevent an update from being applied twice, the replica keeps a list of identifiers of the updates that have been applied so far.

Page 21: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2121

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

A Gossip Replica Manager

Page 22: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2222

Processing Query Requests

• A query request q carries a timestamp q.prev, reflecting the latest version of the value that the front end has seen.

• Request q can be applied (i.e., it is stable) if q.prev valueTS (the value timestamp of the replica that received q).

• Once q is applied, the replica returns the current valueTS along with the reply.

Page 23: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2323

Processing Update Requests

• For an update u (not a duplicate), replica i – increments the i-th element of its replica timestamp r

eplicaTS by one,– adds an entry to the log with a timestamp ts derived

from u.prev by replacing the i-th element with that of replicaTS, and

– return ts to the front end immediately.

• When the stability condition u.prev valueTS holds, update u is applied and its ts is merged with valueTS.

Page 24: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2424

Processing Gossip Messages

• For every gossip message received, a replica does the following:– Merge the arriving log with its own; duplicated updates

are discarded.– Apply updates that have become stable.

• A gossip message need not contain the entire log, if it is certain that some of the updates have been seen by the receiving replica.

Page 25: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2525

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Updates in Bayou

Page 26: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2626

About Bayou

• Consistency guarantees

• Merging of updates

• Dependency checks

• Merge procedures

Page 27: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2727

Coda vs. AFS

• More general replication

• Greater tolerance toward server crashes

• Allowing disconnected operations

Page 28: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2828

• A replicated transactional service should appear the same as one without replicated data.

• The effects of transactions performed by various clients on replicated data are the same as if they had been performed one at a time on single data items; this property is called one-copy serializability.

Transactions with Replicated Data

Page 29: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 2929

• Failures should be serialized with respect to transactions.

• Any failure observed by a transaction must appear to have happened before the transaction started.

Transactions withReplicated Data (cont.)

Page 30: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3030

Schemes for One-Copy Serializability

• Read one/write all

• Available copies replication

• Schemes that also tolerate network partitioning:– available copies with validation– quorum consensus– virtual partition

Page 31: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3131

Source: Instructor’s guide for G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

B

A

Client + front end

BB BA A

getBalance(A)

Client + front end

Replica managersReplica managers

deposit(B,3);

UT

Transactions on Replicated Data

Page 32: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3232

Available Copies Replication

• A client's read request on a logical data item may be performed by any available replica, but a client's update request must be performed by all available replicas.

• A local validation procedure is required to ensure that any failure or recovery does not appear to happen during the progress of a transaction.

Page 33: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3333

Source: Instructor’s guide for G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

A

X

Client + front end

P

B

Client + front end

Replica managers

deposit(A,3);

UT

deposit(B,3);

getBalance(B)

getBalance(A)

Replica managers

Y

M

B

N

A

B

Available Copies Replication (cont.)

Page 34: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3434

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Network Partition

Page 35: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3535

Available Copies with Validation

• The available copies algorithm is applied within each partition.

• When a partition is repaired, the possibly conflicting transactions that took place in the separate partitions are validated.

• If the validation fails, some of the transactions have to be aborted.

Page 36: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3636

Quorum Consensus Methods

• One way to ensure consistency across different partitions is to make a rule that operations can only be carried out within one of the partitions.

• A quorum is a subgroup of replicas whose size gives it the right to execute operations.

• Version numbers or timestamps may be used to determine whether copies of the data item are up to date.

Page 37: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3737

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

An Example for Quorum Consensus

Page 38: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3838

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Two Network Partitions

Page 39: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 3939

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Virtual Partition

Page 40: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 4040

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Overlapping Virtual Partitions

Page 41: IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

IM NTUIM NTU

Distributed Information Systems Distributed Information Systems 20042004 Replication ManagementReplication Management -- -- 4141

Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.

Creating Virtual Partitions