IM IM NTU NTU Distributed Information Distributed Information Systems 2004 Systems 2004 Distributed Transactions Distributed Transactions -- -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Mana gement National Taiwan Universit y
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 11
Distributed Transactions
Yih-Kuen Tsay
Dept. of Information Management
National Taiwan University
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 22Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Structures of Distributed Transactions
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 33
• Both types of transaction invoke operations in more than one server.
• A flat transaction accesses servers’ objects sequentially.
• The subtransactions of a nested transaction can run in parallel (concurrently).
Flat vs. Nested Transactions
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 44
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
* The four subtransactions can run in parallel.
A Nested Banking Transaction
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 55Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
* A transaction identifier may include the server identifier and a serial number.
A Distributed Banking Transaction
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 66
Atomic Commitment inFlat Transactions
• When a distributed flat transaction comes to an end, either all or none of its operations (in different servers) are carried out.
• If one part of a transaction for some reasons (e.g., server crash, failure of validation) has to abort, then the whole transaction must also be aborted.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 77
The Two-Phase Commit Protocol
• A participant (server) is allowed to abort its part of a transaction (even after performing all operations).
• In the first phase, each server votes for the transaction to be committed or aborted.
• In the second phase, every server carries out the joint decision.
• The protocol tolerates server crashes or message losses.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 88
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
* A participant is prepared to commit when it has recorded the changes and
its status in permanent storage.
The Two-Phase Commit Protocol (cont.)
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 99
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
The Two-Phase Commit Protocol (cont.)
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1010
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
The Two-Phase Commit Protocol (cont.)
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1111
Atomic Commitment inNested Transactions• When a subtransaction completes, it makes an ind
ependent decision either to commit provisionally or to abort.
• A parent transaction may commit even if one of its child transactions has aborted.
• If a parent transaction aborts, then its subtransactions will be forced to abort.
• Subtransactions will not carry out a real commitment unless the entire nested transaction descides to commit.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1212
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
* A provisional commit is not backed up in permanent storage.
Deciding Whether to Commit
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1313
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Operations in Coordinator forNest Transactions
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1414
Two-Phase Commit inNested Transactions• When a subtransaction provisionally commit
s, it reports its status and the status of its descendants to its parent.
• When a subtransaction aborts, it just reports abort to its parent.
• Eventually, the top-level transaction receives a list of all subtransactions (except the descendants of an aborted transaction) in the tree, together with the status of each.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1515
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Two-Phase Commit inNest Transactions (cont.)
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1616
(Flat) Two-Phase Commit Protocol
• The top-level coordinator sends canCommit? to all sub-coordinators in the provisional commit list.
• When a server receives a canCommit? ...– If it has provisionally committed substractions
• prepares those without aborted ancestors for commitment,
• aborts those with aborted ancestors, and • sends a Yes vote to the coordinator.
– Otherwise (it must have failed), sends a No vote.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1717
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
The canCommit? Operation forTwo-Phase Commit in Nested Transactions
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1818
Concurrency Control inDistributed Transactions
• Each server applies concurrency control to its own objects.
• Every pair of transactions are serializable in the same order at all servers.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 1919
Locking
• Each server maintains locks for its own objects.
• Locks cannot be released until the transaction has been committed or aborted at all servers.
• Distributed deadlocks might occur if different servers impose different orderings on transactions.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2020
Timestamp Ordering
• A globally unique transaction timestamp is issued by the top-level coordinator.
• All servers must agree on how the timestamps are ordered.
• Conflicts are resolved as each operation is performed.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2121
Optimistic Concurrency Control
• If only one transaction may perform validation at the same time, commitment deadlocks might occur.
Transaction T Transaction U
Read(A) at X
Write(A)
Read(B) at Y
Write(B)
Read(B) at Y
Write(B)
Read(A) at X
Write(A)
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2222
Optimistic Concurrency Control (cont.)
• Parallel validation prevents commitment deadlocks.
• A parallel validation checks (among other things) conflicts between write operations of the transaction being validated against the write operations of other concurrent transactions.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2323
Optimistic Concurrency Control (cont.)
• To ensure that transactions at different servers are globally serializable, the servers may – conduct a global validation (checking if there is
a cyclic ordering) or – use the same globally unique transaction numb
er for the same transaction.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2424
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
An Interleaving of Three Transactions
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2525
Distributed Deadlocks
• A cycle in the global wait-for graph (but not in any single local one) represents a distributed deadlock.
• A deadlock that is detected but is not really a deadlock is called a phantom deadlock.
• Two-phase locking prevents phantom deadlocks; autonomous aborts may cause phantom deadlocks.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2626
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Distributed Deadlocks and Wait-For Graphs
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2727
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Local and Global Wait-For Graphs
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2828
Edge Chasing
• Initiation: when a server notes that a transaction T starts waiting for another transaction U, which is waiting to access an object at another server, it sends a probe containing TU to the server of the object at which transaction U is blocked.
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 2929
Edge Chasing (cont.)
• Detection: receive probes and decide whether deadlock has occurred and whether to forward the probes.
When a server receives a probe TU and finds the transaction that U is waiting for, say V, is waiting for another object elsewhere, a probe TUV is forwarded.
• Resolution: select a transaction in the cycle to abort
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3030
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Probes for Detecting Deadlocks
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3131
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Independently Initiated Probes
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3232
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Probes Traveling Downhill
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3333
Transaction Recovery
• Requirements: durability and failure atomicity• Specific goal: restore the server with the latest
committed versions of its objects.• Tasks of the recovery manager:
– Save objects in permanent storage (a recovery file)– Restore objects after a crash– Reorganize the recovery file and reclaim storage– Optional: be resilient to media failures
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3434
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Types of Entry in a Recovery File
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3535
Two Approaches to the Use of Recovery Files• Logging
– Basic ideas: history of transactions, snapshots, …– Recovery of objects: forward or backward– Checkpointing
• Shadow versions– Basic ideas: map, shadow version, version store, …– Switching from the old map to the new map– Checkpointing
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3636
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Log for Banking Service
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3737
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Shadow Versions
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3838
Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
A Log for the Two-Phase Commit Protocol
IM NTUIM NTU
Distributed Information Systems Distributed Information Systems 20042004 Distributed TransactionsDistributed Transactions -- -- 3939Source: G. Coulouris et al., Distributed Systems: Concepts and Design, Third Edition.
Recovery of the Two-Phase Commit Protocol