CSL 771: Database Implementation Transaction Processing Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, Vassos Hadzilacos and Nathan Goodman (http://research.microsoft.com/en-us/people/philbe/ ccontrol.aspx)
CSL 771: Database Implementation Transaction Processing. Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, Vassos Hadzilacos and Nathan Goodman (http :// research.microsoft.com /en-us/people/ philbe / ccontrol.aspx ). - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Atomicity and Consistency• Single transaction– Execution of a transaction: “all-or-
nothing”Either a transaction completes in its entiretyOr it “does not even start”– As if the transaction never existed– No partial effect must be visible
2 outcomes: A transaction COMMITs or ABORTs
Consistency and Isolation• Multiple transactions– Concurrent execution can cause an
inconsistent database state– Each transaction executed as if isolated
from the others
Durability• If a transaction commits the effects
are permanent
• But, durability has a bigger scope– Catastrophic failures (floods, fires,
earthquakes)
What we will study…• Concurrency Control– Ensuring atomicity, consistency and
isolation when multiple transactions are executed concurrently
• Recovery– Ensuring durability and consistency in
case of software/hardware failures
Terminology• Data item
– A tuple, table, block
• Read (x)• Write (x, 5)
• Start (T)• Commit (T)• Abort (T)• Active Transaction
– A transaction which has neither committed nor aborted
High level model
Transaction Manager
Scheduler
Recovery Manager
Cache ManagerDisk
Transaction 1 Transaction 2 Transaction n
Recoverability (1/2)• Transaction T Aborts– T wrote some data items– T’ read items that T wrote
• DBMS has to…– Undo the effects of T– Undo effects of T’– But, T’ has already committed
T T’Read (x)Write (x,
k)Read (y)
Read (x)Write (y,
k’)Commit
Abort
Recoverability (2/2)• Let T1,…,Tn be a set of transactions• Ti reads a value written by Tk, k < i• An execution of transactions is
recoverable if Ti commits after all Tk commitT1 T2
Write (x,2)
Read (x)Write (y,2)
Commit
T1 T2
Write (x,2)
Read (x)Write (y,2)
CommitCommit
Cascading Aborts (1/2)• Because T was aborted, T1,…, Tk also
have to be abortedT T’ T’’
Read (x)Write (x,
k)Read (y)
Read (x)Write (y,
k’)Abort
Read (y)
Cascading Aborts (2/2)• Recoverable executions do not
prevent cascading aborts• How can we prevent them then ?
T1 T2
Write (x,2)
Read (x)Write (y,2)
CommitCommit
T1 T2
Write (x,2)
CommitRead (x)
Write (y,2)
Commit
What we learnt so far…
T1 T2
Write (x,2)
Read (x)Write (y,2)
Commit
T1 T2
Write (x,2)
Read (x)Write (y,2)
CommitCommit
T1 T2
Write (x,2)
CommitRead (x)
Write (y,2)
Commit
Not recoverable Recoverable with cascading aborts
Recoverable without cascading aborts
Reading a value, committing a transaction
Strict Schedule (1/2)• “Undo”-ing the effects of a
transaction– Restore the before image of the data
itemT1 T2
Write (x,1)Write (y,3)
Write (y,1)
CommitRead (x)Abort
T1 T2
Write (x,1)Write (y,3)
Commit
Equivalent toFinal value of y: 3
Strict Schedule (2/2)T1 T2
Write (x,2)
Write (x,3)
Abort
Initial value of x: 1
Should x be restored to 1 or 3?
T1 T2
Write (x,2)
Write (x,3)
AbortAbortT1 restores x to 3?
T2 restores x to 2?
Do not read or write a value which has been written by an active transaction until that transaction has committed or aborted
T1 T2
Write (x,2)
AbortWrite (x,3)
The Lost Update ProblemT1 T2
Read (x)Read (x)Write (x, 200,000)Commit
Write (x, 200)
Commit
Assume x is your account balance
Serializable Schedules• Serial schedule– Simply execute transactions one after
the other• A serializable schedule is one which
equivalent to some serial schedule
SERIALIZABILITY THEORY
op21, op22, op23, op24
op11, op12, op13
Serializable SchedulesT1: op11, op12, op13
T2: op21, op22, op23, op24
• Serial schedule– Simply execute transactions one after
the otherop11, op12, op13
op21, op22, op23, op24
• Serializable schedule– Interleave operations– Ensure end result is equivalent to some
Deadlocks (2/2)Strategies to deal with deadlocks• Timeouts– Leads to inefficiency
• Detecting deadlocks–Maintain a wait-for graph, cycle
indicates deadlock– Once a deadlock is detected, break the
cycle by aborting a transaction• New problem: Starvation
Conservative 2PL• Avoids deadlocks altogether– T declares its readset and writeset– Scheduler tries to acquire all required locks– If not all locks can be acquired, T waits in a queue
• T never “starts” until all locks are acquired– Therefore, it can never be involved in a deadlock
On your ownStrict 2PL (2PL which ensures only strict
schedules)
Extra Information• Assumption: Data items are
organized in a tree
Can we come up with a better (more efficient) protocol?
Tree Locking Protocol (1/3)Receive ai[x]
is alk[x] ?
ai[x] delayed
RULE 2
RULE 1
NO
YESRULE 3ali[x] cannot be released until ai[x] is completed
RULE 2if x is an intermediate node, and y is a parent of x, the ali[x] is possible only if ali[y]
RULE 4Once a lock is released the same lock may not be re-obtained.
pi[x] scheduled
Tree Locking Protocol (2/3)• Proposition: If Ti locks x before Tk,
then for every v which is a descendant of x, if both Ti and Tk lock v, then Ti locks v before Tk.
• Theorem: Tree Locking Protocol always produces Serializable Schedules
Tree Locking Protocol (3/3)• Tree Locking Protocol avoids
deadlock• Releases locks earlier than 2PL
BUT• Needs to know the access pattern to
be effective• Transactions should access nodes
from root-to-leaf
Multi-granularity Locking (1/3)
• Granularity– Refers to the relative size of the data
item– Attribute, tuple, table, page, file, etc.
• Efficiency depends on granularity of locking
• Allow transactions to lock at different granularities
Multi-granularity Locking (2/3)
• Lock Instance Graph
Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman
• Explicit and Implicit Locks
• Intention read and intention write locks
• Intention locks conflict with explicit read and write locks but not with other intention locks
Multi-granularity Locking (3/3)
• To set rli[x] or irli[x], first hold irli[y] or iwli[y], such that y is the parent of x.
• To set wli[x] or iwli[x], first hold iwli[y], such that y is the parent of x.
• To schedule ri[x] (or wi[x]), Ti must hold rli[y] (or wli[y]) where y = x, or y is an ancestor of x.
• To release irli[x] (or iwli[x]) no child of x can be locked by Ti
The Phantom Problem• How to lock a tuple, which (currently)