SIGMOD 99 Efficient Concurrency Control in Multidimensional Access Methods Kaushik Chakrabarti Sharad Mehrotra University of.
Post on 20-Dec-2015
218 Views
Preview:
Transcript
SIGMOD 99 http://www-db.ics.uci.edu
Efficient Concurrency Control in Multidimensional Access Methods
Kaushik Chakrabarti Sharad Mehrotra
University of Illinois at Urbana Champaign
University of California at Irvine
Presented at ACM SIGMOD Conference
June 1, 1999
SIGMOD 99 http://www-db.ics.uci.edu
Outline of talk
• Introduction
• Background
• Phantom protection in Generalized Search Trees– Define granules– Describe lock protocols
• Experiments
• Conclusion
SIGMOD 99 http://www-db.ics.uci.edu
Introduction• Increasing number of applications deal with multidimensional data
• Examples: spatial (CAD, GIS), spatio-temporal (moving objects, weather)
• DBMS should allow applications to:(1) define their own data types and operations(2) define multidimensional access methods (AMs) for those data types for efficient
query processing
• OR technology solves (1)
• Generalized Search Trees (GiSTs) addresses (2)
SIGMOD 99 http://www-db.ics.uci.edu
Introduction • For successful integration, we need to support concurrent accesses via
GiST
• Concurrency control problems:(1) Preserve consistency of data structure(2) Prevent phantom anomalies
• (1) has been addressed in Kornacker, Mohan and Hellerstein, SIGMOD97
• This paper addresses the problem of phantom protection in GiSTs
SIGMOD 99 http://www-db.ics.uci.edu
Phantom
• Definition:– T1 reads a set of items satisfying some <search-condition>
– T2 creates data items that satisfy T1’s <search-condition> and commits
– T1 repeats its scan with the same <search-condition>,
gets a different set of items
• Serializability No phantoms
SIGMOD 99 http://www-db.ics.uci.edu
Example
SIGMOD 99 http://www-db.ics.uci.edu
Solution
• Predicate locks: costly
• Granular locks: efficient
SIGMOD 99 http://www-db.ics.uci.edu
Key Range Locking
• ARIES/KVL(Mohan, 1990)
SIGMOD 99 http://www-db.ics.uci.edu
Phantoms in Spatial/Spatio-temporal Databases
• Compute average rainfall over all locations a 2-d region where the locations are indexed using a GiST
• Get all objects in a given region from a moving objects database where the objects are indexed using a GiST
SIGMOD 99 http://www-db.ics.uci.edu
Solutions
• Adapting KRL: too costly.
• Predicate locking based strategy by Kornacker, Mohan and Hellerstein, SIGMOD97.
• Our granular locking based approach for phantom protection in R-trees, ICDE98. Does not work well when applied to GiSTs (details in paper)
SIGMOD 99 http://www-db.ics.uci.edu
Granular Locking in GiST
• Solution involves– Define the granules– Define the lock protocol for the operations
• Challenges– “nice’’ granules– handling overlap among granules– handling “loss of lock’’ problem– high concurrency and low lock overhead
SIGMOD 99 http://www-db.ics.uci.edu
GiST
• Keys can be arbitrary predicates• An AM can be implemented by specifying some
extension methods which dictate the tree operations
SIGMOD 99 http://www-db.ics.uci.edu
Granules in GiST
• Leaf Granules: One per leaf node
• Non-leaf granules: One per non-leaf node
• Lock name: <table-name, index-name, node-id>
• Lock Coverage: defined by Granule Predicate (GP)
– GP(N) = BP(N) if N is root
= BP(N) GP(P) otherwise, P=parent(N)
SIGMOD 99 http://www-db.ics.uci.edu
SIGMOD 99 http://www-db.ics.uci.edu
Locks
SIGMOD 99 http://www-db.ics.uci.edu
Overlap between granules• Correctness: p p’ lset(p) lset(p’) NULL
• Problem does not arise in KRL
• Policies– Overlap-for-Search & Cover-for-Insert (OSCI)– Cover-for-Search & Overlap-for-Insert (CSOI)
SIGMOD 99 http://www-db.ics.uci.edu
Loss of lock coverage
SIGMOD 99 http://www-db.ics.uci.edu
Search Protocol
• Get commit duration S lock on the granule corresponding to each index node visited
• Correctness:– GP(T) Q is satisfiable i(Consistent(BP(Pi), Q), Pi is
ancestor of T
• Note– No object locks– No extra cost except that of acquiring the lock (no extra checks)
SIGMOD 99 http://www-db.ics.uci.edu
Insert Protocol
• Correctness:– full coverage– prevent phantoms due to loss of lock
coverage
SIGMOD 99 http://www-db.ics.uci.edu
Insert Protocol
• Case 1: No growth, No split– commit duration IX lock on g (target granule)– commit duration X lock on O
• Case 2: Growth, No split– 2 locks as before– short duration IX lock on lowest unchanged node
(LU-node)
SIGMOD 99 http://www-db.ics.uci.edu
Example
SIGMOD 99 http://www-db.ics.uci.edu
Insert Protocol
• Case 3: No growth, Split– instant duration SIX on g– commit duration IX on whichever contains O after
split; X on O– instant duration SIX on each ancestor that splits
• Case 4: Growth, Split– lock requirements of Cases 2 and 3
SIGMOD 99 http://www-db.ics.uci.edu
Deletion Protocol
• Problem: g does not cover O after deletion commit duration lock on LU-node
• We do: – logical deletion (IX on target granule, X on object)– defer physical deletion till transaction commits
SIGMOD 99 http://www-db.ics.uci.edu
Protocol for Other Operations
• ReadSingle: S lock on object
• UpdateSingle: – if indexed attributes not changed, IX on g, X on O– else, deletion followed by insertion
• UpdateScan: same as search for the region, same as updatesingle for every object updated
SIGMOD 99 http://www-db.ics.uci.edu
Empirical Evaluation
• Data sets:– 2-d spatial data: 62,556 2-d points from Sequoia
2000 benchmark
– 3-d feature data: First 3 Fourier coefficients from 480,471 Fourier vectors
SIGMOD 99 http://www-db.ics.uci.edu
Measurements & Parameters
• Performance: Throughput (tps)
• Concurrency: Conflict ratio
• Overhead: #locks, # pred. Checks
• Parameters: MPL, transaction size, write probability, query size, external think time (fixed 3sec), restart delay (fixed 3sec)
SIGMOD 99 http://www-db.ics.uci.edu
Implementation
SIGMOD 99 http://www-db.ics.uci.edu
Performance
• 2-d data
Throughput vs MPL
0
2
4
6
8
10
12
14
16
10 20 40 50 60 80 100
MPL
Thro
ughp
ut (t
ps) GL
PL
• 3-d data
Throughput vs MPL
0
1
2
3
4
5
6
7
8
9
10
10 30 40 50 60 75
MPL
Thro
ughp
ut (t
ps) GL
PL
SIGMOD 99 http://www-db.ics.uci.edu
Performance/Concurrency
• Under various loads
Throughput vs Write ratio
0
5
10
15
0.1 0.2 0.4 0.5 0.6 0.8 0.9
Write ratio
Throu
ghpu
t (tps
) GL
PL
• Conflict ratio
Concurrency vs. MPL
00.5
11.5
MPL
Confl
ict ra
tio GLPL
SIGMOD 99 http://www-db.ics.uci.edu
Overhead
• Search • Insert
Lock Overhead
0
50
100
150
200
250
MPL
# lo
cks/
# p
red
. ch
ecks GL
PL
Lock Overhead
0100200300400500600700800
MPL
# lo
cks/
# p
red
. ch
ecks GL
PL
SIGMOD 99 http://www-db.ics.uci.edu
Conclusions• GL is significantly more efficient than PL
• We expect the performance gap to increase with better implementation (mainly LM)
• Dimensionality curse is a problem in GL
• Can be integrated with a consistency protocol for complete solution to concurrency control in multidimensional AMs
top related