1-1 Dynamic Indexability and the Optimality of B-trees and Hash Tables Ke Yi Hong Kong University of Science and Technology Dynamic Indexability and Lower Bounds for Dynamic One-Dimensional Range Query Indexes, PODS ’09 Dynamic External Hashing: The Limit of Buffering, with Zhewei Wei and Qin Zhang, SPAA ’09 + some latest development
99
Embed
B-trees and Hash Tables Dynamic Indexability and …yike/talks/external-lb-overview.pdfExternal Hashing null null null null null null null 1 21 32 82 64 34 24 55 h(x) = last digit
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1-1
Dynamic Indexability and the Optimality ofB-trees and Hash Tables
Ke Yi
Hong Kong University of Science and Technology
Dynamic Indexability and Lower Bounds for Dynamic One-DimensionalRange Query Indexes, PODS ’09
Dynamic External Hashing: The Limit of Buffering, with Zhewei Weiand Qin Zhang, SPAA ’09
+ some latest development
2-1
An index is . . .
An index is a single number calculated from a set of prices
Dow Jones, S & P, Hang Seng
2-2
An index is . . .
An index is a single number calculated from a set of prices
Dow Jones, S & P, Hang Seng
An index is a list of keywords and their page numbers in a book
An index is an exponent
An index is a finger
An index is a list of academic publications and their citations
2-3
An index is . . .
An index (database) is a (disk-based) data structure that im-proves the speed of data retrieval operations (queries) on adatabase table.
An index is a single number calculated from a set of prices
Dow Jones, S & P, Hang Seng
An index is a list of keywords and their page numbers in a book
An index is an exponent
An index is a finger
An index is a list of academic publications and their citations
An index (search engine) is an inverted list from keywords to webpages
3-1
Hash Table and B-tree
Hash tables and B-trees are taught to undergrads and actuallyused in all database systems
3-2
Hash Table and B-tree
Hash tables and B-trees are taught to undergrads and actuallyused in all database systems
B-tree: lookups and range queries; Hash table: lookups
3-3
Hash Table and B-tree
Hash tables and B-trees are taught to undergrads and actuallyused in all database systems
B-tree: lookups and range queries; Hash table: lookups
External memory model (I/O model):
Each I/O reads/writes a block
Memory of size M
Disk partitioned into blocks of size B
Memory
Disk
4-1
The B-tree
4-2
The B-tree
A range query in O(logB N +K/B) I/Os
K: output size
4-3
The B-tree
A range query in O(logB N +K/B) I/Os
K: output size
memory
logB N − logBM = logBNM
4-4
The B-tree
A range query in O(logB N +K/B) I/Os
K: output size
memory
logB N − logBM = logBNM
The height of B-tree never goes beyond 5 (e.g., if B = 100, thena B-tree with 5 levels stores n = 10 billion records). We willassume logB
NM = O(1).
5-1
External Hashing
null
null
null null
null
null
null
1 21
32 82
64 34 24
55
h(x) = last digit of x
14 null null
5-2
External Hashing
null
null
null null
null
null
null
1 21
32 82
64 34 24
55
h(x) = last digit of x
14 null null
Ideal hash function assumption: h maps each object to a hash valueuniformly independently at random
5-3
External Hashing
null
null
null null
null
null
null
1 21
32 82
64 34 24
55
h(x) = last digit of x
14 null null
Ideal hash function assumption: h maps each object to a hash valueuniformly independently at random
Expected average cost of a successful (or unsuccessful) lookup is1 + 1/2Ω(B) disk accesses, provided the load factor is less than aconstant smaller than 1 [Knuth, 1973]
6-1
Exact Numbers Calculated by Knuth
The Art of Computer Programming, volume 3, 1998, page 542
6-2
Exact Numbers Calculated by Knuth
The Art of Computer Programming, volume 3, 1998, page 542
Extremely close to ideal
7-1
Now Let’s Go Dynamic
B-tree: Split blocks when necessary
Focus on insertions first: Both the B-tree and hash table do asearch first, then insert into the appropriate block
Hashing: Rebuild the hash table when too full; extensible hashing[Fagin, Nievergelt, Pippenger, Strong, 79]; linear hashing [Litwin,80]
7-2
Now Let’s Go Dynamic
B-tree: Split blocks when necessary
Focus on insertions first: Both the B-tree and hash table do asearch first, then insert into the appropriate block
These resizing operations only add O(1/B) I/Os amortized perinsertion; bottleneck is the first search + insert
Hashing: Rebuild the hash table when too full; extensible hashing[Fagin, Nievergelt, Pippenger, Strong, 79]; linear hashing [Litwin,80]
7-3
Now Let’s Go Dynamic
Cannot hope for lower than 1 I/O per insertion only if thechanges must be committed to disk right away (necessary?)
B-tree: Split blocks when necessary
Focus on insertions first: Both the B-tree and hash table do asearch first, then insert into the appropriate block
These resizing operations only add O(1/B) I/Os amortized perinsertion; bottleneck is the first search + insert
Hashing: Rebuild the hash table when too full; extensible hashing[Fagin, Nievergelt, Pippenger, Strong, 79]; linear hashing [Litwin,80]
7-4
Now Let’s Go Dynamic
Cannot hope for lower than 1 I/O per insertion only if thechanges must be committed to disk right away (necessary?)
B-tree: Split blocks when necessary
Otherwise we probably can lower the amortized insertion cost bybuffering, like numerous problems in external memory, e.g. stack,priority queue,... All of them support an insertion in O(1/B) I/Os— the best possible
Focus on insertions first: Both the B-tree and hash table do asearch first, then insert into the appropriate block
These resizing operations only add O(1/B) I/Os amortized perinsertion; bottleneck is the first search + insert
Hashing: Rebuild the hash table when too full; extensible hashing[Fagin, Nievergelt, Pippenger, Strong, 79]; linear hashing [Litwin,80]
For any dynamic range query index with a query cost of q and anamortized insertion cost of u, the following tradeoff holdsq · log(uB/q) = Ω(logB), for q < α logB,α is any constant;uB · log q = Ω(logB), for all q.
13-2
Our Main Result
For any dynamic range query index with a query cost of q and anamortized insertion cost of u, the following tradeoff holdsq · log(uB/q) = Ω(logB), for q < α logB,α is any constant;uB · log q = Ω(logB), for all q.
Current upper bounds:q u
logB 1B logB
1 1BB
ε
Bε 1B
Assuming logBNM
= O(1), all the bounds are tight!
13-3
Our Main Result
For any dynamic range query index with a query cost of q and anamortized insertion cost of u, the following tradeoff holdsq · log(uB/q) = Ω(logB), for q < α logB,α is any constant;uB · log q = Ω(logB), for all q.
Current upper bounds:q u
logB 1B logB
1 1BB
ε
Bε 1B
Assuming logBNM
= O(1), all the bounds are tight!
Can’t be true for B = o(√
log n log log n), since the exponentialtree achieves u = q = O(
√log n/ log log n) [Andersson, Thorup,
JACM’07]. (n = N/M)
1B
log NM
log NM
14-1
The real question
How large does B need to be for buffer-treeto be optimal for range reporting?
Update cost: u = amortized transition cost per insertion
18-1
The Ball-Shuffling Problem
→B balls q bins
18-2
The Ball-Shuffling Problem
→B balls q bins
→ cost = 1
18-3
The Ball-Shuffling Problem
→B balls q bins
→ cost = 1
→ cost = 2
cost of putting the ball directly into a bin = # balls in the bin + 1
19-1
The Ball-Shuffling Problem
→B balls q bins
19-2
The Ball-Shuffling Problem
→B balls q bins
→ cost = 5Shuffle:
19-3
The Ball-Shuffling Problem
→B balls q bins
→ cost = 5
Cost of shuffling = # balls in the involved bins
Shuffle:
19-4
The Ball-Shuffling Problem
→B balls q bins
→ cost = 5
Cost of shuffling = # balls in the involved bins
Shuffle:
Putting a ball directly into a bin is a special shuffle
19-5
The Ball-Shuffling Problem
→B balls q bins
→ cost = 5
Cost of shuffling = # balls in the involved bins
Shuffle:
Putting a ball directly into a bin is a special shuffle
Goal: Accommodating all B balls using q bins with minimum cost
20-1
The Workload Construction
round 1:
keys
time
20-2
The Workload Construction
round 1:
round 2:
keys
time
20-3
The Workload Construction
round 1:
round 2:
keys
time
round 3:
· · ·
round B:
20-4
The Workload Construction
round 1:
round 2:
keys
time
round 3:
· · ·
round B:
Queries that we require the index to cover with q blocks# queries ≥ 2MB
20-5
The Workload Construction
round 1:
round 2:
keys
time
round 3:
· · ·
round B:
Queries that we require the index to cover with q blocks# queries ≥ 2MB
snapshot
snapshot
snapshot
snapshot
Snapshots of the dynamic index considered
21-1
The Workload Construction
round 1:
keys
time
round 2:
round 3:· · ·
round B:
There exists a query such that
• The ≤ B objects of the query reside in ≤ q blocksin all snapshots
• All of its objects are on disk in all B snapshots (wehave ≥MB queries)
• The index moves its objects uB2 times in total
22-1
The Reduction
An index with update cost u and query A gives us asolution to the ball-shuffling game with cost uB2 for Bballs and q bins
22-2
The Reduction
An index with update cost u and query A gives us asolution to the ball-shuffling game with cost uB2 for Bballs and q bins
Lower bound on the ball-shuffling problem:
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
22-3
The Reduction
An index with update cost u and query A gives us asolution to the ball-shuffling game with cost uB2 for Bballs and q bins
Lower bound on the ball-shuffling problem:
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
q · log(uB/q) = Ω(logB), for q < α logB,α is any constant;uB · log q = Ω(logB), for all q.
⇒
23-1
Ball-Shuffling Lower Bounds
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
q
cost lower bound
23-2
Ball-Shuffling Lower Bounds
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
q
cost lower bound
1
B2
23-3
Ball-Shuffling Lower Bounds
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
q
cost lower bound
1
B2
B4/3
2
23-4
Ball-Shuffling Lower Bounds
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
q
cost lower bound
1
B2
B4/3
2 logB
B logB
23-5
Ball-Shuffling Lower Bounds
Theorem: The cost of any solution for the ball-shuffling problemis at least
Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.
q
cost lower bound
1
B2
B4/3
2 logB
B logB
Bε
B
24-1
Dynamic B-trees
Dynamic Hash Tables
25-1
Dynamic Hash Tables
B-tree query I/O: O(logBNM )
Hash table query I/O: 1 + 1/2Ω(B); insertion the same
25-2
Dynamic Hash Tables
B-tree query I/O: O(logBNM )
Hash table query I/O: 1 + 1/2Ω(B); insertion the same
A long-time conjecture in the external memory community:
The insertion cost must be Ω(1) I/Os if the query cost is requiredto be O(1) I/Os.
25-3
Dynamic Hash Tables
B-tree query I/O: O(logBNM )
Hash table query I/O: 1 + 1/2Ω(B); insertion the same
A long-time conjecture in the external memory community:
The insertion cost must be Ω(1) I/Os if the query cost is requiredto be O(1) I/Os.
Buffering is useless?
26-1
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
8m
26-2
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
8m
26-3
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
26-4
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
x x/β
For some parameter β = Bc, c ≤ 1
26-5
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
x x/β
For some parameter β = Bc, c ≤ 1
26-6
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
x x/β
For some parameter β = Bc, c ≤ 1
2x2x
26-7
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
x x/β
For some parameter β = Bc, c ≤ 1
2x2xInsertion: O(Bc−1)
26-8
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
x x/β
For some parameter β = Bc, c ≤ 1
2x2xInsertion: O(Bc−1)
Query: 1+O(1/Bc)
26-9
Dynamic Hash Tables (for successful queries)
Logarithmic method (folklore?)memory
4m
2m
m
Insertion: O( 1B log N
M )Expected average query: O(1)
Improving query time
Idea: Keep one table large enough
8m
x x/β
For some parameter β = Bc, c ≤ 1
2x2xInsertion: O(Bc−1)
Query: 1+O(1/Bc)
Still far from the target 1+1/Ω(2B)
27-1
Query-Insertion Tradeoff for Successful queries
1 + 1/2Ω(B)
1−O(1/B(c−1)/4)
Ω(Bc−1)
O(Bc−1)
Ω(1)
O(1)
Insertion
Query
1 + Θ(1/B)
1 + Θ(1/Bc), c < 11
upper bounds
lower bounds
1+Θ(1/Bc)c > 1
[Wei, Yi, Zhang, SPAA’09]
standard hashing
28-1
Indexability Too Strong!
Naıve solution: For every B items, write to a block.
Query cost is 1, insertion is 1/B
28-2
Indexability Too Strong!
Naıve solution: For every B items, write to a block.
Query cost is 1, insertion is 1/B
Too many possiblemappings!
28-3
Indexability Too Strong!
Naıve solution: For every B items, write to a block.
Query cost is 1, insertion is 1/B
Indexabilty + information-theoretical argument
If with only the information in memory, the hash table cannotlocate the item, then querying it takes at least 2 I/Os.
Too many possiblemappings!
29-1
The Abstraction
Consider the layout of a hash table at any snapshot. Denote allthe blocks on disk by B1, B2, . . . , Bd. Let f : U → 1, . . . , dbe any function computable within memory.
We divide items inserted into 3 zones with respect to f .
Memory zone M : set of items stored in memory. tq = 0.
Fast zone F : set of items x such that x ∈ Bf(x). tq = 1.
Slow zone S: The rest of items. tq = 2.
30-1
The Key
The hash table can employ afamily F of at most 2M dis-tinct f ’s.
Note that the current fadopted by the hash table isdependent upon the already in-serted items, but the family Fhas to be fixed beforehand.
31-1
How about All Queries? (Latest results)
We are essentially talking about the membership problem
Can’t use indexability model
Have to use cell probe model
32-1
All queries (the membership problem)
(The cell probe model)
Query
Insert0
upper bounds
lower bounds
1B
log n 11B
logMB
n
`B
log` n, log` n
nε
truncated buffer tree
buffer tree
hashing
1 + 1/2Ω(B)
Bε
Blog n
32-2
All queries (the membership problem)
(The cell probe model)
Query
Insert0
upper bounds
lower bounds
1B
log n 11B
logMB
n
`B
log` n, log` n
nε
truncated buffer tree
buffer tree
hashing
1.1
[Yi, Zhang, SODA’10]
0.91
1 + 1/2Ω(B)
Bε
Blog n
32-3
All queries (the membership problem)
(The cell probe model)
Query
Insert0
upper bounds
lower bounds
1B
log n 11B
logMB
n
`B
log` n, log` n
nε
truncated buffer tree
buffer tree
hashing
1.1
[Yi, Zhang, SODA’10]
logB logn n
0.91
1 + 1/2Ω(B)[Verbin, Zhang, STOC’10]
Bε
Blog n
33-1
THE BIG BOLD CONJECTURE
All these fundamental data structure prob-lems have the same query-update tradeoffin external memory when u = o(1), for suf-ficiently large B.
Partial-sum: all B; Range reporting: B > nε; Predecessor: unknown.
33-2
THE BIG BOLD CONJECTURE
All these fundamental data structure prob-lems have the same query-update tradeoffin external memory when u = o(1), for suf-ficiently large B.
Strong implication: The buffer tree (and many
of the log method based structures) is simple, practical,versatile, and optimal!
Partial-sum: all B; Range reporting: B > nε; Predecessor: unknown.