B-trees and Hash Tables Dynamic Indexability and …yike/talks/external-lb-overview.pdfExternal Hashing null null null null null null null 1 21 32 82 64 34 24 55 h(x) = last digit

1-1

Dynamic Indexability and the Optimality ofB-trees and Hash Tables

Ke Yi

Hong Kong University of Science and Technology

Dynamic Indexability and Lower Bounds for Dynamic One-DimensionalRange Query Indexes, PODS ’09

Dynamic External Hashing: The Limit of Buffering, with Zhewei Weiand Qin Zhang, SPAA ’09

+ some latest development

2-1

An index is . . .

An index is a single number calculated from a set of prices

Dow Jones, S & P, Hang Seng

2-2

An index is . . .



An index is a list of keywords and their page numbers in a book

An index is an exponent

An index is a finger

An index is a list of academic publications and their citations

2-3

An index is . . .

An index (database) is a (disk-based) data structure that im-proves the speed of data retrieval operations (queries) on adatabase table.



An index is a list of keywords and their page numbers in a book

An index is an exponent

An index is a finger

An index is a list of academic publications and their citations

An index (search engine) is an inverted list from keywords to webpages

3-1

Hash Table and B-tree

Hash tables and B-trees are taught to undergrads and actuallyused in all database systems

3-2



B-tree: lookups and range queries; Hash table: lookups

3-3



B-tree: lookups and range queries; Hash table: lookups

External memory model (I/O model):

Each I/O reads/writes a block

Memory of size M

Disk partitioned into blocks of size B

Memory

Disk

4-1

The B-tree

4-2

The B-tree

A range query in O(logB N +K/B) I/Os

K: output size

4-3

The B-tree


K: output size

memory

logB N − logBM = logBNM

4-4

The B-tree


K: output size

memory

logB N − logBM = logBNM

The height of B-tree never goes beyond 5 (e.g., if B = 100, thena B-tree with 5 levels stores n = 10 billion records). We willassume logB

NM = O(1).

5-1

External Hashing

null

null

null null

null

null

null

1 21

32 82

64 34 24

55

h(x) = last digit of x

14 null null

5-2

External Hashing

null

null

null null

null

null

null

1 21

32 82

64 34 24

55


14 null null

Ideal hash function assumption: h maps each object to a hash valueuniformly independently at random

5-3

External Hashing

null

null

null null

null

null

null

1 21

32 82

64 34 24

55


14 null null

Ideal hash function assumption: h maps each object to a hash valueuniformly independently at random

Expected average cost of a successful (or unsuccessful) lookup is1 + 1/2Ω(B) disk accesses, provided the load factor is less than aconstant smaller than 1 [Knuth, 1973]

6-1

Exact Numbers Calculated by Knuth

The Art of Computer Programming, volume 3, 1998, page 542

6-2

Exact Numbers Calculated by Knuth

The Art of Computer Programming, volume 3, 1998, page 542

Extremely close to ideal

7-1

Now Let’s Go Dynamic

B-tree: Split blocks when necessary

Focus on insertions first: Both the B-tree and hash table do asearch first, then insert into the appropriate block

Hashing: Rebuild the hash table when too full; extensible hashing[Fagin, Nievergelt, Pippenger, Strong, 79]; linear hashing [Litwin,80]

7-2




These resizing operations only add O(1/B) I/Os amortized perinsertion; bottleneck is the first search + insert


7-3


Cannot hope for lower than 1 I/O per insertion only if thechanges must be committed to disk right away (necessary?)





7-4


Cannot hope for lower than 1 I/O per insertion only if thechanges must be committed to disk right away (necessary?)


Otherwise we probably can lower the amortized insertion cost bybuffering, like numerous problems in external memory, e.g. stack,priority queue,... All of them support an insertion in O(1/B) I/Os— the best possible




8-1

Dynamic B-trees

Dynamic Hash Tables

9-1

Dynamic B-trees for Fast Insertions

LSM-tree [O’Neil, Cheng, Gawlick,O’Neil, Acta Informatica’96]: Log-arithmic method + B-tree

memory

`2M

`M

M

9-2



memory

`2M

`M

M

Insertion: O( `B

log`NM

)

Query: O(log`NM

) (omit the KB

output term)

9-3



memory

`2M

`M

M

Insertion: O( `B

log`NM

)

Query: O(log`NM

) (omit the KB

output term)

Stepped merge tree [Jagadish, Narayan, Seshadri, Sudar-shan, Kannegantil, VLDB’97]: variant of LSM-tree

Insertion: O( 1B

log`NM

)

Query: O(` log`NM

)

9-4



memory

`2M

`M

M

Insertion: O( `B

log`NM

)

Query: O(log`NM

) (omit the KB

output term)

Stepped merge tree [Jagadish, Narayan, Seshadri, Sudar-shan, Kannegantil, VLDB’97]: variant of LSM-tree

Insertion: O( 1B

log`NM

)

Query: O(` log`NM

)

Usually ` is set to be a constant, then they both haveO( 1

B log NM ) insertion and O(log N

M ) query

10-1

More Dynamic B-trees

Y-tree [Jermaine, Datta, Omiecinski, VLDB’99]

Buffer-tree (buffered-repository tree) [Arge, WADS’95; Buchsbaum,

Goldwasser, Venkatasubramanian, Westbrook, SODA’00]

Streaming B-tree [Bender, Farach-Colton, Fineman, Fogel, Kuszmaul,

Nelson, SPAA’07]

10-2






Nelson, SPAA’07]

q ulogB 1

B logB1 1

BBε

Bε 1B

10-3





Deletions? Standard trick: inserting “delete signals”


Nelson, SPAA’07]

q ulogB 1

B logB1 1

BBε

Bε 1B

10-4



No better solutions known ...



Deletions? Standard trick: inserting “delete signals”


Nelson, SPAA’07]

q ulogB 1

B logB1 1

BBε

Bε 1B

Cache-oblivious model [Demaine, Fineman, Iacono, Langerman, Munro,

SODA’10]

11-1

Compare with the rich results in RAM!

Θ(√

logN/ log logN) insertion and query [Andersson, Thorup,JACM’07]

Predecessor

Range reporting

O(logN/ log logN) insertion and O(log logN) query [Mortensen,Pagh, Patrascu, STOC’05]

O(√

logN/ log logN) insertion and query [Andersson, Thorup,JACM’07]

Partial-sum

Θ(logN) insertion query [Patrascu, Demaine, SODA’04]

Other results that depend on the word size w

12-1

Are the EM and DB people just dumb?

13-1

Our Main Result

For any dynamic range query index with a query cost of q and anamortized insertion cost of u, the following tradeoff holdsq · log(uB/q) = Ω(logB), for q < α logB,α is any constant;uB · log q = Ω(logB), for all q.

13-2

Our Main Result


Current upper bounds:q u

logB 1B logB

1 1BB

ε

Bε 1B

Assuming logBNM

= O(1), all the bounds are tight!

13-3

Our Main Result


Current upper bounds:q u

logB 1B logB

1 1BB

ε

Bε 1B

Assuming logBNM

= O(1), all the bounds are tight!

Can’t be true for B = o(√

log n log log n), since the exponentialtree achieves u = q = O(

√log n/ log log n) [Andersson, Thorup,

JACM’07]. (n = N/M)

1B

log NM

log NM

14-1

The real question

How large does B need to be for buffer-treeto be optimal for range reporting?

Known: somewhere betweenΩ(√

log n log log n) and O(nε)

15-1

Lower Bound Model: Dynamic Indexability

Indexability: [Hellerstein, Koutsoupias, Papadimitriou, PODS’97, JACM’02]

15-2



4 7 9 1 2 4 3 5 8 2 6 7 1 8 9 4 5

Objects are stored in disk blocks of size up to B, possibly withredundancy.

15-3



4 7 9 1 2 4 3 5 8 2 6 7 1 8 9 4 5


a query reports 2,3,4,5

15-4



4 7 9 1 2 4 3 5 8 2 6 7 1 8 9 4 5



The query cost is the minimum number of blocks that cancover all the required results (search time ignored!).

cost = 2

15-5



4 7 9 1 2 4 3 5 8 2 6 7 1 8 9 4 5



The query cost is the minimum number of blocks that cancover all the required results (search time ignored!).

cost = 2

Similar in spirit to popular lower bound models: cell probemodel, semigroup model

16-1

Previous Results on Static Indexability

Nearly all external indexing lower bounds are under this model

Tradeoff between space (s) and query time (q)

16-2



2D range queries: s/N · log q = Ω(log(N/B)) [Hellerstein,Koutsoupias, Papadimitriou, PODS’97], [Koutsoupias, Taylor, PODS’98],[Arge, Samoladas, Vitter, PODS’99]


16-3


2D stabbing queries: q · log(s/N) = Ω(log(N/B)) [Arge, Samo-ladas, Yi, ESA’04, Algorithmica’99]




16-4



1D range queries: s = N, q = 1 trivially




16-5



1D range queries: s = N, q = 1 trivially

Adding dynamization makes it much more interesting!




17-1

Dynamic Indexability

Still consider only insertions

17-2



4 5time t:

memory of size M

4 7 9 ← snapshot1 2 7

blocks of size B = 3

17-3



4 5time t:

memory of size M

4 7 9 ← snapshot1 2 7


4 5time t+ 1: 4 7 91 2 6 7 6 inserted

17-4



4 5time t:

memory of size M

4 7 9 ← snapshot1 2 7

1 2 6 7

8 inserted


4 5time t+ 1: 4 7 91 2 6 7 6 inserted

1 2 5time t+ 2: 4 7 9 6 8

17-5



4 5time t:

memory of size M

4 7 9 ← snapshot1 2 7

1 2 6 7

8 inserted


4 5time t+ 1: 4 7 91 2 6 7 6 inserted

1 2 5time t+ 2: 4 7 9 6 8

transition cost = 2

17-6



4 5time t:

memory of size M

4 7 9 ← snapshot1 2 7

1 2 6 7

8 inserted


4 5time t+ 1: 4 7 91 2 6 7 6 inserted

1 2 5time t+ 2: 4 7 9 6 8

transition cost = 2

Update cost: u = amortized transition cost per insertion

18-1

The Ball-Shuffling Problem

→B balls q bins

18-2


→B balls q bins

→ cost = 1

18-3


→B balls q bins

→ cost = 1

→ cost = 2

cost of putting the ball directly into a bin = # balls in the bin + 1

19-1


→B balls q bins

19-2


→B balls q bins

→ cost = 5Shuffle:

19-3


→B balls q bins

→ cost = 5

Cost of shuffling = # balls in the involved bins

Shuffle:

19-4


→B balls q bins

→ cost = 5


Shuffle:

Putting a ball directly into a bin is a special shuffle

19-5


→B balls q bins

→ cost = 5


Shuffle:

Putting a ball directly into a bin is a special shuffle

Goal: Accommodating all B balls using q bins with minimum cost

20-1

The Workload Construction

round 1:

keys

time

20-2


round 1:

round 2:

keys

time

20-3


round 1:

round 2:

keys

time

round 3:

· · ·

round B:

20-4


round 1:

round 2:

keys

time

round 3:

· · ·

round B:

Queries that we require the index to cover with q blocks# queries ≥ 2MB

20-5


round 1:

round 2:

keys

time

round 3:

· · ·

round B:

Queries that we require the index to cover with q blocks# queries ≥ 2MB

snapshot

snapshot

snapshot

snapshot

Snapshots of the dynamic index considered

21-1


round 1:

keys

time

round 2:

round 3:· · ·

round B:

There exists a query such that

• The ≤ B objects of the query reside in ≤ q blocksin all snapshots

• All of its objects are on disk in all B snapshots (wehave ≥MB queries)

• The index moves its objects uB2 times in total

22-1

The Reduction

An index with update cost u and query A gives us asolution to the ball-shuffling game with cost uB2 for Bballs and q bins

22-2

The Reduction


Lower bound on the ball-shuffling problem:

Theorem: The cost of any solution for the ball-shuffling problemis at least

Ω(q ·B1+Ω(1/q)), for q < α logB where α is any constant;Ω(B logq B), for any q.

22-3

The Reduction


Lower bound on the ball-shuffling problem:



q · log(uB/q) = Ω(logB), for q < α logB,α is any constant;uB · log q = Ω(logB), for all q.

⇒

23-1

Ball-Shuffling Lower Bounds



q

cost lower bound

23-2




q

cost lower bound

1

B2

23-3




q

cost lower bound

1

B2

B4/3

2

23-4




q

cost lower bound

1

B2

B4/3

2 logB

B logB

23-5




q

cost lower bound

1

B2

B4/3

2 logB

B logB

Bε

B

24-1

Dynamic B-trees

Dynamic Hash Tables

25-1

Dynamic Hash Tables

B-tree query I/O: O(logBNM )

Hash table query I/O: 1 + 1/2Ω(B); insertion the same

25-2

Dynamic Hash Tables



A long-time conjecture in the external memory community:

The insertion cost must be Ω(1) I/Os if the query cost is requiredto be O(1) I/Os.

25-3

Dynamic Hash Tables



A long-time conjecture in the external memory community:

The insertion cost must be Ω(1) I/Os if the query cost is requiredto be O(1) I/Os.

Buffering is useless?

26-1

Dynamic Hash Tables (for successful queries)

Logarithmic method (folklore?)memory

4m

2m

m

8m

26-2



4m

2m

m

Insertion: O( 1B log N

M )Expected average query: O(1)

8m

26-3



4m

2m

m



Improving query time

Idea: Keep one table large enough

8m

26-4



4m

2m

m





8m

x x/β

For some parameter β = Bc, c ≤ 1

26-5



4m

2m

m





8m

x x/β


26-6



4m

2m

m





8m

x x/β


2x2x

26-7



4m

2m

m





8m

x x/β


2x2xInsertion: O(Bc−1)

26-8



4m

2m

m





8m

x x/β



Query: 1+O(1/Bc)

26-9



4m

2m

m





8m

x x/β



Query: 1+O(1/Bc)

Still far from the target 1+1/Ω(2B)

27-1

Query-Insertion Tradeoff for Successful queries

1 + 1/2Ω(B)

1−O(1/B(c−1)/4)

Ω(Bc−1)

O(Bc−1)

Ω(1)

O(1)

Insertion

Query

1 + Θ(1/B)

1 + Θ(1/Bc), c < 11

upper bounds

lower bounds

1+Θ(1/Bc)c > 1

[Wei, Yi, Zhang, SPAA’09]

standard hashing

28-1

Indexability Too Strong!

Naıve solution: For every B items, write to a block.

Query cost is 1, insertion is 1/B

28-2




Too many possiblemappings!

28-3




Indexabilty + information-theoretical argument

If with only the information in memory, the hash table cannotlocate the item, then querying it takes at least 2 I/Os.

Too many possiblemappings!

29-1

The Abstraction

Consider the layout of a hash table at any snapshot. Denote allthe blocks on disk by B1, B2, . . . , Bd. Let f : U → 1, . . . , dbe any function computable within memory.

We divide items inserted into 3 zones with respect to f .

Memory zone M : set of items stored in memory. tq = 0.

Fast zone F : set of items x such that x ∈ Bf(x). tq = 1.

Slow zone S: The rest of items. tq = 2.

30-1

The Key

The hash table can employ afamily F of at most 2M dis-tinct f ’s.

Note that the current fadopted by the hash table isdependent upon the already in-serted items, but the family Fhas to be fixed beforehand.

31-1

How about All Queries? (Latest results)

We are essentially talking about the membership problem

Can’t use indexability model

Have to use cell probe model

32-1

All queries (the membership problem)

(The cell probe model)

Query

Insert0

upper bounds

lower bounds

1B

log n 11B

logMB

n

`B

log` n, log` n

nε

truncated buffer tree

buffer tree

hashing

1 + 1/2Ω(B)

Bε

Blog n

32-2



Query

Insert0

upper bounds

lower bounds

1B

log n 11B

logMB

n

`B

log` n, log` n

nε


buffer tree

hashing

1.1

[Yi, Zhang, SODA’10]

0.91

1 + 1/2Ω(B)

Bε

Blog n

32-3



Query

Insert0

upper bounds

lower bounds

1B

log n 11B

logMB

n

`B

log` n, log` n

nε


buffer tree

hashing

1.1

[Yi, Zhang, SODA’10]

logB logn n

0.91

1 + 1/2Ω(B)[Verbin, Zhang, STOC’10]

Bε

Blog n

33-1

THE BIG BOLD CONJECTURE

All these fundamental data structure prob-lems have the same query-update tradeoffin external memory when u = o(1), for suf-ficiently large B.

Partial-sum: all B; Range reporting: B > nε; Predecessor: unknown.

33-2

THE BIG BOLD CONJECTURE

All these fundamental data structure prob-lems have the same query-update tradeoffin external memory when u = o(1), for suf-ficiently large B.

Strong implication: The buffer tree (and many

of the log method based structures) is simple, practical,versatile, and optimal!

Partial-sum: all B; Range reporting: B > nε; Predecessor: unknown.

34-1

The End

T HANK YOU

Q and A

B-trees and Hash Tables Dynamic Indexability and …yike/talks/external-lb-overview.pdfExternal Hashing null null null null null null null 1 21 32 82 64 34 24 55 h(x) = last digit

Documents