Top Banner
Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff Gerth Stølting Brodal Erik D. Demaine Jeremy T. Fineman John Iacono Stefan Langerman J. Ian Munro Result presented at SODA 2010
18

Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Feb 24, 2016

Download

Documents

Juliet cousins

Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff. Gerth Stølting Brodal Erik D. Demaine Jeremy T. Fineman John Iacono Stefan Langerman J. Ian Munro. Result p resented at SODA 2010. Dynamic Dictionary. Search(k) Insert(e) Delete(k). I/O Model [ Aggarwal , Vitter 88]. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Cache-Oblivious Dynamic Dictionaries with Update/Query

Tradeoff

Gerth Stølting BrodalErik D. Demaine

Jeremy T. FinemanJohn Iacono

Stefan LangermanJ. Ian Munro

Result presented at SODA 2010

Page 2: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Dynamic Dictionary

Search(k)

Insert(e)Delete(k)

Page 3: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

I/O Model[Aggarwal, Vitter 88]

CPU Fast Memory block

M/B

B

Slow Memory

B

Cost: the number of block transfers (I/Os)

Page 4: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Cache-Oblivious Algorithms[Frigo, Leiserson, Prokop, Ramachandran 99]

• Algorithms not parameterized by M or B• Analyze in ideal-cache model — I/O model,

except optimal replacement policy is assumed

CPU Fast Memory block

M/B

B

Slow Memory

B

Page 5: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Cache-Oblivious Dynamic Dictionaries

Cache-Aware Search InsertB-tree [BM72] O(logBN) O(logBN)

Buffered B-tree [BF03] O((1/)logBN) O((1/B1-)logBN)*

Cache-Oblivious Search InsertCO B-tree [BDF-00,

BDIW04,BFJ02] O(logBN) O(logBN + …)COLA [BFF-CFKN07] O(log2N) O((1/B)log2N)*

Shuttle Tree [BFF-CFKN07] O(logBN) O((1/BΩ(1/(log

log B)2))logBN

+ …)*xDict [this paper] O((1/

)logBN)O((1/B1-)logBN)*†

* amortized † assumes M = Ω(B2)

Page 6: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Building an xDict ( = 1/2)

lglgN x-boxes of squaring capacities

Insert: insert into smallest box • When a box reaches capacity, Flush it and Batch-

Insert into the next box• O((1/√B) logB x) cost is dominated by largest box

O((1/√B) logB N)Search: search in each x-box• O(logB x) cost is dominated by largest box O(logB N)

221-box 222-box 22lglgN-box…

Page 7: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

x-Box = dictionary with capacity x2

Batch-Insert(D,A): insert Θ(x) presorted objects — cost O((1/√B)logB x) per element

Search(D,κ):— cost is O(logB x)

Flush(D): produce a size-x2 sorted array A containing all the elements in the x-box D— cost is O(1/B) per element

Page 8: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

size-x input buffer

Recursive x-Box

size-x2 output buffer

size-x3/2 middle buffer

√x-box

√x-box

Upper level: at most x1/2/4 subboxes

Lower level: at most x/4 subboxes

input middle… … output

subboxes stored contiguously in arbitrary

order

Unused (currently empty) subboxes are

preallocated

Page 9: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

size-x input buffer

x-Box Space Usage

size-x2 output buffer

size-x3/2 middle buffer

√x-box

√x-box

Upper level: at most x1/2/4 subboxes

Lower level: at most x/4 subboxes

Theorem: An x-Box uses at most cx2 space

(within constant factor of capacity/output buffer)

Page 10: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Fractional Cascading within x-Box

size-x input buffer

size-x2 output buffer

size-x3/2 middle buffer

√x-box

√x-box

Upper level: at most x1/2/4 subboxes

Lower level: at most x/4 subboxes

Propagate samples upwards + Lookahead pointers

Page 11: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Searching in an x-Box

Describe searches by the recurrenceS(x) = 2S(√x) + O(1) with base case S(<√B) = 0

Solves to O(logB N)

√x-box

√x-box

Upper level: at most x1/2/4 subboxes

Lower level: at most x/4 subboxes

size-x input buffer

size-x3/2 middle buffer

size-x2 output buffer

Page 12: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Flush

• Moves all real elements to the output buffer in sorted order.

size-x2 output buffer

√x-box

√x-box

Upper level: at most x1/2/4 subboxes

• Lookahead pointers are rebuilt to facilitate searches. Most subboxes remain empty.

size-x input buffer

size-x3/2 middle bufferLower level: at most x/4 subboxes

Page 13: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Batch-Insert

1. Merge sorted input into input buffer.

+

input buffer

middle buffer

output buffer

Page 14: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Batch-Insert

1. Merge sorted input into input buffer.2. If input buffer is “full enough,” Batch-Insert into

upper-level subboxes (in chunks of Θ(√x))

middle buffer

output buffer

Page 15: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Batch-Insert

1. Merge sorted input into input buffer.2. If input buffer is “full enough,” Batch-Insert into

upper-level subboxes (in chunks of Θ(√x))3. Whenever a subbox is near capacity, Flush it,

then split it into two subboxes

input buffer

middle buffer

output buffer

Page 16: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Batch-Insert

1. Merge sorted input into input buffer.2. If input buffer is “full enough,” Batch-Insert into

upper-level subboxes (in chunks of Θ(√x))3. Whenever a subbox is near capacity, Flush it,

then split it into two subboxes4. If no empty subboxes remain, Flush all of them

and merge output buffers into middle buffer.

input buffer

middle buffer

output buffer

Page 17: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Generalizing to O((1/εB 1-ε)logBN)

Parameterize by 0 < α ≤ 1, where α = ε/(1-ε)

size-x input buffer

size-x 1+α output buffer

size-x 1+α/2 middle buffer

√x-box

√x-box

Upper level: at most x1/2/4 subboxes

Lower level: at most x1/2+α/2/4 subboxes

2(1+α)1 2(1+α)2-box 2(1+α)i-box

1/ε overhead comes from geometric sum in xDict

Page 18: Cache-Oblivious Dynamic Dictionaries with Update/Query Tradeoff

Results SummaryCache-Aware Search InsertB-tree [BM72] O(logBN) O(logBN)

Buffered B-tree [BF03] O((1/)logBN) O((1/B1-)logBN)*

Cache-Oblivious Search InsertCO B-tree [BDF-00,

BDIW04,BFJ02] O(logBN) O(logBN + …)COLA [BFF-CFKN07] O(log2N) O((1/B)log2N)*

Shuttle Tree [BFF-CFKN07] O(logBN) O((1/BΩ(1/(log

log B)2))logBN

+ …)*xDict [this paper] O((1/

)logBN)O((1/B1-)logBN)*†

* amortized † assumes M = Ω(B2)