I/O-Efficient Batched Union- I/O-Efficient Batched Union- Find and Its Applications to Find and Its Applications to Terrain Analysis Terrain Analysis Pankaj K. Agarwal, Lars Pankaj K. Agarwal, Lars Arge, Ke Yi Arge, Ke Yi Duke University Duke University University of Aarhus University of Aarhus
32
Embed
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
I/O-Efficient Batched Union-Find and Its I/O-Efficient Batched Union-Find and Its
Applications to Terrain AnalysisApplications to Terrain Analysis
Pankaj K. Agarwal, Lars Arge, Ke YiPankaj K. Agarwal, Lars Arge, Ke Yi Duke UniversityDuke University
University of AarhusUniversity of Aarhus
The Union-Find ProblemThe Union-Find Problem
• A universe of N elements: x1, x2, …, xN
• Initially N singleton sets: {x1}, {x2 }, …, {xN}
• Each set has a representative
• Maintain the partition under– Union(xi, xj) : Joins the sets containing xi and xj
– Find(xi) : Returns the representative of the set containing xi
The SolutionThe Solution
d
b j a
e g
h
f l
n
m
i
s r cz k
p
representatives
d
b j a
e g
h
f l
n
m
Union(d, h) :
link-by-rank
d
b j a
e g
h
f l n
Find(n) :
path compression
m
ComplexityComplexity
• O(N α(N)) for a sequence of N union and find operations [Tarjan 75]
– α(•) : Inverse Ackermann function (very slow!)– Optimal in the worst case [Tarjan79, Fredman
and Saks 89]
• Batched (Off-line) version– Entire sequence known in advance– Can be improved to linear on RAM [Gabow and
Tarjan 85]– Not possible on a pointer machine [Tarjan79]
Simple and Good, as long as …Simple and Good, as long as …
The entire data structure fits in memory
The I/O ModelThe I/O Model
Main memory of size M
Disk of infinite size
One I/O transfers B items between memory and disk
Sources of “Non-Locality”Sources of “Non-Locality”
• Two operands in a union
• Nodes on a leaf-to-root path
• Operands in consecutive operations– Cannot remove for the on-line case
Need to eliminate all of them in order to get less than one I/O per operation!
Our ResultsOur Results
• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os– Same as sorting– optimal in the worst case
• A practical algorithm using O(sort(N) log(N/M)) I/Os– Implemented
• Applications to terrain analysis– Topological persistence : O(sort(N)) I/Os