Top Banner
I/O-Efficient Batched Union- I/O-Efficient Batched Union- Find and Its Applications to Find and Its Applications to Terrain Analysis Terrain Analysis Pankaj K. Agarwal, Lars Pankaj K. Agarwal, Lars Arge, Ke Yi Arge, Ke Yi Duke University Duke University University of Aarhus University of Aarhus
32

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

I/O-Efficient Batched Union-Find and Its I/O-Efficient Batched Union-Find and Its

Applications to Terrain AnalysisApplications to Terrain Analysis

Pankaj K. Agarwal, Lars Arge, Ke YiPankaj K. Agarwal, Lars Arge, Ke Yi Duke UniversityDuke University

University of AarhusUniversity of Aarhus

Page 2: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

The Union-Find ProblemThe Union-Find Problem

• A universe of N elements: x1, x2, …, xN

• Initially N singleton sets: {x1}, {x2 }, …, {xN}

• Each set has a representative

• Maintain the partition under– Union(xi, xj) : Joins the sets containing xi and xj

– Find(xi) : Returns the representative of the set containing xi

Page 3: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

The SolutionThe Solution

d

b j a

e g

h

f l

n

m

i

s r cz k

p

representatives

d

b j a

e g

h

f l

n

m

Union(d, h) :

link-by-rank

d

b j a

e g

h

f l n

Find(n) :

path compression

m

Page 4: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

ComplexityComplexity

• O(N α(N)) for a sequence of N union and find operations [Tarjan 75]

– α(•) : Inverse Ackermann function (very slow!)– Optimal in the worst case [Tarjan79, Fredman

and Saks 89]

• Batched (Off-line) version– Entire sequence known in advance– Can be improved to linear on RAM [Gabow and

Tarjan 85]– Not possible on a pointer machine [Tarjan79]

Page 5: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Simple and Good, as long as …Simple and Good, as long as …

The entire data structure fits in memory

Page 6: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

The I/O ModelThe I/O Model

Main memory of size M

Disk of infinite size

One I/O transfers B items between memory and disk

Page 7: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Sources of “Non-Locality”Sources of “Non-Locality”

• Two operands in a union

• Nodes on a leaf-to-root path

• Operands in consecutive operations– Cannot remove for the on-line case

Need to eliminate all of them in order to get less than one I/O per operation!

Page 8: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Our ResultsOur Results

• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os– Same as sorting– optimal in the worst case

• A practical algorithm using O(sort(N) log(N/M)) I/Os– Implemented

• Applications to terrain analysis– Topological persistence : O(sort(N)) I/Os

• Implemented

– Contour trees : O(sort(N)) I/Os

Page 9: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

I/O-Efficient Batched Union-FindI/O-Efficient Batched Union-Find

• Assumption: No redundant unions– Each union must join two different sets– Will remove later

• Two-stage algorithm– Convert to interval union-find

• Compute an order on the elements s.t. each union joins two adjacent sets

– Solve batched interval union-find

Page 10: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Union TreeUnion Tree

r

a b

c d e f

g h i

1: Union(d, g)2: Union(a, c)3: Union(r, b)4: Union(a, e)5: Union(e, i)6: Union(r, a)7: Union(a, d) g8: Union(d, h) r9: Union(b, f)

3

1

2 4

5

7

8

9

6r

a b

c d e

f

g

h

i

3

1

2 4

5

7

8

96

Equivalent union trees

Page 11: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Transforming the Union TreeTransforming the Union Treer

a b

c d e f

g h i

3

1

2 4

5

7

8

9

6r

a b

c d e f

g

h

i

3

1

2 4

5

7

8

9

6r

a b

c

d

e fg

h

i

3

1 2 4

5

78

9

6

r

a b

c

d

e

f

g

h

i

3

1 2 4 5

78

96

Weights along root-to-leafpath decrease

Page 12: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Formulating as a Batched ProblemFormulating as a Batched Problem

r

a b

c d e f

g h i

3

1

2 4

5

7

8

9

6

r

a b

c

d

e

f

g

h

i

3

1 2 4 5

78

96

For each edge, find the lowest ancestor edgewith a higher weight

Page 13: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Cast in a Geometry SettingCast in a Geometry Settingr

a b

c d e f

g h i

3

1

2 4

5

7

8

9

6

Euler Tour

In O(sort(N)) I/Os [Chiang et al. 95]

12

3

45

6

78

9

x: weighty: positions in the tour

Page 14: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Cast in a Geometry SettingCast in a Geometry Settingr

a b

c d e f

g h i

3

1

2 4

5

7

8

9

6

12

3

45

6

78

9

For each edge, find the lowest ancestor edgewith a higher weight

For each segment, find the shortest segment above and containing it

Page 15: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Distribution SweepingDistribution SweepingM/B vertical slabs

checked here

checkedrecursively

Total cost:O(sort(N))

Page 16: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

In-Order TraversalIn-Order Traversalr

ab

c

d

e

f

g

h

i

3

12 4 5

796Weights along root-to-leaf

path decrease

At u, with child u1,…, uk (in increasing order of weight)

1. Recursively visit subtree at u1

2. Return u3. For i=2 ,…, k

Recursively visit subtree at ui

b r

8

ac e i g d h f

Claim: this traversalproduces the right order

Page 17: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Solving Interval Union-FindSolving Interval Union-Find

Union:x: two operands y: time stamp

Find:x: operand y: time stamp

Four instances of batched ray shooting: O(sort(N))

Page 18: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Handling Redundant UnionsHandling Redundant Unions

• Union tree becomes a graph

• Compute the minimum spanning tree– O(sort(N)) I/Os (randomized) [Chiang et al. 95]

O(sort(N) loglog B) I/Os (deterministic) [Arge et al. 04]

– Deterministic O(sort(N)) I/Os if graph is planar– Only MST edges are non-redundant

Page 19: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

A Practical AlgorithmA Practical Algorithm

• Previous algorithm too complicated– 2 Euler tours– 4 instances of batched ray shooting– MST

• A simple and practical algorithm– Divide-and-conquer– O(sort(N) log(N/M)) I/Os– Implemented

Page 20: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

ApplicationsApplications

1.1. Topological PersistenceTopological Persistence

2.2. Contour TreesContour Trees

Page 21: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Topological PersistenceTopological Persistence

Page 22: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Formulated as Batched Union-FindFormulated as Batched Union-Find• Represented as a triangulated mesh

• Consider minimum-saddle pairs• When reach

– A minimum or maximum: do nothing– A regular poin u: Issue union(u,v) for a lower neighbor v– A saddle u: let v and w be nodes from u’s two

connected pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w)

lower link

Page 23: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Contour TreesContour Trees

Page 24: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Previous ResultsPrevious Results

• Directly maintain contours– O(N log N) time [van Kreveld et al. 97]

– Needs union-split-find for circular lists– Do not extend to higher dimensions

• Two sweeps by maintaining components, then merge– O(N log N) time [Carr et al. 03]

– Extend to arbitrary dimensions

Page 25: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Join Tree and Split TreeJoin Tree and Split Tree

9

8

76

5

4

32

1

Join tree

9

8

76

5

4

32

1

Split tree

Qualified nodes

9

8

76

5

4

3

1

Join tree

9

8

76

5

4

3

1

Split tree

Page 26: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Final Contour TreeFinal Contour Tree

9

8

76

5

4

32

1

Join tree

9

8

76

5

4

32

1

Split tree

9

8

76

5

4

32

1

Contour tree

Hard to BATCH!

Page 27: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Another CharacterizationAnother Characterization

9

8

76

5

4

32

1

Join tree

9

8

76

5

4

32

1

Split tree

9

8

76

5

4

32

1

Contour tree

u

vw

u

vw

u

uw

Let w be the highest node that is a descendant of v in join treeand ancestor of u in split tree, (u, w) is a contour tree edge

Now can BATCH!

Page 28: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Experiment 1:Experiment 1:Random Union-FindRandom Union-Find

Page 29: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Experiment 2: Topological Experiment 2: Topological Persistence on Terrain DataPersistence on Terrain Data

Neuse River Basin of NC

Page 30: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Experiment 2: Topological Experiment 2: Topological Persistence on Terrain DataPersistence on Terrain Data

Page 31: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

SummarySummary

• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os– optimal in the worst case

• A practical algorithm using O(sort(N) log(N/M)) I/Os• Applications to terrain analysis

– Topological persistence : O(sort(N)) I/Os– Contour trees : O(sort(N)) I/Os

• Open Question: On-line case– Can we get below O(N α(N)) I/Os?

Page 32: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Thank you!Thank you!