Top Banner
Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University
18

Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Dec 31, 2015

Download

Documents

Dylan Woods
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Nearest Neighbor Searching Under

UncertaintyWuzhou Zhang

Supervised by Pankaj K. AgarwalDepartment of Computer Science

Duke University

Page 2: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Nearest Neighbor Searching (NNS)

Applications

Pattern Recognition, Data Compression

Statistical Classification, Clustering

Databases, Information Retrieval

Computer Vision, etc.http://en.wikipedia.org/wiki/Nearest_neighbor_search

Page 3: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Nearest Neighbor Searching Under Uncertainty

Discrete pdf•

Continuous pdf•

0.2

0.10.4

0.3

0.3

0.4

0.2

0.1

0.3

0.4

0.2

0.1

Page 4: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Nearest Neighbor In Expectation

_________

Page 5: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Bisector In Case Of Gaussian

For Gaussian distribution, bisector is a line!

Hard to get explicit formula!

Figure: http://www.cs.utah.edu/~hal/courses/2009S_AI/Walkthrough/KalmanFilters/

Page 6: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

In case of discrete pdf,

bisector is also a line!

In both cases, compute the Voronoi diagram, solve it optimally!

However, not a metric !

Squared Distance Function

bisector is simple and beautiful!

Page 7: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Sampling Continuous Distributions

Sometimes working on continuous distributions is hard….

Lower bounds on other metrics and distributions are also possible…. Let’s focus on discrete pdf then….

Page 8: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Expected Nearest NeighborIn L1 Metric (Manhattan metric)

Page 9: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Expected Nearest NeighborIn L1 Metric ( cont. )

Source: Range Searching on Uncertain Data [P.K.Agarwal et al. 2009]

Page 10: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Geometric Reduction

Page 11: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Building Block: Half-Space Intersection and Convex Hulls

Upper hulls correspond to lower envelopes, an example in 2D

Source: page 252 – 253, Computational Geometry: Algorithms and Applications, 3rd Edition[Mark de Berg et al. ]

Page 12: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Segment-tree Based Data Structures for Expected-NN In L1 Metric

Page 13: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Segment-tree Based Data Structures for Expected-NN In L1 Metric ( cont. )

Page 14: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Segment-tree Based Data Structures for Expected-NN In L1 Metric ( cont. )

Size of data structure

Preprocessing time

Query timeSummary of the result

Page 15: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Approximate L2 Metric

It’s a metric when P is centrally symmetric!

Page 16: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

More complex!

Approximate L2 Metric ( cont. )

Page 17: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Work harder in the near

future!

• Approximate the expected NN in L2 metric

• Study the complexity of expected Voronoi diagram

• Study the probability case

Future Work

Page 18: Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University.

Thanks!

Questions?

Main References:[1] Pankaj K. Agarwal, Siu-Wing Cheng, Yufei Tao, Ke Yi: Indexing uncertain data. PODS 2009: 137-146[2] Pankaj K. Agarwal, Lars Arge, Jeff Erickson: Indexing Moving Points. J. Comput. Syst. Sci. 66(1): 207-243 (2003)