Top Banner
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer
30

A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

Dec 17, 2015

Download

Documents

Marion Logan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction

Lee-Ad Gottlieb

Weizmann Institute of Science

Joint work with Robert Krauthgamer

Page 2: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 2

Data As High-Dimensional Vectors Data is often represented by vectors in Rm

For images, color or intensity For document, word frequency

A typical goal – Nearest Neighbor Search: Preprocess data, so that given a query vector, quickly find closest

vector in data set. Common in various data analysis tasks – classification, learning,

clustering.

Page 3: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 3

Curse of Dimensionality Cost of many useful operations is exponential in dimension

First noted by Bellman (Bel-61) in the context of PDFs Nearest Neighbor Search (Cla-94)

Dimension reduction: Represent high-dimensional data in a low-dimensional space

Specifically: Map given vectors into a low-dimensional space, while preserving most of the data’s “structure”

Trade-off accuracy for computational efficiency

Page 4: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 4

The JL Lemma Theorem (Johnson-Lindenstrauss, 1984):

For every n-point Euclidean set X, with dimension d, there is a linear map : XY (Euclidean Y) with Interpoint distortion 1± Dimension of Y : k = O(--2 log n)

Can be realized by a trivial linear transformation Multiply d x n point matrix by a k x d matrix of random entries {-1,0,1} [Ach-01]

An near matching lower bound was given by [Alon-03]

Applications in a host of problems in computational geometry

But can we do better?

Page 5: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 5

Doubling Dimension Definition: Ball B(x,r) = all points within distance r from x.

The doubling constant (of a metric M) is the minimum value ¸ such that every ball can be covered by ¸ balls of half the radius First used by [Ass-83], algorithmically by [Cla-97]. The doubling dimension is dim(M)=log ¸(M) [GKL-03]

Applications: Approximate nearest neighbor search [KL-04,CG-06] Distance oracles [HM-06] Spanners [GR-08a,GR-08b] Embeddings [ABN-08,BRS-07]

Here ≤7.

Page 6: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 6

The JL Lemma Theorem (Johnson-Lindenstrauss, 1984):

For every n-point Euclidean set X, with dimension d, there is a linear map : XY with Interpoint distortion 1± Dimension of Y : O(-2 log n)

An almost matching lower bound was given by [Alon-03] This lower bound considered n roughly equidistant points

So it had dim(X) = log n So in fact the lower bound is (-2 dim(X))

Page 7: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 7

A stronger version of JL? Open questions:

Can the JL log n lower bound be strengthened to apply to spaces with low doubling dimension? (dim(X) << log n)

Does there exist a JL-like embedding into O(dim(X)) dimensions? [LP-01,GKL-03] Even constant distortion would be interesting A linear transformation cannot attain this result [IN-07]

Here, we present a partial resolution to these questions: Two embeddings that use Õ(dim2(X)) dimensions Result I: (1±) embedding for a single scale, interpoint distances close to

some r. Result II: (1±) global embedding into the snowflake metric, where every

interpoint distance s is replaced by s½

Page 8: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 8

Result I – Embedding for Single Scale Theorem 1 [GK-09]:

Fix scale r>0 and range 0<<1. Every finite X½l2 admits embedding f:Xl2

k for k=Õ(log(1/)(dim X)2), such that

1. Lipschitz: ||f(x)-f(y)|| ≤ ||x-y|| for all x,y2X

2. Bi-Lipschitz at scale r: ||f(x)-f(y)|| ≥ (||x-y||) whenever ||x-y||2 [r, r]

3. Boundedness: ||f(x)|| ≤ r for all x2X

We’ll illustrate the proof for constant range and distortion.

Page 9: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 9

distance: 1

Result I: The construction We begin by considering the entire point set. Take for example

scale r=20 range = ½ Assume minimum interpoint distance 1

Page 10: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 10

Step 1: Net extraction From the point set, we extract a net

For example, a 4-net Net properties:

Covering Packing

A consequence of the packing property is that a ball of radius s contains O(sdim(X)) points

Covering radius: 4

Packing distance: 4

Page 11: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 11

Step 1: Net extraction We want a good embedding for just the net points

From here on, our embedding will ignore non-net points Why is this valid?

Page 12: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 12

Step 1: Net extraction Kirszbraun theorem (Lipschitz extension, 1934):

Given an embedding f : XY , X ½ S (Euclidean space) there exists a extension f ’ : S Y

The restriction of f ’ to X is equal to f f ’ is contractive for S \ X

Therefore, a good embedding just for the net points suffices Smaller net radius less distortion for the non-net points

f ’

2020

Page 13: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 13

Step 2: Padded decomposition Decompose the space into probabilistic padded clusters

Page 14: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 14

Step 2: Padded decomposition Decompose the space into probabilistic padded clusters

Cluster properties for a given random partition [GKL03,ABN08]: Diameter: bounded by 20 dim(X)

Size: By the doubling property, bounded (20 dim(X))dim(X) Padding: A point is 20-padded with probability 1-c, say 9/10 Support: O(dim(X)) partitions

≤ 20 dim(X)

Padded

Page 15: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 15

Step 3: JL on individual clusters For each partition, consider each individual cluster

Page 16: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 16

Step 3: JL on individual clusters For each partition, consider each individual cluster

Reduce dimension using JL-Lemma Constant distortion Target dimension:

logarithimic in size: O(log(20 dim(X))dim(X)) = Õ(dim(X)) Then translate some point to the origin

JL

Page 17: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 17

The story so far… To review

Step 1: Extract net points Step 2: Build family of partitions Step 3: For each partition, apply JL to each cluster, and translate a

cluster point to the origin

Embedding guarantees for

a singe partition Intracluster distance: Constant distortion Intercluster distance:

Min distance: 0 Max distance: 20 dim(X)

Not good enough Let’s backtrack…

Page 18: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 18

The story so far… To review

Step 1: Extract net points Step 2: Build family of partitions Step 3: For each partition, apply Gaussian transform to each cluster Step 4: For each partition, apply JL to each cluster, and translate a

cluster point to the origin

Embedding guarantees for

a singe partition Intracluster distance: Constant distortion Intercluster distance:

Min distance: 0 Max distance: 20 dim(X)

Not good enough Let’s backtrack…

Page 19: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 19

Step 3: Gaussian transform For each partition, apply the Gaussian transform to distances

within each cluster (Schoenberg’s theorem, 1938) f(t) = (1-e-t2)1/2

Threshold at s:

fs(t) = s(1-e-t2/s2)1/2

Properties for s=20: Threshold: Cluster diameter is at most 20 (Instead of 20dim(X)) Distortion: Small distortion of distances in relevant range

Transform can increase dimension… but JL is the next step

Page 20: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 20

Step 4: JL on individual cluster Steps 3 & 4:

New embedding guarantees Intracluster: Constant distortion Intercluster:

Min distance: 0 Max distance: 20 (instead of 20dim(X))

Caveat: Also smooth the edges

JLGaussian

smaller diameter smaller dimension

Page 21: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 21

Step 5: Glue partitions We have an embedding for a single partition

For padded points, the guarantees are perfect For non-padded points, the guarantees are weak

“Glue” together embeddings for all dim(X) partitions Concatenate images (and scale down)

Non-padded case occurs 1/10 of the time, so it gets “averaged away” Final dimension for non-net points:

Number of partitions: O(dim(X)) dimension of each embedding: Õ(dim(X)) = Õ (dim2(X))

f1(x) = (1,7,2), f2(x) = (5,2,3), f3(x) = (4,8,5)

F(x) = f1(x) f2(x) f3(x) = (1,7,2,5,2,3,4,8,5)

Page 22: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 22

Kirszbraun’s theorem extends embedding to non-net points within increasing dimension

Step 6: Kirszbraun extension theorem

Embedding

Embedding + K.

Page 23: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 23

Result I – Review Steps:

Net extraction Padded Decomposition Gaussian Transform JL Glue partitions Extension theorem

Theorem 1 [GK-09]: Every finite X½l2 admits embedding f:Xl2

k for k=Õ((dim X)2), such that

1. Lipschitz: ||f(x)-f(y)|| ≤ ||x-y|| for all x,y2X

2. Bi-Lipschitz at scale r: ||f(x)-f(y)|| ≥ (||x-y||) whenever ||x-y||2 [r, r]

3. Boundedness: ||f(x)|| ≤ r for all x2X

Page 24: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 24

Result I – Extension Steps:

Net extraction nets Padded Decomposition Larger padding, prob. guarantees Gaussian Transform JL Already (1±) Glue partitions Higher percentage of padded points Extension theorem

Theorem 1 [GK-09]: Every finite X½l2 admits embedding f:Xl2

k for k=Õ((dim X)2), such that

1. Lipschitz: ||f(x)-f(y)|| ≤ ||x-y|| for all x,y2X

2. Gaussian at scale r: ||f(x)-f(y)|| ≥(1±)G(||x-y||) whenever ||x-y||2 [r, r]

3. Boundedness: ||f(x)|| ≤ r for all x2X

Page 25: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 25

Result II – Snowflake Embedding Theorem 2 [GK-09]:

For 0<<1, every finite subset X½l2 admits an embedding F:Xl2k for

k=Õ(-4(dim X)2) with distortion (1±) to the snowflake: s s½

We’ll illustrate the construction for constant distortion. The constant distortion construction is due to [Asouad-83] (for non-

Euclidean metrics) In the paper, we implement the same construction with (1±) distortion

Page 26: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 26

Snowflake embedding Basic idea.

Fix points x,y 2X, and suppose ||x-y|| ~ s Now consider many single scale embeddings

r = 16s r = 8s r = 4s r = 2s r = s r = s/2 r = s/4 r = s/8 r = s/16

x y

Lipschitz: ||f(x)-f(y)|| ≤ ||x-y||

Gaussian: ||f(x)-f(y)|| ≥(1±)G(||x-y||)

Boundedness: ||f(x)|| ≤ r

Page 27: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 27

Snowflake embedding Now scale down each embedding by r½ (snowflake)

r = 16s s s½/4 r = 8s s s½/8½ r = 4s s s½/2 r = 2s s s½/2½ r = s s s½

r = s/2 s/2 s½/2½ r = s/4 s/4 s½/2 r = s/8 s/8 s½/8½ r = s/16 s/16 s½/4

Page 28: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 28

Snowflake embedding Join levels by concatenation and addition of coordinates

r = 16s s s½/4 r = 8s s s½/8½ r = 4s s s½/2 r = 2s s s½/2½ r = s s s½

r = s/2 s/2 s½/2½ r = s/4 s/4 s½/2 r = s/8 s/8 s½/8½ r = s/16 s/16 s½/4

Page 29: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 29

Result II – Review Steps:

Take collection of single scale embeddings Scale embedding r by r½

Join embeddings by concatenation and addition

By taking more refined scales (jump by 1± instead of 2), can achieve (1±) distortion to the snowflake

Theorem 2 [GK-09]: For 0<<1, every finite subset X½l2 admits an embedding F:Xl2

k for k=Õ(-4(dim X)2) with distortion (1±) to the snowflake: s s½

Page 30: A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.

A Nonlinear Approach to Dimension Reduction 30

Conclusion Gave two (1±) distortion low-dimension embeddings for

doubling spaces Single scale Snowflake

This framework can be extended to L1 and L∞

Dimension reduction: Can’t use JL Extension: Can’t use Kirszbraun Threshold: Can’t use the Gaussian

Thank you!