Approximate Nearest Approximate Nearest Neighbors Neighbors and the and the Fast Johnson- Fast Johnson- Lindenstrauss Lindenstrauss Transform Transform Nir Ailon Nir Ailon , Bernard , Bernard Chazelle Chazelle (Princeton University) (Princeton University)
23
Embed
Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Approximate Nearest NeighborsApproximate Nearest Neighborsand theand the
Fast Johnson-LindenstraussFast Johnson-LindenstraussTransformTransform
Nir AilonNir Ailon, Bernard Chazelle, Bernard Chazelle(Princeton University)(Princeton University)
d can be very larged can be very large -approx beats “curse of dimensionality”-approx beats “curse of dimensionality” [IM98, H01] (Euclidean), [KOR00] (Cube):[IM98, H01] (Euclidean), [KOR00] (Cube):
Time O(Time O(-2-2d log n)d log n) Space nSpace nO(O(-2-2))
Bottleneck: Dimension reduction
Using FJLTO(d log d + -3 log2 n)
The d-Hypercube CaseThe d-Hypercube Case [KOR00][KOR00] Binary search on distance Binary search on distance 22 [d] [d] For distance For distance multiply space by random matrixmultiply space by random matrix
22 Z Z22k k ££ d d k=O( k=O(-- log n) log n)
ijij i.i.d. i.i.d. »» biased coin biased coin
Preprocess lookup tables for Preprocess lookup tables for x (mod 2)x (mod 2) Our observation: Our observation: can be made sparse can be made sparse
Using “handle” to pUsing “handle” to p22 P s.t. dist(x,p) P s.t. dist(x,p) Time for each step: O(Time for each step: O(-2-2d log n) d log n) )) O(d + O(d + -2-2 log n) log n)
How to make similar improvement for LHow to make similar improvement for L2 2 ??
Back to Euclidean Space andBack to Euclidean Space andJohnson-LindenstraussJohnson-Lindenstrauss. . . . . .
History of Johnson-LindenstraussHistory of Johnson-LindenstraussDimension ReductionDimension Reduction
[JL84] [JL84] : Projection of R: Projection of Rdd onto random onto random
subspace of dimension k=c subspace of dimension k=c-2-2 log n log n w.h.p.:w.h.p.:
88 p pii,p,pjj 22 P P
|| || p pi i - - p pjj || ||2 2 = = (1±O((1±O() ||p) ||pi i - p- pjj||||22
LL22 !! L L22 embedding embedding
History of Johnson-LindenstraussHistory of Johnson-LindenstraussDimension ReductionDimension Reduction
[FM87], [DG99][FM87], [DG99] Simplified proof, improved constant cSimplified proof, improved constant c 22 R Rk k ££ d d : random orthogonal matrix : random orthogonal matrix
1
2
k
||i||2=1
i ¢ j = 0
History of Johnson-LindenstraussHistory of Johnson-LindenstraussDimension ReductionDimension Reduction
[IM98][IM98] 22 R Rkk££ d d : : ijij i.i.d. i.i.d. »» N(0,1/d) N(0,1/d)
1
2
k
E ||i||22=1
Ei ¢ j = 0
History of Johnson-LindenstraussHistory of Johnson-LindenstraussDimension ReductionDimension Reduction
[A03][A03] Need only tight concentration of |Need only tight concentration of |ii ¢¢ v| v|22
22 R Rkk££ d d : : ijij i.i.d. i.i.d. » »
1
2
k
E ||i||22=1
Ei ¢ j = 0
+1 1/2 -1 1/2
History of Johnson-LindenstraussHistory of Johnson-LindenstraussDimension ReductionDimension Reduction
[A03][A03] 22 R Rkk££ d d : : ijij i.i.d. i.i.d. » » SparseSparse
Interesting Problem Interesting Problem IIII Dimension reduction is samplingDimension reduction is sampling Sampling by random walk:Sampling by random walk:
Expander graphs for uniform samplingExpander graphs for uniform sampling Convex bodies for volume estimationConvex bodies for volume estimation
[Kac59]: Random walk on orthogonal group[Kac59]: Random walk on orthogonal groupfor t=1..T:for t=1..T:
pick i,j pick i,j 22RR [d], [d], 22RR [0,2 [0,2)) v vii v vii cos cos+ v+ vj j sin sin
vvjj -v -vii sin sin + v+ vj j coscos Output (vOutput (v11, ..., v, ..., vkk) as dimension reduction of v) as dimension reduction of v How many steps for J-L guarantee?How many steps for J-L guarantee? [CCL01], [DS00], [P99] . . .[CCL01], [DS00], [P99] . . .