High Dimensional Spaces Foundations of Data Science Course Ramesh Hariharan Jan 2014 Ramesh Hariharan High Dimensional Spaces
High Dimensional SpacesFoundations of Data Science Course
Ramesh Hariharan
Jan 2014
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Inductive View of fn
Ramesh Hariharan High Dimensional Spaces
Inductive Derivation for fn
fn = 2 fn−1∫ π
20 sinn(θ) dθ n ≥ 2
f1 = 2
fn = 2n ∫ π2
0 sinn(θ) dθ∫ π
20 sinn−1(θ) dθ . . .
∫ π2
0 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Inductive Derivation for fn
fn = 2 fn−1∫ π
20 sinn(θ) dθ n ≥ 2
f1 = 2
fn = 2n ∫ π2
0 sinn(θ) dθ∫ π
20 sinn−1(θ) dθ . . .
∫ π2
0 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Inductive Derivation for fn
fn = 2 fn−1∫ π
20 sinn(θ) dθ n ≥ 2
f1 = 2
fn = 2n ∫ π2
0 sinn(θ) dθ∫ π
20 sinn−1(θ) dθ . . .
∫ π2
0 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Unit Sphere vs theUnit Cube
Corners of a unitcube are distance√
n2 from the origin
Center points ofeach side aredistance 1
2 fromthe origin
It looks like this
Ramesh Hariharan High Dimensional Spaces
The Unit Sphere vs theUnit Cube
Corners of a unitcube are distance√
n2 from the origin
Center points ofeach side aredistance 1
2 fromthe origin
It looks like this
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Do 2 Equators sum to more than the whole!
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Inductive View of an
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
E(cos2(θ))
∫ π2
0 sinn−2(θ) cos2(θ) dθ∫ π2
0 sinn−2(θ) dθ
=
∫ π2
0 sinn−2(θ) dθ −∫ π
20 sinn(θ) dθ∫ π
20 sinn−2(θ) dθ
= 1− n − 1n
=1n
Ramesh Hariharan High Dimensional Spaces
E(cos2(θ))
∫ π2
0 sinn−2(θ) cos2(θ) dθ∫ π2
0 sinn−2(θ) dθ
=
∫ π2
0 sinn−2(θ) dθ −∫ π
20 sinn(θ) dθ∫ π
20 sinn−2(θ) dθ
= 1− n − 1n
=1n
Ramesh Hariharan High Dimensional Spaces
E(cos2(θ))
∫ π2
0 sinn−2(θ) cos2(θ) dθ∫ π2
0 sinn−2(θ) dθ
=
∫ π2
0 sinn−2(θ) dθ −∫ π
20 sinn(θ) dθ∫ π
20 sinn−2(θ) dθ
= 1− n − 1n
=1n
Ramesh Hariharan High Dimensional Spaces
Var(cos2(θ))
∫ π2
0 sinn−2(θ) cos4(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
=
∫ π2
0 sinn−2(θ) dθ − 2∫ π
20 sinn(θ) dθ +
∫ π2
0 sinn+2(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
= 1− 2n − 1
n+
(n − 1)(n + 1)
n(n + 2)− 1
n2 =2(n − 1)
n2(n + 2)≤ 2
n2
Ramesh Hariharan High Dimensional Spaces
Var(cos2(θ))
∫ π2
0 sinn−2(θ) cos4(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
=
∫ π2
0 sinn−2(θ) dθ − 2∫ π
20 sinn(θ) dθ +
∫ π2
0 sinn+2(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
= 1− 2n − 1
n+
(n − 1)(n + 1)
n(n + 2)− 1
n2 =2(n − 1)
n2(n + 2)≤ 2
n2
Ramesh Hariharan High Dimensional Spaces
Var(cos2(θ))
∫ π2
0 sinn−2(θ) cos4(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
=
∫ π2
0 sinn−2(θ) dθ − 2∫ π
20 sinn(θ) dθ +
∫ π2
0 sinn+2(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
= 1− 2n − 1
n+
(n − 1)(n + 1)
n(n + 2)− 1
n2 =2(n − 1)
n2(n + 2)≤ 2
n2
Ramesh Hariharan High Dimensional Spaces
Tail Bounds on cos2(θ)
Pr(cos2(θ) > a2
n ) =R cos−1( a√
n)
0 sinn−2(θ) dθR π2
0 sinn−2(θ) dθ
≤√
2(n−1)(n−2)π
1(n−3)ae−
n−32n a2 ∼
√2π
1ae−
a22
Ramesh Hariharan High Dimensional Spaces
Tail Bounds on cos2(θ)
Pr(cos2(θ) > a2
n ) =R cos−1( a√
n)
0 sinn−2(θ) dθR π2
0 sinn−2(θ) dθ
≤√
2(n−1)(n−2)π
1(n−3)ae−
n−32n a2 ∼
√2π
1ae−
a22
Ramesh Hariharan High Dimensional Spaces
Projection Length of Fixed Unit Vector on Random Unit Vector
With probability 1−√
2π
1ae−
a22 , the projected length is between 0
and a√n
With probability 0.946, the projected length is between 0 and 2√n
Can we drive the projected length to be much more tightlydistributed around 1√
n ?
Ramesh Hariharan High Dimensional Spaces
Projection Length of Fixed Unit Vector on Random Unit Vector
With probability 1−√
2π
1ae−
a22 , the projected length is between 0
and a√n
With probability 0.946, the projected length is between 0 and 2√n
Can we drive the projected length to be much more tightlydistributed around 1√
n ?
Ramesh Hariharan High Dimensional Spaces
Projection Length of Fixed Unit Vector on Random Unit Vector
With probability 1−√
2π
1ae−
a22 , the projected length is between 0
and a√n
With probability 0.946, the projected length is between 0 and 2√n
Can we drive the projected length to be much more tightlydistributed around 1√
n ?
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Sums of Random Variables
Since X 21 , . . . , X 2
k are i.i.d, E(X 2
k ) = E(X 21 ) and Var(X 2
k ) =Var(X 2
1 )k
I.e., the distribution of X 2
k preserves the mean but is much tighteraround the mean.
Pr(|X 2
k − E(X 2
k )| ≥ α) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Pr(|X 2 − E(X 2)| ≥ kα) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Ramesh Hariharan High Dimensional Spaces
Sums of Random Variables
Since X 21 , . . . , X 2
k are i.i.d, E(X 2
k ) = E(X 21 ) and Var(X 2
k ) =Var(X 2
1 )k
I.e., the distribution of X 2
k preserves the mean but is much tighteraround the mean.
Pr(|X 2
k − E(X 2
k )| ≥ α) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Pr(|X 2 − E(X 2)| ≥ kα) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Ramesh Hariharan High Dimensional Spaces
Sums of Random Variables
Since X 21 , . . . , X 2
k are i.i.d, E(X 2
k ) = E(X 21 ) and Var(X 2
k ) =Var(X 2
1 )k
I.e., the distribution of X 2
k preserves the mean but is much tighteraround the mean.
Pr(|X 2
k − E(X 2
k )| ≥ α) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Pr(|X 2 − E(X 2)| ≥ kα) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
E(e−tX 2i ) ≤?
Using 1− x ≤ e−x ≤ 1− x + x2
2 , for all x ≥ 0:
E(e−tX 2i ) ≤ E(1− tX 2
i + t2 X 4i
2)
≤ 1− tn
+3t2
2n2 ≤ e−tn (1− 3t
2n )
Ramesh Hariharan High Dimensional Spaces
E(e−tX 2i ) ≤?
Using 1− x ≤ e−x ≤ 1− x + x2
2 , for all x ≥ 0:
E(e−tX 2i ) ≤ E(1− tX 2
i + t2 X 4i
2)
≤ 1− tn
+3t2
2n2 ≤ e−tn (1− 3t
2n )
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces