Top Banner
Sampling, Matrices, Tensors January 11, 2013 () Sampling, Matrices, Tensors January 11, 2013 1/1
72

Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

May 31, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Sampling, Matrices, Tensors

January 11, 2013

() Sampling, Matrices, Tensors January 11, 2013 1 / 1

Page 2: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.

Attach a probability pj with j th column of A. The pj sum to 1.In s i.i.d. trials, pick s columns of A with these probabilities.Scale picked columns.Form m ! s matrix B of sampled, scaled columns.Want Bm!s " Am!n. Makes sense??Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 3: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.Attach a probability pj with j th column of A. The pj sum to 1.

In s i.i.d. trials, pick s columns of A with these probabilities.Scale picked columns.Form m ! s matrix B of sampled, scaled columns.Want Bm!s " Am!n. Makes sense??Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 4: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.Attach a probability pj with j th column of A. The pj sum to 1.In s i.i.d. trials, pick s columns of A with these probabilities.

Scale picked columns.Form m ! s matrix B of sampled, scaled columns.Want Bm!s " Am!n. Makes sense??Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 5: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.Attach a probability pj with j th column of A. The pj sum to 1.In s i.i.d. trials, pick s columns of A with these probabilities.Scale picked columns.

Form m ! s matrix B of sampled, scaled columns.Want Bm!s " Am!n. Makes sense??Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 6: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.Attach a probability pj with j th column of A. The pj sum to 1.In s i.i.d. trials, pick s columns of A with these probabilities.Scale picked columns.Form m ! s matrix B of sampled, scaled columns.

Want Bm!s " Am!n. Makes sense??Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 7: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.Attach a probability pj with j th column of A. The pj sum to 1.In s i.i.d. trials, pick s columns of A with these probabilities.Scale picked columns.Form m ! s matrix B of sampled, scaled columns.Want Bm!s " Am!n. Makes sense??

Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 8: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Set-up

A an m ! n matrix, real entries.Attach a probability pj with j th column of A. The pj sum to 1.In s i.i.d. trials, pick s columns of A with these probabilities.Scale picked columns.Form m ! s matrix B of sampled, scaled columns.Want Bm!s " Am!n. Makes sense??Try BBT " AAT . Both are m ! m !With correct scaling, can make it Unbiased:

E(BBT ) = AAT .

() Sampling, Matrices, Tensors January 11, 2013 2 / 1

Page 9: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Length Squared Sampling

Large Matrix A. Sampling and scaling some columns to form Bgot us

E(BBT ) = AAT .

Minimize Variance. [Of a matrix?]Frieze, K., Vempala Sampling probabilities proportional toSQUARED LENGTHS of columns minimize the variance.Many applications of length-squared sampling:

Estimate of invariants of matrix.Matrix Compression by sampling: Sample of rows and columnssufficient to approximate any matrix. Drineas, K., MahoneyApproximate maximization of cubic and higher forms.

() Sampling, Matrices, Tensors January 11, 2013 3 / 1

Page 10: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Length Squared Sampling

Large Matrix A. Sampling and scaling some columns to form Bgot us

E(BBT ) = AAT .

Minimize Variance. [Of a matrix?]

Frieze, K., Vempala Sampling probabilities proportional toSQUARED LENGTHS of columns minimize the variance.Many applications of length-squared sampling:

Estimate of invariants of matrix.Matrix Compression by sampling: Sample of rows and columnssufficient to approximate any matrix. Drineas, K., MahoneyApproximate maximization of cubic and higher forms.

() Sampling, Matrices, Tensors January 11, 2013 3 / 1

Page 11: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Length Squared Sampling

Large Matrix A. Sampling and scaling some columns to form Bgot us

E(BBT ) = AAT .

Minimize Variance. [Of a matrix?]Frieze, K., Vempala Sampling probabilities proportional toSQUARED LENGTHS of columns minimize the variance.

Many applications of length-squared sampling:

Estimate of invariants of matrix.Matrix Compression by sampling: Sample of rows and columnssufficient to approximate any matrix. Drineas, K., MahoneyApproximate maximization of cubic and higher forms.

() Sampling, Matrices, Tensors January 11, 2013 3 / 1

Page 12: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Length Squared Sampling

Large Matrix A. Sampling and scaling some columns to form Bgot us

E(BBT ) = AAT .

Minimize Variance. [Of a matrix?]Frieze, K., Vempala Sampling probabilities proportional toSQUARED LENGTHS of columns minimize the variance.Many applications of length-squared sampling:

Estimate of invariants of matrix.Matrix Compression by sampling: Sample of rows and columnssufficient to approximate any matrix. Drineas, K., MahoneyApproximate maximization of cubic and higher forms.

() Sampling, Matrices, Tensors January 11, 2013 3 / 1

Page 13: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Length Squared Sampling

Large Matrix A. Sampling and scaling some columns to form Bgot us

E(BBT ) = AAT .

Minimize Variance. [Of a matrix?]Frieze, K., Vempala Sampling probabilities proportional toSQUARED LENGTHS of columns minimize the variance.Many applications of length-squared sampling:

Estimate of invariants of matrix.

Matrix Compression by sampling: Sample of rows and columnssufficient to approximate any matrix. Drineas, K., MahoneyApproximate maximization of cubic and higher forms.

() Sampling, Matrices, Tensors January 11, 2013 3 / 1

Page 14: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Length Squared Sampling

Large Matrix A. Sampling and scaling some columns to form Bgot us

E(BBT ) = AAT .

Minimize Variance. [Of a matrix?]Frieze, K., Vempala Sampling probabilities proportional toSQUARED LENGTHS of columns minimize the variance.Many applications of length-squared sampling:

Estimate of invariants of matrix.Matrix Compression by sampling: Sample of rows and columnssufficient to approximate any matrix. Drineas, K., MahoneyApproximate maximization of cubic and higher forms.

() Sampling, Matrices, Tensors January 11, 2013 3 / 1

Page 15: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

How many Samples do we need?

We fix one measure of error, namely, relative spectral norm for thistalk:

Spectral Norm of (AAT # BBT )

Spectral Norm of AAT .

How many samples (= s, the number of columns of B) do weneed to ensure that with high probability, the error is at most 0.01?Let r =rank(A). [Actually, r = ||A||2F/||A||2 which is at most rankwill do.]

Original FKV: s = r3 works.Drineas, K., Mahoney s = r2 suffices.Rudelson and Vershynin s = r log r suffices. Uses some nice ideasfrom Functional Analysis. (Decoupling). Simpler proof of main toolby Ahlswede and Winter in Information Theory.

() Sampling, Matrices, Tensors January 11, 2013 4 / 1

Page 16: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

How many Samples do we need?

We fix one measure of error, namely, relative spectral norm for thistalk:

Spectral Norm of (AAT # BBT )

Spectral Norm of AAT .

How many samples (= s, the number of columns of B) do weneed to ensure that with high probability, the error is at most 0.01?

Let r =rank(A). [Actually, r = ||A||2F/||A||2 which is at most rankwill do.]

Original FKV: s = r3 works.Drineas, K., Mahoney s = r2 suffices.Rudelson and Vershynin s = r log r suffices. Uses some nice ideasfrom Functional Analysis. (Decoupling). Simpler proof of main toolby Ahlswede and Winter in Information Theory.

() Sampling, Matrices, Tensors January 11, 2013 4 / 1

Page 17: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

How many Samples do we need?

We fix one measure of error, namely, relative spectral norm for thistalk:

Spectral Norm of (AAT # BBT )

Spectral Norm of AAT .

How many samples (= s, the number of columns of B) do weneed to ensure that with high probability, the error is at most 0.01?Let r =rank(A). [Actually, r = ||A||2F/||A||2 which is at most rankwill do.]

Original FKV: s = r3 works.Drineas, K., Mahoney s = r2 suffices.Rudelson and Vershynin s = r log r suffices. Uses some nice ideasfrom Functional Analysis. (Decoupling). Simpler proof of main toolby Ahlswede and Winter in Information Theory.

() Sampling, Matrices, Tensors January 11, 2013 4 / 1

Page 18: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

How many Samples do we need?

We fix one measure of error, namely, relative spectral norm for thistalk:

Spectral Norm of (AAT # BBT )

Spectral Norm of AAT .

How many samples (= s, the number of columns of B) do weneed to ensure that with high probability, the error is at most 0.01?Let r =rank(A). [Actually, r = ||A||2F/||A||2 which is at most rankwill do.]

Original FKV: s = r3 works.

Drineas, K., Mahoney s = r2 suffices.Rudelson and Vershynin s = r log r suffices. Uses some nice ideasfrom Functional Analysis. (Decoupling). Simpler proof of main toolby Ahlswede and Winter in Information Theory.

() Sampling, Matrices, Tensors January 11, 2013 4 / 1

Page 19: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

How many Samples do we need?

We fix one measure of error, namely, relative spectral norm for thistalk:

Spectral Norm of (AAT # BBT )

Spectral Norm of AAT .

How many samples (= s, the number of columns of B) do weneed to ensure that with high probability, the error is at most 0.01?Let r =rank(A). [Actually, r = ||A||2F/||A||2 which is at most rankwill do.]

Original FKV: s = r3 works.Drineas, K., Mahoney s = r2 suffices.

Rudelson and Vershynin s = r log r suffices. Uses some nice ideasfrom Functional Analysis. (Decoupling). Simpler proof of main toolby Ahlswede and Winter in Information Theory.

() Sampling, Matrices, Tensors January 11, 2013 4 / 1

Page 20: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

How many Samples do we need?

We fix one measure of error, namely, relative spectral norm for thistalk:

Spectral Norm of (AAT # BBT )

Spectral Norm of AAT .

How many samples (= s, the number of columns of B) do weneed to ensure that with high probability, the error is at most 0.01?Let r =rank(A). [Actually, r = ||A||2F/||A||2 which is at most rankwill do.]

Original FKV: s = r3 works.Drineas, K., Mahoney s = r2 suffices.Rudelson and Vershynin s = r log r suffices. Uses some nice ideasfrom Functional Analysis. (Decoupling). Simpler proof of main toolby Ahlswede and Winter in Information Theory.

() Sampling, Matrices, Tensors January 11, 2013 4 / 1

Page 21: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.Eg. 2: v has general (non-spherical) Gaussian or other densities.

How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 22: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.

Eg. 2: v has general (non-spherical) Gaussian or other densities.How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 23: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.Eg. 2: v has general (non-spherical) Gaussian or other densities.

How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 24: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.Eg. 2: v has general (non-spherical) Gaussian or other densities.

How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)

Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 25: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.Eg. 2: v has general (non-spherical) Gaussian or other densities.

How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 26: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.Eg. 2: v has general (non-spherical) Gaussian or other densities.

How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .

Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 27: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Variance-covariance matrices

v a vector valued random variable (with a probability distribution(or density) in n# space.)

Eg. 1: v is a random column of a fixed matrix.Eg. 2: v has general (non-spherical) Gaussian or other densities.

How many i.i.d. samples of v are sufficient to ensure relativeerror approximation to the variance-covariance matrix - EvvT ?Want:Sample Variance-Covariance matrix "! true Variance-Covariancematrix. (M1 "! M2 if xT M1x "! xT M2x $x .)Question raised for log-concave densities by K., Lovász,Simonovits for computing volumes of convex sets. Firstimprovement by Bourgain, then Rudelson to O(n log n) and mostrecently Srivatsava, Vershynin to O(n). Relative error is important(and more difficult) for

Linear Regression when we are looking for x minimizing xT

Var-Covar x .Graph, Matrix Sparsification Spielman, Srivatsava, Batman, Teng.

() Sampling, Matrices, Tensors January 11, 2013 5 / 1

Page 28: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.

Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 29: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 30: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.

Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 31: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 32: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 33: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]

Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 34: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix-Valued Random Variables

Last Slide: prove concentration for vvT , v random vector. Rank 1.Generally, concentration for ||X ||, X = X1 + X2 + · · ·+ Xnindependent d ! d matrix-valued r.v.s’s. with

0 % Xi % I.

Traditional methods: Wigner ... Bound E Tr(X1 + X2 + · · ·+ Xn)m,m large even.Ahlswede and Winter A Chernoff bound using Bernstein method.Crucial: Golden-Thompson inequality.

Theorem Xi i.i.d.. Pr (X /& (1 # !)EX , (1 + !)EX ) % d e!!2n, for! % 1.

Tropp Independence suffices; don’t need i.i.d. [Lieb’s inequalityinstead of Golden-Thompson.]Open: Prove such concentration for negatively correlated (but notindependent) Xi .

() Sampling, Matrices, Tensors January 11, 2013 6 / 1

Page 35: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]

Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.What probability distribution and what s ?Length-squared sampling only gives us!!|xT A|#| xT B|

!! % 0.01||A||. Bad for x with small |xT A|.Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnans = O#(n) will do (whatever m is). Implies:Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.

() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 36: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.

What probability distribution and what s ?Length-squared sampling only gives us!!|xT A|#| xT B|

!! % 0.01||A||. Bad for x with small |xT A|.Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnans = O#(n) will do (whatever m is). Implies:Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.

() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 37: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.What probability distribution and what s ?

Length-squared sampling only gives us!!|xT A|#| xT B|!! % 0.01||A||. Bad for x with small |xT A|.

Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnans = O#(n) will do (whatever m is). Implies:Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.

() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 38: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.What probability distribution and what s ?Length-squared sampling only gives us!!|xT A|#| xT B|

!! % 0.01||A||. Bad for x with small |xT A|.

Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnans = O#(n) will do (whatever m is). Implies:Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.

() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 39: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.What probability distribution and what s ?Length-squared sampling only gives us!!|xT A|#| xT B|

!! % 0.01||A||. Bad for x with small |xT A|.Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnan

s = O#(n) will do (whatever m is). Implies:Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.

() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 40: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.What probability distribution and what s ?Length-squared sampling only gives us!!|xT A|#| xT B|

!! % 0.01||A||. Bad for x with small |xT A|.Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnans = O#(n) will do (whatever m is). Implies:

Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.

() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 41: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Matrix Sparsification

n ! m matrix A. [Think of m >> n.] [Each column is a record in adatabase.]Sample s columns of A (with a probability distribution of yourchoice) to get matrix B so that for every x :

xT (AAT )x "! xT (BBT )x ' |xT A| "! |xT B|.What probability distribution and what s ?Length-squared sampling only gives us!!|xT A|#| xT B|

!! % 0.01||A||. Bad for x with small |xT A|.Do length-squared sampling on (basically) A"1A (!!??!!) -Isometry, equally good for all x ! Spielman, Srivatisava, Batsman;Drineas, Mahoney, Muthukrishnans = O#(n) will do (whatever m is). Implies:Theorem For any n ! m matrix A, there is a subset B of O(n)(scaled) columns of A such that for every x ,

|xT A| "0.01 |xT B|.() Sampling, Matrices, Tensors January 11, 2013 7 / 1

Page 42: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Graph Spasification - a special case of MatrixSparsification

Sample edges to represent every cut size to relative error. Then findsparsest cut in sampled graph.

Indeed, for graphs, sampling probabilities proportional to electricalresistances work and make sparsification possible in nearly lineartime. No such fast algorithm is known for general matrix sparsification.

() Sampling, Matrices, Tensors January 11, 2013 8 / 1

Page 43: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing Cubic and higher forms

Given m ! n ! p array Aijk , find||A|| = Max|x |=|y |=|z|=1A(x , y , z) =

"ijk Aijkxiyjzk .

All we say here applies higher forms, Aijkl , etc..

No clean, nice theory, algorithms as for matrices. In fact, exactmaximization is computationally hard for quartic and higher forms.Theorem Using length squared sampling, we can find (inpolynomial time) a x , y , z such that with high probability

A(x , y , z) ( ||A||# 0.01||A||F ,

where, ||A||2F is the sum of squares of all entries of A. [Alas, wecannot replace || · || on the left by || · ||F or vice varsa.] de la Vega,Karpinski, K., Vempala

() Sampling, Matrices, Tensors January 11, 2013 9 / 1

Page 44: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing Cubic and higher forms

Given m ! n ! p array Aijk , find||A|| = Max|x |=|y |=|z|=1A(x , y , z) =

"ijk Aijkxiyjzk .

All we say here applies higher forms, Aijkl , etc..No clean, nice theory, algorithms as for matrices. In fact, exactmaximization is computationally hard for quartic and higher forms.

Theorem Using length squared sampling, we can find (inpolynomial time) a x , y , z such that with high probability

A(x , y , z) ( ||A||# 0.01||A||F ,

where, ||A||2F is the sum of squares of all entries of A. [Alas, wecannot replace || · || on the left by || · ||F or vice varsa.] de la Vega,Karpinski, K., Vempala

() Sampling, Matrices, Tensors January 11, 2013 9 / 1

Page 45: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing Cubic and higher forms

Given m ! n ! p array Aijk , find||A|| = Max|x |=|y |=|z|=1A(x , y , z) =

"ijk Aijkxiyjzk .

All we say here applies higher forms, Aijkl , etc..No clean, nice theory, algorithms as for matrices. In fact, exactmaximization is computationally hard for quartic and higher forms.Theorem Using length squared sampling, we can find (inpolynomial time) a x , y , z such that with high probability

A(x , y , z) ( ||A||# 0.01||A||F ,

where, ||A||2F is the sum of squares of all entries of A. [Alas, wecannot replace || · || on the left by || · ||F or vice varsa.] de la Vega,Karpinski, K., Vempala

() Sampling, Matrices, Tensors January 11, 2013 9 / 1

Page 46: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.Now, A(ei , y , z) =

"j,k ,l Ai,j,kyjzk .

The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .Of course don’t know these values, but FEW =) we canenumerate all possibilities.How do we make sure the variance is not too high, since the entriescan have disparate values ?Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 47: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.

Now, A(ei , y , z) ="

j,k ,l Ai,j,kyjzk .The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .Of course don’t know these values, but FEW =) we canenumerate all possibilities.How do we make sure the variance is not too high, since the entriescan have disparate values ?Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 48: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.Now, A(ei , y , z) =

"j,k ,l Ai,j,kyjzk .

The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .

Of course don’t know these values, but FEW =) we canenumerate all possibilities.How do we make sure the variance is not too high, since the entriescan have disparate values ?Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 49: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.Now, A(ei , y , z) =

"j,k ,l Ai,j,kyjzk .

The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .Of course don’t know these values, but FEW =) we canenumerate all possibilities.

How do we make sure the variance is not too high, since the entriescan have disparate values ?Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 50: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.Now, A(ei , y , z) =

"j,k ,l Ai,j,kyjzk .

The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .Of course don’t know these values, but FEW =) we canenumerate all possibilities.How do we make sure the variance is not too high, since the entriescan have disparate values ?

Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 51: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.Now, A(ei , y , z) =

"j,k ,l Ai,j,kyjzk .

The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .Of course don’t know these values, but FEW =) we canenumerate all possibilities.How do we make sure the variance is not too high, since the entriescan have disparate values ?Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 52: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Maximizing cubic formsCentral Problem: Find x , y , z unit vectors to maximize

"ijk Aijklxiyjzk .

If we knew the optimizing y , z, then the optimizing x is easy tofind: it is just the vector A(·, y , z) (whose i th component isA(ei , y , z)) scaled to length 1.Now, A(ei , y , z) =

"j,k ,l Ai,j,kyjzk .

The sum can be estimated by having just a few terms, namely, yj , zkvalues for a few j , k .Of course don’t know these values, but FEW =) we canenumerate all possibilities.How do we make sure the variance is not too high, since the entriescan have disparate values ?Length squared sampling works ! [Stated here without proof.]

This gives us many candidate x ’s. How do we check which one isgood ? For each x , form the matrix A(x). Solve the quadratic formmaximization for the matrix to find best y , z. Take the bestcandidate x .

() Sampling, Matrices, Tensors January 11, 2013 10 / 1

Page 53: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Application of Low-rank approximations

Szemeredi’s Regularity Lemma:

Graph G on n vertices (n * +).Can partition the vertex set into O(1) parts so that the edge setsbetween most pairs behave as if they were thrown in at randomwith the correct density.

Beautiful Theorem with many applications including van derWarden conjecture.Gowers The number of parts has to be at least a tower of height1/!20 in error parameter !.

() Sampling, Matrices, Tensors January 11, 2013 11 / 1

Page 54: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Application of Low-rank approximations

Szemeredi’s Regularity Lemma:Graph G on n vertices (n * +).

Can partition the vertex set into O(1) parts so that the edge setsbetween most pairs behave as if they were thrown in at randomwith the correct density.

Beautiful Theorem with many applications including van derWarden conjecture.Gowers The number of parts has to be at least a tower of height1/!20 in error parameter !.

() Sampling, Matrices, Tensors January 11, 2013 11 / 1

Page 55: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Application of Low-rank approximations

Szemeredi’s Regularity Lemma:Graph G on n vertices (n * +).Can partition the vertex set into O(1) parts so that the edge setsbetween most pairs behave as if they were thrown in at randomwith the correct density.

Beautiful Theorem with many applications including van derWarden conjecture.Gowers The number of parts has to be at least a tower of height1/!20 in error parameter !.

() Sampling, Matrices, Tensors January 11, 2013 11 / 1

Page 56: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Application of Low-rank approximations

Szemeredi’s Regularity Lemma:Graph G on n vertices (n * +).Can partition the vertex set into O(1) parts so that the edge setsbetween most pairs behave as if they were thrown in at randomwith the correct density.

Beautiful Theorem with many applications including van derWarden conjecture.

Gowers The number of parts has to be at least a tower of height1/!20 in error parameter !.

() Sampling, Matrices, Tensors January 11, 2013 11 / 1

Page 57: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Application of Low-rank approximations

Szemeredi’s Regularity Lemma:Graph G on n vertices (n * +).Can partition the vertex set into O(1) parts so that the edge setsbetween most pairs behave as if they were thrown in at randomwith the correct density.

Beautiful Theorem with many applications including van derWarden conjecture.Gowers The number of parts has to be at least a tower of height1/!20 in error parameter !.

() Sampling, Matrices, Tensors January 11, 2013 11 / 1

Page 58: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .

Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 59: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .

Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 60: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .

Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 61: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 62: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 63: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.

Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 64: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.

But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 65: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Weak Regularity Lemma

Vertex set V of a graph partitioned into V1,V2, . . . ,Vk .Density dij between part Vi and Vj is the fraction of number ofedges between Vi ,Vj .Think of edges between a vertex in Vi and one in Vj being thrownin at random with probability dij .Partition is “weakly” ! regular if for any subsets S,T of vertices wehave

Number of edges between S and T = E( of that number ) ±!n2.

Frieze, K. There is a weakly ! regular partition with 21/!2 parts.Such a partition can be found in poly time.But why state this in this talk?

() Sampling, Matrices, Tensors January 11, 2013 12 / 1

Page 66: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.

(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)Hard: Such an approximation can be found.Easy: Such an approximation gives a weakly regular partition.Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1

Page 67: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .

Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)Hard: Such an approximation can be found.Easy: Such an approximation gives a weakly regular partition.Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1

Page 68: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)

Hard: Such an approximation can be found.Easy: Such an approximation gives a weakly regular partition.Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1

Page 69: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)Hard: Such an approximation can be found.

Easy: Such an approximation gives a weakly regular partition.Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1

Page 70: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)Hard: Such an approximation can be found.Easy: Such an approximation gives a weakly regular partition.

Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1

Page 71: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)Hard: Such an approximation can be found.Easy: Such an approximation gives a weakly regular partition.Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.

Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1

Page 72: Sampling, Matrices, Tensorsmath.iisc.ac.in/~nmi/downloads/kannan_conf.pdf · Sampling, Matrices, Tensors January 11, 2013 Sampling, Matrices, Tensors January 11, 2013 1 / 1. Set-up

Combinatorial Rank 1 matrices and Regularity

A cut matrix is of the form "v , u, where, " is a real number andu, v are 0-1 vectors.(Easy) Any matrix can be approximated by a sum of a smallnumber of cut matrices. Specifically, at most 1/!2 cut matrices, sothat the error in “cut norm” is at most !||A||F .Cut Norm: Max. absolute value of the sum of entries in arectangle (any subset of rows ! any subset of columns)Hard: Such an approximation can be found.Easy: Such an approximation gives a weakly regular partition.Weak regularity partition not sufficient for many purely structuralresults. (Otherwise would contradict lower bounds for van derWarden problem). It suffices for algorithmic applications.Extends to higher dimensional arrays (tensors).

() Sampling, Matrices, Tensors January 11, 2013 13 / 1