Top Banner
Sparsity, Rank, and All That March 30, 2009 Ben Recht Center for the Mathematics of Information Caltech
58

Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Aug 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Sparsity, Rank, and All That

March 30, 2009

Ben RechtCenter for the Mathematics of Information

Caltech

Page 2: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• When A has less rows than columns, there are an infinite number of solutions.

• Which one should be selected?

OR:

Undertermined Linear Systems

Page 3: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Mining for Biomarkers

• npatients << npeaks

• If very few are needed for diagnosis, search for a sparse set of markers

• l1 , LASSO, etc.

Page 4: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Recommender Systems

Page 5: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Netflix Prize

• One million big ones!

• Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest accuracy

• 17770 total movies x 480189 total users• Over 8 billion total ratings

• How to fill in the blanks?

Page 6: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Abstract Setup: Matrix Completion

• How do you fill in the missing data?

Xij known for black cellsXij unknown for white cells

Rows index moviesColumns index users

X =

X LR*

k x r r x nk x n

kn entries r(k+n) entries

=

Page 7: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Matrix Rank

• The rank of X is…the dimension of the span of the rowsthe dimension of the span of the columnsthe smallest number r such that there exists an k x r matrix L and an n x r matrix R with X=LR*

X LR*

k x r r x nk x n

=

Page 8: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

ComplexSystems

Structure

Rank

Dynamics

Sparsity

Predictions

Smoothness

Page 9: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• Search for best linear combination of fewest atoms• “rank” = fewest atoms needed to describe the model

• Suppose we want to solve

• M = {all rank r models}• What happens when dimension(M) is smaller than the

number of rows of A?

Parsimonious Models

atomsmodel weights

rank

Page 10: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Plan of Attack

• Encoding parsimony – embeddings, projections, and the atomic norm

• Example 1: Sparse vectors– Atomic norm = l1– Decoding via Restricted Isometry– Decoding via most encodings

• Example 2: Low rank matrices– Atomic norm = trace norm– Decoding via Restricted Isometry– Decoding via most encodings

• Other models and further directions

Page 11: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Whitney’s Theorem

• Any random projection of a d-dimensional manifold into 2d+1 dimensions is en embedding!

a• Let X = { t(x-y) : x,y∈

M, t ∈

R} ⊂

RD

• If D>2d+1, any random a is not in X.

• Project orthogonal a.

• If there are x,y in M with πa (x) = πa (y), then there is a t with a = t(x-y) ∈ X

(contradiction).X

Page 12: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Whitney’s Theorem

• Any random projection of a d-dimensional manifold into 2d+1 dimensions is an embedding!

• If any random projection is an embedding, when can we reconstruct points in X from their projected values?

• Given a random encoder, when can we find a low-complexity decoder?

• Answer: need slightly more geometry

X

Page 13: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• Search for best linear combination of fewest atoms• “rank” = fewest atoms needed to describe the model

• “natural” heuristic:

Parsimonious Models

atomsmodel weights

rank

Page 14: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Cardinality

• Vector x has cardinality s if it has at most s nonzeros.

• Atoms are a discrete set of orthogonal points • Typical Atoms:

– standard basis– Fourier basis– Wavelet basis

Page 15: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Cardinality Minimization

• PROBLEM: Find the vector of lowest cardinality that satisfies/approximates the underdetermined linear system

• NP-HARD:– Reduce to EXACT-COVER [Natarajan 1995]– Hard to approximate– Known exact algorithms require enumeration

Page 16: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Proposed Heuristic

• Long history (back to geophysics in the 70s) • Flurry of recent work characterizing success of this

heuristic: Candès, Donoho, Romberg, Tao, Tropp, etc., etc…

• “Compressed Sensing”

Convex Relaxation:Cardinality Minimization:

Page 17: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Why l1 norm?

card(x)

||x||1

Page 18: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• 2d vectors

1 nonzerox2 + y2 = 1

Convex hull:

Page 19: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

w1

w2

A(X)=b

When is this intuition precise?

Page 20: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Restricted Isometry Property (RIP)

• Let A:Rn →

Rm be a linear map. For every positive integer s≤m, define the s-restricted isometry constant to be the smallest number s (A) such that

holds for all vectors x of cardinality at most s.

• Candès and Tao (2005).

Page 21: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Unique Sparse Solution

• Theorem Suppose that 2s (A)<1 for some integer s≥1. Then there can be at most one vector x with cardinality less than or equal to s satisfying Ax= b.

• Proof: Assume, on the contrary, that there exist two different vectors, x1 and x2 , satisfying the matrix equation (Ax1 =Ax2 =b).

• Then z:=x1 -x2 is a nonzero matrix of card at most 2s, and Az=0.

• But then we would have

which is a contradiction.

Page 22: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Heuristic Succeeds

• Theorem: Let x0 be a vector of cardinality at most s. Let x* be the solution of Ax=Ax0 of smallest l1 norm. Suppose that 4s (A) < 1/4. Then x* =x0 .

• Deterministic condition on A• Current best bound: 2s (A) < 0.2 suffices.

Independent of n,m,s

Page 23: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Heuristic Succeeds

• Theorem: Let x0 be a matrix of cardinality s. Let x* be the solution of Ax=Ax0 of smallest l1 norm. Suppose that s≥

1 is such that 4s (A) < 1/4. Then x* =x0 .• Proof Sketch: Let R:=x* -x0 be the error.• The majority of the mass of R is concentrated in the

support of x0 :

• We can decompose R = R0 + R1 + R2 + …– R0 is projection on the support of x– Ri have cardinality at most 3s and disjoint support from x0

for i>0

Page 24: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Heuristic Succeeds (cont)

Striclty

positive for 4s

<1/4

• Using from CRT 06

• Proof of l2 constrained version is similar

Page 25: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Nearly Isometric Random Variables

• Let A be a random variable that takes values in linear maps from Rn to Rm.

• We say that A is nearly isometrically distributed if

1. For all x ∈

Rn,

2. For all 0<<1 we have,

Isometric in expectation

Large deviations unlikely

Page 26: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Nearly Isometric RVs obey RIP

• Theorem: Fix 0<<1. If A is a nearly isometric random variable, then for every 1≤s≤m, there exist constants c0 , c1 >0 depending only on

such that s (A)≤

whenever m≥c0 s log(n/s) with probability at least 1- exp(-c1 m).

• Number of measurements c0 s log(n/s)

• Typical scaling for this type of result.

constant intrinsic dimension

ambient dimension

Page 27: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Examples of Restricted Isometries

• Aij Gaussian with variance• A a random projection

• “Most” transformations when properly scaled

Page 28: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• Probability x is distorted is at most

• Can cover all x on the unit ball in Rs

with at most α2

(²)s points.

• Since nearby x’s are distorted similarly, probability any s-sparse x is distorted is at most

• So no x is distorted with Prob at least 1-exp(-c1 m) if

Proof of RIP:

Page 29: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

The l1 heuristic works!

• The l1 heuristic succeeds (at sparsity level s) for most A with m>c0 slog(n/s)

• Number of measurements c0 s log(n/s)

• Approach: Show that a properly scaled random A is nearly an isometry on the set of 4s-sparse vectors.

constantintrinsic

dimension

ambient dimension

Page 30: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

(Matrix) Rank

• Matrix X has rank r if it has at most r nonzero singular values.

• Atoms are the set of all rank one matrices• Not a discrete set

Page 31: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

G

K

ControllerDesign

Constraints involving the rank of the Hankel Operator, Matrix, or Singular Values

Model Reduction

SystemIdentification

Multitask Learning

EuclideanEmbedding

Rank of: Matrix of Classifiers

GramMatrix

RecommenderSystems

DataMatrix

Page 32: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Affine Rank Minimization

• PROBLEM: Find the matrix of lowest rank that satisfies/approximates the underdetermined linear system

• NP-HARD:– Reduce to finding solutions to polynomial systems– Hard to approximate– Exact algorithms are awful (doubly exponential)

Page 33: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Singular Value Decomposition (SVD)

• If X is a matrix of size k x n (k≤n) then there matrices U (k x k) and V (n x k) such that

• a diagonal matrix, 1 ≥

… ≥

k≥

0

• Fact: If X has rank r, then X has only r non-zero singular values.

• Dimension of rank r matrices: r (k+n - r) ≤

2 n r

Page 34: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Proposed Heuristic

• Proposed by Fazel (2002).• Nuclear norm is the “numerical rank” in numerical

analysis• The “trace heuristic” from controls if X is p.s.d.

Convex Relaxation:

Affine Rank Minimization:

Page 35: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Why nuclear norm?

rank(X)

||X||*

• Just as l1 norm ⇒

sparsity, nuclear norm ⇒

low rank• Nuclear norm of diagonal matrix = l1

norm of diagonal

Page 36: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Matrix and Vector Norms

• Vector • Matrix

• Singular Values

Page 37: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• 2x2 matrices• plotted in 3d

rank 1x2 + z2 + 2y2 = 1

Convex hull:

Page 38: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• 2x2 matrices• plotted in 3d

• Projection onto x-zplane is l1 ball

Page 39: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

w1

w2

A(X)=b

Page 40: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

So how do we compute it? And when does it work?

• 2x2 matrices• plotted in 3d

• Not polyhedral…

Page 41: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Equivalent Formulations

• Semidefinite embedding:

• Low rank parametrization:

Page 42: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Computationally: Gradient Descent

• “Method of multipliers”• Schedule for

controls the noise in the data• Same global minimum as nuclear norm• Dual certificate for the optimal solution

• When will this fail and when it might succeed?

Page 43: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Restricted Isometry Property (RIP)

• Let A:Rk x n →

Rm be a linear map. (Without loss of generality, assume k≤

n throughout). For every positive integer r≤k, define the r-restricted isometry constant to be the smallest number r (A) such that

holds for all matrices X of rank at most r.

• Directly adapted from RIP condition from Candès and Tao (2004).

Page 44: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Unique Low-rank Solution

• Theorem Suppose that 2r (A)<1 for some integer r≥1. Then there can be at most one matrix X with rank less than or equal to r satisfying A(X) = b.

• Proof: Assume, on the contrary, that there exist two different matrices, X1 and X2 , satisfying the matrix equation (A(X1 )=A(X2 )=b).

• Then Z:=X1 -X2 is a nonzero matrix of rank at most 2r, and A(Z)=0.

• But then we would have

which is a contradiction.

Page 45: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Heuristic Succeeds

• Theorem: Let X0 be a matrix of rank r. Let X* be the solution of A(X)=A(X0 ) of smallest nuclear norm. Suppose that r≥

1 is such that 5r (A) < 1/10. Then X* =X0 .

• Deterministic condition on A• No reason for estimate to be sharp

Independent of k,n,r,m

Page 46: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Heuristic Succeeds

• Theorem: Let X0 be a matrix of rank r. Let X* be the solution of A(X)=A(X0 ) of smallest nuclear norm. Suppose that r≥

1 is such that 5r (A) < 1/10. Then X* =X0 .

• Proof Sketch: Let R:=X* -X0 be the error.• The majority of the mass of R is concentrated in the row

and column spaces of X0 .• We can decompose R = R0 + R1 + R2 + …

– R0 is concentrated near the row and column space of X– Ri have rank at most 3r and orthogonal row/col spaces to

X0 for i>0

• Then we can show

Page 47: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

RIP ⇒

Heuristic Succeeds (cont)

Striclty

positive for 5r

<1/10

Page 48: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Nearly Isometric RVs obey RIP

• Theorem: Fix 0<<1. If A

is a nearly isometric random variable, then for every 1≤r≤k, there exist constants c0 , c1 >0 depending only on

such that r (A)≤

whenever m≥c0 r(k+n-r) log(kn) with probability at least 1-exp(-c1 m).

• Number of measurements c0 r(k+n-r) log(kn)

• Typical scaling for this type of result.

constant intrinsic dimension

ambient dimension

Page 49: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Generic Proof:

• Probability X is distorted is at most

• I can cover all X with O(Dd) points where d is the intrinsic dimension and D is the embedded/ambient dimension

• Since nearby X’s are distorted similarly, probability any X is distorted is at most

• So no X is distorted with Prob at least 1-exp(-c1 m) if

Page 50: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Proof Sketch

• Show concentration holds for all matrices with same row and column space. (large deviations unlikely)

• Show that the distortion of a subspace of matrices by a linear map is robust to perturbations of the subspace. (maps have bounded norm)

• Provide an -net over the set of all subspaces of low-rank matrices (a Grassmann manifold). Show RIP holds at all points in the net with overwhelming probability and hence holds everywhere.

Apply large deviations

property at an - net

Nearby subspaces have same distortion

Page 51: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

The trace-norm heuristic succeeds!

• If m > c0 r(k+n-r)log(kn), the heuristic succeeds for most A

• Number of measurements c0 r(k+n-r) log(kn)

• Approach: Show that a random A

is nearly an isometry on the manifold of rank 5r matrices.

constant intrinsic dimension

ambient dimension

Recht, Fazel, and Parrilo. 2007.

Page 52: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Numerical Experiments

• Test “image”• Rank 5 matrix, 46x81 pixels• Random Gaussian measurements• Nuclear norm minimization via SDP (sedumi)

Page 53: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Phase transition

Page 54: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Phase transition

measurements vs parameters:

= m/n2

“Normalized” dimension of

the rank r matrices

= r/n

Recht, Xu, and Hassibi, 2008

model-size vs measurements

Page 55: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

… … … …Gradient descent on low-rank nuclear norm

parameterization

Mixture of hundreds of

models, including nuclear norm

Page 56: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

• Search for best linear combination of fewest atoms• “rank” = fewest atoms needed to describe the model

Parsimonious Models

atomsmodel weights

rank

Page 57: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

Other Directions

• Random Features for Learning (Rahimi & Recht 07-08)– Atomic norm on basis functions

• Dynamical Systems– Atomic norm on filter banks

• Multivariate Tensors– Applications in genetics and vision

• Jordan Algebras, Polynomial Varieties, nonlinear models, completely positive matrices, …

atomsmodel weights

rank

Page 58: Sparsity, Rank, and All That9.520/spring09/Classes/Recht-lecture-Mar-30... · 2009. 3. 30. · Sparsity, Rank, and All That March 30, 2009 Ben Recht. Center for the Mathematics of

References

• “Some remarks on greedy algorithms.” Ron DeVore and Vladimir Temlyakov. Advances in Computational Mathematics. 5, pp. 173-187, 1996.

• “Decoding by Linear Programming.” Emmanuel Candes and Terence Tao. IEEE Transactions on Information Theory. 51 (12), pp. 4203- 4215, 2005.

• “Stable Signal Recovery from Incomplete and Inaccurate Measurements.” Emmanuel Candes, Justin Romberg, and Terence Tao. 59 (8), pp. 1207 – 1223, 2006.

• “A Simple Proof of the Restricted Isometry Property for Random Matrices.” R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin. Constructive Approximation, 28(3), pp. 253-263, 2008.

• “Guaranteed Minimum Rank Solutions to Linear Matrix Equations via Nuclear Norm Minimization.” Benjamin Recht, Maryam Fazel, and Pablo A. Parrilo. Submitted to SIAM Review. 2007.

• “Necessary and Sufficient Condtions for Success of the Nuclear Norm Heuristic for Rank Minimization.” Benjamin Recht, Weiyu Xu, and Babak Hassibi. Submitted to IEEE Transactions on Information Theory. 2008.

• More extensions on my website: http://www.ist.caltech.edu/~brecht/publications.html