Top Banner
1 Dimensionality Reduction Haiqin Yang
25

Dimensionality Reduction

Feb 01, 2016

Download

Documents

meryle

Dimensionality Reduction. Haiqin Yang. Outline. Dimensionality reduction vs. manifold learning Principal Component Analysis (PCA) Kernel PCA Locally Linear Embedding (LLE) Laplacian Eigenmaps (LEM) Multidimensional Scaling (MDS) Isomap Semidefinite Embedding (SDE) Unified Framework. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dimensionality Reduction

1

Dimensionality Reduction

Haiqin Yang

Page 2: Dimensionality Reduction

2

Outline Dimensionality reduction vs. manifold learning Principal Component Analysis (PCA) Kernel PCA Locally Linear Embedding (LLE) Laplacian Eigenmaps (LEM) Multidimensional Scaling (MDS) Isomap Semidefinite Embedding (SDE) Unified Framework

Page 3: Dimensionality Reduction

3

Dimensionality Reduction vs. Manifold Learning Interchangeably Represent data in a low-dimensional space Applications

Data visualization Preprocessing for supervised learning

Page 4: Dimensionality Reduction

4

Examples

Page 5: Dimensionality Reduction

5

Models Linear methods

Principal component analysis (PCA) Multidimensional scaling (MDS) Independent component analysis (ICA)

Nonlinear methods Kernel PCA Locally linear embedding (LLE) Laplacian eigenmaps (LEM) Semidefinite embedding (SDE)

Page 6: Dimensionality Reduction

6

Principal Component Analysis (PCA) History: Karl Pearson, 1901 Find projections that capture the largest amounts

of variation in data Find the eigenvectors of the covariance matrix,

and these eigenvectors define the new space

x2

x

1

e

Page 7: Dimensionality Reduction

7

PCA Definition: Given a set of data , find the principal axes a

re those orthonormal axes onto which the variance retained under projection is maximal

Original Variable A

Ori

gin

al V

ari

able

B

PC 1

PC 2

Page 8: Dimensionality Reduction

8

Formulation Variance on the first dimension

var(U1)=var(wTX)=wTSw S: covariance matrix of X

Objective: the variance retains the maximal Formulation

Solving procedure Construct Langrangian Set the partial derivative on to zero

As w ≠ 0 then w must be an eigenvector of S with eigenvalue 1

Page 9: Dimensionality Reduction

9

PCA: Another Interpretation A rank-k linear approximation model

Fit the model with minimal reconstruction error

Optimal condition

Objective

can be expressed as SVD of X, TVUX

Page 10: Dimensionality Reduction

10

PCA: Algorithm

Page 11: Dimensionality Reduction

11

Kernel PCA History: S. Mika et al, NIPS, 1999 Data may lie on or near a nonlinear manifold, not a

linear subspace Find principal components that are nonlinearly to the

input space via nonlinear mapping

Objective

Solution found by SVD: U contains the eigenvectors of

Page 12: Dimensionality Reduction

12

Kernel PCA Centering

Issue: Difficult to reconstruct

Page 13: Dimensionality Reduction

13

Locally Linear Embedding (LLE) History: S. Roweis and L. Saul, Science, 2000 Procedure

1. Identify the neighbors of each data point2. Compute weights that best linearly reconstruct the poi

nt from its neighbors

3. Find the low-dimensional embedding vector which is best reconstructed by the weights determined in Step 2

Centering Y with unit variance

Page 14: Dimensionality Reduction

14

LLE Example

Page 15: Dimensionality Reduction

15

Laplacian Eigenmaps (LEM) History: M. Belkin and P. Niyogi, 2003 Similar to locally linear embedding Different in weights setting and objective function

Weights

Objective

Page 16: Dimensionality Reduction

16

LEM Example

Page 17: Dimensionality Reduction

17

Multidimensional Scaling (MDS) History: T. Cox and M. Cox, 2001 Attempts to preserve pairwise distances Different formulation of PCA, but yields similar

result form

Transformation

Page 18: Dimensionality Reduction

18

MDS Example

Page 19: Dimensionality Reduction

19

Isomap History: J. Tenenbaum et al, Science 1998 A nonlinear generalization of classical MDS Perform MDS, not in the original space, but in the ge

odesic space Procedure-similar to LLE

1. Find neighbors of each data point2. Compute geodesic pairwise distances (e.g., shortest path di

stance) between all points3. Embed the data via MDS

Page 20: Dimensionality Reduction

20

Isomap Example

Page 21: Dimensionality Reduction

21

Semidefinite Embedding (SDE) History: K. Weinberger and L. Saul, ICML, 2004 A variation of kernel PCA Criteria: if both points are neighbor, or

common neighbors of another point Procedure

Page 22: Dimensionality Reduction

22

SDE Example

Page 23: Dimensionality Reduction

23

Unified Framework All previous methods can be cast as kernel

PCA Achieve by adopting different kernel

definitions

Page 24: Dimensionality Reduction

24

Summary Seven dimensional reduction methods Unified framework: kernel PCA

Page 25: Dimensionality Reduction

25

Reference Ali Ghodsi. Dimensionality Reduction: A Short

Tutorial. 2006