Top Banner
CS 789 ADVANCED BIG DATA - EIGENVECTOR - PRINCIPAL COMPONENTS ANALYSIS Mingon Kang, Ph.D. Department of Computer Science, University of Nevada, Las Vegas * Some contents are adapted from Dr. Hung Huang and Dr. Vassilis Athitsos at UT Arlington
21

CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

May 10, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

CS 789 ADVANCED BIG DATA

- EIGENVECTOR

- PRINCIPAL COMPONENTS ANALYSIS

Mingon Kang, Ph.D.

Department of Computer Science, University of Nevada, Las Vegas

* Some contents are adapted from Dr. Hung Huang and Dr. Vassilis Athitsos at UT

Arlington

Page 2: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Eigenvector & eigenvalue

Eigenvector is a non-zero vector that does not

change its direction when that linear transformation

is applied to it.

𝐀𝐱 = 𝜆𝐱

where 𝐀 is a square matrix

Eigenvector: Geometric exploration

http://www.sineofthetimes.org/eigenvectors-of-2-x-2-

matrices-a-geometric-exploration/

Page 3: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Eigenvector & eigenvalue

𝐀 =0 11 0

, 𝐱 =11

Then, 𝐀𝐱 =11

𝐀 =0 11 0

, 𝐱 =−11

Then, 𝐀𝐱 =1−1

Sum of 𝜆 = det(A) = σ𝑎𝑖𝑖

Page 4: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Eigenvector & eigenvalue

How to solve it?

𝐀𝐱 = 𝜆𝐱 ⇒ 𝐀𝐱 − 𝜆𝐱 = 0 ⇒ (𝐀 − 𝜆𝐈)𝐱 = 0

𝐀 − 𝜆𝐈 is a singular matrix ⇒ det(𝐀 − 𝜆𝐈 ) = 0

E.g.,

𝐀 =3 11 3

Page 5: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Principal Components Analysis

Reduce the dimensionality of a data set while

preserving as much as possible of the variation

present in the data set.

Transform to a new set of variables, “Principal

Components (PCs)

Page 6: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Example

Projection to principal components

Data with two variables: some variables may be

correlated each other

Projection to new spaces which are uncorrelated

Page 7: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Projection on PCA

Page 8: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Dimensionality Reduction

If we have 100 points on the

plot, how many numbers do

we need to specify them?

Every point (x, y) is on a line

𝑦 = 𝑎𝑥 + 𝑏

A, b and 100 numbers of x

coordinate of each point

Page 9: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Dimensionality Reduction

Project all points to a single

line

If we find the line, we can

approximately represent the

data with lower

dimensionality.

Page 10: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Vector Projection

Project of vector a onto b

Orthogonal projection of a onto a straight line parallel

to b.

𝐚𝟏 = a1መ𝐛

a1 is a scalar and መ𝐛 is the unit

vector of vector b

a1 = 𝐚 cos𝜃 = 𝐚 ∙ 𝐛

∙ is a dot product

https://en.wikipedia.org/wiki/Vector_projection

Page 11: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

PCA

Given data 𝐱 = x1, … , xp , p random variables

𝛂𝐤 is a vector of p constants

New projection: 𝛂𝐤 ∙ 𝐱 = 𝛂𝐤′ 𝐱

Page 12: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

How to get principal components

1. Find linear function of x, 𝛂𝟏′ 𝐱, with maximum variance

2. Next find another linear function 𝛂𝟐′ 𝐱, which is

uncorrelated with 𝛂𝟏′ 𝐱 with maximum variance

3. Repeat, where k << p

Page 13: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

Find 𝛂𝐤′ 𝐱, with maximum variance

maximize Var(𝛂𝐤′ 𝐱) = 𝛂𝐤

′ 𝚺𝛂𝐤

subject to 𝛂𝐤′ 𝛂𝐤 = 𝟏 (unit length vector)

Use Lagrange multipliers

𝛂𝐤′ 𝚺𝛂𝐤 − 𝜆k(𝛂𝐤

′ 𝛂𝐤 − 𝟏)

Page 14: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

𝑑

𝑑𝛂𝐤𝛂𝐤′ 𝚺𝛂𝐤 − 𝜆𝑘 𝛂𝐤

′ 𝛂𝐤 − 1 = 0

𝚺𝛂𝐤 − 𝜆𝑘𝛂𝐤 = 0𝚺𝛂𝐤 = 𝜆𝑘𝛂𝐤

Eigenvector equation

Page 15: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

We can obtain eigenvectors and eigenvalues from

the equation.

Choose 𝜆𝑘 to be as big as possible

𝜆1 is the largest eigenvalue of 𝚺 and 𝛂𝟏 is the

corresponding eigenvector

First principal component of x

Page 16: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

Second principal component, 𝛂𝟐′ 𝐱 maximizes 𝛂𝟐

′ 𝚺𝛂𝟐

subject to being uncorrelated with 𝛂𝟏′ 𝐱

cov(𝛂𝟏′ 𝐱, 𝛂𝟐

′ 𝐱) = 𝛂𝟏′ 𝚺𝛂𝟐 = 𝛂𝟐

′ 𝜆1𝛂𝟏

= 𝜆1𝛂𝟐′ 𝛂𝟏 = 𝜆1𝛂𝟏

′ 𝛂𝟐 = 𝟎

Lagrangian again

𝛂𝟐′ 𝚺𝛂𝟐 − 𝜆2 𝛂𝟐

′ 𝛂𝟐 − 𝟏 − 𝜙𝛂𝟐′ 𝛂𝟏

Page 17: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

𝑑

𝑑𝛂𝟐(𝛂𝟐

′ 𝚺𝛂𝟐 − 𝜆2 𝛂𝟐′ 𝛂𝟐 − 1 − 𝜙𝛂𝟐

′ 𝛂𝟏) = 0

𝚺𝛂𝟐 − 𝜆2𝛂𝟐 − 𝜙𝛂𝟏 = 0 (multiple 𝛂𝟏)𝛂𝟏′ 𝚺𝛂𝟐 − 𝜆2𝛂𝟏

′ 𝛂𝟐 − 𝜙𝛂𝟏′ 𝛂𝟏 = 0

0 − 0 − 𝜙1=0

Now, 𝜙 = 0

𝚺𝛂𝟐 − 𝜆2𝛂𝟐 = 0

Page 18: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Derivation of PCA

This process can be repeated for k = 1…p yielding

up to p different eigenvectors of 𝚺 along with the

corresponding eigenvalues

Page 19: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Applications of PCA

Visualization of high-dimensional data

Page 20: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

Applications of PCA

Eigen Face

Page 21: CS 7265 Big Data Analytics - eigenvector - Principal ...mkang.faculty.unlv.edu/teaching/CS789/12.Principal... · Eigenvector & eigenvalue Eigenvector is a non-zero vector that does

References

Principal Component Analysis by Frank Wood

http://www.stat.columbia.edu/~fwood/Teaching/w

4315/Fall2009/pca.pdf

http://www.vision.jhu.edu/teaching/vision08/Hand

outs/case_study_pca1.pdf