ETC3250/5250: Dimension reduction Semester 1, 2020 Professor Di Cook Econometrics and Business Statistics Monash University Week 4 (b)
ETC3250/5250: DimensionreductionSemester 1, 2020
Professor Di Cook
Econometrics and Business Statistics Monash University
Week 4 (b)
PCA vs LDA
Discriminant space: is the low-dimensional space where the classmeans are the furthest apart relative to the common variance-covariance.
The discriminant space is provided by the eigenvectors after making aneigen-decomposition of , whereΣ−1ΣB
ΣB =K
∑i=1
(μi − μ)(μi − μ)′ and Σ =K
∑k=1
nk
∑i=1
(xi − μk)(xi − μk)′1
K
1
K
1
nk
2 / 12
Mahalanobis distance
For two -dimensional vectors,Euclidean distance is
and Mahalanobs distance is
Which points are closest accordingto Euclidean distance? Which pointsare closest relative to the variance-covariance?
p
d(x, y) = √(x − y)′(x − y)
d(x, y) = √(x − y)′Σ−1(x − y)
00:30
3 / 12
Discriminant space
Both means the same. Two different variance-covariance matrices.Discriminant space depends on the variance-covariance matrix.
4 / 12
Projection pursuit (PP) generalises PCA
PCA:
PP:
maximizeϕ11,…,ϕp1
n
∑i=1
(p
∑j=1
ϕj1xij)
2
subject to
p
∑j=1
ϕ2j1 = 1
1
n
maximizeϕ11,…,ϕp1
f (p
∑j=1
ϕj1xij) subject to
p
∑j=1
ϕ2j1
= 1
5 / 12
MDS
Multidimensional scaling (MDS) �nds a low-dimensional layout ofpoints that minimises the difference between distances computed inthe p-dimensional space, and those computed in the low-dimensionalspace.
where is an matrix of distances between all pairs of
points, and is the distance between the points in the low-dimensional space.
StressD(x1, . . . ,xN) =⎛
⎝
N
∑i,j=1;i≠j
(dij − dk(i, j))2⎞
⎠
1/2
D N × N (dij)dk(i, j)
6 / 12
MDS
Classical MDS is the same asPCA Metric MDS incorporates powertransformations on the distances,
.
Non-metric MDS incorporates amonotonic transformation of thedistances, e.g. rank
track <- read_csv("data/womens_track.csv"track_mds <- cmdscale(dist(track[,1:7])) as_tibble() %>% mutate(country = track$country)
drij
7 / 12
ChallengeFor each of these distance matrices, �nd a layout in 1 or 2D thataccurately re�ects the full distances.
## # A tibble: 3 x 4## name A B C## <chr> <dbl> <dbl> <dbl>## 1 A 0.1 3.2 3.9## 2 B 3.2 -0.1 5.1## 3 C 3.9 5.1 0
## # A tibble: 4 x 5## name A B C D## <chr> <dbl> <dbl> <dbl> <dbl>## 1 A 0.1 0.9 2.1 3 ## 2 B 0.9 0 1.1 1.9## 3 C 2.1 1.1 0.1 1.1## 4 D 3 1.9 1.1 -0.1
00:30 8 / 12
Non-linear dimension reduction
T-distributed Stochastic Neighbor Embedding (t-SNE): similar toMDS, except emphasis is placed on grouping observations into clusters.Observations within a cluster are placed close in the low-dimensionalrepresentation, but clusters themselves are placed far apart.
9 / 12
Non-linear dimension reduction
Local linear embedding (LLE): Finds nearest neighbours of points,de�nes interpoint distances relative to neighbours, and preserves theseproximities in the low-dimensional mapping. Optimisation is used tosolve an eigen-decomposition of the knn distance construction.
10 / 12
Non-linear dimension reduction
Self-organising maps (SOM): First clusters the observations into groups. Uses the mean of each group laid out in a constrained 2D
grid to create a 2D projection.k × k
11 / 12
� Made by a human with a computerSlides at https://iml.numbat.space.
Code and data at https://github.com/numbats/iml.
Created using R Markdown with �air by xaringan, andkunoichi (female ninja) style.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
12 / 12