Top Banner
Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD Supervisor: Colin Fyfe, Malcolm Crowe University of the West of Scotland
18

Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Jan 03, 2016

Download

Documents

Camilla Nichols
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Neighbourhood relation preservation (NRP)

A rank-based data visualisation quality assessment criterion

Jigang SunPhD studies finished in July 2011

PhD Supervisor: Colin Fyfe, Malcolm CroweUniversity of the West of Scotland

Page 2: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

outline

• Multidimensional Scaling (MDS);• The need for a common quality measure for data

visualisation;• Local continuity meta-criterion (LCMC);• Definition of neighbourhood relation preservation (NRP);• Illustration of LCMC and NRP on mappings of data sets

created by different methods;

Page 3: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Multidimensional Scaling (MDS)

A group of information visualisation methods that projects data points from high dimensional data space to low, typically two dimensional, latent space in which the structure of the original data set can be identified by eye.

For example…

Page 4: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

By LeftSammon, using graph distances, k=20

Samples of high dimensional data(each image is 28*28=784 dimensions)

2 dimensional projection

Page 5: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Various methods• The classical MDS, the stress function to be minimised is defined to be

spacelatent in and i points databetween distanceEuclidean the

space datain and i points databetween distanceEuclidean the

L

ij

ij

i

i

is

isD

|DL| E

E)D(LE

ijijij

N

1i

N

1ij

2ij

N

1i

N

1ij

2ijijCMDS

error distance

where

D

)D(LE

N

1i

N

1ij ij

2ijij

DSammon N

1i

N

1ijij

1

Sammon Mapping (1969)

• Each method has its own criterion

Page 6: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Instead of

we use base function

N

1i

N

1ij1

ij

ijij

3ij

4ijij

2ij

3ijij

ij

2ijij

N

1i

N

1ijijij

ij

ijij

N

1i

N

1ijijijLSammon

...D!

DL!212

D6

DL

D3

DL

D

DL

DL-D

LlnL2D,LE

n

nn

F

n

n

d

My insight: 1. The above can be performed very efficiently. 2. The higher order Taylor series terms are

better for analysis.

, D

)D(LE

N

1i

N

1ij ij

2ijij

DSammon N

1i

N

1ijij

1

0,ln2)( xxxxF to create LeftSammon mapping

Various methods

Page 7: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

create to0)( ,e F(x)function base new a Usex

2

)2E :Mapping LeftExp1 1

LeftExpλ

ijD

ijijN

i

N

ij

λ

ijD

λ

ijL

DL-e(e λ

)e-e(e λE λ

ijL

ijijN

i

N

ij

λ

ijL

λ

ijD

λ

LD

1 1RightExp

2 :mapping RightExp

Various methods

• Each method has its own criterion.

Page 8: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Mappings of open box

By Sammon’s mapping By LeftSammon mapping

Mappings can be assessed by eye

Page 9: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

By CMDS

By LeftExp By RightExp

By Isomap

Mappings of open box

Page 10: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

by Sammon's mapping by LeftSammon mapping

Sammon vs LeftSammon mapping

• Assessing mapping quality by eye is usually difficult

Page 11: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Local continuity meta-criterion (LCMC)

)(data iN k space datain point of neighboursnearest theofset for the stands ik

)(output iN k spaceoutput in point of neighboursnearest theofset for the stands ik

1|)()(|

1)(

1outputdata

N

kiNiN

NkkLCMC

N

i

kk

Problem: loose constraints

Page 12: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Rank based quality measures|},:{}:{|),( jkDDkDDkjiR ijikijikdata

|},:{}:{|),( jkLLkLLkjiR ijikijikoutput

• Traditional rank is used in trustworthiness and continuity (T&C )

• Problem 1: change of intermediate points is not considered

• p is mapped perfectly since rank of p does not change

• Rank is discrete; distance is continuous

Page 13: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Rank based quality measures

Problem 2: angle constraint is not considered

Page 14: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Neighbourhood relation preservation (NRP)

• Given and the difference in angle piq in data space and output space is less than ), we say that a neighbourhood relation of p over q with respect to i, , is preserved. We denote this as )=1; otherwise )=0;

• Φ(i,k)=, t=1.3• NRP(k)=1/N

Page 15: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Assessment to mappings of open box

Page 16: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Mappings of MNIST digits

By CMDS

By Isomap

By LeftExpBy RightExp

Page 17: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Assessment of mappings of digits

Page 18: Neighbourhood relation preservation (NRP) A rank-based data visualisation quality assessment criterion Jigang Sun PhD studies finished in July 2011 PhD.

Conclusions• Multidimensional Scaling (MDS);• List of objective function of some MDS methods;• The need for a common quality measure for data visualisation;• Local continuity meta-criterion (LCMC);• Definition of Neighbourhood relation preservation (NRP);• Comparison of LCMC and NRP on mappings of data sets created by

different methods;

Thank you! Any questions?