Top Banner
Lecture 8: High Dimensionality Information Visualization CPSC 533C, Fall 2006 Tamara Munzner UBC Computer Science 5 October 2006
37

Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Sep 07, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Lecture 8: High DimensionalityInformation VisualizationCPSC 533C, Fall 2006

Tamara Munzner

UBC Computer Science

5 October 2006

Page 2: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Readings Covered

Hyperdimensional Data Analysis Using Parallel Coordinates EdwardJ. Wegman. Journal of the American Statistical Association, Vol. 85,No. 411. (Sep., 1990), pp. 664-675.

Fast Multidimensional Scaling through Sampling, Springs andInterpolation Alistair Morrison, Greg Ross, Matthew Chalmers,Information Visualization 2(1) March 2003, pp. 68-77.

Cluster Stability and the Use of Noise in Interpretation of ClusteringGeorge S. Davidson, Brian N. Wylie, Kevin W. Boyack, Proc InfoVis2001.

Interactive Hierarchical Dimension Ordering, Spacing and Filtering forExploration Of High Dimensional Datasets Jing Yang, Wei Peng,Matthew O. Ward and Elke A. Rundensteiner. Proc. InfoVis 2003.

Page 3: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Further Reading

Visualizing the non-visual: spatial analysis and interaction withinformation from text documents. James A. Wise et al, Proc. InfoVis1995

Hierarchical Parallel Coordinates for Visualizing Large MultivariateData Sets Ying-Huey Fua, Matthew O. Ward, and Elke A.Rundensteiner, IEEE Visualization ’99.

Parallel Coordinates: A Tool for Visualizing Multi-DimensionalGeometry. Alfred Inselberg and Bernard Dimsdale, IEEEVisualization ’90.

Page 4: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Parallel CoordinatesI only 2 orthogonal axes in the planeI instead, use parallel axes!

[Hyperdimensional Data Analysis Using Parallel Coordinates. Edward J. Wegman.Journal of the American Statistical Association, 85(411), Sep 1990, p 664-675.]

Page 5: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

PC: Correllation

[Hyperdimensional Data Analysis Using Parallel Coordinates. Edward J. Wegman.Journal of the American Statistical Association, 85(411), Sep 1990, p 664-675.]

Page 6: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

PC: Duality

I rotate-translateI point-line

I pencil: set of lines coincident at one point

[Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry. AlfredInselberg and Bernard Dimsdale, IEEE Visualization ’90.]

Page 7: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

PC: Axis Ordering

I geometric interpretationsI hyperplane, hypersphereI points do have intrinsic order

I infovisI no intrinsic order, what to do?I indeterminate/arbitrary order

I weakness of many techniquesI downside: human-powered searchI upside: powerful interaction technique

I most implementationsI user can interactively swap axes

I Automated Multidimensional DetectiveI Inselberg 99I machine learning approach

Page 8: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Hierarchical Parallel Coords: LOD

[Hierarchical Parallel Coordinates for Visualizing Large Multivariate Data Sets. Fua,Ward, and Rundensteiner, IEEE Visualization 99.]

Page 9: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Hierarchical Clustering

I proximity-based coloringI interaction lecture later:

I structure-based brushingI extent scaling

[Hierarchical Parallel Coordinates for Visualizing Large Multivariate Data Sets. Fua,Ward, and Rundensteiner, IEEE Visualization 99.]

Page 10: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Dimensionality Reduction

I mapping multidimensional space intoI space of fewer dimensions

I typically 2D for infovisI keep/explain as much variance as possibleI show underlying dataset structureI multidimensional scaling (MDS)

I minimize differences between interpointI distances in high and low dimensions

Page 11: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Dimensionality Reduction: IsomapI 4096 D: pixels in imageI 2D: wrist rotation, fingers extension

[A Global Geometric Framework for Nonlinear Dimensionality Reduction. J. B.Tenenbaum, V. de Silva, and J. C. Langford. Science 290(5500), pp 2319–2323, Dec22 2000]

Page 12: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Naive Spring Model

I repeat for all pointsI compute spring force to all other pointsI difference between high dim, low dim distanceI move to better location using computed forces

I compute distances between all pointsI O(n2) iteration, O(n3) algorithm

Page 13: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Faster Spring Model [Chalmers 96]

I compare distances only with a few pointsI maintain small local neighborhood set

Page 14: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Faster Spring Model [Chalmers 96]

I compare distances only with a few pointsI maintain small local neighborhood setI each time pick some randoms, swap in if closer

Page 15: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Faster Spring Model [Chalmers 96]

I compare distances only with a few pointsI maintain small local neighborhood setI each time pick some randoms, swap in if closer

Page 16: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Faster Spring Model [Chalmers 96]

I compare distances only with a few pointsI maintain small local neighborhood setI each time pick some randoms, swap in if closer

I small constant: 6 locals, 3 randoms typicalI O(n) iteration, O(n2) algorithm

Page 17: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Parent Finding [Morrison 02, 03]

I lay out a√

n subset with [Chalmers 96]I for all remaining points

I find ”parent”: laid-out point closest in high DI place point close to this parent

I O(n5/4) algorithm

Page 18: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Issues

I which distance metric: Euclidean or other?I computation

I naive: O(n3)I better: O(n2) Chalmers 96I hybrid: O(n

√n)

Page 19: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

True Dimensionality: LinearI how many dimensions is enough?

I could be more than 2 or 3I knee in error curve

I exampleI measured materials from graphicsI linear PCA: 25I get physically impossible intermediate points

[A Data-Driven Reflectance Model, SIGGRAPH 2003, W Matusik, H. Pfister M. Brandand L. McMillan, graphics.lcs.mit.edu/∼wojciech/pubs/sig2003.pdf]

Page 20: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

True Dimensionality: NonlinearI nonlinear MDS: 10-15

I all intermediate points possibleI categorizable by people

I red, green, blue, specular, diffuse, glossy,metallic, plastic-y, roughness, rubbery,greasiness, dustiness...

[A Data-Driven Reflectance Model, SIGGRAPH 2003, W Matusik, H. Pfister M. Brandand L. McMillan, graphics.lcs.mit.edu/∼wojciech/pubs/sig2003.pdf]

Page 21: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

MDS Beyond PointsI galaxies: aggregation

I themescapes: terrain/landscapes

[www.pnl.gov/infoviz/graphics.html]

Page 22: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Cluster Stability

I displayI also terrain metaphor

I underlying computationI energy minimization (springs) vs. MDSI weighted edges

I do same clusters form with different randomstart points?

I ”ordination”I spatial layout of graph nodes

Page 23: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Approach

I normalize within each columnI similarity metric

I discussion: Pearson’s correllation coefficientI threshold value for marking as similar

I discussion: finding critical value

Page 24: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Graph Layout

I criteriaI geometric distance matching graph-theoretic

distanceI vertices one hop away closeI vertices many hops away far

I insensitive to random starting positionsI major problem with previous work!

I tractable computationI force-directed placement

I discussion: energy minimizationI others: gradient descent, etcI discussion: termination criteria

Page 25: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Barrier Jumping

I same idea as simulated annealingI but compute directlyI just ignore repulsion for fraction of vertices

I solves start position sensitivity problem

Page 26: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

ResultsI efficiency

I naive approach: O(V 2)I approximate density field: O(V )

I good stabilityI rotation/reflection can occur

different random start adding noise

Page 27: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Critique

I real dataI suggest check against subsequent publication!

I give criteria, then discuss why solution fitsI visual + numerical results

I convincing images plus benchmark graphs

I detailed discussion of alternatives at eachstage

I specific prescriptive advice in conclusion

Page 28: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Critique

I real dataI suggest check against subsequent publication!

I give criteria, then discuss why solution fitsI visual + numerical results

I convincing images plus benchmark graphs

I detailed discussion of alternatives at eachstage

I specific prescriptive advice in conclusion

Page 29: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Dimension Ordering

I in NP, like most interesting infovis problemsheuristic

I divide and conquerI iterative hierarchical clusteringI representative dimensions

I choicesI similarity metricsI importance metrics

I varianceI ordering algorithms

I optimalI random swapI simple depth-first traversal

Page 30: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Spacing, Filtering

I same idea: automatic supportI interaction

I manual interventionI structure-based brushingI focus+context, next week

Page 31: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Results: InterRing

I raw, order, distort, rollup (filter)

[Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration OfHigh Dimensional Datasets. Yang Peng, Ward, and Rundensteiner. Proc. InfoVis 2003]

Page 32: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Results: Parallel CoordinatesI raw, order/space, zoom, filter

[Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration OfHigh Dimensional Datasets. Yang Peng, Ward, and Rundensteiner. Proc. InfoVis 2003]

Page 33: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Results: Star Glyphs

I raw, order/space, distort, filter

[Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration OfHigh Dimensional Datasets. Yang Peng, Ward, and Rundensteiner. Proc. InfoVis 2003]

Page 34: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Results: Scatterplot Matrices

I raw, filter

[Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration OfHigh Dimensional Datasets. Yang Peng, Ward, and Rundensteiner. Proc. InfoVis 2003]

Page 35: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Critique

I proI approach on multiple techniques,I real data!

I conI always show order then space then filter

I hard to tell which is effectiveI show ordered vs. unordered after zoom/filter?

Page 36: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Critique

I proI approach on multiple techniques,I real data!

I conI always show order then space then filter

I hard to tell which is effectiveI show ordered vs. unordered after zoom/filter?

Page 37: Lecture 8: High Dimensionality - University of British Columbiatmm/courses/cpsc533c-06-fall/... · 2006. 10. 3. · Dimensionality Reduction I mapping multidimensional space into

Software, Data Resources

www.cs.ubc.ca/∼tmm/courses/infovis/resources.html