Top Banner
Data Visualization - An introduction Prof Jan Aerts Biodata Visualization and Analysis ESAT/SCD University of Leuven Belgium twitter: @jandot Google+: +Jan Aerts [email protected] http://biovizanlab.wordpress.com http://saaientist.blogspot.com
88

Intro to data visualization

Jan 19, 2015

Download

Documents

Jan Aerts

Slides used in capita selecta HCI course H05N2A
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Intro to data visualization

Data Visualization - An introduction

Prof Jan AertsBiodata Visualization and AnalysisESAT/SCDUniversity of LeuvenBelgium

twitter: @jandotGoogle+: +Jan [email protected]://biovizanlab.wordpress.comhttp://saaientist.blogspot.com

Page 2: Intro to data visualization

1. What is data visualization?

Page 3: Intro to data visualization

“A good sketch is better than a long speech” (Napoleon)

Page 4: Intro to data visualization

“A good sketch is better than a long speech” (Napoleon)

shows: size of the army, geographical coordinates, direction that the army was traveling, location of the army with respect to certain dates, temperature along the path of the retreat

Page 5: Intro to data visualization

John Snow - cholera map

Page 6: Intro to data visualization

Shape of Songs: “Like a Prayer” (Madonna)Martin Wattenberg

Page 8: Intro to data visualization
Page 9: Intro to data visualization

What I use as a definition:

“computer-based visualization systems providing visual representations of datasets intended to help people carry out some task more effectively.” (T Munzner)

Page 10: Intro to data visualization
Page 11: Intro to data visualization

cognition <=> perceptioncognitive task => perceptive task

“eyes beat memory”

Page 12: Intro to data visualization

• record information

• blueprints, photographs,seismographs, ...

• analyze data to support reasoning

• develop & assess hypotheses

• discover errors in data

• expand memory

• find patterns (see Snow’s cholera map)

• communicate information

• share & persuade

• collaborate & revise

Why do we visualize data?

Page 13: Intro to data visualization

pictorial superiority effect

“information”

“informa” “i”65% 1%

72hr

exploration explanation

Page 14: Intro to data visualization

2. Exploration <-> explanation

Page 15: Intro to data visualization

exploration explanation

Page 16: Intro to data visualization

exploration explanation

visual analytics infographics

Page 17: Intro to data visualization

exploration explanation

visual analytics infographics

Page 18: Intro to data visualization

exploration explanation

visual analytics infographics

hypothesis generation

Page 19: Intro to data visualization

exploration explanation

“visual analytics”

=> identify unexpected patterns

Page 20: Intro to data visualization

J van Wijk

exploration explanation

Page 21: Intro to data visualization

Anscombe’s quartet

• uX = 9.0

• uY = 7.5

• sigma X = 3.317

• sigma Y = 2.03

• Y = 3 + 0.5X

• R2 = 0.67

Page 22: Intro to data visualization
Page 23: Intro to data visualization
Page 24: Intro to data visualization

A concrete example: hive plots

Page 25: Intro to data visualization

Martin Krzewinsky

same network

Page 26: Intro to data visualization

Martin Krzewinsky

different networks!

Page 27: Intro to data visualization

3D, anyone?

Page 28: Intro to data visualization

3D, anyone?

occlusioninteraction complexityperspective distortion

text legibility

Page 29: Intro to data visualization

Gene interaction data: “gene A regulates gene B”

Functions in linux operation system: “function A calls function B”

Page 30: Intro to data visualization

regulator

manager

workhorse

Page 31: Intro to data visualization

3. Why specifically learn about dataviz?

Page 32: Intro to data visualization

Isn’t it all just about using common sense?

Page 33: Intro to data visualization

• huge space of design alternatives => many tradeoffs

• many possibilities known to be ineffective

• avoid random walk through parameter space

• avoid some of our past mistakes

• extensive experimentation has already been done

• guidelines continue to evolve

• we reflect on lessons learned in design studies

• iterative refinement usually wise

Page 34: Intro to data visualization

4. Stages of data visualization

Page 35: Intro to data visualization

How do we get from data to visualization? We need to understand:

• properties of the data

• properties of the image

• the rules mapping data to image

Page 36: Intro to data visualization

4.1. Properties of the data

Page 37: Intro to data visualization

S Stevens “On the theory of scales and measurements” (1946)

Page 38: Intro to data visualization

4.2. Properties of the image - perception

Page 39: Intro to data visualization

Semiology of graphics

• Jacques Bertin, Gauthier-Villars 1967, EHESS 1998

• semiology = study of signs and sign processes, likeness, analogy, metaphor, symbolism, signification, and communication (Wikipedia)

• visual encoding:

• what - points, lines, areas (, patterns, trees/networks, grids)

• where - positional: XY (1D, 2D, 3D)

• how - retinal: Z (size, lightness, texture, colour, orientation, shape)

• when - temporal: animation

Page 40: Intro to data visualization

“marks” - geometric primitives

“channels” - control appearance of marks

H

V

S

Page 41: Intro to data visualization

Gestalt laws - interplay between parts and the whole (Kurt Koffka)

series of principles

Election results Florida:

• black = Bush

• white = Gore

Page 42: Intro to data visualization
Page 43: Intro to data visualization

Gestalt - Principle of Simplicity

Every pattern we see is seen such that we see a structure that is as simple as possible.

Page 44: Intro to data visualization

Gestalt - Principle of Proximity

Things that are close to each other are seen as belonging together (=> clusters)

Page 45: Intro to data visualization

Gestalt - Principle of Similarity

Things that are similar in some way are perceived as belonging together.

Page 46: Intro to data visualization

Gestalt - Principle of Closure

You will try to complete a pattern.

Page 47: Intro to data visualization

Gestalt - Principle of Connectedness

Things that are connected are perceived as belonging together. This encoding is stronger than similarity, shape, colour, and size.

Page 48: Intro to data visualization

Gestalt - Principle of Good Continuation

Objects that are arranged in a straight or smooth line tend to be seen as a unit.

Page 49: Intro to data visualization

Gestalt - Principle of Common Fate

Objects that move in the same direction tend to be seen as a unit.

Page 50: Intro to data visualization

Gestalt - Principle of Familiarity

Page 51: Intro to data visualization
Page 52: Intro to data visualization
Page 53: Intro to data visualization
Page 54: Intro to data visualization

Gestalt - Principle of Symmetry

Symmetrical areas tend to be seen as figures against asymmetrical backgrounds.

Page 55: Intro to data visualization

Context affects perceptual tasks

Page 56: Intro to data visualization

Pre-attentive vision

= ability of low-level human visual system to rapidly identify certain basic visual properties

• some features “pop out”

• used for:

• target detection

• boundary detection

• counting/estimation

• ...

• visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure

Page 57: Intro to data visualization

Really fast; see http://www.csc.ncsu.edu/faculty/healey/PP/

Page 58: Intro to data visualization

1. Combining pre-attentive features does not always work => would need to resort to “serial search” (most channel pairs; all channel triplets)e.g. is there a red square in this picture

Limitations of preattentive vision

2. Speed depends on which channel (use one that is good for categorical; see further (“accuracy”))

Page 59: Intro to data visualization

4.3. Mapping data to image: visual encoding

Page 60: Intro to data visualization

Language of graphics

• graphics = sign system:

• each mark (point, line, area) represents a data element

• choose visual variables to encode relationships between data elements

• difference, similarity, order, proportion

• only position supports all relationships (see later)

• huge range of alternatives for data with many attributes

• find images that express & effectively convey the information

Page 61: Intro to data visualization

Which encoding should I use?

• From huge list of possibilities, you have to choose the best one.

• Principle of Consistency

• properties of the representation should match properties of the data (e.g. pie chart: area vs radius)

• Principle of Importance Ordering

• encode the most important piece of information in the most “effective” way (i.e. spatial position)

Page 62: Intro to data visualization
Page 63: Intro to data visualization

Steven’s psychophysical law

= proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength

Page 64: Intro to data visualization

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

Page 65: Intro to data visualization

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

Page 66: Intro to data visualization

Accuracy of quantitative perceptual tasks

McKinlay“power of the plane”

what/where (qualitative)how much (quantitative)

Page 67: Intro to data visualization

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

grouping: see Gestalt laws

Page 68: Intro to data visualization

COLOUR

Page 69: Intro to data visualization

COLOUR ... is tricky, and often used wrong

Page 70: Intro to data visualization

Colour space

• = mathematical model to talk about colour

• RGB (red-green-blue)

• most common, but less useful

• HSV (hue-saturation-value)

• more useful

Page 71: Intro to data visualization

colorbrewer2.org

in R: please use RColorBrewer!

Page 72: Intro to data visualization

Context affects colour perception

Page 73: Intro to data visualization

Context affects colour perception

Page 74: Intro to data visualization

Dangers of Depth (3D)

• We do NOT see in 3D; we see in 2.05D.

• occlusion

• interaction complexity

• perspective distortion

Page 75: Intro to data visualization

3D example

Page 76: Intro to data visualization

Lie factor

size of effect shown in graphic“lie factor” =

size of effect in data

Page 77: Intro to data visualization

3D scatter plots are better as series of 2D projections

Page 78: Intro to data visualization

Dynamic data

• animation is good sometimes, but often not:

• we can only follow 3-4 visual cues simultaneously

• change in “mental map”

• change blindness (e.g. http://nivea.psycho.univ-paris5.fr/CBMovies/BarnTrackFlickerMovie.gif)

Page 79: Intro to data visualization

http://vimeo.com/2035117

Page 80: Intro to data visualization
Page 81: Intro to data visualization

5. Interaction

Page 82: Intro to data visualization

Overview, zoom and filter, details on demand(Schneiderman’s Information Seeking Mantra)

Page 83: Intro to data visualization

• sorting

• filtering

• browsing/exploring

• comparison

• characterizing trends & distributions

• finding anomalies & outliers

• ...

Operations on the data

Page 84: Intro to data visualization

Techniques to support these operations

• re-orderable matrices

• brushing

• linked views

• overview & detail

• focus & context

• ...

Page 85: Intro to data visualization

6. Validation

Page 86: Intro to data visualization

Evaluate the right thing

Munzner, 2009

Page 87: Intro to data visualization

Slide/picture acknowledgments

• Jeffrey Heer

• Tamara Munzner

• Jessie Kennedy

• Nils Gehlenborg

• Miriah Meyer

Page 88: Intro to data visualization

“I think this presentation went quite well...”