Telling Stories With Data Class #1 March 20th, 2017 David Newbury — @workergnome 1
Telling Stories With DataClass #1March 20th, 2017
David Newbury — @workergnome 1
What We're Doing Today:
— Syllabus Review
— (Brief) History of Data Visualization
— (Tiny) Theory of Visualization
— (Nerdy) Overview of Concepts
— (Fake) Data Exploration
David Newbury — @workergnome 2
Course Website:datastories.davidnewbury.com
David Newbury — @workergnome 3
Which is biggest?
15012, 8271, 30193, 1189, 9913, 16000, 92481, 49801, 100407, 2910, 3809, 8018, 61528, 18083, 38691, 1800
David Newbury — @workergnome 4
Which is biggest?
David Newbury — @workergnome 5
Which is biggest?
David Newbury — @workergnome 6
Why do wevisualize?David Newbury — @workergnome 7
(Brief)History ofData Visualization
David Newbury — @workergnome 8
Tabula Peutingeriana, 5th century CE
David Newbury — @workergnome 9
David Newbury — @workergnome 10
Rene Descartes, 1600s
David Newbury — @workergnome 11
Joseph Priestly, New Chart of History (1769)
David Newbury — @workergnome 12
William Playfair, (1786 & 1801)
David Newbury — @workergnome 13
David Newbury — @workergnome 14
John Snow, London Cholera Map (1854)
David Newbury — @workergnome 15
Cholera Map
David Newbury — @workergnome 16
Florence Nightingale, War Deaths (1855)
David Newbury — @workergnome 17
Charles Minard, March on Moscow (1862)
David Newbury — @workergnome 18
More recent history.
David Newbury — @workergnome 19
David Newbury — @workergnome 20
New York Times
David Newbury — @workergnome 21
(tiny)
Theory of VisualizationDavid Newbury — @workergnome 22
Dataviz is constructed reality.You are telling a story, not (just) stating facts.
David Newbury — @workergnome 23
data art
as opposed to
data visualization
as opposed to
statistical graphicsDavid Newbury — @workergnome 24
StatisticalGraphics
How do I create Statistical Graphs in SAS 9.1.3 without Proc Gplot. UCLA: Statistical Consulting Group.http://www.ats.ucla.edu/stat/sas/notes2/
David Newbury — @workergnome 25
Data Art
Dear Data Giorgia Lupi & Stefanie Posavec.http://www.dear-data.com
David Newbury — @workergnome 26
Two Uses1). help people grasp things outside their reach
David Newbury — @workergnome 27
Two Uses1). help people grasp things outside their reach
2.) tell stories
David Newbury — @workergnome 28
explanatory visualization work
as opposed to
exploratory visualizations
David Newbury — @workergnome 29
Dataviz is constructed reality.Do you care how true your story is?
Do you care how accurate your story is?
Are you trying to teach, entertain, or convince?
David Newbury — @workergnome 30
(Nerdy)Overview of Concepts
David Newbury — @workergnome 31
What can you visualise?
David Newbury — @workergnome 32
Potential Subjects.
subways, sheep, the solar system,shoes, sleep, skyline,snow, supermarket, sausages,school,the sea, spiders,staircases, syrup, soap,sawmills, stereos...
David Newbury — @workergnome 33
Potential Subjects.
subways, sheep, the solar system,shoes, sleep, skyline,snow, supermarket, sausages,school,the sea, spiders,staircases, syrup, soap,sawmills, stereos...
...and other things that begin with S.
David Newbury — @workergnome 34
What are you interested in?
I'm interested in subways.
David Newbury — @workergnome 35
Data Visualization starts with...
A Question.David Newbury — @workergnome 36
What question about your subject are you interested in?
— Are subways more efficient than owning a car?
— How often do I ride the subway in a year?
— What's locations have the best access to subways?
— What's the average subway commute in Pittsburgh?
David Newbury — @workergnome 37
Dimension and Scopeare about choosing what to focus on.
David Newbury — @workergnome 38
Dimension
Which bits of information about a subjectare you going to focus on?
David Newbury — @workergnome 39
Possible Dimensions
number of carsduration of ridedate of a ridedifferent linesnumber of stopscost per ridenumber of stops per daytime between stopscleanlinessDavid Newbury — @workergnome 40
Scope
Out of the infinite ways to look at your subject, how are you going to choose one?
David Newbury — @workergnome 41
Possible Scopes
All trains in a dayAll the rides that I've been on this yearMy train this morningAll of the stops in the cityEach lineEvery train stop in the past 50 years
David Newbury — @workergnome 42
(Fake)Data Exploration
David Newbury — @workergnome 43
Choose one.
subways, sheep, the solar system,shoes, sleep, skyline,snow, supermarket, sausages,school,the sea, spiders,staircases, syrup, soap,sawmills, stereos...
...and other things that begin with S.
David Newbury — @workergnome 44
TRY IT.1. Write down your subject2. Write down your question3. Write down as many dimensions as you can
4. Write down possible scopes for your dataDavid Newbury — @workergnome 45
What does yourdata look like?
David Newbury — @workergnome 46
Types of Data
DatesNumbersGeo CoordinateStringsCategories
David Newbury — @workergnome 47
Types of Data
number of cars - Numericduration of ride - Numericdate of a ride - Datedifferent lines - Categorynumber of stops - Numericcost per ride - Categorynumber of stops per day - Numerictime between stops - Numericcleanliness - StringDavid Newbury — @workergnome 48
Two (related ides):
Categories & measures
David Newbury — @workergnome 49
Categories are Discrete Things
Measures are for Counting
David Newbury — @workergnome 50
number of cars - Measureduration of ride - Measuredate of a ride - Measuredifferent lines - Categoriesnumber of stops - Measurecost per ride - Categoriesnumber of stops per day - Measuretime between stops - Measurecleanliness - Categories
David Newbury — @workergnome 51
A hidden dimension:
David, Daniel, Dawn, Danique
David Newbury — @workergnome 52
A hidden dimension:
David (1), Daniel (2), Dawn (3), Danique (4)
Position of the item in the group.
David Newbury — @workergnome 53
TRY IT.1. Choose a scope for your data.
2. Identify which dimensions are relevant.
3. Is the dimension is a category or a measure?
David Newbury — @workergnome 54
NowWhat?
David Newbury — @workergnome 55
We need to map our data
from a domainto a range.
David Newbury — @workergnome 56
Domain
number of cars - 1...8duration of ride - 30 sec...2 hoursdate of a ride - - 24ft...200ftdifferent lines - Red line, Blue line, Green line, Silver Line, Yellow Linenumber of stops - **2..20cost per ride - "$2.50, $1.75, $3.00, $0.00"number of stops per day - ??...???time between stops - 30 sec..5 minutesDavid Newbury — @workergnome 57
Range
Domain is the possible input values
Range is the possible output values
David Newbury — @workergnome 58
Data3, 7, 10, 6, 2Position of the item in the group.
Domain[0-10][1-5]
RangeX: 400px Y: 800px
MappingX: item position Y: numeric value David Newbury — @workergnome 59
Data3, 7, 10, 6, 2Position of the item in the group.
AreaDavid Newbury — @workergnome 60
Data3, 7, 10, 6, 2Position of the item in the group.
ColorDavid Newbury — @workergnome 61
Data
val1: 3, 7, 10, 6, 2val2: 5, 8, 1, 8, 3val3: Cat, Dog, Cat, Cat, DogPosition of the item in the group.
Mapping
X: item position Y: val1 Size: val2 Color: val3
David Newbury — @workergnome 62
Dimensions beyond X and Y.
ColorSizeShapeLabelsPatternsIconsAnything Else You Can Imagine
David Newbury — @workergnome 63
TRY IT.1. Identify your domains
2. For each domain, choose a range
3. Draw it!
David Newbury — @workergnome 64
FinishingTouchesDavid Newbury — @workergnome 65
Measures get AxisCategories get Headers
David Newbury — @workergnome 66
Labels
David Newbury — @workergnome 67
Axis
Category AxisNumber AxisDate AxisLog axis
David Newbury — @workergnome 68
Legends
David Newbury — @workergnome 69
TRY IT.1. Add a title to your chart
2. Label your axis
3. Add legends and labels as needed
David Newbury — @workergnome 70
Review
DimensionsScope
DomainRange
CategoriesMeasures
David Newbury — @workergnome 71
Thank You.
David Newbury — @workergnome 72