-
Data Science: Data Visualization Boot CampRelationshipBubble
Plot
Chuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge,
PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhD
24 January 202024 January 202024 January 202024 January 202024
January 202024 January 202024 January 202024 January 202024 January
202024 January 202024 January 202024 January 202024 January 202024
January 202024 January 202024 January 202024 January 202024 January
202024 January 202024 January 202024 January 2020
1/22
-
2/22
Type Sample data Hands on Q & A Conclusion References
Files
Table of contents (1 of 1)
1 TypeUsesGeneral considerations
2 Sample data
3 Hands on
4 Q & A
5 Conclusion6 References7 Files
-
3/22
Type Sample data Hands on Q & A Conclusion References
Files
A definition
“A bubble graph is a vari-ation of a point or line graphwhere
the data points (dots)have been replaced by circles(bubbles). The
major advan-tage of a bubble graph versus apoint or line graph is
the abil-ity to encode one or more addi-tional variables by means
of thebubble symbol. Bubble graphsmight be two or three
dimen-sional, . . . ”
R. L. Harris [1]
-
4/22
Type Sample data Hands on Q & A Conclusion References
Files
R supplied data set (1 of 2)
Included in the R package ggplot2.
“This dataset contains a subset of the fuel econ-omy data that
the EPA makes available on . It contains only models which hada new
release every year between 1999 and 2008 - this wasused as a proxy
for the popularity of the car.”
H. Wickham [2]
library(ggplot2)
?mpg
head(mpg)
Resulting in:
-
5/22
Type Sample data Hands on Q & A Conclusion References
Files
R supplied data set (2 of 2)
# A tibble: 6 x 11
manufacturer model displ year cyl trans drv cty hwy fl class
1 audi a4 1.8 1999 4 auto(l5) f 18 29 p comp
2 audi a4 1.8 1999 4 manual(m5) f 21 29 p comp
3 audi a4 2 2008 4 manual(m6) f 20 31 p comp
4 audi a4 2 2008 4 auto(av) f 21 30 p comp
5 audi a4 2.8 1999 6 auto(l5) f 16 26 p comp
6 audi a4 2.8 1999 6 manual(m5) f 18 26 p comp
-
6/22
Type Sample data Hands on Q & A Conclusion References
Files
More recent mileage data
Downloaded
from:https://www.fueleconomy.gov/feg/download.shtml
Described at:
https://www.fueleconomy.gov/feg/ws/index.shtml#vehicle
We will:
1 Extract csv data from a zip file (39,865 rows)
2 Select certain makes (attempt to replicate the sample
data)
3 Display different data for selected makes/models
https://www.fueleconomy.gov/feg/download.shtmlhttps://www.fueleconomy.gov/feg/ws/index.shtml#vehiclehttps://www.fueleconomy.gov/feg/ws/index.shtml#vehicle
-
7/22
Type Sample data Hands on Q & A Conclusion References
Files
The first codes. (1 of 3)
-
8/22
Type Sample data Hands on Q & A Conclusion References
Files
The first codes. (2 of 3)
rm(list=ls())
library(ggplot2)
data(mpg, package="ggplot2")
mpg_select
-
9/22
Type Sample data Hands on Q & A Conclusion References
Files
The first codes. (3 of 3)
g + geom_point(aes(col=manufacturer))
g + geom_jitter(aes(col=manufacturer))
g + geom_jitter(aes(col=manufacturer, size=hwy)) +
geom_smooth(aes(col=manufacturer), method="lm", se=F)
g + geom_jitter(aes(col=manufacturer, size=hwy)) +
geom_smooth(aes(col=manufacturer), method="lm", se=F) +
labs(size = "Highway\n mpg",
colour = "Brand")
-
10/22
Type Sample data Hands on Q & A Conclusion References
Files
The second codes. (1 of 4)
-
11/22
Type Sample data Hands on Q & A Conclusion References
Files
The second codes. (2 of 4)
rm(list=ls())
library(ggplot2)
saveFileName
-
12/22
Type Sample data Hands on Q & A Conclusion References
Files
The second codes. (3 of 4)
labs(subtitle="mpg: Displacement vs City Mileage",
title="Bubble chart",
x="Engine displacement (liters)",
y="City mpg",
color="Manufacturer")
g + geom_point()
g + geom_point(aes(col=make))
g + geom_jitter(aes(col=make))
g + geom_jitter(aes(col=make, size=highway08)) +
geom_smooth(aes(col=make), method="lm", se=F) +
labs(size = "Highway\n mpg",
colour = "Brand")
-
13/22
Type Sample data Hands on Q & A Conclusion References
Files
The second codes. (4 of 4)
-
14/22
Type Sample data Hands on Q & A Conclusion References
Files
The third codes. (1 of 4)
-
15/22
Type Sample data Hands on Q & A Conclusion References
Files
The third codes. (2 of 4)
rm(list=ls())
library(ggplot2)
saveFileName
-
16/22
Type Sample data Hands on Q & A Conclusion References
Files
The third codes. (3 of 4)
" and average CO2 over time"),
title="Bubble chart",
x="Year",
y="City mpg",
caption = paste0("Idea taken from \"Practical",
" Statistics for Data",
" Scientists\", Bruce and Bruce."
)
)
g + geom_point()
g + geom_jitter()
g + geom_jitter(aes(size=co2,
shape=as.factor(cylinders)
), alpha=0.5) +
geom_smooth(colour="green", method="lm", se=F) +
-
17/22
Type Sample data Hands on Q & A Conclusion References
Files
The third codes. (4 of 4)
labs(size = "CO2\nmeasurements",
shape = "Number\nof cylinders"
)
-
18/22
Type Sample data Hands on Q & A Conclusion References
Files
Hands-on
1 The supervisor would like to see the effect of
different“default” themes on the first plot. Show how to use the
gray,linedraw, and classical themes.
2 The CO2 plot displays data for Hondas only. Change the
dataselection command to include Fords, and discuss how
theresulting plot could be improved.
-
19/22
Type Sample data Hands on Q & A Conclusion References
Files
Q & A time.
Q: How many Harvard MBA’sdoes it take to screw in a lightbulb?A:
Just one. He grasps it firmlyand the universe revolves
aroundhim.
-
20/22
Type Sample data Hands on Q & A Conclusion References
Files
What have we covered?
Bubble plots are:
Require slightly more thoughtand consideration than
scatterplotsUsed to show 3, or more relateddata sets
Good for showing gross differencesin the third dimension.
Next: Columnar histograms (how grouping data can show
patterns)
-
21/22
Type Sample data Hands on Q & A Conclusion References
Files
References (1 of 1)
[1] Robert L. Harris,Information Graphics: A Comprehensive
Illustrated Reference,Oxford University Press, 2000.
[2] Hadley Wickham, ggplot2: Elegant Graphics for Data
Analysis,Springer-Verlag New York, 2009.
-
22/22
Type Sample data Hands on Q & A Conclusion References
Files
Files of interest
1 Code snippet to createimages in this presentation
2 Extract Federal fuel data
## First codesrm(list=ls())
library(ggplot2)data(mpg, package="ggplot2")
mpg_select