Top Banner
Bubble Plots as a Model- Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College of Criminal Justice, CUNY
36

Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables

and a Flexible R Function to Plot Them

Keith A. Markus and Wen GuJohn Jay College of Criminal Justice,

CUNY

Page 2: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.
Page 3: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Overview

• Goal: Model-free graphs for 3 continuous variables.

• Some alternative graphs & design issues.• The R function: bp3way().• An empirical study.• Tentative conclusions & future directions.

Page 4: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

The Goal

• The goal is to provide a useful graphical representation of the association between 3 continuous variables.

• Often: 2 IVs and 1 DV.• Model free:

– Exploratory data analysis. – Not a summary of a statistical model.

Page 5: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Why Model Free?

• If the statistical model is correct: model based graphs can be very efficient.

• If the statistical model is incorrect: model based graphs can be very misleading.

• E.g., Multiple y~x regression lines for values of z. Misleading if...– y~x relationship is not linear.– Variance in y varies with x or z.– Regression lines extrapolate beyond data.

Page 6: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Some Non-Options

• Scatterplot matrix.• y~x regression lines for fixed z values.• Factorial design type line plots.

• All good plots for other applications.• But not good plots for present purpose.

Page 7: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Scatterplot matrix

• Does not attempt to represent 3-way distributions.

• Same data used for all graphs (N = 100)

Page 8: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

y~x regression lines for fixed values of z:

• Model dependent: plots model not data.

• Not clear where data leaves off.

Page 9: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Factorial-design type plots for categorized IVs:

• Model dependent (interpolation).

• Arbitrary cuts (quartiles plotted here).

• Loss of information through categorization.

Page 10: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Some Options• 3D Scatterplots.

– R Package scatterplot3d: scatterplot3()

• Co-plots.– R base installation: coplot()

• 3-way Bubbleplots.– Available from authors: bp3way()

Page 11: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

3D scatterplot:• Natural extension of 2D scatter plot.

• Relies on 3D illusion: some ambiguity.

Page 12: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Co-plot

• Well suited to perceptual process.

• Relies on banding of z values.

Page 13: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

3-Way Bubble Plot• 2D representation of 3D data.

• People tend to underestimate area.

• No literature.

Page 14: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Some Design Features of the 3-Way Bubble Plot

• Grid designed to make it easier to compare circle sizes across the plot surface.

• Shading designed to accentuate bubbles.• Limited number of cases plotted avoids overly

dense plots (in this case all 100 are plotted).• Margins avoid bubbles extending outside plot

region.

Page 15: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

bp3way() function

Usagebp3way(x)

bp3way(x, xc=1, bc=2, yc=3, proportion=1, random=TRUE, x.margin=.1, y.margin=.1, rad.ex=1, rad.min=NULL, names=c('X', 'B', 'Y'), std=FALSE, fg='black', bg='grey90', tacit=TRUE, ...)

Page 16: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Data Parametersx is a data frame with at least 1 column.xc, yc, and bc identify the columns used to plot

the x axis, y axis, and bubbles respectively.names is a vector of variables names used in the

plot.• Easy to switch variables without changing the

data.• User can use same column more than once.• Out of bounds values return an error.

Page 17: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Data-sensitive Defaults Help Avoid Bad Plots

Page 18: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

• Parameters with data sensitive defaults:– rad.ex: Radius expansion rate.– rad.min: Minimum bubble radius.– proportion: % of data plotted.– margins and grid.

• Other user-specified options include:– Plotting a random sample or first % of cases.– Standardization of X and Y variables.– labels and colors.

Page 19: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Empirical Study• 3 Plots (Bubbleplot, 3D Scatterplot, Coplot).

– Between subjects.– Within group n = 36.

• 6 Data sets.– Within subjects.

• N of subjects = 108.• N of observations = 108 x 6 = 648.

Page 20: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Four DVs

• Accuracy of interpretation of graphs– 0-3 questions answered correctly.

• Confidence in interpretation– 1-5, average of 3 1-5 Likert scale items.

• Perceived clarity– 1-5 Likert scale item.

• Perceived ease of use– 1-5 Likert scale item.

Page 21: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Univariate Summary• No floor or ceiling effects, variability in DVs.

Page 22: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Correlations Between Outcomes

Accuracy Confidence Clarity Ease of Use

Accuracy 1 .061 -.058 -.102

Confidence .106 1 .497 .471

Clarity -.118 .586 1 .784

Ease of Use -.115 .562 .866 1

• Above Diagonal: N = 648 observations.• Below Diagonal: N = 108 participants.

Page 23: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Multivariate model fit firsty* = α0 + α1' Data + α2' Data Graph∙ + u1 (Level 1)

α 0 = β0 + β1' Graph + u2 (Level 2)

y = { 0 if y* ≤ τ1, 1 if τ 1 < y* ≤ τ 2, ... k if τ k-1 < y* ≤ τ k} (Threshold model)

• Third equation not used for confidence DV.• Full model: Mplus• Confidence also fit in R using lme() function.• Nearly identical estimates with R or Mplus.• Story in interactions, not main effects.

Page 24: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Follow-up: Simple Effects

• Shift focus to simple effects because we cannot usefully interpret interactions.

• Protected Wilcox Mann Whitney Exact Tests Used for Accuracy, Clarity and Ease of Use DVs.

• Protected t tests used for Confidence DV.• No one graph consistently better.• Mostly a story about accuracy.

Page 25: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Accuracy Results

Page 26: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Accuracy Results

Page 27: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Confidence Results

Page 28: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Confidence Results

Page 29: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Perceived Clarity Results

Page 30: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Perceived Clarity Results

Page 31: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Perceived Ease of Use Results

Page 32: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Perceived Ease of Use Results

Page 33: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Tentative Conclusions

• Much remains to be learned about the cognition of these 3 graph types.– Coplot may have a slight edge over the other two.– But optimal plot seems data dependent.– Study included a limited range of data and graph

conditions.– More detailed perceptual theory is needed to

optimize graph design.

Page 34: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

• Recommendation for exploratory analysis:– Use 2 or more graph types.– Cannot predict ahead of time which will work

best.– Probably useful to look at data more than one

way even if one graph were consistently best.

Page 35: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

• Recommendation for reporting results:– Use model based graphs.

• If you understand your data well enough to fit a good model.

– If not, try different model-free graphs and see which seems to work best.

Page 36: Bubble Plots as a Model-Free Graphical Tool for Three Continuous Variables and a Flexible R Function to Plot Them Keith A. Markus and Wen Gu John Jay College.

Future Directions

• Identify factors that impact which graph works best.

• Identify design factors that maximize effectiveness of all 3 graph types.

• Increase statistical power:– Identify individual difference covariates that

account for within condition variance.– More sensitive outcome measures.