Top Banner
INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010
28

INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

Dec 30, 2015

Download

Documents

Brandon Grant

INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010. How to display data badly. Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin. Using Microsoft Excel to obscure your data and annoy your readers. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

INTRODUCTION TO CLINICAL RESEARCH

How To Make A Bad Plot

Karen Bandeen-Roche, Ph.D.

July 13, 2010

Page 2: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

How to display data badly

Karl W BromanDepartment of Biostatistics and Medical Informatics

University of Wisconsin

Page 3: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

Using Microsoft Excel to obscure your data and annoy your readers

Karl W BromanDepartment of Biostatistics

http://www.biostat.jhsph.edu/~kbroman

Page 4: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

4

Inspiration

This lecture was inspired by

H Wainer (1984) How to display data badly. American Statistician 38(2):137-147

Dr. Wainer was the first to elucidate the principles of the bad display of data.

The now widespread use of Microsoft Excel has resulted in remarkable advances in the field.

Page 5: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

5

General principles

The aim of good data graphics:Display data accurately and clearly.

Some rules for displaying data badly:– Display as little information as possible.– Obscure what you do show (with chart junk).– Use pseudo-3d and color gratuitously.– Label badly– Use a poorly chosen scale.– Ignore sig figs.

Page 6: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

6

Displaying data well

• Be accurate and clear.

• Let the data speak.

– Show as much information as possible, taking care not to obscure the message.

• Science not sales.

– Avoid unnecessary frills — esp. gratuitous 3d.

• In tables, every digit should be meaningful. Don’t drop ending 0’s.

Page 7: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

7

Displaying data well

• Show “typical”, “average” values

• Convey extent of “spread”, “variability” in values

• Compare groups clearly

• Label explicitly

Page 8: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

8

• Einarsson K, et al (NEJM 313:277, 1985; reprinted in D-S & T, p. 28 1st ed)

• Supersaturation of bile with cholesterol necessary for cholesterol gall stones

• Female gender and increasing age are risk factors for gall stones

• Is either gender or age associated with percentage cholesterol saturation of bile?

• Cross-sectional data on 60 healthy Swedish subjects (31 men, 29 women) who were not obese

Supersaturation of Bile Data Set

Page 9: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

9

Men Women ID % Saturation ID % Saturation 1 40 32 35 2 47 33 52 3 52 34 55 4 56 35 58 5 57 36 65 6 58 37 66 7 65 38 69 8 66 39 73 9 67 40 75 10 73 41 76 11 74 42 76 12 78 43 77 13 79 44 80 14 80 45 82 15 80 46 84 16 86 47 84 17 86 48 86 18 87 49 87 19 88 50 89 20 88 51 91 21 88 52 98 22 90 53 107 23 106 54 116 24 106 55 120 25 110 56 123 26 110 57 127 27 111 58 128 28 112 59 142 29 118 60 146 30 123 31 137

Bile Data Set

Page 10: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

10

• “Average” -- typical or representative value; where the distribution is “centered”

• Different measures of the center -- usually, all the same for symmetric distributions (ones that look on right or left of center

• Median -- value such that half the observations are less than it and half are greater than it (50th percentile)

Males Females 86% 84%

• Mode -- value where the distribution achieves maximum -- most likely value

Males Females80-90% (85%) 80-90 (85%)

• Mean -- sum of values divided by the number of values = Males Females84.5% 88.5%

X

Measures of the “Average”

Page 11: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

11

• Spread -- variability among the observations

• Different measures of spread, like averages, represent distinct aspects of distribution

• Interquartile range– 75th-25th percentiles -- range of values that contains middle 50% of data

Men Women106-66= 40.0% 111.5-71=40.5%

Measures of Spread

Page 12: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

12

• Variance = (standard deviation)2 = mean squared error deviation from the mean

variance =

standard deviation = square root of variance

Men Women

(24.0%)2=574(%2) (26.6%)2=761(%2)

n/)xx( i2

to n

from i=1

SUM

Measures of Spread (cont’d)

Page 13: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

13

• Displays for continuous data– Histograms / Stem and leaf plots– Boxplots

• Displays for categorical data: tables

• Displays for relationships of two variables (on same “people”) to each other– Continuous data: scatterplots– Categorical data: cross-tabulations

Some common data displays

Page 14: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

14

4* 07

5* 2678

6* 567

7* 3489

8* 00667888

9* 0

10* 66

11* 00128

12* 3

13* 7

Stem and Leaf Plot: % Saturation, Men

Page 15: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

15

40

60

80

100

120

140

M W

Graphs by gender

Boxplots of Bile Data

Page 16: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

16

Scatterplot: SBP vs DBP

SBP

DBP

Page 17: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

Some really bad plots

Page 18: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

18

Example 1

Page 19: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

19

Example 2

Distribution of genotypes

AA 21%

AB 48%

BB 22%

missing 9%

Page 20: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

20

Example 3

Page 21: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

21

Example 4

Page 22: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

22

Example 5

Page 23: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

23

Example 6

Page 24: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

24

Example 7

Page 25: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

25

Example 8

Page 26: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

26

Main points once again

• Be accurate and clear.

• Let the data speak.

– Show as much data as possible, taking care not to obscure the message.

• Science not sales.

– Avoid unnecessary frills

– Go for the cleanest display that conveys the necessary info

• In tables, every digit should be meaningful. Don’t drop ending 0’s.

Page 27: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

27

Displaying data well

• Show “typical”, “average” values

• Convey extent of “spread”, “variability” in values

• Compare groups clearly

• Label explicitly

Page 28: INTRODUCTION TO CLINICAL RESEARCH How To Make A Bad Plot Karen Bandeen-Roche, Ph.D. July 13, 2010

28

Further reading

• ER Tufte (1983) The visual display of quantitative information. Graphics Press.

• ER Tufte (1990) Envisioning information. Graphics Press.

• ER Tufte (1997) Visual explanations. Graphics Press.

• WS Cleveland (1993) Visualizing data. Hobart Press.

• WS Cleveland (1994) The elements of graphing data. CRC Press.