Top Banner
Descriptive statistics Describing data with numbers: measures of variability
21

Descriptive statistics Describing data with numbers: measures of variability.

Jan 12, 2016

Download

Documents

Joel Bell
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Descriptive statistics Describing data with numbers: measures of variability.

Descriptive statistics

Describing data with numbers: measures of variability

Page 2: Descriptive statistics Describing data with numbers: measures of variability.

What to describe?

• What is the “location” or “center” of the data?

• How do the data vary?

Page 3: Descriptive statistics Describing data with numbers: measures of variability.

Measures of Variability

• Range

• Interquartile range

• Variance and standard deviation

• Coefficient of variation

All of these measures are appropriate for measurement data only.

Page 4: Descriptive statistics Describing data with numbers: measures of variability.

Range

• The difference between largest and smallest data point.

• Highly affected by outliers.

• Best for symmetric data with no outliers.

Page 5: Descriptive statistics Describing data with numbers: measures of variability.

What is the range?

2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0

0

10

20

GPA

Fre

quency

GPAs of Spring 1998 Stat 250 Students

Page 6: Descriptive statistics Describing data with numbers: measures of variability.

RangeDescriptive Statistics

Variable N Mean Median TrMean StDev SE MeanGPA 92 3.0698 3.1200 3.0766 0.4851 0.0506

Variable Minimum Maximum Q1 Q3GPA 2.0200 3.9800 2.6725 3.4675

Range = 3.98 - 2.02 = 1.96

Page 7: Descriptive statistics Describing data with numbers: measures of variability.

Interquartile range

• The difference between the “third quartile” (75th percentile) and the “first quartile” (25th percentile). So, the “middle-half” of the values.

• IQR = Q3-Q1

• Robust to outliers or extreme observations.

• Works well for skewed data.

Page 8: Descriptive statistics Describing data with numbers: measures of variability.

What is the Interquartile Range?

2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0

0

10

20

GPA

Fre

quency

GPAs of Spring 1998 Stat 250 Students

Page 9: Descriptive statistics Describing data with numbers: measures of variability.

Interquartile rangeDescriptive Statistics

Variable N Mean Median TrMean StDev SE MeanGPA 92 3.0698 3.1200 3.0766 0.4851 0.0506

Variable Minimum Maximum Q1 Q3GPA 2.0200 3.9800 2.6725 3.4675

IQR = 3.4675 - 2.6725 = 0.795

Page 10: Descriptive statistics Describing data with numbers: measures of variability.

Variance

1n

2)x(x2s

1. Find difference between each data point and mean.

2. Square the differences, and add them up.

3. Divide by one less than the number of data points.

Page 11: Descriptive statistics Describing data with numbers: measures of variability.

Variance

• If measuring variance of population, denoted by 2 (“sigma-squared”).

• If measuring variance of sample, denoted by s2

(“s-squared”).

• Measures average squared deviation of data points from their mean.

• Highly affected by outliers. Best for symmetric data.

• Problem is units are squared.

Page 12: Descriptive statistics Describing data with numbers: measures of variability.

Standard deviation

• Sample standard deviation is square root of sample variance, and so is denoted by s.

• Units are the original units.

• Measures average deviation of data points from their mean.

• Also, highly affected by outliers.

Page 13: Descriptive statistics Describing data with numbers: measures of variability.

What is the variance or standard deviation?

70 80 90 100 110 120 130 140 150 160Speed

Fastest Ever Driving Speed

126Women

100Men

226 Stat 100 Students, Fall '98

(MPH)

Page 14: Descriptive statistics Describing data with numbers: measures of variability.

Variance or standard deviationSex N Mean Median TrMean StDev SE Mean female 126 91.23 90.00 90.83 11.32 1.01 male 100 06.79 110.00 105.62 17.39 1.74 Minimum Maximum Q1 Q3female 65.00 120.00 85.00 98.25male 75.00 162.00 95.00 118.75

Females: s = 11.32 mph and s2 = 11.322 = 128.1 mph2

Males: s = 17.39 mph and s2 = 17.392 = 302.5 mph2

Page 15: Descriptive statistics Describing data with numbers: measures of variability.

What is the variance or standard deviation?

120 170 220 270

KPH

Fastest Ever Driving Speed

Sex

female

male

Page 16: Descriptive statistics Describing data with numbers: measures of variability.

Variance or standard deviation

Sex N Mean Median TrMean StDev SE Mean female 126 152.05 150.00 151.39 18.86 1.68 male 100 177.98 183.33 176.04 28.98 2.90

Sex Minimum Maximum Q1 Q3female 108.33 200.00 141.67 163.75male 125.00 270.00 158.33 197.92

Females: s = 18.86 kph and s2 = 18.862 = 355.7 kph2

Males: s = 28.98 kph and s2 = 28.982 = 839.8 kph2

Page 17: Descriptive statistics Describing data with numbers: measures of variability.

Coefficient of Variation

• Ratio of sample standard deviation to sample mean multiplied by 100.

• Measures relative variability, that is, variability relative to the magnitude of the data.

• Unitless, so good for comparing variation between two groups.

Page 18: Descriptive statistics Describing data with numbers: measures of variability.

Coefficient of variation (MPH)Sex N Mean Median TrMean StDev SE Mean female 126 91.23 90.00 90.83 11.32 1.01 male 100 106.79 110.00 105.62 17.39 1.74 Minimum Maximum Q1 Q3female 65.00 120.00 85.00 98.25male 75.00 162.00 95.00 118.75

Females: CV = (11.32/91.23) x 100 = 12.4

Males: CV = (17.39/106.79) x 100 = 16.3

Page 19: Descriptive statistics Describing data with numbers: measures of variability.

Coefficient of variation (KPH)Sex N Mean Median TrMean StDev SE Mean female 126 152.05 150.00 151.39 18.86 1.68 male 100 177.98 183.33 176.04 28.98 2.90

Sex Minimum Maximum Q1 Q3female 108.33 200.00 141.67 163.75male 125.00 270.00 158.33 197.92

Females: CV = (18.86/152.05) x 100 = 12.4

Males: CV = (28.98/177.98) x 100 = 16.3

Page 20: Descriptive statistics Describing data with numbers: measures of variability.

The most appropriate measure of variability depends on …

the shape of the data’s distribution.

Page 21: Descriptive statistics Describing data with numbers: measures of variability.

Choosing Appropriate Measure of Variability

• If data are symmetric, with no serious outliers, use range and standard deviation.

• If data are skewed, and/or have serious outliers, use IQR.

• If comparing variation across two data sets, use coefficient of variation.