Top Banner
NUMERICAL NUMERICAL DESCRIPTIVE STATISTICS DESCRIPTIVE STATISTICS Measures of Variability Measures of Variability
26

NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

NUMERICAL NUMERICAL DESCRIPTIVE STATISTICSDESCRIPTIVE STATISTICS

Measures of VariabilityMeasures of Variability

Page 2: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Another Description of the Data -- Variability

• For Data Set A below, the mean of the 10 observations is 2.60.

SET A: 4,2,3,3,2,2,1,4,3,2

• But each of the following two data sets with 10 observations also has a mean of 2.60

SET B: 2,2,2,2,3,3,3,3,3,3

SET C: 0,0,1,1,4,4,4,4,4,4

• Although sets A, B, an C all have the same mean, the “spread” of the data differs from set to set.

Page 3: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

The “Spread” of the DataGrades

0

1

2

3

4

5

0 1 2 3 4

Data Set A

Grades

0

2

4

6

8

0 1 2 3 4

Data Set B

Grades

0

2

4

6

8

0 1 2 3 4

Data Set C

Most “spread”

Least “spread”

Page 4: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Measures of Variability

• Population– Variance 2

– Standard Deviation

• Sample– Range– Variance s2 – Standard Deviation s

Page 5: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

The Range

• When we are talking about a sample, the range is the difference between the highest and lowest observation

• In the sample there were some A’s (4’s), and the lowest value in the sample was a D (1)– Sample range = 4 - 1 = 3

Page 6: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Another Approach to Variability

• The range only takes into account the two most extreme values

• A better approach– Look at the variability of all the data

• In some sense find the “average” deviation from the mean

• The value of an observation minus the mean can be positive or negative

• The plusses and minuses cancel each other out giving an average value of 0

– Need another measure

Page 7: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

How to Average OnlyPositive Deviations

• MEAN ABSOLUTE DEVIATION (MAD)MEAN ABSOLUTE DEVIATION (MAD)– Averages the absolute values of these

differences – Used in quality control/inventory analyses– But this quantity is hard to work with

algebraically and analytically

• POPULATION VARIANCE (POPULATION VARIANCE (σσ22)) – Averages the squares of the differences from

the mean

Page 8: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Population Variance Formulas

N

)μ(xσ : )1(

2i2

Definition

2

2i2 μ

N

xσ : )2( Shortcut

Page 9: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

EXAMPLECalculation of σ2

Using the numbers from the population of 2000 GPA’s: 4,2,1,3,3,3,2,… 2

92.2000

)39.22(...)39.21()39.22()39.24( 22222

92.)39.2(2000

)2(...)1()2()4( 22222

2

Page 10: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Standard Deviation

• But the unit of measurement for σ2 is:– Square Grade Points (???)

• What is a square grade point?

• To get back to the original units (grade points), take the square root of σ2

• STANDARD DEVIATION (STANDARD DEVIATION ()) – the square root of the variance, σ2

2

Page 11: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Calculation of theStandard Deviation (σ)

• For the grade point data:

959.92.

Page 12: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Estimating σ2

• SAMPLE VARIANCE (SAMPLE VARIANCE (ss22))– Best estimate for is 2 is s2

• s2 is found by using the sample data and using the formula for 2 except:

rdenominato in the 1)-(nby Divide

μn rather tha x Use

Page 13: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

1-nn

xx

s :1Shortcut (2)

2

i2i

2

Sample Variance Formulas

1-n

)x(xs :Definition (1)

2i2

1-n

xnxs :2Shortcut (3)

22i2

Page 14: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

9333.

910

2...324))2(...)3()2()4((

22222

2

s

9333.9

)6.22(...)6.23()6.22()6.24( 22222

s

Calculations for s2

• The data from the sample is: 4,2,3,3,2,2,1,4,3,2

6.2x

9333.9

)6.2(10))2(...)3()2()4(( 222222

s

Page 15: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Sample Standard Deviation, s

• The best estimate for is denoted: s

• It is called the sample standard deviation

• s is found by taking the square root of s2

2ss

.9661.9333s

example, For this

Page 16: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

s2 for Grouped Data

• For the grade point example– 4 occurs 2 times– 3 occurs 3 times– 2 occurs 4 times– 1 occurs 1 time

• To calculate the sample variance, s2, rather than write the term down each time:– Multiply the squared deviations by their class

frequencies

Page 17: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Calculation of s2-Grouped Data

9333.

9

)6.21(1)6.22(4)6.23(3)6.24(2 22222

s

9333.

910

)1(1)2(4)3(3)4(2))1(1)2(4)3(3)4(2(

22222

2

s

9333.9

)6.2(10))1(1)2(4)3(3)4(2( 222222

s

Page 18: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Empirical RuleInterpreting s

(Mound Shaped Distribution)

• If data forms a mound shaped distribution– Within 1s from the mean

• Approximately 68% of the measurements

– Within 2s from the mean• Approximately 95% of the measurements

– Within 3s from the mean• Approximately all of the measurements

Page 19: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Chebychev’s InequalityInterpreting s

(Any Distribution)

• If data is not mound shaped ( or shape is unknown)

• Within 2s from the mean• At least 75% of the measurements

– Within 3s from the mean• At least 88.9% of the measurements

– Within ks from the mean (k > 1)• At least 1 -1/k2 of the measurements

Page 20: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Coefficient of Variation

• Another measure of variability that is frequently used to compare different data sets (even if measured in different units) is the: Coefficient of Variation (CV)Coefficient of Variation (CV)

CV = (Standard Deviation/Mean) x 100%CV = (Standard Deviation/Mean) x 100%

Page 21: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Range Approximation for σ

• If data is relatively mound-shaped a “good” approximation for s is:

σ (range)/4

Sometimes, when one is more certain that the sample range captures the entire population of data statisticians use,

σ (range)/6

Page 22: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Using Excel• Suppose population data is in cells A2 to A2001Population variance (2) = VARP(A2:A2001)Population standard dev. () =STDEVP(A2:A2001)

• Suppose sample data is in cells A2 to A11Sample variance (s2) =VAR(A2:A11)

Sample standard dev. (s) =STDEV(A2:A11)

• Data Analysis

Page 23: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.
Page 24: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Where data values are storedCheckLabels

Check both:Summary StatisticsConfidence Level

Enter Name ofOutput Worksheet

Page 25: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Drag to makeColumn A wider

Sample Standard DeviationSample Variance

Page 26: NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability.

Review• Measures of variability for Populations and

Samples– Range– Variance– Standard Deviation

• Interpretation of standard deviation– Empirical Rule for “mound-shaped” data– Chebychev’s Inequality for “other” data

• Excel– Functions– Data Analysis