Top Banner
Chapter 3 Averages and Variations 3.1 Measures of Central Tendency
38

Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Dec 29, 2015

Download

Documents

Duane Sanders
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Chapter 3Averages and Variations

3.1 Measures of Central Tendency

Page 2: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Mode, Median and Mean

What kind of data will we be able to compute mode, median and mean?

Quantitative data can have a mode, median and mean.

Qualitative data can have a mode.

Page 3: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Mode

The value that occurs most frequently is the mode. Some books describe the mode as the “hump” or local high point in a histogram, which does imply frequency of an answer.

Page 4: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Median

The median of a data set is the middle data value.

To find, order the data from smallest to largest, and the data set in the middle (for a data set of n, the middle position is ) is the median.

Does anyone detect a potential problem?

n 12

Page 5: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Mean

You are used to an “average” of the test. The technical term is the mean.

Page 6: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Mean

You are used to an “average” of the test. The technical term is the mean.

Trimmed mean is a term for a mean where a percentage of the data values are disregarded. A 5% mean is one where 5% of top and 5% of bottom values are thrown out before computing the mean.

sum of data valuesmean

number of data values

Page 7: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Pulse Data

Lets find the mode, the median and the mean of the pulse data from the first day of class.

We just found the population mean (μ) rather than the sample mean ().

What is the difference then between μ and ?

Page 8: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Weighted Averages

Final Exams are computed in as weighted averages. How do they do that???

Page 9: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Weighted Averages

Final Exams are computed in as weighted averages. How do they do that???

xwweighted average

w

That is, multiply the data value by its weighting, add each of those, then divide by the sum of the weighting (typically 1)

Page 10: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

3.2

Measures of Variation

Page 11: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

While knowing the mean is important

There is other information from data that you can measure.

These tell you about the spread of the data.

Range – difference between largest and smallest value of a data distribution.

Page 12: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance

Variance = measure of how data tends to spread around an expected value (the mean)

Each data point = xMean = Deviation = x – Sample size = nVariance = s2

Standard Deviation = s

Page 13: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

22 (x x)

sn 1

Defining Formula

Page 14: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

22 (x x)

sn 1

Defining Formula

2

2

2

xx

nsn 1

Computation Formula

Page 15: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

To find standard deviation, just square root the variance.

The computational formula tends to be a little easier to do by hand, but we will practice both.

These two formulas ARE the same.

Page 16: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

Lets find the variance and the standard deviation of the pulse data, using both formulas.

Page 17: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

If an entire population is used, instead of a sample, the notation is different but the methods are the same

Each data point = xMean = µDeviation = x – µSample size = NVariance = σ 2

Standard Deviation = σ

Page 18: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

22 (x )

N

Defining Formula

Page 19: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

Coefficient of Variance (CV) expresses standard deviation as a percentage of the sample/population mean.

Page 20: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

Coefficient of Variance (CV) expresses standard deviation as a percentage of the sample/population mean.

sCV 100

x CV 100

Sample Population

Page 21: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Variance (cont)

Chebyshev’s TheoremFor any data set, the proportion that lies

within k standard deviations on either side of the mean is at least

So 75% lies between 2 standard deviations, 88.9% between 3 standard deviations, etc.

2

11

k

Page 22: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

3.3 Mean/Standard Deviation

What if you use grouped data

Page 23: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Grouped Data

Lots of data = TEDIOUS, whether you have a calculator or not… If you generally approximate the mean and standard deviation, that sometimes is enough

To deal with this, you actually begin with a frequency table (remember Histograms?

Page 24: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Grouped Data (cont)

1. Make a frequency table2. Find the midpoint of each class = x3. Compute each class frequency = f4. Total number of entries = n

Page 25: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Grouped Data (cont)

1. Make a frequency table2. Find the midpoint of each class = x3. Compute each class frequency = f4. Total number of entries = n

xfaverage x

n

Page 26: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Grouped Data (cont)

22 (x x) f

sn 1

Defining Formula

2

2

2

xfx f

nsn 1

Computation Formula

Page 27: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Grouped Data (cont)

Essentially, by using the midpoint and the frequency, you use a representation for ALL data values in that class, without typing in every data value.

It will be a little off, but again, if the data set is huge it isn’t a bad way to approach the problem.

Page 28: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

3.4 Percentiles

Box/Whiskers Plots

Page 29: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Percentiles

Baby Calculator

Children’s BMI

A percentile ranking allows one to know where the particular data value falls in relation to the entire population.

Page 30: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Percentiles (cont)

The Pth percentile (1 ≤ P ≤ 99) is a value so that P% of the data falls at or below it (and 100 – P % falls at/above)

60th Percentile does NOT mean 60% score – it means that 60% of scores fall at or below that position… 60th percentile could be 80%

Where have you seen percentiles?

Page 31: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Percentiles (cont)

Quartiles – special percentiles used frequently. The data is divided into fourths, called Quartiles.

2nd Quartile – Median1st Quartile – Median below (exclude Q2)3rd Quartile – Median above (exclude Q2)Interquartile Range (IQR) = Q3 – Q1

Page 32: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Percentiles (cont)

Lets find the quartiles for following Math class sizes in the 9th grade.

10, 11, 12, 12, 14, 15, 16, 17, 19, 20

Median = 14.5

1st Q = 12

3rd Q = 17

IQR = 17 – 12 = 5

Page 33: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Percentiles (cont)

Lets find the quartile for the pulse data

Why are these values significant? These are needed to make Box and Whiskers Plots

Page 34: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Box and Whiskers Plots

Page 35: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Box and Whiskers Plots (cont)

The five number summary is used to make a box and whisker

plot.

Lets make a box and whiskers plot for the class size data.

Lowest value, Q1, Median, Q3, Highest Value

Page 36: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

10

12

14

16

18

20

Lowest Value

Highest Value

Q2

Median

Q1

Page 37: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Box and Whiskers Plots (cont)

Lets make a box and whiskers for the pulse data

Outliers – data > Q3 + 1.5 IQR data < Q1 – 1.5 IQR

Page 38: Chapter 3 Averages and Variations 3.1 Measures of Central Tendency.

Resources• http://www.statcan.ca/english/edu/power/ch12/plots.htm

• http://www.statsdirect.com/help/graphics/box_whisker.htm

• http://v8doc.sas.com/sashtml/stat/chap18/sect18.htm