Top Banner
Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
37

Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Chapter 3Numerically

Summarizing Data3.3

Measures of Central Tendency and Dispersion from Grouped

Data

Page 2: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 3: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 4: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Approximating the Mean from a Frequency Distribution

The following frequency distribution represents the time between eruptions (in seconds) for a random sample of 45 eruptions at the Old Faithful Geyser in California. Approximate the mean time between eruptions.

Page 5: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 6: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 7: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Computed a Weighted Mean

Bob goes the “Buy the Weigh” Nut store and creates his own bridge mix. He combines 1 pound of raisins, 2 pounds of chocolate covered peanuts, and 1.5 pounds of cashews. The raisins cost $1.25 per pound, the chocolate covered peanuts cost $3.25 per pound, and the cashews cost $5.40 per pound. What is the cost per pound of this mix.

Page 8: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 9: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 10: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Approximating the Mean from a Frequency Distribution

The following frequency distribution represents the time between eruptions (in seconds) for a random sample of 45 eruptions at the Old Faithful Geyser in California. Approximate the standard deviation time between eruptions.

Page 11: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 12: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Chapter 3Numerically

Summarizing Data

3.4

Measures of Location

Page 13: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

The z-score represents the number of standard deviations that a data value is from the mean.

It is obtained by subtracting the mean from the data value and dividing this result by the standard deviation.

The z-score is unitless with a mean of 0 and a standard deviation of 1.

Page 14: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Population Z - score

Sample Z - score

Page 15: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Using Z-Scores

The mean height of males 20 years or older is 69.1 inches with a standard deviation of 2.8 inches. The mean height of females 20 years or older is 63.7 inches with a standard deviation of 2.7 inches. Data based on information obtained from National Health and Examination Survey. Who is relatively taller:

Shaquille O’Neal whose height is 85 inches

or

Lisa Leslie whose height is 77 inches.

Page 16: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Answer:

Shaquille O’Neal Z-Score: (85-69.1)/2.8 =5.67857143

Lisa Leslie (77-63.7)/2.7 =4.92592593 Because O’Neal Z-Score > Lisa ‘s Z-Score,We say O’Neal is in a higher position than

Lisa in their Goups.

Page 17: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

The median divides the lower 50% of a set of data from the upper 50% of a set of data. In general, the kth percentile, denoted Pk , of a set of data divides the lower k% of a data set from the upper (100 – k) % of a data set.

Page 18: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Computing the kth Percentile, Pk

Step 1: Arrange the data in ascending order.

Page 19: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Step 1: Arrange the data in ascending order.

Step 2: Compute an index i using the following formula:

where k is the percentile of the data value and n is the number of individuals in the data set.

Computing the kth Percentile, Pk

Page 20: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Step 1: Arrange the data in ascending order.

Step 2: Compute an index i using the following formula:

where k is the percentile of the data value and n is the number of individuals in the data set.

Step 3: (a) If i is not an integer, round up to the next highest integer. Locate the ith value of the data set written in ascending order. This number represents the kth percentile. (b) If i is an integer, the kth percentile is the arithmetic mean of the ith and (i + 1)st data value.

Computing the kth Percentile, Pk

Page 21: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Finding a Percentile

For the employment ratio data on the next slide, find the

(a) 60th percentile

(b) 33rd percentile

Page 22: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Answer: A) 60th Percentile i) the index: I = (60/100)*51 =30.6

30.6 in not an integer, we round it up to 31. so the data value is 66.1

B) 33rd

i) the index: I =(33/100)*51=16.83 Round it up to 17. So the data value at 17th is

63.6.

Page 23: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 24: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.
Page 25: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Finding the Percentile that Corresponds to a Data Finding the Percentile that Corresponds to a Data ValueValue

Step 1: Arrange the data in ascending order.

Page 26: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Step 2: Use the following formula to determine the percentile of the score, x:

Percentile of x =

Round this number to the nearest integer.

Finding the Percentile that Corresponds to a Data Finding the Percentile that Corresponds to a Data ValueValue

Step 1: Arrange the data in ascending order.

Page 27: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Finding the Percentile Rank of a Data Value

Find the percentile rank of the employment ratio of Michigan.

Page 28: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.

• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.

Page 29: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.

• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.

• The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median.

Page 30: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.

• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.

• The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median.

• The 3rd quartile divides the bottom 75% of the data from the top 25% of the data, so that the 3rd quartile is equivalent to the 75th percentile.

Page 31: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Finding the Quartiles

Find the quartiles corresponding to the employment ratio data.

Page 32: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Checking for Outliers Using QuartilesStep 1: Determine the first and third quartiles of the data.

Page 33: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Step 1: Determine the first and third quartiles of the data.

Step 2: Compute the interquartile range. The interquartile range or IQR is the difference between the third and first quartile. That is, IQR = Q3 - Q1

Checking for Outliers Using Quartiles

Page 34: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Step 3: Compute the fences that serve as cut-off points for outliers.

Lower Fence = Q1 - 1.5(IQR)

Upper Fence = Q3 + 1.5(IQR)

Step 1: Determine the first and third quartiles of the data.

Step 2: Compute the interquartile range. The interquartile range or IQR is the difference between the third and first quartile. That is, IQR = Q3 - Q1

Checking for Outliers Using Quartiles

Page 35: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

Step 3: Compute the fences that serve as cut-off points for outliers.

Lower Fence = Q1 - 1.5(IQR)

Upper Fence = Q3 + 1.5(IQR)

Step 4: If a data value is less than the lower fence or greater than the upper fence, then it is considered an outlier.

Step 1: Determine the first and third quartiles of the data.

Step 2: Compute the interquartile range. The interquartile range or IQR is the difference between the third and first quartile. That is,

Checking for Outliers Using Quartiles

IQR = Q3 - Q1

Page 36: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

EXAMPLE Check the employment ratio data for outliers.

Q1:13 th—62.9 ;

Q3: 38th—67.2

Q3-Q1=4.3

So (62.9-1.5*4.3, 67.2+1.5*4.3)=(56.45,73.65)

The OUTLIER is 52.7

Page 37: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data.

West Virginia