Chapter 2: Describing Distributions with Numbers · 2017. 1. 23. · Chapter 2: Describing Distributions with Numbers Math 2200: Elementary Statistics January 19, 2011 Moore Chapter

The CenterVariability

Chapter 2: Describing Distributions with Numbers

Math 2200: Elementary Statistics

January 19, 2011

Moore Chapter 2

ObjectivesDescribing the Center

The Why?

Suppose you have been chosen to lead a group to study comparablesalaries for your profession. When displaying this information foryour employer to give competitive wages, how would you choose tonumerically display the center of your distribution?

Moore Chapter 2

The Why?

Suppose you have been chosen to lead a group to study comparablesalaries for your profession.

When displaying this information foryour employer to give competitive wages, how would you choose tonumerically display the center of your distribution?

Moore Chapter 2

The Why?

Suppose you have been chosen to lead a group to study comparablesalaries for your profession. When displaying this information foryour employer to give competitive wages, how would you choose tonumerically display the center of your distribution?

Moore Chapter 2

The Why?

Suppose you have been chosen to lead a group to study theincomes of people in the surrounding community. When displayingthis information to express the needs for the well being of yourfellow citizens, how would you choose to numerically display thecenter of your distribution?

Moore Chapter 2

The Why?

Suppose you have been chosen to lead a group to study theincomes of people in the surrounding community.

When displayingthis information to express the needs for the well being of yourfellow citizens, how would you choose to numerically display thecenter of your distribution?

Moore Chapter 2

The Why?

Suppose you have been chosen to lead a group to study theincomes of people in the surrounding community. When displayingthis information to express the needs for the well being of yourfellow citizens, how would you choose to numerically display thecenter of your distribution?

Moore Chapter 2

Numerical Observations

Main Objective

In this section, we shall explore numerical observations of adistribution with our focal point being the center and variability.

We begin by looking at some numerical information that describesthe center of a distribution.

Moore Chapter 2

Numerical Observations

Main Objective

In this section, we shall explore numerical observations of adistribution with our focal point being the center and variability.

We begin by looking at some numerical information that describesthe center of a distribution.

Moore Chapter 2

Measures of Center

Definition

If the n observations in a sample are denoted by x1, x2, · · · xn, thenthe mean is given by

x =x1 + x2 + · · ·+ xn

n∑i=1

Definition

The median is a measure of central tendency that divides the datainto two equal parts, half below the median and half above. In thecase where the number of observations is even, the median ishalfway between the two central values. We shall denote themedian by M.

Moore Chapter 2

Measures of Center

Definition

If the n observations in a sample are denoted by x1, x2, · · · xn, thenthe mean is given by

x =x1 + x2 + · · ·+ xn

n∑i=1

Definition

The median is a measure of central tendency that divides the datainto two equal parts, half below the median and half above. In thecase where the number of observations is even, the median ishalfway between the two central values. We shall denote themedian by M.

Moore Chapter 2

Understanding Sigma Notation

Consider the data set given byx1 = 17, x2 = 23, x3 = −15, x4 = 37, x5 = 43, x6 = 2.

4∑i=1

xi = x1 + x2 + x3 + x4

= 17 + 23 + (−15) + 37= 62

Moore Chapter 2

4∑i=1

x1 + x2 + x3 + x4

= 17 + 23 + (−15) + 37= 62

Moore Chapter 2

4∑i=1

xi = x1 + x2 + x3 + x4

= 17 + 23 + (−15) + 37= 62

Moore Chapter 2

4∑i=1

xi = x1 + x2 + x3 + x4

= 17 + 23 + (−15) + 37

Moore Chapter 2

4∑i=1

xi = x1 + x2 + x3 + x4

= 17 + 23 + (−15) + 37= 62

Moore Chapter 2

3∑i=2

x2 + x3

= 23 + (−15)= 8

Moore Chapter 2

3∑i=2

xi = x2 + x3

= 23 + (−15)= 8

Moore Chapter 2

3∑i=2

xi = x2 + x3

= 23 + (−15)

Moore Chapter 2

3∑i=2

xi = x2 + x3

= 23 + (−15)= 8

Moore Chapter 2

6∑i=1

x1 + x2 + x3 + x4 + x5 + x6

= 17 + 23 + (−15) + 37 + 43 + 2= 107

Moore Chapter 2

6∑i=1

xi = x1 + x2 + x3 + x4 + x5 + x6

= 17 + 23 + (−15) + 37 + 43 + 2= 107

Moore Chapter 2

6∑i=1

xi = x1 + x2 + x3 + x4 + x5 + x6

= 17 + 23 + (−15) + 37 + 43 + 2

Moore Chapter 2

6∑i=1

xi = x1 + x2 + x3 + x4 + x5 + x6

= 17 + 23 + (−15) + 37 + 43 + 2= 107

Moore Chapter 2

Finding the Mean

Find the mean of the aforementioned data set.

Recall that6∑

xi = 107.

Then x =

6∑i=1

6≈ 17.833333.

Moore Chapter 2

Finding the Mean

Recall that6∑

xi = 107.

Then x =

6∑i=1

6≈ 17.833333.

Moore Chapter 2

Finding the Mean

Recall that6∑

xi = 107.

Then x =

6∑i=1

6≈ 17.833333.

Moore Chapter 2

Finding the Mean

Recall that6∑

xi = 107.

Then x =

6∑i=1

6≈ 17.833333.

Moore Chapter 2

Finding the Mean

Recall that6∑

xi = 107.

Then x =

6∑i=1

6≈ 17.833333.

Moore Chapter 2

Finding the Mean

Recall that6∑

xi = 107.

Then x =

6∑i=1

6≈ 17.833333.

Moore Chapter 2

Finding the Median

Find the median of the aforementioned data set.

Step 1: Rewrite the data set in increasing order.-15, 2, 17, 23, 37, 43Step 2: If the data set has n observations, then find the location

of the midpoint of the data set by the formulan + 1

The location of the median in this data set is6 + 1

2= 3.5.

Step 3: Find the median.

The median of this data set is given by M =17 + 23

2= 20.

Moore Chapter 2

Finding the Median

Step 1: Rewrite the data set in increasing order.

-15, 2, 17, 23, 37, 43Step 2: If the data set has n observations, then find the location

2= 3.5.

2= 20.

Moore Chapter 2

Finding the Median

Step 1: Rewrite the data set in increasing order.-15, 2, 17, 23, 37, 43

Step 2: If the data set has n observations, then find the location

2= 3.5.

2= 20.

Moore Chapter 2

Finding the Median

2= 3.5.

2= 20.

Moore Chapter 2

Finding the Median

2= 3.5.

2= 20.

Moore Chapter 2

Finding the Median

2= 3.5.

2= 20.

Moore Chapter 2

Finding the Median

2= 3.5.

2= 20.

Moore Chapter 2

Mean vs. Median

The mean x and median M describe the center of adistribution.

The mean gives the arithmetic average of the observations.

The median is the midpoint of the observations.

Moore Chapter 2

Mean vs. Median

Moore Chapter 2

Mean vs. Median

Moore Chapter 2

The Mode

Definition

The mode is the response that occurs most frequently in adistribution.

Relation to Center

The mode often presents itself as a poor measure of centraltendency.

Moore Chapter 2

The Mode

Definition

The mode is the response that occurs most frequently in adistribution.

Relation to Center

The mode often presents itself as a poor measure of centraltendency.

Moore Chapter 2

Application 1

The following stemplot gives the monthly salaries of a radiologistover the past twelve months with the leaves representing hundreds.

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the mode for this particular distribution?Answer: The mode is $5000.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the mode for this particular distribution?

Answer: The mode is $5000.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the mode for this particular distribution?Answer: The mode is $5000.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: How much money did this particular radiologist make?Answer: This person made $62000.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: How much money did this particular radiologist make?

Answer: This person made $62000.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: How much money did this particular radiologist make?Answer: This person made $62000.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the radiologist’s average monthly salary?

Answer: We can determine the mean by evaluating

12∑i=1

12. Therefore, the average monthly salary is

$5166.67.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the radiologist’s average monthly salary?Answer: We can determine the mean by evaluating

12∑i=1

12. Therefore, the average monthly salary is

$5166.67.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the radiologist’s median monthly income?Answer: Since the stemplot gives the observations in ascending

order, we need to only use the formulan + 1

2to find the location

of the median. This gives us12 + 1

2= 6.5 as the location of the

median. The sixth and seventh data entries are 5000 and 5100,respectively. Therefore the midpoint of those two values shall giveus the median income of $5050.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Question: What is the radiologist’s median monthly income?

Answer: Since the stemplot gives the observations in ascending

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

median.

The sixth and seventh data entries are 5000 and 5100,respectively. Therefore the midpoint of those two values shall giveus the median income of $5050.

Moore Chapter 2

Application 1

4 6 6 85 0 0 0 1 1 4 4 56 5

Moore Chapter 2

Application 2

Ten males and ten females checked their body fat percentagebefore beginning an exercise regimen. The results from the studyare listed in the following sets:

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

Question: What is the average body fat percentage for males andfemales beginning this exercise regimen?Answer: In this case, the formula for the average body fat

percentage of each sample group is given by the formula

10∑i=1

This gives the average body fate percentage for the males is17.86%, while the females have an average body fat percentage of23.09%.

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

Question: What is the average body fat percentage for males andfemales beginning this exercise regimen?

Answer: In this case, the formula for the average body fat

10∑i=1

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

10∑i=1

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

10∑i=1

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

Question: What is the average body fat percentage for the entirepopulation in the study?

Answer: For us to calculate the mean of the whole group we need

to calculate

20∑i=1

20. This gives us an average body fat percentage

of 20.475% for the whole group.

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

Question: What is the average body fat percentage for the entirepopulation in the study?Answer: For us to calculate the mean of the whole group we need

to calculate

20∑i=1

This gives us an average body fat percentage

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}

Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.

Question: What is the average body fat percentage for the entirepopulation in the study?Answer: For us to calculate the mean of the whole group we need

to calculate

20∑i=1

20. This gives us an average body fat percentage

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.Comment: Recall that the average body fat percentage for themales and females were 17.86% and 23.09%, respectively, whilethe average body fat percentage for the entire population was20.475%.

Question: What is the mean of the sample means for the body fatpercentage of males and females?Answer: This involves us evaluating the average of the twonumbers 17.86 and 23.09, which gives us an average body fatpercentage of these samples at 20.475%. Notice that this matchesthe average body fat percentage for the entire population.

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.Comment: Recall that the average body fat percentage for themales and females were 17.86% and 23.09%, respectively, whilethe average body fat percentage for the entire population was20.475%.Question: What is the mean of the sample means for the body fatpercentage of males and females?

Answer: This involves us evaluating the average of the twonumbers 17.86 and 23.09, which gives us an average body fatpercentage of these samples at 20.475%. Notice that this matchesthe average body fat percentage for the entire population.

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.Comment: Recall that the average body fat percentage for themales and females were 17.86% and 23.09%, respectively, whilethe average body fat percentage for the entire population was20.475%.Question: What is the mean of the sample means for the body fatpercentage of males and females?Answer: This involves us evaluating the average of the twonumbers 17.86 and 23.09,

which gives us an average body fatpercentage of these samples at 20.475%. Notice that this matchesthe average body fat percentage for the entire population.

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.Comment: Recall that the average body fat percentage for themales and females were 17.86% and 23.09%, respectively, whilethe average body fat percentage for the entire population was20.475%.Question: What is the mean of the sample means for the body fatpercentage of males and females?Answer: This involves us evaluating the average of the twonumbers 17.86 and 23.09, which gives us an average body fatpercentage of these samples at 20.475%.

Notice that this matchesthe average body fat percentage for the entire population.

Moore Chapter 2

Application 2

Males = {20.7, 15.8, 32.0, 8.4, 12.8, 29.6, 10.8, 12.8, 16.9, 18.8}Females = {16.8, 34.2, 22.8, 21.0, 23.8, 29.8, 28.9, 13.4, 20.5, 19.7}.Comment: Recall that the average body fat percentage for themales and females were 17.86% and 23.09%, respectively, whilethe average body fat percentage for the entire population was20.475%.Question: What is the mean of the sample means for the body fatpercentage of males and females?Answer: This involves us evaluating the average of the twonumbers 17.86 and 23.09, which gives us an average body fatpercentage of these samples at 20.475%. Notice that this matchesthe average body fat percentage for the entire population.

Moore Chapter 2

Sample Means and Population Means

Question

Do the means of all sample means always equal the mean of theentire population?

Answer

Your WebWork assignment will help provide an answer to thisquestion.

Moore Chapter 2

Sample Means and Population Means

Question

Do the means of all sample means always equal the mean of theentire population?

Answer

Your WebWork assignment will help provide an answer to thisquestion.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.

Question: What is the median household income of the 15families surveyed?Answer: After rewriting the observations in ascending order, wemust choose the number in the 8th spot to find the median since15 + 1

2= 8. This gives a median household income of $59, 000.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: What is the median household income of the 15families surveyed?

Answer: After rewriting the observations in ascending order, wemust choose the number in the 8th spot to find the median since15 + 1

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: What is the median household income of the 15families surveyed?Answer: After rewriting the observations in ascending order, wemust choose the number in the 8th spot to find the median since15 + 1

This gives a median household income of $59, 000.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: What is the median household income of the 15families surveyed?Answer: After rewriting the observations in ascending order, wemust choose the number in the 8th spot to find the median since15 + 1

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: What is the mean household income of the 15 familiessurveyed?

Answer: Notice that the sum of the observations is $2, 162, 750.This implies the mean household income for the 15 families is$144, 183.33.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: What is the mean household income of the 15 familiessurveyed?Answer: Notice that the sum of the observations is $2, 162, 750.

This implies the mean household income for the 15 families is$144, 183.33.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: What is the mean household income of the 15 familiessurveyed?Answer: Notice that the sum of the observations is $2, 162, 750.This implies the mean household income for the 15 families is$144, 183.33.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: Does the mean or median provide a better measure ofcentral tendency for this distribution?

Answer: The median is a better approximation for the center ofthis distribution. In this case, the mean is much larger than themedian since our distribution is skewed right. In general, anyheavily skewed distribution leads to the median being a betterapproximation of the center.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: Does the mean or median provide a better measure ofcentral tendency for this distribution?Answer: The median is a better approximation for the center ofthis distribution.

In this case, the mean is much larger than themedian since our distribution is skewed right. In general, anyheavily skewed distribution leads to the median being a betterapproximation of the center.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: Does the mean or median provide a better measure ofcentral tendency for this distribution?Answer: The median is a better approximation for the center ofthis distribution. In this case, the mean is much larger than themedian since our distribution is skewed right.

In general, anyheavily skewed distribution leads to the median being a betterapproximation of the center.

Moore Chapter 2

Application 3

A census report gave the following household incomes of 15families in the Savannah area:85000, 32000, 68500, 17000, 42000, 175000, 88500, 67000, 39500,49000, 55000, 59000, 310250, 25000, 1050000.Question: Does the mean or median provide a better measure ofcentral tendency for this distribution?Answer: The median is a better approximation for the center ofthis distribution. In this case, the mean is much larger than themedian since our distribution is skewed right. In general, anyheavily skewed distribution leads to the median being a betterapproximation of the center.

Moore Chapter 2

Application 4

A recent Nielson rating poll contacted a random sample ofAmericans to determine the amount of time their family watchedtelevision on a Tuesday night. Exactly 250 people were involved inthe poll with 37 people watching no television, 51 people watching30 minutes of television, 17 people watching 45 minutes oftelevision, 20 people watching 60 minutes of television, 19 peoplewatching 75 minutes of television, 11 people watching 90 minutesof television, 50 people watching 120 minutes of television, and 45people watching 240 minutes of television. Determine the mean,median, and mode from the given Nielson rating poll.

Mode = 30 minutes

Moore Chapter 2

Application 4

Mode =

30 minutes

Moore Chapter 2

Application 4

Mode = 30 minutes

Moore Chapter 2

Application 4

Median =

60 + 75

2= 67.5 minutes

Moore Chapter 2

Application 4

Median =60 + 75

2= 67.5 minutes

Moore Chapter 2

Application 4

Mean =

37(0)+51(30)+17(45)+20(60)+19(75)+11(90)+50(120)+45(240)250 =

22710250 = 90.84 minutes

Moore Chapter 2

Application 4

Mean = 37(0)+51(30)+17(45)+20(60)+19(75)+11(90)+50(120)+45(240)250 =

22710250 = 90.84 minutes

Moore Chapter 2

Percentiles and BoxplotsStandard Deviation

What is a Percentile?

Definition

The kth percentile is a data value such that approximately k% ofthe observations are at or below this value and approximately(100− k)% of the observations are above this value.

The Location of the kth Percentile

The kth percentile of a data set containing n observations writtenin ascending order can be found at the location of L where

100(n + 1).

Warning: There is no universal definition for percentile. As aresult, definitions you may encounter using outside resources mayprovide you with a different definition.

Moore Chapter 2

Definition

100(n + 1).

Moore Chapter 2

Definition

100(n + 1).

Moore Chapter 2

Calculating Percentiles

A small sample of students in a psychology class were chosen totake the Wechsler Adult Intelligence Scale (WAIS) intelligencequotient (IQ) test with their results listed in the stemplot below.

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

Question: Which score would represent the 80th percentile for thissample?Answer: Since there are 23 observations, we find the location ofour desired value is given by L = k

100(n + 1) = 80100(23 + 1) = 19.2.

Therefore our desired observation lies two-tenths of the waybetween our 19th and 20th observations, which are 125 and 126.As a result, the 80th percentile is 125.2.

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

Question: Which score would represent the 80th percentile for thissample?

Answer: Since there are 23 observations, we find the location ofour desired value is given by L = k

100(n + 1) = 80100(23 + 1) = 19.2.

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(n + 1) = 80100(23 + 1) = 19.2.

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(n + 1) = 80100(23 + 1) = 19.2.

Therefore our desired observation lies two-tenths of the waybetween our 19th and 20th observations,

which are 125 and 126.As a result, the 80th percentile is 125.2.

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(n + 1) = 80100(23 + 1) = 19.2.

Therefore our desired observation lies two-tenths of the waybetween our 19th and 20th observations, which are 125 and 126.

As a result, the 80th percentile is 125.2.

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(n + 1) = 80100(23 + 1) = 19.2.

Moore Chapter 2

From Percentile to Numerical Value

Converting Decimals

Assuming the kth percentile gives the location with a numberbesides zero after the decimal point, we can calculate theapproximate place in which the observation representing kth

percentile must lie for an ascending list of observations.

In the casewhere d represents the nonzero numbers to the right of thedecimal point lying between the observations n and n + 1 withcorresponding observations of an and an+1, respectively, theobservation A representing the kth percentile is given by

A = an + d(an+1 − an).

Moore Chapter 2

Converting Decimals

percentile must lie for an ascending list of observations. In the casewhere d represents the nonzero numbers to the right of thedecimal point lying between the observations n and n + 1 withcorresponding observations of an and an+1, respectively,

theobservation A representing the kth percentile is given by

A = an + d(an+1 − an).

Moore Chapter 2

Converting Decimals

percentile must lie for an ascending list of observations. In the casewhere d represents the nonzero numbers to the right of thedecimal point lying between the observations n and n + 1 withcorresponding observations of an and an+1, respectively, theobservation A representing the kth percentile is given by

A = an + d(an+1 − an).

Moore Chapter 2

Calculating Percentiles Revisited

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

Question: Which score would represent the 31st percentile for thissample?

Answer: Since there are 23 observations, we find the location ofour desired value is given by L = 31

100(23 + 1) = 7.44. Thereforeour desired observation lies between the 7th and 8th observations,which are 85 and 88. By applying our previous formula, the 31st

percentile is given by 85 + .44(88− 85) = 86.32.

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

Question: Which score would represent the 31st percentile for thissample?Answer: Since there are 23 observations, we find the location ofour desired value is given by L = 31

100(23 + 1) = 7.44.

Thereforeour desired observation lies between the 7th and 8th observations,which are 85 and 88. By applying our previous formula, the 31st

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(23 + 1) = 7.44. Thereforeour desired observation lies between the 7th and 8th observations,

which are 85 and 88. By applying our previous formula, the 31st

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(23 + 1) = 7.44. Thereforeour desired observation lies between the 7th and 8th observations,which are 85 and 88.

By applying our previous formula, the 31st

Moore Chapter 2

7 1 1 98 4 5 5 5 8 8 89

10 4 4 4 4 7 711 0 312 5 6 9 9 9

100(23 + 1) = 7.44. Thereforeour desired observation lies between the 7th and 8th observations,which are 85 and 88. By applying our previous formula, the 31st

percentile is given by 85 + .44(88− 85) = 86.32.Moore Chapter 2

Quartiles

Definition

When an ordered set of data is divided into four equal parts, thedivision points are called quartiles.

The first quartile, Q1, is avalue that has approximately one-fourth (25%) of the observationsbelow it and approximately 75% of the observations above. Thesecond quartile, Q2, has approximately one-half (50%) of theobservations below its value. The more common term for thesecond quartile is the median, which we denote by M. The thirdquartile, Q3, has approximately three-fourths (75%) of theobservations below its value.

Although this may seem like easy calculation, there is somediscrepancy amongst statisticians on how to compute Q1 and Q3.

Moore Chapter 2

Quartiles

Definition

When an ordered set of data is divided into four equal parts, thedivision points are called quartiles. The first quartile, Q1, is avalue that has approximately one-fourth (25%) of the observationsbelow it and approximately 75% of the observations above.

Thesecond quartile, Q2, has approximately one-half (50%) of theobservations below its value. The more common term for thesecond quartile is the median, which we denote by M. The thirdquartile, Q3, has approximately three-fourths (75%) of theobservations below its value.

Moore Chapter 2

Quartiles

Definition

When an ordered set of data is divided into four equal parts, thedivision points are called quartiles. The first quartile, Q1, is avalue that has approximately one-fourth (25%) of the observationsbelow it and approximately 75% of the observations above. Thesecond quartile, Q2, has approximately one-half (50%) of theobservations below its value.

The more common term for thesecond quartile is the median, which we denote by M. The thirdquartile, Q3, has approximately three-fourths (75%) of theobservations below its value.

Moore Chapter 2

Quartiles

Definition

When an ordered set of data is divided into four equal parts, thedivision points are called quartiles. The first quartile, Q1, is avalue that has approximately one-fourth (25%) of the observationsbelow it and approximately 75% of the observations above. Thesecond quartile, Q2, has approximately one-half (50%) of theobservations below its value. The more common term for thesecond quartile is the median, which we denote by M.

The thirdquartile, Q3, has approximately three-fourths (75%) of theobservations below its value.

Moore Chapter 2

Quartiles

Definition

When an ordered set of data is divided into four equal parts, thedivision points are called quartiles. The first quartile, Q1, is avalue that has approximately one-fourth (25%) of the observationsbelow it and approximately 75% of the observations above. Thesecond quartile, Q2, has approximately one-half (50%) of theobservations below its value. The more common term for thesecond quartile is the median, which we denote by M. The thirdquartile, Q3, has approximately three-fourths (75%) of theobservations below its value.

Moore Chapter 2

Quartiles

Definition

When an ordered set of data is divided into four equal parts, thedivision points are called quartiles. The first quartile, Q1, is avalue that has approximately one-fourth (25%) of the observationsbelow it and approximately 75% of the observations above. Thesecond quartile, Q2, has approximately one-half (50%) of theobservations below its value. The more common term for thesecond quartile is the median, which we denote by M. The thirdquartile, Q3, has approximately three-fourths (75%) of theobservations below its value.

Moore Chapter 2

Methods for Finding Q1 and Q3

Method 1

With Q1 and Q3 representing the 25th and 75th percentile,respectively, we can make our calculations coincide with finding the25th and 75th percentile of our data set.

Method 2

The first quartile Q1 can be calculated by taking the median of theobservations whose position in the ordered list is to the left of thelocation of the overall median. The third quartile Q3 is the medianof the observations whose position in the ordered list is to the rightof the location of the overall median.

These different methods can provide us with different values for Q1

and Q3 as can be seen with our next example.

Moore Chapter 2

Method 1

Method 2

The first quartile Q1 can be calculated by taking the median of theobservations whose position in the ordered list is to the left of thelocation of the overall median.

The third quartile Q3 is the medianof the observations whose position in the ordered list is to the rightof the location of the overall median.

Moore Chapter 2

Method 1

Method 2

Moore Chapter 2

Method 1

Method 2

Moore Chapter 2

Method 1 vs. Method 2

The following stemplot gives the weight (in pounds) of the dogsthat visit the veterinarian’s office on a Tuesday.

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q1 using Method 1?Answer: With 28 observations in place, the location of the 25th

percentile is 25100(28 + 1) = 7.25. This places Q1 a quarter of the

way between the observations 19 and 25. As a result, we find Q1

occurs at 20.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q1 using Method 1?

Answer: With 28 observations in place, the location of the 25th

occurs at 20.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

percentile is 25100(28 + 1) = 7.25.

This places Q1 a quarter of theway between the observations 19 and 25. As a result, we find Q1

occurs at 20.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

way between the observations 19 and 25.

As a result, we find Q1

occurs at 20.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

occurs at 20.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q1 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot.

Now we take the median of those numbersto the left of the asterisk (which is where the median occurs).Since there are 14 observations here, we find the median is locatedat the digit 7.5. This is between 19 and 25, which gives us amedian of 22 for these 14 observations. Hence, Q1 = 22.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q1 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot. Now we take the median of those numbersto the left of the asterisk (which is where the median occurs).

Since there are 14 observations here, we find the median is locatedat the digit 7.5. This is between 19 and 25, which gives us amedian of 22 for these 14 observations. Hence, Q1 = 22.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q1 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot. Now we take the median of those numbersto the left of the asterisk (which is where the median occurs).Since there are 14 observations here, we find the median is locatedat the digit 7.5.

This is between 19 and 25, which gives us amedian of 22 for these 14 observations. Hence, Q1 = 22.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q1 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot. Now we take the median of those numbersto the left of the asterisk (which is where the median occurs).Since there are 14 observations here, we find the median is locatedat the digit 7.5. This is between 19 and 25, which gives us amedian of 22 for these 14 observations. Hence, Q1 = 22.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

percentile is 75100(28 + 1) = 21.75. This places Q1 three-quarters of

the way between the observations 52 and 54. As a result, we findQ3 occurs at 53.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Answer: With 28 observations in place, the location of the 75th

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

percentile is 75100(28 + 1) = 21.75.

This places Q1 three-quarters ofthe way between the observations 52 and 54. As a result, we findQ3 occurs at 53.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

the way between the observations 52 and 54.

As a result, we findQ3 occurs at 53.5.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 93 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q3 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot.

Now we take the median of those numbersto the right (or below) of the asterisk (which is where the medianoccurs). Since there are 14 observations here, we find the medianis located at the digit 21.5. This is between 52 and 54, which givesus a median of the last 14 observations at 53. Hence Q3 = 53

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q3 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot. Now we take the median of those numbersto the right (or below) of the asterisk (which is where the medianoccurs).

Since there are 14 observations here, we find the medianis located at the digit 21.5. This is between 52 and 54, which givesus a median of the last 14 observations at 53. Hence Q3 = 53

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q3 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot. Now we take the median of those numbersto the right (or below) of the asterisk (which is where the medianoccurs). Since there are 14 observations here, we find the medianis located at the digit 21.5.

This is between 52 and 54, which givesus a median of the last 14 observations at 53. Hence Q3 = 53

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Question: What is Q3 using Method 2?Answer: We begin by placing an asterisk (*) where the medianoccurs in our stemplot. Now we take the median of those numbersto the right (or below) of the asterisk (which is where the medianoccurs). Since there are 14 observations here, we find the medianis located at the digit 21.5. This is between 52 and 54, which givesus a median of the last 14 observations at 53. Hence Q3 = 53

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Comment: Recall that Method 1 provided us with Q1 = 20.5 andQ3 = 53.5, while Method 2 gives us a different result of Q1 = 22and Q3 = 53.

The book uses Method 2 throughout the section, soout of convenience we shall use Method 2 for calculating the firstand third quartiles for the rest of this chapter.

Moore Chapter 2

0 8 91 2 8 8 9 92 5 5 6 6 9 9 9 *3 0 1 1 2 74 15 2 4 7 7 9 96 5 7

Comment: Recall that Method 1 provided us with Q1 = 20.5 andQ3 = 53.5, while Method 2 gives us a different result of Q1 = 22and Q3 = 53. The book uses Method 2 throughout the section, soout of convenience we shall use Method 2 for calculating the firstand third quartiles for the rest of this chapter.

Moore Chapter 2

The Five-Number Summary and Interquartile Range

The Five-Number Summary

The five-number summary of a distribution consists of theminimum, the first quartile, the median, the third quartile, and themaximum, written in order from smallest to largest.

Interquartile Range

The interquartile range, denoted IQR, is the distance between thefirst and third quartiles, which is given by

IQR = Q3 − Q1.

We often call an observation a suspected outlier if it falls morethan 1.5× IQR before the first quartile or beyond the third quartile.

Moore Chapter 2

Interquartile Range

IQR = Q3 − Q1.

Moore Chapter 2

Interquartile Range

IQR = Q3 − Q1.

Moore Chapter 2

Finding the Five-Number Summary

The following stemplot gives the tons of bluefin tuna caught in theMediterranean Sea over the past 11 years with the leaves of thestemplot representing thousands.

2 1 7 73 0 0 0 2 3 3 4456 8

Question: What is the five-number summary of this data set?Answer: The minimum and maximum are given by 21000 and68000, respectively. With 11 observations the median has alocation of 11+1

2 = 6, which gives us a median of M = 30000.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: What is the five-number summary of this data set?

Answer: The minimum and maximum are given by 21000 and68000, respectively. With 11 observations the median has alocation of 11+1

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: What is the five-number summary of this data set?Answer: The minimum and maximum are given by

21000 and68000, respectively. With 11 observations the median has alocation of 11+1

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: What is the five-number summary of this data set?Answer: The minimum and maximum are given by 21000 and68000, respectively.

With 11 observations the median has alocation of 11+1

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: What is the five-number summary of this data set?Answer: The minimum and maximum are given by 21000 and68000, respectively. With 11 observations the median has alocation of

11+12 = 6, which gives us a median of M = 30000.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

2 = 6, which gives us a median of

M = 30000.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

2 = 6, which gives us a median of M = 30000.Since we are using Method 2 for the quartiles, we begin by makingthe median bold.

With 5 values on each side of the median thelocation of the quartiles are in the 3rd and 9th spot. This impliesQ1 = 27000 and Q3 = 33000.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

2 = 6, which gives us a median of M = 30000.Since we are using Method 2 for the quartiles, we begin by makingthe median bold. With 5 values on each side of the median thelocation of the quartiles are in the

3rd and 9th spot. This impliesQ1 = 27000 and Q3 = 33000.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

2 = 6, which gives us a median of M = 30000.Since we are using Method 2 for the quartiles, we begin by makingthe median bold. With 5 values on each side of the median thelocation of the quartiles are in the 3rd and 9th spot. This impliesQ1 = 27000 and Q3 = 33000.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?

Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range of

IQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000.

We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000.

Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000,

so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile.

But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000.

As a result, 68000 is a suspected outlier.

Moore Chapter 2

2 1 7 73 0 0 0 2 3 3 4456 8

Question: Do any suspected outliers exist?Answer: Recall that the first and third quartiles were given byQ1 = 27000 and Q3 = 33000. This gives the interquartile range ofIQR = Q3 − Q1 = 33000− 27000 = 6000. We look for suspectedoutliers by taking 1.5× IQR = 1.5× 6000 = 9000. Notice thatQ1 − 9000 = 18000 is below the minimum of 21000, so no outlierexists before the first quartile. But, Q3 + 9000 = 42000 is belowthe maximum of 68000. As a result, 68000 is a suspected outlier.

Moore Chapter 2

Boxplots

Definition

A boxplot is a graph of the five-number summary with

a centralbox that spans the quartiles Q1 and Q3, a line in the box thatmarks the median M, and lines extended from the box out to thesmallest and largest observations (these are often called whiskers).

Moore Chapter 2

Boxplots

Definition

A boxplot is a graph of the five-number summary with a centralbox that spans the quartiles Q1 and Q3,

a line in the box thatmarks the median M, and lines extended from the box out to thesmallest and largest observations (these are often called whiskers).

Moore Chapter 2

Boxplots

Definition

A boxplot is a graph of the five-number summary with a centralbox that spans the quartiles Q1 and Q3, a line in the box thatmarks the median M,

and lines extended from the box out to thesmallest and largest observations (these are often called whiskers).

Moore Chapter 2

Boxplots

Definition

A boxplot is a graph of the five-number summary with a centralbox that spans the quartiles Q1 and Q3, a line in the box thatmarks the median M, and lines extended from the box out to thesmallest and largest observations (these are often called whiskers).

Moore Chapter 2

Boxplots

The following boxplot displays the tons of bluefin tuna caught inthe Mediterranean Sea over the past 11 years according to theprevious stemplot.

20 000

30 000

40 000

50 000

60 000

70 000

Moore Chapter 2

Finding Standard Deviation

Consider the data set given byx1 = 10, x2 = −14, x3 = 7, x4 = 13, x5 = 21, x6 = 5.

Determine how to describe data points variance from the center.

xi xi − x

-14 -21

Unfortunately,6∑

(xi − x) = 0, so our sum tells us nothing about

the spread.

Moore Chapter 2

xi xi − x

-14 -21

Unfortunately,6∑

the spread.

Moore Chapter 2

xi xi − x

-14 -21

Unfortunately,6∑

the spread.

Moore Chapter 2

xi xi − x

-14 -21

Unfortunately,6∑

the spread.Moore Chapter 2

Improving Standard Deviation

Improve our description of how data points variance from thecenter.

xi (xi − x)2

-14 441

21 196

Notice6∑

(xi − x)2 = 686, which gives us a better approach.

Moore Chapter 2

xi (xi − x)2

-14 441

21 196

Notice6∑

Moore Chapter 2

xi (xi − x)2

-14 441

21 196

Notice6∑

Moore Chapter 2

Variance

Definition

The sample variance, s2, provides us with a measure of the spreadfrom center of a data set x1, x2, · · · , xn with

s2 =n∑

(xi − x)2

n − 1.

Definition

The population variance, σ2, provides us with a measure of thespread from center of a data set for the population x1, x2, · · · , xnwith

σ2 =n∑

(xi − x)2

For the most part, we will focus our efforts on sample variancerather than population variance.

Moore Chapter 2

Variance

Definition

s2 =n∑

(xi − x)2

n − 1.

Definition

σ2 =n∑

(xi − x)2

For the most part, we will focus our efforts on sample variancerather than population variance.

Moore Chapter 2

Variance

Definition

s2 =n∑

(xi − x)2

n − 1.

Definition

σ2 =n∑

(xi − x)2

For the most part, we will focus our efforts on sample variancerather than population variance.Moore Chapter 2

From Variance to Standard Deviation

Calculate the variance for the aforementioned data set.

xi (xi − x)2

-14 441

21 196

Notice6∑

(xi − x)2

6− 1=

5= 137.2.

Moore Chapter 2

xi (xi − x)2

-14 441

21 196

Notice6∑

(xi − x)2

6− 1=

5= 137.2.

Moore Chapter 2

xi (xi − x)2

-14 441

21 196

Notice6∑

(xi − x)2

6− 1=

5= 137.2.

Moore Chapter 2

Standard Deviation

Definition

The sample standard deviation, s, tells us how far the data pointsx1, x2, · · · , xn lie from the mean with

√√√√ n∑i=1

(xi − x)2

n − 1.

Definition

The population standard deviation, σ, tells us how far the datapoints x1, x2, · · · , xn lie from the mean with

√√√√ n∑i=1

(xi − x)2

Moore Chapter 2

Standard Deviation

Definition

The sample standard deviation, s, tells us how far the data pointsx1, x2, · · · , xn lie from the mean with

√√√√ n∑i=1

(xi − x)2

n − 1.

Definition

The population standard deviation, σ, tells us how far the datapoints x1, x2, · · · , xn lie from the mean with

√√√√ n∑i=1

(xi − x)2

Moore Chapter 2

Standard Deviation

Calculate the standard deviation for the aforementioned data set.

xi (xi − x)2

-14 441

21 196

Notice6∑

√(xi − x)2

6− 1=

√686

137.2 = 11.713.

Moore Chapter 2

Standard Deviation

xi (xi − x)2

-14 441

21 196

Notice6∑

√(xi − x)2

6− 1=

√686

137.2 = 11.713.

Moore Chapter 2

Standard Deviation

xi (xi − x)2

-14 441

21 196

Notice6∑

√(xi − x)2

6− 1=

√686

137.2 = 11.713.

Moore Chapter 2

Chapter 2: Describing Distributions with Numbers · 2017. 1. 23. · Chapter 2: Describing Distributions with Numbers Math 2200: Elementary Statistics January 19, 2011 Moore Chapter

Documents

Chapter 02 Describing Data: Frequency Tables ... - Test...

Describing Distributions With Numbers Chapter 12

Describing Distributions - With Graphs or Tables - UW...

Chapter 4: Describing Distributions

Chapter 02 Describing Data: Frequency Distributions and...

7-1. Continuous Distributions Chapter 77 Continuous...

Chapter 02 Describing Data: Frequency Tables, Frequency...

Chapter 2 Describing Data: Frequency Tables, Frequency...

Displaying and describing data distributions ›...

Exploring Data 1.2 Describing Distributions with Numbers...

Describing Distributions Numerically Measures of Variation.....

Essential Statistics Chapter 21 Describing Distributions...

Chapter 3 Describing Distributions Numerically. Describing.....

IPS Chapter 1 © 2012 W.H. Freeman and Company 1.1:...

Chapter 2 Describing Data: Frequency Tables, … 02 -...

BPS - 3rd Ed. Chapter 21 Describing Distributions with...