Top Banner
Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1
50
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Chapter 2Descriptive Statistics

Larson/Farber 4th ed.1

Page 2: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Useful screencast/videos:Video on creating a frequency

distribution by hand:http://screencast.com/t/OGY3ZjJj

Video on using Excel 2007 to create frequency distributions: http://screencast.com/t/tkMv2FMWhJe

Video on using Excel 2007 to create a histogramhttp://screencast.com/t/L0u9UI2eI

Larson/Farber 4th ed. 2

Page 3: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Section 2.1Frequency Distributions and Their Graphs

Larson/Farber 4th ed.3

Page 4: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Frequency Distribution - Terminology

Frequency Distribution

A table that shows classes or intervals of data with a count of the number of entries in each class.

The frequency, f, of a class is the number of data entries in the class.

Larson/Farber 4th ed.4

Class Frequency, f

1 – 5 5

6 – 10 8

11 – 15

6

16 – 20

8

21 – 25

5

26 – 30

4

Page 5: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Determining the Relative Frequency

Relative Frequency of a class Portion or percentage of the data that

falls in a particular class.

Larson/Farber 4th ed.5

n

f

sizeSample

frequencyclassfrequencyrelative

Class Frequency, f

Relative Frequency

7 – 18 6

19 – 30 10

31 – 42 13

60.12

50

100.20

50

130.26

50

Page 6: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Constructing a Frequency Distribution

The following sample data set lists the number of minutes 50 Internet subscribers spent on the Internet during their most recent session. Construct a frequency distribution that has seven classes.50 40 41 17 11 7 22 44 28 21 19 23 37 51 54 42 8641 78 56 72 56 17 7 69 30 80 56 29 33 46 31 39 2018 29 34 59 73 77 36 39 30 62 54 67 39 31 53 44

Video on computing frequency distribution using this data: http://screencast.com/t/OGY3ZjJj

Larson/Farber 4th ed.

6

Page 7: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Expanded Frequency Distribution

Larson/Farber 4th ed.7

Class Frequency, f

MidpointRelative

frequencyCumulative frequency

7 – 18 6 12.5 0.12 6

19 – 30 10 24.5 0.20 16

31 – 42 13 36.5 0.26 29

43 – 54 8 48.5 0.16 37

55 – 66 5 60.5 0.10 42

67 – 78 6 72.5 0.12 48

79 – 90 2 84.5 0.04 50Σf = 50 1

n

f

Page 8: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Graphs of Frequency DistributionsFrequency HistogramA bar graph that represents the

frequency distribution.The horizontal scale is

quantitative and measures the data values.

The vertical scale measures the frequencies of the classes.

Consecutive bars must touch.

Larson/Farber 4th ed. 8

data values

freq

uen

cy

Page 9: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Frequency Histogram (using Midpoints)

Larson/Farber 4th ed. 9

Page 10: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Graphs of Frequency Distributions

Relative Frequency HistogramHas the same shape and the same horizontal

scale as the corresponding frequency histogram.

The vertical scale measures the relative frequencies, not frequencies.

Larson/Farber 4th ed. 10

data values

rela

tive

freq

uen

cy

Page 11: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Relative Frequency Histogram

Larson/Farber 4th ed. 11

6.5 18.5 30.5 42.5 54.5 66.5 78.5 90.5

From this graph you can see that 20% of Internet subscribers spent between 18.5 minutes and 30.5 minutes online.

Page 12: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Section 2.2More Graphs and Displays

Larson/Farber 4th ed. 12

Page 13: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Graphing Quantitative Data Sets

Stem-and-leaf plotEach number is separated into a

stem and a leaf.Similar to a histogram.Still contains original data values.

Larson/Farber 4th ed. 13

Data: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45

26

2 1 5 5 6 7 83 0 6 6

4 5

Page 14: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Graphing Qualitative Data Sets

Pie ChartA circle is divided into sectors that

represent categories.The area of each sector is

proportional to the frequency of each category.

Larson/Farber 4th ed. 14

Page 15: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Section 2.3Measures of Central Tendency

Larson/Farber 4th ed. 15

Page 16: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Measures of Central TendencyMeasure of central tendencyA value that represents a typical,

or central, entry of a data set.Most common measures of

central tendency:◦Mean◦Median◦Mode

Larson/Farber 4th ed. 16

Page 17: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Measure of Central Tendency: MeanMean (average)The sum of all the data entries

divided by the number of entries.Sigma notation: Σx = add all

of the data entries (x) in the data set.

Population mean:

Sample mean:Larson/Farber 4th ed. 17

x

N

xx

n

Page 18: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Finding a Sample Mean

The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. What is the mean price of the flights?

872 432 397 427 388 782 397

Larson/Farber 4th ed. 18

Page 19: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Finding a Sample Mean

872 432 397 427 388 782 397

Larson/Farber 4th ed. 19

• The sum of the flight prices is

Σx = 872 + 432 + 397 + 427 + 388 + 782 + 397 = 3695

• To find the mean price, divide the sum of the prices by the number of prices in the sample

3695527.9

7

xx

n

The mean price of the flights is about $527.90.

Page 20: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Measure of Central Tendency: Median

MedianThe value that lies in the middle of

the data when the data set is ordered.

Measures the center of an ordered data set by dividing it into two equal parts.

If the data set has an◦odd number of entries: median is the

middle data entry.◦even number of entries: median is the

mean of the two middle data entries.

Larson/Farber 4th ed. 20

Page 21: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Finding the Median

The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. Find the median of the flight prices.

872 432 397 427 388 782 397

Larson/Farber 4th ed. 21

Page 22: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Finding the Median872 432 397 427 388 782 397

Larson/Farber 4th ed. 22

• First order the data.

388 397 397 427 432 782 872

• There are seven entries (an odd number), the median is the middle, or fourth, data entry.

The median price of the flights is $427.

Page 23: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Finding the Median

The flight priced at $432 is no longer available. What is the median price of the remaining flights?

872 397 427 388 782 397

Larson/Farber 4th ed. 23

Page 24: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Finding the Median872 397 427 388 782 397

Larson/Farber 4th ed. 24

• First order the data.

388 397 397 427 782 872

• There are six entries (an even number), the median is the mean of the two middle entries.

The median price of the flights is $412.

397 427Median 412

2

Page 25: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Measure of Central Tendency: Mode

ModeThe data entry that occurs with the

greatest frequency.If no entry is repeated the data set

has no mode.If two entries occur with the same

greatest frequency, each entry is a mode (bimodal).

Larson/Farber 4th ed. 25

Page 26: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Finding the Mode

The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. Find the mode of the flight prices.

872 432 397 427 388 782 397

Larson/Farber 4th ed. 26

Page 27: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Finding the Mode872 432 397 427 388 782 397

Larson/Farber 4th ed. 27

• Ordering the data helps to find the mode.

388 397 397 427 432 782 872

• The entry of 397 occurs twice, whereas the other data entries occur only once.

The mode of the flight prices is $397.

Page 28: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Finding the Mode

At a political debate a sample of audience members was asked to name the political party to which they belong. Their responses are shown in the table. What is the mode of the responses?

Larson/Farber 4th ed. 28

Political Party Frequency, f

Democrat 34

Republican 56

Other 21

Did not respond

9

Page 29: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Finding the Mode

Larson/Farber 4th ed. 29

Political Party Frequency, f

Democrat 34

Republican 56

Other 21

Did not respond

9

The mode is Republican (the response occurring with the greatest frequency). In this sample there were more Republicans than people of any other single affiliation.

Page 30: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Section 2.4Measures of Variation

Larson/Farber 4th ed. 30

Page 31: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Deviation, Variance, and Standard DeviationDeviationThe difference between the data

entry, x, and the mean of the data set.

Population data set:◦Deviation of x = x – μ

Sample data set:◦Deviation of x = x – x

Larson/Farber 4th ed. 31

Page 32: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Finding the Deviation

A corporation hired 10 graduates. The starting salaries for each graduate are shown. Find the deviation of the starting salaries.

Starting salaries (1000s of dollars)

41 38 39 45 47 41 44 41 37 42

Larson/Farber 4th ed. 32

Solution:• First determine the mean starting

salary. 41541.5

10

x

N

Page 33: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Finding the Deviation

Larson/Farber 4th ed. 33

• Determine the deviation for each data entry.

Salary ($1000s), x Deviation: x – μ

41 41 – 41.5 = –0.5

38 38 – 41.5 = –3.5

39 39 – 41.5 = –2.5

45 45 – 41.5 = 3.5

47 47 – 41.5 = 5.5

41 41 – 41.5 = –0.5

44 44 – 41.5 = 2.5

41 41 – 41.5 = –0.5

37 37 – 41.5 = –4.5

42 42 – 41.5 = 0.5

Σx = 415 Σ(x – μ) = 0

Page 34: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Deviation, Variance, and Standard DeviationPopulation Variance

Population Standard Deviation

Larson/Farber 4th ed. 34

22 ( )x

N

Sum of squares, SSx

22 ( )x

N

Page 35: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Deviation, Variance, and Standard DeviationSample Variance

Sample Standard Deviation

Larson/Farber 4th ed. 35

22 ( )

1

x xs

n

22 ( )

1

x xs s

n

Page 36: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Using Technology to Find the Standard Deviation

Sample office rental rates (in dollars per square foot per year) for Miami’s central business district are shown in the table. Use a calculator or a computer to find the mean rental rate and the sample standard deviation. (Adapted from: Cushman & Wakefield Inc.)

Larson/Farber 4th ed. 36

Office Rental Rates

35.00 33.50 37.00

23.75 26.50 31.25

36.50 40.00 32.00

39.25 37.50 34.75

37.75 37.25 36.75

27.00 35.75 26.00

37.00 29.00 40.50

24.50 33.00 38.00

Page 37: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Using Technology to Find the Standard Deviation

Larson/Farber 4th ed. 37

Sample Mean

Sample Standard Deviation

Page 38: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Interpreting Standard DeviationStandard deviation is a measure

of the typical amount an entry deviates from the mean.

The more the entries are spread out, the greater the standard deviation.

Larson/Farber 4th ed. 38

Page 39: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Interpreting Standard Deviation: Empirical Rule (68 – 95 – 99.7 Rule)

For data with a (symmetric) bell-shaped distribution, the standard deviation has the following characteristics:

Larson/Farber 4th ed. 39

• About 68% of the data lie within one standard deviation of the mean.

• About 95% of the data lie within two standard deviations of the mean.

• About 99.7% of the data lie within three standard deviations of the mean.

Page 40: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Interpreting Standard Deviation: Empirical Rule (68 – 95 – 99.7 Rule)

Larson/Farber 4th ed. 40

3x s x s 2x s 3x sx s x2x s

68% within 1 standard deviation

34%

34%

99.7% within 3 standard deviations

2.35% 2.35%

95% within 2 standard deviations

13.5% 13.5%

Page 41: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Example: Using the Empirical Rule

In a survey conducted by the National Center for Health Statistics, the sample mean height of women in the United States (ages 20-29) was 64 inches, with a sample standard deviation of 2.71 inches. Estimate the percent of the women whose heights are between 64 inches and 69.42 inches.

Larson/Farber 4th ed. 41

Page 42: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Solution: Using the Empirical Rule

Larson/Farber 4th ed. 42

3x s x s 2x s 3x sx s x2x s55.87 58.58 61.29 64 66.71 69.42 72.13

34%

13.5%

• Because the distribution is bell-shaped, you can use the Empirical Rule.

34% + 13.5% = 47.5% of women are between 64 and 69.42 inches tall.

Page 43: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Larson/Farber 4th ed. 43

Page 44: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Larson/Farber 4th ed. 44

Page 45: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Important Formulas

Range = Maximum value – Minimum value

Sample Standard Deviation

Sample Variance

Population Standard Deviation

Population Variance

Page 46: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

1. The mean value of homes on a street is $125 thousand with a standard deviation of $5 thousand. The data set has a bell shaped distribution. Estimate the percent of homes between $120 and $135 thousand.

Using the Empirical Rule

$120 thousand is 1 standard deviation below

the mean and $135 thousand is 2 standard

deviations above the mean.

68% + 13.5% = 81.5%

125 130 135120 140 145115110105

Page 47: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

2 4 2 0 40 2 4 3 6

Calculate the mean, the median, and the mode, using the appropriate notation. [Hint: is this a sample or a population?]

2. An instructor recorded the average number of absences for his students in one semester. For a random sample the data are:

Page 48: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Slide 2- 48

3. Find the class width:

A. 3

B. 4

C. 5

D. 19

Class Frequency, f

1 – 5 21

6 – 10 16

11 – 15

28

16 – 20

13

Page 49: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Slide 2- 49

4. The mean annual automobile insurance premium is $950, with a standard deviation of $175. The data set has a bell-shaped distribution. Estimate the percent of premiums that are between $600 and $1300.

A. 68%

B. 75%

C. 95%

D. 99.7%

Page 50: Chapter 2 Descriptive Statistics Larson/Farber 4th ed. 1.

Larson/Farber 4th ed. 50

1. 81.5% have a value between $120 and $135 thousand.

2. xbar = 63, median = 3, mode = 2. This is a sample, so these are all sample statistics.

3. (C) 5

4. (C) 95%