Chapter 4: Describing Distributions

Post on 05-Jan-2016

60 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Chapter 4: Describing Distributions. 4.1Graphs: good and bad 4.2Displaying distributions with graphs 4.3Describing distributions with numbers. Dow Jones Industrial Average. Pie Graph. Definitions. Types of variables Categorical E.g., gender, type of degree Quantitative - PowerPoint PPT Presentation

Transcript

1

Chapter 4:Describing Distributions

4.1 Graphs: good and bad 4.2 Displaying distributions with

graphs 4.3 Describing distributions with

numbers

2

Dow Jones Industrial Average

3

Pie Graph

4

Definitions

Types of variables Categorical

E.g., gender, type of degree Quantitative

E.g., time, mass, force, dollars

The distribution of a variable tells us what values it takes and how often it takes these values.

5

Bar graph showing a distribution

Education Level in U.S. (adults age 25+)

15.9

33.125.4 25.6

0

10

20

30

40

50

No highschooldegree

High schoolonly

1-3 years ofcollege

4+ years ofcollege

Years of Schooling

Per

cen

t o

f T

ota

l

6

Exercises, pp. 207-208

4.1 4.5

7

Bar graph for 4.1Lottery Game Sales Distribution

16420

5245

2776

8865

5134

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

Instant 3-digit 4-digit Lotto Other

Type of Game

Sal

es (

mil

lio

n $

)

8

Lottery Game Sales Distribution (percent of total)

42.7

13.6

7.2

23.1

13.4

Instant

3-digit

4-digit

Lotto

Other

Pie Chart for 4.1

9

Misleading Pictogram (p. 209)

Worker Salary

$2000/mo

Manager Salary

$4000/mo

10

Dow Jones Industrial Average:This is a line graph (p. 210)

11

Misleading Graphs?

Salaries are Going Up!

20002050210021502200225023002350

1994 2004

Year

Mon

thly

Sal

ary

($)

Salaries Barely Increased

0500

10001500200025003000

1994 2004

Year

Mon

thly

Sal

ary

($)

12

Making good graphs (p. 213)

Graphs must have labels, legends, and titles.

Make the data stand out. Pay attention to what the eye sees.

3-D is really not necessary!

13

Exercises, pp. 214-216

4.6 through 4.8

14

Homework

Problems, pp. 219-221, to be done in Excel: 4.11, 4.15 Email Excel file by class time on Monday

Section 4.2 Reading, pp. 221-242

15

4.2 Displaying Distributions with Graphs

16

Displaying distributions graphically

The distribution of a variable tells us what values it takes and how often it takes these values.

Ways to display distributions for quantitative variables: dotplots histograms stemplots

See example on pp. 221-222.

17

Figure 4.15: A histogram

18

Figure 4.16: A stemplot

19

Histograms

Most common graph of the distribution of a quantitative variable.

How to make a histogram: Example 4.9, p. 224 Range: 5.7 to 17.6 Shoot for 6-15 classes (bars)

Read paragraph on p. 226

1 size of intervals19.110

7.56.17

20

Example 4.9, pp. 224-226

21

Practice Problem: 4.18, p. 226

22

Exercise 4.18

Histogram By hand Using calculator

Stemplot By hand

23

Interpreting the graphical displays

Concentrate on the main features. Overall pattern (p. 230)

Shape, center, spread Outliers

Individual observations outside the overall pattern of the graph

24

Example 4.10, p. 230

25

Shape

Symmetric or skewed (p. 231)? Is it unimodal (one hump) or bimodal

(two humps)?

26

Homework

Reading: pp. 221-242

27

Stemplots

Usually reserved for smaller data sets. Advantage:

Actual (or rounded) data are provided. Possible drawback:

Many people are not used to this type of plot, so the presenter/writer has to describe it.

28

How to make a stemplot, p. 236

29

More problems

Exercises: 4.24 and 4.25, p. 233 4.26, p. 233

30

Practice

Exercises 4.30, p. 239 and 4.32, p. 240

4.28, p. 238

31

Wrapping up Section 4.2 …

4.28, p. 238 4.33, p. 242 4.36 4.37

32

4.3 Describing Distributionswith Numbers

Until now, we’ve been satisfied with using words to describe the center and spread of distributions. Now, we will use numbers to describe

these characteristics of a distribution. The 5-number summary:

Center: Median (p. 248) Spread: Find the Quartiles, Q1 and Q3. (p.

250) Spread: Min and Max

33

Boxplots

We can use this information to construct a boxplot:

34

Practice

4.46, p. 254 Enter data in the Stat Edit menu in

your calculator, and order them.

35

Boxplot vs. Modified Boxplot The modified boxplot shows outliers … they

are marked with a *. The lines extending from the quartiles go to the last number which is not an outlier.

If there are no outliers, the modified boxplot and the regular boxplot are identical.

Below are a boxplot (on the left) and modified boxplot (on the right) for Problem 4.39, p. 245.

36

Side-by-side boxplots (p. 252)

37

Practice

Exercises: 4.50, p. 256 4.49, p. 256

38

Testing for Outliers Find the Inter-Quartile Range:

IQR=Q3-Q1

Multiply: 1.5*IQR Outliers on low side:

Q1-1.5*IQR Outliers on high side:

Q3+1.5*IQR Are there any numbers outside of these

values? If so, they are outliers, and are marked on boxplots

with an asterisk. The tail is drawn to the highest (or lowest) value

which is not an outlier.

39

Measures of Center and Spread

Median and IQR Mean and Standard Deviation

Mean is the arithmetic average Standard deviation measures the average distance

of the observations from their mean. Variance is simply the squared standard deviation.

All of these statistics can be calculated by hand, but we use technology to do these today …

We use 1-sample stats on our calculators, or a stats program.

40

Properties of standard deviation (p. 259)

Use s as a measure of spread when you use the mean.

If s=0, there is no spread. The larger the value for s, the larger

the spread of the distribution.

41

Practice Problem

4.52, p. 263 Mike:

59,69,71,52,65,55,72,50,75,67,51,69,68,62,69

42

Practice Problem

4.55, p. 263

43

Example 4.21, p. 265

44

Choosing a summary

The book has a section on which summary to use (mean and std. dev., or median with the quartiles).

I like to report all of them.

However, when writing about a distribution, or comparing distributions, we should think about which summary works best. See p. 266.

Skewed, outliers … median and quartiles Symmetrical, no (or few) outliers … mean and std. dev.

Mean and standard deviation are most common. One reason is that they allow for more sophisticated calculations to be used in higher statistics.

45

More Practice …

p. 271: 4.57, 4.58, 4.60

top related