Top Banner
Five-Number Summary 1 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value 2 2 3 3 4 4 5 5
29

Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Five-Number Summary11 Smallest ValueSmallest Value

First QuartileFirst Quartile

MedianMedian

Third QuartileThird Quartile

Largest ValueLargest Value

22

33

44

55

Page 2: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Five-Number Summary

425 430 430 435 435 435 435 435 440 440440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

425 430 430 435 435 435 435 435 440 440440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

Lowest Value = 425Lowest Value = 425 First Quartile = 445First Quartile = 445

Median = 475Median = 475

Third Quartile = 525Third Quartile = 525Largest Value = 615Largest Value = 615

Page 3: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

375375

400400

425425

450450

475475

500500

525525

550550

575575

600600

625625

A box is drawn with its ends located at the first andA box is drawn with its ends located at the first and third quartiles.third quartiles.

Box PlotBox Plot

A vertical line is drawn in the box at the location ofA vertical line is drawn in the box at the location of the median (second quartile).the median (second quartile).

Q1 = 445Q1 = 445 Q3 = 525Q3 = 525

Q2 = 475Q2 = 475

Page 4: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Box PlotBox Plot

Limits are located (not drawn) using the Limits are located (not drawn) using the interquartile range (IQR).interquartile range (IQR).

Data outside these limits are considered Data outside these limits are considered outliersoutliers..

The locations of each outlier is shown with the The locations of each outlier is shown with the

symbolsymbol * * ..

… … continuedcontinued

Page 5: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Box PlotBox Plot

Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(75) = 332.5Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(75) = 332.5

Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(75) = 637.5Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(75) = 637.5

The lower limit is located 1.5(IQR) below The lower limit is located 1.5(IQR) below QQ1.1.

The upper limit is located 1.5(IQR) above The upper limit is located 1.5(IQR) above QQ3.3.

There are no outliers (values less than 332.5 orThere are no outliers (values less than 332.5 or greater than 637.5) in the apartment rent data.greater than 637.5) in the apartment rent data.

Page 6: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Box Plot

• Whiskers (dashed lines) are drawn from the ends of the box to the smallest and largest data values inside the limits.

375375

400400

425425

450450

475475

500500

525525

550550

575575

600600

625625

Smallest valueSmallest valueinside limits = 425inside limits = 425

Largest valueLargest valueinside limits = 615inside limits = 615

Page 7: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Measures of AssociationBetween two Variables

•Covariance

•Correlation coefficient

Page 8: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Covariance

• Covariance is a measure of linear association between variables.

• Positive values indicate a positive correlation between variables.

• Negative values indicate a negative correlation between variables.

Page 9: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

To compute a covariance for variables x and y

N

uyx yixixy

))((

For populations

1

))((

n

yyxxs iixy

For samples

Page 10: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Mortgage Interest Rates and Monthly Home Sales, 1980-2004

3

5

7

9

11

13

15

17

15 35 55 75 95 115

Monthly Home Sales (thousands)

Mor

tgag

e In

tere

st R

ate

(Per

cent

)3.60x

02.9y

n = 299

II I

III

IV

Page 11: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

If the majority of the sample points are

located in quadrants II and IV, you have a negative correlation

between the variables—as we do in this case.

Thus the covariance will have a negative sign.

Page 12: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

The (Pearson) Correlation Coefficient

A covariance will tell you if 2 variables are positively or

negatively correlated—but it will not tell you the degree of correlation. Moreover, the

covariance is sensitive to the unit of measurement. The correlation coefficient does not suffer from

these defects

Page 13: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

The (Pearson) Correlation Coefficient

yx

xyxy

yx

xyxy ss

sr For populations

For samples

Note that:

11

and

11

xy

xy

r

Page 14: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Correlation Coefficient = 1

0

100

200

300

400

500

0 20 40 60 80 100

Average Speed (MPH)

Dis

tan

ce T

rave

led

in

5

Ho

urs

(M

iles

)

Page 15: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Correlation Coefficient = -1

012345678

0 2 4 6 8

Time Spent Jogging (Hours)

Tim

e S

pen

t S

wim

min

g

(Ho

urs

)

I have 7 hours per week for exercise

Page 16: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

A golfer is interested inA golfer is interested in

investigating the relationship, if any,investigating the relationship, if any,

between driving distance and 18-holebetween driving distance and 18-hole

score.score.

277.6277.6259.5259.5269.1269.1267.0267.0255.6255.6272.9272.9

696971717070707071716969

Average DrivingAverage DrivingDistance (yds.)Distance (yds.)

AverageAverage18-Hole Score18-Hole Score

Example: Golf StatsExample: Golf Stats

Page 17: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Using Excel to Compute theUsing Excel to Compute theCovariance and Correlation CoefficientCovariance and Correlation Coefficient

Formula WorksheetFormula Worksheet

A B C D E

1Average

Drive18-Hole Score

2 277.6 69 Pop. Covariance =COVAR(A2:A7,B2:B7)3 259.5 71 Samp. Correlation =CORREL(A2:A7,B2:B7)4 269.1 705 267.0 706 255.6 717 272.9 698

Page 18: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Value WorksheetValue Worksheet

Using Excel to Compute theUsing Excel to Compute theCovariance and Correlation CoefficientCovariance and Correlation Coefficient

A B C D E

1Average

Drive18-Hole Score

2 277.6 69 Pop. Covariance -5.93 259.5 71 Samp. Correlation -0.96314 269.1 705 267.0 706 255.6 717 272.9 698

Page 19: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

The Weighted Mean and Working with Grouped Data

• Weighted mean• Mean for grouped data• Variance for grouped data• Standard deviation for grouped data.

Page 20: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

GPA Example

A grade point average is a weighted-mean. That is, 4- hour courses are weighted more than 3- hour courses

when computing a GPA

Page 21: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

The Weighted Mean

i

ii

w

xwx

Where wi is the weight attached to observation i

Page 22: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Example: Raw Materials Purchase

PurchaseCost per Pound($)

Number of Pounds

1 3.00 1200

2 3.40 500

3 2.80 2750

4 2.90 1000

5 3.25 800

Let x1 = 3.00, x2 = 3.40, x3 = 2.80, x4 =2.90, and x5 = 3.25

Let w1 = 1200, w2 = 500, w3 = 2750, w4 =1000, and w5 =800

Thus:

96.26250

500,18800100027505001200

)25.3(800)90.2(1000)80.2(2750)40.3(500)3(1200

x

Page 23: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Grouped Data The weighted mean computation can be used toThe weighted mean computation can be used to obtain approximations of the mean, variance, andobtain approximations of the mean, variance, and standard deviation for the grouped data.standard deviation for the grouped data. To compute the weighted mean, we treat theTo compute the weighted mean, we treat the midpoint of each classmidpoint of each class as though it were the mean as though it were the mean of all items in the class.of all items in the class. We compute a weighted mean of the class midpointsWe compute a weighted mean of the class midpoints using the using the class frequencies as weightsclass frequencies as weights.. Similarly, in computing the variance and standardSimilarly, in computing the variance and standard deviation, the class frequencies are used as weights.deviation, the class frequencies are used as weights.

Page 24: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Sample Mean for Grouped Data

n

Mfx ii

Where fi is the frequency of class i and Mi is the midpoint of class i

N

Mf ii

For populations

For samples

Page 25: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Example: Apartment Rents

Given below is the previous sample of monthly rents

for 70 studio apartments, presented here as groupeddata in the form of a frequency distribution.

Rent ($) Frequency420-439 8440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

Rent ($) Frequency420-439 8440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

Page 26: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Sample Mean for Grouped DataSample Mean for Grouped Data

This approximationThis approximationdiffers by $2.41 fromdiffers by $2.41 fromthe actual samplethe actual samplemean of $490.80.mean of $490.80.

34,525 493.21

70x

34,525 493.21

70x

Rent ($) f i

420-439 8440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

Total 70

M i

429.5449.5469.5489.5509.5529.5549.5569.5589.5609.5

f iM i

3436.07641.55634.03916.03566.52118.01099.02278.01179.03657.034525.0

Page 27: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Variance for Grouped Data

N

Mf ii2

2 )(

1

)( 22

n

xMfs ii

For populations

For samples

Page 28: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

Rent ($) f i

420-439 8440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

Total 70

M i

429.5449.5469.5489.5509.5529.5549.5569.5589.5609.5

Sample Variance for Grouped DataSample Variance for Grouped Data

M i - x

-63.7-43.7-23.7-3.716.336.356.376.396.3116.3

f i(M i - x )2

32471.7132479.596745.97110.11

1857.555267.866337.13

23280.6618543.5381140.18

208234.29

(M i - x )2

4058.961910.56562.1613.76

265.361316.963168.565820.169271.76

13523.36

continuedcontinued

Page 29: Five-Number Summary 1 Smallest Value Smallest Value First Quartile First Quartile Median Median Third Quartile Third Quartile Largest Value Largest Value.

• Sample Variance

• Sample Standard Deviation

3,017.89 54.94s 3,017.89 54.94s

ss22 = 208,234.29/(70 – 1) = 3,017.89 = 208,234.29/(70 – 1) = 3,017.89

This approximation differs by only $.20 This approximation differs by only $.20

from the actual standard deviation of $54.74.from the actual standard deviation of $54.74.

Sample Variance for Grouped DataSample Variance for Grouped Data