Top Banner
Chapter 9 Estimation Using a Single Sample
39

Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

Dec 29, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

Chapter 9

Estimation Using a Single Sample

Page 2: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

2 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

A point estimate of a population characteristic is a single number that is based on sample data and represents a plausible value of the characteristic.

Point Estimation

Page 3: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

3 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

ExampleA sample of 200 students at a large university is selected to estimate the proportion of students that wear contact lens. In this sample 47 wear contact lens.

Let = the true proportion of all students at this university that wear contact lens. Consider “success” being a student wears contact lens.

Such a point estimate is47

p 0.235200

Such a point estimate is47

p 0.235200

The statistic

Is a reasonable choice for a formula to obtain a point estimate for .

number of successes in the samplep

nThe statistic

Is a reasonable choice for a formula to obtain a point estimate for .

number of successes in the samplep

n

Page 4: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

4 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

ExampleA sample of weights of 34 male freshman students was obtained.185 161 174 175 202 178 202 139 177170 151 176 197 214 283 184 189 168188 170 207 180 167 177 166 231 176184 179 155 148 180 194 176

If one wanted to estimate the true mean of all male freshman students, you might use the sample mean as a point estimate for the true mean.

sample mean x 182.44

Page 5: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

5 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

ExampleAfter looking at a histogram and boxplot of the data (below) you might notice that the data seems reasonably symmetric with a outlier, so you might use either the sample median or a sample trimmed mean as a point estimate.

260220180140

Calculatedwith Minitab5% trimmed mean( ) 180.07

177 178sample median 177.5

2

Page 6: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

6 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

BiasA statistic with mean value equal to the value of the population characteristic being estimated is said to be an unbiased statistic. A statistic that is not unbiased is said to be biased.

valueTruevalueTrue

Sampling distribution of a unbiased statistic

Sampling distribution of a unbiased statistic

Sampling distribution of a biased statistic

Sampling distribution of a biased statistic

Page 7: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

7 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

CriteriaGiven a choice between several unbiased statistics that could be used for estimating a population characteristic, the best statistic to use is the one with the smallest standard deviation.

valueTrue

Unbiased sampling distribution with the smallest standard deviation, the Best choice.

Unbiased sampling distribution with the smallest standard deviation, the Best choice.

Page 8: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

8 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Large-sample Confidence Interval for a Population Proportion

A confidence interval for a population characteristic is an interval of plausible values for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the characteristic will be captured inside the interval.

Page 9: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

9 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Confidence Level

The confidence level associated with a confidence interval estimate is the success rate of the method used to construct the interval.

Page 10: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

10 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

RecallFor the sampling distribution of p,

p = and for large* n the

sampling distribution of p is approximately normal.

p

(1 )n

For the sampling distribution of p,

p = and for large* n the

sampling distribution of p is approximately normal.

p

(1 )n

Specifically when n is large*, the statistic p has a sampling distribution that is approximately normal with mean and standard deviation .(1 )

n

Specifically when n is large*, the statistic p has a sampling distribution that is approximately normal with mean and standard deviation .(1 )

n

* n 10 and n(1-) 10

Page 11: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

11 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Some considerations

Approximately 95% of all large samples will result in a value of p that is within

of the true population

proportion .p

(1 )1.96 1.96

n

Approximately 95% of all large samples will result in a value of p that is within

of the true population

proportion .p

(1 )1.96 1.96

n

Page 12: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

12 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Some considerations

Equivalently, this means that for 95% of all possible samples, will be in the interval

(1 ) (1 )p 1.96 to p 1.96

n n

Equivalently, this means that for 95% of all possible samples, will be in the interval

(1 ) (1 )p 1.96 to p 1.96

n n

Since is unknown and n is large, we estimate

(1 ) p(1 p)with

n n

Since is unknown and n is large, we estimate

(1 ) p(1 p)with

n n

This interval can be used as long as np 10 and np(1-p) 10

Page 13: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

13 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

The 95% Confidence Interval

When n is large, a 95% confidence interval for is (1 ) (1 )

p 1.96 , p 1.96n n

When n is large, a 95% confidence interval for is (1 ) (1 )

p 1.96 , p 1.96n n

The endpoints of the interval are often abbreviated by

where - gives the lower endpoint and + the upper endpoint.

p(1 p)p 1.96

n

The endpoints of the interval are often abbreviated by

where - gives the lower endpoint and + the upper endpoint.

p(1 p)p 1.96

n

Page 14: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

14 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Example

For a project, a student randomly sampled 182 other students at a large university to determine if the majority of students were in favor of a proposal to build a field house. He found that 75 were in favor of the proposal.

Let = the true proportion of students that favor the proposal.

Page 15: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

15 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Example - continued75

p 0.4121182

So np = 182(0.4121) = 75 >10 and

n(1-p)=182(0.5879) = 107 >10 we can use the formulas given on the previous slide to find a 95% confidence interval for .

p(1 p) 0.4121(0.5879)p 1.96 0.4121 1.96

n 1820.4121 0.07151

The 95% confidence interval for is (0.341, 0.484).

Page 16: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

16 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

The General Confidence Interval

The general formula for a confidence interval for a population proportion when

1. p is the sample proportion from a random sample , and

2. The sample size n is large (np 10 and np(1-p) 10) is

p(1 p)p z critical value

n

Page 17: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

17 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Finding a z Critical Value

Finding a z critical value for a 98% confidence interval.

Looking up the cumulative area or 0.9900 in the body of the table we find z = 2.33

2.33

Page 18: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

18 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Some Common Critical Values

Confidence level

z critical value

80% 1.2890% 1.64595% 1.9698% 2.3399% 2.5899.8% 3.0999.9% 3.29

Page 19: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

19 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

TerminologyThe standard error of a statistic is the estimated standard deviation of the statistic.

p(1 p)n

This means that the standard error of the sample proportion is p(1 p)

n

This means that the standard error of the sample proportion is

(1 )n

For sample proportions, the standard deviation is (1 )

n

For sample proportions, the standard deviation is

Page 20: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

20 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

TerminologyThe bound on error of estimation, B, associated with a 95% confidence interval is

(1.96)(standard error of the statistic).

The bound on error of estimation, B, associated with a confidence interval is

(z critical value)·(standard error of the statistic).

Page 21: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

21 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Sample SizeThe sample size required to estimate a population proportion p to within an amount B with 95% confidence is

The value of may be estimated by prior information. If no prior information is available, use = 0.5 in the formula to obtain a conservatively large value for n. Generally one rounds the result up to the nearest integer.

21.96

n (1 )B

Page 22: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

22 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

If a TV executive would like to find a 95% confidence interval estimate within 0.03 for the proportion of all households that watch NYPD Blue regularly. How large a sample is needed if a prior estimate for was 0.15.

Sample Size Calculation Example

A sample of 545 or more would be needed.

We have B = 0.03 and the prior estimate of = 0.15

2 21.96 1.96

n (1 ) (0.15)(0.85) 544.2B 0.03

Page 23: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

23 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Suppose a TV executive would like to find a 95% confidence interval estimate within 0.03 for the proportion of all households that watch NYPD Blue regularly. How large a sample is needed if we have no reasonable prior estimate for .

Sample Size Calculation Example revisited

The required sample size is now 1068.

We have B = 0.03 and should use = 0.5 in the formula.

Notice, a reasonable ball park estimate for can lower the needed sample size.

2 21.96 1.96

n (1 ) (0.5)(0.5) 1067.1B 0.03

Page 24: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

24 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

A college professor wants to estimate the proportion of students at a large university who favor building a field house with a 99% confidence interval accurate to 0.02. If one of his students performed a preliminary study and estimated to be 0.412, how large a sample should he take.

Another Example

The required sample size is 4032.

We have B = 0.02, a prior estimate = 0.412 and we should use the z critical value 2.58 (for a 99% confidence interval)

2 22.58 2.58

n (1 ) (0.412)(0.588) 4031.4B 0.02

Page 25: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

25 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

One-Sample z Confidence Interval for

x z critical valuen

The general formula for a confidence interval for a population mean when

1. is the sample proportion from a random sample,

2. The sample size n is large (generally n30), and

3. , the population standard deviation, is known is

x

The general formula for a confidence interval for a population mean when

1. is the sample proportion from a random sample,

2. The sample size n is large (generally n30), and

3. , the population standard deviation, is known is

x

Page 26: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

26 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

One-Sample z Confidence Interval for

If n is small (generally n < 30) but it is reasonable to believe that the distribution of values in the population is normal, a confidence interval for (when is known) is

Notice that this formula works when is known and either

1. n is large (generally n 30) or

2. The population distribution is normal (any sample size.

x z critical valuen

Page 27: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

27 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Find a 90% confidence interval estimate for the true mean fills of catsup from this machine.

Example

A certain filling machine has a true population standard deviation = 0.228 ounces when used to fill catsup bottles. A random sample of 36 “6 ounce” bottles of catsup was selected from the output from this machine and the sample mean was ounces.x 6.018

A certain filling machine has a true population standard deviation = 0.228 ounces when used to fill catsup bottles. A random sample of 36 “6 ounce” bottles of catsup was selected from the output from this machine and the sample mean was ounces.x 6.018

Page 28: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

28 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Example I (continued)

645.136n,228.0,018.6x

Z critical value is 1.645

x (z critical value)n

0.2286.018 1.645 6.018 0.063

36

90% Confidence Interval

(5.955, 6.081)

Page 29: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

29 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Unknown - Small Size Samples[All Size Samples]

ns

x 0

An Irish mathematician/statistician, W. S.Gosset developed the techniques and derived the Student’s t distributions that describe the behavior of .

ns

x 0

An Irish mathematician/statistician, W. S.Gosset developed the techniques and derived the Student’s t distributions that describe the behavior of .

Page 30: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

30 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

t Distributions

If X is a normally distributed random variable, the statistic

follows a t distribution with df = n-1 (degrees of freedom).

ns

xt 0

If X is a normally distributed random variable, the statistic

follows a t distribution with df = n-1 (degrees of freedom).

ns

xt 0

Page 31: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

31 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

t Distributions

This statistic is fairly robust

and the results are reasonable for moderate sample sizes (15 and up) if x is just reasonable centrally weighted. It is also quite reasonable for large sample sizes for distributional patterns (of x) that are not extremely skewed.

ns

xt 0This statistic is fairly robust

and the results are reasonable for moderate sample sizes (15 and up) if x is just reasonable centrally weighted. It is also quite reasonable for large sample sizes for distributional patterns (of x) that are not extremely skewed.

ns

xt 0

Page 32: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

32 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

t distribution

-4 -3 -2 -1 0 1 2 3 4

df = 2

df = 5

df = 10

df = 25

Normal

Comparison of normal and t distibutions

Page 33: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

33 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Notice: As df increase, t distributions approach the standard normal distribution.

t Distributions Continued

Since each t distribution would require a table similar to the standard normal table, we usually only create a table of critical values for the t distributions.

Page 34: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

34 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

0.80 0.90 0.95 0.98 0.99 0.998 0.999

80% 90% 95% 98% 99% 99.8% 99.9%

1 3.08 6.31 12.71 31.82 63.66 318.29 636.582 1.89 2.92 4.30 6.96 9.92 22.33 31.603 1.64 2.35 3.18 4.54 5.84 10.21 12.924 1.53 2.13 2.78 3.75 4.60 7.17 8.615 1.48 2.02 2.57 3.36 4.03 5.89 6.876 1.44 1.94 2.45 3.14 3.71 5.21 5.967 1.41 1.89 2.36 3.00 3.50 4.79 5.418 1.40 1.86 2.31 2.90 3.36 4.50 5.049 1.38 1.83 2.26 2.82 3.25 4.30 4.78

10 1.37 1.81 2.23 2.76 3.17 4.14 4.5911 1.36 1.80 2.20 2.72 3.11 4.02 4.4412 1.36 1.78 2.18 2.68 3.05 3.93 4.3213 1.35 1.77 2.16 2.65 3.01 3.85 4.2214 1.35 1.76 2.14 2.62 2.98 3.79 4.1415 1.34 1.75 2.13 2.60 2.95 3.73 4.0716 1.34 1.75 2.12 2.58 2.92 3.69 4.0117 1.33 1.74 2.11 2.57 2.90 3.65 3.9718 1.33 1.73 2.10 2.55 2.88 3.61 3.9219 1.33 1.73 2.09 2.54 2.86 3.58 3.8820 1.33 1.72 2.09 2.53 2.85 3.55 3.8521 1.32 1.72 2.08 2.52 2.83 3.53 3.8222 1.32 1.72 2.07 2.51 2.82 3.50 3.7923 1.32 1.71 2.07 2.50 2.81 3.48 3.7724 1.32 1.71 2.06 2.49 2.80 3.47 3.7525 1.32 1.71 2.06 2.49 2.79 3.45 3.7326 1.31 1.71 2.06 2.48 2.78 3.43 3.7127 1.31 1.70 2.05 2.47 2.77 3.42 3.6928 1.31 1.70 2.05 2.47 2.76 3.41 3.6729 1.31 1.70 2.05 2.46 2.76 3.40 3.6630 1.31 1.70 2.04 2.46 2.75 3.39 3.6540 1.30 1.68 2.02 2.42 2.70 3.31 3.5560 1.30 1.67 2.00 2.39 2.66 3.23 3.46

120 1.29 1.66 1.98 2.36 2.62 3.16 3.371.28 1.645 1.96 2.33 2.58 3.09 3.29

Central area captured:Confidence level:

Degrees of freedom

z critical values

Page 35: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

35 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

One-Sample t Procedures

Suppose that a SRS of size n is drawn from a population having unknown mean . The general confidence limits are

sx (t critical value)

n

Suppose that a SRS of size n is drawn from a population having unknown mean . The general confidence limits are

sx (t critical value)

n

and the general confidence interval for is

s sx (t critical value) , x (t critical value)

n n

and the general confidence interval for is

s sx (t critical value) , x (t critical value)

n n

Page 36: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

36 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Confidence Interval Example

Ten randomly selected shut-ins were each asked to list how many hours of television they watched per week. The results are

82 66 90 84 75

88 80 94 110 91

Find a 90% confidence interval estimate for the true mean number of hours of television watched per week by shut-ins.

Page 37: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

37 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Confidence Interval Example Continued

We find the critical t value of 1.833 by looking on the t table in the row corresponding to df = 9, in the column with bottom label 90%. Computing the confidence interval for is

Calculating the sample mean and standard deviation we have 842.11sand,86x,10n Calculating the sample mean and standard deviation we have 842.11sand,86x,10n

n

s*tx

10

842.11)833.1(86 86.686

)86.92,14.79(

Page 38: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

38 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

To calculate the confidence interval, we had to make the assumption that the distribution of weekly viewing times was normally distributed. Consider the following normal plot of the 10 data points.

Confidence Interval Example Continued

Page 39: Chapter 9 Estimation Using a Single Sample 2 A point estimate of a population characteristic is a single number that is based on sample data and represents.

39 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.

Confidence Interval Example ContinuedNotice that the normal plot looks reasonably linear so it is reasonable to assume that the number of hours of television watched per week by shut-ins is normally distributed.

P-Value: 0.753A-Squared: 0.226

Anderson-Darling Normality Test

N: 10StDev: 11.8415Average: 86

110100908070

.999

.99

.95

.80

.50

.20

.05

.01

.001

Pro

babi

lity

Hours

Normal Probability Plot

P-Value: 0.753A-Squared: 0.226

Anderson-Darling Normality TestP-Value: 0.753A-Squared: 0.226

Anderson-Darling Normality Test

N: 10StDev: 11.8415Average: 86

110100908070

.999

.99

.95

.80

.50

.20

.05

.01

.001

Pro

babi

lity

Hours

Normal Probability Plot

P-Value: 0.753A-Squared: 0.226

Anderson-Darling Normality Test

P-Value: 0.753A-Squared: 0.226

Anderson-Darling Normality Test

P-Value: 0.753A-Squared: 0.226

Anderson-Darling Normality Test

The output comes from Minitab.Typically if the p-value is more than 0.05 we assume that the distribution is normal