Top Banner
Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound! - author unknown circa 2007 Chapter 8A
39

Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Dec 28, 2015

Download

Documents

Doreen Clark
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Statistical Intervals for a Single Sample

From only one sample,

An interval has been found.

Because the sample was ample,

The results were quite profound!

- author unknown circa 2007

Chapter 8A

Page 2: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

What to Look Forward to this Week

Today only!

Page 3: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

A Diversion – the sampling distributions or distributions arising from the normal

If Z1, Z2, ..., Zn are independent standard normal random variables, then

2 2 2 21 2 ... ( )nZ Z Z n chi-square distribution

with n degrees freedom

2( )

/

ZT t n

n t-distribution

with n degrees freedom

2

2

( ) /( , )

( ) /

n nF F n m

m m

F-distribution with n degrees freedom inthe numerator and m degreesof freedom in the denominator

Page 4: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

What will be Chi-square?

2

22

12

1

2

21

2 2

( , )

(0,1)

Therefore

( 1)However 1

i

i

n

ini i

i

n

ii

X n

XZ n

XX

n

X Xn S

n

Let Xi be the ith sample value from a normal population

Page 5: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The Chi-square Distribution

k = degrees offreedom

Page 6: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The t-distributionalso known as Student’s t

f(x) =

v = k = df

Page 7: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

More about Student

The t statistic was introduced by William Sealy Gosset for cheaply monitoring the quality of beer brews. "Student" was his pen name.

Gosset was a statistician for the Guinness brewery in Dublin, Ireland, and was hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes.

Gosset published the t test in Biometrika in 1908, but was forced to use a pen name by his employer who regarded the fact that they were using statistics as a trade secret.

Page 8: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The F-distribution

f(x)

B(m,n) is theBeta function

an interestingproperty:

Page 9: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

F-Distribution named after R.A. Fisher

Born 17 February 1890(1890-02-17)East Finchley, London , England Died 29 July 1962 (aged 72) Adelaide, AustraliaResidence England, Australia Nationality British, Field Statistics, Genetics, Natural selection Institutions Rothamsted Experimental StationUniversity College London, Cambridge UniversityAlma mater Cambridge University Academic advisor Sir James JeansF.J.M. Stratton Notable students C.R. Rao Known for Maximum likelihoodFisher informationAnalysis of variance Notable prizes Royal Medal (1938)Copley Medal (1955)

Page 10: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Three types of Intervals

-Confidence Interval – bound population parameter or distribution parameter.

-Tolerance Interval – bound a proportion of the distribution at a certain confidence level.

- Prediction Interval – bound a single observation => assumptions on population distribution critical here.

Page 11: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Overheard at a rest stopI know that my average driving time

on this daily route has been 2.3 hours over the last 7 days. However, that is

based on a sample and therefore is unlikely to equal my population

mean. What I really need is some way to measure how precise this

estimate is.

Page 12: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

A typical prob-stat graduate

Confidence Interval - A statement consisting of two values between which the population parameter is estimated to lie.

Reliability – degree of confidence – the probability with which the population parameter will be “captured by the two values.

Precision – the length of the confidence interval (a measure of the error in estimating the parameter.

I am 95% confident that my mean

driving time to work is between 37.4

minutes and 41.2 minutes.

41.2 – 37.4 = 3.8Best estimate is midpoint = 39.3Error = 1.9 minutes

Page 13: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The Big Picture of a Confidence Interval

(L,U) is 100(1-)% CI for the population parameter

Page 14: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The Bigger Picture of a Confidence Interval

General Approach:

Estimate reliability Factor x Standard Error

our pointestimate our confidence

our precision

Page 15: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The Biggest Picture of a Confidence Interval

1 1Pr ,..., ,..., 1n nl x x u x x

Measure of RiskMeasure of Uncertainty (precision)(Random Variables; i.e. statistics

populationparameter

The length of a confidence interval is a measure of the precision of estimation.

(1 – )% of the C.I.s constructed this way contain the mean. Watch the interpretation of this concept.

Page 16: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Confidence Interval on Mean of Normal Distribution Variance Known

/2 /2

/2 /2

/2

has a standard normal distribution/

1/

with a little algebra,

/ / 1

This is our 100(1 )% Confidence Interval on .

is the upper

XZ

n

XP z z

n

P X z n X z n

z

/2 percentage point from standard normal

Page 17: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Our Very First Real Confidence Interval

A sample of 100 batteries are tested for their operating life. They averaged (mean) 10 hours before failing. The manufacturer has assured us that the population variance is 16 hours. Find a 95 percent confidence interval for the mean life of this particular type of battery.

100, 10 ., 4 .N x hr hr

.025

/2

1.96

410 1.96 (9.216,10.784)

100

z

X zn

Page 18: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Sample Size and Precision

. exceednot error will that theconfident

)%-100(1 becan weand below as Choose

by error theDefine

/ :lyEquivalent

1//

2

2/

2/

2/2/

E

zn

E

n

Ex

nzx

nzXnzXP

Think of E as a measure of practical

significance.

Page 19: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Our Very First Real Confidence Interval Revisited

For our battery problem, what sample size is required to reduce the error to .5 hr. with a 99% confidence?

2 2/2

2

2.58 16426.0096 426

.5

z xn

E

Page 20: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Problem 8-12

Life of a 75 watt bulb is normally distributed with std dev = 25 hrs. Suppose we want to be 95% confident that the error in estimating mean life is less than 5 hours. Find a sample size.

2 2

/2 1.96(25)96.04 96

5

zn

E

Page 21: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Problem 8-10

Diameter of holes for a cable harness is normally distributed with a standard deviation of .01 in. A random sample of 10 yields average diameter of 1.5045 in. Find a 99% two-sided confidence interval.

5127.14963.1

10/)01.0(58.25045.110/)01.0(58.25045.1

// 005.0005.0

nzxnzx

Page 22: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Interpreting a Confidence Interval

The confidence interval is a random interval The appropriate interpretation of a confidence

interval (for example on ) is: The observed interval [l, u] brackets the true value of , with confidence 100(1-).

Examine Figure 8-1 on the next slide.

Page 23: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Figure 8-1 Repeated construction of a confidence interval for

Page 24: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Repeated Confidence Intervals, gen. samples

Sample1 1.886 1.014 -1.534 0.192 0.801 -0.429 -0.579 0.647 0.149 1.0152 1.040 -1.008 -0.225 0.374 0.168 0.607 -1.439 -1.070 1.355 0.9943 1.091 -0.447 1.393 1.105 -0.012 -1.986 -1.518 0.749 1.244 -1.1234 -1.618 0.874 0.484 -1.761 -0.653 -0.432 1.695 0.487 -1.589 -0.9085 -1.246 -0.386 0.222 -0.326 0.969 0.225 0.824 -1.450 0.399 0.5666 0.301 -1.002 1.791 -0.212 1.403 0.669 -0.071 -0.306 1.576 -0.171

96 -0.778 -0.977 0.361 -1.247 -0.045 -0.213 -1.772 -0.052 0.666 1.27397 0.277 0.646 -0.693 -1.306 -1.311 -0.489 0.743 -0.313 -0.219 -1.27898 0.725 2.182 -0.855 -0.831 0.359 0.295 -1.639 1.165 -1.099 0.09099 -0.227 -1.575 1.890 0.497 0.211 0.408 -0.542 1.423 1.832 -0.073

100 -0.425 -0.328 0.475 1.241 0.210 1.409 0.641 -2.964 -1.120 0.983

Random samples from a standard normal distribution, N(0,1).

Generated in Excel as NORMSINV(RAND())

Page 25: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Repeated Confidence Intervals, hits and misses

mean std dev 90% lower 90% upper 95% lower 95% upper90%

misses95%

misses-0.104 0.717 -0.476 0.268 -0.549 0.340 0 00.375 0.807 -0.044 0.793 -0.125 0.875 0 00.327 0.929 -0.155 0.809 -0.249 0.903 0 00.064 1.117 -0.515 0.643 -0.628 0.756 0 00.335 1.426 -0.405 1.074 -0.549 1.219 0 00.071 1.214 -0.559 0.701 -0.681 0.823 0 0

-0.346 1.080 -0.907 0.214 -1.016 0.324 0 00.403 0.531 0.127 0.678 0.074 0.732 1 10.217 1.124 -0.366 0.800 -0.480 0.914 0 0

0.534 1.022 0.004 1.064 -0.099 1.168 1 00.387 0.750 -0.002 0.776 -0.078 0.852 0 0

-0.212 1.253 -0.862 0.438 -0.988 0.564 0 00.469 0.759 0.076 0.863 -0.001 0.940 1 00.305 0.612 -0.012 0.622 -0.074 0.684 0 00.374 0.745 -0.013 0.760 -0.088 0.836 0 0

-0.024 0.489 -0.278 0.230 -0.327 0.279 0 0Totals -> 14 9

Since these are not 10 and 5, respectively, is there an error?

Page 26: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Repeated Confidence Intervals, .90

n p misses prob(p) Cum Prob100 0.1 0 0.0000 0.0000

1 0.0003 0.00032 0.0016 0.00193 0.0059 0.00784 0.0159 0.02375 0.0339 0.05766 0.0596 0.11727 0.0889 0.20618 0.1148 0.32099 0.1304 0.4513

10 0.1319 0.583211 0.1199 0.703012 0.0988 0.801813 0.0743 0.876114 0.0513 0.927415 0.0327 0.960116 0.0193 0.979417 0.0106 0.990018 0.0054 0.995419 0.0026 0.998020 0.0012 0.9992

How are these probabilitiesbeing generated?

Let X = RV, number ofmisses. Then X ~ Bin(100, .1)E[X] = np = 10

Page 27: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Repeated Confidence Intervals, .95

n p misses prob(p) Cum Prob100 0.05 0 0.0059 0.0059

1 0.0312 0.03712 0.0812 0.11833 0.1396 0.25784 0.1781 0.43605 0.1800 0.61606 0.1500 0.76607 0.1060 0.87208 0.0649 0.93699 0.0349 0.9718

10 0.0167 0.988511 0.0072 0.995712 0.0028 0.9985

Major Point: Can you see how probability helps us assess the risk associated with statistical inference?

Page 28: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

One-sided Confidence Bounds

Page 29: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Our Very First Real Confidence Interval Revisited Again

A One-Sided Confidence IntervalBased upon the sample of 100 batteries averaging (mean) 10 hours to failure. The manufacturer continues to assure us that the population variance is 16 hours. Find a 95 percent lower confidence interval for the mean life of this particular type of battery.

.05 1.6449

410 1.6449 9.342 hr.

100

z

X zn

Page 30: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

A Transition…

The previous development of a confidence interval was limited in two ways:

- Needed a Normal population

- Needed to know the standard deviation of the Normal distribution

The Central Limit Theorem eliminates the need to explicitly know the population is normal – Z will still be approximately standard normal

We can estimate using the sample standard deviation, s.

Page 31: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Confidence Interval on Mean of Normal – Variance Unknown

Same form as for the normal – measure of risk is now from the t distribution, and we boldly use the sample standard deviation – protected by the heavy-tailed t distribution-even when sample size is small!

Remember we are still assuming that observations from the underlying population are normally distributed.

s.d.f.' 1on with distributi a ofpoint

percentage 2/100upper theis where

//

:interval confidencepercent )1(100

1,2/

1,2/1,2/

nt

t

nstxnstx

n

nn

Page 32: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Where Does it Come From? Do we care?

From earlier:

2

2

//( 1)

1

XXnTS nn S

n

numerator is standard normal

denominator is chi-square divided by d.f. (n-1)

T has a t distribution with n-1 degrees of freedom.

2

2

(0,1)/

( 1)1

XZ n

n

n SX n

Page 33: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Are we still caring?

/2 /2

/2 /2

/2

1/

with a little algebra,

/ / 1

This is our 100(1 )% Confidence Interval on .

is the upper /2 percentage point from t-distribution

XP t t

S n

P X t S n X t S n

t

Page 34: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Confidence Interval on Mean of Normal – Variance Unknown

For large samples the distributional assumption is not critical. If sample size is not large use the t distribution

variance.andmean unknown with

ondistributi normal a from sample a are ,...,, if

freedom of degrees 1-non with distributi a has /

21 nXXX

tnS

XT

As n ∞, t distribution becomes standard normal.

Page 35: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

t Distribution Converges to Standard Normal

For large sample size, use the normal distribution even if the variance is unknown.

Page 36: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

The t distribution

1).-(ndeviation standard sample with the

associated freedom of degrees ofnumber theisk Usually

. ty)(probabili area have which weabove freedom of

degrees with variablerandom theof value theis ,

kTt k

xkxkk

kxf k 2/)1(2 1/

1

2/

2/1)(

Page 37: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Our Very First CI using the t-distribution

Sulfur dioxide and nitrogen oxide are products of fossil fuel consumption. These compounds can be carried long distances and converted to acid before being deposited in the form of “acid rain.” The following sulfur dioxide concentrations (in micrograms per cubic meter) were obtained from different locations in a forest though to have been damaged by acid rain. Estimate the mean concentration in the forest. 52.7 43.9 41.7 71.5 47.6 55.1

62.2 56.5 33.4 61.8 54.3 5045.3 63.4 53.9 65.5 66.6 7052.4 38.6 46.1 44.4 60.7 56.4

/2, 1 /2, 1/ /

10.07 10.0753.92 2.069 53.92 2.069 (49.67,58.17)

24 24

n nx t s n x t s n

2.025,2324, 53.92, 101.48, 2.069n x s t

Average concentrationin undamaged areasis 20 g/m3

Page 38: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

It’s Official Now

Page 39: Statistical Intervals for a Single Sample From only one sample, An interval has been found. Because the sample was ample, The results were quite profound!

Stay Tuned – next time…