SECTION 10.2 Estimating a Population Mean. Whats the difference between what we did in Section 10.1 and what we are beginning in Section 10.2? In reality,

SECTION 10.2 SECTION 10.2

Estimating a Population MeanEstimating a Population Mean

What’s the difference between What’s the difference between what we did in Section 10.1 and what we did in Section 10.1 and

what we are beginning in what we are beginning in Section 10.2?Section 10.2?

In reality, the standard deviation In reality, the standard deviation σσ of the of the population is unknown, so the procedures population is unknown, so the procedures from last section are not useful. However, from last section are not useful. However, the understanding of the logic of the the understanding of the logic of the procedures will continue to be of use. procedures will continue to be of use.

In order to be more realistic, In order to be more realistic, σσ is estimated from is estimated from the data collected using sthe data collected using s

Conditions for Inference about a Conditions for Inference about a Population Mean Population Mean

1.1. RandomRandom--The data is an SRS from the population or --The data is an SRS from the population or from a randomized experimentfrom a randomized experiment

2.2. Observations from the population have a Observations from the population have a normalnormal distribution with an distribution with an unknownunknown mean ( mean () and ) and unknownunknown standard deviation ( standard deviation (σσ) or the sample is ) or the sample is large enough to ensure the sampling distribution is large enough to ensure the sampling distribution is approximately normalapproximately normal

3.3. IndependenceIndependence is assumed for the individual is assumed for the individual observations when calculating a confidence interval. observations when calculating a confidence interval. When we are sampling without replacement from a When we are sampling without replacement from a finite population, it is sufficient to verify that the finite population, it is sufficient to verify that the population is at least 10 times the sample size.population is at least 10 times the sample size.

CAUTIONCAUTIONBe sure to check that the conditions Be sure to check that the conditions for constructing a confidence for constructing a confidence interval for the population mean are interval for the population mean are satisfied before you perform any satisfied before you perform any calculations.calculations.

ROBUSTNESSROBUSTNESS

ROBUST: ROBUST: Confidence levels do not Confidence levels do not change when certain assumptions are change when certain assumptions are violated violated

Fortunately for us, the t-procedures are Fortunately for us, the t-procedures are robust in certain situations.robust in certain situations.

Therefore . . . Therefore . . .

This is when we use the t-procedures:This is when we use the t-procedures:It’s more important for the data to be It’s more important for the data to be an SRS from a population than the population has a an SRS from a population than the population has a normal distributionnormal distributionIf n is less than 15, the data must be normal to use t-If n is less than 15, the data must be normal to use t-proceduresproceduresIf n is at least 15, the t-procedures can be used If n is at least 15, the t-procedures can be used except if there are outliers or strong skewnessexcept if there are outliers or strong skewnessIf n≥30, t-procedures can be used even in the If n≥30, t-procedures can be used even in the presence of strong skewness, but outliers must still presence of strong skewness, but outliers must still be examinedbe examinedEssentially, as long as there are no significant departures from Normality (especially outliers) then the t procedures still work quite well.

Standard ErrorStandard ErrorIn this setting, each sample is a part of a In this setting, each sample is a part of a sampling distribution that is a normal sampling distribution that is a normal distribution with a mean equal to the distribution with a mean equal to the population’s meanpopulation’s meanSince we do not know Since we do not know σσ, we will replace , we will replace the standard deviation formula of the standard deviation formula of with this formula: with this formula:

This is called the This is called the standard error standard error of the sample mean

n

s

n

x

Degrees of FreedomDegrees of Freedom

Commonly listed as dfCommonly listed as dfEqual to n-1Equal to n-1When a t-distribution has k degrees of When a t-distribution has k degrees of freedom, we will write this as t(k)freedom, we will write this as t(k)When the actual df does not appear in When the actual df does not appear in Table C, use the greatest df available that Table C, use the greatest df available that is less than your desired dfis less than your desired df– This guarantees a wider confidence interval This guarantees a wider confidence interval

than needed to justify a given confidence levelthan needed to justify a given confidence level

Density Curves for Density Curves for t Distributionst Distributions

Bell-shaped and symmetricBell-shaped and symmetric

Greater spread than a normal curveGreater spread than a normal curve

As degrees of freedom (or sample size) As degrees of freedom (or sample size) increases, the t density curves appear increases, the t density curves appear more like a normal curvemore like a normal curve

Confidence IntervalsConfidence Intervals

± t*± t*

– t* is the upper (1-C)/2 critical value for the t(n-1) t* is the upper (1-C)/2 critical value for the t(n-1) distributiondistribution

– We find t* using the table or our calculatorWe find t* using the table or our calculatort*=invT(area to left of t*, df)t*=invT(area to left of t*, df)

– We interpret these the same way we did in the last We interpret these the same way we did in the last chapter.chapter.

– This interval is exactly correct when the population This interval is exactly correct when the population distribution is Normal and is approximately correct for distribution is Normal and is approximately correct for large n in other cases.large n in other cases.

xs

n

INFERENCE TOOLBOX (p 631)INFERENCE TOOLBOX (p 631)

1—1—PPARAMETER—Identify the population of interest ARAMETER—Identify the population of interest and the parameter you want to draw a conclusion and the parameter you want to draw a conclusion about.about.2—2—CCONDITIONS—Choose the appropriate inference ONDITIONS—Choose the appropriate inference procedure. VERIFY conditions (procedure. VERIFY conditions (Random, Normal, Random, Normal, Independent) Independent) before using it.before using it.3—3—CCALCULATIONS—If the conditions are met, carry ALCULATIONS—If the conditions are met, carry out the inference procedure.out the inference procedure.4—4—IINTERPRETATION—Interpret your results in the NTERPRETATION—Interpret your results in the context of the problem. CONCLUSION, context of the problem. CONCLUSION, CONNECTION, CONTEXT(meaning that our CONNECTION, CONTEXT(meaning that our conclusion about the parameter connects to our work conclusion about the parameter connects to our work in part 3 and includes appropriate context)in part 3 and includes appropriate context)

Steps for constructing a CONFIDENCE INTERVAL:DO YOU REMEMBER WHAT THE STEPS ARE???

Example: GOT MILK?Example: GOT MILK?

--We want to estimate --We want to estimate = the mean number of bacteria per = the mean number of bacteria per milliliter in all of the milk from this suppliermilliliter in all of the milk from this supplier--Since we don’t know --Since we don’t know σσ, we should construct a one-sample t , we should construct a one-sample t interval for interval for ..– We must be confident that the data are an SRS from the producer’s milk. We must be confident that the data are an SRS from the producer’s milk.

We must learn how the sample was chosen to see if it can be regarded We must learn how the sample was chosen to see if it can be regarded as an SRS (we are only told that it is a “random sample”).as an SRS (we are only told that it is a “random sample”).

– A boxplot and a Normal probability plot of the data show no outliers and A boxplot and a Normal probability plot of the data show no outliers and no strong skewness. This gives us little reason to doubt the Normality of no strong skewness. This gives us little reason to doubt the Normality of the population from which this sample was drawn. In practice, we would the population from which this sample was drawn. In practice, we would probably rely on the fact that past measurements of this type have been probably rely on the fact that past measurements of this type have been roughly Normal.roughly Normal.

– Since these measurements came from a random sample of specimens, Since these measurements came from a random sample of specimens, they should be independent (assuming that there were many, at least they should be independent (assuming that there were many, at least 100, one-milliliter specimens available at the milk processing facility).100, one-milliliter specimens available at the milk processing facility).

A milk processor monitors the number of bacteria per milliliter in raw milk received for processing. A random sample of 10 one-milliliter specimens from milk supplied by one producer give the following data:

5370, 4890, 5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870

Construct a 90% confidence interval.

Example: GOT MILK? Cont.Example: GOT MILK? Cont.

--Entering these data into a calculator gives--Entering these data into a calculator gives

=4950 and s=268.45. So a 90% confidence =4950 and s=268.45. So a 90% confidence interval for the mean bacteria count per milliliter in this interval for the mean bacteria count per milliliter in this producer’s milk isproducer’s milk is

--We can say that we are 90% confident that the --We can say that we are 90% confident that the actual mean number of bacteria per milliliter of milk actual mean number of bacteria per milliliter of milk from this supplier is between 4794.4 and 5105.6 from this supplier is between 4794.4 and 5105.6 because we used a method that yields intervals such because we used a method that yields intervals such that 90% of all these intervals will capture the true that 90% of all these intervals will capture the true mean desired.mean desired.

x

268.45* 4950 1.833

10

sx t

n 4950 155.6

(4794.4, 5105.6)

df = 10-1 = 9

Paired t ProceduresPaired t ProceduresRecall, matched pairs studies are a form of block Recall, matched pairs studies are a form of block design in which just two treatments are being design in which just two treatments are being comparedcomparedAlso, experiments are rarely done on randomly Also, experiments are rarely done on randomly selected subjects. Random selection allows us to selected subjects. Random selection allows us to generalize results to a larger population, but random generalize results to a larger population, but random assignment of treatments to subjects allows us to assignment of treatments to subjects allows us to compare treatments.compare treatments.Be careful to distinguish a matched pairs setting from Be careful to distinguish a matched pairs setting from a two-sample setting.a two-sample setting.The real key is independence.The real key is independence.TREAT THE DIFFERENCES from a matched pairs TREAT THE DIFFERENCES from a matched pairs study as a single sample.study as a single sample.

TECHNOLOGYTECHNOLOGYAs always, you will be allowed unrestricted use of your As always, you will be allowed unrestricted use of your calculator on quizzes and tests (as well as the actual calculator on quizzes and tests (as well as the actual AP Exam). For this reason, ALWAYS be certain to AP Exam). For this reason, ALWAYS be certain to write down the values of key numbers that are being write down the values of key numbers that are being used (means, standard deviations, degrees of used (means, standard deviations, degrees of freedom, significance levels, etc.) along with results of freedom, significance levels, etc.) along with results of the calculator procedures in order to receive full credit.the calculator procedures in order to receive full credit.

The calculator information is available in your book on The calculator information is available in your book on pages 661-662.pages 661-662.

We are now using the T Interval instead of the Z We are now using the T Interval instead of the Z IntervalInterval

Plug in exactly what you are asked forPlug in exactly what you are asked for

SECTION 10.2 Estimating a Population Mean. Whats the difference between what we did in Section 10.1 and what we are beginning in Section 10.2? In reality,

Documents

t procedures

t procedures

sample t interval

sample mean slide

population distribution

population mean

normal distribution

calculator t