Top Banner
Introduction to Probability and Statistics 11 th Week (5/24) Hypothesis Testing
146
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 11주차

Introduction to Probability and Statistics11th Week (5/24)

Hypothesis Testing

Page 2: 11주차

Hypothesis

in statistics, is a claim or statement about a property of a population

Hypothesis Testing is to test the claim or statement

Example: A conjecture is made that “the average starting salary for computer science gradate is $30,000 per year”.

Page 3: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.3

Nonstatistical Hypothesis Testing…

A criminal trial is an example of hypothesis testing without the statistics.

In a trial a jury must decide between two hypotheses. The null hypothesis is

H0: The defendant is innocent

The alternative hypothesis or research hypothesis is

H1: The defendant is guilty

The jury does not know which hypothesis is true. They must make a decision on the basis of evidence presented.

Page 4: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.4

Nonstatistical Hypothesis Testing…

In the language of statistics convicting the defendant is called rejecting the null hypothesis in favor of the alternative hypothesis. That is, the jury is saying that there is enough evidence to conclude that the defendant is guilty (i.e., there is enough evidence to support the alternative hypothesis).

If the jury acquits it is stating that there is not enough evidence to support the alternative hypothesis. Notice that the jury is not saying that the defendant is innocent, only that there is not enough evidence to support the alternative hypothesis. That is why we never say that we accept the null hypothesis, although most people in industry will say “We accept the null hypothesis”

Page 5: 11주차

Question: How can we justify/test this conjecture?

A. What do we need to know to justify this conjecture?

B. Based on what we know, how should we justify this conjecture?

Page 6: 11주차

Answer to A: Randomly select, say 100, computer

science graduates and find out their annual salaries

---- We need to have some sample observations, i.e., a sample set!

Page 7: 11주차

Answer to B: That is what we will learn in this

chapter

---- Make conclusions based on the sample observations

Page 8: 11주차

Statistical Reasoning

Analyze the sample set in an attempt to distinguish between results that can easily occur and results that are highly unlikely.

Page 9: 11주차

Statistical Decisions

Decisions about populations on the basis of sample information.

Ex) We may wish to decide on the basis of sample data whether a new serum is really effective in curing a disease, or whether one educational procedure is better than another

Page 10: 11주차

Definitions Null Hypothesis (denoted H 0):

is the statement being tested in a

test of hypothesis.

Alternative Hypothesis (H 1):

is what is believe to be true if the

null hypothesis is false.

Page 11: 11주차

Null Hypothesis: H0

Must contain condition of equality

=, , or

Test the Null Hypothesis directly

Reject H 0 or fail to reject H 0

Page 12: 11주차

Alternative Hypothesis: H1

Must be true if H0 is false

, <, >

‘opposite’ of Null

Example:

H0 : µ = 30 versus H1 : µ > 30

Page 13: 11주차

Statistical Hypotheses and Null Hypotheses

Statistical hypotheses: Assumptions or guesses about the populations involved. (Such assumptions, which may or may not be true)

Null hypotheses (H0): Hypothesis that there is no difference between the procedures. We formulate it if we want to decide whether one procedure is better than another.

Alternative hypotheses (H1): Any hypothesis that differs from a given null hypothesis

Example 1. For example, if the null hypothesis is p = 0.5, possible alternative hypotheses are p =0.7, or p ≠ 0.5.

Page 14: 11주차

11.14

Concepts of Hypothesis Testing (1)…

• The two hypotheses are called the null hypothesis and the other the alternative or research hypothesis. The usual notation is:

• H0: — the ‘null’ hypothesis

• H1: — the ‘alternative’ or ‘research’ hypothesis

• The null hypothesis (H0) will always state that the parameter equals the value specified in the alternative hypothesis (H1)

pronounced

H “nought”

Page 15: 11주차

Stating Your Own HypothesisIf you wish to support your claim, the

claim must be stated so that it becomes the alternative hypothesis.

Page 16: 11주차

Important Notes:

H0 must always contain equality; however some claims are not stated using equality. Therefore sometimes the claim and H0 will not be the same.

Ideally all claims should be stated that they are Null Hypothesis so that the most serious error would be a Type I error.

Page 17: 11주차

Tests of Hypotheses and Significance

“Significant”: If on the supposition that a particular hypothesis is true we find that results observed in a random sample differ markedly from those expected under the hypothesis on the basis of pure chance using sampling theory, we would say that the observed differences are significant

We would be inclined to reject the hypothesis if the observed differences are significant.

Tests of hypotheses, tests of significance, or decision rules: Procedures that enable us to decide whether to accept or reject hypotheses or to determine whether observed samples differ significantly from expected results

Page 18: 11주차

Type I ErrorThe mistake of rejecting the null hypothesis when

it is true.

The probability of doing this is called the significance level, denoted by (alpha).

Common choices for : 0.05 and 0.01

Example: rejecting a perfectly good parachute and refusing to jump

Page 19: 11주차

Type II Errorthe mistake of failing to reject the null

hypothesis when it is false.

denoted by ß (beta)

Example: failing to reject a defective parachute and jumping out of a plane with it.

Page 20: 11주차

Table 7-2 Type I and Type II Errors

True State of Nature

We decide to

reject the

null hypothesis

We fail to

reject the

null hypothesis

The null

hypothesis is

true

The null

hypothesis is

false

Type I error

(rejecting a true

null hypothesis)

Type II error

(failing to reject

a false null

hypothesis)

Correct

decision

Correct

decision

Decision

Page 21: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.21

Types of Errors…

A Type I error occurs when we reject a true null hypothesis (i.e. Reject H0 when it is TRUE)

A Type II error occurs when we don’t reject a false null hypothesis (i.e. Do NOT reject H0 when it is FALSE)

H0 T F

Reject I

Reject II

Page 22: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.22

Type of Errors…

There are two possible errors.

A Type I error occurs when we reject a true null hypothesis. That is, a Type I error occurs when the jury convicts an innocent person. We would want the probability of this type of error [maybe 0.001 – beyond a reasonable doubt] to be very small for a criminal trial where a conviction results in the death penalty, whereas for a civil trial, where conviction might result in someone having to “pay for damages to a wrecked auto”,we would be willing for the probability to be larger [0.49 – preponderance of the evidence ]

P(Type I error) = [usually 0.05 or 0.01]

Page 23: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.23

Type of Errors…

A Type II error occurs when we don’t reject a false null hypothesis [accept the null hypothesis]. That occurs when a guilty defendant is acquitted.

In practice, this type of error is by far the most serious mistake we normally make. For example, if we test the hypothesis that the amount of medication in a heart pill is equal to a value which will cure your heart problem and “accept the hull hypothesis that the amount is ok”. Later on we find out that the average amount is WAY too large and people die from “too much medication” [I wish we had rejected the hypothesis and threw the pills in the trash can], it’s too late because we shipped the pills to the public.

Page 24: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.24

Type of Errors…

The probability of a Type I error is denoted as α (Greek letter alpha). The probability of a type II error is β (Greek letter beta).

The two probabilities are inversely related. Decreasing one increases the other, for a fixed sample size.

In other words, you can’t have and β both real small for any old sample size. You may have to take a much larger sample size, or in the court example, you need much more evidence.

Page 25: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.25

Type of Errors…

The critical concepts are theses:

1. There are two hypotheses, the null and the alternative hypotheses.

2. The procedure begins with the assumption that the null hypothesis is true.

3. The goal is to determine whether there is enough evidence to infer that the alternative hypothesis is true, or the null is not likely to be true.

4. There are two possible decisions:

Conclude that there is enough evidence to support the alternative hypothesis. Reject the null.

Conclude that there is not enough evidence to support the alternative hypothesis. Fail to reject the null.

Page 26: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.26

Judging the Test…

A statistical test of hypothesis is effectively defined by the significance level ( ) and the sample size (n), both of which are selected by the statistics practitioner.

Therefore, if the probability of a Type II error ( ) is too large [we have insufficient power], we can reduce it by

increasing , and/or

increasing the sample size, n.

Page 27: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.27

Judging the Test…The power of a test is defined as 1– .It represents the probability of rejecting the null hypothesis when it is false and the true mean is something other than the null value for the mean.

If we are testing the hypothesis that the average amount of medication in blood pressure pills is equal to 6 mg (which is good), and we “fail to reject” the null hypothesis, ship the pills to patients worldwide, only to find out later that the “true” average amount of medication is really 8 mg and people die, we get in trouble. This occurred because the P(reject the null / true mean = 7 mg) = 0.32 which would mean that we have a 68% chance on not rejecting the null for these BAD pills and shipping to patients worldwide.

Page 28: 11주차

Type I and Type II ErrorsType I error: If we reject a hypothesis when it happens to be true.

Type II error: If we accept a hypothesis when it should be rejected.

In order for any tests of hypotheses or decision rules to be good, they must be designed so as to minimize errors of decision.

An attempt to decrease one type of error is accompanied in general by an increase in the other type of error. The only way to reduce both types of error is to increase the sample size, which may or may not be possible.

Page 29: 11주차

Significant Differences Hypothesis testing is designed to detect

significant differences: differences that did not occur by random chance.

In the “one sample” case: we compare a random sample (from a large group) to a population.

We compare a sample statistic to a population parameter to see if there is a significant difference.

Page 30: 11주차

Level of Significance ( 유의수준 )

Level of significance: In testing a given hypothesis, the maximum probability with which we would be willing to risk a Type I error is called the level of significance

Page 31: 11주차

Level of Significance

In practice a level of significance of 0.05 or 0.01 is customary, although other values are used.

If for example a 0.05 or 5% level of significance is chosen in designing a test of a hypothesis, then there are about 5 chances in 100 that we would reject the hypothesis when it should be accepted, i.e., whenever the null hypotheses is true, we are about 95% confident that we would make the right decision. In such cases we say that the hypothesis has been rejected at a 0.05 level of significance, which means that we could be wrong with probability 0.05.

Page 32: 11주차

DefinitionTest Statistic: is a sample statistic or value based on sample

data Example:

z = x – µx

n

Page 33: 11주차

Definition Critical Region : is the set of all values of the test statistic

that would cause a rejection of the null hypothesis

Page 34: 11주차

Critical Region• Set of all values of the test statistic that

would cause a rejection of thenull hypothesis

CriticalRegion

Page 35: 11주차

Critical Region• Set of all values of the test statistic that

would cause a rejection of the • null hypothesis

CriticalRegion

Page 36: 11주차

Critical Region• Set of all values of the test statistic that

would cause a rejection of the null hypothesis

CriticalRegions

Page 37: 11주차

Definition

Critical Value: is the value (s) that separates the critical

region from the values that would not lead to a rejection of H 0

Page 38: 11주차

Critical ValueValue (s) that separates the critical region

from the values that would not lead to a rejection of H 0

Critical Value

( z score )

Page 39: 11주차

Critical ValueValue (s) that separates the critical region

from the values that would not lead to a rejection of H 0

Critical Value

( z score )

Fail to reject H0Reject H0

Page 40: 11주차

Tests Involving the Normal Distribution

-Level of confidence : 0.05

The critical region (or region of rejection of the hypothesis or the region of significance): The set of z scores outside the range -1.96 to 1.96 constitutes

The region of acceptance of the hypothesis (or the region of nonsignificance) : The set of z scores inside the range -1.96 to 1.96 could

Page 41: 11주차

Tests Involving the Normal Distribution

Decision Rule

When the level of confidence is 0.01, a value 2.58 should be instead of 1.96.

Page 42: 11주차

Two-tailed,Left-tailed,Right-tailed

Tests

Page 43: 11주차

Left-tailed Test

H0: µ 200

H1: µ < 200

Page 44: 11주차

Left-tailed Test

H0: µ 200

H1: µ < 200Points Left

Page 45: 11주차

Left-tailed Test

H0: µ 200

H1: µ < 200

200

Values that differ significantly

from 200

Fail to reject H0Reject H0

Points Left

Page 46: 11주차

Right-tailed TestH0: µ 200

H1: µ > 200

Page 47: 11주차

Right-tailed TestH0: µ 200

H1: µ > 200

Points Right

Page 48: 11주차

Right-tailed TestH0: µ 200

H1: µ > 200

Values that differ significantly

from 200200

Fail to reject H0Reject H0

Points Right

Page 49: 11주차

Two-tailed TestH0: µ = 200

H1: µ 200

Page 50: 11주차

Two-tailed TestH0: µ = 200

H1: µ 200 is divided equally between

the two tails of the critical region

Page 51: 11주차

Two-tailed TestH0: µ = 200

H1: µ 200

Means less than or greater than

is divided equally between the two tails of the critical

region

Page 52: 11주차

Two-tailed TestH0: µ = 200

H1: µ 200

Means less than or greater than

Fail to reject H0Reject H0Reject H0

200

Values that differ significantly from 200

is divided equally between the two tails of the critical

region

Page 53: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.53

Summary of One- and Two-Tail Tests…

One-Tail Test(left tail)

Two-Tail Test One-Tail Test(right tail)

Page 54: 11주차

One-Tailed and Two-Tailed TestsTwo-tailed tests or two-sided tests: When we display interest in extreme values of the statistic S or its corresponding z score on both sides of the mean, i.e., in both tails of the distribution.

One-tailed tests or one-sided tests: When we are interested only in extreme values to one side of the mean, i.e., in one tail of the distribution, as, for example, when we are testing the hypothesis that one process is better than another (which is different from testing whether one process is better or worse than the other).

Page 55: 11주차

P Value

The null hypothesis H0 will be an assertion that a population parameter has a specific value, and the alternative hypothesis H1 will be one of the following assertions:

(i) The parameter is greater than the stated value (right-tailed test).

(ii) The parameter is less than the stated value (left-tailed test).(iii) The parameter is either greater than or less than the stated value (two-tailed test).

P value of the test: The probability that a value of S in the direction(s) of H1 and as extreme as the one that actually did occur would occur if H0 were true.

Page 56: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.56

Interpreting the p-value…The smaller the p-value, the more statistical evidence exists to support the alternative hypothesis.•If the p-value is less than 1%, there is overwhelming evidence that supports the alternative hypothesis.•If the p-value is between 1% and 5%, there is a strong evidence that supports the alternative hypothesis.•If the p-value is between 5% and 10% there is a weak evidence that supports the alternative hypothesis.•If the p-value exceeds 10%, there is no evidence that supports the alternative hypothesis.We observe a p-value of .0069, hence there is overwhelming evidence to support H1: > 170.

Page 57: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.57

Interpreting the p-value…Overwhelming Evidence(Highly Significant)

Strong Evidence(Significant)

Weak Evidence(Not Significant)

No Evidence(Not Significant)

0 .01 .05 .10

p=.0069

Page 58: 11주차

P Value

Page 59: 11주차

P Value

Page 60: 11주차

P Value

Small P values provide evidence for rejecting the null hypothesis in favor of the alternative hypothesis, and large P values provide evidence for not rejecting the null hypothesis in favor of the alternative hypothesis.

The P value and the level of significance do not provide criteria for rejecting or not rejecting the null hypothesis by itself, but for rejecting or not rejecting the null hypothesis in favor of the alternative hypothesis.

When the test statistic S is the standard normal random variable, the table in Appendix C is sufficient to compute the P value, but when S is one of the t, F, or chi-square random variables, all of which have different distributions depending on their degrees of freedom, either computer software or more extensive tables will be needed to compute the P value.

Page 61: 11주차

Special Tests of Significance for Large Samples: Means

Page 62: 11주차

Special Tests of Significance for Large Samples: Means

Page 63: 11주차

Special Tests of Significance for Large Samples: Means

Page 64: 11주차

Special Tests of Significance for Large Samples: Means

Page 65: 11주차

Our Problem:

The education department at a university has been accused of “grade inflation” so education majors have much higher GPAs than students in general.

GPAs of all education majors should be compared with the GPAs of all students. There are 1000s of education majors, far too many to

interview. How can this be investigated without interviewing all

education majors?

Page 66: 11주차

What we know: The average GPA for

all students is 2.70. This value is a parameter.

To the right is the statistical information for a random sample of education majors:

= 2.70

X

= 3.00

s = 0.70

N = 117

Page 67: 11주차

Questions to ask:

Is there a difference between the parameter (2.70) and the statistic (3.00)?

Could the observed difference have been caused by random chance?

Is the difference real (significant)?

Page 68: 11주차

Two Possibilities:

1. The sample mean (3.00) is the same as the pop. mean (2.70).

The difference is trivial and caused by random chance.

2. The difference is real (significant). Education majors are different from all

students.

Page 69: 11주차

The Null and Alternative Hypotheses:1. Null Hypothesis (H0)

The difference is caused by random chance.

The H0 always states there is “no significant difference.” In

this case, we mean that there is no significant difference

between the population mean and the sample mean.

2. Alternative hypothesis (H1) “The difference is real”.

(H1) always contradicts the H0.

One (and only one) of these explanations must be true. Which one?

Page 70: 11주차

Test the Explanations

We always test the Null Hypothesis.

Assuming that the H0 is true:

What is the probability of getting the sample

mean (3.00) if the H0 is true and all education

majors really have a mean of 2.70? In other

words, the difference between the means is

due to random chance.

If the probability associated with this difference

is less than 0.05, reject the null hypothesis.

Page 71: 11주차

Test the Hypotheses Use the .05 value as a guideline to identify differences

that would be rare or extremely unlikely if H0 is true.

This “alpha” value delineates the “region of rejection.”

Use the Z score formula for single samples and Appendix A to determine the probability of getting the observed difference.

If the probability is less than .05, the calculated or “observed” Z score will be beyond ±1.96 (the “critical” Z score).

Page 72: 11주차

Two-tailed Hypothesis Test:

When α = .05, then .025 of the area is distributed on either side of the curve in area (C )

The .95 in the middle section represents no significant difference between the population and the sample mean.

The cut-off between the middle section and +/- .025 is represented by a Z-value of +/- 1.96.

Z= -1.96

c

Z = +1.96

c

Page 73: 11주차

Testing Hypotheses:Using The Five Step Model…

1. Make Assumptions and meet test requirements.

2. State the null hypothesis.

3. Select the sampling distribution and establish the critical region.

4. Compute the test statistic.

5. Make a decision and interpret results.

Page 74: 11주차

Step 1: Make Assumptions and Meet Test Requirements

Random sampling Hypothesis testing assumes samples were selected using

random sampling. In this case, the sample of 117 cases was randomly selected

from all education majors.

Level of Measurement is Interval-Ratio GPA is I-R so the mean is an appropriate statistic.

Sampling Distribution is normal in shape This is a “large” sample (N≥100).

Page 75: 11주차

Step 2 State the Null Hypothesis H0: μ = 2.7 (in other words, H0: = μ)

You can also state Ho: No difference between the sample

mean and the population parameter

(In other words, the sample mean of 3.0 really the same as

the population mean of 2.7 – the difference is not real but

is due to chance.) The sample of 117 comes from a population that has a

GPA of 2.7. The difference between 2.7 and 3.0 is trivial and caused by

random chance.

Page 76: 11주차

Step 2 (cont.) State the Alternate Hypothesis H1: μ≠2.7 (or, H0: ≠ μ)

Or H1: There is a difference between the sample mean and

the population parameter The sample of 117 comes a population that does not have

a GPA of 2.7. In reality, it comes from a different population. The difference between 2.7 and 3.0 reflects an actual

difference between education majors and other students. Note that we are testing whether the population the sample

comes from is from a different population or is the same as the general student population.

Page 77: 11주차

Step 3 Select Sampling Distribution and Establish the Critical Region Sampling Distribution= Z

Alpha (α) = .05

α is the indicator of “rare” events.

Any difference with a probability less than α is rare and will cause us to reject the H0.

Page 78: 11주차

Step 3 (cont.) Select Sampling Distribution and Establish the Critical Region Critical Region begins at Z= ± 1.96

This is the critical Z score associated with α = .05, two-tailed test.

If the obtained Z score falls in the Critical Region, or “the region of rejection,” then we would reject the H0.

Page 79: 11주차

Step 4: Use Formula to Compute the Test Statistic (Z for large samples (≥ 100)

NZ

Page 80: 11주차

When the Population σ is not known,use the following formula:

1

Ns

Z

Page 81: 11주차

Test the Hypotheses

We can substitute the sample standard deviation S for pop. s.d.) and correct for bias by substituting N-1 in the denominator.

Substituting the values into the formula, we calculate a Z score of 4.62.

62.4

11177.

7.20.3

Z

Page 82: 11주차

Step 5 Make a Decision and Interpret Results

The obtained Z score fell in the Critical Region, so we reject

the H0.

If the H0 were true, a sample outcome of 3.00 would be

unlikely. Therefore, the H0 is false and must be rejected.

Education majors have a GPA that is significantly different from the general student body (Z = 4.62, α = .05).*

*Note: Always report significant statistics.

Page 83: 11주차

Looking at the curve:(Area C = Critical Region when α=.05)

Z= -1.96

c

Z = +1.96

c z= +4.62 I

Page 84: 11주차

Summary:

The GPA of education majors is significantly different from the GPA of the general student body.

In hypothesis testing, we try to identify statistically significant differences that did not occur by random chance.

In this example, the difference between the parameter 2.70 and the statistic 3.00 was large and unlikely (p < .05) to have occurred by random chance.

Page 85: 11주차

Summary (cont.)

We rejected the H0 and concluded that the

difference was significant.

It is very likely that Education majors have GPAs higher than the general student body

Page 86: 11주차

Special Tests of Significance for Large Samples: Proportions

Page 87: 11주차

Special Tests of Significance for Large Samples: Proportions

Page 88: 11주차

Special Tests of Significance for Large Samples: Difference of Means

Page 89: 11주차

Special Tests of Significance for Large Samples: Difference of Means

Page 90: 11주차

Special Tests of Significance for Large Samples: Difference of Means

Page 91: 11주차

Special Tests of Significance for Large Samples: Difference of Means

Page 92: 11주차

Special Tests of Significance for Large Samples: Difference of Means

Page 93: 11주차

Special Tests of Significance for Large Samples: Difference of Means

Page 94: 11주차

Special Tests of Significance for Large Samples: Difference of Proportions

Page 95: 11주차

Special Tests of Significance for Large Samples: Difference of Proportions

Page 96: 11주차

Special Tests of Significance for Large Samples: Difference of Proportions

Page 97: 11주차

Special Tests of Significance for Small Samples: Means

Page 98: 11주차

Special Tests of Significance for Small Samples: Means

Page 99: 11주차

Special Tests of Significance for Small Samples: Means

Page 100: 11주차

Special Tests of Significance for Small Samples: Means

Page 101: 11주차

Using the Student’s t Distribution for Small Samples (One Sample T-Test) When the sample size is small

(approximately < 100) then the Student’s t distribution should be used (see Appendix B)

The test statistic is known as “t”. The curve of the t distribution is flatter than

that of the Z distribution but as the sample size increases, the t-curve starts to resemble the Z-curve (see text p. 230 for illustration)

Page 102: 11주차

Degrees of Freedom

The curve of the t distribution varies with sample size (the smaller the size, the flatter the curve)

In using the t-table, we use “degrees of freedom” based on the sample size.

For a one-sample test, df = N – 1. When looking at the table, find the t-value for

the appropriate df = N-1. This will be the cutoff point for your critical region.

Page 103: 11주차

Formula for one sample t-test:

1

NS

t

Page 104: 11주차

Example

A random sample of 26 sociology graduates scored 458 on the GRE advanced sociology test with a standard deviation of 20. Is this significantly different from the population average (µ = 440)?

Page 105: 11주차

Solution (using five step model) Step 1: Make Assumptions and Meet Test

Requirements:

1. Random sample 2. Level of measurement is interval-ratio 3. The sample is small (<100)

Page 106: 11주차

Solution (cont.)

Step 2: State the null and alternate hypotheses.

H0: µ = 440 (or H0: = μ)

H1: µ ≠ 440

Page 107: 11주차

Solution (cont.) Step 3: Select Sampling Distribution and

Establish the Critical Region

1. Small sample, I-R level, so use t distribution.

2. Alpha (α) = .05

3. Degrees of Freedom = N-1 = 26-1 = 25

4. Critical t = ±2.060

Page 108: 11주차

Solution (cont.)

Step 4: Use Formula to Compute the Test Statistic

5.4

12620

440458

1

NS

t

Page 109: 11주차

Looking at the curve for the t distribution Alpha (α) = .05

t= -2.060

c

t = +2.060

c t= +4.50 I

Page 110: 11주차

Step 5 Make a Decision and Interpret Results

The obtained t score fell in the Critical Region, so we

reject the H0 (t (obtained) > t (critical)

If the H0 were true, a sample outcome of 458

would be unlikely. Therefore, the H0 is false and must be rejected.

Sociology graduates have a GRE score that is significantly different from the general student body (t = 4.5, df = 25, α = .05).

Page 111: 11주차

Testing Sample Proportions:

When your variable is at the nominal (or ordinal) level the one sample z-test for proportions should be used.

If the data are in % format, convert to a proportion first.

The method is the same as the one sample Z-test for means (see above)

Page 112: 11주차

Special Tests of Significance for Small Samples: Variance

Page 113: 11주차

Special Tests of Significance for Small Samples: Variance

Page 114: 11주차

Special Tests of Significance for Small Samples: Variance

Page 115: 11주차

Special Tests of Significance for Small Samples: Difference of Means

Page 116: 11주차

Special Tests of Significance for Small Samples: Difference of Means

Page 117: 11주차

Special Tests of Significance for Small Samples: Ratios of Variances

Page 118: 11주차

Special Tests of Significance for Small Samples: Ratios of Variances

Page 119: 11주차

Special Tests of Significance for Small Samples: Ratios of Variances

Page 120: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.120

Concepts of Hypothesis Testing…

For example, if we’re trying to decide whether the mean is not equal to 350, a large value of (say, 600) would provide enough evidence.

If is close to 350 (say, 355) we could not say that this provides a great deal of evidence to infer that the population mean is different than 350.

Page 121: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.121

Concepts of Hypothesis Testing (4)…The two possible decisions that can be made:

Conclude that there is enough evidence to support the alternative hypothesis

(also stated as: reject the null hypothesis in favor of the alternative)

Conclude that there is not enough evidence to support the alternative hypothesis

(also stated as: failing to reject the null hypothesis in favor of the alternative)

NOTE: we do not say that we accept the null hypothesis if a statistician is around…

Page 122: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.122

Concepts of Hypothesis Testing (2)…

The testing procedure begins with the assumption that the null hypothesis is true.

Thus, until we have further statistical evidence, we will assume:

H0: = 350 (assumed to be TRUE)

The next step will be to determine the sampling distribution of the sample mean assuming the true mean is 350.

is normal with 350

75/SQRT(25) = 15

Page 123: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.123

Is the Sample Mean in the Guts of the Sampling Distribution??

Page 124: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.124

Three ways to determine this: First way

1. Unstandardized test statistic: Is in the guts of the sampling distribution? Depends on what you define as the “guts” of the sampling distribution.

If we define the guts as the center 95% of the distribution [this means = 0.05], then the critical values that define the guts will be 1.96 standard deviations of X-Bar on either side of the mean of the sampling distribution [350], or

UCV = 350 + 1.96*15 = 350 + 29.4 = 379.4

LCV = 350 – 1.96*15 = 350 – 29.4 = 320.6

Page 125: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.125

1. Unstandardized Test Statistic Approach

Page 126: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.126

Three ways to determine this: Second way2. Standardized test statistic: Since we defined the “guts” of the sampling distribution to be the center 95% [ = 0.05],

If the Z-Score for the sample mean is greater than 1.96, we know that will be in the reject region on the right side or

If the Z-Score for the sample mean is less than -1.97, we know that will be in the reject region on the left side.

Z = ( - )/ = (370.16 – 350)/15 = 1.344

Is this Z-Score in the guts of the sampling distribution???

Page 127: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.127

2. Standardized Test Statistic Approach

Page 128: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.128

Three ways to determine this: Third way

3. The p-value approach (which is generally used with a computer and statistical software): Increase the “Rejection Region” until it “captures” the sample mean.

For this example, since is to the right of the mean, calculate

P( > 370.16) = P(Z > 1.344) = 0.0901

Since this is a two tailed test, you must double this area for the p-value.

p-value = 2*(0.0901) = 0.1802

Since we defined the guts as the center 95% [ = 0.05], the reject region is the other 5%. Since our sample mean, , is in the 18.02% region, it cannot be in our 5% rejection region [ = 0.05].

Page 129: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.129

3. p-value approach

Page 130: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.130

Statistical Conclusions:

Unstandardized Test Statistic:

Since LCV (320.6) < (370.16) < UCV (379.4), we reject the null hypothesis at a 5% level of significance.

Standardized Test Statistic:

Since -Z/2(-1.96) < Z(1.344) < Z/2 (1.96), we fail to reject the null hypothesis at a 5% level of significance.

P-value:

Since p-value (0.1802) > 0.05 [], we fail to reject the hull hypothesis at a 5% level of significance.

Page 131: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.131

Example 11.1…

A department store manager determines that a new billing system will be cost-effective only if the mean monthly account is more than $170.

A random sample of 400 monthly accounts is drawn, for which the sample mean is $178. The accounts are approximately normally distributed with a standard deviation of $65.

Can we conclude that the new system will be cost-effective?

Page 132: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.132

Example 11.1…

The system will be cost effective if the mean account balance for all customers is greater than $170.

We express this belief as a our research hypothesis, that is:

H1: > 170 (this is what we want to determine)

Thus, our null hypothesis becomes:

H0: = 170 (this specifies a single value for the parameter of interest) – Actually H0: μ < 170

Page 133: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.133

Example 11.1…

What we want to show:

H1: > 170

H0: < 170 (we’ll assume this is true)

Normally we put Ho first.

We know:

n = 400,

= 178, and

= 65

= 65/SQRT(400) = 3.25

= 0.05

Page 134: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.134

Example 11.1… Rejection Region…

The rejection region is a range of values such that if the test statistic falls into that range, we decide to reject the null hypothesis in favor of the alternative hypothesis.

is the critical value of to reject H0.

Page 135: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.135

Example 11.1…At a 5% significance level (i.e. =0.05), we get [all in one tail]

Z = Z0.05 = 1.645

Therefore, UCV = 170 + 1.645*3.25 = 175.35Since our sample mean (178) is greater than the critical value we calculated (175.35), we reject the null hypothesis in favor of H1

OR

(>1.645) Reject null

OR

p-value = P( > 178) = P(Z > 2.46) = 0.0069 < 0.05 Reject null

Page 136: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.136

Example 11.1… The Big Picture…

=175.34

=178

H1: > 170

H0: = 170

Reject H0 in favor of

Page 137: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.137

Conclusions of a Test of Hypothesis…

If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true.

If we fail to reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true. This does not mean that we have proven that the null hypothesis is true!

Keep in mind that committing a Type I error OR a Type II error can be VERY bad depending on the problem.

Page 138: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.138

One tail test with rejection region on right The last example was a one tail test, because the rejection region is located in only one tail of the sampling distribution:

More correctly, this was an example of a right tail test.

H1: μ > 170

H0: μ < 170

Page 139: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.139

One tail test with rejection region on leftThe rejection region will be in the left tail.

Page 140: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.140

Two tail test with rejection region in both tailsThe rejection region is split equally between the two tails.

Page 141: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.141

Example 11.2… Students work

AT&T’s argues that its rates are such that customers won’t see a difference in their phone bills between them and their competitors. They calculate the mean and standard deviation for all their customers at $17.09 and $3.87 (respectively). Note: Don’t know the true value for σ, so we estimate σ from the data [σ ~ s = 3.87] – large sample so don’t worry.

They then sample 100 customers at random and recalculate a monthly phone bill based on competitor’s rates.

Our null and alternative hypotheses are

H1: ≠ 17.09. We do this by assuming that:

H0: = 17.09

Page 142: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.142

Example 11.2…

The rejection region is set up so we can reject the null hypothesis when the test statistic is large or when it is small.

That is, we set up a two-tail rejection region. The total area in the rejection region must sum to , so we divide by 2.

stat is “small” stat is “large”

Page 143: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.143

Example 11.2…

At a 5% significance level (i.e. = .05), we have

/2 = .025. Thus, z.025 = 1.96 and our rejection region is:

z < –1.96 -or- z > 1.96

z-z.025 +z.0250

Page 144: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.144

Example 11.2…

From the data, we calculate = 17.55

Using our standardized test statistic:

We find that:

Since z = 1.19 is not greater than 1.96, nor less than –1.96 we cannot reject the null hypothesis in favor of H1. That is “there is insufficient evidence to infer that there is a difference between the bills of AT&T and the competitor.”

Page 145: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.145

Probability of a Type II Error –

A Type II error occurs when a false null hypothesis is not rejected or “you accept the null when it is not true” but don’t say it this way if a statistician is around.

In practice, this is by far the most serious error you can make in most cases, especially in the “quality field”.

Page 146: 11주차

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 11.146

Probability you ship pills whose mean amount of medication is 7 mg approximately 67%