Top Banner
Statistical Hypothesis Tests Notes of STAT6205 by Dr. Fan 1 6205
39

Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

May 17, 2018

Download

Documents

vonhan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Statistical Hypothesis

TestsNotes of STAT6205 by Dr. Fan

16205

Page 2: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Overview• Introduction of hypotheses tests (Sections 7.1,7.2)

o General logic

o Two types of error

o Parametric tests for one mean and for proportions

o What is the best test for a given situation?

• Order Statistics (Section 8.3)

• Wilcoxon tests (Section 8.5)

26205

Page 3: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Statistical Hypotheses• A statistical hypothesis is an assumption or

statement concerning one or more population parameters.

• Simple vs. composite hypotheses

E.g. A pharmaceutical company wants to be able to claim that for its newest medication the proportion of patients who experience side effects is less than 20%.

Q. What are the two possible conclusions (hypotheses) here?

36205

Page 4: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Hypothesis Tests• A statistical test is to check a statistical hypothesis

using data. It involves the five steps:

1. Set up the null (Ho) and alternative (H1) hypotheses

2. Find an appropriate test statistic (T.S.)

3. Find the rejection (critical) region (R.R.)

4. Reject Ho if the observed test statistic falls into R.R. and not reject Ho otherwise

5. Report the result in the context of the situation

46205

Page 5: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

5

Determine Ho and H1• The null hypothesis Ho is the no-change hypothesis

• The alternative hypothesis H1 says that Ho is false

The Logic of Hypothesis Tests:

“Assume Ho is a possible truth until proven false”

Analogical to

“Presumed innocent until proven guilty”

The logic of the US judicial system

Q: What are the two possible conclusions?

6205

Page 6: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Determine Ho and H1Golden Rule: Ho must be a simple hypothesis.

Practical Rule: If possible, the hypothesis we hope to prove (called research hypothesis) goes to H1.

Back to the drug example, setting Ho and H1.

66205

Page 7: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

7

Good!(Correct!)

H0 true H0 false

Type II Error, or “ββββ Error”

Type I Error, or

“αααα Error”

Good!(Correct)

we accept H0

we reject H0

Types of Errors

Page 8: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

More Terms• α = Significance level of a test = Type I error rate

• Power of a test = 1-Type II error rate=1- β

• We only control α not β,

���� so we don’t say “accept Ho”.

6205 8

Page 9: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

9

Report the Conclusion

• Reject Ho: the data shows strong evidence supporting Ha

Eg. The data shows strong evidence that the proportion of users who will experience side effects is less than 20% at significant level of 10%.

• Fail to reject Ho: the data does not provide sufficient evidence supporting Ha

Eg. Based on the data, there is not sufficient evidence to support the proportion is less than 20% at significant level of 5%.

Page 10: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Tests for One Mean

106205

Page 11: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

11

Z Test

For normal populations or large samples (n >30)

And the computed value of Z is denoted by Z*.

n

XZ

/0

σµ−=

6205

Page 12: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

6205 12

Page 13: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

13

Types of Tests

6205

Page 14: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

14

Types of Tests

6205

Page 15: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

15

Types of Tests

6205

Page 16: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Example 1 (Conti.)

Conduct a test for Ho: mu=2500 vs. H1: mu =3000 at 5% significant level.

1) What is the R.R.?

2) What is the power of the test?

Z test is the most powerful test!

6205 16

Page 17: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

17

P-Values

• The p-value is the smallest level of significance to reject Ho at the observed value, also called the observed significance level.

p-value > α � fail to reject Ho

p-value < α � reject Ho (= accept Ha)

• That is, p-value is the probability of seeing as extreme as (or more extreme) what we observe, given Ho is true.

Page 18: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

18

P-Value• The level of significance (called α level) is usually

0.05

• p-value > α � fail to reject Ho (??)

• p-value < α � reject Ho (= accept Ha)

Page 19: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

19

Computing the p-Value for the Z-Test

Page 20: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

20

Computing the p-Value for the Z-Test

Page 21: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

21

Computing the p-Value for the Z-Test

P-value = P(|Z| > |z*| )= 2 x P(Z > |z*|)

Page 22: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

6205 22

Page 23: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

23

t Test• For normal populations with unknown σ

Eg. Revisit Example 1

)1(~/

0 −−= ntns

Xt

µ

Page 24: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

24

One Population

Page 25: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

6205 25

Page 26: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

26

Testing Hypotheses about a Proportion

• Three possible Ho and Ha

Ho Ha Type

p = po p = po Two-sided

p > po p < po One-sided (lower-tailed)

p < po p > po One-sided (upper-tailed)

Write them all as p=po in the future

Page 27: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

27

The z-test for a Proportion

• When 1) the sample is a random sample

2) n(po) and n(1-po) are both at least 10,

an appropriate test statistic for p is

n

pp

ppz

oo

o

)1(

ˆ

−−=

Page 28: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

28

Example: New Drug (Conti.)1. Ho: p > 20% vs. Ha: p < 20%

2. Z-test statistic; α = 0.053. Find rejection region or p-value

4. Decide if reject Ho or not

5. Report the conclusion in the context of the situation

Page 29: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

29

Hypothesis Test for the Difference of Two

Population Proportions

• Step 1. Set up hypotheses

Ho: p1 = p2 and three possible Ha’s:

Ha: p1 = p2 (two-tailed)

or

Ha: p1 < p2 (lower-tailed)

or

Ha: p1 > p2 (upper-tailed)

Page 30: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

30

Hypothesis Test for the Difference

between Two Population Proportions

• Step 2. calculate test statistic

where

+−

−=

21

21

11)ˆ1(ˆ

ˆˆ

nnpp

ppz

21

2211 ˆˆˆ

nn

pnpnp

++=

Page 31: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

31

Hypothesis Test for the Difference

between Two Population Proportions

• Step 3: Find p value

1. Must be two independent random samples; both are large samples:

And

2. When the above conditions are met, use Z-Table to find p-value.

• Steps 4 and 5 are the same as before

10)ˆ1(,ˆ 11 ≥− pnpn 10)ˆ1(,ˆ 22 ≥− pnpn

Page 32: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

32

Example: Bike to School

For 80 randomly selected men, 30 regularly bicycled to campus; while for 100 randomly selected women, 20 regularly bicycled to campus.

• Find the p-value for testing:

Ho: p1 = p2 vs. Ha: p1 > p2

Answer: z=2.60, p=0.0047

1: men; 2: women

Page 33: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Order Statistics• Min & Max

• Joint and other orders

6205 33

Page 34: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Order StatisticsProblem 1: Suppose X1, X2, …, X5 are a random sample from U[0,1]. Find the pdf of X(2).

Problem 2: Suppose X1, X2, …, Xn are a random sample from U[0,1]. Show that X(k) ~ beta(k,n-k+1).

6205 34

Page 35: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Order StatisticsThe CDF of X(k) and example

6205 35

Page 36: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Wilcoxon TestsHo: median of X = median of Y vs. H1: Ho is false

Wilcoxon tests (p. 448 - 450)

• assume the two distributions are of similar shapes but do not need to be normal

• See the supplementary material

6205 36

Page 37: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

Exercise 8.5-9X = the life time of light bulb of brand A

Y = the life time of light bulb of brand B

Data: (in 100 hours)

X: 5.6 4.6 6.8 4.9 6.1 5.3 4.5 5.8 5.4 4.7

Y: 7.2 8.1 5.1 7.3 6.9 7.8 5.9 6.7 6.5 7.1

(a)Conduct the Wilcoxon test at 5 % level to test if brand B has longer life time in general.

A: W(Y)=145 > 128 or Z= 3.024 > 1.645; reject Ho

(a)Construct and interpret a Q-Q plot of these data.

6205 37

Page 38: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

R Code for Q-Q Plot

> x<-c(5.6, 4.6, 6.8, 4.9, 6.1, 5.3, 4.5, 5.8, 5.4, 4.7)

> y<-c(7.2, 8.1, 5.1, 7.3, 6.9, 7.8, 5.9, 6.7, 6.5, 7.1)

> qqplot(x,y,xlab="life time of brand A", ylab="life time of brand B", main="qqplot of Life time of Brand A vs. Brand B")

6205 38

Page 39: Statistical Hypothesis Tests - California State …sfan/SubPages/CSUteach/st6205/lecture notes...Statistical Hypotheses • A statistical hypothesis is an assumption or statement concerning

6205 39