Top Banner
Statistical Inference Week 3: Hypothesis Testing and t-tests
24

Statistical inference: Hypothesis Testing and t-tests

Jul 14, 2015

Download

Education

Eugene Yan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical inference: Hypothesis Testing and t-tests

Statistical InferenceWeek 3: Hypothesis Testing and t-tests

Page 2: Statistical inference: Hypothesis Testing and t-tests

Central Limit Theorem What is the mean height (πœ‡) of all primary school children in Singapore?

Sample = Anderson Primary

Population = All primary school children in SG

Sample = DamaiPrimary

Sample = Red Swastika Primary

Sample = Zhenghua Primary

𝒙𝑨𝒏𝒅𝒆𝒓𝒔𝒐𝒏 π‘·π’“π’Šπ’Žπ’‚π’“π’š = Mean height of

100 children from Anderson Primary

π’™π‘«π’‚π’Žπ’‚π’Š π‘·π’“π’Šπ’Žπ’‚π’“π’š = Mean height of 100

children from Damai Primary

𝒙𝑹𝒆𝒅 π‘Ίπ’˜π’‚π’”π’•π’Šπ’Œπ’‚ = Mean height of 100 children from Red Swastika Primary

π’™π’π’‰π’†π’π’ˆπ’‰π’–π’‚ π‘·π’“π’Šπ’Žπ’‚π’“π’š= Mean height of 100

children from Zhenghua Primary

π·π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› π‘œπ‘“ π‘šπ‘’π‘Žπ‘› β„Žπ‘’π‘–π‘”β„Žπ‘‘ ~ 𝑁(π‘šπ‘’π‘Žπ‘› = πœ‡, π‘ π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ =𝜎

100)

…

…

…

From the sampling distribution: Mean( π‘₯) β‰ˆ πœ‡ SD( π‘₯) < 𝜎

βˆ’ As sample size increases, SD decreases

Page 3: Statistical inference: Hypothesis Testing and t-tests

Central Limit Theorem (CLT)

The distribution of sample statistics (e.g., mean) is approximately normal, regardless of the underlying distribution, with mean =

πœ‡ and variance = 𝜎2

𝑁

𝒙 ~ 𝑡(π’Žπ’†π’‚π’ = 𝝁, 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓 =𝝈

𝒏)

Further experimentation: http://bitly.com/clt_mean

Distribution is normal

Sample mean = population mean

Sample sd = population sd divided by square root

of sample size

Applet source: Mine Γ‡etinkaya-Rundel, Duke University

Page 4: Statistical inference: Hypothesis Testing and t-tests

Conditions for CLT

Independence: Sampled observations must be independent:βˆ’Random sample/assignment

βˆ’ If sampling without replacement, n < 10% of population

Sample Size/Skew:βˆ’Population should be normal

βˆ’ If not, sample size should be large (rule of thumb: n > 30)

Page 5: Statistical inference: Hypothesis Testing and t-tests

Confidence Interval

An interval estimate of a population parameterβˆ’Computed as sample mean +/- a

margin of error

π‘₯ Β± 𝑧 Γ— 𝑆𝐸,where SE =𝑠

𝑛

βˆ’95% confidence interval would contain 95% of all values and would be π‘₯ Β± 2𝑆𝐸 or π‘₯ Β± 1.96 Γ—

𝑠

𝑛

π‘ͺ𝑳𝑻: 𝒙 ~ 𝑡(π’Žπ’†π’‚π’ = 𝝁, 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓 =𝝈

𝒏)

Page 6: Statistical inference: Hypothesis Testing and t-tests

Confidence Interval

You have taken a random sample of 100 primary school children in Singapore. Their heights had mean = 150cm and sd = 10cm. Estimate the true average height of primary school children based on this sample using a 95% confidence interval.

We are 95% confident that primary school children mean height is between 148.04cm and 151.96cm

Confidence Interval: π‘₯ Β± 𝑧 Γ— 𝑆𝐸𝑛 = 100 π‘₯ = 150𝑠𝑑 = 10

𝑆𝐸 =𝑠𝑑

𝑛=

10

100= 1

π‘₯ Β± 𝑧 Γ— 𝑆𝐸 = 150 Β± 1.96 Γ— 1= 150 Β± 1.96= (148.04, 151.96)

Page 7: Statistical inference: Hypothesis Testing and t-tests

Required sample size for margin of error

Given a target margin of error and confidence level, and information on the standard deviation of sample (or population), we can work backwards to determine the required sample size.

Previous measurements of primary school children heights show sd = 15cm. What should be the sample size in order to get a 95% confidence interval with a margin of error less than or equal 1cm?

Margin of error: ≀ 1π‘π‘šConfidence level: 95%𝑧 = 1.96𝑠𝑑 = 15

𝑀𝐸 = 𝑧 Γ— 𝑆𝐸

1 = 1.96 Γ—15

𝑛

𝑛 = (1.96 Γ— 15

1)2

𝑛 = (29.4)2 = 864.36Thus, we need a sample size of at least 865 primary school children

Page 8: Statistical inference: Hypothesis Testing and t-tests

Hypothesis Testing

Null hypothesis 𝐻0

βˆ’The status quo that is assumed to be true

Alternative hypothesis (π»π‘Ž)βˆ’An alternative claim under consideration that will require statistical

evidence to accept, and thus, reject the null hypothesis

We will consider 𝐻0 to be true and accept it unless the evidence in favour of π»π‘Ž is so strong that we reject 𝐻0 in favour of π»π‘Ž.

Page 9: Statistical inference: Hypothesis Testing and t-tests

Hypothesis Testing

Earlier, we found the sample of 100 primary school children had mean height = 150cm and sd = 10cm. Based on this statistic, does the data support the hypothesis that primary school children on average are shorter than 151cm?

𝐻0: μ = 151 #primary school students have mean height = 151

π»π‘Ž: πœ‡ < 151 #primary school students have mean height < 151

Page 10: Statistical inference: Hypothesis Testing and t-tests

P-value

Probability of obtaining the observed result or results that are more β€œextreme”, given that the null hypothesis is trueβˆ’P(observed or more extreme outcome | 𝐻0 is true)

βˆ’ If the p-value is low (i.e., lower than the significance level (𝛼), usually 5%), then we say that it is very unlikely to observe the data if the null hypothesis was true, and reject 𝐻0

βˆ’ If the p-value is high (i.e., higher than 𝛼), we say that it is likely to observe the data even if the null hypothesis was true, and thus do not reject 𝐻0

Page 11: Statistical inference: Hypothesis Testing and t-tests

Hypothesis Testing and P-value

Recall that the sample of 100 primary school children had mean height = 150cm and sd = 10cm. Also take sig. level = 0.05

π‘₯ = 150cm; sd = 10cm; SE =10

100= 1 #what we know from the sample

𝑋 ~𝑁(πœ‡ = 151, 𝑆𝐸 = 1) #null hypothesis of the population

Test Statistic:

𝑍 =150 βˆ’ 151

1= βˆ’1

P-value: 𝑃 𝑍 < βˆ’1 = 1 βˆ’ 0.8413= 0.1587

Since p-value is higher than 0.05, we do not reject 𝐻0

Page 12: Statistical inference: Hypothesis Testing and t-tests

πœ‡ = 151150

0.1587

Hypothesis Testing and P-value

Interpreting p-valueβˆ’ If in fact, primary school children have mean height of 151cm, there is a

15.9% chance that a random sample of 100 children would yield a sample mean of 150cm or lower

βˆ’This is a pretty high probability

βˆ’Thus, the sample mean of 150 could have

likely occurred by chance

Page 13: Statistical inference: Hypothesis Testing and t-tests

Two-sided Hypothesis Testing

What is the probability that the children have mean height different from 151cm?

𝐻0: μ = 151 #primary school students have mean height = 151

π»π‘Ž: πœ‡ β‰  151 #primary school students have mean height β‰  151

P-value: 𝑃 𝑍 < βˆ’1 + 𝑃 𝑍 > 1= 2 Γ— 1 βˆ’ 0.8413= 0.3174

πœ‡ = 151150

0.1587 0.1587

152

Page 14: Statistical inference: Hypothesis Testing and t-tests

Hypothesis Testing and Confidence Intervals

If the confidence interval contains the null value, don’t reject 𝐻0. If the confidence interval does not contain the null value, reject 𝐻0.βˆ’Previously, we found the 95% confidence interval for heights of primary

school children to be (148, 152). Given that our null hypothesis(𝐻0 =151cm) falls within this 95% CI, we do not reject it.

A two-sided hypothesis with significance level 𝛼 is equivalent to a confidence interval with 𝐢𝐿 = 1 βˆ’ 𝛼

A one-sided hypothesis with a significance level 𝛼 is equivalent to a confidence interval with 𝐢𝐿 = 1 βˆ’ 2𝛼

148 cm 152 cm

95% confident that the average height is between 148 and 152 cm

Page 15: Statistical inference: Hypothesis Testing and t-tests

Decision Errors

Which error is worse to commit (in a research/business context)?βˆ’Type II: Declaring the defendant innocent when they are actually guilty

βˆ’Type I: Declaring the defendant guilty when they are actually innocent

β€œBetter that ten guilty persons escape than that one innocent suffer”

- William Blackstone

Fail to reject 𝐻0 Reject 𝐻0

𝐻0 is True Type I error

𝐻0 is False Type II error

Page 16: Statistical inference: Hypothesis Testing and t-tests

Type I Error rate

We reject 𝐻0 when the p-value is less than 0.05 (𝛼=0.05)βˆ’ I.e., Should 𝐻0 actually be true, we do not want to incorrectly reject it

more than 5% of the time

βˆ’Thus, using a 0.05 significance level is equivalent to having a 5% chance of making a Type I error

Choosing significance levelsβˆ’ If Type I Error is costly, we choose a lower significance level (e.g., 0.01)

βˆ’ E.g., spam filtering

βˆ’ If Type II Error is costly, we choose a higher significance level (e.g., 0.10)βˆ’ E.g., airport baggage screening

Fail to reject 𝐻0 Reject 𝐻0

𝐻0 is True Type I error (𝛼)

𝐻0 is False Type II error (𝛽)

Page 17: Statistical inference: Hypothesis Testing and t-tests

Student’s t Distribution

According to CLT, the distribution of sample statistics is approximately normal, if: βˆ’Population is normal

βˆ’Sample size is large (n > 30)

If so, we can use the population sd (𝜎) to compute a z-score

However, sample sizes are sometimes small and we often do not know the standard deviation of the population (𝜎)βˆ’Thus, the normal distribution may not be appropriate

Thus, we rely on the t distribution

Page 18: Statistical inference: Hypothesis Testing and t-tests

Shape of the t distribution

Bell shaped but thicker tails than the normalβˆ’Thus, observations are more likely to fall beyond 2sd from the mean

βˆ’The thicker tails are helpful in adjusting for the less reliable data on the standard deviation (when n is small and/or 𝜎 is unknown)

Page 19: Statistical inference: Hypothesis Testing and t-tests

Shape of the t distribution

Has one parameter, degrees of freedom (df), which determines the thickness of the tailsβˆ’df refers to the number of independent observations in data set

βˆ’Number of independent observations = sample size minus 1

βˆ’E.g., in a sample size of 8, there are (8-1) degrees of freedom

What happens to the shape of the t distribution when df increases?βˆ’ It approaches the normal distribution

Page 20: Statistical inference: Hypothesis Testing and t-tests

When to use the t distribution

In general, we use the t distribution when:βˆ’N is small (n < 30) and/or;

βˆ’πœŽ is unknown

However, nowadays, our sample sizes are usually above 30βˆ’Thus, why bother with the t distribution?

βˆ’Because 95% of the world prefers the t distribution to the normal and you’ll definitely encounter it eventually

βˆ’ If you’re unsure, use the t distribution since it approximates to the normal distribution with large sample sizes

Page 21: Statistical inference: Hypothesis Testing and t-tests

Independent and Dependent t-tests

When to use independent and dependent t-tests?βˆ’Dependent: when evaluating the effect between two related samples

βˆ’ You feed a group of 100 people fast food everyday

βˆ’ Did they gain weight after 30 days?

βˆ’ Independent: when evaluating the effect between two independent samplesβˆ’ You feed 50 males and 50 females fast food everyday

βˆ’ Did males or females gain more weight after 30 days?

You conduct a study with two groups and have them exercise three times a day for 30 days (group A = crossfit, group B = yoga).βˆ’How would you test the difference between crossfit and yoga participants?

βˆ’How would you test the difference in weight between day 0 and day 30 for yoga participants?

Page 22: Statistical inference: Hypothesis Testing and t-tests

Effect Size

When samples become large enough, you often get significant resultsβˆ’However, is it practically significant?

Effect size is a simple way to quantify difference between two groupsβˆ’Emphasizes the size of the difference (without effect of sample size)

βˆ’Cohen’s d is one of the most common ways to measure effect size

Effect size:

Proper calculation for π‘†π·π‘π‘œπ‘œπ‘™π‘’π‘‘:

Simple calculation for π‘†π·π‘π‘œπ‘œπ‘™π‘’π‘‘:

Page 23: Statistical inference: Hypothesis Testing and t-tests

Time for practice

In this lab session we will cover:βˆ’ Independent t-tests

βˆ’Dependent (paired) t-tests

βˆ’Effect size (Cohen’s d)

GitHub repository: https://github.com/eugeneyan/Statistical-Inference

Page 24: Statistical inference: Hypothesis Testing and t-tests

Thank you for your attention!Eugene Yan