Top Banner
Basic statistical tests in R Anja Bråthen Kristoffersen Biomedical Research Group
42

Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Aug 02, 2019

Download

Documents

lycong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Basic statistical tests in R

Anja Bråthen Kristoffersen

Biomedical Research Group

Page 2: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Outline

• Example: Tea testing

– Hypotese testing, type I and type II error

• More tests and how to find and read help files for different tests in R

2014.01.15 2

Page 3: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Famous hypotheses test example:

The Design of Experiments (1935), Sir Ronald A. Fisher

– A tea party in Cambridge in the 1920ties

– A lady claims that she can taste whether milk is poured in cup before or after the tea

– All professors agree: impossible

– Fisher: this is statistically interesting!

He organized a test

2014.01.15 3

Page 4: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

The lady tasting tea • Test with 8 trials, 2 cups in each trial

– In each trial: guess which cup had the milk poured in first

• Binomial experiment

– Independent trials

– Two possible outcomes, she guesses right cup (success), wrong cup (failure)

– Constant probability of success in each trial

• X = number of correct guesses in 8 trials, each with probability of success p

– X is Binomially (8,p) distributed

2014.01.15 4

Page 5: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

The lady tasting tea cont. • The null (conservative) hypothesis

– The one we initially believe in

• The alternative hypothesis

– The new claim we wish to test

• She has no special ability to taste the difference

• She has a special ability to taste the difference

2014.01.15 5

𝑝 = 0.5

𝑝 > 0.5

Page 6: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

How many right to be convinced We expect maybe 3, 4 or 5 correct guesses if she has no special ability

• Assume 7 correct guesses

– Is there enough evidence to claim that she has a special ability? If 8 correct guesses this would have been even more obvious!

• What if only 6 correct guesses?

– Then it is not so easy to answer YES or NO

• Need a rule that says something about what it takes to be convinced.

2014.01.15 6

Page 7: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

How many right to be convinced?

• Rule: We reject H0 if the observed data have a small probability under H0 (given H0 is true).

• Compute the p-value.

– The probability to obtain the observed value or something more extreme, given that is true

– NB! The p-value is NOT the probability that is true

Small p-value: reject the null hypothesis

Large p-value: keep the null hypothesis 2014.01.15 7

Page 8: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

The lady tasting tea, cont. • Say: she identified 6 cups correctly

• P-value – The probability to obtain the observed value or something more

extreme, given that H0 is true 𝑃 𝑋 ≥ 6 𝐻0 𝑡𝑟𝑢𝑒 =

𝑃 𝑋 = 6 𝑝 = 0.5 + 𝑃 𝑋 = 7 𝑝 = 0.5 + 𝑃 𝑋 = 8 𝑝 = 0.5 = 𝑑𝑏𝑖𝑛𝑜𝑚 6, 8, 0.5 + 𝑑𝑏𝑖𝑛𝑜𝑚 7, 8, 0.5 + 𝑑𝑏𝑖𝑛𝑜𝑚 8, 8, 0.5 =

sum(dbinom(6:8, 8, 0.5)) = 0.1445

• Is this enough to be convinced?

• Need a limit. – we must know about the types of errors we can make.

2014.01.15 8

Page 9: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Two types of error

• Type I error most serious

– Wrongly reject the null hypothesis

– Example:

• person is not guilty

• person is guilty

• To say a person is guilty when he is not is far more serious than to say he is not guilty when he is.

2014.01.15 9

H0 true H1 true

Accept H0 OK Type II error

Accept H1 Type I error OK

Page 10: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

When to reject

• Decide on the hypothesis’ level of significance

– Choose a level of significance α

– This guarantees P(type I error) ≤ α

– Example

• Level of significance at 0.05 gives 5 % probability to reject a true

• Reject H0 if P-value is less than α

2014.01.15 10

Page 11: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Important parameters in hypothesis testing

• Null hypothesis

• Alternative hypothesis

• Level of significance

Must be decided upon before we know the results of the experiment

2014.01.15 11

Page 12: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

The lady tasting tea, cont.

• Choose 5 % level of significance Conduct the experiment

– Say: she identified 6 cups correctly

– Is this evidence enough?

• P-value

– The probability to obtain the observed value or something more extreme, given that H0 is true

𝑃 𝑋 ≥ 6 𝐻0 𝑡𝑟𝑢𝑒 = 𝑠𝑢𝑚(𝑑𝑏𝑖𝑛𝑜𝑚 6: 8, 8, 0.5)= 0.1445

2014.01.15 12

Page 13: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

The lady tasting tea, cont.

• We obtained a p-value of 0.1443

• The rejection rule says

– Reject H0 if p-value is less than the level of significance α

– Since α = 0.05 we do NOT H0 reject

Small p-value: reject the null hypothesis

Large p-value: keep the null hypothesis 2014.01.15 13

Page 14: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

The lady tasting tea, cont.

• In the tea party in Cambridge:

– The lady got every trial correct!

• Comment:

– Why does it taste different?

• Pouring hot tea into cold milk makes the milk curdle, but not so pouring cold milk into hot tea*

2014.01.15 14 *http://binomial.csuhayward.edu/applets/appletNullHyp.html Curdle = å skille seg

Page 15: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Area of rejection

• Reject H0 if p-value ≤ α

• Reject H0 if observed value (x) ≥ critical value (xc)

• P(type I error) = P(reject H0 | H0 true) =

P(X ≥ xc|p = 0.5) – xc= 7 → sum(dbinom(7:8, 8, 0.5)) = 0.035 ≤ 0.05

– xc= 6 → sum(dbinom(6:8, 8, 0.5)) = 0.145 > 0.05

Area of rejection: {x: x ≥ xc} → {x: x ≥ 7}

2014.01.15 15

Page 16: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Type II error

P(type I error) ≤ α P(type II error) = β

• Want both errors as small as possible

– especially type I.

• β is not explicitly given, depends on H1

• There is one β for each possible value of p under H1

2014.01.15 16

H0 true H1 true

Accept H0 OK Type II error

Accept H1 Type I error OK

Page 17: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Example: type II error

• P(type II error) = P(not reject H0 | H1 true)

– P = 0.7:

P(not reject H0 | p = 0.7) = 1 - P(reject H0 | p = 0.7) =

1 – P(X ≥ 7 | p = 0.7) = 1 - (1 – P(X < 7 | p = 0.7) =

P(X ≤ 6|p = 0.7) = sum(dbinom(1:6, 8, 0.7)) = 0.745

p = 0.7: H0 will wrongly be accept in 74.5% of the tests

2014.01.15 17

Page 18: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Power of the test

• The probability that a false H0 is rejected

P(reject H0 | H1 true) = 1 - P(accept H0 | H1 true) = 1 - β

• A test with large power has:

– larger probability to draw the right conclusion

– larger probability to reject a false null hypothesis

then a test with low power.

• α and β is connected:

– Decreasing α will give an increased β which again will decrease the power of the test

2014.01.15 18

Page 19: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Example: power function

2014.01.15 19

p <- seq(0.6, 1, 0.01)

antall <- length(p)

beta8 <- rep(NA, antall)

for(i in 1:antall){

beta8[i] <- sum(dbinom(1:6, 8, p[i]))

}

power8 <- 1 - beta8

plot(p, power8, type = "l")

Page 20: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

http://www.stanford.edu/~stephsus/ basic-stats-tests.pdf

Page 21: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

help(shapiro.test)

2014.01.15 21

Page 22: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

2014.01.15 22

Page 23: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

shapiro.test()

2014.01.15 23

Page 24: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

chisq.test()

2014.01.15 24

Reject the null hypotheses and assume that there are differences between the groups

Page 25: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

t.test()

2014.01.15 25

Page 26: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

t.test()

2014.01.15 26

Page 27: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

t.test(, paired = TRUE)

2014.01.15 27

Page 28: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

wilcox.test()

• Same mean? But not normally distributed data.

2014.01.15 28

Page 29: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

wilcox.test(, paired = TRUE)

2014.01.15 29

Page 30: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

help(cor.test)

2014.01.15 30

Page 31: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

2014.01.15 31

Page 32: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

2014.01.15 32

Page 33: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

2014.01.15 33

Page 34: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

2014.01.15 34

Page 35: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Multiple hypothesis testing

• Tests are designed such that it has an expected proportion of incorrectly rejected null hypotheses, most often this level is 5%.

• When many tests are done the probability of rejecting a null hypotheses falsely increase, hence we can correct the probabilities according to how many tests that are done.

2014.01.15 35

Page 36: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Example 10000 genes • Q: is gene g, g = 1, …, 10 000, differentially expressed?

• Gives 10 000 null hypothesis: 𝐻01, 𝐻0

2, … , 𝐻010000

– 𝐻01: gene 1 not differentially expressed

– …

• Assume: no genes differentially expressed

– 𝐻0𝑔

true for all g

• Significance level α ≤ 0.01

– The probability to incorrectly conclude that one gene is differentially expressed is 0.01. e.g. 0.01 * 10000 = 100

expected wrong rejections of 𝐻0𝑔

2014.01.15 36

Page 37: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Need to control the risk of false positive Type I error

• Corrected p-value:

– The original p-values do not tell the full story.

– Instead of using the original p-values for decision making, we should use corrected ones.

2014.01.15 37

Page 38: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Different correction methods

• Bonferroni (1935)

– Just multiply all the p-values by the number of tests

– To conservative

• need very small p-value to reject 𝐻0

• give very little power

• Methods that control the family-wise error rate (FWER).

• Methods that control the false discovery rate (FDR).

2014.01.15 38

Page 39: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

Family-Wise Error Rate (FWER)

• Control type I errors at a level α

– Bonferroni

– Sidak

– Bonferroni-Holm

– Westfall & Young

• Use one of these if you are most afraid of getting stuff on your significant list that should not have been there

2014.01.15 39

Page 40: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

False Discovery Rate (FDR)

• Calculate the expected proportion of type I error among the rejected hypotheses

• Technique that applies to a set of p-values

– Benjamini & Hochberg

– Different newer variants of Benjamini & Hochberg

• Use one of these if you are you most afraid of missing out on interesting stuff

2014.01.15 40

Page 41: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

help(p.adjust)

2014.01.15 41

Page 42: Anja Bråthen Kristoffersen Biomedical Research Group · Famous hypotheses test example: The Design of Experiments (1935), Sir Ronald A. Fisher –A tea party in Cambridge in the

False discovery rate (fdr)

2014.01.15 42