Top Banner
Lecture 14 Non-parametric hypothesis testing H igh M edium Low P ristine 33 51 6 25 34 43 28 27 32 75 22 38 47 19 29 60 21 49 46 64 31 30 25 31 25 34 93 24 28 57 T-TE S T Medium Low P ristine High 0.145265 0.172254 0.931288 Medium 1 0.081749 Low 0.211812 H igh Medium Low P ristine 14 6 30 23 12 10 20 22 15 2 27 11 8 29 19 4 28 7 9 3 16 18 23 16 23 12 1 26 20 5 T-TES T Medium Low P ristine High 0.121074 0.414821 0.09406 Medium 0.5811 0.020992 Low 0.103828 The ranking of data The ranking of data eliminates outliers and non- linearities. In most cases it reduces within group variances. All parametric tests can be applied to ranked data! Spiders on Mazurian lake islands Disturbance
14

Lecture 14 Non-parametric hypothesis testing

Feb 23, 2016

Download

Documents

rufin

Lecture 14 Non-parametric hypothesis testing. Spiders on Mazurian lake islands. The ranking of data. Disturbance. The ranking of data eliminates outliers and non-linearities . In most cases it reduces within group variances . All parametric tests can be applied to ranked data!. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 14 Non-parametric hypothesis testing

Lecture 14Non-parametric hypothesis testing

High Medium Low Pristine33 51 6 2534 43 28 2732 75 2238 47 1929 60 21

49 4664 31

302531253493242857

T-TESTMedium Low Pristine

High 0.145265 0.172254 0.931288Medium 1 0.081749Low 0.211812

High Medium Low Pristine14 6 30 2312 10 20 2215 2 2711 8 2919 4 28

7 93 16

1823162312126205

T-TESTMedium Low Pristine

High 0.121074 0.414821 0.09406Medium 0.5811 0.020992Low 0.103828

The ranking of dataThe ranking of data eliminates outliers and non-linearities.In most cases it reduces within group variances.

All parametric tests can be applied to ranked data!

Spiders on Mazurian lake islands

Disturbance

Page 2: Lecture 14 Non-parametric hypothesis testing

Effects of ranking

High Medium Low Pristine High Medium Low Pristine33 51 6 25 14 6 30 2334 43 28 27 12 10 20 2232 75 22 15 2 2738 47 19 11 8 2929 60 21 19 4 28

49 46 7 964 31 3 16

30 1825 2331 1625 2334 1293 124 2628 2057 5

Mean 33.2 47 47 33.625 14.2 8 10.57143 18.625StdDev 3.271085 5.656854 23.4094 18.5324 3.114482 2.828427 10.48582 8.317652

CVGroup 10.14954 8.308505 2.00774 1.81439 4.559345 2.828427 1.008164 2.239214CVMean 5.123992 2.784361

CVStdDev 1.299735 1.620218CVCVGroup 1.297749 1.800552

Raw data Ranked data

Ranking often reduces the within group variances.

Page 3: Lecture 14 Non-parametric hypothesis testing

Paired comparisons of the mean; Wilcoxon’s matched pairs rank test

24)12)(1(

4)1(

2

nnn

nn

Wz

z is approximately normally distributed.

Island Abundance Spring

Abundance Summer

Górna E 22.1 8.3

Koń 15.1 26.0Kopanka 13.4 41.4Królewski Ostrów 13.1 23.1Maleńka 19.7 10.9Mała Wierzba 9.7 32.5Kopanka N 12.3 9.7Ośrodek 9.7 20.5Piaseczna 34.5 43.5Ruciane - ląd 28.4 16.9Mikołajki - ląd 22.4 13.4Śluza 13.7 38.8Górna W 7.3 16.7Wierzba 26.0 17.0Wygryńska 15.1 39.2

Difference Sign Absolute value Ranks Sign Sorted Sum

13.9 1 13.9 5 5 -11

-10.9 -1 10.9 7 -7 -10-27.9 -1 27.9 1 -1 -9-10.0 -1 10.0 9 -9 -88.7 1 8.7 14 14 -7

-22.7 -1 22.7 4 -4 -42.6 1 2.6 15 15 -3

-10.8 -1 10.8 8 -8 -2-9.0 -1 9.0 11 -11 -1 -5511.5 1 11.5 6 6 59.0 1 9.0 12 12 6

-25.1 -1 25.1 2 -2 12-9.3 -1 9.3 10 -10 138.9 1 8.9 13 13 14

-24.1 -1 24.1 3 -3 15 65

Past uses a different algorithm for the same test.

W

The Wicoson test is the non-parametric alternative to the one-way repeated measures ANOVA

Page 4: Lecture 14 Non-parametric hypothesis testing

Sign test

Island Abundance Spring

Abundance Summer Difference Sign

Górna E 22.1 8.3 13.9 1

Koń 15.1 26.0 -10.9 -1Kopanka 13.4 41.4 -27.9 -1Królewski Ostrów 13.1 23.1 -10.0 -1Maleńka 19.7 10.9 8.7 1Mała Wierzba 9.7 32.5 -22.7 -1Kopanka N 12.3 9.7 2.6 1Ośrodek 9.7 20.5 -10.8 -1Piaseczna 34.5 43.5 -9.0 -1Ruciane - ląd 28.4 16.9 11.5 1Mikołajki - ląd 22.4 13.4 9.0 1Śluza 13.7 38.8 -25.1 -1Górna W 7.3 16.7 -9.3 -1Wierzba 26.0 17.0 8.9 1Wygryńska 15.1 39.2 -24.1 -1

1 9-1 6

Bernoulli 0.3036192-sided 0.607239

Page 5: Lecture 14 Non-parametric hypothesis testing

The rank test of Withney and Mann – U-test

1 11 1 2 1

( 1)2

n nU n n R

1 2

2Un n

Expected mean if no difference

1 2 1 2( 1)12U

n n n nSE

1 2

1 2 1 2

2( 1)

12

U

U

n nUUt

SE n n n n

Expected SE if no difference

The U-test is the nonparametric alternative to the t-test.

Low Pristine

6 2528 2775 2247 1960 2149 4664 31

302531253493242857

Raw data

Spider abundances

Low Pristine

1 610 922 417 220 318 1621 13

126136152351019

Ranked data

n 7 16R 109 =SUMA(EB3:EB18)U 31 =EB19*EC19+(EC19+1)*EC19/2-EB20Mean 56 =EB19*EC19/2SE 14.96663 =(EB19*EC19*(EB19+EC19+1)/12)^0.5

Z -1.67038 =(EB21-EB22)/EB22P(Z) 0.047422 =ROZKŁAD.NORMALNY(EB22;EB21;EB23;PRAWDA)Double sided 0.094844

Page 6: Lecture 14 Non-parametric hypothesis testing

As in the case of the t-test does the ranked ANOVA result in lower significance levels. Ranking levels off the within group heterogeneity (lower within group variance). The test

is less conservative.

Raw data Ranked data

Page 7: Lecture 14 Non-parametric hypothesis testing

Kruskal-Wallis test or Kruskal-Wallis one way ANOVA by ranks

2

1

12 3( 1)( 1)

ri

i i

RKW N

N N n

KW is approximately χ2 distributed. Values can be taken from a c2 table with r-1

degrees of freedom

Raw data ANOVA

High Medium Low Pristine High Medium Low Pristine33 51 6 25 14 6 30 2334 43 28 27 12 10 20 2232 75 22 15 2 2738 47 19 11 8 2929 60 21 19 4 28

49 46 7 964 31 3 16

30 1825 2331 1625 2334 1293 124 2628 2057 5

Raw data Ranked data

ni 5 2 7 16Ri 71 16 74 298

Ri2 / ni 1008.2 128 782.2857 5550.25KW 3.370783

Chi2(KW;3) 0.337912

Page 8: Lecture 14 Non-parametric hypothesis testing

The ANOVA gave the more conservative result

Page 9: Lecture 14 Non-parametric hypothesis testing

Random skewers

Diversity of ground beetles along an elevational gradient

AltitudeNumber

of species

100 86250 98350 75400 80450 50500 61630 55700 45950 49

1030 381100 391200 361300 331500 27

Ranked altitud

e

Ranked number of

species

1 22 13 44 35 76 57 68 99 8

10 1111 1012 1213 1314 14

r 0.969231

Random samples

Ranked altitude

Ranked number

of species

Ranked altitude

Ranked number

of species

Ranked altitude

Ranked number

of species

Ranked altitude

Ranked number

of species

1 2 1 22 1 2 1 2 1

3 4 3 44 3 4 3 4 3

5 7 5 76 5

7 68 9 8 9 8 9

9 8 9 8 9 810 11

11 1012 12 12 12 12 12

13 1314 14 14 14

0.98327 0.95721 0.97105 0.98243

We take 1000 random samples and calculate each time Spearman’s rank order correlation.

Page 10: Lecture 14 Non-parametric hypothesis testing

If there is no trend in species richness we expect a Bernoulli distribution of positive and negative correlations.

Of 1000 rank correlations 623 were positive.The associated probability is

151000

10*37.121

6231000

)1000,623(

p

AltitudeNumber

of species

100 86250 98350 75400 80450 50500 61630 55700 45950 49

1030 381100 391200 361300 331500 27

It’s highly probable that there is a altitudinal trend in species richness.

Page 11: Lecture 14 Non-parametric hypothesis testing
Page 12: Lecture 14 Non-parametric hypothesis testing

What kind of test to be used?

Errors are normally distributedaround the mean

Errors are not normally distributedaround the mean

Comparingtwo means

t-test

Comparingtwo variances

F-test

effect sizeoverall standard error

t

1 2

2 21 2

x xt N

2122

F

Comparingtwo distributions

Comparingexpectation

and observation

Chi2-test

22

1

( )ki i

i i

Obs ExpExp

c

22

1

( )ki i

i i

Obs ExpExp

c

Chi2-test

Kolmogorov -Smirnov-test

max( )cum cumKS Obs Exp

Chi2-test

G-test

1

2 lnk

i

OG OE

Page 13: Lecture 14 Non-parametric hypothesis testing

Errors are not normally distributedaround the mean

Comparingtwo means

Comparingexpectation

and observation

Analyzing dependenciesbetween two variables

Sign testMonte Carlo simulation

U-testWilcoxon test

Sign test

Rank correlation

Comparingtwo means

Studying structure

Monte Carlo simulation

Page 14: Lecture 14 Non-parametric hypothesis testing

Home work and literature

Refresh:

• U-test• Wilcoxon matched pairs test• Sign test• Kruskal Wallis test• Raw and ranked data• Tied ranks

Literature:

Łomnicki: Statystyka dla biologówhttp://statsoft.com/textbook/