Top Banner
BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 1 BINF702 SPRING 2014 Chapter 9 - Nonparametric Methods
54

BINF702 SPRING 2014 - George Mason University

Nov 20, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 1

BINF702 SPRING 2014

Chapter 9 - Nonparametric Methods

Page 2: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 2

Why Nonparametric Methods?

Our previous estimation and hypothesis testing methods assumed knowledge of the underlying distribution of the data.

Parametric statistical methods.

Nonparametric methods make fewer assumptions about the underlying distribution from which the data was drawn.

Page 3: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 3

Section 9.1 Introduction (Data Types)

Def. 9.1 – Cardinal data are data that are on a scale where it is meaningful to measure the distance between possible data values.

Ex.

Height, weight, cholesterol level

Def. 9.2 – For cardinal data, if the zero point is arbitrary, then the data are on an interval scale; if the zero point is fixed, then the data are on a ratio scale.

Ex

Temperature is of type interval unless measured in Kelvin

Height is of type ratio

Page 4: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 4

Ordinal and Nominal Data

Def. 9.3 – Ordinal data are data that can be ordered but do not have specific numeric values. Thus, common arithmetic cannot be performed on ordinal data in a meaningful way.

Ex.

Cancer classification (normal, mild, moderate, severe)

Def. 9.4 – Data are on a nominal scale if different data values can be classified into categories but the categories have no specific ordering.

Disease names

Page 5: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 5

Summary of the Types of Data

Page 6: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 6

Section 9.2 The Sign Test

Example 9.7 – Dermatology Suppose we wish to compare the effectiveness of two ointments (A, B) in reducing excessive redness in people who cannot otherwise be exposed to sunlight. Ointment A is randomly applied to the left arm or right arm, and ointment B is applied to the corresponding area on the other arm. The person is then exposed to 1 hours of sunlight and the two arms are compared for degrees of redness. Suppose only the following qualitative assessments can be made:

The A arm is not as red as the B arm.

The B arm is not as red as the A arm.

The arms are equally red.

Of 45 people tested with the condition, 22 are better off in the A arm, 18

are better off in the B arm, and 5 are equally well off on both arms. How

can we decide if this evidence is sufficient to conclude that ointment A is

better than ointment B?

Page 7: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 7

Section 9.2.1 Normal-Theory Method

Let xi = degree of redness on the A arm, yi = degree of redness on the B arm for the i-th person. Consider di = xi – yi. Consider the hypothesis H0:D = 0 versus H1:D not equal 0 where D = the population median of the di or the 50th percentile of the underlying distribution of the di. We can’t observe the di but we can observe C = the number where di > 0.

Under H0 P(di > 0) = ½. We will form a binomial based test statistic using the normal approximation to the binomial. Hence we require npq > 5 (n)(1/2)(1/2) > 5 n > 20

Page 8: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 8

Section 9.2.1 The Sign Test (Normal Theory Methods)

Eq. 91 – The Sign Test To test the hypothesis H0:D = 0 versus H1:D not equal 0 where the number of nonzero di’s = n >=20 and C = number of di’s where di > 0 if

Then H0 is rejected. Otherwise H0 is accepted.

2 1 / 2 1 1 / 2

1 1/ 4 / 4

2 2 2 2

n nC c z n or C c z n

Page 9: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 9

Section 9.2.1 The Sign Test (Normal Theory Methods) Eq. 9.2 Computation of the p-Value for the Sign Test

(Normal-Theory Method)

.5

22* 12/ 4

.522*

2/ 4

1.0 if 2

nC

np if C

n

nC

np if C

n

np C

Page 10: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 10

Section 9.2.1 The Sign Test (Normal Theory Methods)

Alternate p-value computation

Where C = number of di > 0 and D = number of di < 0

12* 1 if and 1.0 if C = D

C Dp C D p

n

Page 11: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 11

Example 9.8 (Eq. 9.2 in R)

sign.approx <- function(C, D, n){

denom = sqrt(n/4)

#browser()

if(C > (n/2)){

num = C - (n/2) - .5

p = 2 * (1 - pnorm(num/denom))

}

else{

if(C < (n/2)){

num = C - (n/2) + .5

p = 2 * pnorm(num/denom)

}

else{num/denom

p = 1.0

}

}

return(p)

}

>source("F:/fall2

004/binf702/sign.

approx.R")

>sign.approx(18,2

2,40)

[1] 0.6352563

Page 12: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 12

Section 9.2.2 – Exact Method

Eq 9.3 Computation of the p-Value for the Sign Test (Exact Test, n < 20)

0

1/ 2 2*

2

1/ 2, 2*

2

/ 2, 1.0

This equation is a special case of Equation 7.44.

nn

k C

nC

k

nIf C n p

k

nIf C n p

k

If C n p

Where did this n come from?

Page 13: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 13

Section 9.2.2 – Exact Method Suppose we want to compare two different types of eye drops (A, B) that are intended to

prevent redness in people with hay fever. Drug A is randomly given to one eye and drug B the other eye. The redness is noted at baseline and after 20 minutes by an observer who is not aware of which drug is administered to which eye. We find that for 15 people with an equal amount of redness in each eye at baseline, after 10 minute the drug A eye is less red than the drug B eye for 8 people; the drug B eye is less red than the drug A eye for 2 people; and the eyes are equally red for 5 people. Assess the statistical significance of the results. Craft your calculation using binom.test in R. Solve this using binom.test in R.

Page 14: BINF702 SPRING 2014 - George Mason University

Another Sign Test Example - I The data are a subset of data reported by Ijzermans (1970) from an investigation into the

susceptibility to corrosion of a stainless steel alloy. The data we will work with is the

percentage of chromium in 12 samples of the alloy. We are interested in testing the

hypothesis that the median percentage of chromium content is 18% against the alternative

that it is not.

data <- read.table("table3_9",header=T)

data

Sample X.CR

1 1 17.4

2 2 17.9

3 3 17.6

4 4 18.1

5 5 17.6

6 6 18.9

7 7 16.9

8 8 17.5

9 9 17.8

10 10 17.4

11 11 24.6

12 12 26.0

cro <- data[,2]

mu <- 18 # hypothesized value

b <- sum(cro > mu) # test statistic

b

[1] 4

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC

METHODS 14

Page 15: BINF702 SPRING 2014 - George Mason University

Another Sign Test Example - II

Can you solve this in R?

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 15

Page 16: BINF702 SPRING 2014 - George Mason University

Another Sign Example

Suppose an ophthalmologist reviews fundus photographs of 30 patients with macular degeneration both before and 3 moths after receiving a laser treatment. To assess the efficacy of treatment, each patient is rated as improved, remained the same, or declined. If 20 patients improved, 7 declined, and 3 remained the same, then assess whether or not patients undergoing this treatment are showing significant change from baseline to 3 months afterward. Report a p-value.

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 16

Page 17: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 17

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test)

Ex. 9.10 Consider the data in Ex. 9.7 from a different perspective. We assumed that the only possible assessment was that the degree of sunburn with ointment A was either better or worse than that with ointment B. Suppose instead that the degree of burn can be quantified on a 10-point scale, with 10 being the worst burn and 1 being no burn at all. We can now compute di = xi – yi, where xi = degree of burn for ointment A and yi = degree of burn for ointment B. If di is positive, then ointment B is doing better than ointment A; if di is negative , then ointment A is doing better then ointment B. For example, if di = + 5, then the degree of redness is 5 units greater on the ointment A arm than on the ointment B arm, whereas if di = -3, then the degree of redness is 3 units less on the ointment A arm than on the ointment B arm. How can this additional information be used to test if the ointments are equally effective?

Page 18: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 18

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test)

We wish to test H0 : D = 0 vs. H1 : D != 0 where D = median score difference between the ointment A and ointment B arms. If D < 0, then ointment A is better; if D > 0, then ointment B is better. We will assume that the di have an underlying continuous distribution.

We can’t use a paired t-test because the data is ordinal.

Page 19: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 19

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test)

Eq. 9.4 Ranking Procedure for the Wilcoxon Signed-Rank Test

1. Arrange the differences di in order of absolute value.

2. Count the number of differences with the same absolute value.

3. Ignore, the observations where di = 0 and rank the remaining observations from 1, for the observations with the lowest absolute value, up to n, for the observations with the highest absolute value.

4. If there is a group of several observations with the same absolute value, then find the lowest rank in the range = 1 + R and the highest rank in the range = G + R, where R = the highest rank used prior to considering this group and G = the number of differences in the range of ranks for the group. Assign the average rank = (lowest rank in the range + highest rank in the range)/2 as the rank for each difference in the group.

Page 20: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 20

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test)

Ex. 9.11

From Table 9.1 we note that 14 people have absolute value 1. This groups ranks range from 1 to 14 with an average rank of (1 + 14) / 2 = 7.5

The group of 10 people with absolute value 2 has a rank range from (1 + 14) to (10 + 14) = 15 to 24 and an average rank = (15 + 24) / 2 = 19.5

Page 21: BINF702 SPRING 2014 - George Mason University

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test) Eq. 9.5 Wilcoxon Signed-Rank

Test (Normal Approximation Method for Two-Sided Level Test)

1. Rank the differences shown in eq. 9.4.

2. Compute the rank sum R1 of the positive differences.

3. Compute

If there are no ties (i.e., no groups

of differences with the same

absolute value.)

1

( )( 1) 1

4 2

( 1)(2 1) / 24

n nR

Tn n n

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 21

Page 22: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 22

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test) Eq 9.5 (cont)

3. Compute

if there are ties, where ties refers

to the number of differences with

the same absolute value in the ith

tied group and g is the number of

tied groups.

1

3

1

( 1)1/ 2

4

( 1)(2 1) / 24 / 48g

i i

i

n nR

T

n n n t t

Page 23: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 23

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test) Eq. 9.5 (cont.)

4. If

T > z1-/2 then reject H0. Otherwise, accept H0.

5. The p-value for the test is given by p = 2 * [1 – fT)]

6. This test should be used only if the number of nonzero differences is >= 16 and if the difference scores have an underlying continuous symmetric distribution.

Page 24: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 24

Section 9.3 The Wilcoxon Signed-Rank Test (A Nonparametric Analog of the Paired t-Test) Eq. 9.5 (cont.)

Ex. 9.12

1

1

3 3 3 3

1

3 3 3 3

2 22

1

10(7.5) 6(19.5) 2(28.0) 248

( ) 40(41) / 4 410

( ) 40(41)(81) / 24 [(14 14) (10 10) 7 7 1 1

2 2 2 2 3 3 1 1 ]/ 48 5449.75

( ) 14(7.5) 10 19.5 40 / 4 5449.75

| 248 410 | 1/ 2 / 73.82 2.19

R

E R

Var R

Var R

T

2 1 2.19 .029p f

Page 25: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 25

The Wilcoxon Signed Rank Test in R - I

wilcox.test package:stats R Documentation

Wilcoxon Rank Sum and Signed Rank Tests

Description:

Performs one and two sample Wilcoxon tests on vectors of data; the

latter is also known as 'Mann-Whitney' test.

## Default S3 method:

wilcox.test(x, y = NULL,

alternative = c("two.sided", "less", "greater"),

mu = 0, paired = FALSE, exact = NULL, correct = TRUE,

conf.int = FALSE, conf.level = 0.95, ...)

Page 26: BINF702 SPRING 2014 - George Mason University

The Wilcoxon Signed Rank Test in R - II

Arguments:

x: numeric vector of data values.

y: an optional numeric vector of data values.

alternative: a character string specifying the alternative hypothesis,

must be one of '"two.sided"' (default), '"greater"' or

'"less"'. You can specify just the initial letter.

mu: a number specifying an optional location parameter.

paired: a logical indicating whether you want a paired test.

exact: a logical indicating whether an exact p-value should be

computed.

correct: a logical indicating whether to apply continuity correction

in the normal approximation for the p-value.

normal approximation is used.

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 26

Page 27: BINF702 SPRING 2014 - George Mason University

The Wilcoxon Signed Rank Test in R - III

conf.int: a logical indicating whether a confidence interval should be

computed.

conf.level: confidence level of the interval.

formula: a formula of the form 'lhs ~ rhs' where 'lhs' is a numeric

variable giving the data values and 'rhs' a factor with two

levels giving the corresponding groups.

data: an optional data frame containing the variables in the model

formula.

subset: an optional vector specifying a subset of observations to be

used.

na.action: a function which indicates what should happen when the data

contain 'NA's. Defaults to 'getOption("na.action")'.

...: further arguments to be passed to or from methods.

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 27

Page 28: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 28

The Wilcoxon Signed Rank Test in R - IV

Details:

The formula interface is only applicable for the 2-sample tests.

If only 'x' is given, or if both 'x' and 'y' are given and

'paired' is 'TRUE', a Wilcoxon signed rank test of the null that

the distribution of 'x' (in the one sample case) or of 'x-y' (in

the paired two sample case) is symmetric about 'mu' is performed.

Otherwise, if both 'x' and 'y' are given and 'paired' is 'FALSE',

a Wilcoxon rank sum test (equivalent to the Mann-Whitney test) is

carried out. In this case, the null hypothesis is that the

location of the distributions of 'x' and 'y' differ by 'mu'.

By default (if 'exact' is not specified), an exact p-value is

computed if the samples contain less than 50 finite values and

there are no ties. Otherwise, a

Page 29: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 29

In the Presence of Ties

The p-values are not really correct in the presence of ties so one

should install exactRankTests and use this in the presence of ties

library(exactRankTests)

One then uses

wilcox.exact

Page 30: BINF702 SPRING 2014 - George Mason University

Example 9.12 in R

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 30

Page 31: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 31

Example 9.12 in R With the Tie Robust Test

Page 32: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 32

Another Example of the Wilcoxon Signed-Ranks Test

An interview panel of ten interviewers were asked to rate the final candidates on a scale of 1 to 20 in terms of their suitability for a vacant post. Is one candidate rated significantly higher than the other by the interviewers?

Interviewer Candidate 1 Candidate 2

1 14 10

2 17 7

3 12 14

4 16 6

5 14 14

6 10 4

7 17 10

8 12 4

9 6 11

10 18 6

Page 33: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 33

Another Example of the Wilcoxon Signed-Ranks Test

Page 34: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 34

Example Repeated With the Ties Robust Test

Page 35: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 35

Example Repeated With the coin library

> x = c(14,17,12,16,14,10,17,12,6,18)

> y = c(10, 7, 14, 6, 14, 4, 10, 4, 11, 6)

> wilcoxsign_test(x ~ y, alternative =

"two.sided", distribution = exact())

Exact Wilcoxon-Signed-Rank Test

data: y by x (neg, pos)

stratified by block

Z = 2.1936, p-value = 0.02734

alternative hypothesis: true mu is not equal to 0

Page 36: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 36

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Example 9.15 Ophthalmology Different genetic types of the disease retinitis pigmentosa (RP) are thought to have different rates of progression, with the dominant form of the disease progressing the slowest, the recessive form of the disease the next slowest, and the sex-linked form of the disease the quickest. This hypothesis can be tested by comparing the visual acuity of people ages 10-19 who have different genetic types of RP. Suppose there are 25 people with dominant disease and 30 people with sex-linked disease. The best-corrected visual acuities (i.e., with appropriate glasses) in the better eyes of these people are presented in Table 9.2. How can the data be used to test if the median visual acuity is different in the two groups?

Page 37: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 37

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

What do we wish to test in example 9.15?

H0: medianD = medianSL versus H1:medianD != medianSL where medianD and medianSL are the median visual acuities in the dominant and sex-linked groups respectively.

The two-sample t test for independent sample would be appropriate except that the visual acuity data cannot be given a specific numerical value that ophthalmologist would agree on.

Page 38: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 38

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Eq. 9.6 Ranking Procedure for the Wilcoxon Rank-Sum Test

1. Combine the data from the two groups and order the values from the lowest to the highest, or in the case of visual acuity, from the best visual acuity (20-20) to worst visual acuity (20-80)

2. Assign ranks to the individual values, with the best visual acuity (20-20) having the lowest rank and worst visual acuity (20-80) have the highest rank, or vice versa.

3. If a group of observations has the same value, then compute the range of ranks for the group, as was done for the signed-rank test in eq. 9.4 and assign the average rank for each observation in the group.

Page 39: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 39

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Eq. 9.7 Wilcoxon Rank-Sum Test (Normal Approximation Method for Two-Sided level Test)

1. Rank the observations as shown in Eq. 9.6.

2. Compute the rank sum R1 in the first sample (the choice of sample is arbitrary).

3. Compute (assuming no ties)

1 1 2

1

1 21 2

1 1

2 2

112

n n nR

Tn n

n n

Page 40: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 40

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Eq. 9.7 Wilcoxon Rank-Sum Test (Normal Approximation Method for Two-Sided level Test)

3. Compute (if there are ties)

1 1 2

1

2

11 21 2

1 2 1 2

1 1 1 2

1 1

2 2

1

112 1

unless 1 / 2 then 0

g

i i

i

n n nR

T

t tn n

n nn n n n

R n n n T

Page 41: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 41

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t

test for Two Independent Samples)

where ti refers to the number of observations with the same

value in the i-th tied group, and g is the number of tied groups.

4. If

T > z1-/2

Then reject H0, otherwise accept H0.

5. Compute the exact p-value by

p = 2 * [1 – (T)]

6. This test should be used only if both n1 and n2 are at least 10, and if there is an underlying continuous distribution

Page 42: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 42

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Consider Ex. 9.17

Page 43: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 43

Section 9.4 The Wilcoxon Rank-Sum Test (A Nonparamatric Analog to the t test for Two Independent Samples)

Page 44: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 44

Ex 9.17 Repeated With the ties Robust Test

> x = c(rep(3.5,5),rep(13.5,9),rep(25.5,6),rep(34.0,3),rep(42.5,2))

> y = c(rep(3.5,1),rep(13.5,5),rep(25.5,4),rep(34.0,4),rep(42.5,8), rep(50,5),rep(53.5,2), rep(55,1))

> wilcox.exact(x,y,paired=FALSE)

Exact Wilcoxon rank sum test

data: x and y

W = 154, p-value = 8.496e-05

alternative hypothesis: true mu is not equal to 0

Page 45: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 45

Ex 9.17 Repeated With the coin library

> xxx = c(rep(3.5,5),rep(3.5,1),rep(13.5,9),rep(13.5,5),rep(25.5,6),rep(25.5,4),rep(34.0,3),rep(34.0,4),rep(42.5,2), rep(42.5,8),rep(50,5),rep(53.5,2), rep(55,1))

> gene = factor(c(rep("dominant",5),rep("sex-linked",1),rep("dominant",9),rep("sex-linked",5),rep("dominant",6),rep("sex-linked",4),rep("dominant",3),rep("sex-linked",4),rep("dominant",2),rep("sex-linked",8), rep("sex-linked",5),rep("sex-linked",2), rep("sex-linked",1)))

> wilcox_test(xxx ~ gene, alternative = "two.sided", distribution = exact())

Exact Wilcoxon Mann-Whitney Rank Sum Test

data: xxx by gene (dominant, sex-linked)

Z = -3.7975, p-value = 8.496e-05

alternative hypothesis: true mu is not equal to 0

Page 46: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 46

Challenge Problems

Page 47: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 47

Problem # 1 Suppose we wish to compare

the recovery times of patients after 2 different versions of some operation, say removing the gallbladder. Operation A is performed through a vertical incision; Operation B through an oblique incision. Each operation is performed alternately (A, B, A, B, etc.) on a consecutive series of patients suffering from gallbladder disease, and the recovery times (say, number of days in the hospital after operation, including the day of operation and the day of discharge from hospital) are then collected as follows.

Patient # Days to recover from operation

Patient # DTRFO

1 16 2 18

3 20 4 19

5 25 6 15

7 19 8 16

9 22 10 21

11 15 12 17

13 22 14 17

15 19 16 14

Page 48: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 48

Problem # 1 Solution

Can you solve this in R?

Page 49: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 49

Problem # 2

A man who was sick in bed decided to count the number of advertisements delivered per day by 2 competing radio stations, WHO and WHY. Each morning he tossed a penny to determine which station he would listen to that day; WHO won the toss 5 times and WHY won 3 times. Eight days of this made his illness fatal, but he left us the following results to analyze-

WHO (Sample A) = 341, 326, 360, 305, 326

WHY (Sample B) = 352, 382, 347

WHY’s average is obviously higher than WHO’s, but is the difference statistically significant?

Page 50: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 50

Problem 2 Solution

Can you solve this in R?

Page 51: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 51

Problem # 3 Suppose you are investigating a new sleeping pill called Nockout, and want to

compare it with a standard sedative called phenobarbitone. Things like sedative can very quite a bit from one person to another, so it is best to try both drugs on every person taking part in the experiment. So you collect 10 people suffering from chronic insomnia and one one night give half of them (selected at random) Nockout pills, and the other half Phenobarbitone and observe the number of hours that each person sleeps. You are able to only assess the sleep length to the ¼ hour. A few nights later when you can be sure that the effect of the first sedative has worn off completely each person is given his second pill which is whichever pill he didn’t get the first time and the hours of sleep are note again. The results of such a trial expressed in hours of sleep are given below.

Page 52: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 52

Problem # 3 Data

Patient J.B. R.A. S.T. S.L. P.Q. E.V. J.T. L.O. E. M. B.O.

With

Pheno

7.5 7 7 5.75 4.25 9.25 8 7.25 8.5 7.75

With Nock

8 6 6.75 5 4.5 8 7.5 6.25 8 7.75

Page 53: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 53

Problem # 3 Solution

Can you solve this in R?

Page 54: BINF702 SPRING 2014 - George Mason University

BINF702 SPRING 2014 - CHAPTER 9 NONPARAMETRIC METHODS 54

Chapter 9 Homework

9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 9.10, 9.11, 9.12