Top Banner
McNemar’s Test, Correlation, Regression Arthur Berg Pennsylvania State University
22

McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

Sep 09, 2018

Download

Documents

hacong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test, Correlation, Regression

Arthur BergPennsylvania State University

Page 2: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Tonsillectomy & Hodgkin’s Lymphoma

Tonsillectomy rates in the US in children under the age of 15.Year # per 10,0001965 1661986 122000 <2

Hodgkin’s Lymphoma linked with Tonsillectomy?

S. Johnson and R. Johnson, “Tonsillectomy history in Hodgkin’s disease”,NEJM (1972)

Hodgkin’s data

A study involved 85 patients with Hodgkin’s disease each of which had anormal sibling. Unpaired data:

Tonsillectomy No Tonsillectomy

Hodgkins 41 44Control 33 52

Arthur Berg McNemar’s Test, Correlation, Regression 2 / 22

Page 3: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Tonsillectomy & Hodgkin’s Lymphoma

> mat <- matrix(c(41, 33, 44, 52), 2, 2)

> prop.table(mat, 1)

[,1] [,2]

[1,] 0.4823529 0.5176471

[2,] 0.3882353 0.6117647

> chisq.test(mat)

Pearson's Chi-squared test with Yates' continuity

correction

data: mat

X-squared = 1.1726, df = 1, p-value = 0.2789

Arthur Berg McNemar’s Test, Correlation, Regression 3 / 22

Page 4: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Tonsillectomy & Hodgkin’s Lymphoma

Hodgkin’s data

paired data:Sibling

Tonsillectomy No Tonsillectomy

PatientTonsillectomy 37 7

No Tonsillectomy 15 26

> mat <- matrix(c(37, 15, 7, 26), 2, 2)

> mcnemar.test(mat)

McNemar's Chi-squared test with continuity correction

data: mat

McNemar's chi-squared = 2.2273, df = 1, p-value =

0.1356

Arthur Berg McNemar’s Test, Correlation, Regression 4 / 22

Page 5: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Insulin Sensitivity (Y) vs. C20-22 Fatty Acids (X)

Arthur Berg McNemar’s Test, Correlation, Regression 5 / 22

Page 6: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

> y <- c(250, 220, 145, 115, 230, 200, 330, 400,

370, 260, 270, 530, 375)

> x <- c(17.9, 18.3, 18.3, 18.4, 18.4, 20.2, 20.3,

21.8, 21.9, 22.1, 23.1, 24.2, 24.4)

> plot(x, y, pch = 16, cex = 2)

● ●

18 19 20 21 22 23 24

100

200

300

400

500

x

y

Arthur Berg McNemar’s Test, Correlation, Regression 6 / 22

Page 7: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

> cor(x, y)

[1] 0.7700025

> cor.test(x, y)

Pearson's product-moment correlation

data: x and y

t = 4.0026, df = 11, p-value = 0.002077

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

0.3804100 0.9274906

sample estimates:

cor

0.7700025

Arthur Berg McNemar’s Test, Correlation, Regression 7 / 22

Page 8: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

CI and p-value

I The CI is not symmetrical.

I CI interpretation: Assuming the data were randomly sampled from alarger population, there is a 95% chance that this range includes thepopulation correlation coefficient.

I p-value interpretation: If the null hypothesis was true, what is thechance that 13 randomly picked subjects would have an r greater than.77 or less than -.77?

Arthur Berg McNemar’s Test, Correlation, Regression 8 / 22

Page 9: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

The Pearson’s correlation coefficient is not robust

> y2 <- y

> y2[12] <- 30

> plot(x, y2, pch = 16, cex = 2)

●●

18 19 20 21 22 23 24

100

200

300

400

x

y2

Arthur Berg McNemar’s Test, Correlation, Regression 9 / 22

Page 10: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

> cor.test(x, y2)

Pearson's product-moment correlation

data: x and y2

t = 0.8235, df = 11, p-value = 0.4277

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

-0.3574666 0.6991377

sample estimates:

cor

0.2409823

Arthur Berg McNemar’s Test, Correlation, Regression 10 / 22

Page 11: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Correlation vs Causation

I The lipid content of the membranes determines insulin sensitivity.

I The insulin sensitivity of the membranes somehow affects lipid content.

I Both insulin sensitivity and lipid content are under the control of someother factor, perhaps a hormone.

I Lipid content, insulin sensitivity, and other factors are all part of acomplex molecular/biochemical/physiological network, perhaps withpositive and/or negative feedback components. In this case, theobserved correlation is just a peek at a much more complicated set ofrelationships.

I The two variables don’t correlate in the population at all, and theobserved correlation in this sample was a coincidence.

Arthur Berg McNemar’s Test, Correlation, Regression 11 / 22

Page 12: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Assumptions

I Random and independent data: (Xi ,Yi)

I X and Y are paired

I Both X and Y are stochastic–not experimentally controlled

I Normally distributed

I No outliers

I Linear relationship

Arthur Berg McNemar’s Test, Correlation, Regression 12 / 22

Page 13: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

R2

R2 is the fraction of the variance shared between the two variables.

.772 = .59

I 59% of the variability in insulin tolerance is associated with variability inlipid content.

I 59% of the variability in lipid content is associated with variability ininsulin tolerance.

I Knowing the lipid content of the membranes lets you explain 59% of thevariance in the insulin sensitivity.

I 41% of the variance is explained by other factors.

Arthur Berg McNemar’s Test, Correlation, Regression 13 / 22

Page 14: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Arthur Berg McNemar’s Test, Correlation, Regression 14 / 22

Page 15: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

> summary(lm(y ~ x))

Call:

lm(formula = y ~ x)

Residuals:

Min 1Q Median 3Q Max

-102.96 -65.05 25.64 61.23 116.11

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -486.542 193.716 -2.512 0.02890 *

x 37.208 9.296 4.003 0.00208 **

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 75.9 on 11 degrees of freedom

Multiple R-squared: 0.5929, Adjusted R-squared: 0.5559

F-statistic: 16.02 on 1 and 11 DF, p-value: 0.002077Arthur Berg McNemar’s Test, Correlation, Regression 15 / 22

Page 16: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Arthur Berg McNemar’s Test, Correlation, Regression 16 / 22

Page 17: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

> fit <- lm(y ~ x)

> plot(x, y, pch = 16, cex = 2)

> abline(fit, lwd = 3)

● ●

18 19 20 21 22 23 24

100

200

300

400

500

x

y

Arthur Berg McNemar’s Test, Correlation, Regression 17 / 22

Page 18: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

> plot(fit, 1)

200 250 300 350 400

−10

0−

500

5010

0

Fitted values

Res

idua

ls

Residuals vs Fitted

12

11

4

Arthur Berg McNemar’s Test, Correlation, Regression 18 / 22

Page 19: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Arthur Berg McNemar’s Test, Correlation, Regression 19 / 22

Page 20: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Things to Look Out For

I look at the residuals

I make sure you have a well-defined response variable

I consider the use of weighted regression

I be mindful of spurious regression

I be cautious of extrapolating beyond your data

I delineate statistical significance from scientific or practical significance

Arthur Berg McNemar’s Test, Correlation, Regression 20 / 22

Page 21: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Arthur Berg McNemar’s Test, Correlation, Regression 21 / 22

Page 22: McNemar's Test, Correlation, Regression · McNemar’s TestCorrelationLinear Regression Tonsillectomy & Hodgkin’s Lymphoma Tonsillectomy rates in the US in children under the age

McNemar’s Test Correlation Linear Regression

Arthur Berg McNemar’s Test, Correlation, Regression 22 / 22