Top Banner
1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano rvard School of Public Health) for lecture material
18

1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

Mar 31, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

1

Introduction to Biostatistics

(BIO/EPI 540)

Contingency Tables

Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture material

Page 2: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

2

Contingency Tables

• Nominal data that are grouped into categories are often presented in the form of contingency tables

• Rows denote levels of one variable (e.g. disease)

• Columns denote the levels of the other variable (e.g. exposure)

Page 3: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

3

Consider whether the rate of caesareans is different for subjects receiving an electronic fetal monitoring (EFM), as compared to those without EMF.Sample 5,824 deliveries:

of these 2,850 were EFM exposedand 2,974 were not.358 of the 2,850 had c-sectionsas did 229 of the 2,974.

Binomial with n huge.

Example – Discrete Outcomes

Page 4: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

4

Chi square test

Proceed as usual:

1. If there is no difference (null hypothesis) what do we

expect to see?

2. How does this compare to what we have observed? (statistic & its distribution)

Do the c-section rates differ?

Example – Discrete Outcomes

Page 5: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

5

Caesarean Delivery

EFM ExposureTotal

Yes No

Yes 358 229 587

No 2,492 2,745 5,237

Total 2,850 2,974 5,824

Data-Contingency table

If the c-section rate is the same in both populations, then ignore column classification and go with totals.

Page 6: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

6

2x2 Table – Null Hypothesis

• Ho: The proportion of C-sections among patents receiving EFM is identical to the proportion of C-sections among patients who do not receive EMF

• Ha: The proportion of C-sections among patents receiving EFM is different from the proportion of C-sections among patients who do not receive EMF

Page 7: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

7

From the totals we can estimate:

Probability of c-section

Page 8: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

8

What do we expect to see if EFM has no effect?

EFM exposed (2,850 mothers):

No EFM (2,974 mothers)

Expected counts under Ho

Page 9: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

9

C-sectEFM Exposure?

TotalYes No

Yes 358 287 229 300 587

No 2492 2563 2745 2674523

7

Total 2850 2974582

4

Expected, if independence of row andcolumn classification is true, in boxes:

Observed and Expected counts – Contingency Table

Page 10: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

10

(Table page A-26)

Chi Square Goodness of fit

Chi Square Test

Page 11: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

11

In 2x2 tables (only) we applya continuity correction factor:

Continuity correction factor

Page 12: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

12

For the EFM and c-section example, above:

Example

Note: This is a 2 sided test

Page 13: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

13

Equivalent Tests

• The above example can be analyzed equivalently using a two sample test of proportions (Chapter 14.6)

• 2 sample test of proportions (Z test) and Chi-Square test are mathematically equivalent

Page 14: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

14

Assumptions – Chi Square test

• Chi square test – is an asymptotic test. i.e. Works only when sample size is large

• Chi Square test – treats the row total and column total of the data as fixed (i.e. not random)

Page 15: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

15

Assumptions – 2 sample test of proportions

• Z test – is also an asymptotic test. Assumes that the Central Limit Theorem for sample means (i.e. proportions) holds. Thus this test is appropriate only when sample size is large

• Z test – assumes that the proportions in each group being compared are random variables

Page 16: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

16

e.g. Accuracy of Death Certificates

Hospit.

Certificate Status

TotalConf.Accur.

Inacc.No Ch.

Incorr.Recode

Comm. 157 18 54 229

Teach. 268 44 34 346

Total 425 62 88 575

Extending to multiple categories: r x c Tables

Page 17: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

17

Hospital

Certificate Status

TotalConfirmedAccurate

InaccurateNo Change

IncorrectRecoded

Comm. 157 169.3 18 24.7 54 35.0 229

Teach. 268 255.7 44 37.3 34 53.0 346

Total 425 62 88 575

tabi 157 18 54 \ 268 44 34

e.g.

Page 18: 1 Introduction to Biostatistics (BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture.

18

Summary

• Contingency Tables – – Analysis of 2x2 tables– Analysis of rxc tables

• Equivalence between Chi square test and two sample test of proportions