Top Banner
Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences and Engineering University of Guanajuato Campus Celaya-Salvatierra
23

Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Apr 02, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Biostatistics coursePart 14

Analysis of binary paired data

Dr. Sc. Nicolas Padilla RaygozaDepartment of Nursing and Obstetrics

Division Health Sciences and EngineeringUniversity of Guanajuato

Campus Celaya-Salvatierra

Page 2: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Biosketch

Medical Doctor by University Autonomous of Guadalajara. Pediatrician by the Mexican Council of Certification on

Pediatrics. Postgraduate Diploma on Epidemiology, London School of

Hygiene and Tropical Medicine, University of London. Master Sciences with aim in Epidemiology, Atlantic International

University. Doctorate Sciences with aim in Epidemiology, Atlantic

International University. Associated Professor B, School of Nursing and Obstetrics of

Celaya, university of Guanajuato. [email protected]

Page 3: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Competencies

The reader will know how show paired binary data.

He (she) will apply hypothesis test for paired binary data – McNemar’s Chi-squared.

He (she) will calculate confidence interval for paired binary data.

He (she) will obtain Odds Ratio and confidence interval for cases-controls paired studies.

Page 4: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Introduction

In Parts 12 and 13 of the biostatistics course, we knew, the methods for comparing two proportions estimated from independent samples.

If the observations in a study are not independent, we need to use different methods.

Often we use two types of studies that give rise to observations that are not independent: Repeated observations in the same individual Matched case-control studies

Page 5: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Example

Tuberculosis can be diagnosed to use a culture media and looking if Mycobacterium tuberculosis is growing.

In a experiment to compare two culture medias for the tuberculosis diagnosis, samples of expectoration from 100 patients were planted in the two medias.

The half of the sample was planted in media A and another half of sample, planted in media B.

Page 6: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Example

In a study, to examine the relation between breast cancer and oral contraceptives, women with a breast cancer were matched with women without breast cancer, selected from electoral registries.

This is an example of cases-controls study, where each individual with breast cancer is matched with an individual with similar age, for control the potential effect counding, of age.

Page 7: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Showing categorical paired data

To use Z test or Chi squared test with paired data is a mistake, because we do not take into account the paired nature of data.

Patient Culture A

Culture B

1 - -

2 - -

3 + +

4 + -

5 + +

6 - -

7 - +

8 - -

9 - -

10 - -

11 + -

12 + +

13 + -

14 + +

15 + -

Patient Culture A

Culture B

16 + +

17 + +

18 - -

19 - -

20 + -

21 - -

22 + +

23 + +

24 + +

25 + -

26 - -

27 + +

28 - -

29 + +

30 + +

Patient Culture A

Culture B

31 - -

32 + +

33 + +

34 + -

35 - -

36 + +

37 + +

38 + -

39 + +

40 + -

41 + -

42 + +

43 - -

44 + +

45 + -

Patient Culture A

Culture B

46 + +

47 - -

48 + -

49 - -

50 - +

Page 8: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Showing categorical paired data

Patient Culture A

Culture B

51 - -

52 - -

53 + +

54 + -

55 + +

56 - -

57 - +

58 - -

59 - -

60 - -

61 + -

62 + +

63 + -

64 + +

65 + -

Patient Culture A

Culture B

66 + +

67 + +

68 - -

69 - -

70 + -

71 - -

72 + +

73 + +

74 + +

75 + -

76 - -

77 + +

78 - -

79 + +

80 + +

Patient CultureA

Culture B

81 - -

82 + +

83 + +

84 + -

85 - -

86 + +

87 + +

88 + -

89 + +

90 + -

91 + -

92 + +

93 - -

94 + +

95 + -

Patient Culture A

Culture B

96 + +

97 - -

98 + -

99 - -

100 - +

Page 9: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Showing categorical paired data

The experiment compared the capacity of culture media to detect Mycobacterium tuberculosis.

The results were positive (+) or negative (-). We have interest in to compare the samples positives of both culture

media. The table summarize the results

Culture media

+ - Total

A 64 36 100

B 44 56 100

Page 10: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Showing categorical paired data

From this, do you think that media A is better that media B to detect the tuberculosis bacilli? 

To make an adequate analysis, we need to compare the results with both media in each subject.

There are four combinations of results that can occur in each subject:

Combination Media A Media B Pairs

1 + + k

2 + - r

3 - + s

4 - - m

Page 11: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Showing categorical paired data

To compare the results of each subject, we need count how many times occur each combination.

AN easy form to show the calues is tabulate the results from a sample against another sample.

Media B+

B-

A + k r k + r

A - s m s + m

k + s r + m N

Page 12: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Showing categorical paired data

The pairs with same result are pairs with agreement, and they do not give any information on what media is better to detect bacilli.

Of the remaining results were different between the two media: 24 were positive for the A and negative for B. 4 were negative for A and positive for the B.

The pairs whose results were different between both media, are called discordant pairs.

Media B+

B-

A + 40 24 64

A - 4 32 36

44 56 100

Page 13: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Hypothesis test for binary paired data If there were no difference between the medias, we should expect

similar numbers r and s, r ≈ s We can use a call McNemar test to assess whether the difference

between the numbers of discordant pairs is greater than what you would expect by chance.

To test the null hypothesis that there is no difference between the two proportions, we used the McNemar test:

(|r-s|-1)2

X2paired= -----------------

r + s

Subtracting 1 gives us a continuous correction.

Page 14: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Hypothesis test for binary paired data In the study of two culture media for tuberculosis bacilli:

24 were positives in media A and negatives in media B 4 were negatives in media A and positives in media B

(|r-s|-1)2 (|24-4|-1)2 361

X2paired=---------------= --------------- = -------- = 12.81 p<0.05

r + s 24 + 4 28

Rejected the null hypothesis of non-difference between media.

Page 15: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Confidence intervals for the difference of two paired proportions We know thta the difference between proportions of paired data can be

calculate by: r – s / N

Where:  r and s are the number of discordant pairs N is the total number of pairs Standard error from the difference between paired proportions is:

√r +s

SE(p1-p2) = -----------

N

Page 16: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Confidence intervals from difference of two paired proportions General formula to calculate 95% confidence interval is:

 Estimate ± 1.96 x SE

From the table of results of cultures from expectoration, with medias, A and B, we are using r and s values, and can calculate 95% confidence interval for paired proportions:

r-s / N ± √r +s/N = 24-4/100±1.96 √24+4/100 = 0.2±0.10 = 0.1 a 0.3 = 10% a 30%-

Confidence intervals from 0.1 to 0.3 mean that the percentage of positive cultures for the bacilli could be between 10% and 30% higher in media A than media B, in the population.

Page 17: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Odds Ratio for paired data

In case-control studies, usually, we want to evaluate the risk with the exposure at a risk factor; for these studies, we need an effect measure.

In case-control studies, we are using OR, that is a Ratio between odds of the exposure in the cases divided by odds of the exposure in controls.

Calculate of OR with matched data, is based in discordant pairs, the same that the difference between proportions od paired data.

Page 18: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Odds Ratios for paired data

Table of exposure in cases against exposure in controls

Controls

Cases Exposed Non-exposed

Exposed k r

Non-exposed s m 

k = number of pairs where the case and control were exposed r = number of pairs where the case was exposed and the control was not

exposed s = number of pairs where the case was not exposed and the control was

exposed. m = number of pairs where cases and controls were not exposed.

Page 19: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Odds Ratio for paired data

Odds Ratio is calculate as the Ratio of two groups of discordant pairs.

r cases exposed controls not exposed

OR = ---- = -------------------------------------------------

s cases not exposed controls exposed  

Page 20: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Odds Ratio for paired data

The table show the results of a matched case-control study, designed to investigate the association between the use of oral contraceptive (OCC) and thromboembolism.

 

Controls

Cases Use OCC Not use OCC

Use OCC 10 57

Not use OCC 13 95

Page 21: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Confidence intervals for paired OR

To calculate confidence intervals is a little more complicated.

It is calculate using square root of the value of McNemar X2 test, instead of standard error.

95% confidence intervals for OR from apired data is: OR1±1.96/ X

Page 22: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Odds Ratio for paired data

OR = 4.4 X²paired = 26.41 

Xpaired = 5.14

Then, 95% confidence interval is: From 4.41-1.96/ 5.14 to 4.41+1.96/ 5.14 4.40.62 to 4.41.38 2.5 to 7.7

Page 23: Biostatistics course Part 14 Analysis of binary paired data Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences.

Bibliografía

1.- Last JM. A dictionary of epidemiology. New York, 4ª ed. Oxford University Press, 2001:173.

2.- Kirkwood BR. Essentials of medical ststistics. Oxford, Blackwell Science, 1988: 1-4.

3.- Altman DG. Practical statistics for medical research. Boca Ratón, Chapman & Hall/ CRC; 1991: 1-9.