Top Banner
EPI809/Spring 2008 EPI809/Spring 2008 1 Chapter 10 Chapter 10 Hypothesis testing: Hypothesis testing: Categorical Data Categorical Data Analysis Analysis
54

EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

Mar 30, 2015

Download

Documents

Luca Saville
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 11

Chapter 10Chapter 10

Hypothesis testing: Categorical Hypothesis testing: Categorical Data AnalysisData Analysis

Page 2: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 22

Learning ObjectivesLearning Objectives

1.1. Comparison of binomial proportion using Z and Comparison of binomial proportion using Z and 22 Test. Test.

2.2. Explain Explain 22 Test for Independence of 2 variables Test for Independence of 2 variables

3.3. Explain The Fisher’s test for independenceExplain The Fisher’s test for independence

4.4. McNemar’s tests for correlated dataMcNemar’s tests for correlated data

5.5. Kappa StatisticKappa Statistic

6.6. Use of SAS Proc FREQ Use of SAS Proc FREQ

Page 3: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 33

Data TypesData Types

Data

Quantitative Qualitative

Discrete Continuous

Data

Quantitative Qualitative

Discrete Continuous

Page 4: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 44

Qualitative DataQualitative Data

1.1. Qualitative Random Variables Yield Qualitative Random Variables Yield Responses That Can Be Put In Categories. Responses That Can Be Put In Categories. Example: Gender (Male, Female)Example: Gender (Male, Female)

2.2. Measurement or Count Reflect # in CategoryMeasurement or Count Reflect # in Category3.3. Nominal (no order) or Ordinal Scale (order)Nominal (no order) or Ordinal Scale (order)

4.4. Data can be collected as continuous but Data can be collected as continuous but recoded to categorical data. Example recoded to categorical data. Example (Systolic Blood Pressure - Hypotension, (Systolic Blood Pressure - Hypotension, Normal tension, hypertension ) Normal tension, hypertension )

Page 5: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 55

Hypothesis Tests Hypothesis Tests Qualitative Data Qualitative Data

QualitativeData

Z Test Z Test 2 Test

Proportion Independence1 pop.

2 Test

2 or morepop.

2 pop.

QualitativeData

Z Test Z Test 2 Test

Proportion Independence1 pop.

2 Test

2 or morepop.

2 pop.

Page 6: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 66

Z Test for Differences in Z Test for Differences in Two ProportionsTwo Proportions

Page 7: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 77

Hypotheses for Hypotheses for TwoTwo Proportions Proportions

Page 8: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 88

Hypotheses for Hypotheses for TwoTwo Proportions Proportions

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0

Ha

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0

Ha

Page 9: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 99

Hypotheses for Hypotheses for TwoTwo Proportions Proportions

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0

Ha p1 - p2 0

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0

Ha p1 - p2 0

Page 10: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1010

Hypotheses for Hypotheses for TwoTwo Proportions Proportions

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0 p1 - p2 0

Ha p1 - p2 0 p1 - p2 < 0

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0 p1 - p2 0

Ha p1 - p2 0 p1 - p2 < 0

Page 11: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1111

Hypotheses for Hypotheses for TwoTwo Proportions Proportions

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0 p1 - p2 0 p1 - p2

0

Ha p1 - p2 0 p1 - p2 < 0 p1 - p2 > 0

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0 p1 - p2 0 p1 - p2

0

Ha p1 - p2 0 p1 - p2 < 0 p1 - p2 > 0

Page 12: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1212

Hypotheses for Hypotheses for TwoTwo Proportions Proportions

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0 p1 - p2 0 p1 - p2

0

Ha p1 - p2 0 p1 - p2 < 0 p1 - p2 > 0

Research Questions

Hypothesis No DifferenceAny Difference

Pop 1 Pop 2Pop 1 < Pop 2

Pop 1 Pop 2Pop 1 > Pop 2

H0 p1 - p2 = 0 p1 - p2 0 p1 - p2

0

Ha p1 - p2 0 p1 - p2 < 0 p1 - p2 > 0

Page 13: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1313

Z Test for Difference in Two Z Test for Difference in Two ProportionsProportions

1.1. AssumptionsAssumptions Populations Are IndependentPopulations Are Independent Populations Follow Binomial DistributionPopulations Follow Binomial Distribution Normal Approximation Can Be Used for Normal Approximation Can Be Used for

large samples large samples (All Expected Counts (All Expected Counts 5) 5)

2.2. Z-Test Statistic for Two ProportionsZ-Test Statistic for Two Proportions

21

21

21

2121 ˆ where11

ˆ1ˆ

ˆˆ

nn

XXp

nnpp

ppppZ

21

21

21

2121 ˆ where11

ˆ1ˆ

ˆˆ

nn

XXp

nnpp

ppppZ

Page 14: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1414

Sample Distribution for Difference Sample Distribution for Difference Between Proportions Between Proportions

1 1 2 21 2 1 2

1 2

0 1 21 2

1 2

1 2

1 1 N ;

1 1N 0; :

,

p p p pp p p p

n n

pq under H p pn n

x xp

n n

2 21 2

1 2 1 21 2

~ N ;X Xn n

Page 15: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1515

Z Test for Two Proportions Z Test for Two Proportions Thinking Challenge Thinking Challenge

You’re an epidemiologist for the US You’re an epidemiologist for the US Department of Health and Human Department of Health and Human Services. You’re studying the Services. You’re studying the prevalence of disease X in two prevalence of disease X in two states (MA and CA). In states (MA and CA). In MAMA, , 7474 of of 15001500 people surveyed were people surveyed were diseased and in diseased and in CACA, , 129 129 of of 15001500 were diseased. At were diseased. At .05.05 level, does level, does MAMA have a have a lowerlower prevalence rate? prevalence rate?

MA

CA

Page 16: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1616

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

Page 17: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1717

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00::

HHaa::

= =

nnMAMA = = nnCACA ==

Critical Value(s):Critical Value(s):

Page 18: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1818

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00: : ppMAMA - - ppCACA = 0 = 0

HHaa: : ppMAMA - - ppCACA < 0 < 0

==

nnMAMA = = nnCACA ==

Critical Value(s):Critical Value(s):

Page 19: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 1919

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00: : ppMAMA - - ppCACA = 0 = 0

HHaa: : ppMAMA - - ppCACA < 0 < 0

== .05 .05

nnMAMA = = 1500 1500 nnCACA = = 15001500

Critical Value(s):Critical Value(s):

Page 20: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2020

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00: : ppMAMA - - ppCACA = 0 = 0

HHaa: : ppMAMA - - ppCACA < 0 < 0

== .05 .05

nnMAMA == 1500 1500 nnCACA == 1500 1500

Critical Value(s):Critical Value(s):

Z0-1.645

.05

Reject

Z0-1.645

.05

Reject

Page 21: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2121

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

00.4

15001

15001

0677.10677.

00860.0493.

0677.15001500

12974ˆ

0860.1500

129ˆ0493.

1500

74ˆ

Z

nn

XXp

n

Xp

n

Xp

CAMA

CAMA

CA

CACA

MA

MAMA

00.4

15001

15001

0677.10677.

00860.0493.

0677.15001500

12974ˆ

0860.1500

129ˆ0493.

1500

74ˆ

Z

nn

XXp

n

Xp

n

Xp

CAMA

CAMA

CA

CACA

MA

MAMA

Page 22: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2222

Z = -4.00Z = -4.00

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00: : ppMAMA - - ppCACA = 0 = 0

HHaa: : ppMAMA - - ppCACA < 0 < 0

== .05 .05

nnMAMA = = 1500 1500 nnCACA == 1500 1500

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z0-1.645

.05

Reject

Z0-1.645

.05

Reject

Page 23: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2323

Z = -4.00Z = -4.00

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00: : ppMAMA - - ppCACA = 0 = 0

HHaa: : ppMAMA - - ppCACA < 0 < 0

= = .05.05

nnMAMA = = 1500 1500 nnCACA == 1500 1500

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z0-1.645

.05

Reject

Z0-1.645

.05

Reject Reject at Reject at = .05 = .05

Page 24: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2424

Z = -4.00Z = -4.00

Z Test for Two Proportions Z Test for Two Proportions Solution*Solution*

HH00: : ppMAMA - - ppCACA = 0 = 0

HHaa: : ppMAMA - - ppCACA < 0 < 0

== .05 .05

nnMAMA == 1500 1500 nnCACA == 1500 1500

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Z0-1.645

.05

Reject

Z0-1.645

.05

Reject Reject at Reject at = .05 = .05

There is evidence MA There is evidence MA is less than CAis less than CA

Page 25: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2525

22 Test of Independence Test of Independence Between 2 Categorical Between 2 Categorical

VariablesVariables

Page 26: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2626

Hypothesis Tests Hypothesis Tests Qualitative Data Qualitative Data

QualitativeData

Z Test Z Test 2 Test

Proportion Independence1 pop.

2 Test

2 or morepop.

2 pop.

QualitativeData

Z Test Z Test 2 Test

Proportion Independence1 pop.

2 Test

2 or morepop.

2 pop.

Page 27: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2727

22 Test of Independence Test of Independence

1.1.Shows If a Relationship Exists Between 2 Shows If a Relationship Exists Between 2 Qualitative Variables, but does Qualitative Variables, but does NotNot Show Show CausalityCausality

2.2.AssumptionsAssumptionsMultinomial ExperimentMultinomial Experiment

All Expected Counts All Expected Counts 5 5

3.3.Uses Two-Way Contingency TableUses Two-Way Contingency Table

Page 28: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2828

22 Test of Independence Test of Independence Contingency Table Contingency Table

1.1. Shows # Observations From 1 Shows # Observations From 1 Sample Jointly in 2 Qualitative VariablesSample Jointly in 2 Qualitative Variables

Page 29: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 2929

Residence Disease Status

Urban Rural Total

Disease 63 49 112 No disease 15 33 48 Total 78 82 160

Residence Disease Status

Urban Rural Total

Disease 63 49 112 No disease 15 33 48 Total 78 82 160

22 Test of Independence Test of Independence Contingency Table Contingency Table

1.1.Shows # Observations From 1 Sample Shows # Observations From 1 Sample Jointly in 2 Qualitative VariablesJointly in 2 Qualitative Variables

Levels of variable 2Levels of variable 2

Levels of variable 1Levels of variable 1

Page 30: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3030

22 Test of Independence Test of Independence Hypotheses & StatisticHypotheses & Statistic

1.1.HypothesesHypotheses HH00: Variables Are Independent : Variables Are Independent

HHaa: Variables Are Related (Dependent): Variables Are Related (Dependent)

Page 31: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3131

22 Test of Independence Test of Independence Hypotheses & StatisticHypotheses & Statistic

1.1.HypothesesHypothesesHH00: Variables Are Independent : Variables Are Independent

HHaa: Variables Are Related (Dependent): Variables Are Related (Dependent)

2.2.Test StatisticTest Statistic Observed countObserved count

Expected Expected countcount 2

2

n E n

E n

ij ij

ij

c hc hall cells

2

2

n E n

E n

ij ij

ij

c hc hall cells

Page 32: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3232

22 Test of Independence Test of Independence Hypotheses & StatisticHypotheses & Statistic

1.1.HypothesesHypothesesHH00: Variables Are Independent : Variables Are Independent

HHaa: Variables Are Related (Dependent): Variables Are Related (Dependent)

2.2.Test StatisticTest Statistic

Degrees of Freedom: (Degrees of Freedom: (rr - 1)( - 1)(cc - 1) - 1)RowsRows Columns Columns

Observed countObserved count

Expected Expected countcount 2

2

n E n

E n

ij ij

ij

c hc hall cells

2

2

n E n

E n

ij ij

ij

c hc hall cells

Page 33: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3333

22 Test of Independence Test of Independence Expected CountsExpected Counts

1.1.Statistical Independence Means Joint Statistical Independence Means Joint Probability Equals Product of Marginal Probability Equals Product of Marginal ProbabilitiesProbabilities

2.2.Compute Marginal Probabilities & Multiply Compute Marginal Probabilities & Multiply for Joint Probabilityfor Joint Probability

3.3.Expected Count Is Sample Size Times Expected Count Is Sample Size Times Joint ProbabilityJoint Probability

Page 34: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3434

Expected Count ExampleExpected Count Example

Page 35: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3535

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Expected Count ExampleExpected Count Example

112 112 160160

Marginal probability = Marginal probability =

Page 36: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3636

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Expected Count ExampleExpected Count Example

112 112 160160

78 78 160160

Marginal probability = Marginal probability =

Marginal probability = Marginal probability =

Page 37: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3737

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Expected Count ExampleExpected Count Example

112 112 160160

78 78 160160

Marginal probability = Marginal probability =

Marginal probability = Marginal probability =

Joint probability = Joint probability = 112 112 160160

78 78 160160

Page 38: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3838

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Residence Disease Urban Rural

Status Obs. Obs. Total

Disease 63 49 112

No Disease 15 33 48

Total 78 82 160

Expected Count ExampleExpected Count Example

112 112 160160

78 78 160160

Marginal probability = Marginal probability =

Marginal probability = Marginal probability =

Joint probability = Joint probability = 112 112 160160

78 78 160160

Expected count = 160· Expected count = 160· 112 112 160160

78 78 160160

= 54.6 = 54.6

Page 39: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 3939

Expected Count CalculationExpected Count Calculation

Page 40: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4040

Expected Count CalculationExpected Count Calculation

Expected count = Row total Column total

Sample sizea fa f

Expected count = Row total Column total

Sample sizea fa f

Page 41: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4141

Residence Disease Urban Rural

Status Obs. Exp. Obs. Exp. Total

Disease 63 54.6 49 57.4 112

No Disease 15 23.4 33 24.6 48

Total 78 78 82 82 160

Residence Disease Urban Rural

Status Obs. Exp. Obs. Exp. Total

Disease 63 54.6 49 57.4 112

No Disease 15 23.4 33 24.6 48

Total 78 78 82 82 160

Expected Count CalculationExpected Count Calculation

112x82 112x82 160160

48x78 48x78 160160

48x82 48x82 160160

112x78 112x78 160160

Expected count = Row total Column total

Sample sizea fa f

Expected count = Row total Column total

Sample sizea fa f

Page 42: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4242

HIV STDs Hx No Yes Total

No 84 32 116 Yes 48 122 170 Total 132 154 286

HIV STDs Hx No Yes Total

No 84 32 116 Yes 48 122 170 Total 132 154 286

You randomly sample You randomly sample 286286 sexually active sexually active individuals and collect information on their HIV individuals and collect information on their HIV status and History of STDs. At the status and History of STDs. At the .05.05 level, is level, is there evidence of a there evidence of a relationshiprelationship??

22 Test of Independence Test of Independence Example on HIVExample on HIV

Page 43: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4343

22 Test of Independence Test of Independence SolutionSolution

Page 44: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4444

22 Test of Independence Test of Independence SolutionSolution

HH00: :

HHaa: :

= =

df = df =

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

20

Reject

20

Reject

Page 45: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4545

22 Test of Independence Test of Independence SolutionSolution

HH00: : No Relationship No Relationship

HHaa: : Relationship Relationship

= =

df = df =

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

20

Reject

20

Reject

Page 46: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4646

22 Test of Independence Test of Independence SolutionSolution

HH00: : No Relationship No Relationship

HHaa: : Relationship Relationship

= = .05.05

df = df = (2 - 1)(2 - 1) = 1 (2 - 1)(2 - 1) = 1

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

20

Reject

20

Reject

Page 47: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4747

22 Test of Independence Test of Independence SolutionSolution

HH00: : No Relationship No Relationship

HHaa: : Relationship Relationship

= = .05.05

df = df = (2 - 1)(2 - 1) = 1 (2 - 1)(2 - 1) = 1

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

20 3.841

Reject

20 3.841

Reject

= .05= .05

Page 48: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4848

HIV No Yes

STDs HX Obs. Exp. Obs. Exp. Total

No 84 53.5 32 62.5 116

Yes 48 78.5 122 91.5 170

Total 132 132 154 154 286

HIV No Yes

STDs HX Obs. Exp. Obs. Exp. Total

No 84 53.5 32 62.5 116

Yes 48 78.5 122 91.5 170

Total 132 132 154 154 286

EE((nnijij)) 5 in all 5 in all

cellscells

170x132 170x132 286286

170x154 170x154 286286

116x132 116x132 286286

154x116 154x116 286286

22 Test of Independence Test of Independence SolutionSolution

Page 49: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 4949

2

2

11 11

2

11

12 12

2

12

22 22

2

22

2 2 284 53 5

53 5

32 62 5

62 5

122 915

91554 29

n E n

E n

n E n

E n

n E n

E n

n E n

E n

ij ij

ij

.

.

.

.

.

..

c hc h

a fa f

a fa f

a fa f

all cells

2

2

11 11

2

11

12 12

2

12

22 22

2

22

2 2 284 53 5

53 5

32 62 5

62 5

122 915

91554 29

n E n

E n

n E n

E n

n E n

E n

n E n

E n

ij ij

ij

.

.

.

.

.

..

c hc h

a fa f

a fa f

a fa f

all cells

22 Test of Independence Test of Independence SolutionSolution

Page 50: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 5050

22 Test of Independence Test of Independence SolutionSolution

HH00: : No Relationship No Relationship

HHaa: : Relationship Relationship

= .05= .05

dfdf = (2 - 1)(2 - 1) = 1 = (2 - 1)(2 - 1) = 1

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

20 3.841

Reject

20 3.841

Reject

= .05= .05

22 = 54.29 = 54.29

Page 51: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 5151

22 Test of Independence Test of Independence SolutionSolution

HH00: : No Relationship No Relationship

HHaa: : Relationship Relationship

= .05= .05

dfdf = (2 - 1)(2 - 1) = 1 = (2 - 1)(2 - 1) = 1

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Reject at Reject at = .05 = .05

20 3.841

Reject

20 3.841

Reject

= .05= .05

22 = 54.29 = 54.29

Page 52: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 5252

22 Test of Independence Test of Independence SolutionSolution

HH00: : No Relationship No Relationship

HHaa: : Relationship Relationship

= .05= .05

dfdf = (2 - 1)(2 - 1) = 1 = (2 - 1)(2 - 1) = 1

Critical Value(s):Critical Value(s):

Test Statistic: Test Statistic:

Decision:Decision:

Conclusion:Conclusion:

Reject at Reject at = .05 = .05

There is evidence of a There is evidence of a relationshiprelationship20 3.841

Reject

20 3.841

Reject

= .05= .05

22 = 54.29 = 54.29

Page 53: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 5353

22 Test of Independence Test of Independence SAS CODESSAS CODES

DataData dis; dis;input STDs HIV count;input STDs HIV count;cards;cards;1 1 84 1 1 84 1 2 321 2 322 1 482 1 482 2 1222 2 122;;runrun;;

ProcProc freqfreq data=dis order=data; data=dis order=data; weight Count;weight Count; tables STDs*HIV/tables STDs*HIV/chisqchisq;;runrun;;

Page 54: EPI809/Spring 2008 1 Chapter 10 Hypothesis testing: Categorical Data Analysis.

EPI809/Spring 2008EPI809/Spring 2008 5454

22 Test of Independence Test of Independence SAS OUTPUTSAS OUTPUT

Statistics for Table of STDs by HIV

Statistic DF Value Prob ------------------------------------------------------- Chi-Square 1 54.1502 <.0001 Likelihood Ratio Chi-Square 1 55.7826 <.0001 Continuity Adj. Chi-Square 1 52.3871 <.0001 Mantel-Haenszel Chi-Square 1 53.9609 <.0001 Phi Coefficient 0.4351 Contingency Coefficient 0.3990 Cramer's V 0.4351