Top Banner
1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)
23

1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

1

PH 240A: Chapter 12

Mark van der LaanUniversity of California Berkeley

(Slides by Nick Jewell)

Page 2: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

2

Regression Models: Motivation

Stratification methods break down (wrt precision) with large number of strata and moderate sample sizes

Stratification leads to high degree of freedom tests for interaction, for example, and therefore low power

Refined measures of exposure lead to high degree of freedom tests for association and therefore low power

Want parsimonious descriptions of patterns of risk

Regression Models (how to diet wrt degrees of freedom)

Page 3: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

3

Regression Models: Linking Effects

xx

x

x

X, numerical measure of Exposure E

P(D|E=x)

x1 x2 x4x3

Page 4: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

4

Linear Models (think CHD and body weight)

Form of model:

Interpretation of model parameters a and b:

b = ER associated with unit increase in X b(x*-x) = ER assoc. with increase in X from x* to x

bxaxXDPpx +=== )|(

[ ] [ ]b

bxaxba

XxdpxXDPpp

aXDPp

xx

=

+−++=

=−+==−

===

+

)1(

)|()1|(

)0|(

1

0Choose X = 0 value carefully!

Choose the scale ofX carefully!

Page 5: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

5

Linear Models

Pros Good modeling Excess Risk

Cons Can’t be applied to case-control data Can predict probabilities < 0 or > 1

Page 6: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

6

Log Linear Models

Form of model:

Interpretation of model parameters a and b:

( ) ( )

bxax

x

ep

bxaxXDPp

+=

+=== )|(loglog

( ) ( )

( ) ( ) [ ] [ ]

bp

p

bxaxbapp

aXDPp

x

x

xx

=⎟⎟⎠

⎞⎜⎜⎝

⎛⇒

+−++=−

===

+

+

1

1

0

log

)1(loglog

)0|(loglog Choose X = 0 value carefully!

Page 7: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

7

Log Linear Models

Interpretation of model parameters a and b:

eb = RR associated with unit increase in X

eb(x*-x) = RR assoc. with increase in X from x* to x

Choose the scale ofX carefully!

Page 8: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

8

Log Linear Models

Pros Good modeling Relative Risk

Cons Can’t be applied to case-control data Can predict probabilities > 1

Page 9: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

9

Logistic Regression Models

Form of model:

Interpretation of model parameters a and b:

( )

bxa

bxa

bxax

x

x

e

e

ep

bxaxXDp

p

+

+

+− +≡

+=

+===⎟⎟⎠

⎞⎜⎜⎝

11

1

|for oddslog1

log

)(

Choose X = 0 value carefully!

Page 10: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

10

Logistic Regression Models

Interpretation of model parameters a and b:

eb = OR associated with unit increase in X

eb(x*-x) = OR assoc. with increase in X from x* to x

Choose the scale ofX carefully!

Page 11: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

11

Logistic Regression Models

Pros Good modeling Odds Ratio Can be applied to case-control data Predicted probabilities always lie between 0 and 1

Cons Harder to interpret

Page 12: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

12

Logistic Regression Models

Page 13: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

13

1991 US Infant Mortality

Mother’s Marital Status

Infant Mortality

Unmarried

(X = 1)

Married(X = 0)

Total

Death 16,712 18,784 35,496

Live at 1 Year

1,197,142

2,878,421

4,075,563

Total 1,213,854

2,897,205

4,111,059

)0073.00065.00138.0(

0073.00065.0 :ModelLinear

01 =−=−=+=

ppERxpx

)7531.0)0065.0/0138.0log(logloglog

0385.50065.0loglog(

7531.00385.5)log( :Modellinear -Log

01

0

==−==−===

+−=

ppbRRpa

xpx

p1 = 16,712/1,213,854 = 0.0138p0 = 18,784/2,897,205 = 0.0065

)7604.0)1/log()1/log(log

0320.5)9935.0/0065.0log()1/log((

7604.00320.5)1/log( :Model Logistic

0011

00

=−−−==−==−=

+−=−

ppppbORppa

xpp xx

Page 14: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

14

Infant Mortality and Marital Status: Various Models

Page 15: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

15

Logistic Regression: Body Weight and CHD

Page 16: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

16

CHD and Body Weight: Various Models

Page 17: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

17

Multiple Logistic Regression Models

Form of model:

( )

kk

kk

kkk

k

k

xbxbxba

xbxbxba

xbxbxbaxx

kk

kkxx

xx

e

e

ep

xbxbxba

xXxXDp

p

++++

++++

++++− +≡

+=

++++=

===⎟⎟⎠

⎞⎜⎜⎝

L

L

LK

K

K

L

K

2211

2211

22111

1

1

111

,,|for oddslog1

log

)(,,

2211

11,,

,,

Think, e.g., D = CHD, X1 = Behavior type, X2 = Body weight

Page 18: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

18

Multiple Logistic Regression Models

Interpretation of model parameters a, b1, . . ., bk :

( ) ( )

[ ] [ ] 122112211

,,,,,,,,,1,,,1,,,,,,

,,,1,,,1

0,,0

0,,0

)1(

)1/(log)1/(log)1/(

)1/(log

1log

21212121

2121

2121

bxbxbxbaxbxbxba

pppppp

pp

ap

p

kkkk

xxxxxxxxxxxxxxxxxx

xxxxxx

kkkk

kk

kk

=++++−+++++=

−−−=⎟⎟⎠

⎞⎜⎜⎝

=⎟⎟⎠

⎞⎜⎜⎝

++++

LL

KKKKKK

KK

K

K Choose 0 covariatevalues carefully!

Choose covariate scales carefully!

Page 19: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

19

Indicator (Dummy) Variables for Discrete Exposures

Goal: to model exposures with several discrete levels without assuming dose response Body weight (dose response)

Body weight (no assumed pattern)

bxap

pX

x

x +=⎟⎟⎠

⎞⎜⎜⎝

⎪⎪⎪

⎪⎪⎪

>≤<≤<≤<

=1

log

lbs 180 wt4lbs180wt1703lbs170wt1602lbs160wt1501

lbs150wt0

⎩⎨⎧ >

=⎩⎨⎧ ≤<

=

⎩⎨⎧ ≤<

=⎩⎨⎧ ≤<

=

otherwise0

lbs180wt1

otherwise0

lbs180wt1701

otherwise0

lbs170wt1601

otherwise0

lbs160wt1501

43

21

XX

XX

44332211,,

,,

41

41

1log xbxbxbxba

p

p

xx

xx ++++=⎟⎟⎠

⎞⎜⎜⎝

− K

K

Page 20: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

20

Indicator (Dummy) Variables for Discrete Exposures

Model:44332211

,,

,,

41

41

1log xbxbxbxba

p

p

xx

xx ++++=⎟⎟⎠

⎞⎜⎜⎝

− K

K

lbs 180 wt

lbs180wt170

lbs170wt160

lbs160wt150

lbs150wt

odds log

4

3

2

1

⎪⎪⎪

⎪⎪⎪

>+≤<+≤<+≤<+

=

babababa

a

No assumed pattern!

Page 21: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

21

Interpretation of Slope Coefficients with Dummy

Variables

lbs 180 wt

lbs180wt170

lbs170wt160

lbs160wt150

lbs150wt

odds log

4

3

2

1

⎪⎪⎪

⎪⎪⎪

>+≤<+≤<+≤<+

=

babababa

a

384.0)32594/50558log(

lbs) 150 baseline, vs.lbs 170-(160 log

068.0)32505/31558log(

lbs) 150 baseline, vs.lbs 160-(150 log

2

1

=××==<

=××==<

bOR

bOR

M

764.0

)50131/50566log(

068.0832.0

lbs) 160- 150 vs.lbs 180-(170 log 13

=××=

−=−= bbOR

CHD Event

D not D

Body Wt

(lbs)

150 32 558

150-160

31 505

160-170

50 594

170-180

66 501

>180 78 739

Page 22: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

22

Indicator (Dummy) Variables for Discrete Exposures

Model Parameter Estimate OR

a -2.839

b 0.180 1.20

a -2.859

b1 0.068 1.07

b2 0.384 1.47

b3 0.832 2.30

b4 0.610 1.84

bxap

p

x

x +=⎟⎟⎠

⎞⎜⎜⎝

−1log

4433

2211,,

,,

1log

41

41

xbxb

xbxbap

p

xx

xx

++

++=⎟⎟⎠

⎞⎜⎜⎝

− K

K

Page 23: 1 PH 240A: Chapter 12 Mark van der Laan University of California Berkeley (Slides by Nick Jewell)

23

Logistic Regression: Body Weight and CHD

Dose response (linear) X

No pattern (dummy vars.) X1, . . . , X4