Top Banner
Gompertz Maximum Likelihood Estimation of Truncated Death Distribuitons Joshua R. Goldstein November 8, 2019 Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death Distribuitons November 8, 2019 1 / 32
32

Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Mar 22, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Gompertz Maximum Likelihood Estimation ofTruncated Death Distribuitons

Joshua R. Goldstein

November 8, 2019

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 1 / 32

Page 2: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Big Picture

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 2 / 32

Page 3: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Our challenge

We see only part of the picture (e.g., deaths aged 70 to 87).

No estimates of who died before or after

How can we estimate death rates without denominators?How can we estimate e(65) differences between groups?

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 3 / 32

Page 4: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Our idea

We can combine

observed distribution of deaths (over limited range)our external knowledge of human mortality age-patterns

The hope is that this will produce good estimates of mortality rates, ofe(65), and of differences between groups

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 4 / 32

Page 5: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Today’s agenda

Intro to Maximum Likelihood EstimationExample with simulated dataAttempt at validation with HMDPreliminary try at NUMIDENTLessions and directions

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 5 / 32

Page 6: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Truncated Maximum Likelihood (in theory)

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 6 / 32

Page 7: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Philosophy

For given data X , we can a likelihood associated with a particular value ofparameter θ.

We then choose the θ̂ to maximize this likelihood.

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 7 / 32

Page 8: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

A simple example

Likelihood for observation i withvalue xi :

Li = L(λ|xi) = fλ(xi)

Likelihood for all observations:

L =∏

iLi

Log-likelihood:

L =∑

ilog Li

If we observe x1 = 3 and x2 = 5, then

L(λ|x1) = λe−3λ

L(λ|x2|) = λe−5λ

L(λ|x1, x2) = λ2e( − 8λ)

L =∑

ilog Li = 2 log λ− 8λ

dLdλ = 2

λ− 8 = 0

So,λ̂MLE = 2/8 = 0.25

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 8 / 32

Page 9: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

We did this by hand, but can also do with the computerlambda.vec <- seq(.01, 1, .01)loglik.vec = 2 * log(lambda.vec) - 8 * lambda.vecplot(lambda.vec, loglik.vec)abline(v = lambda.vec[which.max(loglik.vec)])

0.0 0.2 0.4 0.6 0.8 1.0

−9

−8

−7

−6

−5

lambda.vec

logl

ik.v

ec

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 9 / 32

Page 10: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

For truncated distribution we observe only from a to b

We can define the conditional distribution

ftrunc = fθ(x)∫ ba fθ(x) dx

= fθ(x)Fθ(b)− Fθ(a)

with likelihood

L(θ|x) =∏ fθ(xi)

Fθ(b)− Fθ(a)

And then we maximize that.

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 10 / 32

Page 11: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Validation with Simulated Gompertz

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 11 / 32

Page 12: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Simulated Gompertz, without truncationsource("hmd_validation_functions.R")N = 2000set.seed(13)x <- rgompertz.M(N, b = 1/10, M = 75)Dx <- table(floor(x))plot(names(Dx), Dx, type = "h",

ylab = "Death Counts", xlab = "Age",main = "2000 Simulated Gompertz Deaths")

20 40 60 80

415

2841

5467

80

2000 Simulated Gompertz Deaths

Age

Dea

th C

ount

s

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 12 / 32

Page 13: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

MLE estimation

source("hmd_validation_functions.R")N = 1000set.seed(13)x <- rgompertz.M(N, b = 1/10, M = 75)Dx <- table(floor(x))fit <- counts.trunc.gomp.est(Dx= Dx, x.left = 0, x.right = 200,

b.start = 1/9, M.start = 80)(b.hat = exp(fit$par[1]))

## [1] 0.09572687

(M.hat = exp(fit$par[2]))

## [1] 75.01612

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 13 / 32

Page 14: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

How did we do?

0 20 40 60 80 100

415

2534

4352

2000 Simulated Gompertz Deaths, with MLE fit

Age

Dea

th C

ount

s

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 14 / 32

Page 15: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Now artificially truncate to ages 65-90

l <- 65h <- 90x.trunc <- x[x > l & x < h]Dx <- table(floor(x.trunc))fit <- counts.trunc.gomp.est(Dx= Dx, x.left = l, x.right = h,

b.start = 1/9, M.start = 80)(b.hat = exp(fit$par[1]))

## [1] 0.1083535

(M.hat = exp(fit$par[2]))

## [1] 75.42279

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 15 / 32

Page 16: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Plot the fit

0 20 40 60 80 100

010

2030

4050

Age

Dea

ths

Simulated Truncated Gompertz, with MLE fit

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 16 / 32

Page 17: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Other ingredients

A modelHere we pick Gompertz, with parameters β and M.Some code to implement optimization routineValidatationTest on simulated Gompertz deaths, to see if we estimate right valuesTest on HMD to see if it works with real dataApplicationNUMIDENT(Weighted Censoc)

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 17 / 32

Page 18: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Major Assumptions

Gompertz model is appropriateUniform coverage across ages to preserve the cohort distribution ofdeaths

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 18 / 32

Page 19: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

CodeGompertz functions

source("hmd_validation_functions.R")

##fit <- counts.trunc.gomp.est(Dx= Dx, x.left = 0, x.right = 200,

b.start = 1/9, M.start = 80)(b.hat = exp(fit$par[1]))

## [1] 0.1716812

(M.hat = exp(fit$par[2]))

## [1] 78.73153

Optimization

## wrapper function that calls optim()counts.trunc.gomp.est <- function(Dx, x.left, x.right, b.start,M.start){...}

## our negative log-lilelihoodd.counts.gomp.negLL <- function(par, Dx, x.left, x.right){...}

## usagefit <- counts.trunc.gomp.est(Dx = my.Dx, x.left = 70, x.right = 87,b.start = 1/9, M.start = 80)

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 19 / 32

Page 20: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Without Truncation

N = 1000set.seed(13)x <- rgompertz.M(N, b = 1/10, M = 75)Dx <- table(floor(x))fit <- counts.trunc.gomp.est(Dx= Dx, x.left = 0, x.right = 200,

b.start = 1/9, M.start = 80)(b.hat = exp(fit$par[1]))

## [1] 0.09572687

(M.hat = exp(fit$par[2]))

## [1] 75.01612

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 20 / 32

Page 21: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Plot the fit

20 40 60 80

415

2225

2831

3437

4043

46

names(Dx)

Dx

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 21 / 32

Page 22: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Validating Method with HMD

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 22 / 32

Page 23: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Our approach

Use HMD death counts for the year and ages we have NUMIDENTdeaths

(1988 to 2005, Ages 65 and over)

Fit Gompertz and if we matchMortality ratese(65)

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 23 / 32

Page 24: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Import and prepare HMD datalibrary(data.table)## read in and define age and cohortdt.mx <- fread("~/Documents/hmd/hmd_statistics/death_rates/Mx_1x1/USA.Mx_1x1.txt")dt.mx[ , x := as.numeric(Age)]

## Warning in eval(jsub, SDenv, parent.frame()): NAs introduced by coercion

dt.mx[Age == "110+" , x := 110]dt.mx[, cohort := Year - x]dt <- fread("~/Documents/hmd/hmd_statistics/deaths/Deaths_1x1/USA.Deaths_1x1.txt")dt[ , x := as.numeric(Age)]

## Warning in eval(jsub, SDenv, parent.frame()): NAs introduced by coercion

dt[Age == "110+" , x := 110]dt[, cohort := Year - x]#### make array, age x cohort x sex.dt.long <- melt(dt, measure.vars = c("Male", "Female"), variable.name = "sex", value.name = "Dx")## make arraymy.array <- dt.long[cohort %in% my.cohorts, xtabs(Dx ~ x + cohort + sex)]dimnames(my.array)[[3]] <- c("m", "f") ## for backwards compatibility

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 24 / 32

Page 25: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Fit HMD## [1] 1900## [1] 1901## [1] 1902## [1] 1903## [1] 1904## [1] 1905## [1] 1906## [1] 1907## [1] 1908## [1] 1909## [1] 1910## [1] 1911## [1] 1912## [1] 1913## [1] 1914## [1] 1915## [1] 1916## [1] 1917## [1] 1918## [1] 1919## [1] 1920## [1] 1921## [1] 1922## [1] 1923

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 25 / 32

Page 26: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Plot parameter values

1900 1905 1910 1915 1920

0.05

0.07

0.09

0.11

my.cohorts

b.ve

c

1900 1905 1910 1915 1920

7585

95

my.cohorts

M.v

ec

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 26 / 32

Page 27: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Look at cohort of 1920, pretending we observe only ages68 to 85

65 70 75 80 85 90 95 100

010

000

2000

030

000

4000

050

000

names(Dx)

Dx

HMD females cohort of 1920, observed and fit

Very high M is not crazy: Period life table for 2016 has mode at age 88.Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 27 / 32

Page 28: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

e(65)

1900 1905 1910 1915 1920

1012

1416

1820

Male cohort e(65) estimates from truncated Gompertz MLE

cohort

e(65

)

For reference, period e(65, 2016) = 18.36.Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 28 / 32

Page 29: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Mortality rates

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 29 / 32

Page 30: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Selected Mortality Rates

1895 1900 1905 1910 1915 1920 1925

0.05

0.10

0.20

MLE fits (dashed) vs HMD observations (solid) Males

cohort

Mac

(lo

g−sc

ale)

68

70

72

74

76

78

80

82

84

86

88

90

92

94

96

98

Overall fit is remarkably good.But shouldn’t rely on for unobserved ages (see old age decline in upperright)Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 30 / 32

Page 31: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Period perspective

1990 1995 2000 2005

0.05

0.10

0.20

Year

Mxt

(lo

g−sc

ale)

MLE estimated hazards (dashed) vs HMD (solid), period perspective by age

6870

72

74

76

78

80

82

84

86

88

90

92

94

9698

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 31 / 32

Page 32: Gompertz Maximum Likelihood Estimation of Truncated Death … · 2020. 3. 25. · main ="2000 Simulated Gompertz Deaths") 20 40 60 80 4 15 28 41 54 67 80 2000 Simulated Gompertz Deaths

Trying out the method with NUMIDENT

Joshua R. Goldstein Gompertz Maximum Likelihood Estimation of Truncated Death DistribuitonsNovember 8, 2019 32 / 32