Generalized linear mixed models (GLMMs) for dependent compound risk models Emiliano A. Valdez, PhD, FSA joint work with H. Jeong, J. Ahn and S. Park University of Connecticut Seminar Talk at Yonsei University Seoul, Korea 15 May 2017 Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 1 / 21
21
Embed
Generalized linear mixed models (GLMMs) for dependent …valdez/Seoul2017-GLMM-Valdez.pdf · 2017-05-07 · Generalized linear mixed models (GLMMs) for dependent compound risk models
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Generalized linear mixed models (GLMMs) fordependent compound risk models
Emiliano A. Valdez, PhD, FSA
joint work with H. Jeong, J. Ahn and S. Park
University of Connecticut
Seminar Talk atYonsei University
Seoul, Korea
15 May 2017
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 1 / 21
OutlineIntroduction
Data structureThe frequency-severity model
Exponential dispersion familyCovariates
Compound risk modelsGeneralized linear mixed models
Model specificationsModel estimates
Claim frequencyClaim severityTweedie models
Model comparisonGini index
ConclusionAppendix A - Singapore
Insurance marketJeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 2 / 21
Introduction Data structure
Data structure
“Policyholder” i is followed over time t = 1, . . . , Ti years, where Ti isat most 9 years.
Unit of analysis “it” – a registered vehicle insured i over time t (year)
For each “it”, could have several claims, k = 0, 1, . . . , nitHave available information on: number of claims nit, amount of claimyitk, exposure eit and covariates (explanatory variables) xit
covariates often include age, gender, vehicle type, driving history andso forth
We will model the pair (nit, cit) where
cit =
1
nit
nit∑k=1
yitk, nit > 0
0, nit = 0
is the observed average claim size.
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 3 / 21
The frequency-severity model
The frequency-severity model
Traditional to predict/estimate insurance claims distributions:
Cost of Claims = Frequency × Severity
The joint density of the number of claims and the average claim sizecan be decomposed as
f(N,C|x) = f(N |x)× f(C|N,x)joint = frequency × conditional severity.
This natural decomposition allows us to investigate/model eachcomponent separately and it does not preclude us from assuming Nand C are independent.
For purposes of notation, we will use the notation C = NC to be theaggregate claims.
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 4 / 21
The frequency-severity model Exponential dispersion family
Exponential dispersion family
We say Y comes from an exponential dispersion family if its density hasthe form
f(y) = exp
[yθ − ψ(θ)
φ+ c(y;φ)
].
where θ and ψ are location and scale parameters, respectively, and b(θ)and c(y;ψ) are known functions.
The following well-known relations hold for these distributions:
mean: µ = E[Y ] = ψ′(θ)
variance: V ar[Y ] = φψ′′(θ) = φV (µ)
Reproductive EDF: If Y1, . . . , YN are mutually independent belonging tothe EDF(θ, φ), then its average Y also belongs to EDF(θ, φ/N).
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 5 / 21
The frequency-severity model Exponential dispersion family
Examples of members of this family
Normal N(µ, σ2) with θ = µ and φ = σ2, V (µ) = 1
Gamma(α, β) with θ = −β/α and φ = 1/α, V (µ) = µ2
Inverse Gaussian(α, β) with θ = −12β
2/α2 and φ = β/α2, V (µ) = µ3
Poisson(µ) with θ = logµ and φ = 1, V (µ) = 1, V (µ) = µ
Binomial(m, p) with θ = log [p/(1− p)] and φ = 1,V (µ) = µ(1− µ/m)
Negative Binomial(r, p) with θ = log(1− p) and φ = 1,V (µ) = µ(1 + µ/r)
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 6 / 21
The frequency-severity model Exponential dispersion family
The link function and linear predictors
The linear predictor is a function of the covariates, or predictorvariables.
The link function connects the mean of the response to this linearpredictor in the form η = g(µ).
g(µ) is called the link function with µ = E(Y ) = linear predictor.
The transformed mean follows a linear model as
η = x′iβ = β0 + β1xi1 + · · ·+ βpxip
with p predictor variables (or covariates).
You can actually invert µ = g−1(x′iβ) = g−1(η).
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 7 / 21
The frequency-severity model Covariates
Covariates
Vehicle Type: automotive (A) or others (O).
Marital status: married (M), single (S), others (O)
Gender: male (M) or female (F).
Coverage type: Comprehensive (Comp) or Others
Age: in years, grouped into 3 categories
Young (Y), Middle Aged (M), Old (O)
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 8 / 21
Compound risk models
Compound risk models
See Garrido, et al. (2016)
In the case of independence, we have:
Mean: E(C) = E(N)E(C)
Variance: V ar(C) = E(N)V ar(C) + V ar(N)[E(C]2
In case we do not assume independence, we have
Mean: E(C) = E[NE(C|N))] 6= E(N)E(C)
Variance: V ar(C) = E[NV ar(C|N))] + V ar[NE(C|N)]
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 9 / 21
Compound risk models Generalized linear mixed models
Generalized linear mixed models
GLMMs extend GLMs by allowing for random, or subject-specific, effectsin the linear predictor which reflects the idea that there is a naturalheterogeneity across subjects. For insurance applications, our subject i isusually a policyholder observed for a period of Ti periods.
Given the vector bi with the random effects for each subject i, Yit belongsto the EDF:
f(yit|bi) = exp
[yitθit − ψ(θit)
φ+ c(yit;φ)
]with the following conditional relations:
mean: µit = E[Yit|bi] = ψ′(θit)
variance: V ar[Y |bi] = φψ′′(θ + it) = φV (µit)
Link function: g(µit) = x′itβ + z′itbi
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 10 / 21
Model specifications
Model specifications
We calibrated models with the following specifications:
For the count of claims (frequency): negative binomial for ourbaseline model
more suitable than the traditional Poisson model because it can handleoverdispersion and individual unobserved effects
For the severity of claims: Gamma (due to its reproductive property)
For the random effects for GLMMs: Normal with mean zero andunknown variances
different variances for frequency and claim severity
The Tweedie model is a special case of the GLM family and it hasmass at zero (avoids modeling frequency and severity separately).
for some case, its density has no explicit form, but the variancefunction has the form V (µ) = µp e.g. compound Poisson Gamma with1 < p < 2.
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 11 / 21
Model estimates Claim frequency
Negative binomial model for claim frequencyNB GLM NB GLMM
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 14 / 21
Model comparison Gini index
Using the Gini index to compare models
See Frees, et al. (2016).
In computing the Gini index, we employ the following steps:
1 Sort observed claims Yi|i = 1, · · · ,M according to risk scoreSi|i = 1, · · · ,M (in our case, estimated value from each method) inascending order. That is, calculate Ri|i = 1, · · · ,M , where each Ri isthe rank of Si between 1 and M so that R1 = arg max (Si).
2 Compute Fscore(m/M) =∑M
i=1 1(Ri≤m)/M ,
Floss(m/M) =∑M
i=1 Yi1(Ri≤m)/∑M
i=1 Yi for each m = 1, · · · ,M .
3 Plot Fscore(m/M) on x-axis, and Floss(m/M) on y-axis.
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 15 / 21
Model comparison Gini index
Comparing the Gini index for the two-part models
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 16 / 21
Model comparison Gini index
Comparing the Gini index for the Tweedie models
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 17 / 21
Conclusion
Concluding remarks
We are still in the early stages of this work.
In particular, this is an early exploration of the promise of usingGLMMs to account for dependency between frequency and severity.
This work hopes to extend the literature on this type of dependency:
H. Jeong (2016) work on ”Simple Compound Risk Model withDependent Structure”Garrido, et al. (2016)Frees and Wang (2006), Frees, et al. (2011)Czado (2012)Shi, et al. (2015)
Further work is needed to measure the financial impact of usingGLMMs versus other models:
risk classificationa posteriori prediction and applications in bonus-malus systemsmodified insurance coverage and reinsurance applications
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 18 / 21
Appendix A - Singapore
A bit about Singapore
Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 19 / 21
Appendix A - Singapore
A bit about Singapore 1
Singa Pura: Lion city. Location: 136.8 km N of equator, betweenlatitudes 103 deg 38’ E and 104 deg 06’ E. [islands between Malaysiaand Indonesia]
Size: very tiny [647.5 sq km, of which 10 sq km is water] Climate:very hot and humid [23-30 deg celsius]
Ethnic groups: Chinese 74%, Malay 13%, Indian 9%; Languages:Chinese, Malay , Tamil, English
1Updated: February 2010Jeong/Ahn/Park/Valdez (U. of Connecticut) Seminar Talk - Yonsei University 15 May 2017 20 / 21
Appendix A - Singapore Insurance market
Insurance market in Singapore
As of 2009 2: market consists of 45 general ins, 8 life ins, 7 both, 17general reinsurers, 2 life reins, 7 both; also the largest captivedomicile in Asia, with 59 registered captives.
Monetary Authority of Singapore (MAS) is the supervisory/regulatorybody; also assists to promote Singapore as an international financialcenter.