Hidden Regular Variation in Joint Tail Modeling with Likelihood Inference via the MCEM Algorithm Dan Cooley Grant Weller Department of Statistics Colorado State University Funding: NSF-DMS-0905315 Weather and Climate Impacts Assessment Program (NCAR) 2011-2012 SAMSI UQ program
26
Embed
Hidden Regular Variation in Joint Tail Modeling with ...web.math.ku.dk/~mikosch/extremes_2013/cooleyCopenhagen.pdf · Likelihood estimation via modi ed MCEM algorithm Captures tail
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Hidden Regular Variation in Joint TailModeling with Likelihood Inference via the
MCEM Algorithm
Dan CooleyGrant Weller
Department of StatisticsColorado State University
Funding:NSF-DMS-0905315
Weather and Climate Impacts Assessment Program (NCAR)2011-2012 SAMSI UQ program
1
Motivating Example: Daily Air Pollution, Leeds UK
0 100 200 300 400 500
0100
200
300
400
500
Daily max pollution at Leeds, UK
NO2
SO2
A1A2A3
0 100 200 300 4000
100
200
300
400
Frechet Scale
NO2
SO2
Data exhibit asymptotic independence (Heffernan and Tawn,2004).
2
Outline
• Hidden Regular Variation
• Sum Characterization of HRV
• Estimation via MCEM
• Application: air pollution data
3
When Multivariate Regular Variation Fails
Multivariate Regular Variation:
tP[R
b(t)> r,W ∈ B
]v−→ r−αH(B).
In some cases, the angular measure H degenerates on someregions of N, masking sub-asymptotic dependence features.
Example: asymptotic independence in d = 2:
limz→z+
P(Z1 > z|Z2 > z) = 0.
• H consists of point masses at {0} and {1} (using ‖ · ‖1)
• e.g. bivariate Gaussian with correlation ρ < 1
Normalization by b(t) kills off sub-asymptotic dependencestructure.
4
Hidden Regular Variation(Resnick, 2002)
A regular varying random vector Z exhibits hidden regularvariation on a subcone C0 ⊂ C if ν(C0) = 0 and there exists{b0(t)}, b0(t)→∞ with b0(t)/b(t)→ 0 s.t.
tP[
Z
b0(t)∈ ·
]v−→ ν0(·)
as t→∞ in M+(C0).
• Scaling: ν0(tA) = t−α0ν0(A) for measurable A ∈ C0, α0 ≥ α• ν0 is Radon but not necessarily finite.
Equivalently,
tP[R
b0(t)> r,W ∈ B
]v−→ r−α0H0(B)
for B a Borel set of N0 = C0 ∩N (e.g. N0 = (0,1)).
H0 is called the hidden angular measure.
5
Example: bivariate Gaussian
Consider Z with Frechet margins and Gaussian dependence,ρ ∈ [0,1). Recall ν places mass only on the axes of C.
Define η = (1+ρ)/2, the coefficient of tail dependence (Led-ford and Tawn, 1997).
• Z exhibits hidden regular variation of order α0 = 1/η
• The density of the hidden measure ν0 can be written
ν0(dr × dw) =1
ηr−(1+1/η)dr ×
1
4η{w(1− w)}−1/2η−1dw︸ ︷︷ ︸
H0(dw)
H0 is infinite on (0,1).
6
Tail Equivalence(Maulik and Resnick, 2004)
Two random vectors X and Y are tail equivalent on the coneC∗ if
tP[
X
b∗(t)∈ ·
]v−→ ν(·) and tP
[Y
b∗(t)∈ ·
]v−→ cν(·)
as t→∞ in M+(C∗) for c > 0.
‘Extremes of X and Y samples taken in C∗ will have the sameasymptotic properties.’
7
Mixture Characterization of HRV(Maulik and Resnick, 2004)
Suppose Z is regular varying on C with hidden regular variationon C0:
tP[
Z
b(t)∈ ·
]v−→ ν(·) in M+(C) and
tP[
Z
b0(t)∈ ·
]v−→ ν0(·) in M+(C0)
with ν(C0) = 0 and b0(t)/b(t)→ 0 as t→∞.
• Let Y be RV (α) with support only on C \ C0.
• Let V = R0θ0, R0 ∼ FR0(t) = 1/b→(t) and θ0 ∼ H0, finite.
• Then Z is tail equivalent to a mixture of Y and V on bothC and C0.
Works because Y’s support doesn’t mess with the HRV.
8
0 20 40 60 80 100
020
4060
80100
Y[,1]
Y[,2]
0 20 40 60 80 100
020
4060
80100
V[,1]
V[,2]
0 20 40 60 80 100
020
4060
80100
YV[,1]
YV[,2]
9
Construction of Y + V
Define Y = RW, with P(R > r) ∼ 1/b←(r) and W drawn fromlimiting angular measure H. Notice that Y has support onlyon C \ C0.
Let V ∈ [0,∞)d be regular varying on C0 with limit measureν0:
tP[
V
b0(t)∈ ·
]v−→ ν0(·) in M+(C0).
Further assume that on C,
P(‖V‖ > r) ∼ cr−α∗
as r →∞, with c > 0 and
α∗ > α ∨ (α0 − α).
Assume R, W, V are independent.
10
Tail Equivalence Result
Then
tP[Y + V
b(t)∈ ·
]v−→ ν(·) in M+(C)
(Jessen and Mikosch, 2006).
Furthermore, tail equivalence (Maulik and Resnick, 2004)also holds on C0:
Theorem. With Y and V as defined above,
tP[Y + V
b0(t)∈ ·
]v−→ ν0(·) in M+(C0).
View Z as a sum of ‘first-order’ Y and ‘second-order’ V.
The sum Y + V is tail equivalent to Z on both C and C0.
11
Simulation when ν0 is finite.
0 200 400 600 800 1000 1200
0200
400
600
800
1000
1200
Y
Y1
Y2 +
0 200 400 600 800 1000 1200
0200
400
600
800
1000
1200
V
V1
V2 =
0 200 400 600 800 1000 1200
0200
400
600
800
1000
1200
Y +V
Y1 +V1
Y2+V2
No point falls exactly on an axis.
12
Infinite Measure Example: Bivariate Gaussian
Z has Frechet margins and Gaussian dependence (ρ < 1).Recall: H0 is infinite on N0 = (0,1).
Poses difficulty near the axes of C.
Proposed construction of V:
• Restrict to Cε0 = C0 ∩Nε0, where Nε
0 = [ε,1− ε] forε ∈ (0,1/2).
• Simulate W0 from probability density H0(dw)/H0(Nε0)
• Let R0 follow a Pareto distribution with α = 1/η
• V = [R0W0, R0(1−W0)]T
Y + V is tail equivalent to Z on C and Cε0.
13
Sum representation of bivariate Gaussian
Example with ρ = 0.5 (n = 2500):
0 200 400 600 800
0200
400
600
800
Z
z1
z 2
0 200 400 600 800
0200
400
600
800
Y +V (ε = 0.01)
Y1 +V1
Y2+V2
0 200 400 600 800
0200
400
600
800
Y +V (ε = 0.1)
Y1 +V1
Y2+V2
For any set completely contained in Cε0 we achieve the correctlimit measure ν0.
Choice of ε involves a trade-off between:• Size of the subcone on which tail equivalence holds
• Threshold at which Y + V is a useful approximation
• Biases due to choice of ε calculated.
14
Inference via the EM Algorithm
Observe realizations from Z, tail equivalent to Y+V. Assumeparametric forms and perform ML inference via EM.
If we assume Z = Y + V,
log f(z; θ) =∫
log f(z,y,v; θ)f(y,v|z; θ(k))dydv
−∫
log f(y,v|z; θ)f(y,v|z; θ(k))dydv
:= Q(θ|θ(k))−H(θ|θ(k)).
Here: Z and Y + V are only tail equivalent; θ governs tailbehavior of Y and V. Requires a modification of the EMsetup.
15
EM for Extremes
Consider distributions with densities gY(y; θ) and gV(v; θ)which are tail equivalent to the true distributions; i.e.,
gY(y; θ) ∼= fY(y; θ) for ‖y‖ > r∗YgV(v; θ) ∼= fV(v; θ) for ‖v‖ > r∗V,
Complete likelihood is based on limiting Poisson point pro-cesses for Y and V.
• E step: expectation is taken with respect to g(y,v|z; θ).
• M step: maximization is taken over only ‘large’ y and v.
We showH(θ(k)|θ(k))−H(θ|θ(k)) ≥ 0
using Jensen’s inequality.
16
MCEM
Natural framework for MCEM.
At the E step of the (k + 1)th iteration, simulate from
gY(y; θ(k))gV(z− y; θ(k)) ∝ g(y,v|z; θ(k))
for all z and use the simulated realizations to compute
Qm(θ|θ(k)) =1
m
m∑j=1
`(θ; z,yj,vj).
employing Poisson point process likelihoods for large realiza-tions of Y and V.
Key idea: likelihood only depends on θ for ‘large’ y and v!
Uncertainty estimates obtained via Louis’ method.
17
Example w/ Infinite Hidden Measure
Simulate n = 10000 realizations from a bivariate Gaussiandistribution with correlation ρ, transform marginals to unitFrechet.
Tail equivalent on C and Cε0 to Y + V, where V has angularmeasure
H0(dw) =1
4η{w(1− w)}−1/2η−1dw.
Aim: estimate η = (1 + ρ)/2 from the ε-restricted model.
• Must select both ε and r∗V
• Trade-off in finite sample estimation problems
18
Infinite Hidden Measure Results
Shown for η = 0.75 (ρ = 0.5)
rV
η
22 45 100 200
0.65
0.70
0.75
0.80
0.85
0.90
ε = 0.185ε = 0.2ε = 0.215ε = 0.23
Mean estimates of η
rV
Cov
erag
e R
ate
22 45 100 2000.0
0.2
0.4
0.6
0.8
1.0
ε = 0.185ε = 0.2ε = 0.215ε = 0.23
Coverage rates of 95% CI
19
Air Pollution Data
0 100 200 300 400 500
0100
200
300
400
500
Daily max pollution at Leeds, UK
NO2
SO2
A1A2A3
0 100 200 300 4000
100
200
300
400
Frechet Scale
NO2
SO2
• Strong evidence for asymptotic independence
• Aim: estimate risk set probabilities
20
Competing Approaches
Examine three modeling approaches:
1. Assume asymptotic dependence; i.e. that ν(·) places masson the entire cone C. Fit a bivariate logistic angular de-pendence model to largest 10% of observations (in termsof L1 norm). Estimate β = 0.713.
2. Assume asymptotic independence and ignore any possiblehidden regular variation.
3. Assume asymptotic independence and hidden regular vari-ation. Fit the ε-restricted infinite hidden measure modelvia MCEM. Select r∗V = 7.5 and ε = 0.3. Estimate η =0.748.
This work introduces a sum representation for regular varyingrandom vectors possessing hidden regular variation.
• Useful representation for finite samples
• Asymptotically justified by tail equivalence result
• Difficulty arises when H0 is infinite - restrict to a compactcone to simulate V
• Likelihood estimation via modified MCEM algorithm
• Captures tail dependence in the presence of asymptoticindependence
• Improved estimation of tail risk set probabilites
25
References
Heffernan, J. E. and Tawn, J. A. (2004). A conditional approach for multivariateextreme values. Journal of the Royal Statistical Society, Series B, 66:497–546.
Jessen, A. and Mikosch, T. (2006). Regularly varying functions. University of Copen-hagen, laboratory of Actuarial Mathematics.
Ledford, A. and Tawn, J. (1997). Modelling dependence within joint tail regions. Journalof the Royal Statistical Society, Series B, B:475–499.
Maulik, K. and Resnick, S. (2004). Characterizations and examples of hidden regularvariation. Extremes, 7(1):31–67.
Resnick, S. (2002). Hidden regular variation, second order regular variation and asymp-totic independence. Extremes, 5(4):303–336.
Weller, G. and Cooley, D. (2013). A sum decomposition for hidden regular variation injoint tail modeling with likelihood inference via the mcem algorithm. Submitted.