Hidden Regular Variation in Joint Tail Modeling with ...web.math.ku.dk/~mikosch/extremes_2013/cooleyCopenhagen.pdf · Likelihood estimation via modi ed MCEM algorithm Captures tail

Hidden Regular Variation in Joint TailModeling with Likelihood Inference via the

MCEM Algorithm

Dan CooleyGrant Weller

Department of StatisticsColorado State University

Funding:NSF-DMS-0905315

Weather and Climate Impacts Assessment Program (NCAR)2011-2012 SAMSI UQ program

1

Motivating Example: Daily Air Pollution, Leeds UK

0 100 200 300 400 500

0100

200

300

400

500

Daily max pollution at Leeds, UK

NO2

SO2

A1A2A3

0 100 200 300 4000

100

200

300

400

Frechet Scale

NO2

SO2

Data exhibit asymptotic independence (Heffernan and Tawn,2004).

2

Outline

• Hidden Regular Variation

• Sum Characterization of HRV

• Estimation via MCEM

• Application: air pollution data

3

When Multivariate Regular Variation Fails

Multivariate Regular Variation:

tP[R

b(t)> r,W ∈ B

]v−→ r−αH(B).

In some cases, the angular measure H degenerates on someregions of N, masking sub-asymptotic dependence features.

Example: asymptotic independence in d = 2:

limz→z+

P(Z1 > z|Z2 > z) = 0.

• H consists of point masses at {0} and {1} (using ‖ · ‖1)

• e.g. bivariate Gaussian with correlation ρ < 1

Normalization by b(t) kills off sub-asymptotic dependencestructure.

4

Hidden Regular Variation(Resnick, 2002)

A regular varying random vector Z exhibits hidden regularvariation on a subcone C0 ⊂ C if ν(C0) = 0 and there exists{b0(t)}, b0(t)→∞ with b0(t)/b(t)→ 0 s.t.

tP[

Z

b0(t)∈ ·

]v−→ ν0(·)

as t→∞ in M+(C0).

• Scaling: ν0(tA) = t−α0ν0(A) for measurable A ∈ C0, α0 ≥ α• ν0 is Radon but not necessarily finite.

Equivalently,

tP[R

b0(t)> r,W ∈ B

]v−→ r−α0H0(B)

for B a Borel set of N0 = C0 ∩N (e.g. N0 = (0,1)).

H0 is called the hidden angular measure.

5

Example: bivariate Gaussian

Consider Z with Frechet margins and Gaussian dependence,ρ ∈ [0,1). Recall ν places mass only on the axes of C.

Define η = (1+ρ)/2, the coefficient of tail dependence (Led-ford and Tawn, 1997).

• Z exhibits hidden regular variation of order α0 = 1/η

• The density of the hidden measure ν0 can be written

ν0(dr × dw) =1

ηr−(1+1/η)dr ×

1

4η{w(1− w)}−1/2η−1dw︸︷︷︸

H0(dw)

H0 is infinite on (0,1).

6

Tail Equivalence(Maulik and Resnick, 2004)

Two random vectors X and Y are tail equivalent on the coneC∗ if

tP[

X

b∗(t)∈ ·

]v−→ ν(·) and tP

[Y

b∗(t)∈ ·

]v−→ cν(·)

as t→∞ in M+(C∗) for c > 0.

‘Extremes of X and Y samples taken in C∗ will have the sameasymptotic properties.’

7

Mixture Characterization of HRV(Maulik and Resnick, 2004)

Suppose Z is regular varying on C with hidden regular variationon C0:

tP[

Z

b(t)∈ ·

]v−→ ν(·) in M+(C) and

tP[

Z

b0(t)∈ ·

]v−→ ν0(·) in M+(C0)

with ν(C0) = 0 and b0(t)/b(t)→ 0 as t→∞.

• Let Y be RV (α) with support only on C \ C0.

• Let V = R0θ0, R0 ∼ FR0(t) = 1/b→(t) and θ0 ∼ H0, finite.

• Then Z is tail equivalent to a mixture of Y and V on bothC and C0.

Works because Y’s support doesn’t mess with the HRV.

8

0 20 40 60 80 100

020

4060

80100

Y[,1]

Y[,2]

0 20 40 60 80 100

020

4060

80100

V[,1]

V[,2]

0 20 40 60 80 100

020

4060

80100

YV[,1]

YV[,2]

9

Construction of Y + V

Define Y = RW, with P(R > r) ∼ 1/b←(r) and W drawn fromlimiting angular measure H. Notice that Y has support onlyon C \ C0.

Let V ∈ [0,∞)d be regular varying on C0 with limit measureν0:

tP[

V

b0(t)∈ ·

]v−→ ν0(·) in M+(C0).

Further assume that on C,

P(‖V‖ > r) ∼ cr−α∗

as r →∞, with c > 0 and

α∗ > α ∨ (α0 − α).

Assume R, W, V are independent.

10

Tail Equivalence Result

Then

tP[Y + V

b(t)∈ ·

]v−→ ν(·) in M+(C)

(Jessen and Mikosch, 2006).

Furthermore, tail equivalence (Maulik and Resnick, 2004)also holds on C0:

Theorem. With Y and V as defined above,

tP[Y + V

b0(t)∈ ·

]v−→ ν0(·) in M+(C0).

View Z as a sum of ‘first-order’ Y and ‘second-order’ V.

The sum Y + V is tail equivalent to Z on both C and C0.

11

Simulation when ν0 is finite.

0 200 400 600 800 1000 1200

0200

400

600

800

1000

1200

Y

Y1

Y2 +

0 200 400 600 800 1000 1200

0200

400

600

800

1000

1200

V

V1

V2 =

0 200 400 600 800 1000 1200

0200

400

600

800

1000

1200

Y +V

Y1 +V1

Y2+V2

No point falls exactly on an axis.

12

Infinite Measure Example: Bivariate Gaussian

Z has Frechet margins and Gaussian dependence (ρ < 1).Recall: H0 is infinite on N0 = (0,1).

Poses difficulty near the axes of C.

Proposed construction of V:

• Restrict to Cε0 = C0 ∩Nε0, where Nε

0 = [ε,1− ε] forε ∈ (0,1/2).

• Simulate W0 from probability density H0(dw)/H0(Nε0)

• Let R0 follow a Pareto distribution with α = 1/η

• V = [R0W0, R0(1−W0)]T

Y + V is tail equivalent to Z on C and Cε0.

13

Sum representation of bivariate Gaussian

Example with ρ = 0.5 (n = 2500):

0 200 400 600 800

0200

400

600

800

Z

z1

z 2

0 200 400 600 800

0200

400

600

800

Y +V (ε = 0.01)

Y1 +V1

Y2+V2

0 200 400 600 800

0200

400

600

800

Y +V (ε = 0.1)

Y1 +V1

Y2+V2

For any set completely contained in Cε0 we achieve the correctlimit measure ν0.

Choice of ε involves a trade-off between:• Size of the subcone on which tail equivalence holds

• Threshold at which Y + V is a useful approximation

• Biases due to choice of ε calculated.

14

Inference via the EM Algorithm

Observe realizations from Z, tail equivalent to Y+V. Assumeparametric forms and perform ML inference via EM.

If we assume Z = Y + V,

log f(z; θ) =∫

log f(z,y,v; θ)f(y,v|z; θ(k))dydv

−∫

log f(y,v|z; θ)f(y,v|z; θ(k))dydv

:= Q(θ|θ(k))−H(θ|θ(k)).

Here: Z and Y + V are only tail equivalent; θ governs tailbehavior of Y and V. Requires a modification of the EMsetup.

15

EM for Extremes

Consider distributions with densities gY(y; θ) and gV(v; θ)which are tail equivalent to the true distributions; i.e.,

gY(y; θ) ∼= fY(y; θ) for ‖y‖ > r∗YgV(v; θ) ∼= fV(v; θ) for ‖v‖ > r∗V,

Complete likelihood is based on limiting Poisson point pro-cesses for Y and V.

• E step: expectation is taken with respect to g(y,v|z; θ).

• M step: maximization is taken over only ‘large’ y and v.

We showH(θ(k)|θ(k))−H(θ|θ(k)) ≥ 0

using Jensen’s inequality.

16

MCEM

Natural framework for MCEM.

At the E step of the (k + 1)th iteration, simulate from

gY(y; θ(k))gV(z− y; θ(k)) ∝ g(y,v|z; θ(k))

for all z and use the simulated realizations to compute

Qm(θ|θ(k)) =1

m

m∑j=1

`(θ; z,yj,vj).

employing Poisson point process likelihoods for large realiza-tions of Y and V.

Key idea: likelihood only depends on θ for ‘large’ y and v!

Uncertainty estimates obtained via Louis’ method.

17

Example w/ Infinite Hidden Measure

Simulate n = 10000 realizations from a bivariate Gaussiandistribution with correlation ρ, transform marginals to unitFrechet.

Tail equivalent on C and Cε0 to Y + V, where V has angularmeasure

H0(dw) =1

4η{w(1− w)}−1/2η−1dw.

Aim: estimate η = (1 + ρ)/2 from the ε-restricted model.

• Must select both ε and r∗V

• Trade-off in finite sample estimation problems

18

Infinite Hidden Measure Results

Shown for η = 0.75 (ρ = 0.5)

rV

η

22 45 100 200

0.65

0.70

0.75

0.80

0.85

0.90

ε = 0.185ε = 0.2ε = 0.215ε = 0.23

Mean estimates of η

rV

Cov

erag

e R

ate

22 45 100 2000.0

0.2

0.4

0.6

0.8

1.0

ε = 0.185ε = 0.2ε = 0.215ε = 0.23

Coverage rates of 95% CI

19

Air Pollution Data

0 100 200 300 400 500

0100

200

300

400

500

Daily max pollution at Leeds, UK

NO2

SO2

A1A2A3

0 100 200 300 4000

100

200

300

400

Frechet Scale

NO2

SO2

• Strong evidence for asymptotic independence

• Aim: estimate risk set probabilities

20

Competing Approaches

Examine three modeling approaches:

1. Assume asymptotic dependence; i.e. that ν(·) places masson the entire cone C. Fit a bivariate logistic angular de-pendence model to largest 10% of observations (in termsof L1 norm). Estimate β = 0.713.

2. Assume asymptotic independence and ignore any possiblehidden regular variation.

3. Assume asymptotic independence and hidden regular vari-ation. Fit the ε-restricted infinite hidden measure modelvia MCEM. Select r∗V = 7.5 and ε = 0.3. Estimate η =0.748.

21

Results - risk set estimates

Model P(Z ∈ A1) Expected # p-val1 (asy. dep.) 0.0297 59.04 0.480

2 (asy. indep.) 0.0120 23.86 8.17× 10−5

3 (Y + V) 0.0261 51.89 0.210Empirical 0.0292 58 −

22



2 (asy. indep.) 0.0002 0.40 0.0093 (Y + V) 0.0018 3.58 0.274Empirical 0.0025 5 −

23



2 (asy. indep.) 0 0 13 (Y + V) 0.0002 0.40 0.704Empirical 0 0 −

24

Summary

This work introduces a sum representation for regular varyingrandom vectors possessing hidden regular variation.

• Useful representation for finite samples

• Asymptotically justified by tail equivalence result

• Difficulty arises when H0 is infinite - restrict to a compactcone to simulate V

• Likelihood estimation via modified MCEM algorithm

• Captures tail dependence in the presence of asymptoticindependence

• Improved estimation of tail risk set probabilites

25

References

Heffernan, J. E. and Tawn, J. A. (2004). A conditional approach for multivariateextreme values. Journal of the Royal Statistical Society, Series B, 66:497–546.

Jessen, A. and Mikosch, T. (2006). Regularly varying functions. University of Copen-hagen, laboratory of Actuarial Mathematics.

Ledford, A. and Tawn, J. (1997). Modelling dependence within joint tail regions. Journalof the Royal Statistical Society, Series B, B:475–499.

Maulik, K. and Resnick, S. (2004). Characterizations and examples of hidden regularvariation. Extremes, 7(1):31–67.

Resnick, S. (2002). Hidden regular variation, second order regular variation and asymp-totic independence. Extremes, 5(4):303–336.

Weller, G. and Cooley, D. (2013). A sum decomposition for hidden regular variation injoint tail modeling with likelihood inference via the mcem algorithm. Submitted.

26

Hidden Regular Variation in Joint Tail Modeling with ...web.math.ku.dk/~mikosch/extremes_2013/cooleyCopenhagen.pdf · Likelihood estimation via modi ed MCEM algorithm Captures tail

Documents