Page 1
Estimation of Extreme Value at Risks Using
CAViaR Models
Midori Nagai
Graduate School of Economics, Hitotsubashi University
January 2016
Abstract
This paper presents a estimation method of extreme VaR by integrating aCAViaR model and the extreme value theory. We model the dynamics of VaRby a CAViaR model and estimate the parameters by applying the extreme valuetheory. We compared a performance of our method with estimating the pa-rameter by the conventional quantile regression. Our method outperforms thequantile regression method in the simulation in that it estimates the VaR moreaccurately, while in the empirical study, our method is inferior to the quantileregression method in the sense of a violation rate.
Page 2
Contents
1 Introduction 2
2 Methods 4
2.1 CAViaR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Extreme value theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Basic theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Quantile estimation . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Estimation of extreme VaR . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Simulation Study 13
4 Real Data Analysis 20
5 Conclusion 22
Page 3
1 Introduction
Value-at-Risk (VaR) has been developed in the early 1990s in response to the market
crises occured around 1990, such as the stock market crash on Wall Steet in 1987 and
the break down of the European Monetary System in 1992. Nowadays VaR is a widely
used measure of market risk in risk management. VaR is a measure of the maximum
potential loss of a certain portfolio within a given time period for a given confidence
level. More specifically, conditional on the current information Ωt, VaR is defined as
the θ-quantile of the conditional return distribution, where 1 − θ ∈ (0, 1) represents
the confidence level associated with VaR. Let yt be the return of a portfolio. We are
then interested in estimating the VaR of yt, V aRt, defined by
P [ yt+1 < V aRt+1|Ωt] = θ.
Since VaR is conceptually simple, it has become a standard measure of market risk in
risk management.
Despite its conceptual simplicity, its estimation is a challenging statistical prob-
lem. So far various approachs have been developed to forecast VaR, but none of the
methods gives satisfactory solutions because typically the distribution of portfolio re-
turns changes over time. The most widely used approach is using fully parametric time
series models such as ARCH or GARCH models to capture the dynamics of volatili-
ties. The main weakness of this approach is necessity of assumptions on the shape of
the return distributions. The original GARCH and ARCH models assumed normal-
ity, which was soon realized to be inadequate. Its replacement with more fat-tailed
and possibly skewed distributions, such as Student-t distributions, is widely thought
to be effective. Nevertheless, there is still no answer which distribution to be assumed.
Regarding nomparametric methods, the most popupar approach is a historical simula-
tion. Using this method, we may obtain an estimate of VaR as an empirical quantile
of historical returns from a window of the most recent periods, implying simplicity in
computation. Despite the tractability in implementation, the VaR estimation could
be unstable especially in the extreme quantiles. On the other hand, other approaches
2
Page 4
such as Conditional Autoregressive Value at Risk (CAViaR) models and applications of
extreme value theory, is known to have an advantage even in such extremal estimation.
The CAViaR model, introduced by Engle and Manganelli (2004), is based on a
quantile regression. It is a dynamic quantile model in that conditional quantiles follow
autoregressive processes. The autoregressive structure of the quantile is quite natural
since the series of financial returns empirically tend to exhibit volatility clustering
and the quantile of the distribution is tightly linked to the variance. The advantage
of this model is that no explicit distributional assumptions need to be made due to
modeling the quantile directly. Empirical evidence has shown that a CAViaR model is
competitive with other VaR models (Bao et al. 2006, Yu et al. 2010).
We can use quantile regression introduced by Koenker and Bassett (1978) to esti-
mate the parameters of a CAViaR model. However, the estimates of high quantiles by
a quantile regression tend to be unstable due to data sparsity in the tail areas. The
quantile level θ of VaR is often set as 0.01 or sometimes lower. In such cases, it is dif-
ficult to forecast VaR using a CAViaR model with parameters estimated by a quantile
regression without making distributional assumptions.
Extreme value theory (EVT) provides a solid framework to analyze rare events
and forcuses only on the tail of the distribution. EVT has been applied in many fields
including VaR estimation. A simple and widely used EVT approach for VaR estimation
is to fit generalized pareto distribution to the returns that exceed a particular threshold.
See, for example, Danielsson and DeVries (1997), Pownall and Koedijk (1999). On
the other hand, McNeil and Frey (2000) applied EVT to the residuals of Gaussian
AR(1)-GARCH(1,1) model, called conditional-EVT method. Bystrom (2004) extended
McNeil and Frey’s approach to block maxima method. For more details on applications
of EVT to VaR estimation, see McNeil et al. (2015).
In this paper, we present a method to estimate extreme VaR using both a CAViaR
model and EVT. We first estimate “not extreme” quantiles from CAViaR models by a
quantile regression, then extrapolate the extreme quantile from the estimated quantiles
via EVT. The extrapolation method is based on Wang et al. (2012) which gives a
3
Page 5
method on conditional extreme quantile estimation. Although we can overcome the
unstableness of a quantile regression due to the application of EVT, we still do not need
to make extra assumptions on a return distribution which may lead to a misspecification
by integrating CAViaR and EVT.
The rest of the article is organized as follows. In Section 2, we briefly review
the CAViaR model and the extreme value theory. We also introduce the estimation
methods. In Section 3, we conduct a simulation study to assess the finite sample
performance of the proposed method. Section 4 presents an empirical application to
real data, and Section 5 conclude the paper.
2 Methods
2.1 CAViaR model
Engle and Manganelli (2004) introduced CAViaR to model conditional VaR. This ap-
proach models the quantile directly, instead of modeling the whole distribution. Em-
pirically it is well known that volatilities of stock market returns are usually autocor-
related. Since the quantile is tightly linked to the variance of the distribution, it is
natural to consider the VaR is also autocorrelated.
Engle and Manganelli (2004) proposed several CAViaR specifications. In this paper
we adopt the following three models. Let QYt(θ|Ωt−1) denote the θ-quantile of Yt
conditional on current information Ωt−1, i.e., QYt(θ|Ωt−1) := infy : FYt(y|Ωt−1) ≤ θ,where FYt(·|Ωt−1) is the conditional distribution function of the return Yt.
[Symmetric Absolute Value]
QYt+1(θ|Ωt) = β1 + β2QYt(θ|Ωt−1) + β3|yt|
[Asymmetric Slope]
QYt+1(θ|Ωt) = β1 + β2QYt(θ|Ωt−1) + β3max(yt, 0) + β4max(−yt, 0)
4
Page 6
[Indirect GARCH]
QYt+1(θ|Ωt) = (β1 + β2QYt(θ|Ωt−1)2 + β3y
2t )
1/2
In the Symmetic Absolute Value model, the quantiles respond symmetrically to past
returns. The autoregressive parameter β2, is to be |β2| < 1, so long as the process is
mean-reverting. On the other hand, the symmetric assumption is relaxed in the Asym-
metric Slope model in that the quantile respond differently to positive and negative
returns.
The above CAViaR models are correctly specified if the data were generated from
following models.
yt = σtεt, εt ∼ i.i.d.(0, 1),
σt =
β∗1 + β∗
2σt−1 + β∗3 |yt−1|, · · · Symmtric Absolute Value
β∗1 + β∗
2σt−1 + β∗3 max(yt, 0) + β∗
4 max(−yt, 0), · · ·Asmmetric Slope
(β∗1 + β∗
2σ2t−1 + β∗
3y2t−1)
1/2. · · · Indirect GARCH
The parameters of CAViaR models can be estimated by a quantile regression. Con-
sider a quantile regression model
yt = ft(β) + εtθ
with Qθ(εtθ|Ωt) = 0, where Qε(·|Ωt) is the quantile function of the i.i.d.errors ε. Then
we can define the quantile regression estimator of β as;
β = argminβ
T∑i=1
ρθ(yt − ft(β))
where ρθ(u) := uθ − I(u < 0) is the so-called check function. For details about
quantile regression; see Koenker (2005).
When θ is a not extreme quantile level, we can appropreately estimate the quantile
by just minimizing the objective function. However, if the quantile level θ is extreme,
θ ≈0 or 1, quantile regression estimators tend to be unstable due to data sparseness in
the tail area, especially for heavy-tailed distributions. This motivate us to adopt EVT
to the estimation.
5
Page 7
2.2 Extreme value theory
Extreme Value Theory (EVT) deals with extreme and rare events. In many cases, it
is difficult to get some inferences on such events, since the events could be beyond the
range of available data. EVT can be a strong tool in such situations. The pioneering
works on EVT are Fischer and Tippett (1928), Gnedenko (1943), and then they are
extended by Gumbel (1958). The theory has been applied in many fields, such as
hydorology, wind engineering, including finance and insurance from the viewpoint of
risk analysis. For the details of the theory; see Embrechts et al. (1997), de Haan and
Ferreira (2006).
2.2.1 Basic theory
Let X1, X2, . . . , Xn be i.i.d.random variables with a distribution function F . EVT is
concerned with the limit behavior of the sample maxima, Mn := max(X1, X2, . . . Xn),
while the central limit theorem is concerned with that of the partial sum, X1 +X2 +
· · ·+Xn. The distribution function of the sample maxima Mn is,
P (Mn ≤ x) = P (X1 < x,X2 < x, . . . , Xn ≤ x) = F n(x).
However, it has no practical value since F n(x) converges to zero for all x < x∗, and
converges to 1 for x ≥ x∗ as n → ∞, where x∗ is the right endpoint of F , i.e., x∗ :=
sup x : F (x) < 1. Therefore, in order to obtain a nondegenerate limit distribution,
we need to normalize the sample maxima Mn.
Under weak assumptions, it is known that there exists a sequence of constants
an > 0 and bn ∈ R(n = 1, 2, . . . ), such that
limn→∞
P
(Mn − bn
an≤ x
)= lim
n→∞F n(anx+ bn) = G(x), (1)
for some non-degenerate distribution G. If G exsists, it is known that G must be the
so-called generalized extreme value distribution:
Gγ(x) =
exp(−(1 + γx)−1/γ), γ = 0
exp(− exp(−x)), γ = 0
6
Page 8
where 1+γx > 0. The shape parameter γ is called extreme value index which determins
the tail behavior. If the relation (1) holds for some G = Gγ, we say that the distribution
function F is in the domain of attraction of Gγ, we write F ∈ D(Gγ).
The class of extreme value distribution can be devided into three types of subclasses
according to the value of γ.
(a) For γ > 0, Gγ can be reparametrized as
Φα(x) :=
0, x ≤ 0,
exp(−x−α), x > 0,
where α = 1/γ. This class is often called the Frechet class of distributions. The
right end point of this distribution is infinity and the distribution has a heavy
tail. The right tail of the distribution decreases like a power law, and moments of
order grater than or equal to α do not exist. The Student-t distribution and the
Cauchy distribution have this type of extreme value distributions. In this paper,
we are interested in this type of distributions.
(b) For γ = 0,
G0(x) = exp(− exp(−x)).
This is often called Gumbel distribution. Though the right endpoint of the dis-
tribution is infinity, the distribution has a rather light tail in that it declines
exponentially and all moments exist. Normal and lognormal distributions have
this type of extreme value distributions.
(c) For γ < 0, with α = −1/γ, we can write Gγ as
Ψα(x) :=
exp(−(−x)α), x < 0,
1, x ≥ 0.
This class is called Weibull class of distributions. Since the right end point of
the distribution is −α < ∞, it has a short tail. The uniform distribution, for
example, has this type of extreme value distribution.
7
Page 9
There is another limit distribution on extreme values. The limit relationship (1)
with G(x) = Gγ(x) = exp(−(1 + γx)−1/γ) is equivalent to,
limt→x∗
P
(X − t
f(t)
∣∣∣∣ X > t
)= lim
t→x∗
1− F (t+ xf(t))
1− F (t)= (1 + γx)−1/γ (2)
where f is some positive nondecreasing function and x∗ = sup x : F (x) < 1. This
means that the conditional distribution of (X − t)/f(t) given X > t, often referred to
as the excess distribution over threshold t, has the limit distribution,
Hγ(x) :=
1− (1 + γx)−1/γ γ = 0,
1− exp(−x) γ = 0,
where x ≥ 0 when γ ≥ 0, and 0 ≤ x ≤ −γ when γ < 0, as t → x∗. This class of
distribution functions is called the generalized Pareto distribution. The estimation of
extreme value distributions is often based on this type of limit formulas.
We have the same types of limiting distributions for the maxima of strictly station-
ary time series. Let (X1, X2, . . . , Xn) be a strictly stationary sequence with marginal
distributions F , (X1, X2, . . . , Xn) denote i.i.d. process with the same distribution func-
tion F , and Mn := max(X1, X2, . . . , Xn), Mn := max(X1, X2, . . . , Xn), denote maxima
of the stationary series and the i.i.d. series. Assume the i.i.d. process Xi is in the
maximum domain of attraction of Gγ, F ∈ D(Gγ), then for many processes Xi, there
exists θ ∈ (0, 1] such that
limn→∞
P
(Mn − bn
an≤ x
)= Gθ
γ(x). (3)
Note that the normalizing constants and the extreme value index are still the same
as the independent case. The condition for strictly stationaly process Xi to have the
extreme value distribution is known as the D(un) condition, obtained by Leadbet-
ter(1974). The condition D(un) is relatively weak, so that many processes including
linear processes, ARCH and GARCH processes, satisfy the condition. For the details
of the condition, see Leadbetter et al. (1983), Embrechts et al. (1997). The value
θ is called the extremal index. The extremal index equals to 1 in the case when the
series is i.i.d. or weakly dependent. In general, serial dependences leads to a clustering
8
Page 10
of large values. Therefore, the maximum of a stationary time series is stochastically
smaller than that of an i.i.d. sequence with the same marginal distribution function.
The reciprocal of the extremal index 1/θ can be interpreted as the mean cluster size.
Together with (1) and (2), we have,
P
(Mnθ − bn
an≤ x
)∼ P
(Mn − bn
an≤ x
)
as n → ∞. This means that the distribution of the maximum of n observations from
strictly stationaly time series with the extremal index θ can be approximated by the
distribution of the maximum of nθ < n observations from i.i.d. series with the same
marginal distributions. Thus the convergence speed of the maximum from time series
to extreme value distribution becomes slower than that of the i.i.d. series.
2.2.2 Quantile estimation
Let xp := F−1(1− p) be the extreme quantile we want to estimate. The relation (2) is
equivalent to (Theorem 1.1.6 of de Haan and Ferreira (2006)),
limt→∞
U(tx)− U(t)
a(t)=
xγ − 1
γ
where U(t) = F−1(1 − 1/t), a(t) is a positive function with f(t) = a(1/(1 − F (t))).
This implies
U(tx) ≈ U(t) + a(t)xγ − 1
γ.
Since xp = U(1/p), by letting t = n/k, x = k/(np) with k → ∞, k/n → 0 as n → ∞,
we have
xp ≈ U(nk
)+ a
(nk
) ( knp
)γ− 1
γ.
Therefore, by replacing U(n/k), a(n/k), and γ with their suitable estimators, we have
an estimator of xp.
9
Page 11
When the extreme value index γ is positive, a simpler version is available. The basic
assumption F ∈ D(Gγ>0) is equivalent to (Corollary 1.2.10 of de Haan and Ferreira
(2006)),
limt→∞
U(tx)
U(t)= xγ , t > 0.
Then similarly, we have
xpn ≈ U(nk
)( k
npn
)γ
. (4)
We can estimate U(n/k) by the empirical quantile Xn−k,n, where X1,n ≤ · · · ,≤ Xn,n
are the order statistics. Hence, the estimator of the quantile becomes
xpn := Xn−k,n
(k
npn
)γ
, (5)
where γ denotes a suitable estimator of the extreme value index γ depending only on
the k largest order statistics.
Several methods have been proposed to estimate the extreme value index γ. For
the case γ > 0, a simple and widely used estimator is the Hill estimator developed by
Hill (1975),
γ :=1
k
k∑i=1
logXn−i+1,n
Xn−k,n
. (6)
This estimator can be derived from several different methods, showing that the Hill
estimator is quite natural. One approach is based on the reformulation of the condition
F ∈ D(Gγ). We can rewrite the condition as,
limt→∞
1− F (tx)
1− F (t)= x−1/γ, x > 0.
Using partial integration, we have
limt→∞
∫∞t(log u− log t)dF (u)
1− F (t)= γ.
Then replacing F by the enpirical distribution function Fn, and t by the intermediate
order statistic Xn−k,n, we have the estimator (6). The replacement of t is motivated by
10
Page 12
the fact that Xk,n → ∞ a.s., provided k → ∞ and k/n → 0 as n → ∞; see Proposition
4.1.14 of Embrechts et al. (1997).
The asymptotic properties of the Hill estimator have been investigated in various
models. The consistency was proved in quite general time series models, including
infinite order of moving averages and ARCH and GARCH processes, established by
Hsing (1991), Resnik and Startica (1995,1998). Asymptotic normality has also been
proved in many situations; see Resnik and Starica (1997) and Hill (2010).
The Hill estimator can be used only when the extreme value index is positive,
corresponding to the heavy-tailed situation. In the case γ ∈ R, we can use other
estimators such as the moment estimator, the maximum likelihood estimator, and the
Pickands estimator. However, since we focus on heavy tailed distributions in this paper,
we use the Hill estimator alone for the estimation.
To justify the quantile estimation (5), n/k should be large enough to justify the
approximation (4). On the other hand, k itself should also be sufficiently large so that
the intermediate order statistics Xn−k,n can estimate the intermediate quantile U(n/k)
well enough. Thus the following conditions are usually imposed in the extreme value
literature.
k → ∞, n/k → ∞, as n → ∞
In practice, the choise of k is a quite difficult problem. There is a trade-off between
bias and variance of γ. A smaller value of k leads to larger variance due to the lack
of sample, while larger k results in a severe bias because of including “not extreme
data” in the estimation. Theoretlically, we can choose the optimal k by minimizing
the mean squared error of the estimator. However, the optimal k depends on unknown
parameters which are difficult to be estimated in practice. Therefore, a commonly used
approach to choose k is to plot the estimator of γ versus k, and then choose the value
of k corresponding to the first stable part of the plot, which is often called as Hill plot.
11
Page 13
2.3 Estimation of extreme VaR
To estimate the conditional extreme VaR by CAViaR, we adopt the estimation method
proposed by Wang et al. (2012). They integrated a quantile regression and EVT to
estimate the conditional extreme quantiles. The method estimates extreme conditional
quantiles by the following way. First, we use a quantile regression to estimate the
conditional quantiles for intermediate quantile levels. Then the estimated intermediate
conditional quantiles are used to estimate the extreme conditional quantiles. We should
note that although Wang et al.(2012) assumed the quantile regression model to be
linear, we use the method to nonlinear models.
We assume FYt(·|Ωt−1) is in the maximum domain of attraction of an extreme value
distribution Gγ(·), denoted by FYt(·|Ωt−1) ∈ D(Gγ). In this paper, the extreme value
index γ is restricted to be γ > 0, corresponding to heavy-tailed distributions, since
market returns generally show heavy-tailed structures.
Let QYt(θ|Ωt−1) = inf y : FYt(yt|Ωt−1) ≤ θ,denote the θ-th conditional quantile
of Yt given Ωt−1. Our main objective is to estimate the extreme conditional quantiles
QYt(θn|Ωt−1), with θn → 1 as n → ∞. We consider the following quantile regression
model:
QYt(θ|Ωt−1) = f(Ωt−1; βθ)
where Ωt is a vecter of information available at time t, and the parameter vector β
depends on θ.
First we estimate a sequence of intermediate conditional quantiles. Define a se-
quence of quantile levels 0 < θn−k < θn−k+1 < · · · < θm < 1, where m = n − [nη]
for some 0 < η < 1 and [u] denotes the integer part of u, and θj = j/(n + 1). We
assume k = kn → ∞ and k/n → 0 as n → ∞, and nη < k. We estimate these levels of
quantiles by a quantile regression;
βθj = argminβ
n∑i=1
ρθj(yi − f(Ωt−1; βθj))
where ρθ(u) = uθ − I(u < 0).
12
Page 14
Forj = n − k, . . . ,m, define QYt(θj|Ωt−1) = f(Ωt−1; βθj). These can be roughly
regarded as the upper order statistics of a sample from FYt(·|Ωt−1). Since we assumed
FYt(·|Ωt−1) ∈ D(Gγ) with the extreme value index γ > 0, we can estimate γ by the Hill
estimator based on the pseudo order statistics QYt(θn−k|Ωt−1), · · · , QYt(θn−[nη ]|Ωt−1),
γ =1
k − [nη]
k∑j=1
logQYt(θn−[nη ]−j+1|Ωt−1)
QYt(θn−k|Ωt−1).
Consequently, QYt(θn|Ωt−1) can be estimated by
QYt(θn|Ωt−1) =
(1− θn−k
1− θn
)γ
QYt(θn−k|Ωt−1).
3 Simulation Study
In this section, we conduct a simulation study to investigate the comparative perfor-
mance of our EVT method and the conventional quantile regression method for esti-
mating CAViaR models. We consider the three CAViaR models mentioned in previous
section. The data are generated from
yt = σtεt, εt ∼ i.i.d.t∗(3), t = 1, . . . , T,
where t∗(3) is the standardized Student-t distribution with 3 degrees of freedom. Note
that the extreme value index γ of t(3) is 1/3 > 0 .
The standard error processes σt are specified as follows according to each of the
CAViaR models.
[Case 1] Symmetric Absolute Value model
σt = 0.1 + 0.8σt−1 + 0.03|yt−1|
[Case 2] Asymmetric Slope model
σt = 0.1 + 0.8σt−1 + 0.01max(yt−1, 0) + 0.03max(−yt−1, 0)
13
Page 15
[Case 3] Indirect GARCH model
σ2t = 0.1 + 0.8σ2
t−1 + 0.1y2t−1
We consider three sample sizes T = 500, 1000 and 2000. We generate T observations
from these models, then estimate QYT+1(θ|ΩT ) for θ = 0.01, 0.005 and 0.001. The true
one-step-ahead quantile isQYT+1(θ|ΩT ) = σTT
∗−13 (θ), with T ∗−1
3 (θ) = T−13 (θ)/
√3 where
T−13 is the inverse Student-t cdf. Therefore, the true CAViaR models become,
[Case 1] Symmetric Absolute Value model
QyT+1(θ|ΩT ) = 0.1T ∗−1
3 (θ) + 0.8QyT (θ|ΩT−1) + 0.03T ∗−13 (θ)|yT |
[Case 2] Asymmetric Slope model
QyT+1(θ|ΩT ) = 0.1T ∗−1
3 (θ) + 0.8QyT (θ|ΩT−1)+
0.03T ∗−13 (θ)max(yT , 0) + 0.03T ∗−1
3 (θ)max(−yT , 0)
[Case 3] Indirect GARCH model
QyT+1(θ|ΩT ) = (0.1T ∗−1
3 (θ) + 0.8QYT(θ|ΩT−1)
2 + 0.1T ∗−13 (θ)y2T )
1/2
We calculate the true quantiles from these formulas. The number of replications is
500 for each scenario.
To optimize the objective function in the quantile regression estimation, we follow
Engle and Manganelli (2004). First, we generate 1000 parameter vectors from the
uniform distribution, U(0, 1), if the true parameter is positive, while U(−1, 0) when
the true parameter is negative. Specifically, β2 in all the models is generated from
U(0, 1) and the others from U(−1, 0). Then we compute the objective function for
each of these vectors and select the 15 vectors that produced the lowest value of the
objective function. The vectors are used as initial values for the optimization. For
each of these initial values, first we ran the Nelder-Mead simplex algorithm. Then we
let the optimal parameters as the initial parameters for the quasi-Newton algorithm
to get the new optimal parameters and run the simplex algorithm again with the new
14
Page 16
initial values. We repeat this procedure until the convergence criterion is satisfied.
The convergence criterion is set to 10−3 for the parameters values. Finally, we select
the vector that produced the lowest value of the objective function among the 15
optimization procedures.
To estimate the quantile by the EVT method, we set the tuning parameters η = 0.2,
and k = [5T 1/3].
Hill estimator is defiened only when the data are positive. However, though the
true quantiles cannot be negative, the estimated quantiles sometimes could be negative.
Therefore we apply some shift to the data to create Hill estimators. The size of the shift
is set to the minimum of the data. More specifically, let x1, . . . , xn be the observed
data andMn denotes their maximum. We transform the data to x1+Mn, . . . , xn+Mn,
then get the quantile estimators of the transformed data, denoted by Q∗θn−k
, · · · , Q∗θn,
which are used to calculate the Hill estimator γ. Then we estimate the shifted quantile
by
Q∗n =
(1− θn−k
1− θn
)γ
Q∗n−k,
and the quantile estimate becomes
Q(θn|x) = Q∗n −Mn.
We should note that one of the main weaknesses of the Hill estimator is that it is not
shift invariant. Althogh a shift to the data does not affect the value of the extreme
value index γ, the convergence speed of the Hill estimator may change due to the shift.
To compare the performance, we focus on the mean absolute error (MAE) and
the root mean squared error (RMSE). The MAE is defined by 1n
∑ni=1 |q0i − qi|, and
RMSE is√
1n
∑ni=1(q
0i − qi)2, where q0i denotes the true quantile, qi is the estimated
quantile, and i denotes each replication. Meanwhile, the true conditional quantile
varies for each simulation, since it depends on time varying standard deviations of the
variables. Therefore, we also compare normalized versions of MAE and RMSE, defined
by 1n
∑ni=1 |(q0i − qi)/q
0i | and
√1n
∑ni=1((q
0i − qi)/q0i )
2.
15
Page 17
Table 1: MAE and RMSE of QYT+1(θ|ΩT )
.
Symmetric Absolute Value Asymmetric Slope Indirect GARCHMAE RMSE MAE RMSE MAE RMSE
T = 2000QR
0.01 1.294 1.482 1.485 1.818 0.650 1.0890.005 1.751 2.063 1.979 2.312 0.889 1.5200.001 3.976 4.875 4.045 4.547 2.384 4.115
EVT0.01 0.375 0.496 0.961 1.148 0.693 1.0330.005 0.583 0.732 1.081 1.398 0.945 1.3830.001 1.603 1.809 1.287 2.085 2.186 2.946
T = 1000QR
0.01 1.367 1.658 1.396 1.572 0.683 1.0360.005 1.875 2.280 1.983 2.183 1.020 1.7100.001 3.859 4.823 4.034 4.358 3.213 5.082
EVT0.01 0.414 0.525 0.847 1.125 0.700 1.0180.005 0.647 0.781 0.923 1.319 1.000 1.4350.001 1.713 1.906 1.221 1.832 2.356 3.076
T = 500QR
0.01 1.544 2.133 1.375 1.580 0.913 1.4930.005 2.070 2.647 2.032 2.262 1.352 2.1690.001 3.659 3.952 4.022 4.292 3.411 5.277
EVT0.01 0.538 0.688 0.701 0.952 0.907 1.7010.005 0.806 0.985 0.805 1.143 1.313 2.3160.001 1.927 2.195 1.333 1.845 2.974 4.421
16
Page 18
Table 2: Normalized versions of MAE and RMSE of QYT+1(θ|ΩT ).
Symmetric Absolute Value Asymmetric Slope Indirect GARCHMAE RMSE MAE RMSE MAE RMSE
T = 2000QR
0.01 0.596 0.672 0.628 0.699 0.169 0.2240.005 0.629 0.740 0.649 0.711 0.179 0.2400.001 0.809 0.968 0.763 0.800 0.284 0.428
EVT0.01 0.161 0.199 0.419 0.484 0.184 0.2500.005 0.198 0.234 0.363 0.450 0.195 0.2640.001 0.319 0.347 0.235 0.353 0.258 0.320
T = 1000QR
0.01 0.637 0.759 0.605 0.655 0.191 0.2510.005 0.677 0.806 0.671 0.714 0.227 0.3630.001 0.786 0.904 0.780 0.813 0.417 0.667
EVT0.01 0.184 0.219 0.370 0.453 0.197 0.2730.005 0.226 0.260 0.310 0.409 0.219 0.3020.001 0.348 0.376 0.228 0.313 0.297 0.373
T = 500QR
0.01 0.708 0.906 0.599 0.665 0.257 0.3940.005 0.741 0.916 0.689 0.745 0.296 0.4470.001 0.758 0.811 0.782 0.804 0.432 0.654
EVT0.01 0.243 0.285 0.311 0.417 0.248 0.4120.005 0.285 0.326 0.275 0.385 0.281 0.4390.001 0.393 0.430 0.255 0.341 0.369 0.490
17
Page 19
0 100 200 300 400 500
−15
−10
−5
0
0 100 200 300 400 500
−15
−10
−5
0
0 100 200 300 400 500
−15
−10
−5
0
True EVTQR
Figure 1: Estimated and true quantiles of Symmmetric Absolute Value model withT = 500 and the quantile level θ = 0.01
Table 1 reports the MAE and RMSE, and Table 2 shows the normalized versions
of them. EVT denotes our EVT-based estimation method and QR denotes the con-
ventional quantile regression method.
Both tables show that the EVT-based estimation method is useful for forecasting
VaR using each CAViaR models. According to the Table 1, the EVT method outper-
forms the QR method for the Symmetric Absolute Value model and the Asymmetric
Slope model in every quantile level and sample size. In the Indirect GARCH case, we
cannot judge which method performs better when θ =0.01, 0.005. When it is more
extreme case, θ = 0.001, however, the EVT method clearly outperforms the quantile
18
Page 20
0 100 200 300 400 500
−15
−10
−5
0
0 100 200 300 400 500
−15
−10
−5
0
0 100 200 300 400 500
−15
−10
−5
0
True EVTQR
Figure 2: Estimated and true quantiles of Symmmetric Absolute Value model withT = 500 and the quantile level θ = 0.001
regression method. Looking at the shifted versions of MAE and RMSE, in the Symmet-
ric Absolute model and the Asymmetric Slope model, the EVT method also performs
much better than the quantile regression method in all the situations. On the other
hand, in the Indirect GARCH with θ = 0.01, the quantile regression method performs
better than the EVT method. However, when θ = 0.001, the EVT method outper-
forms the quantile regression method in all the sample sizes. When θ = 0.005, the
EVT method is better when T = 500 and 1000, the sample size is rather small. Con-
sequently, the EVT method outperfoms the conventional quantile regression method
especially when the quantile level is more extreme and the sample size is small.
19
Page 21
Figures 1 and 2 plot the estimated quantiles QYT+1(τ |ΩT ) by Symmetric Absolute
Value model and the true quantiles for each simulation, where T = 500. In Figure
1, θ = 0.01, and θ = 0.001 in Figure 2. Although both the QR and EVT method
underestimate the true quantiles, EVT estimates are closer to the true quantiles than
QR estimates. The EVT method seems to be quite stable especially when θ = 0.01.
On the other hand, the QR estimates are considerably unstable even when θ = 0.01.
4 Real Data Analysis
In this section, we apply the proposed method to empirical data to study the per-
formance of forcasting the VaR. We use the daily log return series of the Japanese
Nikkei 225 stock index from January 1990 to December 2014; the length of the data
is 5903. Table 3 provides the summary statistics on the return series. The standard
errors are in the parentheses, the JB denotes the Jarque-Bera test statistic. It shows
that the return series is negatively skewed, and has heavy-tails evidenced by that the
kurtosis is significantly positive. Consequently, the normality is strongly rejected on
the Jarque-Bera test.
Table 3: Summary statistics of the Nikkei 225 log-return seriesMean Variance Skewness Kurtosis JB−0.005 2.306 −0.209 5.047 6306.5(0.020) (0.032) (0.064)
We estimate the VaR for 3903 days, from February 2, 1999 to December 30, 2014.
We apply the rolling window method. To estimate the VaR of time t with window size
T , we use rt−T , . . . , rt−1 where rt denotes the return, to estimate the parameters of
the CAViaR models and then obtain the VaR estimates from the estimated models.
We examine all the three CAViaR models, and set the window size T = 500, 1000
and 2000. The estimated quantile levels are θ = 0.01, 0.005 and 0.001. For the EVT
method, k is set to 35 for T = 500, 45 for T = 1000, 50 for T = 2000. The data are
also shifted as the simulation in previous section to calculate the Hill estimator. The
20
Page 22
Table 4: Violation rates of the Nikkei 225Symmetric Absolute Value Asymmetric Slope Indirect GARCH
QR EVT QR EVT QR EVTT =5000.01 0.01051 0.02101 0.00999 0.02000 0.01102 0.017420.005 0.00461 0.01486 0.00615 0.01256 0.00538 0.011020.001 0.00231 0.00692 0.00205 0.00410 0.00103 0.00436T = 10000.01 0.00999 0.02358 0.01256 0.02230 0.01128 0.020500.005 0.00641 0.01332 0.00743 0.01333 0.00615 0.011530.001 0.00333 0.00743 0.00333 0.00538 0.00205 0.00436T = 20000.01 0.01461 0.02870 0.01435 0.02665 0.01461 0.027680.005 0.00974 0.02127 0.00118 0.01691 0.01128 0.016400.001 0.00666 0.01153 0.00897 0.00743 0.00461 0.00641
size of the shift are set to the maximum return in the window.
The relative performance of each estimation method is calculated in terms of the
violation ratio. A violation occurs when a realized return fall below the estimated VaR.
The violation ratio is defined as the proportion of the number of violations to the total
number of VaR forcasts. If the VaR is estimated well, the violation ratio will be close
to the quantile level θ of the VaR.
The results are reported in Table 4. Our EVT-based VaR estimation method per-
forms poorer than the conventional quantile regression estimation method in every
model. The violation ratios of the EVT estimators are considerably higher than the
quantile levels, that means the VaR forcasts are too liberal. One reason of this result
might be the underestimation of the extreme value index caused by shifting the data,
leading to estimating the tail to be too light. As a result, the estimated quantiles tend
to be higher than the true quantiles.
In Figures 3 and 4, estimated VaR from the Symmetric Absolute Value model with
T = 500 are plotted with observed Nikkei 225 return series. Contrary to the simulation
results, EVT estimates are higher than QR estimates leading to the overestimation of
21
Page 23
0 1000 2000 3000 4000
−0.
3−
0.2
−0.
10.
00.
1
0 1000 2000 3000 4000
−0.
3−
0.2
−0.
10.
00.
1
0 1000 2000 3000 4000
−0.
3−
0.2
−0.
10.
00.
1
Returns EVTQR
Figure 3: Nikkei 225 log-return series and orecasted VaR series from Symmetric Abso-lute Value model with T = 500, θ = 0.01
the quantiles. When the volatility of the returns seems to change, the EVT estimates
become considerably low compeared to the QR estimates. The range of fluctuation
of the forecasted VaR by EVT is smaller than that by QR, which is consistent to the
simulation results.
5 Conclusion
In this paper, we proposed a VaR estimation method by integrating a CAViaR model
and extreme value theory. We model the dynamics of quantiles by a CAViaR model
22
Page 24
0 1000 2000 3000 4000
−0.
3−
0.2
−0.
10.
00.
1
0 1000 2000 3000 4000
−0.
3−
0.2
−0.
10.
00.
1
0 1000 2000 3000 4000
−0.
3−
0.2
−0.
10.
00.
1
Returns EVTQR
Figure 4: Nikkei 225 log-return series and orecasted VaR series from Symmetric Abso-lute Value model with T = 500, θ = 0.001
and estimate extreme quantiles by applying extreme value theory. We first estimate
intermediate quantile series by a quantile regression, then extrapolate high quantiles
with the estimated quantiles. Numerical studies showed that the extrapolation leads
to more accurate quantile estimation than the conventional quantile regression.
The empirical analysis on log-return series of a stock index, however, showed that
our estimation method performs worse than the conventional quantile regression es-
timation method, in terms of violation ratio. This result may be caused by shifting
the data for constructing the Hill estimator which is not shift invariant. Shifting the
data directly affects the value of the Hill estimator, of course also affect the estimated
23
Page 25
quantiles. This is a main disadvantage of the Hill estimator.
For further study, using other estimators for the extreme value index, such as maxi-
mum likelihood estimator or moment estimator, is suggested. We can estimate general
γ ∈ R by adopting those estimators, leading to relaxing the heavy-taild assumption. In
addition, asymptotic properties of the Hill estimator and the quantile estimator based
on the pseudo order statistcs should be investigated. The i.i.d. assumption and linear
assumption on Wang et al.(2012) are violated in this paper.
Acknowledgements
I would like to express the deepest appreciation to Prof. Eiji Kurozumi helpful sug-
gestions and constructive comments. Without his guidance and persistent help this
paper would not have been possible. I would also like to express my appreciation to
the Center of Financial Engineering Education for supporting my research.
24
Page 26
References
[1] Bao, Y., Lee, T. H., and Saltoglu, B. (2006) “Evaluating Predictive Performance
of Value-at-Risk Models in Emerging Markets: a Reality Check,” Journal of Fore-
casting, 25, 101–128.
[2] Bystrom, H. N. E. (2004), “Managing Extreme Risks in Tranquil and Volatile Mar-
kets Using Conditional Extreme Value Theory,” International Review of Financial
Analysis, 13(2), 133–152.
[3] Danielsson, J., and de Vries, C. G. (1997), “Tail Index and Quantile Estimation
with Very High Frequency Data,” Journal of Empirical Finance, 4(2), 241–257.
[4] Dress, H. (2003), “Extreme Quantile Estimation for Dependent Data, with Appli-
cations to Finance,” Bernoulli, 9(1), 617–657.
[5] Embrechts, P., Kluppelberg, C., and Mikosch, T. (1997), Modelling Extremal
Events, Springer-Verlag Berlin Heidelberg.
[6] Engle, R. F., and Manganelli, S. (2004), “CAViaR: Conditional Autoregressive
Value at Risk by Regression Quantiles,” Journal of Business & Economic Statis-
tics, 22(4), 367–381.
[7] Fischer, R. A., and Tippett, L. H. C. (1928), “Limiting Forms of the Frequency
Distribution of the Largest or Smallest Member of a Sample,” Mathematical Pro-
ceedings of the Cambridge Philosophical Society, 24(2), 180–190.
[8] Gnedenko, B. (1943), “Sur la Distribution Limite du Terme Maximum d’une Serie
Aleatoire,” Annals of Mathematics, 423–453.
[9] Gumbel, E. J. (1958), Statistics of Extremes, Columbia University Press, New
York.
[10] de Haan, L., and Ferreira, A. (2006), Extreme Value Theory: An Introduction,
New York: Springer.
25
Page 27
[11] Hill, B. M. (1975), “A Simple General Approach to Inference about the Tail of a
Distribution,” The Annals of Statistics, 3(5), 1163–1174.
[12] Hill, J. B. (2010), “On Tail Index Estimation for Dependent, Heterogeneous Data,”
Econometric Theory, 26(5), 1398–1436.
[13] Hsing, T. (1991), “On Tail Index Estimation Using Dependent Data,” The Annals
of Statistics, 1547–1569.
[14] Koenker, R., and Bassett, G. (1978), “Regression Quantiles,” Econometrica, 46,
33-50.
[15] Koenker, R. (2005), Quantile Regression, Cambridge: Cambridge University Press.
[16] Kuester, K., Mittnik, S., and Paolella, M. S., (2006), ”Value-at-Risk Prediction:
a Comparison of Alternative Strategies,” Journal of Financial Econometrics, 4,
53–89.
[17] Leadbetter, M. R. (1974), “On Extreme Values in Stationary Sequences,”
Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete, 28(4), 289–303.
[18] Leadbetter, M. R., Lindgren, G., and Rootzen, H. (1983), Extremes and Related
Properties of Random Sequences and Processes, Springer, Berlin.
[19] McNeil, A. J., and Frey, R. (2000), “Estimation of Tail-Related Risk Measures for
Heteroscedastic Financial Time Series: an Extreme Value Approach,” Journal of
Empirical Finance, 7(3), 271–300.
[20] McNeil, A. J., Frey, R., and Embrechts, P. (2015), Quantitative Risk Management:
Concepts, Techniques and Tools: Concepts, Techniques and Tools, Princeton uni-
versity press.
[21] Pownall, R. A., and Koedijk, K. G. (1999), “Capturing Downside Risk in Finan-
cial Markets: the Case of the Asian Crisis,” Journal of International Money and
Finance, 18(6), 853–870.
26
Page 28
[22] Resnick, S., and Startica, C. (1995), “Consistency of Hill’s Estimator for Depen-
dent Data,” Journal of Applied Probability, 139–167.
[23] Resnick, S., and Startica, C. (1997), “Asymptotic Behavior of Hill’s Estimator
for Autoregressive Data,” Communications in statistics. Stochastic models, 13(4),
703–721.
[24] Resnick, S., and Startica, C. (1998), “Tail Index Estimation for Dependent Data,”
The Annals of Applied Probability, 8(4), 1156–1183.
[25] Wang, J. H., Li, D., and He, X. (2012), “Estimation of High Conditional Quantiles
for Heavy-Tailed Distributions,” Journal of the American Statistical Association,
107, 1453–1464.
[26] Yu, P. L. H., Li, W. K., and Jin, S. (2010), “On Some Models for Value-at-Risk,”
Econometric Reviews, 29(5-6), 622–641.
27