RiskMetrics Journal Volume 8, Number 1 Winter 2008 › Volatility Forecasts and At-the-Money Implied Volatility › Ination Risk Across the Board › Extensions of the Merger Arbitrage Risk Model › Measuring the Quality of Hedge Fund Data › Capturing Risks of Non-transparent Hedge Funds
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
RiskMetrics JournalVolume8, Number 1 Winter 2008
› Volatility Forecasts and At-the-MoneyImplied Volatility
› In5ation Risk Across the Board
› Extensions of the Merger Arbitrage Risk Model
› Measuring the Quality of Hedge Fund Data
› Capturing Risks of Non-transparent Hedge Funds
aRiskMetricsGroupPublication
On the Cover:
Weightswk( ΔT ) as functionofthe forecast horizonΔT.
This article explores the relationship between realized volatility, implied volatility and severalforecasts for the volatility built from multiscale linear ARCH processes. The forecasts are derivedfrom the process equations, and the parameters set according to different risk methodologies(RM1994, RM2006). An empirical analysis across multiple time horizons shows that a forecastprovided by an I-GARCH(1) process (one time scale) does not capture correctly the dynamic ofthe realized volatility. An I-GARCH(2) process (two time scales, similar to GARCH(1,1)) isbetter, but only a long memory LM-ARCH process (multiple time scales) replicates correctly thedynamic of the realized volatility. Moreover, the forecasts provided by the LM-ARCH process areclose to the implied volatility. The relationship between market models for the forward varianceand the volatility forecasts provided by ARCH processes is investigated. The structure of theforecast equations is identical, but with different coefficients. Yet the process equations for thevariance induced by the process equations for an ARCH model are very different from thosepostulated for a market model, and not of any usual diffusivetype when derived from ARCH.
1 Introduction
The intuition behind volatility is to measure price fluctuations, or equivalently, the typical magnitude
for the price changes. Yet beyond the first intuition, volatility is a fairly complex concept for various
reasons. First, turning this intuition into formulas and numbers is partly arbitrary, and many
meaningful and useful definitions of volatilities can be given. Second, the volatility is not directly
“observed” or traded, but rather computed from time series (although this situation is changing
indirectly through the ever increasing and sophisticated option market and the volatility indexes). For
trading strategies, options and risk evaluations, the valuable quantity is the realized volatility, namely
the volatility that will occur between the current timet and some time in the futuret +∆T. As this
quantity is not available at timet, a forecast needs to be constructed. Clearly, a better forecast of the
At a timet, a forecast for the realized volatility can be constructed from the (underlying) price time
series. In this paper, multiscale ARCH processes are used. On the other hand, a liquid option market
furnishes the implied volatility, corresponding to the “market” forecast for the realized volatility. On
4 Volatility Forecasts and At-the-Money Implied Volatility
the theoretical side, an “instantaneous”, or effective, volatility σeff is needed to define processes, and
the forward variance. Therefore, at a given timet, we have mainly one theoretical instantaneous
volatility and three notions of “observable” volatility (forecasted, implied and realized). This paper
studies the empirical relationship between these three time series, as a function of the forecast horizon
∆T.1
The main line of this work is to model the underlying time series by multicomponent ARCH
processes, and toderivean implied volatility forecast. This forecast is close to the implied volatility
for the at-the-money option. Such an approach produces a volatility surface based only on the
underlying time series, and therefore a surface can be inferred even when option data is poor or not
available. This article does not address the issue of the full surface, but only the implied volatility for
the at-the-money options, called thebackbone.
A vast literature on implied volatility and its dynamic already exists. In this article, we will review
some recent developments on market models for the forward variance. These models focus on the
volatility as a process, and many process equations can be set that are compatible with a martingale
condition for the volatility. On the other side, the volatility forecast as induced by a multicomponent
ARCH process leads also to process equations for the volatility itself. These two approaches leading
to the volatility process are contrasted, showing the formal similarity in the structure of the forecasts,
but the very sharp difference in the processes for the volatility. If the price time series behave
according to some ARCH process, then the implication for volatility modeling is far reaching as the
usual structure based on Wiener process cannot be used.
This paper is organized as follows. The required definitionsfor the volatilities and forward variance
are given in the next section. The various multicomponent ARCH processes are introduced in
Section 3, and the induced volatility forecasts and processes given in Section 4 and 5. The market
models and the associated volatility dynamics are presented in Section 6. The relationship between
market models, options and the ARCH forecasts are discussedin Section 7. Section 8 presents an
empirical investigation of the relationship between the forecasted, implied and realized volatilities.
Section 9 concludes.
1There exist already an abundant literature on this topic, and (Poon 2005) published a book summarizing nicely the
available publications (approximately 100 articles on volatility forecast alone!).
Definitions and setup of the problem 5
2 Definitions and setup of the problem
2.1 General
We assume to be at timet, with the corresponding information setΩ(t). The time increment for the
processes and the granularity of the data is denoted byδt, and is one day in the present work. We
assume that there exists an instantaneous volatility, denoted byσeff(t), which corresponds to the
annualized expected standard deviation of the price in the next time stepδt. This is a useful quantity
for the definitions, but this volatility is essentially unobserved. In a process,σeff gives the magnitude
of the returns, as in (9) below.
2.2 Realized volatility
Therealized volatilitycorresponds to the annualized standard deviation of the returns in the interval
betweent andt +∆T
σ2(t, t +∆T) =1 yearn δt ∑
t<t ′≤t+∆T
r2(t ′) (1)
wherer(t) are the (unannualized) returns measured over the time interval δt, and the ratio 1 year/δt
annualizes the volatility. The empirical section is done with daily data and the returns are evaluated
over a one day interval. If the returns do not overlap in the sum, then∆T = n δt. At the timet, the
realized volatility cannot be evaluated from the information setΩ(t). The realized volatility isthe
useful quantity we would like to forecast and to relate to theimplied volatility.
2.3 Forward variance
Theexpected cumulative varianceis defined by
V(t, t +∆T) =
∫ t+∆T
tdt′ E
[σ2
eff(t′) | Ω(t)
](2)
and theforward varianceby
v(t, t +∆T) =∂V(t, t +∆T)
∂∆T= E
[σ2
eff(t +∆T) | Ω(t)]. (3)
The cumulative variance is an extensive quantity as it is proportional to∆T. For empirical
investigation, it is simpler to work with an intensive quantity as this removes a trivial dependency on
6 Volatility Forecasts and At-the-Money Implied Volatility
the time horizon. For this reason, the cumulative variance is used only in the theoretical part (hence
also the continuum definition with an integral), whereas theforecasted volatility is used in the
empirical part.
The variance enters into the variable leg of a variance swap,and as such, it is tradable. Related
tradable instruments are the volatility indexes like the VIX (but the relation is indirect as the index is
defined through implied volatility of a basket of options). Because volatility is tradable, the forward
variance should be a martingale
E[v(t ′,T) | Ω(t)
]= v(t,T). (4)
For the volatility, this condition is quite weak as it follows also from the chain rule for conditional
expectation
E[E[
σ2eff(T) | Ω(t ′)
]| Ω(t)
]= E
[σ2
eff(T) | Ω(t)]
for t < t ′ < T (5)
and from the definition of the forward variance as a conditional expectation. Therefore, any forecast
built as a conditional expectation produces a martingale for the forward variance.
At this level, there is a formal analogy with interest rates,with the (zero coupon) interest rate and
forward rate being analogous to the cumulative variance andforward variance. Therefore, some ideas
and equations can be borrowed from the interest rate field. For example, on the modeling side, one
can write process for the cumulative variance or for the variance swap, the later being more
convenient as the martingale condition gives simpler constraints on the possible equations. In this
paper, the ARCH path is followed using a multiscale process for the underlying. The forward variance
is computed as an expectation, and therefore the martingaleproperty follows. In Section 6, this
ARCH approach is contrasted with a direct model for the forward volatility, where the martingale
condition has to be explicitly enforced.
2.4 The forecasted volatility
Theforecasted volatilityis defined by
σ2(t, t +∆T) =1n ∑
t<t ′≤t+∆T
E[σ2
eff(t′) | Ω(t)
]. (6)
Up to a normalization and the transformation of the integralinto a discrete sum, this definition is
similar to the expected cumulative variance.
Multicomponent ARCH processes 7
2.5 The implied volatility
As usual, the implied volatility is defined as the volatilityto insert into the Black-Scholes equation so
as to recover the market price for the option. The implied volatility σBS(m,∆T) is a function of the
moneynessm and of the time to maturity∆T. The moneyness can be defined is various ways, with
most definitions similar tom' ln(F/K), and withF the forward rateF = Ser ∆T . The (forward)
at-the-money option corresponds tom= 0. Thebackboneis the implied volatility at the money
σBS(∆T) = σBS(m= 0,∆T), as a function of the time to maturity∆T. For a given time to maturity
∆T, the implied volatility as function of moneyness is called the smile.
Intuitively, the implied volatility surface can loosely bedecomposed in backbone× smile. The
rationale for this decomposition is that the two directionsdepend on different option features. The
backbone is related to the expected volatility until the option expiry
σ(t, t +∆T) = σBS(m= 0,∆T)(t) (7)
In the Black-Scholes formula, the volatility appears only through the combination∆T σ2,
corresponding to the cumulative expected variance. In the other direction, the smile is the fudge factor
to remedy the incomplete modeling of the underlying by a Gaussian random walk. The Black-Scholes
model has the key advantage to be solvable, but does not include many stylized facts like
heteroskedasticity, fat-tails, or leverage effect. Theseshortcomings translate into various “features” of
the smile.
In principle, (7) should be checked using empirical data. Yet this comparison raises a number of
issues, on both sides of the equation. On the left-hand side,the variance forecast should be computed
using some equations and the time series for the underlying.The forecasting scheme, with its
estimated parameters, is subject to errors. On the right-hand side, the option market has its own
idiosyncracies, for example related to demand and supply. Such effect can be clearly observed by
computing the implied volatility corresponding to the option bid or ask prices. These points are
discussed in more detail in Section 8. Therefore, (7) shouldbe taken only a first order approximation.
3 Multicomponent ARCH processes
3.1 The general setup
The basic idea of a multicomponent ARCH process is to measurehistorical volatilities using
exponential moving average on a set of time horizons, and to compute the effective volatility for the
8 Volatility Forecasts and At-the-Money Implied Volatility
next time step as a convex combination of the historical volatilities. A first process along these lines
was introduced in (Dacorogna, Muller, Olsen, and Pictet 1998), and this family of processes was
thoroughly developed and explored in (Zumbach and Lynch 2001; Lynch and Zumbach 2003;
Zumbach 2004). A particular simple process with long memoryis used to build the RM2006 risk
methodology (Zumbach 2006). One of the key advantages of these multicomponent processes is that
forecasts for the variance can be computed analytically. Wewill use this property to explore their
relations with the option implied volatility.
In order to build the process, the historical volatilities are measured by exponential moving averages
(EMA) at time scalesτk
σ2k(t) = µk σ2
k(t−δt)+(1−µk) r2(t) k = 1, · · · ,n (8)
and with decay coefficientsµk = exp(−δt/τk). The process time increment,δt, is one day in this
work. Let us emphasize that theσk are computed from historical data, and there is no hidden
stochastic processes like in a stochastic volatility model.
The “effective” varianceσ2eff is a convex combination of theσ2
k and of the mean varianceσ2∞
σ2eff(t) =
n
∑k=1
wk σ2k(t)+w∞ σ2
∞ = σ2∞ +
n
∑k=1
wk(σ2
k(t)−σ2∞),
1 =n
∑k=1
wk +w∞.
Finally, the price follows a random walk with volatilityσeff
r(t +δt) = σeff(t) ε(t +δt). (9)
Depending on the number of componentsn, the time horizonsτk and weightswk, a number of
interesting processes can be built. The processes we are using to compare with implied volatility are
given in the next subsections.
On general ground, we make the distinction between affine processes for which the mean volatility is
fixed byσ∞ andw∞ > 0, and the linear process for whichw∞ = 0. The linear and affine terms qualify
the equations for the variance. The linear processes are very interesting for forecasting volatility as
they have no mean volatility parameterσ∞, which clearly would be time series dependent. However,
their asymptotic properties are singular, and affine processes should be used in Monte Carlo
simulations. This subtle difference between both classes of processes is discussed in details in
(Zumbach 2004). As this paper deal with volatility forecasts, only the linear processes are used.
Multicomponent ARCH processes 9
3.2 I-GARCH(1)
The I-GARCH(1) model corresponds to a 1-component linear process
σ2(t) = µ σ2(t −δt)+(1−µ) r2(t)
σ2eff(t) = σ2(t).
It has one parameterτ (or equivalentlyµ). This process is equivalent to the integrated GARCH(1,1)
process (Engle and Bollerslev 1986), and with a given value for µ is equivalent to the standard
RiskMetrics methodology (RM1994). Its advantage is to be the most simple, but it does not capture
mean reversion for the forecast (that is, that long term forecasts should converge to the mean
volatility).
For the empirical evaluation, the characteristic time has been fixeda priori to τ = 16 business days,
corresponding toµ' 0.94.
3.3 I-GARCH(2) and GARCH(1,1)
The I-GARCH(2) process corresponds to a two-component linear model
σ21(t) = µ1 σ2
1(t−δt)+(1−µ1) r2(t),
σ22(t) = µ2 σ2
2(t−δt)+(1−µ2) r2(t),
σ2eff(t) = w1σ2
1(t)+w2σ22(t).
It has three parametersτ1, τ2 andw1. Even if this process is linear, it has mean reversion for time
scales up toτ2, with σ2 playing the role of the mean volatility.
The GARCH(1,1) process (Engle and Bollerslev 1986) corresponds to the one-component affine
model
σ21(t) = µ1 σ2
1(t−δt)+(1−µ1) r2(t),
σ2eff(t) = (1−w∞)σ2
1(t)+w∞σ2∞.
It has three parametersτ1, w∞ andσ∞. In this form, the analogy between the I-GARCH(2) and
GARCH(1,1) processes is clear, with the long term volatility σ2 playing a similar role as the mean
volatility σ∞.
10 Volatility Forecasts and At-the-Money Implied Volatility
Given a process, the parameters need to be estimated on a timeseries. GARCH(1,1) is more
problematic with that respect becauseσ∞ is clearly time series dependent. A good procedure is to
estimate the parameters on a moving historical sample, say in a window betweent −∆T ′ andt for a
fixed span∆T ′. With this setup, the mean varianceσ2∞ is essentially the sample variance∑ r2
computed on the estimating window. This is a rectangular moving average, similar to an EMA but for
the weights given to the past. This argument shows that I-GARCH(2) and (a continuously
re-estimated on a moving window) GARCH(1,1) behave similarly. A detailed analysis of both
processes in (Zumbach 2004) show that they have similar forecasting power, with an advantage to
I-GARCH(2).
In this work, we use the I-GARCH(2) process with two parameter sets fixeda priori to some
reasonable values. The first set isτ1 = 4 business days,τ2 = 512 business days,w1 = 0.843 andw2 =
0.157. The second set isτ1 = 16 business days,τ2 = 512 business days,w1 = 0.804 andw2 = 0.196.
The values for the weights are obtained according to the longmemory ARCH process, but with only
two givenτ components.
3.4 Long Memory ARCH
The idea for a long memory process is to use a multicomponent ARCH model with a large number of
components but simple analytical form for the characteristic timeτk and the weightswk. For the long
memory ARCH process, the characteristic timesτk increase as a geometric series
τk = τ1 ρk−1 k = 1, · · · ,n, (10)
while the weights decay logarithmically
wk =1C
(1− ln(τk)/ ln(τ0)) , (11)
C = ∑k
(1− ln(τk)/ ln(τ0)) .
This choice produces lagged correlations for the volatility that decays logarithmically, as observed in
the empirical data (Zumbach 2006). The parameters are takenas for the RM2006 methodology,
namelyτ1 = 4 business days,τn = 512 business days,ρ =√
2 and the logarithmic decay factorτ0 =
1560 days = 6 years .
Forward variance and multicomponent ARCH processes 11
Figure 1Weights wk(∆T ) as function of the forecast horizon ∆TLong memory process with w∞ = 0.1 and τk = 2, 4, 8, 16, · · · , 256 days. Weight profiles for increasingcharacteristic times τk have decreasing initial values and maximum values going from left to right.
100 101 102 1030
0.05
0.1
0.15
0.2
Forecast horizon ∆T [day]
wei
ghts
wk(
∆T)
4 Forward variance and multicomponent ARCH processes
For multiscale ARCH processes (I-GARCH, GARCH(1,1), long-memory ARCH, etc ...), the forwardvariance can be computed analytically (Zumbach 2004; Zumbach 2006). The idea is to compute theconditional expectation of the process equations, from which iterative relations can be deduced. Then,some algebra and matrix computations produce the following form for the forward variance
v( t , t + ∆T ) = E σ2eff( t + ∆T ) | Ω( t) = σ2
∞ +n
∑k= 1
wk(∆T ) σ2k ( t) − σ2
∞ . (12)
The weights wk(∆T ) can be computed by a recursion formula depending on the decay coefficients µk
and with initial values given by wk = wk(1) . The equation for the forecast of the realized volatility hasthe same form but the weights wk(∆T ) are different.
Let us emphasize that this can be done for all processes in this class (linear and affine). Moreover, theσ2
k( t) are computed from the underlying time series, namely there is no hidden stochastic volatility toestimate. This makes volatility forecasts particularly easy in this framework.
For a multicomponent ARCH process, the intuition for the forecast can be understood from a graph ofthe weights wk(∆T ) as function of the forecast horizon ∆T as given in Figure 1. For short forecasthorizons, the volatilities with the shorter time horizons dominate. As the forecast horizon gets larger,the weights of the short term volatilities decay while the weights of the longer time horizons increase.
12 Volatility Forecasts and At-the-Money Implied Volatility
Figure 2Sum of the weights ∑k wk(∆T ) = 1− w∞
Same parameters as in Figure 1
100 101 102 1030
0.2
0.4
0.6
0.8
1su
mof
wei
ghts
Forecast horizon ∆T [day]
The weight for a particular horizon τk peaks at a forecast horizon similar to τk, for example theBurgundy curve corresponds to τ = 32 days, and its maximum is around this value. Figure 2 showsthe sum of the volatility coefficients ∑k wk = 1− w∞. This shows the increasing weight of the meanvolatility as the forecast horizon gets longer. Notice that this behavior corresponds to our generalintuition about forecasts: short term forecasts depend mainly on the recent past while long termforecasts need to use more information from the distant past. The nice feature of the multicomponentARCH process is that the forecast weights are derived from the process equations, and that they havea similar content to the process equations (linear or affine, one or multiple time scales).
5 The induced volatility process
The multicomponent ARCH processes are stochastic processes for the return, in which the volatilitiesare convenient intermediate quantities. It is important to realize that the volatilities σk and σeff areuseful and intuitive in formulating a model, but they can be completely eliminated from the equations.An important advantage of this class of process is that the forward variance v( t , t + ∆T ) can becomputed analytically. Going in the opposite direction, we want to eliminate the return, namely toderive the equivalent process equations for the dynamic of the forward variance induced by amulticomponent ARCH process. This will allow us to make contact with some models for theforward variance that are available in the literature and presented in the next section.
The induced volatility process 13
Equation (8) forσk can be rewritten as
dσ2k(t) = σ2
k(t)−σ2k(t −δt) (13)
= (1−µk)−σ2
k(t−δt)+ ε2(t) σ2eff(t−δt)
= (1−µk)
σ2eff(t−δt)−σ2
k(t −δt)+ (ε2(t)−1) σ2eff(t−δt)
.
The equation can be simplified by introducing the annualizedvariancesvk = 1y/δt σ2k,
veff = 1y/δt σ2eff and a new random variableχ with
χ = ε2−1 such that E [ χ(t) ] = 0, χ(t) > −1. (14)
Assuming that the time incrementδt is small compared to the time scalesτk in the model, the
following approximation can be used
1−µk =δtτk
+O(δt2). (15)
In the present derivation, this expansion is used only to make contact with the usual form for
processes, but no term of higher order are neglected. Exact expressions are obtained by replacing
δt/τk by 1−µk in the equations below.
These notations and approximations allow the equivalent equations
dvk =δtτk
veff −vk +χveff , (16a)
veff = ∑k
wk vk +w∞v∞. (16b)
The process for the forward variance is given by
dv∆T = ∑k
wk(∆T) dvk (17)
with dvτ(t) = v(t, t +∆T)−v(t−δt, t−δt +∆T).
The content of (16a) is the following. The termδt veff −vk/τk gives a mean reversion toward the
current effective volatilityveff at a time scaleτk. This structure is fairly standard, except forveff which
is given by a convex combination of all the variancesvk. Then, the random term is unusual. All the
variances share the same random factorδt χ/τk, which has a standard deviation of orderδt instead of
the usual√
δt appearing in Gaussian model.
An interesting property of this equation is to enforce positivity for vk through a somewhat unusual
mechanism. Equation (16a) can be rewritten as
dvk =δtτk
−vk +(χ+1)veff (18)
14 Volatility Forecasts and At-the-Money Implied Volatility
Becauseχ ≥ 1, the term(χ+1)veff is never negative, and asδt vk(t−δt)/τk is smaller thanvk(t −δt),
this implies thatvk(t) is always positive (even for a finiteδt). Another difference with the usual
random process is that the distribution forχ is not Gaussian. In particularly ifε has a fat-tailed
distribution—as seems required in order to have a data generating process that reproduce the
properties of the empirical time series—the distribution for χ also has fat tails.
The continuum limit of the GARCH(1,1) process was already investigated by (Nelson 1990). In this
limit, GARCH(1,1) is equivalent to a stochastic volatilityprocess where the variance has its own
source of randomness. Yet Nelson constructed a different limit as above because he fixes the GARCH
parametersα0, α1 andβ1. The decay coefficient is given byα1+β1 = µ and is therefore fixed. With
µ= exp(−δt/τ), fixing µ and taking the limitδt → 0 is equivalent toτ → 0. Because the
characteristic timeτ of the EMA goes to zero, the volatility process becomes independent of the
return process, and the model converges toward a stochasticvolatility model. A more interesting limit
is to takeτ fixed andδt → 0, as in the computation above. Notice that the computation is done with a
finite time incrementδt; the existence of a proper continuum limitδt → 0 for a process defined by
(16b) to (17) is likely not a simple question.
Let us emphasize that the derivation of the volatility process as induced by the ARCH structure
involves only elementary algebra. Essentially, if the price follows an ARCH process (one or multiple
time scales, with or without meanσ∞), then the volatility follows a process according to (16). The
structure of this process involves a random term of orderδt and therefore it cannot be reduced to a
Wiener or Levy process. This is a key difference from the processes used in finance that were
developed to capture the price diffusion.
The implications of (16) are important as they show a key difference between ARCH and stochastic
volatility processes. This has clearly implication for option pricing, but also for risk evaluation. In a
risk context, the implied volatility is a risk factor for anyportfolio that contains options, and it is
likely better to model the dynamic of the implied volatilityby a process with a similar structure.
6 Market model for the variance
In the literature, the models for the implied volatility aredominated by stochastic volatility processes,
essentially assuming that the implied volatility “has its own life”, independent of the underlying. In
this vast literature, a recent direction is to write processes directly for the forward variance. Recent
work in this direction include (Buehler 2006; Bergomi 2005;Gatheral 2007). In this direction, we
present here simple linear processes for the forward variance, and discuss the relation with
Market model for the variance 15
multicomponent ARCH in the next section.
The general idea is to write a model for the forward variance
v(t, t +∆T) = G(vk(t);∆T), (19)
whereG is a given function of the (hidden) random factorsvk. In principle, the random factors can
appear everywhere in the equation, say for example as a random characteristic time likeτk. Yet
Buehler has showed that strong constraints exist on the possible random factors, for example
forbidding random characteristic time. In this paper, onlylinear model will be discussed, and
therefore the random factor appears as a variancevk.
The dynamic for the random factorvk are given by processes
dvk = µk(v) dt+d
∑α=1
σαk (v) dWα k = 1, · · · ,n. (20)
The processes haved sources of randomnessdWα, and the volatilityσαk (v) can be any function of the
factors.
As such, the model is essentially unconstrained, but the martingale condition (4) for the forward
variance still has to be enforced. Through standard Ito calculus, the variance curve model together
with the martingale condition lead to a constraint betweenG(v;∆T), µ(v) andσ(v)
∂∆TG(v;∆T) =n
∑i=1
µi ∂vi G(v;∆T)+n
∑i, j=1
d
∑α=1
σαi σα
j ∂2vi ,v j
G(v;∆T) (21)
A given functionG is said to be compatible with a dynamic for the factors if thiscondition is valid.
The compatibility constraint is fairly weak, and many processes can be written for the forward
variance that are martingales. As already mentioned, we consider only functionsG that are linear in
the risk factors. Therefore,∂2vi ,v j
G = 0, leading to first order differential equations that can be solved
by elementary techniques. For this class of models, the condition does not involve the volatilityσαk (v)
of the factor, which therefore can be chosen freely.
6.1 Example: one-factor market model
The forward variance is parameterized by
G(v1;∆T) = v∞ +w1 e−∆T/τ1(v1−v∞) (22)
16 Volatility Forecasts and At-the-Money Implied Volatility
which is compatible with the stochastic volatility dynamic
dv1 = −(v1−v∞)dtτ1
+ γ vβ1 dW for β ∈ [1/2,1]. (23)
The parameterw1 can be chosen freely, and for identification purposes the choicew1 = 1 is often
made. BecauseG is linear inv1, there is no constraint onβ. The valueβ = 1/2 corresponds to the
Heston model,β = 1 to the lognormal model. This model is somewhat similar to the GARCH
process, with one characteristic timeτ1, a mean volatilityv∞, and the volatility of the volatility
(vol-of-vol) γ. This model is not rich enough to describe the empirical forward variance dynamic,
which involves multiple time scale.
6.2 Example: two-factor market model
The linear model with two factors
G(v;∆T) = v∞ +w1 e−∆T/τ1(v1−v∞) (24)
+1
1− τ1/τ2
(−w1 e−∆T/τ1 +(w1+w2) e−∆T/τ2
)(v2−v∞)
= v∞ +w1(∆T) (v1−v∞)+w2(∆T) (v2−v∞) (25)
is compatible with the dynamic
dv1 = −(v1−v2) dt/τ1+ γ vβ1 dW1 (26)
dv2 = −(v2−v∞) dt/τ2+ γ vβ2 dW2.
The parametersw1 andw2 can be chosen freely, and for identification purposes the choicew1 = 1 and
w2 = 0 is often made. Notice the similarity of (25) with the Svensson parameterization for the yield
curve.
The linear model can be solved explicitly forn components, but the∆T dependency in the coefficients
wk(∆T) becomes increasingly complex. It is therefore not natural in this approach to create the
equivalent of a long-memory model with multiple time scales.
7 Market models and options
Assuming a liquid option market, the implied volatility surface can be extracted, and from its
backbone, the forward variancev(t, t +∆T) is computed. At a given timet, given a market model
Comparison of the empirical implied, forecasted and realized volatilities 17
G(vk(t);∆T), the risk factorsvk(t) are estimated by fitting the functionG(∆T) on the forward
variance curve. It is therefore important for the functionG(∆T) to have enough possible shapes to
accommodate the various forward variance curves. This estimation procedure for the risk factors
gives the initial conditionvk(t). Then, the postulated dynamics for the risk factors induce adynamic
for G, and hence of the forward variance.
Notice that in this approach, there is no relation with the underlying and its dynamic. For this reason,
the possible processes are weakly constrained, and the parameters need to be estimated independently
(say for example the characteristic timesτk). Another drawback of this approach is to rely on the
empirical forward variance curve, and therefore a liquid option market is a prerequisite.
Our choice of notations makes clear the formal analogy of themarket model with the forecasts
produced by a multicomponent ARCH process. Except for the detailed shapes of the functions
wk(∆T), the equations (12) and (25) have the same structure. They are however quite different in
spirit: thevk are computed from the underlying time series in the ARCH approach, whereas in a
market model approach, thevk are estimated from the forward variance curve obtained fromthe
option market. In other words, ARCH leads to a genuine forecast based on the underlying, whereas
the market model provides for a constrained fit of the empirical forward curve. Beyond this formal
analogy, the dynamic for the risk factors are quite different as the ARCH approach leads to the
unusual (16a) whereas market models use the familiar generic Gaussian process in (20).
8 Comparison of the empirical implied, forecasted and reali zed
volatilities
As explained in Section 4, a multicomponent ARCH process provides us with a forecast for the
realized volatility, and the forecast is directly related to the underlying process and its properties. At a
given timet, there are three volatilities (implied, forecasted and realized) for each forecast horizon
∆T. Essentially, the implied and forecasted volatilities aretwo forecasts for the realized volatility. In
this section, we investigate the relationship between these three volatilities and the forecast horizon
∆T. When analyzing the empirical statistics and comparing these three volatilities, several factors
should be kept in mind.
1. For short forecast horizons (∆T = 1 day and 5 days), the number of returns in∆T is small and
therefore the realized volatility estimator (computed with daily data) has a large variance.
2. The forecastability decreases with increasing∆T.
18 Volatility Forecasts and At-the-Money Implied Volatility
3. The forecast and implied volatilities are “computed” using the same information set, namely the
history up tot. This is different from the realized volatility, computed using the information in
the interval[t, t +∆T]. Therefore, we expect the distance between the forecast andimplied to be
the smallest.
4. The implied volatility has some idiosyncracies related to the option market, for example supply
and demand, or the liquidity of the underlying necessary to implement the replication strategy.
Similarly, an option bears volatility risk, and a related volatility risk premium can be expected.
These particular effects should bias the implied volatility upward.
5. From the raw options and underlying prices, the computations leading to the implied volatility
are complex, and therefore error prone. This includes dependencies on the original data
providers. An example is given by the time series for CAC 40 implied volatility, where during a
given period, the implied volatility above three months jumps randomly between a realistic
value and a much higher value. This is likely created by quotes for the one-year option that are
quite off the “correct” price (see Figure 3). Yet this data quality problem is inherent to the
original data provider and the option market, and reflects the difficulty in computing clean and
reliable implied volatility surfaces.
6. The options are traded for fixed maturity time, whereas theconvenient volatility surface is given
for constant time to maturity. Therefore, some interpolation and extrapolation needs to be done.
In particular, the short times to maturity (one day, five days) need most of the time an
extrapolation, as the options are traded at best with one expiry for each month. This is clearly a
difficult and error prone procedure.
7. The ARCH-based forecasts are dependent on the choice of the process and the associated
parameters.
8. As the forecast horizon increases, the dynamic of the volatility gets slower and the actual
number of independent volatility points decreases (as 1/∆T). Therefore, the statistical
uncertainty on the statistics are increasing with∆T.
Because of the above points, each volatility has some peculiarities, and therefore we do not have a
firm anchor point to base our comparison. Given that we are on afloating ground, our goals are fairly
modest. Essentially, we want to show that processes with oneor two time scales are not good enough,
and that the long-memory process provide for a very good forecast with an accuracy comparable to
the implied volatility. The processes used in the analysis are I-GARCH(1), I-GARCH(2) with two set
Comparison of the empirical implied, forecasted and realized volatilities 19
Figure 3Volatility time series for USD/EUR (top) and CAC 40 (bottom), six month forecast horizon
2003/01/01 2004/01/01 2005/01/01 2006/01/01
6
8
10
12
14
16
18
vola
tility
[%
]
realizedimpliedI−GARCH(1)I−GARCH(2)LM−ARCH
2003/01/01 2004/01/01 2005/01/01 2006/01/010
10
20
30
40
50
60
vola
tility
[%
]
realizedimpliedI−GARCH(1)I−GARCH(2)LM−ARCH
of parameters and LM-ARCH. The equations for the processes are given in Section 3, along with thevalues for the parameters.
The best way to visualize the dynamic of the three volatilities is to use a movie of the σ[∆T ] timeevolution. On a movie, the properties of the various volatilities, their dynamics and relationships arevery clear. Unfortunately, the present analogic paper does not allow for such a medium, and we haveto rely on conventional statistics to present their properties.
The statistics are computed for two time series: the USD/EUR foreign exchange rate and the CAC 40stock index. The ATM implied volatility data originate from JP Morgan Chase for USD/EUR andEgartech for the CAC 40 index; the underlying prices originate from Reuters. The time series for thevolatilities are shown on Figure 3 for a six-month forecast horizon. The time series are fairly short(about six years for USD/EUR and four years for CAC 40). This clearly makes statistical inferences
20 Volatility Forecasts and At-the-Money Implied Volatility
difficult, as the effective sample size is fairly small. On the USD/EUR panel, the lagging behavior of
the forecast and implied volatility with respect to the realized volatility is clearly observed. For the
CAC 40, the data sample contains an abrupt drop in the realized volatility at the beginning of 2003.
This pattern was difficult to capture for the models with longterm mean reversion. In 2005 and early
2006, the implied volatility data are also not reliable: first there are two “competing” streams of
implied volatility at∼12% and∼18%, before a period at the end of 2005 where there is likely no
update in the data stream. This shows the difficulty of obtaining reliable implied volatility data, even
from a major data supplier.
For the statistics, in order to ease the comparison between the graphs, all the horizontal and vertical
scales are identical, the color is fixed for a given forecast model, and the line type is fixed for a given
volatility pair. The graphs are presented for the mean absolute error (MAE)
MAE(x,y) =1n∑
t|x(t)−y(t)| , (27)
wheren is the number of terms in the sum. Other measures of distance like root mean square error
give very similar figures. The volatility forecast depends on the ARCH process. The parameters for
the processes have been selecteda priori to some reasonable values, and no optimization was done.
The overall relationship between the three volatilities can be understood from Figure 4. The pair of
volatilities with the closest relationship is the implied and forecasted volatilities, because they are
built upon the same information set. The distance with the realized volatility is larger, with similar
values for implied-realized and forecast-realized. This shows that it is quite difficult to assert which
one of the implied and forecasted volatility provides for a better forecast of the realized volatility. All
the distances have a global U-shape form as function of∆T. This originates in the points 1 and 2
above, and leads to a minimum around one month for the measureof distances. The distance is larger
for shorter∆T because of the bad estimator for the realized volatility, and larger for longer∆T
because of the decreasing forecastability.
Figure 5 shows the distances for given volatility pairs, depending on the process used to build the
forecast. The forecast-implied distance shows clear difference between processes (left panels). The
I-GARCH(1) process lacks mean reversion, an important feature of the volatility dynamic. The
I-GARCH(2) process with parameter set 1 is handicapped by the short characteristic time for the first
EMA (4 days); this leads to a noisy volatility estimator and subsequently to a noisy forecast. The
same process with a longer characteristic time for the first EMA (16 days, parameter set 2) shows
much improved performance up to a time horizon comparable tothe long EMA (260 days). Finally,
the LM-ARCH produces the best forecast. As the forecast becomes better (1 time scale→ 2 time
scales→ multiple time scales), the distance between the implied andforecasted volatilities decreases.
Comparison of the empirical implied, forecasted and realized volatilities 21
Figure 4
MAE distances between volatility pairs for EUR/USD, grouped by forecast method
The vertical axis gives the MAE for the annualized volatility in %, the horizontal axis the forecast time
interval∆T in days.
I-GARCH(1) I-GARCH(2) parameter set 1
100
101
102
0
1
2
3
4
5
6
7
8
fcst−implfcst−realimpl−real
100
101
102
0
1
2
3
4
5
6
7
8
fcst−implfcst−realimpl−real
I-GARCH(2) parameter set 2 LM-ARCH
100
101
102
0
1
2
3
4
5
6
7
8
fcst−implfcst−realimpl−real
100
101
102
0
1
2
3
4
5
6
7
8
fcst−implfcst−realimpl−real
22 Volatility Forecasts and At-the-Money Implied Volatility
Figure 5
MAE distances between volatility pairs, grouped by pairs
The vertical axis gives the MAE for the annualized volatility in %, the horizontal axis the forecast time
For EUR/USD, the mean volatility is around 10% (the precise value depending on the volatility and
time horizon), and the MAE is in the 1 to 2% range. This shows that in this time to maturity range, we
can build a good estimator of the ATM implied volatility based only on the underlying time series.
The distance forecast-realized is larger than the forecast-implied volatility (right panel), with the long
memory process giving the smallest distance. The only exception is the I-GARCH(1) process applied
to the CAC 40 time series, due to the particular abrupt drop inthe realized volatility at early 2003.
This shows the limit of our analysis due to the fairly small data sample. Clearly, to gain statistical
power requires longer time series for implied volatility, as well as a cross-sectional study over many
time series.
9 Conclusion
Themenagea troisbetween the forecasted, implied and realized volatilitiesis quite a complex affair,
where each participants have their own peculiarities. The salient outcome is that the forecasted and
impled volatilities have the closest relationship, while the realized volatility is more distant as it
incorporates a larger information set. This picture is dependent to some extent on the quality of the
volatility forecast: the multiscale dynamic of the long memory ARCH process is shown to capture
correctly the dynamic of the volatility, while the I-GARCH(1) and I-GARCH(2) processes are not rich
enough in their time scale structures. This conclusion falls in line with the RM2006 risk methodology,
where the same process is shown to capture correctly the lagged correlation for the volatility.
The connection with the market model for the forward variance shows the parallel structure of the
volatility forecasts provided by both approaches. However, their dynamics are very different
(postulated for the forward volatility market models, induced by the ARCH structure for the
multicomponent ARCH processes). Moreover, the volatilityprocess induced by the ARCH equations
is of a different type from the usual price process, because the random term is of orderδt instead of√δt used in diffusive equations. This emphasizes a fundamentaldifference between price and
volatility processes. A clear advantage of the ARCH approach is to deliver a forecast based only on
the properties of the underlying time series, with a minimalnumber of parameters that need to be
estimated (none in our case as all the parameters correspondto the values used in RM1994 and
RM2006). This point bring us to a nice and simple common framework to evaluate risks as well as the
implied volatilities of at-the-money options.
For the implied volatility surface, the problem is still notcompletely solved, as the volatility smile
needs to be described in order to capture the full implied volatility surface. Any multicomponent
24 Volatility Forecasts and At-the-Money Implied Volatility
ARCH process will capture some (symmetric) smile, due to theheteroskedasticity. Moreover, fat tail
innovations will make the smile stronger, as the process becomes increasingly distant from a Gaussian
random walk. Yet, adding an asymmetry in the smile, as observed for stocks and stock indexes,
requires enlarging the family of process to capture asymmetry in the distribution of returns. This is
left for further work.
References
Bergomi, L. (2005). Smile dynamics ii.Risk 18, 67–73.
Buehler, H. (2006). Consistent variance curve models.Finance and Stochastics 10, 178–203.
Dacorogna, M. M., U. A. Muller, R. B. Olsen, and O. V. Pictet (1998). Modelling short-term
volatility with GARCH and HARCH models. In C. Dunis and B. Zhou (Eds.),Nonlinear
Modelling of High Frequency Financial Time Series, pp. 161–176. John Wiley.
Engle, R. F. and T. Bollerslev (1986). Modelling the persistence of conditional variances.
Econometric Reviews 5, 1–50.
Gatheral, J. (2007). Developments in volatility derivatives pricing. Presentation at “Global
derivatives”, Paris, May 23.
Lynch, P. and G. Zumbach (2003, July). Market heterogeneities and the causal structure of
volatility. Quantitative Finance 3, 320–331.
Nelson, D. (1990). Arch model as diffusion approximation.Journal of Econometrics 45, 7–38.
Inflation markets have evolved significantly in recent last years. In addition to stronger issuanceprograms of inflation-linked debt from governments, derivatives have developed, allowing abroader set of market participants to start trading inflation as a new asset class. These changes callfor modifications of risk management and pricing models. While the real rate framework allowedus to apply the familiar nominal bond techniques on linkers,it does not provide a consistent viewwith inflation derivative markets, and limits our ability toreport inflation risk. We thus introducein detail the concept of Break-Even Inflation and develop associated pricing models. We describevarious adjustments for taking into account indexation mechanisms and seasonality in realizedinflation. The adjusted break-even framework consolidatesviews across financial products andgeography. Inflation risk can now be explicitly defined and monitored as any other risk class.
1 Introduction
Even though inflation is an old topic for academics, interestfrom the financial community only began
recently. This is somewhat due to historical reasons. Inflation was and is a social and macroeconomic
matter, and has consequently been a concern for economists,politicians and policy makers, not for
financial market participants. High inflation (and of coursedeflation) being perceived as a bad signal
for the health of an economy, efforts concentrated on the understanding of the main inflation drivers
rather than on the risks inflation represents, especially for financial markets. One of the most famous
sentences of Milton Friedman, “Inflation is always and everywhere a monetary phenomenon,”
suggests that monetary fluctuations (thus, money markets) are meaningful indicators of inflation risks
we might face in financial markets. While this claim is supported by most economic theories, the
monetary explanation cannot be transposed in the same way onfinancial markets. Persistent shocks
on money markets effectively determine a large fraction of long-term inflation moves, but short and
mid-term fluctuations of inflation-linked assets bear another risk which has to be analyzed in its own
right.
Quantifying inflation risk on financial markets is today a major concern. The markets developed
quickly over the last five to ten years and we expect them to continue to evolve. On the one hand, after
26 Inflation Risk Across the Board
several years of low inflation across industrialized countries, signals of rising inflation have appeared.
Rising commodity and energy prices are typical examples. Onthe other hand, more and more players
are coming to inflation markets both on the supply and demand sides. Two decades ago, people
considered equities as a good hedge against inflation, but equities appear to be little correlated with
inflation, and demand for pure inflation hedges has dramatically increased. Pension funds and insurers
are the most active and really triggered new attention. Today, they not only face pressure from their
rising liabilities but also from regulators.
Even though inflation is not a new topic for us, we thus need to brush the cobwebs on techniques used
to understand and quantify inflation risk, examining the newperspectives and problems offered by
evolving inflation markets. In this paper, we first explore how inflation is measured. While everybody
can agree on what a rate of interest is just by looking at the evolution of a savings account, inflation
hits different people differently and depends on consumption habits. We highlight seasonality effects
and their impact on the volatility of short-term inflation. We then survey the structure of the
inflation-linked asset class. We are left with the necessityto consider inflation risk in a new way. For
that purpose, we come back to the well-known Irving Fisher paradigm linking real and nominal
economies, and define in a clean way the concept of break-eveninflation. We then describe various
adjustments required by the indexation mechanisms of inflation markets so as to make break-even a
suitable quantity for risk management. Adjusted break-evens allow us to consistently consider and
measure inflation risk across assets and markets. Finally, we illustrate the methodology through actual
data over the last years, considering break-evens, nominaland real yields.
2 Measuring economic inflation
The media relay inflation figures on a regular basis. It is important to understand where numbers
come from and their implications. An inflation rate is measured off a so-called Consumer Price Index
(CPI) which varies significantly across countries and through time. The annual inflation rate is
commonly reported. It constitutes the percentage change inthe index over the prior year. We
underline hereafter that this is a meaningful way to measureeconomic inflation. Exploiting economic
inflation would however be a challenging task for financial markets which require higher frequency
data. The monthly change on most CPIs is more suitable but should be considered with care, taking
into account seasonality in the index.
Measuring economic inflation 27
2.1 Consumer Price Indices
A consumer price index is a basket containing household expenditures on marketable goods and
services. The total value of this index is usually scaled to 100 for some reference date. As an
example, consider the Harmonized Index of Consumer Prices ex-tobacco (HICPx). This index applies
to the Euro zone, and is defined as a weighted average of local HICPx. The weights across countries
are given by two-year lagged total consumption, and the composition of a local index is revised on a
yearly basis. While the country composition has been generally stable, we notice significant changes
with the entry of Greece in 2001 and with the growth of Ireland. The contribution of Germany has
decreased from 35% to 28%.
The composition of the HICPx across main categories of expenditures has also significantly evolved
over the last ten years.1 We can observe that expenditures for health and insurances have grown by
6%. We would have also expected a large growth in the role of housing expenditures as real estate and
energy take a greater share of the budget today, but this category stayed barely constant (from 16.1%
in 1996 to 15.9% in 2007). This is a known issue, but no corrective action has been taken yet for fear
of a jump in short-term inflation and an increase the volatility of the index.
These considerations point out that inflation as measured through CPI baskets is a dynamic concept,
likely to capture individual exposure to changes in purchasing power with a lag.2 To some extent, the
volatility of market-implied inflation contains the volatility in the definition of the index itself.
2.2 Measure of realized inflation
Beyond the issue of how representative any CPI is, measuringan inflation rate off a CPI raises timing
issues. We define the annualized realized inflation ratei(t,T) from t to T corresponding to an index
CPI(t) as
i(t,T) = T−t
√CPI(T)
CPI(t)−1. (1)
Subtleties in the way information on the index is gathered and disclosed are problematic for financial
markets. First, CPI values are published monthly or quarterly only while markets trade on a daily
basis. Second, a CPI is officially published with a lag, because of the time needed for collecting prices
of its constituents. Finally, the data gathering process implies that a CPI value can hardly be assigned
1Data on the index and its composition can be found on the Eurostat web site epp.eurostat.ec.europa.eu2The most famous consequence of this lag is known as the Boskineffect.
28 Inflation Risk Across the Board
to a precise datet. An arbitrary date is however settled. By convention, this reference date is the first
day of the month for which the collecting process started.
As an illustration, let us consider the HICPx with the information available on 04 October 2007. The
last published value was 104.19 on 17 September 2007. This value corresponds to the reference date
01 August 2007. The previous value, 104.14, published in August, corresponds to 01 July 2007. On
October 4, we could thus compute the annualized realized inflation rate corresponding—by
convention—to 01 July 2007 through 01 August 2007:i =(
104.19104.14
)12−1' 0.58%, but the inflation
rate between August 1 and October 4 was still unknown.
Such a low inflation figure, 0.58%, sounds unusual. The media commonly report inflation over the
past year. On October 04, we could thus reference inflation104.19102.46−1' 1.69% in the Euro zone,
102.46 being the HICPx value for the reference date 01 August2006. The top graph in Figure 1
shows the evolution of these two inflation measures. The annual inflation rate has been relatively
stable on the Euro zone, certainly as the result of controlled monetary policies. The high volatility in
the monthly inflation rate is striking, with a lot of negativevalues. Of course, we cannot talk about
deflation in June (−0.31%) and inflation in July (0.58%). The repeating peaks and troughs, noticeably
pronounced over the last years, and the nature of some constituents of the CPI, advocate for a
systematic pattern (seasonality) in monthly data which hasto be filtered out.
2.3 Seasonality
Seasonality in demand for commodities and energy is a well established phenomenon. There exist
several methods which extract seasonality from a time series. Typically, the trend is estimated though
autoregressive processes (AR) and seasonal patterns through moving average (MA). We apply here a
proven model developed by the US Bureau of Census for economic time series, known as the X-11
method.3 We set a multiplicative model on the CPI itself such that
CPI(t) = S(t)CPI(t) (2)
whereCPI(t) represents the seasonally adjusted CPI, andS(t) the seasonal pattern.
Returning to monthly estimates of the inflation rate, the topgraph in Figure 1 shows that most of the
volatility in monthly inflation measured off the unadjustedCPI time series is due to seasonality. The
seasonally adjusted monthly inflation rate for July is about1.57% and for June about 1.72%, figures
3In practice, the time series is decomposed into three components: the trend, the seasonality and a residual. The X-11
method consists in a sequential and iterative estimation ofthe trend and of the seasonality by operating moving averages in
time series and in cross section for each month. See for instance (Shishkin, Young, and Musgrave 1967) for further details.
Measuring economic inflation 29
Figure 1
Realized inflation implied by HICP ex-tobacco
Annualized inflation reported on monthly and annual CPI variations, with and without seasonality
adjustments. Seasonality componentS(t) estimated using the X-11 method.
in line with the annual estimate. This proves that seasonality has to be an important ingredient when
measuring inflation risk on financial markets, exactly as it is for commodities.
The bottom graph of Figure 1 presents the evolution of the seasonal pattern. The relative impact of
seasonality on the HICP ex-tobacco has doubled over the lastten years. The effect is more
pronounced in winter. Even though the pattern is different,it is interesting to notice that the same
comments apply to the US CPI (and actually for all developed countries). We will show in Section 5
that modeling a stochastic seasonal pattern is pointless.
30 Inflation Risk Across the Board
3 Structure of the market
Inflation-linked financial instruments are now establishedas an asset class, but the growth of this
market4 is a new phenomenon mainly driven by an increased demand. Conventional wisdom used to
be that inflation-linked liabilities could be hedged with equities, producing a low demand for pure
inflation products. The past ten years have demonstrated thedisconnection between inflation figures
and the evolution of equity markets, and highlighted the necessity to develop inflation markets.
We examined various measures of correlation between the main equity index and the corresponding
CPI in the US, the UK and in France.5 Our results were between -20% and 40%, concentrated around
0% to 20%. An institution with inflation-indexed liabilities could not be satisfied with even the best
case, a 40% hedging ratio. Demand for inflation products has thus exploded, reinforced by new
regulations (the International Financial Reporting Standards, IFRS, and Solvency II in Europe).
Restructuring activities from banks have also created a liquid derivatives market. This evolution of
inflation markets has shifted the way inflation-linked products and inflation risk have to be assessed.
While in the early stages, the analogy with nominal bonds wasconvenient for the treatment of
inflation-linked bonds, this is no longer true with derivatives.
3.1 The inflation-linked bond market
Governments structured inflation-linked bonds (ILB) and created these markets as alternative issuance
programs to standard treasury notes. This supply is mainly driven from cost of funding concerns. The
risk premium that governments should grant to investors on ILB should be slightly lower than on
nominal bonds. In addition, governments have risk diversification incentives on their debt portfolio.
Both outgoing flows (such as wages) and incoming flows (such astaxes) are directly linked to
inflation. As a consequence, some proportion of ILB should beissued; importantly for the
development of these markets, governments committed to provide consistent and transparent issuance
plans several years ahead.
Inflation-linked bonds are typically bonds with coupons andprincipal indexed to inflation. The
coupon is fixed in real terms and inflated by the realized inflation from the issuance of the bond (or
4Bloomberg reports a tenfold increase in the HICPx-linked market, to overC100 million outstanding, since 2001.5We considered monthly versus yearly returns and realized inflation, using expanding, rolling and non-overlapping
windows. Rolling windows show a high volatility in the correlation itself, though the statistical significance of theseresults
is pretty low. Introducing a lag does not modify the outputs.
Structure of the market 31
Figure 2
Cash flow indexation mechanism
Inflation protected period for holding a cash flow fromt to T. L stands for the indexation lag, andtl for
the last reference date for which the CPI value has been published.
from a base CPI value).6 The indexation mechanism is designed to overcome the problems associated
with the CPI, namely its publication frequency and its publication lag. This is achieved by introducing
a so-called indexation lagL.7 Typical lag periods range from two to eight months.
Flows are indexed looking at the CPI value corresponding to areference dateL months into the past.
With such a scheme, the inflation protected period and the holding period do not exactly match, as
illustrated in Figure 2. The nominal couponCN to be paid at timeT to an investor who acquired this
right at timet on a real couponCR is then given byCN = CRCPI(T−L)CPI(t−L) . The lag is set such that the base
CPI valueCPI(t−L) is known at timet, provided that an interpolation method is also specified to
compute this value from the CPI values of the closest reference dates. This mechanism is mandatory
for calculations of accrued interests.
The approach yields to a simple pricing formula—Section 4.2justifies it—in terms of real discount
ratesRR(t, .). Denotingt1, ..., tN the coupon payment dates andCPI0 the value of the reference CPI at
the issuance of the bond (typically,CPI0 = CPI(t0−L) wheret0 is the issuance date), we get
PV(t) =CPI(t−L)
CPI0
(
∑i;t<ti
CR
(1+RR(t, ti))ti−t +100
(1+RR(t, tN))tN−t
). (3)
This equation implies that in real terms, an ILB can be considered as a plain vanilla nominal bond
6This structure is now standard, though other structures with capital only or coupons only indexation can still be found.7Depending on the context,L should be understood as a number of months or as a fraction of year.
32 Inflation Risk Across the Board
with fixed couponCR, but substituting real rates for nominal rates. (In fact, this equation effectively
defines the real rates.) Risk is represented by the daily variations in real rates—or real yields—since
the real rate curve is the only unknown from today to tomorrow. Surprisingly, we cannot isolate
inflation risk here even though nominal rates, real rates andinflation risks are connected quantities.
The standard convention is to quote ILB prices in real terms,that is, to quote the real coupon bond
which then must be inflated by the above index ratio. For instance, on December 3, the French OATe
2.25 7/25/2020 linked to the HICPx is quoted with a real cleanprice of 103.177. Using the
appropriate index ratio of 1.0894 leads to a clean PV ofC112.401.8
3.2 Development of derivatives
Liability hedging sparked off this boom. Pension funds are largely exposed to inflation moves:
indirectly, when retirement pensions are partially linkedto inflation through indexation to final wages,
directly, when pensions are explicitly linked to a CPI.9
Inflation swap markets have been developed to answer more precisely the needs from liability driven
investments. Stripping ILB into zero-coupon bonds, financial intermediaries are now able to propose
customized inflation-protected cashflow schedules. In its simplest form, a zero-coupon inflation swap
is written on the inflation rate from a given CPI. The inflationpayer pays at the maturityT of the
contract the increase in the indexCPI(T−L)CPI(t−L) times a predefined notional. The protection buyer pays a
fixed rate on the same notional. The fixed rate is determined atthe inception such that the value of the
swap is null.10 With inflation swaps—and similar derivatives—this fixed rate is the quantity at risk,
and expresses mainly views on expected inflation. By convention, inflation swaps are quoted through
this fixed rate for a set of full-year maturities. Thus, the natural risk factors for inflation swaps are the
quoted swap rates, rather than the real rates from before. Inilliquid markets, we could accept different
risk factor definitions, but today, with liquid markets and significant intermediation activity, we
require a consistent view. Break-even inflation as defined later in this paper fills the gap while
accounting for the intrinsic features of consumer price indices.
8This ratio is computed taking into account a three-month lag, a linear interpolation method and a three days settlement
period. The HICPx values can be downloaded from the Eurostatweb site.9The tendency in Europe is to the direct linkage of pensions. Previously, most pensions were indirectly linked through
the average wage over the last years before retirement.10Such a structure is very effective for liability hedging: the whole capital can still be invested in risky assets while ILB
lock money in low returns. Among the possible swaps, one would prefer year-on-year inflation swaps which pays inflation
yearly. It strongly limits the amount of cash which has to be paid at maturity.
Structure of the market 33
Figure 3
Inflation Markets
3.3 More players with different incentives
Figure 3 depicts the global structure of inflation markets. Pension funds, insurers, corporates and
retail banks look both for protection and portfolio diversification. Banks stand across the board, as
intermediaries competing with hedge funds on inflation arbitrages, and as buyers looking for money
market investments and diversification. On the supply side,utilities and large corporates issue
inflation swaps or ILB so as to reduce their cost of funding. The inflation swap market has indeed
consistently granted a small premium (0-50bps) to inflationpayers. In addition to standard
factors—credit and liquidity risk—this premium contains restructuring fees and a reward for a small
portion of inflation risk which cannot be transferred from bond markets to swap markets because of
indexation rules.
34 Inflation Risk Across the Board
4 The concept of break-even inflation
Ideally, we would like to define and use expected inflation as aconsistent risk factor on both markets.
Given that inflation-linked assets are written in terms of CPI ratios, we might believe that expected
inflation can be extracted from inflation swaps quotes, or from ILB prices. Unfortunately, measuring
expected inflation is challenging and cannot be done withoutan explicit dependency on the nominal
interest rate curve. Even though prices of inflation products can be observed, we show that there are
no model independent inflation expectations which can be derived.
We previously defined the realized inflation measurei(t,T) in (1) and can thus define the expected
inflation I(t,T) as
I(t,T) = Et
[T−t
√CPI(T)
CPI(t)
]−1, (4)
where the expectation is taken under the physical measure. Given that future inflation is uncertain, a
premium is necessarily embedded into the above expectation.11 But making an assumption about this
premium is not sufficient either. Using standard concepts ofasset pricing theory, we demonstrate
hereafter that one can extract forward CPI values—or expectations of CPI values under corresponding
nominal forward measures—only. These forward CPI values are the main inputs entering into the
break-even inflation concept.
From now on, we leave aside the annualization, consider perfect indexation (L = 0) and assume that
the realized CPI is observable on a daily basis. We remark that we would be able to observe the
expected inflation if we could observe the expected CPI:
Et
[CPI(T)
CPI(t)
]=
1CPI(t)
Et [CPI(T)] . (5)
4.1 The exchange rate analogy
Let us first call the nominal world our physical “home” world:in this world, any good or amount is
expressed in a monetary unit which is a currency (say, $US). We can consider other worlds that we
would still describe as nominal worlds but in which the monetary unit is another currency (say,C),
the “foreign” worlds. In a complete market with absence of arbitrage opportunities, a unique measure
exists to value all goods and assets in our nominal “home” world, the risk neutral measure. Through a
11There is no consensus in the academic literature about the sign and the magnitude of this risk premium. In the US,
recent papers evaluated the premium up to 50bp while the European premium might be insignificant. See for instance
(Buraschi and Jiltsov 2005) and (Hordahl and Tristani 2007).
The concept of break-even inflation 35
change of numeraire, typically given by the exchange rate dynamics, pricing can be done in any
currency using the risk neutral measure in the “foreign” world.
It is common to consider inflation analogously to this setup by defining a CPI basket as a new
monetary unit. We refer to this as a basket unit as opposed to adollar unit. The world where all goods
and amounts are expressed in basket units is the real world. Because of the completeness argument,
through a change of numeraire pricing of real assets can be done in the real economy or in the
nominal economy equivalently. The change of numeraire is given by the CPI price itself. As with
exchange rates,CPI(t) is the spot exchange rate to convert one basket unit in the real economy to the
nominal economy, that is, to $.
4.2 Standard pricing of linkers
As we discussed, linkers are traditionally priced in the real world through real coupon bonds. Let us
consider the simplest linker, a perfectly indexed inflation-linked discount bond (ILDB)—with price
P(t,T; t) $—issued at timet, which gives the right to receive a cash flow ofCPI(T)CPI(t) at T. A linker can
obviously be decomposed into a deterministic linear combination of ILDB matching the coupon
payment dates. We further introduce the discount bondsBN,N(t,T) andBR,R(t,T) at datet with
maturity dateT in respectively the nominal world and the real world and in their own monetary units.
In other words, these bond prices are obtained under the riskneutral measuresP∗N andP∗
R of the
nominal and real worlds respectively:
Bx,x(t,T) = EP∗x
t
exp
−
T∫
t
rx(s)ds
, (6)
whererx(s) is the short rate in the corresponding world at times. We can express the price of this real
discount bond in nominal termsBN,R(t,T) using the spot CPI asBN,R(t,T) = CPI(t)BR,R(t,T). This
does not correspond to the price of an investment paying in our “home” world since it settles to one
CPI unit at timeT: clearly, 1CPI(t)BN,R(t,T) 6= P(t,T; t) as they do not provide the payoff in the same
units. This is illustrated Figure 4 through the black and thered arrows.
Though pricing of ILDB can be done using expectations on future CPIs (this is developed in the next
section), a straightforward observation yields to (3) and apricing model in terms of real rates. At time
t0, payingP(t0,T; t0) $ for one ILDB issued att0 and maturing atT yields to a payoff ofCPI(T)CPI(t0)
$ at
maturity. The same payoff can be locked by investingBR,R(t0,T) $ in the real world—thus paying1
CPI(t0)BR,R(t0,T) real unit for 1
CPI(t0)real discount bond maturing atT—and converting the payoff
back into dollars at maturity. This is materialized in Figure 5, where we explicitly mark dollar flows.
36 Inflation Risk Across the Board
Figure 4
ILDB and real discount bond prices in the nominal world
Figure 5
Investment flows for replicating an ILDB
The concept of break-even inflation 37
Since this is a self-financing replicating strategy which yield to the same payoff in all states of nature
atT, in absence of arbitrage opportunities, the price of the ILDB at timet such thatt0 ≤ t ≤ T is equal
to the value of the replicating strategy, and thus given by
P(t,T; t0) =CPI(t)CPI(t0)
BR,R(t,T) =1
CPI(t0)BN,R(t,T) $. (7)
Do note the following implication: at issuance, the dollar value of the ILDB is exactly equal to the
real discount bond price, as if a CPI unit was worth $1. This justifies (3), and this simple pricing
model advocated for the use of real rates as risk factors.
4.3 Expectations of future CPI values
4.3.1 The forward CPI value
Let us derive the forward CPI value which will be the buildingblock of break-evens. Since we have
ultimately to price instruments in the nominal world—this is our “home” world—we consider the
forward priceFN,R(t,T1,T2) of a real discount bond delivered atT1 and maturing atT2 expressed in
nominal terms. Using the exchange rate analogy, we define theforward CPI valueFCPI(t,T) at t for
delivery dateT as the forward price whenT1 = T2 = T: FCPI(t,T) = FN,R(t,T,T). In absence of
arbitrage opportunities (AOA), we obtain
FN,R(t,T1,T2) =CPI(t)BR,R(t,T2)
BN,N(t,T1). (8)
Notice that given a set of nominal discount bond prices and a set of real discount bond prices in their
respective units, the values of forward CPI are known.
4.3.2 Link with CPI expectations
In our complete market, the AOA condition implies that at time t, the following condition should hold:
BN,N(t,T) = EP∗
Rt
CPI(t)CPI(T)
exp
−T∫
t
rR(s)ds
, (9)
since investing in the nominal world should be equivalent toinvesting in the real world with an initial
nominal amount and converting the output back to the nominalworld (see Figure 6). The standard
relationship between risk neutralP∗ and forward measuresPT applied to the real world leads to
BN,N(t,T) = CPI(t)BR,R(t,T)EPT
Rt
[1
CPI(T)
]. (10)
38 Inflation Risk Across the Board
Figure 6
Non-arbitrage condition between real and nominal worlds
Combining (8) and (10) implies that
1FCPI(t,T)
= EPT
Rt
[1
CPI(T)
]. (11)
Considering the AOA condition from the real world leads to a similar equation,
BR,R(t,T) = EP∗
Nt
CPI(T)
CPI(t)exp
−
T∫
t
rN(s)ds
⇒ FCPI(t,T) = E
PTN
t [CPI(T)] . (12)
This shows that the expected CPI valueEt [CPI(T)] cannot be directly observed. Of course, assuming
the dynamics for the CPI itself, its interactions with nominal rates and the shape of the inflation risk
premium, we could derive the exact relationship between theforward CPI and the expected CPI. In
particular, if the CPI dynamics and nominal rate dynamics were independentFCPI(t,T) would be
equal to the expected CPI under the nominal risk neutral measure.
4.4 The break-even inflation
The above equations actually define the quantity that we can extract—free of modeling bias—which
we refer to as the zero-coupon break-even inflation (BEI). Indiscrete annual compounding, it can be
characterized by
BEI(t,T) = T−t
√FCPI(t,T)
CPI(t)−1, (13)
or in continuous compounding,
BEIc(t,T) =1
T − tlog
(FCPI(t,T)
CPI(t)
), (14)
The concept of break-even inflation 39
Notice that because of (8) and (10), the definition (13) is equivalent to
BEI(t,T) =1+RN(t,T)
1+RR(t,T)−1, (15)
whereRR(t, .) andRN(t, .) are respectively the real and the nominal zero-coupon rates.
Equation (15) is the well known Fisher equation, with break-evens substituted for expected inflation.
However, with stochastic inflation (equivalently, future CPI values) the relationship is verified by
break-evens only. Equation (10) provides more insight on the components included in the break-even
inflation: the complete Fisher equation involving the expected inflationI(t,T) can be written as12
in which interest rates and CPI dynamics provide expressions for π(t,T)—which contains the
inflation risk premium—andv(t,T)—the correlation correction between interest rates and theCPI
dynamics, with a convexity adjustment, depending on the model.
4.5 Pricing in the nominal world only
Using a zero-coupon break-even curve and a nominal zero-coupon interest rate curve, we can now
rely on a new pricing model for linkers. This new framework enables the explicit modeling of
inflation risk and standard interest rate risk. An inflation-linked bond can indeed be modeled as a
stochastic coupon bond. The priceP(t,T; t0,CR) at timet of a linker maturing atT, issued att0 with a
real coupon ofCR can indeed be computed under the risk neutral measureP∗N through
P(t,T; t0,CR) = EP∗
Nt
[
∑i;t<ti
CPI(ti)CPI(t0)
CR exp
−
ti∫
t
rN(s)ds
+
CPI(T)
CPI(t0)100exp
−
T∫
t
rN(s)ds
]
(17)
=CPI(t)CPI(t0)
(
∑i;t<ti
EP
tiN
t
[CPI(ti)CPI(t)
]CRBN,N(t, ti)+E
PTN
t
[CPI(T)
CPI(t)
]100BN,N(t,T)
)(18)
=CPI(t)CPI(t0)
(
∑i;t<ti
CR
(1+BEI(t, ti)1+RN(t, ti)
)ti−t
+100
(1+BEI(t,T)
1+RN(t,T)
)T−t)
(19)
wherePsN is the forward measure for the times in the nominal world. Similarly, one can show that the
fixed rate of an inflation swap is equal to the break-even inflation for the same horizon.
12This can be derived from Ito’s lemma applied to the dynamicsof the CPI, and nominal or real rates.
40 Inflation Risk Across the Board
5 Adjusted break-evens as risk factors
Importantly, the methodology presented above does not depend on perfect indexation. The indexation
lag and the publication lag can be taken into account throughslight adjustments in the definition of
the break-even inflation and in pricing formulas. We now detail those adjustments, beginning with
seasonality. We then discuss two types of break-evens that can be used in a risk context. The second
type of BEI is motivated by requirements of homogeneity and portability, two essential characteristics
for a consistent evaluation of inflation risk across both asset classes and countries.
5.1 Including seasonality
We showed in Section 2.3 that the predictable seasonal pattern should be stripped out from observed
values of the CPI. This seasonality component should be included into our modeling of break-evens
by defining seasonally adjusted break-evens. Assuming thatthe seasonal pattern is deterministic, we
can combine (2), (11) and (13), defining aseasonally adjusted break-evenBEI(t,T) as
(1+BEI(t,T))T−t =S(t,T)
S(t, t)(1+BEI(t,T))T−t (20)
whereS(t,T) is the seasonality estimated att projected forT. The projection is done by repeating the
last whole year’s seasonal patternS(t), as extracted in Section 2.3. Between estimated seasonal
monthly valuesS(t), a linear interpolation is performed.
Since inflation swap quotes constitute (a special type of) break-evens, we explore the impact of
seasonality on the strike rate of an inflation swap. Figure 7 presents results on the EU HICPx and US
CPI inflation. We will refer to these BEI as adjusted and unadjusted standard break-evens. In the
figure, we observe that US inflation was expected to increase,while the European curve was flat. US
inflation swaps traded about 25bp to 50bp above European inflation swaps. Further, the impact of
seasonality vanishes quickly with increasing swap maturity. Since the seasonal pattern changes
slowly,13 any implication on mid- and long-term unadjusted BEI would be insignificant. Short-term
unadjusted BEI are well predicted by the seasonal pattern observed on the previous year. Figure 7
underlines the benefits of using adjusted BEI as risk factors. Because of the swings created by the
seasonality, we would overestimate short-term inflation risk by considering unadjusted BEI. This will
be shown later. Unless specified, we consider seasonally adjusted BEI in the remainder of this paper.
13Recall Figure 1.
Adjusted break-evens as risk factors 41
Figure 7
Impact of seasonality on inflation swap quotes
Inflation swap quotes are interpolated using smoothed splines. Data as of 03 December 2007
0 5 10 15 20 25 302
2.5
3
3.5
Maturity
US
Bre
ak−
even
infla
tion
swap
(%
)
AdjustedUnadjustedQuotes
0 5 10 15 20 25 302
2.5
3
3.5
Maturity
EU
Bre
ak−
even
infla
tion
swap
(%
)
AdjustedUnadjustedQuotes
5.2 Homogeneity and portability
The validation of a pricing model for risk management purposes imposes some constraints on the
input data, or equivalently on the risk factors entering into equations. When introducing imperfect
indexation, definition (13) has to be adapted for our break-even to satisfy those constraints. First, our
ability to estimate and model the distribution of a risk factor depends on our capacity to observe an
homogenous sample of this factor through time. Homogeneitycomes into two flavors: the observable
should refer to the same theoretical quantity—in particular, to the same horizon—and it should be
forward looking. Second, risk analyses at the portfolio level call for a consistent modeling of the same
risk across markets and assets. As far as inflation is concerned, we would like inflation swaps and
inflation-linked bonds to rely on the same risk factor. Trading on the two asset classes—for instance
through inflation-linked swaps—might even require interchangeability in break-evens derived from
inflation swaps and from linkers.14 This involves disentangling specific market conventions from the
risk factors themselves.
14In general, an inflation-linked swap consists in swapping a linker against a floating nominal leg.
42 Inflation Risk Across the Board
5.2.1 The standard BEI
We characterize a first type of break-even by coming back to ILDB. Relaxing the perfect indexation
assumption implies that the payoff of an inflation linked discount bond refers to lagged inflation. Its
price is given by
P(t,T; t0,L) = EP∗
Nt
CPI(T−L)
CPI(t0−L)exp
−
T∫
t
rN(s)ds
(21)
=CPI(t−L)
CPI(t0−L)E
P∗N
t
CPI(T −L)
CPI(t−L)exp
−
T∫
t
rN(s)ds
(22)
=CPI(t−L)
CPI(t0−L)E
PTN
t
[CPI(T −L)
CPI(t−L)
]BN,N(t,T). (23)
Equation (23) shows that we can embed the indexation lag within the definition of the break-even
inflation by setting
(1+BEI(t, t−L,T −L))T−t =FCPI(t,T−L)
CPI(t−L)=
S(t,T−L)
S(t, t−L)(1+BEI(t, t−L,T −L))T−t . (24)
This definition conveniently allows us to stick to (19) without additional effort beyond modifying the
upfront index ratio. This is a standard market practice. We thus refer to (24) as thestandard BEI.
From a risk perspective, seasonally adjusted standard break-evens do satisfy some of the
aforementioned criteria on risk factors. We can indeed derive a standard zero-coupon BEI curve on a
daily basis from inflation swap quotes as well as from treasuries and inflation-indexed bonds.
However, this simple approach comes at the cost of deriving aquantity which is not purely at risk.
Figure 8 shows how the protected period—and so the BEI—decomposes. We distinguish three parts.
From the base indexation datet −L up to the last observed reference datetl , the inflation rate is
actually known. In betweentl and the analysis datet, if not fully known the inflation rate is strongly
predictable. The third, forward-looking part is the true period at risk.
Portability of standard BEI is also a concern. Conventions for computing the base index value
CPI(t−L) varies from one market to another, and leaving this component within the break-even
significantly restricts the way it can be used. For instance,French linkers indexed to the HICPx (the
OATe) obtain their base index value by interpolation between two prior observed CPI values, while
HICPx inflation swaps choose the CPI value for a single reference date. Because of these different
conventions, the discrepancy between standard BEI derivedfrom inflation swaps and from linkers
varies in a predictable way through a month, which is unsatisfactory. The HICPx interpolation rules
additionally create spurious jumps when the protected period changes.
Adjusted break-evens as risk factors 43
Figure 8
Standard and fully adjusted break-evens
Zero coupon break-even inflation defined off a position held from t to T. L stands for the indexation
lag, andtl for the reference date at which the last CPI value was published.
5.2.2 Fully adjusted BEI
Acknowledging these defects in the standard BEI, another common practice is to define “forward”
break-evensBEIf (t,T,L) as the risk factors by stripping the first two parts of the protected period:
(1+BEIf (t,T,L))T−t =(1+BEI(t, t−L,T))T−t+L
(1+BEI(t, t−L, t))L =S(t,T)
S(t, t)(1+BEI f (t,T,L))T−t . (25)
We underline that the “forward” label is a misuse of languagesinceBEIf (t,T,L) is nothing more than
a spot break-even measure. The forward standard BEI is thus the quantity that we have to use when
applying the Fisher equation (15). More importantly, it is obvious that none of the standard BEI
drawbacks are corrected.15
We can nevertheless define another type of break-even. Coming back to Figure 8, we could partially
adjust for the indexation lag and consider a break-even overthe non-deterministic protected period
from tl to T −L. This would break the homogeneity condition since on a dailybasis for a quoted
constant time-to-maturity instrument (such as an inflationswap), the interval[tl ,T −L] increases
slowly before contracting again when a new CPI value is published fortl+1. Homogeneity of the risk
factor can be satisfied by defining a break-even on the third period only as shown on Figure 8,
(1+BEI(t,T,L))T−t−L =FCPI(t,T−L)
CPI(t)=
S(t,T−L)
S(t, t)(1+BEI(t,T,L))T−t−L. (26)
15The defects could be corrected if we could observe the price of an overnight inflation swap or a one-day linker.
44 Inflation Risk Across the Board
This break-even cannot be observed since the current CPI is unknown att. Because of the high
predictability ofCPI(t) we can definefully adjusted break-eventhrough a forecastCPI∗(t) as
(1+BEI(t,T,L))T−t−L =FCPI(t,T−L)
CPI∗(t). (27)
Various strategies can be applied so as to obtainCPI∗(t). Notice that the forecasting model can be
designed under the physical or the risk neutral probabilitymeasure since the risk premium should be
barely null for such a small horizon.16 Regression-based strategies on log CPI could be operated.17 A
conservative approach can also be designed, for instance, by assuming that the past year’s inflation is
the best forecast of the increase in the CPI from the last published value. We apply this strategy in the
following figures and data.
5.2.3 The adjustments in practice
We highlight the differences in the standard and the fully adjusted break-evens by looking at the
European (HICPx) and the US inflation swap markets. Figure 9 presents the term structure of adjusted
break-evens extracted from market quotes. As expected, differences are significant on the short-term.
On the one hand, the standard BEI smoothes out expectations of future inflation by realized inflation
on the indexation lag period. When realized inflation has been higher than expected, it tends to lower
the true expectations of future inflation, and vice versa. For instance, given that the indexation lag on
the US inflation swaps is three months, the US standard curve is influenced by the realized inflation
over August. The August inflation was high 3.36% and is likelyto bias the standard BEI curve which
displays a short term value of 2.38%. On the other hand, fullyadjusted BEI can suffer from bad
predictions of the current index valueCPI∗(t). Our conservative forecasting strategy slightly
underestimated the August to November inflation with an estimated CPI value of 2.5%, while ex-post
we observed a 3.03% inflation rate. We nevertheless observe here that the implied short term
break-even—about 2.92%—is more inline with the last inflation realizations. Put simply, fully
adjusted break-evens react more quickly to realized inflation.18
Figure 10 shows implications of the various adjustments on volatility estimates. Most inflation swap
markets offer a decreasing volatility with the break-even horizon, though the US market is an
16With the development of inflation markets, liquid futures oninflation could for instance be used.17Prices of futures on energy and commodities, money market rates, etc...are potential forecasting variables.18Let us point out that even though we do not provide details here, several subtle issues were taken into account. First,
the market conventions for the indexation method differ between the European and US markets. Both markets use a three-
month lag but the US market uses a linear interpolation whilethe European swaps are indexed between index reference
dates directly. Second, data and break-evens have to be interpolated. This has been done through smoothed splines applied
on seasonally adjusted break-evens and forward break-evens.
Adjusted break-evens as risk factors 45
Figure 9
Adjusted break-even term structures
Forward standard BEI and fully adjusted BEI term structuresfrom the HICPx and the US CPI inflation
swaps. All curves are seasonally adjusted. Data as of 05 November 2007
5 10 15 20 25 302.2
2.3
2.4
2.5
2.6
EU
BE
I Sw
ap (
%)
5 10 15 20 25 302.2
2.4
2.6
2.8
3
3.2
US
BE
I Sw
ap (
%)
Standard SAFully Adj. SA
Standard SAFully Adj. SA
exception. Seasonality effects are presented on the standard adjusted curves only. It exhibits the
benefits of adjusting break-evens for seasonality by removing meaningless waves. From previous
comments about differences between the two adjusted BEI, wecould expect that the standard BEI
methodology tends to underestimate short-term volatility. This is confirmed in Figure 10.
We further check the adequacy of the break-evens with classic assumptions of risk models. It is for
instance standard to assume that risk factors follow a normal—log-normal for a positive
variable—distribution or at-distribution. Looking at the evolution of standard break-evens especially
casts doubt on such an assumption (see Figure 11). The data displays different regimes with several
jumps. We performed a Jarque-Bera statistics, an adequacy test to the normal distribution. For
standard BEI, the null hypothesis of adequacy to the normal distribution is rejected at a 5%
confidence level for all maturities up to ten years. On the contrary, for the fully adjusted break-even
the same test cannot reject the null hypothesis for all maturities above two years. Given that there is
only one market point below—the one year point—this can be interpreted as a good signal and
advocates for the use of fully adjusted BEI.
46 Inflation Risk Across the Board
Figure 10
Volatility (annualized) across the term structure of BEI from HICPx inflation swaps
Volatility computed using a decay factor of 0.94. “U” indicates a non-seasonally adjusted curve; “SA”
stands for seasonally adjusted. Data as of 05 November 2007
1 2 3 4 5 6 7 8 9 100
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
EU
BE
I Sw
ap V
ol (
%)
Standard SAStandard UFully Adj. SA
Figure 11
Short- and mid-term BEI from HICPx inflation swaps
A traditional VaR approach is not suitable to assess the riskof merger arbitrage hedge funds. Werecently proposed a simple two- or three-state model that captures the risk characteristics of thedeals in which merger arbitrage funds invest. Here, we refinethe model, and demonstrate that itcaptures merger and acquisition risk characteristics using over 4000 historical deals. We thenmeasure the risk of a realistic sample portfolio. The risk measures that we obtain are consistentwith those of actual hedge funds. Finally, we present a statistical model for the probability ofsuccess and show that we beat the market in an out-of-sample study, suggesting that there is apotential “alpha” for merger arbitrage hedge funds.
1 Introduction
The merger arbitrage strategy consists of capturing the spread between the market and bid prices that
occurs when a merger or acquisition is announced. There are two main types of mergers: cash
mergers and stock mergers. In a cash merger, the acquirer offers to exchange cash for the target
company’s equity. In a stock merger, the acquirer offers itscommon stock to the target in lieu of cash.
Let us consider a cash merger in more detail. Company A decides to acquire Company B, for example
for a vertical synergy (B is a supplier of A). Company A announces that they offer a given price for
each share of B. The price of stock B will immediately jump to (almost) that level. However, the
transaction typically will not be effective for a number of months, as it is subject to regulator
clearance, shareholder approval, and other matters. During the interim, the stock price of B actually
trades at a discount with respect to the offer price, since their is a risk that the deal fails. Usually, the
discount decreases as the effective date approaches and vanishes at the effective date.
In a stock merger, company A offers to exchange a fixed number of its shares for each share of B. The
stock price of B trades at a discount with respect to the shareprice of A (rescaled by the exchange
ratio) as long as the deal is not closed.
With a cash merger, the arbitrageur simply buys the target company’s stock. As mentioned above, the
target’s stock sells at a discount to the payment promised, and profits can be made by buying the
52 Extensions of the Merger Arbitrage Risk Model
Figure 1
Cash deals. Share price of target (thick line) and bid offer (dotted line)
λequity for equity deals, andNt is a point process taking values
N(t) =
1 if t ≥ t0+Λ,
0 if t < t0+Λ.(4)
Finally, the initial condition is
St0 = St0. (5)
We can easily integrate this process and get fort < t0+Λ,
St = St0e∆Z (6)
where∆Z follows a normal distribution with mean(µ− 12σ2)(t − t0) and standard deviationσ
√t− t0.
For t = t0+Λ we get
St0+Λ = St0e∆Z(1−J). (7)
Parameters estimation and model validation 57
3 Parameters estimation and model validation
3.1 Virtual stock price
The parameters of the model are estimated using historical information on deals. The transaction
details (such as announcement date, effective date, type ofdeal, and so forth) are obtained from
Thomson One Banker. We consider pure cash or equity deals between public companies from 1996 to
2006 worldwide where the target company offered value is over $100 million. We consider those
deals for which we can also obtain stock prices from DataMetrics.
The daily driftµ is set to zero, and the ex-ante deal daily volatility is estimated using one year of daily
returns, equally weighted.
The intensity parametersλcashandλequity of the shock are obtained by moment matching. Conditional
on deal failure, the expected value of the stock price is
E[St0+Λ|C = 0] = St0e(µ−σ2/2)Λ(1−λ·). (8)
Assumingµ−σ2/2≈ 0, we get
E
[St0+ΛSt0
∣∣∣∣C = 0
]= (1−λ·). (9)
Using the 131 withdrawn cash deals in our database, we getλcash= −0.07±0.06; using the 33
withdrawn equity deals, we getλequity= 0.2±0.1. Hence we set
λcash= 0 and λequity = 0.2. (10)
3.2 Deal length
We model the deal lengthΛ with a Weibull distribution having parametersa andb,
F(t) = 1−e−( ta)
b
. (11)
This distribution is assumed to be universal. Using 1075 realized deal lengths (measured in days), we
obtain the following boundaries at 95% level of confidence:
143< a < 154 (12)
1.43< b < 1.56 (13)
This corresponds to an average deal length of
L = 135 days. (14)
58 Extensions of the Merger Arbitrage Risk Model
3.3 Test of the main hypothesis
As stated above, the main hypothesis is the “existence” of a virtual stock price that is reached only in
case of withdrawal. For cash deals,λcash= 0, meaning the stock prices after withdrawal should follow
a lognormal distribution, with volatilityσi different for each deali. Hence, the normalized residuals
ui =
log
(Si
t0+ΛiSi
t0
)
σi√
Λi(15)
should follow a standard normal distribution. Thep-value of a Kolmogorov-Smirnov test using the
131 withdrawn deals is 93%, implying that we cannot reject atall the main hypothesis.
For equity deals,λequity = 0.2, and the residuals defined as above do not follow a normal distribution.
Instead we study the residuals
vi =Si
t0+Λ
Sit0
. (16)
This should be distributed as
e∆Zi(1−J), (17)
where∆Zi follows a normal distribution with parameterσi different for each deal. We set the
volatility equal to the average of theσi , and use Monte Carlo to obtain a sample distributed according
to (17). We then compare this sample to our 33 withdrawn dealsusing a two-sample
Kolmogorov-Smirnov test. The result is ap-value of 53%. Again we cannot reject at all the
hypothesis, confirming the validity of our model.
4 Risk measurement application
We want to measure the risk of a sample portfolio consisting of 30 pure cash deals pending end of
2006. All deals are described by
• the target company,
• the bid offerK,
• the date of announcementt0,
• the probability of successπ,
Risk measurement application 59
Table 1
VaR using the merger arbitrage risk model and the traditional risk model
VaR level Merger Arb Model Traditional Equity Model
95% 1.37% 7.25%
99% 2.21% 10.24%
Table 2
Dispersion of historical VaRs for merger arbitrage hedge funds
VaR level 1st quartile median 3rd quartile
95% 0.81% 1.29% 1.68%
99% 2.17% 2.92% 4.90%
and are assumed independent. We set the probability of successπ to the historical value of 86% (see
Section 5).
We forecast the P&L distribution of the portfolio at a risk horizon of one month using Monte-Carlo
simulations. For each deal, one iteration is as follows:
1. Draw an effective date using the Weibull distribution.
2. If the risk horizon is subsequent to the effective date, draw a completion indicator. If the risk
horizon is before the effective date, the deal stays in place.
3. If the completion indicator indicates failure, draw a virtual stock price, and calculate the loss. If
the completion indicator indicates success, calculate theprofit.
Table 1 reports the VaRs at two different confidence levels obtained from model, as well as VaRs
obtained from modeling the positions as simple equities following a log-normal distribution. We
notice that our model produces lower risk measures, consistent with our expectation.
For more evidence, we compare these monthly VaRs with the historical monthly VaRs of 41 merger
arbitrage hedge funds obtained from the HFR database. Table2 shows that the dispersion of the hedge
fund VaRs contains our model’s results. We conclude that ournew model consistently captures the
risk of a merger arbitrage hedge fund, and that the traditional model likely overstates risk.
60 Extensions of the Merger Arbitrage Risk Model
5 Probability of success
In the risk measurement application above, the probability of success was unconditional on the deal,and set to the historical estimate using all deals worldwide from 1996 to 2006
πhistorical =NsuccessNtotal
=41764879
= 86%. (18)
A deal-specific probability of success can be inferred from the observed spread in the market as in(Daul 2007),
πimplied = π (∆, St0 ,K, rfree) . (19)
Alternately, we may fit an empirical model. We will use a logistic regression and assume that theprobability of success is a function of observable factors Xi as
πempirical =1
1+ e− ∑i biXi. (20)
If the factor sensitivity bi is positive, then larger Xi lead to higher probability of success, assumingother factors are constant.
We consider the following factors:
• Target attitude:
Xi =
1 Friendly0 Neutral
− 1 Hostile
• Premium: the relative extra amount the bidder offers. Its magnitude should be an indicator ofthe acquirer’s interest.
Xi =K − St0
St0
• Multiple: the ratio of enterprize value (EV), calculated by adding the target’s debt to the dealvalue, to the EBITDA, an accounting measure of cash flows.
Xi =EV
EBITDA
• Industrial sector: By acquiring a target in the same industrial sector, the acquirer increases itsmarket share. This could influence deal success.
Xi =1 same sectors0 different sectors
Probability of success 61
Table 3
Logistic regression on 1322 deals
Factor bi p-value
Constant -1.09 0.04
Target attitude 1.79 0.00
Premium 0.76 0.17
Multiple 0.44 0.00
Industrial sectors 0.33 0.15
Relative size 0.44 0.00
Deal type 0.34 0.16
Trailing number of deals -0.29 0.19
• Relative sizeof acquirer to the target
Xi = log
(Acquirer assets
EV
).
• Deal type
Xi =
1 cash
0 equity
• Trailing number of deals. The number of deals is cyclical; the position in that cycle should
influence deal completion.
Xi =Ndeals in last 12 monthsyearly average ofNdeals
We have 1322 realized deals (completed or withdrawn) with all factors available. Table 3 shows the
results obtained from the logistic regression. We see that attitude, multiple and relative size are very
relevant factors (very smallp-values). The premium, having the target and the acquirer inthe same
industrial sector and the deal type are relevant to some extent. The sensitivity for the trailing number
of deals is counterintuitive: it appears that a large numberof deals announced might catalyze less
convincing deals.
To assess the predictive power of our model we perform an out-of-sample test, and compute the
so-called cumulative accuracy profile (CAP) curve. The model parameters are fit using the 873 (66%)
oldest deals. We then infer the probability of success for the remaining 449 (34%) deals. After sorting
the deals by their probability of success obtained with the statistical model (from less probable to
most probable), the CAP curve is calculated as the cumulative ratio of failures as a function of the
cumulative ratio of all deals.
62 Extensions of the Merger Arbitrage Risk Model
Figure 6
CAP curve for the out-of-sample test (OOS) and the implied probability of success
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
OOS implied
The 449 out-of-sample deals have an overall failure ratio of10.2%. If the model were perfect, then
the first 10.2% of deals as sorted by our model would have contained all of the failed deals, and we
would have CAP(x) = 1 for x≥ 10.2%. If the model were useless, the ordering would be random, and
we would have CAP(x) = x. In Figure 6 we show the result for the out-of-sample test, the
market-implied probability of success and the two limitingcases. We clearly see that our model beats
the market, suggesting that there is a potential “alpha” formerger arbitrage hedge funds. Further
looking closer at the lower left corner we notice that the CAPcurve for the statistical model follows
the perfect limiting case up to about 5%. This means that our statistical model ranks the first half of
the withdrawn deals perfectly as the worst ones.
6 Conclusion
The specifics of merger arbitrage deals can be captured introducing a binomial completion indicator
and a virtual stock price modeled as a simple jump-diffusionprocess. This model has been validated
using a large set of deals. A merger arbitrage hedge fund would benefit from using this model to
measure the risk of his portfolio in a VaR framework and/or perform stress tests using the probability
Conclusion 63
of deal success for example.
Finally we have developed a statistical model for the probability of success and showed in a
out-of-sample analysis that its forecasting power is superior to the market predicting power.
References
Daul, S. (2007). Merger arbitrage risk model.RiskMetrics Journal 7(1), 129–141.
Moore, K. M., G. C. Lai, and H. R. Oppenheimer (2006). The behavior of risk aribtrageurs in
mergers and acquisitions.The Journal of Alternatives Investments Summer.
This paper discusses and investigates the quality of hedge fund databases. The accuracy of hedgefund return data is taken for granted in most empirical studies. We show however that hedge fundreturn time series often exhibit peculiar and most likely “man-made” patterns, which are worth tobe recognized. We develop a statistical testing methodology which can detect these patterns.Based on these tests, we devise a data quality score for rating hedge funds and, more generally,hedge fund databases. In an empirical study we show how this data quality score can be usedwhen exploring a hedge fund database. Thereby we can confirm many of the insights by (Liang2003) concerning the quality of hedge fund return data and made by different means. In a last stepwe try to estimate the impact of imperfect data on performance measurement by defining a “dataquality bias”. The main goals of this paper are to increase the awareness for the practicallimitations of hedge fund data and to suggest a tool for the quantification of financial data quality.
1 Introduction
The past years have seen a rapid growth of the hedge fund industry and an enormous increase of the
assets that this investor segment controls. Originally only accessible for institutional investors or very
wealthy individuals, hedge funds are nowadays much better established in the broad public. In many
countries, even retail investors can place money into hedgefunds.
There are several reasons why hedge funds have been so successful in attracting new money. Hedge
fund risk-return profiles are perceived as superior to classical long-only mutual. Hedge funds are
flexible and basically unregulated. In contrast to mutual funds, any kind of financial investment is
permitted. Hedge funds may for instance go short, invest thecapital into futures, derivative securities,
commodities and other asset classes that are not accessiblefor mutual funds. Furthermore, they can
borrow money in order to create leverage on their portfolio.Many hedge funds seek to achieve
absolute returns. Therefore, they have no traditional benchmark such as a stock or bond index or a
blend of indices. Mutual funds, however, are required to more or less track a benchmark and are
therefore much more exposed to bearish market conditions. And indeed, a majority of hedge funds
did convincingly well in the aftermath of the burst of the technology bubble and the September 11
terrorist attacks.
66 Measuring the Quality of Hedge Fund Data
Parallel to the success and the maturing of the hedge fund industry, academics were beginning to take
an interest in how hedge fund managers achieve their profits and whether successful track records are
due to skill or just luck. The questions of sources of hedge fund returns and performance persistence
have been addressed in many empirical studies, and the literature on these topics is still growing. It is
needless to say that this research heavily relies on hedge fund returns data and statistical methods to
analyze them. Concerning the data, several providers offerhedge fund databases. These databases
differ substantially in coverage of funds and in the information beyond the return time series. The
hedge fund database business is rather fragmented. It does not seem that there is a “golden standard”
for hedge fund returns. As a matter of fact, no database exists that would provide full coverage. The
diversity of hedge fund databases used in articles can explain dissimilar quantitative results.
While increasingly complicated stochastic models are being used for the description of hedge fund
returns, there are certain limitations on the data side. These limitations are only marginally discussed,
if not neglected. The purpose of this paper isnot to provide refinements of models or to present yet
another large empirical study. Instead, we are concerned about what forms the backbone of hedge
fund research: hedge fund data. Our focus lies on data quality aspects, a topic that is somewhat
disregarded in the literature. From our experience, hedge fund return data is not always beyond all
doubts. To assess the plausibility of hedge fund return data, we propose an objective and
mathematically sound method. We then use this method to analyze a hedge fund database. We also
quantify the impact which imperfect data may have.
1.1 Issues with hedge fund data
We have recently examined hedge fund databases of several providers, and have come to doubt the
quality of the return data. In this paper, we uniquely work with the Barclay1 database.In all
databases similar issues were identified.
To illustrate the aforementioned issues, we consider the following example of an active onshore
long-short hedge fund. Its return and asset under management time series are displayed in Table 1.
The following peculiarities are striking:
• The returns of the year 2000 are repeated in 2001. This is obviously a serious data error.
Interestingly, the assets under management do not show any recurring patterns.
1Barclay Hedge is a company specialized in the field of hedge fund and managed futures performance measurement
and portfolio management; see www.barclayhedge.com. We benefited from the excellent support by Sol Waksman and his
client service team of Barclay Hedge. This is gratefully acknowledged.
Introduction 67
Table 1
Monthly returns of an active long-short hedge fund, January1999 through December 2002“AUM” stands for assets under management (in $M). The full time series covers January 1991 throughSeptember 2007. The symbol “+” signifies that the return value appears at least twice in the entire timeseries. The boxes frame the blocks of recurring returns.
Date AUM Return (%) Date AUM Return (%)
Jan–1999 10.7 0.20 + Jan–2001 22.5 4.40 +
Feb–1999 10.7 6.50 Feb–2001 22.5 0.20 +
Mar–1999 15.5 3.70 + Mar–2001 34.0 0.00 +
Apr–1999 15.5 -6.30 Apr–2001 34.0 5.40 +
May–1999 15.5 -0.90 + May–2001 34.0 6.40 +
Jun–1999 23.8 2.90 + Jun–2001 43.0 0.40 +
Jul–1999 23.8 0.10 + Jul–2001 43.0 1.10 +
Aug–1999 23.8 4.10 + Aug–2001 43.0 -2.60 +
Sep–1999 28.1 -0.80 + Sep–2001 48.0 -8.60 +
Oct–1999 28.1 -2.10 + Oct–2001 48.0 4.00 +
Nov–1999 28.1 -3.00 Nov–2001 48.0 3.50 +
Dec–1999 26.8 5.70 Dec–2001 48.0 3.10 +
Jan–2000 26.8 4.40 + Jan–2002 48.0 0.90 +
Feb–2000 26.8 0.20 + Feb–2002 50.0 -0.90 +
Mar–2000 36.1 0.00 + Mar–2002 50.0 3.40 +
Apr–2000 36.1 5.40 + Apr–2002 50.0 1.80 +
May–2000 36.1 6.40 + May–2002 51.5 1.60 +
Jun–2000 28.0 0.40 + Jun–2002 51.0 -0.90 +
Jul–2000 28.0 1.10 + Jul–2002 50.0 -5.70
Aug–2000 28.0 -2.60 + Aug–2002 51.0 -0.80 +
Sep–2000 25.0 -8.60 + Sep-2002 51.0 -2.30
Oct–2000 25.0 4.00 + Oct–2002 53.0 -0.60 +
Nov–2000 25.0 3.50 + Nov–2002 60.0 3.30 +
Dec–2000 22.5 3.10 + Dec–2002 62.0 1.10 +
68 Measuring the Quality of Hedge Fund Data
• The returns appear to be rounded.
• Many return values appear more than once in the time series (depicted by the symbol “+”).
Note that this is partially caused by the rounding.
• Appearance of zero returns (for instance in March 2000). Itis rather unlikely that a fund has
returnsexactlyequal to zero.
The recurrence of one year of return data is clearly a seriouserror. We admit that in the Barclay
database such extreme problems appear for a handful of fundsonly. Much more frequent is the
recurrence of blocks of length two or three. For instance, the January to March returns of a certain
year would recur in one of the subsequent years. We picked thelong-short fund because it provides an
exemplary illustration of all types of problems that we haveencountered. Once again, it must be
stressed that such irregularities are by no means restricted to Barclay, but were evident inall databases
we examined.
An important question is why for this particular fund the data quality is so poor. One argument could
be that the fund is exposed to illiquid markets or instruments, which would make an accurate
valuation difficult. In the Barclay database, we find the following description of the fund’s investment
objectives:
Long/Short stocks and other securities and instruments traded in public markets.
Emphasis is to manage the portfolio with near zero beta. Focus on companies with market
capitalization between 200 mm and 2.5 bb. Uses quantitativescreening and fundamental
analysis to identify undervalued equities with strong cashflow to purchase. The short
strategy uses proprietary quantitative screens and fundamental analysis to identify short
opportunities with a non-price based catalyst, potential for negative earnings surprise, and
overvaluation. Overvaluation is a necessary but not sufficient condition to be short.
This description indicates that the fund’s positions are probablynotparticularly illiquid, and that it
should be feasible to supply an exact valuation once per month. It appears either the fund does have
valuation difficulties, or at minimum that they do not reportthe exact valuations to Barclay.
Since the reported monthly performance numbers appear unreliable, one might postulate that the
long-short fund in question doesin generalnot properly value its assets every month and that
therefore only crude return estimates are reported. Two arguments speak against this. First, the
long-short fund is audited every December, an information provided by the Barclay database and
absolutely plausible in view of the fund size. Therefore, one would expect that in December some sort
Introduction 69
of equalization is applied, meaning that the December return is determined in such a way that the
actualone-year return is matched. This return figure would most likely be a number withtwo digits
after the decimal. However, for the fund in question we do notsee any numbers with more than one
digit after the decimal. Second, the long-short fund is openfor new investments and subscription
possible every month. This would imply that the fund is able to quantify the net asset value of its
holdings on a monthly basis.
Finally, the information reported to data providers is not audited by a third party and cannot be
thoroughly reviewed by the data provider due to the mass of funds that report. One has to be also
aware that hedge funds reportvoluntarilyand therefore the willingness of fund managers to revise
numbers in order to ensure data accuracy might be rather limited.
Concluding our reasonings, the most likely reason for the questionable return history is a certain
negligence exercised by the fund when reporting to Barclay.
1.2 Goals and organization of the paper
The lesson from the previous example is that hedge fund return data can be problematic. While the
conclusions from empirical hedge fund research might be unaffected in qualitative terms, it is clear
that inaccurate data could have an impact when industry-wide numbers such as performance, Sharpe
ratio or alpha are calculated. In the past, people have careda lot about data biases such as the
survivorship bias, which occurs when failed hedge funds areexcluded from performance studies. The
survivorship bias generally leads to an overstatement of performance. The accuracy of return data
itself is however mostly taken for granted, and the impact ofthe data quality on the analysis is rarely
questioned. This is in a surprising contrast to the attention paid to the “classical” data biases. This
paper tries to fill a gap and to increase the awareness for the practical limitations of hedge fund data.
We regard inaccurate data as another cause of performance misrepresentation. In this context we
would like to introduce a new terminology:data quality bias. It is by no means our aim to excoriate
hedge fund data providers, which are reliant on hedge funds reporting accurately. The providers’
hands are tied when it comes to verification of the performance numbers. Maybe this paper
contributes to preparing the ground for improvement of hedge fund data quality in the future.
The only way for assessing the accuracy of hedge fund returnsin a systematic and objective manner is
via a quantification. We first devise tests that detect the kind of problems discussed above. The results
of these tests are then combined into a single number, which serves as a measure or score for how
plausible a hedge fund return time series is. A score of a group of funds is just the average score. We
70 Measuring the Quality of Hedge Fund Data
will also call it data quality score. Applying the data quality score to the Barclay hedge fund data, we
provide a small study showing results which often have an intuitive explanation. It is important to be
aware that our score rates the quality of returns only. It disregards other aspects of data quality.2
A score for data quality is a powerful tool and can serve multiple purposes. It allows for instance
comparison of different groups of funds with respect to dataaccuracy. We mention that, if properly
adapted, the ideas and principles presented in this paper can be applied to any kind of financial data.
Besides identifying problematic samples, a score of data quality helps one to monitor the
improvement of data quality over time and confirm the effectiveness of data improvement measures
and differentiate between competing databases.
The paper is organized as follows. In a section on preliminaries, we provide a brief survey on the
hedge fund literature. In the subsequent section we introduce and discuss our data quality score. This
is then followed by an empirical study using the Barclay database. Finally we conclude.
2 Preliminaries
First we give a classification of the hedge fund literature. Such an overview helps to make the
connection between this paper and the literature. Secondlywe provide more details on hedge fund
databases and their biases. Lastly, we cite and summarize the literature on hedge fund data quality
that we are aware of.
2.1 Classification of hedge fund literature
As mentioned earlier, virtually any hedge fund research relies on return data. There are basically three
interrelated main streams of academic research, addressing the following matters:
Hedge fund performance persistence
Here the main question is whether the performance achieved by a fund relative to its peers3 is
consistent over time, or in other words whether the outperformers of a certain time period are likely to
remain outperformers for the next time period, and vice versa. Miscellaneous methods have been
2For a readable survey on data quality, we refer to (Karr, Sanil, and Banks 2006).3That is, other hedge funds which pursue a similar investmentstrategy
Preliminaries 71
applied and the hedge fund databases of various providers were used in order to investigate whether
hedge fund performance persists. The answers are mixed, andit seems that the community has not yet
reached a consensus. We refer to (Eling 2007) for a comprehensive overview of the literature on
hedge fund performance persistence.
Sources of hedge fund returns
We have used the terms “performance” and “track record” without being explicit. Comparisons and
rankings of hedge funds (or any investments) based on raw returns would not be sensible because one
would neglect the risks that have been taken in order to achieve these returns. Therefore risk-adjusted
performance measures should be used when ranking funds. Themost classical measure is the Sharpe
ratio. Since the statistical properties of hedge fund returns are however often not in accordance with
the Gaussian law (skewness and fat tails), many people resort to a generalized Jensen’s alpha. Alpha
is the regression intercept of a multi-factor linear model,and (together with the mean-zero
idiosyncratic return) represents the skill of the manager,that is, what is unique about the manager’s
investment strategy. Building a factor model that containsall common driving factors (or sources of
hedge fund returns), is not trivial. Capturing hedge funds’trading strategies through a linear model
requires the use of non-linear factors. There have been manyproposals for hedge-fund-specific style
factors; see for instance (Hasanhodzic and Lo 2007).
Hedge fund return anomalies
This topic is probably closest to the main theme of this paper, and for this reason we elaborate a bit
more on it. While the question about economic sources of returns searches for the factors that
determine the performance of hedge funds, the anomalies research stream rather deals with what one
could call the fine structure of hedge fund return processes,or in other words, the peculiarities that
they exhibit.
Hedge fund returns are mostly not available at frequencies higher than monthly. There are several
reasons for this. Unlike mutual funds, hedge funds are privately organized investment vehicles and
often not subject to regulation; therefore there are no binding reporting standards. Moreover, there is
still secrecy around hedge funds. Managers are reluctant todisclose information on a daily basis, even
something as basic as realized returns. This is particularly the case for those that trade in illiquid
markets, because it is feared that disclosed information could be abused by competitors. From an
operational point of view, managers do not want to have the burden of daily subscriptions or
72 Measuring the Quality of Hedge Fund Data
redemptions necessitating the calculation of daily returns because they want to have the freedom to
fully concentrate on investment operations.
A great deal of the anomalies literature is concerned with the serial correlation of hedge fund returns.
The occurrence of pronounced autocorrelations is remarkable because it seems to contradict the
efficient markets hypothesis. However, due to lock-out and redemption periods, it would hardly be
possible to take advantage of these autocorrelations. Getmansky, Lo and Makarov (2004) show that
serial correlations are most likely induced by return smoothing. These authors argue that the exposure
of the fund to illiquid assets or markets leads to return smoothing when a portfolio is valued. This also
explain why funds that invest in very liquid assets (such as CTAs4 and long-only hedge funds) rarely
show significant autocorrelations. Getmansky, Lo and Makarov (2004) propose the use of
autocorrelations as a proxy for a hedge fund’s illiquidity exposure. They moreover stress that the
naive estimator overstates the Sharpe ratio because it ignores the autocorrelation structure of hedge
fund returns.
Bollen and Pool (2006) go a step further and insinuate that some managers might smooth returns
artificially by underreporting gains and diminishing losses, a practice they call “conditional
smoothing”. The authors also devise a screen to detect fundsthat apply conditional smoothing. Such
screens could be used by investors or regulators as an early warning system. Conditional smoothing
does not necessarily imply purely fraudulent behavior of the managers. It can be partially explained
by the pressure that they face in order to accord with the widespread myth of hedge funds as
generators ofabsolutereturns. However, history shows that many hedge fund fraud cases came along
with misrepresentations of returns and so it might be worthwhile to have a closer look at those funds
which appear to misrepresent returns.
In a subsequent paper (Bollen and Pool 2007), the same authors look at the distribution of hedge fund
returns and give evidence that it has a discontinuity at zero. Again, the explanation is the tendency to
avoid reporting negative returns. It is tempting for a hedgefund manager to report something like
0.02% for an actual return of, say -0.09%. If such practices are followed by a non-negligible number
of managers, a discontinuity at zero will occur. The authorstest for other possible causes, but return
misrepresentation turns out to be most likely.
Similar in nature is a study by (Agarwal, Daniel, and Naik 2006), which shows that average hedge
fund returns are significantly higher during December than during the rest of the year. The analysis
indicates that the “December spike” is most likely related to the compensation schemes of hedge
4CTA stands for commodity trading advisor and actually comesfrom legal terminology. The CTA strategy is also
referred to as managed futures. A CTA manager follows trendsand invests in commodities, currencies, interest rates,
futures and other liquid assets.
Preliminaries 73
funds. These tempt managers to inflate December returns in order to achieve a better year-end
performance, which in turn leads to higher compensation. Anequalization is then made in the
subsequent year. Another piece of research giving evidencethat hedge fund managers are driven by
the incentive structure is (Brown, Goetzmann, and Park 2001). There, it is shown that hedge funds that
perform well in the first six months of the year tend to reduce the volatility in the second half of the
year. It seems very hard to avoid such behavior other than through modifying the incentive systems.
2.2 Hedge fund databases and biases
Hedge fund data is marketed by several providers. Some are small vendors focusing on hedge fund
data alone, whereas others operate within a large data provider company that covers a variety of other
financial segments. Currently there are about twenty providers, many of which offer additional
services such as hedge fund index calculation.
The various hedge fund databases differ in coverage and in the information supplied besides pure
returns or assets under management. The differences with respect to coverage are considerable. The
estimated coverage of the largest databases is no more than 60% of all hedge funds.5 A reason for this
is the fact that managers typically report to one or two, but hardly to all existing databases. Some
funds prefer not to report at all, particularly if they are not interested in attracting new investors. Some
providers are specialized in the collection of data of certain hedge fund segments and would even
exclude others.6 For all these reasons, there is no database yet which fully represents all hedge funds.
Indices constructed based on one database share again the problem of inadequate representation of the
hedge fund universe. This issue led EDHEC, a French risk and asset management research institution
supported by universities, to construct a family of indicesof hedge fund indices. These indices are
meant to combine the information content of the various dataprovider indices in an optimal fashion;
see (Edhec-Risk Asset Management Research 2006).
Apart from inadequate representation, which leads to biased estimates of performances,there are other
data biases which play an important role. Numerous articlesdiscuss these biases and provide
estimates of their magnitude.
Survivorship bias. This bias, mostly upward, is created when funds that have been liquidated, or
have stopped reporting, are removed from the database. Survivorship bias has also been
discussed in the mutual fund performance literature, but itis particularly pronounced for hedge
5See (Lhabitant 2004).6As an example, HFR excludes CTAs from their database.
74 Measuring the Quality of Hedge Fund Data
funds because their attrition rate is significantly higher than that of mutual funds. Nowadays,
providers are aware of this issue and make sure that collected data of defunct funds does not get
erased. Most hedge fund providers offer so called graveyarddatabases, this means, databases
containing the “dead” funds.
Backfill bias. This bias occurs when hedge funds that join a database are allowed to report some of
their past return history. This again leads to overstatements of the historical performance of all
funds because most likely hedge funds will start reporting during a period of success. A simple
remedy to limit this bias is to record the date when the fund joined the database.
Self-reporting bias. Recall that hedge funds report voluntarily. There might be differences between
reporting and non-reporting funds, and it is difficult to quantify. An indirect way is to look at
funds-of-hedge funds. The performance of funds-of-hedge funds can serve as a proxy for the
performance of the “market portfolio” of hedge funds.
Another big difficulty leading to potentially distorted numbers is the style classification of hedge
funds. First, the style classification used by a provider is hardly ever perfect. Second, the style is
self-proclaimed by the manager. Third the investment stylepursued by a fund may change over time,
but the databases we know do not treat style as a time series item. A lot more could be said about
hedge fund databases and their biases; see the excellent survey given in (Lhabitant 2004).
2.3 Hedge fund data quality
Note that the biases presented above are uniquely connectedto the way the providers collect and
manage data, and to the willingness of managers to report returns and other information. Most of
these studies take theaccuracyof hedge fund returns for granted. We are aware of two papers raising
questions regarding this assumption. In (Liang 2000), differences between the HFR and TASS
databases are explored. The returns and NAVs of funds that appear in both databases are compared.
The returns coincide in about 47% of the cases only and the NAVs in about 83% of the cases. The
second article (Liang 2003) finds that the average annual return discrepancy between identical hedge
funds in TASS and the US Offshore Fund Directory is as large as0.97%. Liang also compares
onshore versus offshore equivalents of TASS and different snapshots of TASS. Furthermore he
identifies factors which influence return discrepancies by means of regressions. He finds that audited
funds have a much lower return discrepancy than non-auditedfunds. Moreover, funds listed on
exchanges, funds of funds, funds open to the public, funds invested in a single industrial sector and
funds with a low leverage have generally less return discrepancies than other funds. Similarly to us,
A data quality score 75
Liang questions the accuracy of hedge fund return data. He measures data quality in terms of return
discrepancies across databases. Liang does however not ascertain which of the two data sources is
more credible and therefore of higher quality.
Our paper has a similar scope, but we highlight two differences in our approach:
• We rate the quality of a single database and the funds therein in absolute terms. We do not
depend on comparisons across databases. In contrast, Liangquantifies data quality in relative
terms by looking at return discrepancies.
• We can assess the data quality ofall funds since we do not rely on matching funds between
different databases. In contrast, Liang’s approach can only determine the data quality for the
intersection of funds in two databases.
3 A data quality score
In this section, we devise a quality score for fund return time series. Inspired by the patterns found in
the long-short hedge fund of the introduction, we first definefive statistical tests for time series of
returns. For a fund, the quality score is the number of rejected tests. For a group of funds, the quality
score is the average of the fund scores. For illustrative purposes we finally compare the quality of
stock returns and fund of hedge fund returns.
3.1 Testing for patterns
As announced, we design five tests to detect patterns in return data. A test is rejected if the return data
exhibits the corresponding pattern. We suppose that the monthly returns are expressed as percentages
with two decimal places as in Table 1. We begin by describing the tests in a rather loose way:
1. For testT1, the numberz1 of returns exactly equal to zero is evaluated. Ifz1 is “too large”,T1 is
rejected.
2. TestT2 is based on the inversez2 of the proportion of unique values in the time series. Ifz2 is
“too large” (or, equivalently, the proportion of unique values is too small),T2 is rejected.
3. TestT3 looks at runs of the time series. A run is a sequence of consecutive observations that are
identical. To give an example,(2.31, 2.31, 2.31) would be a run of length three. If the lengthz3
76 Measuring the Quality of Hedge Fund Data
of the longest run is “too large”,T3 is rejected.
4. In testT4 the numberz4 of different recurring blocks of length two is evaluated. A block is
considered as recurring if it reappears in the sequence without an overlap. For example,
The sequence contains two different recurring blocks of length two: (1.25,4.57) and
(4.57,−2.08). Note that(8.21, 8.21) is not a recurring block because of the no overlap rule.
The testT4 is rejected ifz4 is “too large”.
5. TestT5 is based on the sample distribution of the second digit afterthe decimal. If this
distribution is “unlikely”,T5 is rejected.
It is evident that there are overlaps between the five tests. TestsT2 andT5 check for concentration of
return values and rounding, whereasT3 andT4 are meant to uncover repetitions in the data.
So far we have been unspecific about the thresholds for rejecting the tests. The role of the thresholds
is to discriminate between patterns appearing just by chance and those that are caused by real
problems in the data. Fixed thresholds would not be useful, since the longer the time series, the more
likely certain features such as recurring blocks occur by chance. Moreover, the volatility plays an
important role; funds with a very low volatility will feature a high concentration in certain return
values because the range of the data is limited.
We thus set the thresholds on a per time series basis. To this end, we suppose that monthly fund
returns are independent and identically distributed normal random variables rounded to two digits
after the decimal:
rt i.i.d.∼ N (µ,σ2), t = 1, . . . ,n. (1)
Here the notationN highlights that the normal random variables are rounded, and n denotes the length
of the return time series. Under the distributional assumption (1) we next compute for each testTi the
probability that the corresponding test statisticZi is equal to or larger than the actually observedzi :
pi = Pµ,σ2;n(Zi ≥ zi). (2)
If this probability is small, it implies that we have observed an unlikely event and so the pattern can be
considered as significant. Note thatpi is thep-value of the testTi under the null hypothesis (1).
Instead of working with thresholds, we can equivalently uselevels of significance and reject tests if
the p-values exceed these levels. We chose to take a common level of significance equal to 1%
A data quality score 77
because this makes all tests comparable. Summarizing, we have:
rejectTi ⇐⇒ pi < α = 1%. (3)
Now there are a couple of practical issues to consider. For computing thep-values, we replace the
unknown parametersµ andσ2 in (2) by the sample mean ˆµ and by the sample varianceσ2 of the
returns, respectively. The numerical values ofpi are then obtained by Monte Carlo simulation.
We have yet to define the test statisticZ5. We first determine, via Monte Carlo simulation, the
distribution of the second digit after the decimal under (1)with µ= µ andσ2 = σ2. The probability
that this digit is equal tok is denoted byqk. We have found that for the range of volatilitiesσ ≥ 0.5
the digit is close to being equidistributed on0,1, . . . ,9. For a sample ofn returns, the number of
occurrences ofk as the second digit after the decimal is denoted bynk. We defineZ5 as the distance
between the sample distribution of the second digit and its distribution under (1). This distance is
measured through the classicalχ2 goodness-of-fit test statistic:
Z5 =9
∑k=0
(nk−nqk)2
nqk. (4)
Note that we use Monte-Carlo simulation for the calculationof the p-values ofT5; we do not resort to
a χ2 approximation of the distribution ofZ5.
To conclude the definition of the tests, a couple of remarks are warranted. From visual inspection of
return time series we have developed a sense of imperfect data. We have concentrated on patterns in
sequences of returnnumbers. This resulted in the testsT1-T5. Of course these tests are not necessarily
exhaustive since there might be other patterns which we are not aware of. Although we believe that
testing for faulty outliers of extreme returns would be of high relevance, the only feasible way of
doing that would be via a comparison of the identical fund across various databases, which we did not
pursue.
It is evident there is a speculative element in our method. Wecan merely conjecture that a certain
return time series is inaccurate. The only way to validate our approach would be to call up all the
funds with problematic data. It should be clear that this is beyond the scope of this paper.
An assumption which might lead to objections is the hypothesis (1) of i.i.d. normally distributed
percentage returns (rounded to two decimal places after thedecimal). This model is needed to
estimate thep-values, and we are aware that it is crude. It could be easily replaced by a model
allowing for skewness and heavy tails. Note that our tests are of discrete nature and rather indifferent
about the return distribution. For this reason we conjecture that the approach is to some degree robust
with respect to the chosen model for the return distribution. At least on a qualitative level we expect
that the distributional assumptions have little impact on the results of Section 4.
78 Measuring the Quality of Hedge Fund Data
Another choice we have made is the level of significance α = 1%, which is somewhat arbitrary, aswith any statistical testing problem. We have taken a rather low α because we wish to be prudentabout rejecting funds and want to keep the Type I error7 low. Another reason for taking a low α is theincrease of the Type I error if the five tests are applied jointly. The Type I error of the five tests appliedjointly is smaller than or equal to 5%. This is a consequence of the Bonferroni inequality:
Pµ,σ2;n ( at least one Ti is rejected) ≤5
∑i= 1
Pµ,σ2;n (Ti rejected) = 5α . (5)
3.2 Definition of the quality score
The data quality score of a fund is just the number of rejected tests. Using (3), the score can beformally written as
s =5
∑i= 1
I Zi< pi . (6)
Note that high values of S correspond to a low data quality, and vice versa. For a group F of funds,the score is the average fund score:
S =1|F | ∑
j Fs j . (7)
Note that S is the average number of rejected tests (per fund) and lies between zero and five. Ifhypothesis (1) holds true for each fund, we have that
E (S) =1|F | ∑
j FE (s j) ≤ 5α, (8)
since Pµ,σ2;n(Zi < pi) ≤ α; note that we don’t have equality because the Zis are discrete. Our rationalefor testing a group of funds is to compare its score (7) with the upper bound (8) on the expectednumber of rejected tests. If the score exceeds this bound by far, we infer that there are issues withsome of the underlying return time series. In the next section we provide an illustration with stocksand funds of hedge funds.
3.3 A reality check
As a first application of the data quality score, we would like to demonstrate the fundamentallydifferent quality of equity and Barclay funds of hedge funds returns. We chose funds of hedge fundsbecause this is one of the best categories in the Barclay database with respect to data quality.
7That is, the likelihood of rejecting a “good” return time series by chance.
A data quality score 79
Table 2
Data quality score of monthly return data
Stocks Funds ofHedge Funds
Score 0.04 0.23
P(s=0) 96.77% 85.60%
P(s=1) 2.84% 8.65%
P(s=2) 0.39% 3.61%
P(s=3) 0.00% 1.47%
P(s=4) 0.00% 0.63%
P(s=5) 0.00% 0.04%
The equity data was obtained from the Compustat database. Wetook the monthly returns of the
members of the RiskMetrics US stock universe, which is used to produce equity factors for the
RiskMetrics Equity Factor Model.8 This universe contains the ordinary shares of the largest US
companies and consists of roughly 2000 stocks. The time window was chosen such that the stock
return time series length matches the average length of the funds of hedge funds return time series.
The results in Table 2 speak for themselves and support our hypothesis that there are issues with
hedge fund data. Note that for the stocks the Bonferroni inequality (5) is “on average” respected, since
P(at least one test is rejected) = P(s> 0) = 1−P(s= 0) = 3.23%. (9)
HereP(s= 0) is the proportion of return time series with a quality score of zero, that is, with no
rejected tests. The quality score for the stocks also obeys the inequality (8). All this indicates that, as
expected, the equity data does not exhibit any of the patterns we are concerned with.
These positive findings are contrasted by the funds of hedge fund return data. Here many more funds
than predicted by the model (1) display patterns; inequality (8) is clearly breached. This leads us to
conclude that the data accuracy for the funds of hedge funds is not always given.
8See (Straumann and Garidi 2007).
80 Measuring the Quality of Hedge Fund Data
4 Analyzing a hedge fund database
This section presents an empirical study, from which many conclusions about data quality can be
drawn. The study also demonstrates the power of the previously defined data quality score and the
mechanics for using it. We explore the Barclay database witha particular view towards its quality.
4.1 The Barclay database
In this section, we describe the Barclay database and discuss the filtering rules which we have applied.
We look at the October 2007 snapshot of the complete Barclay database, which contains CTAs, fund
of funds, and hedge funds. We also consider inactive funds. Afund is called inactive if it has stopped
reporting to Barclay; note that this does not necessarily mean that it does not exist anymore. All in all,
the database contains 11,701 funds. In order that all funds contain consistent information, we apply
certain filters. These six filters, discussed next, lead to a downsize of the Barclay database to a
universe of 8574 funds.
First, we remove the 591 funds with no strategy classification. We only admit funds that report returns
net of all fees. This leads to a further exclusion of 204 funds. The next filter deals with duplicates. For
many funds there exist multiple classes, sometimes denominated in different currencies. Also there
are funds coexisting in an onshore and offshore version. We want to restrict ourselves to one class or
version only. Similarly to (Christory, Daul, and Giraud 2006), we devised an algorithm to find fund
duplicates. This algorithm is based on a comparison of Barclay’s manager identifiers, strategy
classifications, the roots of the fund names, and the investment objectives. The latter are stored in a
text field and consist of longer written summaries of the typeshown in the introduction of this paper.
All in all, we remove 1184 duplicates. The next filter removesthe 702 funds which have not reported
more than one year of monthly returns. Since we consider the assets under management (AUM) as a
very important piece of information, we next remove all 375 funds where the AUM time series is
missing.9 We mention that we have removed the leading and trailing zeros in all return time series,
since we interpret the latter as missing data. The occurrence of runs of zeros of length three or more
strictly insidethe return time series is also interpreted as missing data. We omit the corresponding
71 funds. The remaining funds have no gaps in their return series. We also mention that all funds of
the Barclay database have a monthly return frequency.
The Barclay strategy classification consists of 79 items. Wemapped these categories to the following
broad EDHEC categories: CTA, Fund of Hedge Funds, Long/Short Equity, Relative Value, Event
9We admit however gaps of missing data in the AUM time series.
Analyzing a hedge fund database 81
Table 3
Summary of Barclay hedge fund strategies
Active Inactive Total
CTA 769 1633 2402
Funds of Funds 1800 582 2382
Hedge Funds 2325 1465 3790
Long/Short Equity 1183 737 1920
Relative Value 459 404 863
Event Driven 260 167 427
Emerging Markets 321 89 410
Global Macro 102 68 170
Total 4894 3680 8574
Driven, Emerging Markets and Global Macro.10
Table 3 summarizes the number of funds in each strategy broken down by status. Barclay is known
for providing a large coverage on CTAs, and this can also be seen from the numbers. In the CTA
category there is a high proportion of inactive funds. This seems to be a database legacy artefact.
Among the funds active during the 1980s and 1990s, the CTA category clearly dominates. We
conclude that Barclay mainly focused on CTAs during these times. Moreover the CTAs exhibit a
higher attrition rate than the other categories.11 As we will see below, the CTA class contains many
“micro-funds” with less than one million dollars in AUM. It is not surprising that small funds have a
higher likelihood of disappearing than large funds. This point has also been addressed and confirmed
by (Grecu, Malkiel, and Saha 2007). We mention that many of these tiny funds are legally spoken not
CTAs because their managers do not hold an SEC licence; sincethey invest similarly to CTAs,
Barclay nevertheless categorizes them as CTAs.12
10We refer to www.edhec-risk.com/indexes for a concise description of these strategies.11The average annual attrition rates in the period 1990–2006 are: 12.3% for CTAs, 3.9% for funds of funds and 6.2% for
hedge funds.12From personal communication with the Barclay Hedge client services
82 Measuring the Quality of Hedge Fund Data
4.2 Overview of the data quality
After applying the filters, we determine the score for every fund. Tables 4 and 5 give an overview of
scores and rejection probabilities. In overall data quality, global macro and funds of funds do best and
CTAs worst.
The favorable data quality of funds of funds is not unexpected. Fund of funds managers are not
directly involved in trading activities and their role is tosome extent also administrative. It is in their
interest to have precise knowledge of the NAVs of funds they are invested in. All these factors
increase the likelihood that their reporting to the database vendors is accurate. Our result is similar to
that of (Liang 2003). In his comparison of identical funds inTASS and the US Offshore Fund
Directory, he found that among the fund of funds there were noreturn discrepancies at all.
The satisfactory quality of global macro fund data is positive. A possible explanation is that global
macro funds are active in liquid markets: currency and interest rate markets. For this reason, the
valuation of the assets is relatively straightforward for global macro funds, and this in turn should
induce a good data quality.
In contrast, we have not found a convincing explanation for the relatively bad data quality of
long-short funds. Since long-short funds trade in the equity markets, which are rather liquid, we
would have expected a better result for this category. It surprises us that the relative value and
emerging markets hedge funds, which are active onlessliquid markets, outperform the long-short
strategy in terms of data quality. We did not gain any insightinto the high score of the long-short
funds either by using the more granular strategy categorization by Barclay. In the next section, we
have a closer look at possible factors which affect data quality. But even accounting for these factors,
we have not uncovered the reasons for the poor score of long-short equity funds.
The unfavorable data quality of CTAs is caused by the many tiny funds of this group; see
Section 4.3.2 below.
For all strategies, either testT2 on the proportion of unique values or testT5 on the distribution of the
second digit after the decimal is rejected most often. We have verified that in one third of the cases
thatT2 or T5 is rejected, rounding appears to be the main cause for rejection. The next most frequent
problem concerns the occurrence of zeros (testT1). Less frequent are recurring blocks (testT4). The
occurrence of runs (testT3) is least common in the return data.
Analyzing a hedge fund database 83
Table 4
Data quality scores for Barclay funds
Score P(s=0) P(s=1) P(s=2) P(s=3) P(s=4) P(s=5)
CTA 0.45 75.19% 11.70% 7.79% 3.66% 1.62% 0.04%
Funds of Funds 0.23 85.60% 8.65% 3.61% 1.47% 0.63% 0.04%
Figure 1Data quality score as a function of time series length
0 20 40 60 80 100 120 140 160 180 2000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time−Series Length
Qua
lity
Scor
e
CTAFunds of FundsHedge Funds
4.3 Predictors of data quality
In the previous discussion, we did not make use of any covariates, which would possibly help toexplain the score values. The goal of this section is to find the explanatory factors for data quality.
4.3.1 Time series length
Figure 1 displays the data quality versus the length of the return time series.13 The plot shows nicelythat the longer the time series, the higher the data quality score. This relationship is close to linear. Itis straightforward to give an explanation: the longer the return time series, the higher is theprobability that errors have occurred during its recording.
13The curves have been estimated through a binning approach. In each category, bins containing approximately 200funds with similar return time series lengths are constructed. For each bin one determines the average time series lengthtogether with the score of all funds therein, and this gives one point in the xy-plane.
Analyzing a hedge fund database 85
Table 6
Assets under management ($M) by time series length
33%- and 66%-tiles of AUM in each time series length categoryreported.
Length (yrs.) (1,3] (3,6] > 6
Percentile 33% 66% 33% 66% 33% 66%
CTA 0.9 5.0 1.5 9.3 5.9 35.6
Funds of Funds 13.8 52.6 26.1 85.3 33.9 127.4
Hedge Funds 11.3 47.5 17.6 72.7 31.1 105.5
Long/Short Equity 9.3 44.1 15.5 66.1 27.2 80.3
Relative Value 10.3 48.0 18.3 75.7 34.8 129.5
Event Driven 12.8 44.1 24.2 105.7 46.5 131.0
Emerging Markets 22.0 66.0 27.5 73.4 32.5 115.4
Global Macro 9.8 34.9 21.5 70.7 36.1 131.3
4.3.2 Assets under management (AUM)
In this section, we consider fund size, defined as the time average of the AUM series. In Table 6, we
consider the fund size in relation to time series length. From comparing the percentiles across the
three categories of time series lengths, we see that the longer the time series, the higher in general the
AUM. The CTA category contains the funds with the lowest AUM.As we alluded earlier, it is striking
how many tiny CTA funds exist. We utilize the AUM percentilesfrom Table 6 to divide the funds into
small, medium and large size categories.
We present the quality scores in Table 7. First we remark thatbucketing by time series length is
necessary in order to remove the strong effect of this factoron the data quality score. We would
expect funds with low AUM to exhibit a poorer data quality than large funds since they most likely
have fewer resources for accurate valuation and reporting.14 For CTAs, the quality improves (scores
decrease) with greater size in each of the time series lengthcategories; our conjecture holds true. For
funds of funds or hedge funds, there is no such clear relationship. We have investigated whether
auditing could play a role for this result by further subdividing the groups into audited and
non-audited funds, but this has revealed that auditing or non-auditing, respectively, is not the cause.
14We mention that (Liang 2003) established such a relationship indirectly by giving the argument that fund size and
auditing are strongly related and by establishing a positive effect of auditing on the size of the return discrepancies.
86 Measuring the Quality of Hedge Fund Data
Table 7
Data quality score by time series length and assets under management
Number of funds in each category in parentheses
Length (yrs.) (1,3] (3,6] > 6Fund Size Small Medium Large Small Medium Large Small Medium Large
the argument is similar. It is likely that a fund does not concentrate anymore on accurately reportingclose before it liquidates. We convince ourselves from Table 9 that this conjecture is true for CTAsonly. For funds of funds, hedge funds and all subcategories there is no striking relationship betweendata quality score and fund status.
4.3.5 Concluding remarks concerning predictors of data quality
Summarizing, we have found that the time series length and whether a fund is audited or not are themost important predictors for the data quality score. For the other tested predictors, there are noconclusive results which would hold across all fund categories. AUM and fund status have somepredictive power for the class of CTA funds. Small values of AUM have a clearly negative impact onthe data quality of CTAs. Inactive CTAs have a generally lower data quality than active CTAs.
We have presented an exploratory analysis of the predictive power of certain factors for data quality.Additional factors could have been tested. An alternative approach would have been to fit somegeneralized linear model to the data quality scores. The advantage of a model-based analysis wouldbe the straightforward and mechanical assessment of the significance of factors, basically by lookingat p-values of estimated model parameters. The disadvantage is that we would have to trust a blackbox. Since the data quality score is a new concept and since we wanted to gain a certain intuitionabout it, we have preferred performing an exploratory data analysis, which consists of looking attables and plots.
4.4 Is there an improvement of data quality over time?
This section is concerned with the evolution of data quality through time. To this end, we look at twoequally long time periods: 1997–2001 and 2002–2006. For each time period, we consider those fundsthat reported returns during the full period. We calculate the data quality score for the time seriesrestricted to the corresponding time-period. Note that this leads to some simplification of the analysisby virtue of the fact that all funds have equally long time series consisting of 60 monthly returns. Thedivision into small, medium and large funds is as discussed in the previous sections.
The results of Table 10 indicate that there is a clear improvement of the quality for CTAs. Indeed, forall fund size groups the score of the second period is lower than for the first period. For all otherstrategies, the relationship is mixed. For funds of funds and hedge funds, the data quality stays moreor less at the same level. Most striking we found the considerable decrease of quality for largelong-short funds. At the time being we do not have an explanation for this result.
Analyzing a hedge fund database 89
Table 10
Evolution of data quality score through time
Fund Size Small Medium Large
Time Period 1997-2001 2002-2006 1997-2001 2002-2006 1997-2001 2002-2006
Summarizing the results, it is all in all fair saying that thedata quality in the Barclay databasehas
improved. Possible reasons for this improvement could be the general increase of transparency in the
hedge fund world during the past decade and the advances of information technology, which generally
facilitated the collection and management of large amountsof data.
4.5 The data quality bias
We finally estimate the data quality bias, as announced earlier in this paper. The data quality bias
measures the impact of imperfect data on performance. We adapt the common definition of data
biases to the case of data quality. The data quality bias is defined as the average annual performance
difference of the entire universe of funds and the group of funds with data quality score equal to zero.
Following the practice of the hedge fund literature, performances of funds are averaged using equal
weights. A positive data quality bias indicates that the inclusion of funds with imperfect return data
leads to an overstatement of the performance, and vice versa. The data quality bias for a certain
strategy is obtained by restricting the universe to this strategy. Also recall that the survivorship bias is
the average annual performance difference of the living funds and the entire universe of funds.
The results are presented in Table 11. We mention that we havetaken into account the non-USD
currencies when calculating the performances. Numbers indicate that the data quality could be a
non-negligible source for performance misrepresentation.
90 Measuring the Quality of Hedge Fund Data
Table 11
Data quality bias and survivorship bias 1997-2006 (annualized)
Data Quality Bias(%) Survivorship Bias(%)
CTA 0.16 1.75
Funds of Funds -0.14 0.07
Hedge Funds 0.48 0.67
Long/Short Equity 0.64 0.60
Relative Value -0.14 0.73
Event Driven 1.50 -0.17
Emerging Markets -0.01 0.84
Global Macro 0.24 1.93
Both the data quality and the survivorship bias are almost negligible for the funds of funds. For CTAs,
the data quality bias is small compared to the huge survivorship bias; note that the large survivorship
bias for CTAs is to a large extent due to their high attrition rate. Recall that the data quality of CTAs is
generally low; nevertheless the data quality bias is not outrageous. For hedge funds the data quality
bias is in the order of magnitude of the survivorship bias. Concluding this section, we would like to
stress the point that data biases are of course not additive.
4.6 Regularizing hedge fund return data
As a last piece of the analysis of the Barclay database, we would like to study the uses of the quality
score for “cleaning” hedge fund data. We appeal to the previously given overview on the hedge fund
return anomalies literature, where we cited the work of (Bollen and Pool 2007). These authors found
a significant discontinuity at zero for the pooled distribution of hedge fund returns reported to the
CISDM database. First we would like to verify whether a similar observation can be made when
using the Barclay return data. We apply a kernel density estimator to percentage returns. Thereby we
use a Gaussian kernel together with a bandwidth equal to 0.025. The results in Figure 2 are quite
illustrative. For the left-hand plot, the estimator is applied to all 8574 funds in the Barclay database,
whereas for the right-hand plot the 1738 funds with data quality issues are removed.
Note that the kernel density estimate for the raw Barclay return data in the right-hand plot is very
wiggly. This wiggling is induced by those funds which have heavily rounded returns. The next
Conclusions 91
Figure 2Kernel density estimates for pooled distribution of Barclay fund returnsAll funds (left) and funds with quality score of zero (right)
−2 −1 0 1 20
0.05
0.1
0.15
0.2
0.25
return (%)
dens
ity
Barclay raw
−2 −1 0 1 20
0.05
0.1
0.15
0.2
0.25
return (%)
dens
ity
Barclay cleaned
observation is the pronounced jump of the density at zero, which is in line with (Bollen and Pool2007). Before we move to the right-hand plot, we stress that for both plots the kernel densityestimates are based on the same bandwidth and evaluated at identical grid-points. In the right-handplot, the wiggling almost disappears. We are not surprised by this because the tests T1-T5 reject timeseries with heavily rounded returns. Also note that the density curve is still very steep at zero;however, the jump size appears to be slightly less pronounced. This example shows that removingfunds with a nonzero data quality score can to some extent regularize hedge fund return data.
5 Conclusions
In this paper, we have provided a comprehensive discussion of quality issues in hedge fund returndata. Hedge funds data quality is a topic which is often avoided, maybe because it is perceived as notparticularly fruitful or just boring. The main goal of this paper was to increase the awareness forirregularities and patterns in hedge fund return data, to suggest methods for finding them and toquantify their severity.
92 Measuring the Quality of Hedge Fund Data
Using a simple, natural and mathematically sound rationale, we introduced tests and devised a novel
scoring method for quantifying the quality of return time series data. By means of an empirical study
using the Barclay database, we then demonstrated how such a score can be used for exploring
databases. Our findings conformed to a large extent with results from other articles. This can be seen
as a partial validation of the score approach.
While the score approach is appealing and can be applied in analmost mechanical fashion, it seems to
us that uncritically computing data quality scores could prove harmful. Most hedge fund databases
have grown organically, and every analysis must respect that there are legacy issues. It would be very
wrong to ascribe excessive importance to numerical score values without looking at the underlying
causes.
References
Agarwal, V., N. Daniel, and N. Naik (2006). Why is Santa Clausso kind to hedge funds? The
December return puzzle! Working paper, Georgia State University.
Bollen, N. and V. Pool (2006). Conditional return smoothingin the hedge funds industry.
Forthcoming,Journal of Financial and Quantitative Analysis.
Bollen, N. and V. Pool (2007). Do hedge fund managers misreport returns? Working paper,
Vanderbilt University.
Brown, S., W. Goetzmann, and J. Park (2001). Careers and survival: competition and risk in the
hedge fund and CTA industry.Journal of Finance 56(5), 1869–1886.
Christory, C., S. Daul, and J.-R. Giraud (2006). Quantification of hedge fund default risk.Journal
of Alternative Investments 9(2), 71–86.
Edhec-Risk Asset Management Research (2006). EDHEC investable hedge fund indices. Available
at http://www.edhec-risk.com.
Eling, M. (2007). Does hedge fund performance persist? Overview and new empirical evidence.
Working paper, University of St. Gallen.
Getmansky, M., A. Lo, and I. Makarov (2004). An econometric model of serical correlation and
illiquidity in hedge fund returns.Journal of Financial Economics 74, 529–609.
Grecu, A., B. Malkiel, and A. Saha (2007). Why do hedge funds stop reporting their performance?
Journal of Portfolio Management 34(1), 119–126.
Conclusions 93
Hasanhodzic, J. and A. Lo (2007). Can hedge-fund returns be replicated? The linear case.Journal
of Investment Management 5(2), 5–45.
Karr, A., A. Sanil, and D. Banks (2006). Data quality: a statistical perspective.Statistical
We present a model that captures risks of hedge funds only using their historical performance asinput. This statistical model is a multivariate distribution where the marginals derive from anAR(1)/AGARCH(1,1) process witht5 innovations, and the dependency is a grouped-t copula. Theprocess captures all relevant static and dynamic characteristics of hedge fund returns, while thecopula enables us to go beyond linear correlation and capture strategy-specific tail dependency.We show how to estimate parameters and then successfully backtest our model and some peermodels using 600+ hedge funds.
1 Introduction
Investors taking stakes in hedge funds usually do not get full transparency of the funds’ exposures.
Hence in order to perform their monitoring function, investors would benefit from models based only
on hedge fund past performances.
The first type of models consists of linear factor decompositions. These are potentially very powerful,
but no clear results have emerged and intensive research is ongoing. We present here a less ambitious
but successful second approach based on statistical processes. We are able to accurately forecast the
risk taken by one or more hedge funds only using their past track record.
In this article we first describe the static and dynamic characteristics of hedge fund returns that we
wish to capture. Then we introduce an extension of the usual GARCH process and detail its
parametrization. This model encompasses other standard processes, enabling us to backtest all of the
models consistently. Finally we study the dependence of hedge funds, going beyond linear correlation
to introduce tail dependency.
∗The author would like to thank G. Zumbach for helpful discussion.
96 Capturing Risks of Non-transparent Hedge Funds
2 Characterizing hedge fund returns
We start by presenting descriptive statistics of hedge funds returns. To that end, we use the
information from the HFR database. This database consists of (primarily monthly) historical returns
for hedge funds. We assume that what is reported for each hedge fund is the monthly return at timet
(measured in months) defined as
rt =NAV t −NAV t−1
NAV t−1, (1)
where NAVt is the net asset value of the hedge fund at timet. This return is considered net of all
hedge fund fees. We consider only the 680 hedge funds with more than 10 years of data (i.e. at least
120 observations). This will enable us to perform extensiveout-of-sample backtesting afterwards.
We first analyze the shape of the distribution of the monthly returns. The classical tests for normality
are the Jarque-Bera and Lilliefor tests. At a 95% confidence level both tests reject the normality
hypothesis on a vast majority of hedge funds: out of the 680 hedge funds, the Jarque-Bera test rejects
598, and the Lilliefor test rejects 498.
A common assertion about hedge fund returns is that they are skewed. By looking at the sample
skewness, this is certainly what we would conclude. Howeverthis quantity is sensitive to outliers and
not a robust statistic. We opt for testing the symmetry usingthe Wilcoxon signed rank sum,
W =N
∑i=1
φiRi , (2)
whereRi is the rank of the absolute values,φi is the sign of samplei andN is the number of samples.
This test rejects only 26 of the 680 hedge funds at a significance level of 95%. We conclude that the
bulk of the hedge funds do not display asymmetric returns, but that tail events and small sample size
are produce high sample skewness.
After describing the static behavior of hedge fund returns,we analyze their dynamics by calculating
various one-month lagged correlation coefficients. We consider the following correlation coefficients:
• Return-lagged returnρ(rt , rt−1),
• Volatility-lagged volatilityρ(σt ,σt−1) and
• Volatility-lagged returnρ(σt , rt−1).
If the time series have no dynamics (such as white noise) thenthe autocorrelation coefficients should
follow a normal distribution with variance1N . Hence a coefficient is significant at 95% if it falls
Characterizing hedge fund returns 97
Figure 1
Distributions of one-month lagged correlation coefficients across hedge funds
Only significantly non-zero values reported.
−1 −0.5 0 0.5 10
20
40
60
80
100Return − Lagged Return
−1 −0.5 0 0.5 10
20
40
60
80
100Volat − Lagged Volat
−1 −0.5 0 0.5 10
10
20
30
40
50
60Volat − Lagged Return
outside[− 2√
N, 2√
N
]. Using the 680 hedge funds, we found 336 significant coefficients for the
return-lagged return correlation, 348 for the volatility-lagged volatility correlation and 192 for the
volatility-lagged return correlation.
We summarize the significant coefficients in histograms in Figure 1. The first coefficient is the
correlation between the return and the lagged return. The coefficients are essentially positive,
meaning that some of one month’s return is transferred to thenext month. This could have different
origins: valuation issues, trading strategies or even smoothing. The second coefficient is the
correlation between the volatility and the lagged volatility. These are essentially positive and imply
heteroscedasticity, or non-constant volatility. Finally, the third coefficient is the correlation between
the volatility and the lagged return. These coefficients areboth positive and negative, suggesting that
hedge fund managers adapt their strategies (increasing or decreasing the risk taken) to upwards or
downwards markets.
We also examined the dependency between the three coefficients, but did not observe any structure.
We conclude that the three dynamic characteristics are distinct, and must be captured separately by
our model. To summarize, we have found that hedge fund returns are non-normal but not necessarily
skewed, and that they have at least three distinct dynamic properties.
98 Capturing Risks of Non-transparent Hedge Funds
3 The univariate model
3.1 The process
To describe the univariate process of hedge fund returns, westart with a GARCH(1,1) model to
capture heteroscedasticity. We then extend the model by introducing autocorrelation in the returns and
an asymmetric volatility response. The innovations are also generalized by using an asymmetric
t-distribution.
The process is
rt+1 = r +α(rt − r)+σtεt , (3)
σ2t = (ω∞ −α2)σ2
∞ +(1−ω∞)σ2t , (4)
σ2t = µσ2
t−1 +(1−µ) [1−λ sign(rt)](rt − r)2. (5)
The parameters of the model are thus
r,α,ω∞,σ∞,µ,λ , (6)
and the distribution shape of the innovationsεt .
When ¯r,α,λ = 0, the model reduces to the standard GARCH(1,1) model written is a different way
(Zumbach 2004). In this form, the GARCH(1,1) process appears with the elementary forecastσt
given by a convex combination of the long-term volatilityσ∞ and the historical volatilityσt . The
historical volatilityσt is measured by an exponential moving average (EMA) at the time horizon
τ = −1/ log(µ). The parameterω∞ ranges from 0 to 1 and can be interpreted as the volatility of the
volatility.
The parameterα ∈ [−1,1] induces autocorrelation in the returns. The parameterλ ∈ [−1,1] yields
positive (negative) correlation between the lagged returnand the volatility ifλ is negative (positive).
The innovationsεt are i.i.d. random variables withE[εt] = 0 andE[ε2t ] = 1. We choose an asymmetric
generalization of thet-distributionintroduced by Hansen (Hansen 1994). The parameters areλ′ andν,
and the density is given by
gλ′,ν(ε) =
bc
(1+ 1
ν−2
(bε+a1−λ′
)2)−(ν+1)/2
ε ≤−a/b,
bc
(1+ 1
ν−2
(bε+a1+λ′
)2)−(ν+1)/2
ε > −a/b,
(7)
The univariate model 99
with
a = 4λ′c(
ν−2ν−1
), (8)
b2 = 1+3λ′2−a2, (9)
c =1√
π(ν−2)
Γ((ν+1)/2)
Γ(ν/2). (10)
For λ′ = 0 this distribution reduces to the usualt-distributionwithν degrees of freedom; forλ′ > 0, it
is right skewed and forλ′ < 0, it is negatively skewed.
3.2 Parametrization
The choice of our parametrization enables us to separate thedifferent parameter estimations. First, we
set
α = ρ(rt, rt−1). (11)
For λ = 0, this implies
E[(rt − r)2] = σ2∞ (12)
justifying the estimation ofσ∞ by the sample standard deviation. The expected return ¯r is set to the
historical average return.
In order to reduce overfitting, we set some parameters to fixedvalues. We make the hypothesis that
the volatility dynamics and the tails of the innovations areuniversal, implying fixed values forω∞, µ
andν.
We obtainω∞ andµ by analyzing the correlation function for the pure GARCH case (that is,λ = 0
andα = 0). We rewrite the process, introducing
β0 = σ2∞(1−µ)ω∞, (13)
β1 = (1−ω∞)(1−µ), (14)
β2 = µ. (15)
Assuming an average return ¯r = 0, the process becomes
rt = σtεt , (16)
σ2t = β0+β1r2
t +β2σ2t−1. (17)
100 Capturing Risks of Non-transparent Hedge Funds
Figure 2
Tail distribution of the innovations
−1 −0.5 0 0.5 1 1.5 2−16
−14
−12
−10
−8
−6
−4
−2
0
log(−ε)
log(
cdf)
residuals t
5
normal
The autocorrelation function forr2t decays geomatrically1
ρk = ρ(r2t , r
2t−k) = ρ1(β1+β2)
k−1, (18)
with
ρ1 =
(β1+
β21β2
1−2β1β2−β22
.
). (19)
We evaluate the sample autocorrelation function
ρk = ρ(r2t , r
2t−k), (20)
across the 680 hedge funds. We then fit the cross-sectional averageρk to a power law decay as (18).
Finally, we transform the estimated parameters back to our parametrization, yieldingω∞ = 0.55 and
µ= 0.85.
To estimate the tail parameterν of the innovations, we compute the realized innovations, setting
λ = 0, ω∞ = 0.55 andµ= 0.85. Since we hypothesize that the innovation distribution is universal, we
aggregate all realized innovations and plot the tail distribution. We see in Figure 2 that a value of
ν = 5 is the optimal choice.
Finally, the remaining parametersλ andλ′ are obtained for each hedge fund using maximum
likelihood estimation. Table 1 recapitulates all parameters and their estimation.
1See (Ding and Granger 1996).
Backtest 101
Table 1
Parameter estimation
Parameter Effect captured Value
r expected return individual mean(r)
α autocorrelation individual ρ(rt, rt−1)
ω∞ volatility of volatility universal 0.55
σ∞ long term volatility individual std(r)
µ EMA decay factor universal 0.85
λ dynamic asymmetry individual MLE
ν innovation tails universal 5
λ′ innovation asymmetry individual MLE
4 Backtest
We follow the framework set in (Zumbach 2007). The process introduced in Section 3.1 yields us a
forecast at timet for the next month’s return
rt+1 = r +α(rt − r), (21)
and a forecast of the volatilityσt . At time t +1, we know the realized returnrt+1 and can evaluate the
realized residual
εt =rt+1− rt+1
σt. (22)
Next, we calculate the probtile
zt = t5(εt) , (23)
wheret5(x) is the cumulative distribution function of the innovations. These probtiles should be
uniformly distributed through time and across hedge funds.To quantify the quality of our model, we
calculate the relative exceedance
δ(z) = cdfemp(z)−z, (24)
where cdfemp is the empirical distribution function for the probtiles and introduce the distance
d =
∫ 1
0dz|δ(z)|. (25)
We have calculated such distance across all times and all hedge funds, and report the results in Figure
3. The first result (labeled “AR(0) normal”) is the usual normal distribution with no dynamics. Then
102 Capturing Risks of Non-transparent Hedge Funds
Figure 3Average distance d across all hedge funds
0 0.02 0.04 0.06 0.08
AR(1) AGARCH asym. t
AR(1) AGARCH t
AR(1) GARCH t
AR(1) GARCH normal
AR(1) normal
AR(0) normal
from top to bottom, we add successively autocorrelation, heteroscedasticity with normal innovations,heteroscedasticity with t5 innovations, asymmetric response in the dynamic volatility and finallyasymmetry in the innovations. We see that compared to the usual static normal distribution, the bestmodel reduces more than three times the distance between the realized residuals and the modeledresiduals. The two major improvements are when we introduce heteroscedasticity and fat tails in theinnovations. The last step (adding innovation asymmetry) does not improve the results, as we mighthave suspected from the earlier Wilcoxon tests, and further, induces over-fitting.
5 The multivariate extension
Let us consider now the case of N hedge funds simultaneously, as in a fund of hedge funds. We haveseen in Section 4 that the appropriate univariate model is the AR(1) plus AGARCH plus t5-distributedinnovations. We now consider multivariate innovations where the marginals are t5-distributions andthe dependency is modeled by a copula.
This structure enables us to capture tail dependency, which is not possible with linear correlationalone but present within hedge funds. Figure 4 presents an illustrative extreme example of two hedgefund return time series. We see that most of the time the two hedge funds behave differently, while inone month they both experience tail events. These joint tail events are a display of tail dependency.
The multivariate extension 103
Figure 4
Tail dependency in hedge fund returns
31−Jan−85 31−Oct−87 31−Jan−00−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
The multivariate distribution of theN innovations is given by
whereU(u1, . . . ,uN)—a multivariate uniform distribution—is the copula to estimate.
5.1 Copula and tail dependency
Consider two random variablesX andY with marginal distributionsFX andFY. The upper tail
dependency is
λu = limq→1
P[X > F−1
X (q)|Y > F−1Y (q)
], (27)
and analogously the lower tail dependency is
λ` = limq→0
P[X ≤ F−1
X (q)|Y ≤ F−1Y (q)
]. (28)
These coefficients do not depend on the marginal distribution of X andY, but only on their copula.
See (Nelsen 1999) for more details.
The probtiles (defined in Section 4) of theN hedge funds observed at timet
(z1t , . . . ,z
Nt ) = UN
t (29)
104 Capturing Risks of Non-transparent Hedge Funds
Figure 5
Upper and lower tail dependency as a function ofq, fixed income arbitrage hedge funds
0 0.01 0.02 0.03 0.04 0.05−0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
constitute a realization of the copulaUNt . Since our univariate model has extracted all of the process
dynamics, we are free to reorder and aggregate our observations across time. We make the further
assumption that within a strategy, all pairs of hedge funds have the same dependence structure. Thus,
we may interpret each observation of innovations for a pair of hedge funds in a strategy at a given
time as a realization from a universal (for the strategy) bivariate copula. So fromM historical periods
onN hedge funds, we extractMN(N−1)/2 realizations.
From this bivariate sample we can infer the upper and lower tail dependency nonparametrically using
Equations (27) and (28). We calculate the coefficient for fixed values ofq as shown if Figure 5 and
extrapolate the value for the limiting case.
We can also obtain a parametric estimation of the tail dependency by fitting the realized copula
between two hedge funds to at-copula. The parameters of such copula are the correlation matrix ρ(which we estimate using Kendall’sτ) and the degrees of freedomνcop (which we estimate using
maximum likelihood). The tail dependency is symmetric and is obtained by2
λ` = λu = 2−2tνcop+1
(√νcop+1
√1+ρ12√1−ρ12
), (30)
whereρ12 is the correlation coefficient between the two hedge funds.
2(Embrechts, McNeil, and Straumann 2002)
The multivariate extension 105
Table 2
Estimated tail dependency coefficients
Empirical Empirical
Strategy N Lower Upper λ±σλ νcop
Convertible Arbitrage 16 0.2 0.1 0.18± 0.09 6
Distressed Securities 18 0.06 0.05 0.05± 0.09 10
Emerging Markets 29 0 0 0.07± 0.06 8
Equity Hedge 103 0.05 0 0.04± 0.05 10
Equity Market Neutral 16 0.04 0.04 0.02± 0.03 9
Equity Non-Hedge 32 0.1 0 0.17± 0.06 5
Event-Driven 38 0.17 0 0.11± 0.08 7
Fixed Income 28 0.09 0 0.03± 0.07 9
Foreign Exchange 14 0 0.1 0.03± 0.09 10
Macro 27 0 0.05 0.03± 0.08 10
Managed Futures 58 0 0.07 0.05± 0.07 9
Merger Arbitrage 10 0 0.15 0.20± 0.17 5
Relative Value Arbitrage 20 0.1 0 0.04± 0.09 10
Short Selling 7 0 0 0.50± 0.22 3
Table 2 shows the results for all strategies. We report the nonparametric lower and upper coefficients,
as well as the results of the parametric estimation. Since inthe parametric case, the coefficient
depends on the correlation between each pair of hedge funds,we report the average and its standard
deviation across fund pairs. We also show the estimated degrees of freedom of the copulaνcop. We
see that the two estimates of tail dependence are consistent.
5.2 The multivariate model
To capture the different tail dependencies within each strategy we use a generalization of thet-copula,
namely the grouped-t copula (Daul, DeGiorgi, Lindskog, and McNeil 2003). We firstpartition theN
hedge funds inm groups (strategies) labeledk, with dimensionsk and parameterνk.
Then letZ be a random vector following a multivariate normal distribution of dimensionN with linear
106 Capturing Risks of Non-transparent Hedge Funds
correlation matrixρ and letGν be the distribution function of√
νχ2
ν, (31)
whereχ2ν follows a chi-square distribution withν degree of freedom. IntroducingU , a uniformly
distributed random variable independent ofZ, we define
Rk = G−1νk
(U), (32)
and
Y =
R1
Z1
. . .
Zs1
R2
Zs1+1
. . .
Zs2
. . .
. (33)
As a result, for instance, the group of random variables(Y1, . . . ,Ys1) has as1-dimensional multivariate
t-distributionwithν1 degrees of freedom.
Finally using the univariate distribution function of the innovations, we get a random vector of
innovations, [t−15 (tν1(Y1)) , . . . , t−1
5 (tνk(YN))]
(34)
following a meta grouped-t distribution with linear correlation matrixρ and with different tail
dependency in each group (strategy). The tail dependenciesare captured by theνk’s and are different
in general from the degree of freedomν = 5 of the innovations.
6 Conclusion
We have presented a model that captures all the static, dynamic and dependency characteristics of
hedge fund returns. Individual hedge fund returns are non-normally distributed, and show
autocorrelation and heteroscedasticity. Their volatility adapts when the hedge fund manager under- or
outperforms. Concerning multiple hedge funds, we have looked at joint events and noticed that tail
dependency is present.
Our model consists of a univariate process and a copula structure on the innovations of that process.
The univariate process is an asymmetric generalization of aGARCH(1,1) process while the
Conclusion 107
dependency is captured by a grouped-t copula with different tail dependency for each strategy. This
model shows compelling out-of-sample backtesting results.
This approach can be applied to any hedge fund and in particular to non-transparent ones. Using only
hedge fund historical performance we may forecast the risk of a portfolio of hedge funds. A
straightforward model extension permits analysis of portfolios of hedge funds mixed with other asset
classes.
References
Daul, S., E. DeGiorgi, F. Lindskog, and A. McNeil (2003). Thegroupedt-copula with an
application to credit risk.Risk 16, 73–76.
Ding, Z. and C. W. J. Granger (1996). Modeling volatility persistence of speculative returns: A
new approach.Journal of Econometrics(73), 185–215.
Embrechts, P., A. McNeil, and D. Straumann (2002). Correlation and dependence in risk
management: Properties and pitfalls. In M. Dempster (Ed.),Risk Management: Value-at-Risk
and Beyond, pp. 176–223. Cambridge University Press.
Hansen, B. E. (1994). Autoregressive conditional density estimation.International Economic
Review 3(35), 705–730.
Nelsen, R. (1999).An introduction to Copulas. Springer, New York.
Zumbach, G. (2004). Volatility processes and volatility forecast with long memory.Quantitative
Finance 4, 70–86.
Zumbach, G. (2007). Backtesting risk methodologies from one day to one year.RiskMetrics
Journal 7(1), 17–60.
www.riskmetrics.com
RiskMetrics JournalVolume8, Number 1 Winter 2008
3 Volatility Forecasts and At-the-MoneyImplied Volatility