-
Department of Economics and Business
Aarhus University
Fuglesangs Allé 4
DK-8210 Aarhus V
Denmark
Email: [email protected]
Tel: +45 8716 5515
On the identification of fractionally cointegrated VAR
models with the F(d) condition
Federico Carlini and Paolo Santucci de Magistris
CREATES Research Paper 2013-44
mailto:[email protected]
-
On the identification of fractionally cointegrated VAR
models with the F(d) condition
Federico Carlini∗ Paolo Santucci de Magistris∗
December 11, 2013
Abstract
This paper discusses identification problems in the fractionally
cointegrated
system of Johansen (2008) and Johansen and Nielsen (2012). The
identification
problem arises when the lag structure is over-specified, such
that there exist sev-
eral equivalent re-parametrization of the model associated with
different fractional
integration and cointegration parameters. The properties of
these multiple non-
identified sub-models are studied and a necessary and sufficient
condition for the
identification of the fractional parameters of the system is
provided. The condition
is named F(d). The assessment of the F(d) condition in the
empirical analysis
is relevant for the determination of the fractional parameters
as well as the lag
structure.
Keywords: Fractional Cointegration; Cofractional Models;
Identification; Lag
Selection.
JEL Classification: C19, C32
∗The authors acknowledge support from CREATES - Center for
Research in Econometric Analysis ofTime Series (DNRF78), funded by
the Danish National Research Foundation. The authors are gratefulto
Søren Johansen, Morten Nielsen, Niels Haldrup and Rocco Mosconi for
insightful comments on thiswork. The authors would like to thank
the participants to the Third Long Memory Symposium
(Aarhus2013).
1
-
1 Introduction
The last decade has witnessed an increasing interest in the
statistical definition and
evaluation of the concept of fractional cointegration, as a
generalization of the idea of
cointegration to processes with fractional degrees of
integration. In the context of long-
memory processes, fractional cointegration allows linear
combinations of I(d) processes
to be I(d − b), with d ∈ R+ and 0 < b ≤ d. More specifically,
the concept of fractional
cointegration implies the existence of one, or more, common
stochastic trends, integrated
of order d, with short-period departures from the long-run
equilibrium integrated of
order d− b. The coefficient b is the degree of fractional
reduction obtained by the linear
combination of I(d) variables, namely the cointegration gap.
Interestingly, the seminal paper by Engle and Granger (1987)
already introduced the
idea of common trends between I(d) processes, but the subsequent
theoretical works, see
among many others Johansen (1988), have mostly been dedicated to
cases with integer
orders of integration. Notable methodological works in the field
of fractional cointegration
are Robinson and Marinucci (2003) and Christensen and Nielsen
(2006), which develop
regression-based semi-parametric methods to evaluate whether two
fractional stochastic
processes share common trends. More recently, Nielsen and
Shimotsu (2007) provide a
testing procedure to evaluate the cointegration rank of the
multivariate coherence matrix
of two, or more, fractionally differenced series. Despite the
effort spent in defining testing
procedures for the presence of fractional cointegration, the
literature in this area lacked
a coherent multivariate model explicitly characterizing the
joint behaviour of fractionally
cointegrated processes. Only recently, Johansen (2008) and
Johansen and Nielsen (2012)
have proposed the FCVARd,b model, an extension of the well-known
VECM to fractional
processes, which represents a tool for a direct modeling and
testing of fractional cointe-
gration. Johansen (2008) and Johansen and Nielsen (2012) study
the properties of the
model and provide a method to obtain consistent estimates when
the lag structure of the
model is correctly specified.
The present paper shows that the FCVARd,b model is not globally
identified, i.e. for
a given number of lags, k, there may exist several sub-models
with the same conditional
densities but different values of the parameters, and hence
cannot be identified. The
2
-
multiplicity of not-identified sub-models can be characterized
for any FCVARd,b model
with k lags. An analogous identification problem, for the FIVARb
model, induced by the
generalized lag operator is discussed in Tschernig et al.
(2013a,b).
A solution for this identification problem is provided in this
paper. It is proved that
the I(1) condition in the VECM of Johansen (1988) can be
generalized to the fractional
context. This condition is named F(d) , and it is a necessary
and sufficient condition for
the identification of the system. This condition can be used to
correctly identify the lag
structure of the model and to consistently estimate the
parameter vector.
The consequence of the lack of identification of the FCVARd,b is
investigated from
a statistical point of view. Indeed, as a consequence of the
identification problem, the
expected likelihood function is maximized in correspondence of
several parameter vec-
tors when the lag order is not correctly specified. Hence, the
fractional and co-fractional
parameters cannot be uniquely estimated if the true lag
structure is not correctly deter-
mined. Therefore, a lag selection procedure, integrating the
likelihood ratio test with an
evaluation of the F(d) condition, is proposed and tested. A
simulation study shows that
the proposed method provides the correct lag specification in
most cases.
Finally, a further identification issue is discussed. It is
proved that there is a poten-
tially large number of parameters sets associated with different
choices of lag length and
cointegration rank for which the conditional density of the
FCVARd,b model is the same.
This problem has practical consequences when testing for the
nullity of the cointegration
rank and the true lag length is unknown. For example, under
certain restrictions on the
sets of parameters, the FCVARd,b with full rank and k lags is
equivalent to the FCVARd,b
with rank 0 and k+1 lags. It is shown that the evaluation of the
F(d) condition provides
a solution to this identification problem and works in most
cases.
This paper is organized as follows. Section 2 discusses the
identification problem
from a theoretical point of view. Section 3 discusses the
consequences of the lack of
identification on the inference on the parameters of the
FCVARd,b model. Section 4
presents the method to optimally select the number of lags and
provides evidence, based
on simulation, on the performance of the method in finite
sample. Section 5 discusses
the problems when the cointegration rank and the lag length are
both unknown. Section
3
-
6 concludes the paper.
2 The Identification Problem
This Section provides a discussion of the identification problem
related to the FCVARd,b
model
Hk : ∆dXt = αβ
′∆d−bLbXt +k
∑
i=1
Γi∆dLibXt + εt εt ∼ iidN(0,Ω) (1)
where Xt is a p-dimensional vector, α and β are p × r matrices,
where r defines the
cointegration rank.1 Ω is the positive definite covariance
matrix of the errors, and Γj, for
j = 1, . . . , k, are p×pmatrices loading the short-run
dynamics. The operator Lb := 1−∆b
is the so called fractional lag operator, which, as noted by
Johansen (2008), is necessary
for characterizing the solutions of the system. Hk defines the
model with k lags and
θ = vec(d, b, α, β,Γ1, ...,Γk,Ω) is the parameter vector.
Similarly to Johansen (2010), the
concept of identification and equivalence between two models is
formally introduced by
the following definition.
Definition 2.1 Let {Pθ, θ ∈ Θ} be a family of probability
measures, that is, a statistical
model. We say that a parameter function g(θ) is identified if
g(θ1) 6= g(θ2) implies that
Pθ1 6= Pθ2. On the other hand, if Pθ1 = Pθ2 and g(θ1) 6= g(θ2),
the two models are
equivalent or not identified.
As noted by Johansen and Nielsen (2012), the parameters of the
FCVARd,b model
in (1) are not identified, i.e. there exist several equivalent
sub-models associated with
different values of the parameter vector, θ.
An illustration of the identification problem is provided by the
following example.
Consider the FCVARd,d model with one lag2
H1 : ∆dXt = αβ
′LdXt + Γ1∆dLdXt + εt
1The results of this Section are obtained under the maintained
assumption that the cointegrationrank is known and such that 0 <
r < p. An extension to case of unknown rank and number of lags
ispresented in Section 5.
2To simplify the exposition, we consider the case FCVARd,b with
d = b.
4
-
where d > 0. Consider the following two restrictions, leading
to the sub-models:
H1,0 : H1 under the constraint Γ1 = 0 (2)
H1,1 : H1 under the constraint Γ1 = Ip + αβ′ (3)
Interestingly, these two sets of restrictions lead to equivalent
sub-models with different
parameter vectors. The sub-model H1,0 can be formulated as:
∆d1Xt = αβ′Ld1Xt + εt (4)
where d1 is the fractional parameter under H1,0, and the
restriction Γ1 = 0 corresponds
to a FCVARd,d model with no lags, H0. After a simple
manipulation, the sub-model H1,1
can be written as
∆2d2Xt = αβ′L2d2Xt + εt (5)
where d2 is the fractional parameter under H1,1. From (5) it
emerges that also H1,1 is
equivalent to H0. This means that the two sub-models, H1,0 and
H1,1, are equivalent,
with d1 = 2d2. The fractional order of the system is the same in
both cases, i.e. F(d1) =
F(2d2). Hence, under H1,0 the process Xt has the same fractional
order as under H1,1,
but the latter is represented by an integer multiple of the
parameter d2. In the example
above, the identification condition is clearly violated, as the
conditional densities of H1,0
and H1,1 are
p(X1, ..., XT , θ1|X0, X−1, . . .) = p(X1, ..., XT , θ2|X0, X−1,
. . .) (6)
where θ1 = vec(d1, α, β,Γ(1)1 ,Ω) and θ2 = vec(
12d1, α, β,Γ
(2)1 ,Ω) with Γ
(1)1 = 0 and Γ
(2)1 =
Ip + αβ′.
The identification problem outlined above has practical
consequences when the model
with k0 lags and a FCVARd,b model with k lags are considered.
Suppose that the Hk0
model is
5
-
Hk0 : ∆d0Xt = α0β
′
0∆d0−b0Lb0Xt +
k0∑
i=0
Γ0i∆d0Lib0Xt + εt εt ∼ N(0,Ω0) (7)
with k0 lags, |α′
0,⊥Γ0β0,⊥| 6= 0 where Γ
0 = Ip −∑k0
i=1 Γ0i . When a model Hk with k > k0
is considered for Xt, then Hk0 corresponds to the set of
restrictions Γk0+1 = Γk0+2 = ... =
Γk = 0 imposed on Hk. As shown in the example above, there are
several equivalent
sub-models to that under the restriction Γk0+1 = Γk0+2 = ... =
Γk = 0.
Therefore, the aim of this Section is to study the number and
the nature of these
equivalent sub-models, in order to provide a necessary and
sufficient condition to identify
the fractional parameters d0 and b0 as the parameters d and b of
the model Hk.
The following Proposition states the necessary and sufficient
condition, called the
F(d) condition, for identification of the parameters of the
model Hk0.
Proposition 2.2 For any k0 ≥ 0 and k ≥ k0, the F (d) condition,
|α′
⊥Γβ⊥| 6= 0, where
Γ = Ip −∑k
i=1 Γi, is a necessary and sufficient condition for the
identification of the set
of parameters of Hk0 in equation (7).
Corollary 2.3
i) Given k0 and k, with k ≥ k0, the number of equivalent
sub-models that can be obtained
from Hk is m = ⌊k+1k0+1
⌋, where ⌊x⌋ denotes the greatest integer less or equal to
x.
ii) For any k ≥ k0, all the equivalent sub-models are found for
parameter values dj =
d0 −j
j+1b0 and bj = b0/(j + 1) for j = 0, 1, ..., m− 1.
The Proposition 2.2 has important consequences. First, the
condition |α′⊥Γβ⊥| 6= 0
holds only for the sub-model for which d = d0 and b = b0, i.e.
for the sub-model
corresponding to the restrictions Γk0+1 = Γk0+2 = ... = Γk = 0.
In the example above,
the F(d) condition is verified only for H1,0, while |α′
⊥Γβ⊥| = 0 for H1,1, since Γ =
Ip − (Ip + αβ′) = −αβ ′. Second, for k ≫ k0, the (m − 1)-th
sub-model is such that
dm−1 ≈ d0 − b0 and bm−1 ≈ 0, i.e. located close to the boundary
of the parameter space.
As a consequence of corollary i), there are cases for which k
> k0 doesn’t imply lack of
6
-
identification. For example, when k = 2 and k0 = 1 there are no
sub-models of H2 that
are equivalent to the one in correspondence of d = d0, b = b0,
Γ1 = Γ01 and Γ2 = 0.
Table 1 summarizes the number of equivalent sub-models for
different values of k0 and
k. When k0 is small there are several equivalent sub-models for
small choices of k. When
k0 increases, multiple equivalent sub-models are obtained only
for large k. For example,
when k0 = 5, then two equivalent sub-models are obtained only
from suitable restrictions
of the H11 model.
k0 ↓ k → 0 1 2 3 4 5 6 7 8 9 10 11 12
0 1 2 3 4 5 6 7 8 9 10 11 12 131 – 1 1 2 2 3 3 4 4 5 5 6 62 – –
1 1 1 2 2 2 3 3 3 4 43 – – – 1 1 1 1 2 2 2 2 3 34 – – – – 1 1 1 1 1
2 2 2 25 – – – – – 1 1 1 1 1 1 2 2
Table 1: Table reports the number of equivalent models (m) for
different combinationsof k and k0. When k0 > k the Hk is
under-specified.
The next Section discusses the consequences of the lack of
identification on the esti-
mation of the FCVARd,b parameters when the true number of lags
is unknown.
3 Identification and Inference
This Section illustrates, by means of numerical examples, the
problems in the estimation
of the parameters of the FCVARd,b that are induced by the lack
of identification outlined
in Section 2. In particular, the F(d) condition can be used to
correctly identify the
fractional parameters d and b when model Hk is fitted on the
data.
As shown in Johansen and Nielsen (2012), the parameters of the
FCVARd,b can be
estimated in two steps. First, the parameters d̂ and b̂ are
obtained by maximizing the
profile log-likelihood
ℓT (ψ) = − log det(S00(ψ))−r
∑
i=1
log(1− λi(ψ)) (8)
where ψ = (d, b)′. λ(ψ) and S00(ψ) are obtained from the
residuals, Rit(ψ) for i = 1, 2,
7
-
of the reduced rank regression of ∆dYt on∑k
j=1∆dLjbYt and ∆
d−bLbYt on∑k
j=1∆dLjbYt, re-
spectively. The moment matrices Sij(ψ) for i, j = 1, 2 are
Sij(ψ) = T−1
∑T
t=1
∑T
t=1Rit(ψ)R′
jt(ψ)
and λi(ψ) for i = 1, . . . , p are the solutions, sorted in
decreasing order, of the generalized
eigenvalue problem
det[
λS11(ψ)− S10(ψ)S−100 (ψ)S01(ψ)
]
. (9)
Second, given d̂ and b̂, the estimates α̂, β̂, Γ̂j for j = 1, .
. . , k and Ω̂ are found by
reduced rank regression.
The values of ψ that maximize ℓT (ψ) must be found numerically.
Therefore, we
explore, by means of Monte Carlo simulations, the effect of the
lack of identification of
the FCVARd,b model on the expected profile likelihood when k
> k0. Since the asymptotic
value of ℓT (ψ) is not a closed-form function of the model
parameters, we approximate the
asymptotic behaviour of ℓT (ψ) by averaging over S simulations,
setting a large number
T of observations. This provides an estimate of the expected
profile likelihood, E[ℓT (ψ)].
Therefore, we generate S = 100 times from model (7) with T = 50,
000 observations
and different choices of k0 and p = 2. The fractional parameters
of the system are
d0 = 0.8 and b0 is set equal to d0 in order to simplify the
readability of the results
without loss of generality. The cointegration vectors are α =
[0.5,−0.5] and β = [1,−1],
and the matrices Γ0i for i = 1, ..., k0 are chosen such that the
roots of the characteristic
polynomial are outside the fractional circle, see Johansen
(2008). 3 The average profile
log-likelihood, ℓ̄T (ψ), and the average F(d) condition, F̄(d),
are then computed with
respect to a grid of alternative values for d = [dmin, . . . ,
dmax].4
Figure 1 reports the values of ℓ̄T (d) and F̄(d) when k = 1 lags
are chosen but k0 = 0. It
clearly emerges that two equally likely sub-models are found
corresponding to d = 0.4 and
d = 0.8. However, F̄(d) is equal to zero when d = 0.4.
Consistently with the theoretical
results presented in Section 2, the other value of the parameter
d that maximizes ℓ̄T (d)
is found around d = 0.8, where F̄(d) is far from zero.
Similarly, as reported in Figure 2,
when k = 2 and k0 = 0, the likelihood function presents three
humps around d = 0.8,
3The purpose of these simulations is purely illustrative, so
that we do not explore the behaviour ofE[ℓT (ψ)] for other
parameter values. All the source codes are available upon request
from the authors.
4The values of dmin and dmax presented in the graphs change in
order to improve the clearness of theplots.
8
-
d = 0.4 and d = 0.2667 = d0/3. As in the previous case, the
estimates corresponding to
d = 0.4 and d = 0.2667 should be discarded due to the nullity of
the F(d) condition.
A slightly more complex evidence arises when k0 > 0. Figures
3 and 4 report ℓ̄T (d)
and F̄(d) when k0 = 1 while k = 2 and k = 3 are chosen. When k =
2 the ℓ̄T (d) function
has a single large hump in the region of d = 0.8, thus
supporting the theoretical results
outlined above, i.e. when k = 2 and k0 = 1 there is no lack of
identification. However,
another interesting evidence emerges. The l̄T (d) function is
flat and high in the region
around d = 0.5. This may produce identification problems in
finite samples. This issue
will be further discussed in Section 3.1. When k = 3 we expect m
= 42= 2 equivalent
sub-models in correspondence of d = d0 = 0.8 and d = d0/2 = 0.4.
Indeed, in Figure 4,
the line ℓ̄T (d) has two humps around the values of d = 0.4 and
d = 0.8. As expected, in
the region around d = 0.4 the average F(d) condition is near
0.
3.1 Identification in Finite Sample
In Section 2, the mathematical identification issues of the
FCVARd,b have been discussed
in theory. The purpose of this Section is to shed some light on
the consequences of the lack
of mathematical identification in finite samples. From the
analysis above, we know that
the expected profile likelihood displays multiple equivalent
maxima in correspondence of
fractions of d0 for some k > k0.
In finite samples, however, the profile likelihood function
displays multiple humps,
but just one global maximum when k is larger than k0. Figure 5
reports the finite sample
profile likelihood function, ℓT (d), of model H1 obtained from
two simulated paths of
(7) with k0 = 0 with T = 1000. The plots highlight the behaviour
of ℓT (d) and the
consequences of the lack of identification, since the global
maximum of ℓT (d) is around
d = 0.4 in Panel a), while it is around 0.8 in Panel b).
Moreover, the lag structure of the FCVARd,b model induces poor
finite sample identi-
fication, namely weak identification, also for those cases in
which mathematical identifi-
cation is expected. For example, as shown in Figure 3, when k0 =
1 and k = 2 the average
profile likelihood is high in a neighbourhood of d = 0.5, even
though in theory there is
no sub-model equivalent to the one corresponding to d = 0.8. The
problem worsens if
9
-
we look at the profile likelihoods, ℓT (d), for a given T =
1000. As in Figure 5, Figure 6
reports the shape of the finite sample profile likelihood
function, ℓT (d), relative to two
simulated paths of (7) when k0 = 1 and H2 is estimated. When the
global maximum is
in a neighbourhood of d = 0.4, Panel a), the F(d) is close to
zero, thus suggesting that
the estimated matrix Γ̂1 and Γ̂2 are such that |α′
⊥Γβ⊥| = 0. This evidence suggests that,
in empirical applications, it is crucial to evaluate the F(d)
condition when selecting the
optimal lag length.
4 Lag selection and the F(d) condition
In practical applications the true number of lags is unknown.
Commonly, the lag selection
in the VECM framework is carried out following a
general-to-specific approach. Starting
from a large value of k, the optimal lag length is chosen by a
sequence of likelihood-ratio
tests for the hypothesis Γk = 0, until the nullity of the matrix
Γk is rejected. At each step
of this iteration, the profile likelihood function ℓT (d) must
be computed. If k is larger
than k0, then there is a non-zero probability that the maximum
of ℓ(d)T will be found
in a neighborhood of the values of d, that are fractions of d0
and for which |α⊥Γβ⊥| = 0.
For example, similarly to the evidence shown in Panel a) of
Figure 5, it may happen that
when k = 1, max ℓ(d)T is found in a region near d = 0.4, when d0
= 0.8 and k0 = 0. If
the likelihood ratio test
LR = 2 ·[
ℓ(d̂(k=1)T )− ℓ(d̂
(k=0)T )
]
(10)
rejects the null hypothesis, then the set of parameters which
maximizes the likelihood in
this case will correspond to θ = (d0/2, b0/2, α, β,Γ1 = Ip +
αβ′).
In order to avoid this inconvenience, we suggest to integrate
the top-down approach
for the selection of the lags with an evaluation of the F(d)
condition. Since the value
of |α̂′⊥Γ̂β̂⊥| is a point estimate, it is required to compute
confidence bands around its
value in order to evaluate if it is statistically different from
zero. Therefore, we rely on a
bootstrap approach to evaluate the nullity of |α̂′⊥Γ̂β̂⊥|. The
suggested algorithm for the
lag selection in the FCVAR model is
1. Evaluate Lk = max ℓT (d) for the FCVAR for a given large
k;
10
-
2. Evaluate Lk−1 = max ℓT (d) for the FCVAR with k − 1;
3. Compute the value of the LR test (10) for k and k − 1, which
is distributed as
χ2(p2) where p2 are the degrees of freedom, see Johansen and
Nielsen (2012).
4. Iterate points 2. and 3. until the null hypothesis is
rejected, in k̃.
5. Evaluate the F(d) condition in d̂k̃, b̂k̃, i.e. |α̂′
⊥,k̃Γ̂k̃β̂⊥,k̃|, namely F(d̂k̃).
6. Generate S pseudo trajectories from the re-sampled residuals
of the Hk̃ model.
7. For fixed d̂k̃, b̂k̃, estimate the matrices α̂s
⊥,k̃, β̂s
⊥,k̃and Γ̂s1, .., Γ̂
s
k̃with reduced rank
regression, for s = 1, ..., S.
8. Compute the F s(d) condition, for s = 1, ..., S.
9. Compute the quantiles, qα and q1−α, of the empirical
distribution of Fs(d).
10. If both F(d̂k̃) and 0 belong to the bootstrapped confidence
interval, then iterate
1.-9. for k̃ − i, for i = 1, ..., k̃ until the LR test rejects
the null and F(d̂k̃−i) is
statistically different from zero.
Table 2 reports the results on the performance of the lag
selection procedure that
exploits the information on the F(d) condition to infer the
correct number of lags. The
lag selection method follows the procedure outlined above,
starting from k = 10 lags. It
clearly emerges that in more than 95% of the cases the true
number of lags is selected,
thus avoiding the identification problems discussed in Section
2. A different evidence
emerges from Table 3. The selection procedure based only on
likelihood ratio tests is
not robust to the identification problem and it has a much lower
coverage probability.
Indeed, only in 50% of the cases the correct lag length is
selected with 500 observations.
As expected, the performance slightly improves when T = 1000 and
the percentage of
correctly specified models increases to 65% of the cases.
11
-
k0 ↓ k → 0 1 2 3 4 5 6 7 8 9 10
T=500
0 97 0 1 0 0 0 0 0 1 0 11 1 96 0 0 0 1 0 0 1 0 12 0 2 91 2 1 2 0
0 1 1 0
T=1000
0 93 3 0 1 1 0 0 0 0 2 01 0 95 2 1 1 0 0 0 1 0 02 0 0 96 2 0 0 1
0 0 0 1
Table 2: Table reports the percentage coverage probabilities in
which a specific lag lengthk is selected using the F(d) condition
together with the LR test. The reported resultsare based on 100
generated paths from the Hk0 model with k0 = 0, 1, 2 and T = 500
andT = 1000 observations. The bootstrapped confidence intervals for
the F(d) condition arebased on S = 200 draws.
k0 ↓ k → 0 1 2 3 4 5 6 7 8 9 10
T=500
0 56 0 0 0 0 0 3 6 9 13 131 0 57 0 0 0 2 2 7 10 11 112 0 0 46 1
2 4 4 9 11 10 13
T=1000
0 64 3 1 1 1 4 4 1 7 7 71 0 67 2 2 1 3 2 3 6 8 62 0 0 69 2 3 2 3
1 3 10 7
Table 3: Table reports the percentage coverage probabilities in
which a specific laglength k is selected with a general-to-specific
approach using a sequence of LR tests. Thereported results are
based on 100 generated paths from the Hk0 model with k0 = 0, 1,
2and T = 500 and T = 1000 observations.
5 Unknown cointegration rank
This Section extends the previous results to the case of unknown
rank, r, which is of
relevance in empirical applications. The FCVARd,b model with
cointegration rank 0 ≤
r ≤ p is defined as:
Hr,k : ∆dXt = Π∆
d−bLbXt +k
∑
i=1
Γi∆dLibXt + εt
where r is the rank of the p× p matrix Π.
12
-
Compared to the case discussed in previous sections, model Hr,k
exhibits further
identification issues. For example, the model with k = 1 lag and
rank 0 ≤ r ≤ p, is given
by
Hr,1 : ∆dXt = Π∆
d−bLbXt + Γ1∆dLbXt + εt (11)
where the parameters θ = (d, b,Π,Γ1). Consider the following two
sub-models
Hp,0 : ∆d1Xt = Π
1∆d1−b1Lb1Xt + εt (12)
and
H0,1 : ∆d2Xt = Γ
21∆
d2Lb2Xt + εt (13)
The sub-model Hp,0 is a reparameterization of H0,1 because (12)
can be written as
[
∆d1−b1(−Π1) + ∆d1(Ip +Π1)]
Xt = εt (14)
and (13) is given by[
∆d2(I − Γ21) + ∆d2+b2(Γ21)
]
Xt = εt (15)
If I − Γ21 = Π1, d1 = d2 + b2, b1 = b2 the two sub-models
represent the same process and
d1 ≥ b1 > 0 implies d2 + b2 > b2. Hence, the probability
densities
p(X1, . . . , XT ; θ1|X−1, . . .) = p(X1, . . . , XT ; θ2|X−1, .
. .)
when
θ1 = (d1, b1,Π1, 0) θ2 = (d2 + b2, b2, 0, I − Π
1)
However, the sub-model H0,1 is not always a reparameterization
of Hp,0. In fact, given
the expansions in (14) and (15), it follows that
p(X1, . . . , XT ; θ3|X−1, X−2, . . .) = p(X1, . . . , XT ;
θ4|X−1, X−2, . . .) (16)
where
θ3 = (d2, b2, 0,Γ21) θ4 = (d2 − b2, b2, I − Γ
21, 0)
13
-
The equality (16) holds if and only if θ4 is such that d2 − b2 ≥
b2 > 0. This implies that
H0,1 = Hp,0 ∩ {d ≥ 2b}. Hence, the nesting structure H0,1 ⊂ Hp,0
follows.
Next proposition extends this example for a general number of
lags k and rank r.5
Proposition 5.1 Consider the FCVARd,b, Hr,k, with k > 0 and 0
≤ r ≤ p. The following
propositions hold:
• For any k > 0, model H0,k is equivalent to Hp,k−1, if the
restriction d > 2b holds.
• For any k > 0, model H0,k is equivalent to Hr,k−1 with 0
< r < p, if and only if
|α′⊥Γβ⊥| = 0 in the latter.
• The nesting structure of the FCVARd,b model is represented by
the following scheme:
H0,0 ⊂ H0,1 ⊂ H0,2 ⊂ · · · ⊂ H0,k
∩ ∩ ∩ ∩
H1,0 ⊂ H1,1 ⊂ H1,2 ⊂ · · · ⊂ H1,k
∩ ∩ ∩ ∩
......
.... . .
...
∩ ∩ ∩ ∩
Hp,0 ⊂ Hp,1 ⊂ Hp,2 ⊂ · · · ⊂ Hp,k
with
H0,1 ⊂ Hp,0
H0,2 ⊂ Hp,1...
...
H0,k ⊂ Hp,k−1
Clearly, the nesting structure of the FCVAR impacts on the joint
selection of the
number of lags and the cointegration rank. Indeed, the
likelihood ratio statistic for
5A similar identification problem arises in the FAR(k) in
Johansen and Nielsen (2010)
∆dXt = π∆d−bLbXt +
k∑
i=1
γi∆dLibXt + εt
Similarly to the FCVARd,b model, the FAR(k) has the following
nesting structure:
H0,0 ⊂ H0,1 ⊂ H0,2 ⊂ · · · ⊂ H0,k∩ ∩ ∩ ∩
H1,0 ⊂ H1,1 ⊂ H1,2 ⊂ · · · ⊂ H1,k
with
H0,1 ⊂ H1,0H0,2 ⊂ H1,1
...
...H0,k ⊂ H1,k−1
14
-
cointegration rank r, LRr,k, is given by
−2 logLRr,k(Hr,k|Hp,k) = T · (ℓr,k(d̂r, b̂r)− ℓp,k(d̂p,
b̂p))
where ℓr,k is the profile likelihood of the FCVARd,b model with
rank r and k lags. Analo-
gously, d̂r,k and b̂r,k are the arguments that maximize ℓr,k.
The asymptotic properties of
the LRr statistics for given k are provided in Johansen and
Nielsen (2012).
Under the null hypothesis H0,k, it follows from Proposition 5.1
that the LR tests
LR0,k = −2 logLR(H0,k|Hp,k) is equal to LRp,k−1 = −2
logLR(Hp,k−1|Hp,k).6 Hence,
the equality of the test statistics LR0,k and LRp,k−1 influences
the top-down sequence
of tests for the joint identification of the cointegration rank
and the lag length. Indeed,
assuming that the top-down procedure for the optimal lag
selection terminates in Hp,k−1,
then it would impossible to test whether the optimal model is
Hp,k−1 or H0,k. Therefore a
problem of joint selection of k and r > 0 arises in the
FCVARd,b when rank is potentially
equal to 0 or p. A trivial solution to this issue is to exclude
the models with rank equal
to zero when selecting rank and lag length.
6 Conclusion
This paper discussed in detail the identification problem in the
CFVARd,b model of Jo-
hansen (2008) such that the fractional order of the system
cannot be uniquely determined
when the lag structure is over-specified. In particular, the
multiplicity of equivalent sub-
models is provided in closed form given k and k0. It is also
shown that a necessary and
sufficient condition for the identification is that the F(d)
condition, i.e. |α′⊥Γβ⊥| 6= 0, is
fulfilled. A simulation study highlights the practical problem
of multiple humps in the
expected profile log-likelihood function as a consequence of the
identification problem
and the over-specification of the lag structure. The simulations
also show that the true
parameters can be detected by evaluating the F(d) condition. The
simulation study also
reveals a problem of weak identification, characterized by the
presence of local and global
maxima of the profile likelihood function in finite samples. It
is also shown that the
6Both tests, LRp,k−1 and LR0,k are asymptotically χ2(p2)
distributed.
15
-
F(d) condition is necessary and sufficient for identification
also when the cointegration
rank is unknown and such that 0 < r < p. It is proved that
model H0,k is equivalent
to model Hp,k−1 under certain conditions on d and b, but the
F(d) does not provide any
information for the identification in this case. A solution to
this issue, which does not
exclude rank equal to zero, is left for future research.
References
Christensen, B. J. and Nielsen, M. O. (2006). Semiparametric
analysis of stationary
fractional cointegration and the implied-realized volatility
relation. Journal of Econo-
metrics.
Engle, R. and Granger, C. (1987). Cointegration and error
correction: representation
estimation, and testing. Econometrica, 55:251–276.
Johansen, S. (1988). Statistical analysis of cointegration
vectors. Journal of Economic
Dynamics and Control, 12:231–254.
Johansen, S. (2008). A representation theory for a class of
vector autoregressive models
for fractional processes. Econometric Theory, Vol 24,
3:651–676.
Johansen, S. and Nielsen, M. Ø. (2012). Likelihood inference for
a fractionally cointe-
grated vector autoregressive model. Econometrica,
80(6):2667–2732.
Johansen, S. and Nielsen, M. r. (2010). Likelihood inference for
a nonstationary fractional
autoregressive model. Journal of Econometrics, 158(1):51–66.
Nielsen, M. Ø. and Shimotsu, K. (2007). Determining the
cointegration rank in nonsta-
tionary fractional system by the exact local whittle approach.
Journal of Econometrics,
141:574–596.
Robinson, P. M. and Marinucci, D. (2003). Semiparametric
frequency domain analysis of
fractional cointegration. In Robinson, P. M., editor, Time
Series with Long Memory,
pages 334–373. Oxford University Press.
16
-
Tschernig, R., Weber, E., and Weigand, R. (2013a). Fractionally
integrated var models
with a fractional lag operator and deterministic trends: Finite
sample identification
and two-step estimation. Technical report.
Tschernig, R., Weber, E., and Weigand, R. (2013b). Long-run
identification in a frac-
tionally integrated system. Journal of Business & Economic
Statistics, 31(4):438–450.
17
-
Proof of Proposition 2.2 when k0 = 0 and k = 1
Let us define the FCVARd,b model with one lag, H1, as
∆dXt = αβ′∆d−bLbXt + Γ1∆
dLbXt + εt (17)
which can be written as
{
∆d [I + αβ ′ − Γ1] + ∆d−b [−αβ ′] + ∆d+bΓ1
}
Xt = εt (18)
Similarly, the model Hk0 with k0 = 0 lags in (7) can be
rewritten as
{
∆d0 [I + α0β′
0] + ∆d0−b0[−α0β
′
0]}
Xt = εt (19)
Imposing I + αβ ′ − Γ1 = 0, it follows that
∆d+bΓ1 = (I + α0β′
0)∆d0 (20)
and the condition
−αβ ′∆d−b = −α0β′
0∆d0−b0 (21)
it is satisfied when d = d0 − b0/2 and b = b0/2. The other
equivalent sub-model corre-
sponding to Hk0 in (7) with k0 = 0 with α0β′
0 = αβ′, Γ1 = 0, d = d0 and b = b0.
When the Hk0 has k0 = 0 lags and model Hk with k > 0 is
considered, then the k−th
model can be rewritten ask
∑
i=−1
Ψi∆d+ibXt = εt (22)
where∑k
i=−1Ψi = Ip, Ψ−1 = −αβ′ and Ψ0 = αβ
′ + Γ.
Similarly, the Hk0 model with k0 lags is given by
0∑
i=−1
Ψi,0∆d0+ib0 = εt, with Ψ−1,0 +Ψ0,0 = Ip. (23)
It is possible to show, that k + 1 sub-models equivalent to Hk0
can be obtained
18
-
imposing suitable restrictions on the matrices Ψi i = −1, ..., k
of the model Hk. The
equivalent sub-models, Hk,j, j = 0, 1, . . . , k, are found in
correspondence of
Ψ−1 = Ψ−1,0 corresponding to d− b = d0 − b0 (24)
Ψj = Ψ0,0 corresponding to d+ jb = d0
Ψs = 0, s 6= j
This system entails that all sub-models Hk,j, j = 1, . . . , k
are such that Ψ−1 = −αβ′ =
−α0β′
0 = Ψ−1,0 and Ψ0 = 0. This implies that αβ′ + Γ = Ψ0 = 0. Hence,
the sub-models
for j = 1, ..., k are such that |α′⊥Γβ⊥| = 0. Only for j = 0,
the condition |α
′
⊥Γβ⊥| 6= 0 is
satisfied.
Hence, verifying |α′⊥Γβ⊥| 6= 0 is sufficient for the
identification of the parameters of
Hk0 .�
The following Table summarizes the set of restrictions that have
to be imposed on a
Hk model in order to find k + 1 sub-models equivalent to the
model Hk0 with k0 = 0:
Matrices in Hk → Ψ−1 Ψ0 Ψ1 Ψ2 Ψ3 · · · Ψk
Hk,0 Ψ−1,0 Ψ0,0 0 0 0 · · · 0Hk,1 Ψ−1,0 0 Ψ0,0 0 0 · · · 0Hk,2
Ψ−1,0 0 0 Ψ0,0 0 · · · 0Hk,3 Ψ−1,0 0 0 0 Ψ0,0 · · · 0...
......
......
.... . .
...Hk,k Ψ−1,0 0 0 0 0 · · · Ψ0,0
Table 4: Restrictions imposed on the Hk model when the model Hk0
is a FCVARd,b withk0 = 0 lags.
19
-
Proof of Proposition 2.2
Let us define the model Hk0 under k0 ≥ 0 as
k0∑
i=−1
Ψi,0∆d0+ib0Xt = εt (25)
and the model Hk with k > k0 as
k∑
i=−1
Ψi∆d+ibXt = εt (26)
It is possible to show, that, for a given k0, m sub-models
equivalent to the DGP
(25) can be obtained imposing suitable restrictions on the
matrices Ψi i = −1, ..., k
of the model Hk. The equivalent sub-models, Hk,j, j = 0, 1, . .
. , m − 1, are found in
correspondence of
Ψ−1 = Ψ−1,0 corresponding to d− b = d0 − b0 (27)
Ψ(ℓ+1)(j+1)−1 = Ψℓ,0 corresponding to d+ [(ℓ+ 1)(j + 1)− 1]b =
d0 + ℓb0,
for ℓ = 0, . . . , k0 j = 0, 1, . . . , m− 1
Ψs = 0 for s 6= (ℓ+ 1)(j + 1)− 1,
and ℓ = 0, . . . , k0 j = 0, 1, . . . , m− 1.
The restriction Ψ0 = αβ′ + Γ = 0, implying |α′
⊥Γβ⊥| = 0 with Γ = I −
∑k
i=1 Γi, is
always imposed for the sub-models Hk,j when j ≥ 1.
As in the case k0 = 0, Ψ−1,0 = −α0β′
0 and Ψ−1 = −αβ load the terms ∆d0−b0Xt and
∆d−bXt respectively. This implies that d0 − b0 = d − b. For a
given j > 0, a system of
k0 + 2 equations (27) in d and b is derived from the
restrictions on the matrices Ψi. The
solution of this system is found for b = b0/(j + 1) and d = d0
−j
j+1b0. All sub-models
Hk,j, j = 1, . . . , k are such that Ψ−1 = −αβ′ = −α0β
′
0 = Ψ−1,0 and Ψ0 = 0. This implies
that αβ ′+Γ = Ψ0 = 0. Hence, the sub-models for j = 1, ..., k
are such that |α′
⊥Γβ⊥| = 0.
Only for j = 0, the condition α′⊥Γβ⊥| 6= 0 is satisfied.
For a given k > k0, the number of possible restriction to be
imposed on Ψi that
20
-
satisfies the system in (27) is ⌊ k+1k0+1
⌋. Hence, the number of equivalent sub-models is
m = ⌊ k+1k0+1
⌋.
Finally, the following Table reports the set of restrictions to
be imposed on the H6
model to have m = ⌊72⌋ = 3 sub-models equivalent to the model
Hk0 with k0 = 1 lags.
Matrices in H6 → Ψ−1 Ψ0 Ψ1 Ψ2 Ψ3 Ψ4 Ψ5 Ψ6
H6,0 Ψ−1,0 Ψ0,0 Ψ1,0 0 0 0 0 0H6,1 Ψ−1,0 0 Ψ0,0 0 Ψ1,0 0 0 0H6,2
Ψ−1,0 0 0 Ψ0,0 0 0 Ψ1,0 0
Table 5: Restrictions imposed on the H6 model when the model Hk0
is a FCVARd,b withk0 = 1 lag.
�
21
-
Proof of Proposition 5.1
Consider the model
Hr,k : ∆dXt = Π∆
d−bLbXt +
k∑
j=1
Γj∆d−bLbXt + εt
It can be written ask
∑
i=−1
Ψj∆d+ibXt = εt
where Ψ−1 = −Π, Ψ0 = I +Π−∑k
i=1 Γi and Ψk = −(1)k+1Γk.
Consider the two sub-models of Hr,k with the following two
restrictions:
Hp,k−1 : Π is a p× p matrix and Γk = 0
H0,k : Π=0
The model Hp,k−1 can be written as:
k−1∑
i=−1
Ψ̃i∆d̃+ib̃Xt = εt
where Ψ̃−1 = Π, Ψ̃0 = I + Π−∑k−1
i=1 Γi and Ψ̃k−1 = (−1)kΓk−1.
The model H0,k can be written as:
k∑
i=0
Ψ̄i∆d̄+ib̄Xt = εt
because Ψ̄−1 = 0, Ψ̄0 = I + 0−∑k
i=1 Γi and Ψ̄k = (−1)k+1Γk.
The two sub-models are equal if
Ψ̃−1 = Ψ̄0
Ψ̃0 = Ψ̄1
...
Ψ̃k−1 = Ψ̄k
and
d̃− b̃ = d̄
d̃ = d̄+ b̄
...
d̃+ (k − 1)b̃ = d̄+ kb̄
(28)
22
-
Given that the FCVARd, b model assumes that d ≥ b > 0, it
implies that d̃ ≥ b̃ and
d̄ ≥ b̄. The inequality d̃ ≥ b̃ is always verified but d̄ ≥ b̄
is verified if and only if d̃ ≥ 2b̃.
Therefore, H0,k ⊂ Hp,k−1.
Consider the case in which Π is a reduced rank matrix with 0
< r < p. Hr,k−1 and
H0,k are equivalent if the systems of equations (28) hold. In
this case, Ψ̄0 is equal to
I −∑k
i=1 Γ̄i = −α̃β̃′. Hence, the models Hr,k−1 are equivalent to
H0,k if and only if the
F(d) condition is equal to 0. �
23
-
A Figures
0.4 0.5 0.6 0.7 0.8 0.9 1−5.69
−5.68
−5.67x 10
−5 Expected Likelihood and F(d) condition fod different values
of d
0.4 0.5 0.6 0.7 0.8 0.9 1−2
0
2
Expected LogL
F(d) conditiond=d*=0.8
d=d*/2=0.4
Zero Line
Figure 1: Figure reports simulated values of l̄(d) and ¯F(d) for
different values of d ∈[0.2, 1.2]. The DGP is generated with k0 = 0
lags and a model Hk with k = 1 lags isfitted.
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2.84
−2.839
−2.838
−2.837
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−4
−2
0
2Expected Profile Likelihood and F(d) condition for different
values of d
F(d) condition
Expected logL
d=d*−2b*/3=0.2667
d=d*−b*/2=0.4
d=d*=0.8
Zero Line
Figure 2: Figure reports simulated values of l̄(d) and ¯F(d) for
different values of d ∈[0.2, 1.2]. The DGP is generated with k0 = 0
lags and a model Hk with k = 2 lags isfitted.
24
-
0.4 0.5 0.6 0.7 0.8 0.9 1−0.9
−0.8
−0.7
−0.6
−0.5
−0.4
−0.3
−0.2
−0.1
0Expected Likelihood and F(d) condition for different values of
d
0.4 0.5 0.6 0.7 0.8 0.9 1−160
−140
−120
−100
−80
−60
−40
−20
0
20
F(d) condition
Expected profilelikelihood
Figure 3: Figure reports simulated values of l̄(d) and ¯F(d) for
different values of d ∈[0.4, 1]. The DGP is generated with k0 = 1
lags and a model Hk with k = 2 lags is fitted.
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.01
−0.005
0Expected Likelihood Function and F(d) condition for different
values of d
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2
0
2
Expected LikelihoodF(d) condition
Figure 4: Figure reports simulated values of l̄(d) and ¯F(d) for
different values of d ∈[0.3, 0.8]. The DGP is generated with k0 = 1
lags and a model Hk with k = 3 lags isfitted.
25
-
0.4 0.5 0.6 0.7 0.8 0.9 1−2846
−2844
−2842
−2840
−2838
−2836
−2834
0.4 0.5 0.6 0.7 0.8 0.9 1−2.5
−2
−1.5
−1
−0.5
0
0.5
F(d) condition
zero line
l(d)
(a) Maximum around d = 0.4
0.4 0.5 0.6 0.7 0.8 0.9 1−2861
−2860
−2859
−2858
−2857
−2856
−2855
0.4 0.5 0.6 0.7 0.8 0.9 1−2.5
−2
−1.5
−1
−0.5
0
0.5
l(d)
F(d) condition
zero line
(b) Maximum around d = 0.8
Figure 5: Figure reports the values of the profile likelihood
l(d) and F(d) for differentvalues of d ∈ [0.35, 0.9] for two
different simulated path with T = 1000 of the FCVARd,dwhen k0 = 0
and model H1 is estimated in the data.
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2770
−2765
−2760
−2755
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2770
−2765
−2760
−2755
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1.5
−1
−0.5
0
l(d)
F(d)
zero−line
(a) Maximum around d = 0.4
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2841
−2840
−2839
−2838
−2837
−2836
−2835
−2834
−2833
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
l(d)
F(d)
zero−line
(b) Maximum around d = 0.8
Figure 6: Figure reports the values of the profile likelihood
l(d) and F(d) for differentvalues of d ∈ [0.35, 0.9] for two
different simulated path with T = 1000 of the FCVARd,dwhen k0 = 1
and model H2 is estimated in the data.
26
-
Research Papers 2013
2013-26: Nima Nonejad: Long Memory and Structural Breaks in
Realized Volatility: An Irreversible Markov Switching Approach
2013-27: Nima Nonejad: Particle Markov Chain Monte Carlo
Techniques of Unobserved Compdonent Time Series Models Using Ox
2013-28: Ulrich Hounyo, Sílvia Goncalves and Nour Meddahi:
Bootstrapping pre-averaged realized volatility under market
microstructure noise
2013-29: Jiti Gao, Shin Kanaya, Degui Li and Dag Tjøstheim:
Uniform Consistency for Nonparametric Estimators in Null Recurrent
Time Series
2013-30: Ulrich Hounyo: Bootstrapping realized volatility and
realized beta under a local Gaussianity assumption
2013-31: Nektarios Aslanidis, Charlotte Christiansen and
Christos S. Savva: Risk-Return Trade-Off for European Stock
Markets
2013-32: Emilio Zanetti Chini: Generalizing smooth transition
autoregressions
2013-33: Mark Podolskij and Nakahiro Yoshida: Edgeworth
expansion for functionals of continuous diffusion processes
2013-34: Tommaso Proietti and Alessandra Luati: The Exponential
Model for the Spectrum of a Time Series: Extensions and
Applications
2013-35: Bent Jesper Christensen, Robinson Kruse and Philipp
Sibbertsen: A unified framework for testing in the linear
regression model under unknown order of fractional integration
2013-36: Niels S. Hansen and Asger Lunde: Analyzing Oil Futures
with a Dynamic Nelson-Siegel Model
2013-37: Charlotte Christiansen: Classifying Returns as Extreme:
European Stock and Bond Markets
2013-38: Christian Bender, Mikko S. Pakkanen and Hasanjan Sayit:
Sticky continuous processes have consistent price systems
2013-39: Juan Carlos Parra-Alvarez: A comparison of numerical
methods for the solution of continuous-time DSGE models
2013-40: Daniel Ventosa-Santaulària and Carlos Vladimir
Rodríguez-Caballero: Polynomial Regressions and Nonsense
Inference
2013-41: Diego Amaya, Peter Christoffersen, Kris Jacobs and
Aurelio Vasquez: Does Realized Skewness Predict the Cross-Section
of Equity Returns?
2013-42: Torben G. Andersen and Oleg Bondarenko: Reflecting on
the VPN Dispute
2013-43: Torben G. Andersen and Oleg Bondarenko: Assessing
Measures of Order Flow Toxicity via Perfect Trade
Classification
2013-44:
Federico Carlini and Paolo Santucci de Magistris: On the
identification of fractionally cointegrated VAR models with the
F(d) condition