FORECASTING EXCHANGE RATES USING EXTENDED MARKOV SWITCHING MODELS t by Hok-hoi Fung A thesis submitted in partial fulfillment of the requirements for the degree of Master of Philosophy in the Department of Economics The Chinese University of Hong Kong June 1995 I.
67
Embed
FORECASTING EXCHANGE RATES USING EXTENDED MARKOV … · ABSTRACT FORECASTING EXCHANGE RATES USING EXTENDED MARKOV SWITCHING MODELS by Hok-hoi Fung It is documented in the literature
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
FORECASTING EXCHANGE RATES USING EXTENDED
MARKOV SWITCHING MODELS
t
by
Hok-hoi Fung
A thesis
submitted in partial fulfillment
of the requirements for the degree of
Master of Philosophy in the Department of Economics
The Chinese University of Hong Kong
June 1995 I.
I
- -It.
.
- -I
I: •
1
- - im.l- ;
-•
- I ; .
u
IJ
•
.
‘.
I
、-
-
••jl
---
•
- -
- II
••
- ••
- -
i-l-
I
! I
m-—
{I
.
IK
-I I
ll-t
二-
I
•-
•
-
-,
cn f'!./{/
t
7> (
“
•
ABSTRACT
FORECASTING EXCHANGE RATES USING EXTENDED
MARKOV SWITCHING MODELS
by
Hok-hoi Fung
It is documented in the literature that the 2-state Markov model with time-
vaiying transition probabilities (TVTP) and the 3-state Markov model are more
suitable for delineating movement of the exchange rates. The exchange rates seem
to be more stable since the Lourve accord of March 1987, and it implies the
occurrence of a new state characterized by low variances and less drift. Therefore,
the Markov model with more than 2 states is necessary to incorporate these
characteristics. The assumption on constant transition probabilities of the simple
Markov model may be unrealistic so that a more flexible structure should be
developed. For the TVTP model, it has extra flexibility to allow transition
probabilities to adjust before a rise or decline in the exchange rate, and hence
helps to identify the current state or regime and forecast when the exchange rate
switches regimes. Three major currencies: the German mark, the Japanese yen
and the British pound over the period from September 1973 to June 1992 are
examined in the paper and the results indicate that these extensions seem to offer
better description in sample and generate more satisfactory forecasts for some
currencies when compared to the simple 2-state Markov model. For instance, the
German mark is more likely to follow a stochastic process generated by the TVTP
model while the 3-state Markov model delineates Japanese yen much better.
However, the Markov models do not out-peiform the random walk in out-of-
sample forecasting over short horizons.
I
” ,
!
TABLE OF CONTENTS
Page
LIST OF TABLES ii
LIST OF FIGURES iii
CHAPTER
1. INTRODUCTION 1
2. LITERATURE REVIEW 3
3. METHODOLOGY 6
Formulation of the TVTP Model 6
Filtered and Smoothed Probabilities 9
Maximization of the Expected Log-likelihood 13
4. EMPIRICAL RESULTS 15
The Simple 2-state Markov Switching Model 15
The TVTP Model 17
The 3-state Markov Switching Model 26
5. OUT-OF-SAMPLE FORECASTING 34
6. CONCLUSION 40
APPENDICES 42
BIBLIOGRAPHY 58
LIST OF TABLES
Page
Table
1. Parameter Estimates for the 2-state Markov and
the TVTP Model 16
2. Parameter Estimates for the 3-state Markov Model 28
3. Post-sample Root Mean Squared Forecast Errors 36
4. Post-sample Mean Absolute Forecast Errors 37
ii
LIST OF FIGURES
Page
Figure
1. The Smoothed Probabilities of the 2-state Markov and the TVTP Models for Each Currency 18
2. The Transition Probabilities of the TVTP Model for Each Currency 23
3. The Smoothed Probabilities of the 3-state Markov Model for Each Currency 29
iii
CHAPTER 1
INTRODUCTION
Markov-switching model, originally devised by Hamilton, is claimed to
well model a random process which experiences discrete shifts in values of its
underlying parameters. During the past decade, the Markov model is broadly
applied to some economic issues as the business cycles dynamics (Goodwin 1993
and Filardo 1994),the term structure of interest rates (Hamilton 1988), the stock
price volatility (Hamilton and Susmel 1994) and the exchange rate analysis (Engel
and Hamilton 1990; Kaminsky 1993 and Engel 1994). As regard to the topic on
exchange rates, strong evidence has been found that the first difference in the log
of exchange rates of many major currencies experience shifts in regime from low
or negative to high or positive expected values.
Nevertheless, recent literature indicates that only simple 2-state Markov
model or its variant with constant transition probabilities is employed to trace out
the exchange rate movement. Engel (1994) proposed for future development that
the Markov model with more than two states may be operative in the exchange rate
analysis. Filardo (1994) introduced the time-varying instead of constant transition
probabilities (TVTP) in his study of the business cycles dynamics and found a
much better result and forecasting performance. Therefore, a natural question
emerges: can these extensions yield superior exchange rate forecasts when
compared to the simple Markov model, and even the driftless random walk? Or do
these formulations rather weaken the prediction power because oveifitting occurs
or the market fundamentals "distort" the estimates? The argument is based on the
fact that many structural exchange rate models, which use market fundamentals as
explanatory variables, produce unsatisfactory forecasts (Meese and Rogoff 1983).
These questions and corresponding answers constitute the main theme of
the discussion. In this paper, the 3-state Markov and TVTP models are estimated
for the US dollar against the German mark, the Japanese yen and the British pound
exchange rates, and are compared to the simple Markov and random walk models
in terms of forecasting performance. It is believed that these two extensions
should win over the simple 2-state Markov model because they have supportive
underlying rationales. As refer to the 3-state Markov model, Engel (1994) pointed
out that the poorer forecasts offered by the simple 2-state Markov model (when
compared to the zero-drift random walk) can be attributed to the Lourve accord of
March 1987. This accord seems to have stabilized the exchange rates and implied
the occurrence of a new state characterized by low variances and less drift (it is
called narrow spread) during his post-sample forecast period. Therefore, he
argued that the Markov model would perform better if it allowed for a third state
to incorporate the narrow spread characteristics of the exchange rate process. For
the TVTP model, it has extra flexibility to allow the transition probabilities to
adjust before a rise or decline in the exchange rate, and hence helps to identify the
current state or regime and forecast when the exchange rate switches regimes. It is
expected that the structure of the TVTP model is endowed with a satisfactory
prediction power.
The paper is organized as follows. The literature review is presented in
Chapter Two. Chapter Three delineates the formulation of the TVTP model with
the 2-state Markov model as a nested alternative. Chapter Four reports the
estimates of the 2-state and the 3-state Markov, and the TVTP models. Chapter
Five compares their forecasting performance. Conclusions are offered in Chapter
Six.
2
CHAPTER 2
LITERATURE REVIEW
The Markov model was first applied to the exchange rate analysis by Engel
and Hamilton (1990). The motivation of their study was that they observed
apparent long swings in several exchange rates against the dollar over the period
from the third quarter of 1973 to the first quarter of 1988 and this phenomenon
violated predictions from the stiiictural exchange rate models. For instance, if the
US real interest rate was driven up by the fiscal or monetary policies, it would
only result in one-time upward jump in the value of the dollar according to the
Dombusch (1976) sticky price model, and then the dollar would depreciate
gradually to equate the expected return across countries. However, reality told us
the truth that the dollar was instead much stronger for the subsequent one or two
years after a rise in the real interest rate during that period. Therefore, they were
questioned whether the exchange rates were generated by such stochastic process
that incorporated long swings as its systematic part, or just followed a random
walk with the directionless drifts. They applied the Markov model and assumed
any quarter's change in the exchange rate as driving from one of two regimes or
states which corresponded to episodes of a rising or falling rate respectively. The
results were robust in-sample that two different regimes were identified for
corresponding exchange rates and a given regime was likely to persist for several
years to support the phenomenon of long swings. However, the post-sample
forecasting ability of the Markov model was poor when compared to the zero-drift
random walk on the basis of the mean squared forecast error. In order to have a
complete and thorough comparison on the forecasting performance, Engel (1994)
examined eighteen exchange rates and chose thirteen rates as objects for the
contest. The post-sample forecast period began in the second quarter of 1986 and
ended in the first quarter of 1991. It was found that although the Markov model
3
was superior in forecasting the direction of change in exchange rates, it was still
out-performed by the zero-drift random walk in minimizing the mean squared
forecast eiTor.
Kaminsky (1993) argued that the poor forecasting performance of the
Markov irw del could be explained by its implicit assumption that investors used
only past exchange rate observations to make their forecasts. This information set
might not be enough to forecast changes in the economic environment which
would affect the future path of the exchange rate. Therefore, in his framework, it
was assumed that investors also included the announcements made by Federal
Reserve officials in their information set, and the news provided by such
announcements might be wrong in some situations but helped investors to identify
the current exchange rate regime. The model using this "imperfect,’ information
was termed the Markov model with imperfect regime classification information
(IRCI). The monthly dollar per British pound rate over the period from March
1976 to De:ember 1987 was examined, and it was shown that although the IRCI
model h a d � i higher forecasting performance when compared to the simple Markov
model, its forecasts were not superior than those offered by the zero-drift random
walk.
Filardo (1994) proposed another scheme to improve the forecasting
performance of the Markov model in the business cycle analysis. The transition
probabilities of the Markov model were allowed to evolve as a logistic function of
economic indicators, and were called the time varying transition probabilities
(TVTP). Sach flexibility would provide valuable additional information about
whether a particular business cycle phase had occurred and whether a turning
point was imminent. The idea was similar to those of the IRCI model that the
4
indicators helped to identify the current regime (expansion or contraction) which
the economy was in. The US business cycles over the period from January 1948
to August 1992 were considered, and they were measured by the growth rate in the
national output. The logarithmic first difference of seasonally adjusted total
industrial output from Federal Reserve was chosen as a proxy for the growth rate.
Data on the Composite Index of Eleven Leading Indicators, the Stock and Watson
Experimental index of Seven Leading Indicators, the Standard and Poor's
Composite Stock Index and so on were used as the economic indicators. It was
discovered that the TVTP model offered more satisfactory forecasts when
compared to the Markov model and time series models such as the ARJMA and
VAR.
5
CHAPTER 3
METHODOLOGY
The formulation of the TVTP model is presented in this chapter, and the
Markov models can be constructed by following the same rationale. The details of
the Markov models are described in Hamilton (1990,1993 and 1994).
Formulation of the TVTP Model
Following Diebold, Lee and Weinbach (1994), let the regime or state that a
given process is in at time t be indexed by an unobserved random variable s^,
which takes on the value zero or one, and it is assumed that changes in an
exchange rate e, follow such process. When S[ = 0’ the observed change, e is
assumed to be drawn from the N{/uq,ag^ ) distribution, whereas if s^ = 1, e^ is
drawn from the N(/// , cr;- ) distribution. Under this specification, the density
function of e conditional on ^ = /.’ i = 0 or I at time I is given by
r W I J
(1)
where <9 is a vector of distribution parameters which consists of the means {juq ,
/u j ) and variances (cr^^ , a;-) associated with the two regimes.
St is assumed to follow a first-order, two states Markov process with
transition probability matrix shown as follows.
i.
6
^'t-l
“ 0 0 - e x p C v i ' … 11 Pt 一 7 — r r Pt — 1 Pt
l + exp(x,_i a )
Pt 二 卜 凡 Pt —i + expOc卜 1,") • J
(2)
where 厂产 is the conditional probability p{Sf = k | s^.j = i) at time t for /•’ 众 = 0
or 1 and is a nxl vector of market fundamentals. The probabilities listed
above are assumed to evolve as a logistic function of such fundamentals. It is
obvious, but worth noting, that when the last n-l terms of the nxl probability
parameter vectors a and P are set to zero, the transition probabilities are time
invariant so that p P � a n d p / i are simply constant and the model reduces to the
simple Markov model.
In connection with (1), let X = [0, a, A p] be a (2«+5)xl vector of
population parameters, where p is the unconditional probability of Sj = 0. This
vector characterizes the joint probability density or sample likelihood function
,(^T-J, , ’ e j I Xt,Xf.j, , X 2 , X j ; X ) of the observed data, and our
tasks are then to find the value of the parameter X that maximizes this
function. p ( c ’ , , I "^r,^T - j,……,, Xj ; X) is defined as
P(叶’ct-I, ..,c)2,q|xr,xr-i,…,巧,义)
= Z Z - Z j : p { E r , S T \ X r a )
(3)
7
and
T
= 义 ) f l • , 丑 卜 i,、Vi,而;义)
t=2
T
t=2 T
= h ;约 p(5,i) n pi^t k ; /=2
T
= I 巧;动 p i h ) n pi^t k ;动 pI" t=2
(4)
where E^,S^,Xi,are vectors or matrices containing respective past information
on e, s and x through date i .
In practice, however, the numerical maximization of (3) using ordinary
differentiation methods to evaluate the slope of the sample likelihood is
computationally difficult. Therefore, the EM algorithm is used to find the optimal
parameter .'alues. The EM algorithm is an iterative method and is used to
maximize the expected log likelihood conditional on observed data, instead of the
sample likelihood as (3). It is straight forward to show (see Appendix 1) that the
sequences of estimates which maximize the expected log likelihood at each
iteration converge to the maximum likelihood estimates of the sample likelihood.
The expected log likelihood at j th iteration is defined as
8
= [ \ o g p [ E r , S - f Xf
Sr
= = o | £ 7 , ’ A V ; A 厂 i)(logp(c’|«Sl =0;…+log/y.夕
+ ( 1 - • 二 拉 , ’ A V ; 1 ) ) [ " l o g • |、.i 二 1 ; … + 1 o g ( 1 V )
• \ 7
丁
⑶
where 厂 = i \ Ej , Xj ; ) and = k, s^., = i \ E-p , X^ • ) are the
smoothed state probabilities conditional on 丨 the best guess' of A at j-l iteration.
Hence, if we know the smoothed probabilities, we can maximize (5)
immediately with respect to the parameter vector M . However, a central question
is raised: how do we determine the smoothed probabilities?
Filtered and Smoothed Probabilities
The smoothed probabilities can be calculated after the filtered probabilities I
have been found. Let zW, | , = p{st = \ ?J'' ) be the filtered probability
of St = i for i = 0 or I and 於人 )十 / 1 , = 广 = k \ E" X^ 刃‘丨、be the filtered
probability of s^+j = k for k = Q or 1,and they are both estimated on time t
information set. Then ( j ) � i s called a posterior probability and interpreted as
9
/ \
, M c ' M • 广 - 1 , 义 " 义 广 \ = ] — ,
/ \ / \
_ 巧二。伊-1)厂(‘s) =/|E,-i,义/-i;义广\
/ \
一 M � k / •;没,-1)4-1
— • I五丨,义H ;义广 i )
(6)
where
(7)
and (j>^t+i\ t is called a prior probability and interpreted as
1 /=0
(8) j
where 丨力=/; ) is given by (1).
It is more convenient to express the derivation of the filtered probabilities in
matrix form. Let O , = [(f) \\t ’ ^t\t ]’ ^t+i\t =、伞 V / ] / ’ <t> ] and n广=
[p{e^ I 5广 0 ; 0 J - 1 � , I = 1 ; … " ) ] b e the matrices considered here. Then
10 .
(9)
and
�;+i | /=P/, i � ; k
(10)
Where P/+/ is the transition probability matrix as (2),and 1 is a column vector
whose elements are unity. The symbol * denotes element-by-element
multiplication.
Using (9) and (10), one can calculate the values of filtered probabilities �,丨,
and for each time t by iteration in the sample. Given the filtered
probabilities, the smoothed probabilities can be determined by following the
algorithm developed by Kim (1993). Let • = Pih = i \ Ej , Xj \ be the
smoothed probability of s^ = i on the time T information set and defined as
= 1 : / i ? , 卞 | 二 " | 五 r ’ 义 r ; 义 广 — — — — T ^
/ \
. 丨 、 乂 二 • + � , r ’ 义 r ; 广 1)
=p[S( = i ^ k=o = 五 / , 知 广 j
- / 4 + i = 0| 知 , 义 r;义广 1)
= • 卡 , ; 义 p i l l ]
(11)
provided that 厂 ( 〜 = i | V y 二 k , Et ’ 乂丁., ^!'') = = i | s …=k^E^.X^-, ?J'^) •
The algorithm can be made clear and compact in matrix form as
�The details of the proof can be found in Hamilton (1994).
12 .
(12)
where ^ ]• The symbol (+) denotes element-by-element division.
Therefore, the smoothed probabilities can be found by iterating on (12) backward
for t = T-1, T-2, 2, I. It is worth noting that the algorithm starts with Oj^j^
which is obtained from (9) for t = T.
Maximization of the Expected Log-likelihood
After the smoothed probabilities are obtained, the expected log likelihood
(5) is maximized directly with respect to the parameter M. The resulting 2n+5
expressions for the likelihood estimates are listed as follows.
T _
J - l^i h 一 丁 . I、
(13)
T 2 / .一 1、
YM -Mi) M � = z | £ v,A > ;乂广
rP- - i ^ 《一 T .,、
(14)
for / = 0, 1
(15)
13 .
r { 「 ; ; I 00 1 �
^ >卜 | 厂 “ =0’‘y卜 1 二 0|£',,A V ;义广•卜 1 广 1) p ? ' - — / L Jz
“ 一丁
(16)
T { / , \ . , \ ,, dp^ 1 .,
n - — — - ^
T (
r=2 P
(17)
because the transition probabilities are non-linear functions of a and /? , the
derivation of (16) and (17) has made use of a linear approximation of 严 and
p/i by a fi:-st order Taylor series expansion around a and (5 respectively.2
A new set of estimates of parameters obtained from (13) to (17) are then
used to re-calculate the smoothed probabilities, and next the estimates. This
procedure will continue until a stopping criterion is met. The difference between
two successive estimates below 10"^ is chosen as such criterion, and the final set
of estimates is treated as the maximum likelihood estimates of the TVTP model.
2;rhe derivation of the expression (16) and (17) is given in Appendix 2.
14 ,
CHAPTER 4
EMPIRICAL RESULTS
The exchange rates examined in this paper are the US dollar against the
German mark, the Japanese yen and the British pound. Monetary theories state
that the exchange rate could be affected by such exogenous variables as the
difference in money supply changes, the difference in output changes, the
difference in interest rates and the difference in inflation rates between the
corresponding countries. Therefore, various combinations of the lagged values of
these variables have been attempted as the market fundamentals to influence the
transition probabilities, and it is observed that the interest rate differential ( R^, ,)
as proxied by the treasury bill rate differential or the call money rate differential
alone does the best in estimation and forecasting and its results are reported in this
thesis (the results produced by other combinations are also tried). In mathematical
term, this functional relation can be expressed as p!�=/( x^.j ) = / ( R^.j ) • The
data on exchange rates and market fundamentals are monthly series and obtained
from the International Financial Statistics (IFS). The sample period begins in
January 1976 for German mark and in September 1973 for Japanese yen and
British pound, and ends in January 1988. The starting point of the sample period
for each currency depends on the availability of data.
The Simple 2-state Markov Switching Model
The results are summarized in Table 1, and it can be seen that some
estimates on means are significant at the 5 or 10 percent level.^
^The estimates on the un-conditional probability p for each model are suppressed from Tables I and 2 because they are not the main concern of this paper.
15 .
Table I. Parameter estimates for the 2-state Markov and the T V T P model. Monthly , 1976:1-1988:1 for German mark, 1973:9-1988:1 for Japanese yen and British pound
Exch. rate German mark Japanese yen British pound
Parameter Markov T V T P Markov T V T P Markov T V T P (2-state) (2-state) (2-state)
Panel C3. The Smoothed Probabilities of the T V T P Model for British Pound Figure I. The Smoothed Probabilities o f the 2-state Markov and the T V T P Models for
Each Currency
Two important points should be noted here: Although the transition probabilities |
of the TVTP model evolve over time as shown in Figure 2, almost all coefficients
on the market fundamentals are not statistically different from zero. With the null i
hypothesis that the simple Markov model is a true and correct model, the
likelihood ratio test (LR ) rejects this hypothesis at the 1 percent significant level
in favor of the TVTP model for the German mark and the Japanese yen. These
contradictory results may indicate that although the market fundamentals can, to a
large extern, help to find the true parameter values of /Uq, "/,cr^ and cr卜 they
cannot give any significant idea about the current and future regimes because of
high variability of estimated a and p.
22 ,
TVTP model for German mark
。 ' n r n n r i 0.5 1 " H H I I 1
o L u J J ~ 1 1 1 1 。 ~ l U Z _ _ U 1976.01 1978.01 1980.01 1982.01 1984.01 1986.01 1988.01
Month
Panel A l . The Transition Probability o f the T V T P Model for German Mark