Asymmetric t

8/13/2019 Asymmetric t

1/37

A Generalized Asymmetric Student-tDistribution

with Application to Financial Econometrics

Dongming Zhu

HSBC School of Business, Peking University

John W. Galbraith

Department of Economics, McGill University

April 14, 2009

Abstract

This paper proposes a new class of asymmetric Student-t (AST)distributions, and investigates its properties, gives procedures for es-timation, and indicates applications in financial econometrics. We de-rive analytical expressions for the cdf, quantile function, moments, andquantities useful in financial econometric applications such as the ex-pected shortfall. A stochastic representation of the distribution is alsogiven. Although the AST density does not satisfy the usual regular-

ity conditions for maximum likelihood estimation, we establish consis-tency, asymptotic normality and efficiency of ML estimators and derivean explicit analytical expression for the asymptotic covariance matrix.A Monte Carlo study indicates generally good finite-sample conformitywith these asymptotic properties.

JEL classification codes: C13, C16

Key words: asymmetric distribution, expected shortfall, maximumlikelihood estimation.

The support of the Social Sciences and Humanities Research Council of Canada(SSHRC) and the Fonds quebecois de la recherche sur la societe et la culture(FQRSC) isgratefully acknowledged.

1


2/37

1 Introduction

The Student-t distribution is commonly used in finance and risk manage-ment, particularly to model conditional asset returns for which the tails ofthe normal distribution are almost invariably found to be too thin. Forexample, Bollerslev (1987) used the Student-t to model the distribution offoreign exchange rate returns; Mittnik, Rachev and Paolella (1998) fitteda return distribution using a number of parametric distributions includingStudent-t, and found that the partially asymmetric Weibull, Student-t andthe asymmetric stable distributions provide the best fit according to variousmeasures. Recent applications include Alberg et al. (2008) and Franses etal. (2008).

Hansen (1994) was the first to consider a skewed Studentst distributionto model skewness in conditional distributions of financial returns. Sincethen, several skew extensions of the Student-t distribution have been pro-posed for financial and other applications; see for example Fernandez andSteel (1998), Theodossiou (1998), Branco and Dey (2001), Bauwens andLaurent (2002), Jones and Faddy (2003), Sahu et al (2003), Azzalini andCapitanio (2003), Aas and Haff (2006) and others.

All but two of these skewt-type distributions have two tails with identicalpolynomial rate of decay. The first of the exceptions is the skew extensionof Jones and Faddy (2003), which has two tail parameters to control theleft and right tail behavior, respectively, but does not embody a third thatallows skewness to change independently of the tail parameters. The second

is due to Aas and Haff (2006), who argued for a special case of the gener-alized hyperbolic (GH) distribution, called the GH Student-t distribution,in which one tail is determined by a polynomial rate, while the other hasexponential behavior. For detailed descriptions of various skew Student-ttype distributions, refer to the review in Aas and Haff (2006). However, ingeneral, a skewness parameter mainly controls the asymmetry of the centralpart of a distribution. Therefore a class of generalized asymmetric Student- t(AST) distributions which has one skewness parameter and two tail param-eters offers the potential to improve our ability to fit and forecast empiricaldata in the tail regions which are critical to risk management and otherfinancial econometric applications. In this paper, we propose such a class of

distributions, describe estimation methods and investigate properties of thedistribution and of the estimators.

There are various methodologies for generation of a skewed Student-tdistribution. One is the two-piece method; Hansen (1994) used this methodto propose the first skew extension to the Student-t. More generally, Fer-

2


3/37

nandez and Steel (1998) introduced a skewness parameterto any univari-

ate pdf which is unimodal and symmetric, resulting in a skewed version ofthe Student-t equivalent to that of Hansen (1994); Bauwens and Laurent(2002) generalized the procedure used in Fernandez and Steel (1998) to themultivariate case. A second methodology is the perturbation approach ofAzzalini and Capitanio (2003), which can generate the multivariate skewelliptical distributions proposed by Branco and Dey (2001) and Sahu et al.(2003) using the conditioning method. 1 In this paper we will extend thetwo-piece method to allow the additional parameter.

Allowing an additional parameter offers the potential to fit more subtlefeatures of the distribution than is possible with two-parameter versions,with the attendant potential for better descriptions of tail phenomena, and

better predictions of quantitites such as expected shortfall which dependon the shape of the tail. Of course, relatively large sample sizes may benecessary in order to realize this potential: even if a three-parameter formprovides in principle a better description of a given type of data, the two-parameter approximation may not be detectably poorer in a finite sample.However, we show here by simulation that the parameters can be distin-guished in realistic sample sizes, and in a companion empirical study (Zhuand Galbraith 2009) we show that improved fit and forecast performancecan be observed in financial return data.

The paper is organized as follows. Section 2 gives the definition of theAST distribution and section 3 provides an interpretation of parametersand gives some properties such as a stochastic representation and analyti-

cal expressions for the cdf, quantiles, moments, value at risk and expectedshortfall. In section 4 we establish consistency and asymptotic normality ofthe MLE, and section 5 provides some finite-sample Monte Carlo results.Technical results and proofs are collected in the appendices.

2 Definition of the AST Distribution

The asymmetric Student-t(AST) distribution proposed in this paper is de-fined as follows. Its standard (location parameter is zero, scale parameter is

1By using the conditioning method, Branco and Dey (2001) and Sahu et al. (2003)

construct two different classes of multivariate skew Student t distributions, which howevercoincide in the univariate case.

3


4/37

unity) probability density function has the form

fAST(y; , 1, 2) =

K(1)

1 + 11 ( y2 )

21+1

2, y 0

11 K(2)

1 + 12 (

y2(1))

22+1

2, y >0

(1)

where (0, 1) is the skewness parameter, 1 > 0 and 2 > 0 are the leftand right tail parameters respectively, K() ((+ 1)/2)/[(/2)](where () is the gamma function), and is defined as

= K(1)/[K(1) + (1 )K(2)]. (2)

Denoting byandthe location (center) and scale parameters, respectively,the general form of the AST density is expressed as 1 fAST(

y ; , 1, 2).

Note that

K(1) =

1 1 K(2) =K(1) + (1 )K(2) B. (3)

The AST density (1) is continuous and unimodal with mode at the center,y = = 0, and is everywhere differentiable at least once. In the limit as approaches either 0 or 1, the shape of the density resembles a Student- ttruncated at the mode. The parameter provides scale adjustments re-spectively to the left and right parts of the density so as to ensure continuity

of the density under changes of shape parameters (, 1, 2).A new parameterization of a skewed Student-t(SST) distribution is given

by letting 1 = 2 = (implying = ) in the AST; the general form of

its density is

fSS T(y; ,,,) =

1 K()

1 + 1 (

y2 )

2(+1)/2

, y ;

1 K()

1 + 1 (

y2(1) )

2(+1)/2

, y > .

(4)

This parameterization of the SST is equivalent to those of Hansen (1994) and

Fernandez and Steel (1998), but it will provide an interesting new interpre-tation of the skewness parameter in terms ofLp distances. By reparameter-ization with = 1/(1 + 2) and = (+ 1/)/2, the SST (4) will becomethat of Fernandez and Steel (1998); letting = (1)/2, = 1b

( 2)/

and = a/b, the density (4) will be that of Hansen (1994, eq. 10). With

4


5/37

= 1/2, the SST reduces to the general form of Student-tdistribution. The

skewed Cauchy and skewed normal distribution are special cases of the SSTwith = 1 and =, respectively. By the skewness measure of Arnoldand Groeneveld (1995), the SST density is skewed to the right for 1/2.

When one of the tail parameters goes to infinity, say, 2 ,the ASTbehaves as a Student-t on the left side and as a Gaussian on the right side,implying one heavy tail and one exponential tail. This type of tail behavioris similar to that of the GH Student-tin Aas and Haff (2006). With these twotail parameters the AST can accommodate empirical distributions of dailyreturns of financial assets that are often skewed and have one heavy tailand one relatively thin tail. A potential disadvantage of the AST, compared

with the GH Students t and that of Jones and Faddy (2003), is the factthat the density function is differentiable only once at the mode ; however,it is not an impediment in applications, because we can show that the usual

Tasymptotics of MLE still hold for the AST.The definition in (1) above is useful in theoretical analysis, but it will

sometimes be convenient to re-scale for computations and applications. Wecan give an alternative definition of the AST density as follows:

fAST(y; ) =

1

1 + 11

y2K(1)

2(1+1)/2, y ;

1 1 + 12 y2(1)K(2)2(2+1)/2 , y > ,

(5)

where = (, 1, 2, , )T and and are the location and scale parame-

ters respectively. From the rescaled AST density (5), we can clearly observethe effects of the shape parameters on the distribution. This also yields asimple closed-form expression for the information matrix of the maximumlikelihood estimator (MLE).

3 Properties of the AST distribution

3.1 Stochastic representation, moments, and implications of

parameters

Suppose thatYis a random variable with the standard AST density (= 0, = 1). Definea b min{a, b}, a b max{a, b}, byFt(; ) the cdf ofstandard Student-twith non-integer degrees of freedom , and by F1t (; )

5


6/37

the inverse function ofFt(; ). The cdf and quantile function of the ASTr.v. Y are given by a straightforward calculation as follows:

FAST(y) = 2Ft

y 02

; 1

+ 2(1 )

Ft

y 0

2(1 ) ; 2

12

(6)

and

F1AST(p) = 2F1t

p 2

; 1

+2(1 )F1t

p + 1 22(1 ) ; 2

, (7)

where is defined in (2). Note that Ft(0; ) = 12 , which impliesFAST(0) = and F1AST() = 0 (recall that is the skewness parameter here). Thismeans that the -quantile of a standard AST r.v. is always zero. For ageneral AST with locationand scale,the locationcorresponds to the-quantile of the general AST random variable. This is the basic interpretationof the parameters and.

A stochastic representation of the AST is useful in studying propertiesof the distribution, and in simulation studies. Denote by T() a randomvariable having the standard Student-t distribution with degrees of free-dom. Consider three independent random variables U, T(1) and T(2),whereU U(0, 1), the uniform distribution on [0, 1]. Define

Y = |T(1)| [sign(U ) 1] + (1 ) |T(2)| [sign(U ) + 1], (8)

where is defined as in (2), sign(x) = +1 ifx > 0,1 ifx 0) = [2(1 )]rE|T(2)|r , r (1, 2) (11)

6


7/37

and

E|Y|r = E(|Y|r | Y 0) + (1 )E(|Y|r | Y >0)= [2]rE|T(1)|r + (1 )[2(1 )]rE|T(2)|r , (12)

where r (1, 1 2), implying that the r-th absolute moment of thestandard AST r.v. Y can be obtained by combining (9) and (12). Similarly,for any positive integer k < 1 2, the k-th moment is given by

E(Yk) =[2]kE|T(1)|k + (1 )[2(1 )]kE|T(2)|k . (13)

In particular, the mean and variance of a standard AST random variableare:

E(Y) = 41K(1)

1 1 + (1 )(1 )

2K(2)

2 1

= 4B

2 1

1 1+ (1 )2

22 1

, (14)

V ar(Y) = 4

2

11 2+ (1 )(1

)2 2

2 2

16B22 1

1 1+ (1 )2

22 1

2, (15)

whereK() andB are defined respectively in (1) and (3). We see that all themoments can be expressed simply and conveniently in terms of the Gammafunction. For the skew Student-t, where 1 = 2 = and

= , we canobtain simplified expressions for various moments:

E(Yk) = (2

)k

(1)kk+1 + (1 )k+1 ( k+12 )( k2 )

( 2 ) , (16)

E(|Y|r) = (2)r r+1 + (1 )r+1 ( r+12 )( r2 )( 2 )

, (17)

wherek is a non-negative integer less than , and

1< r < .

An interpretation of the parameters can be given by using the conditionalLrnorm deviations,

dL(r) [E(|Y |r | Y )]1/r, dR(r) [E(|Y |r | Y > )]1/r, (18)

7


8/37

where r > 0 is any given constant, is the location parameter and here

= 0 for the standard AST r.v. Y. Substituting (10) and (11) into (18)yields

dL(r) = 2(E|T(1)|r)1/r , dR(r) = 2(1 ) (E|T(2)|r)1/r . (19)

As we know from the alternative definition (5), the parameters 1 and 2separately control the shapes of the left and right sides of the AST, so theycan be referred to as the left and right shape parameters respectively. We cansee this point also from the left and right conditional generalized kurtosisdefined for every r >0 as

kurL(r)

[dL(2r)/dL(r)]

2r = E|T(1)|2r

(E|T(1)|r

)2 , (20)

kurR(r) [dR(2r)/dR(r)]2r = E|T(2)|2r

(E|T(2)|r)2, (21)

where each depends on only one of the shape parameters 1 and2.For thecase in which 1 = 2 = , the skewness parameter has an interestinginterpretation. Recall that= when 1= 2. It follows from (19) that

dL(r) = 2 (E|T()|r)1/r , dR(r) = 2(1 ) (E|T()|r)1/r . (22)This implies that the ratio of the probability () that Y occurs on the leftside of to the probability (1 ) that Yoccurs on the right side of isequal to the ratio of the left deviation dL(r) to the right deviationdR(r), i.e.,/(1 ) =dL(r)/dR(r). Defined(r) dL(r) + dR(r), the total conditionaldeviation; then for any r >0,

= dL(r)/d(r) =dL(r)/[dL(r) + dR(r)], (23)

implying that the skewness parameter can also be interpreted as the ratioof the left deviation dL(r) to the total deviation dL(r) + dR(r).

By substituting (9) into (20) and (21), the left and right (generalized)kurtosis are given as follows:

kurL(r, 1) =

(r+ 12)(

12 r)( 12)

( r+12 )( 1r2 )2 ,r 1

2

,1

2 , 1> 0 (24)and

kurR(r, 2) =

(r+ 12)(

22 r)( 22)

( r+12 )(2r2 )2 ,r 12 ,22

, 2> 0. (25)

8


9/37

We can show that both kurL(r, ) and kurR(r, ) are strictly decreasing

in and strictly increasing in r (see Lemma 7 in Appendix A). From theexpressions for kurL(r, 1) and kurR(r, 2) in (24) and (25), the heavinessof the left (or right) tail of the AST is controlled by only 1(or 2). If1 kurR(r, 2),implying that the left tail is heavier thanthe right one; the smaller the value of1 (or2), the heavier the left (or theright) tail. Ifi> 4, then forr = 2 the left (or right) kurtosis has the simpleexpressionkurL(2, 1) = 3 + 6/(1 4) (or kurR(2, 2) = 3 + 6/(2 4)).

3.2 Value at Risk and Expected Shortfall

The Value at Risk (VaR) and the Expected Shortfall (ES) are two verywidely used risk measures, defined for a standard AST random variable Y

at a confidence level p or a point in the support of the distribution qas

V aRAST(p) F1AST(p), ESAST(q) E(Y| Y < q).We will now show that each of these risk measures can be expressed interms of the cdf and pdf of the standard Student-t,Ft(; ) andft(; ), withparameter values of1 and 2. V aRAST(p) has been given in (7); we canexpress ESAST(q) in terms ofFt(; i) and ESt(q; i) (see (40) in Lemma8 of Appendix A), where ESt(q; ) E(T()| T() < q) is the expectedshortfall of a standard Student-t r.v. T() with degrees of freedom , andi= 1, 2.Note that E St(q; ) can be simply expressed as

ESt(q; ) = 1

1 +1

q2 ft(q; )Ft(q; )

. (26)

Then, substituting ESt(; ) into (40), we obtain the expression for ESAST(q)as follows: ESAST(q) =

4BFAST(q)

2

11 1

1 +

1

1

q 02

2(11)/2+ (1 )2

22 1

1 + 1

2

q 0

2(1 )2(12)/2

1

, (27)

where againa b= min{a, b}and a b= max{a, b}, andB is defined as in(3).

For q < 0, 2 vanishes from the expression and existence of the ESrequires only 1> 1.

9


10/37

When considering the ES as a function of the confidence level pby taking

q= V aRAST(p) =F1

AST(p), we obtain ESAST(p) ESAST F1AST(p) =4B

p

1

2

1 1

1 + 1

1

F1t

p 2

; 12(11)/2

+2(1 )2

2 1

1 + 1

2

F1t

p + 1 2

2(1 ) ; 22(12)/2

1 .

4 Asymptotic Properties of the MLE

We now investigate asymptotic properties and finite-sample performance

of ML estimators of the parameters of the AST distribution. In order toobtain a relatively straightforward form of information matrix of the MLE,we adopt the alternative definition of the AST density given in (5). Thisdensity is a parameter transformation (re-scaling) of the original one in (1).For any one-to-one parameter transformation, = h(), the informationmatrices of MLEs for and , denoted by J() and I(), can be shown tohave the following relationship:

J1() = h()I1()h(),

whereh() = (h()) = (i/j) whose element in the i-th row andj-th column is

i/

j,i, j= 1, 2,...5.

Now consider the MLE of the parameters of the AST. Let f(y; ) be theAST density (5). The true value of is denoted by0= (0, 01, 02, 0, 0).Suppose that0 { | = (, 1, 2, , ), , 1, 2> 0, (0, 1), (, +)}, the parameter space. Given an i.i.d. sampley = (y1, y2,...,yT)of sizeTwe can write the log-likelihood functionlT( | y)

Tt=1ln f(yt; )

as follows:

lT( | y) = Tln 1+ 12

Tt=1

ln

1 +

1

1

yt

2K(1)

21(yt )

2+ 1

2

T

t=1 ln1 + 12 yt

2(1 )K(2)2

1(yt > ).Note that because the log-likelihood function is differentiable only once at,, the AST distribution does not satisfy the usual regularity conditionsunder which the ML estimator has

Tasymptotics. In this case, however,

10


11/37

we can still establish the usual asymptotics by using Theorem 2.5 in Newey

and McFadden (1994, p2131) and Theorem 3 as well as its corollary inHuber (1967). In addition, we obtain the closed-form expression for theFisher information matrix I(). We use the notationH() for the Hessianmatrix.

Proposition 1 The MLET of0 is consistent, i.e.,Tp 0.Proof: See Appendix B.

Proposition 2 The information matrix equality I(0) = H(0) holds.The elements of the Fisher information matrix, denoted byij , are

ij Eln f(yt; 0)i

ln f(yt; 0)j

, (28)whereij = ji andj represents the jth element of the parameter vector= (, 1, 2, , )

T, and:

11= 3

1+1(1+3)

+ 2+1(1)(2+3)

; 12= 11+1+ 1D(1)

1+3 ;

13= 12+1

2D(2)2+3 ; 14 = 23 11; 15= 2

11+3

22+3

;

22= 2 1D2(1)1+3 2D(1)1+1 D(1) ; 23= 0; 25= 12;24=

1

1

1+1 1+11+3D(1)

; 34= 1

1

2+1 2+12+3D(2)

;

33= 1

2

2D2(2)

2+3 2D(2)2+1 D(2)

; 35 = 1 13;

44= 142

1+1

(1+3)1

K2(1)+ 2+1(1)(2+3)

1K2(2)

; 45 = 23 15;

55= 22

11+3+ (1 ) 22+3

;

(29)

where all theij are evaluated at the true values (0, 01, 02, 0, 0), K()is defined in (1), D() ( +12 ) ( 2 ), and () ()/() is thedigamma function.

11


12/37

Proof: See Appendix C.

Note that for the SST (1= 2= ), its score component ln f

is the sumof the AST score components ln f1 and

ln f2

. Thus, by combining the termsofij involving1 and2,i.e.,12 + 13, 22 + 33, 24 + 34 and25 + 35,we can obtain the information matrix for the MLE of the SST parameters(, , , ); the result appears in Gomez et al. (2007, Proposition 2.2).

Proposition 3 The MLET of0 is asymptotically normal,

T(T 0) DN(0, I1(0)),whereI(0) is the Fisher information matrix,

I(0)

E[(

ln f(yt; 0))(

ln f(yt; 0))

];

provided by (29); it can be consistently estimated byI(T).Proof: See Appendix B.From the proof of the Proposition, we can see that I() is continuous in

some neighborhood of0, so it follows from the consistency ofT thatI(T)is a consistent estimator ofI(0).

5 Simulation performance of the MLE

To assess the asymptotic properties of the MLE in finite samples we report anumerical investigation of bias and variance of the estimators using samplesizes ofT= 1000 and 5000. We choose 0 = 0, 0 = 1 and various differenttrue values of (, 1, 2): = 0.3, 0.7 and 1,2 = (0.7, 2.5),(2.0,2.0), and(2.0,5.0); these cases are representative of a larger number of simulationsproducing qualitatively similar results. For each set of true values of theparameters and every sample size, N= 10000 simulated samples are drawnfrom the AST distribution with that set of parameter values, and then MLestimatesi (i = 1, 2,...,N) are obtained using these samples. We obtainthe sample means and standard errors of the MLEs of the parameters onthese 10 000 replications, denoted respectively by M() and SE(),

M() = 1NN

i=1i, SE() = 1N

N

i=1i M()2

1/2

,

and compare these standard errors with the theoretical standard deviationswhich are taken from the square root of the diagonal elements of the Cramer-Rao bound (i.e.,I1()/T). Simulation results are reported in Table 1(a/b)

12


13/37

and in Figure 1. All entries in Table 1a (labeled mean) reportM(), andthose in Table 1b (se) report SE().To describe the ratios of simulated standard errors SE() to theoreti-

cal ones from I1()/T, we report results graphically in Figure 1 for thelarger set of sample sizes T ={1000, 2000, . . . , 10000}; these results areconveniently viewed in graphical form since asymptotically the result willconverge on unity, and we wish to see examples of the speed of convergenceand the degree of finite-sample discrepancy.

Finally, note that all random samples from a standard AST are generatedby using the stochastic representation (8) of the AST multiplied by BK(1) + (1 )K(2), X BY, which has the AST density (5) withshape parameters (, 1, 2), location = 0 and scale = 1.

From these simulations the estimates of all parameters appear asymp-totically unbiased in each case and their variances appear to be approachingthe Cramer-Rao bound. However, ML estimates of the tail parameters havea slower convergence rate than those of the other parameters. In fact, skew-ness, scale and location parameters can be estimated well even for samplesizes smaller than 500; however, even for moderately large values of the tailparameters, such as 5.0, a sample size of 1000 or 2000 may not be largeenough to give a good estimate. The highest variances that we observedarose for large values of1 where was less than 0.5, and correspondinglyfor2 where was greater than 0.5; the last lines of Table 1a/1b and panelD of Figure 1 illustrate such a case. Note in Figure 1 that the vertical scalein panel D differs from those of panels A-C.

Estimates of tail parameters depend crucially on the relatively sparsetail observations, suggesting that relative to the Student-t and the SSTwhich have only a single tail parameter, approximately double the number ofobservations will be needed in order to obtain good tail parameter estimatesin the AST, because the two tail parameters in the AST are distinct. For = 0.3 in our simulation studies there are fewer observations on the leftside, so estimates of the left tail parameter should show poorer finite sampleperformance than those of the right tail parameter. Note that a smaller(larger) value of a tail parameter implies a heavier (thinner) tail, so thatthere are more (fewer) observations in the tail. As well, the shape of thedistribution changes less with a one-unit change in the tail parameters when

the value is large; that is, sensitivity of shape is greater at small values.These considerations suggest that we should observe lower standard errorsfor small tail parameter values than for large values.

13


14/37

6 Concluding remarks

Many processes display a relative frequency of extreme values which farexceeds what could be accounted for by a Gaussian distribution. This istrue in particular for financial data, where the Student-t distribution hascommonly been found valuable in modelling conditional returns. However,equality of the relative frequency of extreme returns in left and right tails(losses and gains) often seems violated in practice. Hence generalizationsof the Student-t that allow asymmetry are potentially valuable in empiricalmodelling and forecasting.

The present study offers a three-parameter form which is more generalthan those available in the literature. The proposed distribution allows ana-lytical computation of important quantities related to risk, and ML estima-tion of parameters with the usual T asymptotics. We show by simulationthat finite-sample performance of ML estimation is reasonable, and alsothrough empirical analysis that the potential of the more general form is re-alized both in better in-sample fits, and in better forecasts of tail-dependentquantities of interest such as the expected shortfall. This distribution there-fore appears to offer a device for continuing to increase the subtlety of ourunderstanding of financial returns and other heavy-tailed data.

References[1] Aas, K. and Haff, I.H. (2006). The generalized hyperbolic skew Stu-

dents t-distribution.Journal of Financial Econometrics, 4(2), 275-309.

[2] Abramowitz, M. and Stegun, I. A. (1970). Handbook of MathematicalFunctions, National Bureau of Standard, Applied Math. Series 55, U.S.Department of Commerce.

[3] Alberg, D., H. Shalit and R. Yosef (2008). Estimating stock marketvolatility using asymmetric GARCH models. Applied Financial Eco-nomics18, 1201-1208.

[4] Anderson, T. and Darling, D. (1952). Asymptotic theory of certaingoodness of fit criteria based on stochastic process. The Annals ofMathematical Statistics23, 193-212.

14


15/37

[5] Arnold, B.C. and Groeneveld, R.A. (1995). Measuring skewness with

respect to the mode. Am. Statistician, 49, 34-38.

[6] Artin, E. (1964). The Gamma Function, Holt, Rinehart and Winston,Inc.

[7] Azzalini, A. and A. Capitanio (2003). Distributions generated by per-turbation of symmetry with emphasis on a multivariate skew t distri-bution.Journal of the Royal Statistical Society B65, 367-389.

[8] Bauwens, L. and Laurent, S. (2002). A new class of multivariate skewdensities, with application to GARCH models.Journal of Business andEconomic Statistics.

[9] Bollerslev, T. (1987). A conditional heteroskedastic time series modelfor speculative prices and rates of return. Review of Economics andStatistics, 69, 542-547.

[10] Branco, M.D. and Dey, D.K. (2001). A general class of multivariateskew-elliptical distributions. Journal of Multivariate Analysis 79, 99-113.

[11] Farrell, O. J. and Ross, B. (1963). Solved Problems: Gamma and BetaFunctions, Legendre Polynomials, Bessel Functions. The MacmillanCompany, Now York.

[12] Fernandez, C. and Steel, M.F.J. (1998). On Bayesian modelling of fat

tails and skewness,Journal of the American Statistical Association, 93,359-371.

[13] Franses, P.H., M. van der Leij and R. Paap (2008). A simple testfor GARCH against a stochastic volatility model. Journal of Finani-cal Econometrics6, 291-306.

[14] Gomez, H.W., F.J. Torres and H. Bolfarine (2007). Large-sample infer-ence for the epsilon-skew-tdistribution.Communications in StatisticsTheory and Methods, 36, 73-81.

[15] Hansen, B. E. (1994). Autoregressive conditional density estimation.International Economic Review, 35, 705-730.

[16] Huber, P. J. (1967). The behavior of Maximum Likelihood estimatesunder nonstandard conditions, in: L.M. LeCam and J. Neyman, eds.,Proceedings of the Fifth Berkeley Symposium on Mathematical Statisticsand Probability, Berkeley: University of California Press.

15


16/37

[17] Jones, M.C. and Faddy, M.J. (2003). A skew extension of the t distribu-

tion, with applications. Journal of the Royal Statistical Society, SeriesB, 65, 159-174.

[18] Kagan, A. M., Linnik, Yu. V. and Rao, C. R. (1973).CharacterizationProblems in Mathematical Statistics, Wiley, New York.

[19] Kotz, S., Kozubowski, T.J. and Podgorski, K. (2001).The Laplace Dis-tribution and Generalizations: A Revisit with Applications to Commu-nications, Economics, Engineering, and Finance, Birkhauser Boston.

[20] Mittnik, S. and Paolella, M. S. (2003). Prediction of financial downside-risk with heavy-tailed conditional distributions. Handbook of Heavy

Tailed Distributions in Finance, edited by S. T. Rachev.[21] Mittnik, S., Rachev, T. and Paolella, M.S. (1998). Stable Paretian mod-

elling in finance: Some empirical and theoretical aspects. In: Adler etal. (Eds.), A Practical Guide to Heavy Tails. Birkhauser.

[22] Newey, W.K. and McFadden, D. (1994). Large sample estimation andhypothesis testing, in: R. Engle and D. McFadden, eds., Handbook ofEconometrics, Vol. 4, Amsterdam: North-Holland.

[23] Sahu, S.K., D.K. Dey and M.D. Branco (2003). A new class of multivari-ate skew distributions with applications to Bayesian regression models.The Canadian Journal of Statistics31, 129-150.

[24] Theodossiou, P. (1998). Financial data and the skewed generalized tDistribution.Management Science, 44 (12-1), 1650-1661.

[25] Zhu, D. and J.W. Galbraith (2009). Forecasting expected shortfallwith a generalized asymmetric Student-t distribution. Working paper:www.mcgill.ca/economics/papers .

16


17/37

7 Appendix A

Appendix A provides some lemmas that will be used in the proofs of prop-erties of the AST.

We will use the Gamma function of a positive real variable only, andso for x > 0 we take (x) =

0 t

x1etdt as our definition of the Gammafunction. (x) has derivatives of arbitrarily high order:

(x)(x)

= C 1x

+

i=1

(1

i 1

x + i), (30)

dk1

dxk1

(x)(x)

=

i=0(1)k(k 1)

(x + i)k , fork 2; (31)

C is Eulers constant. See e.g. Artin, 1964, pp. 16 ff. for these and otherproperties. Let (x) (x)/(x); this is called the digamma function.

Lemma 4 LetD() ( +12 ) ( 2 )for any >0. Then()is strictlyincreasing whileD()is strictly decreasing, and the following equalities hold:

(+ 1) = 1

+ (), D(+ 2) = 2

(+ 1)+ D(). (32)

Proof. From (31), taking k = 2, we get (x) =

i=01/(x+ i)

2 > 0for all x, implying that () is a strictly increasing function. From the

above expression for (x),we can also see that (x) is a strictly deceasingfunction forx >0, implyingD() = ( +12 )( 2 )< 0.SoD() is strictlydecreasing. If the first equality in (32) holds, then the second one is easilyverified. Now we proceed to show the first equality. In fact, differentiatingboth sides of (x+ 1) = x(x) leads to (x+ 1)/(x) = 1 +x(x), andthen rewriting it does yield the first equality in (32).

Lemma 5 For any >0, recallK() ((+ 1)/2)/[(/2)]. Thenthe following equalities hold:

1

+ 2

K()K()

= (+ 1

2 ) (

2) =D(), (33)

+ 2j

1/2 K()K(+ 2j)

=

1, j= 0;/(+ 1), j = 1;(+ 2)/[(+ 1)(+ 3)], j = 2.

(34)

17


18/37

Proof. The proofs are immediate. For equality (33), taking the log of

the expression for K() and then differentiating both sides, it follows that

K()K()

= d ln K()

d =d

ln((+ 1)/2) ln(/2) 1

2ln ln

/d

= 1

2

(

+ 1

2 ) (

2) 1

.

So we have shown that (33) holds. From the definition ofK(),the left sideof equality (34) is expressed as

+ 2j1/2 K()

K(+ 2j)=

((+ 1)/2)

(/2)

(/2 +j)

((+ 1)/2 +j).

Using the fact (+ 1) = (), the proof of equality (34) is easily com-pleted.

Lemma 6 For any >0, we have the following integral equalities: +0

(1 + z2)(+1)/2dz = 1

2

K() (35) +

0(1 + z2)(+1)/2 ln(1 + z2)dz =

D()

2

K() (36)

+

0

(1 + z2)(+1)/2 ln(1 + z2)

2dz =

D2() 2D()2

K() (37)

where K() and D() are defined as the above, and D() is the derivativefunction ofD().Below we will continue to use(),(), K()andD()to denote these functions defined earlier.

Proof. For the Student-t density ft(x) = K()(1 +x2/)(+1)/2, from+

ft(x)dx= 1,we obtain equality (35) immediately. Differentiating bothsides of equality (35) with respect to , by Lemma 3.6 of Newey and Mc-Fadden (1994, p.2152) that ensures that the order of differentiation andintegration can be interchanged, it follows that

1

2 +0 (1 + z2)(+1)/2 ln(1 + z2)dz= 12K() 12 K()

K() .Rewriting this equality and combining with equality (33) yields (36). Simi-larly, by differentiating both sides of ( 36) with respect to and combiningwith (33), we obtain (37).

18


19/37

Lemma 7 The inequalities kurL(r, )/ < 0 and kurL(r, )/r > 0

hold for the AST.

Proof. Taking the log ofkurL(r, ), we have two partial derivatives asfollows:

ln kurL

=1

2

(

2) ( r

2 )

1

2

(

r2

) ( 2r2

)

0. (39)

The inequality (38) can be verified by using the mean value theorem and thefact that (x) is strictly decreasing forx >0; then (39) follows immediatelybecause (x) is an increasing function.

Lemma 8 The expected shortfall of the AST distribution, ESAST(q), canbe expressed as

1

FAST(q)

4ESt

q 02

; 1

Ft

q 02

; 1

+ 4(1 )(1 )

ESt

q 0

2(1 ) ; 2

Ft

q 0

2(1 ) ; 2

12

ESt(0; 2)

. (40)

Proof. The expression forES AST(q) in (40) is unified from the followingtwo cases. When q 0, by the definition of expected shortfall and the changeof variable, u = x/(2), we have

ESAST(q) E(Y| Y < q)=

q

xfAST(x; , 1, 2)dx

/FAST(q)

=

q x

ft(

x2 ; 1)dx

FAST(q) = 4

q/(2) uft(u; 1)du

2Ft( q2 ; 1)

= 2ESt( q

2; 1).

Similarly, when q >0, using the change of variable z =x/(2(1 )), andnoting that 0

x

ft(

x

2; 1)dx= 4

ESt(0; 1)Ft(0; 1)

19


20/37

and q zft(z; )dz= ESt(q; )Ft(q; ),q,we obtain

ESAST(q) =

0 x

ft(

x2 ; 1)dx +

q0 x

11 ft(

x2(1) ; 2)dx

FAST(q)

= 1

FAST(q){4ESt(0; 1)Ft(0; 1) + 4(1 )(1 )

q/(2(1))

zft(z; 2)dz

0

zft(z; 2)dz

= 1

FAST(q) {4ESt(0; 1)Ft(0; 1) + 4(1 )(1 )

ESt( q

2(1 ) ; 2)Ft( q

2(1 ) ; 2) 1

2ESt(0; 2)

.

8 Appendix B

Appendix B is devoted to establishing consistency and asymptotic normalityof the MLE of all parameters of the AST distribution.

Proof of Proposition 1 (consistency of MLE). The consistency of theMLE

Tcan be shown by verifying the conditions of Theorem 2.5 in Newey

and McFadden (1994, p.2131), which holds under conditions that are primi-tive and also quite weak. Condition (ii) of Theorem 2.5, compactness of theparameter set, is ensured by considering a compact parameter set such that it includes the true parameter 0 as an interior point. Condition(iii) of Theorem 2.5 requires that the log-likelihood ln f(y| ) be continuousat each with probability one. This condition holds by inspection.We only need to check the identification condition and dominance condition(corresponding to conditions (i) and (iv) of Theorem 2.5 respectively).

For the identification condition, it is sufficient to show that for any given =0 and ,

ln f(y| ) = ln f(y| 0), a.e. (41)

on a set of positive probability. The fact that the AST random variableY has a positive probability on any interval will be used in the proof. If =0,say, > 0,then on interval (0, ] the log-density function ln f(y|) is strictly increasing, but ln f(y | 0) decreases strictly, so (41) holdsfor y (0, ]. Now suppose = 0. We can show that (41) is true on

20


21/37

(, 0] or (0, +) respectively if1=01or2=02.In fact, assuming2 = 02 and letting C() = (2 + 1)/2, for y (0, +), we haveln f(y| ) = ln C() ln R(y; ) with = 0, and ln f(y| 0) = ln 0C(0) ln R(y; 0), where R(y; ) is defined in (55). Note that R(y; ) with= 0andR(y; 0) are quadratic and strictly increasing on (0, +). Thus,both log-density functions intersect at no more than two points, so that (41)holds on (0,+ ). Similarly, for = 0, 1 =01 and 2=02, it is easyto show that (41) holds if =0 or =0 (see Newey and McFadden, p.2126).

The dominance condition of Theorem 2.5, E[sup |ln f(Y| )|] 0)

for all . Using equalities (63) and (64), the dominance conditionfollows.

Proof of Proposition 3 (asymptotic normality of the MLE). The proofof the asymptotic normality result proceeds by verifying the conditions ofTheorem 3 as well as its corollary in Huber (1967). Following the notation

of Huber (1967), let (y, ) = ln f(y,) , the score vector, and set

() =E(y, ), u(y ,,d) supD

|(y, ) (y, )| , (43)

where D {| | | d} and all expectations are always taken with re-spect to the true underlying distribution f(y; 0) with 0= (0, 01, 02, 0, 0).Similar to Example 1 of Huber (1967), the condition N-1 (i.e., for each fixed, (y, ) is measurable and separable: see Assumption (A-1) of Huber(1967)) is immediate; both conditions (N-2) and (N-4), i.e., (0) = 0 andE[

|(y, 0)

|2] ), (48)

where A() and Aij() are some continuous functions of shape and scaleparameters (, 1, 2, ), and L = L(y, ) and R=R(y, ) are definedas in (54) and (55).

Now we check condition (45). Separate the location parameter from theother parameters, = (, 1, 2, ), i.e. = (, ) and

= (, ). Then

u(y ,,d) supD

|(y, , ) (y, , )|+ sup||d

|(y, , ) (y , , )| .(49)

The condition (45) is easily verified for the second part in (49), because thelocation is fixed and (y , , ) as a function of is smooth enough. Forthe first part in (49), note from (56) to (60) that each element of(y , , )can be expressed in the following form:

C() + 2i=1

C1i()(ln L)i1 +

4i=3

C1i()( y)i3L1 1(y < )+

2i=1

C2i()(ln R)i1 +

4i=3

C2i()(y )i3R1

1(y > ),(50)

22


23/37

where C() and Cij() are also certain continuous functions of= (, 1, 2, ).Without loss of generality, we need just to prove that

E

sup||d

|1(y < ) 1(y < )| |(y, , )|

bd, (51)

E

sup||d

|ln L(y, , ) ln L(y, , )| 1(y < )

bd, (52)

and

E

sup

|

|d

( y)kL(y, , )1 ( y)kL(y, , )1

1(y < )

bd

(53)where k = 0, 1. The similar inequalities for R(y, ) can be proved in thesame way. Equation (51) is immediate by (42) and the boundedness of|(y, , )| f(y; 0). The other two equations (52) and (53) are easily veri-fied by using the mean-value theorem. Finally, verification of condition (46)is similar.

9 Appendix C

Appendix C is devoted to deriving a closed-form expression for the informa-tion matrix and to verifying the information matrix equality.

Suppose thatyt (t= 1, 2,...T) are i.i.d. observations from the AST withdensityf(y; 0) defined in (5), where0= (0, 01, 02, 0, 0). Expectationsare always taken with respect to the true underlying distribution f(y; 0).Let

L L(yt; ) 1 + 11

yt

2K(1)

2, (54)

R R(yt; ) 1 + 12

yt

2(1 )K(2)2

, (55)

where = (, 1, 2, , ) , the parameter space. Then the log-densityfunction with parameter is ln f(yt; ) =

ln 1+ 12

[ln L(yt; )]1(yt < ) 2+ 12

[ln R(yt; )]1(yt > )

23


24/37

and the score vector for observation t, ln f(yt; ), is given by

ln f

=

1+ 1

1 1

L(yt; )

1(yt< )

2+ 11

1 1

R(yt; )

1(yt > ), (56)

ln f

1=

1

2ln L(yt; ) +

1+ 1

2 D(1)

L(yt; ) 1L(yt; )

1(yt< ), (57)

ln f

2=

1

2ln R(yt; ) +

2+ 1

2 D(2)

R(yt; ) 1R(yt; )

1(yt> ), (58)

ln f

=

1+ 1

2

1

L(yt; )

1

1

2(yt )[2K(1)]2

1(yt< )

+2+ 1

2

1

R(yt; )

1

2

2(yt )[2(1 )K(2)]2 1(yt> ), (59)

ln f

= 1

+

1+ 1

1 1

L(yt; )

1(yt < )

+2+ 1

1 1

R(yt; )

1(yt > ), (60)

where we used equality (33) in the expressions for the components ln f1 andln f2

. To derive the information matrix I(0) E[ ln f(yt, 0) ln f(yt; 0)]and the Hessian H(0)E[ 2ln f(yt; 0)] and to verify the informationmatrix equality I(0) = H(0), the following Lemma is needed.Lemma 9 For anyj, m= 0, 1, 2,..., the following moment equalities hold:

E

1

[L(yt; 0)]j1(yt < 0)

=

1

1+ 2j

1/2 K(1)K(1+ 2j)

(61)

=

, j = 0,1/(1+ 1), j= 1,

1(1+2)(1+1)(1+3) , j = 2;

E 1[R(yt; 0)]j 1(yt > 0) = (1 ) 22+ 2j1/2 K(2)

K(2+ 2j) (62)

=

(1 ) , j= 0,(1 )2/(2+ 1), j= 1,(1 ) 2(2+2)(2+1)(2+3) , j = 2;

24


25/37

E ln L(yt; 0)[L(yt; 0)]j1(yt< 0) = 11+ 2j1/2 K(1)

K(1+ 2j)D(1+ 2j)

=

D(1), j = 0,( 11+1)D(1+ 2), j= 1;

(63)

E

ln R(yt; 0)

[R(yt; 0)]j1(yt> 0)

= (1 )

2

2+ 2j

1/2 K(2)D(2+ 2j)K(2+ 2j)

=

(1 )D(2), j = 0,(1 )( 22+1)D(2+ 2), j= 1;

(64)

E(yt 0)[ln L(yt; 0)]m

[L(yt; 0)]j 1(yt< 0) =

2m+1m!1[2K(1)]2

2(1+ 2j 1)m+1 , (65)

E

(yt 0)[ln R(yt; 0)]m

[R(yt; 0)]j 1(yt > 0)

=

2m+1m!2[2(1 )K(2)]22(2+ 2j 1)m+1 ,

(66)

E

[ln L(yt; 0)]21(yt < 0)

= [D2(1) 2D(1)], (67)

E

[ln R(yt; 0)]21(yt > 0)

= (1 )[D2(1) 2D(1)], (68)

where the right hand sides of all the equalities from (61) to (68) are evaluatedat the true values(0, 01, 02, 0, 0).

Proof.2 We discuss equalities (61), (63), (65) and (67). Other equalitiescan be proved in the same manner. Note that, for any j, m= 0, 1, 2,...,

EL1(j, m) E

[L(yt; )]j[ln L(yt; )]m1(yt < )

=

[L(y; )]j[ln L(y; )]m1

[L(y; )](1+1)/2}dy,

and that L(y; ) 1 + 11 ( y2K(1)

)2. Then using the change of variable

z= y21K(1) yields

EL1(j, m) = 2

1K(1)

+

0(1 + z2)(1+2j+1)/2[ln(1 + z2)]mdz. (69)

Settingm = 0, m= 1, and (j, m) = (0, 2) respectively, and correspondinglytaking into account equality (35) with =1+ 2j, equality (36) with =

2For simplicity, we omit the subscript on the true parameters 0 in all the followingproofs.

25


26/37

1 + 2j, and equality (37) with = 1, we obtain equalities (61), (63),

and (67). These proofs use (34). Now consider equality (65). Denote byEL2(j, m) the expectation of the left side of equality (65), and note that thechange of variable z= 11 (

y2K(1)

)2 yields

EL2(j, m) = 1[2K(1)]2

2

+0

(1 + z)(1+2j+1)/2[ln(1+ z)]mdz. (70)

Subject to 1+ 2j 1> 0,by integration by parts it follows that

EL2(j, m) =m

2

1+ 2j 1

EL2(j, m1) =m!

2

1+ 2j 1m

EL2(j, 0).

A straightforward calculation for (70) gives EL2(j, 0) = 1[2K(1)]2

(1+2j1) .

Lemma 10 The score vector for observation t, ln f(yt; ), satisfies theequation

E

ln f(yt; 0)

= 0. (71)

Proof. By using the equalities from (61) to (68), this Lemma is easilyverified. In fact,

(i).

Eln f = 1+ 1 E1 1L(yt; ) 1(yt< )2+ 1

1 E

1 1R(yt; )

1(yt > )

=

1+ 1

1 1

1+ 1

2+ 1

1 (1 )

1 22+ 1

= 0.

(ii).

E

ln f

1

= E

ln L(yt; )

2 +

1+ 1

2 D(1)

L(yt; ) 1L(yt; )

1(yt < )

=

1

2

D(1) +1+ 1

2

D(1)1 1

1+ 1 = 0.(iii). Similarly, we have

E

ln f

2

=E

2+ 1

2 D(2)

R(yt; ) 1R(yt; )

ln R(yt; )2

1(yt> )

= 0.

26


27/37

(iv).

Eln f

=

1+ 1

2 E 1

L(yt; )

1

1

2(yt )[2K(1)]2

1(yt < )

+2+ 1

2 E

1

R(yt; )

1

2

2(yt )[2(1 )K(2)]2 1(yt> )

=

1+ 1

2

2(1+ 1)

+2+ 1

2

2

(2+ 1)= 0.

(v).

E

ln f

= 1

+

1+ 1

E

1 1

L(yt; )1(yt< )

+

2+ 1

E

1 1R(yt; )

1(yt > )

= 1

+

1+ 1

1 1

1+ 1

+

2+ 1

(1 )2+ 1

= 1

+

+

1

= 0.

Proof of Proposition 2. We prove this by computing expectations on theboth sides of the following equations and then verifying them,

E[ln f(yt; )

i ln f(yt; )

j] =

E[

2 ln f(yt; )

ij], i, j = 1, 2,...5.

In the proof, the fact that 1(yt < )1(yt > ) = 0 and the equalities(61)-(68) are used repeatedly. In addition, we use E[ ln f(yt; 0)] = 0shown in (71) and D(+ 2) = 2(+1) +D() given in (32). Note thatby the construction of the AST distribution, the left-tail parameter 1 andthe right-tail parameter 2 have a symmetry property. Hence we do notconsider the terms of the information matrix equality involved in the right-tail parameter2.

27


28/37


29/37

(b)

Eln f

1

2= E

[ln L(yt; )]

2

4 +

1+ 1

2

2D2(1)

1 1

L(yt; )

21(yt< )

1+ 1

2

D(1)E

1 1

L(yt; )

ln L(yt; )1(yt< )

=

D2(1) 2D(1)4

+

1+ 1

2

2D2(1)

3

(1+ 1)(1+ 3)

1+ 12 D(1) D(1) 11+ 1 D(1+ 2)=

2

1

1+ 3D2(1) 2

1+ 1D(1) D(1)

;

E

2 ln f

21

=

D(1) +

1+ 1

2 D(1)

E

1 1

L(yt; )

1(yt < )

1+ 1

2 D2(1)E

1

L(yt; ) 1

[L(yt; )]2

1(yt < )

=

D(1) +

1+ 1

2 D(1)

1 1

1+ 11+ 12

D2(1) 1

1+ 1 1(1+ 2)

(1+ 1)(1+ 3)

=

2

1

1+ 3D2(1) 2

1+ 1D(1) D(1)

.

29


30/37

(c).

Eln f

2=

1

1

1+ 12K(1)

2E 1

L(yt; ) 1

[L(yt; )]2

1(yt< )

+

1

2

2+ 1

2(1 )K(2)2

E

1(yt > )

R(yt; ) 1(yt> )

[R(yt; )]2

=

1+ 1

2K(1)

2 1

1

1+ 1 1(1+ 2)

(1+ 1)(1+ 3)

+

+

2+ 1

2(1 )K(2)2 1

2

2

2+ 1 2(2+ 2)

(2+ 1)(2+ 3)

=

1

42

1+ 1

(1+ 3)

1

K2

(1)

+ 2+ 1

(1 )(2+ 3)1

K2

(2) .E

2 ln f

2

=

1

1

1+ 1

[2K(1)]2E

1

L(yt; ) 2

[L(yt; )]2

1(yt< )

+

1

2

2+ 1

[2(1 )K(2)]2 E

1(yt> )

R(yt; ) 21(yt> )

[R(yt; )]2

=

1+ 1

[2K(1)]2

1

1

1+ 1 21(1+ 2)

(1+ 1)(1+ 3)

+

2+ 1

[2(1 )K(2)]21

2

2

2+ 1 22(2+ 2)

(2+ 1)(2+ 3)

= 1

42 1+ 1

(1+ 3)1

K2(1)+ 2+ 1

(1 )(2+ 3)1

K2(2) .

(d).

E

ln f

2= 0 1

2+

1+ 1

2E

1 1

L(yt; )

21(yt< )

+

2+ 1

2E

1 1

R(yt; )

21(yt > )

=

1

2

+1+ 1

2

1 2 1

1+ 1

+ 1(1+ 2)

(1+ 1)(1+ 3)+

2+ 1

2(1 )

1 2 2

2+ 1+

2(2+ 2)

(2+ 1)(2+ 3)

=

2

2

11+ 3

+ (1 ) 22+ 3

;

30


31/37

E2 ln f

2 =

1

2

1+ 1

2

E1 1

L(yt; )1 + 2

L(yt; ) 1(yt< )2+ 1

2 E

1 1

R(yt; )

1 +

2

R(yt; )

1(yt> )

=

1

2 1+ 1

2

1 +

11+ 1

2 1(1+ 2)(1+ 1)(1+ 3)

2+ 1

2 (1 )

1 +

22+ 1

2 2(2+ 2)(2+ 1)(2+ 3)

= 2

2

11+ 3

+ (1 ) 22+ 3

.

(e).

E

ln f

ln f

1

= 1+ 1

2 E

1 1

L(yt; )

ln L(yt; )1(yt< )

+

(1+ 1)2

2 D(1)E

1 1

L(yt; )

21(yt < )

= 1+ 12

D(1) 1

1+ 1D(1+ 2)

+

(1+ 1)2

2 D(1)

1 21

1+ 1+

1(1+ 2)

(1+ 1)(1+ 3)

=

1

1+ 1+

1

1+ 3D(1);

E

2 ln f

1

=

1

E

1 1

L(yt; )

1(yt < )

1+ 1

D(1)E

1

L(yt; ) 1

[L(yt; )]2

1(yt< )

=

1

1 1

1+ 1

1+ 1

D(1)

11+ 1

1 1+ 2

1+ 3

=

1

1+ 1 1

1+ 3D(1).

31


32/37

(f).

Eln f

ln f

=

(1+ 1)2

2 E

1

L(yt; ) 1

[L(yt; )]2

1

1

2(yt )[2K(1)]2

1(yt< )

(2+ 1)

2

2(1 )E

1

R(yt; ) 1

[R(yt; )]2

1

2

2(yt )1(yt > )[2(1 )K(2)]2

=

(1+ 1)2

2

2

1+ 1+

2

1+ 3

(2+ 1)

2

2(1 )

2

2+ 1 2

2+ 3

= 2

1+ 1

(1+ 3)+

2+ 1

(1

)(2+ 3) ;E

2 ln f

= 1+ 1

E

1

[L(yt; )]21

1

2(yt )[2K(1)]2

1(yt < )

+

2+ 1

1 E

1

[R(yt; )]21

2

2(yt )[2(1 )K(2)]2 1(yt > )

=

1+ 1

1

2

1+ 3+

2+ 1

1 1

2

2+ 3

= 2

1+ 1

(1+ 3)+

2+ 1

(1 )(2+ 3)

.

(g).

E

ln f

ln f

= 1

E[

ln f

] +

(1+ 1)2

E

1 1

L(yt; )

21(yt< )

(2+ 1)2

(1 ) E

1 1R(yt; )

21(yt> )

= 0 +(1+ 1)

2

1 21

1+ 1+

1(1+ 2)

(1+ 1)(1+ 3)

(2+ 1)

2

(1

)

(1 )

1 222+ 1

+ 2(2+ 2)

(2+ 1)(2+ 3)=

2

11+ 3

22+ 3

;

32


33/37

E2 ln f

= 21+ 1

E 1

L(yt; ) 1

[L(yt; )]2 1(yt< )

+2

2+ 1

(1 )

E

1

R(yt; ) 1

[R(yt; )]2

1(yt > )

= 2

1+ 1

1

1+ 1 1(1+ 2)

(1+ 1)(1+ 3)

+2

2+ 1

(1 )

(1 )

22+ 1

2(2+ 2)(2+ 1)(2+ 3)

= 2

1

1+ 3 2

2+ 3

.

(h). Note that ln f1ln f2

= 0 and 2 ln f

12= 0. Then we have

Eln f

1

ln f

2

= E

2 ln f12

= 0.

(i).

E

ln f

1

ln f

= 1+ 1

4 E

1

L(yt; )

1

1

2(yt )[2K(1)]2

ln L(yt; )1(yt < )

+

1+ 1

2

2D(1)E

1

L(yt; )

1

1

2(yt )[2K(1)]2

1(yt < )

1+ 1

22 D(1)E 1

[L(yt; )]2

11

2(yt )[2K(1)]2

1(yt < )=

1+ 1

4

2

1+ 1

2

1+ 1

2

2 D(1)

2

2+ 1 2

2+ 3

=

1

1

1+ 11+ 1

1+ 3D(1)

;

E

2 ln f

1

=

1

2E

1

L(yt; )

1

1

2(yt )[2K(1)]2

1(yt < )

1+ 1

2 D(1)E 1

[L(yt; )]2

1

1

2(yt )1(yt< )

[2K(1)]2

= 12

1

2

1+ 1

+

1+ 1

2

D(1)

1

2

2+ 3

= 1

1

1+ 11+ 1

1+ 3D(1)

.

33


34/37

(j).

Eln f

1

ln f

= 0 1+ 1

2 E

1 1L(yt; )

ln L(yt; )1(yt< )

+

(1+ 1)2

2 D(1)E

1 1

L(yt; )

21(yt < )

= 1+ 12

D(1) 1

1+ 1D(1+ 2)

+

(1+ 1)2

2 D(1)

1 2 1

1+ 1+

1(1+ 2)

(1+ 1)(1+ 3)

=

1

1+ 1

+ 1

1+ 3

D(1) ;E

2 ln f

1

=

1

E

1 1

L(yt; )

1(yt < )

1+ 1

D(1)E

1

L(yt; ) 1

[L(yt; )]2

1(yt < )

=

1

1 1

1+ 1

1+ 1

D(1)

11+ 1

1 1+ 2

1+ 3

=

1

1+ 1 1

1+ 3D(1)

.

(k).

E

ln f

ln f

=

(1+ 1)2

2 E

1

L(yt; ) 1

[L(yt; )]2

1

1

2(yt )[2K(1)]2

1(yt< )

+

(2+ 1)2

2 E

1

R(yt; ) 1

[R(yt; )]2

1

2

2(yt )1(yt > )[2(1 )K(1)]2

=

(1+ 1)2

22

2

1+ 1+

2

1+ 3

+

(2+ 1)2

22

2

2+ 1 2

2+ 3

= 2

21+ 1

1+ 3+

2+ 12+ 3

;

34


35/37

E2 ln f

= 1+ 1

E 1

[L(yt; )]2

1

1

2(yt )[2K(1)]

21(yt < )

2+ 1

E

1

[R(yt; )]21

2

2(yt )[2(1 )K(1)]2 1(yt > )

=

1+ 1

1

2

1+ 32+ 1

1

2

2+ 3

= 2

2

1+ 1

1+ 32+ 1

2+ 3

.

(l). By the symmetry property of1 and 2, similarly, we have

Eln f

2 2

=

E

2 ln f

22 =

1 2

2D2(2)

2+ 3 2D(2)

2+ 1 D(2) ;

E

ln f

2

ln f

= E

2 ln f

2

=

1

2+ 1 2

2+ 3D(2);

E

ln f

2

ln f

= E

2 ln f

2

= 1

1

2+ 12+ 1

2+ 3D(2)

;

E

ln f

2

ln f

= E

2 ln f

2

=

1

1

2+ 1+

2D(2)

2+ 3

.

35


36/37

Table 1a

Means of simulated MLEs of AST parametersT= 1000, 5000; 10000 replications; = 1, = 0

T : , 1, 2: mean mean 1 mean 2 mean mean 1000 0.3,0.7,2.5 0.300 0.709 2.547 1.001 61045000 0.3,0.7,2.5 0.300 0.701 2.511 1.000 31051000 0.3,2.5,2.5 0.301 2.668 2.543 1.001 0.0025000 0.3,2.5,2.5 0.300 2.534 2.510 1.000 51041000 0.3,2.0,5.0 0.301 2.083 5.311 1.001 0.0025000 0.3,2.0,5.0 0.300 2.018 5.051 1.000 61041000 0.8,0.7,2.5 0.798 0.702 2.806 1.002 -0.0025000 0.8,0.7,2.5 0.800 0.700 2.559 1.000 -31041000 0.8,2.5,2.5 0.798 2.533 2.826 1.001 -0.0035000 0.8,2.5,2.5 0.800 2.506 2.543 1.000 -41041000 0.8,2.0,5.0 0.798 2.021 10.93 1.001 -0.0035000 0.8,2.0,5.0 0.800 2.003 5.282 1.000 -7104

Table 1b

Simulation standard errors of MLEs of AST parametersT= 1000, 5000; 10000 replications; = 1, = 0

T : , 1, 2: se() se( 1) se( 2) se() se()1000 0.3,0.7,2.5 0.022 0.078 0.323 0.042 0.0255000 0.3,0.7,2.5 0.010 0.033 0.133 0.019 0.0111000 0.3,2.5,2.5 0.030 0.688 0.334 0.038 0.0355000 0.3,2.5,2.5 0.013 0.244 0.139 0.017 0.0151000 0.3,2.0,5.0 0.029 0.435 1.364 0.036 0.0355000 0.3,2.0,5.0 0.013 0.171 0.471 0.016 0.0151000 0.8,0.7,2.5 0.023 0.042 1.905 0.051 0.0275000 0.8,0.7,2.5 0.010 0.019 1.435 0.023 0.0121000 0.8,2.5,2.5 0.026 0.301 2.535 0.038 0.0315000 0.8,2.5,2.5 0.011 0.128 0.313 0.017 0.013

1000 0.8,2.0,5.0 0.028 0.207 26.90 0.038 0.0345000 0.8,2.0,5.0 0.012 0.089 1.228 0.017 0.015

36


37/37

Asymmetric t

Documents