Top Banner
Restricted Maximum Likelihood Estimation 1 Running head: Restricted Maximum Likelihood Estimation Implementing Restricted Maximum Likelihood (REML) Estimation in Structural Equation Models Mike W.-L. Cheung National University of Singapore Jan 2012 Cheung, W. L. M. (conditionally accepted). Implementing restricted maximum likelihood (REML) estimation in structural equation models. Structural Equation Modeling.
22

Restricted Maximum Likelihood Estimation 1 Running head ...

Apr 21, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 1

Running head: Restricted Maximum Likelihood Estimation

Implementing Restricted Maximum Likelihood (REML) Estimation

in Structural Equation Models

Mike W.-L. Cheung

National University of Singapore

Jan 2012

Cheung, W. L. M. (conditionally accepted). Implementing restricted maximum likelihood

(REML) estimation in structural equation models. Structural Equation Modeling.

Page 2: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 2

Abstract

Structural equation modeling (SEM) is now a generic modeling framework for many

multivariate techniques applied in the social and behavioral sciences. Many statistical models

can be considered either as special cases of SEM or as part of the latent variable modeling

framework. One popular extension is the use of SEM to conduct linear mixed-effects modeling

(LMM) such as cross-sectional multilevel modeling and latent growth modeling. It is well known

that LMM can be formulated as structural equation models. However, one main difference

between the implementations in SEM and LMM is that maximum likelihood (ML) estimation is

usually used in SEM while restricted (or residual) maximum likelihood (REML) estimation is

the default method in most LMM packages. This paper shows how REML estimation can be

implemented in SEM. Two empirical examples on latent growth model and meta-analysis are

used to illustrate the procedures implemented in OpenMx. Issues related to implementing REML

in SEM will be discussed.

Key words: structural equation modeling, multilevel modeling, meta-analysis, restricted

maximum likelihood estimation

Page 3: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 3

Implementing Restricted Maximum Likelihood (REML) Estimation

in Structural Equation Models

Structural equation modeling (SEM) is now a generic modeling framework for many

multivariate techniques used in the social and behavioral sciences. Many statistical models can

be considered either as special cases of SEM or as part of the latent variable modeling

framework. To name a few, for example, MANOVA (Raykov, 2001), item response theory

(Takane & Deleeuw, 1987), categorical data analysis (B. O. Muthén, 1984), mixture modeling

(Lubke & B. O. Muthén, 2005), and meta-analysis (Cheung, 2008, 2010, in press). Most popular

SEM packages, e.g., LISREL (Jöreskog & Sörbom, 1996), EQS (Bentler, 2004) and Mplus (B.

O. Muthén & L. K. Muthén, 2010) provide several estimation methods, such as maximum

likelihood (ML), generalized least squares, and weighted least squares. Arguably, ML estimation

is the most popular estimation method in SEM. One reason of its popularity is that ML

estimation is the natural choice when SEM is integrated with other techniques such as mixture

modeling and handing missing data.

One popular extension of SEM is to conduct linear mixed-effects modeling (LMM).

LMM, also known as multilevel models or hierarchical linear models, is yet another popular

technique to model data spanning more than one levels. Popular applications are to model nested

structure in cross-sectional and longitudinal data. It is well known that LMM can be formulated

as structural equation models (Bauer, 2003; Bollen & Curran, 2006; Chou, Bentler, & Pentz,

1998; Curran, 2003; Mehta & Neale, 2005; Mehta, & West, 2000; Rovine & Molenaar, 2000).

Nowadays, most SEM packages have implemented LMM into the general SEM framework. The

Page 4: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 4

main advantage of integrating LMM into the SEM framework is not only that researchers can

use their favorite SEM packages to conduct LMM. The key benefit of the integration is to get the

best of both worlds – multilevel structural equation modeling (MLSEM). By using MLSEM,

models involving latent variables can be analyzed for data spanning more than one levels.

Although the mathematical models of LMM are the same for those implemented in SEM

and in conventional LMM packages such as HLM (Raudenbush & Bryk, 2002), MLwiN

(Goldstein, 2011) and nlme (Pinheiro & Bates, 2000), there are clear differences in these two

implementations. One of them is that predictors are usually considered as random variables in

the SEM framework while they are treated as fixed design matrix in the LMM framework.

Therefore, means and variance-covariance matrix of the predictors have to be estimated in SEM

while they are not estimated in the LMM framework.

Another main difference is on the estimation method. ML estimation is the usual choice

in SEM while both ML and restricted (or residual) maximum likelihood (REML) estimation can

be used in the LMM approach. It is well known in the LMM literature that variance components

based on the ML estimation are negatively biased. REML estimation was proposed to minimize

the bias. However, REML is not without its own limitations. Since the fixed-effect parameters

are removed before estimating the variance components, there is no fixed-effect estimate in

REML method. Ad hoc calculations are required to compute the fixed-effect estimates. In order

words, likelihood ratio test cannot be applied to compare models involving the fixed-effect

parameters. “As to the question 'ML' or 'REML?',” Searle, Casella, and McCulloch (1992)

succinctly summarized that “there is probably no hard and fast answer.” Thus, it was not the

intention of this paper to argue whether ML or REML is preferable. Although there are pros and

Page 5: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 5

cons of using REML, REML is always the default estimation in conventional LMM packages.

Many LMM users prefer REML when analyzing nested data.

Since SEM is becoming more and more popular as an integrated framework for data

analysis, some LMM users may want to use it to conduct LMM. As noted by Bauer (2003, p.

163), one of the current limitations of the SEM approach is that “the restricted maximum

likelihood estimator commonly used with multilevel models currently has no counterpart in

SEM.” The lack of REML estimation can be an obstacle for some users, especially those who

want to replicate their findings in conventional LMM packages.

The main purpose of this paper is to illustrate how REML estimation can be implemented

in SEM. OpenMx (Boker et al., 2011), an open source SEM package in R (R Development Core

Team, 2011), is used to demonstrate the procedures. Readers from the LMM tradition may find

the results more comparable to those obtained from conventional LMM packages. Moreover, this

paper also attempts to shed light on how new estimation method can be implemented in SEM.

The remaining sections of the paper are organized as follows. The next section contains a brief

review of how REML estimation can be obtained via analyzing transformed data and log-

likelihood function. Two empirical examples on latent growth model and meta-analysis are then

presented to demonstrate the procedures. The final section discusses issues of implementing

REML in SEM.

Estimating Parameters with REML Method

There are two equivalent model specifications on LMM. One is based on a single

equation that specifies both fixed effects and random effects (Laird & Ware, 1982) while the

other approach is to represent the models in two levels (Raudenbush & Bryk, 2002). In this paper

Page 6: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 6

we use the model advocated by Laird and Ware (1982). This model specification uses the long

format which is different from the SEM specification that uses the wide format. As we will

discuss later, this model representation makes it easier to implement the REML estimation.

Under this model representation, data of the dependent variable in the level 1 unit are stacked

into a single column vector iy . iy represents the ith group of participants in cross-sectional data

whereas it represents repeated measures of the ith subject in longitudinal data. This format is

generally known as the long format. The model for iy of the ith unit is

iiiii +Z+X= euβy (1)

where iX is the design matrix of the fixed effects, β is the parameter vector of the fixed effects,

iZ is the design matrix of the random effects, iu is the random effects for the ith unit and ie is

the residuals (Pinheiro & Bates, 2000). The random effects and the residuals are normally

distributed with GNi 0,u and ii RN 0,e . It should be noted that iX is the design matrix

combining both level-1 and level-2 predictors. Moreover, the between group variance

components G is the same for all units while Ri may vary across units.

The log-likelihood function without the constant term on iy is

)()(log2

1-log 1 βyβyy iii

TiiiiiML XVX+V=;RG,β,l , (2)

where iTiii R+GZZ=V . This log-likelihood function is the same as that in SEM by setting the

model implied mean vectors and variance covariance matrix as βθμ ii X= and

iTiii R+GZZ=Σ θ , respectively. Because of this similar between SEM and LMM, SEM may be

Page 7: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 7

used to conduct latent growth models (Bollen & Curran, 2006) and cross-sectional multilevel

analysis (Mehta & Neale, 2005). The estimates are based on the ML estimation because both

fixed-effects and variance components are estimated simultaneously.

One problem of the ML estimation is that the estimated variance components are

negatively biased. Take the ML estimate of variance in normally distributed data as an example.

If we knew the population mean , an unbiased estimate on variance is nμxi /2 where n is

the sample size. Since we seldom know the population mean in practice, the ML estimate of the

variance is nxxi /2 where x is the sample mean. It is always true that

nμxnxx ii // 22 . Thus, the ML estimate of variance is negatively biased. A well-

known unbiased estimate on variance is 1/2 nxxi that adjusts the uncertainty in

estimating x . The situation is more complicated in LMM because it cannot be corrected by a

simple scalar adjustment.

The basic idea behind the REML estimation is to remove the fixed-effects parameters

before estimating the variance components. A contrast matrix is chosen in such a way that the

fixed-effects parameters are not estimated. Since the fixed-effects parameters are not part of the

model, the estimated variance components will not be biased by treating the fixed-effects

estimates as known. Let us consider the model of stacking all iy into a single column vector,

eβy +Zu+X= (3)

where X and e are stacked over all the units, and kZ,ZZDiag=Z ...2,1, and

k,Diag=u uuu ...2,1, are block diagonal matrices. Instead of analyzing y with ML estimation,

Page 8: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 8

we may analyze its residuals y~ . The model on y~ is

eβy +ZuA+AX=~ (4)

where TT XXXXI=A1

with p arbitrary rows removed and I is an identity matrix and p is

the number of columns of X (Harville, 1977; Patterson & Thompson, 1971). Several

characteristics bear explaining. After the calculations, the contrast A (without deleting the p

arbitrary rows) is a k by k matrix where k is the length of y~ . Since the rank of this matrix is (k-

p), it is not of full rank. Thus, p redundant rows have to be deleted. Harville (1977) showed that

these p rows can be arbitrarily selected without affecting the results. The common practice is to

delete the last p rows. Therefore, A becomes a (k-p) by k matrix after deleting the last p rows. It

is of importance to note that y~ is now a column vector with (k-p) rows. In practice, the transpose

of it is used in the analysis. That is, the final data is a row vector with (k-p) columns of variables.

After the transformation, the expected value of y~ is

0

ββ

βy

=

XXXXXX=

XXXXXI=E

TT

TT

1

1~

(5)

The population means of y~ is always zero regardless of what β is. Effectively, the variance

components can be estimated without estimating β. The expected variance-covariance matrix of

y~ is

TAVA=Cov y~ (6)

Page 9: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 9

where iV,VVDiag=V ...2,1, . Before the transformation, the between unit of y is independent.

However, the between units become systematically related after the transformation. In the

context of SEM, the variance components can be estimated by fitting a model with 0θ =μ and

TT AR+ZGZA=Σ θ .

Alternatively, the log-likelihood function of the model may be used to directly estimate

the variance components with REML. The log-likelihood function without the constant term on

y~ is

XVX+XVX+V=R;G,l TTREML

11 )()(log2

1-log αyαyy (7)

where yα 111 VXXVX= TT . Although α is in the log-likelihood function, it is not a function

of β . Thus, the log-likelihood function does not include the fixed-effects parameters. Once the

variance components have been estimated, we may computer the fixed-effects by

yβ 111 ˆ)ˆ(ˆ VXXVX TT . (8)

In the context of SEM, we may fit βθ X=μ by fixing R+ZGZ=Σ T ˆˆθ obtained from the

REML estimation.

Empirical Examples

Two examples on the latent growth model and mixed-effects meta-analysis were used to

illustrate the procedures of the REML estimation. Since latent growth model is fitted the same

way as cross-sectional multilevel model under the LMM framework, results can be readily

applicable to cross-sectional data. OpenMx (Boker et al., 2011) was used to conduct the analysis.

R code for the analyses based on the transformed data and on the log-likelihood is listed in the

Page 10: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 10

Appendix.

Latent growth model. A classic example from Potthoff and Roy (1964) was used to

illustrate fitting a linear growth model. The data set consists of measurements of the distance

from the pituitary gland to the pterygomaxillary fissure taken every two years from 8 years of

age until 14 years of age on a sample of 27 children. This data set has been used in many LMM

papers and textbooks (see Pinheiro & Bates, 2000 for the details). Eight was subtracted from

years of age to improve the numerical stability. Thus, the new intercept is located at the age of 8

though it is not estimated under the REML estimation. The model for one subject is

e

e

e

e

u

u

y

y

y

y

1

0

1

0

4

3

2

1

61

41

21

01

61

41

21

01

. (9)

where 0β and 1β are the average intercept and average slope, respectively. To be consistent with

conventional assumptions in LMM, the level-1 residuals are assumed homogeneous.

Nevertheless, this assumption can be easily relaxed. The REML estimates of )var(1

0

u

u is

051.0089.0

559.3 while the ML estimate is

046.0095.0

383.3. The ML estimate of the variance

component is slightly smaller than that with REML estimation.

Meta-analysis. The second example draws from a classic data set in meta-analysis.

Colditz et al. (1994) collected effect sizes from 13 clinical trials examining the effectiveness of

the Bacillus Calmette-Guerin (BCG) vaccine for preventing tuberculosis. A total of 1264 articles

or abstracts were identified for meta-analysis on the effectiveness of the BCG. Thirteen studies

Page 11: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 11

were used in the illustration. The same data set has been used in many papers illustrating meta-

analytic techniques (e.g., Berkey et al., 1995; Viechtbauer & Cheung, 2010).

The effect size is the relative risk that indicates the ratio of the risk of having tuberculosis

in the BCG vaccine group compared to a group without BCG vaccine. Publication year was used

as a potential study characteristic to predict the effect size. The earliest publication year of the

data was 1948 that was subtracted from the data to improve numerical stability. This analysis is

also known as the mixed-effects meta-analysis or meta-regression. The model is

13

2

1

0

0

0

1

0

13

2

1

00

00

00

100

010

001

281

11

01

e

e

e

u

u

u

y

y

y

, (10)

where 0β and 1β are the average intercept and the slope, respectively. Mixed-effects meta-analysis

can be considered as a special case of LMM (Raudenbush & Bryk, 2002) and SEM (Cheung,

2008, in press) by treating the sampling variance var(e1) to var(e13) as fixed and known. The

estimated amount of heterogeneity based on the REML is 0.267. In contrast, the ML estimate of

the amount of heterogeneity is 0.212 which is slightly smaller than that with REML.

Discussion

This paper shows how REML estimation, a popular estimation method in LMM, can be

implemented in SEM. Instead of analyzing the raw data, variance components based on the

REML estimation can be obtained by analyzing the transformed data in which the fixed-effects

parameters have been removed. Since the transformed data do not include the fixed-effects

parameters, REML can efficiently adjust the negatively bias of the ML estimates. The following

Page 12: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 12

sections discuss some possible extensions and issues related to the implementation of REML

estimation in SEM.

Computation Efficiency and Numerical Stability

As demonstrated above, there are two equivalent approaches to implementing the REML

estimation in SEM. One approach is to analyze the transformed data with Equations 4 to 6. The

other approach is to fit the log-likelihood function in Equation 7 directly. Analyzing the

transformed data seems to be more appealing to many SEM users because only the implied mean

and implied variance-covariance matrix are required. There is one main limitation, however. The

input data is a (k-p) columns of variables with one row (or subject in SEM). Mehta and Neale

(2005) has also noted that this approach was not efficient when “subjects” are treated as

“variables” in handling nested data. The situation becomes worse in implementing REML

estimation. It is because the implied variance-covariance matrix is a matrix with many

constraints as shown in Equation 6. Thus, numerical stability may be an issue. On the other hand,

the approach based on the log-likelihood function is more stable. The main limitation is that most

SEM packages cannot implement arbitrary fitting functions. Future research needs to address

how to obtain REML estimates more efficiently.

Possible Extensions

One main characteristic of the model in Equation 3 is that it specifies the fixed-effects

design matrix and the random-effects design matrix separately. A transformation matrix based on

the fixed effects is used to remove the fixed-effects parameters. This means that the REML

estimation can be readily extended to other LMM such as multivariate LMM, multivariate meta-

analysis and cross-classified LMM (Goldstein, 2011). As long as these models can be specified

Page 13: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 13

via Equation 3, the transformation matrix A can be calculated to remove the fixed-effects

parameters. For example, Cheung (2011) uses a SEM approach to conduct univariate and

multivariate meta-analysis. REML using Equation 7 may be used to obtain the variance

components in meta-analysis.

Generalization to Models with Latent Variables

The current REML implementation is based on the model in Equations 3 and 4 that

specify the fixed effects and the random effects separately. This approach does not work for

models that cannot be parameterized as Equation 3. Take a nonlinear growth model with

estimated factor loadings as an example (Bollen & Curran, 2006). Suppose there are 4 measures,

the model with estimated factor loadings is

e

e

e

e

u

u

y

y

y

y

1

0

4

31

0

4

3

4

3

2

1

1

1

11

01

1

1

11

01

. (11)

This model captures the non-linear trajectories in development by estimating the trajectories

from data empirically. Since 3λ and 4λ are included in both design matrices, it is not possible to

find a transformation matrix A without estimating 3λ and 4λ . Further research should address

how REML estimation can be implemented in general MLSEM.

To summarize, the addition of REML estimation in SEM makes the SEM approach more

appealing to the LMM users. They may find the results more comparable to conventional LMM

packages. This approach is especially useful when the number of subjects is small.

Page 14: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 14

References

Bauer, D. (2003). Estimating multilevel linear models as structural equation models. Journal of

Educational and Behavioral Statistics, 28(2), 135-167. doi:10.3102/10769986028002135

Bentler, P. M. (2004). EQS 6 [Computer software]. Encino, CA: Multivariate Software.

Berkey, C. S., Hoaglin, D. C., Mosteller, F., & Colditz, G. A. (1995). A random-effects regression

model for meta-analysis. Statistics in Medicine, 14(4), 395-411.

doi:10.1002/sim.4780140406

Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., et al. (2011).

OpenMx: An Open Source Extended Structural Equation Modeling Framework.

Psychometrika, 76(2), 306-317. doi:10.1007/s11336-010-9200-6

Bollen, K. A., & Curran, P. (2006). Latent Curve Models: A Structural Equation Perspective.

Hoboken, N.J.: Wiley-Interscience.

Cheung, M. W. L. (2008). A model for integrating fixed-, random-, and mixed-effects meta-

analyses into structural equation modeling. Psychological Methods, 13(3), 182-202.

doi:10.1037/a0013163

Cheung, M. W. L. (2010). Fixed-effects meta-analyses as multiple-group structural equation

models. Structural Equation Modeling: A Multidisciplinary Journal, 17(3), 481-509.

doi:10.1080/10705511.2010.489367

Cheung, M. W. L. (2011). metaSEM: Meta-analysis: A Structural Equation Modeling Approach.

R package version 0.7-0. Retrieved November 11, 2011, from

http://courses.nus.edu.sg/course/psycwlm/Internet/metaSEM/

Cheung, M. W. L. (in press). Multivariate meta-analysis as structural equation models. Structural

Page 15: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 15

Equation Modeling: A Multidisciplinary Journal.

Chou, C.-P., Bentler, P. M., & Pentz, M. A. (1998). Comparisons of two statistical approaches to

study growth curves: The multilevel model and the latent curve analysis. Structural

Equation Modeling: A Multidisciplinary Journal, 5(3), 247.

doi:10.1080/10705519809540104

Colditz, G. A., Brewer, T. F., Berkey, C. S., Wilson, M. E., Burdick, E., Fineberg, H. V., &

Mosteller, F. (1994). Efficacy of BCG Vaccine in the Prevention of Tuberculosis. JAMA:

The Journal of the American Medical Association, 271(9), 698 -702.

doi:10.1001/jama.1994.03510330076038

Curran, P. (2003). Have multilevel models been structural equation models all along?

Multivariate Behavioral Research, 38(4), 529-568. doi:10.1207/s15327906mbr3804_5

Goldstein, H. (2011). Multilevel statistical models (4th ed.). Hoboken, N.J.: Wiley.

Harville, D. A. (1977). Maximum Likelihood Approaches to Variance Component Estimation

and to Related Problems. Journal of the American Statistical Association, 72(358), 320-

338. doi:10.2307/2286796

Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8: A user’s reference guide. Chicago, IL:

Scientific Software International, Inc.

Laird, N. M., & Ware, J. H. (1982). Random-Effects Models for Longitudinal Data. Biometrics,

38(4), 963-974. doi:10.2307/2529876

Lubke, G. H., & Muthén, B. (2005). Investigating Population Heterogeneity With Factor Mixture

Models. Psychological Methods, 10(1), 21-39. doi:10.1037/1082-989X.10.1.21

Mehta, P. D., & Neale, M. C. (2005). People are variables too: multilevel structural equations

Page 16: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 16

modeling. Psychological Methods, 10(3), 259-284. doi:10.1037/1082-989X.10.3.259

Mehta, P., & West, S. (2000). Putting the individual back into individual growth curves.

Psychological Methods, 5(1), 23-43. doi:10.1037//1082-989X.5.1.23

Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical,

and continuous latent variable indicators. Psychometrika, 49(1), 115-132.

doi:10.1007/BF02294210

Muthén, B. O., & Muthén, L. K. (2010). Mplus user’s guide (6th ed.). Los Angeles, CA: Muthén

& Muthén.

Patterson, H. D., & Thompson, R. (1971). Recovery of inter-block information when block sizes

are unequal. Biometrika, 58(3), 545 -554. doi:10.1093/biomet/58.3.545

Pinheiro, J. C., & Bates, D. M. (2000). Mixed Effects Models in S and S-Plus. New York:

Springer.

Potthoff, R. F., & Roy, S. N. (1964). A generalized multivariate analysis of variance model useful

especially for growth curve problems. Biometrika, 51(3-4), 313 -326.

doi:10.1093/biomet/51.3-4.313

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: applications and data

analysis methods. Thousand Oaks: Sage Publications.

Raykov, T. (2001). Testing Multivariable Covariance Structure and Means Hypotheses via

Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary

Journal, 8(2), 224. doi:10.1207/S15328007SEM0802_4

R Development Cor Team. (2011). R: A Language and Environment for Statistical Computing.

Vienna, Austria. Retrieved from http://www.R-project.org/

Page 17: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 17

Rovine, M., & Molenaar, P. (2000). A structural modeling approach to a multilevel random

coefficients model. Multivariate Behavioral Research, 35(1), 51-88.

doi:10.1207/S15327906MBR3501_3

Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance Components. New York: Wiley-

Interscience.

Takane, Y., & Deleeuw, J. (1987). On the relationship between item response theory and factor-

analysis of discretized variables. Psychometrika, 52(3), 393-408.

doi:10.1007/BF02294363

Viechtbauer, W., & Cheung, M. W. L. (2010). Outlier and influence diagnostics for meta-

analysis. Research Synthesis Methods, 1(2), 112-125. doi:10.1002/jrsm.11

Page 18: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 18

Author Note

Mike W.-L. Cheung, Department of Psychology, National University of Singapore.

This research was supported by the Academic Research Fund Tier 1 (R581-000-111-112)

from the Ministry of Education, Singapore.

Correspondence concerning this article should be addressed to Mike W.-L. Cheung,

Department of Psychology, Faculty of Arts and Social Sciences, National University of

Singapore, Block AS4, Level 2, 9 Arts Link, Singapore 117570. Tel: (65) 6516-3702; Fax: (65)

6773-1843; E-mail: [email protected].

Page 19: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 19

Endnote

1The data sets, R code, and output are available at

http://courses.nus.edu.sg/course/psycwlm/internet/remlSEM.zip.

Page 20: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 20

Appendix

Obtaining Variance Components with REML in R:

Illustration with Linear Growth Model and Mixed-effects Meta-analysis

library(OpenMx) # Library to conduct SEM #### Example on linear growth model data(Orthodont, package="nlme") # Sample data Orthodont # Show the data y <- Orthodont$distance k <- length(y) # no. of effect sizes n <- 27 X <- cbind(1, Orthodont$age-8) # Design matrix p <- ncol(X) # no. of predictors + intercept # Numerically more stable approach: N <- diag(k) - X %&% solve( crossprod(X) ) N <- diag(k) - X %*% solve( t(X)%*% X ) %*% t(X) # M is the trasformation matrix M <- N[, 1:(k-p)] # remove redundant columns # Transformed effect size reml_y <- matrix(y, nrow=1) %*% M remlVars <- paste("X", 1:(k-p), sep="") dimnames(reml_y) <- list(NULL, remlVars) # Analysis based on the transformed effect size reml1 <- mxModel("REML for growth model with transformation", mxData(observed=reml_y, type="raw"), mxMatrix("Full", nrow=k, ncol=(k-p), values=c(M), free=FALSE, name="M"), mxMatrix("Full", ncol=2, nrow=4, free=FALSE, values=c(rep(1,4),c(0,2,4,6)), byrow=FALSE, name="lambda"), mxMatrix("Symm", ncol=2, nrow=2, free=TRUE, values=c(3,0.1,0.5), labels=c("var_intercept","cov_is","var_slope"), name="cov_f"), mxMatrix("Diag", ncol=4, nrow=4, free=TRUE, values=2, labels=c("var_e","var_e","var_e","var_e"), name="cov_e"), mxMatrix("Iden", ncol=n, nrow=n, name="Id"), mxAlgebra( t(M) %&% (Id %x% (lambda %&% cov_f + cov_e)), name="expCov"), mxMatrix("Full", ncol=(k-p), nrow=1,free=FALSE,values=0, name="expMean"), mxFIMLObjective(covariance="expCov", means="expMean", dimnames=remlVars) ) reml1.fit <- mxRun(reml1) summary(reml1.fit) # Analysis based on the log-likelihood reml2 <- mxModel("REML for growth model with likelihood function", mxMatrix(type="Full", nrow=k, ncol=1, values=y, free=FALSE, name="Y"), mxMatrix(type="Full", nrow=k, ncol=p, values=c(X), free=FALSE, name="X"), mxMatrix("Full", ncol=2, nrow=4, free=FALSE, values=c(rep(1,4),c(0,2,4,6)), byrow=FALSE, name="lambda"), mxMatrix("Symm", ncol=2, nrow=2, free=TRUE, values=c(3,0.1,0.5),

Page 21: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 21

labels=c("var_intercept","cov_is","var_slope"), name="cov_f"), mxMatrix("Diag", ncol=4, nrow=4, free=TRUE, values=2, labels=c("var_e","var_e","var_e","var_e"), name="cov_e"), mxMatrix("Iden", ncol=n, nrow=n, name="Id"), mxAlgebra( Id %x% (lambda %*% cov_f %*% t(lambda) + cov_e), name="V_new"), mxAlgebra( solve(V_new), name="W"), mxAlgebra( solve(t(X)%*%W%*%X) %*% t(X) %*% W %*% Y, name="alpha"), mxAlgebra( ( log(det( V_new )) + log(det(t(X)%*%W%*%X)) + t(Y-X%*%alpha)%*%W%*%(Y-X%*%alpha) ), name="obj"), mxAlgebraObjective("obj") ) reml2.fit <- mxRun(reml2) summary(reml2.fit) #### Example on mixed-effects meta-analysis data(dat.bcg) # Sample data # Calculate effect size dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, append=TRUE) dat # Show the data y <- dat$yi # Effect size v <- dat$vi # Sampling variance X <- cbind(1, dat.bcg$year) # Design matrix k <- length(y) # no. of studies p <- ncol(X) # no. of predictors + intercept # Numerically more stable approach: N <- diag(k) - X %&% solve( crossprod(X) ) N <- diag(k) - X %*% solve( t(X)%*% X ) %*% t(X) # M is the trasformation matrix M <- N[, 1:(k-p)] # remove redundant columns # transformed effect size y_star <- matrix(y, nrow=1) %*% M selVars <- paste("X", 1:(k-p), sep="") dimnames(y_star) <- list(NULL, selVars) # Analysis based on the transformed effect size ma1 <- mxModel("REML for meta analysis with transformation", mxData(observed=y_star, type="raw"), mxMatrix(type="Diag", nrow=k, ncol=k, values=c(v), free=FALSE, name="V"), mxMatrix(type="Diag", nrow=k, ncol=k, values=0.5, lbound=c(0.00001), free=c(TRUE), labels=rep("tau2", k), name="Tau2"), mxMatrix(type="Full", nrow=k, ncol=(k-p), values=c(M), free=FALSE, name="M"), mxAlgebra( t(M) %&% (Tau2+V), name="expCov"), mxMatrix(type="Full", nrow=1, ncol=(k-p), free=FALSE, values=0, name="expMean"), mxFIMLObjective(means="expMean", covariance="expCov", dimnames=selVars) ) ma1.fit <- mxRun(ma1) summary(ma1.fit)

Page 22: Restricted Maximum Likelihood Estimation 1 Running head ...

Restricted Maximum Likelihood Estimation 22

# Analysis based on the log-likelihood ma2 <- mxModel("REML for meta analysis with likelihood function", mxMatrix(type="Full", nrow=k, ncol=1, values=y, free=FALSE, name="Y"), mxMatrix(type="Diag", nrow=k, ncol=k, values=c(v), free=FALSE, name="V"), mxMatrix(type="Full", nrow=k, ncol=2, values=c(X), free=FALSE, name="X"), mxMatrix(type="Diag", nrow=k, ncol=k, values=0.5, lbound=c(0.00001), free=c(TRUE), labels=rep("tau2", k), name="Tau2"), mxAlgebra( solve(Tau2+V), name="W"), mxAlgebra( solve(t(X)%*%W%*%X) %*% t(X) %*% W %*% Y, name="alpha"), mxAlgebra( ( log(det(V+Tau2)) + log(det(t(X)%*%W%*%X)) + t(Y-X%*%alpha)%*%W%*%(Y-X%*%alpha) ), name="obj"), mxAlgebraObjective("obj") ) ma2.fit <- mxRun(ma2) summary(ma2.fit)