! " # $ Sprott Letters Working Papers Occasional Reports Article Reprints Frontiers in Business Research and Practice A Critique of Partial Least Squares, and a Preliminary Assessement of an Alternative Estimation Method D. Roland Thomas, Irene R.R. Lu, and Marzena Cedzynski February 2007 SL 2007-002 About the authors Irene R. R. Lu is Assistant Professor at the School of Administrative Studies, York University, and a graduate of the Ph.D. in Management program of the Sprott School of Business. D. Roland Thomas is Professor of Quantitative Methods, and Marzena Cedzynski a doctoral candidate, in the Sprott School of Business. The research of the first author was supported by grants from the Natural Sciences and Engineering Research Council of Canada and from the National Program on Complex Data Structures. Abstract Partial least squares (PLS) is sometimes used as an alternative to covariance- based structural equation modeling (SEM). This paper briefly reviews currently available SEM techniques, and provides a critique of the perceived advantages of PLS over covariance-based SEM as commonly cited by PLS users. Specific attention is drawn to the primary disadvantage of PLS, namely the lack of consistency of its parameter estimates. The instrumental variables (IV) / two stage least squares (2SLS) method of estimation is then described and presented as a potential alternative to PLS that might yield its perceived advantages without succumbing to its primary disadvantage. Preliminary simulation results show that: PLS parameter estimates exhibit substantial bias when the number of items is moderate; SEM-based methods yield lower bias; and IV/2SLS estimates may indeed provide a viable ordinary least squares (OLS)-based alternative to PLS.
22
Embed
A Critique of Partial Least Squares, and a Preliminary Assessement of an Alternative Estimation Method
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
A Critique of Partial Least Squares, and a Preliminary Assessement
of an Alternative Estimation Method��������
����
D. Roland Thomas, Irene R.R. Lu, and Marzena Cedzynski����
February 2007 SL 2007-002
About the authors Irene R. R. Lu is Assistant Professor at the School of Administrative Studies, York University, and a graduate of the Ph.D. in Management program of the Sprott School of Business. D. Roland Thomas is Professor of Quantitative Methods, and Marzena Cedzynski a doctoral candidate, in the Sprott School of Business.
The research of the first author was supported by grants from the Natural Sciences and Engineering Research Council of Canada and from the National Program on Complex Data Structures.
Abstract Partial least squares (PLS) is sometimes used as an alternative to covariance-based structural equation modeling (SEM). This paper briefly reviews currently available SEM techniques, and provides a critique of the perceived advantages of PLS over covariance-based SEM as commonly cited by PLS users. Specific attention is drawn to the primary disadvantage of PLS, namely the lack of consistency of its parameter estimates. The instrumental variables (IV) / two stage least squares (2SLS) method of estimation is then described and presented as a potential alternative to PLS that might yield its perceived advantages without succumbing to its primary disadvantage. Preliminary simulation results show that: PLS parameter estimates exhibit substantial bias when the number of items is moderate; SEM-based methods yield lower bias; and IV/2SLS estimates may indeed provide a viable ordinary least squares (OLS)-based alternative to PLS.
Sprott Letters Working Papers
A Critique of Partial Least Squares, and a Preliminary Assessement
of an Alternative Estimation Method
D. Roland Thomas, Sprott School of Business Irene R.R. Lu, York University
PA5. The apparently common belief that PLS does not require interval scaled data is
bizarre, and totally incorrect. Nothing in the theory of OLS regression justifies such a claim.
PA6 and PA10. The claim that PLS is predictive as opposed to confirmatory derives from
the use of OLS regression which minimizes 2R for each estimated equation in the system.
However, the principle of least squares is also fundamental to confirmatory techniques such as
analysis of variance (ANOVA), so that the predictive claim should be treated cautiously. Many
studies based on PLS discuss their results entirely in confirmatory terms (see, for example,
Barclay et al., 1995) and do not use predictive measures, e.g., cross-validation. Also, contrary to
the implication of PA10 that only PLS generates latent variable scores, scores can be predicted
once the parameters of the measurement models (2) and (3) have been estimated by covariance-
based SEM techniques. The “regression” method for generating factor scores yields individual
predictions (scores) that minimize the mean-squared error of prediction, i.e., they minimize 2�� ηηηηηηηη −E . Thus covariance-based SEM can also be said to be prediction oriented. The package
Mplus offers a wide range of scoring methods, for both continuous and discrete SEM methods.
PA7. OLS methods applied to variables that are free of measurement error do yield
consistent parameter estimates even when observations are correlated. However, corresponding
parameter standard errors will be biased. PLS uses resampling methods to estimate these standard
errors, and standard resampling methods require independent, identically distributed
observations. Thus, if finite item bias is small enough to be ignored, the claim will be correct if
applied to point estimates, but false whenever inferential techniques are applied.
PA8. This belief is false, as described in the earlier description of PLS, unless the number
of items is very large and / or unless the measurement error is negligible.
Instrumental Variables (IV) / Two Stage Least Squares (2SLS)
Though popular in econometrics, the IV/2SLS approach to estimation has seen relatively
limited use in the fields of factor analysis and structural equation modeling. The IV/2SLS
approach to estimation in fact refers to a family of related techniques, among which several
other approaches under all conditions, particularly for five items per latent variable. Even with
Cronbach alphas over 85%, biases in the PLS estimates are still appreciable. The bias in PLS is
similar in magnitude to that obtained when standardized total scores are directly used in an OLS
regression (Lu, 2004). The decrease in PLS bias predicted by measurement error theory is also
evident in the results of Table 2, biases for 10 items being considerably smaller than for 5 items
per latent variable. The IV/2SLS estimates do not depend on the number of items and produce
smaller biases, confirming the earlier suggestion that IV/2SLS is a potential alternative to PLS
that does not share its primary disadvantage. The ML-SEM and the discrete-SEM also exhibit
much smaller biases than PLS (all below 3%). For the sample sizes considered here, neither ML-
SEM or discrete-SEM exhibited convergence problems. As noted, this comparison of PLS with
the SEM and IV/2SLS approaches is preliminary. A more detailed comparison under different
conditions, e.g., sample size, model complexity, number of categories, skewness, kurtosis, is
needed before definitive conclusions can be drawn.
Table 3 displays the effects of sample size and the number of items per latent variable on
the biases in 2R for the two IV/2SLS methods. It can be seen that the biases of both approaches
are relatively insensitive to sample size and to the number of manifest items for all conditions
shown, all relative biases being below 3%.
Summary and Conclusions
The primary advantages claimed for PLS over covariance-based SEM have been
examined and while some have little basis, others are legitimate. However, PLS suffers from a
serious deficiency, namely that its parameter estimates lack consistency. The IV/2SLS technique
shares the primary advantages claimed for PLS, namely freedom from distributional assumptions,
and robustness to model misspecifications. In fact, IV/2SLS is superior to PLS in the latter
regard, being robust to misspecifications of a clearly specified form. In addition, since the
IV/2SLS approach is non-iterative, and requires only two applications of OLS regression,
convergence and model identification (given sufficient IVs) is not an issue. Moreover, the
IV/2SLS technique is very easy to use, being programmed as a single step in most statistical
software, such as SAS, SPSS and STATA. Finally, it generates consistent parameter estimates.
15
Preliminary simulation results confirm that finite item bias in PLS parameter estimates can be
serious, and that IV/2SLS is a potential alternative to PLS that is free of this problem. However,
further studies are required before definitive recommendations can be made.
References
Arminger, G., & Schoenberg, R. J. (1989). Pseudo-maximum likelihood estimation and a test for misspecification in mean and covariance structure models. Psychometrika, 54, 409-425.
Barclay, D. W., Higgins, C., & Thompson, R. (1995). The partial least squares (PLS) approach to causal modeling: Personal computer adaptation and use as an illustration. Technology Studies, 2(2), 285-309.
Bentler, P. M. (1982). Confirmatory factor analysis via noniterative estimation: A fast, inexpensive method. Journal of Marketing Research, 19, 417-424.
Bollen, K. A. (1996). An alternative two stage least squares (2SLS) estimator for latent variable equations. Psychometrika, 61, 109-121.
Bollen, K. A. (2001). Two-stage least squares and latent variable models: Simultaneous estimation and robustness to misspecifications. In R. Cudeck, S. Du Toit , & D. Sörbom (Eds.), Structural equation modeling: Present and future (pp. 119-138). Lincolnswood, IL: Scientific Software.
Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53, 605-634.
Bollen, K. A., & Stine, R. A. (1990). Direct and Indirect effects: Classical and bootstrap estimates of variability. In C. C. Clogg (Ed.), Sociological methodology (pp. 115-140). Oxford: Basil Blackwell.
Boomsma, A. (1983). On the robustness of LISREL (maximum likelihood estimation) against small sample size and nonnormality. Amsterdam: Sociometric Research Foundation.
Browne, M.W. (1984). Asymptotic distribution free methods in analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62-83.
Chin, W.W. (1995). Partial least squares is to LISEL as principal components analysis is to factor analysis. Technology Studies, 2, 315-319.
Chin, W.W. (1998). The partial least squares approach to structural equation modeling. In G. A. Marcoulides (Ed.), Modern methods for business research (pp. 295-336). Mahwah, New Jersey: Lawrence Erlbaum Associates.
Chin, W.W. (2001). PLS-graph user’s guide. C.T. Bauer College of Business, University of Houston, USA.
Croon, M. (2002). Using predicted latent scores in general latent structure models. In G. A. Marcoulides & I. Moustaki (Eds.), Latent variable and latent structure models (pp. 195-223). Mahwah, NJ: Lawrence Erlbaum Associates.
Dikstra, T. (1983). Some comments on maximum likelihood and partial least squares methods. Journal of Econometrics, 21, 67-90.
Fuller, W.A. (1987). Measurement error models. New York: Wiley. Gefen, D., Straub, D. W., & Boudreau, M. C. (2000). Structural equation modeling and
regression: Guidelines for research practice. Communications of the Association for Information Systems, 4(7), 1-77.
Hanafi, M., & Qannari, E. M. (2005). An alternative algorithm to the PLS B problem. Computational Statistics and Data Analysis, 48, 63-67.
Hoyle, R. H., & Panter, A. T. (1995). Writing about structural equation models. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 158-176). Thousand Oaks, CA: Sage.
Jöreskog, K. G. (1970). A general method for analysis of covariance structures. Biometrika, 57, 239-251.
Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8 user's reference guide. Chicago: Scientific Software International.
Jöreskog, K. G. & Wold, H. (1982). The ML and PLS techniques for modelling with latent variables: Historical and comparative aspects. In K. G. Jöreskog & H. Wold (Eds.), Systems under indirect observation: Causality, structure, prediction (Vol. 1, pp. 263-270). Amsterdam: North-Holland.
Lance, C. E., Cornwell, J. M., & Mulaik, S. A. (1988). Limited information parameter estimates for latent or mixed manifest and latent variable models. Multivariate Behavioral Research, 23, 155-67.
Lu, I. R. R. (2004). Latent variable modeling in business research: A comparison of regression based on IRT and CTT scores with structural equation models. Doctoral dissertation, Carleton University, Canada.
Lu, I. R. R., Thomas, D. R., & Orser, B. J. (2004). Latent variable modeling in business research: A comparison of two-step approach with structural equation modeling. Proceedings of Administrative Sciences Association of Canada, 32nd Annual ASAC Conference, Quebéc city, Quebéc, June 5-8.
Lu, I. R. R., Thomas, D. R., & Zumbo, B. D. (2005). Embedding IRT in structural equation models: A comparison with regression based on IRT scores. Structural Equation Modeling, 12(2), 263-277.
Muthén, B. O. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115–132.
Muthén, L. K., & Muthén, B. O. (2001). Mplus user's guide. Los Angeles, CA: Muthén & Muthén.
Muthén, B. O., & Kaplan, D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.
Satorra, A. (1992). Asymptotic robust inferences in the analysis of mean and covariance structures. In P. Marsden, (Ed.), Sociological methodology (pp. 249-278). Oxford, England: Blackwell Publishers.
Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye and C. Clogg (Eds.), Latent variable analysis in developmental research (pp. 285-305). Newbury Park, CA: Sage.
Schneeweiss, H. (1993). Consistency at large in models with latent variables. In K. Haagen, D. J. Bartholomew, & M. Deistler (Eds.), Statistical modelling and latent variables (pp. 299-320). Amsterdam: Elsevier Science Publishers.
Tenenhaus, M., Vinzi, V. E., Chatelin Y-M., & Lauro, C. (2005). PLS path modeling. Computational Statistics and Data Analysis, 48, 159-205.
West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 56-75). Thousand Oaks, CA: Sage.
Wold, H. (1982). Soft modeling: The basic design and some extensions. In K. G. Jöreskog & H. Wold (Eds.), Systems under indirect observation: Causality, structure, prediction (Vol. 2, pp. 1-54). Amsterdam: North-Holland.
Wold, H. (1985). Systems analysis by partial least squares. In P. Nijkamp, H. Leitner, & N. Wrigley (Eds.), Measuring the unmeasurable (pp. 221-251). Boston: Martinus Nijhoff.
Wolins, L. (1995). A Monte-Carlo study of constrained factor-analysis using maximum likelihood and unweighted least-squares. Educational and Psychological Measurement, 55 (4), 545-557.