A Primer on Structural Equation Models: Part 1. Confirmatory Factor Analysis. Michael A. Babyak, PhD 1 and Samuel B. Green, PhD 2 1 Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC 2 Division of Psychology in Education, Arizona State University, Tempe, Arizona Correspondence to: Dr. Michael Babyak, Box 3119, DUMC, Durham, NC 27710; email: [email protected]; fax: (919) 684-8629 Word Count: 6,098 Tables: 1 Figures: 3
40
Embed
A Primer on Structural Equation Modelspeople.duke.edu/~mababyak/docs/other/babyakgreenCFA.pdf · 2012. 12. 18. · A Primer on Structural Equation Models: Part 1. Confirmatory Factor
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Primer on Structural Equation Models:
Part 1. Confirmatory Factor Analysis.
Michael A. Babyak, PhD1 and Samuel B. Green, PhD2
1 Department of Psychiatry and Behavioral Sciences, Duke University Medical Center,
Durham, NC
2 Division of Psychology in Education, Arizona State University, Tempe, Arizona
Correspondence to: Dr. Michael Babyak, Box 3119, DUMC, Durham, NC 27710;
In the first of a two-part didactic series on structural equation modeling, we present an
introduction to the basic concepts underlying confirmatory factor analysis. We use
examples with simplified fictitious data to demonstrate the underlying mathematical
model and the conventions for nomenclature and graphical representation of the model.
We then show how parameter estimates are generated for the model with the maximum
likelihood function. Finally, we discuss several ways in which model fit is evaluated and
also briefly introduce the concept of model identification. Sample code in the EQS and
Mplus software language is provided in an online appendix. A list of resources for
further study also is included.
Confirmatory Factor Analysis 2
Structural equation modeling (SEM) is a general data-analytic method for the
assessment of models that specify relationships among variables. SEM involves
investigating two primary models: the measurement model that links measures to factors
and the structural model that links factors to each other. In the first installment of this
two-part series, we will discuss confirmatory factor analysis (CFA), which is the method
for specifying, estimating, and assessing measurement models. The second installment
will be published at a later date and will focus on the full structural equation model,
which includes both measurement and structural components. SEM is highly flexible–it
can be used to carry out a large variety of analytic procedures. With flexibility, of
course, comes complexity, and even two overly long papers would not provide sufficient
space to address the many facets of modern SEM. Nevertheless, we hope to provide at
least a glimpse and intuitive understanding of the basic concepts of SEM. In the
following pages, we will offer an introduction to confirmatory factor analysis, including
the purpose of CFA, the specification of models, computation of estimates of the model
parameters, and assessment of model fit.
SEM in psychosomatic and medical research
Although SEM is used quite frequently in some fields, such as psychology,
education, sociology, and genetics, research using SEM appears comparatively
infrequently in psychosomatic and medical journals. There is at least a small irony to the
relative scarcity of SEM in medical and psychosomatic research in that the technique
actually has its direct roots in biology. In the 1920s, geneticist Sewall Wright first
developed an important component of SEM, path analysis, in an attempt to better
Confirmatory Factor Analysis 3
understand the complex relations among variables that might determine the birth weight
of guinea pig offspring (1). In the late 1970s by Joreskog and Goldberger (2) developed
and championed the full SEM model, successfully integrating factor analytic technology
with path analysis. SEM has begun to appear more often in psychosomatic research, and
even has made begun to make an occasional foray into high profile medical journals. For
example, in a recent paper published in the New England Journal of Medicine, Calis et al.
(3) used the path modeling portion of SEM to estimate a set of complex associations
among malaria, HIV, and various nutritional deficiencies. In a recent commentary on
post-traumatic stress syndrome (PTSD) that appeared in JAMA, Bell and Orcutt {Bell,
2009 #158} explicitly point out the potential utility of SEM in their area of study:
“Structural equation modeling is particularly well suited for examining complex
associations between multiple constructs; such constructs are often represented as latent
constructs and are assumed to be free of measurement error.” Nevertheless, very few
papers using SEM to address any topic have graced the pages of JAMA. Of course, we
tend to agree with Bell and Orcutt. SEM is particularly useful as an aid in understanding
of how variables might interrelate in a system. It is especially useful when those
variables are comprised of several facets that overlap to some extent. In the
psychosomatic medicine domain, for example, Rosen et al. (4) use SEM to estimate the
association of global subjective health with psychological distress, social support and
physical function. Rosen et al. operationalize these four study variables as latent
variables, that is, factors instantiated by a set of observed measures that were thought to
reflect the respective factors.
Confirmatory Factor Analysis 4
Our fictionalized example
In order to demonstrate the CFA part of SEM, we will focus on a fictionalized
example based on a real question in psychosomatic research. Our example bears some
similarity to the model studied by Rosen et al., although ours is, of necessity, much
simpler. We draw on the literature has accumulated in our field suggesting that a variety
of psychosocial constructs, including trait hostility, anger, anxiety, and depressive
symptoms, appear to be risk factors for the development of coronary artery disease
(CAD) (for a review, see Suls and Bunde (5)). Despite the large number of papers
published about the topic, however, a number of interesting fundamental questions
remain. For example, do depression, hostility, anger, and anxiety each uniquely pose a
risk for heart disease? Or, because these variables tend to overlap, are they really just
manifestations of a broader trait underlying “negative affect? And is it really that
underlying trait which confers the risk?
In this first installment, we will focus on how SEM can be used to study the
question of whether these variables might be manifestations of one or more broader
dimensions. Understanding the measurement properties of the variables under study is a
critical first step in carrying out any empirical research study—we have to know what we
are measuring and how well we are doing it before drawing any robust conclusions about
findings that concern those variables. In the case of our example, in this preliminary
‘measurement model’ phase we can address questions about the measurement properties
of the variables under study, such as: are hostility, depression, and anxiety really distinct
constructs? Or are they really just slightly different manifestations of the same negative
affectivity phenomenon? Understanding the measurement properties of the negative
Confirmatory Factor Analysis 5
affect variables will later help illuminate the “specific versus general” risk question. Of
course, in the world outside the confines of this tutorial, we also would draw from
substantive theory to make such arguments work. Since our focus here is within the more
narrow borders of the SEM technique itself, we will not delve very deeply into the
substantive aspects of the analyses we will present. Our aim here is to simply get your
feet wet in understanding the technique might be applied to such questions.
Purpose of Confirmatory Factor Analysis
CFA, as well as exploratory factor analysis (EFA), defines factors that account for
covariability or shared variance among measured variables and ignore the variance that is
unique to each of the measures. Broadly speaking, either can be a useful technique for
(a) understanding the structure underlying a set of measures, (b) reducing redundancy
among a set of measured variables by representing them with a fewer number of factors,
and (c) exploiting that redundancy and hence improving the reliability and validity of
measures. However, the purposes of EFA and CFA, and accordingly the methods
associated with them, are different. The goal of EFA is to discover a set of as-yet-
unknown or unverified factors based on the data, although a priori hypotheses based on
the literature may help guide some decisions in the EFA process. In other words, we start
with correlations among some set of variables, and although we may have some reason to
believe they will cluster following a certain pattern, we generally submit the correlations
to the software and allow the algorithm to tell us how many factors there may be and
what particular variables belong to those factors. In contrast, in CFA, we have to start
with one or more explicit hypotheses about the number of factors and how the variables
Confirmatory Factor Analysis 6
are related to those factors. CFA accomplishes this by assessing constraints imposed on
factor models based on the a priori hypotheses about measures. If these constraints are
inconsistent with the pattern of relationships among measured variables, the model with
its imposed constraints is rejected. Given the focus of CFA is on hypothesized models,
let’s first describe how these models are specified before considering how the parameters
of models are estimated and how the fit of models to the data are assessed.
Model Specification
With CFA, we hypothesize a model that specifies the relationship between
measured variables and presumed underlying factors.1 The model includes parameters we
want to estimate based on the data (i.e., freely estimated parameters) and parameters we
constrain to particular values based on our understanding of our data and the literature
(i.e., constrained or fixed parameters). It is the constraints on model parameters that
produce lack of fit.
In this section we will consider three prototypical CFA models, each with a
different substantive interpretation. We will present each prototypical model and discuss
it in the context of our negative affect example. The example for the first two prototypes
involves postulating a factor structure underlying four measures: a trait hostility measure,
an anger measure, an anxiety measure, and a depressive symptoms score. Each of the
measures is derived from summing the items on a self-report instrument designed to
measure that construct. For the third prototypical model, we extend this example by
breaking the depressive symptoms measures out into three domains of symptoms,
affective, somatic, and cognitive. 1 We use the terms factor and latent variable interchangeably throughout this paper.
Confirmatory Factor Analysis 7
Single Factor Model
The single factor model is the simplest CFA model. Nevertheless, we devote
considerable attention to it in order to introduce the basic concept of CFA and
conventional SEM terminology.
A single factor model specifies a unidimensional model, hypothesizing that a
single dimension underlies a set of measures. Unidimensionality is often seen as a
desirable characteristic of a set of measures, in particular because multiple measures can
be reduced to a single measure, thus improving parsimony, and also because the single
dimension is typically more reliable than a given individual component. As with any
structural equation model, a single factor model can be presented pictorially as a path
diagram or in equation form. Figure 1 is a graphical representation of a model with a
single factor (F1) underlying four measures, X1 (hostility), X2 (anger), X3 (anxiety), and
X4 (depressive symptoms). By convention, the factor is depicted as a circle, which
represents a latent variable, while the observed measures are squares, which represent
observable or indicator variables. A single-headed arrow between two variables indicates
the direction of the effect of the one variable on the other. Within the context of our
example, we are postulating that a factor called negative affect (F1) underlies or
determines the observed scores on the hostility, anger, anxiety, and depressive symptom
measures. Statistically, we believe these four measures are correlated because they have a
common underlying factor, negative affect. In other words, the model reflects the belief
that changes in the unobserved latent variable, negative affect, are presumed to result in
changes in the four variables that we have actually measured.
Confirmatory Factor Analysis 8
Continuing with Figure 1, a variable with arrows pointing only away from it is
called exogenous. A variable with one or more arrows pointing to it, even if one or more
arrows are pointing away from it, is called endogenous. One equation is associated with
each endogeneous variable. Accordingly, the model in Figure 1 involves four endogenous
variables and therefore four equations:
1 11 1 1
2 21 1 2
3 31 1 3
4 41 1 4
X = λ F + EX = λ F + EX = λ F + EX = λ F + E
.
The lambdas ( )λ are factor weights or loadings, which can be interpreted essentially like
regression coefficients. For example, for every one unit increase in the negative affect
factor, F1, the expected change in hostility, X1, will be λ11.
Observed measures are not likely to be pure indicators of a factor, but almost
certainly contain unique components, frequently referred to as residuals or errors (E). A
unique component for a measure includes two kinds of variability--reliable information
that is specific to that measure but not related to the factor, and unreliable information,
otherwise known as measurement error. Because errors are not directly observable, they
are also latent variables and are represented in our path diagram as circles. For the
hostility measure in our example, the unique component might include the specific
component of agitation as well as measurement error due to inattentiveness of
respondents and ambiguity of the items on this measure.
Finally, our path diagram also includes double-headed curved arrows. If an arrow
begins and returns to the same exogenous variable, it represents the variance of that
variable. A double-headed arrow could also be drawn between any two errors to represent
Confirmatory Factor Analysis 9
a covariance between them, but we chose not to include error covariances in our model to
avoid unnecessary complexity.
The model parameters, or unknowns, which we seek to estimate or constrain
based on our understanding of a study, are associated with the single-headed and double-
headed arrows in our diagrams and, by convention, are shown as Greek letters. In
addition to the lambdas, the parameters for the model in Figure 1 are the variance of the
factor ( )1
2Fσ and the variances of the errors ( )1 4
2 2E Eσ - σ . As shown at the bottom of the
figure, the model parameters can also be presented in three matrices: the phi matrix ( )Φ
containing the variances and covariances among factors, the lambda matrix ( )Λ that
includes all factor weights, and the theta matrix ( )Θ that includes the variances and
covariances among the errors.
We now turn to a concept that concerns all SEM, including our present CFA
models, the idea of “free” or “fixed” (also referred to as constrained) parameters. When
we conduct a conventional multiple regression, we typically deal directly only with free
parameters, those we wish to estimate.2 In a multiple regression model, the free
parameters are the regression coefficients. We tell the software or algorithm ahead of
time to calculate those values. In SEM, however, we specify not only which parameters
are to be estimated, but most critically, we also specify which parameters are constrained
or fixed, that is, which parameters are not to be estimated. In SEM, parameters can be
constrained in a number of ways, including fixing them to a specific value or to be equal
2 There are in fact a number of constraints even in a conventional multiple regression, but these are typically just part of the underlying assumption of the simple for of the model. For example, unless we explicitly specify the model differently, the relations between the predictors and response variable are all assumed to be linear, which is in a broad sense, a constraint. There are a number of other such constraints in most statistical models.
Confirmatory Factor Analysis 10
to each other. For example, in our negative affect CFA model, we could constraint the
loading (l) for the anxiety variable to be some value that we had obtained in a prior study.
Or, we could constrain the loading for anxiety to equal to loading of anger. These
constraints are generally used when we have a very specific theoretical question about
those parameters. A constraint that is far more frequently used is one in which the
parameter is constrained to a value of zero. For example, our single factor model includes
no covariances among errors (i.e., all zeros in the off-diagonal positions of the theta
matrix). Substantively, these constraints on the error covariances reflect our belief that
there are no other factors other than the negative affect factor that systematically
influence the variability in the measured variables. Also reflecting that there are no
additional factors underlying the measures, we don’t specify any additional parameters in
the phi and lambda matrices. This is a less obvious constraint in that they are made by
simply omitting any reference to additional factors; the net result, however, is that of
constraining the variances and weights of additional factors to zero. This latter point will
become clearer when we present our second example below. If one or more of these
constraints are incorrect, the model is likely to fit poorly and be rejected.
Another important class of constraints are those used to define an arbitrary metric
for a factor. The metric constraint is often a bit mysterious to SEM novices, and while
the precise mathematical details are not critical to our purposes here, we will describe
how this constraint is applied and what it accomplishes. Factor metrics are arbitrary
because they are latent variables—they are unobservable variables and hence do not have
an inherent metric. We can assign a metric for a factor by either fixing its variance to 1,
as we did, or fixing one of its weights to 1. The choice should have no effect on the fit
Confirmatory Factor Analysis 11
for the relatively simple models considered in this article. Fixing the factor variance to 1
in effect standardizes the factor, and the resultant loadings can be interpreted as
standardized. In more complex models, however, it may be necessary to fix one of the
factor weights (lambdas) to 1 rather than to fix the variance of the factor to 1.
Researchers often select the weight associated with the measured variable that is believed
to have the strongest relationship with the underlying factor. In our negative affect
example, we might choose to fix the weight of the best developed depressive symptom
measure to 1. The result of fixing a factor weight to 1 puts all the other loadings in a
scale that is relative to the depressive symptom scale. Very broadly speaking,
constraining one of the loadings to equal 1 is a distant cousin to selecting a reference
category for a set of dummy variables in a linear model, in that it provides a point of
comparison for the other effects. In addition to setting the metric of a latent variable, the
constraint also helps the algorithm estimate the remaining free parameters by making it
more likely that the model is identified. We will discuss the concept of identification in
more detail later in this tutorial, but the central idea of identification is that we cannot
have fewer data points than unknowns (free parameters) in the model. Finally, as we
noted above, the constraints to define a factor’s metric do not influence model fit. All
other model constraints, however, have potential effects on the fit of the model to the
data.
All of the free parameters in our model (i.e., those not constrained to 1 or 0) are
estimated based on the data. If the model fits, we interpret these estimated parameters to
evaluate, for example, which measure is the best indicator of the factor. As in the case of
regression analysis, interpretation is best performed by examining the standardized
Confirmatory Factor Analysis 12
weights (i.e., when the factor and measures are transformed to z-scores). If the model
fails to fit, we should not interpret the estimated parameters because their values may
have been adversely affected by the potential misspecification of the model. If our model
1 provides adequate fit (which we will define later) and each of the factor loadings is
substantial, we would conclude that there is evidence to support the idea that there may
be a single latent variable underlying the four observed measures. However, this result
does not mean that the one factor model is the only structure that might produce good fit.
The good fit only means that it is one of the possible models that fits well. Apart from
the obvious theoretical implications of a well-fitting one factor model with high factor
loadings, the result also suggests that we might feel fairly safe using just the single
negative affect latent variable as a predictor (or as an outcome or mediator) in a more
extensive structural model. Of course, if the model does not fit, we will have to test
alternative models in order to understand whether there is a structure that might fit the
observed data better.
Correlated Factors Model
Our second model is a correlated factors model, which specifies that two or more
factors underlie a set of measured variables and also that the factors are correlated. For
simplicity, we will consider a two-factor model, but our discussion is relevant to models
with more than two factors.
In Figure 2 we present a model for our four measures but now with two correlated
factors. As with our path diagram for a single factor model, we have circles for latent
variables (i.e., factors and errors), squares for measured variables, single-headed arrows
Confirmatory Factor Analysis 13
for effects of one variable on another, double-headed curved arrows for variances of
exogenous variables, and a double-headed arrow for the covariance between the two
factors. Within the context of our negative affect example, we might speculate that the
hostility and anxiety measures are related to one another by the shared characteristic of
agitation and also, to some degree, distinct from the two depressive symptom measures.
In other words, the model should include a factor (F1) affecting the hostility and anger
measures (X1 and X2), and another factor (F2) affecting the anxiety and depressive
symptom measures (X3 and X4).
Model parameters are associated with all single-headed and double-headed arrows
and are presented in matrix form at the bottom of Figure 2. Constraints can be imposed
on the model parameters. As previously presented, we can define the metric for factors by
constraining their variances to 1 or one of their weights to 1. In this instance, we
arbitrarily chose to set the factor variance to 1 ( )1 2
2 2F Fi.e., σ = 1andσ = 1 .
All constraints besides those to determine the metric of factors can produce lack
of fit and are evaluated in assessing the quality of a model. For example, the effects of
factors on measures, as shown by arrows between factors and measures in the path
diagram, can be represented as equations,
1 11 1 2 1
2 21 1 2 2
3 1 32 2 3
4 1 42 2 4
X = λ F + 0 F + EX = λ F + 0 F + EX = 0 F + λ F + EX = 0 F + λ F + E
.
As shown, the equations indicate that a number of factor loadings are constrained to zero
such that each measured variable is associated with one and only one factor. The
specified structure is consistent with the idea of simple structure, an objective frequently
Confirmatory Factor Analysis 14
felt to be desirable with EFA. In addition, a measure is less likely to be misinterpreted if
it is a function of only one factor. Given the advantages of this structure, researchers
frequently begin with specifying models that constrain factor loadings for a measure to be
associated with one and only one factor. In other words, each measure has one weight
that is freely estimated, and all other weights (potential crossloadings) between that
measure and other factors are constrained to 0.
Other parameters in our model that may be freely estimated or constrained are the
covariance between the factors and the variances and covariances among errors. (a) With
CFA, we typically allow the factors to be correlated by freely estimating the covariances
between factors. If we constrained all factor covariances to be equal to zero (i.e.,
orthogonal factors) and also constrained many of the factor loadings to be equal to zero
(e.g., each measure being associated with only one factor), we would be hypothesizing a
model that does not allow for correlations among measures associated with different
factors. This model is likely to conflict with reality and be rejected empirically. In
addition, this model would be inconsistent with many psychological theories that suggest
underlying correlated dimensions. The decision to allow for correlated factors is in stark
contrast with practice in EFA, where researchers routinely choose varimax rotation
resulting in orthogonal factors. However, in EFA, we can still obtain good fit to data in
that all factor loadings are freely estimated (i.e., all measured variables are a function of
all factors), permitting correlations among all measured variables. (b) We usually think of
our measured variables as being unreliable to some degree and thus must freely estimate
the error variances. In most CFA models, we begin by constraining all covariance
Confirmatory Factor Analysis 15
between errors to be 0. By imposing these constraints, we are implying that the
correlations among measures are purely a function of the specified factors.
If this model fits our data, we again have a structure that is consistent with the
data, but still cannot rule out other specifications that also might fit. A good fit for the
two factor model suggests that although all four measures may share variability, anger
and hostility may represent an underlying construct that is relatively distinct from the
construct that underlies the anxiety and depression measure. We might interpret the
anger and hostility factor as something like ‘opposition’ or perhaps ‘aggression,’ while
the depression and anxiety might be interpreted as something like ‘withdrawal.’ Again,
this result would have implications regarding whether we would be better off using these
two separate latent variables rather than the single negative affect variable.
In practice, we would typically compare the unidimensional model with the two
correlated factor model. We can do this formally in SEM by comparing the difference
between the fit of the models. We pointed out earlier that a well-fitting model does not
guarantee that it is the correct model. For this reason, SEM procedures such as CFA are
at their scientific best when there are several theoretically plausible models available to
compare. We will discuss fit a bit later, and model comparison in the second installment.
For now, we turn to one more type of model structure, just to further illustrate the kinds
of models that can be represented.
Bifactor Model
A bifactor model may include a general factor associated with all measures and
one or more group factors associated with a limited number of measures.(6, 7) In Figure
Confirmatory Factor Analysis 16
3 we present a model for six measures with one general factor and one group factor.
Keeping with our negative affect example, we replace the single depressive symptom
measures with three separate scores of affective, cognitive, and somatic symptoms of
depression. Let’s say that X1, X2, and X3 are the hostility, anger, and anxiety measures,
and the X4, X5, and X6 are the affective, cognitive, and somatic depressive symptom
measures. Due to space limitations, we will only briefly describe the specification of this
model.
As typically applied, we are unlikely to obtain a bifactor model with EFA in that
an objective of this method (with rotation) is to obtain simple structure, which is
generally intolerant to a general factor. In contrast, in CFA, we choose which parameters
to estimate freely and which to constrain to 0. Thus, we can allow for a general factor as
well as group factors. Most frequently bifactor models have been suggested as
appropriate for item measures associated with psychological scales (See (8)). Although
interesting measures are likely to assess a general trait or factor, they are also likely to
include more specific aspects of that trait, that is, group factors. In contrast with the
previous model, factors for a bifactor model are typically specified to be uncorrelated
(i.e., the factor covariances are constrained to 0). In our example, this model suggests that
the three depressive symptom measures are to some extent distinct from the other three
measures, but that a broader general factor, which might be called negative affect, also
underlies all six measures. 3
3 See Reise, Morizot, and Hays (2007) for a discussion of bifactor models. They suggest, for example, that items on an appropriately developed scale of depression would assess not only the general factor of depression, but also subsets of items would assess group factors representing such aspects as somatization and feelings of hopelessness.
Confirmatory Factor Analysis 17
Estimation of Free Parameters
Next we will consider how free parameters are estimated. We will discuss the
estimation using the model presented in Figure 1 with four measures and a single factor.
Hats (^) are placed on top of model parameters in recognition that we are now working
with sample data as opposed to model parameters at the population level.
SEM software typically allows a variety of input data formats, including raw
case-level data, the observed covariances among the study measures, or the correlations
and standard deviations of the measures. Regardless of the form of the data that you enter
into the software, the standard maximum likelihood estimation algorithm ultimately uses
the variances and covariances among the measured variables. If you input data as a
covariance matrix, the software will use this matrix directly; if you input data as raw
cases or correlations and standard deviations among measures, the software will convert
them to a covariance matrix before conducting the SEM analyses. These variances and
covariances are elements in the sample covariance matrix, S . The specified model with
its freely estimated parameters tries to reproduce this covariance matrix. The reproduced
matrix based on the model (also called the model-implied covariance matrix) is ModelΣ̂ and
the equation linking it to the model parameters is