Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam
Multivariate Statistics
Confirmatory Factor Analysis I
W. M. van der VeldUniversity of Amsterdam
Overview
• Digression: The expectation• Formal specification• Exercise 2• Estimation
– ULS– WLS– ML
• The 2-test• General confirmatory factor analysis
approach
Digression: the expectation)E(
1xx
Nx i
22 )(EE)(1
)var( xxxxN
x i
)(E)(EE)cov(
))((1
)cov(
yyxxxy
yyxxN
xy ii
• If the variables are expressed in deviations from their mean: E(x)=E(y)=0, then:
)E(xE(xx))var( 2x
• If the variables are expressed as standard scores:E(x2)=E(y2)=1, then:
E(xy))(cor)cov( xyxy
E(xy))cov( xy
Formal specification
• The full model in matrix notation: x = Λξ + δ
• The variables are expressed in deviations from their means, so: E(x)=E(ξ)=E(δ)=0.
• The latent ξ-variables are uncorrelated with the unique components (δ), so: E(ξδ’)=E(δξ’)=0
• On the left side we need the covariance (or correlation) matrix. Hence;
• E(xx’) = Σ = E(Λξ + δ)(Λξ + δ)’– Σ is the covariance matrix of the x variables.
• Σ = E(Λξ + δ)(ξ’Λ’ + δ’)
• Σ = E(Λξξ’Λ’ + δξ’Λ’ + Λξδ’ + δδ’)
Formal specification
• The factor equation: x = Λξ + δ
• E(x)=E(ξ)=E(δ)=0 and E(ξδ’)=E(δξ’)=0
• Σ = E(Λξξ’Λ’ + δξ’Λ’ + Λξδ’ + δδ’)
• Σ = E(Λξξ’Λ’) + E(δξ’Λ’) + E(Λξδ’) + E(δδ’)
• Σ = ΛE(ξξ’)Λ’ + Λ’E(δξ’) + ΛE(ξδ’) + E(δδ’)
• Σ = ΛE(ξξ’)Λ’ + Λ’*0 + Λ*0 + E(δδ’)
• Σ = ΛΦΛ’ + Θδ
– E(ξξ’) = Φ the variance-covariance matrix of the factors
– E(δδ’) = Θδ the variance-covariance matrix of the factors
• This is the covariance equation: Σ = ΛΦΛ’ + Θδ
• Now relax, and see the powerful possibilities of this equation.
• Formulate expressions for the variances of and the correlations between the x variables in terms of the parameters of the model
• Now via the formal way.
• It is assumed that: E(xi)=E(ξi)=0, and E(δiδj)=E(δx)=E(δξ)=0
Exercise 2
λ11 λ21
x2
ξ11
x1
δ1 δ2
φ21
λ32 λ42
x4
ξ22
x3
δ3 δ4
Exercise 2
• The factor equation is: x = Λξ + δ
• The covariance equation then is: Σ = ΛΦΛ’ + Θδ
• This provides the required expression.
δΘΣ
0000
0000
00
00
0000
0000
00
00
000
000
000
000
4232
2111
2221
1211
42
32
21
11
δΘΣ
422242322242212142112142
422232322232212132112132
421221321221211121111121
421211321211211111111111
λ11 λ21
x2
ξ11
x1
δ1 δ2
φ21
λ32 λ42
x4
ξ22
x3
δ3 δ4
Exercise 2
δΘΣ
422242322242212142112142
422232322232212132112132
421221321221211121111121
421211321211211111111111
4
3
2
1
422242322242212142112142
422232322232212132112132
421221321221211121111121
421211321211211111111111
000
000
000
000
Σ
Exercise 2
4
3
2
1
422242322242212142112142
422232322232212132112132
421221321221211121111121
421211321211211111111111
000
000
000
000
Σ
4422242322242212142112142
4222323322232212132112132
4212213212212211121111121
4212113212112111111111111
Σ
Exercise 2
4422242322242212142112142
4222323322232212132112132
4212213212212211121111121
4212113212112111111111111
44434241
34333231
24232221
14131211
• Because both matrices are symmetric, we skip the upper diagonal.
4422242322242212142112142
4222323322232212132112132
4212213212212211121111121
4212113212112111111111111
Σ
Exercise 2
• Let’s list the variances and covariances.
4422242322242212142112142
3322232212132112132
2211121111121
1111111
44434241
333231
2221
11
4422242322242212142112142
4222323322232212132112132
4212213212212211121111121
4212113212112111111111111
44434241
34333231
24232221
14131211
Exercise 2
• The covariances between the x variables:
• The variances of the x variables:
442224244
332223233
221112122
111111111
32224243
21214242
21213232
11214241
11213231
11112121
• We already assumed that:E(xi)=E(ξi)=E(δi)=0, andE(δiδj)=E(δx)=E(δξ)=0
• If we standardize the variables x and ξ so that:var(xi)=var(ξi)=1,
• Then we can write:
Results Exercise 1 ρ12= λ11λ21
ρ13= λ11φ21λ32
ρ14= λ11φ21λ42
ρ23= λ21φ21λ32
ρ24= λ21φ21λ42
ρ34= λ32λ42
Exercise 2
Becomes
44242
33232
22121
11111
1
1
1
1
Which is the same result as in the intuitive approach, but using a different notation: φii=var(ξii) and φij=cov(ξij) or when standardized cor(ξij)
442224244
332223233
221112122
111111111
324243
21214242
21213232
11214241
11213231
112121
32224243
21214242
21213232
11214241
11213231
11112121
Becomes
Estimation• The model parameters can normally be estimated
if the model is identified.• Let’s assume for the sake of simplicity that our
variables are standardized, except for the unique components.
• The decomposition rules only hold for the population correlations and not for the sample correlations.
• Normally, we know only the sample correlations.• It is easily shown that the solution is different for
different models.• So an efficient estimation procedure is needed.
Estimation• There are several general principles. • We will discuss: - the Unweighted Least Squares (ULS) procedure - the Weighted Least Squares (WLS) procedure. • Both procedures are based on:
the residuals betweenthe sample correlations (S) and the expected values of the correlations.
• Thus estimation means minimizing the difference between:ΣS ˆ and
Σ̂
ΣS ˆ -
• The expected values of the correlations are a function of the model parameters, which we found earlier:
δΘΛΦΛΣ ˆˆˆˆˆ
ULS Estimation• The ULS procedure suggests to look for the parameter values
that minimize the unweighted sum of squared residuals:
i
0
2uls )ˆˆˆˆ()ˆ(S,F δΘΛΦΛSΣ
• Where i is the total number of unique elements of the correlations matrix.
• Let’s see what this does for the example used earlier with the four indicators.
λ11 λ21 λ31
x2
ξ
x3 x1
δ11 δ22 δ33
x4
δ44
λ41 x1 x2 x3 x4
x1 1.0
x2 .42 1.0
x3 .56 .48 1.0
x4 .35 .30 .50 1.0
ULS Estimation• FULS =
– (.42 - λ11λ21)2 + (.56 - λ11λ31)2 + (.35 - λ11λ41)2 +– (.48 - λ21λ31)2 + (.30 - λ21λ41)2 +– (.40 - λ31λ41)2 +– (1 - (λ11
2 + var(δ11)))2 + (1 - (λ212 + var(δ22)))2 +
– (1 - (λ312 + var(δ33)))2 + (1 - (λ41
2 + var(δ44)))2
• The estimation procedure looks (iteratively) for the values of all the parameters that minimize the function Fuls.
• Advantages:– Consistent estimates without distributional assumptions on x’s.– So for large samples ULS is approximately unbiased.
• Disadvantages:– There is no statistical test associated with this procedure (RMR).– The estimators are scale dependent.
WLS Estimation• The WLS procedure suggests to look for the parameter values
that minimize the weighted sum of squared residuals:
i
0
2iwls )ˆˆˆˆ(w)ˆ(S,F δΘΛΦΛSΣ
• Where i is the total number of unique elements of the correlations matrix.
• These weights can be chosen in different ways.
Maximum Likelihood Estimation
• The most commonly used procedure, the Maximum Likelihood (ML) estimator, can be specified as a special case of the WLS estimator.
• The ML estimator provides standard errors for the parameters and a test statistic for the fit of the model for much smaller samples.
• But this estimator is developed under the assumption that the observed variables have a multivariate normal distribution.
The χ2-test
• Without a statistical test we don’t know whether our theory holds.
• The test statistic t used is the value of the fitting function (FML) at its minimum.
• If the model is correct, t is 2 (df) distributed • Normally the model is rejected if t > C
• where C is the value of the 2 for which: pr(2df > C
= • See the appendices in many statistics books.• But, the 2 should not always be trusted, as any other
similar test-statistic.• A robust test is to look at:
– The residuals, and– The expected parameter change (EPC).
General CF approach
• A model is specified with observed and latent variables.
• Correlations (covariances) between the observed variables can be expressed in the parameters of the model (decomposition rules).
• If the model is identified the parameters can be estimated.
• A test of the model can be performed if df > 0.• Eventual misspecifications (unacceptable 2) can
be detected. • Corrections in the models can be introduced:
adjusting the theory.
Data collectionprocess Model
modification
Reality
Data
Model
Theory