Assessing Hypothesis Testing Fit Indices Kline Chapter 8 (Stop at 210) Beaujean Appendix A
Assessing Hypothesis TestingFit Indices
Kline Chapter 8 (Stop at 210)Beaujean Appendix A
Fit Indices
• It’s complicated yo.– There are a lot of them.– They do not imply that you are perfectly right.– They have guidelines but they are not perfect
either.– People misuse them.– Etc.
Fit Indices
• Limitations:– Fit statistics indicate an overall/average model fit. • That means there can be bad sections, but the overall
fit is good.
– No one magical number/summary.– They do not tell you where a misspecification
occurs.
Fit Indices
• Limitations– Do not tell you the predictive value of a model.– Do not tell you if it’s theoretically meaningful.
Fit Indices
• What size? I need a rule?!– Everyone cites Hu and Bentler (1999) for the
golden standards.– Same problem that Cohen had (we love rules).
• So when the fit is messy, cite Kline (page 197) as reasons that’s not a bad thing– This section is an interesting read, especially if you
have trouble publishing, but not crucial to your understanding of fit indices
Fit Indices
• Model test statistic – examines if the reproduced correlation matrix matches the sample correlation matrix– Sometimes called “badness of fit”– Want these to be small
Fit Indices
• Traditional NHST = reject-support context– You reject the null to show your research
hypothesis is correct.• SEM Hyp Testing = accept-support context– You do not reject the null showing that your model
is consistent with the population
Model Test Statistic
• Chi-square– Formula = (N-1)FML
– FML = is the minimum fit function in ML estimation– P values are based on df for your model and a chi-
square distribution– You want this to be nonsignificant.• But this is a catch 22!
Model Test Statistic
• Chi-square is biased by– Multivariate non-normality – Correlation size – bigger correlations can be bad
for you (harder to estimate all that variance)– Unique variance – Sample size
Model Test Statistic
• Everyone reports chi-square, but people tend to ignore significant values– (I’m sort of eh on his YOU MUST PAY ATTN OR DIE
talk in this section)
Model Test Statistic
• Normed chi-square or (X2/df) – this used to be widely reported and used– The criterion was < 3.00 were good models– Now most people have moved away from this
procedure (i.e. don’t use this one).
Also considered an absolute fit index.
Model Test Statistic
• Since X2 is biased by a bunch of things, you can use a robust estimator to help fix that bias.– Satorra-Bentler– Yuan-Bentler– See page 157 for all the different robust options
Fit Indices
• Many of the statistics described next are compared to the following model types:– Independence – a model assuming no
relationships between the variables (i.e. parameters are not significant)
– Saturated – a model assuming all parameters exist (i.e. df = 0).
Fit Indices
• Both types of statistical inferences have their problems … and especially in SEM because it is easy to find statistics that you would normally reject, even with good model fit.
• Tends to be too black and white (reject or not to reject!)
Fit Indices
• Alternative Fit Indices – Not traditionally a dichotomous yes-no decision – Do not distinguish between sampling error and
evidence against the model
Fit Indices
• Alternative Fit Indices– Absolute Fit Indices– Incremental Fit Indices– Parsimony-adjusted Index– Predictive Fit Indices
Fit Indices
• Absolute fit indices – Proportion of the covariance matrix explained by
the model– You can think about these as sort of R2
– Want these values be high
GFI
• Do not use this sucker unless you want to get a nasty review.– GFI, AGFI, PGFI
• Lots of research showing it’s positively biased
SRMR
• Standardized root mean (square) residual• Want small values– Excellent < .06 (not a typo different than book)– Good < .08– Acceptable < .10– Eeek > .10
Fit Indices
• Incremental Fit Indices– Also known as comparative fit indices– Compared to the improvement over the
independence model (remember that’s the one with no relationships between the variables)
– Not necessarily the best indices
CFI
• Comparative Fit Index• Values are 0 to 1 (sometimes you’ll get slightly
over 1, usually indicates something is wrong)• Want high values– Excellent >.95– Good > .90– Blah < .90
All the following FIs have the same cut off rules
NFI
• Normed Fit Index• A variation of the CFI, as it was said to
underestimate for small samples• 1 – (X2
baseline / X2model)
IFI
• Incremental Fit Index• Also known as Bollen’s Non-normed fit index• Modified NFI that doesn’t depend on sample
size so much. – (X2
baseline - X2model) / (X2
baseline - dfmodel)
RFI
• Relative Fit Index• Also known as Bollen’s Normed Fit Index– (X2
model/dfmodel) / (X2baseline/dfbaseline)
TLI
• Tucker Lewis Index• Also known as the Bentler-Bonet Non-Normed
Fit Index– ( (X2
model/dfmodel) - (X2baseline/dfbaseline) ) / (
(X2baseline/dfbaseline) – 1)
Fit Indices
• Parsimony-adjusted index– These include penalties for model complexity • Normally more paths = better fit.
– These will have smaller values for simpler models
RMSEA
• Root mean squared error of approximation • Parsimony-adjusted index• Want small values– Excellent < .06 (not a typo different than book)– Good < .08– Acceptable < .10– Eeek > .10
• Report Confidence Interval!
Pclose
• Tests if the RMSEA is in the excellent range• You want p > .50 to show that there is a high
probability that RMSEA is effectively zero
Fit Indices
• Predictive fix indices– Estimate model fit in a hypothetical replication of
the study with the same sample size randomly drawn from the population
– Not always used– Often used for model comparisons, see below.– Often also consider parsimony adjusted indices.
Model Comparisons
• Let’s say you want to adjust your model– You can compare the adjusted model to the
original model to determine if the adjustment is better
• Let’s say you want to compare two different models– You can compare their fits to see which is better
Model Comparisons
• Nested models– If you can create one model from another by the
addition or subtraction of parameters, then it is nested• Model A is said to be nested within Model B, if
Model B is a more complicated version of Model A. – For example, a one-factor model is nested within a
two-factor as a one-factor model can be viewed as a two-factor model in which the correlation between factors is perfect).
Nested Models
• Chi-square difference test– | Subtract Model 1 X2 – Model 2 X2|– Subtract Model 1 df – Model 2 df– Use a chi-square table to look up p < .05 for
difference in df– See if the first step is greater than that value• If yes, you say the model with the lower chi-square is
better• If no, you say they are the same and go with the
simpler model
Nested Models
• CFI difference test– Subtract CFI model 1 – CFI model 2– If the change is more than .01, then the models
are considered different– This version is not biased by sample size issues
with chi-square.
Nested Models
• So how can I tell what to change?• NOTE: JUST CHANGE ONE THING AT A TIME!• Use modification indices!– They tell you what the chi-square change would
be if you add the path suggested.– Based on X2(1) – called a Lagrange Multiplier• Remember that p < .05 = 3.84
Nested Models
• Can be tested with lavaan using the anova() function.– More on this in the multigroup section
Non-Nested Models
• AIC – Akaike Information Criterion• BIC – Bayesian Information Criterion• SABIC – Sample size Adjusted Information
Criterion– All of these penalize you for having more complex
models. If all other things are equal, it is biased to the simpler model.
Non-Nested Models
• All of the ICs are how much the sample will cross validate in the future
• You want them to be small, so you pick the smallest one of the two models (how different?)
Non-Nested Models
• ECVI – expected cross validation index– Fmin + (2t / ( n – p – 2) )• T number of parameters estimated• P number of squares
• Again, you want small values, so you pick the model with the smallest ECVI
OMG!
• So what to do?– Mainly people report: X2(df), RMSEA, SRMR, CFI– Determine the type of model change to use the
right model comparison statistic
Example
• Two models– Let’s make the models– And compare them with their fit indices– Are they nested or not?– See handout on blackboard.