Assessing Hypothesis Testing Fit Indices Kline Chapter 8 (Stop at 210) Beaujean Appendix A.

Assessing Hypothesis TestingFit Indices

Kline Chapter 8 (Stop at 210)Beaujean Appendix A

Fit Indices

• It’s complicated yo.– There are a lot of them.– They do not imply that you are perfectly right.– They have guidelines but they are not perfect

either.– People misuse them.– Etc.

Fit Indices

• Limitations:– Fit statistics indicate an overall/average model fit. • That means there can be bad sections, but the overall

fit is good.

– No one magical number/summary.– They do not tell you where a misspecification

occurs.

Fit Indices

• Limitations– Do not tell you the predictive value of a model.– Do not tell you if it’s theoretically meaningful.

Fit Indices

• What size? I need a rule?!– Everyone cites Hu and Bentler (1999) for the

golden standards.– Same problem that Cohen had (we love rules).

• So when the fit is messy, cite Kline (page 197) as reasons that’s not a bad thing– This section is an interesting read, especially if you

have trouble publishing, but not crucial to your understanding of fit indices

Fit Indices

• Model test statistic – examines if the reproduced correlation matrix matches the sample correlation matrix– Sometimes called “badness of fit”– Want these to be small

Fit Indices

• Traditional NHST = reject-support context– You reject the null to show your research

hypothesis is correct.• SEM Hyp Testing = accept-support context– You do not reject the null showing that your model

is consistent with the population

Model Test Statistic

• Chi-square– Formula = (N-1)FML

– FML = is the minimum fit function in ML estimation– P values are based on df for your model and a chi-

square distribution– You want this to be nonsignificant.• But this is a catch 22!


• Chi-square is biased by– Multivariate non-normality – Correlation size – bigger correlations can be bad

for you (harder to estimate all that variance)– Unique variance – Sample size


• Everyone reports chi-square, but people tend to ignore significant values– (I’m sort of eh on his YOU MUST PAY ATTN OR DIE

talk in this section)


• Normed chi-square or (X2/df) – this used to be widely reported and used– The criterion was < 3.00 were good models– Now most people have moved away from this

procedure (i.e. don’t use this one).

Also considered an absolute fit index.


• Since X2 is biased by a bunch of things, you can use a robust estimator to help fix that bias.– Satorra-Bentler– Yuan-Bentler– See page 157 for all the different robust options

Fit Indices

• Many of the statistics described next are compared to the following model types:– Independence – a model assuming no

relationships between the variables (i.e. parameters are not significant)

– Saturated – a model assuming all parameters exist (i.e. df = 0).

Fit Indices

• Both types of statistical inferences have their problems … and especially in SEM because it is easy to find statistics that you would normally reject, even with good model fit.

• Tends to be too black and white (reject or not to reject!)

Fit Indices

• Alternative Fit Indices – Not traditionally a dichotomous yes-no decision – Do not distinguish between sampling error and

evidence against the model

Fit Indices

• Alternative Fit Indices– Absolute Fit Indices– Incremental Fit Indices– Parsimony-adjusted Index– Predictive Fit Indices

Fit Indices

• Absolute fit indices – Proportion of the covariance matrix explained by

the model– You can think about these as sort of R2

– Want these values be high

GFI

• Do not use this sucker unless you want to get a nasty review.– GFI, AGFI, PGFI

• Lots of research showing it’s positively biased

SRMR

• Standardized root mean (square) residual• Want small values– Excellent < .06 (not a typo different than book)– Good < .08– Acceptable < .10– Eeek > .10

Fit Indices

• Incremental Fit Indices– Also known as comparative fit indices– Compared to the improvement over the

independence model (remember that’s the one with no relationships between the variables)

– Not necessarily the best indices

CFI

• Comparative Fit Index• Values are 0 to 1 (sometimes you’ll get slightly

over 1, usually indicates something is wrong)• Want high values– Excellent >.95– Good > .90– Blah < .90

All the following FIs have the same cut off rules

NFI

• Normed Fit Index• A variation of the CFI, as it was said to

underestimate for small samples• 1 – (X2

baseline / X2model)

IFI

• Incremental Fit Index• Also known as Bollen’s Non-normed fit index• Modified NFI that doesn’t depend on sample

size so much. – (X2

baseline - X2model) / (X2

baseline - dfmodel)

RFI

• Relative Fit Index• Also known as Bollen’s Normed Fit Index– (X2

model/dfmodel) / (X2baseline/dfbaseline)

TLI

• Tucker Lewis Index• Also known as the Bentler-Bonet Non-Normed

Fit Index– ( (X2

model/dfmodel) - (X2baseline/dfbaseline) ) / (

(X2baseline/dfbaseline) – 1)

Fit Indices

• Parsimony-adjusted index– These include penalties for model complexity • Normally more paths = better fit.

– These will have smaller values for simpler models

RMSEA

• Root mean squared error of approximation • Parsimony-adjusted index• Want small values– Excellent < .06 (not a typo different than book)– Good < .08– Acceptable < .10– Eeek > .10

• Report Confidence Interval!

Pclose

• Tests if the RMSEA is in the excellent range• You want p > .50 to show that there is a high

probability that RMSEA is effectively zero

Fit Indices

• Predictive fix indices– Estimate model fit in a hypothetical replication of

the study with the same sample size randomly drawn from the population

– Not always used– Often used for model comparisons, see below.– Often also consider parsimony adjusted indices.

Model Comparisons

• Let’s say you want to adjust your model– You can compare the adjusted model to the

original model to determine if the adjustment is better

• Let’s say you want to compare two different models– You can compare their fits to see which is better

Model Comparisons

• Nested models– If you can create one model from another by the

addition or subtraction of parameters, then it is nested• Model A is said to be nested within Model B, if

Model B is a more complicated version of Model A. – For example, a one-factor model is nested within a

two-factor as a one-factor model can be viewed as a two-factor model in which the correlation between factors is perfect).

Nested Models

• Chi-square difference test– | Subtract Model 1 X2 – Model 2 X2|– Subtract Model 1 df – Model 2 df– Use a chi-square table to look up p < .05 for

difference in df– See if the first step is greater than that value• If yes, you say the model with the lower chi-square is

better• If no, you say they are the same and go with the

simpler model

Nested Models

• CFI difference test– Subtract CFI model 1 – CFI model 2– If the change is more than .01, then the models

are considered different– This version is not biased by sample size issues

with chi-square.

Nested Models

• So how can I tell what to change?• NOTE: JUST CHANGE ONE THING AT A TIME!• Use modification indices!– They tell you what the chi-square change would

be if you add the path suggested.– Based on X2(1) – called a Lagrange Multiplier• Remember that p < .05 = 3.84

Nested Models

• Can be tested with lavaan using the anova() function.– More on this in the multigroup section

Non-Nested Models

• AIC – Akaike Information Criterion• BIC – Bayesian Information Criterion• SABIC – Sample size Adjusted Information

Criterion– All of these penalize you for having more complex

models. If all other things are equal, it is biased to the simpler model.

Non-Nested Models

• All of the ICs are how much the sample will cross validate in the future

• You want them to be small, so you pick the smallest one of the two models (how different?)

Non-Nested Models

• ECVI – expected cross validation index– Fmin + (2t / ( n – p – 2) )• T number of parameters estimated• P number of squares

• Again, you want small values, so you pick the model with the smallest ECVI

OMG!

• So what to do?– Mainly people report: X2(df), RMSEA, SRMR, CFI– Determine the type of model change to use the

right model comparison statistic

Example

• Two models– Let’s make the models– And compare them with their fit indices– Are they nested or not?– See handout on blackboard.

Assessing Hypothesis Testing Fit Indices Kline Chapter 8 (Stop at 210) Beaujean Appendix A.

Documents