13th Aug, 2020

Universität Trier

Q&A

Find answers to technical questions and follow scientific discussions

Discussion

Started 15th Jul, 2020

My question is connected to rather unclear point of error correlation that many scholars encounter while conducting their SEM analysis. It is pretty often when scholars report procedures of correlating the error terms to enhance the overall goodness of fit for their models. Hermida (2015), for instance, provided an in-depth analysis for such issue and pointed out that there are many cases within social sciences studies when researchers do not provide appropriate justification for the error correlation. I have read in Harrington (2008) that the measurement errors can be the result of similar meaning or close to the meanings of words and phrases in the statements that participants are asked to assess. Another option to justify such correlation was connected to longitudinal studies and a priori justification for the error terms which might be based on the nature of study variables.

In my personal case, I have two items with Modification indices above 20.

lhs op rhs mi epc sepc.lv sepc.all sepc.nox

12 item1 ~~ item2 25.788 0.471 0.471 0.476 0.476

After correlating the errors, the model fit appears just great (Model consists of 5 latent factors of the first order and 2 latent factors of the first order; n=168; number of items: around 23). However, I am concerned with how to justify the error terms correlations. In my case the wording of two items appear very similar: With other students in English language class I feel supported (item 1) and With other students in English language class I feel supported (item 2)(Likert scale from 1 to 7). According to Harrington (2008) it's enough to justify the correlation between errors.

However, I would appreciate any comments on whether justification of similar wording of questions seems enough for proving error correlations.

Any further real-life examples of wording the items/questions or articles on the same topic are also well-appreciated.

Dear Artem and Marcel,

there are two problems with post-hoc correlating errors

1) the error covariance is causally unspecific (as any correlation). If one possibility is true--namely that both items additionally measure an omitted latent--then estimating the error cov will fit the model but the omitted latent variable still is not explicitly contained in the model. This may be unproblematic if this latent is just the response reaction on a specific word contained in both items --but sometimes it may be a substantial latent variable missing in the model whose omission will bias the effects of other, contained latent variables.

2) While issue #1 still presumes that the factor model is correct (but the items *in addition* share a further cause), the need for estimating error covs will emerge as a sign of a fundamental misspecification of the factor model: If the factor model is too simple (e.g., you test a 1-factor model whereas the true structure contains more) than the only proposal the algorithm can make is to estimate error covs. These can be interpreted as the valves in a technical system. Opening the valves will reduce the pressure but not solve the problem. To the contrary: Your model will fit but it is worse than before.

One simple add-hoc test is to estimate the error cov and then to include further variables in the model which correlate (or receive / emit effects) with/from/on the latent target variable. You will often see that the model which had fitted one minute ago (due to the estimation of the error cov) again shows a substantial misfit as the factor model is still wrong and cannot explain the new restrictions and correlations between the indicators and the newly added variables.

Please note that the goal in CFA/SEM is not to get a fitting model! The (mis)fit of the model is just a tool to evaluate the causal correctness. If data fit would be the essential goal than SEModeling would be so easy: Just saturate the model and you always get a perfect data fit.

One aspect is the post-hoc justification of the error-covs: I remember once reading MacCallum (I think that it was him) who wrote that he knows no colleague who would not have enough phantasy to come to an idea to explain a post-hoc need for an error covariance. :)

Hence, besides the causal issues noted above, there are statistical problems with regard to overfitting capitalization on chance (as any other post-hoc change of the model). That is: Better look onto your items before doing the model testing and think wether they could be reasons that lead to an error covariance.

One example is the longitudinal case where error covariances between the same items are expected and are included from the beginning.

If you have to post hoc include the error covariances, carefully consider other potential reasons (mainly the more fundamental issues noted in #2) and replicate the study. But replication in causal inference context should always imply an enlargement of the model (i.e., including new variables).

Best,

Holger

Gerbing, D. W., & Anderson, J. C. (1984). On the meaning of within-factor correlated measurement errors. Journal of Consumer Research, 11, 572-580.

Landis, R. S., Edwards, B. D., & Cortina, J. M. (2009). On the practice of allowing correlated residuals among indicators in structural equation models. In C. E. Lance & R. J. Vandenberg (Eds.), Statistical and Methodological Myths and Urban Legends: Doctrine, Verity and Fable in Organizational and Social Sciences (pp. 193-215). New York: Routledge.

Dear all,

I have some questions about correlation of errors in SEM

With my team we submitted a paper including SEM, we have been asked:

A bias analysis / sensitivity analysis for measurement error is encouraged under different assumptions about the correlations

between these measurement errors, (this means variance correlation?)

(i m working on R)

Thank you

Aline

**Get help with your research**

Join ResearchGate to ask questions, get input, and advance your work.

Dear Artem Zadorozhnyy

you have kind of provided the answer to your questions yourself. There are instances, especially while adapting or developing new scales, where correlated errors cannot be avoided. Given appropriate circumstances, I do not foresee grave issues, as long as your model fit improves considerably and you do not hide the fact that you left the "confirmatory realm" in favour of exploratory endeavours.

Since you asked for **real-life examples**: I have developed an instrument to measure self-efficacy beliefs of prospective and in-service teachers who teach an interdisciplinary subject.

Both questions appear after one another, refer to competencies in general and one of them lists them. The fit improved considerable after allowing the errors to correlate (e.g. RMSEA from .105 to .065; CFI from .962 to .987, TLi from .936 to .976, even the Chi-quare became nearly not significant from p<.001 to p=.043).

Hope that helps. May I ask you for the complete titles of Hermida and Harrington in return? I am sure I can use them to "justify" my correlations.

Best wishes from Germany

Marcel

Dear Artem and Marcel,

there are two problems with post-hoc correlating errors

1) the error covariance is causally unspecific (as any correlation). If one possibility is true--namely that both items additionally measure an omitted latent--then estimating the error cov will fit the model but the omitted latent variable still is not explicitly contained in the model. This may be unproblematic if this latent is just the response reaction on a specific word contained in both items --but sometimes it may be a substantial latent variable missing in the model whose omission will bias the effects of other, contained latent variables.

2) While issue #1 still presumes that the factor model is correct (but the items *in addition* share a further cause), the need for estimating error covs will emerge as a sign of a fundamental misspecification of the factor model: If the factor model is too simple (e.g., you test a 1-factor model whereas the true structure contains more) than the only proposal the algorithm can make is to estimate error covs. These can be interpreted as the valves in a technical system. Opening the valves will reduce the pressure but not solve the problem. To the contrary: Your model will fit but it is worse than before.

One simple add-hoc test is to estimate the error cov and then to include further variables in the model which correlate (or receive / emit effects) with/from/on the latent target variable. You will often see that the model which had fitted one minute ago (due to the estimation of the error cov) again shows a substantial misfit as the factor model is still wrong and cannot explain the new restrictions and correlations between the indicators and the newly added variables.

Please note that the goal in CFA/SEM is not to get a fitting model! The (mis)fit of the model is just a tool to evaluate the causal correctness. If data fit would be the essential goal than SEModeling would be so easy: Just saturate the model and you always get a perfect data fit.

One aspect is the post-hoc justification of the error-covs: I remember once reading MacCallum (I think that it was him) who wrote that he knows no colleague who would not have enough phantasy to come to an idea to explain a post-hoc need for an error covariance. :)

Hence, besides the causal issues noted above, there are statistical problems with regard to overfitting capitalization on chance (as any other post-hoc change of the model). That is: Better look onto your items before doing the model testing and think wether they could be reasons that lead to an error covariance.

One example is the longitudinal case where error covariances between the same items are expected and are included from the beginning.

If you have to post hoc include the error covariances, carefully consider other potential reasons (mainly the more fundamental issues noted in #2) and replicate the study. But replication in causal inference context should always imply an enlargement of the model (i.e., including new variables).

Best,

Holger

Gerbing, D. W., & Anderson, J. C. (1984). On the meaning of within-factor correlated measurement errors. Journal of Consumer Research, 11, 572-580.

Landis, R. S., Edwards, B. D., & Cortina, J. M. (2009). On the practice of allowing correlated residuals among indicators in structural equation models. In C. E. Lance & R. J. Vandenberg (Eds.), Statistical and Methodological Myths and Urban Legends: Doctrine, Verity and Fable in Organizational and Social Sciences (pp. 193-215). New York: Routledge.

I was pleased to read your correspondence.

I want to ask, have you encountered in CFA with covariance's creation for all errors of reversed items? In my model, they sound very similar (50% of items), and the fit is significantly improved if I correlate their errors. However, if I correlate errors of any unreversed items, the fit does not change. Am I making any mistake in thinking or inferring? (sorry, I am a beginner)

Dear all,

I have some questions about correlation of errors in SEM

With my team we submitted a paper including SEM, we have been asked:

A bias analysis / sensitivity analysis for measurement error is encouraged under different assumptions about the correlations

between these measurement errors, (this means variance correlation?)

(i m working on R)

Thank you

Aline

Article

- Jan 2014

SEM have two part that is structural model and measurement model. Composite structural model and measurement model testing enable to test the measurement error as part indivisible of SEM. Based on that condition, SEM can be used and developed to know about a correlation measurement influence in exogen variable and endogen variable, and correlation...

Article

- May 2018

Abstract
Structural Equation Modeling (SEM) is a multivariate data analysis, and one of the requirements in using SEM is that the data has interval scale. Some researchers argue that Likert scale is an interval, yet many others assume that this type of data is ordinal, and therefore transformation is important to apply to uplift the measurement sca...

Get high-quality answers from experts.