Article

The golden rule is that there are no golden rules: A commentary on Paul Barrett's recommendations for reporting model fit in structural equation modelling

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Paul Barrett offers a challenging and timely call for a re-examination of fit assessment strategies in structural equation modelling (SEM). He points out that widely accepted cutoff values for approximate fit indices have come to be treated as if they were test statistics. Paul cites four recent studies of the behaviour of fit indices under varying data conditions which demonstrate that universal indicative cutoff values cannot be trusted. Based upon these studies, Paul advocates the abandonment of approximate fit indices and greater reliance on the chi square test and a broader assessment strategy that includes predictive accuracy. I share Paul’s concerns about the lax standards often adopted in model testing and I agree with most of his arguments. However, the authors he cites in support of his recommendation to abandon approximate fit indices do not reach the same conclusion as Paul. In my response to Paul’s article, I discuss some conditions under which it could be legitimate to accept a model which has failed the chi square test and I contend that approximate fit indices can play a useful part in a multi-faceted strategy for determining model adequacy, provided they are not elevated to the status of golden rules.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... SPSS AMOS 21 maximum likelihood estimates demonstrated adequate model fit, χ 2 (2, N = 395) = 21.606, p < 0.001, TLI = 0.914, CFI = 0.983 (Jackson et al. 2009;Markland 2007). ...
... SPSS AMOS 21 maximum likelihood estimates demonstrated adequate model fit, χ 2 (2, N = 375) = 17.996, p < 0.001, TLI = 0.923, CFI = 0.985 (Jackson et al. 2009;Markland 2007). ...
... SPSS AMOS 21 maximum likelihood estimates demonstrated adequate model fit, χ 2 (3, N = 400) = 50.293, p < 0.001, TLI = 0.867, CFI = 0.960 (Jackson et al. 2009;Markland 2007). ...
Article
Full-text available
Political marketing campaigns expend enormous effort each campaign season to influence voter turnout. This cyclical democratic process and nonstop news cycle foster an environment of media malaise. Voter pessimism undercuts participation through increased perceived alikeness among ballot options. Differentiation and consolidation theory describe the voting decision process as reconciling rational and irrational information. Voters seek out differences to decide among presented options. More politically interested voters are more likely to vote. Counterintuitively, higher political organizational avocational interest is related to higher perceived alikeness. Across three studies, higher perceived alikeness of parties, candidates, and issues was related to a lower likelihood to vote (LTV). Conditional voting ineffectual beliefs exacerbated these indirect effects on LTV. In a saturated marketing atmosphere with massive spending during each election cycle, we discuss implications to influence LTV based on results.
... Alternatively, Saris et al. (2009) have proposed a useful approach for the inspection of local misfit, allowing for a more accurate quantification of the number and size of misspecifications present in a model. Whatever approach is used, it becomes clear that model fit indices alone are insufficient to judge a model and need to be included as part of a broader assessment strategy including multiple criteria (Markland, 2007). Surprisingly, this issue has been overlooked in past UWES-S research, where the validity of the model has primarily been judged based on fit indices alone. ...
... The presence of these fit indices, along with clearly defined factors, might lead some researchers to erroneously conclude that the model is acceptable when, in fact, it is seriously misspecified. Once again, our results caution against relying solely on fit indices in prior UWES-S research for model evaluation, emphasizing that a good-fitting model does not guarantee that the model is sound (Markland, 2007). ...
Article
Academic engagement plays a crucial role in students’ learning and performance. One of the most popular measures for assessing this construct is the Utrecht Work Engagement Scale for Students (UWES-S), which is based on a tridimensional conceptualization consisting of dedication, vigor, and absorption. However, prior research on its factor structure has yielded inconsistent results, and the substantial correlations between dimensions raise doubts about their empirical distinctiveness. Thus, questions remain whether academic engagement is experienced as a global construct, or as its three components. The present study addressed this issue by examining the dimensionality of both UWES-S17 and UWES-S9 using a comprehensive factor-analytic framework. One- to four-factor CFA and ESEM models, along with corresponding bifactor-CFA and bifactor-ESEM models, were tested using data from 453 Ecuadorian university students. The results indicated that ESEM yielded superior fit indices and less correlated factors compared to CFA. However, discriminant validity test did not support the distinctiveness of UWES-S factors, and bifactor analyses consistently demonstrated a strong general factor and weak or collapsed specific factors. These findings were remarkably consistent across both UWES-S versions. Collectively, the results suggest that academic engagement, as currently operationalized by the UWES-S, can be considered as a unidimensional rather than a multidimensional construct. Implications for conceptualization, measurement, and research on academic engagement are discussed.
... The cutoffs for the root mean square error of approximation (RMSEA) were equal to or below 0.10, according to the criteria proposed by Weston and Gore (2006). For the Comparative Fit Index (CFI), Normed Fit Index, Incremental Fit Index, and the adjusted goodness of fit (Tucker-Lewis index [TLI]), the values recommended by various authors are 0.90 or 0.95 (Hair et al., 1999;Markland, 2007;Marsh et al., 2004) although these values should not be considered absolute values. ...
... The standardized root mean square residual value was below 0.08, and this model is considered to be a good fit. Regarding the CFI and TLI, values were above 0.90, in line with the values recommended by various authors (Markland, 2007;Marsh et al., 2004). ...
Article
Forgiveness plays an important role in couple relationships, as it is essential in overcoming interpersonal offenses and related to the well‐being of the relationship. To date, no valid instruments are available for Spanish populations to evaluate forgiveness within marital relationships. This study aims to adapt and evaluate the Marital Offense‐Specific Forgiveness Scale (MOFS), comparing the behavior of the scale in two cultural contexts: Spain and the United States. Two studies were conducted: the first with 389 participants to evaluate the behavior of the scale and to explore the dimensionality of the Spanish version of the MOFS using exploratory factor analysis (EFA); the second study used a sample of 361 Spanish and 119 American participants, conducting a confirmatory factor analysis (CFA) and an invariance factor analysis. The EFA revealed two factors: Avoidance–Resentment and Benevolence. Using CFA, the factorial structure of the MOFS was confirmed, with results indicating that the proposed model presents a similar fit to the original version.
... or .95. However, these values should not be considered absolute values (Hair et al., 1999;Markland, 2007;Marsh et al., 2004). ...
... The CFI, TLI, IFI and NFI had scores equal to or above .90, coinciding with those recommended by various authors (Hair et al., 1999;Markland, 2007;Marsh et al., 2004). As shown in Figure 1, the range of factor loading for the original model ranges from .33 (item 23) to .92 (item 2). ...
Article
Full-text available
Título: Perdón interpersonal: validación del Enright Forgiveness Inventory (EFI-30) en una muestra española. Resumen: Antecedentes: Diversos estudios demuestran la relevancia del per-dón interpersonal tras una ofensa para mejorar la salud y el bienestar. A pe-sar de su importancia, es evidente la falta de instrumentos de evaluación del perdón adaptados al contexto español. El Enright Forgiveness Inventory (EFI-30) es el instrumento que operacionaliza uno de los modelos teóricos más asentados y reconocidos en el área del perdón a nivel mundial. El obje-tivo del presente estudio es adaptar el EFI-30 a la población española y re-visar suspropiedades psicométricas. Método: 426 estudiantes de grado y más-ter (98 hombres y 328 mujeres) con edades entre 18 y 30 años (M = 21.24; DE = 2.91), completaron el EFI-30 tras su adaptación, así como Transgression Related Interpersonal Motivations Inventory (TRIM-18), Remedial Strategies Scale (RSS) y Depression Anxiety and Stress Scale (DASS-21). Resultados: El Análisis Factorial Confirmatorio indicó buen ajuste a la estructura original de seis factores (CFI = .91, TLI = .90, IFI = .91, RMSEA = .067). La fiabilidad de las subescalas y del instrumento general fue buena, similar a la versión original. Los resultados mostraron adecuada validez convergente y de criterio. Conclusiones: EFI-30 muestra adecuadas propiedades psicométricas en un contexto español, siendo una medida apropiada para evaluar el perdón interpersonal de una ofensa especifica en al ámbito de la investigación e intervención clínica. Palabras clave: Perdón interpersonal. Evaluación. Validación. España. Búsqueda de perdón. Abstract: Background: Numerous studies have demonstrated the importance of interpersonal forgiveness after a specific offense for improving the health and well-being of individuals. Despite its importance, there is an evident lack of forgiveness evaluation instruments adapted to the Spanish context. The Enright Forgiveness Inventory (EFI-30) is the questionnaire that implements one of the most established and recognized theoretical models in the area of forgiveness. The aim of the present study is to adapt the EFI-30 for the Spanish population and evaluate its psychometric properties. Method: A sample of 426 undergraduate and graduate students (98 men, 328 women) aged from 18 to 30 years (M = 21.24; SD = 2.91), completed the EFI-30 after its adaptation to the Spanish context, as well as the Transgression Related Interpersonal Motivations Inventory (TRIM-18), the Remedial Strategies Scale (RSS) and the Depression Anxiety and Stress Scale (DASS-21). Results: The Confirmatory Factor Analysis showed a good fit to the original six-factor structure (CFI = .91, TLI = .90, IFI = .91, RMSEA = .067). The reliability of these subscales and the instrument was similar to the original version. The results showed adequate criteria and convergent validity. Conclusions: The EFI-30 shows adequate psycho-metric properties within the Spanish context and is an appropriate instrument for evaluating interpersonal forgiveness of a specific offense in research and clinical intervention.
... or .95. However, these values should not be considered absolute values (Hair et al., 1999;Markland, 2007;Marsh et al., 2004). ...
... The CFI, TLI, IFI and NFI had scores equal to or above .90, coinciding with those recommended by various authors (Hair et al., 1999;Markland, 2007;Marsh et al., 2004). As shown in Figure 1, the range of factor loading for the original model ranges from .33 (item 23) to .92 (item 2). ...
Article
Abstract: Background: Numerous studies have demonstrated the im- portance of interpersonal forgiveness after a specific offense for improving the health and well-being of individuals. Despite its importance, there is an evident lack of forgiveness evaluation instruments adapted to the Spanish context. The Enright Forgiveness Inventory (EFI-30) is the questionnaire that implements one of the most established and recognized theoretical models in the area of forgiveness. The aim of the present study is to adapt the EFI-30 for the Spanish population and evaluate its psychometric prop- erties. Method: A sample of 426 undergraduate and graduate students (98 men, 328 women) aged from 18 to 30 years (M = 21.24; SD = 2.91), com- pleted the EFI-30 after its adaptation to the Spanish context, as well as the Transgression Related Interpersonal Motivations Inventory (TRIM-18), the Remedial Strategies Scale (RSS) and the Depression Anxiety and Stress Scale (DASS-21). Results: The Confirmatory Factor Analysis showed a good fit to the original six-factor structure (CFI = .91, TLI = .90, IFI = .91, RMSEA = .067). The reliability of these subscales and the instrument was similar to the original version. The results showed adequate criteria and convergent validity. Conclusions: The EFI-30 shows adequate psycho- metric properties within the Spanish context and is an appropriate instru- ment for evaluating interpersonal forgiveness of a specific offense in re- search and clinical intervention.
... Hu and Bentler [55] recommend that the RMSEA should be equal to or smaller than 0.06. Meanwhile, according to Browne and Cudeck [56], RMSEA < 0.05 indicates a good fit; 0.05 to 0.08 indicates an adequate fit, and 0.08 to 0.10 indicates an acceptable fit. In addition, the chi-square test was known to be sensitive to sample size; therefore, a statistically significant chi-square might be acceptable when other fit indices indicated a good model fit [56]. ...
... Meanwhile, according to Browne and Cudeck [56], RMSEA < 0.05 indicates a good fit; 0.05 to 0.08 indicates an adequate fit, and 0.08 to 0.10 indicates an acceptable fit. In addition, the chi-square test was known to be sensitive to sample size; therefore, a statistically significant chi-square might be acceptable when other fit indices indicated a good model fit [56]. ...
Article
Full-text available
Under the COVID-19 pandemic, online learning has become m sore frequently used and has carried over cultural characteristics. In China, grandparents exert a great impact on parent–child relationships and on children’s online learning process. This study proposed six models and examined the roles of various Chinese family members (father, mother, grandparents) and their online accompaniment time in promoting preschoolers’ math learning. A total of 3552 participants were recruited to finish online questionnaires about demographics, household adult–child interactions, online company time investment, and math language performance. We found that the relationships between father time investment online and children’s math language performance were mediated by the amount of time that maternal grandparents spent with children on online learning. To contextualize these findings, we discussed the unique Chinese cultural aspects of the grandparent–parent–children relationship during the development of online math language performance in Chinese families.
... The evaluation of the factorial structure of the instrument analyzed was carried out through con rmatory factor analysis (CFA) Several t indices were calculated for the evaluation of the different proposed models, combining absolute and relative t indexes [34,35]. Among the absolute ones: the p-value, associated with the Chi-square statistic (χ2), the ratio between and degrees of freedom (gl; χ 2 /gl) and GFI (goodness-of-t index). ...
... Next, the original dimensionality theoretically proposed by Krech et al. [22] was analyzed with CFA and, following authors such as Markland [35], several models were formulated and analyzed, given that the data so recommended, and the most relevant results were reported. Considering the above in the analysis of scale items, it was appropriate to perform and compare several structural regression models to check the best t. ...
Preprint
Full-text available
The aim of the present study was to analyze the psychometric properties of the Questionnaire for Disruptive Behavior in Physical Education (CCD-EF) in the Mexican context. A non-experimental, cross-sectional, correlational-causal study was designed in which 378 girls (Mean = 13.99; SD = .30) and 375 boys (Mean = 14.02; SD = .33), all high school students participated. The psychometric properties of the scale were analyzed by means of different exploratory and confirmatory analyses that demonstrate that this instrument with four correlated factors, and as higher order models, is valid, reliable, and invariant as a function of sex. A regression model with latent variables showed a positive and significant prediction of boredom with Physical Education on disruptive behaviors, finding that this prediction is higher in boys than in girls. The CCD-EF has proven to be a reliable and valid instrument to use with Mexican high school students.
... Cutoffs tailored to the setting of interest are generally more appropriate than fixed cutoffs whenever the setting falls outside the limited range of simulation scenarios from which these cutoffs were derived (such as those by Hu and Bentler, 1999). Therefore, methodologists are increasingly urging that fixed cutoffs should be abandoned and replaced by tailored (or "dynamic") cutoffs (e.g., Markland, 2007;Marsh et al., 2004;McNeish & Wolf, 2023a;Niemand & Mai, 2018;Nye & Drasgow, 2011). ...
Article
Full-text available
To evaluate model fit in structural equation modeling, researchers commonly compare fit indices against fixed cutoff values (e.g., CFI ≥ .950). However, methodologists have cautioned against overgeneralizing cutoffs, highlighting that cutoffs permit valid judgments of model fit only in empirical settings similar to the simulation scenarios from which these cutoffs originate. This is because fit indices are not only sensitive to misspecification but are also susceptible to various model, estimation, and data characteristics. As a solution, methodologists have proposed four principal approaches to obtain so-called tailored cutoffs, which are generated specifically for a given setting. Here, we review these approaches. We find that none of these approaches provides guidelines on which fit index (out of all fit indices of interest) is best suited for evaluating whether the model fits the data in the setting of interest. Therefore, we propose a novel approach combining a Monte Carlo simulation with receiver operating characteristic (ROC) analysis. This so-called simulation-cum-ROC approach generates tailored cutoffs and additionally identifies the most reliable fit indices in the setting of interest. We provide R code and a Shiny app for an easy implementation of the approach. No prior knowledge of Monte Carlo simulations or ROC analysis is needed to generate tailored cutoffs with the simulation-cum-ROC approach.
... Despite the importance of model fit in CFA, a consensus on best practices remains elusive and the topic remains hotly debated in the methodological literature (e.g., Barrett, 2007;Markland, 2007;Marsh et al., 2004;Ropovik, 2015;Steiger, 2007). Different classes of approaches have been proposed (e.g., local fit and global fit) as have different methods within each approach (e.g., exact fit vs. approximate fit). ...
Article
Full-text available
Recent reviews report that about 80% of empirical factor analyses are applied to Likert-type responses and that it is exceedingly common to treat Likert-type item responses as continuous. However, traditional model fit index cutoffs like the root-mean-square error of approximation ≤ .06 or comparative fit index ≥ .95 were derived to have 90+% sensitivity to misspecification with continuous responses. A disconnect therefore emerges whereby traditional methodological guidelines assume continuous responses whereas empirical data often contain Likert-type responses. We provide an illustrative simulation study to show that this disconnect is not innocuous—the sensitivity of traditional cutoffs to misspecification is close to 100% with continuous responses but can fall considerably if 5-point Likert responses are treated as continuous in some conditions. In other conditions, the reverse may occur, and traditional cutoffs may be too strict. Generally, applying traditional cutoffs to Likert-type responses can adversely impact conclusions about fit adequacy. This article aims to address this prevalent issue by extending the dynamic fit index (DFI) framework to accommodate Likert-type responses. DFI is a simulation-based method that was initially intended to address changes in cutoff sensitivity to misspecification because of model characteristics (e.g., number of items, strength of loadings). Here, we propose extending DFI so that it also accounts for data characteristics (e.g., number of Likert scale points, response distribution). Two simulations are included to demonstrate that—with 5-point Likert-type responses—the proposed method (a) improves upon traditional cutoffs, (b) improves upon DFI cutoffs based on multivariate normality, and (c) consistently maintains 90+% sensitivity to misspecification.
... Due to the lack of normal distribution of the variables, robust estimation methods were used (Maximum Likelihood Robust [MLR]) (Finney et al., 2016). The goodness-of-fit of the hypothesized models was assessed based on the following indices: 1) Satorra-Bentler χ 2 and, given its high sensitivity to sample size (Jöreskog & Sörbom, 1993;Markland, 2007), also the relative chi-squared (χ 2 /df), taking as fit criterion a value less than 2; 2) the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR). The fit criteria are identified as values for CFI and TLI ≥ .95, ...
Article
Full-text available
Measuring attachment in adulthood is still a challenge. Despite progress in developing brief instruments, there are currently no instruments that assess attachment to significant persons without being limited to a specific type of relationship. The present study aims to develop a brief scale to assess attachment to significant persons (SP), as well as to provide evidence of validity and reliability. For this purpose, the brief Spanish version of the Experiences in Close Relationships instrument was used. 385 emerging adults, divided into two groups, Spanish psychology students and psychotherapists, completed the study online. A two-factor structure (Anxiety and Avoidance) was supported through confirmatory factor analysis. Likewise, evidence of convergent and concurrent validity, respectively, was provided through correlations with the Inventory of Parent and Peer Attachment and the Relational Needs Satisfaction Scale. The scale also demonstrated gender (men vs. women) and age (18-25 years vs. 26-30 years) invariance and adequate internal consistency. The study has allowed us to obtain a brief 11-item psychometrically robust scale-the ECR-SP11-which helps to understand attachment styles in clinical practice and psychotherapeutic research. The instrument's applicability through more heterogeneous samples should be explored.
... WLSMV-based Chi-Square (χ2) and the general model significance (p) were also reported. Given that these statistics are highly conditioned by sample size (Markland, 2007), we did not use them to assess models fit. Following the pre-registration guidelines, excellent model fit was considered when the CFI and TLI were ≥ .95 and RMSEA ≤ .05 ...
Article
Full-text available
Sexual desire is a complex construct with important implications for sexual functioning and well-being. In this research, we translated the Sexual Desire Inventory (SDI-2), a widely used scale for assessing sexual (desire), into 25 languages from English and used data from the International Sex Survey (ISS) to (a) investigate its psychometric properties (i.e. factorial structure, reliability, validity, and measurement invariance) and (b) explore the expression of sexual desire across different countries, genders, and sexual orientations. A total of 82,243 participants from 42 countries completed the SDI-2, along with other sexuality-related scales. Confirmatory factor analysis supported a three-factor solution for the SDI-2 (CFI = .980; RMSEA = .060), encompassing the domains of "Partner-related," "Attractive-person-related," and "Solitary" sexual desire. The reliability of the total score and subscales were excellent. Likewise, correlations with other sexuality-related variables were positive yet weak-to-moderate in effect size. Measurement invariance tests supported its use across countries, languages, genders, and sexual orientations. Analysis of SDI-2 scores according to these variables supported its ability to capture group-based differences in sexual desire. In sum, the SDI-2 constitutes a psychometrically robust measure for the assessment of sexual desire in non-clinical samples with utility in large-scale cross-cultural studies.
... Doğrulayıcı faktör analizi geçerlilik analizinde tercih edilir ve belirlenen yapının doğruluğunu test etmek için kullanılır. Bu analiz süreci kapsamında, literatürde model uyumunu değerlendirmek için belirli bir eşik değer üzerinde genel bir mutabakat sağlanmamıştır (Marsh vd., 2004;Markland, 2007;İzmir vd., 2023). Ancak yaygın olarak kabul edilen bazı uyum ölçütleri Kline (2016) ve Hair vd. ...
Article
Full-text available
Bu çalışmanın amacı aşırı rol yükünün iş tatmini üzerine etkisinde iş stresinin aracı rolünün araştırılmasıdır. Bu araştırmanın yapılabilmesi için Türkiye’nin çeşitli illerinde görev yapan 508 öğretmene anket uygulanmıştır. Ankete kamu ve özel sektörde çalışan öğretmenler katılmıştır. Araştırmada elde edilen veriler SPSS 24 ve Process Makro (4.2) programları aracılığıyla analize tabi tutulmuştur. Çalışmada yer alan hipotezler İş Talepleri ve Kaynakları Teorisine dayanılarak geliştirilmiştir. Verilerin analizi sonucunda aşırı rol yükünün iş stresi üzerindeki etkisinin pozitif ve anlamlı, iş tatmini üzerindeki etkisinin negatif ve anlamsız olduğu; iş stresinin ise iş tatmini üzerinde negatif ve anlamlı etkisinin olduğu sonucuna ulaşılmıştır. Aşırı rol yükünün iş tatmini üzerine etkisinde iş stresinin tam aracı etkiye sahip olduğu tespit edilmiştir. Bu sonuçlar, eğitim politikası yapıcılarının ve eğitim sektöründe görev alan yöneticilerin, öğretmenlerin verimliliklerini ve iş tatminlerini artırmalarına, iş streslerini ise azaltmalarına imkân sağlayacak politikalar üretmeleri gerekliliğini ortaya koymaktadır.
... CFA demonstrates reasonable values of fit indices (χ2/df 5 3.98, RMSEA 5 0.08, CFI 5 0.85, TLI 5 0.84). Prior research has suggested no standard value for model fit indices (Markland, 2007). Values of χ2/df below 5 and RMSEA about 0.08 indicate an acceptable fit. ...
Article
Purpose This study aims to examine how star–nonstar exchange (SNE) influences nonstars’ performance using social information processing theory. Design/methodology/approach A time-lagged survey approach is utilized to collect data from 531 nonstars in China. Structural equation modeling and process macro models are applied to test the moderated mediation model of this study. Findings Results reveal that SNE has a positive effect on nonstars’ performance through their psychological empowerment, with task complexity moderating the relationship between psychological empowerment and innovative performance. However, no moderating effect was found for routine performance. Originality/value Although previous research has delved into how leader–member exchange and team–member exchange influence employee performance. This study uniquely concentrates on how the exchange relationship between star performers and nonstars influences nonstars’ performance – a dimension that has generally been overlooked in existing literature. Findings are important for understanding SNE influence on nonstars’ performance while managing task complexity.
... The goodness-of-fit of the hypothesized models was assessed based on the following indices: (1) χ 2 Satorra-Bentler, to analyze the divergence between variance matrices and sample covariances and that generated by the hypothesized model. This index is very sensitive to sample size (Jöreskog and Sörbom, 1993;Markland, 2007), so the relative chi-square (χ 2 /df) has also been considered; a value below 2 is strictly considered to be indicative of an acceptable fit of the model; (2) the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA) and the standardized root mean square residual (SRMR). Excellent fit of the model was identified when the CFI and the TLI were ≥0.95, ...
Article
Full-text available
Introduction This study is based on the paradigm of collaborative law and the current absence of instruments that evaluate the lawyer-client relationship as a function of the needs of the family system. The objective was to construct and validate an instrument, conceptualizing the lawyer-client relationship as a helping relationship. Method Two groups of experts and 239 parents (58% mothers and 42% fathers), users of Family Visitation Centers, participated in the study. The content, construct, and criterion validity of the instrument, as well as its invariance for both parents, were analyzed. Results The resulting 12-item instrument has been shown to have a two-dimensional structure, invariant for both parents, with high psychometric solidity. Discussion The LCR scale seems to be a valuable and effective measure for use in a legal context, with important correlations with the parents’ psychological well-being, leading to a promising and relevant instrument for the holistic approach to the divorce process.
... indicates a good fit, .05 to < .08 indicates a reasonable fit, and .08 to .10 indicates a mediocre fit. The χ 2 statistic is known to be sensitive to sample size, and a significant χ 2 may be acceptable when other fit indices indicate good model fit (Markland, 2007). ...
... RMSEA ≤ .08; Browne & Cudeck, 1992;Hu & Bentler, 1999;Markland, 2007), and standardized loadings were expected to be "good" or better for all items (.55 or above; Comrey & Lee, 1992). ...
... Nonetheless, these warnings have gone unheeded (Jackson et al., 2009) and narrowly tailored recommendations were transformed into immutable golden rules in the organizational and psychological sciences (Markland, 2007;Marsh et al., 2004). ...
... Though there are debates about how covariance structure model fit should be evaluated (e.g., Barrett, 2007;Bentler, 2007;Markland, 2007;McIntosh, 2007;Steiger, 2007), the prevailing approach over the last 25 years has been to compare approximate fit indices like RMSEA and CFI to traditional cutoffs such as RMSEA ≤ .06 and CFI ≥ .95 (or .96) ...
... This study chose to report some represent indices after reviewing several relative research (Boomsma, 2000;Hayduk et al., 2007;Hooper et al., n.d.;Markland, 2007). The requirements of the indicators of model fit as follows: 2 (Chi-square),which should be as high as 5 (Wheaton et al., 1977), NFI (Hooper et al., n.d.), NFI (Normed Fit Index), which can be as low as 0.8, IFI (incremental fit index), CFI (comparative fit index), all of which should be higher than 0.9 PNFI (non-normed fit index), which should be higher than 0.5 (West et al., 2012;Weston & Gore Jr, 2006) and RMSEA (root mean square error of approximation), which can be a good fit if it is lower than 0.08 or can be mediocre fit if it is between 0.08-0.1 (MacCallum et al., 1996), all of which were computed to assess the model goodness-of-fit. ...
Article
Full-text available
Objective: This study evaluated the psychometric property of the Chinese version of the 12-item Expectations Regarding Aging Scale (C-ERA) in China. Method: The C-ERA was finalized after forward-translation, back translation, synthesis, pilot testing, and expert panel. A convenient sample of 458 community dwellers was used. The reliability and validity were investigated using content validity, exploratory factor analysis (EFA), confirmatory factor analysis (CFA), Split-half Reliability, Item Analysis, Convergent Validity and discriminant validity. Results: The content validity index at item level was 0.831. The overall Cronbach‘s coefficient of the scale is 0.953 and split-half reliability is 0.904 and 0.933 respectively. Three factors were extracted by EFA and the cumulative variance contribution rate is 84.24%, demonstrating an acceptable model fit confirmed by CFA. Convergent validity and discriminant validity were acceptable except physical health dimension. Conclusion: The psychometric properties of the C-ERA were confirmed and provided evidence for expectations regarding aging assessment. full-text https://www.airitilibrary.com/Article/Detail/P20151116003-N202301170014-00005
... Specifically, obtaining an insignificant chi-square is the ultimate goal, indicating the acceptance of the null hypothesis (Markland, 2007). RMSEA assesses the difference between the estimated and the observed variance-covariance matrices in each degree of freedom (Shi and Maydeu-Olivares, 2020). ...
... Referring to that, the RMSEA value in our model can be accepted. Overall, in the current literature, researchers have cautioned against overinterpreting fit indices, and some have even questioned the applicability of universal cutoff values such as for TLI and RMSEA to determine adequate model fit (Fan & Sivo, 2007;Kenny et al., 2015;Markland, 2007;Marsh, Hau, & Wen, 2004). Thus, interpretations of model fit statistics should be done carefully. ...
Article
The present study examines the quality and domains of teacher-toddler interactions and associations with structural characteristics using data from 95 German early childcare settings. The results of the confirmatory factor analysis supported a two-factor structure of interaction quality assessed by the CLASS Toddler: emotional and behavioural support (EBS) and engaged support for learning (ESL). The EBS domain showed higher quality ratings (M = 5.33, SD = .59) than the ESL domain (M = 3.23, SD = .70). Structural equation modelling was applied to estimate associations between those domains and structural characteristics within classrooms. Structural characteristics predicting interaction quality were teachers’ age (for EBS), teachers’ education (for ESL) and children’s age composition in the classroom (for EBS and ESL). Overall, the two-factor structure of CLASS Toddler could be replicated. For high-quality interactions, teacher and classroom characteristics are crucial but need to be carefully distinguished. Beyond their limitations, these findings have implications that are discussed.
... Referring to that, the RMSEA value in our model can be accepted. Overall, in the current literature, researchers have cautioned against overinterpreting fit indices, and some have even questioned the applicability of universal cutoff values such as for TLI and RMSEA to determine adequate model fit (Fan & Sivo, 2007;Kenny et al., 2015;Markland, 2007;Marsh, Hau, & Wen, 2004). Thus, interpretations of model fit statistics should be done carefully. ...
Conference Paper
This study examines domains of teacher-toddler interactions in early childcare settings and investigates their associations with structural characteristics. Ninety-five toddler classrooms located in Bavaria, Germany were observed and rated with CLASS Toddler (La Paro et al., 2012) and data on teachers and conditions within the childcare settings were assessed through self-report questionnaires. Confirmatory factor analyses and structural equation modeling were conducted using R. The results were in favor of a two-, instead of a one- or three-factor structure, which included the ‘emotional and behavioral support’ (EBS) and the ‘engaged support for learning’ (ESL) domains. The model showed a feasible fit: χ²(19)=56.51, p<.001, CFI=.91, TLI=.87, RMSEA=.14 (CI90:.10-.19), SRMR=.07. The EBS domain had higher quality levels (M=5.33, SD=.59) than the ESL domain (M=3.23, SD=.70). Of structural characteristics that were examined, teachers’ age, teachers’ education and children’s age composition in the classroom showed (marginally) significant relations with interaction domains (model fit: χ²(43)=81.65, p<.001, CFI=.91, TLI=.88, RMSEA=.10 (CI90:.06-.13), SRMR=.06). Less EBS was observed when teachers were older (B=-.02, SE=.01, p=.01) and more ESL when teachers had a higher educational level (B=.12, SE=.07, p=.10). Mixed-aged classrooms where negatively associated with EBS (B=-.35, SE=.15, p=.02) and ESL (B=-.33, SE=.16, p=.04). Overall, the two-factor structure of CLASS Tod-dler could be replicated and the quality-levels were in the expected range. For high quality interactions, teacher characteristics and circumstances within classrooms are influential but need to be distinguished carefully. Beyond limitations, these findings have implications for further research, policy and practice in toddler childcare settings that are discussed.
... Unfortunately, current reporting practice shows that researchers apply cutoffs rather uniformly, even in the presence of data or analysis characteristics that can differ markedly from the ones in the simulation studies (for an overview, see Jackson et al., 2009;McNeish & Wolf, 2023a). It appears that repeatedly voiced concerns against overgeneralizations of cutoffs have gone largely unheeded (e.g., Heene et al., 2011;Markland, 2007;Marsh et al., 2004;McNeish & Wolf, 2023a;Niemand & Mai, 2018;Nye & Drasgow, 2011). The widespread-in fact, near-universal-practice of relying on (fixed) cutoffs for GOFs in model evaluation is alarming, given the lingering uncertainty about the applicability of fixed cutoffs for GOFs to scenarios hitherto uncharted by simulation studies. ...
Article
Full-text available
To evaluate model fit in confirmatory factor analysis, researchers compare goodness-of-fit indices (GOFs) against fixed cutoff values (e.g., CFI > .950) derived from simulation studies. Methodologists have cautioned that cutoffs for GOFs are only valid for settings similar to the simulation scenarios from which cutoffs originated. Despite these warnings, fixed cutoffs for popular GOFs (i.e., χ ² , χ ² / df , CFI, RMSEA, SRMR) continue to be widely used in applied research. We (1) argue that the practice of using fixed cutoffs needs to be abandoned and (2) review time-honored and emerging alternatives to fixed cutoffs. We first present the most in-depth simulation study to date on the sensitivity of GOFs to model misspecification (i.e., misspecified factor dimensionality and unmodeled cross-loadings) and their susceptibility to further data and analysis characteristics (i.e., estimator, number of indicators, number and distribution of response options, loading magnitude, sample size, and factor correlation). We included all characteristics identified as influential in previous studies. Our simulation enabled us to replicate well-known influences on GOFs and establish hitherto unknown or underappreciated ones. In particular, the magnitude of the factor correlation turned out to moderate the effects of several characteristics on GOFs. Second, to address these problems, we discuss several strategies for assessing model fit that take the dependency of GOFs on the modeling context into account. We highlight tailored (or “dynamic”) cutoffs as a way forward. We provide convenient tables with scenario-specific cutoffs as well as regression formulae to predict cutoffs tailored to the empirical setting of interest.
... As for the SRMR index, a value of 0.0611 was obtained, with appropriate values considered to be between 0.05 and 0.08. The CFI and TLI show a value higher than 0.90, which coincides with the values recommended by Marsh et al. (2004) and Markland (2007). As can be seen in Figure 1, the range of factor loadings for the model varies between 0.625 (item 13) and 0.898 (item 29) (Supplementary Table S3). ...
Article
Full-text available
Introduction Self-forgiveness has been a complex construct to define, which has resulted in a shortage of instruments that adequately measure it as a process. In Spain, until now there is only one validated instrument to measure self-forgiveness, for this reason the present study aims to validate the Enright Self-Forgiveness Inventory (ESFI). Method A sample of 276 people (84 men, 192 women) aged from 18 to 25 years, completed the Enright Self-Forgiveness Inventory (ESFI) after its adaptation to Spanish, as well as the Enright Forgiveness Inventory-30 (EFI-30), the Narcissistic Personality Inventory (NPI), the Short form of Social Desirability Scale (M-C SDS), the Scale of psychological wellbeing (RYFF) and the Depression, Anxiety and Stress Scale-21 (DASS-21). Results The Confirmatory Factor Analysis showed a good fit for the original six-factors structure (CFI = 0.93, TLI = 0.92, RMSEA = 0.063). The results showed good psychometric qualities (both validity and reliability) and association between self-forgiveness and social desirability, depression, anxiety, narcissistic traits, and purpose in life as expected theoretically. Discussion The ESFI-30 shows good psychometric properties within the Spanish context and is an appropriate instrument for evaluating self-forgiveness for research and clinical intervention.
... We examined all the three indices and compared them to the recommended cutoffs in order to control both Type I and Type II errors (Hu & Bentler, 1999). We did not use the χ 2 test for model evaluation purpose because even strong models will produce a significant deviation in fit with large enough sample sizes (Markland, 2007). Models that included all the predictors generally had poor model fits. ...
Article
Full-text available
Background For peer assessment, reliability (i.e., consistency in ratings across peers) and validity (i.e., consistency of peer ratings with instructors or experts) are frequently examined in the research literature to address a central concern of instructors and students. Although the average levels are generally promising, both reliability and validity can vary substantially from context to context. Meta‐analyses have identified a few moderators that are related to peer assessment reliability/validity, but they have lacked statistical power to systematically investigate many moderators or disentangle correlated moderators. Objectives The current study fills this gap by addressing what variables influence peer assessment reliability/validity using a large‐scale, cross‐context dataset from a shared online peer assessment platform. Methods Using multi‐level structural equation models, we examined three categories of variables: (1) variables related to the context of peer assessment; (2) variables related to the peer assessment task itself; and (3) variables related to rating rubrics of peer assessment. Results and Conclusions We found that the extent to which assessment documents varied in quality on the given rubric played a central role in mediating the effect from different predictors to peer assessment reliability/validity. Other variables that are significantly associated with reliability and validity included: Education Level, Language, Discipline, Average Ability of Peer Raters, Draft Number, Assignment Number, Class Size, Average Number of Raters, and Length of Rubric Description. The results provide information to guide practitioners on how to improve reliability and validity of peer assessments.
... Subsequently, CFA examined one-factor, three-factor, and three-factor bifactor models using WLSMV estimation [48,49]. Criteria for evaluating an acceptable model fit were established a priori: RMSEA values ≤ 0.08 and CFA values ≥ 0.90 [50][51][52]. ...
Article
Full-text available
Depression is a common and debilitating condition that impacts individuals with various cultural backgrounds, medical conditions, and life circumstances. Thus, assessment tools need to be useful among different cultural groups. The 21-item Teate Depression Inventory (TDI) was developed in Italy, is designed to assess major depression, and focuses on cognitive and affective rather than somatic symptoms. This study aims to examine the factor structure and concurrent validity of the TDI English version among a non-clinical population in the United States. Participants included 398 adults (mean age 19.89 years, SD = 2.72, range: 18 to 46 years old) who completed the TDI and The Center for Epidemiologic Studies Depression Scale-Revised (CESD-R). The results supported a three-factor bifactor structure of the TDI (Positive Affect, Negative Affect, and Daily Functioning), which largely corresponds to the Tripartite Model of affective disorders. These findings support the use of TDI scores as measures of depressive symptoms among U.S. young adults, offering researchers and practitioners a brief and useful tool.
... was indicative of poor model fit. However, as mentioned previously, the chi-square test as a measure of goodness of fit in SEM is heavily influenced by sample size, and models that include large samples are at substantial risk of incorrectly rejecting the null hypothesis (Markland, 2007;Shi, Lee, & Maydeu-Olivares, 2019). Therefore, other fit indices that are less sensitive to sample size are more likely to provide a valid indication of model fit. ...
Article
Full-text available
Objective: In most countries, men are at higher risk than women for suicide death. Research focused on masculinity and men's mental health increasingly demonstrates that relationships between gender and various health outcomes, including suicidality, is complex as these relationships can be further explained by certain psychological processes or health behaviors. The objective of this study was to extend this area of research in a national sample of US men (n = 785) by investigating if their adherence to certain hegemonic masculine gender role norms (toughness and self-reliance through mechanical skills) is associated with the suppression of distressing thoughts and if thought suppression then increases their risk for suicidal thoughts and behaviors. Methods: Men in the US who have recently experienced a stressful life event completed an anonymous online survey. Structural Equational Modeling (SEM) was used to test for direct and indirect effects (i.e., mediation) between variables. Results: Men's engagement in thought suppression mediated the relationship between self-reliance and suicidality. The norm of toughness was both directly related to suicidality and mediated by thought suppression. Conclusions: Thought suppression appears to be a process that provides some explanation for the relationships between hegemonic masculine norms and suicidality in men, though this study indicated it may play only a small role. Research continues to build that certain masculine norms, such as self-reliance and toughness, are particularly concerning for men's health.HIGHLIGHTSMen's thought suppression mediates the relationship between self-reliance and suicidalityMen's toughness impacts suicidality both directly and via engagement in thought suppressionThese findings have implications for interventions that help men manage distressing thoughts.
... For the CFA process, the literature fails to agree on a golden rule to evaluate the model fit (Marsh et al., 2004;Markland, 2007). According to Kline (2016), and Hair et al. (2014), the most widely used indices in the evaluation of the model fit are x2/df<3; CFI>.90; ...
Article
Full-text available
Consumers use the country image as not only heuristics to predict the quality of the products but also as a symbol of the self by which they affiliate themselves with certain groups and differentiate from others. This study intents to understand the effects of COI on product and service quality perceptions and a set of behavioral intentions through a holistic perspective in the automobile industry. Moreover, as a complementary element, some insights into the conceptualization and the measurement of the country image are meant to be gained. The findings of this study verify the assertions in the current literature on the two-dimensional country image construct, which consists of cognition and affect. The cognition-oriented country image scales threaten the validity of studies conducted in this area because the results obtained with cognition-oriented scales are inadvertently attributed to the (general) country image construct consisting of both cognitive and affective elements. In line with the service dominant logic, it is identified that a holistic approach is required predicting the effects of the country image on quality perceptions. Even in such a pure product category as automobiles, strong associations were identified between country image and perceived service quality. Therefore, regardless of the content of the market offering, it is important that quality must be evaluated under two separate dimensions as (physical) product quality and service quality. With the help of this two-dimensional conceptualization of country image and quality perceptions, country of origin element attached to the market offering can be transformed into actual behaviors.
... and the difference between these two models were nonsignificant, Δχ 2 (16) = 13.81 (p = 0.612). Considering the previous discussions about the goodness of fit indexes (e.g., Hu & Bentler, 1999;Markland, 2007), the hypothesized model can be deemed as having acceptable model fit. ...
Article
Full-text available
The present research emphasizes the role of learning in response changes to infection threats and suggests a new instrument. This preliminary study aims to develop a brief tool (SITS: Sensitivity to Infection Threats Scale) that measures individuals’ health sensitivity to infection threats. The present research utilized the Brief Symptom Inventory—phobic anxiety and hostility subscales and the newly developed SITS. The reliability and validity of SITS were examined through construct, divergent, and convergent validity as well as internal consistency and test-retest reliability. The underlying dimensions were explored through an exploratory factor analysis (EFA N = 142; Mage = 20.29, SDage = 2.34), and the EFA dimensions were confirmed through a confirmatory factor analysis (CFA N = 236; Mage = 20.36, SDage = 2.24). The EFA and CFA results supported a correlated four-factor model and the 20-item structure of the SITS. These four factors included Preoccupied, Avoidant, Physiological, and Cautionary Sensitivities. The overall scale and subscales had good internal consistency, test-retest reliability, and convergent and divergent validity. The SITS is a reliable scale and has the potential to deepen our understanding of human behaviour in responding to infection threats.
... The Chao1 index was used to describe the microbial community richness, and the Shannon index was used to represent the microbial community diversity Xu et al., 2022). Model accuracy was tested with goodness-of-fit index (CFI) (p > 0.05) and Akaike information criterion (AIC; Markland, 2007). ...
Article
Full-text available
Phosphite, a reduced form of orthophosphate, is characterized by high solubility, and transportation efficiency and can be used as potential phosphorus fertilizer, plant biostimulant and supplemental fertilizer in agriculture. However, the effects of phosphite fertilizer on soil properties and microorganisms are poorly understood. This study evaluated the effects of phosphate and phosphite fertilizers on the different forms of phosphorus, alkaline phosphatase (ALP) activity, and phoD -harboring bacterial community in the alfalfa ( Medicago sativa ) field. The study used four concentrations (30, 60, 90, and 120 mg P 2 O 5 kg ⁻¹ soil) of phosphate (KH 2 PO 4 ) and phosphite (KH 2 PO 3 ) fertilizers for the alfalfa field treatment. The results showed that both phosphite and phosphate fertilizers increased the total phosphorus (TP) and available phosphorus (AP) contents in the soil. The phosphorus content of the phosphite-treated soil was lower than that of the phosphate-treated one. TP, inorganic phosphate (Pi), and AP negatively regulated ALP activity, which decreased with increasing phosphate and phosphite fertilizers concentrations. Furthermore, high-throughput sequencing analysis identified 6 phyla and 29 families, which were classified from the altered operational taxonomic units (OTUs) of the soil samples. The redundancy analysis (RDA) revealed that pH, TP, AP and Pi were significantly related to the phoD -harboring bacterial community constructure. The different fertilizer treatments altered the key families, contributing to soil ALP activities. Frankiaceae, Sphingomonadaceae, and Rhizobiaceae positively correlated with ALP activity in phosphite-treated soil. Moreover, the structural equation model (SEM) revealed that ALP activity was affected by the phoD -harboring bacterial community through altered organic phosphorus (Po), AP, total nitrogen (TN), soil organic carbon (SOC), and pH levels under phosphate fertilizer treatment. However, the effect was achieved through positive regulation of pH and AP under phosphite fertilizer. Thus, the changes in soil properties and phoD -harboring bacteria in response to phosphate and phosphite treatments differed in the alfalfa field. This study is the first to report the effects of phosphite on the soil properties of an alfalfa field and provides a strong basis for phosphite utilization in the future. Highlights – Phosphite and phosphate increase the total phosphorus and available phosphorus. – The pH was the dominant factor influencing the phoD -harboring bacterial community under phosphite fertilizer. – The response of soil properties and phoD -harboring bacterial community to phosphate and phosphite fertilizers differed in the alfalfa field.
... The model fit was considered acceptable when the χ 2 /df was < 3, the CFI and the IFI were ≥.90, the RMSEA ≤.08, and the SRMR ≤.10 (Hooper et al., 2008). For the sake of transparency, Satorra-Bentler chi-square (χ 2 ) and general model significance (p) were reported; however, given that χ 2 is highly sensitive to sample size (Markland, 2007), which in our study exceeds the standards required for conducting these types of analysis (Hair et al., 2010), these indices were not employed to assess the adequacy of the SEM model. ...
Article
Full-text available
Introduction/objective: Worries regarding COVID-19 and its economic, social, and psychological consequences, together with the strict measures implemented to control this health crisis, have threatened the mental health of adolescents. The aim of this study was to test the mediating role of resilience and life satisfaction in the association between COVID-19 related worries and mental health among adolescents and young adults. Method: A total of 3485 participants between 14-29 years of age (Medad = 19.68, DT = 3.36) completed an online survey regarding pandemic-related worries, resilience, life satisfaction, and emotional symptoms (depression, anxiety, and stress). Structural Equation Modeling (SEM) was performed to test multi-group invariance. Results: Resilience and life satisfaction partly mediated the relationship between pandemic-related worries and emotional symptoms. Pandemic-related worries were positively associated with emotional symptoms. Resilience and life satisfaction mediated the impact of pandemic-related worries on emotional symptoms. The tested model was invariant according to gender and age. Conclusions: Our findings go beyond the context of the current pandemic, highlighting how young people’s worries regarding extraordinary circumstances may negatively impact on their mental health. This study highlights the mediating role of life satisfaction and resilience, thus emphasising the need for promoting these aspects to improve the mental health of young people during this global health crisis.
... Researchers use multiple goodness-of-fit indices to evaluate whether the developed model indicates a good fit to the data, such as "Model Chi-Square" (χ 2 ), "Root Mean Square Error of Approximation" (RMSEA), and "Standardized Root Mean Square Residual" (SRMR) (Hooper et al., 2008). Achieving an insignificant chi-square is the ultimate goal since it indicates the acceptance of the null hypothesis (Markland, 2007). Researchers recommend using the chi-square divided by the degree of freedom (χ 2 /df) to measure the model fit. ...
Article
Although existing research identified influences of age and gender on Automated Vehicle (AV) acceptance, the underlying reasons were not revealed. A potential reason is that age and gender are exogenous variables, which do not change by other variables. There must exist endogenous variables, such as the built environment and personal factors, such as affordability, travel needs, exposure to AV knowledge, acting as mediating factors that bridge the exogenous variables and AV acceptance. However, these mediating effects have not been discovered, validated, and quantified. Therefore, this paper provides a new viewpoint in unveiling how ages and genders influence acceptance of AV by quantitatively revealing hidden mediating effects focusing on the built environment and personal factors. A statewide survey was conducted in Kentucky. Besides demographical information, respondents’ personal information such as travel needs, affordability, exposure to AV knowledge, and the built environment were collected. Results reveal that males with high levels of travel needs and affordability better accept AV due to higher familiarity and more experience riding AV. Younger adults are more likely to have higher AV acceptance levels than older adults because younger adults tend to live in an urban setting with higher exposure levels to AV technology. Results suggest that experience in riding an AV, the most influential factor, improves acceptance by 44.8%. The research informs transportation agencies of a better understanding of how people of different ages and genders accept AV.
... "Diversity" was indicated by the Shannon diversity index; "Composition" was indicated by NMDS. Model adequacy was determined by χ 2 tests (p > 0.05), goodness-of-fit index (GFI, > 0.9), Akaike Information Criteria (AIC), and root square mean errors of approximation (RSMEA, < 0.05) (Markland, 2007). ...
Article
Soil microbes play an important role in nutrient cycling in agricultural soils and can be influenced by tillage. Conservation tillage aims to reduce energy inputs and conserve soil and water by decreasing disturbance of soil and returning a portion of crop residue. The response of the soil microbial community to conservation tillage is complex and quantitative analysis is largely absent regarding how tillage and depth, combined with soil properties, affect soil microbial diversity and community composition. Here, the diversity and composition of the soil bacterial community and its relationship with soil properties were explored by high-throughput sequencing technology and Structural Equation Modeling (SEM) at two soil depths (0–5 cm and 15–20 cm) under conventional tillage and no tillage with residue retention. No tillage significantly increased alpha diversity (Chao 1 and Shannon) at 0-5 cm and altered the composition of the bacterial community, coinciding with changes in physical-chemical properties. Proteobacteria, Actinobacteria, Acidobacteria, Chloroflexi, and Gemmatimonadetes were the most abundant phyla across all samples. Alpha diversity was significantly correlated with soil bulk density (BD) and pH. The SEM showed that tillage and depth explained 86% of the bacterial diversity and 84% of the composition. In addition, tillage and depth had indirect effects on bacterial diversity and composition by affecting soil BD, pH and soil organic carbon. Results indicate that the soil bacterial community is altered by conservation tillage, especially in the topsoil, and highlight the importance of soil physical-chemical properties in shaping the diversity and composition of the soil bacterial community. Our findings contribute to a broad understanding of tillage disturbance and differentiated effects in the soil profile for the bacterial community.
... Model fit was evaluated based on root mean square error of approximation (RMSEA) index, standardized root mean square residual (SRMR), comparative fit index (CFI), and Tucker-Lewis index (TLI) results. Since χ 2 criterion is sensitive to the sample size [45], it was not used in the goodness of model fit assessment. However, the difference between models was focused on the change in χ 2 test. ...
Article
Full-text available
Risk factors for depression in older adults include significant interpersonal losses, increasing social isolation, and deteriorating physical abilities and health that require healthcare. The effects of unmet healthcare needs on depression in older adults are understudied. This study aimed to analyze the association between unmet healthcare needs and symptoms of depression, sleep, and antidepressant medication while controlling for other significant factors among older adults. For this study, we used a multinational database from The Survey of Health, Ageing and Retirement in Europe (SHARE), containing data of individuals aged 50 and older. The final sample used in this research consisted of 39,484 individuals from 50 to 100 years (mean − 71.15, SD ± 9.19), 42.0 percent of whom were male. Three path models exploring relationships between symptoms of depression at an older age and unmet healthcare needs were produced and had a good model fit. We found that unmet healthcare needs were directly related to depression, activity limitations were related to depression directly and through unmet healthcare needs, whereas financial situation mostly indirectly through unmet healthcare needs. We discuss how depression itself could increase unmet healthcare needs.
... A one-factor CFA model underlying all eight items was also fit and compared. Absolute cut-offs indicating "adequate" or "good" fit for common CFA fit statistics (e.g., RMSEA < 0.08, CFI > 0.90) have been widely criticized (81,82). We report them but focus on comparing them between models noting that a model with smaller RMSEA and larger CFI than another model indicates better fit (83). ...
Article
Full-text available
Background Individuals with psychiatric diagnoses who are unemployed or underemployed are likely to disproportionately experience financial hardship and, in turn, lower life satisfaction (LS). Understanding the mechanisms though which financial hardship affects LS is essential to inform effective economic empowerment interventions for this population. Aim To examine if subjective financial hardship (SFH) mediates the relationship between objective financial hardship (OFH) and LS, and whether hope, and its agency and pathways components, further mediate the effect of SFH on LS among individuals with psychiatric diagnoses seeking employment. Methods We conducted structured interviews with participants (N = 215) of two peer-run employment programs using indicators of OFH and SFH and standardized scales for hope (overall hope, hope agency, and hope pathways) and LS. Three structural equation models were employed to test measurement models for OFH and SFH, and mediational relationships. Covariates included gender, age, psychiatric diagnosis, race/ethnicity, education, income, employment status, SSI/SSDI receipt, and site. Results Confirmatory factor analysis (CFA) for items measuring OFH and SFH supported two separate hypothesized factors. OFH had a strong and significant total effect on SFH [standardized beta (B) = 0.68] and LS (B = 0.49), and a weak-to-moderate effect on hope (B = –0.31). SFH alone mediated up to 94% of the effect of OFH on LS (indirect effect B = –0.46, p < 0.01). The effect of SFH on LS through hope was small (indirect effect B = –0.09, p < 0.05), primarily through hope agency (indirect effect B = –0.13, p < 0.01) and not hope pathways. Black and Hispanic ethno-racial identification seemed to buffer the effect of financial hardship on hope and LS. Individuals identifying as Black reported significantly higher overall hope (B = 0.41–0.47) and higher LS (B = 0.29–0.46), net of the effect of OFH and SFH. Conclusion SFH is a strong mediator of the relationship between OFH and LS in our study of unemployed and underemployed individuals with psychiatric diagnoses. Hope, and particularly its agency component, further mediate a modest but significant proportion of the association between SFH and LS. Economic empowerment interventions for this population should address objective and subjective financial stressors, foster a sense of agency, and consider the diverse effects of financial hardship across ethno-racial groups.
... In order to study the psychometric properties of ISC original dimesionalization (Castillo et al., 2001), structural equation modelling was performed. Different absolute and relative fitness indices were calculated (Bentler, 2007;Markland, 2007), such as pvalue associated with Chi-square test, χ 2 and degrees of freedom ratio (df; χ 2 /df), goodness of fit index (GFI), normed fit index (NFI), non-normed fix index (NNFI), and comparative fit index (CFI). The estimated parameters were considered significant when the value associated with the t-value was higher than 1.96 (p < 0.05). ...
Article
Full-text available
Objectives: This article has had two main objectives. The first has been to adapt and validate a questionnaire to measure intrinsic satisfaction when learning through a second language (English). To do so, the items satisfaction/enjoyment and boredom were adapted into this new setting. The second goal has been to analyze, in a CLIL context, the sex and age differences within these two variables (satisfaction/enjoyment and boredom). Methods: In order to fulfill this study, 3355 students were surveyed in the region of Andalucía (South of Spain) with a statistical confidence level of 99%, and a margin of error below 2%. After the analysis performed, the instrument proved to be valid and reliable, which implies that a new tool would be available for future studies, and even for actual teaching in new approaches of the FLL, such as gamification. Discussion: Regarding sex and age differences, the main finding has been the significant higher boredom values in boys, as opposed to enjoyment in girls in CLIL contexts. Conclusion: This might shed light on performance depending sex and age which could be useful in other studies, and CLIL practice adaptation.
... Using less restrictive criteria, values ≥0.90 for the CFI and the IFI, ≤0.08 for the RMSEA, and ≤0.10 for the SRMR were considered acceptable (Hooper et al., 2008). For the sake of transparency, Satorra-Bentler chisquare (X 2 ), general model significance (p), and relative chi-square (X 2 / df) were reported; however, given that X 2 is highly sensitive to sample size (Jöreskog & Sörbom, 1993;Markland, 2007), which in our study far exceeds the standards required for conducting this type of analysis (Hair et al., 2010), these indices were not employed to assess the adequacy of the CFA models. ...
Article
The pandemic context presents remarkable psychological challenges for adolescents and young adults. The aim of the present work was to construct and study the psychometric properties of a scale in Spanish language (W-COV) to measure their worries related to the pandemic. Participants were 5559 people aged between 14 and 25 years old (M = 19.05; SD = 3.28). Self-report data were collected using a cross-sectional and cross-cultural design. Participants were from 5 Spanish-speaking countries. Instruments were W-COV to assess worries about COVID-19 and its consequences; DASS-21 for anxiety, depression and stress; and SWLS for life satisfaction. Exploratory, confirmatory and multi-group factor analyses were conducted to determine the factorial structure of the W-COV and its measurement invariance (configural, metric, scalar and error variance). Correlational and regression analyses were also performed to study convergent and predictive validity. The results suggest that W-COV presents a bifactorial structure: (1) a general factor of worries about COVID-19; and (2) three different factors: worries about health, economic and psychosocial consequences from COVID-19. The internal reliability indices Cronbach's α and Omega were adequate. With respect to the invariance results, the instrument can be used interchangeably in the five countries considered, in both genders and in two different age groups (12–17 and 18–25). Regarding validity, W-COV factors were positively associated with anxiety, depression and stress, and negatively predicted life satisfaction. In conclusion, W-COV is a reliable and valid instrument for researchers and health care professionals to assess the psychological impact of the pandemic on mental health of young Ibero-Americans.
... Model accuracy was tested with χ 2 (p > 0.05), goodness-of-fit index (GFI>0.9), and Akaike information criterion (AIC) (Markland, 2007). ...
Article
Full-text available
The excessive application of phosphorus (P) fertilizer is becoming a major agricultural problem, which reduces the utilization rate of the P fertilizer and degrades soil quality. The following five P fertilizer treatments were investigated to know how they affect soil properties, enzyme activity, bacterial and fungal community structure. 1) no P fertilizer (P0); 2) farmers’ traditional P fertilization scheme (FP); 3) 30% reduction in P fertilizer application (P1, microbial blended fertilizer as base fertilizer); 4) 30% reduction in P fertilizer application (P2, diammonium phosphate as starting fertilizer); 5) 30% reduction in P fertilizer application (P3, microbial inoculum seed dressing). The P fertilizer reduction combined with microbial fertilizer significantly increased soil organic matter (SOM), total phosphorus (TP), available phosphorus (AP) available potassium (AK) contents, and acid phosphatase activity (ACP), however, soil urease activity was significantly reduced. Moreover, the P fertilizer reduction combined with microbial fertilizer significantly increased the relative abundance of a potential beneficial genus (i.e., Bacillus, Pseudomonas, Penicillium, and Acremonium) and potentially pathogenic genus (i.e., Fusarium, Gibberella, and Drechslera). The structural equation model (SEM) revealed that different P fertilizer reduction systems had significant indirect effects on bacterial and fungal community structures. The results suggested that the P fertilizer reduction combined with microbial fertilizer systems regulated the pathogenic and beneficial genus which created a microbial community that is favorable for maize growth. Moreover, the findings highlighted the importance of soil properties in determining the soil bacterial and fungal community structure.
... An excellent model fit was identified when the χ 2 /df was between 1 and 2, the CFI and the IFI were ≥0.95, the RMSEA ≤ 0.05, and the SRMR ≤ 0.05 (Bagozzi & Yi, 2011). For the sake of transparency, Satorra-Bentler chi-square (χ 2 ) and general model significance (p) were reported; however, given that χ 2 is highly sensitive to sample size (Markland, 2007), these indices were not employed to assess the adequacy of the SEM model. ...
Article
Full-text available
The aim of this study was to test whether resilience and life satisfaction (two traditional protective factors) mediate between COVID‐19 related worries and the development of symptoms of depression, anxiety, and stress in adolescents and young adults. Participants involved 392 adolescents and young adults (70.20% female) aged between 12 and 25 years (M = 17.05 years, SD = 3.08). Participants completed the COVID‐19 related worries scale, the CD‐RISC to analyse resilience, the Satisfaction with Life Scale, and the Depression, Anxiety, and Stress Scales‐21 to study emotional symptoms. Descriptive analyses and Pearson correlations were conducted, together with a structural equation modeling testing a mediational model and multigroup invariance. Results show that resilience and life satisfaction play a mediating role in the relation between the COVID‐19 related worries and emotional symptoms (depression, anxiety, and stress). This study highlights the role of protective factors on adolescents' and young adults' emotional symptoms during the COVID‐19 pandemic.
... value necessary to consider it a satisfactory fit for the model. Nevertheless, it has been shown that this statistic is highly conditioned by sample size (Jöreskog & Sörbom, 1993;Markland, 2007). For this reason, it may be more appropriate to use other indices considered less sensitive to sample size to assess the adequacy of the factorial solutions. ...
Article
Full-text available
Cyberchondria refers to excessive and repeated online health-related searching, which is associated with increased distress and anxiety. The Cyberchondria Severity Scale (CSS) is the most widely used measure for assessment of cyberchondria, and its shortened version (CSS-12) has recently been developed. The aim of the present study was to develop the Spanish version of the CSS-12 and test its psychometric properties. A community sample of 432 Spanish-speaking adults (67.6% women; mean age = 36.00 ± 15.22 years) completed the Spanish translation of CSS-12 along with measures of health anxiety, obsessive-compulsive, anxiety and depressive symptoms. The Spanish version of the CSS-12 comprises a general cyberchondria factor and four specific factors (‘excessiveness’, ‘compulsion’, ‘distress’, and ‘reassurance’). Multi-group confirmatory factor analysis indicated measurement invariance across gender groups. Internal consistency values for the total score and subscales were good to excellent. The CSS-12 showed strong correlations with health anxiety, and moderate to low correlations with anxiety, obsessive - compulsive and depressive symptoms, supporting the convergent and divergent validity of the CSS-12, respectively. In conclusion, these results show that the CSS-12 is a valid and reliable tool for measuring cyberchondria in both genders in the general Spanish population.
Article
Sexual consent is essential for preventing sexual assault. Research on sexual consent in Spain is hampered by the lack of an appropriate measure. The aim of this study was to translate and assess the psychometric properties of the Sexual Consent Scale–Revised using a sample of 557 Spanish individuals between 18 and 60 years old. An exploratory factor analysis followed by a confirmatory factor analysis supported the Spanish SCS-R factorial structure. Reliability analyses and invariance testing demonstrated its good psychometric properties and invariance with respect to age and gender. The Spanish SCS-R is a useful tool to evaluate sexual consent among young men and women as well as men and women older than 30 years of age.
Article
Full-text available
Workplace ostracism is one of the most common forms of passive workplace mistreatment, in which employees are ignored and excluded from the workplace. The primary goal of this study is to investigate the relationship between workplace ostracism and workplace behavior, as well as how neuroticism moderates this relationship. This is a descriptive research project and a cross-sectional research design with exposure and outcome constraints was used. Data were acquired from 180 employees working in private firms in India and analyzed using PLS-SEM. A conceptual model is also constructed based on the COR (Conservation of Resources) principle that describes the impact of workplace ostracism on employee behavior. Workplace ostracism has a negative impact on employee job performance but does not affect deviant behavior. It also shows that whereas neuroticism has a stronger moderating effect on the link between workplace ostracism and job performance, it does not influence deviant behavior that goes against our expectations. While previous studies have mostly focused on the moderating effect of psychological or motivational constructs on the association between workplace exclusion and behavior, the author has added to the existing body of knowledge by examining one of the Big-Five Personality Dimensions, namely Neuroticism, and its moderating effect.
Article
Depression, anxiety, and sleep disturbance are common among school‐age children and can impair functioning. Schools are in a unique position to assess and refer these children for intervention services, but standardized screening is underutilized. One challenge with screening is the lack of psychometrically strong mental health screening tools that can be efficiently and effectively administered in school settings. The Behavioral Health Works program provides a web‐based platform and a multidimensional screening tool (the Behavioral Health Screen [BHS]) that can help overcome implementation barriers. Because parents have unique, valuable perspectives on reporting the mental health concerns of their children, a parent report version for younger children was developed. This study examines the psychometric properties of the internalizing and sleep scales of the BHS parent report version for assessing children (BHS‐PRC). Participants included 420 parents of children ages 6–12, who completed the BHS‐PRC. Results supported evidence of good internal structure, partial measurement equivalence (across race, gender, age, and education groups), discrimination in item response theory analysis, classification accuracy, and convergent and discriminant validity. Overall, the BHS‐PRC demonstrates strong psychometric characteristics and has the potential to assist schools' mental health screening as part of a multitiered system of support.
Article
Racially marginalized communities are socially and politically active, yet there is limited work that examines the psychological forces underlying how People of Color engage in cross-racial solidarity and collective action. We propose a model of politicized racial identity and collective action to Asian American participation in own-group collective action and African American collective action. In Study 1, we tested the model using correlational data. In Study 2, we used an experiment to explore whether politicized identities predict collective action. Results support the relation between politicized identities and collective action. Politicized Person of Color identity predicted Asian American engagement in both own-group-oriented collective action (Study 2) and African American-oriented (Study 1, Study 2) collective action. Further, politicized Asian American identity predicted Asian American engagement in own-group collective action (Study 1). These findings provide empirical evidence for the role of politicized identities in predicting collective action, including cross-racial solidarity with African Americans.
Article
Full-text available
Consolidating goals of service-learning in tertiary institutions offering Technical and Vocational Education is gaining acceptance in other part of the developing nations around the world. Despite experiential learning adoption in education, higher institutions of learning employ the use of such experiential learning without considering pertinent statistical methods to guide the selection of the goals that can guide the implementation of the experiential learning in practice. Hence, this study deployed a structural equation modeling (SEM) to consolidate the goals of service-learning for TVE in Nigeria. The study was guided by three research questions and one null hypothesis was tested at 0.05 alpha value. Descriptive survey design was employed for the study. A sample of two hundred and sixty seven (267) samples that comprised of lecturers and administrators were randomly selected from fifteen tertiary institutions. Goals of service-learning in technical and vocational education questionnaire' (GOSLITVEQ) that comprised of 15 items was used for data collection. A reliability coefficient of 0.83 was obtained for the questionnaire using Cronbach's Alpha statistic. Structural equation modelling (SEM) specifically, the confirmatory factor analysis (CFA) was employed for data analysis using Analysis of Moment of Structure (AMOS). The finding revealed eight goals of service-learning that were deemed relevant for implementation in TVE. Based on the finding, the paper recommends eight goals of service-learning to Nigerian tertiary education institutions offering TVE for implementation of service-learning.
Article
Full-text available
Goodness-of-fit (GOF) indexes provide "rules of thumb"—recommended cutoff values for assessing fit in structural equation modeling. Hu and Bentler (1999) proposed a more rigorous approach to evaluating decision rules based on GOF indexes and, on this basis, proposed new and more stringent cutoff values for many indexes. This article discusses potential problems underlying the hypothesis-testing rationale of their research, which is more appropriate to testing statistical significance than evaluating GOF. Many of their misspecified models resulted in a fit that should have been deemed acceptable according to even their new, more demanding criteria. Hence, rejection of these acceptable-misspecified models should have constituted a Type 1 error (incorrect rejection of an "acceptable" model), leading to the seemingly paradoxical results whereby the probability of correctly rejecting misspecified models decreased substantially with increasing N. In contrast to the application of cutoff values to evaluate each solution in isolation, all the GOF indexes were more effective at identifying differences in misspecification based on nested models. Whereas Hu and Bentler (1999) offered cautions about the use of GOF indexes, current practice seems to have incorporated their new guidelines without sufficient attention to the limitations noted by Hu and Bentler (1999).
Article
Full-text available
Fit indexes were compared with respect to a specific type of model misspecification. Simple structure was violated with some secondary loadings that were present in the true models that were not specified in the estimated models. The c2 test, Comparative Fit Index, Goodness-of-Fit Index, Incremental Fit Index, Nonnormed Fit Index, root mean squared error of approximation, standardized root mean square residual, and the c2/df values were investigated. Simulated data sets with 3 sample sizes (250, 500, and 1,000 cases), 4 levels of main loadings (.40,. 50,. 60, and. 80), 2 numbers of factors (4, 8), and 2 types of association matrix (covariance, correlation) were the basis for maximum likelihood estimation of orthogonal and oblique factor models. Some correlations between fit indexes were low. Moreover, small distortions from simple structure did not lead to misfit in the RMSEA and SRMR, but they often led to misfit in the incremental fit indexes. This result may be of interest for research on personality traits, where small violations of simple structure are very common.
Article
Full-text available
As the use of structural equation modeling (SEM) has increased, confusion has grown concerning the correct use of and the conclusions that can be legitimately drawn from these methodologies. It appears that much of the controversy surrounding SEM is related to the degree of certainty with which causal statements can be drawn from these procedures. SEM is discussed in relation to the conditions necessary for providing causal evidence. Both the weaknesses and the strengths of SEM are examined. Although structural modeling cannot ensure that necessary causal conditions have been met, it is argued that SEM methods may offer the potential for tentative causal inferences to be drawn when used with carefully specified and controlled designs. Keeping in mind that no statistical methodology can in and of itself determine causality, specific guidelines are suggested to help researchers approach a potential for providing causal evidence with SEM procedures.
Article
Full-text available
Justification, in the vernacular language of philosophy of science, refers to the evaluation, defense, and confirmation of claims of truth. In this article, we examine some aspects of the rhetoric of justification, which in part draws on statistical data analysis to shore up facts and inductive inferences. There are a number of problems of methodological spirit and substance that in the past have been resistant to attempts to correct them. The major problems are discussed, and readers are reminded of ways to clear away these obstacles to justification. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Factor analysis, path analysis, structural equation modeling, and related multivariate statistical methods are based on maximum likelihood or generalized least squares estimation developed for covariance structure models (CSMs). Large-sample theory provides a chi-square goodness-of-fit test for comparing a model (M) against a general alternative M based on correlated variables. It is suggested that this comparison is insufficient for M evaluation. A general null M based on modified independence among variables is proposed as an additional reference point for the statistical and scientific evaluation of CSMs. Use of the null M in the context of a procedure that sequentially evaluates the statistical necessity of various sets of parameters places statistical methods in covariance structure analysis into a more complete framework. The concepts of ideal Ms and pseudo chi-square tests are introduced, and their roles in hypothesis testing are developed. The importance of supplementing statistical evaluation with incremental fit indices associated with the comparison of hierarchical Ms is also emphasized. Normed and nonnormed fit indices are developed and illustrated. (43 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
In previous research (Hu & Bentler, 1998, 1999), 2 conclusions were drawn: stan-dardized root mean squared residual (SRMR) was the most sensitive to misspecified factor covariances, and a group of other fit indexes were most sensitive to mis-specified factor loadings. Based on these findings, a 2-index strategy—that is, SRMR coupled with another index—was proposed in model fit assessment to detect poten-tial misspecification in both the structural and measurement model parameters. Based on our reasoning and empirical work presented in this article, we conclude that SRMR is not necessarily most sensitive to misspecified factor covariances (structural model misspecification), the group of indexes (TLI, BL89, RNI, CFI, Gamma hat, Mc, or RMSEA) are not necessarily more sensitive to misspecified factor loadings (measurement model misspecification), and the rationale for the 2-index presenta-tion strategy appears to have questionable validity. The assessment of model fit in structural equation modeling (SEM) has long been a thorny issue in SEM application. As a result, the issues related to model fit as-sessment in SEM analysis have been at the forefront of theoretical and empirical research over the years. Research in this area has focused on different issues con-cerning the use and interpretation of model fit indexes. Studies typically examined the performance characteristics of different fit indexes under different data condi-STRUCTURAL EQUATION MODELING, 12(3), 343–367 Copyright © 2005, Lawrence Erlbaum Associates, Inc.
Article
Full-text available
Model evaluation is one of the most important aspects of structural equation modeling (SEM). Many model fit indices have been developed. It is not an exaggeration to say that nearly every publication using the SEM methodology has reported at least one fit index. Most fit indices are defined through test statistics. Studies and interpretation of fit indices commonly assume that the test statistics follow either a central chi-square distribution or a noncentral chi-square distribution. Because few statistics in practice follow a chi-square distribution, we study properties of the commonly used fit indices when dropping the chi-square distribution assumptions. The study identifies two sensible statistics for evaluating fit indices involving degrees of freedom. We also propose linearly approximating the distribution of a fit index/statistic by a known distribution or the distribution of the same fit index/statistic under a set of different conditions. The conditions include the sample size, the distribution of the data as well as the base-statistic. Results indicate that, for commonly used fit indices evaluated at sensible statistics, both the slope and the intercept in the linear relationship change substantially when conditions change. A fit index that changes the least might be due to an artificial factor. Thus, the value of a fit index is not just a measure of model fit but also of other uncontrollable factors. A discussion with conclusions is given on how to properly use fit indices.
Article
Full-text available
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred .05 criterion—still persists. This article reviews the problems with this practice, including its near-universal misinterpretation of p as the probability that H 0 is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects H 0 one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences, on replication.
Article
Full-text available
In confirmatory factor analysis, hypothesized models reflect approximations to reality so that any model can be rejected if the sample size is large enough. The appropriate question is whether the fit is adequate to support the model, and a large number of fit indexes have been proposed for this purpose. In the present article, we examine the influence of sample size on different fit indexes for both real and simulated data. Contrary to claims by Bentler and Bonett (1980), their incremental fit index was substantially affected by sample size. Contrary to claims by Joreskog and Sorbom (1981), their goodness-of-fit indexes provided by LISREL were substantially affected by sample size. Contrary to claims by Bollen (1986), his new incremental fit index was substantially affected by sample size. Hoelter's (1983) critical N index was also substantially affected by sample size. Of the more than 30 indexes considered, the Tucker-Lewis (1973) index was the only widely used index that was relatively independent of sample size. However, four new indexes based on the same form as the Tucker-Lewis index were also relatively independent of sample size., (C) 1988 by the American Psychological Association <2>
Article
It has been proposed that a clear separation of measurement from structural reasons for model failure can be obtained via a procedure testing 4 nested models: (a) a factor model, (b) a confirmatory factor model, (c) the anticipated structural equation model, and (d) possibly, a more constrained model. Advocates of the 4-step procedure contend that these nested models provide a trustworthy way of determining whether one's model is failing as a result of structural (conceptual) inadequacy, or as a result of measurement misspecification. We argue that measurement and structural issues can not be unambiguously separated by the 4 steps, and that the seeming separation is incomplete at best and illusory at worst. The prime difficulty is that the 4-step procedure is incapable of determining whether the proposed model contains the proper number of factors. As long as the number of factors is in doubt, measurement and structural assessments remain dubious and entwined. The assessment of model fit raises additional difficulties because the researcher is implicitly favoring of the null hypothesis, and the logic of the root mean square error of approximation (RMSEA) as a test of "close fit" is inconsistent with the logic of the 4-step. These discussions question whether factor analysis can dependably determine the proper number of factors, and argue against the routine use of. 05 as the probability target for structural equation model chi-square fit.
Article
In confirmatory factor analysis, hypothesized models reflect approximations to reality so that any model can be rejected if the sample size is large enough. The appropriate question is whether the fit is adequate to support the model, and a large number of fit indexes have been proposed for this purpose. In the present article, we examine the influence of sample size on different fit indexes for both real and simulated data. Contrary to claims by Bentler and Bonett (1980), their incremental fit index was substantially affected by sample size. Contrary to claims by Joreskog and Sorbom (1981), their goodness-of-ftt indexes provided by LISREL were substantially affected by sample size. Contrary to claims by Bollen (1986), his new incremental fit index was substantially affected by sample size. Hoelter's (1983) critical N index was also substantially affected by sample size. Of the more than 30 indexes considered, the Tucker-Lewis (1973) index was the only widely used index that was relatively independent of sample size. However, four new indexes based on the same form as the Tucker-Lewis index were also relatively independent of sample size. The purpose of the present investigation was to examine the influence of sample size on goodness-of-f it indicators used in confirmatory factor analysis (CFA). Although the present inves- tigation was limited to CFA, the problems, issues, and most of the results generalize to the analysis of covariance structures. The advantages of CFA are well-known, and numerous intro- ductions to the LISREL approach used in the present investiga- tion are available elsewhere (e.g., Bagozzi, 1980; Joreskog &
Article
any application of structural equation modeling (SEM) must involve the specification of one or more models to be evaluated / it is critical for researchers using SEM to have a sound working knowledge of procedures and strategies for model specification / provide a detailed presentation of procedures for model specification as well as a discussion of related issues, such as the existence of equivalent models / discuss and offer recommendations regarding strategies for model construction / focus on the general case of conventional linear structural equation models (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and Gamma Hat; a cutoff value close to .90 for Mc; a cutoff value close to .08 for SRMR; and a cutoff value close to .06 for RMSEA are needed before we can conclude that there is a relatively good fit between the hypothesized model and the observed data. Furthermore, the 2‐index presentation strategy is required to reject reasonable proportions of various types of true‐population and misspecified models. Finally, using the proposed cutoff criteria, the ML‐based TLI, Mc, and RMSEA tend to overreject true‐population models at small sample size and thus are less preferable when sample size is small.
Article
This study evaluated the sensitivity of maximum likelihood (ML)-, generalized least squares (GLS)-, and asymptotic distribution-free (ADF)-based fit indices to model misspecification, under conditions that varied sample size and distribution. The effect of violating assumptions of asymptotic robustness theory also was examined. Standardized root-mean-square residual (SRMR) was the most sensitive index to models with misspecified factor covariance(s), and Tucker-Lewis Index (1973; TLI), Bollen's fit index (1989; BL89), relative noncentrality index (RNI), comparative fit index (CFI), and the ML- and GLS-based gamma hat, McDonald's centrality index (1989; Mc), and root-mean-square error of approximation (RMSEA) were the most sensitive indices to models with misspecified factor loadings. With ML and GLS methods, we recommend the use of SRMR, supplemented by TLI, BL89, RNI, CFI, gamma hat, Mc, or RMSEA (TLI, Mc, and RMSEA are less preferable at small sample sizes). With the ADF method, we recommend the use of SRMR, supplemented by TLI, BL89, RNI, or CH. Finally, most of the ML-based fit indices outperformed those obtained from GLS and ADF and are preferable for evaluating model fit. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Statistical rituals largely eliminate statistical thinking in the social sciences. Rituals are indispensable for identification with social groups, but they should be the subject rather than the procedure of science. What I call the “null ritual” consists of three steps: (1) set up a statistical null hypothesis, but do not specify your own hypothesis nor any alternative hypothesis, (2) use the 5% significance level for rejecting the null and accepting your hypothesis, and (3) always perform this procedure. I report evidence of the resulting collective confusion and fears about sanctions on the part of students and teachers, researchers and editors, as well as textbook writers.
Article
Factor analysis, path analysis, structural equation modeling, and related multivariate statistical methods are based on maximum likelihood or generalized least squares estimation developed for covariance structure models. Large-sample theory provides a chi-square goodness-of-fit test for comparing a model against a general alternative model based on correlated variables. This model comparison is insufficient for model evaluation: In large samples virtually any model tends to be rejected as inadequate, and in small samples various competing models, if evaluated, might be equally acceptable. A general null model based on modified independence among variables is proposed to provide an additional reference point for the statistical and scientific evaluation of covariance structure models. Use of the null model in the context of a procedure that sequentially evaluates the statistical necessity of various sets of parameters places statistical methods in covariance structure analysis into a more complete framework. The concepts of ideal models and pseudo chi-square tests are introduced, and their roles in hypothesis testing are developed. The importance of supplementing statistical evaluation with incremental fit indices associated with the comparison of hierarchical models is also emphasized. Normed and nonnormed fit indices are developed and illustrated.
Causation issues in structural equation modeling research
  • Bullock