# Current Methodological Considerations in Exploratory and Confirmatory Factor Analysis

Abstract

Researchers must make numerous choices when conducting factor analyses, each of which can have significant ramifications on the model results. They must decide on an appropriate sample size to achieve accurate parameter estimates and adequate power, a factor model and estimation method, a method for determining the number of factors and evaluating model fit, and a rotation criterion. Unfortunately, researchers continue to use outdated methods in each of these areas. The present article provides a current overview of these areas in an effort to provide researchers with up-to-date methods and considerations in both exploratory and confirmatory factor analysis. A demonstration was provided to illustrate current approaches. Choosing between confirmatory and exploratory methods is also discussed, as researchers often make incorrect assumptions about the application of each.

Ad

- ... Goodness of fit was assessed based on the following 30 : standard- ised mean square residual (SMSR), root mean square error of approximation (RMSEA), Tucker-Lewis Index (TLI), and the comparative fit index (CFI). In line with recommendations, the following values were used as indicators of good-fit: SMSR < 0.08 [29][30][31][32] , RMSEA < 0.06 [30][31][32] , TLI values >0.9, and CFI close to 1 29,30,32 . Of note, although chi-square is the traditional fit index used to evaluate model fit, it is very sensitive to sample size and can be inflated in large samples 30,31,33 and so this was not used as a goodness of fit (GFI) index for the CFA model. ...... Goodness of fit was assessed based on the following 30 : standard- ised mean square residual (SMSR), root mean square error of approximation (RMSEA), Tucker-Lewis Index (TLI), and the comparative fit index (CFI). In line with recommendations, the following values were used as indicators of good-fit: SMSR < 0.08 [29][30][31][32] , RMSEA < 0.06 [30][31][32] , TLI values >0.9, and CFI close to 1 29,30,32 . Of note, although chi-square is the traditional fit index used to evaluate model fit, it is very sensitive to sample size and can be inflated in large samples 30,31,33 and so this was not used as a goodness of fit (GFI) index for the CFA model. ...... In line with recommendations, the following values were used as indicators of good-fit: SMSR < 0.08 [29][30][31][32] , RMSEA < 0.06 [30][31][32] , TLI values >0.9, and CFI close to 1 29,30,32 . Of note, although chi-square is the traditional fit index used to evaluate model fit, it is very sensitive to sample size and can be inflated in large samples 30,31,33 and so this was not used as a goodness of fit (GFI) index for the CFA model. Measurement Invariance. ...ArticleFull-text available
- Feb 2019

Recent studies highlight the role of excessive mind wandering in attention-deficit/hyperactivity disorder (ADHD) and its association with impairment. We believe assessing mind wandering could be especially relevant to individuals, including many females, who present with less externalising manifestations of ADHD. Using a new measure based on ADHD patient reports, the Mind Excessively Wandering Scale (MEWS), we previously found adults with ADHD had elevated levels of mind wandering that contributed to impairment independently of core ADHD symptoms. Using data from an online general population survey, the current study assessed the factor-structure, reliability, validity and measurement invariance of the MEWS. We also investigated sex differences in mind wandering, as well as ADHD symptoms, impairment and wellbeing in those with and without ADHD. The MEWS had a unidimensional structure, was invariant across sex, age and ADHD status, and accounted for unique variance in impairment and wellbeing beyond core ADHD symptoms. Among those with ADHD, we found no evidence for sex differences in mind wandering and among those without ADHD males had higher scores. We also found similar levels of hyperactivity/impulsivity, emotional lability, and impairment in males and females with ADHD, but males reported greater inattention and lower wellbeing. Results suggest the MEWS is a reliable and valid instrument measuring the same construct across sex, age and ADHD status, which could aid diagnosis and monitoring of outcomes. - ... The full range of possible values were observed except for Item 18. Normality among the observed variables was assessed in terms of skewness and kurtosis, and univariate and multivariate normality were violated (Zs > 1.96). This result required the use of a robust maximum likelihood method in EFA and CFA (Schmitt 2011). The Kaiser-Meyer-Olkin measure of sampling adequacy 2 was satisfied (.75). ...... 2. Values of .60 and above are required for good FA (Tabachnick & Fidell 2014). 3. Simulation research has indicated that the parallel analysis is the best empirical method for selecting the appropriate number of factors (Schmitt 2011). In the analysis, a series of data sets generated using the sample size and number of variables of the original data set is used to examine whether real nonrandom factors exist. ...ArticleFull-text available
- Apr 2019

This paper reports an approximate replication of Matsuda & Gobel (2004) for the psychometric validation of the Foreign Language Reading Anxiety Scale (FLRAS). Their study examined the structural aspects of the FLRAS developed by Saito, Horwitz & Garza (1999). The results showed that the FLRAS measured three different subcomponents of foreign language reading anxiety; none of the factors predicted foreign language performance in content-based and four-skill classes. The present study aimed to reconfirm the psychometric validity of the FLRAS because it has been widely employed to make foreign language reading anxiety researchable. Our study retained the same methodology, with the exception of the measurements of classroom performance and reading proficiency. Matsuda & Gobel's conclusions were reproduced by showing a weak relationship between classroom performance and foreign language reading anxiety measured by the three-factor model of the FLRAS. However, this study newly demonstrated a strong association of reading-anxiety subcomponents with learners' reading proficiency. The administration, scoring, and interpretation methods of the FLRAS were reconsidered based on the replicated results. - ... Therefore, we confirmed the original number of components using another method. Various methods have been rec- ommended by which to confirm the best number of components [24,25]. MAP approaches have been suggested as suitable for deciding the number of components [25][26][27]. ...... Various methods have been rec- ommended by which to confirm the best number of components [24,25]. MAP approaches have been suggested as suitable for deciding the number of components [25][26][27]. In the pres- ent study, the six components of the K-HBQOL were extracted by K1 and MAP approaches, and floor and ceiling effects for each item (N = 168). ...Purpose The purpose of this study was to conduct a psychometric evaluation of the Korean version of the Hepatitis B Quality of Life Questionnaire (K-HBQOL), which is designed to assess the quality of life of patients with the hepatitis B virus (HBV). Methods The K-HBQOL was developed by converting the original English version to Korean using a back-and-forth translation method. The translated questionnaire was distributed to 168 adults with HBV. Descriptive statistics were used to summarize the demographic characteristics of these participants. Confirmatory factor analysis and exploratory principal components analysis (CFA and PCA) were performed, applying varimax rotation and using Kaiser’s eigenvalue-greater-than-one rule to examine the factor structure. The minimum average partial rule was also used to identify the number of components. Results The original factor model was not confirmed by CFA in this sample. Six components (Psychological Well-being, Stigmatization, Anticipation Anxiety, Weakness, Vitality, and Vulnerability) were extracted by PCA, with 69.11% of the total variance explained. In this process, one component present in the original factor model (Transmission) was not found, while another component (Weakness) was extracted. Conclusions This study revealed the psychometric characteristics of the Korean version of the HBQOL for patients with hepatitis B. We suggest a study with a larger sample is needed to evaluate validity and reliability of the K-HBQOL for other Korean populations.
- ... The resulting survey will be piloted by administering it to staff at eight VHA medical centers that are not a part of the evaluation but are involved in process improve- ment efforts. Using the pilot data, we will conduct a seri- ally exploratory factor analysis with at least three items per factor, and reliable fit statistics [30]. We will then examine these items using a confirmatory factor analysis, looking for appropriate fit statistics, loadings, and clin- ical relevance [31,32]. ...Background The goal of Lean Enterprise Transformation (LET) is to go beyond simply using Lean tools and instead embed Lean principles and practices in the system so that it becomes a fundamental, collective mindset of the entire enterprise. The Veterans Engineering Resource Center (VERC) launched the Veterans Affairs (VA) LET pilot program to improve quality, safety, and the Veteran’s experience. A national evaluation will examine the pilot program sites’ implementation processes, outcomes and impacts, and abilities to improve LET adoption and sustainment. This paper describes the evaluation design for the VA LET national evaluation and describes development of a conceptual framework to evaluate LET specifically in healthcare settings. Methods A targeted literature review of Lean evaluation frameworks was performed to inform the development of the conceptual framework. Key domains were identified by a multidisciplinary expert group and then validated with key stakeholders. The national evaluation design will examine LET implementation using qualitative, survey, and quantitative methods at ten VA facilities. Qualitative data include site visits, interviews, and field observation notes. Survey data include an employee engagement survey to be administered to front-line staff at all pilot sites. Quantitative data include site-level quality improvement metrics collected by the Veterans Services Support Center. Qualitative, quantitative, and mixed-methods analyses will be conducted to examine implementation of LET strategic initiatives and variations in implementation success across sites. Discussion This national evaluation of a large-scale LET implementation effort will provide insights helpful to other systems interested in embarking on a Lean journey. Additionally, we created a multi-faceted conceptual framework to capture the specific features of a Lean healthcare organization. This framework will guide this evaluation and may be useful as an assessment tool for other organizations interested in implementing Lean principles at an enterprise level. Electronic supplementary material The online version of this article (10.1186/s12913-019-3919-2) contains supplementary material, which is available to authorized users.
- ... To control the error of the non-response bias [82], an exploratory factor analysis (EFA) was employed [83]. The EFA analysis indicated that the RBE is not significant, being 32.99% of the total explained variance. ...ArticleFull-text available
- Dec 2018

Currently, there is a growing number of businesses which organize their operations in the form of projects. One of the key success factors in the area of project management is building successful relationships with project stakeholders. Using stakeholder theory perspective and looking through the lens of family involvement, the study addresses two research questions: 1. how do family firms perceive the difficulty in building relationships with external stakeholders compared to other project management difficulties; 2. does organizing work in the form of projects redefine the significance of family involvement in the difficulties of building relationships with external stakeholders. To answer these questions, 154 Polish family-owned enterprises, considered as representatives of Eastern European emerging economies, were surveyed. The results indicate that family involvement strongly influences the difficulties in building relationships with external stakeholders, but only in those companies which at the time of the survey were not managing projects. In the firms employing project management practices, only the factor related to increasing the number of employees had a facilitating effect on the studied phenomenon. On the contrary, in the case of family firms not managing projects, the growth in the number of employees increased the difficulty in building relationships with external stakeholders. The findings add to the research on the role of family involvement in building relationships with a firm’s external stakeholders. - ... [ Though debate exists regarding which rotation method is most appropriate for a given data set (Henson & Roberts, 2006), most scholars agree that an orthogonal rotation should be employed in instances where variables are correlated and oblique rotation when variables are not correlated (Schmitt, 2011). Given the aforementioned correlations reported in Table 2 demonstrated minimal corollary relationships between COP and ILP variables and rather robust correlations within both COP and ILP variables, the present study employed both orthogonal (varimax) and oblique (promax) rotations in the EFAs for purposes of robustness. ...Purpose: Despite increased scholarly inquiry regarding intelligence-led policing and popularity among law enforcement agencies around the globe, ambiguity remains regarding the conceptual foundation and appropriate measurement of ILP. Although most scholars agree that ILP is indeed a unique policing philosophy, there is less consensus regarding the relationship between ILP and the ever-present model of community-oriented policing (COP). Consequently, there is a clear need to study the empirical distinctions and overlaps in these policing philosophies as implemented by U.S. law enforcement agencies. Methods: Data were gleaned from the 2007 LEMAS and 2009 NIJ Intelligence surveys. A total of 227 unique police agencies in the United States are included. A series of bivariate, exploratory factor analyses, and structural models are used to determine discriminatory or convergent validity across COP and ILP constructs. Findings: The goal was to answer the question: Are these two policing philosophies are being implemented as separate and distinct strategies? Results of our exploratory and structural models indicate that COP and ILP loaded on unique latent constructs. This affirms the results of the bivariate correlations, and indicates that COP and ILP have discriminant measurement validity. In other words, COP and ILP are conceptually distinct, even when implemented in police departments across the United States. Implications of these findings, and suggestions for future research are discussed. Originality: This is the first study to empirically test the discriminant or convergent validity of COP and ILP.
- ... 32,33 This approach is recom- mended after a priori models have demonstrated poor confirmatory factor analytic fit and when researchers do not have strong hypotheses for model development. 32,34 We randomly divided our analytic sample dataset into two equal-sized datasets using SPSS's random case-selec- tion function. One dataset, the training dataset, was employed to identify the factor structure. ...Article
- Dec 2018
- PALLIATIVE MED

Background: Ensuring a good death in individuals with advanced disease is a fundamental goal of palliative care. However, the lack of a validated patient-centered measure of quality of dying and death in advanced cancer has limited quality assessments of palliative-care interventions and outcomes. Aim: To examine item characteristics and the factor structure of the Quality of Dying and Death Questionnaire in advanced cancer. Design: Cross-sectional study with pooled samples. Setting/participants: Caregivers of deceased advanced-cancer patients (N = 602; mean ages = 56.39–62.23 years), pooled from three studies involving urban hospitals, a hospice, and a community care access center in Ontario, Canada, completed the Quality of Dying and Death Questionnaire 8–10 months after patient death. Results: Psychosocial and practical item ratings demonstrated negative skewness, suggesting positive perceptions; ratings of symptoms and function were poorer. Of four models evaluated using confirmatory factor analyses, a 20-item, four-factor model, derived through exploratory factor analysis and comprising Symptoms and Functioning, Preparation for Death, Spiritual Activities, and Acceptance of Dying, demonstrated good fit and internally consistent factors (Cronbach’s α = 0.70–0.83). Multiple regression analyses indicated that quality of dying was most strongly associated with Symptoms and Functioning and that quality of death was most strongly associated with Preparation for Death (p < 0.001). Conclusion: A new four-factor model best characterized quality of dying and death in advanced cancer as measured by the Quality of Dying and Death Questionnaire. Future research should examine the value of adding a connectedness factor and evaluate the sensitivity of the scale to detect intervention effects across factors. - ... Starting from the largest eigenvalue, factors are retained as long as their empirical eigenvalue is greater than the eigenvalue of its random counterpart. Parallel analysis has been one of the most studied and accurate dimensionality assessment methods for continuous and categorical variables to date (Crawford et al., 2010;Garrido et al., 2013Garrido et al., , 2016Ruscio & Roche, 2012;Schmitt, 2011;Timmerman & Lorenzo-Seva, 2011). ...PreprintFull-text available
- Dec 2018

Exploratory graph analysis (EGA) is a highly accurate technique that was recently proposed within the framework of network psychometrics to estimate the number of factors underlying multivariate data. Unlike other methods, EGA produces a visual guide-network plot-that not only indicates the number of dimensions to retain, but also which items cluster together and their level of association. However, although previous studies have found EGA to be superior to traditional methods, they are limited in the conditions considered. These issues are here addressed through an extensive simulation study that incorporates a wide range of plausible structures that may be found in practice. Additionally, a new variant of EGA based on the triangulated maximally filtered graph approach (EGAtmfg) is evaluated, and both are compared with five widely used and/or recommended factor analytic techniques. Overall, EGA and EGAtmfg are found to perform as well as the most accurate traditional method, parallel analysis, and to produce the best large-sample properties of all the methods evaluated. To increase use and transparency, we present a straightforward R tutorial on how to use and interpret EGA, and apply it to the scores from a well-known Big Five personality test. Finally, we offer a set of practical guidelines for applied researchers, and outline next steps for large scale assessments in health beyond psychology. - ... The initial 55 items were examined for missing values (<0.01%), which were replaced using a random forest iterative process with 10,000 trees (proportion of falsely classified = 0.01). We conducted EFA using a weighted least squares (WLS) approach (with an oblique, Oblimin rotation), which is a preferred technique for analyzing ordinal data that are prone to non-normality (Schmitt, 2011). Each run was preceded by computing Barlett's test of sphericity and the Kaiser-Meyer- Olkin (KMO) measure of sampling adequacy, and the optimal number of factors was determined using parallel analysis with 1,000 resamples. ...Article
- Feb 2019
- SCAND J PSYCHOL

Mounting evidence suggests that experiences of forgiveness vary across cultures. However, culturally-sensitive conceptualizations of forgiveness lack empirical support, in part because psychometrically sound instruments designed to capture unique aspects of forgiveness in non-Western cultures are rare. For this reason, we developed the Collectivist-Sensitive Trait Forgivingness Scale (TFS-CS), which is designed to measure trait forgivingness within societies characterized by a blend of individualistic and collectivistic worldviews. In Study 1 (N = 597), exploratory factor analysis revealed a 16-item three-factor structure of third-party forgiveness, collectivistic forgiveness, and interpersonal resentment among South Africans. In Study 2 (N = 897), the three-factor model replicated in an independent South African sample. Findings also offered preliminary evidence supporting the construct validity of the TFS-CS. Overall, these studies support a conceptualization of trait forgivingness with similarities and differences relative to Western models and highlight the importance of appreciating the influence of culture when measuring forgiveness. - ... Regarding the estimation method, we used a weighted least square mean adjusted (WLSM) estimator, because the observed variables (items) were ordered categorically and this estimation method is more accurate than the maximum likelihood method (Schmitt, 2011). We also handled missing data using the WLS estimation method in the presence of missing data. ...Article
- Jan 2019
- SCH EFF SCH IMPROV

The purpose of this study was to examine the relationship between teaching quality and students’ harmonious passion, deep strategy to learn, and epistemic curiosity in mathematics in 1,003 high school students. Data were analyzed using multilevel structural equation modeling, and results showed support for the hypotheses tested. First, we found that teaching quality – specifically, providing an optimal challenge, focusing on the process, and offering positive feedback – predicted students’ harmonious passion. Second, students’ harmonious passion predicted, at the individual and class level, students’ deep strategy to learn. Third, students’ harmonious passion predicted, at the individual and class level, students’ epistemic curiosity. Findings were discussed regarding their implications for educational practice and methodological suggestions for future research. - ... and RMSEA values lower than .06 were consid- ered indicative of a good structural fit (Schmitt, 2011). Tests for the Δχ 2 between nested models calculated with the WLSMV estimator were undertaken following Satorra and Bentler (2010). ...ArticleFull-text available
- Jan 2019

Academic expectations play a significant role in the quality of student adaptation and academic success. Previous research suggests that expectations are a multidimensional construct, making it crucial to test the measures used for this important characteristic. Because assessment of student adaptation to higher education comprises a multitude of personal and contextual variables, including expectations, shortened versions of assessment instruments are critical. In this article, confirmatory factor analysis was used to obtain a short version of the Academic Perceptions Questionnaire–Expectations (APQ-E). Participants were 3,017 first-year Portuguese college students. The results support the use of a shorter version of 24 items, distributed over six dimensions, with good reliability and validity. - ... Based on existing evidence ( Garrido et al., 2013), PA PCA seems to produce better results than PA EFA . PA is supported by strong evidence from simulation studies (Hubbard & Allen, 1987;Humphreys & Montanelli, 1975;PeresNeto et al., 2005;Velicer et al., 2000;Zwick & Velicer, 1986) and is generally considered to be the method of choice (e.g., Hayton et al., 2004;Schmitt, 2011). However, there are two weaknesses associated with PA, initially suggested by Horn (1965). ...Article
- Jan 2019
- PSYCHOL METHODS

Exploratory factor analyses are commonly used to determine the underlying factors of multiple observed variables. Many criteria have been suggested to determine how many factors should be retained. In this study, we present an extensive Monte Carlo simulation to investigate the performance of extraction criteria under varying sample sizes, numbers of indicators per factor, loading magnitudes, underlying multivariate distributions of observed variables, as well as how the performance of the extraction criteria are influenced by the presence of cross-loadings and minor factors for unidimensional, orthogonal, and correlated factor models. We compared several variants of traditional parallel analysis (PA), the Kaiser-Guttman Criterion, and sequential χ² model tests (SMT) with 4 recently suggested methods: revised PA, comparison data (CD), the Hull method, and the Empirical Kaiser Criterion (EKC). No single extraction criterion performed best for every factor model. In unidimensional and orthogonal models, traditional PA, EKC, and Hull consistently displayed high hit rates even in small samples. Models with correlated factors were more challenging, where CD and SMT outperformed other methods, especially for shorter scales. Whereas the presence of cross-loadings generally increased accuracy, non-normality had virtually no effect on most criteria. We suggest researchers use a combination of SMT and either Hull, the EKC, or traditional PA, because the number of factors was almost always correctly retrieved if those methods converged. When the results of this combination rule are inconclusive, traditional PA, CD, and the EKC performed comparatively well. However, disagreement also suggests that factors will be harder to detect, increasing sample size requirements to N ≥ 500. - ... Third, the Kaiser- Meyer-Olkin (KMO) test was conducted to assess for sampling adequacy (≥0.70). We performed a parallel analysis of the overall PANSS to determine the number of factors to be extracted (Schmitt, 2011). Parallel analysis indicated five factors that exceeded the mean eigenvalue of randomly generated data across 5,000 iterations. ...Objectives The present study examines the latent factor structure of general psychopathology and investigates the mediating role of unmet psychosocial concerns, motivation, and medication side effects in the relationship between general psychopathology and quality of life (QOL) impairment in patients with schizophrenia. Methods A total of 251 patients completed self‐report measures of unmet psychosocial concerns, motivation, medication side effects, and physical/mental QOL impairment. The severity of schizophrenia was assessed on the Positive and Negative Syndrome Scale. Results Exploratory factor analysis revealed one latent factor (emotional distress) of general psychopathology. Mediation path analyses controlling for confounding variables revealed significant indirect effects of unmet psychosocial concerns, motivation, and medication side effects on emotional distress and physical/mental QOL impairment. Conclusions Our findings suggest that identifying optimal methods of managing co‐occurring emotional distress as well as secondary psychosocial factors on psychological health may improve QOL among patients diagnosed with schizophrenia.
- ... EFA in SPSS (Version 23) used principal axis factoring and promax rotation be- cause the items on each of the measures were correlated. To determine the number of factors to retain within EFA, we used both parallel analysis (Horn, 1965) and the minimum average partial test (MAP test;Velicer, 1976), which more accurately select the number of factors within EFA compared to the traditional methods of eigenvalue- greater-than-one or the Kaiser criterion (Schmitt, 2011). We conducted parallel anal- ysis with the fapara package, and the MAP test used the minap package in STATA (Version 14). ...Article
- Jan 2019

Objective: Civic engagement during emerging adulthood (18–25 years) has been found to be a protective factor for mental illness and substance abuse. However, few measures to assess levels of civic behaviors and attitudes among emerging adults are available. This study investigates the factorial validity of a combined measure of civic behavior and attitudes in a sample of undergraduate college students. Method: Two samples of first-year undergraduate students at a private university completed an online survey during fall and spring terms. We investigated the factor structure of the Civic Mindedness and Civic Acts measures in two steps: (a) exploratory factor analysis (EFA; n 5 226), and (b) confirmatory factor analysis (CFA; n 5 352). Results: EFA revealed a 4-factor, 15-item structure that aligned with the original Civic Mindedness and Civic Acts measures; all factor loadings were greater than.40. CFA found the 4-factor structure to have marginally acceptable fit with no modifications (v²(84) 5 225.99, p <.05; RMSEA 5.07 [.06–.08]; CFI 5.92). Conclusions: We provide preliminary evidence for factorial validity of a combined measure of civic engagement. Implications for future research to confirm whether this measure is appropriate for broad use during emerging adulthood are discussed. © 2019 by the Society for Social Work and Research. All rights reserved. - ... The confirmatory procedure then becomes data-driven and exploratory and may well lead to the models fitting well by chance only (MacCallum et al., 1992;Browne, 2001). Schmitt (2011) concludes his article by stating the same idea: "EFA and CFA are mostly differ- entiated by including or not including cross-loadings, respectively, and are not only "exploratory" or only "confirmatory" as CFA can be used to explore with MIs and EFA can be used to confirm when a priori cross-loadings are hypothesized." In this context, it is worth noting that the ESEM approach can be used as a tool of both confirm- atory analyses (as in the present study) and exploratory analyses - not too much weight should be placed on the purely semantic issue of the method being called "exploratory". ...
- ... Our determination of the appropriate number of factors to retain was based on eigenvalues (values over 1.0), proportion of variance explained by each factor and cumulative variance explained, scree plot (number of factors with eigenvalues over 1.0 and above the change of scree slope), parallel analyses (factors with eigenvalues above those from randomly generated data), factor loadings (statistically significant and over 0.40), theoretical interpretability, and practical utility (Brown, 2015;Schmitt, 2011;Schmitt, Sass, Chappelle, & Thompson, 2018). The scree plots and parallel analyses were conducted in the program R (version 3.5.0; ...Article
- Nov 2018
- J SCHOOL PSYCHOL

Future orientation (FO) has received increasing attention for its positive effects on adolescent well-being and successful transition to adulthood. Although numerous measures of FO exist, most are not developmentally appropriate for diverse populations of adolescents, do not assess all theoretical components of FO, and/or were not developed for administration in schools. Additionally, the invariance of existing measures across racial/ethnic groups has not been examined using appropriately rigorous procedures. Using data from 2575 students in grades 6–9, this study examined the psychometric quality and measurement invariance of the FO scale on the School Success Profile (SSP) across African American (34.8%), Latino (27.0%), and European American (38.1%) subsamples. A one-factor model fit the data well in all three groups. Analyses identified only a small number of noninvariant parameters, supporting the conclusion that the scale has partial measurement invariance across the three groups. On average, African Americans had significantly higher levels of FO than the other two groups; mean scores for Latinos and European Americans were lower and statistically equivalent to each other. Construct validity of the SSP FO scale was also supported by findings of medium-sized relationships of FO scores to scores on five other constructs: low grades, school engagement, parent educational support, psychological distress, and school behavior. Multiple group tests of the magnitude and direction of the validity relationships indicated statistical equivalence across the three groups. Results support the use of the SSP FO scale by school psychologists to assess FO and to evaluate the effects of interventions targeting FO as a promoter of well-being and school success. - ... Herewith, a further distinction can be made between exploratory and confirmatory approaches (Hurley, et al., 1997;Schmitt, 2011;Tukey, 1980) depending on whether or not one has a preliminary idea about the positions of the variables on the constructs. In this paper, we will focus on the exploratory case, in which no such a priori knowledge is available. ...ArticleFull-text available
- Apr 2019

In psychology, many studies measure the same variables in different groups. In case of a large number of variables and when a strong a priori idea about underlying latent constructs is lacking, researchers often start with reducing the variables to a few principal components in an exploratory way. Herewith, one often wants to evaluate whether the components represent the same construct in the different groups. To this end, it makes sense to remove outlying variables that have significantly different loadings on the extracted components across the groups, hampering an equivalent interpretation of the components. Moreover, identifying such outlying variables is important to test theories about which variables behave similarly or differently across groups. In this paper, we first scrutinize the lower bound congruence method (LBCM; De Roover, Timmerman, & Ceulemans, 2017) that was recently proposed for solving the outlying variable detection problem. LBCM investigates how Tucker’s congruence between the loadings of the obtained cluster loading matrices improves when specific variables are discarded. We show that LBCM has the tendency to output outlying variables that are either false positives or concern very small and thus practically insignificant loading differences. In order to address this issue, we present a new heuristic: the Lower and Resampled Upper Bound Congruence Method (LRUBCM). The method uses a resampling technique to obtain a sampling distribution for the congruence coefficient, under the hypothesis that no outlying variable is present. In a simulation study, we show that the LRUBCM method outperforms LBCM. We finally illustrate the use of the method by means of empirical data. - ... Regarding the internal structure, since the original questionnaire is at an early stage of development and this is the first iteration of the translated tool, confirmatory factor analysis seemed premature and restrictive [26]. Hence, exploratory factor analysis was performed as it would allow to identify items that might be problematic. ...Background. Utilization of the emergency department (ED) by patients seeking relief from chronic pain (CP) has increased. ese patients often face stigmatization, and the ED is no exception. e French-Canadian Chronic Pain Myth Scale (CPMS) was developed to evaluate common societal misconceptions about CP including among healthcare providers. To our knowledge, no tool of this nature is available in English. Objectives. is study thus aimed at determining to what extent a new English adaptation of the CPMS could provide valid scores among US emergency nurses. e internal consistency, construct validity, and internal structure of the translated scale were thus examined. Methods. After careful translation of the scale, the English CPMS was administered to 482 emergency nurses and its validity was explored through a web-based cross-sectional study. Results. Acceptable reliability (α > 0.7) was reported for the first and third subscales. e second subscale's reliability coefficient was below the cutoff (α � 0.67) but is still considered adequate. As expected, statistically significant differences were found between nurses suffering from CP vs nurses not suffering from CP, supporting the construct validity of the scale. After exploratory factor analysis, similar internal structure was found supporting the 3-factorial nature of the original CPMS. Conclusion. Our results provide support for the preliminary validity of the English CPMS to measure knowledge, beliefs, and attitudes towards CP among emergency nurses in the United States.
- ... • Information theory-based index: this index is pertinent to compare several alternative models with data adjustments. Akaike Information Criterion (AIC) [96]. It is mandatory to analyze the fit and validity of the identified variables, to assess the quality of the collected data, for further evaluation of the defined general model. ...ArticleFull-text available
- Mar 2019

Nowadays, Lean Manufacturing, Industry 4.0, and Sustainability are important concerns for the companies and in a general way for the society, principally, the influence of the two production philosophies, Lean Manufacturing and Industry 4.0, in the three main pillars of sustainability: economic, environmental, and social. According to the literature review done in this work, these relations are not well known and are dispersed by different sustainability’s criteria. To address this gap, this research proposes a structural equation model, with six hypotheses, to quantitatively measure the effects of Lean Manufacturing and Industry 4.0, in Sustainability. To statistically validate such hypotheses, we collected 252 valid questionnaires from industrial companies of Iberian Peninsula (Portugal and Spain). Results show that: (1) it is not conclusive that Lean Manufacturing is correlated with any of the sustainability pillars; and (2) Industry 4.0 shows a strong correlation with the three sustainability pillars. These results can contribute as an important decision support for the industrial companies and its stakeholders, even because not all the results are in line with other opinions and studies. - ... It is important to emphasize that a possible limitation with these studies is that they both use PCA instead of EFA. PCA uses the total variance of the indicators assuming measurement without error and thus has a different goal (data reduction) than EFA which focuses on extraction of common latent factors (Floyd & Widaman, 1995;Schmitt, 2011). ...Article
- Feb 2019
- ASSESSMENT

Psychometric evaluations of the Resilience Scale for Adolescents (READ) have yielded inconsistent support for the original five-factor solution, with different modifications being proposed. The aim of the present article was to investigate the psychometric properties and factor structure of the READ using both confirmatory and exploratory methods, and to evaluate how the scale fits within the theoretical framework of resilience. Data stem from the population-based youth@hordaland-study of 9,596 adolescents from 16 to 19 years of age. Using confirmatory factor analysis, the original five-factor model yielded relatively poor fit. A better model fit was identified for a different five-factor structure using exploratory methods including two new personal factors measuring (a) Goal Orientation and (b) Self-Confidence. This division was supported by low secondary loadings and moderate correlations between the factors, and gender differences in the mean scores. Although the READ is a multidimensional measure that includes individual, family, and social factors related to the resilience process, some important aspects of resilience have not been included. - ArticleFull-text available
- Dec 2018

The Ways of Coping Questionnaire (WCQ) is used extensively in health research, but the measurement properties and suitability of the WCQ for people with Parkinson’s disease (PD) have not been psychometrically assessed. If the WCQ does not align with its original 8-factor structure in a PD population, the use of the WCQ subscales may not be appropriate. The present study used confirmatory factor analysis (CFA), exploratory factor analysis (EFA), and multiple-group EFA to determine the ideal factor structure of the WCQ in a PD sample. The original 8 factors of the WCQ were not reproduced. EFA revealed a 6-factor structure, including Distancing, Faith, Avoidance, Seeking Social Support, Planful Problem Solving, and Confrontive coping. As motor symptom severity may impact coping, the stability of the 6-factor structure was examined across motor symptom severity (mild and moderate), remaining consistent. Higher levels of overall motor severity were associated with increased use of faith and avoidance style coping. These findings suggest that the 6-factor structure of the WCQ may be more appropriate for assessing coping styles in PD. - Article
- Dec 2018

Purpose The purpose of this paper is to develop a scale to empirically measure the self-centered leadership SCL pattern in Arab organizations. Design/methodology/approach This paper depends on two Egyptian samples. It has conducted exploratory factor analysis, confirmatory factor analysis and multiple regression analyses to generate the proposed SCL measurement scale. Findings The analyses have revealed that the new measurement scale is valid and reliable. They have also confirmed the multidimensional structure of the self-centered leadership construct. Originality/value The Arab leadership literature is in short of scales which take into consideration the specialties of the Arab cultures. Therefore, this study fills a lacuna in international research which examines Arab leadership behaviors from a culture-bound perspective. - Objective . The Dental Activities Test (DAT) was developed to be used by dental, nursing, and other health professionals to assess the ability of persons with dementia to perform oral health-related activities and aid care planning. The instrument was designed as a unitary scale and has excellent internal consistency, test-retest reliability, interrater reliability, and construct validity. This study examines the underlying factor structure of the DAT among older adults in assisted living settings. Methods . In a secondary analysis of the data from the original study, the results of testing of 90 older adults with normal to severely impaired cognition from three assisted living communities in North Carolina from March 2013 to February 2014 were studied. An exploratory factor analysis was used to assess the dimensionality of the presumed unitary assessment scale. Results . Two-factor structures were explored. A one-factor model demonstrated acceptably mixed model fit, and a two-factor model had good model fit with moderate correlation between the two factors ( r=0.667 , p<0.05 ). All the items in the one-factor model demonstrated significant factor loadings (loadings ≥ 0.39, all p<0.05 ), while the loadings of some items in the two-factor model (nonsignificant or cross-loadings, loadings < 0.40) did not meet the criteria of factor selection. The one-factor structure was preferred based on the criteria of Scree Plot, eigenvalue, and factor interpretability in relation to clinical relevance. Conclusions . The study provided preliminary evidence that the Dental Activities Test has a unidimensional construct among older adults with cognitive impairment. It suggested that this instrument can be used as a unitary scale to assess dental-related function in persons with dementia. Future testing, including using a confirmatory factor analysis, in a new sample is needed to further assess the usefulness and psychometric properties of this instrument.
- Article
- Mar 2019

Determining the structure and assessing the psychometric properties of multidimensional scales before their application is a prerequisite of scaling theory. This involves splitting a sample of adequate size randomly into two halves and first performing exploratory factor analysis (EFA) on one half-sample in order to assess the construct validity of the scale. Secondly, this structure is validated by carrying out confirmatory factor analysis (CFA) on the second half. As in any statistical analysis – whether univariate, bivariate, or multivariate – the first and most important consideration is to ascertain the level of measurement of the input variables, in this instance the defining items of the scale. This guides the correct choice of the methods to be used. In this paper, we carry out the investigation and assessment of the 2006 European Social Survey six-dimensional instrument of wellbeing for Germany and the Netherlands when items are considered as both ordinal and pseudo-interval. - Article
- Dec 2018
- CHILD HEALTH CARE

The current study seeks to examine the factor structure of the Procedural Coping Questionnaire (PCQ) in a sample of healthy children. In addition, the relation between the PCQ, a measure of medical procedure-specific coping strategies, was examined in relation to the Children’s Behavioral Styles Scale (CBSS), a measure of dispositional coping style. A sample of 192 healthy children completed an 18-item version of the PCQ and the CBSS. Exploratory factor analyses were conducted to examine factor structure of the PCQ. Correlations between each factor of the PCQ and subscales of the CBSS were examined. The EFA yielded a two-factor solution for the final 15 items of the PCQ retained, fitting theoretical conceptualizations of engagement/approach and disengagement/avoidant coping styles. Moderate to strong correlations were found between the Approach/Avoidant factors of the PCQ and the Monitoring/Blunting subscales of the CBSS. The results support the PCQ as a measure of situation-specific (i.e., medical procedure) approach-avoidant coping styles in healthy children. - ArticleFull-text available
- Jan 2019

Purpose: The purpose of this study was to construct and test a predictive model for physical activity adherence for secondary prevention among patients with coronary artery disease. Methods: Two hundred and eighty-two patients with coronary artery disease were recruited at cardiology outpatient clinics in four general hospitals and the data collection was conducted from September 1 to October 19, 2015. Results: The model fit indices for the final hypothetical model satisfied the recommended levels: χ ² /dF=0.77, adjusted goodness of fit index=.98, comparative fit index=1.00, normal fit index=1.00, incremental fit index=1.00, standardized root mean residual=.01, root mean square error of approximation=.03. Autonomy support (β=.50), competence (β=.27), and autonomous motivation (β=.31) had significant direct effects on physical activity adherence for secondary prevention among patients with coronary artery disease. This variable explained 35.1% of the variance in physical activity adherence. Conclusion: This study showed that autonomy support from healthcare providers plays a key role in promoting physical activity adherence for secondary prevention among patients with coronary artery disease. The findings suggest that developing intervention programs to increase feelings of competence and autonomous motivation through autonomy support from healthcare providers are needed to promote physical activity adherence for secondary prevention among patients with coronary artery disease. - Article
- Jan 2019
- STRESS HEALTH

Although the Parenting Daily Hassles Intensity Scale is a common measure, it has been relatively unclear whether users should employ the 15‐item form that quantifies routine parenting hassles on two dimensions of intensity or the 20‐item form that assumes a single dimension underlies the responses on the scale. To help address this gap, Bayesian confirmatory factor analysis was used to investigate the structural validity of the 15‐ and 20‐item forms in a sample of 174 mothers with at least one young child ( = 6.040, sd = 0.492). Results of the Bayesian analysis did not provide empirical support for either form. A subsequent exploratory factor analysis indicated that six of the hassles that appear to address challenging child behavior tended to cluster onto one latent factor while eleven hassles that appear to speak to routine parenting chores tended to cluster onto a second factor. A follow‐up Bayesian analysis indicated that intensity scores can be approximated well under the 17‐item form (ppp = .124). Accordingly, researchers and clinicians are encouraged to consider the 17‐item form when addressing their measurement needs. - Article
- Feb 2019
- Alcohol Clin Exp Res

Background The alcohol consumption patterns of young adults are of concern. Critically, tertiary students consume greater quantities of alcohol, are at increased risk of injury/harm, and have higher rates of alcohol use disorders (AUD) as compared to their non‐university enrolled peers. The Brief Young Adult Alcohol Consequences Questionnaire (BYAACQ) is one of several tools utilised to explore adverse alcohol‐related outcomes among tertiary students. Alcohol intake behaviour, assessed via retrospective summary measures, has been linked to BYAACQ score. It is unclear, however, how drinking assessed in real‐time, in conjunction with variables such as age of drinking onset, might predict severity of adverse alcohol consequences as captured by the BYAACQ. Methods The psychometric properties of the BYAACQ were explored using a large Australian sample of tertiary students (N = 893). A subsample (n = 504) provided alcohol intake information in real‐time (21 days; event‐ and notification‐contingent) via a smartphone app (CNLab‐A) plus details related to age of drinking onset, drug use, parental alcohol/drug use, and anxiety/depression symptomology. Results Average BYAACQ score was 7.23 (SD = 5.47). Classical and item response theory analyses revealed inconsistencies related to dimensionality, progressive item severity, and male/female differential item functioning. Current drinking – namely, frequency of intake and quantity per drinking occasion – plus age of drinking onset predicted BYAACQ score after controlling for age, other drug use, and depression symptomology. Conclusions The BYAACQ is a sound tool for use with Australian samples. Information related to current drinking, age of drinking onset, and drug use is useful for predicting severity of alcohol use consequences. These markers might enable tertiary institutions to better target students who could benefit from prevention/intervention programs. - Article
- Apr 2019
- AIDS BEHAV

Adolescent HIV self-management is a complex phenomenon that has been poorly researched. A mixed-method explorative sequential research design was used to develop an instrument to measure adolescent HIV self-management in the context of the Western Cape, South Africa. The development and validation was undertaken in four phases: (i) individual interviews and focus groups with adolescents aged 13 to 18, their caregivers and healthcare workers (n = 56); (ii) item identification; (iii) item refinement through cognitive interviewing (n = 11), expert review (n = 11) and pilot testing (n = 33); and (iv) psychometric evaluation (n = 385). The final scale consists of five components with 35 items encompassing the construct of adolescent HIV self-management. The developed scale had acceptable reliability (0.84) and stability (0.76). Factor analysis indicated a good model-fit that support the structural validity (RMSEA = 0.052, p = 0.24; RMR = 0.065; CFI = 0.9). Higher self-management was associated with better HIV-related and general health outcomes, which supports the criterion- and convergent validity of the instrument.

- This simulation study compared maximum likelihood (ML) estimation with weighted least squares means and variance adjusted (WLSMV) estimation. The study was based on confirmatory factor analyses with 1, 2, 4, and 8 factors, based on 250, 500, 750, and 1,000 cases, and on 5, 10, 20, and 40 variables with 2, 3, 4, 5, and 6 categories. There was no model misspecification. The most important results were that with 2 and 3 categories the rejection rates of the WLSMV chi-square test corresponded much more to the expected rejection rates according to an alpha level of. 05 than the rejection rates of the ML chi-square test. The magnitude of the loadings was more precisely estimated by means of WLSMV when the variables had only 2 or 3 categories. The sample size for WLSMV estimation needed not to be larger than the sample size for ML estimation.
- The performance of five methods for determining the number of components to retain (Horn's parallel analysis, Velicer's minimum average partial [MAP], Cattell's scree test, Bartlett's chi-square test, and Kaiser's eigenvalue greater than 1.0 rule) was investigated across seven systematically varied conditions (sample size, number of variables, number of components, component saturation, equal or unequal numbers of variables per component, and the presence or absence of unique and complex variables). We generated five sample correlation matrices at each of two sample sizes from the 48 known population correlation matrices representing six levels of component pattern complexity. The performance of the parallel analysis and MAP methods was generally the best across all situations. The scree test was generally accurate but variable. Bartlett's chi-square test was less accurate and more variable than the scree test. Kaiser's method tended to severely overestimate the number of components. We discuss recommendations concerning the conditions under which each of the methods are accurate, along with the most effective and useful methods combinations.
- The most popular measures of multidimensional constructs typically fail to meet standards of good measurement: goodness of fit, measurement invariance, lack of differential item functioning, and well-differentiated factors that are not so highly correlated as to detract from their discriminant validity. Part of the problem, the authors argue, is undue reliance on overly restrictive independent cluster models of confirmatory factor analysis (ICM-CFA) in which each item loads on one, and only one, factor. Here the authors demonstrate exploratory structural equation modeling (ESEM), an integration of the best aspects of CFA and traditional exploratory factor analyses (EFA). On the basis of responses to the 11-factor Motivation and Engagement Scale (n = 7,420, Mage = 14.22), we demonstrate that ESEM fits the data much better and results in substantially more differentiated (less correlated) factors than corresponding CFA models. Guided by a 13-model taxonomy of ESEM full-measurement (mean structure) invariance, the authors then demonstrate invariance of factor loadings, item intercepts, item uniquenesses, and factor variancescovariances, across gender and over time. ESEM has broad applicability to other areas of research that cannot be appropriately addressed with either traditional EFA or CFA and should become a standard tool for use in psychometric tests of psychological assessment instruments.
- Structural equation modeling is a well-known technique for studying relationships among multivariate data. In practice, high dimensional nonnormal data with small to medium sample sizes are very common, and large sample theory, on which almost all modeling statistics are based, cannot be invoked for model evaluation with test statistics. The most natural method for nonnormal data, the asymptotically distribution free procedure, is not defined when the sample size is less than the number of nonduplicated elements in the sample covariance. Since normal theory maximum likelihood estimation remains defined for intermediate to small sample size, it may be invoked but with the probable consequence of distorted performance in model evaluation. This article studies the small sample behavior of several test statistics that are based on maximum likelihood estimator, but are designed to perform better with nonnormal data. We aim to identify statistics that work reasonably well for a range of small sample sizes and distribution conditions. Monte Carlo results indicate that Yuan and Bentler's recently proposed F-statistic performs satisfactorily.
- ArticleFull-text available
- Sep 1999

Despite the widespread use of exploratory factor analysis in psychological research, researchers often make questionable decisions when conducting these analyses. This article reviews the major design and analytical decisions that must be made when conducting a factor analysis and notes that each of these decisions has important consequences for the obtained results. Recommendations that have been made in the methodological literature are discussed. Analyses of 3 existing empirical data sets are used to illustrate how questionable decisions in conducting factor analyses can yield problematic results. The article presents a survey of 2 prominent journals that suggests that researchers routinely conduct analyses using such questionable methods. The implications of these practices for psychological research are discussed, and the reasons for current practices are reviewed. (PsycINFO Database Record (c) 2012 APA, all rights reserved) - The authors surveyed exploratory factor analysis (EFA) practices in three organizational journals from 1985 to 1999 to investigate purposes for conducting EFA and to update and extend Ford, MacCallum, and Tait’s (1986) review. Ford et al. surveyed the same journals from 1975 to 1984, concluding that researchers often applied EFA poorly (e.g., relying too heavily on principal components analysis [PCA], eigenvalues greater than 1 to choose the number of factors, and orthogonal rotations). Fabrigar, Wegener, MacCallum, and Strahan (1999) reached a similar conclusion based on a much smaller sample of studies. This review of 371 studies shows reason for greater optimism. The tendency to use multiple number-of-factors criteria and oblique rotations has increased somewhat. Most important, the authors find that researchers tend to make better decisions when EFA plays a more consequential role in the research. They stress the importance of careful and thoughtful analysis, including decisions about whether and how EFA should be used.
- The factor analysis literature includes a range of recommendations regarding the minimum sample size necessary to obtain factor solutions that are adequately stable and that correspond closely to population factors. A fundamental misconception about this issue is that the minimum sample size, or the minimum ratio of sample size to the number of variables, is invariant across studies. In fact, necessary sample size is dependent on several aspects of any given study, including the level of communality of the variables and the level of overdetermination of the factors. The authors present a theoretical and mathematical framework that provides a basis for understanding and predicting these effects. The hypothesized effects are verified by a sampling study using artificial data. Results demonstrate the lack of validity of common rules of thumb and provide a basis for establishing guidelines for sample size in factor analysis.
- Article
- Jul 2010

Horn’s parallel analysis (PA) is an empirical method to decide how many components in a principal component analysis (PCA) or factors in a common factor analysis (CFA) drive the variance observed in a data set of n observations on p variables (Horn, 1965). This decision of how many components - Article
- Dec 2006

Structural equation modeling (SEM), by segregating measurement errors from the true scores of attributes, provides a methodology to model the latent variables, such as attitudes, IQ, personality traits, political liberalism or conservatism, and socio-economic status, directly. The methodology of SEM has enjoyed tremendous developments since 1970, and is now widely applied. The idea of multiple indicators for a latent variable is from factor analysis. SEM is often regarded as an extension of factor analysis in the psychometric literature. This methodology also covers several widely used statistical models in various disciplines. This chapter presents several specific models before introducing the general mean and covariance structures. It is noted that unlike the exploratory factor analysis (EFA) model, the zero loadings in a general confirmatory factor analysis (CFA) model are specified a priori based on subject-matter knowledge, and their number and placement guarantee that the model is identified without any rotational constraints. - An examination of the use of exploratory and confirmatory factor analysis by researchers publishing in Personality and Social Psychology Bulletin over the previous 5 years is presented, along with a review of recommended methods based on the recent statistical literature. In the case of exploratory factor analysis, an examination and recommendations concerning factor extraction procedures, sample size, number of measured variables, determining the number of factors to extract, factor rotation, and the creation of factor scores are presented. These issues are illustrated via an exploratory factor analysis of data from the University of California, Los Angeles, Loneliness Scale. In the case of confirmatory factor analysis, an examination and recommendations concerning model estimation, evaluating model fit, sample size, the effects of non-normality of the data, and missing data are presented. These issues are illustrated via a confirmatory factor analysis of data from the Revised Causal Dimension Scale.
- Article
- Nov 2006
- COUNS PSYCHOL

The authors conducted a content analysis on new scale development articles appearing in the Journal of Counseling Psychology during 10 years (1995 to 2004). The authors analyze and discuss characteristics of the exploratory and confirmatory factor analysis procedures in these scale development studies with respect to sample characteristics, factorability, extraction methods, rotation methods, item deletion or retention, factor retention, and model fit indexes. The authors uncovered a variety of specific practices that were at variance with the current literature on factor analysis or structural equation modeling. They make recommendations for best practices in scale development research in counseling psychology using exploratory and confirmatory factor analysis. - In mean and covariance structure analysis, the chi-square difference test is often applied to evaluate the number of factors, cross-group constraints, and other nested model comparisons. Let model Ma be the base model within which model Mb is nested. In practice, this test is commonly used to justify Mb even when Ma is misspecified. The authors study the behavior of the chi-square difference test in such a circumstance. Monte Carlo results indicate that a nonsignificant chi-square difference cannot be used to justify the constraints in Mb. They also show that when the base model is misspecified, the z test for the statistical significance of a parameter estimate can also be misleading. For specific models, the analysis further shows that the intercept and slope parameters in growth curve models can be estimated consistently even when the covariance structure is misspecified, but only in linear growth models. Similarly, with misspecified covariance structures, the mean parameters in multiple group models can be estimated consistently under null conditions.
- Given the proliferation of factor analysis applications in the literature, the present article examines the use of factor analysis in current published research across four psychological journals. Notwithstanding ease of analysis due to computers, the appropriate use of factor analysis requires a series of thoughtful researcher judgments. These judgments directly affect results and interpretations. The authors examine across studies (a) the decisions made while conducting exploratory factor analyses (N = 60) and (b) the information reported from the analyses. In doing so, they present a review of the current status of factor analytic practice, including comment on common errors in use and reporting. Recommendations are proffered for future practice as regards analytic decisions and reporting in empirical research.
- Article
- Mar 1990
- EDUC PSYCHOL MEAS

Employment of the bootstrap method to approximate the sampling variation of eigenvalues is explicated, and its usefulness is amplified by an illustration in conjunction with two commonly used number-of-factors criteria: eigenvalues larger than one and the scree test. Confidence intervals for eigenvalues are approximated for sample correlation matrices that have ones and squared multiple correlation coefficients on the diagonals. The results demonstrate the usefulness of the bootstrap method in providing information about the sampling variability of eigenvalues-knowledge that affords a basis for more informed decisions regarding the number of factors when employing common criteria. Further, this information can be obtained with little difficulty, and the approach avoids tenuous assumptions of symmetric confidence intervals. - Expectations for reporting factor analysis results as part of construct validation are explored in the context of emerging views of measurement validity. Desired practices are discussed regarding both exploratory factor analysis (e.g., principal components analysis) and confirmatory factor analysis (e.g., LISREL and EQS factor analyses). A short computer program for conducting parallel analysis is appended.
- Article
- Apr 1998
- MULTIVAR BEHAV RES

We evaluated whether "more is ever too much" for the number of indicators (p) per factor (p/f) in confirmatory factor analysis by varying sample size (N = 50-1000) and p/f (2-12 items per factor) in 35,000 Monte Carlo solutions. For all N's, solution behavior steadily improved (more proper solutions, more accurate parameter estimates, greater reliability) with increasing p/f. There was a compensatory relation between N and p/f: large p/f compensated for small N and large N compensated for small p/f, but large-N and large-p/f was best. A bias in the behavior of the χ2 was also demonstrated where apparent goodness of fit declined with increasing p/f ratios even though approximating models were "true". Fit was similar for proper and improper solutions, as were parameter estimates form improper solutions not involving offending estimates. We also used the 12-p/f data to construct 2, 3, 4, or 6 parcels of items (e.g., two parcels of 6 items per factor, three parcels of 4 items per factor, etc.), but the 12-indicator (nonparceled) solutions were somewhat better behaved. At least for conditions in our simulation study, traditional "rules" implying fewer indicators should be used for smaller N may be inappropriate and researchers should consider using more indicators per factor that is evident in current practice. - Article
- Jan 2001
- MULTIVAR BEHAV RES

The use of analytic rotation in exploratory factor analysis will be examined. Particular attention will be given to situations where there is a complex factor pattern and standard methods yield poor solutions. Some little known but interesting rotation criteria will be discussed and methods for weighting variables will be examined. Illustrations will be provided using Thurstone's 26 variable box data and other examples. - Article
- Jun 1996
- PSYCHOL METHODS

A framework for hypothesis testing and power analysis in the assessment of fit of covariance structure models is presented. We emphasize the value of confidence intervals for fit indices, and we stress the relationship of confidence intervals to a framework for hypothesis testing. The approach allows for testing null hypotheses of not-good fit, reversing the role of the null hypothesis in conventional tests of model fit, so that a significant result provides strong support for good fit. The approach also allows for direct estimation of power, where effect size is defined in terms of a null and alternative value of the root-mean-square error of approximation fit index proposed by J. H. Steiger and J. M. Lind (1980). It is also feasible to determine minimum sample size required to achieve a given level of power for any test of fit in this framework. Computer programs and examples are provided for power analyses and calculation of minimum sample sizes. (PsycINFO Database Record (c) 2012 APA, all rights reserved) - This study is a methodological-substantive synergy, demonstrating the power and flexibility of exploratory structural equation modeling (ESEM) methods that integrate confirmatory and exploratory factor analyses (CFA and EFA), as applied to substantively important questions based on multidimentional students' evaluations of university teaching (SETs). For these data, there is a well established ESEM structure but typical CFA models do not fit the data and substantially inflate correlations among the nine SET factors (median rs = .34 for ESEM, .72 for CFA) in a way that undermines discriminant validity and usefulness as diagnostic feedback. A 13-model taxonomy of ESEM measurement invariance is proposed, showing complete invariance (factor loadings, factor correlations, item uniquenesses, item intercepts, latent means) over multiple groups based on the SETs collected in the first and second halves of a 13-year period. Fully latent ESEM growth models that unconfounded measurement error from communality showed almost no linear or quadratic effects over this 13-year period. Latent multiple indicators multiple causes models showed that relations with background variables (workload/difficulty, class size, prior subject interest, expected grades) were small in size and varied systematically for different ESEM SET factors, supporting their discriminant validity and a construct validity interpretation of the relations. A new approach to higher order ESEM was demonstrated, but was not fully appropriate for these data. Based on ESEM methodology, substantively important questions were addressed that could not be appropriately addressed with a traditional CFA approach.
- Goodness-of-fit (GOF) indexes provide "rules of thumb"—recommended cutoff values for assessing fit in structural equation modeling. Hu and Bentler (1999) proposed a more rigorous approach to evaluating decision rules based on GOF indexes and, on this basis, proposed new and more stringent cutoff values for many indexes. This article discusses potential problems underlying the hypothesis-testing rationale of their research, which is more appropriate to testing statistical significance than evaluating GOF. Many of their misspecified models resulted in a fit that should have been deemed acceptable according to even their new, more demanding criteria. Hence, rejection of these acceptable-misspecified models should have constituted a Type 1 error (incorrect rejection of an "acceptable" model), leading to the seemingly paradoxical results whereby the probability of correctly rejecting misspecified models decreased substantially with increasing N. In contrast to the application of cutoff values to evaluate each solution in isolation, all the GOF indexes were more effective at identifying differences in misspecification based on nested models. Whereas Hu and Bentler (1999) offered cautions about the use of GOF indexes, current practice seems to have incorporated their new guidelines without sufficient attention to the limitations noted by Hu and Bentler (1999).
- In the exploratory factor analysis, when the number of factors exceeds the true number of factors, the likelihood ratio test statistic no longer follows the chi-square distribution due to a problem of rank deficiency and nonidentifiability of model parameters. As a result, decisions regarding the number of factors may be incorrect. Several researchers have pointed out this phenomenon, but it is not well known among applied researchers who use exploratory factor analysis. We demonstrate that overfactoring is one cause for the well-known fact that the likelihood ratio test tends to find too many factors.
- As part of the development of a comprehensive strategy for structural equation model building and assessment, a Monte Carlo study evaluated the effectiveness of different exploratory factor analysis extraction and rotation methods for correctly identifying the known population multiple‐indicator measurement model. The exploratory methods fared well in recovering the model except in small sample sizes with highly correlated factors, and even in those situations most of the indicators were correctly assigned to the factors. Surprisingly, the orthogonal varimax rotation did as well as the more sophisticated oblique rotations in recovering the model, and generally yielded more accurate estimates. These results demonstrate that exploratory factor analysis can contribute to a useful heuristic strategy for model specification prior to cross‐validation with confirmatory factor analysis.
- Exploratory factor analysis (EFA) has long been used in the social sciences to depict the relationships between variables/items and latent traits. Researchers face many choices when using EFA, including the choice of rotation criterion, which can be difficult given that few research articles have discussed and/or demonstrated their differences. The goal of the current study is to help fill this gap by reviewing and demonstrating the utility of several rotation criteria. Furthermore, this article discusses and demonstrates the importance of using factor pattern loading standard errors for hypothesis testing. The choice of a rotation criterion and the use of standard errors in evaluating factor loadings are essential so researchers can make informed decisions concerning the factor structure. This study demonstrates that depending on the rotation criterion selected, and the complexity of the factor pattern matrix, the interfactor correlations and factor pattern loadings can vary substantially. It is also illustrated that the magnitude of the factor loading standard errors can result in different factor structures. Implications and future directions are discussed.
- ArticleFull-text available
- Nov 2006

The evaluation of assessment dimensionality is a necessary stage in the gathering of evidence to support the validity of interpretations based on a total score, particularly when assessment development and analysis are conducted within an item response theory (IRT) framework. In this study, we employ polytomous item responses to compare two methods that have received increased attention in recent years (Rasch model and Parallel analysis) with a method for evaluating assessment structure that is less well-known in the educational measurement community (TETRAD). The three methods were all found to be reasonably effective. Parallel Analysis successfully identified the correct number of factors and while the Rasch approach did not show the item misfit that would indicate deviation from clear unidimensionality, the pattern of residuals did seem to indicate the presence of correlated, yet distinct, factors. TETRAD successfully confirmed one dimension in the single-construct data set and was able to confirm two dimensions in the combined data set, yet excluded one item from each cluster, for no obvious reasons. The outcomes of all three approaches substantiate the conviction that the assessment of dimensionality requires a good deal of judgment. - Article
- Jul 2009
- STRUCT EQU MODELING

Exploratory factor analysis (EFA) is a frequently used multivariate analysis technique in statistics. Jennrich and Sampson (1966)19. Jennrich , R. I. and Sampson , P. F. 1966. Rotation to simple loadings.. Psychometrika, 31: 313–323. [CrossRef], [PubMed], [Web of Science ®]View all references solved a significant EFA factor loading matrix rotation problem by deriving the direct Quartimin rotation. Jennrich was also the first to develop standard errors for rotated solutions, although these have still not made their way into most statistical software programs. This is perhaps because Jennrich's achievements were partly overshadowed by the subsequent development of confirmatory factor analysis (CFA) by Jöreskog (1969)20. Jöreskog , K. G. 1969. A general approach to confirmatory maximum-likelihood factor analysis.. Psychometrika, 34: 183–202. [CrossRef], [Web of Science ®]View all references. The strict requirement of zero cross-loadings in CFA, however, often does not fit the data well and has led to a tendency to rely on extensive model modification to find a well-fitting model. In such cases, searching for a well-fitting measurement model may be better carried out by EFA (Browne, 20017. Browne , M. W. 2001. An overview of analytic rotation in exploratory factor analysis.. Multivariate Behavioral Research, 36: 111–150. [Taylor & Francis Online], [Web of Science ®]View all references). Furthermore, misspecification of zero loadings usually leads to distorted factors with over-estimated factor correlations and subsequent distorted structural relations. This article describes an EFA-SEM (ESEM) approach, where in addition to or instead of a CFA measurement model, an EFA measurement model with rotations can be used in a structural equation model. The ESEM approach has recently been implemented in the Mplus program. ESEM gives access to all the usual SEM parameters and the loading rotation gives a transformation of structural coefficients as well. Standard errors and overall tests of model fit are obtained. Geomin and Target rotations are discussed. Examples of ESEM models include multiple-group EFA with measurement and structural invariance testing, test–retest (longitudinal) EFA, EFA with covariates and direct effects, and EFA with correlated residuals. Testing strategies with sequences of EFA and CFA models are discussed. Simulated and real data are used to illustrate the points. - Article
- Jun 1995
- EDUC PSYCHOL MEAS

One of the most important decisions that can be made in the use of factor analysis is the number of factors to retain. Numerous studies have consistently shown that Horn's parallel analysis is the most nearly accurate methodology for determining the number of factors to retain in an exploratory factor analysis. Although Horn's procedure is relatively accurate, it still tends to error in the direction of indicating the retention of one or two more factors than is actually warranted or of retaining poorly defined factors. A modification of Horn's parallel analysis based on Monte Carlo simulation of the null distributions of the eigenvalues generated from a population correlation identity matrix is introduced. This modification allows identification of any desired upper 1 - a percentile, such as the 95th percentile of this set of distributions. The 1 - ax percentile then can be used to determine whether an eigenvalue is larger than what could be expected by chance. Horn based his original procedure on the average eigenvalues derived from this set of distributions. The modified procedure reduces the tendency of the parallel analysis methodology to overextract. An example is provided that demonstrates this capability. A demonstration is also given that indicates that the parallel analysis procedure and its modification are insensitive to the distributional characteristics of the data used to generate the eigenvalue distributions. - Article
- Nov 1992
- SOCIOL METHOD RES

This article is concerned with measures of fit of a model. Two types of error involved in fitting a model are considered. The first is error of approximation which involves the fit of the model, with optimally chosen but unknown parameter values, to the population covariance matrix. The second is overall error which involves the fit of the model, with parameter values estimated from the sample, to the population covariance matrix. Measures of the two types of error are proposed and point and interval estimates of the measures are suggested. These measures take the number of parameters in the model into account in order to avoid penalizing parsimonious models. Practical difficulties associated with the usual tests of exact fit or a model are discussed and a test of “close fit” of a model is suggested. - This article compares maximum likelihood (ML) estimation to three variants of two-stage least squares (2SLS) estimation in structural equation models. The authors use models that are both correctly and incorrectly specified. Simulated data are used to assess bias, efficiency, and accuracy of hypothesis tests. Generally, 2SLS with reduced sets of instrumental variables performs similarly to ML when models are correctly specified. Under correct specification, both estimators have little bias except at the smallest sample sizes and are approximately equally efficient. As predicted, when models are incorrectly specified, 2SLS generally performs better, with less bias and more accurate hypothesis tests. Unless a researcher has tremendous confidence in the correctness of his or her model, these results suggest that a 2SLS estimator should be considered.
- Factor analysis models with ordinal indicators are often estimated using a 3-stage procedure where the last stage involves obtaining parameter estimates by least squares from the sample polychoric correlations. A simulation study involving 324 conditions (1,000 replications per condition) was performed to compare the performance of diagonally weighted least squares (DWLS) and unweighted least squares (ULS) in the procedure's third stage. Overall, both methods provided accurate and similar results. However, ULS was found to provide more accurate and less variable parameter estimates, as well as more precise standard errors and better coverage rates. Nevertheless, convergence rates for DWLS are higher. Our recommendation is therefore to use ULS, and, in the case of nonconvergence, to use DWLS, as this method might converge when ULS does not.
- The decision of how many factors to retain is a critical component of exploratory factor analysis. Evidence is presented that parallel analysis is one of the most accurate factor retention methods while also being one of the most underutilized in management and organizational research. Therefore, a step-by-step guide to performing parallel analysis is described, and an example is provided using data from the Minnesota Satisfaction Questionnaire. Recommendations for making factor retention decisions are discussed.
- ArticleFull-text available
- Oct 2009

Assessing the correctness of a structural equation model is essential to avoid drawing incorrect conclusions from empirical research. In the past, the chi-square test was recommended for assessing the correctness of the model but this test has been criticized because of its sensitivity to sample size. As a reaction, an abundance of fit indexes have been developed. The result of these developments is that structural equation modeling packages are now producing a large list of fit measures. One would think that this progression has led to a clear understanding of evaluating models with respect to model misspecifications. In this article we question the validity of approaches for model evaluation based on overall goodness-of-fit indexes. The argument against such usage is that they do not provide an adequate indication of the “size” of the model's misspecification. That is, they vary dramatically with the values of incidental parameters that are unrelated with the misspecification in the model. This is illustrated using simple but fundamental models. As an alternative method of model evaluation, we suggest using the expected parameter change in combination with the modification index (MI) and the power of the MI test. - Article
- Jan 2007
- STRUCT EQU MODELING

Some authors have suggested that sample size in covariance structure modeling should be considered in the context of how many parameters are to be estimated (e.g., Kline, 2005). Previous research has examined the effect of varying sample size relative to the number of parameters being estimated (N:q). Although some support has been found for this effect, the effect size appears to be small compared to other influences, such as indicator reliability and sample size (Jackson, 2003). Efforts to extend this work to the case where models are intentionally misspecified are described in this article. In addition to varying the number of observations per estimated parameter, several other known influences on model fit were varied such as sample size, the degree of misspecification, number of variables per factor, and the communality of the measured variables. The results suggest that decreasing the number of parameters to be estimated while holding sample size constant can help detect misspecification errors, and some fit indexes were more sensitive to this manipulation than others. In general, the effects of N:q were small relative to other experimental effects. - Article
- Jan 2010
- STRUCT EQU MODELING

Two general frameworks have been proposed for evaluating statistical power of tests of model fit in structural equation modeling (SEM). Under the Satorra–Saris (1985) approach, to evaluate the power of the test of fit of Model A, a Model B, within which A is nested, is specified as the alternative hypothesis and considered as the true model. We then determine the power of the test of fit of A when B is true. Under the MacCallum–Browne–Sugawara (1996) approach, power is evaluated with respect to the test of fit of Model A against an alternative hypothesis specifying a true degree of model misfit. We then determine the power of the test of fit of A when a specified degree of misfit is assumed to exist as the alternative hypothesis. In both approaches the phenomenon of isopower is present, which means that different alternative hypotheses (in the Satorra–Saris approach) or combinations of alternative hypotheses and other factors (in the MacCallum–Browne–Sugawara approach) yield the same level of power. We show how these isopower alternatives can be defined and identified in both frameworks, and we discuss implications of isopower for understanding the results of power analysis in applications of SEM. - Exploratory factor analysis (EFA) is a commonly used statistical technique for examining the relationships between variables (e.g., items) and the factors (e.g., latent traits) they depict. There are several decisions that must be made when using EFA, with one of the more important being choice of the rotation criterion. This selection can be arduous given the numerous rotation criteria available and the lack of research/literature that compares their function and utility. Historically, researchers have chosen rotation criteria based on whether or not factors are correlated and have failed to consider other important aspects of their data. This study reviews several rotation criteria, demonstrates how they may perform with different factor pattern structures, and highlights for researchers subtle but important differences between each rotation criterion. The choice of rotation criterion is critical to ensure researchers make informed decisions as to when different rotation criteria may or may not be appropriate. The results suggest that depending on the rotation criterion selected and the complexity of the factor pattern matrix, the interpretation of the interfactor correlations and factor pattern loadings can vary substantially. Implications and future directions are discussed.
- Article
- Mar 1990
- PSYCHOL BULL

Anumber of goodness-of-fit indices for the evaluation of multivariate structural models are expressed as functions of the noncentrality parameter in order to elucidate their mathematical properties and, in particular, to explain previous numerical findings. Most of the indices considered are shown to vary systematically with sample size. It is suggested that H. Akaike's (1974; see record 1989-17660-001) information criterion cannot be used for model selection in real applications and that there are problems attending the definition of parsimonious fit indices. A normed function of the noncentrality parameter is recommended as an unbiased absolute goodness-of-fit index, and the Tucker–Lewis (see record 1973-30255-001) index and a new unbiased counterpart of the Bentler–Bonett (see record 1981-06898-001) index are recommended for those investigators who might wish to evaluate fit relative to a null model. (PsycINFO Database Record (c) 2012 APA, all rights reserved) - Article
- Jan 2005

Following the introduction, we divide the discussion of goodness of fit (GOF) into three broad sections. In the first and the most substantial (in terms of length) section, we provide a technical summary of the GOF literature. In this section we take a reasonably uncritical perspective on the role of GOF testing, providing an almost encyclopedic summary of GOF indices and their behavior in relation to a variety of criteria. Then we introduce some complications related to GOF testing that have not been adequately resolved and may require further research. In the final section, we place the role of GOF within the broader context of model evaluation. Taking the role of devil's advocate, we challenge the appropriateness of current GOF practice, arguing that current practice is leading structural equation modeling (SEM) research into counterproductive directions that run the risk of undermining good science and marginalizing the usefulness of SEM as a research tool. (PsycINFO Database Record (c) 2012 APA, all rights reserved) - Article
- Jan 2004

Investigation of the structure underlying variables (or people, or time) has intrigued social scientists since the early origins of psychology. Conducting one's first factor analysis can yield a sense of awe regarding the power of these methods to inform judgment regarding the dimensions underlying constructs. This book presents the important concepts required for implementing two disciplines of factor analysis: exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). The book may be unique in its effort to present both analyses within the single rubric of the general linear model. Throughout the book canons of best factor analytic practice are presented and explained. The book has been written to strike a happy medium between accuracy and completeness versus overwhelming technical complexity. An actual data set, randomly drawn from a large-scale international study involving faculty and graduate student perceptions of academic libraries, is presented in Appendix A. Throughout the book different combinations of these variables and participants are used to illustrate EFA and CFA applications. (PsycINFO Database Record (c) 2012 APA, all rights reserved) - Exploratory factor analysis (EFA) is generally regarded as a technique for large sample sizes (N), with N = 50 as a reasonable absolute minimum. This study offers a comprehensive overview of the conditions in which EFA can yield good quality results for N below 50. Simulations were carried out to estimate the minimum required N for different levels of loadings (λ), number of factors (f), and number of variables (p) and to examine the extent to which a small N solution can sustain the presence of small distortions such as interfactor correlations, model error, secondary loadings, unequal loadings, and unequal p/f. Factor recovery was assessed in terms of pattern congruence coefficients, factor score correlations, Heywood cases, and the gap size between eigenvalues. A subsampling study was also conducted on a psychological dataset of individuals who filled in a Big Five Inventory via the Internet. Results showed that when data are well conditioned (i.e., high λ, low f, high p), EFA can yield reliable results for N well below 50, even in the presence of small distortions. Such conditions may be uncommon but should certainly not be ruled out in behavioral research data. * These authors contributed equally to this work
- Article
- Nov 2009
- STRUCT EQU MODELING

A common question asked by researchers is, "What sample size do I need for my study?" Over the years, several rules of thumb have been proposed. In reality there is no rule of thumb that applies to all situations. The sample size needed for a study depends on many factors, including the size of the model, distribution of the variables, amount of missing data, reliability of the variables, and strength of the relations among the variables. The purpose of this article is to demonstrate how substantive researchers can use a Monte Carlo study to decide on sample size and determine power. Two models are used as examples, a confirmatory factor analysis (CFA) model and a growth model. The analyses are carried out using the Mplus program (Muthén& Muthén 1998). - Article
- Jan 1999
- STRUCT EQU MODELING

This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and Gamma Hat; a cutoff value close to .90 for Mc; a cutoff value close to .08 for SRMR; and a cutoff value close to .06 for RMSEA are needed before we can conclude that there is a relatively good fit between the hypothesized model and the observed data. Furthermore, the 2‐index presentation strategy is required to reject reasonable proportions of various types of true‐population and misspecified models. Finally, using the proposed cutoff criteria, the ML‐based TLI, Mc, and RMSEA tend to overreject true‐population models at small sample size and thus are less preferable when sample size is small. - Article
- Dec 1998

This study evaluated the sensitivity of maximum likelihood (ML)-, generalized least squares (GLS)-, and asymptotic distribution-free (ADF)-based fit indices to model misspecification, under conditions that varied sample size and distribution. The effect of violating assumptions of asymptotic robustness theory also was examined. Standardized root-mean-square residual (SRMR) was the most sensitive index to models with misspecified factor covariance(s), and Tucker-Lewis Index (1973; TLI), Bollen's fit index (1989; BL89), relative noncentrality index (RNI), comparative fit index (CFI), and the ML- and GLS-based gamma hat, McDonald's centrality index (1989; Mc), and root-mean-square error of approximation (RMSEA) were the most sensitive indices to models with misspecified factor loadings. With ML and GLS methods, we recommend the use of SRMR, supplemented by TLI, BL89, RNI, CFI, gamma hat, Mc, or RMSEA (TLI, Mc, and RMSEA are less preferable at small sample sizes). With the ADF method, we recommend the use of SRMR, supplemented by TLI, BL89, RNI, or CH. Finally, most of the ML-based fit indices outperformed those obtained from GLS and ADF and are preferable for evaluating model fit. (PsycINFO Database Record (c) 2012 APA, all rights reserved)