Science method

# Exploratory Factor Analysis - Science method

Explore the latest questions and answers in Exploratory Factor Analysis, and find Exploratory Factor Analysis experts.
Questions related to Exploratory Factor Analysis
• asked a question related to Exploratory Factor Analysis
Question
I conducted an Exploratory Factor Analysis using Principal Axis Factoring with Promax rotation, resulting in the identification of 8 factors. Now, I aim to examine potential associations between these extracted factor scores (treated as continuous variables) and other variables in my research. However, I am uncertain about the most appropriate statistical approach to compute these factor scores.
Should I compute the average sum of each factor, employ regression, or utilize the Bartlett method, Anderson-Rubin, any others….?
Any insights or alternative approaches are highly appreciated.
Thank you!
Hello Christian-Joseph,
Here's a link to a very readable introduction to some of the considerations associated with different alternatives to the computation of factor scores:
• asked a question related to Exploratory Factor Analysis
Question
Hello,
I have a data set with N = 369 individuals measured at a single time point. The goal of the study is to create an assessment of psychological safety (PS). The assessment is a self-report measure asking participants to indicate how psychologically safe they feel using a unipolar 5-point Likert scale ranging from 1 (not at all) to 5 (extremely).
In addition to the assessment I am creating, I also measured a number of demographic variables (e.g., age, salary) and a few additional measures of team environment for validation (e.g., an existing measure of PS, level of team interdependence).
My primarily goal is to run exploratory factor analysis (EFA). This is the first time anyone has conceptualized PS as multidimensional, so one of the primary goals is to uncover the potential factor structure of PS. Also, to identify candidate items for deletion.
In order to prepare for the EFA analyses, I am cleaning the data by following recommendations in (the excellent) Tabachnik & Fidell (2013, 6th ed).
I am currently at the point where I am checking the data for multivariate outliers, starting with Mahalanobis distance. And I cannot find explicit guidelines regarding which variables I should be including as "IVs" in the analysis.
QUESTION: Which variables should I be including in my search for multivariate outliers? Do I include all variables, or only my target variables?
Specifically, do I include only the variables that represent the item pool for my forthcoming PS assessment? Or do I include all the PS items AND demographic variables, the existing PS assessment, interdependence measure, etc.??
I ran the Mahalanobis distance analyses 2 times using both approaches, and found substantial differences:
• TIME 1 - With just the PS assessment variables --> I identified n = 28 multivariate outliers.
• TIME 2 - With PS items + demographics, etc. --> I identified n = 10 multivariate outliers (all identified as outliers in the TIME 1 analysis).
Syntax I am using - the bolded variables are the ones I am questioning if I should include or not:
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA COLLIN TOL
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Subjno
/METHOD=ENTER Age Salary Edu WorkStructure TeamSize Tenure_OnTeam JapaneseBizEnviron EdmondsonPS_TOT Interdep_TOT Valued_TOT PS_1 PS_48 PS_141 PS_163 PS_43 PS_53 PS_73 PS_133 PS_135 PS_19 PS_60_xl26 PS_93 PS_106_xl26 PS_143 PS_58 PS_86 PS_182 PS_56 PS_69 PS_103 PS_164 PS_22 PS_35 PS_91 PS_30 PS_59 PS_63 PS_90 PS_131 PS_140 (**Note, PS assessment var list is truncated b/c large number)
/RESIDUALS = OUTLIERS(MAHAL)
/SAVE MAHAL.
Thank you so much, both of your answers were extremely helpful! I ultimately made the decision to run the analyses 3 times and then compare results - (1) full data set, (2) with n = 10 multivariate outliers removed, and (3) with n = 28 multivariate outliers removed. It was time-consuming, but luckily there were very few differences in results, so I didn't have to make any hard decisions.
I will have to look into the analyses David Morse suggested. In selecting my rating scale labels, I used research performed by Beckstead (2014). This study attempted to map the strength of adjectives that are typically attached to Likert scales - thus allowing researchers to select rating scale labels that are "spaced an equal distance apart," and thus to proceed with analyses under a crude assumption of interval scaling. But given this spacing assumption is based on single study, its hardly satisfactory.
Thank you again both of you!
• asked a question related to Exploratory Factor Analysis
Question
i’m trying to run polychotic correlation with Stata v13, but I’m confused.
Below is a do-file where you will find detailed information on how to do using Standard methods of performing factor analysis (polychoric, factormat):
• asked a question related to Exploratory Factor Analysis
Question
Hey guys,
I have found two multi-item scales in my previous research regarding my master thesis. I want to know if I can compute an EFA for the dependent and for the independent variable?
Yes you can run exploratory factr analysis for regressor variable and criterion
• asked a question related to Exploratory Factor Analysis
Question
I'm doing a validation study and for construct validity, I'll analyze my data through EFA and CFA. So, the same data should be used in both processes or should we divide the data? And if we divide the data what percent of the data should be included in EFA and what percent in CFA?
As mentioned earlier, it is not recommended to use the same data for EFA and CFA. As far as dividing the dataset into two for EFA and CFA is concerned, there are no specific criteria (at least, I did not come across them in the last 13 years). While 50-50 split could be one approach, another could consider the statistical power - Large enough to produce reliable factors in EFA, for example.
The following references will be helpful:
Beavers, A. S., Lounsbury, J. W., Richards, J. K., Huck, S. W., Skolits, G. J., & Esquivel, S. L. (2013). Practical considerations for using exploratory factor analysis in educational research. Practical Assessment, Research, and Evaluation, 18(1), 6.
Kyriazos, T. A. (2018). Applied psychometrics: sample size and sample power considerations in factor analysis (EFA, CFA) and SEM in general. Psychology, 9(08), 2207.
• asked a question related to Exploratory Factor Analysis
Question
Can you please help me on how to use Exploratory Factor analysis as a Method of my research, actually my research is all about the Dimension of illegal activities commission. Please -I'm new researcher
It will probably help people to answer if you say which books and other sources that you have been reading to learn about this, which software you are most comfortable with, why you want to use EFA as opposed to other methods like CFA or PCA, provide more information about your research, and provide more specific questions.
• asked a question related to Exploratory Factor Analysis
Question
What is meant by collecting data separatly for newly developed questions to run EFA in PLS-SEM? Like do i need to make to separte questionnaire sets one consisting of questions adopted from prior literatures and the other for the self developed questions?
Is it possible to run EFA from the data collected by pilot study and for the empirical work only running the CFA?
Newly developed questionnaires mean that the construct validity of the scale is not yet checked, so to ensure construct and discriminant validities researchers normally go for EFA followed by CFA.
The following sources may help:
(17) (PDF) An Easy Approach to Exploratory Factor Analysis: Marketing Perspective Noor Ul Hadi (researchgate.net)
(17) (PDF) Specifying the Problem of Measurement Models Misspecification in Management Sciences Literature (researchgate.net)
• asked a question related to Exploratory Factor Analysis
Question
Dear RG community,
I have been trying to help my PhD student with an EFA. We have ended up with a very nice stabilised solution (using principal axis factoring, varimax rotation and extracting a fixed number of factors) apart from the determinant of the correlation matrix being too low. We had initially checked for correlations > 0.8 in absolute value but did not find any so we went back and removed one item from each pair where the bivariate correlation was > 0.7 in absolute value. This improved the determinant but it was still considerably lower than the recommended threshold of 0.00001. What do you suggest we should do: continue to reduce the bivariate correlation threshold and remove more items, just live with the low determinant value as what we are really interested in is the scales themselves derived from the EFA, or is there another way to detect and remove multicollinearity?
Hello Maksim,
Factoring, and replacing individual variates as IVs with factor scores to which they contribute, _is_ one way to deal with collinearity (most often in linear models, such as regression), so long as you do not rotate the factor structure obliquely.
However, the original query for this thread was asking about the presence of multicollinearity in EFA, so that was the question to which I was responding.
• asked a question related to Exploratory Factor Analysis
Question
Hello, all!
I have translated and heavily edited a validated survey for use in a new context. This involved changing language, context, and cutting items from 56 to 30.
I'm now trying to find the underlying factor structure of this data, which I assume will differ from the 8 factors in the original 56-item scale.
For the purposes of my research, I tried to have respondents compare their expectations with their preferences for a given item. So, an item would present a statement such as (hypothetically) "During class, students make connections between content from different subjects." This would be followed by two Likert-type questions: "I would like this to be true" and "I expect this will be true". Repsondents answered each of these sub-questions for every item, on a 6-point scale (no neutral).
I'm a bit stuck, conceptually, on how to approach factor analysis with these paired questions. Conceptually, I'd assume that it shouldn't matter whether I incuded both "Q_a" and "Q_b", since those should load onto the same factor. But then, I think, the nature of the questions might confound such an analysis.
Does anyone have any wisdom or literature on how this type of paired response factor analysis has been done?
Thank you!
You could use discrepancy scores between judged importance and expectation as a variable, and characterize it as akin to a goodness of perceived fit indicator for an individual.
Whether these resultant values are better handled as a scale vs. grouped, common factors depends on a number of concerns (again, back to specific research question/s, as well as data collection scheme, empirical data characteristics). As you've indicated that these data would be collected at a common time (once; at beginning of term/year), that helps to avoid the volatility influence.
Ordinal IRT models don't require collapsing variable scores into dichotomous form.
LCA can use ordinal scores as well as dichotomous scores for categorical variables. However, if you are successful in scaling or factoring the fit scores, you can argue that the resulting scale scores or factor scores are interval strength, and use one of many clustering methods instead of LCA. I mention this because it sounds as if you are very interested in subtyping your cases into qualitatively distinct groups. Alternatively, cases may be the objects of interest in factoring instead of variables. In Cattell's explanation, this is referred to as "Q-type" factoring.
• asked a question related to Exploratory Factor Analysis
Question
Dear researchers, can both EFA and CFA be applied to the database obtained at the end of the scale development process at the same time?
Hello Ömer,
I'm uncertain as to what you mean by "...at the same time."
If you are uncertain as to the underlying structure for a set of variables, and have no prior evidence, claim, or theory as to what that structure should be, then EFA is certainly a viable option for exploring and making initial judgments as to what factor space might be appropriate.
Once you determine a defensible candidate structure, it is certainly a good idea to verify that the structure can adequately account for observed relationships among the variables by using CFA with data not used to derive the EFA.
The reason for not applying to the very same data as used for the EFA is the concern that the EFA process can be opportunistic, resulting in a solution which overfits the data set from which it was derived, and therefore may not generalize.
You can find a number of published studies in which author/s have taken part of data set to derive a candidate structure via EFA, then evaluated that structure, via CFA, with the remainder of the data set.
• asked a question related to Exploratory Factor Analysis
Question
I am currently engaged in a study that applies regression and ANOVA models to several latent variables, including entrepreneurial passion, risk attitude, and entrepreneurial self-efficacy.
In the context of this study, I am seeking a rigorous and straightforward method for determining the factor loadings and latent variable scores for each participant. I am particularly interested in going beyond the traditional methods of simply calculating average or sum scores for these latent variables.
I believe estimating these factors would be more precise using an approach similar to that employed in Covariance-Based Structural Equation Modeling (CB-SEM) models.
Could you provide guidance on how to implement this approach effectively? Would you recommend specific statistical techniques or software tools for this purpose?
How can I ensure the validity and reliability of the obtained factor loadings and latent variable scores? Any advice or resources you could share would be greatly appreciated.
Thanks Giacomo!
• asked a question related to Exploratory Factor Analysis
Question
1- I determined exploratory factor analysis (eight dimensions and 37 questions were finalized).
2- Then I checked the measurement model. In the measurement model, only 13 questions had factor loadings greater than 0.5.
3- In this case, the previous dimensions taken from the exploratory factor analysis cannot be used, because one has been removed and the three dimensions have only 2 questions.
4- In this new mode, it is not possible to define a model with hidden variables, because we only have the indicators. Because the number of indicators for each dimension in this new mode is not enough, so the dimensions cannot be considered. Because according to Hair (2010), at least three questions are needed for each variable.
Question:
What do you recommend?
What are you measuring? Considering the number of dimensions and factors, I suspect that you are measuring a well-defined concept. Factor analysis is a tool that should be based on a strong theory. Also, providing information on the parameters used in EFA is helpful. How did you come up with the measurement model? The measurement model assumes a well-established factor structure. It seems that you didn't establish that in EFA. If EFA is not providing a stable solution, CFA will not provide any better solution.
• asked a question related to Exploratory Factor Analysis
Question
It is welcome any study from any field, as much as possible :)
One way to identify articles that use exploratory factor analysis (EFA) factor by factor is to observe whether they report the results of the Bartlett's test of sphericity for each factor. Barlett's test reports whether there is a possible factor solution for the entire set of variables being analysed; if researchers report a p-value and/or statistic for this test for each factor, it means that the EFA was necessarily performed on a factor-by-factor basis.
As complementary information, EFA is not useful as a preliminary step to scales that have already been validated in the literature, and clearly performing it on a factor-by-factor basis goes against the main aims of EFA.
However, I believe that the use of EFA on a factor-by-factor basis is a much more widespread practice than we might think.
Moyano-Fuentes, J., Maqueira-Marín, J.M., Martínez-Jurado, P.J. and Sacristán-Díaz, M. (2021), "Extending lean management along the supply chain: impact on efficiency", Journal of Manufacturing Technology Management, Vol. 32 No. 1, pp. 63-84. https://doi.org/10.1108/JMTM-10-2019-0388
I have even found some CFA factor-by-factor examples:
Tortorella, G. L., Giglio, R., & van Dun, D. H. (2019). Industry 4.0 adoption as a moderator of the impact of lean production practices on operational performance improvement. International Journal of Operations & Production Management, 39(6/7/8), 860–886. doi:10.1108/ijopm-01-2019-0005
• asked a question related to Exploratory Factor Analysis
Question
The result of the exploratory factor analysis for the measurement tool I used is not compatible with the original scale. For example, in the original study, item 3 is in factor 1, but in the study I conducted, it is in factor 3. In general, all items in the scale show factor loadings in different sub-dimensions than the original ones. How should I proceed?
When your EFA results don't match the original scale, first re-examine your data and methodology. Consider differences in sample characteristics, cultural or contextual factors, and any potential methodological issues. You may choose to report the new factor structure if it is theoretically justifiable and supported by your data. Additionally, assess the reliability and validity of the new structure before proceeding with further analysis.
• asked a question related to Exploratory Factor Analysis
Question
Good morning,
I am not an expert on Factor Analysis, so I hope the explanation of my problem makes sense.
My current task is to perform analyses on an older data set from experiments my lab conducted a couple of years ago. We have 212 individual items, consisting of a dozen or so demographic questions and items from a total of 24 different scales that measure separate constructs. Given the large number of items and constructs, I would like to reduce the number of dimensions to achieve a more clear starting point for theory development. Obviously, an exploratory factor analysis is a good choice for this.
My question is whether I have to input the 200 or so individual non-demographic items in the EFA, or whether I can instead just use the 24 composite variables/constructs and reduce the number of dimensions from there. My hesitation using all the individual items is that an EFA would simply return something very similar to the composite variables, as the constructs generally have a quite high internal consistency and are fairly distinct from each other based on theory. The obvious caveat with using the composite variables is that it is not something I have seen done much, and as I am not an expert on EFA, I am unsure as to whether there is a major methodological road block to using composite variables that I am unaware of.
Best wishes,
Pascal
Hey Pascal
200 items is a lot for an EFA. But why EFA - you have some ideas about the structure of the data as you have "24 different scales that measure separate constructs". You have a lot of scales with relatively few items. What not address this question in separate stages. First, you could test the dimensionality of each of the 24 scales separately using CFA to check that each scale is measuring only one latent variable. Second, when you have sorted out the structures at the level of the scales you can look at how they are correlated, or indeed factor analyze composite scores for all the scales together.
Mark
• asked a question related to Exploratory Factor Analysis
Question
Hi
I'm going to do a cluster analysis, after conducting an exploratory factor analysis for a series of Likert scale questions.
The cluster analysis will of course include other variables such as demographics, together with the factor analysis products.
My question is, do I use Factor Score (regression based or Bartlett based), for those 'Factors' from EFA or do I use mean score (average of the Likert scale questions that manifest individual factor)?
I know factor score has many advantage than mean score, but factor scores have a standard deviation of 1, and mean of 0. It's hard to interpret the later on cluster analysis based on these. While mean score seems to make more sense to interpret.
So, which score shall I use for the later on cluster analysis? mean score or factor score, or something else?
Thank you.
Hey Junxiong, I was thinking the same as Carlo - why not do the factor analysis and cluster analysis in one model. There are factor-mixture models that allow the combination of measurement and mixture models. A really accessible and well-written general overview of these methods is available here
doi: 10.1080/10705511.2013.824786
There are also some great video resources, and as usual QuantFish is excellent
Mark
• asked a question related to Exploratory Factor Analysis
Question
Hi all,
Kindly help me to know when can we use Exploratory Factor Analysis(EFA) and Confirmatory Factor analysis(CFA)?
Regards,
In addition to the resources provided by Holger Steinmetz , I would like to add that EFA is almost always unnecessary unless you are really exploring the number of factors. In most cases, we are not doing that. We usually have at least some ideas related to which variables should measure which and (how many) factor(s). Therefore, CFA is typically the more appropriate and more powerful (more flexible) method because it allows us to test our a priori hypotheses about factor structures and to compare competing factor models statistically.
• asked a question related to Exploratory Factor Analysis
Question
In the context of regression, methods for detecting collinearity are well described in the literature. In the context of exploratory factor analysis (EFA), however, I am facing a situation where I get highly unstable factor loadings and factor correlations from different boot-strapped samples (despite a sample size of more than 700), which I assume is due to multicollinearity. I am not sure how to explore its source. The common methods rely on the intercorrelation among explanatory variables (i.e., the latent factors in EFA), but the intercorrelation its-self is highly unstable.
My search for finding references on this topic was not much successful. Any explanation and/or reference on multicollinearity in EFA would be appreciated.
Ali
Hi,
there is (to my knowledge) no such thing as colinearity in factor models. To the contrary: Highly correlating indicators should lead to a much more stable situation was those indicators would be clearly load onto the same factor.
Best
Holger
• asked a question related to Exploratory Factor Analysis
Question
Hello,
I applied exploratory factor analysis with network analysis to data from healthy and diseased patients. The analysis shows different clusters of parameters; some are similar in both groups, and some groups cluster differently. For instance, the parameters IleValLeu are similarly clustered in both groups whereas the parameters Pro Ala are not (see figure).
How shall I interpret the data? For instance, in the Pro/Ala case, I expected to see some differences between the groups, but they look pretty much the same to me.
Is the differences about correlation? The scatterplot of the data shows slightly different regression models, but not something compelling.
Is it the value itself? But again there is no real difference in the value distribution between the two groups.
So, what is the actual outcome of the network analysis?
Thank you
Dear Luigi,
if I understood your question correctly, I would say that to interpret results of the network analysis you need to understand the components of the network. What are nodes? What are links? What does it mean if a link connects two nodes? What does it mean if nodes are not connected?
For example, if I read your network correctly, you can easily see that Ala and Pro are correlated stronger in your decease network, compared to the control. Is it significant? Is it important? An answer is beyond network analysis, but lays in your experiment design and other information you have. You can apply some statistics to test if difference in a strength of correlation is statistically significant, but you still need to add a biology level for interpretation.
What is good about network analysis, that you can use graphs theory based properties to get more layers of differences between your conditions, if, again, it makes sense for your data.
• asked a question related to Exploratory Factor Analysis
Question
Hypothetically, if I would like to validate a scale and I need to explore its latent factors first with EFA followed by a CFA to validate its structure, do I need a new data set for the CFA? Some argued that randomly spliting one dataset into two is also acceptable. My question is that can I apply EFA on the dataset (with all data included) and then randomly select half of the dataset to conduct a CFA?
• asked a question related to Exploratory Factor Analysis
Question
Dear Research Scholars, you are doing well!
I am a Ph.D. scholar in Education. Now I am working on my thesis. kindly guide me that when to perform the EFA whether it use on pilot studying data or actual research data.
Regards
Ph.D. In Education , faculty of Education,
University of Sindh, Jamshoro
When you need to hypothesise a connection between variables, exploratory factor analysis is the method to apply.
• asked a question related to Exploratory Factor Analysis
Question
I am conducting an exploratory factor analysis and to determine the number of factors I used a paired analysis.
How can I generate the number of factors correctly in stata? Or other tool?
When using parallel analysis in stata, for example, if you proceed with Principal Axis Factoring all my Eigenvalues from the Parallel Analysis using a Principal Axis Factoring lower than 1. (Suggests in this case to retain all factors)
when out of curiosity, I use principal component factors, or even principal component analysis (I know this is not EFA), it suggests retaining 3 factors (which satisfies me)
Hi there,
the most basic question is whether a common factor model (which underlies EFA) is reasonable. Are your indicators of a kind that makes the assumption reasonalbe that they were caused by one or several underlying common causes? How does the correlation matrix looks like?
If a PCA is better than this suggests that the former assumption is unreasonable. This is fine but keep in mind that a PC is a simple composite of the indicator without any surplus meaning.
HTH
Holger
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., Strahan, E. J., MacCllum, R., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272-299. https://doi.org/10.1037/1082-989X.4.3.272
Russel, D. W., & Russell, D. W. (2002). In search for underlying dimensions - the use (and abuse) of factor analysis in personality and social psychology bulletin. Personality and Social Psychology Bulletin, 28(2), 1629-1646. https://doi.org/10.1177/014616702237645
Preacher, K. J., & MacCallum, R. C. (2003). Repairing tom swift's electric factor analysis machine. Understanding Statistics, 2, 13-43.
• asked a question related to Exploratory Factor Analysis
Question
I am running a PCA in JASP and SPSS with the same settings, however, the PCA in SPSS shows some factors with negative value, while in JASP all of them are positive.
In addition, when running a EFA in JASP, it allows me to provide results with Maximum Loading, whle SPSS does not. JASP goes so far with the EFA that I can choose to extract 3 factors and somehow get the results that one would have expected from previous researches. However, SPSS does not run under Maximum Loading setting, regardless of setting it to 3 factors or Eigenvalue.
Has anyone come across the same problem?
UPDATE: Screenshots were updated. EFA also shows results on SPSS, just without cumulative values, because value(s) are over 1. But why the difference in positive and negative factor loadings between JASP and SPSS.
Thank you, I found the issues and now it gives me very similar results
• asked a question related to Exploratory Factor Analysis
Question
When designing the questionnaire for EFA, what do I need to keep in mind when it comes to the order of the questions?
More specifically, does the order of the questions need to be completely randomized or is it generally allowed to still ask questions in topic blocks according to potential factors/constructs I have in mind?
Thanks everyone!
A randomized order of items can have the advantage that it may prevent response styles (such as saying "yes" to every item, skipping items, other careless response styles). When items are arranged according to topic/same attribute, participants may get bored and employ a careless response style, especially when the items are similarly worded. They may also find it easier to figure out what attributes you are trying to measure, which may or may not be desirable for your study.
• asked a question related to Exploratory Factor Analysis
Question
For my research project I am adding new items to a previously validated scale. in the previous research they performed an exploratory factor analysis revealing a two-factor structure, but the internal consistency of one of the scales was quite poor so the aim of my study was to add new words to improve the internal consistency. so do i need to do another exploratory factor analysis as the scale will now have new words or can i do a confirmatory factor analysis because i'm still using the same scale?
Megan Burkinshaw, I'm really sorry: I have only just now (~18 months later) noticed your question to me. It seems I overlooked a notification concerning it. Although you've probably moved on, I'll answer in case it helps others.
My answer is, yes, I do recommend doing EFA when new words (I assume you mean items, not words - but the same probably applies to words) have been added to a scale.
Feel free to get back - and, again, my apologies for a late reply.
• asked a question related to Exploratory Factor Analysis
Question
I did the confirmatory factor analysis of the questionnaire. If in each subscale I select the questions that have the highest factor load and I consider these questions as a short version and do other steps to verify the validity and reliability of the questionnaire, is this the correct way?
by doing exploratory factor analysis, I have less sub scale in comparison with sub scales in the long-form questionnaire.
Can anyone recommend a research paper pr chapter detailing best practice for developing a short version of an exusting scale we have developed?
• asked a question related to Exploratory Factor Analysis
Question
In exploratory factor analysis, to name the dimensions, some items are different and heterogeneous from other items of the same dimension, and they make it difficult to name each category. For example, in the first category "personality type of the manager", in the 3rd category, two items "level of analysis of the spouse" and "wife's perception of the existence of social justice", in the 5th category, the item "organizational level of the manager (senior, middle, operational manager) ", in category 6, the item "Education level of wife" and in category 7, the item "Job transition of wife".
Regardless of the heterogeneous items above with other items in the same category, the naming of each category has been done as follows. The question is: what to do with heterogeneous objects? Are they deleted?
Category 1. Family intimacy: spouse's relationship with family; years of married life; wife's compassionate analysis of the company events recounted by the manager; Manager's personality type
Category 3. Manager's attitude towards his wife: manager's mental health; The amount of analysis of the wife; Manager's leadership style; The manager's level of trust in his wife; Wife's perception of the existence of social justice
Category 5. Spouse's power of influence on the manager: manager's organizational level (senior, middle, operational manager); The wife's effort to show support to the manager; Spouse's ability to regulate manager's emotions; The extent of the spouse's influence on the manager's decision-making; wife being a housewife
Category 6. Environment: the level of stability of the environment; Accompanying men (in the role of manager) with their working wives; Manager's income; spouse's level of education; The degree of balance between work and family by the manager
Category 7. Spouse's physical health Question: Spouse's job transfer; Wife's physical health; Manager's physical health
My perspective on this issue is that it doesn't make sense to keep heterogeneous/multidimensional items on the same factor. The very goal of factor analysis if to achieve unidimensionality within a given factor--you don't want items that measure very different things on the same factor because it makes the factor uninterpretable . That is, all items that are used as measures of the same factor should (ideally) be unidimensional (i.e., measure only the given factor with zero or at least minimal cross-loadings on other factors).
Things like, for example, "stability of the environment", "manager's income", and "spouse's level of education" are very highly distinct dimensions that would not be expected to load on a single factor to begin with. It does not make sense to put or keep them on the same factor. The interpretation of such a factor would be unclear or impossible.
• asked a question related to Exploratory Factor Analysis
Question
Here is the code I'm using:
[fa(vars, 4, rotate="oblimin", fm = "pa")]
Hi Everyone, I just wanted to let you know that scaling the variables worked for me to fix this issue and it very minorly changed any results (one variable was no longer removed during the iteration process).
• asked a question related to Exploratory Factor Analysis
Question
Hi. I recently developed new survey items to measure extrinsic motivation. In the exploratory factor analysis, four items intended to measure extrinsic motivation loaded onto the same factor with negative scores (-.822; -.813; -.808; -.553). However, in the subsequent confirmatory factor analysis, the same items produced positive scores (.78; .80; .67; .56). I do not have a theory about why this could be and I would appreciate any help with explaining this finding.
Thank you.
Hello Richard,
Rotational schemes in EFA are arithmetically driven, not theoretically driven. So, having one factor rotated 180 degrees "out of phase" with the theoretical orientation only means that: (a) loadings and factor pattern values; and (b) correlations with other factors would all be reversed in sign if the orientation were changed to the intended direction.
As for your follow-up question about the three "amotivation" items, do note that, even if you conceptualize a bipolar variable, there's no assurance that EFA will yield one. In my experience, it's not uncommon for items which have different valence to generate two factors instead of one. The "true" structure could be one bipolar, two, bi-factor, or second order. The good news is, you can evaluate these via CFA to see if the one you were aiming for works well with the data.
• asked a question related to Exploratory Factor Analysis
Question
I used confirmatory factor analysis without using the exploratory factor analysis first. The assumption that supported this was the clear presence of the factors in the literature. Is it correct not to go for the exploratory factor analysis and jump directly to CFA when you have clearly established factors in the literature and theory?
Yes, absolutely. There is no point in "exploring" factors when you have a clear theory or prior knowledge about the factors and how the observed variables (measurements) relate to them. In that case, you want to "confirm" (test) rather than explore/determine the number and nature of the factors. EFA is only needed when you have no prior theory and need to explore or when you have a CFA solution that completely fails to fit and you don't know why.
• asked a question related to Exploratory Factor Analysis
Question
Can i split a factor that has been identified through EFA.
N=102 4 factors have been identified.
However, one of the 4 actually has two different ideas that obviously are factoring together. I am working on trying to explain how they go together but it is very easy to explain them as two separate factors.
When I conduct a Conformatory analysis the model fit is better for them separate. . . but running a confirmatory analysis on the same population of subjects that I conducted the Exploratory analysis on appears to be a frowned upon behavior.
Eric Orr Performing EFA and CFA on the same dataset serves no use. EFA is used to extract factors for the first time from a dataset, whereas CFA is used to verify factors extracted from a separate dataset.
However, exploratory factor analysis (EFA) is commonly employed to determine a measure's factor structure and to assess its internal reliability. When researchers have no theories regarding the nature of the underlying factor structure of their measure, EFA is frequently advised.
• asked a question related to Exploratory Factor Analysis
Question
Exploratory Factor Analysis and Confirmatory Factor Analysis are used in scale development studies. Rasch Analysis method can also be used in scale development. There are some researchers who consider the Rasch Analysis as up-to-date analysis. Frankly, I don't think so, but is there a feature that makes EFA, CFA or Rasch superior to each other in Likert type scale development?
Rasch is better for a unidimensional scale
• asked a question related to Exploratory Factor Analysis
Question
I am conducting an EFA for a big sample and nearly 100 variables, but no matter what I do, the determinant keeps its ~0 value.
What should I do now?
Hello Assia,
When computed for a correlation matrix, the determinant may range from 0 (no independent variation) to 1 (no relationship among the variables at all). Lower values generally would suggest the data set _does_ share common variance, and therefore may be amenable to factoring.
A variable set with a determinant of exactly zero would indicate that all variables were isomorphic and interchangeable (one gives exactly the same information as any other). While you could contrive such a data set (for example, asking adults their age four times in a row, and treating the responses as separate variables), it should be apparent that scores on the variables should correlate perfectly (hence, a zero determinant for the matrix).
• asked a question related to Exploratory Factor Analysis
Question
I plan to develop a semi-structured assessment tool and further validate it on a relatively small sample of below 50 (clinical sample). I have been asked by the research committee to consider factor analysis.
So in this context, I wanted to know if anyone has used regularized factor analysis for tool validation which is recommended for small sample sizes?
The sample size is relatively quite small but if the size is above 100 then you can try. There had been studies who have opted for exploratory factor analysis on such smaller sample. But you got to check the KMO and Bartlett's Test of Sphericity to see the adequacy of your data. Try reading the following research papers who support smaller samples for EFA.
De Winter, J.C.F., Dodou, D., & Wieringa, P.A. (2009). Exploratory factor analysis with small sample sizes. Multivariate Behavioral Research, 44, 147–181.
Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: current approaches and future directions. Psychological Methods, 12, 58-79.
Barrett, P. (2007). Structural equation modelling: Adjudging model fit. Personality and Individual Differences, 42, 815-824.
• asked a question related to Exploratory Factor Analysis
Question
I am aware that a high degree of normality in the data is desirable when maximum likelihood (ML) is chosen as the extraction method in EFA and that the constraint of normality is less important if principal axis factoring (PAF) is used as the method of extraction.
However, we have a couple of items in which the data are highly skewed to the left (i.e., there are very few responses at the low end of the response continuum). Does that put the validity of our EFAs at risk even if we use PAF?
This is a salient issue in some current research I'm involved in because the two items are among a very small number of items that we would like, if possible, to load on one of our anticipated factors.
Christian and Ali, thanks for your posts. Appreciated. I'll follow up on both of them.
Robert
• asked a question related to Exploratory Factor Analysis
Question
Dear all,
I am conducting research on the impact of blockchain traceability for charitable donations on donation intentions (experimental design with multiple conditions, i.e., no traceability vs. blockchain traceability).
One scale/factor measures “likelihood to donate” consisting of 3 items (dependent variable).
Another ”trust” factor, consisting of 4 items (potential mediator).
Furthermore, a “perception of quality” consisting of 2 items (control).
And a scale “prior blockchain knowledge” consisting of 4 items (control).
My question is: since all these scales are taken from prior research, is CFA sufficient? Or, since the factors are from different studies (and thus have never been used together in one survey/model) should I start out with an EFA?
For instance, I am concerned that one (or perhaps both) items of ”perception of charity quality” might also load on the “trust”-scale. e.g., the item “I am confident that this charity uses money wisely”
Kim Fleche, I'm glad my post was helpful. Please be aware that coefficient alpha (often less appropriately referred to as Cronbach's alpha) is a much more complex, uninformative, and deceptive metric than many researchers seem to appreciate.
A major feature of coefficient alpha is that it is highly dependent on the number of items involved. Because of that, a small number of nicely correlated items can have quite a low coefficient alpha. Conversely, by the time there are 20 or more items, the value of alpha can be quite high despite little association between many of those items.
The following publications might be helpful:
• asked a question related to Exploratory Factor Analysis
Question
Greetings,
I am a DBA student conducting a study about "Factors Impacting Employee Turnover in the Medical Device Industry in the UAE."
My research model consists of 7 variables, out of which:
• 5 Variables measured using multi-item scales adapted from literature ex. Perceived External Prestige (6 items), Location (4 items), Flextime (4 items),.. etc.
• 2 are nominal variables
I want to conduct a reliability analysis using SPSS & I thought I need to do the below?
1. Conduct reliability test using SPSS Cronbach's alpha for each construct (except for nominal variables)
2. Deal with low alpha coefficients (how to do so?)
3. Conduct Exploratory Factor Analysis to test for discriminant validity
Am I thinking right? Attached are my results up to now..
Thank you
The issue is not my specialty , with my best wishes
• asked a question related to Exploratory Factor Analysis
Question
Hi everyone,
I have longitudinal data for the same set of 300 subjects over seven years. Can I use '''year' as a control variable? Initially, I used one way ANOVA and found no significant different across seven years in each construct.
Which approach is more appropriate?. Pooling time series after ANOVA (if not significant) or using 'year' as a control variable?
I thinking, when there is no significant difference found Through, its improper to use it as control variable. Better is allow PLS to create its own groups if any were present in the data.
• asked a question related to Exploratory Factor Analysis
Question
Hello everyone,
As the title suggests, I am trying to figure out how to compute a reduced correlation matrix in R. I am running an Exploratory Factor Analysis using Maximum Likelihood as my extraction method, and am first creating a scree plot as one method to help me determine how many factors to extract. I read in Fabrigar and Wegener's (2012) Exploratory Factor Analysis, from their Understanding Statistics collection, that using a reduced correlation matrix when creating a scree plot for EFA is preferable compared to the unreduced correlation matrix. Any help is appreciated!
Thanks,
Alex
• asked a question related to Exploratory Factor Analysis
Question
I`m conducting the translation of a very short scale of 12 ítems to assess therapeutic alliance in children. I have 61 answers and I wonder if that number of subjects it`s acceptable to run Exploratory Factor Analysis. I know that there is a suggestion of 5 participants for item to do EFA and 10 participants for item to do CFA. However, the number of participants here seem to be very smal for these analysys. What it´s your opinion?
Hello Leandro,
The vague answer is, more cases is generally better than fewer cases. There are two reasons:
1. The factor model you seek must, ideally, be capable of providing good estimates for the (12*11/2 =) 66 unique observed relationships among your 12 variables (here, item scores). That's a lot to ask of 61 cases.
2. In general, the smaller the N of cases, the more volatile are the observed relationships among the variables from sample to sample. Hence, the less likely that your sample data will accurately represent the correlations that may exist in the population. As these correlations are what the EFA is intended to account for, if they are incorrect then you likely will identify a factor model that does not generalize well to the population. The literature is full of studies in which one set of authors using a small or modest sample claims their EFA shows "different" results than those from another set of authors, even if ostensibly drawing from the same population.
Can you still proceed? Yes, of course--the numbers won't leap up from your data file and protest! However, do be mindful that: (a) guidelines such as "at least 100 cases" and "10-20 cases per variable" for EFA abound; and (b) you likely would want to characterize your results as tentative or exploratory, rather than as a definitive solution to the question of the true factor structure.
• asked a question related to Exploratory Factor Analysis
Question
Exploratory factor analysis was conducted to determine the underlying constructs of the questionnaire. The results show the % variance explained by each factor. What does % variance mean in EFA? How to interpret the results? How can I explain the % variance to a non-technical person in simple non-statistical language?
The % variance explained by a factor in a given variable is often called communality (h^2). It gives the proportion of individual differences in an observed (measured) variable that is accounted for by the common factor. The communality is similar to R-squared in regression, except that the independent variable is a latent factor. Under certain conditions, the communality is equal to the reliability of a variable, namely when the variable only measures the common factor and measurement error.
In summary, the communality gives you an estimate of a variable's ability to reliably measure the common factor. The closer the communality to 1, the more reliable the variable as a measure of the factor.
• asked a question related to Exploratory Factor Analysis
Question
I have good result with low variance explanation (Less than 50%) in exploratory factor analysis, and read some discussions about the acceptable for total variance explanation < 50% in social sciences. Please recommend papers to support this issue or give me your suggestion.
Hello Sirwarit,
There are a lot of possible answers to your query as it was posted.
1. It could be that the set of variables you are attempting to factor simply do not share common latent variable underpinnings, and therefore no parsimonious factor structure (other than, perhaps, something approaching the "worst case" of k factors for k variables) exists.
2. It could be that the variables are poorly measured and/or suffer from high levels of specificity, such that the common variance is "low," according to your appraisal. Better versions or measurements of the variables might alleviate this problem.
3. You could have extracted too few factors to account for the variation in the variable set. Alternatively, a chunk of the variables in your set simply might not belong in the set, and this is causing your solution(s) to have low variance accounted for.
4. Your judgment that 50% of variance being explained by the factor structure you've selected from your EFA might be too pessimistic. I've seen a lot of published studies in which EFA solutions adopted by author/s account for 35-60 percent of variance. Often, this coincides with author/s' decision to jettison a number of the variables from the solution because of low variable-factor loadings. Are these ideal structures? Probably not, but one of the reasons for engaging in EFA is to refine a variable set into a group which does have common latent underpinnings...so, it is not unreasonable to expect that some variables might not end up being included.
I agree with Imran Anwar that it wasn't clear what you meant by having obtained good results.
• asked a question related to Exploratory Factor Analysis
Question
I am using the Environmental Motives Scale in a new population. My sample size is 263.
The results of my exploratory factor analysis showed 2 factors (Eigenvalue>1 and with loadings >0.3) - Biospheric Factor and Human Factor
Cronbach alpha was high for both factors (>0.8)
However, unexpectedly, confirmatory factor analysis showed that the model did not fit well:
RMSEA= 0.126, TLI =0.872 and SRMR = 0.063, AIC = 6786
After a long time on Youtube, I then checked the residual matrix and found that the standardized covariance residuals between two of the items in the Biospheric factor was 7.480. From what I understand if values are >3, it indicates that there may be additional factor/s that are accounting for correlation besides the named factor. I therefore tried covarying the error terms of those two items and rechecked the model fit using CFA.
Results of this model show much better model fit.
RMSEA = 0.083, TLI = 0.945, SRMR = 0.043, AIC = 6731 (not as much difference as I thought there would be)
The questions I am now left with (which google does not seem to have the answer to) are:
1. Is it acceptable to covary the error terms to improve model fit?
2. How does covarying error terms impact on the scoring of the individual scales? Can I still add up the items to measure biospheric vs human scales as I would have without the covarying terms?
I would be so grateful for any insight or assistance.
Thank you
Tabitha
Tabitha Osler Many software programs for confirmatory factor analysis (CFA) will allow you to estimate and save individual factor scores as new variables to your data set. These scores can then be used in the same way as a conventional scale score (e.g., sum score of items). One advantage of factor scores is that they take the different item loadings into account and that they would allow you to also account for the error correlation parameter when estimating the factor scores.
I'm not sure which software program you're using. In any case, the program manual/user's guide should have information regarding options for estimating and saving factor scores.
• asked a question related to Exploratory Factor Analysis
Question
I used exploratory factor analysis for 4 latent variables, the result in the table called "total of variance explained" has "% of variance". is it similar to average variance extracted?
What steps to do discriminant validity in SPSS?  I run the factor analysis, then compute the latent variable to become observed variables, after that I run the correlation. is it the correct process?
VALUES SHOULD BE GREATER THAN .05
• asked a question related to Exploratory Factor Analysis
Question
Hi, I am working on a project about ethical dilemmas. This project requires development of a new questionnaire that should be valid and reliable. We started with collecting the items from the literature (n= 57), performed content validity where irrelevant items were removed (n=46), and piloted it to get the level of internal consistency. Results showed that the questionnaire has a high level of content validity and internal consistency. We were requested to perform exploratory factor analysis to confirm convergent and discriminant validity.
Extraction PCA
rotation varimax
Results: the items' communalities were higher than 0.6.
kMO 70%
Barttlett's test is significant.
Number of extracted factors 11with total explained variance 60%.
My issue is 6 factors contain only 2 items. Should I remove all these items?
With notice that the items are varied, each one describes a different situation, and only they share in that they are ethical dilemmas and deleting them will affect the overall questionnaire ability to assess participants' level of difficulty and frequency of such situations.
EFA is new concept for me; I am really confused by this data.
Eman ALI Dyab, if you have been asked to conduct exploratory factor analysis (EFA), I suggest you follow that instruction rather than conducting confirmatory factor analysis.
There are other things you should consider, however. Given that your items' communalities are > .60, I think you don't need to abide by the rule of thumb that you have 10 x the number of participants as there are items, but I think you do need to ensure you have enough participants. Maybe 200 would be sufficient.
Also, given that you have 11 factors in the data, I wonder whether you are using the Kaiser criterion to determine the number of factors. That method has been criticised for at least 20 years. A better method is to use the scree test in conjunction with parallel analysis. The scree plot is often quite easy to interpret. I am attaching information about parallel analysis in case that helps.
Apart from that, I think it would be better to use EFA, not PCA (PCA is not really EFA, and you have been asked to conduct EFA, anyhow) and that you use an oblique (e.g., promax), rather than varimax, rotation. (Use of varimax has been criticised for quite a long time - though a lot of researchers keep using it, probably because they blindly follow what the crowd is doing.)
My inclination would be to remove items that load only on two-item factors. However, before doing that, I'd go back and use the scree plot in conjunction with parallel analysis, EFA with an oblique rotation, and a series of EFAs in which poorly performing items were successively removed.
All the best for your research.
• asked a question related to Exploratory Factor Analysis
Question
Do you know any renowned article which has been published in Scopus journal describing that for conducting the Exploratory Factor Analysis (EFA), which method is the best, 'Principal component' or 'Principal Axis Factoring ' in SPSS?
Hi, Mohammad Kamrul Ahsan. Here are some good papers on EFA that I hope will help you.
Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift's electric factor analysis machine. Understanding statistics: Statistical issues in psychology, education, and the social sciences, 2(1), 13-43.
Yong, A. G., & Pearce, S. (2013). A beginner’s guide to factor analysis: Focusing on exploratory factor analysis. Tutorials in quantitative methods for psychology, 9(2), 79-94.
Costello, A. B., & Osborne, J. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical assessment, research, and evaluation, 10(1), 7.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.
Williams, B., Onsman, A., & Brown, T. (2010). Exploratory factor analysis: A five-step guide for novices. Australasian journal of paramedicine, 8(3).
• asked a question related to Exploratory Factor Analysis
Question
I collected 109 responses for 60 indicators to measure the status of urban sustainability as a pilot study. So far I know, I cannot run EFA as 1 indicator required at least 5 responses, but I do not know whether I can run PCA with limited responses? Would you please suggest to me the applicability of PCA or any other possible analysis?
I would recommend you read about the difference between EFA and PCA first. Whether or not you should run an EFA has nothing to do with the number of response options on the indicators, five or otherwise. In general, EFA is preferable to PCA as it is considered to be the 'real' factor analysis. The are many threads on RG on this issue.
Best
Marcel
• asked a question related to Exploratory Factor Analysis
Question
Query1)
Can mirt exploratory factor analysis method be used for factor structure for marketing/management research studies because most of the studies that I have gone through are related to education test studies? My objective is to extract factors to be used in subsequent analysis (Regression/SEM) My data is comprised of questions like: Data sample for Rasch Factors Thinking about your general shopping habits, do you ever: a. Buy something online b. Use your cell phone to buy something online c. Watch product review videos online RESPONSE CATEGORIES: Yes = 1 No = 0 Data sample for graded Factors Thinking about ride-hailing services such as Uber or Lyft, do you think the following statements describe them well? a. Are less expensive than taking a taxi c. Use drivers who you would feel safe riding with d. Save their users time and stress e. Are more reliable than taking a taxi or public transportation RESPONSE CATEGORIES: Yes =3 Not sure = 2
No = 1 Query2) If we use mirt exploratory factor analysis using rasch model for dichotomous and graded for polytomous, do these models by default contain tetrachoric correlation for rash model and polychoric correlation for graded models? My objective is to extract factors to be used in subsequent analysis (Regression/SEM) Note: I am using R for data analysis
I would really appreciate that you spared some time to answer my question. May be I am unable to ask question properly but here my objective is to create factors from underlying battery of items with different scales. So my question is simple that can I use mirt to perform EFA to create factors to be used in subsequent analysis (Regression/SEM).
One more things I would like to know that what exactly "easy" and "difficult" items you mean in your given answer?
• asked a question related to Exploratory Factor Analysis
Question
Hi,
I used a self-efficacy tool for my sample. According to the original article, there is only one factor in the tool. However, in the Exploratory factor analysis for my sample, two factors were found. How can I interpret this result?
Basically, A statistically significant result is not due to chance and is determined by two important variables: sample size and effect size. The sample size refers to the size of the sample for your experiment.
Best Regards
Dr. Fatemeh Khozaei
• asked a question related to Exploratory Factor Analysis
Question
Hi, I have run an exploratory factor analysis (principal axis factoring, oblique rotation) on 16 items using a 0.4 threshold. This yielded two factors, which I had anticipated as the survey was constructed to measure two constructs. Two items had factor loadings <0.4 (from Factor 1) so I removed them, leaving 14. However, upon closer inspection, one of the items from Factor 2 loaded on to Factor 1 (|B| = 0.460).
The distinction between the two constructs is very clear so there should not be any misunderstanding on the part of the participants (n = 104). I'm unsure of what to do. I checked the Cronbach's alpha for each factor: Factor 1 (a = .835 with the problematic item, a = .839 without). Factor 2 is a = .791).
Do I remove the item? Any advice would be very much appreciated. Thank you!
I think we might have a misunderstanding about communalities versus loadings. In SPSS output, the communalities are provided before the loadings, and it's the extraction communalities you need to pay attention to. I think they need to all be above ~.50 if you want to work with a smallish sample.
You could be spot on when referring to wording of items in your scale possibly not being satisfactory. With regard to the scale that I referred to in my post above, I'd forgotten to mention that a number of the items have wording that I regard as quite unpolished. Some are even silly.
One last note, please don't worry if the coefficient alphas for your two factors are quite low. With as few as 6 or 7 items, the alphas can be quite low (down around .70, maybe a touch less) without cause for concern. Check out the interitem correlations!
• asked a question related to Exploratory Factor Analysis
Question
Can anyone help me with the sample size calculation for the exploratory factor analysis? Do you know how to calculate it and with which statistical program? Thank you.
Fernando Calvo, there are a number of recommendations concerning the desirable sample size for exploratory factor analysis - and they don't always agree with each other.
The simplest recommendation is to aim for 10 times the number of people as there are items to be submitted to EFA. So, if you have 20 items, you'd need 200 people. That's the recommendation in the following chapter:
Dixon, AE. Exploratory factor analysis. In Plitchta SB, Kelvin EA, editors. Munro’s statistical methods for health care research. 6th ed. Philadelphia, PA. Wolters Kluwer; 2013. pp. 371–398.
There are other recommendations, however. My colleagues and I dealt with them in an article we'd had published nearly 2 years ago:
Ma, K., Trevethan, R., & Lu, S. (2019). Measuring teacher sense of efficacy: Insights and recommendations concerning scale design and data analysis from research with preservice and inservice teachers in China. Frontiers of Education in China, 14(4), 612–686. https://doi.org/10.1007/s11516-019-0029-1
It's an open-access article. Check out the middle paragraph on page 628 for several recommendations, and choose whatever seems most appropriate to your situation.
All the best with your research.
• asked a question related to Exploratory Factor Analysis
Question
Can mirt exploratory factor analysis method be used for factor structure for marketing/management research studies because most of the studies that I have gone through are related to education test studies? My objective is to extract factors to be used in subsequent analysis (Regression/SEM) My data is comprised of questions like: Data sample for Rasch Factors Thinking about your general shopping habits, do you ever: a. Buy something online b. Watch product review videos online RESPONSE CATEGORIES: 1 Yes 2 No Data sample for graded Factors Thinking about ride-hailing services such as Uber or Lyft, do you think the following statements describe them well? a. Are less expensive than taking a taxi b. Save their users time and stress RESPONSE CATEGORIES: 1 Yes 2 No 3 Not sure
The Mirt package generally concentrates on Item Response Theory. You can perform the Exploratory Factor Analysis you want more easily with the "fa" function in the "psych" package in R. You will also find many different and up-to-date estimation methods of factor analysis in this package.
• asked a question related to Exploratory Factor Analysis
Question
I have extracted factor scores after EFA and then used K-means to identify clusters.
But there are confusion in validating the number of cluster in SPSS, as it does not show any AIC or BIC value on the basis i can differentiate and finalise the number of clusters.
SPSS macros calculating most important cluster validity indices are available in the library "Internal clustering criteria" found on "Kirill's SPSS macros page" (easily googled).
• asked a question related to Exploratory Factor Analysis
Question
I have adopted my questionnaire from previous literature. I want to know if I still need to carry out EFA before carrying out PLS-SEM for my thesis?
@Zia Aslam The reference you shared here is an amazing landmark paper for the scholars doing survey-based primary data research. A must read manuscript indeed. Below is the attached manuscript in pdf.
• asked a question related to Exploratory Factor Analysis
Question
Hi,
I am working on exploratory factor analysis in SPSS using promax rotation.
Upon checking pattern matrix, there is one question which has factor loading greater than 1.0? should I need to ignore if loading greater than 1.0?
Thanks much.
Hello Ramya,
The pattern matrix gives the regression weights for how scores on a given variable are estimated to be formed, given some value/score on each of the respective factors. These weights can be > |1|. The structure matrix gives the simple correlation of each variable with scores on each factor, not adjusted for the relationships among factors. As such, the structure matrix should not include any values that exceed |1|.
As the pattern matrix furnishes "affiliation" values that do adjust for the relationships among factors, most folks would opt to review that first for understanding what's going on in the factor solution. However, full understanding of a factor solution requires thoughtful review of (a) the pattern and structure matrices; (b) the factor correlation matrix; (c) the variable correlation matrix; (d) variable communalities (after extraction/rotation); and of course, any relevant theory as well as understanding of measurement issues associated with the respective variables.
Negative values are also not uncommon in factor pattern matrices, just as negative regression weights may be observed in multiple linear regression. Again, recall that the purpose of the pattern matrix is to show how scores on the respective factors would be optimally weighted so as to explain differences in scores on each observed variable. Whenever you have multiple factors (and therefore a reason to rotate a factor solution in the first place), the "optimal" combination of factor scores to "explain" scores on an observed variable may have positive, negative, or both valences of estimated weights.
• asked a question related to Exploratory Factor Analysis
Question
Hello everyone,
I have run a confirmatory factor analysis in R to assess the translated version of an existing questionnaire. It is unidimensional and consists of 16 adjective-based items rated on a 7-point Likert Scale.
Here are the results:
X2= 627.197, df= 104, p= 0.000, RMSEA= 0.109, CI 90%= [0.101, 0.117], SRMR= 0.063, CFI= 0.839, TLI= 0.814
I am aware of all the cutoffs; well the result of my RMSEA is troublesome. On the other hand, as I am delving into similar topics, some of tem just reported these results as satisfying and didn't conduct an Explaratory Factor Analysis.
What I am wondering is if my results are acceptable to just limit myself to report them and run no EFA study?
Or should I run EFA and then gather data again based on the model proposed by EFA results?
Sara
Hello all,
if you allow, I would present a different perspective. The chisquare is highly significant, meaning that the model fails in predicting the data beyond the amount expected by chance. This may be trivial or substantial, you don't know it. High sample size does not cause misfit, it increases the statistical power (which it should as it is a statistical test). The cut-off values of fit indices are complete bogus as they cannot be generalized beyond the scenarios tested in the Monte-Carlo simulations which underled the values. These were artificial and minor and do not express those fundamental misspecifications a CFA model often implies.
Hence you have to diagnose the problems and come up with a proposal.
Here's a thread in which a paper is cited that shows the process how to do it
Best,
Holger
• asked a question related to Exploratory Factor Analysis
Question
The items used in my study have been adapted from the instruments that have been developed by some previous researchers. Most of the recently published articles didn't show the EFA results.
I would not feel confident adapting an instrument for which previous articles show only limited psychometric characteristics in the first place. I also concur that, depending on the size of your modifications, you should decide what to report. However, if your already have a factor structure in mind, why not choose a CFA instead?
Best
Marcel
• asked a question related to Exploratory Factor Analysis
Question
The Project is incomplete, Please open the file attached to see!
Having read the project, I will recommend you use PAF, oblique rotation. The component matrix may be ignored. Use the patten matrix instead. Name the factors based on the underlying factors loading to them. Compute item means and standard deviation. Do a Cronbach reliability test for each of the factor using the items loading to them.
• asked a question related to Exploratory Factor Analysis
Question
I have a data which contains Three Level Items (YES-NO-NotSure). Is it technically right to transform data into numeric type and perform EFA (Exploratory Factor Analysis) to extract factor scores to use in subsequent analysis?
There are latent variable models for categorical observed variables. Look up the nominal IRT model if you want continuous latent variables. This is implemented in mirt ( ). If you want categorical latent variables google latent class models.
• asked a question related to Exploratory Factor Analysis
Question
I assessed psychometric features of a construct and after exploratory factor analysis,
more than half of the items were excluded. How the construct/ content validity was influenced?
Any ideas or suggestions for reading ?
If you change a questionnaire around you are basically going back to the development stage. Fine, your re-analysed data may achieve good statistics on the sample data that you reduced but you are simply 'modelling the data' at this stage. As a reviewer I would reject any research that relies for its conclusions on this re-analysis. What you now need to do is to gather a fresh sample on your re-constructed questionnaire and run the whole gamut of statistics on the fresh sample. You will find (perhaps to your surprise but not to the surprise of an experienced psychometrician) that those lovely values you derived from your previous, reduced and edited subset are weakened - if they hold at all.
• asked a question related to Exploratory Factor Analysis
Question
Dear researchers, I am a master student, and now writing my graduation thesis. I am studying how the six dimensions of post-purchase customer experience influence customer satisfaction, in turn, repurchase intention. I have adapted the measurement scales of the six dimensions of post-purchase customer experience to make it more applicable for my study context. My question is: do I have to conduct exploratory factor analysis in SPSS? I have done that, but there are so many cross-loadings, I tried different methods, but the results still look not good. There are two dimensions of post-purchase customer experience(customer support and product quality) are loading to the same new factor, I feel it is not acceptable, because they are very different. I understand there may be some problems related to my questionnaire, but I have no chance to improve the questionnaire now.
I tried to use SmartPLS to do my analysis, and the factor analysis in this software looks great,but I think the factor analysis here is CFA instead of EFA. So can I skip EFA to do CFA directly?
I will need to finish my thesis in 1 month, and I really need your help. Thank you!
Only CFA in case you have adopted standard scale and made minor changes. But if you have constructed t he whole scale yourself as in made new items you would have to do a efa on a separate set of samples and CFA on separate set of samples
• asked a question related to Exploratory Factor Analysis
Question
Hello RG researchers,
I am a bit confused due to different questions and comments.
Well, I have a single factor containing 11 items (Likert rating). For the EFA, I am using SPSS (maximum likelihood) and I use lavaan and Amos for the CFA. I've got three questions:
1. KMO and Bartlett's tests' criteria are met while the normality tests (Kolmogorov-Smirnov and Shapiro-Wilk test) are not met (they are both significant). So, am I good to keep up the EFA or shall I need to use Satorra-Bentler or Yuan-Bentler adjustments (if yes, what software do I need to use)?
2. Should I be checking the normality for each item or checking the variable's normality is enough?
3. For the divergent validity, I use two other variables aside from my main questionnaire. Do they also need to be distributed normally as well?
Sara
1. To test normality, I recommend interpreting the skewness and kurtosis coefficients instead of statistical tests. In this case, if there are normality problems, parameter estimations can be made with unweighted least squares in SPSS.
2. Multivariate normality test is sufficient.
3. Depends on the statistic to be used. If normality is an assumption of the statistic to use, yes.
Good luck
• asked a question related to Exploratory Factor Analysis
Question
I am examining results from an exploratory factor analysis (using Mplus) and it seems like the two-factor solution fits the data better than the one factor solution (per the RMSEA, chi-square LRT, CFI, TLI, and WRMR). Model fit for the one factor model was, in fact, poor (e.g., RMSEA = .10, CFI = .90). In the two factor model, the two latent factors were strongly correlated (.75) and model fit was satisfactory (e.g., RMSEA = .07, CFI = .94). The scree plot, a parallel analysis, and eigenvalue > 1, however, all seem to point to the one-factor model.
I am not sure whether I should retain the one or two factor model. I'm also not sure whether I should look at other parameters/model estimates to make determine how many factors I retain. Theoretically, both models make sense. I intend to use these models to conduct an IRT (uni- or multidimensional graded response model - depending on the # of factors I retain).
0.70 is acceptable
• asked a question related to Exploratory Factor Analysis
Question
Currently, I am performing a factor analysis on 6 items.
I read that the residual plot can be used to assess the assumptions of normality, homoscedasticity, and linearity. However, I do not understand which residuals to use for this analysis. Do I need to examine 15 different plots for each combination of the 6 items?
Why do not you apply regression method?
• asked a question related to Exploratory Factor Analysis
Question
Hello all,
This is my first time doing CFA AMOS.
Initially, I developed a scale for a specific industry 17 items 5 factor scale based on theory of other industries. This proposed scale was tested with two ) datasets first with n=91 year 1 and second n=119 year 2 from a single institution. EFA identified 3 underlying factors in both the datasets, no items were deleted.
During year 3, a sample of n=690 consisting of participants all over the nation was used to do CFA using SPSS AMOS. Following is the output:
1. Based on EFA (3 factors, 17 items)
a) Chisquare = 1101.449 and df= 116 [χ2/DF = 9.495]
b) GFI = 0.805
c) NFI = 0.898
d) IFI = 0.908
e) TLI = 0.892
f) CFI = 0.908
g) RMSEA =0.111 (PClose 0.000)
h) Variance
Estimate S.E. C.R. P Label
F1 .573 .056 10.223 ***
F2 .668 .043 15.453 ***
F3 .627 .040 15.620 ***
i) Covariance
Estimate S.E. C.R. P Label
F1 <--> F2 .446 .036 12.502 ***
F1 <--> F3 .365 .032 11.428 ***
2) Based on theory (5 factors, 17 items)
a) Chisquare = 440.594 and df= 109 [χ2/DF = 4.042]
b) GFI = 0.926
c) NFI = 0.959
d) IFI = 0.969
e) TLI = 0.961
f) CFI = 0.969
g) RMSEA =0.066 (PClose 0.000)
h) Variance
Estimate S.E. C.R. P Label
F1 .677 .047 14.334 ***
F2 .670 .043 15.493 ***
F3 .648 .054 12.100 ***
F4 .741 .061 12.103 ***
F5 .627 .040 15.620 ***
i) Covariance
Estimate S.E. C.R. P Label
F1 <--> F2 .503 .036 14.057 ***
F1 <--> F3 .581 .041 14.262 ***
F1 <--> F4 .546 .041 13.388 ***
F1 <--> F5 .398 .032 12.321 ***
F2 <--> F3 .457 .036 12.848 ***
F2 <--> F4 .403 .035 11.405 ***
F2 <--> F5 .458 .033 13.899 ***
F3 <--> F4 .553 .042 13.036 ***
F3 <--> F5 .360 .032 11.275 ***
F4 <--> F5 .358 .033 10.754 ***
My questions:
1. Do I have to normalize the data before CFA analysis? (I am finding conflicting information since my scale is a likert scale and extreme values are not really outliers ?)
2. Can I report that theory based model is a better fit compared to EFA model? Would doing so be appropriate?
3. Is there anything else I need to do ?
Any guidance will be greatly appreciated.
Thank you,
Sivarchana
Hi Robert Trevethan thanks for your question.
I think ML would be suited to continuous normal data, as you suggest, and robust ML for skewed/non-normal continuous (or interval data with say 7-11 categories). So far I have only collected Likert, so because that involves polychoric correlations ML might not estimate accurately. I don't actually know if PAF in SPSS is equal or superior to, say, a DWLS in R. I think the main point I wanted to emphasise is to use PAF instead of PCA - I got this wrong in my earlier days - as PAF will be more accurate if using SPSS.
I must say Robert, your answers really make me think, which is good :) If I need to be corrected, I'm open.
• asked a question related to Exploratory Factor Analysis
Question
What is the best method or criteria to use in choosing the best item, when cross-loadings of items is evident in exploratory factor analysis? Thanks!
Hello Musa,
The answer depends on your research objectives, on whether you use orthogonal or oblique rotation (assuming more than one factor, of course), and your degree of love for Thurstone's "simple structure" solution.
If you use oblique rotation, and factors have moderate to strong correlations, then you'll tend to get a lot of cross-loadings. That doesn't make the structure wrong, nor does it automatically imply that you should jettison all such variables from your solution. At the same time, this is not to say that sticking with orthogonal rotations will eliminate cross-loadings from occurring.
If you embrace the simple structure solution, then this means you want to see variable-factor loadings of near 1 (sign is irrelevant) on one factor, and near zero on all other factors. The reasons for this as a desired pattern of loadings (for some) are that it: (a) makes interpretation of salience very straightforward, (b) eliminates cross-loading acceptability debates, and (c) often makes the characterization of a factor somewhat easier.
However, if you do insist that all variables have salient loadings on one and only one factor, then you may end up excluding variables which are germane to the identification of factors. The consequence might be reduced validity for factor scores as representing the intended construct.
• asked a question related to Exploratory Factor Analysis
Question
I am working on developing a new scale .On running the EFA, only one factor emerged clearly while the other two factors were messy with multiple item loadings from the different factors.
1- Is it possible that I remove the cross-loadings one by one to reach a better factor structure by re-running the analysis?
2-If multiple items still load on one factor, what criteria should I use to determine what this factor is?
Exploratory factor analysis (EFA) is method to explore the underlying structure of a set of observed variables, and is a crucial step in the scale development process. ... After extracting the best factor structure, we can obtain a more interpretable factor solution through factor rotation.
• asked a question related to Exploratory Factor Analysis
Question
Can the measured variables that remain ungrouped in exploratory factor analysis be included as separate variables during the structural equation modeling (SEM) of the latent variables observed in factor analysis? Please help me with some references.
Thank Imran and David.
• asked a question related to Exploratory Factor Analysis
Question