Science method

# Exploratory Factor Analysis - Science method

Explore the latest questions and answers in Exploratory Factor Analysis, and find Exploratory Factor Analysis experts.

Questions related to Exploratory Factor Analysis

I used confirmatory factor analysis without using the exploratory factor analysis first. The assumption that supported this was the clear presence of the factors in the literature. Is it correct not to go for the exploratory factor analysis and jump directly to CFA when you have clearly established factors in the literature and theory?

Can i split a factor that has been identified through EFA.

N=102 4 factors have been identified.

However, one of the 4 actually has two different ideas that obviously are factoring together. I am working on trying to explain how they go together but it is very easy to explain them as two separate factors.

When I conduct a Conformatory analysis the model fit is better for them separate. . . but running a confirmatory analysis on the same population of subjects that I conducted the Exploratory analysis on appears to be a frowned upon behavior.

I am conducting an EFA for a big sample and nearly 100 variables, but no matter what I do, the determinant keeps its ~0 value.

What should I do now?

I plan to develop a semi-structured assessment tool and further validate it on a relatively small sample of below 50 (clinical sample). I have been asked by the research committee to consider factor analysis.

So in this context, I wanted to know if anyone has used regularized factor analysis for tool validation which is recommended for small sample sizes?

Exploratory Factor Analysis and Confirmatory Factor Analysis are used in scale development studies. Rasch Analysis method can also be used in scale development. There are some researchers who consider the Rasch Analysis as up-to-date analysis. Frankly, I don't think so, but is there a feature that makes EFA, CFA or Rasch superior to each other in Likert type scale development?

I am aware that a high degree of normality in the data is desirable when maximum likelihood (ML) is chosen as the extraction method in EFA and that the constraint of normality is less important if principal axis factoring (PAF) is used as the method of extraction.

However, we have a couple of items in which the data are highly skewed to the left (i.e., there are very few responses at the low end of the response continuum). Does that put the validity of our EFAs at risk even if we use PAF?

This is a salient issue in some current research I'm involved in because the two items are among a very small number of items that we would like, if possible, to load on one of our anticipated factors.

Dear all,

I am conducting research on the impact of blockchain traceability for charitable donations on donation intentions (experimental design with multiple conditions, i.e., no traceability vs. blockchain traceability).

One scale/factor measures “likelihood to donate” consisting of 3 items (dependent variable).

Another ”trust” factor, consisting of 4 items (potential mediator).

Furthermore, a “perception of quality” consisting of 2 items (control).

And a scale “prior blockchain knowledge” consisting of 4 items (control).

My question is: since all these scales are taken from prior research, is CFA sufficient? Or, since the factors are from different studies (and thus have never been used together in one survey/model) should I start out with an EFA?

For instance, I am concerned that one (or perhaps both) items of ”perception of charity quality” might also load on the “trust”-scale. e.g., the item “I am confident that this charity uses money wisely”

Curious to hear your opinions on this, thank you in advance!

Greetings,

I am a DBA student conducting a study about "Factors Impacting Employee Turnover in the Medical Device Industry in the UAE."

My research model consists of 7 variables, out of which:

- 5 Variables measured using multi-item scales adapted from literature ex. Perceived External Prestige (6 items), Location (4 items), Flextime (4 items),.. etc.
- 2 are nominal variables

I want to conduct a reliability analysis using SPSS & I thought I need to do the below?

- Conduct reliability test using SPSS Cronbach's alpha for each construct (except for nominal variables)
- Deal with low alpha coefficients (how to do so?)
- Conduct Exploratory Factor Analysis to test for discriminant validity

Am I thinking right? Attached are my results up to now..

Thank you

Hi everyone,

I have longitudinal data for the same set of 300 subjects over seven years. Can I use '''year' as a control variable? Initially, I used one way ANOVA and found no significant different across seven years in each construct.

Which approach is more appropriate?. Pooling time series after ANOVA (if not significant) or using 'year' as a control variable?

Hello everyone,

As the title suggests, I am trying to figure out how to compute a reduced correlation matrix in R. I am running an Exploratory Factor Analysis using Maximum Likelihood as my extraction method, and am first creating a scree plot as one method to help me determine how many factors to extract. I read in Fabrigar and Wegener's (2012)

*Exploratory Factor Analysis,*from their Understanding Statistics collection, that using a reduced correlation matrix when creating a scree plot for EFA is preferable compared to the unreduced correlation matrix. Any help is appreciated!Thanks,

Alex

I`m conducting the translation of a very short scale of 12 ítems to assess therapeutic alliance in children. I have 61 answers and I wonder if that number of subjects it`s acceptable to run Exploratory Factor Analysis. I know that there is a suggestion of 5 participants for item to do EFA and 10 participants for item to do CFA. However, the number of participants here seem to be very smal for these analysys. What it´s your opinion?

Exploratory factor analysis was conducted to determine the underlying constructs of the questionnaire. The results show the % variance explained by each factor. What does % variance mean in EFA? How to interpret the results? How can I explain the % variance to a non-technical person in simple non-statistical language?

I have good result with low variance explanation (Less than 50%) in exploratory factor analysis, and read some discussions about the acceptable for total variance explanation < 50% in social sciences. Please recommend papers to support this issue or give me your suggestion.

Thanks in advance.

I am using the Environmental Motives Scale in a new population. My sample size is 263.

The results of my exploratory factor analysis showed 2 factors (Eigenvalue>1 and with loadings >0.3) - Biospheric Factor and Human Factor

**Cronbach alpha**was high for both factors (>0.8)

However, unexpectedly, confirmatory factor analysis showed that the model did not fit well:

RMSEA= 0.126, TLI =0.872 and SRMR = 0.063,

**AIC = 6786**After a long time on Youtube, I then checked the residual matrix and found that the standardized covariance residuals between two of the items in the Biospheric factor was 7.480. From what I understand if values are >3, it indicates that there may be additional factor/s that are accounting for correlation besides the named factor. I therefore tried covarying the error terms of those two items and rechecked the model fit using CFA.

Results of this model show much better model fit.

RMSEA = 0.083, TLI = 0.945, SRMR = 0.043,

**AIC = 6731**(not as much difference as I thought there would be)The questions I am now left with (which google does not seem to have the answer to) are:

1. Is it acceptable to covary the error terms to improve model fit?

2. How does covarying error terms impact on the scoring of the individual scales? Can I still add up the items to measure biospheric vs human scales as I would have without the covarying terms?

I would be so grateful for any insight or assistance.

Thank you

Tabitha

I used exploratory factor analysis for 4 latent variables, the result in the table called "total of variance explained" has "% of variance". is it similar to average variance extracted?

What steps to do discriminant validity in SPSS? I run the factor analysis, then compute the latent variable to become observed variables, after that I run the correlation. is it the correct process?

Thanks for your attention

Hi, I am working on a project about ethical dilemmas. This project requires development of a new questionnaire that should be valid and reliable. We started with collecting the items from the literature (n= 57), performed content validity where irrelevant items were removed (n=46), and piloted it to get the level of internal consistency. Results showed that the questionnaire has a high level of content validity and internal consistency. We were requested to perform exploratory factor analysis to confirm convergent and discriminant validity.

Extraction PCA

rotation varimax

Results: the items' communalities were higher than 0.6.

kMO 70%

Barttlett's test is significant.

Number of extracted factors 11with total explained variance 60%.

My issue is 6 factors contain only 2 items. Should I remove all these items?

With notice that the items are varied, each one describes a different situation, and only they share in that they are ethical dilemmas and deleting them will affect the overall questionnaire ability to assess participants' level of difficulty and frequency of such situations.

EFA is new concept for me; I am really confused by this data.

Do you know any renowned article which has been published in Scopus journal describing that for conducting the Exploratory Factor Analysis (EFA), which method is the best, 'Principal component' or 'Principal Axis Factoring ' in SPSS?

I collected 109 responses for 60 indicators to measure the status of urban sustainability as a pilot study. So far I know, I cannot run EFA as 1 indicator required at least 5 responses, but I do not know whether I can run PCA with limited responses? Would you please suggest to me the applicability of PCA or any other possible analysis?

Query1)

Can mirt exploratory factor analysis method be used for factor structure for marketing/management research studies because most of the studies that I have gone through are related to education test studies?
My objective is to extract factors to be used in subsequent analysis (Regression/SEM)
My data is comprised of questions like:
Data sample for Rasch Factors
Thinking about your general shopping habits, do you ever:
a. Buy something online
b. Use your cell phone to buy something online
c. Watch product review videos online
RESPONSE CATEGORIES:
Yes = 1
No = 0
Data sample for graded Factors
Thinking about ride-hailing services such as Uber or Lyft, do you think the following statements describe them well?
a. Are less expensive than taking a taxi
c. Use drivers who you would feel safe riding with
d. Save their users time and stress
e. Are more reliable than taking a taxi or public transportation
RESPONSE CATEGORIES:
Yes =3
Not sure = 2

No = 1
Query2) If we use mirt exploratory factor analysis using rasch model for dichotomous and graded for polytomous, do these models by default contain tetrachoric correlation for rash model and polychoric correlation for graded models?
My objective is to extract factors to be used in subsequent analysis (Regression/SEM)
Note: I am using R for data analysis

Hi,

I used a self-efficacy tool for my sample. According to the original article, there is only one factor in the tool. However, in the Exploratory factor analysis for my sample, two factors were found. How can I interpret this result?

Thank you so much for your answer in advance!

Hi, I have run an exploratory factor analysis (principal axis factoring, oblique rotation) on 16 items using a 0.4 threshold. This yielded two factors, which I had anticipated as the survey was constructed to measure two constructs. Two items had factor loadings <0.4 (from Factor 1) so I removed them, leaving 14. However, upon closer inspection, one of the items from Factor 2 loaded on to Factor 1 (|

*B*| = 0.460).The distinction between the two constructs is very clear so there should not be any misunderstanding on the part of the participants (n = 104). I'm unsure of what to do. I checked the Cronbach's alpha for each factor: Factor 1 (

*a*= .835 with the problematic item,*a*= .839 without). Factor 2 is*a*= .791).Do I remove the item? Any advice would be very much appreciated. Thank you!

Can anyone help me with the sample size calculation for the exploratory factor analysis? Do you know how to calculate it and with which statistical program? Thank you.

Can mirt exploratory factor analysis method be used for factor structure for marketing/management research studies because most of the studies that I have gone through are related to education test studies?
My objective is to extract factors to be used in subsequent analysis (Regression/SEM)
My data is comprised of questions like:
Data sample for Rasch Factors
Thinking about your general shopping habits, do you ever:
a. Buy something online
b. Watch product review videos online
RESPONSE CATEGORIES:
1 Yes
2 No
Data sample for graded Factors
Thinking about ride-hailing services such as Uber or Lyft, do you think the following statements describe them well?
a. Are less expensive than taking a taxi
b. Save their users time and stress
RESPONSE CATEGORIES:
1 Yes
2 No
3 Not sure

i’m trying to run polychotic correlation with Stata v13, but I’m confused.

I have adopted my questionnaire from previous literature. I want to know if I still need to carry out EFA before carrying out PLS-SEM for my thesis?

Hi,

I am working on exploratory factor analysis in SPSS using promax rotation.

Upon checking pattern matrix, there is one question which has factor loading greater than 1.0? should I need to ignore if loading greater than 1.0?

Also i realised there are few negative loadings in pattern matrix? Can negative loadings be still considered for the analysis?

Thanks much.

Hello everyone,

I have run a confirmatory factor analysis in R to assess the translated version of an existing questionnaire. It is unidimensional and consists of 16 adjective-based items rated on a 7-point Likert Scale.

Here are the results:

**X2=**627.197,

**df=**104,

**p=**0.000,

**RMSEA=**0.109,

**CI 90%=**[0.101, 0.117],

**SRMR=**0.063,

**CFI=**0.839,

**TLI=**0.814

I am aware of all the cutoffs; well the result of my RMSEA is troublesome. On the other hand, as I am delving into similar topics, some of tem just reported these results as satisfying and didn't conduct an Explaratory Factor Analysis.

What I am wondering is if my results are acceptable to just limit myself to report them and run no EFA study?

Or should I run EFA and then gather data again based on the model proposed by EFA results?

Thanks for your time,

Sara

The items used in my study have been adapted from the instruments that have been developed by some previous researchers. Most of the recently published articles didn't show the EFA results.

The Project is incomplete, Please open the file attached to see!

I have a data which contains Three Level Items (YES-NO-NotSure). Is it technically right to transform data into numeric type and perform EFA (Exploratory Factor Analysis) to extract factor scores to use in subsequent analysis?

Hey guys,

I have found two multi-item scales in my previous research regarding my master thesis. I want to know if I can compute an EFA for the dependent and for the independent variable?

I assessed psychometric features of a construct and after exploratory factor analysis,

more than half of the items were excluded. How the construct/ content validity was influenced?

Any ideas or suggestions for reading ?

Dear researchers, I am a master student, and now writing my graduation thesis. I am studying how the six dimensions of post-purchase customer experience influence customer satisfaction, in turn, repurchase intention. I have adapted the measurement scales of the six dimensions of post-purchase customer experience to make it more applicable for my study context. My question is: do I have to conduct exploratory factor analysis in SPSS? I have done that, but there are so many cross-loadings, I tried different methods, but the results still look not good. There are two dimensions of post-purchase customer experience(customer support and product quality) are loading to the same new factor, I feel it is not acceptable, because they are very different. I understand there may be some problems related to my questionnaire, but I have no chance to improve the questionnaire now.

I tried to use SmartPLS to do my analysis, and the factor analysis in this software looks great,but I think the factor analysis here is CFA instead of EFA. So can I skip EFA to do CFA directly?

I will need to finish my thesis in 1 month, and I really need your help. Thank you!

Hello RG researchers,

I am a bit confused due to different questions and comments.

Well, I have a single factor containing 11 items (Likert rating). For the EFA, I am using SPSS (maximum likelihood) and I use lavaan and Amos for the CFA. I've got three questions:

1. KMO and Bartlett's tests' criteria are met while the normality tests (Kolmogorov-Smirnov and Shapiro-Wilk test) are not met (they are both significant). So, am I good to keep up the EFA or shall I need to use Satorra-Bentler or Yuan-Bentler adjustments (if yes, what software do I need to use)?

2. Should I be checking the normality for each item or checking the variable's normality is enough?

3. For the divergent validity, I use two other variables aside from my main questionnaire. Do they also need to be distributed normally as well?

Thanks for your time,

Sara

I am examining results from an exploratory factor analysis (using Mplus) and it seems like the two-factor solution fits the data better than the one factor solution (per the RMSEA, chi-square LRT, CFI, TLI, and WRMR). Model fit for the one factor model was, in fact, poor (e.g., RMSEA = .10, CFI = .90). In the two factor model, the two latent factors were strongly correlated (.75) and model fit was satisfactory (e.g., RMSEA = .07, CFI = .94). The scree plot, a parallel analysis, and eigenvalue > 1, however, all seem to point to the one-factor model.

I am not sure whether I should retain the one or two factor model. I'm also not sure whether I should look at other parameters/model estimates to make determine how many factors I retain. Theoretically, both models make sense. I intend to use these models to conduct an IRT (uni- or multidimensional graded response model - depending on the # of factors I retain).

Thank you in advance!

Currently, I am performing a factor analysis on 6 items.

I read that the residual plot can be used to assess the assumptions of normality, homoscedasticity, and linearity. However, I do not understand which residuals to use for this analysis. Do I need to examine 15 different plots for each combination of the 6 items?

Hello all,

This is my first time doing CFA AMOS.

Initially, I developed a scale for a specific industry 17 items 5 factor scale based on theory of other industries. This proposed scale was tested with two ) datasets first with n=91 year 1 and second n=119 year 2 from a single institution. EFA identified 3 underlying factors in both the datasets, no items were deleted.

During year 3, a sample of n=690 consisting of participants all over the nation was used to do CFA using SPSS AMOS. Following is the output:

1. Based on EFA (3 factors, 17 items)

a) Chisquare = 1101.449 and df= 116 [χ2/DF = 9.495]

b) GFI = 0.805

c) NFI = 0.898

d) IFI = 0.908

e) TLI = 0.892

f) CFI = 0.908

g) RMSEA =0.111 (PClose 0.000)

h) Variance

Estimate S.E. C.R. P Label

F1 .573 .056 10.223 ***

F2 .668 .043 15.453 ***

F3 .627 .040 15.620 ***

i) Covariance

Estimate S.E. C.R. P Label

F1 <--> F2 .446 .036 12.502 ***

F1 <--> F3 .365 .032 11.428 ***

2) Based on theory (5 factors, 17 items)

a) Chisquare = 440.594 and df= 109 [χ2/DF = 4.042]

b) GFI = 0.926

c) NFI = 0.959

d) IFI = 0.969

e) TLI = 0.961

f) CFI = 0.969

g) RMSEA =0.066 (PClose 0.000)

h) Variance

Estimate S.E. C.R. P Label

F1 .677 .047 14.334 ***

F2 .670 .043 15.493 ***

F3 .648 .054 12.100 ***

F4 .741 .061 12.103 ***

F5 .627 .040 15.620 ***

i) Covariance

Estimate S.E. C.R. P Label

F1 <--> F2 .503 .036 14.057 ***

F1 <--> F3 .581 .041 14.262 ***

F1 <--> F4 .546 .041 13.388 ***

F1 <--> F5 .398 .032 12.321 ***

F2 <--> F3 .457 .036 12.848 ***

F2 <--> F4 .403 .035 11.405 ***

F2 <--> F5 .458 .033 13.899 ***

F3 <--> F4 .553 .042 13.036 ***

F3 <--> F5 .360 .032 11.275 ***

F4 <--> F5 .358 .033 10.754 ***

My questions:

1. Do I have to normalize the data before CFA analysis? (I am finding conflicting information since my scale is a likert scale and extreme values are not really outliers ?)

2. Can I report that theory based model is a better fit compared to EFA model? Would doing so be appropriate?

3. Is there anything else I need to do ?

Any guidance will be greatly appreciated.

Thank you,

Sivarchana

What is the best method or criteria to use in choosing the best item, when cross-loadings of items is evident in exploratory factor analysis? Thanks!

I am working on developing a new scale .On running the EFA, only one factor emerged clearly while the other two factors were messy with multiple item loadings from the different factors.

1- Is it possible that I remove the cross-loadings one by one to reach a better factor structure by re-running the analysis?

2-If multiple items still load on one factor, what criteria should I use to determine what this factor is?

Can the measured variables that remain ungrouped in exploratory factor analysis be included as separate variables during the structural equation modeling (SEM) of the latent variables observed in factor analysis? Please help me with some references.

For my research project I am adding new items to a previously validated scale. in the previous research they performed an exploratory factor analysis revealing a two-factor structure, but the internal consistency of one of the scales was quite poor so the aim of my study was to add new words to improve the internal consistency. so do i need to do another exploratory factor analysis as the scale will now have new words or can i do a confirmatory factor analysis because i'm still using the same scale?

I am developing a questionnaire and first performing an exploratory factor analysis. After I have the final factor structure, I plan on regressing the factor scores on some demographic covariates. Since I am anticipating missing item responses, I am thinking of imputing the item scores before combining them into factor scores (by average or sum).

I came across a paper that suggested using mice in stata and specifying the factor scores as passive variables. I am wondering if this is the best approach since I read somewhere that says passive variables may be problematic. Or, are there any alternative solutions? Thank you!

Here is a link to the paper, and the stata codes are included in the Appendix.

From the item under 0,32,0,40,0,50 as we have selected ? From the item that doesnt have a loading on any factor or from the items that have loading to two or three or more factors?Is there a rule in terms of the order of procedures?From which one should we start to exclude?

From the item under 0,32,0,40,0,50 as we have selected ? From the item that doesnt have a loading on any factor or from the items that have loading to two or three or more factors?Is there a rule in terms of the order of procedures?From which one should we start to exclude?

Hi everybody!

I'm performing EFA on a 400 observations database that contains 39 variables that I'm trying to group. I'm using maximum likelihood and applying a varimax rotation.

I have eliminated all the variables that have have communlaties < 0.4, I know this can be a bit "relaxed" but overall I don't have communalities that are that high (0.67 the highest one and it is only two variables), I have then dropped the variables that have a loading < 0.4 and have eliminated the variables that are cross loading (usually with loadings just above.4).

After performing all these steps, I have 3 clearly defined factors with 19 variables in total (F1: 8 variables, F2: 7 variables, F3: 4 variables). Is it acceptable to drop that many variables?

Thanks in advance!

Mauricio

Dear Research fellows,

I hope you are well and safe.

What are the factors to identify high potential employees?

Are there any differences between these factors among different companies around the world?

I have not found so many studies using exploratory factor analysis about high potential employees in different companies.

I will be grateful if I know some about this subject matter.

Thanks in advance,

I developed my own survey based on previous themes from qualitative data. Pairs of questions were extracted from subscales from previous questionnaires in the field, and then adapted to fit the survey context. I have now completed data collection and I ran a CFA on the 'a-priori' factors and questions that were developed, and the model fit wasn't great!

I went back to the data and conducted an EFA to see what factors did work together, and the reliability, plus overall model fit when doing CFA was much, much better. The factors extracted during EFA weren't that far from the original themes, except for a couple of questions being moved around.

Therefore, my question is - is this a done thing? As this was a data-driven survey, would it be acceptable to run EFA and go by this factor structure to continue with the rest of my stats? Or should I just stick with the original 'a-priori' factor structure and deal with the consequences?

Thanks!

Hi all,

I've conducted an exploratory factor analysis with 365 participants, and all the data largely is reliable and has gone according to plan. However, when looking at the rotated component matrix (varimax rotation conducted), one of my questions doesn't load onto ANY factor whatsoever (no number is displayed). I've tried to find answers to this, but can't seem to find any papers that illustrate what to do when this happens. I would assume that in this case, I would discount the question altogether?

Can anyone advise on best practice, or guide me towards a paper that may answer my question?

Thanks!

Hello SEM Fans,

I conducted an ESEM with target rotation using personality data.
1) Am i right that with rotation target, the calculation can only be made using listwise deletion? In this case i would pay a better model fit with lower statistical power, right?
2) Is there any possibility to simulate missing data in a way multiple imputation does?
3) In the case that ESEM is not possible because too many cases have missing data, could it be a solution to do a regular SEM with Bartlett's factor scores from SPSS instead of latent variables?

Thank you and best regards
Stefan

Hi everybody!

I have a market research study in which we have asked the customer to evalute the company they buy the most from on multiple attributes (30) on a scale of 1 to 10. We want to understand the different market segments that might exist based on this evaluation. We have around 300 answers.

We have no prior construct on how the different attributes might load onto different factors, so we are using EFA. My question is if we should also perform a CFA after we do the EFA. I have been doing some research about this and some people say you can run CFA on the same EFA data and some say that you shouldn't. Any opinions? Our ultimate goal is to get the factor scores for each customer and perform a segmentation using k-means.

Thanks in advance!

Research on Retail service quality and customer loyalty is being carried out presently by me.

**Brief Description of the Study:**

In our Scope, Retail Service Quality is considered as the independent variable. Further, it has five dimensions or Variables, Which is measured by 22 questions or items (Physical Aspects, Reliability, Personal Interaction, Problem Solving and Policy).Meanwhile, Customer Loyalty is considered as the Dependent Variable, which is measured by 8 questions or items.

**Research Methodology**

First of all, I intend to perform the EFA to factorize the items of independent variable (22 Questions), which will give the unique and uncorrelated factors under the independent variable.

Meanwhile, I don’t want to perform the EFA for whole 30 question or items. In which, 8 items are used to measure the dependent variable. Generally, there is a coloration or association between independent and dependent variables. Therefore, I only used 22 items under the independent variables to perform the EFA to construct the unique model.

Secondly, I wish to perform the CFA with the identified items of the factors under the independent variable via EFA and items of dependent variable, those are not factorized via EFA ( 8 items) to confirm the whole set of model , which contains both independent and dependent variable under the measurement model.

Thereafter, I also like to perform the Structural Equation Model (SEM) to find out the path way between independent variables (Physical Aspects, Reliability, Personal Interaction, Problem Solving and Policy) and dependent variable as Customer Loyalty.

**Sampling Adequacy**

What is the required level of Sample size for the advance statistical tools as Exploratory Factor Analysis and Structural Equation Model?

Whether Two thousand (2000) sample size is enough or too much or less to get the significant model

If it is too much, how we will tackle it smoothly to get the Significant Model.

I need a confirmation regarding Exploratory Factor Analysis (EFA). from literatures that I have already read, we must include all variables and items we had in the research model to run EFA.

When a new research model was developed by combining 2 established theory, is there any possibility and justification for me to run EFA on theory A first, and then EFA on theory B?, or i can separate EFA based on Independent and Dependent Variables?

Please advices...

Hi,

I am running an Exploratory Factor Analysis (EFA) on SPSS and trying different extraction methods. The Principal Component Analysis (PCA) suggested 4 factors whose Eigenvalue (EV>1). The Parallel Analysis suggested that factor 4 be dropped. This is also consistent with the Scree plot output.

When trying the same procedure using Principal Axis Factoring (PAF), I immediately get 3 factors with EV>1. However, when I try to ascertain that result in the Parallel Analysis, all EV I get are lower than 1.

While I intend to proceed with a 3-factor solution as suggested by PCA, I am nevertheless curious to know what does the Parallel Analysis (EV<1) result suggests about the extraction method. Thank you for your kind suggestions.

Regards,

Kimo

I am working on validating Multi-Dimensional scale, total variables are 8 with total 56 items, how much sample size is required to run Exploratory factor analysis?

There's two types of Factor Analysis as we know, that is Confirmatory Factor Analysis(CFA) and Exploratory Factor Analysis(EFA). Do you know any other kind of it?

Hello researchers,

I have a four-factor model determined a priori (some emergent theory) to which my data fits quite well when I do confirmatory factor analysis (CFI>0.98; TLI>0.98; RMSEA<0.03; SRMR<0.04; BIC-sample adjusted = 8107)

When I first perform exploratory factor analysis on the data I arrive at a two-factor model which combines the items in the four-factor model such that factor 1&2 in the four-factor model (with over 85% covariance) load highly on the first factor of the two-factor model and factor 3&4 (with over 85% covariance) in the four-factor model load highly on the second of the two factor-model. On conducting another CFA, the data fits quite well with the two-factor model from the EFA (CFI>0.97; TLI>0.97; RMSEA<0.03; SRMR<0.04; BIC-sample adjusted = 8117 )

The 2-factor structure seems simpler in terms of number the factors but has

**slightly**lower fit statistics. (Q1) How do I interpret these differences in the two structures? ie. What can go wrong with the interpretation if I am to rank and prioritize items by the regression weights on respective factors using the two-factor structure rather than the original 4 factor-structure? (Q2) Which model comparison tests (or theoretical argument) can I use to ascertain which is the best of the two models for explaining the underlying structure of the phenomenon.Any thoughts?

I am currently trying to create a scale to measure a multi-dimensional parenting construct. There is currently no strongly established theory about the construct and I am investigating it in an age group that has not typically been the focus of parenting researchers. I created a list of 26 items based on a qualitative study and have done an EFA on the data. Almost half of the items are skewed and some are quite kurtotic due to low base rates of those parenting behaviours. However, I believe that these items are theoretically relevant to the construct of interest. Due to high skew/kurtosis/presence of non-normality, I used polychoric correlations for the EFA. A 3 factor solution was recommended.

My questions are:

1) The determinant of the matrix is less than .00001 but Bartlett's and KMO are good (fit indices are generally good as well). I have read in previous discussions online that <.00001 determinants may arise due to high kurtosis in items. Does anyone know of a reference/resource that explains this in more detail and/or has recommendations that it's not the end of the world?

2) A number of the skewed/kurtotic items have low communalities (<.40) even though they have factor loadings of over >.40. What are the best practices or existing rules of thumb on how to proceed with elimination of items to refine the scale? Should I delete the items with low communalities (despite the sufficient factor loadings), and then re-run EFA? Or should I delete items based on low factor loadings (<.40), then re-run EFA? If the latter, would it be necessary to do anything with (i.e., elimiate) the items that have low communality? Or just leave them?

Thanks very much in advance.

Hello.

I conducted a pilot survey (n=67). I modified the survey based on respondent feedback and a principal component analysis, and I administered it to a larger sample (n=561). I ran an exploratory factor analysis with the new data omitting the demographics variables. I obtained three factors that work out nicely. The first two factors yielded an acceptable Cronbach's Alpha coefficient (.784 and .772, respectively), but the third factor has a coefficient of .485. One of my research question was whether factor 3 affected factor 1, and the factor correlation matrix shows a correlation coefficient of 0.488.

My questions are:

1) Is it okay to omit demographic variables in an exploratory factor analysis? I am not testing demographics as a construct, I want to find the correlations between the different demographics (e.g., type of math, grade level, school location) and the factors.

2) Do I conduct further statistical analysis (i.e., frequency and correlation) on the first two factors only since those have acceptable Cronbach's Alpha but the third one does not?

3) I heard that any correlation of 0.32 and above is considered acceptable between factors. Is that true? If that is case, should I say that factors 1 and 3 have a strong correlation but then indicate that the low internal reliability for factor 3 does not allow us to generalize results to the general population?

4) In general, if we ever eliminate any variables in an EFA, do we completely ignore those variables in further statistical analysis?

I'm sorry for so many questions. I have found a lot of information about how to conduct an EFA, but not what to do with and after the results.

Thank you in advance for your help.

Many researches have Exploratory Factor Analysis followed by Confirmatory factor Analysis when theory is available. My research has not established model can I apply this Exploratory Factor Analysis and the result cab be taken as a scale developed

Factor analysis classified into two. That is obviously Exploratory and confirmatory factor analysis. So I need a clear explanation a bout the difference between the two?

Thank you in Advance!!!

i design a questionnaire about job health . data collection and exploratory factor analysis was done. can h used the responses of participant to report job health of the sample.

I have seven latent variables but in exploratory factor analysis only six factors came and some questions falls under different factors. My query is while performing CFA model those questions falling under factors are considered or as per my questionnaire seven latent variables are to be considered ignoring EFA.

Graduate research (Ph.D.) students very often face the dilemma of whether to use EFA along with CFA (or CCA) while conducting measurement model analysis in SEM in basic business research with perception-based latent variables. Finally, there comes an article (reference given below) that discusses in detail about these construct measurement concepts from PLS-SEM and CB-SEM perspectives.

**Hair, J. F., Howard, M. C., & Nitzl, C. (2020). Assessing measurement model quality in pls-sem using confirmatory composite analysis. Journal of Business Research, 109, 101-110. doi:10.1016/j.jbusres.2019.11.069**

I hope this well-written paper would now serve as a primary reference for all those who seek emancipation from the never-ending (long-created) confusion about the use of EFA/CFA/CCA while conducting measurement model analysis, especially in SEM (PLS-SEM and CB-SEM).

I hope respected business research scholars would enlighten further our thought on the matter by providing their insights on the topic.

Best regards.

What can be the advantages of performing EFA prior to CFA, when we are building an instrument against a pre-defined conceptual framework?

In Factory Analysis perspective, What is the difference between exploratory factor analysis and confirmatory factor analysis? What will be the ideal sample size to analysis EFA and CFA?

I have reviewed many articles about the pilot study, in some, they run the reliability test, while in other along with the reliability they run the EFA. I would like to know is it necessary to run EFA along with reliability test ? in case we already adapted the item from literature and content validity is conducted by calculating content validity index (CVI)? (Note that my sample size is less than 50).

- Most of us are using adapted scales while conducting research, but there is confusion on the application of exploratory factor analysis (EFA) on such scales. If the constructs are already identified in the questionnaire then should we apply EFA on the full scale or it should be applied on each construct?

Hi,

I conducted a principal axis factoring (PAF) with oblimin rotation on a scale with 5 items. 2 factors were extracted (using Kaiser's criteria (eigenvalue >1) and scree plot analysis): 2 items load on factor 1 (F1) and 3 items on factor 2 (F2). Items do not cross-load.

All items are measuring the feeling of efficacy but different dimensions of it. F1 seemingly measures self-efficacy (my family & friends (item 1), and I (item 2) can do something to...). F2 describes collective efficacy (people in general (item 3), businesses (item 4), governments (item 5) can do something to...).

If I force PAF to extract only 1 factor (to measure "general" efficacy), all item factor loadings are still >.5, and Cronbach's alpha of the 5-item scale is above >.7.

However, I am not sure what would be the better way to proceed to build a regression model with "efficacy" as one/or two independent variable/s: Go for the 1-scale solution (to avoid a 2-item scale of "self-efficacy") and simply mention that efficacy has two dimensions that are measured together in one variable? Or split efficacy into two scales? I saw both approaches in research.

Hi,

I have performed factor analysis in order to categorize 31 variables into major groups, the result yielded 9 factors however, in the rotated component matrix I have found two factors with only one attribute (one variable) but with a high loading value. Is it normal to take a factor with one variable or I have to delete this factor??

Hi,

I want to construct an index

**'X**' (like 'wealth-index'/'gender development index') with 3 hierarchical scores**(low-medium-high)**for Bangladesh. I have found a literature where they have initially included 8 factors which are generally considered as validate domains of 'X' in Bangladesh. Then, after EFA and CFA, they have excluded 3 factors and declared that the remaining 5 factors are the 'valid factors' of 'X'.**I will use 'DHS' data, like they've used too.**Now,

**my question is,****"If I want to create an index of 'X'**(and further logistic regression)

**for Bangladesh with low-medium-high scores,**then with proper citation of that literature,

**can I use those 5 valid factors/variables directly in my study**(without further CFA analysis)?"

I am writing a paper assessing unidimensionality of multiple-choice mathematics test items. The scoring of the test was based on right or wrong answers which imply that the set of data are in nominal scale. Some earlier research studies that have consulted used exploratory factor analysis, but with the little experience in data management, I think factor analysis may not work. This unidimensionality is one of the assumptions of dichotomously scored items in IRT. Please sirs/mas, I need professional guidance, if possible the software and the manual.