# How to interpret factor scores from Exploratory Factor Analysis?

I've conducted different factor extraction methods using a considerably small dataset (low-level features extracted from image content). The problem is with the interpretation of factor scores obtained, which ranges from negative to positive integer number of unknown minimum/maximum. I read some handbooks but usually highlighted on how to conduct factor analysis and very rarely discuss about how to interpret the output.

## Popular Answers

Frederick H. Navarro· Walden UniversityIf the observable measures captured the dimensions "fun", "hard", "happy", "tough", and "sad", then hidden Factor 1 can be described as "fun" (loading = .80), "happy" (loading = .75), and "NOT Sad" (loading = -.89).

I hope this helps!

Frederick H. Navarro· Walden UniversityFactor Analysis and Complex Systems

Factor analysis is also a technique to identify emergent multistability within the phase space of a system of responses to a set of measures. The Five Factor Model of personality is an inductive model (Friedman & Schustack, 2012) which identified the emergent multistable factors (e.g., Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) from repeated factor analyses of behavioral-affective measures (Digman, 1990).

From a complex dynamic systems perspective, the Five Factors can be thought of as “global structural patterns, which emerge from (both linear and non-linear) interactions among the system’s components through phase space…characterized as emergent (attractor-based) collectives.” (Juarrero, 2012, Abstract, parenthetical phrases added). A system’s phase space represents all possible states of the system, and the attractors within the phase space represent regions of multistability and different trajectories of the system.

Example statements used by Holgado-Tello, Carrasco-Ortiz, del Barrio Gándara, and Moscoso (2009) to detect the Five Factors included statements such as “I share my things with other people,” “I like to watch the TV news and to know what happens in the world” (p .78). Based on the 5-point scale Hogado-Tello, et al. (2009) used to assess the 65 statements, the size of the phase space for the entire system of items and response options can be calculated at 564 or 2.7 X 1045.

Assuming the responses to these statements are based on behavioral tendencies in specific contexts (e.g., occasions where “sharing with others” can occur or when the news is on TV, the responses can be said to be contextually-bound and linked to specific environmental-social contexts.

So the Five Factors can be said to “emerge” from the response behavior of large numbers of individuals to dynamically changing environmental-social situations consistent with complex adaptive systems (Miller & Page, 2007).

References

Digman, J. M. (1990, February). Personality structure: emergence of the five-factor model. Annual Review of Psychology, 41, 417-440. doi: 10.1146/annurev.ps.41.020190.002221.

Friedman, H. S. & Schustack, M. W. (2012). Personality: Classic theories and modern research (5th Ed.). MA: Allyn & Bacon ISBN: 0-205-05017-4

Holgado-Tello, F., Carrasco-Ortiz, M., del Barrio Gándara, M., & Moscoso, S. (2009). Factor analysis of the Big Five Questionnaire using polychoric correlations in children. Quality & Quantity: International Journal of Methodology, 43(1), 75-85. doi:10.1007/s11135-007-9085-3.

Juarrero, A. (2012). Complex dynamical systems theory. The Cognitive Edge. Retrieved from http://www.cognitive-edge.com/uploads/articles/100608%20Complex_Dynamical_Systems_Theory.pdf

Miller, J. H. & Page, S. E. (2007). Complex adaptive systems. New Jersey: Princeton University Press.

## All Answers (39)

Mohammad Tahir· Sugar Crops Research Institute MardanI would like you to go through this page:

http://www.hawaii.edu/powerkills/UFA.HTM for this description:

......"Scaling. A scientist often wishes to develop a scale on which individuals, groups, or nations can be rated and compared. The scale may refer to such phenomena as political participation, voting behavior, or conflict. A problem in developing a scale is to weight the characteristics being combined. Factor analysis offers a solution by dividing the characteristics into independent sources of variation (factors). Each factor then represents a scale based on the empirical relationships among the characteristics. As additional findings, the factor analysis will give the weights to employ for each characteristic when combining them into the scales. The factor score results are actually such scales, developed by summing characteristics times these weights. "

Hope this helps...

Janavi Kumar· Kansas State Universityhttp://en.wikipedia.org/wiki/Factor_analysis#Criteria_for_determining_the_number_of_factors

Then chose the factor loading cut-off level. What you can then do (to help visually), is to copy your factor scores over to excel and paste the scores for each factor in a distinct column(if you chose 3 factors, 3 columns). The ones that you consider are significant scores above the cut-off level. So negative scores are out. With this information, you consider your research question, and see what makes the most sense. Now consider the remaining scores on each factor. The closer to 1 they are, the more important they are in explaining the variation in that factor. There may potentially be relationships among those elements in each factor, which the research question, or background information may help you interpret. Do the same for interpreting the other factors.

"More than one interpretation can be made of the same data factored the same way, and factor analysis cannot identify causality."

Bernardo Chaves· Washington State UniversityMichael T Weaver PhD, RN, FAAN· University of FloridaAssuming that you used a principal components extraction, and incorporated all variables in each retained factor, then each factor contains a portion of the total variance that is reproduceable by a linear combination of the items factored.

You MAY be able to identify (stick a name on) each of the factors you retain by looking at the sign of the weights that make up the loadings; doesn't always work out, though.

Using a rotation may help clarify what each factor represents...

Scot W McNary· Towson UniversityFactor scores on the other hand are composites of the variables that are used to make the latent factor into an observed variable; to give it a scale. You would use factor scores, for example, when the factor was of interest to use as a predictor or outcome in a regression analysis and you were not planning to use latent variable modeling methods.

Many different factor score computation methods result in factor scores with mean = 0 and SD = 1. Various statistical packages have an assortment of these methods for computing factor scores, but they all produce very highly correlated alternatives. Even a unit weighted factor score method (for all variables with loadings > .3 weight = 1, else weight = 0) produces factor scores very highly correlated with other methods. There's a good paper by Grice on factor scores (see his helpful website, pdf of the paper is there too: http://psychology.okstate.edu/faculty/jgrice/factorscores/).

Alessandro Giuliani· Istituto Superiore di Sanitàhttp://pubs.acs.org/doi/abs/10.1021/ci2005127

Clearly you MUST KNOW THE SENSE OF WHAT YOU HAVE MEASURED, BUT I GIVE FOR GRANTED A SCIENTIST KNOWS WHAT HE/SHE IS DOING AND THIS IS AN ESEGESIS WORK CRUCIAL FOR YOUR WORK.

Having given a name to your components you turn to the scores and, depending on the system at hand you will see that component1 is significantly related to an external variable Y (let's say a drug treatment, or a continuous variable as age, or location or any other thing you are interested into..by the way if you performed a study you wanted to prove some hypothesis on the world outside). Component scores are like any other variable and you ca correlate them with any feature of interest of your statistical units.

In some other cases it is more convenient to start from factor scored, like in microarray analysis, where the genes are statistical units, so for example you find that loadings are able to separate cancer from healthy subject (interperetation y loadings), for example the loadings on factor2, well, now you turn to the scores and order them for their absolute value, and look at which genes have higher scores on the discriminating component, well you are a biologist, or you work with a biologist, he will look at these 'extreme relevant genes' and will say something about biology, like here:

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0013668

http://www.febsletters.org/article/S0014-5793(01)02973-8/abstract

or even here:

http://www.biomedcentral.com/1471-2105/7/194/abstract/

Reynaldo Rocha-Chávez· National Polytechnic InstitutePhei-Chin Lim· University Malaysia SarawakThank you Bernardo Chaves, Michael Weaver, Scot McNary and Reynaldo Rocha-Chávez for your suggestion.

Checking my understanding on factor analysis, there are two type of outputs - factor loadings and factor scores. Since we are approaching our research question through an exploratory perspective or trying to build a model, different combinations and methods with different number of factor cut-off level are used before settle on a 'suitable' configuration. Then, rotated factor loadings are used to interpret the factors obtained before giving it a name/label. As read somewhere, factor scores can be treated as any other variable for further investigation. Hence, we are using factor scores to identify scales for the named/labeled factor (e.g. to obtain something like the scale use in questionnaire; strongly agree, agree, neutral, disagree, strongly disagree). Am I on the right track?

Alessandro Giuliani· Istituto Superiore di SanitàOn the contrary if you rotate you completely loose the physical meaning of components aand this is very detrimental to your analysis.

Reynaldo Rocha-Chávez· National Polytechnic InstituteAlessandro Giuliani· Istituto Superiore di Sanitàthis is a very interesting point, those who work in psychology or sociology desire factor analysis confirm their previous statements (this is the reason for rotation) 'conceptual content' is given , statistics is useful to obtain scales usable reflecting conceptual content.

Who works in natural science (physics, chemistry) does exactly the opposite, the components (by the way this name come from analytical chemistry indicating the components of a mixture) give rise to the correlation structure of the variables (e.g. peaks of a spectrum) thus the same varible can be influenced by different components like in the composition of forces in physics, or different molecules sharing the same peak of adsorbance (among many other they do not share).

I think this should be the right attitude in psychology too (see tha attached paper) because if we are not sure of the stability across different data sets of our constructs in physics...can you imagine in psychology ????

But again this is a complex and intermingled question....

Stanisław Matusik· Akademia Wychowania Fizycznego im. Bronisława Czecha w KrakowieStanisław Matusik· Akademia Wychowania Fizycznego im. Bronisława Czecha w KrakowieReynaldo Rocha-Chávez· National Polytechnic InstituteI think that this difference between social sciences and natural sciences comes of the difference between of type of objects of study. In analytics chemistry there is a implicit supposed, the nature of study's objects is not dependent of the phisycal place where ones are. But, in social science, the stability of the one factorial structure depents of a one specific culture. So, the technique in social sciences is to repeat one study until to discover of the cultural limits of those factorial structure.

For other hand, there is an clarifier example. Take in eche row diferents mesures of one specific cylinder. Each row, maybe can be selectec by random proces. In the columns one can put: height; area of the base, lateral area, base radius, and other. The solución without rotation, could not be interpreted for other hand, the rotate solution could be easy interpreted in terms of the diferences of the mesuares' nature: length, area; mesuares of the base, height of the cylinder.

Allan John Brimicombe· University of East LondonReynaldo Rocha-Chávez· National Polytechnic InstituteI agree with your view point.

Allan John Brimicombe· University of East LondonI always understood that the positive and negative signs in the rotated solution reflected the positive and negative correlations in the correlation matrix.....I've tried to find a reference to this but couldn't run one to ground....so I could be wrong.

Alessandro Giuliani· Istituto Superiore di Sanitàyou say 'For other hand, there is an clarifier example. Take in eche row diferents mesures of one specific cylinder. Each row, maybe can be selectec by random proces. In the columns one can put: height; area of the base, lateral area, base radius, and other. The solución without rotation, could not be interpreted for other hand, the rotate solution could be easy interpreted in terms of the diferences of the mesuares' nature: length, area; mesuares of the base, height of the cylinder.'

The point is exactly what you are saying, in natural sciences we are interested in the cylinder orientation and specific shape , given the nature of the measures is alredy known and thus not interesting at all, in other words, the measures are simply probes for the reality and not the other way around, as a matter of fact any scientific paper starts with the description of the measures (already known) and ends up into the specific structure of correlation they aquire when interact wit a 'piece of matter' (whatever this piece of matter is, a protein, a set of rats, a population..) given the aim is to increase knowledge about the piece of matter (i.e. the real thing), like here, where a set of well known and clearly defined measure are made to interact with a set of proteins so to make their correlation structure to highlight the specific structural principles of the proteins (not the nature of the measures we know already), in this case is mandatory to use a physically motivate solution like unrotated PCA.

Thank you for this exchange of ideas.

Alessandro

Alessandro Giuliani· Istituto Superiore di SanitàFrederick H. Navarro· Walden UniversityIf the observable measures captured the dimensions "fun", "hard", "happy", "tough", and "sad", then hidden Factor 1 can be described as "fun" (loading = .80), "happy" (loading = .75), and "NOT Sad" (loading = -.89).

I hope this helps!

Frederick H. Navarro· Walden UniversityFactor Analysis and Complex Systems

Factor analysis is also a technique to identify emergent multistability within the phase space of a system of responses to a set of measures. The Five Factor Model of personality is an inductive model (Friedman & Schustack, 2012) which identified the emergent multistable factors (e.g., Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) from repeated factor analyses of behavioral-affective measures (Digman, 1990).

From a complex dynamic systems perspective, the Five Factors can be thought of as “global structural patterns, which emerge from (both linear and non-linear) interactions among the system’s components through phase space…characterized as emergent (attractor-based) collectives.” (Juarrero, 2012, Abstract, parenthetical phrases added). A system’s phase space represents all possible states of the system, and the attractors within the phase space represent regions of multistability and different trajectories of the system.

Example statements used by Holgado-Tello, Carrasco-Ortiz, del Barrio Gándara, and Moscoso (2009) to detect the Five Factors included statements such as “I share my things with other people,” “I like to watch the TV news and to know what happens in the world” (p .78). Based on the 5-point scale Hogado-Tello, et al. (2009) used to assess the 65 statements, the size of the phase space for the entire system of items and response options can be calculated at 564 or 2.7 X 1045.

Assuming the responses to these statements are based on behavioral tendencies in specific contexts (e.g., occasions where “sharing with others” can occur or when the news is on TV, the responses can be said to be contextually-bound and linked to specific environmental-social contexts.

So the Five Factors can be said to “emerge” from the response behavior of large numbers of individuals to dynamically changing environmental-social situations consistent with complex adaptive systems (Miller & Page, 2007).

References

Digman, J. M. (1990, February). Personality structure: emergence of the five-factor model. Annual Review of Psychology, 41, 417-440. doi: 10.1146/annurev.ps.41.020190.002221.

Friedman, H. S. & Schustack, M. W. (2012). Personality: Classic theories and modern research (5th Ed.). MA: Allyn & Bacon ISBN: 0-205-05017-4

Holgado-Tello, F., Carrasco-Ortiz, M., del Barrio Gándara, M., & Moscoso, S. (2009). Factor analysis of the Big Five Questionnaire using polychoric correlations in children. Quality & Quantity: International Journal of Methodology, 43(1), 75-85. doi:10.1007/s11135-007-9085-3.

Juarrero, A. (2012). Complex dynamical systems theory. The Cognitive Edge. Retrieved from http://www.cognitive-edge.com/uploads/articles/100608%20Complex_Dynamical_Systems_Theory.pdf

Miller, J. H. & Page, S. E. (2007). Complex adaptive systems. New Jersey: Princeton University Press.

DeletedMohamed Ismail Mohideen Bawa· South Eastern University of Sri LankaFactor scores are useful to rank and prioritise the factors. the, you can suggest the factors which are most important and need more focus. Recommendation should also be based on these rankings.

John F Golding· University of WestminsterJust on a practical basis I have found suppressing variable loadings on each factor <.4 useful. This cleans up the overall factor picture and make it easier for the reader to understand when presented with the factor table. I do not think there is any hard & fast rule but would be interested to learn.

Allan John Brimicombe· University of East LondonYou really want to be working with

rotatedfactor scores to interpret them. Varimax rotation is most commonly used. This ensures that the correlation between factors (also referred to as components or dimensions) is minimised. {Don't forget, use the scree plot to limit the number of dimensions you are working with to what is absolutely necessary.} Then a rule of thumb is that any variable scoring 0.7 or more in a dimension is a key one for interpreting what that dimension means. In fact in exploratory factor analysis you would progressively weed out variables which do not have a rotated score of 0.7 or more in any dimension. The end result is often that each variable only contributes strongly (0.7 or more) to one dimension and the interpretation of each dimension then becomes much easier.Ashwini Zaware· National Environmental Engineering Research InstituteOnce you get to know that some components describe most of the factor in which they are, what are we supposed to do next? Do we remove the describing components and run the analysis again? And then, when we see that sufficient factor analysis is being carried out, what is the next step?

Allan John Brimicombe· University of East LondonIn response to Ashwini's question of what next...

If as I have suggested, for an exploratory analysis you have weeded out all variables that do not contribute strongly (>=0.7 rorated score) in any dimension, you are usually left (in my experience) with each variable contributing strongly to only one dimension and therefore you are better able to unambiguousky describe what each dimension is. The purpose of the factor analysis is usally to follow through to multiple linear regression (and therefore you shouldn't include the dependent variable in the factor analysis). You can do one or more of three things:

However, remember, just because you find n dimensions from your collection of variables, it doesn't mean they will all significantly contribute in a multiple linear regression model. You may still need to weed out some of the dimensions in building a good regression model.

Godfrey Tumwesigye· Kyambogo UniversityNegative factor loadings can be caused by negatively worded items. Recode them in reverse order. If this is not the case, i.e. no negatively worded item, then discard the items with a negative factor loadings

Allan John Brimicombe· University of East LondonWhere negative factor loadings occur, it indicates that the variable has a negative correlation with the dimension, similar to a negative coefficient in a regression model. Therefore I wouldn't suggest discarding all variables with negative factor loadings without first studying what part they may play in any model that is being built.

Godfrey Tumwesigye· Kyambogo UniversityYes, Allan. Discarding depends, on the relationships among the factors. I have done EFA of job attitudes including job satisfaction, organisational commitment and turnover intentions. And consistent with theory, whereas factor loadings on commitment and job satisfaction are positive, those on turnover intentions are negative. The negativity is explained by theory, so I can't discard items measuring TOI.

Mariano Pierantozzi· Università Politecnica delle MarcheDear all, can I ask a question inside a question?

the results of factor analysis, the number that I've as result, has a units? That is, if I want to investigate the effects of the,perturbed and volume on pressure the results has the units of pressure? Thanks a lot and sorry if this Is a silly question, I'm not an expert on statistics,but I'm courious.

Alessandro Giuliani· Istituto Superiore di SanitàDear Mariano it depends on the kind of algorithm you use if you

Extract the eigenvectorsfrom the correlation matrix that is the

Obliged choice when dealingvwith variables with different mrasurement

Units factor scores are adimensional z scores but if youvhave variables

With the same mrasurement units you can extract eigenvectors from covariance

Masix so keeping the same mrasurement unit.

Mariano Pierantozzi· Università Politecnica delle MarcheDear Alessandro Giuliani,

thanks a lot.

Grazie tante e complimenti per la enorme competenza!

mariano.

Alessandro Giuliani· Istituto Superiore di SanitàTroppo gentile

Ciao

Alessandro

Augusto Teoi· University of CampinasDear all, provided it seems to have the same subject (factor analysis output interpretation), I'm reposting a question I've made separately.

Does anybody know where can I find any published material regarding Exploratory Factor Analysis with one factor, used to create an importance ranking?Does anybody know where can I find any published material regarding Exploratory Factor Analysis with one factor, used to create an importance ranking?

Clarifying: In order to analyze a Critical Success Factor Survey and trying to rank their importance, I'd run an SPSS Factor Analysis - Principal Component Analysis with "1" Factor to be Extracted, using Varimax Rotation.

Alessandro Giuliani· Istituto Superiore di SanitàDear Augusto,

probably the attached file could be of interest to you

Augusto Teoi· University of CampinasDear Giuliani, thanks for the prompt support! I'll check that out!

Quratulain Shaikh· Princess Nora bint Abdul Rahman UniversityHi guys,

I am doing factor analysis for the first time and I found out that instead of principal component analysis we should go for maximum likelihood estimations. The problem is with max likelihood I am getting all items >0.6 but with PCA I can omit a few. Is it Ok to keep all?

I know this is silly but I am not a statistician. Please advise.

Suman De· Sambalpur University Institute of InformationHello everybody

I have conducted a factor analysis to extract latent factors leading to gender inequality at workplace. I extracted 5 factors from 20 variables. Now i want to rank all the variables showing their contribution in gender inequality. I have conducted a factor score ranking of variables, is this a correct method to fulfill my objective of ranking?

Please suggest me ways to do the ranking.

Can you help by adding an answer?