ArticlePDF Available

A replication of the S factor among US states

  • Ulster Institute for Social Research

Abstract and Figures

A dataset of 127 variables concerning socioeconomic outcomes for US states was analyzed. Of these, 81 were used in a factor analysis. The analysis revealed a general socioeconomic factor. This factor correlated .961 with one from a previous analysis of socioeconomic data for US states.
Content may be subject to copyright.
IQ and socioeconomic development across Regions of the UK: a reanalysis
A reanalysis of (Carl, 2015) revealed that the inclusion of London had a strong effect on the S loading
of crime and poverty variables. S factor scores from a dataset without London and redundant variables
was strongly related to IQ scores, r = .87. The Jensen coefficient for this relationship was .86.
Carl (2015) analyzed socioeconomic inequality across 12 regions of the UK. In my reading of his
paper, I thought of several analyses that Carl had not done. I therefore asked him for the data and he
shared it with me. For a fuller description of the data sources, refer back to his article.
Redundant variables and London
Including (nearly) perfectly correlated variables can skew an extracted factor. For this reason, I created
an alternative dataset where variables that correlated above |.90| were removed. The following pairs of
strongly correlated variables were found:
1. median.weekly.earnings and log.weekly.earnings r=0.999
2. GVA.per.capita and log.GVA.per.capita r=0.997
3. R.D.workers.per.capita and log.weekly.earnings r=0.955
4. log.GVA.per.capita and log.weekly.earnings r=0.925
5. economic.inactivity and children.workless.households r=0.914
In each case, the first of the pair was removed from the dataset. However, this resulted in a dataset with
11 cases and 11 variables, which is impossible to factor analyze. For this reason, I left in the last pair.
Furthermore, because capitals are known to sometimes strongly affect results (Kirkegaard, 2015a,
2015b, 2015d), I also created two further datasets without London. One with the redundant variables,
one without. Thus, there were 4 datasets:
1. A dataset with London and redundant variables.
2. A dataset with redundant variables but without London.
3. A dataset with London but without redundant variables.
4. A dataset without London and redundant variables.
Factor analysis
Each of the four datasets was factor analyzed. Figure 1 shows the loadings.
Removing London strongly affected the loading of the crime variable which changed from moderately
positive to moderately negative. The poverty variable also saw a strong change, from slightly negative
to strongly negative. Both changes go towards a purer S factor (desirable outcomes with positive
loadings, undesirable outcomes with negative loadings). Removing the redundant variables did not
have much effect.
As a check, I checked that these results were stable across 30 different factor analytic methods.1 They
were, all loadings and scores correlated near 1.00. For my analysis, I used those extracted with the
combination of minimum residuals and regression.
Due to London's strong effect on the loadings, one should check that the two methods developed for
finding such cases can identity it (Kirkegaard, 2015c). Figure 2 shows the results from these two
methods (mean absolute residual and change in factor size):
1 There are 6 different extraction and 5 scoring methods supported by the fa() function from the psych package (Revelle,
2015). Thus, there are 6*5 combinations.
Figure 1: S factor loadings in four analyses.
As can be seen, London was identified as a far outlier using both methods.
S scores and IQ
Carl's dataset also contains IQ scores for the regions. These correlate .87 with the S factor scores from
the dataset without London and redundant variables. Figure 3 shows the scatter plot.
However, it is possible that IQ is not really related to the latent S factor, but just the other variance of
the extracted S scores. For this reason I used Jensen's method (method of correlated vectors) (Jensen,
1998). Figure 4 shows the results.
Figure 2: Mixedness metrics for the complete dataset.
Figure 3: Scatter plot of S and IQ scores for regions of the UK.
Jensen's method thus supported the claim that IQ scores and the latent S factor are related.
Discussion and conclusion
My reanalysis revealed some interesting results regarding the effect of London on the loadings. This
was made possible by data sharing and shows why sharing data is very important (Wicherts & Bakker,
Supplementary material
R source code and datasets are available at the OSF.
Carl, N. (2015). IQ and socioeconomic development across Regions of the UK. Journal of Biosocial
Science, 1–12.
Jensen, A. R. (1998). The g factor: the science of mental ability. Westport, Conn.: Praeger.
Kirkegaard, E. O. W. (2015a). Examining the S factor in Mexican states. The Winnower. Retrieved from
Kirkegaard, E. O. W. (2015b). Examining the S factor in US states. The Winnower. Retrieved from
Kirkegaard, E. O. W. (2015c). Finding mixed cases in exploratory factor analysis. The Winnower.
Retrieved from
Figure 4: Jensen's method for the S factor's relationship to IQ scores.
Kirkegaard, E. O. W. (2015d). The S factor in Brazilian states. The Winnower. Retrieved from
Revelle, W. (2015). psych: Procedures for Psychological, Psychometric, and Personality Research
(Version 1.5.4). Retrieved from
Wicherts, J. M., & Bakker, M. (2012). Publish (your data) or (let the data) perish! Why not publish
your data too? Intelligence, 40(2), 73–76.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Sizeable S factors were found across 3 different datasets (from years 1991, 2000 and 2010), which explained 56 to 71% of the variance. Correlations of extracted S factors with cognitive ability were strong ranging from .69 to .81 depending on which year, analysis and dataset is chosen. Method of correlated vectors supported the interpretation that the latent S factor was primarily responsible for the association (r’s .71 to .81).
Full-text available
I analyzed the S factor in US states by compiling a dataset of 25 diverse socioeconomic indicators. Results show that Washington DC is a strong outlier, but if it is excluded, then the S factor correlated strongly with state IQ at .75. Ethnoracial demographics of the states are related to the state's IQ and S in the expected order (White>Hispanic>Black).
Full-text available
Two datasets of socioeconomic data was obtained from different sources. Both were factor analyzed and revealed a general factor (S factor). These factors were highly correlated with each other (.79 to .95), HDI (.68 to .93) and with cognitive ability (PISA; .70 to .78). The federal district was a strong outlier and excluding it improved results. Method of correlated vectors was strongly positive for all 4 analyses (r’s .78 to .92 with reversing).
Cross-regional correlations between average IQ and socioeconomic development have been documented in many different countries. This paper presents new IQ estimates for the twelve regions of the UK. These are weakly correlated ( r =0.24) with the regional IQs assembled by Lynn (1979). Assuming the two sets of estimates are accurate and comparable, this finding suggests that the relative IQs of different UK regions have changed since the 1950s, most likely due to differentials in the magnitude of the Flynn effect, the selectivity of external migration, the selectivity of internal migration or the strength of the relationship between IQ and fertility. The paper provides evidence for the validity of the regional IQs by showing that IQ estimates for UK nations (England, Scotland, Wales and Northern Ireland) derived from the same data are strongly correlated with national PISA scores ( r =0.99). It finds that regional IQ is positively related to income, longevity and technological accomplishment; and is negatively related to poverty, deprivation and unemployment. A general factor of socioeconomic development is correlated with regional IQ at r =0.72.
The authors argue that upon publication of a paper, the data should be made available through online archives or repositories. Reasons for not sharing data are discussed and contrasted with advantages of sharing, which include abiding by the scientific principle of openness, keeping the data for posterity, increasing one's impact, facilitation of secondary analyses and collaborations, prevention and correction of errors, and meeting funding agencies' increasingly stringent stipulations concerning the dissemination of data. Practicing what they preach, the authors include data as an online appendix to this editorial. These data are from a cohort of psychology freshmen who completed Raven's Advanced Progressive Matrices, tests of Numerical Ability, Number Series, Hidden Figures, Vocabulary, Verbal Analogies, and Logical Reasoning, two Big Five personality inventories, and scales for social desirability and impression management. Student's sex and grade point average (GPA) are also included. Data could be used to study predictive validity of cognitive ability tests, Extraversion, Neuroticism, Conscientiousness, Openness to Experience, Agreeableness, and the general factor of personality, as well as sex differences, differential prediction, and relations between personality and intelligence.
A pesar de la relativamente corta historia de la Psicología como ciencia, existen pocos constructos psicológicos que perduren 90 años después de su formulación y que, aún más, continúen plenamente vigentes en la actualidad. El factor «g» es sin duda alguna uno de esos escasos ejemplos y para contrastar su vigencia actual tan sólo hace falta comprobar su lugar de preeminencia en los modelos factoriales de la inteligencia más aceptados en la actualidad, bien como un factor de tercer orden en los modelos jerárquicos o bien identificado con un factor de segundo orden en el modelo del recientemente desaparecido R.B.Cattell.
analysis Figure 4: Jensen's method for the S factor's relationship to IQ scores
  • E O W Kirkegaard
Kirkegaard, E. O. W. (2015c). Finding mixed cases in exploratory factor analysis. The Winnower. Retrieved from Figure 4: Jensen's method for the S factor's relationship to IQ scores.