IQ and socioeconomic development across Regions of the UK: a reanalysis
A reanalysis of (Carl, 2015) revealed that the inclusion of London had a strong effect on the S loading
of crime and poverty variables. S factor scores from a dataset without London and redundant variables
was strongly related to IQ scores, r = .87. The Jensen coefficient for this relationship was .86.
Carl (2015) analyzed socioeconomic inequality across 12 regions of the UK. In my reading of his
paper, I thought of several analyses that Carl had not done. I therefore asked him for the data and he
shared it with me. For a fuller description of the data sources, refer back to his article.
Redundant variables and London
Including (nearly) perfectly correlated variables can skew an extracted factor. For this reason, I created
an alternative dataset where variables that correlated above |.90| were removed. The following pairs of
strongly correlated variables were found:
1. median.weekly.earnings and log.weekly.earnings r=0.999
2. GVA.per.capita and log.GVA.per.capita r=0.997
3. R.D.workers.per.capita and log.weekly.earnings r=0.955
4. log.GVA.per.capita and log.weekly.earnings r=0.925
5. economic.inactivity and children.workless.households r=0.914
In each case, the first of the pair was removed from the dataset. However, this resulted in a dataset with
11 cases and 11 variables, which is impossible to factor analyze. For this reason, I left in the last pair.
Furthermore, because capitals are known to sometimes strongly affect results (Kirkegaard, 2015a,
2015b, 2015d), I also created two further datasets without London: one with the redundant variables,
one without. Thus, there were 4 datasets:
1. A dataset with London and redundant variables.
2. A dataset with redundant variables but without London.
3. A dataset with London but without redundant variables.
4. A dataset without London and redundant variables.
Each of the four datasets was factor analyzed. Figure 1 shows the loadings.
Removing London strongly affected the loading of the crime variable, which changed from moderately
positive to moderately negative. The poverty variable also saw a large change, from slightly negative to
strongly negative. Both changes are in the direction towards a purer S factor (desirable outcomes with
positive loadings, undesirable outcomes with negative loadings). Removing the redundant variables did
not have much effect.
As a check, I investigated whether these results were stable across 30 different factor analytic
methods.1 They were, all loadings and scores correlated near 1.00. For my analysis, I used those
extracted with the combination of minimum residuals and regression.
Due to London's strong effect on the loadings, one should check that the two methods developed for
finding such cases can identify it (Kirkegaard, 2015c). Figure 2 shows the results from these two
methods (mean absolute residual and change in factor size):
1 There are 6 different extraction and 5 scoring methods supported by the fa() function from the psych package (Revelle,
2015). Thus, there are 6*5 combinations.
Figure 1: S factor loadings in four analyses.
As can be seen, London was identified as a far outlier using both methods.
S scores and IQ
Carl's dataset also contains IQ scores for the regions. These correlate .87 with the S factor scores from
the dataset without London and redundant variables. Figure 3 shows the scatter plot.
However, it is possible that IQ is not really related to the latent S factor, just the other variance of the
extracted S scores. For this reason I used Jensen's method (method of correlated vectors) (Jensen,
1998). Figure 4 shows the results.
Figure 2: Mixedness metrics for the complete dataset.
Figure 3: Scatter plot of S and IQ scores for regions of the UK.
Jensen's method thus supported the claim that IQ scores and the latent S factor are related.
Discussion and conclusion
My reanalysis revealed some interesting results regarding the effect of London on the loadings. This
was made possible by data sharing demonstrating the importance of this practice (Wicherts & Bakker,
R source code and datasets are available at the OSF.
Carl, N. (2015). IQ and socioeconomic development across Regions of the UK. Journal of Biosocial
Science, 1–12. http://doi.org/10.1017/S002193201500019X
Jensen, A. R. (1998). The g factor: the science of mental ability. Westport, Conn.: Praeger.
Kirkegaard, E. O. W. (2015a). Examining the S factor in Mexican states. The Winnower. Retrieved from
Kirkegaard, E. O. W. (2015b). Examining the S factor in US states. The Winnower. Retrieved from
Kirkegaard, E. O. W. (2015c). Finding mixed cases in exploratory factor analysis. The Winnower.
Retrieved from https://thewinnower.com/papers/finding-mixed-cases-in-exploratory-factor-
Figure 4: Jensen's method for the S factor's relationship to IQ scores.
Kirkegaard, E. O. W. (2015d). The S factor in Brazilian states. The Winnower. Retrieved from
Revelle, W. (2015). psych: Procedures for Psychological, Psychometric, and Personality Research
(Version 1.5.4). Retrieved from http://cran.r-project.org/web/packages/psych/index.html
Wicherts, J. M., & Bakker, M. (2012). Publish (your data) or (let the data) perish! Why not publish
your data too? Intelligence, 40(2), 73–76. http://doi.org/10.1016/j.intell.2012.01.004