S and G in Italian regions: Re-
analysis of Lynn's data and new data
Emil O. W. Kirkegaard1
The S factor in Italian regions was examined by reanalyzing data published by Lynn (Lynn, 2010) as
well as new data compiled from the Italian statistics agency (7 and 10 socioeconomic variables,
respectively). The S factors from the datasets were highly correlated (.92) and both are strongly
correlated with a G factor from PISA scores (.93 and .88).
Key words: Italy, regions, social inequality, S factor, general socioeconomic factor, IQ, intelligence,
cognitive ability, PISA, cognitive sociology
One can study a given human trait at many levels. Probably the most common is the individual level.
The next-most common the inter-national, and the least common perhaps the intra-national. This last
one can be done at various level too, e.g. state, region, commune, and city. These divisions usually vary
The study of general intelligence (GI) at these higher levels has been called the ecology of intelligence
by Richard Lynn (Lynn, 1979, 1980) and the sociology of intelligence by Linda Gottfredson
(Gottfredson, 1998). Lynn has since published a number of papers on the regions of Italy (Lynn, 2010,
2012; Piffer & Lynn, 2014). The present study re-analyses some of this data. After this a new, larger,
more diverse dataset is presented and analyzed.
2. Lynn’s 2010 data
True to his style, Lynn (2010) contains the raw data used for his analysis. This is fortunate because it
means anyone can re-analyze them. His paper contains the following variables:
1 University of Aarhus, Denmark. Email: firstname.lastname@example.org
Page 1 of 10.
1. 3x PISA subtests: reading, math, science
2. An average of these PISA scores
3. An IQ derived from the average
4. Stature for 1855, 1910, 1927 and 1980
5. Per capita income for 1970 and 2003
6. Infant mortality for 1955 and 1999
7. Literacy 1880
8. Years of education 1951, 1971 and 2001
These data are given for 12 Italian regions.
Lynn himself merely did correlational analysis and discussed the results. The data however can be
usefully factor analyzed to extract a G (from the three PISA subtests) and an S factor (from all the
Lynn’s choice of variables is quite odd. They are not all from the same years, presumably because he
picked them from various other papers instead of going to the Italian statistics website to fetch some
himself. This opens the question of how to analyze them. Three approaches were used: 1) factor
analyzed the old data alone, 2) factor analyzed the new data alone, 3) factor analyzed all the data. The
two factor analyses of the limited datasets did not reveal anything interesting not shown in the full
analysis, so only the results from the full analysis are shown (Figure 1).
There were no surprises to be seen.
The loadings for the G factor with the PISA subtests were all .99. The scatter plot for G and S is shown
in Figure .
Page 2 of 10.
Figure 1: S factor loadings in Lynn's dataset.
The method of correlated vectors was then applied, as shown in Figure .
There was a moderate positive relationship. But given the lack of diversity and small size of the
indicators, not much can be concluded from this.
Page 3 of 10.
Figure 2: Scatter plot of G and S for Lynn's dataset.
Figure 3: Method of correlated vectors applied to the G x S in Lynn's dataset.
3. New data
Being dissatisfied with the data Lynn reported, I decided to collect more data. The PISA 2012 results
have PISA scores for more regions than before which allows for an analysis with more cases. This also
means that one can use more variables in the factor analysis. The new PISA data has 22 regions, so one
can use about 11 variables (Zhao, 2009). However, due to some missing data, only 21 regions were
available for analysis (Südtirol had some missing data). So I decided to use 10 variables.
To get data for the analysis, the same approach as was used in a previous publication on the S factor in
US was used (Kirkegaard, 2015). Were were selected and downloaded from the Italian statistics
agency, IStat (http://www.istat.it/en/). As mentioned earlier, for MCV to work well, one needs a large,
diverse selection of variables, so that there is diversity in their S loadings (not just direction of loading).
The following 10 variables were used:
1. Political participation index, 9 years
2. Percent with normal weight, 9 years
3. Percent smokers, 10 years
4. Intentional homicide rate, 4 years
5. Total crime rate, 4 years
6. Unemployment, 10 years
7. Life expectancy males, 10 years
8. Total fertility rate, 10 years
9. Interpersonal trust index, 5 years
10.No savings percent, 10 years
It was attempted to fetch approximately the last 10 years of data for each variable, which were then
For cognitive data, the regional scores for reading, mathematics and science was PISA2012 were used
(OECD, 2014, Annex B2).
3.1. Factor analysis
Factor analysis was carried as before. The loadings are shown in Figure 4.
Page 4 of 10.
We see two odd results. Total crime (TC) rate has a slight positive loading (.16) while intentional
homicide rate has a strong negative loading (-.72). Lynn (1979) reported a similar finding. He
explained it as being due to urbanization, which increases population density which increases crime
rates (more opportunities, more interpersonal conflicts). An alternative hypothesis is that the total crime
rate is being increased by immigrants who live mostly in the north. Perhaps one can get crime rates for
natives only to test this. A third hypothesis is that it has to do with differences in the legal system, for
instance, prosecutor practice in determining which actions to pull into the legal system.
The second odd finding is that fertility has a positive loading. Generally, it has been found that fertility
has a slight negative correlation with GI and s factor indicators at the individual level (Lynn, 2011). It
has also been found that internationally, GI has a strong negative relationship, -.5 to -.7 depending on
measure, to fertility (Lynn & Harvey, 2008; Shatz, 2008). I have also previously reported a group-level
correlation of about -.50 among Danish immigrant groups (Kirkegaard, 2014). However, if one
examines European countries only, one sees that fertility is relatively ‘high’ (a bit below 2) in the
northern countries (Nordic countries, UK), and low in the southern and eastern countries. This means
that the correlation of fertility between countries in Europe and IQ (e.g. PISA) is positive. Perhaps this
has some relevance to the current finding. One hypothesis is that immigrants are pulling the fertility up
in the northern regions.
There is little to report from the factor analysis of PISA results. All loadings were between .98 and .99.
Figure 5 shows the correlation between G and S scores.
Page 5 of 10.
Figure 4: S factor loadings for the new dataset.
As before, the relationship was positive. It was much stronger this time, however, perhaps due to the
more varied selection of variables.
3.2. Cross-dataset stability
Finally, the G and S scores from the two datasets were compared to examine cross-dataset stability.
For one case, Lynn’s dataset had data for a merged region. To make the datasets comparable, the same
regions were merged in the new dataset. The scatter plots are shown in Figures 7-8.
Page 7 of 10.
Figure 6: Method of correlated vectors applied to the G x S relationship in the new
These revealed very high cross-dataset agreement.
The results for the regional G and S in Italian regions are especially strong. They rival even the
international S factor in their correlation with the G estimates. Italy really is a very divided country.
Stability across datasets was strong, so Lynn’s odd choice of data was not inflating the results.
The method of correlated vectors gave stronger results in the dataset with more and more diverse
indicator variables for S, as would be expected if the correlation was artificially low in the first dataset
due to restriction of range in the S loadings.
Page 8 of 10.
Figure 7: Cross-dataset correlation of G.
Figure 8: Cross-dataset correlation of S.
All project files (R source code, data files, plots) are available at https://osf.io/ebr3p/.
Thanks to Davide Piffer for catching an error and for help in matching the regions up from the two
Gottfredson, L. S. (1998). Jensen, Jensenism, and the sociology of intelligence. Intelligence, 26(3),
Kirkegaard, E. O. W. (2014). Criminality and fertility among Danish immigrant populations. Open
Differential Psychology. Retrieved from http://openpsych.net/ODP/2014/03/criminality-and-
Kirkegaard, E. O. W. (2015). Examining the S factor in US states. The Winnower. Retrieved from
Lynn, R. (1979). The social ecology of intelligence in the British Isles. British Journal of Social and
Clinical Psychology, 18(1), 1–12. https://doi.org/10.1111/j.2044-8260.1979.tb00297.x
Lynn, R. (1980). The social ecology of intelligence in France. British Journal of Social and Clinical
Psychology, 19(4), 325–331. https://doi.org/10.1111/j.2044-8260.1980.tb00360.x
Lynn, R. (2010). In Italy, north–south differences in IQ predict differences in income, education, infant
mortality, stature, and literacy. Intelligence, 38(1), 93–100.
Lynn, R. (2011). Dysgenics: genetic deterioration in modern populations. London: Ulster Institute for
Lynn, R. (2012). IQs in Italy are higher in the north: A reply to Felice and Giugliano. Intelligence,
40(3), 255–259. https://doi.org/10.1016/j.intell.2012.02.007
Lynn, R., & Harvey, J. (2008). The decline of the world’s IQ. Intelligence, 36(2), 112–120.
OECD (Ed.). (2014). What students know and can do: student performance in mathematics, reading
and science (Rev. ed., Febr. 2014). Paris: OECD.
Piffer, D., & Lynn, R. (2014). New evidence for differences in fluid intelligence between north and
Page 9 of 10.
south Italy and against school resources as an explanation for the north–south IQ differential.
Intelligence, 46, 246–249. https://doi.org/10.1016/j.intell.2014.07.006
Shatz, S. M. (2008). IQ and fertility: A cross-national study. Intelligence, 36(2), 109–111.
Zhao, N. (2009, March 23). The Minimum Sample Size in Factor Analysis. Retrieved November 16,
Page 10 of 10.