Content uploaded by Fernand Gobet

Author content

All content in this area was uploaded by Fernand Gobet

Content may be subject to copyright.

Why are (the best) women so good at chess?

Participation rates and gender differences

in intellectual domains

Merim Bilalic

´

1,

*, Kieran Smallbone

2

, Peter McLeod

1

and Fernand Gobet

3

1

Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford OX1 3UD, UK

2

Manchester Centre for Integrative Systems Biology, University of Manchester,

131 Princess Street, Manchester M1 7DN, UK

3

Centre for Cognition and Neuroimaging, Brunel University, Uxbridge, Middlesex UB8 3PH, UK

A popular explanation for the small number of women at the top level of intellectually demanding activities

from chess to science appeals to biological differences in the intellectual abilities of men and women. An

alternative explanation is that the extreme values in a large sample are likely to be greater than those in a

small one. Although the performance of the 100 best German male chess players is better than that of the

100 best German women, we show that 96 per cent of the observed difference would be expected given

the much greater number of men who play chess. There is little left for biological or cultural explanations to

account for. In science, where there are many more male than female participants, this statistical sampling

explanation, rather than differences in intellectual ability, may also be the main reason why women are

under-represented at the top end.

Keywords: gender differences; participation (base) rates; chess; intellectual activities;

intelligence; science

1. INTRODUCTION

The former president of Harvard University, Lawrence

Summers, expressed a view widely held by both academic

researchers (e.g. Geary 1998;Kimura 1999;Pinker 2002)

and laypeople when he suggested that innate biological

differences in intellectual abilities may explain why the

most successful scientists and engineers are predomi-

nantly male (Summers 2005). Recent research conﬁrms

that, although there is little difference between the average

scores of men and women on intelligence and aptitude

tests, highly intelligent people are predominantly male

(Deary et al. 2003;Irwing & Lynn 2005). As Irwing &

Lynn (2005) said: ‘different proportions of men and

women with high IQs.may go some way to explain the

greater numbers of men achieving distinctions of various

kinds for which a high IQ is required, such as chess

Grandmasters, Fields medallists for mathematics, Nobel

prize winners and the like’ (p. 519; emphasis added).

Despite the increasing proportion of women in

intellectually demanding professions, they are still under-

represented at the top level in science (Long 2001;Xie &

Shauman 2003;Ceci & Williams 2007) and engineering

(Long 2001). There are various possible explanations for

this apart from innate biological differences (for critiques

and negative ﬁndings see Kerkman et al. 2000;Spelke

2005;Lachance & Mazzocco 2006). Socialization and

different interests (Benbow et al. 2000;Ayalon 2003),

gender roles ( Massa et al. 2005), gatekeeper effects

(Davidson & Cooper 1992;Steele 1997;Huffman &

Torres 2002), cultural differences (Andreescu et al. 2008)

and higher participation rates of men (Charness &

Gerchak 1996;Chabris & Glickman 2006) have all been

proposed. Here, we show that in chess, an intellectually

demanding activity where men dominate at the top level,

the difference in the performance of the best men and

women is largely accounted for by the difference that

would be expected, given the much greater number of

men who participate. Despite the clear superiority of the

top male players, there is, in reality, very little performance

gap in favour of men for non-statistical theories to explain.

Before considering cultural or biological explanations

for better performance by the top performers in one of two

groups of different size, a simple statistical explanation

must be considered. Even if two groups have the same

average (mean) and variability (s.d.), the highest perform-

ing individuals are more likely to come from the larger

group. The greater the difference in size between the two

groups, the greater is the difference to be expected

between the top performers in the two groups. Nothing

about underlying differences between the groups can be

concluded from the preponderance of members of the

larger group at the far ends of the distribution until one

can show that this preponderance is greater than would be

expected on statistical sampling grounds.

It is difﬁcult to quantify how participation rates

inﬂuence the number of outstanding men and women in

ﬁelds such as science and engineering because both

achievement and participation rates are difﬁcult to

measure. But it is straightforward in chess because there

is an objective measure of achievement and the number of

male and female participants is known. Chess has an

Proc. R. Soc. B (2009) 276, 1161–1165

doi:10.1098/rspb.2008.1576

Published online 23 December 2008

*Author and address for correspondence: Section for

Experimental MR, Department of Neuroradiology, Tu

¨bingen

University, Hoppe-Seyler-Street 3, 72076 Tu

¨bingen, Germany

(merim.bilalic@med.uni-tuebingen.de).

Received 31 October 2008

Accepted 1 December 2008 1161 This journal is q2008 The Royal Society

interval scale, the Elo rating, for measuring skill level.

Every serious player has an Elo rating that is obtained on

the basis of their results against other players of known

rating (see Elo 1986). National federations record the

ratings of all players who take part in competitions. Hence,

it is possible to test whether the difference between the

best male and female performers is any greater than would

be expected given the different numbers of male and

female chess players.

2. MATERIAL AND METHODS

We developed an analytic method to estimate the expected

difference between the top male and female performers based

on the overall male and female participation rates using the

parameters of central tendency (mean) and variability (s.d.)

of the underlying population. A novelty of this method is

that it can be used to estimate the values of the extreme

members of very large samples (such as ours, where

nZ120 399). This method is described in detail in the

appendix A. Our approach is based on the work of Charness &

Gerchak (1996) who showed that the difference between the

top male and female player in the world in 1994 was similar to

that predicted by the relative numbers of male and female

players in the ratings of the international chess federation

(FIDE). A limitation with this study was the statistical

problem that the estimate of the extreme value from a sample

tends to be highly variable (Glickman & Chabris 1996;

Glickman 1999). Also, the world’s top female player at the

time, the Hungarian Judit Polga

¯r, was a phenomenon, by far

the strongest female player the world has ever known. She is

currently ranked 27th of all players in the world but she is the

only female player in the top 100. The fact that the best

female player is an outlier in her population, combined with

the problem of high variability of the extreme value, means

that conclusions drawn on the basis of the performance of the

top player alone may not be applicable to top players in

general. The FIDE rating list used by Charness and Gerchak

only reports players of average strength and above. We used

the German rating list that lists all players. Most importantly,

instead of estimating just the top male and female player, we

estimated the expected performance of the best 100 male and

female performers.

We applied this method to the population of all German

players recorded by the German chess federation (Deutscher

Schachbund). With over 3000 rated tournaments in a year,

the German chess federation is one of the largest and the best

organized national chess federations in the world. Given that

almost all German tournaments are rated, including events

such as club championships, all competitive and most hobby

players in Germany can be found on the rating list. The rating

itself is based on the same assumptions as the Elo rating used

by the international chess federation. The two correlate

highly (rZ0.93).

We considered the players in the list published in April

2007. A small number of the best male and female chess players

from all over the world participate in tournaments rated by the

German chess federation and dominate the top of the rating

list. We have excluded all foreign players from the analyses so

that our conclusions about the expected performance of the

best male and female players, based on the total number and

performance of male and female German players, are only

applied to German players. Figure 1 shows the distribution of

ratings for the German chess population. The distribution is

approximately normal with mean of 1461 and s.d. of 342.

Rated men (113 386) greatly outnumber rated women (7013);

that is, there are 16 male chess players for every woman.

3. RESULTS

Figure 2 shows the real difference in rating for each of the

top 100 pairs of male and female players, and the

difference to be expected for each pair given the much

larger number of male players. The expected superiority of

male players varies from approximately 270 Elo points for

the best male player to approximately 440 Elo points for

the 100th. Figure 2 shows that, in fact, the top three

women are better than would be expected. The next 70

pairs show a small but consistent advantage for men—

their superiority over the corresponding female player is a

little greater than would be expected purely from

the relative numbers of male and female players.

From approximately the 80th pair the advantage shifts.

The female players are slightly better than would be

expected. Averaged over the 100 top players, the expected

male superiority is 341 Elo points and the real one is 353

points. Therefore 96 per cent of the observed difference

between male and female players can be attributed to a

simple statistical fact—the extreme values from a large

sample are likely to be bigger than those from a small one.

4. DISCUSSION

Chess has long been renowned as the intellectual activity

par excellence (Newell et al. 1958) and male dominance at

chess is frequently cited as an example of innate male

intellectual superiority (e.g. Howard 2005;Irwing & Lynn

2005). The reason seems obvious—the best male players

are indisputably better than the best female players. For

example: not a single woman has been world champion;

only 1 per cent of Grandmasters, the best players in the

world, are female; and there is only one woman among

the best 100 players in the world. When considering such

a seemingly convincing example of real world male

superiority, one can easily forget to consider the great

disparity in the number of participants and the statistical

consequences of this for the probable gender of the

best players.

0 500 1000 1500 2000 2500 3000

ratin

g

1000

2000

3000

4000

5000

no. of players

Figure 1. The distribution of the German chess rating

with the best-ﬁt normal curve superimposed. nZ120 399,

mZ1461, sZ342, 16 : 1 men to women ratio.

1162 M. Bilalic

´et al. Gender differences in intellectual domains

Proc. R. Soc. B (2009)

This was the case when the chess portal ChessBase

asked some of the best female players to explain male

dominance in chess (Ahmadov 2007). None of the

interviewed women even mentioned the greatly differing

participation rates and its consequences on the probable

gender of the top performers. Similarly, at a recent

gathering of more than 20 experts on gender difference

to discuss the reasons for the paucity of women at the top

of science, a broad range of reasons was discussed, but

there was no mention of participation rates (Ceci &

Williams 2007).

One way to avoid the conclusion based on participation

rates would be to argue that the base participation rate for

womenusedinthisstudyunderestimatesthereal

participation rate. It is possible that there is a self-selection

process based on the innate biological differences in

intellectual abilities, and that the effects of this self-

selection are already observable in the rating list we used.

Women may be inferior in the intellectual abilities that are

important for successful chess playing. This innate

disadvantage may lead women to give up on chess in

greater numbers than more successful men. The small

number of women is then a consequence of their greater

drop-out, which in turn is produced by their innate lack of

the intellectual abilities required to succeed at chess.

Differential participation rates may explain the discre-

pancy at the top, but the difference is itself a direct product

of innate differences in intellectual abilities.

This argument sounds reasonable but it rests on a

controversial assumption. It requires that there should be

innate differences between men and women in the

intellectual abilities required for success at chess.

The topic of gender differences in cognitive abilities is

a hotly debated one, which lacks conclusive evidence

(for example, Geary 1998;Kimura 1999;Kerkman et al.

2000;Pinker 2002;Spelke 2005;Summers 2005;

Lachance & Mazzocco 2006;Ceci & Williams 2007).

Even if such differences exist, it is unclear which, if

any, intellectual abilities are associated with chess skill

(for a recent review, see Bilalic

´et al. 2007). Whatever the

ﬁnal resolution of these debates, there is little empirical

evidence to support the hypothesis of differential drop-out

rates between male and females. A recent study of 647

young chess players, matched for initial skill, age and

initial activity found that drop-out rates for boys and girls

were similar (Chabris & Glickman 2006). Our study does

not deal directly with the reasons why there are so few

women in competitive chess. These may have to do with

selective drop-out before tournament play starts in the

early stages of learning to play chess. We can speculate

about the reasons for low participation rates of women

in competitive intellectual endeavours (as is often done,

e.g. Steele 1997;Benbow et al. 2000;Kerkman et al. 2000;

Massa et al. 2005;Spelke 2005;Summers 2005;

Lachance & Mazzocco 2006;Andreescu et al. 2008) but

empirical evidence is scarce.

This study demonstrates that the great discrepancy in

the top performance of male and female chess players can

be largely attributed to a simple statistical fact—more

extreme values are found in larger populations. Once

participation rates of men and women are controlled for,

there is little left for biological, environmental, cultural or

other factors to explain. This simple statistical fact is often

overlooked by both laypeople and experts. In other

domains such as science and engineering, where the

predominance of men at the top is offered as evidence of

the biological superiority of men, large differences

between the number of women and men engaged in

these activities are evident (Long 2001;Xie & Shauman

2003). In these areas of life, it is not possible to estimate

the performance of the top women and men and their

participation rates as precisely as it is in chess. But until

0

50

100

150

200

250

300

350

400

450

500

1 10192837465564738291100

no. of

p

air

rating difference

Figure 2. The differences between the real ratings of the best 100 female and male chess players and the differences expected on

the basis of the common distribution of male and female ratings and the number of male and female players. The expected

difference was obtained by subtracting the estimated rating for the nth female from the estimated rating for the nth male. The

ratings were estimated using the participation rates of men and women and the parameters of their shared population (mean and

s.d.). Black triangles, expected differences; white squares, real differences.

Gender differences in intellectual domains M. Bilalic

´et al. 1163

Proc. R. Soc. B (2009)

the effect of participation rates has been allowed for, the

greater number of men among the most successful people

should not be cited as evidence of innate differences

between male and female intellectual abilities.

We are grateful to Frank Hoppe for providing us with the

German database and Eric-Jan Wagenmakers for his

comments on an earlier draft of the paper. Supported by an

ESRC Post-doctoral Fellowship to M.B.

APPENDIX A

A.1 The expected difference between the kth

best male and female players

Standard methods (e.g. David & Nagaraja 2003) allow

exact calculation of the expected score of the kth best

player, but this proves numerically difﬁcult for large

sample sizes (such as ours, where nZ120 399). Here we

use the size of the population to our advantage, deriving a

simple approximation valid for large n.

Suppose we have a population of independent,

identically distributed variables; in our case, the sample

is all ratings from the German chess federations for both

genders. We sort and relabel the sample such that

X

n,n

%/%X

n,1

; that is, the notation X

n,k

denotes the

kth best score from the sample, which has size n.For

example, X

n,1

and X

n,n

denote the best and worst ratings,

respectively, in our sample. Now deﬁning E

n,k

ZE(X

n,k

),

the expected value (or mean) of X

n,k

, we have the

recurrence relationship (Harter 1961)

En;kC1Z1

kðnEnK1;kKðnKkÞEn;kÞ;

which is equivalent to

En;kC1Z

n!

k!ðnKkK1Þ!X

n

mZnKk

ðK1ÞkCmKnk

nKm

Em;1

m:

Thus we see that the expected value of the (kC1)th

best performer of a sample size nmay be written as a sum

of terms of the form E

m,1

—the best performer from a

sample of size m—where the indexing variable takes values

between nKkand n.

Suppose that X

i

wN(0,1) are drawn from a standard

normal distribution. If the sample size is large, then the

expected value of the best performer increases linearly

with the logarithm of sample size, i.e. E

m,1

zc

1

Cc

2

ln m,

where c

1

Z1.25 and c

2

Z0.287 (ﬁgure 3).

If nKkis large, then each value taken by min the

previous sum will be large. Hence, we may apply the log-

linear approximation to each occurrence of E

m,1

, obtaining

En;kC1zc1Cc2

n!

k!ðnKkK1Þ!X

n

mZnKk

ðK1ÞkCmKnk

nKm

!lnm

mZc1Cc2

n!

k!ðnKkK1Þ!ðK1ÞkVkfðnÞ:

Here, f(x)Zln x/xand VgðxÞZgðxÞKgðxK1Þdenotes

the backward difference operator (Abramowitz & Stegun

1972). Furthermore, the backward difference operator is a

close approximation of the differential operator, and hence

VkfðxÞz

dk

dxkfðxÞZðK1Þkk!ln xKHðkÞ

xkC1;

where HðkÞZPk

jZ1jK1denotes kth harmonic number

and we deﬁne H(0)Z0. Substituting, we ﬁnd that the

expected value of the kth best performer is given by

En;kzc1Cc2

n!

ðnKkÞ!nkðln nKHðkK1ÞÞ:

This holds if X

i

wN(0,1) are drawn from a standard

normal distribution. Moving to a general normal

distribution, if X

i

wN(m,s

2

)ZmCsN(0,1), we ﬁnd

En;kzðmCc1sÞCc2sn!

ðnKkÞ!nkðln nKHðkK1ÞÞ:

Given a distribution with known mean mand s.d. d,this

ﬁnal formula deﬁnes the expectation of the kth highest value

within a sample of size n, valid provided nis large and k is

relatively small. As such, it affords us a method for estimating

the expected rating of a range of top players from the

German chess data for each gender; indeed, we use the

formula to calculate the expected ratings of the top 100 male

and female players using the mean and s.d. of the population

(the German chess data), in turn allowing us to determine

the expected difference in rating between those players.

REFERENCES

Abramowitz, M. & Stegun, I. A. 1972 Handbook of

mathematical functions with formulas, graphs, and mathe-

matical tables. New York, NY: Dover.

Ahmadov, Z. 2007 Women in chess—a matter of opinion.

Retrieved 26 April 2008 from http://www.chessbase.com/

newsdetail.asp?newsidZ4132.

Andreescu, T., Gallian, J. A., Kane, J. M. & Mertz, J. E. 2008

Cross-cultural analysis of students with exceptional talent

in mathematical problem solving. Notices Am. Math. Soc.

55, 1248–1260.

Ayalon, H. 2003 Women and men go to university:

mathematical background and gender differences in

choice of ﬁeld in higher education. Sex Roles 48,

277–290. (doi:10.1023/A:1022829522556)

Benbow, C. P., Lubinski, D., Shea, D. L. & Eftekhari-Sanjani,

H. 2000 Sex differences in mathematical reasoning ability

at age 13: their status 20 years later. Psychol. Sci. 11,

474– 480. (doi:10.1111/1467-9280.00291)

Bilalic

´, M., McLeod, P. & Gobet, F. 2007 Does chess need

intelligence?—a study with young chess players. Intelligence

35, 457–470. (doi:10.1016/j.intell.2006.09.005)

0

1

2

maximum value

3

4

5

6

2468

ln (sam

p

le size)

10 12 14 16

Figure 3. Plot demonstrating that the expected value of the

best performer increases linearly with the logarithm of sample

size, if the sample is large and drawn from a standard normal

distribution. Each data point represents the mean value and

s.d. of 1000 samples. The linear ﬁt is shown using all

populations that have 1000 or more members (ln 1000z7).

1164 M. Bilalic

´et al. Gender differences in intellectual domains

Proc. R. Soc. B (2009)

Ceci, J. S. & Williams, M. W. 2007 Why aren’t more women in

science? Top researchers debate the evidence. Washington, DC:

American Psychological Association.

Chabris, C. F. & Glickman, M. E. 2006 Sex differences in

intellectual performance: analysis of a large cohort of

competitive chess players. Psychol. Sci. 17, 1040–1046.

(doi:10.1111/j.1467-9280.2006.01828.x)

Charness, N. & Gerchak, Y. 1996 Participation rates and

maximal performance. Psychol. Sci. 7, 46–51. (doi:10.

1111/j.1467-9280.1996.tb00665.x)

David, H. A. & Nagaraja, H. N. 2003 Order statistics.

New York, NY: Wiley.

Davidson, M. J. & Cooper, C. L. 1992 Shattering the glass

ceiling—the woman manager. London, UK: Paul Chapman

Publishing.

Deary, I. J., Thorpe, G., Wilson, V., Starr, J. M. & Whalley,

L. J. 2003 Population sex differences in IQ at age 11: the

Scottish mental survey 1932. Intelligence 31, 533–542.

(doi:10.1016/S0160-2896(03)00053-9)

Elo, A. E. 1986 The rating of chessplayers, past and present.

New York, NY: Arco.

Geary, D. C. 1998 Male, female: the evolution of human sex

differences. Washington, DC: American Psychological

Association.

Glickman, M. E. 1999 Parameter estimation in large dynamic

paired comparison experiments. Appl. Stat. 48, 377–394.

(doi:10.1111/1467-9876.00159)

Glickman, M. E. & Chabris, C. F. 1996 Using chess ratings as

data in psychological research. Boston, MA: Department of

Mathematics and Statistics, Boston University. Unpub-

lished manuscript, available from www.wjh.harvard.edu/

cfc/Glickman1996.pdf

Harter, H. L. 1961 Expected values of normal order statistics.

Biometrika 48,151–165.(doi:10.1093/biomet/48.1-2.151)

Howard,R.W.2005Aregenderdifferences in high achievement

disappearing? A test in one intellectual domain. J. Biosoc. Sci.

37,371–380.(doi:10.1017/S0021932004006868)

Huffman, L. M. & Torres, L. 2002 It’s not only ‘who you know’

that matters: gender, personal contacts, and job lead quality.

Gend. Soc. 16,793–813.(doi:10.1177/089124302237889)

Irwing, P. & Lynn, L. 2005 Sex differences in means and

variability on the progressive matrices in university

students: a meta-analysis. Br. J. Psychol. 96, 505–524.

(doi:10.1348/000712605X53542)

Kerkman, D. D., Wise, J. C. & Harwood, E. A. 2000

Impossible ‘mental rotation’ problems: a mismeasure of

women’s spatial abilities? Learn. Individ. Differ. 12,

253–269. (doi:10.1016/S1041-6080(01)00039-5)

Kimura, D. 1999 Sex and cognition. Cambridge, MA: MIT

Press.

Lachance, J. A. & Mazzocco, M. M. 2006 A longitudinal

analysis of sex differences in math and spatial skills in

primary school age children. Lear n. Individ. Differ. 16,

195–216. (doi:10.1016/j.lindif.2005.12.001)

Long, S. J. 2001 From scarcity to visibility: gender differences in

the careers of doctoral scientists and engineers. Washington,

DC: National Academy Press.

Massa, L. J., Mayer, R. E. & Bohon, L. M. 2005 Individual

differences in gender role beliefs inﬂuence spatial ability

test performance. Learn. Individ. Differ. 15, 99–111.

(doi:10.1016/j.lindif.2004.11.002)

Newell, A., Shaw, J. C. & Simon, H. A. 1958 Chess-playing

programs and the problem of complexity. IBM J. Res. Dev.

2, 320–335.

Pinker, S. 2002 The blank slate: the modern denial of human

nature. New York, NY: Viking.

Spelke, E. S. 2005 Sex differences in intrinsic aptitude for

mathematics and science? A critical review. Am. Psychol.

60, 950–958. (doi:10.1037/0003-066X.60.9.950)

Steele, C. M. 1997 A threat in the air: how stereotypes shape

intellectual identity and performance. Am. Psychol. 52,

613–629. (doi:10.1037/0003-066X.52.6.613)

Summers, L. H. 2005 Remarks at NBER conference on

diversifying the science and engineering workforce.

Retrieved 26 April 2008 from www.president.harvard.

edu/speeches/2005/nber.html.

Xie, Y. & Shauman, K. A. 2003 Women in science: career

processes and outcomes. Cambridge, MA: Harvard

University Press.

Gender differences in intellectual domains M. Bilalic

´et al. 1165

Proc. R. Soc. B (2009)

- A preview of this full-text is provided by The Royal Society.
- Learn more

Preview content only

Content available from Proceedings of the Royal Society B

This content is subject to copyright.