Conference PaperPDF Available

Which Gender Plays More Beautiful Chess?

Authors:

Abstract

Chess is typically a male-dominated sport. However, women play the game as well but usually against each other. The reasons for this are debatable. Aesthetics is also an important part of the game and the reason why many people play. It is an essential component of the (even more so) male-dominated world of chess composition. In this research, our goal was to determine if games between men and games between women showed any statistically significant difference in terms of aesthetics. We analyzed using a computational aesthetics model two sets of games (one small, one large) between males and between females irrespective of playing strength and age. We found in the smaller set that there was no difference but in the larger set that the games between men were, on average, more beautiful than those between women. This suggests that men are more likely to have a better artistic sense in the game and therefore appreciate it more. It might also help to explain the relative non-existence of master female chess problem composers. It follows that, similarly, women may have better artistic senses in other games or domains as compared to men. Video Version: https://www.youtube.com/watch?v=dCElBB6zvZs
Which Gender Plays More Beautiful Chess?
A. Iqbal
College of Information Technology, Universiti Tenaga Nasional, Malaysia.
ABSTRACT: Chess is typically a male-dominated sport. However, women play the game as well but usually
against each other. The reasons for this are debatable. Aesthetics is also an important part of the game and the
reason why many people play. It is an essential component of the (even more so) male-dominated world of
chess composition. In this research, our goal was to determine if games between men and games between
women showed any statistically significant difference in terms of aesthetics. We analyzed using a
computational aesthetics model two sets of games (one small, one large) between males and between females
irrespective of playing strength and age. We found in the smaller set that there was no difference but in the
larger set that the games between men were, on average, more beautiful than those between women. This
suggests that men are more likely to have a better artistic sense in the game and therefore appreciate it more.
It might also help to explain the relative non-existence of master female chess problem composers. It follows
that, similarly, women may have better artistic senses in other games or domains as compared to men.
1 INTRODUCTION & REVIEW
Chess is a science, sport and art all at the same time.
It has been studied extensively in artificial intelli-
gence (AI), psychology and other fields over the
last 60 years, is regulated by the World Chess Fed-
eration (FIDE) and recognized by the International
Olympic Committee (IOC), and is also featured in
specialized magazines and newspapers worldwide
as puzzles with creative and artistic solutions.
Chess is dominated by men in the sense that the
greatest players (i.e. having the highest Elo rating)
have consistently been men. The highest rated chess
player of all time, for instance, is Magnus Carlsen
of Norway with a peak rating of 2882 in May 2014.
The highest rated female player of all time is Judit
Polgar of Hungary, with a peak rating of 2735 in
2005.
The difference in ratings between the aforemen-
tioned players may not seem like very much but at
the grandmaster level, even a few rating points can
be difficult to earn and even more easily lost. In the
top 100 players list for May 2015, Judit Polgar’s
rating mentioned above would sit at 23rd place. The
highest rated woman on that list is Hou Yifan of
China at 55th place; rated 2686 (World Chess Fed-
eration, 2015). Viswanathan Anand of India, nota-
bly, at the ‘advanced’ age of 45 years, is world
number 2 (rated 2804). This is just 5 years away
from being eligible as a ‘senior’ category player in
chess. Fig. 1 shows photographs of some of these
chess icons for the curious reader (Autopilot, 2012;
Stefan64, 2008; d'Andorra, 2012).
To our knowledge, there has been no documented
research looking specifically at aesthetics or beauty
in the game in relation to gender. However, previ-
ous work with regard to chess-playing ability and
gender has found, for instance, that women are
more risk-averse in playing (Gerdes and Gräns-
mark, 2010), though more males enter the sport at
lower levels in the game which translates to more
men at the highest levels (Chabris & Glickman,
2006).
Magnus Carlsen
Judit Polgar
Hou Yifan
Fig. 1. Top male and female players.
Some research suggests that women play as well as
men only when they are playing against other wom-
en but show a performance drop when playing
against men (Maass, D'Ettole & Cadinu, 2008). Yet
other research suggests that the lower performance
of women in chess may be due to statistical sam-
pling, i.e. far fewer women in the game (Bilalić,
Smallbone & McLeod, 2009). However, when a
veteran male grandmaster in the modern day claims
the difference is genetic, there can be a media storm
and much controversy (Friedel, 2015).
Our intention in this research is not to go into de-
tails about the dynamics and interplay of the gender
roles in contrast to other areas where there are such
discrepancies. We are simply interested in demon-
strating if the data and experimental results show,
quantitatively, that there is indeed such a difference
in this particular domain. The results of this re-
search can therefore be used in a larger tapestry dis-
cussing ideas related to gender differences in aes-
thetics. Our background is not in gender studies so
such an exploration of ideas would be better han-
dled by people who are; but they are free to use our
data and results.
In section 2, we explain the experimental materi-
als and assumptions. In section 3, we present the
results and a brief discussion. We conclude the pa-
per with section 4, summarizing the main points and
suggesting some directions for further work.
2 EXPERIMENTAL SETUP
For our experimental work, we used the updated
ChessBase Big Database 2015 (6,251,221 games)
as the primary resource for games (ChessBase
Shop, 2015a). We also used an experimentally-
validated computational chess aesthetics model that
is able to assess beauty in the game using various
domain-related aesthetic principles and themes. Its
evaluations with regard to three-movers and studies
(a longer type of chess problem) correlate positively
and well with domain-competent and expert human
assessment.
The model uses a combination of well-known
aesthetic principles in the game (e.g. economy, sac-
rifice) and themes (e.g. pin, fork) in combination
with stochastic components in order to produce an
aesthetics score for a given three-move mate se-
quence or endgame study. The second time a se-
quence or study is evaluated its score may be slight-
ly different, much like the second time a human
judge assesses a sequence he may decide to change
his mind slightly. So typically, one cycle is used for
all evaluations. This is not only faster but also more
akin to how human judgements are made. The aes-
thetics model is incorporated into the prototype
computer program, Chesthetica, and this software
was used to perform the evaluations. Further details
about the aesthetics model is available in (Iqbal et
al., 2012).
The first task was to filter the 6+ million games in
the database to those that ended with the white
pieces checkmating black. This could be done au-
tomatically in a few hours using the ChessBase 13
software (ChessBase Shop, 2015b). This left
157,358 games. Separating the games based on
gender was not relevant at this point because this
collection needed to be filtered again for mate-in-3
‘exclusivity’ (using the Chesthetica software). This
means ensuring that the last three moves of the line
played in a game is actually a forced mate-in-3 line
in that position, and not something that occurred
because the winner got lucky or the opponent over-
looked a possible defense. Doing so increases the
likelihood that more thought and skill went into the
actual game and the final winning sequence. A
forced line is also typically considered more beauti-
ful than one where the opponent could have defend-
ed longer or escaped checkmate. Even though there
may be more than one way to checkmate, if the line
played was not forced, it suggests a lack of skill or
attention on the part of the opponent.
Filtering for mate-in-3 exclusivity took about 5
days of continuous processing on a single desktop
computer as each of the 150,000+ games had to be
tested for the existence of forced mates and whether
the line actually played in the game was one of
them. We could not go back further than three
moves or test games that did not end in mate, even
though the aesthetics model theoretically supports
it, because the number of moves one would need to
go back cannot be the same for each game and
would therefore be arbitrary and lack consistency.
Exclusivity filtering left 34,868 games.
From these we were left with the difficult task of
isolating games between men and games between
women. Curiously, there is no easy way to do this
using the aforementioned world-standard chess da-
tabase management software (ChessBase 13). Fur-
thermore, typical chess player names, especially to
those not accustomed to them, can often not reflect
their gender clearly. So we decided to run a search
for the terms “(Women)” and “(Men)” in ‘any
field’, which returned the tournaments that were
sensible enough to include those terms in their ti-
tles. We also got a handful of hits with the term,
“girls”. Unfortunately, the term, “boys” returned
nothing as tournament names tend not to feature
that word.
The result was 1,069 games between women (or
girls) and only 115 games between men (or boys).
There was no filtering based on age or playing
strength as this study is concerned more with gen-
der differences and aesthetic quality of play. We
also did not have too many games left to work with.
We managed, however, to identify enough addi-
tional games between males to bring the 115 set to
1,069 as well. We also created a random subset of
the 1,069 games between females consisting of just
115 games. The result was two sets of games be-
tween men and between women that had the same
sample size, i.e. 115 and 1,069 games. We tested,
aesthetically using Chesthetica, the smaller sample
first.
It is worth noting that the aesthetic evaluation of
thousands of chess positions like these by human
experts would not be cost-effective, feasible in a
reasonable amount of time, or even consistent and
reliable. This sort of experiment can only be carried
out using a computational aesthetics model. For the
statistical testing of means, an F-test was first per-
formed on a pair of samples to determine if the two-
tailed T-test should be assuming equal (TTEV) or
unequal variances (TTUV). A significance level of
5% was used for all tests.
3 RESULTS & DISCUSSION
The experimental results we obtained were interest-
ing. The first set of 115 games between men and
115 games between women were aesthetically ana-
lyzed using Chesthetica and we found the mean for
the former was 1.847 and the latter, 1.810. The dif-
ference (TTUV) was not statistically significant.
The aesthetics score is typically used for ranking
purposes so even a numerically small difference
would rank one composition or game ahead of an-
other (or the average for a set ahead of another set).
The second set of 1,069 games was analyzed and
the mean for the games between men was 1.769 and
the mean for the games between women was 1.720.
The difference (TTEV) this time was statistically
significant: t(2136) = -2.094, P = 0.036.
So the larger set exposed a significant difference
between the average aesthetic quality of games be-
tween females as opposed to between males. The
critical last three forced winning moves, to be pre-
cise. What does this mean? In order to put these re-
sults in perspective, consider the data in Table 1.
The first two columns show the results just ex-
plained. The third column is a collection of 1,069
games of the same kind between strong players (not
exclusively men, but largely) with an Elo rating
above 2500 taken from Big Database 2011. The
fourth column is a collection of 1,069 published
chess compositions (three-movers) taken from the
Meson Database (The BDS Website, 2015). Chess
compositions are usually composed with aesthetics
in mind.
Table 1. Average aesthetic scores for various sets.
Women vs.
Women
Men vs.
Men
Elo >
2500
#3 Composi-
tions
1.720
1.769
1.789
2.281
A single factor analysis of variance (ANOVA) test
was performed across all the four sets comparing
the aesthetic means and the differences were found
to be statistically significant: F (3, 4272) = 254.95,
p < 0.001. This suggests that games between
stronger players rank higher aesthetically than those
between weaker players. It also shows, as expected,
that published chess problems rank the highest. We
used the ANOVA test because it is suitable for in-
cluding all available data and comparing group
means. If there were just two sets to compare, for
example, we may have used a TTEV, TTUV or
even the Mann-Whitney U test.
Consider the data in Table 2 which shows the re-
sults of two sets of 1,069 games taken from Big Da-
tabase 2011. The column to the right is a set of
games where the mates were exclusive and forced
whereas the column to the left is a set where there
was no forced mate in the position yet white won
anyway. It is generally accepted that such positions,
i.e. where there was no forced win, are considered
less beautiful, even in real games. We could only
test using the games of weak players (irrespective
of gender, by the way) because unforced wins sel-
dom occur at high levels of play; especially with
games that end in mate.
Table 2. Average aesthetic scores of forced vs. unforced wins.
Elo < 1500
(Forced)
1.702
The difference (TTUV) in mean aesthetics here was
also statistically significant: t(2114) = 3.956, P <
0.001. This suggests, as expected, that games where
the checkmate did not stem from a forced position,
should rank lower aesthetically than those where the
checkmate was forced. Given the results in Table 1,
the idea that aesthetics tends to improve with play-
ing quality is also supported. Even very strong
players can only wish they played games where the
win actually resembled a chess composition. Final-
ly, consider the data in Table 3. This shows the aes-
thetic scores of 1,069 games randomly selected
from a collection of games between two chess en-
gines, (i.e. Rybka 3 and Fritz 8) at 10 minutes + 10
seconds time controls and 1 minute + 1 second time
controls.
Table 3. Average aesthetic scores of games between chess
engines.
Rybka 3 vs. Fritz 8 (10+10)
Rybka 3 vs. Fritz 8 (1+1)
1.979
1.992
The assumption might have been that, given less
time to make a move (i.e. 1+1), the quality of the
wins would be lower, like it is with human players.
However, the difference in means (TTEV) was not
significant. This suggests that computers, regardless
of their playing strength and ‘experience’ (if any),
have the same style of play or perhaps just no con-
scious or unconscious appreciation of art and how
to win in a more appealing way, if possible. This
style and appreciation for beauty is something that
usually comes with time and experience in human
players.
It is still interesting, however, that on average, the
aesthetic quality of engine vs. engine games is
higher than even between strong players (1.789) but
expectedly lower than chess compositions by expe-
rienced humans (2.281). We confirmed this using
an ANOVA test as before on the four relevant sets:
F (3, 4272) = 190.1, p < 0.001. Games between
chess engines may be more aesthetic or beautiful
because they probably stem from more precise and
logical moves which tend to lead to better economy
of pieces in the checkmate (a known principle of
beauty in the game). Both games between chess en-
gines and between human experts, however, tend to
lack the artistry and paradoxical nature of chess
problems, which is what makes them so appealing
and an art form.
Returning to the question of the aesthetic quality
of games between women as compared to between
men, the experimental evidence would suggest that
games between men do indeed rank higher than
games between women in terms of beauty. This
may not be evident, however, using relatively small
sets of games (i.e. around 100). Using a set of at
least 1,000 games is recommended, depending on
availability. Playing strength may be regarded as a
separate issue.
Do the results then imply that women have less
artistic appreciation of the game or play in a less
artistic fashion even though they may be good play-
ers? Perhaps. It would help explain the relative non-
existence of master female composers of chess
problems and lack of notable award-winning ‘bril-
liant’ games between women, such as Adolf An-
derssen’s, ‘Immortal Game’ and Bobby Fischer’s,
‘Game of the Century’.
Are there or can there be exceptions to the rule?
Certainly, as with most things. Logically, it would
also follow that there are likely domains where
women fare better aesthetically than men. Under-
standing these domains and differences better
would add to our body of knowledge about the hu-
man brain and gender differences. It may even help
optimize human performance in domains where
equal gender distribution is not a requirement.
4 CONCLUSIONS
In this research we have shown experimentally that
games between female players tend to be of lower
aesthetic quality than games between male players.
This is likely true, at least, in games that end deci-
sively in checkmate and where the win can be
traced back to three forced moves (as explained in
section 2). We are unable to draw any real conclu-
sions about playing strength from the experiments.
How might this finding be useful? For one thing, it
may be of interest to psychologists who study aes-
thetic perception in males and females. Chess, as a
domain of investigation, would be a place where it
can be investigated why females tend to lack in art-
istry or at least rank lower in artistry, compared to
males.
Likewise, there are probably domains where fe-
males fare better artistically. Neuroscientists may be
able to provide more insight into that question after
designing and conducting the right experiments. In
terms of aesthetics in chess, the difference discov-
ered in this research may or may not extend to
games that do not end in mate but as mentioned ear-
lier, the longer the sequence that needs to be inves-
tigated, the less reliable the experiment. There is
little reason to believe they do not, however. The
availability of data is also a significant issue. For
instance, there are simply not enough published
compositions by females (of the same level) to
compare against those by men so chess aesthetics in
one of its finer forms cannot, at this time, be inves-
tigated for consistency with the finding of this re-
search.
In general, what we have demonstrated should not
be taken too seriously as it merely opens a point of
inquiry related to the game that may not have been
adequately considered before. The technology to
investigate such a question has only relatively re-
cently become available and computers today can
already compose interesting chess problems of
some aesthetic merit on their own by integrating
information from various domains. Hundreds of ex-
amples of computer-generated chess problem com-
positions are available at (Iqbal, 2015), for instance.
So we are optimistic about further technological
progress in this area and related ones in the years to
come.
Whether or not the difference between the gen-
ders identified in this research can be confirmed or
stands the test of time remains to be seen, however.
It should also be interesting to explore if the more
established question of a difference in playing
strength between males and females relates in any
way to the aesthetics of gameplay. This is because it
is often assumed that stronger players play more
effectively and effectiveness tends to corrrelate with
aesthetics.
ACKNOWLEDGEMENTS
We would like to thank Frederic Friedel (co-
founder of ChessBase) and Woman Grandmaster
Jana Krivec for their assistance. This research was
sponsored, in part, by the Universiti Tenaga Na-
sional grant, J510050547.
REFERENCES
1. Autopilot (2012). http://upload.wikimedia.org/wikipedia/
commons/0/04/Magnus_Carlsen_cropped.jpg, 27 April.
2. Bilalić, M., Smallbone, K., McLeod, P., & Gobet, F.
(2009). Why are (the Best) Women So good at Chess?
Participation Rates and Gender Differences in Intellectual
Domains. Proceedings of the Royal Society B: Biological
Sciences, 276(1659), 1161-1165.
3. Chabris, C. F., & Glickman, M. E. (2006). Sex
Differences in Intellectual Performance Analysis of a
Large Cohort of Competitive Chess Players.
Psychological Science, 17(12), 1040-1046.
4. ChessBase Shop (2015a). Big Database 2015,
https://shop.chessbase.com/en/products/big_database_201
5, ChessBase, Hamburg, Germany.
5. ChessBase Shop (2015b). ChessBase 13,
http://shop.chessbase
.com/en/products/chessbase13_starter_package_engl,
ChessBase, Hamburg, Germany.
6. d'Andorra, Federació d'Escacs Valls (2012).
http://www.flickr.com/photos/
feva/7930695086, 4
September.
7. Friedel, F. (2015). Chess Gender Debate in the
International Press, ChessBase News, 21 April,
http://en.chessbase.com/post/chess-gender-debate-in-the-
international-press, Hamburg, Germany.
8. Gerdes, C., & Gränsmark, P. (2010). Strategic Behavior
Across Gender: A Comparison of Female and Male
Expert Chess Players. Labour Economics,17(5), 766-775.
9. Iqbal, A. (2015). YouTube. https://www.youtube.com/
c/AzlanIqbal
10. Iqbal, A., van der Heijden, H., Guid, M., & Makhmali, A.
(2012). Evaluating the Aesthetics of Endgame Studies: A
Computational Model of Human Aesthetic
Perception. Computational Intelligence and AI in Games,
IEEE Transactions on,4(3), 178-191.
11. Maass, A., D'Ettole, C., & Cadinu, M. (2008).
Checkmate? The Role of Gender Stereotypes in the
Ultimate Intellectual Sport. European Journal of Social
Psychology, 38(2), 231-245.
12. Stefan64 (2008). http://upload.wikimedia.org/wikipedia/
commons/c/ce/Judit_The_Look_Polgar.jpg.
13. The BDS Website (2015). Meson Introduction,
http://www.bstephen.me.uk/index.php/meson
14. World Chess Federation (2015). Standard Top 100
Players. https://ratings.fide.com/top.phtml?list=men
... Studies have shown that the frontal and posterior parietal areas, which are known to be involved in the orientation of attention, perception and working memory, are engaged in the game of chess [13]. Systematical chess practice develops several important skills in solving mathematical problems [14][15][16], such as maintaining a high level of attention [17] and focusing on tasks [18], perseverance in pursuing goals, creativity [19], recognizing strategic information in situations and using it in planning strategies, critical reflection on one's actions and predicting the course of events [20]. There is a statistically significant correlation between intelligence and chess performance [21][22][23][24][25][26][27], thus the child who excels at a certain school subject has the chance to achieve better performance in chess. ...
Article
Full-text available
The role of intelligence in chess is crucial because the game involves a situation of adversity between two players whose goal is to checkmate the opponent’s king. Due to the complex nature of the game and the huge amount of information needed to become a professional chess player, the ability to receive, analyze, sort and use abstract notions is essential. A total of 67 children from the third grade were selected and tested twice, initially and finally, to establish the level of body schema and intelligence. The Raven test was used to numerically quantify their intelligence and the Goodenough test was conducted for the body schema. We used the paired samples T-test to highlight the statistical difference between the results and performed a simple linear regression to see if the level of intelligence is a predictor of the body schema. There is a linear relationship between intelligence and body schema, and we can use the first one to predict the evolution of the second. In conclusion, body schema can be educated through chess lessons, and this will lead to better psychomotor development.
Article
Full-text available
In this article we explain how an existing computational aesthetics model for three-move mate problems was improved and adapted to suit the domain of chess endgame studies. Studies are typically longer and more ‘sophisticated’ in terms of their perceived aesthetics or beauty. They are therefore likely a better test of the capability of machines to evaluate beauty in the game. Based on current validation methods for an aesthetics model such as this, the experimental results confirm that the adaptation was successful. In the first experiment, the new model enabled a computer program to distinguish correctly between composed studies and positions with sequences resembling studies taken from real games. In the second, the computational aesthetic evaluations were shown to correlate positively and well with human expert aesthetic assessment. The new model encompasses the previous three-mover one and can be used to evaluate beauty as perceived by humans in both domains. This technology pushes the boundaries of computational chess and can be of benefit to human players, composers and judges. To some extent, it may also contribute to our understanding of the psychology of human aesthetic perception and the ‘mechanics’ of human creativity in composing problems and studies.
Article
Full-text available
A popular explanation for the small number of women at the top level of intellectually demanding activities from chess to science appeals to biological differences in the intellectual abilities of men and women. An alternative explanation is that the extreme values in a large sample are likely to be greater than those in a small one. Although the performance of the 100 best German male chess players is better than that of the 100 best German women, we show that 96 per cent of the observed difference would be expected given the much greater number of men who play chess. There is little left for biological or cultural explanations to account for. In science, where there are many more male than female participants, this statistical sampling explanation, rather than differences in intellectual ability, may also be the main reason why women are under-represented at the top end.
Article
Women are surprisingly underrepresented in the chess world, representing less that 5% of registered tournament players worldwide and only 1% of the world's grand masters. In this paper it is argued that gender stereotypes are mainly responsible for the underperformance of women in chess. Forty-two male–female pairs, matched for ability, played two chess games via Internet. When players were unaware of the sex of opponent (control condition), females played approximately as well as males. When the gender stereotype was activated (experimental condition), women showed a drastic performance drop, but only when they were aware that they were playing against a male opponent. When they (falsely) believed to be playing against a woman, they performed as well as their male opponents. In addition, our findings suggest that women show lower chess-specific self-esteem and a weaker promotion focus, which are predictive of poorer chess performance. Copyright © 2007 John Wiley & Sons, Ltd.
Article
This paper aims to measure differences in risk behavior among expert chess players. The study employs a panel data set on international chess with 1.4 million games recorded over a period of 11 years. The structure of the data set allows us to use individual fixed-effect estimations to control for aspects such as innate ability as well as other characteristics of the players. Most notably, the data contains an objective measure of individual playing strength, the so-called Elo rating. In line with previous research, we find that women are more risk-averse than men. A novel finding is that men choose more aggressive strategies when playing against female opponents even though such strategies reduce their winning probability.
Article
Only 1% of the world's chess grandmasters are women. This underrepresentation is unlikely to be caused by discrimination, because chess ratings objectively reflect competitive results. Using data on the ratings of more than 250,000 tournament players over 13 years, we investigated several potential explanations for the male domination of elite chess. We found that (a) the ratings of men are higher on average than those of women, but no more variable; (b) matched boys and girls improve and drop out at equal rates, but boys begin chess competition in greater numbers and at higher performance levels than girls; and (c) in locales where at least 50% of the new young players are girls, their initial ratings are not lower than those of boys. We conclude that the greater number of men at the highest levels in chess can be explained by the greater number of boys who enter chess at the lowest levels.
Chess Gender Debate in the International Press, ChessBase News
  • F Friedel
Friedel, F. (2015). Chess Gender Debate in the International Press, ChessBase News, 21 April, http://en.chessbase.com/post/chess-gender-debate-in-theinternational-press, Hamburg, Germany.
Meson – Introduction Standard Top 100 Players
  • Bds The
  • Website
The BDS Website (2015). Meson – Introduction, http://www.bstephen.me.uk/index.php/meson 14. World Chess Federation (2015). Standard Top 100 Players. https://ratings.fide.com/top.phtml?list=men
http://upload.wikimedia.org/wikipedia
  • Autopilot
Autopilot (2012). http://upload.wikimedia.org/wikipedia/ commons/0/04/Magnus_Carlsen_cropped.jpg, 27 April.
YouTube. https://www.youtube
  • A Iqbal
Iqbal, A. (2015). YouTube. https://www.youtube.com/ c/AzlanIqbal
Chess Gender Debate in the International Press
  • F Friedel
Friedel, F. (2015). Chess Gender Debate in the International Press, ChessBase News, 21 April, http://en.chessbase.com/post/chess-gender-debate-in-theinternational-press, Hamburg, Germany.