# Move-by-Move Dynamics of the Advantage in Chess Matches Reveals Population-Level Learning of the Game.

**ABSTRACT** The complexity of chess matches has attracted broad interest since its invention. This complexity and the availability of large number of recorded matches make chess an ideal model systems for the study of population-level learning of a complex system. We systematically investigate the move-by-move dynamics of the white player's advantage from over seventy thousand high level chess matches spanning over 150 years. We find that the average advantage of the white player is positive and that it has been increasing over time. Currently, the average advantage of the white player is [Formula: see text]0.17 pawns but it is exponentially approaching a value of 0.23 pawns with a characteristic time scale of 67 years. We also study the diffusion of the move dependence of the white player's advantage and find that it is non-Gaussian, has long-ranged anti-correlations and that after an initial period with no diffusion it becomes super-diffusive. We find that the duration of the non-diffusive period, corresponding to the opening stage of a match, is increasing in length and exponentially approaching a value of 15.6 moves with a characteristic time scale of 130 years. We interpret these two trends as a resulting from learning of the features of the game. Additionally, we find that the exponent [Formula: see text] characterizing the super-diffusive regime is increasing toward a value of 1.9, close to the ballistic regime. We suggest that this trend is due to the increased broadening of the range of abilities of chess players participating in major tournaments.

**0**Bookmarks

**·**

**90**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**Is football (soccer) a universal sport? Beyond the question of geographical distribution, where the answer is most certainly yes, when looked at from a mathematical viewpoint the scoring process during a match can be thought of, in a first approximation, as being modeled by a Poisson distribution. Recently, it was shown that the scoring of real tournaments can be reproduced by means of an agent-based model (da Silva et al. (2013) [24]) based on two simple hypotheses: (i) the ability of a team to win a match is given by the rate of a Poisson distribution that governs its scoring during a match; and (ii) such ability evolves over time according to results of previous matches. In this article we are interested in the question of whether the time series represented by the scores of teams have universal properties. For this purpose we define a distance between two teams as the square root of the sum of squares of the score differences between teams over all rounds in a double-round-robin-system and study how this distance evolves over time. Our results suggest a universal distance distribution of tournaments of different major leagues which is better characterized by an exponentially modified Gaussian (EMG). This result is corroborated by our agent-based model.Physica A: Statistical Mechanics and its Applications 03/2014; 398:56–64. · 1.72 Impact Factor - SourceAvailable from: Juan Ignacio Perotti[Show abstract] [Hide abstract]

**ABSTRACT:**We study innovation in chess by analyzing how different move sequences are played at the population level. It is found that the probability of exploring a new or innovative move decreases as a power law with the frequency in which the preceding move sequence is played. Chess players also exploit already known move sequences according to their frequencies, following a preferential growth mechanism. Furthermore, innovation in chess exhibits Heaps' law suggesting similarities with the process of vocabulary growth. We propose a robust generative mechanism based on nested Yule-Simon preferential growth processes that reproduces the empirical observations. These results, supporting the self-similar nature of innovations in chess, are important in the context of decision making in a competitive scenario.EPL (Europhysics Letters) 09/2013; 104(4). · 2.27 Impact Factor - SourceAvailable from: Juan Ignacio Perotti[Show abstract] [Hide abstract]

**ABSTRACT:**In this paper we report the existence of long-range memory in the opening moves of a chronologically ordered set of chess games using an extensive chess database. We used two mapping rules to build discrete time series and analyzed them using two methods for detecting long-range correlations; rescaled range analysis and detrented fluctuation analysis. We found that long-range memory is related to the level of the players. When the database is filtered according to player levels we found differences in the persistence of the different subsets. For high level players, correlations are stronger at long time scales; whereas in intermediate and low level players they reach the maximum value at shorter time scales. This can be interpreted as a signature of the different strategies used by players with different levels of expertise. These results are robust against the assignation rules and the method employed in the analysis of the time series.Physica A: Statistical Mechanics and its Applications 07/2013; · 1.72 Impact Factor

Page 1

Move-by-move dynamics of the advantage in chess matches reveals population-level

learning of the game

H. V. Ribeiro,1,2, ∗R. S. Mendes,1E. K. Lenzi,1M. del Castillo-Mussot,3and L. A. N. Amaral2,4, †

1Departamento de F´ ısica and National Institute of Science and Technology for Complex Systems,

Universidade Estadual de Maring´ a, Maring´ a, PR 87020, Brazil

2Department of Chemical and Biological Engineering,

Northwestern University, Evanston, IL 60208, USA

3Departamento de Estado S´ olido, Instituto de F´ ısica,

Universidad Nacional Aut´ onoma de M´ exico, Distrito Federal, M´ exico

4Northwestern Institute on Complex Systems (NICO),

Northwestern University, Evanston, IL 60208, USA

The complexity of chess matches has attracted broad interest since its invention. This complexity

and the availability of large number of recorded matches make chess an ideal model systems for the

study of population-level learning of a complex system. We systematically investigate the move-by-

move dynamics of the white player’s advantage from over seventy thousand high level chess matches

spanning over 150 years. We find that the average advantage of the white player is positive and

that it has been increasing over time. Currently, the average advantage of the white player is ∼0.17

of 67 years. We also study the diffusion of the move dependence of the white player’s advantage

pawns but it is exponentially approaching a value of 0.23 pawns with a characteristic time scale

and find that it is non-Gaussian, has long-ranged anti-correlations and that after an initial period

with no diffusion it becomes super-diffusive. We find that the duration of the non-diffusive period,

corresponding to the opening stage of a match, is increasing in length and exponentially approaching

a value of 15.6 moves with a characteristic time scale of 130 years. We interpret these two trends

as a resulting from learning of the features of the game. Additionally, we find that the exponent

α characterizing the super-diffusive regime is increasing toward a value of 1.9, close to the ballistic

regime. We suggest that this trend is due to the increased broadening of the range of abilities of

chess players participating in major tournaments.

Introduction

The study of biological and social complex systems has been the focus of intense interest for at least three decades [1].

Elections [2], popularity [3], population growth [4], collective motion of birds [5] and bacteria [6] are just some examples

of complex systems that physicists have tackled in these pages. An aspect rarely studied due to the lack of enough

data over a long enough period is the manner in which agents learn the best strategies to deal with the complexity

of the system. For example, as the number of scientific publication increases, researchers must learn how to choose

which papers to read in depth [7]; while in earlier times word-of-mouth or listening to a colleague’s talk were reliable

strategies, nowadays the journal in which the study was published or the number of citations have become, in spite

of their many caveats, indicators that seem to be gaining in popularity.

In order to understand how population-level learning occurs in the “real-word,” we study it here in a model system.

Chess is a board game that has fascinated humans ever since its invention in sixth-century India [8]. Chess is an

∗Electronic address: hvr@dfi.uem.br

†Electronic address: amaral@northwestern.edu

arXiv:1212.2787v1 [physics.data-an] 12 Dec 2012

Page 2

2

extraordinary complex game with 1043legal positions and 10120distinct matches, as roughly estimated by Shannon [9].

Recently, Blasius and T¨ onjes [10] have showed that scale-free distributions naturally emerge in the branching process

in the game tree of the first game moves in chess. Remarkably, this breadth of possibilities emerges from a small

set of well-defined rules. This marriage of simple rules and complex outcomes has made chess an excellent test bed

for studying cognitive processes such as learning [11, 12] and also for testing artificial intelligence algorithms such as

evolutionary algorithms [13].

The very best chess players can foresee the development of a match 10–15 moves into the future, thus making

appropriate decisions based on his/her expectations of what his opponent will do. Even though super computers

can execute many more calculations and hold much more information in a quickly accessible mode, it was not until

heuristic rules were developed to prune the set of possibilities that computers became able to consistently beat human

players. Nowadays, even mobile chess programs such as Pocket Fritz™ (http://chessbase-shop.com/en/products/

pocket_fritz_4) have a Elo rating [14] of 2938 which is higher than the current best chess player (Magnus Carlsen

with a Elo rating of 2835 — http://fide.com).

The ability of many chess engines to accurately evaluate the strength of a position enables us to numerically evaluate

the move-by-move white player advantage A(m) and to determine the evolution of the advantage during the course of

uncover population-level learning in the historical evolution of chess match dynamics. Here, we focus on the dynamical

aspects of the game by studying the move-by-move dynamics of the white player’s advantage A(m) from over seventy

We have accessed the portable game notation (PGN) files of 73,444 high level chess matches made free available by

PGN Mentor™ (http://www.pgnmentor.com). These data span the last two centuries of the chess history and cover

the most important worldwide chess tournaments, including the World Championships, Candidate Tournaments, and

the Linares Tournaments (see supplementary Table 1). White won 33% of these matches, black won 24% and 43%

ended up with in a draw. For each of these 73,444 matches, we estimated A(m) using the Crafty™ [15] chess engine

differences in the number and the value of pieces, as well as the advantage related to the placement of pieces. It is

usually measured in units of pawns, meaning that in the absence of other factors, it varies by one unit when a pawn

(the pieces with lowest value) is captured. A positive value indicates that the white player has the advantage and a

negative one indicates that the black player has the advantage. Figure 1A illustrates the move dependence of A for

50 matches selected at random from the data base. Intriguingly, A(m) visually resembles the “erratic” movement of

a chess match. In this way, we can probe the patterns of the game to a degree not before possible and can attempt to

thousand high level chess matches.

which has an Elo rating of 2950 (see Methods Section A). The white player advantage A(m) takes into account the

diffusive particles.

Results

We first determined how the mean value of the advantage depends on the move number m across all matches with

the same outcome (Fig. 1B). We observed an oscillatory behavior around a positive value with a period of 1 move

for both match outcomes. This oscillatory behavior reflects the natural progression of a match, that is, the fact that

the players alternate moves. Not surprisingly, for matches ending in a draw the average oscillates around an almost

stable value, while for white wins it increases systematically and for black wins it decreases systematically.

Figure 1B suggests an answer to an historical debate among chess players: Does playing white yield an advantage?

Some players and theorists argue that because the white player starts the game, white has the “initiative,” and that

black must endeavor to equalize the situation. Others argue that playing black is advantageous because white has

to reveal the first move. Chess experts usually mention that white wins more matches as evidence of this advantage.

However, the winning percentage does not indicate the magnitude of this advantage. In our analysis, we not only

Page 3

3

!"!!

!""

!"!

!"#

!#$!"%"!""

&'()'*+,-./-A(m)

0.1,23-m

!456

7)*2

8('72

!"!!

!""

!"!

!"#

!#$!"%"!""

&'()'*+,-./-A(m)

0.1,23-m

!456

78)9,-7)*2

:;'+<-7)*2

=('72

!!

!"

!#

$

#

"

!

$#$"$!$%$

&'()*+,*A(m)

&+-'./*m

0123'*02).

45(67*02).

89(0.

AB

CD

15-18

18-23

23-29

29-37

37-47

47-59

59-75

75-95

95-119

!!"

!!#

!"

#

"

!#

!"

# !#$#%#&# "#'# (#

)*+,-.,/012A(m)

34+0512m

FIG. 1: Diffusive dynamics of white player’s advantage. (A) Evolution of the advantage A(m) for 50 matches selected at

white) and Kramnik in 2008 (green line), the 2ndmatch between Karpov (playing white) and Kasparov in 1985 (red line),

and the 1stmatch between Spassky (playing white) and Petrosian in 1969 (blue line). (B) Mean value of the advantage as

random. We highlight the trajectories from three World Chess Championship matches: the 6thmatch between Anand (playing

a function of move number for matches ending in draws (squares), white wins (circles) and black wins (triangles). Note the

systematically alternating values and the initial positive values of these means for all outcomes. For white wins, the mean

advantage increases with m, while for black wins it decreases. For draws, the mean advantage is approximately a positive

constant. We estimated the advantage of playing white to be 0.14 ± 0.01 and horizontal dashed line represents this value. (C)

black wins (triangles). Note the very similar profile of the variance for white and black wins. Note also that there is practically

no diffusion for the initial 7−10 moves, corresponding to the opening period, a very well studied stage of the game. After the

to be superdiffusive and characterized by an exponent α = 1.49 ± 0.01, as shown by the dashed line. For wins, the variance

stages it displays a behavior similar to that found for draws. (D) Variance of advantage evaluated after grouping the matches

Variance of the advantage as a function of move number for matches ending in draws (squares) and white wins (circles) and

opening stage, the trajectories exhibit a faster than diffusive spreading. For draws, we find this second regime (10 < m < 100)

presents a more complex behavior. For 10 ≲ m ≲ 40 the variance increases faster than ballistic (hyper-diffusion), but for later

by length and outcome. For draws (continuous lines), the different match lengths do not change the power-law dependence of

the variance. For wins (dashed lines), the variance systematically approaches the profile obtained for draws as the matches

becomes longer. We further note the existence of a very fast diffusive regime for the latest moves of each grouping.

confirm the existence of an advantage in playing white, but also estimate its value as 0.14 ± 0.01 by averaging the

We next investigated the diffusive behavior by evaluating the dependence of the variance of A on the move number

m (Fig. 1C). After grouping the matches by match outcome, we observed for all outcomes that there is practically

no diffusion during the initial moves. These moves correspond to the opening period of the match, a stage very well

studied and for which there are recognized sequences of moves that result in balanced positions. After this initial stage,

the variance exhibits an anomalous diffusive spreading. For matches ending in a draw, we found a super-diffusive

regime (10 < m < 100) that is described by a power law with an exponent α = 1.49 ± 0.01. We note the very similar

Matches ending in a win display an hyper-diffusive regime (α > 2) — a signature of nonlinearity and out-of-

values of the mean for matches ending in draws.

profile of the variance of matches ending in white or black wins.

equilibrium systems [16]. In fact, the behavior for matches ending in wins is quite complex and dependent on the

Page 4

4

0

20

40

60

80

100

120

197019801990

Year

20002010

Number of Players

Grandmasters

Olympic Players (×102)

20

30

40

50

197019801990

Year

20002010

Average

Grandmaster age

AB

50

100

150

200

250

197019801990

Year

20002010

Elo standard deviation

D

C

2200

2250

2300

2350

2400

197019801990

Year

20002010

Elo average

FIG. 2: Historical changes in chess player demographics. (A) Number of new Chess Grandmaster awarded annually

by the world chess organization (http://fide.com) and the number of players who have participated in the Chess Olympiad

(http://www.olimpbase.org) since 1970.Note the increasing trends in these quantities.(B) Average players’ age when

receiving the Grandmaster title. (C) Average Elo rating and (D) standard deviation of the of Elo rating of players who have

participated in the Chess Olympiad. Note the nearly constant value of the average, while the standard deviation has increased

dramatically.

match length (Fig. 1D). While grouping the matches by length does not change the variance profile of draws, for wins

it reveals a very interesting pattern: As the match length increases the variance profile become similar to the profile

of draws, with the only differences occurring in the last moves. This result thus suggests that the behavior of the

advantage of matches ending in a win is very similar to a draw. The main difference occurs in last few moves where

an avalanche-like effect makes the advantage undergo large fluctuations.

Historical Trends

Chess rules have been stable since the 19th century. This stability increased the game popularity (Fig. 2A) and

enabled players to work toward improving their skill. A consequence of these efforts is the increasing number of

Grandmasters — the highest title that a player can attain — and the decreasing average player’s age for receiving

this honor (Figs. 2A and 2B). Intriguingly, the average player’s fitness (measured as the Elo rating [14]) in Olympic

tournaments has remained almost constant, while the standard deviation of the player’s fitness has increased fivefold

(Figs. 2C and 2D). These historical trends prompt the question of whether there has been a change in the diffusive

behavior of the match dynamics over the last 150 years.

To answer this question, we investigated the evolution of the profile of the mean advantage for different periods

(Fig. 3A). For easier visualization, we applied a moving averaging with window size two to the mean values of A(m).

intervals obtained via bootstrapping. The average values are significantly different, showing that the baseline white

player advantage has increased over the last 150 years. We found that this increase is well described by an exponential

approach with a characteristic time of 67.0 ± 0.1 years to an asymptotic value of 0.23 ± 0.01 pawns (Fig. 3C). Our

The horizontal lines show the average values of the means for 20 < m < 40 and the shaded areas are 95% confidence

results thus suggest that chess players are learning how to maximize the advantage of playing white and that this

Page 5

5

0.0

0.1

0.2

1880 1920 1960 2000

Year

White advantage

1.0

1.2

1.4

1.6

1.8

1880 1920 1960 2000

Year

!

2

4

6

8

10

12

14

1880 1920 1960 2000

Year

m×

AB

CDE

!"!

!"#

!"$

!"%

! #!$!%!&!

'()*+,-+A(m)

',.(/0+m

#123!#4#1

#4#4!#4&4

#41#!$!##

!"!!

!""

!"!

!#$!"%"!""

&'()'*+,-./-A(m)

0.1,23-m

!

m!

!4$5!!6!4

!6!6!!676

!64!!#"!!

FIG. 3: Historical trends in the dynamics of highest level chess matches. (A) Mean value of the advantage of matches

ending a draw for three time periods. These curves were smoothed by using moving averaging over windows of size 2. The

horizontal lines are the averaged values of the mean for 20 < m < 40 and the shaded regions are 95% confidence intervals for

95% confidence intervals for the variance and the colored dashed lines indicate power law fits to each data set. The horizontal

dashed line represents the average variance for the most recent data set and for 1 < m < 10. Note the systematic increase

at which the diffusion of the advantage changes behavior. The rightmost symbol represent the extrapolated maximum value

m× = 15.6 ± 0.6. (C) Time evolution of the white player advantage for matches ending in draws. The solid line represents an

67.0 ± 0.1 years. Time evolution of (D) the exponent α and (E) the crossover move m×. The solid lines are fits to exponential

128±9 years for the diffusive exponent and 130±12 years for the crossover move. Based on the conjecture that α and m× are

these averaged values. (B) Variance of the advantage of matches ending a draw for three time periods. The shaded regions are

of α and of the number of moves in the opening. The symbols on this line indicate the values of m×, the number of moves

exponential approach to an asymptotic value. The estimated plateau value is 0.23 ± 0.01 pawns and the characteristic time is

approaches to the asymptotic values α = 1.9 ± 0.1 and m×= 15.6 ± 0.6. The estimated characteristic times for convergence are

approaching limiting values, we plotted a continuous line in Fig 3B to represent this limiting regime.

advantage is bounded.

Next, we considered the time evolution of the variance for matches ending in draws (Fig. 3B). Surprisingly, α seems

to be approaching a value close to that for a ballistic regime. We found that the exponent α follows an exponential

approach with a characteristic time of 128 ± 9 years to the asymptote α = 1.9 ± 0.1 (Fig. 3D). We surmise that this

fitness in a diffusive process has been shown to give rise to ballistic diffusion [17]. For an illustration of how differences

in fitness are related to a ballistic regime (α = 2), assume that

trend is directly connected to an increase in the typical difference in fitness among players. Specifically, the presence of

Ai(m+0.5) = Ai(m)+Φi+η(m)

(1)

Page 6

6

describes the advantage of the white player in a match i, where the difference in fitness between two players is Φiand

η(m) is a Gaussian variable. Φi> 0 yields a positive drift in Ai(m) thus modeling a match where the white player is

σ2(m) ∼ σ2

better. Assuming that the fitness Φiis drawn from a distribution with finite variance σ2

Φ, it follows that

Φm2.(2)

Thus, α = 2. In the case of chess, the diffusive scenario is not determined purely by the fitness of players. However,

by suggesting that the typical difference in skill between players has been increasing.

A striking feature of the results of Fig. 3B is the drift of the crossover move m×at which the power-law regime

of 130±12 years (Fig. 3E). Based on the existence of limiting values for α and m×, we plot in Figure 3B an extrapolated

the match lengths for wins and draws display exponential decays with characteristics lengths of 13.22 ± 0.02 moves

the history of chess. For matches ending in draws, we observed a statistically significant growth of approximately

3.0 ± 0.7 moves per century. For wins, we find no statistical evidence of growth and the characteristic length can be

A question posed by the time evolution of these quantities is whether the observed changes are due to learning by

chess players over time or due to a secondary factor such as changes in the organization of chess tournaments. In order

to determine the answer to this question, we analyze the type of tournaments included in the database. We find that

88% of the tournaments in the database use “round-robin” pairing (all-play-all) and that there has been an increasing

tendency to employ this pairing scheme (supplementary Fig. 2). In order to further strengthen our conclusions, we

analyze the matches in the database obtained by excluding tournaments that do not use round-robin pairing. This

procedure has the advantage that it reduces the effect of non-randomness sampling. As shown in supplementary Fig.

3, this procedure does not change the results of our analyses.

We next studied the distribution profile of the advantage. We use the normalized advantage

ξ(m) =A(m)−⟨A(m)⟩

where ⟨A(m)⟩ is the mean value of advantage after m moves and σ(m) is the standard-deviation. Figures 4A and

good data collapse, which indicates that the advantages are statistically self-similar, since after scaling they follow the

same universal distribution. Moreover, Figs. 4D and 4E show that the distribution profile of the normalized advantage

is quite stable over the last 150 years. These distributions obey a functional form that is significantly different from

a Gaussian distribution (dashed line in the previous plots). In particular, we observe a more slowly decaying tail,

showing the existence of large fluctuations even for matches ending in draws.

Another intriguing question is whether there is memory in the evolution of the white player’s advantage. To

investigate this hypothesis, we consider the time series of advantage increments ∆A(m) = A(m + 0.5) − A(m) for all

Methods Section B) to obtain the Hurst exponent for each match (Fig. 5A). We find h distributed around 0.35 (Fig. 5B)

which indicates the presence of long-range anti-correlations in the evolution of A(m). A value of h < 0.5 indicates

much more frequently than by chance. This result also agrees with the oscillating behavior of the mean advantage

(Fig. 1B). We also find that the Hurst exponent h has evolved over time (Fig. 5C). In particular, we note that the

anti-persistent behavior has statistically increased for the recent two periods, indicating that the alternating behavior

differences in fitness are certainly an essential ingredient and thus Eq.(1) can provide insight into the data of Fig. 3D

begins. We observe that m×is exponentially approaching an asymptote at 15.6±0.6 moves with a characteristic time

power law to represent the limiting diffusive regime (continuous line). We have also found that the distributions of

for draws and 11.20 ± 0.02 moves for wins. Moreover, we find that these characteristic lengths have changed over

approximated by a constant mean of 11.3±0.6 moves (supplementary Fig. 1).

σ(m)

, (3)

4B show the positive tails of the cumulative distribution of ξ(m) for draws and wins for 10 ≤ m ≤ 70. We observe the

5,154 matches ending in a draw that are longer than 50 moves. We used detrended fluctuation analysis (DFA, see

the presence of an anti-persistent behavior, that is, the alternation between large and small values of ∆A(m) occurs

Page 7

7

!"!#

!"!$

!"!%

!"!!

!""

!"!!

!""

!"!

&'(')*+,-./0,1+2,3'+,45

!(m)

02*61/7841,+,-./+*,)19

!"!#

!"!$

!"!%

!"!!

!""

!"!!

!""

!"!

&'(')*+,-./0,1+2,3'+,45

!(m)

6,51/7841,+,-./+*,)19

AB

C

&'(')*+,-./0,1+2,3'+,45

D

&'(')*+,-./0,1+2,3'+,45

!"!#

!"!$

!"!%

!"!!

!""

!"!!

!""

!"!

!(m)

02*61/7841,+,-./+*,)19

!:;<!!=!:

!=!=!!=#=

!=;"!!=:"

!=:!!%"!!

!"!#

!"!$

!"!%

!"!!

!""

!"!!

!""

!"!

!(m)

6,51/7841,+,-./+*,)19

!:;<!!=!:

!=!=!!=#=

!=;"!!=:"

!=:!!%"!!

FIG. 4: Scale invariance and non-Gaussian properties of the white player’s advantage. Positive tails of the cumulative

distribution function for the normalized advantage ξ(m) for matches ending in (A) draws and (B) wins. Each line in these

values of m exhibit a good data collapse with tails that decay slower than a Gaussian distribution (dashed line). Average

plots represents a distribution for a different value of m in the range 10 to 70. By match outcome, the distributions for different

cumulative distribution for matches ending in (C) draws and (D) wins for four time periods. We estimated the error bars

using bootstrapping. These data support the hypothesis of scaling, that is, the distributions follow a universal non-Gaussian

functional form. The negative tails present a very similar shape (see supplementary Fig. 4).

has intensified in this period. We have found a very similar behavior for matches ending in wins after removing the

last few moves in the match (supplementary Fig. 5).

Discussion

We have characterized the advantage dynamics of chess matches as a self-similar, super-diffusive and long-ranged-

memory process. Our investigation provides insights into the complex process of creating and disseminating knowledge

of a complex system at the population-level. By studying 150 years of high level chess, we presented evidence that

the dynamics of a chess have evolved over time in such a way that it appears to be approaching a steady-state.

The baseline advantage of the white player, the cross-over move m×, and the diffusive exponent α are exponentially

approaching asymptotes with different characteristic times. We hypothesized that the evolution of α are closely

related to an increase in the difference of fitness among players, while the evolution of the baseline advantage of white

player indicates that players are learning better ways to explore this advantage. The increase in the cross-over move

m×suggest that the opening stage of a match is becoming longer which may also be related to a collective learning

process. As discussed earlier, hypothesized historical changes in pairing scheme during tournaments cannot explain

these findings.

Page 8

8

BCA

Fluctuation function, F(s)

0.34

0.35

0.36

0.37

1880 1920 1960 2000

Hurst exponent, h

Year

0.1

0.2

0.3

2481632

Scale, s

h

0

1

2

3

4

5

0.00.2

Hurst exponent, h

0.40.60.8

Probability distribution

original

shuffled

FIG. 5: Long-range correlations in white player’s advantage. (A) Detrended fluctuation analysis (DFA, see Methods

Section B) of white player’s advantage increments, that is, ∆A(m) = A(m + 0.5) − A(m), for a match ended in a draw and

F(s) and the scale s is a power-law where the exponent is the Hurst exponent h. Thus, in this log-log plot the relationship is

by straight lines with an average Pearson correlation coefficient of 0.892±0.002. (B) Distribution of the estimated Hurst exponent

to the distribution with mean 0.35 and standard-deviation 0.1. Since h < 0.5, it implies an anti-persistent behavior (see Fig.

line is a Gaussian fit to the data with mean 0.54 and standard-deviation 0.09. Note that the shuffled procedure removed the

correlations, confirming the existence of long-range correlations in A(m). (C) Historical changes in the mean Hurst exponent

recent matches.

selected at random from the database. For series with long-range correlations, the relationship between the fluctuation function

approximated by a straight line with slope equal to h = 0.345. In general, we find all these relationships to be well approximated

h obtained using DFA for matches longer than 50 moves that ended in a draw (squares). The continuous line is a Gaussian fit

1B). We have also evaluated the distribution of h using the shuffled version of these series (circles). For this case, the dashed

h. Note the significantly small values of h in recent periods, showing that the anti-persistent behavior has increased for more

Page 9

9

Methods

Estimating A(m)

The core of a chess program is called the chess engine.

moves given a particular arrangement of pieces on the board. In order to find the best moves, the chess engine

enumerates and evaluates a huge number of possible sequences of moves. The evaluation of these possible moves

is made by optimizing a function that usually defines the white player’s advantage. The way that the function is

defined varies from engine to engine, but some key aspects, such as the difference of pondered number of pieces,

are always present. Other theoretical aspects of chess such as mobility, king safety, and center control are also

typically considered in a heuristic manner. A simple example is the definition used for the GNU Chess program

in 1987 (see http://alumni.imsa.edu/∼stendahl/comp/txt/gnuchess.txt). There are tournaments between these pro-

Crafty™

engine [15]. This is a free program that is ranked 24th in the Computer Chess Rating Lists (CCRL -

http://www.computerchess.org.uk/ccrl). We have also compared the results of subsets of our database with other

engines, and the estimate of the white player advantage proved robust against those changes.

The chess engine is responsible for finding the best

grams aiming to compare the strength of different engines.The results we present were all obtained using the

DFA

DFA consists of four steps [18, 19]:

i) We define the profile

Y (i) =

i

∑

k=1

∆A(m)−⟨∆A(m)⟩;

ii) We cut Y (i) into Ns= N/s non-overlapping segments of size s, where N is the length of the series;

from Y (i), defining Ys(i) = Y (i)−pν(i), where pν(i) represents the local trend in the ν-th segment;

iii) For each segment a local polynomial trend (here, we have used linear function) is calculated and subtracted

iv) We evaluate the fluctuation function

F(s) = [1

where ⟨Ys(i)2⟩νis mean square value of Ys(i) over the data in the ν-th segment.

F(s) ∼ sh, where h is the Hurst exponent.

Ns

Ns

∑

ν=1

⟨Ys(i)2⟩ν]1/2,

If A(m) is self-similar, the fluctuation function F(s) displays a power-law dependence on the time scale s, that is,

[1] Amaral LAN, Ottino JM (2004) Augmenting the framework for the study of complex systems. Eur. Phys. J. B 38: 147-162.

[2] Fortunato S, Castellano S (2007) Scaling and universality in proportional elections. Phys. Rev. Lett. 99: 138701.

[3] Ratkiewicz J, Fortunato S, Flammini A, Menczer F, Vespignani A (2010) Characterizing and modeling the dynamics of

online popularity. Phys. Rev. Lett. 105: 158701.

[4] Rozenfeld HD, Rybski D, Andrade JS, Batty M, Stanley HE, Makse HA (2008) Laws of population growth. Proc. Natl.

Acad. Sci. USA 105: 18702-18707.

Page 10

10

[5] Bialek W, Cavagna Q, Giardina I, Mora T, Silvestri E, Viale M, Walczak AM (2012) Statistical mechanics for natural

flocks of birds. Proc. Natl. Acad. Sci. USA 109: 4786-4791.

[6] Peruani F, Starruß J, Jakovljevic V, Søgaard-Andersen L, Deutsch A, B¨ ar M (2012) Collective motion and nonequilibrium

cluster formation in colonies of gliding bacteria. Phys. Rev. Lett. 108: 098102.

[7] Stringer MJ, Sales-Pardo M, Amaral LAN (2008) Effectiveness of journal ranking schemes as a tool for locating information.

PLoS ONE 3: e1683.

[8] O’Brien C (1994) Checkmate for chess historians. Science 265: 1168-1169.

[9] Shannon CE (1950) Programming a computer for playing chess. Philosophical Magazine 41: 314.

[10] Blasius B, T¨ onjes R (2009) Zipf’s law in the popularity distribution of chess openings. Phys. Rev. Lett. 103: 218701.

[11] Gobet F, de Voogt A, Retschitzki J (2004) Moves in Mind: The Psychology of Board Games. New York: Psychology Press.

[12] Saariluoma P (1995) Chess Players’ Thinking: A Cognitive Psychological Approach. London: Routledge.

[13] Fogel DB, Hays TJ, Hahn SL, Quon J (2004) A self-learning evolutionary chess program. Proceedings of the IEEE 92:

1947-1954.

[14] Elo A (1978) The Rating of Chess Players, Past and Present. London: Batsford.

[15] Hyatt RM, Crafty Chess v23.3, www.craftychess.com (accessed on July 2011).

[16] Siegle P, Goychuk I, H¨ anggi P (2010) Origin of hyperdiffusion in generalized brownian motion. Phys. Rev. Lett. 105:

100602.

[17] Skalski GT, Gilliam JF (2000) Modeling diffusive spread in a heterogeneous population: a movement study with stream

fish. Ecology 81: 1685-1700.

[18] Peng CK, Buldyrev SV, Havlin S, Simons M, Stanley HE, Goldberger AL (1994) Mosaic organization of DNA nucleotides.

Phys Rev E 49: 1685-1689.

[19] Kantelhardt JW, Koscielny-Bunde E, Rego HHA, Havlin S, Bunde A (2001) Detecting long-range correlations with de-

trended fluctuation analysis. Physica A 295: 441-454.

Page 11

11

Supporting information

TABLE S1: Full description of our chess database. This table show

all the tournaments that comprise our data base. The PGN files are free

available at http://www.pgnmentor.com/files.html. Specifically, the

files we have used are those grouped under sections “Tournaments”,

“Candidates and Interzonals” and “World Championships”.

TournamentYears

World Championships

FIDE Championship 1996,1998-2000,2002,2004-2008,2010

PCA Championship 1993,1995

World Championship 1886,1889,1890,1892,1894,1896,1907-1910,1921,1927,1929,1934,1935,

1937,1948,1951,1954,1957,1958,1960,1961,1963,1966,1969,1972,1978,

1981,1985,1987,1990,1993

Candidates and Interzonals

Candidates1950,1953,1959,1962,1965,1968,1971,1974,1980,1983,1985,1990,1994

Interzonals1948,1952,1955,1958,1962,1964,1967,1970,1973,1976,1979,1982,1985,

1987,1990,1993

WCC Qualifier 1998,2002,2007,2009

PCA Candidates1994

PCA Qualifier1993

World Cup2005

Open Tournaments

AVRO1938

Aachen 1868

Altona 1869,1872

Amsterdam1889,1920,1936,1976-1981,1985,1987,1988,1991,1993-1996

Bad1977

BadElster 1937-1939

BadHarzburg1938,1939

BadKissingen1928,1980,1981

BadNauheim1935-1937

BadNiendorf1927

BadOeynhausen 1922

BadPistyan1912,1922

Baden 1870,1925,1980

Barcelona1929,1935,1989

Barmen1869,1905

Belfort 1988

Belgrade 1964,1993,1997

Berlin1881,1897,1907,1920,1926

Bermuda2005

Bern1932

Beverwijk 1967

Biel1992,1997,2004,2006,2007

Bilbao 2009

Continued on next page

Page 12

12

TABLE S1 – continued from previous page

OpenYears

Birmingham 1858

Bled1931,1961,1979

Bournemouth1939

Bradford 1888,1889

Breslau 1889,1912,1925

Bristol1861

Brussels1986-1988

Bucharest1953

Budapest 1896,1913,1921,1926,1929,1940,1952,2003

Budva1967

Buenos Aires1939,1944,1960,1970,1980,1994

Bugojno1978,1980,1982,1984,1986

Cambridge1904

Cannes2002

Carlsbad1907

Carrasco 1921,1938

Chicago 1874,1982

Cleveland1871

Coburg1904

Cologne 1877,1898

Copenhagen1907,1916,1924,1934

Dallas1957

Debrecen1925

Dortmund1928,1973,1975-1989,1991-2007

DosHermanas 1991-1997,1999,2001,2003,2005

Dresden1892,1926

Duisburg 1929

Dundee 1867

Dusseldorf 1862,1908

Enghien 2003

Foros2007

Frankfurt1878,1887,1923,1930

Geneva1977

Giessen 1928

Gijon1944,1945

Gothenburg 1909,1920

Groningen1946

Hague 1928

Hamburg1885,1910,1921

Hannover 1902

Hastings1895,1919,1922,1923,1925-1927,1929-1938,1945,1946,1949,1950,1953,

1954,1957,1959-1962,1964,1966,1969-2004

Havana1913,1962,1963,1965

Heidelberg1949

Hilversum1973

Hollywood 1945

Continued on next page

Page 13

13

TABLE S1 – continued from previous page

Open Years

Homburg1927

Hoogeveen2003

Johannesburg1979,1981

Karlovy1948

Karlsbad1911,1923,1929

Kecskemet1927

Kemeri1937,1939

Kiel1893

Kiev 1903

Krakow1940

Kuibyshev1942

LakeHopatcong1926

LasPalmas 1973-1978,1980-1982,1991,1993,1994,1996

Leiden1970

Leipzig1876,1877,1879,1894

Leningrad1934,1937,1939

Leon1996

Liege1930

Linares1981,1983,1985,1988-1995,1997-2007

Ljubljana1938

Ljubojevic 1975,1977

Lodz1907,1935,1938

London 1862,1866,1872,1876,1877,1883,1892,1900,1922,1927,1932,1946,1980,

1982,1984,1986

LosAngeles 1963

Lugano 1970

Lviv 2000

Madrid 1943,1996,1997,1998

Maehrisch 1923

Magdeburg 1927

Manchester 1857,1890

Manila 1974, 1975

Mannheim1914

MardelPlata1928,1934,1936,1942-1957,1959-1962,1965-1972,1976,1979,1981,1982

Margate 1935,1939

Marienbad 1925

Meran 1924

Merano 1926

Merida 2000,2001

Milan 1975

MonteCarlo1901-1904,1967

Montecatini 2000

Montevideo1941

Montreal 1979

Moscow1899,1901,1920,1925,1935,1947,1956,1966,1967,1971,1975,1981,1985,

1992,2005-2007

Continued on next page

Page 14

14

TABLE S1 – continued from previous page

OpenYears

Munich 1900,1941,1942,1993

Netanya1968,1973

NewYork1857,1880,1889,1893,1894,1913,1915,1916,1918,1924,1927,1931,1940,

1951

Nice 1930

Niksic1978,1983

Noordwijk1938

Nottingham1936

Novgorod 1994-1997

NoviSad1984

Nuremberg1883,1896,1906

Oslo1984

Ostende1905-1907,1937

Palma 1967,1968,1970,1971

Paris1867,1878,1900,1924,1925,1933

Parnu 1937,1947,1996

Pasadena1932

Philadelphia1876

Podebrady1936

Poikovsky 2004-2007

Polanica1998,2000

Portoroz 1985

Prague1908,1943

Ramsgate 1929

ReggioEmilia 1985-1989,1991,1992

Reykjavik 1987,1988,1991

Riga 1995

Rogaska 1929

Rosario 1939

Rotterdam 1989

Rovinj 1970

Salzburg1943

SanAntonio1972

SanRemo 1930

SanSebastian1911,1912

SantaMonica1966

Sarajevo 1984,1999,2000

Scarborough 1930

Semmering 1926

Skelleftea1989

Skopje 1967

Sliac1932

Sochi1973,1982

Sofia 2005,2007

SovietChamp 1920,1923-1925,1927,1929,1931,1933,1934,1937,1939,1940,1944,

1945,1947-1953,1955-1981,1983-1991

Continued on next page

Page 15

15

TABLE S1 – continued from previous page

Open Years

StLouis1904

StPetersburg1878,1895,1905,1909,1913

Stepanakert2005

Stockholm1930

Stuttgart1939

Sverdlovsk1943

Swinemunde1930,1931

Szcawno1950

Teeside1975

Teplitz1922

TerApel1997

Tilburg1977-1989,1991-1994,1996-1998

Titograd1984

Trencianske1941,1949

Triberg 1915,1921

Turin1982

Ujpest1934

Vienna1873,1882,1898,1899,1903,1907,1908,1922,1923,1937,1996

Vilnius 1909,1912

Vinkovci1968

Vrbas 1980

Waddinxveen1979

Warsaw 1947

WijkaanZee1968-2007

Winnipeg 1967

Zagreb 1965

Zandvoort 1936