ArticlePDF Available

Abstract

Elections in Russia are widely believed to be fraudulent in various ways, a claim some support especially by looking at voter turnout, others by looking at vote counts' digits. We use polling station level data from the Russian Duma elections of 2003 and 2007 and presidential elections of 2004 and 2008 to examine how several methods for diagnosing election fraud complement one another. The methods include estimating the distribution of turnout, measuring the relationship between turnout and party support and testing for vote counts' second digits following the distribution implied by Benford's Law. Anomalies the methods detect are worse by the end of the period under study than at the beginning. The digit test detects anomalies beyond those suggested by a simple idea that turnout in many places was fraudulently inflated.
Comparative Election Fraud Detection
Walter R. Mebane, Jr.
Kirill Kalinin
August 7, 2009
Abstract
Elections in Russia are widely believed to be fraudulent in various ways, a claim some
support especially by looking at voter turnout, others by looking at vote counts digits. We
use polling station level data from the Russian Duma elections of 2003 and 2007 and
presidential elections of 2004 and 2008 to examine how several methods for diagnosing
election fraud complement one another. The methods include estimating the distribution of
turnout, measuring the relationship between turnout and party support and testing for vote
counts second digits following the distribution implied by Benfords Law. Anomalies the
methods detect are worse by the end of the period under study than at the beginning. The digit
test detects anomalies beyond those suggested by a simple idea that turnout in many places
was fraudulently inflated.
Prepared for presentation at the Annual Meeting of the American Political Science Association, Toronto, Canada,
Sept 3–6, 2009. A previous version wes presented at the 2009 Annual Meeting of the Midwest Political Science
Association.
Professor, Department of Political Science and Department of Statistics, University of Michi-
gan, Haven Hall, Ann Arbor, MI 48109-1045 (E-mail: wmebane@umich.edu).
Ph.D. student, Department of Political Science, University of Michigan (E-mail:
kkalinin@umich.edu).
1
Introduction
The least one can say about national elections in Russia over the most recent election cycles is
that they have become less competitive, with fewer political parties presenting candidates for both
the Duma (federal parliament) and the presidency. The United Russia [UR] party, associated with
Vladimir Putin, has unquestionably become increasingly dominant. But many observers go
further and argue that during the recent period Russian elections have become increasingly unfree
and unfair. Myagkov and Ordeshook (2008) argue that during the past 15 years “falsifications in
the form of stuffed ballot boxes and artificially augmented election counts” have become
prevalent throughout the country (see also Myagkov, Ordeshook, and Shaikin 2008, 2009).
Since 2000 there have been increases in institutional barriers that suppress electoral
competition between parties and candidates: tougher registration requirements; the rise of a
parliamentary threshold for parties; cancellation of protest voting; and cancellation of the
minimum electoral threshold. In the electoral campaign there has been excessive positive
informational and financial support of candidates favored by the Kremlin along with negative
campaigning against alternative candidates and parties. Administrative changes decreased the
transparency of elections. Electoral commission activities became more closed off from the
public, with independent public observation being canceled and the rights of the legal observers
being frequently violated. The OSCE identified serious problems in the 2004 election (OSCE
Office for Democratic Institutions and Human Rights 2004), and by 2008 problems had become
so severe that international observer groups declined to observe the election (OSCE Office for
Democratic Institutions and Human Rights 2008).
Lubarev, Buzin, and Kynev (2007) argue that administrative changes since 2000 increased the
extent to which officials from all levels of the government participated in election administration.
Since 2003 both federal and regional administrative resources as well as the mass media have
been deployed in favor of UR, which thereby received substantial informational advantages, and
against its main competitor the KPRF (Communist party) (Buzin and Lubarev 2008). The
Kremlin now controls the appointments of regional executives (each governor can be fired for
1
“loss of trust”), which has made them responsible for delivering “recommended” electoral figures
to the Kremlin. After gubernatorial elections were canceled in December 2004, by the spring of
2007, 70 of 85 governors had announced they were participating in the party of power (Gel’man
2007). Thus, it appears that the entire regional state apparatus is now at the service of the party of
power, making it one large electoral “political machine. This operation is characterized by
control over the mass media, administrative pressure on both opposition and voters, and possibly
falsifications as well. In fact, state officials have excessive control over all levels of electoral
commissions, including precincts (UIKs), territorial (TIKs) and regional commissions.
Allegations point to a wide variety of methods used to distort reported votes (Kalinin 2008).
Many of these methods relate to voter turnout and so as markers for fraud may appear ambiguous
to the extent they resemble efforts to boost genuine electoral support. In the 2004 presidential
election, due to the effectiveness of administrative resources and the popularity of Putin, the
outcome of presidential elections was essentially predetermined. As a result, neither of the
opposition parties was promoting its leader as their candidate (Buzin and Lubarev 2008, 26).
Nonetheless a wide variety of methods was used to increase turnout, including forced voting by
absentee certificates. According to Buzin and Lubarev, in 214 territories the 2004 election was
the first election where the turnout rate and the share of votes for the winner simultaneously
exceeded 90 percent (Buzin and Lubarev 2008, 26). This phenomenon occurred frequently in
republics as well as in Rostovskaya, Tumenskaya, Chukotskii and Yamalo-Nenetskii AO.
Voters in Russia do not personally register to vote, but all eligible voters are assigned to
specific UIKs depending on where they live. There is a permanent gap between the number of
real voters and the “listed voters”—the average number of unaccounted voters in Russia is 2–5
million people—and on election days there are always extensive corrections of the voter lists
(Arbatskaya 2004). The large-scale character of these corrections depends on the specific ways
the voter lists are formed, methods that differ in different territories. Arbatskaya argues that this
correction of voter lists phenomenon opens the door for administrative tyranny, violating the
democratic rights of citizens Arbatskaya (2004, 224–226).
2
The federal elections of 2007 and 2008 took place under different conditions. The president’s
popularity remained high, and the “party of power” controlled not only the Duma but also many
federal legislatures, encompassing the majority of regional heads, mayors of big cities, and other
representatives of political, administrative and economic elites. Putin had promised not to change
the Constitution to allow himself to stand for reelection, so he was preparing for his successor.
Between 2003 and 2007, the Duma election was changed from a mixed system to a system based
entirely on proportional representation. The UR party list was headed by Putin, and it also
included a majority of the governors. In the absence of any viable political competitors and
Putin’s unique position in UR’s list (he was the only one in its federal list), the 2007 federal
elections were labeled as referendum for the all-national Leader. The lack of competition and the
absence of the “against all” (Protiv vseh) option on the ballot—this protest voting option was
prohibited after the 2004 election—produced a danger of low turnout. According to Buzin and
Lubarev, the main task of federal authorities in 2007 and 2008 was twofold: to provide the victory
of Kremlin candidates, and to provide high turnout (Buzin and Lubarev 2008, 184, 257-258).
Therefore Buzin and Lubarev claim that vote falsifications were not solely about shifting votes
from one candidate to another, but rather about simultaneously increasing the number of votes
and the number of voters. These goals can be implemented by “stuffing” the ballot boxes (vbros)
or “adding figures to protocols” (pripiska) (Buzin and Lubarev 2008, 184).
Like Myagkov et al. (2009), Lubarev et al. (2007) and Buzin and Lubarev (2008) argue that
direct falsifications played a much larger role in the federal elections of 2007 and 2008 than they
had in the federal elections and regional elections of the 1990s and early 2000s. Buzin and
Lubarev (2008) present electoral data, observer reports and multiple stories from observers and
ordinary voters that illustrate the growth of crude falsifications and their widespread character, a
pattern they refer to as “mass administrational electoral technology. Buzin and Lubarev (2008)
conclude that compared to all other elections, the elections of 2007 and 2008 showed that direct
falsifications started to affect the results of elections, by affecting the distribution of votes
between the candidates.
3
Myagkov et al. (2009), Lubarev et al. (2007) and Buzin and Lubarev (2008) all emphasize
what they claim is fraudulent voter turnout. Buzin and Lubarev (2008) state that along with
stuffed ballot boxes, the easiest and the most popular technique is to change figures in UIKs’
protocols by UIKs or even more often by TIKs (territories). Buzin and Lubarev are astonished by
how widespread direct falsifications are and about the “courage” of falsificators, who appear
confident that they are supported by administration and courts (Buzin and Lubarev 2008, 177). As
Buzin and Lubarev point out “the insolence with which the protocols are changed in TIKs,
knowing that the copy is already given out to observers, is explained by their confidence in
impunity, being assured that falsificators and law machinery, including courts, are acting
together” (Buzin and Lubarev 2008, 177). They argue that the federal elections of 2007 and 2008
showed widespread discrepancies between data derived from UIKs and official data produced by
TIKs and the Gas VIBORI system (the internet-accessible election reporting system).
In this paper we use UIK-level data from the 2003 and 2007 Duma elections and the 2004 and
2008 presidential elections to show that it is useful to augment analysis of Russian elections that
focuses on voter turnout statistics with information about the distribution of the second significant
digits in UIK-level vote counts.
Tests of vote counts based on the so-called second-digit Benford’s Law (2BL) distribution
have figured prominently in work on election forensics (Mebane 2006a,b, 2007b,a, 2008b). The
analysis in Mebane (2007a) ultimately focuses on the conditional means of the second digits in
collections of vote counts, measuring how these means differ from the means expected according
to the 2BL distribution. The conditioning factors in that analysis, which examined data from the
2006 election in Mexico, were the partisan affiliations of mayors in Mexican municipalities.
Mebane (2008a) and Kalinin (2008) combined an examination of UIK vote counts second-digit
conditional means with outlier detection methods (Mebane and Sekhon 2004) to try to diagnose
which of several hypothesized methods for fraud may have affected the votes reported for Russian
presidential candidates in 2004 and 2008.
4
Nonparametric Regression 2BL Test
The 2BL test used in this paper involves comparing the arithmetic mean of the vote counts’
second digits to the mean value expected if the digits are 2BL-distributed. This test adapts an idea
used in Grendar, Judge, and Schechter (2007)’s analysis that focuses on the first significant digit
and is intended to identify what they describe as generalized Benford distributions. Grendar et al.
suggest that data that do not conform to Benford’s law may have first digits that match a member
of a specified class of exponential families. Mebane (2006b) argues that vote counts in general do
not have digits that match Benford’s law at all. In particular, the distribution of the first digits of
vote counts is undetermined. Mebane (2006b) demonstrates a pair of naturalistic models that
produce simulated vote counts with second digits but not first digits that are distributed roughly as
specified by Benford’s law. Nonetheless we can use the mean of the second digits to test how
closely the digits match the 2BL distribution. According to Benford’s law, the expected relative
frequency q
j
with which the second significant digit is j is (rounded)
(q
0
, . . . , q
9
) = (.120, .114, .109, .104, .100, .097, .093, .090, .088, .085). Given 2BL-distributed
counts, the value expected for the second-digit mean
¯
j is approximately
¯
j
B
=
P
9
j=0
jq
j
= 4.18 7 .
Mebane (2006b) and Mebane (2007a) suggest that vote counts whose second digits follow the
2BL distribution are unproblematic, while departures from the 2BL distribution indicate that some
kind of manipulation has occurred. Whether the manipulation the second-digit test may detect
constitutes any kind of fraud is something that needs to be established by additional evidence.
The test is, first, whether
¯
j differs from
¯
j
B
and, second, whether it differs in a way that
depends on observed conditioning factors. The conditioning factor in the current analysis is
reported voter turnout, measured as the proportion of registered voters who voted at each UIK.
1
For vote counts y
i
observed for UIKs indexed by i, we nonparametrically regress the second
digits on the turnout proportion x
i
. To estimate the nonparametric regressions we use the package
sm (Bowman and Azzalini 1997) for the statistical programming environment R (R Development
1
Specifically the value we use to measure turnout is the sum of the number of ballots given out to voters before
election day, the number given to voters in polling places and the number given to voters outside of polling places
divided by the number of registered voters.
5
Core Team 2005).
Turnout, Votes and Manipulations in Russia 2003–2008
Start by considering some of the facts about the distribution of turnout in recent Russian elections
that support suspicions that the elections were, increasingly, affected by fraud. Figures 1 and 2
display kernel density estimates
2
for UIK-level turnout in the Duma elections of 2003 and 2007
and the presidential elections of 2004 and 2008.
3
Following Myagkov et al. (2009), we consider
separately data from republics and data from other regions (“oblasts”). The figures mirror results
presented by Buzin and Lubarev (2008, Appendix, Illustration 38), which they attribute to S. A.
Shpilkin.
*** Figures 1 and 2 about here ***
The progression of figures shows worse distributions in 2007 and 2008 than in the earlier two
years. The distributions are also worse in the presidential election years than in the Duma election
years. The top row of Figure 1 shows the distribution for 2003. For both oblasts and republics
there is a spike of UIKs with turnout at or very near 100 percent. A higher proportion of UIKs in
the republics than in the oblasts have this feature. But in oblasts most of the UIKs have turnout
following a relatively smooth unimodal distribution, and in republics many of the UIKs have
turnout following such a distribution. In 2004 (the second row of 1), the proportion of UIKs with
turnout near 100 percent increases noticeably in oblasts and very substantially in republics. In
oblasts the distribution also exhibits spikes at locations corresponding to the excess of turnout
values at values of 70%, 80% and 90% noticed by Shpilkin and Shulgin (Buzin and Lubarev
2008, 201). The distributions for 2007 (top row of Figure 2) shows spikes of UIKs at or near 100
percent turnout similar to those observed in 2004. In the distribution for oblasts, spikes are
apparent at round number percentages of turnout above 60%. The distributions for 2008 (bottom
row of Figure 2) have proportions of UIKs with turnout at or near 100 percent comparable to
2
These densities are computed using Rs density() function.
3
All vote and turnout data were downloaded from the website of the Central Election Commission of the Russian
Federation, http://www.vybory.izbirkom.ru/region/izbirkom.
6
2004. The distribution for oblasts shows very pronounced spikes at round number percentages of
turnout, and in the distribution for republics a spike is evident near 75% turnout.
Buzin and Lubarev (2008, 201) argue that the only acceptable explanation for the spiked
distributions is a wide-spread adjustment of turnout to specific “rounded” figures. Inspecting the
last digits of the original UIK-level turnout counts adds to the impression that many of them are
faked. If the turnout counts reflected the natural complex of processes that cause people to vote or
not to vote, we would expect the counts’ last digits to be uniformly distributed (i.e., each digit
zero through nine would occur equally often) (Beber and Scacco 2008). Table 1 shows that the
distribution of the last digits in the actual turnout counts from 2003–2008 is very often far from
uniform. The table shows for each digit the signed square root of the discrepancy between the
observed frequency of the digit and the frequency of 0.1 expected if the distribution is uniform.
4
A value of 2.0 or greater in magnitude represents a significant discrepancy. The table shows that
there are always too many zeros, with one exception too few nines, and usually too many fives.
Year 2003 for UIKs in oblasts is the only situation where neither the number of fives nor the
number of nines is significantly discrepant from the expected uniform distribution, and that subset
of UIKs is the only one for which the overall Pearson chi-square statistic is not statistically
significant at the .05 test level. As measured by the overall chi-squared statistics, the extent of the
discrepancy from the uniform expectation increases monotonically as one moves from 2003 to
2008. Turnout fakery seems to be much worse at the end of the time period than at the beginning.
*** Table 1 about here ***
Myagkov et al. (2009) emphasize the way turnout is associated with votes for the party of
power at the rayon level, and Buzin and Lubarev (2008, 204) discuss similar kinds of relationships
using UIK-level data. Both discussions make the point that where turnout is very high, support
for UR tends also to be very high, and support for other parties—notably the KPRF—tends to be
relatively low. Figures 3–6 illustrate these relationships for these two parties. These figures show
a solid line representing the nonparametric regression of the vote proportions on the turnout
4
If p
j
is the observed frequency of digit j and N is the number of UIKs, then the signed square root statistic is
sign(p
j
0.1)N[(p
j
0.1)
2
/0.1]
1/2
.
7
proportions bounded by dashed lines indicating 95% confidence bounds. A dotted line shows the
unconditional mean vote proportion. Figure 3 shows the results for UR in republics, with one plot
for the UIKs in each year. Clearly mean support for UR is much greater in UIKs where turnout is
very high. The increase in mean UR vote share from its approximate floor (for turnout roughly 50
percent) to its peak at turnout equalling 100 percent is greater in 2003 than in 2007 but also
greater in 2008 than in 2004.
5
Figure 4 shows the results for UR in oblasts. In the Duma election
years, then mean support for UR no longer peaks at turnout equal to 100 percent but instead
reaches a maximum for turnout at around 90 percent. In the presidential election years, mean
support for UR does have a maximum at the highest level of turnout. The gain in mean support
from floor to maximum is now greater in 2007 than in 2003 and in 2008 than in 2004.
*** Figures 3 and 4 about here ***
Figures 5 and 6, which show the same kinds of scatterplots and nonparametric regression
lines, in contrast show mean support for the KPRF decreasing once turnout increases beyond a
certain level. The relationships for UIKs in republics, in Figure 5, show mean KPRF support
declining throughout the distribution of turnout in 2003, but 2004, 2007 and 2008 a decline in
mean support sets in only for turnout greater than about 60 percent. The decline from ceiling to
minimum is greater in 2003 than in 2007 but greater in 2008 than in 2004. The relationships for
UIKs in oblasts, in Figure 6, show mean KPRF support declining only turnout greater than a
certain level in all four years. In 2003 and 2004 the decline begins once turnout reaches about 80
percent, but in 2007 and 2008 the decline starts when turnout reaches about 50 percent. The
ceiling to minimum declines in mean KPRF support are also larger in 2007 than in 2003 and in
2008 than in 2004.
*** Figures 5 and 6 about here ***
Reported turnout certainly looks suspicious when its distribution and the distribution of
turnout counts’ last digits are viewed on their own, and turnout is clearly related to the mean
support for UR and the KPRF. Plots computed for other parties resemble the ones shown here for
5
For 2003 we use the proportional represntation votes, to match the electoral system in place in 2007.
8
the KPRF. Such a pattern of UR tending to gain support in places where the KPRF and other
parties are tending to lose support strongly suggests that vote switching is possibly occurring.
The 2BL test may provide further evidence on this point. Simulations reported in Mebane
(2006b) and Mebane (2008a) suggest that variations from the 2BL distribution can occur both
when vote counts are artificially increased and when they are artificially reduced. As Mebane
(2008a) observes, it is unclear whether an artificial increase in vote counts will mean that the
mean second digit,
¯
j, also increases, or whether an artificial decrease implies that
¯
j decreases. But
we might expect that if substantial vote switching is occurring, we should see significant
departures from the 2BL expected mean ,
¯
j
B
, for both the receiver party and the donor party in
places where the vote switching is happening. In the current case, we might expect nonparametric
regression lines to show that the expected second digit differs from the 2Bl expected value for
both UR and the KPRF for the same values of turnout, if vote switching is taking place.
Figure 7 shows the first of a series of graphs intended to allow such parallel assessments. Each
plot in the figure shows a solid line representing the nonparametric regression of the second digits
on the turnout proportions, with a pair of dashed lines indicating the boundaries of a 95%
confidence interval. A horizontal dotted line locates
¯
j
B
= 4.187. A rug plot along the bottom of
each plot locates the observed values of turnout. The first question is whether there is any range
of turnout values for which the nonparametric regression curve’s confidence interval does not
contain
¯
j
B
. If so, we will then ask whether the same region of discrepancy is found for both UR
and KPRF.
*** Figure 7 about here ***
In all four years in republics, shown in Figure 7, the nonparametric regression curves for the
UR vote counts’ second digits have roughly the same shape, but across the years the regions
where
¯
j significantly differs from
¯
j
B
varies somewhat. In 2003,
¯
j differs significantly from
¯
j
B
only for turnout in the interval roughly (0.45–0.6).
6
In this region,
¯
j >
¯
j
B
. In 2007,
¯
j differs from
¯
j
B
for roughly the same values of turnout in the same way, but
¯
j also is significantly less than
¯
j
B
6
Magnification is probably needed to see the differences discussed in this section.
9
for turnout in the interval roughly (0.7–0.9). In 2004,
¯
j differs from
¯
j
B
for turnout in the interval
roughly (0.6–0.8), and in that interval
¯
j <
¯
j
B
. For 2008,
¯
j <
¯
j
B
significantly for turnout in
roughly (0.65–0.9), and
¯
j >
¯
j
B
significantly for turnout greater than roughly 0.975. The graphs
for UIKs in oblasts, shown in Figure 8, show roughly the same pattern as in the republics for 2003
and 2007. For 2004, the oblasts graph shows
¯
j >
¯
j
B
significantly for turnout in the interval
roughly (0.3–0.5) and
¯
j <
¯
j
B
significantly for turnout in roughly (0.55–0.8). The graph for 2008
is similar. None of the graphs for oblasts shows a significant discrepancy between
¯
j and
¯
j
B
at the
very highest levels of turnout.
*** Figure 8 about here ***
Comparing these figures to those for the second digits of the KPRF vote counts, we see in the
plot for republics (Figure 9) very different patterns. In 2003,
¯
j <
¯
j
B
significantly for turnout in
two intervals, roughly (0.3–0.55) and greater than 0.75. The results for 2004 are approximately
similar. In 2007 the lower interval shrinks to roughly (0.55–0.6) and the upper interval is also
slightly smaller (greater than about 0.85). In 2008,
¯
j >
¯
j
B
significantly for turnout in roughly
(0.6–0.7), while
¯
j <
¯
j
B
significantly for turnout greater than about 0.9.
*** Figure 9 about here ***
The intervals of turnout for which
¯
j 6=
¯
j
B
significantly for UR overlap with the intervals for
which
¯
j 6=
¯
j
B
significantly for the KPRF in all four years. For 2003, 2004 and 2007, the overlaps
occur for turnout values in the vicinity of 0.5, and
¯
j >
¯
j
B
for UR but
¯
j <
¯
j
B
for the KPRF. For
2008 overlaps occur for most of the turnout values greater than about 0.65, and once again
¯
j
¯
j
B
has opposite signs for UR and for the KPRF. Such patterns strongly suggest vote switching.
Notably, in every year except 2008, the pattern that suggests vote switching occurs for moderate
levels of turnout and not at the highest levels. The pattern of
¯
j for the KPRF in republics in the
earlier years clearly suggest something irregular was happening in the highest turnout UIKs.
Perhaps, as in some of the simulations reported by Mebane (2006b) and Mebane (2008a), vote
switching was also occurring in those years but it did not rise to levels sufficient to trigger a 2BL
signal in the receiver party’s vote counts.
10
The results for
¯
j for the KPRF for UIKs in oblasts, shown in Figure 10, are similar to those for
republics. With minor differences the comparison between those conditional means and the
conditional means for UR suggest the same kind of conclusion.
*** Figure 10 about here ***
Conclusion
Anomalies the methods detect are worse by the end of the period under study than at the
beginning. The second-digit test detects anomalies beyond those suggested by a simple idea that
turnout in many places was fraudulently inflated.
11
References
Arbatskaya, Marina. 2004. HowMany Voters Are there in Russia? (Political-geographicalanalysis
of a General Number of the Russian Voters and Level of their Activity. 1990-2004) (In Russian:
Skol’ko zhe izbiratelei v Rossii? (Politiko-geograficheskii analiz obschego chisla rossiiskih izbi-
ratelei i urovnya ih aktivnosti. 1990-2004)). Irkutsk: Institute of Geography SB RAS.
Beber, Bernd and Alexandra Scacco. 2008. “What the Numbers Say: A Digit-Based Test for
Election Fraud Using New Data from Nigeria. Paper prepared for the Annual Meeting of the
American Political Science Association, Boston, MA, August 28–31, 2008.
Bowman, Adrian W. and Adelchi Azzalini. 1997. Applied Smoothing Techniques for Data Analy-
sis: The Kernel Approach with S-Plus Illustrations. Oxford: Clarendon Press.
Buzin, Andrei and Arkadii Lubarev. 2008. Crime without Punishment: Administrative Technolo-
gies of Federal Elections of 2007-2008 (In Russian: Prestupleniye bez nakazaniya. Administra-
tivniye tekhnologii federal’nih viborov 2007-2008 godov). Moscow: Nikkolo M.
Gel’man, Vladimir. 2007. “Political Trends in the Russian regions on the Eve of State Duma
Elections. Russian Analytical Digest (21): 27.
Grendar, Marian, George Judge, and Laura Schechter. 2007. An Empirical Non-Parametric Like-
lihood Family of Data-Based Benford-Like Distributions. Physica A 380 (1): 429–438.
Kalinin, Kirill. 2008. “Electoral Frauds in Russia: 2004 and 2008 Presidential Elections Com-
pared. Paper presented at the 2008 Annual Meeting of the Midwest Political Science Associa-
tion, Chicago, IL, April 3.
Lubarev, Arkadii, Andrei Buzin, and Alexander Kynev. 2007. Dead Souls: Methods of Falsifica-
tions of Electoral Results and the Struggle Against Them (In Russian: Mertviye dushi. Metodi
falsifikacii itogov golosovaniya i bor’ba s nimi). Moscow: Nikkolo M.
12
Mebane, Walter R., Jr. 2006a. “DetectingAttempted Election Theft: Vote Counts, Voting Machines
and Benford’s Law. Paper prepared for the 2006 Annual Meeting of the Midwest Political
Science Association, Chicago, IL, April 20–23.
Mebane, Walter R., Jr. 2006b. Election Forensics: Vote Counts and Benford’s Law. Paper
prepared for the 2006 Summer Meeting of the Political Methodology Society, UC-Davis, July
20–22.
Mebane, Walter R., Jr. 2007a. “Election Forensics: Statistics, Recounts and Fraud. Paper pre-
sented at the 2007 Annual Meeting of the Midwest Political Science Association, Chicago, IL,
April 12–16.
Mebane, Walter R., Jr. 2007b. “Evaluating Voting Systems To Improve and Verify Accuracy.
Paper presented at the 2007 Annual Meeting of the American Association for the Advancement
of Science, San Francisco, CA, February 16, 2007, and at the Bay Area Methods Meeting,
Berkeley, March 2, 2007.
Mebane, Walter R., Jr. 2008a. “Election Forensics: Outlier and Digit Tests in America and Rus-
sia. Paper presented at The American Electoral Process conference, Center for the Study of
Democratic Politics, Princeton University, May 1–3, 2008.
Mebane, Walter R., Jr. 2008b. “Election Forensics: The Second-digit Benford’s Law Test and
Recent American Presidential Elections. In R. Michael Alvarez, Thad E. Hall, and Susan D.
Hyde, editors, The Art and Science of Studying Election Fraud: Detection, Prevention, and
Consequences, .Washington, DC: Brookings Institution.
Mebane, Walter R., Jr. and Jasjeet S. Sekhon. 2004. “Robust Estimation and Outlier Detection for
Overdispersed Multinomial Models of Count Data. American Journal of Political Science 48
(Apr.): 392–411.
Myagkov, Mikhail, Peter C. Ordeshook, and Dimitry Shaikin. 2009. The Forensics of Election
Fraud: With Applications to Russia and Ukraine. New York: Cambridge University Press.
13
Myagkov, Misha and Peter C. Ordeshook. 2008. “Russian Elections: An Oxymoron of Democ-
racy. Caltech/MIT Voting Technology Project, VTP Working Paper #63, Mar 2008.
Myagkov, Misha, Peter C. Ordeshook, and Dimitry Shaikin. 2008. “Estimating the Trail of Votes
in Russia’s Elections and the Likelihood of Fraud. In R. Michael Alvarez, Thad E. Hall, and
Susan D. Hyde, editors, The Art and Science of Studying Election Fraud: Detection, Prevention,
and Consequences, .Washington, DC: Brookings Institution.
OSCE Office for Democratic Institutions and Human Rights. 2004. “Russian Federation Presiden-
tial Election, 14 March 2004: OSCE/ODIHR Election Observation Mission Report. Warsaw,
June 2, 2004.
OSCE Office for Democratic Institutions and Human Rights. 2008. “OSCE/ODIHR Regrets that
Restrictions Force Cancellation of Election Observation Mission to Russian Federation. Press
release, Warsaw, February 7, 2008.
R Development Core Team. 2005. R: A Language and Environment for Statistical Computing. R
Foundation for Statistical Computing. Vienna, Austria. ISBN 3-900051-07-0.
URL http://www.R-project.org
14
Table 1: Distribution of last Digits for UIK Vote Totals in Russian Elections 2003–2008
Year
2003 2004 2007 2008
Digit Repub. Oblast Repub. Oblast Repub. Oblast Repub. Oblast
0 6.2 3.0 9.9 4.7 10.5 7.7 15.4 10.5
1 2.0 0.1 1.8 1.3 0.8 0.1 1.4 0.7
2 0.6 0.4 1.1 0.5 1.3 1.7 2.1 0.2
3 1.2 1.0 1.8 0.7 0.6 1.4 1.9 2.1
4 0.7 0.3 3.3 0.8 3.4 1.0 3.7 1.3
5 3.1 0.9 2.1 3.3 2.1 0.2 2.7 0.8
6 1.8 0.9 0.1 0.0 2.8 1.8 1.3 1.1
7 0.2 0.9 2.0 2.2 1.1 0.9 3.1 0.0
8 0.7 1.5 0.7 2.1 0.3 0.8 1.4 1.0
9 3.5 0.8 3.4 4.0 2.3 3.5 3.2 4.9
χ
2
L
69.9 15.4 137.2 60.9 144.2 82.0 292.2 143.0
n 17, 008 77, 305 17, 600 77, 824 17, 875 77, 92 8 17, 865 78, 383
Notes: Entries for each digit show signed square roots of chi-squared statistics implied by the null
hypothesis that the total number of votes cast at each UIK (polling station) have uniformly
distributed last digits. The χ
2
L
statistics show the overall Pearson chi-squared statistic (9 degrees
of freedom). n shows the number of UIKs.
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Distribution of Turnout across UIKs, 2003 Oblasts
N = 77757 Bandwidth = 0.0146
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Distribution of Turnout across UIKs, 2003 Republics
N = 17347 Bandwidth = 0.02346
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Distribution of Turnout across UIKs, 2004 Oblasts
N = 77826 Bandwidth = 0.01452
Density
0.2 0.4 0.6 0.8 1.0
0 1 2 3 4 5 6
Distribution of Turnout across UIKs, 2004 Republics
N = 17600 Bandwidth = 0.01942
Density
Figure 1: UIK turnout, 2003 and 2004
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 1.0 1.5 2.0 2.5
Distribution of Turnout across UIKs, 2007 Oblasts
N = 77930 Bandwidth = 0.01422
Density
0.0 0.2 0.4 0.6 0.8 1.0
0 1 2 3 4 5 6
Distribution of Turnout across UIKs, 2007 Republics
N = 17875 Bandwidth = 0.02052
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 1.0 1.5 2.0
Distribution of Turnout across UIKs, 2008 Oblasts
N = 78384 Bandwidth = 0.01447
Density
0.2 0.4 0.6 0.8 1.0
0 1 2 3 4 5 6
Distribution of Turnout across UIKs, 2008 Republics
N = 17865 Bandwidth = 0.01786
Density
Figure 2: UIK turnout, 2007 and 2008
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8
turnout proportion
Edinaya Rossiya vote proportion
0.2 0.4 0.6 0.8 1.0
0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
turnout proportion
Putin vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.65 0.70 0.75 0.80 0.85 0.90 0.95
turnout proportion
EDINAYA ROSSIYA vote proportion
0.2 0.4 0.6 0.8 1.0
0.6 0.7 0.8 0.9
turnout proportion
Medvedev vote proportion
Figure 3: UIK United Russia vote proportion by turnout, 2003, 2004, 2007 and 2008, republics
0.0 0.2 0.4 0.6 0.8 1.0
0.30 0.35 0.40
turnout proportion
Edinaya Rossiya vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.60 0.65 0.70 0.75
turnout proportion
Putin vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75
turnout proportion
EDINAYA ROSSIYA vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.55 0.60 0.65 0.70 0.75
turnout proportion
Medvedev vote proportion
Figure 4: UIK United Russia vote proportion by turnout, 2003, 2004, 2007 and 2008, oblasts
0.0 0.2 0.4 0.6 0.8 1.0
0.10 0.15 0.20 0.25 0.30 0.35 0.40
turnout proportion
KPRF vote proportion
0.2 0.4 0.6 0.8 1.0
0.00 0.05 0.10 0.15
turnout proportion
Haritonov vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.00 0.02 0.04 0.06 0.08 0.10 0.12
turnout proportion
Kommunisticheskaya vote proportion
0.2 0.4 0.6 0.8 1.0
0.05 0.10 0.15 0.20 0.25 0.30
turnout proportion
Zyuganov vote proportion
Figure 5: UIK Communist vote proportion by turnout, 2003, 2004, 2007 and 2008, republics
0.0 0.2 0.4 0.6 0.8 1.0
0.02 0.04 0.06 0.08 0.10 0.12 0.14
turnout proportion
KPRF vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.10 0.15 0.20 0.25
turnout proportion
Haritonov vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.02 0.04 0.06 0.08 0.10 0.12 0.14
turnout proportion
Kommunisticheskaya vote proportion
0.0 0.2 0.4 0.6 0.8 1.0
0.15 0.20 0.25
turnout proportion
Zyuganov vote proportion
Figure 6: UIK Communist vote proportion by turnout, 2003, 2004, 2007 and 2008, oblasts
0.2 0.4 0.6 0.8 1.0
3.5 4.0 4.5 5.0 5.5 6.0
turnout proportion
2003 Edinaya Rossiya 2d digit
0.2 0.4 0.6 0.8 1.0
3 4 5 6 7 8 9
turnout proportion
2004 Putin 2d digit
0.2 0.4 0.6 0.8 1.0
3 4 5 6 7 8 9
turnout proportion
2007 EDINAYA ROSSIYA 2d digit
0.2 0.4 0.6 0.8 1.0
0 2 4 6 8
turnout proportion
2008 Medvedev 2d digit
Figure 7: UIK United Russia second-digit by turnout, 2003, 2004, 2007 and 2008, republics
0.2 0.4 0.6 0.8 1.0
1 2 3 4 5
turnout proportion
2003 Edinaya Rossiya 2d digit
0.2 0.4 0.6 0.8 1.0
1 2 3 4 5 6
turnout proportion
2004 Putin 2d digit
0.2 0.4 0.6 0.8 1.0
0 2 4 6 8
turnout proportion
2007 EDINAYA ROSSIYA 2d digit
0.2 0.4 0.6 0.8 1.0
1 2 3 4 5 6 7
turnout proportion
2008 Medvedev 2d digit
Figure 8: UIK United Russia second-digit by turnout, 2003, 2004, 2007 and 2008, oblasts
0.2 0.4 0.6 0.8 1.0
0 1 2 3 4
turnout proportion
2003 KPRF 2d digit
0.4 0.6 0.8 1.0
3 4 5 6 7
turnout proportion
2004 Haritonov 2d digit
0.2 0.4 0.6 0.8 1.0
0 2 4 6 8
turnout proportion
2007 Kommunisticheskaya 2d digit
0.2 0.4 0.6 0.8 1.0
0 1 2 3 4 5
turnout proportion
2008 Zyuganov 2d digit
Figure 9: UIK Communist second-digit by turnout, 2003, 2004, 2007 and 2008, republics
0.2 0.4 0.6 0.8 1.0
2 3 4 5 6
turnout proportion
2003 KPRF 2d digit
0.2 0.4 0.6 0.8 1.0
2 3 4 5 6 7
turnout proportion
2004 Haritonov 2d digit
0.2 0.4 0.6 0.8 1.0
0 1 2 3 4 5 6
turnout proportion
2007 Kommunisticheskaya 2d digit
0.2 0.4 0.6 0.8 1.0
0 1 2 3 4 5 6
turnout proportion
2008 Zyuganov 2d digit
Figure 10: UIK Communist second-digit by turnout, 2003, 2004, 2007 and 2008, oblasts
... Even with these administrative manipulations, United Russia received slightly less than 50% of the vote, down from 64% in 2007. This figure is doubtless inflated: Ruben Enikolopov et al. (2013) found that the vote total for United Russia was approximately 11 percentage points lower in areas in Moscow that did not have independent election monitors than in areas that had election monitors (see also Buzin and Lyubarev 2008;Kalinin and Mebane 2011;Mebane and Kalinin 2009;Myagkov et al. 2005;Shpilkin 2009;Vorobyev 2010). ...
Article
Full-text available
How do electoral manipulation and resulting anti-fraud protests influence political trust in non-democratic contexts? I leverage the plausibly exogenous variation in the timing of a series of original surveys fielded on nationally representative samples in Russia to understand the impact of political shocks – particularly allegations of electoral fraud and post-election protests – on the evolution of trust in political institutions and individuals. This study demonstrates that allegations of excessive, blatant electoral fraud decrease trust in the autocrat. However, trust rebounds following attendant post-election protests. Finally, I examine the conditional impacts of fraud and protest on trust, finding that updating occurs primarily among those with weak political affiliation.
... La literatura documentada evidencia diversos grados de estas irregularidades en varios ciclos electorales en países como México (Mebane 2006;Cantú 2014), Afganistán (Weidmann y Callen 2013), Rusia (Mebane y Kalinin 2009;Bader y Van Ham 2015), Uganda , Irán (Mebane 2010;Roukema 2013), Venezuela (Jiménez e Hidalgo 2014) y Turquía . No obstante, la experiencia de relevar datos sobre el desempeño de la organización electoral en elecciones en las que no preexiste una sospecha de fraude generalizado es un fenómeno poco común. ...
Article
Full-text available
La integridad del conteo provisorio de los resultados electorales cada vez genera más atención. A diferencia de los resultados definitivos, los provisorios carecen de valor legal, pero su potencial incidencia en la percepción de la integridad de los comicios justifica este creciente interés. En este artículo proponemos un conjunto de herramientas estadísticas para evaluar el funcionamiento y la integridad del conteo provisorio, y las aplicamos a las elecciones legislativas nacionales celebradas en Argentina en 2021. Las técnicas que utilizamos miden tanto la cobertura como la precisión del procedimiento, lo que nos permite descartar cualquier indicio de sesgos que puedan sugerir interferencias, manipulaciones o errores recurrentes. Al mismo tiempo, estos instrumentos resultaron lo suficientemente sensibles como para detectar imprecisiones causadas por errores humanos durante la elaboración de los documentos de las mesas y la carga de los datos definitivos.
... The rounded percentage test developed by Mebane and Kalinin (2009) assumes that the rounded percentage for any given candidate is more likely to be a 0 or a 5 than any other number where fraud has occurred. Where the final decimal place of a rounded percentage is equal to 0 or 5, it is assigned a binary variable of 1, or a binary variable of 0 for any other number. ...
Thesis
Full-text available
Integrity of the electoral process is a cornerstone of democracy. Although reported instances of fraud are rare in most Western societies, corruption within the electoral system is a mainstay for those controlled by oppressive or extremist regimes. Both the mechanisms and motivations to examine electoral integrity in detail have been unavailable until recently; data has been hard to find, and those who exposed cor- ruption have risked criminal charges, social ostracism or even murder. However, as big data relating to elections is increasingly made public, either through legal or il- legal means, the opportunity has arisen for independent researchers to analyse elec- tion outcomes and discover instances of voter fraud. This paper provides an intro- ductory framework for performing forensic analysis and auditing election outcomes (FFAEI). It describes an eleven-step process for examining voter rolls produced after election day, and a toolkit which can be adapted and modified by any interested party to perform a dissection on any election. The developed tools are applied to datasets produced during the 2020 US Election, and the widespread allegations of voter fraud are examined from a data science perspective. Keywords: 2020 US Election, Electoral Integrity, Voter Fraud, Democracy, Forensic Analysis, Election Auditing, Big Data in Elections
Article
Full-text available
Por que, em regimes classificados como autocráticos, onde frequentemente existem suspeitas de manipulação e viés no processo eleitoral, partidos que não estão alinhados com o regime ainda decidem competir? A questão é intrigan- te e forma a base do presente trabalho, que se concentra na análise dos dados eleitorais da Federação Russa. O foco é entender a dinâmica da oposição em um regime autocrático, especificamente o regime russo dominado pelo “partido do poder” Rússia Unida, que regularmente ganha eleições com margens significativas de vitória. A pesquisa procura desvendar o raciocínio subjacente que leva a oposição partidária a considerar válida a continuação da participação nos pleitos eleitorais, mesmo quando as chances de vitória parecem distantes ou até inexistentes no momento atual. Contrariando a hipótese de que possa existir um acordo implícito entre a oposição e o regime, uma espécie de “opo- sição consentida”, os resultados preliminares apontam para uma verdadeira dedicação e estratégia dos partidos na competição eleitoral. Parece que eles não estão simplesmente cumprindo um papel; em vez disso, estão ativamente buscando ampliar seu apoio a longo prazo e de maneira localizada. A descoberta sugere que a situação é mais com- plexa do que pode parecer à primeira vista, e que a oposição em regimes autocráticos pode estar operando com uma lógica própria, orientada para o futuro e focada em ganhos incrementais. Esse entendimento pode ter implicações significativas para a nossa compreensão da natureza da política em regimes autocráticos e da resistência e resiliência da oposição sob condições aparentemente desfavoráveis.
Article
Coordination on mass protest plays an important disciplining role in ensuring compliance with electoral rules, with elections serving as a public signal of the incumbent’s popularity. But the link between the informativeness of the election and the enforceability of electoral rules hinges crucially on the veracity of the electoral process. We model how doubt about electoral integrity influences compliance with electoral rules. Our analysis explains why electoral rules in advanced democracies are less resilient, and incumbents less willing to step aside, than suggested by the standard model of electoral turnover. We clarify how incumbent behaviour responds to changes in the cost of protest, and external overtures that make stepping down more attractive. Our findings contribute to the debate on the role of equilibrium multiplicity in models of mass uprisings.
Article
What role does electoral fraud play in nondemocracies? In this paper, we offer an empirical test of a popular idea that authoritarian governments use elections to engineer overwhelming victories with electoral fraud thus deterring potential opposition from challenging the regime. Using the data from the Russian Parliamentary elections in 2011 and a regionally representative public opinion survey, we find that the geographical allocation of electoral manipulation was the opposite of what the theory would imply: more manipulation happened in the areas where the regime was more popular. We also find that higher margins of victory for a pro-regime party failed to deter subsequent mass protests. We argue that these empirical patterns could be better explained by other mechanisms, such as Bayesian persuasion, efficient allocation, and information gathering.
Article
How do changes in ethnic composition affect voter turnout? In Russia, a country that institutionalizes ethnicity through federalism, research demonstrates that geographically concentrated ethnic minorities increase voter turnout, but has not explored how shifts in ethnic composition over time affect participation. This study uses unique fine-grained demographic data from the 2002 and 2010 All-Russia National Censuses and electoral data from the 2011 and 2016 legislative contests to investigate these dynamics. The findings reveal that surges in the minority population over time depressed turnout in the Putin regime's former electoral strongholds, which indicates that the regime is experiencing newfound vulnerabilities.
Article
Full-text available
I use data from the 2006 federal election in Mexico to compare the impression conveyed by tests based on the second digits of reported vote counts to the impression conveyed by a manual recount done for a nonrandom sample of the ballots cast for president. The patterns identified by the 2BL tests match classical ideas about how local political machines operate: in municipalities whose mayors are affiliated with two of the three major party coalitions, the party's candidates do better in voting for president, senate and deputy than expected according to a natural voting baseline. For the three parties that were not competitive in the presidential election, the second-digit tests strongly suggest vote counts were affected either by massive intimidation or by widespread strategic voting. The manual recount did not detect any such patterns, and the changes produced by the recount are unrelated to the vote counts' second digits. Second-digit tests can detect election phenomena that have nothing to do with tabulation errors.
Article
Full-text available
I illustrate election forensic testing using a combination of robust overdispersed multinomial model estimation and second-digit mean testing, with data from the 2004 U.S. presidential election in Ohio and the 2004 Russian presidential election.
Article
We develop a robust estimator-the hyperbolic tangent (tanh) estimator-for overdispersed multinomial regression models of count data. The tanh estimator provides accurate estimates and reliable inferences even when the specified model is not good for as much as half of the data. Seriously ill-fitted counts-outliers-are identified as part of the estimation. A Monte Carlo sampling experiment shows that the tanh estimator produces good results at practical sample sizes even when ten percent of the data are generated by a significantly different process. The experiment shows that, with contaminated data, estimation fails using four other estimators: the nonrobust maximum likelihood estimator, the additive logistic model and two SUR models. Using the tanh estimator to analyze data from Florida for the 2000 presidential election matches well-known features of the election that the other four estimators fail to capture. In an analysis of data from the 1993 Polish parliamentary election, the tanh estimator gives sharper inferences than does a previously proposed heteroskedastic SUR model.
Article
This volume offers a number of forensic indicators of election fraud applied to official election returns, and tests and illustrates their application in Russia and Ukraine. Included are the methodology's econometric details and theoretical assumptions. The applications to Russia include the analysis of all federal elections between 1996 and 2007 and, for Ukraine, between 2004 and 2007. Generally, we find that fraud has metastasized within the Russian polity during Putin's administration with upwards of 10 million or more suspect votes in both the 2004 and 2007 balloting, whereas in Ukraine, fraud has diminished considerably since the second round of its 2004 presidential election where between 1.5 and 3 million votes were falsified. The volume concludes with a consideration of data from the United States to illustrate the dangers of the application of our methods without due consideration of an election's substantive context and the characteristics of the data at hand. © Mikhail Myagkov, Peter C. Ordeshook, and Dimitri Shakin 2009 and Cambridge University Press, 2010.
Article
We develop a robust estimator—the hyperbolic tangent (tanh) estimator—for overdispersed multinomial regression models of count data. The tanh estimator provides accurate estimates and reliable inferences even when the specified model is not good for as much as half of the data. Seriously ill-fitted counts—outliers—are identified as part of the estimation. A Monte Carlo sampling experiment shows that the tanh estimator produces good results at practical sample sizes even when ten percent of the data are generated by a significantly different process. The experiment shows that, with contaminated data, estimation fails using four other estimators: the nonrobust maximum likelihood estimator, the additive logistic model and two SUR models. Using the tanh estimator to analyze data from Florida for the 2000 presidential election matches well-known features of the election that the other four estimators fail to capture. In an analysis of data from the 1993 Polish parliamentary election, the tanh estimator gives sharper inferences than does a previously proposed heteroskedastic SUR model.
Book
The book describes the use of smoothing techniques in statistics, including both density estimation and nonparametric regression. Considerable advances in research in this area have been made in recent years. The aim of this text is to describe a variety of ways in which these methods can beapplied to practical problems in statistics. The role of smoothing techniques in exploring data graphically is emphasised, but the use of nonparametric curves in drawing conclusions from data, as an extension of more standard parametric models, is also a major focus of the book. Examples are drawnfrom a wide range of applications. The book is intended for those who seek an introduction to the area, with an emphasis on applications rather than on detailed theory. It is therefore expected that the book will benefit those attending courses at an advanced undergraduate, or postgraduate, level,as well as researchers, both from statistics and from other disciplines, who wish to learn about and apply these techniques in practical data analysis. The text makes extensive reference to S-Plus, as a computing environment in which examples can be explored. S-Plus functions and example scriptsare provided to implement many of the techniques described. These parts are, however, clearly separate from the main body of text, and can therefore easily be skipped by readers not interested in S-Plus.