Content uploaded by Alexander Siegenfeld
Author content
All content in this area was uploaded by Alexander Siegenfeld on Nov 07, 2022
Content may be subject to copyright.
How Democracies Polarize: A Multilevel Perspective
Sihao Huang,1, 2 Alexander F. Siegenfeld,1,2 and Andrew Gelman3
1Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
2Center for Constructive Communication, MIT Media Lab, Cambridge, MA, USA
3Departments of Statistics and Political Science, Columbia University, New York, NY, USA
Democracies employ elections at various scales to select officials at the corresponding levels of
administration. The geographic distribution of political opinion, the policy issues delegated to
each level, and the multilevel interactions between elections can all greatly impact the makeup
of these representative bodies. This perspective is not new: the adoption of federal systems has
been motivated by the idea that they possess desirable traits not provided by democracies on a
single scale. Yet most existing models of polarization do not capture how nested local and national
elections interact with heterogeneous political geographies. We begin by developing a framework to
describe the multilevel distribution of opinions and analyze the flow of variance among geographic
scales, applying it to historical data in the United States from 1912 to 2020. We describe how
unstable elections can arise due to the spatial distribution of opinions and how tradeoffs occur
between national and local elections. We also examine multi-dimensional spaces of political opinion,
for which we show that a decrease in local salience can constrain the dimensions along which elections
occur, preventing a federal system from serving as an effective safeguard against polarization. These
analyses, based on the interactions between elections and opinion distributions at various scales, offer
insights into how democracies can be strengthened to mitigate polarization and increase electoral
representation.
I. INTRODUCTION
Polities often span vast geographic regions and encom-
pass groups with diverse interests. Democracies con-
tend with this heterogeneity by distributing represen-
tation and governance across levels. Since at least the
eighteenth century, political philosophers have empha-
sized the need to balance collective action with citizen
representation, with Montesquieu arguing that multilevel
governance combines “the internal advantages of a [small]
republic” with the “external force of a monarchical gov-
ernment” [1]. Other scholars like James Madison also be-
lieved that decentralized governance could guard against
polarization. Madison asserted in Federalist No. 10 that
a “pure democracy . . . can admit of no cure for the mis-
chief of factions,” while a well constructed Union has a
“tendency to break and control the violence of faction”
[2]. One of their key contentions was that democracies
must be built as multiscale systems, motivated by the
heterogeneous distribution of voters and the need to de-
volve policy-making responsibilities. This advocacy was
taken up with great enthusiasm: by the turn of the 21st
century, 95% of all democracies were electing subnational
tiers of government [3].
How this multilevel system interacts with political ge-
ography can greatly impact the type of polarization ob-
served in a country. Even if the overall set of voter
opinions is fixed, considerably different electoral out-
comes can result depending on the geographic distribu-
tions of these opinions [4–6]. For a hypothetical nation
in which differing opinions are geographically well-mixed,
most political contention would be resolved locally while
larger-scale politics remains depolarized. On the other
extreme, in a nation in which voter opinions are per-
fectly sorted into districts, all contention must be re-
solved through larger-scale elections or legislative bod-
ies. Real democracies lie between these extremes, with
opinion variance spread across levels. Disagreement is
resolved in a multilevel fashion, with some compromise
achieved through local government or the election of leg-
islators from localized districts and some through larger-
scale elections and legislative bodies.
Put mathematically, democratic governance at any
particular scale is a mean-field treatment1of the elec-
torate, in that the disparate views and needs must be
collapsed into a single instrument that works on average.
However, the distribution of political opinions is often
poorly described by a mean-field theory due to strong ge-
ographic correlations across multiple spatial scales, born
out of factors such as urban history, clustering, and so-
cial ties [8, 9]. For instance, models that assume no geo-
graphic correlations between voters yield behaviors gov-
erned by the central limit theorem, which have been em-
pirically shown to overestimate the probability of close
elections in larger jurisdictions. These inappropriate as-
sumptions have misleading political implications regard-
ing how jurisdictions should be weighted in bodies such
as the Electoral College [10].
Furthermore, such correlations in political opinion sug-
gest a potential mismatch between the electorate and the
mean-field instruments—which are best suited for sys-
tems in which deviations from the mean are sufficiently
uncorrelated—that would represent it. This mismatch
can lead to failures in the electoral process, which in-
clude unstable elections—in which a slight change in elec-
toral opinions can lead to a large swing in the election
1See section III of reference [7] for a non-technical review of mean-
field theory and the conditions under which it applies.
arXiv:2211.01249v1 [stat.AP] 2 Nov 2022
2
outcome—and negative representation—in which a shift
in one’s opinion position can move the outcome in the
opposite direction [11]. Democracies, therefore, need to
take into consideration not only how the geographic dis-
tribution of opinions affects the fairness of multistage
elections (e.g., via districting and apportionment) [9, 12],
but also the makeup of representative bodies and the de-
volution of policy scope across scales.
Nationalization of political discourse further aggra-
vates the discrepancy between how institutions are de-
signed and how political opinions are distributed. The
United States, for instance, has seen a decline in
spatially-bound media, a deepening urban-rural divide,
and further centralization of government authority in
recent decades. Gubernatorial elections have become
increasingly aligned with state-level presidential votes
since the 1980s and are almost perfectly predicted to-
day using presidential ballots in those districts without
state-specific information [13]. The original architects
of America’s federal system had assumed that the “first
and most natural attachment of the people will be to
the governments of their respective states” [14], but this
premise has been gradually eroded. Campaign contribu-
tions, voter turnout, and search interests indicate that
Americans identify overwhelmingly with national party
politics, undermining the ability of the federal system to
guard against polarization.
Starting with a framework that takes the distribution
of political opinions as given (and therefore applies re-
gardless of the mechanism of opinion formation), we de-
rive a model for understanding polarization and repre-
sentation in multilevel democracies. We first provide a
mathematical formalism for describing opinion hetero-
geneity in section II and examine how this geographic
heterogeneity affects elections in the context of social ties
and segregation in section III A. We connect this formal-
ism to empirical data by analyzing how the spread of
opinion variance across spatial scales has changed over
time in the United States and how these changes relate
to the level of polarization in representative bodies.
The analysis thus far applies to electoral systems with
any number of parties or active issue dimensions. How-
ever, phenomena such as the ideological alignment be-
tween local and national elections require us to explicitly
consider multiple issue dimensions [15]. In section IV,
we explore the consequences of opinion measurements
and introduce the concept of an election subspace. This
multidimensional analysis offers an explanation for why
measurements of mass polarization often lag behind elite
polarization and reveals that elections have a tendency
to occur along the axis of maximum opinion variance. In
a system with multiple levels, a tradeoff emerges between
national and local elections, resulting in higher electoral
variance for elections that play a bigger role in defining
political discourse. Combining these arguments with an
analysis of social interactions and the geographic distri-
bution of opinions leads to the conclusion that greater na-
tional (as opposed to local) salience leads to increased po-
larization and instability in larger-scale elections. These
results parallel the situation in the United States, in
which “hollowed-out,” “top-heavy” parties that used to
be largely local have led to increasingly unstable national
elections and non-competitive local offices [16].
Scholarship on fiscal federalism has shown that “not
all federations are created equal” [17]. Different relation-
ships between the political composition and fiscal struc-
tures of various level of government can lead to vastly
divergent financial outcomes. Each level of government
also has its comparative advantages. For instance, the
need for tight feedback and diseconomies of scale may
make local governments more effective implementers of
developmental policy, while the requirement to coordi-
nate regional policies that prevent a race to the bottom
can make the national government more suited to redis-
tribution [18].
We suggest that in addition to matching the multilevel
complexity [7] of government to that of the policy envi-
ronment, the multilevel distribution of opinions must also
be considered. In other words, the efficacy of a federal
democracy rests on three pillars: the multilevel struc-
ture of political institutions, the multilevel complexity
of the policy environment, and the multilevel distribu-
tion of political opinions. The first two pillars cannot
be considered independently of the third if citizen opin-
ions are to be well represented. Apart from normative
concerns, a failure to stably represent these opinions can
result in gridlock and extreme levels of polarization. Our
analysis finds a tradeoff between larger-scale governance
and the satisfaction of citizen preferences—the variance
of which increases with geographic scale. Devolving pow-
ers to local levels can reduce negative representation and
electoral instability, particularly if such powers are along
issue dimensions for which there is substantial geographic
polarization and segregation.
II. HOW DIFFERENCES IN OPINION ARE
DISTRIBUTED ACROSS GEOGRAPHIC SCALES
Prior works have noted that political polarization is
fractal in nature, meaning that it persists at every scale
as one zooms into the map [9, 22]. To quantify this geo-
graphic heterogeneity, we begin with a method of break-
ing up the variance in opinion into the variance arising
from each scale—e.g., the variance in opinions among
towns, counties, states, and even countries (as is the case
of European Parliament elections)—by conditioning the
law of total variance upon multiple scales:
Var(z) = E(Var (z|W1)) + E(Var (E(z|W1)|W2)) +
. . . +E(Var (E(z|WN−1)|WN)) + Var (E(z|WN)) .
(1)
A derivation can be found in appendix A. Here, zis a
random variable that samples across all individual opin-
ions in the country. Wiare random variables that cor-
respond to regions in scale i, with WNcorresponding to
3
FIG. 1. (a) Using a map of precinct-level returns [19], we provide an illustration of a coarse-graining process in which
progressively larger numbers of precincts are grouped together. (b) We employ a variation of this method based on k-d tree
partitioning such that each branch contains an equal number of precincts to decompose the opinion variance added at each scale
(see equation 1) for the 2016 and 2020 presidential elections across the continental U.S. Two lines are shown for each election
year: the black line represents the added variance with geographically-based aggregation on precinct-level data [20, 21], while
the dashed purple line shows the case when precincts are randomly aggregated with no regard to geography. The logarithmic
x-axis indicates the number of regions into which precincts are grouped. For example, at the smallest scale (toward the right),
no precincts are grouped together and the number of regions is equal to the number of precincts, while at larger scales, precincts
are grouped into successively fewer regions. The line of slope 1 in the random aggregation case is characteristic of the central
limit theorem. The much smaller slope in the geographic aggregation lines indicates the presence of correlations that persist to
large scales. (c) Here, we show the total (rather than added) variance at each level of resolution for the two elections. As the
resolution/number of regions is increased (i.e., scale is decreased/regions are disaggregated), the total variance between regions
increases.
regions at the largest scale. Each Wihas a probability
weight proportional to the population of the region it de-
notes. As an example, the total variance of the opinion
distribution in the U.S., Var(z), can be broken down into
the sum of the average within-city variance of individ-
ual opinions, the average within-county variance of the
mean opinion of cities, the average within-state variance
of the means opinion of counties, and the variance of the
mean opinions of U.S. states. This breakdown holds for
any random variable z, so it can be employed for opinion
distributions, election outcomes, etc.
Since political institutions (e.g., city governments,
state legislatures, and Congress) operate at different
scales, this perspective enables us to quantify the levels
at which geographic polarization occurs. As representa-
tive bodies capture the total variance at the scale of the
election district, one may observe very different levels of
polarization in their chambers even if the total opinion
variance of the population Var(z) is held constant. For
instance, if next-door neighbors differ greatly but there
is little variance among the average political opinion of
towns and cities (strong polarization at small scales), lo-
cal politics may be contentious, with a moderate climate
in state and national chambers. Similarly, if towns and
cities have divergent opinions but there is little variance
between the aggregate opinions of states (strong polar-
ization at large scales), we might expect contentious pol-
itics at the state level, with national differences remain-
ing moderate. When there is still substantial variance
in opinion when aggregated at the state level, we may
expect a tribal Congress and presidential elections that
divide rather than unite.
Figure 1 shows the multiscale breakdown of variance
in the U.S. using precinct-level presidential results from
2016 and 2020 [20, 21]. This plot can be interpreted as
the change in population variance with patch size [23, 24],
illustrating the fractal nature of polarization when one
looks at the electoral map with sequentially finer reso-
lution. The distribution of opinions also possesses geo-
graphic correlations that persist at large scales: although
randomly aggregating regions with no regard to geog-
raphy will yield central limit-type behavior (as shown
by the roughly diagonal lines in (c) where the standard
deviation of the average is proportional to the number
of precincts nby 1/√n), the geographically aggregated
data has a much smaller slope. This correlation has im-
plications when considering the relative voting power of
individuals in smaller and larger states [10].
Figure 2 shows that within-county variance has de-
creased in the U.S. starting around the mid-1980s, trans-
lating into a rise in inter-regional variance at larger scales.
Although the growth in partisan polarization has its roots
in many complex factors, it has coincided with the rise
of large-scale variance and what is often described as the
nationalization of American politics [13]. Splitting the
data at the inflection point, the average elector margin in
presidential elections (which roughly captures state-scale
polarization) from 1916 to 1984 stood at 307, while the
average margin from 1988 to 2020 decreased significantly
to 138. In the following sections, we continue to explore
4
FIG. 2. (a) Extending the analysis in figure 1, we use U.S. county-level presidential returns reaching back to 1912 [25] to
identify the flow of variance across scales, normalized by the total variance p(1 −p) where pis the vote share of the winning
presidential candidate. The lines correspond to the added variance at each scale, where the scale is labeled by the number of
groups in which counties are geographically aggregated. For instance, the top (dark purple) line corresponds to the variance
added by individual counties (≈3000 groups), and the bottom (light yellow) line corresponds to the variance added at the
largest scale, for which the U.S. is divided geographically into four groups. The vast majority of the variance in the system
(around 95%) is contained within the county level. (b) Instead of the variance added at each scale, we can explicitly examine
the variance between mean county opinions and the variance between mean state opinions (dashed lines indicate LOESS fits).
The total inter-county variance is equal to the sum of all lines in panel (a), excluding the within-county variance. We observe a
sharp increase in inter-county variance after 1990, which coincides with the continued increase in partisan distance in Congress
as measured by the difference in average DW-NOMINATE scores across the two parties [26] shown in (c).
the implications of this multilevel polarization and dis-
cuss the potential relationship between these trends.
Elections acting at each scale contain only the cumula-
tive variance up to that level, represented by the sum of
terms in equation 1 up to the corresponding scale. Thus,
larger-scale elections must resolve more variance. The
variance that must be resolved for any electoral or policy
decision depends on the scale where that choice is made,
independent of any intermediate representation. As we
show in appendix A, in any legislative body, the variance
among the legislators plus the variance between each leg-
islator and her/his constituents always equals or exceeds
the variance of the entire electorate that the legislative
body represents.
In other words, for any policy choice at a given scale,
the cumulative variance (along the relevant issue dimen-
sion) up to that scale must be settled via the electoral sys-
tem or the agency of public officials. This is a mathemat-
ical statement of the arguments advanced by John Milton
and James Harrington, who saw the virtues of a federal
government in tailoring services to the expressed needs
of various subpopulations [27]. By customizing policies
for each administrative unit, multilevel governance al-
lows variance to be resolved in an efficient way without
pushing it up to higher (e.g., national) levels. This is
especially true if there is little disagreement within but
significant disagreement among administrative units at
that geographic scale.
The idea that a fixed quantity of variance must be
resolved has implications for how responsibility is dis-
tributed in a multilevel system. While elevating the scale
of policy implementation may be necessary to match the
policy’s multiscale complexity with that of the issue it
is attempting to address [7], three potential drawbacks
should be kept in mind. First, if differences in polit-
ical opinion arise from genuinely different needs, then
the one-size-fits-all approaches necessitated by larger-
scale decision-making will be suboptimal. Second, larger-
scale policy implementation will require more compro-
mise, with some forced to be bound by the opinions of
those residing in completely different areas of the country.
Third, attempting to compromise over too much variance
in opinion at too large a scale can lead to a destabilizing
amount of political polarization.
III. ELECTIONS, SOCIAL TIES, AND
SEGREGATION
Having established a framework to study the geo-
graphic distribution of opinions, we now turn our at-
tention to the effects of elections. Here, we introduce
a general model of elections to be used throughout the
rest of the manuscript: let Sbe the space of all possi-
ble opinions, which we assume can be embedded in a d-
dimensional opinion space Rdfor some d. Then, an elec-
tion is defined as the process y:Sn→Sthat outputs the
opinion of the election winner y∈S, where nis the to-
tal number of potential voters. Any election, regardless
of its structure (number of candidates, voting method,
or the presence of a multi-tiered aggregation process like
the Electoral College), can be described this way, as a
map from a set of citizen opinions to that of an elected
official. In this formulation, candidate positions are en-
dogenous; in other words, the space of possible outcomes
of an election is not the discrete set of the positions of
5
candidates who happen to run, but rather the space of
all possible candidate positions that could arise. Thus,
the election outcome can vary continuously with the elec-
torate (though it need not necessarily do so—see below),
even though any particular election will end up being a
choice between a finite number of candidates. Defining
elections as maps from a given set of electorate opinions
Snrequires the opinions be considered at a particular
snapshot in time. Thus, there is the implicit assumption
that the geographic opinion distribution will be qualita-
tively similar regardless of the precise time at which the
opinions are considered, e.g., one year before the elec-
tion, one month before the election, or the election day
itself. This is a good approximation when voters have
relatively coherent and stable opinions in the timescale
of interest [28]. In this section, we consider phenomena
that apply regardless of the dimension dof the opinion
space, while in section IV we consider explicitly multi-
dimensional phenomena, i.e., phenomena that cannot be
explained if d= 1. Although the ideas in this section ap-
ply for all d, we will assume d= 1 for ease of exposition.
As described in previous work (with slightly different
notation) [11], two key failure modes of an election are
instability and negative representation. Heuristically, in-
stability refers to the phenomenon in which small changes
in the electorate can cause large swings in the election
outcome; for instance, the U.S. presidency swung from
Obama to Trump, and then Trump to Biden, despite rel-
atively small changes in electorate opinion. These large
swings in outcomes correspond to the “alternate domi-
nation of one faction over another” George Washington
characterized as a “frightful despotism” in his farewell
address [29].
We formalize this notion by defining an election to be
unstable if the function y:Sn→Sis discontinuous, i.e.,
if an arbitrarily small change in electorate opinions can
cause a finite shift in election outcome. We can also speak
of the magnitude of instability, which corresponds to the
magnitude of change in the election outcome that an ar-
bitrarily small change in electorate opinions can produce.
Instability can never be directly observed, as it involves a
counterfactual in which the electorate has slightly differ-
ent opinions. Nonetheless, instability can be inferred if
swings in outcome from election to election are far larger
than could be plausibly expected of swings in electorate
opinions.2We should also expect some stochasticity in
electorate opinions, which will result in noise of similar
or lesser magnitude in the outcomes of stable elections.
However, in unstable elections, small fluctuations in elec-
torate opinions (whether treated as random or part of the
model) could shift the outcome between radically differ-
ent candidates.
Negative representation refers to the phenomenon in
2U.S. presidential elections from 1944 to 2012 were analyzed and
found to undergo a phase transition from stability to instability
around 1970 [11].
which a leftward shift in electorate opinions causes the
election outcome to move to the right, or vice versa. For
instance, in U.S. elections, a leftward shift in progressive
voters may result in their becoming disillusioned with
both major-party candidates, causing them to not vote
at all or vote for a third-party candidate, which could lead
election outcomes to move the right. Another mechanism
by which negative representation can arise is through the
party primary systems: a shift in voters of one party
away from the center may result in a less electable party
nominee, which would shift the ultimate election outcome
in the opposite direction. Formally, the representation of
the opinion xiof an individual ican be defined as the
causal effect of a shift in that opinion on the election
outcome,3
ri=∂y
∂xi
,(2)
and can be positive or negative.
We consider instability undesirable for two reasons.
First, only a small change in electorate opinions is neces-
sary to significantly change the election outcome, which
both makes elections more susceptible to harmful influ-
ences (e.g., special interests) and also thereby incentivizes
such influences. Second, unstable elections necessarily
contain negatively represented opinions,4to which the
election is, perversely, anti-responsive. The relationship
between negative representation and instability holds re-
gardless of the election mechanism (e.g. the existence of
party primaries, the presence or absence of the Electoral
College,5etc.) or the opinion distribution of the elec-
torate. We direct the reader to reference [11] for more
details.
For ease of notation, we will often consider the elec-
tion to act on a distribution of electorate opinions f(x),
such that the election outcome can be written as y(f)
and representation as ri=r(f, xi). This differs from the
more general formalism above in that it cannot distin-
guish between who holds which opinions.
3As shown in reference [11], this definition of representation gen-
eralizes the Owen-Shapley voting power index that is commonly
employed in the election literature. There exist specific elec-
tion functions (i.e., the election outcome as a function of elec-
torate opinions) which recover the deterministic and probabilistic
Owen-Shapley indices under their respective assumptions.
4For the case of unstable elections, representation may need to
be defined for a specific finite change in opinion rather than as
a derivative, since the derivative may not exist; see ref. [11] for
more details.
5The fact that the Electoral College and the popular vote can
yield such different election outcomes is itself a symptom of elec-
toral instability. A stable election would only be so close so as
to be swayed by such factors if the candidates themselves were
relatively similar.
6
A. Accounting for social ties
Ideally, democracies are not just mechanisms for opin-
ion aggregation but forums through which citizens and
representatives collaborate to reach a common solution.
This concept of deliberation is sometimes argued to be
the source of democratic legitimacy, embodying the ideas
of rational legislation and participatory governance [30].
Tocqueville described deliberation as being driven by
“enlightened self-interest” [31]: a compulsion for citizens
to take into account the opinions of others—particularly
those they interact closely with—to maximize long-term
payoff. This consideration is more likely to occur among
individuals with strong social ties, which in turn are geo-
graphically correlated [32, 33]. Scholars have thus argued
that deliberation is a scale-dependent phenomena, with
Plato and Aristotle famously stating that the ideal size
of a polis should not exceed 5040 citizens [34]. Indeed, a
key argument made for federal governments is that they
combine the ability for small states to foster participation
with the advantages of a large republic [35].
Here, we develop a general model to explore how po-
larization and representation are affected by social ties.
Unlike previous studies that examine how social networks
affect information transfer [36] and opinion formation
[37–39], we do not make assumptions about how pref-
erences diffuse and evolve. Rather, we take the opinions
of voters as given (as described in section III above) and
impose on them a change in voting behavior based on the
set of social neighbors to capture the multiscale effects of
these interactions. We show that although social ties
can be beneficial in encouraging deliberation, this type
of interaction may conversely aggravate polarization if
insular patterns of political socialization emerge [40]. In
section III B, we use this model to understand the geo-
graphic interactions between social ties and elections held
at various levels.
A citizen’s effective opinion x0is defined as a weighted
average of the opinions of themselves and their neighbors:
x0
i=X
j
Tij xj,(3)
where Tij is some social connectivity matrix defined such
that for each individual i,PjTij = 1 (so that trans-
lational invariance is maintained), yielding an effective
opinion distribution ˆ
f(x0) on which the election acts.6
In other words, in the presence of social ties, the election
6More generally, Tij could also be negative. Negative weights cap-
ture the effect that socializing with certain people causes one to
vote further away from rather than closer to their ideal points
[41]. Negative and positive weights correspond to the effects
of threat and contact theories of interpersonal interactions re-
spectively. These notions describe simultaneous and opposing
forces but often operate on different spatial scales; contact re-
quires frequent inter-personal interactions while threat may be
perceived on a large scale because economic or political compe-
outcome is given by y(ˆ
f) rather than y(f). Because the
model does not make explicit assumptions about opinion
dynamics and can accommodate a wide variety of social
network structures (encoded by Tij), it can be expected
to be applicable to a wide variety of real-world scenarios.
We expect social ties to make representation more eq-
uitable. Indeed, representation can be calculated to be:
ri=∂y(ˆ
f)
∂xi
xk6=i
=X
j
Tji
∂y(ˆ
f)
∂x0
j
x0
k6=j
=X
j
Tji r(ˆ
f , x0
j).
(4)
When social connectivity Tij is positive ∀i, j (as it is
in an deliberative democracy), representation will tend
to be more evenly distributed. However, in general, the
total representation will not increase, and individual rep-
resentation will remain O(1/N ), where Nis the size of
the electorate. For instance, for differentiable elections,
we have the exact result R∞
−∞ f(x)r(f, x)dx = 1 [11].
We can also define social representation as the change
in election outcome with respect to the effective opinion,
holding the selfish opinions of everyone else constant:
∂y
∂x0
i
xk6=i
=∂xi
∂x0
i
xk6=i
∂y
∂xi
xl6=i
=1
Tii
ri.(5)
This social representation measure captures the positive-
sum nature of individuals taking into account each oth-
ers’ preferences. In a hypothetical group in which every-
one valued the opinions of others equally, every individual
would have a representation of 1, capturing the fact that
if the preferences of all citizens were equally weighed by
all individuals (such that everyone’s effective preferences
were the same), the government would be fully responsive
to that effective preference.
B. Multiscale effects of geographic segregation
As social ties distribute the representation of individu-
als, they have the potential to reduce the amount of neg-
ative representation. They can also potentially decrease
the degree to which the election is prone to instability.
Whether these benefits are realized, however, depends
on the way in which social ties are distributed across the
opinion distribution, which will in turn depend on the
geographic distribution of both social ties and political
opinion.
Social ties that span an electorate reduce the effective
variance of its opinion distribution. Consider the opin-
ion distribution in figure 3b consisting of two normally-
distributed subpopulations. If we consider a connectivity
tition may operate at a state or national level [42]. Similarly, if
social media—which connect people on a national scale—do lead
to a more negative evaluation of differing opinions [43], we may
consider a model in which larger-scale connections have negative
weights while smaller-scale, face-to-face interactions have posi-
tive weights, strengthening the effects described in this section.
7
FIG. 3. (a) We illustrate the effects of social ties using a bimodal opinion distribution, similar to the aggregate voter and
representative ideal points across the U.S. as estimated by Bafumi and Herron [12, 44]. (b) Within a single locale, social
ties among all the members reduce the variance of each component distribution and shift their means closer to each other.
For sufficiently strong social ties (parameterized by w), the election becomes unimodal, as seen when w= 0.5. (c) Next, we
extend this framework to the multiscale case with two locales. The two locales may be completely identical, or they may each
be politically segregated such that one is biased toward the first peak (black dashed line) and the other toward the second
peak (gray dashed line), e.g., one being a majority-Democratic and the other a majority-Republican jurisdiction. The overall
opinion distribution (and thus the total variance) is the same in both cases. (d) However, as a result of local social ties, the
segregated (heterogeneous) system—in which much of the variance is between the locales—will display more polarization than
the homogeneous system—in which all of the variance is within the locales.
matrix Tij =w/(n−1) for i6=jwhere nis the size of
the electorate (and thus Tij = 1 −wfor i=j), the opin-
ion distribution is transformed, reducing the distance be-
tween the means of the two subpopulations by a factor
of 1 −w, while also decreasing their respective scales by
the same amount. More generally, homogeneous social
ties decrease the effective variance regardless of the pre-
cise form of the opinion distribution; see appendix B.
Thus, to the extent that social ties are geographically lo-
calized, this analysis implies that electoral polarization
is less problematic locally than at larger scales.
However, while social ties decrease instability within
well-connected locales, they can also increase the overall
polarization of the system depending on the structure of
social connections. Political homophily—in which citi-
zens associate themselves with people of similar political
views—has been observed to be a key process in many
social networks [45]. Consider two groups of separate
ideologies that are socially disconnected from one other.
This is an extreme form of affective polarization, wherein
citizens become unwilling to socialize across party lines
due to the emergence of partisanship as a social identity
[46–48]. Social ties cause the effective opinions within
each group to cluster more sharply around their respec-
tive means. For example, calculating the effective opin-
ion distributions using a bimodal distribution, the cen-
ters of the two Gaussians stay the same, but since their
scales are decreased to ˆσ=σ(1 −w), the effect of social
ties tends to increase instability (appendix B). Growing
instability and the structure of social ties can be self-
reinforcing across the span of several elections, especially
when institutions like strong party systems begin to steer
public opinion [49, 50].
Thus far, we have discussed the effects of social
ties in a hypothetical election where voters are either
well-connected or fully segregated according to partisan
identity. We can generalize this model across multi-
ple levels—following the framework introduced in sec-
tion II—by assuming that social ties are present with
varying strengths at each scale. This assumption is based
on two mechanisms. First, elections at each scale (e.g.,
mayoral or gubernatorial elections) mediates interactions
between voters. Second, despite the complexity of human
networks, social interactions—particularly high-salience
ones that involve face-to-face exchanges—are often geo-
graphically correlated. Therefore, we introduce a hier-
archy of coupling weights wi, each corresponding to the
density of social interactions among citizens within the
same scale-iregion.
This model yields two key results. First, since the de-
gree of social ties varies at different scales, the effective
total variance in a country is affected not only by the
opinion variance of the electorate, but how those differ-
ences are distributed across levels. Specifically, the terms
8
in equation 1 are transformed as
Var(z0) = E(Var (z|W1)) (1 −ΣN+1
i=1 wi)2
+E(Var (E(z|W1)|W2)) (1 −ΣN+1
i=2 wi)2
+· ·· + Var (E(z|WN)) (1 −wN+1)2,
(6)
where z0is a random variable sampling over the effec-
tive opinions and Wiare once again random variables
corresponding to regions at scale i. (Note that the scale
N+ 1 corresponds to the entire country, and thus wN+1
denotes the strength of nationwide social ties.) If social
ties are stronger at smaller scales, it would be preferable
for a larger portion of the total variance to be present in
those levels (see figure 3d).
Second, we find that increasing segregation at any par-
ticular scale, such that opinions in regions below that
scale are made more homogeneous and opinions above
it are made more heterogeneous, can stabilize local elec-
tions while destabilizing larger-scale ones. This type of
segregation could occur through partisan sorting, urban-
ization, or reverse effects [51–53]. Furthermore, increas-
ing the relative strength of local ties in this situation is
at the detriment of larger-scale elections, essentially ag-
gravating the effect of insular communities. This result
is a multilevel generalization of our previous finding that
segregating social ties across party lines increases the like-
lihood of electoral instability. A detailed discussion can
be found in appendix C.
Although this model of social ties is certainly a simpli-
fied one, it captures the idea of deliberation that is cen-
tral to many arguments for federalism and participatory
democracy. If people of different political opinions were
geographically randomly distributed, election instability
would be significantly reduced, since any individual—
even if they are upset that their own views were not rep-
resented in government—would be surrounded by many
people whose views are represented and would recognize
the need for compromise. In this limit, political opinions
would be well-described by a mean-field theory (i.e., any
individual opinion could be described as the mean opin-
ion plus some uncorrelated noise) and so could be well-
represented by a single instrument (e.g., the national gov-
ernment), although other considerations may still favor
more local forms of representation and policy-making.
The inability for a top-heavy political system to stably
represent the U.S. electorate can be viewed as a con-
sequence of a geographic opinion distribution that is in
reality not so well-mixed.7
7Without deliberation to cut down political polarization, a
Congress consisting of multiple members does little to ameliorate
this problem, as Congress must still come to a single decision for
the entire nation on any given piece of legislation. As discussed
in section II, the problem of compromise is simply shifted from
the electorate to Congress.
FIG. 4. (a) As each candidate aims to build a coalition
of voters, the contest occurs on an axis that approximately
spans the direction of maximum variance. The election can be
approximated as acting on the projection f(x) of the multidi-
mensional opinion distribution onto this axis. (b) The same
process can be repeated for a subset of the whole electorate
(marked in gray), which corresponds to a local election. The
introduction of interactions between the two axes may result
in a local election axis (marked in solid red) that differs from
the axis that would maximize the projected variance of the
local opinion distribution.
IV. MULTIDIMENSIONAL PREFERENCES
Up to this point, we have discussed behaviors whose
essential forms can be described in a single-dimensional
opinion space. While this is a reasonable simplification
for isolated elections—where our results on the multiscale
composition of variance and the effect of social ties hold
regardless of dimensionality—a multidimensional space
is needed to model how different elections interact. We
now explore phenomena that cannot be explained with-
out explicit reference to a multi-issue space, beginning
with the problem of how opinions that may lie in a high-
dimensional space can be measured.
A. Election axes
Measurements of political opinions are projections of
voter preferences along the directions spanned by the set
of instruments (e.g., poll questions) used. This leads to
two immediate observations. First, unless a set of ba-
sis vectors spanning the whole space is constructed, the
measurement does not yield complete information on the
opinion distribution. Second, it is generally difficult to
9
conclude that there is no mass polarization as a polar-
ized distribution may not have a high variance (or not ap-
pear as bimodal) when projected onto a smaller subspace.
Even a comprehensive study that integrates polling data
on a large number of opinion dimensions will not neces-
sarily uncover the full structure of the space.
Care needs to be taken when operationalizing the opin-
ion distribution with poll-based measurements. Besides
being inadequate for fully reconstructing the opinion
space, polls are typically constrained along a number of
natural axes: easily pollable issues or those often dis-
cussed by political elites. This idea has been explored
empirically. For instance, Broockman (2016) showed that
commonly used ideological scores are poor measures of
policy preferences. Analysis using a wider range of mea-
surement axes finds legislators to be “similarly moderate
as voters, not more extreme” [54]. This result aligns
with the finding by Ansolabehere et al. that apparently
unstable and incoherent voter opinions are manifesta-
tions of measurement error. Increasing the number of
survey items improves the stability of opinion, steadily
approaching that of party identification [28].
The idea of a measurement axis can be generalized by
modeling elections as measurements along the ideological
positions of the candidates, yielding an election subspace
spanned by the candidates Rmin[c−1,d]∈Rd, where cis
the number of candidates in a given election. In contrast
to poll questions, candidates can more flexibly take po-
sitions across a broad range of issue dimensions, many
of which may be illegible. Our analysis thus far applies
to multilevel democracies generally and does not depend
on the nature of the electoral system. For simplicity, we
now restrict ourselves to two-party elections that oper-
ate on an election axis defined by the line containing the
positions of the two candidate (though, as we will dis-
cuss, such an axis with d= 1 is still a useful concept
in multi-candidate elections). Despite the complexities
of elections in such a high-dimensional space, two-party
elections can be summarized by the the one-dimensional
axis ˆ
espanned by the two candidates, together with the
one-dimensional opinion distribution f(x) created from
the projections of every opinion position onto ˆ
e, as il-
lustrated in figure 5. We might expect this election axis
to often roughly coincide with the axis of political dis-
course occurs, with f(x) representing the distribution of
opinions concerning this discourse.
Defining an election subspace does not impose any ad-
ditional assumptions onto a multidimensional election.
In particular, the election axis represents a choice of how
to abstract the election process rather than an assump-
tion about the election mechanism. Polarization can then
be assessed from the one-dimensional opinion distribu-
tion that arises from the projection of electorate opin-
ions onto the line spanned by candidate positions. This
coarse-graining enables us to capture the relevant large-
scale behavior of the election—regardless of the details
of how outcomes may emerge from an multidimensional
space—and build up to a picture of cross-level interac-
tions in section IV B.
We now examine one particular model, in which the
election is produced by dividing the opinion space across
the dominant cleavage line with one party residing in each
cluster. A simple way to capture this process mathemat-
ically is via k-means clustering. When k= 2, two natural
clusters are formed by minimizing the mean-squared dis-
tances between the cluster centroids and their surround-
ing samples. We can interpret this as a process in which
candidates attempt to minimize the overall ideological
distance between them and their supporters. Such a pro-
cess sets a general axis of discourse along the means of
the two clusters ~µ1−~µ2, the normalized version of which
define as the election axis ˆ
e. This roughly corresponds to
splitting the electorate along the axis of greatest variance
in the case of a two-party election; see appendix D.
More generally, the election axis lies perpendicular to
the traditional notion of a partisan cleavage [55, 56].8If
the partisan cleavage is aligned with the dominant social
cleavage, the election takes place along the axis of high-
est variance. This is true especially when parties start to
shape the dominant social cleavage (e.g., via homophily,
as discussed in section III A) over the course of multiple
elections, unless the parties or issues are going through
significant reorientation [56]. Consequently, the polar-
ization measured along the electoral axis will be greater
than or equal to those found along an arbitrary set of
measurement axes. This provides a plausible explana-
tion for why surveys of both partisan and elite polariza-
tion are often trailed by measurements of mass polar-
ization: candidates are simply positioned along the axis
with the greatest projected variance. The social cleavage
definition also makes the election axis a useful construct
in multi-candidate elections as it indicates the primary
direction of discourse.
B. Interactions among election axes at different
scales
Next, we explore how multilevel interactions affect po-
litical polarization using the concept of election axes de-
scribed in the previous section. Within a democratic sys-
tem, elections are not independent from each other be-
cause municipal, state, and national politics are nested
[57]. Institutional effects pull lower salience elections to-
ward the direction of discourse in more dominant contests
[58, 59]. Interactions may also be driven by the existence
of a strong party system [60], shared funding resources,
the delocalization of news media [13], or the simple fact
that the same politicians are often active across multiple
8In an Euclidean space, dividing voters according to which party
they are closest to results in a boundary that is perpendicular to
the election axis. When there are multiple parties, these cleav-
ages partition the opinion space into Voronoi cells, each contain-
ing the set of voters closest to the corresponding centroid.
10
scales of government. Such effects are more pronounced
in countries with advanced party systems but have been
described across a wide range of democracies [61].
Rather than posit specific models, we use the election
axis as a coarse-grained description of multilevel dynam-
ics through which the effects of various interactions are
included. The first way contests can affect each other
is via issue activation: elections amplify the difference
in candidate opinions as campaign messaging and media
coverage focus on how their positions diverge. For candi-
dates positioned at ~x1and ~x2, communications are typi-
cally aligned with their differences, described by the unit
vector ˆe= (~x1−~x2)/|~x1−~x2|. In a system with multiple
elections, a highly salient contest (such as a presidential
election) can activate issues along ˆeand focus public at-
tention onto that axis, even in mayoral or gubernatorial
elections where it might not represent the primary issues
relevant to that level of governance. This effect persists
even in elections where all the candidates belong to the
same party.
Second, partisan and institutional effects can constrain
candidate positions. We employ the formalism devel-
oped in section III A to model these forces as ties among
members of each political faction across scales (such as
ties between local and national candidates of the same
party). Consider two elections which, without mutual
interactions, span election axes ˆeaand ˆeb. As detailed
in appendix F, within-party ties across the two elec-
tions shift the axes toward each other, bringing them
to ˆe0
a=waˆea+ (1 −wa)ˆeband ˆe0
b=wbˆeb+ (1 −wb)ˆea
respectively, where wa, wb∈[0,1]. The effect of these
ties is that the angle subtended by the two axes is now
reduced. Generally, wa6=wb, as interactions between the
two elections may be asymmetrical, especially if they oc-
FIG. 5. Relative salience of local and national elections
and the dimensionality of congressional voting patterns. The
former is measured via the ratio of voter turnout of American
gubernatorial to presidential primaries held on the same year
(dashed line indicates the LOESS fit). The latter is measured
via the variance in DW-NOMINATE scores on the second
axis divided by the first axis, as computed from congressional
roll-call data [26]; higher values of this variance ratio indicate
a greater importance of issues outside the liberal-conservative
axis.
cur at different scales, or if one is more politically salient
than the other. This asymmetry can also be driven by
the relative distribution of policy responsibilities between
the two elections.
Such interactions consist of what Hijino and Ishima
(2021) termed “multi-level muddling,” wherein candi-
dates adopt messages that appeal to performances and
issues in levels of government other than the one in which
they are seeking office [62]. In the U.S., for instance,
state-level candidates often focus on issues that resonate
across the country [63], sometimes even de-emphasizing
policy issues that are under their purview [64].
Ultimately, both media framing and institutional ties
have the same effect: the direction of discourse becomes
aligned across elections at different levels, reducing the
angular dispersion between their axes. Although interac-
tions in a multilevel system can be described more gen-
erally (as elaborated in appendix E), we shall demon-
strate these effects using on a simple system with local-
national ties. For instance, suppose a country has a set
of pre-interaction local axes ˆ
e`= (ˆe`,1,ˆe`,2,ˆe`,3, . . . )—
corresponding to congressional elections—and a national
axis ˆeN, corresponding to the presidential election.9The
existence of cross-level interactions has two key implica-
tions.
First, these interactions mean that congressional
representatives—each chosen from different regional
electorates—will be drawn from elections with more
aligned axes if multilevel interactions are strong and na-
tional salience is high. If the national axis ˆeNhas a strong
pull, winning candidates are likely to be clustered along
the national axis of discourse, rather than being scattered
across the ideological space due to the diversity of local
and regional concerns. The way in which this clustering
leads to congressional polarization will be discussed in
more detail in section IV C. Furthermore, to the extent
that state politics are influenced by these national con-
cerns, state governments will cease to serve as a check
against national polarization (for instance, state legisla-
tors often draw voting districts and pass voting laws in
line with their national party).
Second, as outlined in the previous section, the elec-
tion axis determines how the multidimensional opinion
space is projected onto the election as a one-dimensional
distribution f(x). A tradeoff can arise in the variance
projected against national and local elections depending
on the relative salience between the two.
If the salience of local politics is high, then each lo-
cal election freely picks its direction of discourse ˆe`,i ,
while the national election—with a weaker agenda-
setting capacity—is pulled by an aggregate of these local
axes. The pull results in an effective axis ˆe0
Nthat may not
correspond exactly to the national cleavage. In this case,
9We note that although congressional elections elect a candidate
for federal office, we refer to their election axes as local since such
axes coarse-grain regional electorates.
11
local elections tend to maximize their projected variance,
while national elections do not.
On the contrary, if national salience is high, presi-
dential elections would occur against the main national
cleavage, setting up a contest along the direction of max-
imum variance. Because the dominance of national dis-
course pulls local candidates away from strictly local con-
tests, opinion variance against their axes can be lower
than what would otherwise be observed against the op-
timal local divide. In such a climate, one may observe
increasing polarization at the national level while local
elections simultaneously become more dominated by sin-
gle parties [65].
As Gelman (2014) discussed, the recent appearance of
close elections is relatively unusual in American history.
For instance, in less nationalized periods like the early
twentieth century, Democrats were largely content with
controlling the urban political machines and the Ameri-
can south [66]. Parties resorted to national politics only
as a brokering mechanism. The rise of federal spending
eventually incentivized them to invest in national con-
tests whenever possible [67], leading to an increase in the
number of close contests. The recent multilevel effects
of nationalization may be seen in party platforms: us-
ing automated and manual content analysis, Hopkins et
al. (2022) found that within-party variation among local
platforms in the U.S. has decreased significantly since the
mid-1990s, while between-party differences in the topics
discussed diverged over the same period [68].
Although these two cases highlight a tradeoff between
national and local polarization, the scale at which highly
divided contests occur can matter in the long run. In
particular, polarization at larger scales can have ripple
effects not present at smaller scales by deepening the
national cleavage across multiple elections. Members of
each party may further congregate in opinion through
internal interactions and agenda setting [50], forming a
positive feedback loop that aggravates the social divide.
We can further describe the effect of multilevel con-
straints on democratic accountability using the concept
of representation from section III. Assuming the deriva-
tive exists, we can generalize equation 2 to a multidimen-
sional space, writing the representation of opinion xias
ri
µν =∂yµ/∂(xi)ν[11]. This is a rank-two tensor where
the first index corresponds to the direction of change in
the election outcome and the second index corresponds
to the direction of opinion change.
Any change in opinion can be broken down into its
component along and orthogonal to the election axis ˆe,
enabling us to split ri
µν into on-axis and off-axis repre-
sentation; see appendix G for details.
For instance, if representation occurs only along the
election axis, perhaps due to sparse public discourse
along orthogonal directions, a regional election can of-
fer representation on local issues only if the election axis
is aligned with such matters. The nationalization of re-
gional elections may prevent such an alignment [13]. Such
a behavior is not predetermined by our model, however.
There may also be other cases in which candidates are
free and willing to move in orthogonal directions in re-
sponse to changes in public opinion, offering an avenue
for representation even if the on-axis contest is unstable
and, by extension, contains negatively represented opin-
ions [11].
C. Guarding against polarization
In this section, we consider in more detail the effect
of multilevel interactions on legislative bodies. As men-
tioned in the previous section, as the salience of national
elections increases, the election axes of regional elections
become focused in the same direction and politicians con-
gregate around their respective clusters. For example, in
the U.S., higher salience in national politics will lead to
a more one-dimensional Congress in a two-party system.
Conversely, greater salience in local politics—and thus
greater freedom in selecting ideal platforms—results in
a more varied Congress, with variance distributed along
different dimensions.
The framers of the American constitution expected
state-level loyalties to far outweigh those to the new na-
tion so that local attachments could counterbalance the
centralizing tendencies of a large republic [69]. Madison
wrote about this in Federalist No. 10: in a world where
local opinions are distributed among different issue di-
mensions, factions are suppressed “by their number and
local situation,” leading them to “lose their efficacy in
proportion to the number combined together.” This vari-
ation, he argues, prevents the formation of a dominant
faction at the national level. As the premise that local
attachments trump national ones breaks down, we have
observed an erosion of local authority and the national-
ization of regional politics [13]. The purported ability of
a federal system to insulate a country from factionaliza-
tion, therefore, has also fallen short.
We can formulate the effects of multiple issue dimen-
sions more precisely with a simple mathematical model.
Consider an opinion space where all voters have equal
extremity such that they are equidistant to the mean. If
variation in opinions only exists across one issue dimen-
sion (such as the case of extreme nationalization), there
are exactly two groups, each located a distance rfrom
the center. In this situation, the total variance in the
distribution is simply σ2=Pi(xi−¯x)2/P =r2, with P
being the total population.
Next, we add issue dimensions to the system. If there is
an equal amount of variance on each axis—corresponding
to a case where disagreement occurs across a multitude
of issue dimensions—then opinions are distributed evenly
on a spherical surface of radius rand dimension n−1
where nis the number of issue dimensions. Since the
average squared distances between the opinions and the
mean is r2, this yields a covariance matrix of the form
Kab =σ2δab with σ2=r2/n. This implies that in a sys-
tem where all opinions are equally far from the center,
12
the variance σ2projected onto any one-dimensional elec-
tion axis decreases as the number of directions in which
opinions vary increases.10 Thus, the effectiveness of a
pluralistic system in guarding against polarization is re-
duced if opinions collapse to a low-dimensional space.
As the salience of national elections becomes high, the
opinions of representatives in Congress become more one-
dimensional, meaning that the effects of polarization are
significantly more pronounced. A multilevel democracy
would be more effective in guarding against factions if
local concerns were more salient than national ones, as
Madison seemed to have assumed.
The dimensionality of Congress can be indirectly mea-
sured through the explained variance of the first and
second principle axis in roll-call votes. Although par-
tisan alignment is likely driven by a range of complex
issues, figure 5 shows that dimensional collapse has oc-
curred during the same period as the relative salience of
regional politics—as measured by the ratio of turnout in
gubernatorial to presidential elections—has decreased.
The domination of national issues over local ones can
diminish the diversity of opinions, intensifying polariza-
tion along a single direction. It also suggests a vicious
cycle: the more salient national issues are, the more at-
tention is paid to national governance, and the more cit-
izens expect issues to be solved by the national govern-
ment. The national government may then take on more
responsibility relative to local ones, which in turn results
in greater national salience.
D. Opinion aggregation across multiple issue
dimensions
The multilevel breakdown of variance presented in sec-
tion II is directly applicable to individual issues when
considering a multidimensional space of opinions. Just
like the single-dimensional case, each issue can have a
varying degree of cumulative variance at each geographic
scale. A country can have disagreements on economic
policy at the smallest scales (e.g., between individuals
and their neighbors) while opinions on gun ownership
are locally homogeneous and significantly divided at the
largest scales (e.g., between north and south).
The set of cumulative variances of these issues deter-
mines the natural axis of discourse for an election at a
particular scale (before interactions with other election
axes are considered). Under certain assumptions (see ap-
pendix D), if the distribution of opinions within an elec-
10 By a similar argument, Rodden (2021) showed that affective po-
larization intensifies when more issue dimensions are added [70].
While this may at first seem to be in conflict with our result, these
two conclusions represent two sides of the same coin: we hold
overall polarization constant (by placing everyone on a sphere
of equal extremity to the mean), while Rodden’s model assumes
that partisan polarization (i.e., the variance projected onto the
1D axis between the parties) is held constant.
torate at scale iis summarized by a covariance matrix
˜
Ki, the election axis is approximated by the top eigen-
vector of this matrix, with the associated eigenvalue cor-
responding to the opinion variance projected onto that
axis. The multiscale breakdown of variances across each
issue dimension for a country (i.e., the multidimensional
generalization of figure 1) can be written as the sum of
the added covariance matrices at each scale.
We illustrate the effect that issue aggregation can have
on a country with two cases. First, consider a situation
where differences in electoral opinion are concentrated
along similar axes for regions above scale i(for exam-
ple, the added variance above the county scale is mainly
along the standard liberal-conservative axis), whereas the
added variance below scale iis dispersed along a wide
range of issue dimensions (e.g., people in each town dif-
fer greatly in their preferences for education and polic-
ing). In such a country, local elections would not have a
dominant axis of discourse, but a clear issue dimension
with high variance emerges nationally. The flip side can
also occur: if disagreement among citizens occurs locally
along a small subset of issue dimensions but exists nation-
ally on a wide range of issues, local elections will be more
polarized with stable election axes but larger-scale elec-
tions will lack a dominant axis. This latter case reflects
the phenomena described in section IV C, wherein the
diversity of issues relevant at the national scale guards
against polarization.
While the implementation of any given policy may be
easiest at a particular level of governance, one must also
take into account the multiscale distribution of opinions
to ensure that polarizing issues, in Madison’s words, “lose
their efficacy.” Moving issues that are most contentious at
a particular electoral scale to other levels of government
can serve as a counterweight against the emergence of a
highly polarized dominant election axis.
V. SUMMARY
It has been argued that federal democracies offer ad-
vantages in encouraging political participation, the pro-
tection of individual rights and liberties, and economic
efficiency [35]. However, these are not intrinsic proper-
ties of federal systems. Realizing these benefits depends
on the relationship between different levels of governance,
the multiscale complexity of the policy problems each has
to tackle, and the geographic structure of the electorate.
While the link between federal governments and their
policy environments has been extensively studied, our
analysis provides a new framework for describing how
the third pillar—the multilevel distribution of political
opinions—couples with democratic processes.
As we outline in section II, varying degrees of polariza-
tion occur along different issue dimensions at each level
of governance. This leads to a tradeoff in assigning policy
scopes by implementational efficiency alone: in addition
to considering the advantages and disadvantages of en-
13
acting policy at a given level of governance, one must also
take into account the amount of polarization that needs
to be resolved at that level. Factoring in the idea that
deliberation and political participation are only optimal
at certain scales—as scholars from Plato to Tocqueville
have argued—we show how geographic segregation af-
fects political polarization and representation, and how
the distribution of variances at different scales changes
the stability of local and national elections. We then ex-
plore the structural factors of national-local relationships
by extending the framework of multilevel polarization
to a multidimensional opinion space. We demonstrate
that increasing national salience can result in elections
occurring predominantly along a single one-dimensional
axis, spoiling the purported insulation against polariza-
tion that a federal system provides.
These multilevel considerations suggest a strong link
between political nationalization and polarization. Both
voter turnout and engagement in local politics have de-
creased significantly over the past few decades [71–73],
while polarization at larger scales (e.g., the variance be-
tween counties, the variance between congressional dis-
tricts, and the variance between states) has grown. Re-
solving more issues via local elections can help reduce in-
stability and increase representation by distributing po-
larization more evenly across scales and leveraging social
effects to encourage deliberation. Doing so can also trans-
fer some of the political salience of national elections to
state and local ones and increase local turnout.
This paper has touched on a wide range of topics, with
the purpose of providing new mathematical and concep-
tual frameworks for future research, rather than defini-
tive answers. Our unifying theme is that in addition
to matching the comparative advantages of each level of
government with its policy environment, it is also essen-
tial to consider how differences in opinions are distributed
geographically. When political preferences are geograph-
ically clustered, devolving the relevant policies to lower
geographic levels can reduce the risk of polarized and un-
stable national elections. Only with a careful balancing
of the policy issues tackled at each level of government
can the full advantages of a federal system be realized.
ACKNOWLEDGMENTS
We thank Johnathan Rodden for conversations on fis-
cal federalism, polarization, and multidimensional opin-
ion spaces, and Shigeo Hirano for insight on the national-
ization of multilevel politics. We would like to express our
appreciation to Deb Roy for his comments and support
of this project at the Center for Constructive Communi-
cation. The authors also gratefully acknowledge the U.S.
Office of Naval Research, the National Science Founda-
tion Graduate Research Fellowship Program under grant
no. 1122374, and the Hertz Foundation for partial sup-
port of this research.
[1] C. Montesquieu, De l’Esprit des Lois (Geneva, 1748;
English translation, “The Spirit of Laws,” reprinted by
Prometheus Books, Amherst, N.Y, 2002).
[2] J. Madison, Federalist no. 10: The same subject contin-
ued: The union as a safeguard against domestic faction
and insurrection, New York Daily Advertiser (1787).
[3] Decentralization: Rethinking government, in Enter-
ing the 21st Century: World Development Report,
1999/2000, edited by S. Yusuf, A. Altaf, W. Dillinger,
S. Evenett, M. Fay, V. Henderson, C. Kenny, and W. Wu
(Oxford University Press, 2000) pp. 107–124.
[4] M. G. Kendall and A. Stuart, The law of the cubic pro-
portion in election results, British Journal of Sociology
1, 183 (1950).
[5] A. Gelman and G. King, A unified method of evaluat-
ing electoral systems and redistricting plans, American
Journal of Political Science 38, 514 (1994).
[6] N. McCarty, J. Rodden, B. Shor, C. Tausanovitch, and
C. Warshaw, Geography, uncertainty, and polarization,
Political Science Research and Methods 7, 775 (2019).
[7] A. F. Siegenfeld and Y. Bar-Yam, An introduction to
complex systems science and its applications, Complexity
2020, 6105872 (2020).
[8] J. Rodden, The geographic distribution of political pref-
erences, Annual Review of Political Science 13, 321
(2010).
[9] A. Gelman, J. N. Katz, and F. Tuerlinckx, The mathe-
matics and statistics of voting power, Statistical Science
17, 420 (2002).
[10] A. Gelman, J. N. Katz, and J. Bafumi, Standard vot-
ing power indexes do not work: An empirical analysis,
British Journal of Political Science 34, 657 (2004).
[11] A. F. Siegenfeld and Y. Bar-Yam, Negative represen-
tation and instability in democratic elections, Nature
Physics 16, 186–190 (2020).
[12] J. Bafumi and M. C. Herron, Preference aggregation, rep-
resentation, and elected American political institutions,
in Midwest Political Science Association Annual National
Conference (2007).
[13] D. J. Hopkins, The Increasingly United States: How and
Why American Political Behavior Nationalized, Chicago
studies in American politics (University of Chicago Press,
2018).
[14] J. Madison, Federalist no. 46: The influence of the state
and federal governments compared, New York Packet
(1788).
[15] W. E. Oates, Fiscal Federalism (Harcourt Brace Jo-
vanovich, New York, 1972).
[16] D. Schlozman and S. Rosenfeld, The hollow parties, in
Can America Govern Itself?, edited by F. E. Lee and
N. McCarty (Cambridge University Press, 2019) 1st ed.,
pp. 120–152.
[17] J. Rodden, Hamilton’s Paradox: The Promise and Peril
of Fiscal Federalism (Cambridge University Press, 2006).
[18] P. E. Peterson, The Price of Federalism (Brookings In-
stitution, Washington, D.C, 1995).
14
[19] J. Lai and J. Whalen, Pennsylvania, Polarized, The
Philadelphia Inquirer (2019).
[20] R. Rohla, M. Bloch, L. Buchanan, J. Katz, and
K. Quealy, An extremely detailed map of the
2016 election, https://www.nytimes.com/interactive
/2018/upshot/election-2016-voting-precinct-maps.html
(2018).
[21] A. Park, C. Smart, R. Taylor, and M. Watkins, Pres-
idential precinct data for the 2020 general election,
https://github.com/TheUpshot/presidential-precinct-
map-2020 (2021).
[22] J. Rodden, Why Cities Lose: The Deep Roots of the
Urban-Rural Political Divide (Basic Books, New York,
2019).
[23] H. F. Smith, An empirical law describing heterogeneity
in the yields of agricultural crops, Journal of Agricultural
Science 28, 1 (1938).
[24] P. Whittle, On the variation of yield variance with plot
size, Biometrika 43, 337 (1956).
[25] D. Leip, Atlas of U.S. presidential elections, datasets,
https://doi.org/10.7910/DVN/XX3YJ4 (2017).
[26] J. B. Lewis, K. Poole, H. Rosenthal, A. Boche, A. Rud-
kin, and L. Sonnet, Voteview: Congressional roll-call
votes database, https://voteview.com/ (2021).
[27] S. H. Beer, To Make a Nation: The Rediscovery of Amer-
ican Federalism (Harvard University Press, 1994).
[28] S. Ansolabehere, J. Rodden, and J. M. Snyder, The
strength of issues: Using multiple measures to gauge pref-
erence stability, ideological constraint, and issue voting,
American Political Science Review 102, 215 (2008).
[29] G. Washington, The Address of Gen. Washington to the
People of America on His Declining the Presidency of the
United States, American Daily Advertiser (1796).
[30] J. Bohman and W. Rehg, eds., Deliberative Democracy:
Essays on Reason and Politics (MIT Press, Cambridge,
Mass, 1997).
[31] A. Tocqueville, H. C. Mansfield, and D. Winthrop,
Democracy in America (University of Chicago Press,
2002).
[32] M. Bailey, R. Cao, T. Kuchler, J. Stroebel, and A. Wong,
Social connectedness: Measurement, determinants, and
effects, Journal of Economic Perspectives 32, 259 (2018).
[33] D. J. Crandall, L. Backstrom, D. Cosley, S. Suri, D. Hut-
tenlocher, and J. Kleinberg, Inferring social ties from
geographic coincidences, Proceedings of the National
Academy of Sciences 107, 22436 (2010).
[34] Plato, M. Schofield, and T. Griffith, Plato: Laws (Cam-
bridge University Press, 2016) p. 183.
[35] R. P. Inman and D. L. Rubinfeld, Democratic Federalism:
The Economics, Politics, and Law of Federal Governance
(Princeton University Press, 2020).
[36] A. J. Stewart, M. Mosleh, M. Diakonova, A. A. Arechar,
D. G. Rand, and J. B. Plotkin, Information gerrymander-
ing and undemocratic decisions, Nature 573, 117 (2019).
[37] R. A. Holley and T. M. Liggett, Ergodic theorems for
weakly interacting infinite systems and the voter model,
Annals of Probability 3, 643 (1975).
[38] R. Hegselmann and U. Krause, Opinion dynamics and
bounded confidence models, analysis, and simulation,
Journal of Artificial Societies and Social Simulation 5
(2002).
[39] C. Borghesi and J.-P. Bouchaud, Spatial correlations
in vote statistics: A diffusive field model for decision-
making, European Physical Journal B 75, 395 (2010).
[40] R. J. Johnston, A Question of Place: Exploring the Prac-
tice of Human Geography (Blackwell, 1991).
[41] C. C. MacInnis and E. Page-Gould, How can intergroup
interaction be bad if intergroup contact is good? Explor-
ing and reconciling an apparent paradox in the science of
intergroup relations, Perspectives on Psychological Sci-
ence 10, 307 (2015).
[42] M. Biggs and S. Knauss, Explaining membership in the
British National Party: A multilevel analysis of contact
and threat, European Sociological Review 28, 633 (2012).
[43] C. A. Bail, Breaking the Social Media Prism: How to
Make Our Platforms Less Polarizing (Princeton Univer-
sity Press, 2021).
[44] A. Gelman, D. Park, B. Shor, and J. Cortina, Red State,
Blue State, Rich State, Poor State: Why Americans Vote
the Way They Do, 2nd ed. (Princeton University Press,
2009).
[45] G. A. Huber and N. Malhotra, Political homophily in so-
cial relationships: Evidence from online dating behavior,
Journal of Politics 79, 269 (2017).
[46] S. Iyengar, Y. Lelkes, M. Levedunsky, N. Malhotra, and
S. J. Westwood, The origin and consequences of affec-
tive polarization in the United States, Annual Review of
Political Science 22, 129 (2019).
[47] E. J. Finkel, C. A. Bail, M. Cikara, P. H. Ditto, S. Iyen-
gar, S. Klar, L. Mason, M. C. McGrath, B. Nyhan, D. G.
Rand, L. J. Skitka, J. A. Tucker, J. J. Van Bavel, C. S.
Wang, and J. N. Druckman, Political sectarianism in
America, Science 370, 533 (2020).
[48] J. N. Druckman, S. Klar, Y. Krupnikov, M. Levendusky,
and J. B. Ryan, Affective polarization, local contexts and
public opinion in America, Nature Human Behaviour 5,
28 (2021).
[49] M. E. McCombs and D. L. Shaw, The agenda-setting
function of mass media, Public Opinion Quarterly 36,
176 (1972).
[50] G. W. Cox and M. D. McCubbins, Setting the Agenda:
Responsible Party Government in the U.S. House of Rep-
resentatives (Cambridge University Press, 2005).
[51] B. Bishop and R. G. Cushing, The Big Sort: Why the
Clustering of Like-Minded America is Tearing Us Apart
(Mariner Books, Boston, 2009).
[52] G. J. Martin and S. W. Webster, Does residential sorting
explain geographic polarization?, Political Science Re-
search and Methods 8, 215 (2020).
[53] E. Kaplan, J. Spenkuch, and R. Sullivan, Partisan spatial
sorting in the United States: A theoretical and empiri-
cal overview, Journal of Public Economics 211, 104668
(2022).
[54] D. E. Broockman, Approaches to studying policy repre-
sentation, Legislative Studies Quarterly 41, 181 (2016).
[55] S. M. Lipset and S. Rokkan, Cleavage structures, party
systems, and voter alignments: An introduction, in Party
Systems and Voter Alignments: Cross-National Perspec-
tives (Free Press, New York, 1967) pp. 1–64.
[56] G. Miller and N. Schofield, Activists and partisan realign-
ment in the United States, American Political Science
Review 97 (2003).
[57] S. N. Golder, I. Lago, A. Blais, E. Gidengil, and
T. Gschwend, Multi-Level Electoral Politics, Vol. 1 (Ox-
ford University Press, 2017).
[58] K. Reif and H. Schmitt, Nine second-order national
elections—a conceptual framework for the analysis of eu-
ropean election results, European Journal of Political Re-
15
search 8, 3 (1980).
[59] M. Golder, Presidential coattails and legislative frag-
mentation, American Journal of Political Science 50, 34
(2006).
[60] G. W. Cox and M. D. MacCubbins, Legislative Leviathan:
Party Government in the House (University of California
Press, 1993).
[61] M. P. Jones and S. Mainwaring, The nationalization of
parties and party systems: An empirical measure and an
application to the Americas, Party Politics 9, 139 (2003).
[62] K. V. L. Hijino and H. Ishima, Multi-level muddling:
Candidate strategies to “nationalize” local elections,
Electoral Studies 70 (2021).
[63] M. Carr, G. Gamm, and J. Phillips, Origins of the culture
war: Social issues in state party platforms, 1960–2014,
Presented at the American Political Science Association
(2016).
[64] J. M. Grumbach, From backwaters to major policymak-
ers: Policy polarization in the states, 1970–2014, Per-
spectives on Politics 16, 416 (2018).
[65] D. Schleicher, Why is there no partisan competition in
city council elections? The role of election law, Journal
of Law and Politics 23 (2008).
[66] A. Gelman, The twentieth-century reversal: How did
the Republican states switch to the Democrats and vice
versa?, Statistics and Public Policy 1, 1 (2014).
[67] T. Ferguson, Golden Rule: The Investment Theory of
Party Competition and the Logic of Money-Driven Polit-
ical Systems (University of Chicago Press, 1995).
[68] D. J. Hopkins, E. Schickler, and D. Azizi, From many di-
vides, one? The polarization and nationalization of amer-
ican state party platforms, 1918–2017, SSRN Electronic
Journal 10.2139/ssrn.3772946 (2020).
[69] J. T. Levy, Federalism, liberalism, and the separation
of loyalties, American Political Science Review 101, 459
(2007).
[70] J. Rodden, Keeping your enemies close: Electoral rules
and partisan polarization, in The New Politics of Inse-
curity, edited by F. M. Rosenbluth and M. Weir (Cam-
bridge University Press, 2021) pp. 129–160.
[71] J. E. Oliver, S. E. Ha, and Z. Callen, Local Elections and
the Politics of Small-Scale Democracy (Princeton Uni-
versity Press, 2012).
[72] K. L. Einstein and V. Kogan, Pushing the city limits:
Policy responsiveness in municipal government, Urban
Affairs Review 52, 3 (2016).
[73] B. F. Schaffner, J. H. Rhodes, and R. J. La Raja,
Hometown Inequality: Race, Class, and Representation
in American Local Politics (Cambridge University Press,
2020).
[74] J.-F. Wang, T.-L. Zhang, and B.-J. Fu, A measure of
spatial stratified heterogeneity, Ecological Indicators 67,
250 (2016).
[75] C. Ding and X. He, K-means clustering via principal com-
ponent analysis, Proceedings, Twenty-First International
Conference on Machine Learning, ICML 2004 1(2004).
Appendix A: Multiscale total variance
As described in section II, we can break up the total
variance of a random variable zinto the variance aris-
ing at each geographic scale. In the context of multilevel
polarization, zmay correspond to political opinion of a
random voter within a country. The country can be par-
titioned into a nested hierarchy of regions of increasing
scale; as just one example, scale 1 could correspond to
precincts, scale 2 to counties, and scale 3 to states. Let-
ting Wnbe a random variable denoting regions at scale n
with probabilities proportional to their populations, the
law of total variance yields
Var(z) = E(Var(z|Wn)) + Var(E(z|Wn)) (A1)
which decomposes the total variance into the variance
within and between the scale-nregions, respectively.
This decomposition is related to the spatial stratifica-
tion of heterogeneity (see, e.g., the q-statistic developed
by Wang et al. [74]). Recursively applying equation A1
to each of the two terms on its right-hand side and not-
ing that Wi+1 is determined by Wi(since smaller-scale
regions are nested within larger-scale ones), we obtain
E(Var(z|Wn)) =
n−1
X
i=0
E(Var (E(z|Wi)|Wi+1 )) (A2)
Var(E(z|Wn)) =
N
X
i=n
E(Var (E(z|Wi)|Wi+1 )) (A3)
Var(z) =
N
X
i=0
E(Var (E(z|Wi)|Wi+1 )) (A4)
Equation A4 is equivalent to equation 1 of the main
text. Here, we employ the notational shorthand
E(Var (E(z|W0)|W1)) = E(Var (z|W1)), since W0=
z, and E(Var (E(z|WN)|WN+1)) = Var (E(z|WN)),
since WN+1 takes on only a single value (as it corresponds
to the entire region in question).
The terms on the right-hand sides of equations A2-A4
correspond to the added variance at scale i+ 1, while the
left-hand side of equation A3 corresponds to the total
variance above scale n(see figure 1).
Equation A2 denotes the minimum amount of variance
that needs to be resolved on average by political decisions
made at scale n(regardless of whether such a decision is
made through direct democracy, a single executive, or
a legislature), since the smallest mean square distance
achievable between the outcome and the electorate is the
variance of the electorate opinions. Formally,
Z∞
−∞
(x−y)2f(x)dx ≥Z∞
−∞
(x−µ)2f(x)dx (A5)
for all outcomes y, since the mean µof f(x) mini-
mizes R∞
−∞(x−y)2f(x)dx. For instance, local elec-
tions in the U.S. need only resolve differences in opin-
ions from within the locales, state elections must resolve
both within-locale and within-state differences between
locales, and national elections must resolve the total vari-
ance Var(z), which consists of within-locale, within-state,
and between-state differences.
16
While the analysis in this section has focused on one-
dimensional random variables, it can easily be extended
to multidimensional political opinions and outcomes by
simply replacing the law of total variance with the law of
total covariance.
Appendix B: Effects of social ties on fully mixed and
fully segregated electorates
We introduced the idea of effective opinions in sec-
tion III A, wherein people vote as if they held a dif-
ferent political position because of their affiliation with
their neighbors. This appendix explores how these so-
cial ties influence political polarization when people con-
sider the opinions of others in the electorate equally (the
“fully-connected” case) and when people’s ties are de-
termined purely by their political affiliation (the “segre-
gated” case). We then discuss an interpolation between
the two extremes in appendix C where disagreement and
social ties possess a multiscale structure.
We first derive how the opinion distribution transforms
under these ties. Generally, the effective opinion x0of a
voter can be written as some function of their opinion x.
For monotone transformations x0=t(x), the distribution
of the random variable x0in terms of xis given by
fx0(x0) = fx(t−1(x0))|d
dx0t−1(x0)|.(B1)
From this point on, we simply notate fx0(x0) as ˆ
f(x0)
and the original distribution as f(x). Recall from equa-
tion 3 that a citizen’s effective opinion is defined as a
weighted average of their own opinions and those of their
neighbors, x0
i=PjTij xj. If we consider a connectiv-
ity matrix where a voter takes into account the opin-
ions of every other member of the electorate equally, i.e.
Tij =w/(n−1) for i6=j(where nis the size of the elec-
torate) while weighting their own position as Tij = 1 −w
for i=j, we can write
x0=t(x) = x(1 −w)−w¯x, (B2)
for n1. Here, wdenotes the weight with which each
person accounts for the opinion of others and ¯xis the av-
erage of the opinion distribution. Using equation B1, we
can write the transformed opinion distribution in terms
of f(x):
ˆ
f(x0) = fx0−w¯x
1−w1
1−w.(B3)
The variance of the transformed distribution is then
given by
ˆσ2=Z∞
−∞
(x0−¯x)2f(x0−w¯x
1−w)1
1−wdx0
=Z∞
−∞
u2(1 −w)2f(u)du =σ2(1 −w)2,
(B4)
using du =dx0/(1−w), where σ2=R∞
−∞(x−¯x)2f(x)dx is
the variance of f(x). This is a general result for any f(x):
in a fully-connected locale, the variance of the opinion
distribution decreases as the social weight wis increased
from 0 to 1.
Next, we examine how social ties affect election stabil-
ity in this fully-connected case. For simplicity (and be-
cause this distribution provides a precisely solvable case),
we consider two normally distributed subpopulations of
equal variance σ2. This may describe two political par-
ties with voters clustering around their respective means
µAand µB:
f(x) = πAe−(x−µA)2
2σ2+πBe−(x−µB)2
2σ2,(B5)
where πAand πBare the relative sizes of the populations.
For this distribution,
J≡(µA−µB)2
4(σ2+a2)(B6)
for some positive constant agives a dimensionless mea-
sure of the degree of polarization. Under a particular
class of models, instability (see section III) occurs when-
ever J > 1 [11]; however, all of our arguments here will
hold as long as a larger value of J(which corresponds to
a more hollowed-out center, relative to the length scale
a) is more likely to produce instability.
For social ties that result in an effective opinion distri-
bution described by the transformation in equation B2,
the distribution average, ¯x= (πAµA+πBµB)/(πA+πB),
stays the same, while the means of the two subpopula-
tions are shifted to ˆµA= ¯xw +µA(1 −w) and ˆµB=
¯xw +µB(1 −w), respectively, and their variances are de-
creased to ˆσ2=σ2(1 −w)2. The overall result of this
transformation is to decrease the dimensionless polariza-
tion Jto
ˆ
J=(µA−µB)2(1 −w)2
4(σ2(1 −w)2+a2)< J (B7)
thus reducing the likelihood or magnitude of instability.
Having examined the case of a fully-connected locale,
we now turn our attention to the situation where so-
cial ties are highly segregated. In this limit, affective
polarization arises where individuals sort their social in-
teractions solely according to partisan affiliation. This
corresponds to a graph with two disconnected compo-
nents, where one component is fully populated by mem-
bers whose opinion distribution is drawn from a Gaus-
sian centered at µA, and the other from the Gaussian at
µB. Each component is internally connected with weight
w. Since this is just a sum of two fully-connected pop-
ulations, we can apply the same transformation (equa-
tion B3) for each group. The effective means ˆµAand ˆµB
stay the same because members of the two groups do not
influence each other. However, because the widths of the
two Gaussians decrease to ˆσ=σ(1 −w), social ties may
17
turn a stable election into an unstable one, since
ˆ
J=(µA−µB)2
4(σ2(1 −w)2+a2)> J. (B8)
In contrast to the fully-connected case, increasing the
strength of social ties can hollow out the middle, increas-
ing rather than decreasing the possibility of instability in
the election.
Appendix C: Effects of social ties across multiple
scales
Here we consider the effect of social ties in a multilevel
setting, under the assumption that the strength of social
ties is correlated with geographic proximity.
We denote the opinion distribution of each region as
fs1,s2,...,sN(x), where s1, s2, . . . , sNare indices that spec-
ify the location of that region (e.g., s1may denote the
city, s2the county, and s3the state). Dropping an index
indicates an implicit sum: fs2,...,sN(x) is the total opinion
distribution of the scale-2 region denoted by s2, ..., sN,
which is equal to the sum of the opinion distributions of
all of the scale-1 regions it contains.
We consider the simplest model that allows for the
strength of social ties to vary with geographic scale, de-
noting the strength of social ties within scale-nregions
by wn. (For instance, in a three-scale model, w1could
correspond to within-precinct ties, w2to within-county
ties, and w3to state-wide ties.) This model is more flex-
ible than it may seem, since one can define arbitrarily
many scales, allowing one to approach a continuum of
possible strengths of social ties, with regions at each scale
defined so as to give the desired strength of social ties be-
tween various groups of individuals (although of course
this freedom is constrained by the nested structure of the
regions at various scales).
Using a similar formulation as equation B1, we write
the effective opinion of a voter residing in a region spec-
ified by s1, s2, . . . , sNas
x0=xβ +w1¯xs1,...,sN+· ·· +wN¯xsN+wN+1 ¯x, (C1)
where β= 1 −PN+1
j=1 wj. The means follow the same
summing notation, with ¯xsn,sn+1,...,sNbeing the mean of
scale-nregion containing the voter (¯xdenotes the av-
erage opinion of the entire nation). Since the effective
transformed opinion of a region specified by sn, ..., sN
is the sum over all the distributions within each all of
the scale-1 regions it contains (each shifted by varying
amounts since they are affected by social ties to different
populations), we can write
ˆ
fsn,...,sN(x0) = 1
βX
sn−1∈sn··· X
s2∈s3X
s1∈s2
fs1,...,sNx0−w1¯xs1,...,sN−w2¯xs2,...,sN· ·· − wN+1 ¯x
β.
(C2)
The effective variance of opinions within locales at the
smallest scale, specified by the distribution ˆ
fs1,...,sN(x0),
is reduced by the effects of ties across w1through wN+1.
However, the effective variance of the means ¯xs1,...,sN
is reduced only through interactions from w2through
wN+1. This nested structure means that the effective
variance at any particular level is only reduced by inter-
actions across larger scales, yielding a multiscale distri-
bution of effective variance of the form
Var(z0) = E(Var (z|W1)) (1 −ΣN+1
i=1 wi)2
+E(Var (E(z|W1)|W2)) (1 −ΣN+1
i=2 wi)2
+· ·· + Var (E(z|WN)) (1 −wN+1)2,
(C3)
employing a similar notation to equation 1 where zis a
random variable that samples over voter opinions xand
z0samples over effective opinions x0. This equation is
equivalent to equation 6 in the main text. The degree to
which social ties reduce the effective variance increases at
smaller scales: the variance within locales, for instance,
is decreased by ties among members of the locale and ties
to others across the country. The variance of the locale
means, on the other hand, is decreased only by cross-
locale ties. When social ties are in play, what matters is
not only the total variance of opinions in a country but
how that variance is distributed across scales. A coun-
try with more opinion differences at smaller (rather than
larger) scales would have a lower effective total variance.
Akin to the analysis performed in appendix B, we can
also examine how social ties affect the stability of elec-
tions. For simplicity, we will consider only two scales, but
similar results can be obtained for any number of scales.
Consider, for instance, two states with the same total
variance. In the first state, all counties have exactly the
same opinion distribution, which we model as the sum
of two normal distributions with means of µA= ∆ and
µB=−∆ (since without loss of generality, we can take
the distributions to be centered at 0) and each with vari-
ance σ2. In the second state, all counties have unimodal
opinion distributions, with half of the counties having
opinion distributions centered at µA= ∆ and the other
half at µB=−∆.
For the first state, where all counties are internally
polarized but identical to each other, the effective opinion
distribution is given by
ˆ
fs2(x0) = 1
βX
s1∈s2
fs1,s2(x0/β),(C4)
since ¯xs1,s2= ¯xs2= 0 for all counties s1. This is akin to
the case of a fully-connected, single-scale locale from the
previous appendix: social ties decrease the dimensionless
polarization Jto
ˆ
J=∆2β2
σ2β2+a2< J, (C5)
reducing the magnitude and likelihood of instability.
18
In the second state, the effective opinion distribution
is given by
ˆ
fs2(x0) = 1
βX
s1∈s2
fx0−w1¯xs1,s2
β,(C6)
since ¯xs2= 0 but the counties have different means
¯xs1,s2equal to either µA= ∆ or µB=−∆. This
transformation reduces the variance of each county by
β= 1 −w1−w2, thus making the within-county popula-
tions more sharply peaked, but only changes the distance
between the locales by (1−w2). Computing the effective
Jyields
ˆ
J=∆2(1 −w2)2
σ2(1 −w1−w2)2+a2.(C7)
In this case, ˆ
Jis always equal to or larger than the ˆ
J
computed for first state, where all the polarization is con-
centrated at the smallest scale, leading to a higher chance
of instability.
Increasing w1here increases ˆ
J, while increasing w2de-
creases ˆ
J; whether or not ˆ
Jis greater than or less than
Jwill depend on the precise values of the parameters.
Intuitively, by decreasing local variance, stronger local
social ties can result in a more hollowed-out center of
the effective opinion distribution if there is substantial
heterogeneity at larger scales, leading to a higher likeli-
hood of larger-scale instability. Thus, for an electorate
with substantial geographic segregation, the effect of so-
cial ties can be to decrease polarization and instability
for local elections, while simultaneously increasing them
for state and national elections.
We can frame this in the perspective of opinion sort-
ing. We saw in the previous appendix that the segrega-
tion of opinions within a particular locale increases the
degree of instability in an election. The multiscale model
provides a more general result: segregation at any partic-
ular scale, where opinions in regions below that scale are
homogenized and opinions above it are made more het-
erogeneous, can create strictly more instability for larger
scale elections even when the overall opinion distribution
of the system is held constant.
Appendix D: Choice of election axes
Here we elaborate upon the concept of the election axis
described in section IV A and provide one model that
links the election axis to the multidimensional opinion
distribution of the electorate. The axis, which is simply
the direction of discourse spanned by the eventual candi-
dates in the general election, exists independently of any
modeling assumptions, including those described below.
We consider a specific model in which coalitions among
potential voters (e.g. political parties or groups of par-
ties) form so as to minimize of the ideological distance
among potential voters within each coalition. Letting
P={P1, .., Pk}be a partition of the set of all electorate
opinions into kcoalitions, membership in each coalition
is then given by
arg min
P
k
X
i=1 X
x∈Pi|~x −~µi|2(D1)
where ~µiis the mean of Pi.
For k= 2, if opinions lie in the Euclidean space Rd,
the boundary between the two clusters is always given by
a flat hyperplane of Rd−1since the Euclidean distance
minimization corresponds to a linear kernel. We focus
on the k= 2 case as it provides an analogue for the
dominant social cleavage even in multi-party elections,
although it is also possible to define a similar process for
an arbitrary kand a corresponding election subspace of
dimension k−1.
Under these assumptions, the election axis is given by
the difference in the means of the two clusters ~µ1−~µ2.
A useful heuristic is to think of this axis as an approx-
imation to the the first principal component v1of the
opinion distribution f(~x): as shown by Ding and He [75],
the first principal component of the opinion distribution
f(~x) approximates the continuous solutions to the dis-
crete cluster membership indicators for 2-means cluster-
ing. Even though the categorical constraint of the two
clusters means that the centroid vector and the first prin-
cipal component do not coincide exactly, we can employ
v1as the leading order proxy for the election axis.
Appendix E: Election axis interactions
Elections in a democracy interact with each other via a
variety of mechanisms, such as partisan ties, shared elec-
torates, and media-driven alignment. In this section, we
develop a general framework for how two election axes,
specified by unit vectors ˆeaand ˆeb, may couple to each
other. We then show in the following appendix that this
formulation is compatible with the idea of effective opin-
ions (as introduced in section III A).
We can describe the interaction of two elections by
writing their new axes ˆe0
aand ˆe0
bas a linear combination
of the original directions:
ˆe0
a=waˆea+ (1 −wa)ˆeb
|waˆea+ (1 −wa)ˆeb|
ˆe0
b=wbˆeb+ (1 −wb)ˆea
|wbˆeb+ (1 −wb)ˆea|,
(E1)
where wa, wb∈[0,1]. A combination of this form as-
sumes that the coupled elections lie within the span of
the original axes. In other words, it assumes that no new
political issues are produced by the interaction; the old
issue dimensions are merely mixed. The relationship be-
tween elections aand bcan be asymmetrical: they may
be conducted at different levels (e.g., one is national and
another is local), one election may be more salient than
19
FIG. 6. (a) When politicians are equally connected regardless of affiliation, the angle θbetween axes ˆeaand ˆebstays constant
and the distances between the two parties (|D0
a−R0
a|and |D0
b−R0
b|) decrease linearly with the social weight m. (b) On the
other hand, under the assumptions stated in appendix F, if only intra-party ties exist, the angle θ0between the the axes
decreases with m.
the other, or they may simply involve electorates with
different populations.
To see the effect of increasing the interaction strengths
waand wb, note that we can always pick a coordinate
system for ˆeaand ˆebsuch that they span a plane, where
the initial angle between them is given by some θ0. If
we pick a coordinate system where ˆea= (1,0,0. . .) and
ˆeb= (cos θ0,sin θ0,0. . .), we see that we can write ˆe0
a=
(cos θa,sin θa,0. . .) and ˆe0
b= (cos θb,sin θb,0. . .) for some
θa, θb∈[0, θ0]. Since θaincreases with increasing waand
θbdecreases with increasing wb, the angle between the
two axes θb−θadecreases with stronger interactions (i.e.
larger values of waand/or wb).
We can extend this two-axis model to a country with
elections at scales α, β, γ . . .. In the most general setting,
a multilevel system can have both horizontal interac-
tions (e.g., those driven by shared party mobilization be-
tween states) and vertical interactions (e.g., those driven
by the effect of a shared electorate in nested elections).
Each scale consist of sets of axes α={α1, α2, . . .},
β={β1, β2. . .}, etc. The un-normalized effective axis
for an election ˆei—produced as a result of multilevel
interactions—can be written as
e0
i=wˆei+ (1 −w)
|α|
X
j, ˆαj6=ˆei
Aij ˆαj+
|β|
X
j, ˆ
βj6=ˆei
Bij ˆ
βj+···
,
(E2)
where Aij , Bij , . . . represent interaction matrices with
elections at the respective scales and wparameterizes
the total strength of these interactions. The normalized
axis is given by ˆe0
i=e0
i/|e0
i|.A version of this model with
two scales is described in the main text.
Generalizing the pairwise interaction shown in equa-
tion E1, the multilevel interactions help to focus all the
election axes in a country—whose discourse may origi-
nally be oriented along any number of issue dimensions—
toward a single direction. For instance, we can measure
the dispersion of the axes ˆ
eivia the circular variance of
the subtended angle θ0= cos−1(ˆ
ei·ˆ
eN),
Var(θ0)=1−1
nv
u
u
t
n
X
i=1
cos2(θ0
i) +
n
X
i=1
sin2(θ0
i),(E3)
where iindexes through all the elections in the country.
This decreases monotonically with stronger social ties.
Although the focusing effect can, in principle, lead to
greater coherence in a country’s legislature, it also col-
lapses the dimensionality of its political discourse. As
described in section IV C, the lower dimensionality tends
to increase the projected variance along the first principal
axis of legislator opinions, an effect that can be measured
via DW-NOMINATE scores.
Appendix F: Interactions via partisan ties
Here, we explore a specific model of how elections inter-
act based on the effective opinion model in section III A
to provide an example of how interactions between elec-
tion axes may arise. We show that this model is consis-
tent with the more general coarse-grained formulation of
axis interactions introduced above in appendix E. Other
specific models will also be consistent with the coarse-
grained formulation. While real-world elections will of
course not follow the precise dynamics given here, they
may nonetheless be well described in aggregate by the
coarse-grained formulation.
Take, as an example, elections aand bwith Democratic
and Republican candidates located at ~
Da,~
Ra, and ~
Db,
20
~
Rbrespectively. The candidates span election axes ˆei=
(~
Di−~
Ri)/|~
Di−~
Ri|, where i∈ {a, b}. Equation B1 tells us
that in order to compute the transformed positions when
all opinion holders are equally connected, we find the
mean ¯xof the distribution and shift the original opinions
proportional to a weight m∈[0,1]. If all the politicians
in a country are equally connected, the new position of
the Democratic candidate in election ican be written as,
~
D0
i= ¯xm +~
Di(1 −m).(F1)
The expressions for the Republican candidate follows
identically. As a result, the distance between candidates
of opposing parties,
|~
D0
i−~
R0
i|= (1 −m)|~
Di−~
Ri|,(F2)
decreases as the strength of social ties is increased. Fur-
thermore, the angle θ0= cos−1(ˆe0
a·ˆe0
b) stays the same.
This can be seen in figure 6a: because ~
Diand ~
Riare
both shifted toward the center of mass by the same pro-
portion, they form a pair of similar triangles with the
transformed candidate positions. As a result, the axes
are always translated parallel to their original directions.
However, we obtain a different result if interactions
that result in changes in effective political opinions ex-
ist predominantly within (rather than between) parties.
When social ties only occur in-party, we let the effective
opinions of Democrats move towards ¯xD=pa~
Da+pb~
Db
and those of Republicans move towards ¯xR=pa~
Ra+
pb~
Rb, with paand pbparameterizing the relative size or
salience of elections aand b(and with pa+pb= 1).
If effective opinions are pulled toward these means with
weight m, the new election axes become ˆe0
i=~e0
i/|~e0
i|,
where
~e0
a= (¯xD−¯xR)m+ ( ~
Da−~
Ra)(1 −m)
= ( ~
Da−~
Ra)(1 −m+pam)+(~
Db−~
Rb)pbm. (F3)
A similar expression can be obtained for ~e0
b. Mapping
m−pam=pbmonto w, we see that this is a linear
combination of ~
Da−~
Raand ~
Db−~
Rbof the form presented
in equation E1. As we increase the level of within-party
socialization m, the angle between the two elections axes
ˆe0
aand ˆe0
bdecreases.
Appendix G: Multidimensional representation
Recall from section III that the representation riof
an individual iis defined as the effect of a shift in their
opinion xion the election outcome. While representation
is either positive or negative in the one-dimensional case,
in a multidimensional opinion space, the outcome of an
election can change in any direction relative to the change
in xi.
This requires us to generalize the representation rias
a tensor, taking into account the direction in which the
outcome changes for a given change in opinion. Assum-
ing the derivative exists, we write the representation of
opinion xias
ri
µν =∂yµ
∂xi
ν
,(G1)
where the second index corresponds to the direction of
opinion change and the first index corresponds to the
direction of change in the election outcome. This is a
rank-two tensor, but because the opinion space can be
embedded in Rn, the metric is simply δµν . Working in
an Euclidean space enables us to lower all indices for no-
tational simplicity. A similar notion of multidimensional
representation was first presented in the first supplemen-
tal section of [11].
A change in opinion in direction ˆccan always be broken
down into a component along the election axis ˆeand a
component along some orthogonal axis ˆo:
ˆc=aˆe+bˆo. (G2)
The differential representation along ˆc, i.e., the change in
the election outcome along ˆcfor a change in opinion in
the same direction, can be written as
rˆc=ˆcµri
µν ˆcν=
a2ˆeµri
µν ˆeν+b2ˆoµri
µν ˆoν+ab(ˆeµri
µν ˆoν+ ˆoµri
µν ˆeν).
(G3)
The final term, which represents the change in the elec-
tion outcome orthogonal to the change in opinion, van-
ishes when ˆeis an eigenvector of ri
µν .
In general, equation G3 allows us to write the total
representation as the sum of contributions from on-axis
changes in opinion a2ˆeµri
µν ˆeν+abˆoµri
µν ˆeνand off-axis
changes in opinion b2ˆoµri
µν ˆoν+abˆeµri
µν ˆoν. These com-
ponents may have very different properties depending on
the election process. For instance, if negative represen-
tation occurs strongly along the election axis—perhaps
due to its correlation with national political discourse—
changes in opinion along orthogonal directions may pro-
vide a good avenue for political representation. However,
there may also be situations in which only changes in
opinion along the direction of the axis are represented.
For instance in a local election dominated by national
discourse, local issues that do not fall along the direction
of national discourse may have little effect on the election
outcome.