ArticlePDF Available

United States Precinct Boundaries and Statewide Partisan Election Results

Authors:

Abstract

We describe the creation and verification of databases of all precinct boundaries used in the United States 2016, 2018, and 2020 November general elections, enhanced with election results for all partisan statewide offices. United States election officials report election results in the smallest geographic reporting known as the precinct. Scholars and practitioners find these election results valuable for numerous use cases. However, these data cannot be augmented with other geographically-bound data, such as U.S. Census data, without precinct boundaries. Here we describe the collection of precinct boundary data from state and local election officials, sometimes provided in GIS formats, images, text descriptions, and – in rare cases – verbally. We describe how we verify boundaries with other election data, such as geocoded voter registration files. Our open-source data has appeared in redistricting litigation argued before the United States Supreme Court; and has been used by state and local redistricting authorities, media organizations, advocacy groups, scholars, and a vibrant community of mapping enthusiasts.
1
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
United States Precinct Boundaries
and Statewide Partisan Election
Results
Brian Amos
Gerontakis & Michael McDonald
 ✉













A requirement for democratic accountability is that governments report election data. In the United States there
is no national entity responsible for administering elections; responsibility devolves to sub-national state and
local election administrators. Without centralized national administration, election ocials report election data
in non-standard formats that pose signicant barriers to creating a unied national database1. To preserve the
secret ballot, election administrators report aggregate election results in geographic units commonly known as
states, districts, counties, townships, and precincts (the names of sub-state entities vary across the country). e
smallest of these geographies are precincts, which are on the scale of neighborhoods. Election ocials use pre-
cincts to identify the polling location at which a voter will cast an in-person vote and the oces that will appear
on the ballot2. To augment election results with other contextual data one requires their geographic boundaries3.
State, district, county, and township boundaries – and contextual data related to them – are readily available
from the U.S. Census Bureau and other sources. However, there is no nationwide database of accurate precinct
boundaries, which poses challenges to their collection and standardization. To ll this gap, we created national
precinct boundary databases enhanced with vote counts for candidates of all partisan statewide oces for the
170,098 precincts used in the 2020 U.S. November general election; the 152,217 used in 2018 (CA is in-process
at the time of this writing), and the 177,202 used in 2016.
ere are notable eorts to collect precinct boundaries from the roughly 3,000 local election ocials across
the country who are the primary data curators. e U.S. Census Bureau requests states provide precinct bound-
aries collected from their localities so they may be included in the Census Bureau’s geographies for reporting
demographic statistics. e principal use case of these data are redistricting of legislative districts following the
decennial census. Phase Two of the Census Bureau’s Redistricting Data Program – when the Bureau collects
precinct boundaries – takes place in a year ending in ‘7’ preceding a decennial census (https://www.census.gov/
programs-surveys/decennial-census/about/rdo.html). Unfortunately, these boundaries can be inaccurate geo-
graphic representations of precincts used in most elections3. Precinct boundaries are time-bound, in that elec-
tion ocials frequently modify precincts from one election to the next for administrative reasons. Additionally,
our quality control routinely identies incorrect boundaries even when obtained directly from local election
1Wichita State University, Department of Political Science, Wichita, Kansas, 67260, USA. 2University of Florida,
Department of Political Science, Gainesville, Florida, 32611, USA. 3These authors contributed equally: Brian Amos,
Steven Gerontakis, Michael McDonald. e-mail: michael.mcdonald@u.edu


Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
ocials. It is for these reasons a labor-intensive eort is required to collect and verify precinct boundaries used
in each election. Following the 2010 census, a team from Harvard and Stanford universities attempted to collect
2008 general election precinct boundaries, but discontinued their eorts (https://projects.iq.harvard.edu/eda/
home). Authors of this essay contributed to that eort and seek to continue it.
Stakeholders nd geographically-bound precinct election results important to understanding democratic
processes, representation, and governance. A primary use is redistricting, where stakeholders desire district
partisanship measures derived from statewide oce elections4. Our databases are incorporated into online
redistricting mapping and evaluation tools, such as Dave’s Redistricting App (https://davesredistricting.org/),
DistrictBuilder (https://www.districtbuilder.org/), and PlanScore (https://planscore.org/). Numerous media
organizations use our database in their election coverage, including CNN, New York Times, Washington Post, and
Wall Street Journal. State governments used our databases for their redistricting, including New York, Ohio, and
Virginia. A related use case is voting rights litigation, where experts analyze precinct data to estimate racial vot-
ing patterns using a technique known as ecological inference5. Among the court cases using our databases was
a successful challenge to Alabamas congressional districts as a racial gerrymander decided by the U.S. Supreme
Court in Allen v. Milligan 599 U.S. 1 (2023). Scholars use our databases to evaluate redistricting outcomes6,7, to
develop new partisan gerrymandering metrics and solutions811, to analyze eects of the U.S. Census Bureau’s
dierential privacy policies12, to estimate racial voting patterns among congressional districts13, to analyze polar-
ization among state legislators14, to analyze Latino voting patterns15, to analyze suburban voting patterns16, to
analyze U.S. COVID policies1719, to analyze local crime policies20, to map campaign donation patterns21, and to
augment other geographically bound databases with partisan voting data8,22.
Statewide partisan oces on the ballot vary among the states depending on the election. In a presidential
election year, all states have the presidential election on their November ballot. Appearances of other oces vary
with timing of U.S. Senate elections and state laws regarding when and what state oces are elected. As depicted
in Table1, we report precinct election results for thirty-six oces spanning the 2016, 2018, and 2020 November
general elections, with a combined 1,736,409 cells.
Methods
We produce nationwide electronic geographic information system (GIS) databases of precincts used in the 2016,
2018, and 2020 United States November general elections, augmented with precinct-level election results for
statewide partisan oces. We on occasion produce and publicly release databases for selected elections other
than November federal elections, such as primary elections, when we have capacity to create these databases.
ere are two important data production phases: the creation of a precinct boundary database for each state and
the creation of precinct-level election results that can be joined to these boundaries.
States and the federal government use various names for what are generally known as precincts. States may
also call these “wards” or “election districts.” e U.S. Census Bureau acknowledges this naming variety by
calling these geographies “voting tabulation districts” or VTDs. We use the common name “precincts” to refer
to any such small geographic boundaries used by election ocials for managing elections and reporting elec-
tion results. Our naming convention includes smaller geographies created when election ocials occasion-
ally report election results for precincts split by legislative district boundaries. Likewise, the local government
that manages federal and state elections is generally what is known as a county. States may have other names
for county-equivalents, such as “parishes,” and sub-county governments such as “townships” or “districts” may
administer elections. We use the common name “county” to refer to any local government responsible for main-
taining precinct boundaries and reporting election results.
 ere is no standard format for precinct boundary data. Frequently, governments
publish precinct boundaries in an electronic format known as shapeles – a proprietary GIS format developed
by soware vendor ESRI, and since cracked for general use. We prefer election ocials to provide a shapele (or
any similar electronic GIS formats, hereaer we simply call all GIS formats “shapele”) since this allows us to
edit boundaries if we detect and verify errors. We obtain shapeles from the U.S. Census Bureau, state election
ocials, and local governments. We ultimately convert all boundary data we ingest into the shapele format for
re-dissemination. is approach allows us to attach precincts’ election results to the boundary data as attributes
of the precinct shapes.
e second-most frequent map format election ocials provide us is a map image. Sometimes a map image
is itself created with GIS soware, but election ocials may be unable to provide us with the shapele. Usually
this is because election ocials did not create the map; another local government agency or an outside vendor
created it. When possible we navigate local government contacts or submit open records requests to obtain the
canonical shapele. ese steps are not always fruitful, as the outside agent may charge a fee or be otherwise
unable to provide the shapele. Not all map images are created by GIS soware. Sometimes election ocials will
provide a hand-drawn map. Sometimes county ocials cannot provide an electronic image le, in which case
election ocials may take a cell-phone picture of their map and forward it to us.
e third-most frequent map format election ocials provide us with is a written or verbal description. We
encounter this situation in rural counties or small townships that have few precincts and little GIS capacity.
While this may seem antiquated, in reality election ocials do not commonly manage which precincts their
registered voters are assigned to using electronic maps. Instead, they manage their voter registration databases
using master street address les that identify which precinct each street address range is located in (e.g., even
numbered 100–198 Main Street is associated with precinct 1). When we geocode voter registration les during
our quality assurance protocols we may detect errors in these master street address les when we observe a
street range assigned to one precinct while surrounding neighbors are assigned to another23. We have consulted
with the Colorado and Virginia state governments to identify and rectify these errors, and alerted numerous
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
localities elsewhere of potential issues. ese errors pose a dilemma as to which boundaries to report. Since our
use case involves the geographic location of voters, we generally opt to generate modied precinct boundaries
that include these veried assignment errors.
Lastly, rarely local ocials do not respond to our multiple requests for their precinct boundaries (in extreme
cases we have made repeated contact attempts spanning more than a year). In these cases we rely upon other
information that provide clues as to the precinct boundaries, such as geocoded voter registration addresses with
precinct identiers, or other local districts and governmental units that local election ocials align precinct
boundaries with (depending on state and local practices). A detail we frequently wrestle with is local munic-
ipality annexations in localities that require municipal and precinct boundaries to coincide. We oen rely on
published annexation reports to verify the correctness of the precinct boundaries. When we detect conicts, we
contact local election ocials to resolve them.
We create our own GIS precinct maps or modify shapeles that we verify have errors. We nearly always have
a reference precinct map to start from that we obtained from the U.S. Census Bureau or a shapele created by us
from a prior election. When we do not have any initial resource, we created a machine learning algorithm to fab-
ricate a starter precinct map from census geography as a basemap using geocoded voter registration addresses
and their associated precincts. Automation rarely yields usable precinct boundaries without extensive additional
editing, since geocodes are imprecise, particularly in rural areas23. Once we have a base map, we examine ancil-
lary data to create the most accurate representation that we can.
In Tables2, 3 we present statistics drawn from our extensive documentation to provide a sense of the scope
of our work. Table2 presents the states from Alabama to Missouri and Table3 presents Montana to Wyoming,
Oce 2016 2018 2020
Attorney General 34108 100486 34551
Auditor 24490 36862 24743
Chief Financial Ocer 143 6152 144
City Council Member 143
Clerk of the Supreme Court 672
Commissioner of Agriculture 4425 24086 4333
Commissioner of Insurance 10763 8660 10982
Commissioner of Labor 2704 4609 2662
Commissioner of Public Lands 7197 13045 7464
Commissioner of School and Public Lands 737
Comptroller 10090 39439
Corporation Commissioner 1951 1948
Council Chairman 143
Delegate to the U.S. House 143 143 144
Governor 26124 114083 25247
Lieutenant Governor 13939 21865 14577
President 177202 170198
Mayor 143
Public Service Commissioner 5129 5074 5073
Public Utilities Commissioner 747 737 737
Railroad Commissioner 8832 8936 9014
Secretary of Commonwealth 2173
Secretary of State 18759 72828 17811
State Appeals Court 4196 6190 6551
State Board of Education 4812 4801 4756
State Controller 3004
State Court of Criminal Appeals 8832 10928 10986
State Mine Inspector 1489
State Supreme Court 15028 15126 15565
State University Regent 3010 3136
Superintendent of Public Instruction 8771 9761 3328
Tax Commissioner 424
Treasurer 28473 61680 29141
U.S. House 5688 3618 3609
U.S. Senate 115565 112084 78044
University Board of Regents/Trustees/Governors 4812 4801 4756
Tot a l 177202 152717 170198
Tab le 1. Statewide Oce Precinct Count Note: CA 2018 is in production at the time of this writing.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
with totals for the entire United States. e rst column presents the number of geographical “units” within each
state, which are counties for all states except those that use state legislative districts to group precincts (Alaska
and Delaware) and those that use cities or towns (Connecticut, Maine, Massachusetts, New Hampshire, Rhode
Island, and Vermont). In the next three columns we present the number of geographical units where, for the
2020, 2018, and 2016 November elections, we edited at least one precinct boundary by splitting a precinct into
two or more parts, merging it with another precinct, or drawing new precincts. In a given election, we edit at
least one precinct – and oen more – in about one-sixth of the geographical units. ese are the geographic units
where we make changes. Our quality assurance protocols include checking all precincts whether they require
further editing or not.
In six states we did not modify any precincts across all three election years (Alaska, Hawaii, Maryland,
Nevada, Oklahoma, Utah, and Wisconsin). State election ocials in all of these states – except Nevada – provide
a statewide precinct shapele. Indeed, nearly all states have a statewide map either disseminated by the state or
available from the Census Bureau’s Phase 2 redistricting data program production. We welcome a statewide
precinct map, but its existence does not guarantee our mapping work is done. We oen nd issues – errors or
out-of-date boundaries – that we resolve by collecting precinct maps from localities.
Tables2, 3 under-represent the breadth of our mapping work. We do not count certain tasks that we perform
regularly. We do not count adjusting precincts for city and town annexations in states where precincts do not
cross local government boundaries. When we collect maps from local governments, electronic representations
of local governments’ boundaries may not nicely conform with one another due to technical GIS details, such as
diering map projections. We do not count ensuring precinct boundaries from dierent sources do not result in
overlapping precincts – and in a rare cases do not extend outside state boundaries. We do not count where we
created precinct maps from sources other than those obtained from election ocials such as local land parcel
maps, which require extra eort to identify as a source for precinct boundaries and to collect. We do not count
when we create electronic representations of local governments’ precincts that localities adopt as their ocial
precinct map.
 Precinct boundary data are most valuable when augmented with election results. Aer
constructing a statewide precinct map for an election of interest, we merge election results for partisan statewide
State Total Units 2020 Units
Adjusted 2018 Units
Adjusted 2016 Units
Adjusted 2020 Non-Precinct
Vote Allocated 2018 Non-Precinct
Vote Allocated 2016 Non-Precinct
Vote Allocated
Alabama 67 67 67 67 Y Y Y
Alaska 20 0 0 0 Y Y Y
Arizona 15 3 3 2 N N N
Arkansas 75 9 26 36 N Y Y
California 58 6 0 N N
Colorado 64 7 6 7 N N N
Connecticut 169 55 56 57 N N N
Delaware 21 4 3 0 Y Y Y
District of
Columbia 1 0 1 1 N N N
Florida 67 29 28 34 Y Y N
Georgia 159 20 9 5 N N N
Hawaii 5 0 0 0 N N N
Idaho 44 16 18 14 Y Y Y
Illinois 102 2 4 11 Y N Y
Indiana 92 43 56 56 N Y Y
Iowa 99 12 7 6 N N N
Kansas 105 21 20 11 N N N
Kentucky 120 0 10 Y Y
Louisiana 64 7 5 5 Y Y Y
Maine 425 3 3 0 Y Y Y
Maryland 24 0 0 0 N Y Y
Massachusetts 351 16 19 15 N N N
Michigan 83 8 4 2 Y Y Y
Minnesota 87 0 0 0 N N N
Mississippi 82 82 82 82 N N N
Missouri 115 60 90 90 Y Y Y
Tab le 2. Counties with Precinct Changes and States Requiring Vote Reallocation (Alabama to Missouri). Notes:
“Units” for all states are counties except AK and DE (state legislative districts) and CT, ME, MA, NH, RI, and
VT (cities and townships). KY 2018 had no statewide partisan election. CA 2018 is in production at the time of
this writing.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
oces. is step serves as an additional validation check, as described below. Our focus on statewide partisan
oces is primarily for purposes to measure the “normal vote,” the partisanship of a precinct, district, locality, or
other geography4. e scope of oces includes U.S. President, U.S. Senator, Governor, and other statewide oces.
U.S. President is the only oce appearing on all ballots in the presidential election years of 2016 and 2020. Other
oces vary with the timing of U.S. Senate elections – a third of the seats are scheduled for election every two years
– and the varied state oces whose terms expire in a given election. In a midterm election year, such as 2018, the
president does not appear on the ballot, but states tend to elect governors and other state oces. Not all states do,
however, raising the occasional possibility a state has no partisan statewide oce on the ballot. When no state-
wide partisan oce is on the ballot, we try to collect precinct boundaries augmented with U.S. House results. We
also include U.S. House races for single-district states since these qualify as astatewide partisan oce.
We create databases of precinct-level vote totals for every partisan statewide oce. We attempt to tally results
for every candidate, including write-in candidates, when available. When election ocials recognize “ocial”
write-in candidates through a formal ballot qualication process, election ocials may report their vote totals
alongside all major and minor party candidates, which allows us to report votes for each ofthese candidates.
Sometimes election ocials may choose to report all write-in candidates in a single category, in which case we
report only the aggregate write-in tallies.
We collect election results in a suitable electronic format to merge with precinct boundaries. Most oen
election ocials report candidates’ votes in an electronic spreadsheet of some sort. Some states and counties
still report election results in a scanned portable document format (pdf), either generated from soware or a
scanned image, requiring conversion to an electronic spreadsheet. Our experience with pdf to spreadsheet con-
version soware is we nearly always need to perform additional cleaning and reformatting.
Merging election results with precinct boundaries requires a common and unique identier in both data
sources. Sometimes precinct identiers in these databases are dierent. Oen these are minor inconsistencies
resolved through visual inspection of names. Infrequently, we may contact local election ocials to resolve
inconsistencies. Our preference is to produce shapeles with precinct identiers as they appear in election
results so that others may merge more election results data, if desired.
At-large” precincts may cover an entire state, county, or a sub-unit within a county, such as districts or local-
ities. Some special-purpose precincts have very small boundaries, typically consisting of the city block where
an election oce is located. We typically treat these government oce precincts the same as at-large precincts
in our data processing. Depending on the locality, election ocials may create these precincts to report election
State Total
Units 2020 Units
Adjusted 2018 Units
Adjusted 2016 Units
Adjusted 2020 Non-Precinct
Vote Allocated 2018 Non-Precinct
Vote Allocated 2016 Non-Precinct
Vote Allocated
Montana 56 1 1 1 N N N
Nebraska 93 21 0 0 Y Y Y
Nevada 17 0 0 0 Y Y Y
New Hampshire 221 2 2 2 N N N
New Jersey 21 0 1 4 Y Y Y
New Mexico 33 6 7 7 N N N
Ne w Yor k 62 14 23 15 Y Y Y
North Carolina 100 3 0 0 Y Y Y
North Dakota 53 8 24 24 N N N
Ohio 88 18 41 45 N N N
Oklahoma 77 0 0 0 Y Y Y
Oregon 36 4 4 4 N N N
Pennsylvania 67 64 65 64 N N N
South Carolina 46 1 1 0 Y Y Y
South Dakota 66 39 40 40 Y Y Y
Tennessee 95 8 3 1 Y Y Y
Texas 254 11 11 12 Y Y Y
Utah 29 0 0 0 Y Y Y
Ver m ont 247 25 25 25 Y Y Y
Virginia 133 36 32 37 Y Y Y
Washington 39 0 0 0 N N N
West Virginia 55 55 55 55 N N N
Wisconsin 72 0 0 0 Y Y Y
Wyoming 23 10 10 10 N N N
U.S. Total 4,528 791 846 852 25 26 26
Tab le 3. Counties with Precinct Changes and States Requiring Vote Reallocation (Montana to Wyoming
and U.S. Total). Notes: “Units” for all states are counties except AK and DE (state legislative districts) and CT,
ME, MA, NH, RI, and VT (cities and townships). KY 2018 had no statewide partisan election. CA 2018 is in
production at the time of this writing.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
results for mail ballots, in-person early votes, overseas votes, provisional ballots, disability-assistance votes, or
some combination of these. Not all localities report election results by voting methods in special at-large pre-
cincts, as some may tabulate and report these votes within the voters’ home geographically-bound precincts.
When possible, we collect election results that allocate these votes to voters’ home precincts and oen these
reports are available from county – not state – election ocials, entailing additional data collection.
e last three columns of Tables2, 3 provide a sense of the extent where we allocate at-large votes for geo-
graphic units within a state during the 2020, 2018 and 2016 November elections. In about half the states we
apportion at least one – usually more – geographic unit’s votes. Typically, we apportion a candidates at-large
votes to geographically-bound precincts proportional to the candidate’s votes within the geographically-bound
precincts. For example, if Joe Biden receives 100 votes in an at-large precinct and two geographically-bound
precincts within it have 600 and 400 votes for Biden, we apportion 60 of the at-large Biden votes to the rst pre-
cinct and 40 to the second. We apportion fractions such that the largest remainders are awarded rst so that the
resultant precinct counts are whole numbers and tally correctly to the county-level results.
An important distinction between our database and other precinct databases, such as the database pro-
duced by the MIT Election Data and Science Lab1, is that these other data providers report at-large precincts as
separate rows and do not disaggregate to geographically-bound precincts as we do. ese diering approaches
primarily involve our respective use cases. We are most interested in measuring candidate votes cast within the
geographic bounds of a precinct, even if we must estimate votes by disaggregating votes election ocials report
in at-large precincts. Our approach permits analyses that account for the political character of a geographic unit
that splits counties, such as a legislative district. eir use case is primarily to measure candidate votes within
precincts of all types, which enables analyses of voting by dierent methods.
When we complete our allocations, we verify our votes tallies with ocial county-level election reports. We
investigate discrepancies when precinct election results do not match exactly. In most cases vote tally discrepan-
cies reveal errors in our data production, but we encounter very rare circumstances where either precinct-level
or county-level election results are in error or incomplete. Sometimes discrepancies are by design. States may
censor small vote tallies to protect voters’ condentiality and the secret ballot. e North Carolina State Board of
Elections adds a small amount of noise to their state’s precinct results per state law whenever a candidate receives
one hundred percent of the vote within a reporting unit and voters’ choices would be revealed.
 As described, there are two components to our data collection, precinct boundaries and precinct
election results. It is possible to replicate collection of precinct election results. Indeed, a team of MIT scholars
created databases of precinct election results, and our teams have shared information on our data collection
eorts1. Retrospective replication of precinct boundaries following the procedures we describe is technically pos-
sible, but not practical. An independent team or scholar would encounter logistical diculties reproducing retro-
spective versions of our precinct boundary maps. Election ocials rarely archive precinct boundary data or maps,
even when they are produced in electronic formats that would facilitate archiving. As time passes and turnover
occurs within election oces institutional memory about precinct boundaries fades. Prospective continuation of
our data production is feasible, but is labor intensive, more so without our knowledge of where hurdles exist and
how to navigate over them.
Data Records
We post our 2016, 2018, and 2020 general election shapeles for each of the y U.S. states plus the District of
Columbia on public archives. Our original repository is the Harvard Dataverse2426. We mirror the Harvard
Dataverse archive at the Election Lab at the University of Florida data archive (https://election.lab.u.edu/
data-archive/) We post updates to our databases at both locations. Some databases in addition to the 2016, 2018,
and 2020 general election shapeles can be found only on the Election Lab archive. ese databases include
primary elections and state general elections – such as those taking place in odd-numbered years – that are held
outside federal general election years.
Shapeles are used for GIS applications and are in practice a collection of les, one of which is a le that
includes the attributes stored in a dBase format accessible by most statistical and spreadsheet soware. Statewide
precinct boundary shapeles tend to be large, so we produce a separate shapele for each state. is assists us
and our user-base two-fold. From our end, our production workow is to release each state as it is completed
rather than delaying release until a nationwide le is complete. is enables our users to obtain data for their
states of interest as soon as we complete our work.
e attributes for each precinct record include information to uniquely identify precincts within a state.
Precinct names are not always unique among counties within a state. Precincts are named or numbered by local
election administrators and it is thus possible two localities within the same state can use the same precinct
identier, particularly when they sequentially number precincts. We standardize geographic identiers within a
state, but do not adopt a standardized schema across states. States have as few as one geographic eld (Delaware
– which embeds state legislative district identiers in its precinct codes) and up to fourteen elds (Georgia –
which identies parts of precincts split by legislative districts) needed to uniquely identify precincts. In some
cases, these identiers are duplicates to a degree, with separate elds identifying a geography with a code and a
long text name. We adopt the full schema used by a state for their election results, which facilitates the merging
of election results to precinct boundaries beyond the statewide partisan oces we provide. Further complicating
data schema is that they may change from one election to the next.
We identify individual candidates by a ten-character code, for example, G20PRERTRU. e rst character
denotes the election type, which can be ‘G’ for a general election, ‘C’ for recount results, ‘P’ for a primary, ‘S’
for a special election, and ‘R’ for a runo election. e second and third characters denote the last two digits of
the year of the election, ‘16,’ ‘18,’ or ‘20.’ e fourth through sixth characters reference the oce code, a list of
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
which is provided in Table4. e seventh character is a political party code. Major political parties are identied
as ‘D’ for Democrat and ‘R’ for Republican; codes for various minor state political parties are identied in our
documentation. e eighth through tenth characters represent the rst three characters of a candidates name.
Unusual exceptions to our candidate schema are described in our documentation.
Technical Validation
We utilize processes to verify precinct boundary correctness similar to those we use to draw maps from scratch
– geocoding and comparing boundaries to existing local boundaries. In addition, we compare precincts to prior
versions available to us through our work, either drawn ourselves or collected from other sources. Boundaries
that do not change when we expect they should are suspect. Election ocials in rapidly growing urban areas
oen create new precincts with new polling places to better meet voting demand. Election ocials oen con-
form precinct boundaries with other local political boundaries, so we may expect new precinct boundaries
following legislative redistricting at any level of government, especially when precincts dene districts for local
governments such as city or county legislatures. In the extreme, we observe precinct boundaries that appear to
be at least a decade out of date in that they are the same as those submitted to the Census Bureau as part of their
2010 Phase 2 Redistricting Data collection.
e merging of election results serves as another verication check. e number of geographically-bound
precincts with reported election results should align with the number found on a map, setting aside at-large
precincts. Precinct names may provide clues that precinct boundaries changed. Sometimes election ocials split
precincts into two or more precincts because a precinct’s number of registered voters has grown to a point where
voters are better served with the creation of a new polling location. Election ocials will oen signify these
child precincts with a sux of ‘A’ and ‘B’ or ‘1’ and ‘2,’ which serve as indicators of areas needing attention. In the
reverse, local election ocials may also consolidate two or more precincts into one precinct, usually resulting in
the disappearance of suxes or a precinct name. We may also detect a boundary realignment when one precinct
unexpectedly gains votes over the last election and a neighbor loses votes. Some localities name precincts aer
their polling place, and name changes may – but do not always – signal new boundaries. Precinct changes due
to local annexations are not always obvious from elections data, since these relatively small adjustments do not
oen result in name changes or changes in the number of precincts. For these, we collect annexation notices led
by local governments.
In the course of our work we have encountered oddities. A rural election oce that burned down along with
all its election data. Rural counties that allow voters to decide which precinct they live in, and which polling
place they will vote at, creating intermingled precincts that defy boundaries, so we create one precinct for the
entire county. We’ve discovered individuals and even an entire neighborhood assigned to vote in the wrong
county, which we veried with election ocials. On rare occasions we identify errors in precinct boundaries
and in certied vote totals. We work with election ocials to correct these issues so that overall election admin-
istration may be improved. We’ve shared election maps we’ve created with election ocials who do not have
GIS capacity, so they may have accurate representations. Our work has even been included in a few localities’
Code Oce
AGR Agriculture Commissioner
ATG Attorney General
AUD Auditor
COC Corporation Commissioner
COU City Council Member
DEL Delegate to the U.S. House
GOV Governor
H## U.S. House, where ## is the district number (‘AL’ denotes at large)
INS Insurance Commissioner
LAB Labor Commissioner
LAN Commissioner of Public Lands
LTG Lieutenant Governor
PRE President
PSC Public Service Commissioner
RRC Railroad Commissioner
SAC State Appeals Court (in Alabama, Civil Appeals Court)
SCC State Court of Criminal Appeals
SOS Secretary of State
SSC State Supreme Court
SPI Superintendent of Public Instruction
TRE Treasurer
USS U.S. Senate
Tab le 4. Statewide Oce Codes.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
2020 Phase 2 Redistricting Data Program transmission of precinct boundaries to the U.S. Census. We strive for
perfection but know the reality of working with big data is we will not catch all errors. Our large user-base (our
databases have over two hundred thousand downloads) includes thousands of mapping enthusiasts who create
election results maps for dissemination on social media. ey include tens of thousands of users who create DYI
redistricting plans using online mapping applications. Our users act as crowd-sourcing agents, and we welcome
and research their error reports.

Our databases are released under a Creative Commons Attribution 4.0 International license (https://creative-
commons.org/licenses/by/4.0/deed.en). Users are welcome to share and adapt our work as long as they pro-
vide appropriate credit. Unfortunately, attribution has at times been challenging, perhaps due to the success of
our work. We have observed peer-reviewed published research attribute our work to other organizations that
re-disseminate our databases. We hope this essay will provide future users a viable and persistent citation to our
work.

We used no customized soware for our databases.
Received: 11 June 2024; Accepted: 22 October 2024;
Published: xx xx xxxx

1. Baltz, S. et al. American election results at the precinct level. Sci. Data 9, 1–12, https://doi.org/10.1038/s41597-022-01745-0 (2022).
2. Herrnson, P., Hanmer, M. & Niemi, . The impact of ballot type on voter errors. Am. J. Polit. Sci. 56, 716–730, https://doi.
org/10.1111/j.1540-5907.2011.00579.x (2012).
3. Amos, B., McDonald, M. & Watins, . When boundaries collide: constructing a national database of demographic and voting
statistics. Public Opin. Q. 81, 385–400, https://doi.org/10.1093/poq/nfx001 (2017).
4. McDonald, M. P. Presidential vote in legislative districts. State Polit. Policy Q. 14, 196–204, https://doi.org/10.1177/1532440014529291
(2012).
5. ing, G. A Solution to the Ecological Inference Problem (Princeton University Press, Princton, NJ, 1997).
6. Warshaw, C., McGhee, E. & Migursi, M. Districts for a new decade – partisan outcomes and racial representation in the 2021–2022
redistricting cycle. Publius: e J. Fed. 52, 428–451, https://doi.org/10.1093/publius/pjac020 (2022).
7. Artes, J., aufman, A. ., ichter, B. . & Timmons, J. F. Are rms gerrymandered? Am. Polit. Sci. Rev. 1–21, https://doi.org/10.1017/
S0003055424000558 (2024).
8. de Benedictis-essner, J., Lee, D. D. I., Velez, Y. . & Warshaw, C. American local government elections database. Sci. Data 10, 912,
https://doi.org/10.1038/s41597-023-02792-x (2023).
9. Dobbs, . W., ing, D. M. & Jacobson, S. H. edistricting optimization with recombination: A local search case study. Comput. &
Oper. Res. 160, 106369, https://doi.org/10.1016/j.cor.2023.106369 (2023).
10. Dobbs, . W., Swamy, ., ing, D. M., Ludden, I. G. & Jacobson, S. H. An optimization case study in analyzing missouri redistricting.
INFORMS J. on Appl. Anal. 54, 162–187, https://doi.org/10.1287/inte.2022.0037 (2024).
11. Palmer, M., Schneer, B. & DeLuca, . A partisan solution to partisan gerrymandering: e dene–combine procedure. Polit.
Analysis 1–16, https://doi.org/10.1017/pan.2023.39 (2023).
12. enny, C. T. e t al. e use of dierential privacy for census data and its impact on redistricting: e case of the 2020 u.s. census. Sci.
Adv. 7, https://doi.org/10.1126/sciadv.ab3283 (2021).
13. uriwai, S., Ansolabehere, S., Dagonel, A. & Yamauchi, S. e geography of racially polarized voting: Calibrating surveys at the
district level. Am. Polit. Sci. Rev. 118, 922–939, https://doi.org/10.1017/S0003055423000436 (2024).
14. Hunt, C. & ouse, S. M. Polarization and place-based representation in us state legislatures. Legislative Stud. Q. https://doi.
org/10.1111/lsq.12441 (2023).
15. Fraga, B. L., Velez, ., Yamil & West, E. A. eversion to the mean, or their version of the dream? latino voting in an age of populism.
Am. Polit. Sci. Rev. 1–9, https://doi.org/10.33774/apsa-2023-764r1 (2024).
16. astogi, A. & Jones-Correa, M. Not just white soccer moms: Voting in suburbia in the 2016 and 2020 elections. RSF: e Russell Sage
Foundation J. Soc. Sci. 9, 184–203, https://doi.org/10.7758/SF.2023.9.2.08 (2023).
17. Grossman, G., im, S., exer, J. M. & Thirumurthy, H. Political partisanship influences behavioral responses to governors
recommendations for covid-19 prevention in the united states. Proc. Natl. Acad. Sci. 117, 24144–24153, https://doi.org/10.1073/
pnas.2007835117 (2020).
18. itchens, ., Harris, S. & Miller, . What matters in school reopening plans: an analysis of the impact of school board demographics.
Polit. Groups, Identities 12, 186–216, https://doi.org/10.1080/21565503.2023.2224765 (2024).
19. Wang, B. S., odnyansy, S., Boarnet, M. G. & Comandon, A. Measuring the impact of covid-19 policies on local commute trac:
Evidence from mobile data in northern california. Travel. Behav. Soc. 34, 100660, https://doi.org/10.1016/j.tbs.2023.100660 (2024).
20. Be c, B., Antonelli, J. & LaScala-Gruenewald, A. Nec-restraint bans, law enforcement ocer unions, and police illings. Criminol.
& Public Policy https://doi.org/10.1111/1745-9133.12658 (2024).
21. Denes, M., Scanlon, M. & Schulz, F. Disclosure in democracy. SSRN https://doi.org/10.2139/ssrn.4154777 (2022).
22. Hughes, S., irchho, C. J., Conedera, . & Friedman, M. e municipal drining water database. PLOS Water 2, e0000081, https://
doi.org/10.1371/journal.pwat.0000081 (2023).
23. Amos, B. & McDonald, M. P. A method to audit the assignment of registered voters to districts and precincts. Polit. Analysis 28,
356–371, https://doi.org/10.1017/pan.2019.44 (2020).
24. Amos, B., Gerontais, S. & McDonald, M. Voting and election science team: 2016 precinct-level election results https://doi.org/10.7910/
DVN/NH5S2I (2024).
25. Amos, B., Gerontais, S. & McDonald, M. Voting and election science team: 2018 precinct-level election results https://doi.org/10.7910/
DVN/UBYU (2024).
26. Amos, B., Gerontais, S. & McDonald, M. Voting and election science team: 2020 precinct-level election results https://doi.org/10.7910/
DVN/7760H (2024).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/

We thank our funding supporters: the Alfred P. Sloan Foundation, the Houston Endowment, Resilient
Democracy, and individual donors to the University of Florida Foundation’s Election Science Group account.
Research assistants who assisted with data collection include Maxwell Clarke, Robert Della Salle, Karl Klarner,
Sara Loving, Evan Smith, and Mario Villegas. Michal Migurski independently provided some data assistance. We
would like to thank numerous state and local ocials who kindly responded to our requests.
Author contributions
S.G. was primarily responsible for data collection and processing. B.A. was primarily responsible for some data
collection and processing, and voter le geocoding. M.M. was primarily responsible for project management and
fundraising. All authors reviewed the manuscript.

e authors declare no competing interests.

Correspondence and requests for materials should be addressed to M.M.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution-NonCommercial-
NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribu-
tion and reproduction in any medium or format, as long as you give appropriate credit to the original author(s)
and the source, provide a link to the Creative Commons licence, and indicate if you modied the licensed mate-
rial. You do not have permission under this licence to share adapted material derived from this article or parts of
it. e images or other third party material in this article are included in the article’s Creative Commons licence,
unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative
Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use,
you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by-nc-nd/4.0/.
© e Author(s) 2024
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We provide the first evidence that firms, not just voters, are gerrymandered. We compare allocations of firms in enacted redistricting plans to counterfactual distributions constructed using simulation methods. We find that firms are over-allocated to districts held by the mapmakers’ party when partisans control the redistricting process; maps drawn by courts and independent commissions allocate firms more proportionately. Our results hold when we account for the gerrymandering of seats: fixing the number of seats the mapmakers’ party wins, they obtain more firms than expected in their districts. Our research reveals that partisan mapmakers target more than just voters, shedding new light on the link between corporate and political power in the United States and opening new pathways for studying how mapmakers actually draw district boundaries.
Article
Full-text available
The study of urban and local politics in the United States has long been hindered by a lack of centralized sources of election data. We introduce a new database of about 78,000 candidates in 57,000 electoral contests that encompasses races for seven distinct local political offices in most medium and large cities and counties in the U.S. over the last three decades. This is the most comprehensive publicly-available source of information on local elections across the country. We provide partisan and demographic information about candidates in these races as well as electoral outcomes. This new database will facilitate a myriad of new research on representation and elections in local governments.
Article
Full-text available
Redistricting reformers have proposed many solutions to the problem of partisan gerrymandering, but they all require either bipartisan consensus or the agreement of both parties on the legitimacy of a neutral third party to resolve disputes. In this paper, we propose a new method for drawing district maps, the Define–Combine Procedure, that substantially reduces partisan gerrymandering without requiring a neutral third party or bipartisan agreement. One party defines a map of 2N equal-population contiguous districts. Then the second party combines pairs of contiguous districts to create the final map of N districts. Using real-world geographic and electoral data, we employ simulations and map-drawing algorithms to show that this procedure dramatically reduces the advantage conferred to the party controlling the redistricting process and leads to less-biased maps without requiring cooperation or non-partisan actors.
Article
Full-text available
Recent scholarship has shown that legislators with deeper local roots and other pre-existing place-based attachments to their districts enjoy far-reaching electoral advantages over their more "carpetbagging" colleagues. In this paper, we consider how local roots, and its intersection with legislative polarization influences legislative behavior, using a dataset of nearly 5,000 state legislators and novel measures of their local roots. We hypothesize that state legislators with deep local roots in their districts should be less ideologically polarized than their less-rooted colleagues. This is precisely what we find. Using Shor-McCarty ideology measures, we show that the most locally-rooted legislators are nearly 15% less ideologically extreme than their unrooted counterparts. These effects are comparable to or exceed those of district partisanship, chamber seniority, or other legislator characteristics. Collectively, these findings show that legislators' local roots not only affect their electoral fortunes, but also have major implications for legislative activity and party polarization.
Article
Full-text available
Every 10 years, U.S. states redraw their congressional and state legislative district plans. This process decides the political landscape for the subsequent 10 years. Prior to the 2021 redistricting cycle, Missouri enacted new criteria for state legislative districts. The Missouri League of Women Voters (LWV-MO) contacted the authors to analyze the potential impact of these new criteria on the map-drawing process. We apply recombination (a spanning tree method) within a local search optimization framework to analyze the interplay between political geography, constitutional requirements, and political fairness in Missouri. We use this framework to produce district plans that satisfy the new criteria and prioritize different aspects of fairness. The results, quantified by several measures of fairness, reveal an inherent Republican advantage in Missouri because of the state’s political geography and constitutional requirements. We conclude that Missouri’s political geography and constitutional requirements prevent the optimization framework from substantially improving political fairness in state legislative plans. In contrast, the framework can substantially improve political fairness in Missouri congressional plans, which are not subject to the new requirements. The LWV-MO used this work to advocate for fairness and transparency in their testimonies for the Missouri redistricting commission’s public hearings. History: This paper was refereed. Funding: This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program [Grant DGE-1746047]. S. H. Jacobson was supported by the Air Force Office of Scientific Research [Grant FA9550-19-1-0106]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/inte.2022.0037 .
Article
Full-text available
Debates over racial voting, and over policies to combat vote dilution, turn on the extent to which groups’ voting preferences differ and vary across geography. We present the first study of racial voting patterns in every congressional district (CD) in the United States. Using large-sample surveys combined with aggregate demographic and election data, we find that national-level differences across racial groups explain 60% of the variation in district-level voting patterns, whereas geography explains 30%. Black voters consistently choose Democratic candidates across districts, whereas Hispanic and white voters’ preferences vary considerably across geography. Districts with the highest racial polarization are concentrated in the parts of the South and Midwest. Importantly, multiracial coalitions have become the norm: in most CDs, the winning majority requires support from non-white voters. In arriving at these conclusions, we make methodological innovations that improve the precision and accuracy when modeling sparse survey data.
Article
Research Summary Following high‐profile police killings, many U.S. cities banned officers from using chokeholds and other neck restraints. The evidence for such bans, however, is limited. To test whether use‐of‐force policies prohibiting neck restraints are related to fewer police killings, we use three modeling approaches to analyze 2183 U.S. cities between 2009 and 2021. Police killings were lower in places that adopted neck‐restraint bans and the bans were associated with less crime and fewer assaults on officers, net of controls. Because officer labor unions can affect use‐of‐force policies and the frequency of police killings, we also analyzed them, finding unionization increased the likelihood a city had a neck‐restraint ban and had a null or negative association with police killings. Policy Implications Adopting a neck‐restraint ban is likely an effective way to reduce deaths due to police use of force with minimal collateral consequences. The bans operate through a diffuse discouragement of many types of lethal force or as a part of an array of use‐of‐force policies. Their direct relationship to asphyxiation deaths remains unclear. Officer unionization is unlikely to change the frequency of police killings, except through its association with stricter use‐of‐force policies.