Access to this full-text is provided by Springer Nature.
Content available from Scientific Data
This content is subject to copyright. Terms and conditions apply.
1
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
United States Precinct Boundaries
and Statewide Partisan Election
Results
Brian Amos
Gerontakis & Michael McDonald
✉
A requirement for democratic accountability is that governments report election data. In the United States there
is no national entity responsible for administering elections; responsibility devolves to sub-national state and
local election administrators. Without centralized national administration, election ocials report election data
in non-standard formats that pose signicant barriers to creating a unied national database1. To preserve the
secret ballot, election administrators report aggregate election results in geographic units commonly known as
states, districts, counties, townships, and precincts (the names of sub-state entities vary across the country). e
smallest of these geographies are precincts, which are on the scale of neighborhoods. Election ocials use pre-
cincts to identify the polling location at which a voter will cast an in-person vote and the oces that will appear
on the ballot2. To augment election results with other contextual data one requires their geographic boundaries3.
State, district, county, and township boundaries – and contextual data related to them – are readily available
from the U.S. Census Bureau and other sources. However, there is no nationwide database of accurate precinct
boundaries, which poses challenges to their collection and standardization. To ll this gap, we created national
precinct boundary databases enhanced with vote counts for candidates of all partisan statewide oces for the
170,098 precincts used in the 2020 U.S. November general election; the 152,217 used in 2018 (CA is in-process
at the time of this writing), and the 177,202 used in 2016.
ere are notable eorts to collect precinct boundaries from the roughly 3,000 local election ocials across
the country who are the primary data curators. e U.S. Census Bureau requests states provide precinct bound-
aries collected from their localities so they may be included in the Census Bureau’s geographies for reporting
demographic statistics. e principal use case of these data are redistricting of legislative districts following the
decennial census. Phase Two of the Census Bureau’s Redistricting Data Program – when the Bureau collects
precinct boundaries – takes place in a year ending in ‘7’ preceding a decennial census (https://www.census.gov/
programs-surveys/decennial-census/about/rdo.html). Unfortunately, these boundaries can be inaccurate geo-
graphic representations of precincts used in most elections3. Precinct boundaries are time-bound, in that elec-
tion ocials frequently modify precincts from one election to the next for administrative reasons. Additionally,
our quality control routinely identies incorrect boundaries even when obtained directly from local election
1Wichita State University, Department of Political Science, Wichita, Kansas, 67260, USA. 2University of Florida,
Department of Political Science, Gainesville, Florida, 32611, USA. 3These authors contributed equally: Brian Amos,
Steven Gerontakis, Michael McDonald. ✉e-mail: michael.mcdonald@u.edu
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
ocials. It is for these reasons a labor-intensive eort is required to collect and verify precinct boundaries used
in each election. Following the 2010 census, a team from Harvard and Stanford universities attempted to collect
2008 general election precinct boundaries, but discontinued their eorts (https://projects.iq.harvard.edu/eda/
home). Authors of this essay contributed to that eort and seek to continue it.
Stakeholders nd geographically-bound precinct election results important to understanding democratic
processes, representation, and governance. A primary use is redistricting, where stakeholders desire district
partisanship measures derived from statewide oce elections4. Our databases are incorporated into online
redistricting mapping and evaluation tools, such as Dave’s Redistricting App (https://davesredistricting.org/),
DistrictBuilder (https://www.districtbuilder.org/), and PlanScore (https://planscore.org/). Numerous media
organizations use our database in their election coverage, including CNN, New York Times, Washington Post, and
Wall Street Journal. State governments used our databases for their redistricting, including New York, Ohio, and
Virginia. A related use case is voting rights litigation, where experts analyze precinct data to estimate racial vot-
ing patterns using a technique known as ecological inference5. Among the court cases using our databases was
a successful challenge to Alabama’s congressional districts as a racial gerrymander decided by the U.S. Supreme
Court in Allen v. Milligan 599 U.S. 1 (2023). Scholars use our databases to evaluate redistricting outcomes6,7, to
develop new partisan gerrymandering metrics and solutions8–11, to analyze eects of the U.S. Census Bureau’s
dierential privacy policies12, to estimate racial voting patterns among congressional districts13, to analyze polar-
ization among state legislators14, to analyze Latino voting patterns15, to analyze suburban voting patterns16, to
analyze U.S. COVID policies17–19, to analyze local crime policies20, to map campaign donation patterns21, and to
augment other geographically bound databases with partisan voting data8,22.
Statewide partisan oces on the ballot vary among the states depending on the election. In a presidential
election year, all states have the presidential election on their November ballot. Appearances of other oces vary
with timing of U.S. Senate elections and state laws regarding when and what state oces are elected. As depicted
in Table1, we report precinct election results for thirty-six oces spanning the 2016, 2018, and 2020 November
general elections, with a combined 1,736,409 cells.
Methods
We produce nationwide electronic geographic information system (GIS) databases of precincts used in the 2016,
2018, and 2020 United States November general elections, augmented with precinct-level election results for
statewide partisan oces. We on occasion produce and publicly release databases for selected elections other
than November federal elections, such as primary elections, when we have capacity to create these databases.
ere are two important data production phases: the creation of a precinct boundary database for each state and
the creation of precinct-level election results that can be joined to these boundaries.
States and the federal government use various names for what are generally known as precincts. States may
also call these “wards” or “election districts.” e U.S. Census Bureau acknowledges this naming variety by
calling these geographies “voting tabulation districts” or VTDs. We use the common name “precincts” to refer
to any such small geographic boundaries used by election ocials for managing elections and reporting elec-
tion results. Our naming convention includes smaller geographies created when election ocials occasion-
ally report election results for precincts split by legislative district boundaries. Likewise, the local government
that manages federal and state elections is generally what is known as a county. States may have other names
for county-equivalents, such as “parishes,” and sub-county governments such as “townships” or “districts” may
administer elections. We use the common name “county” to refer to any local government responsible for main-
taining precinct boundaries and reporting election results.
ere is no standard format for precinct boundary data. Frequently, governments
publish precinct boundaries in an electronic format known as shapeles – a proprietary GIS format developed
by soware vendor ESRI, and since cracked for general use. We prefer election ocials to provide a shapele (or
any similar electronic GIS formats, hereaer we simply call all GIS formats “shapele”) since this allows us to
edit boundaries if we detect and verify errors. We obtain shapeles from the U.S. Census Bureau, state election
ocials, and local governments. We ultimately convert all boundary data we ingest into the shapele format for
re-dissemination. is approach allows us to attach precincts’ election results to the boundary data as attributes
of the precinct shapes.
e second-most frequent map format election ocials provide us is a map image. Sometimes a map image
is itself created with GIS soware, but election ocials may be unable to provide us with the shapele. Usually
this is because election ocials did not create the map; another local government agency or an outside vendor
created it. When possible we navigate local government contacts or submit open records requests to obtain the
canonical shapele. ese steps are not always fruitful, as the outside agent may charge a fee or be otherwise
unable to provide the shapele. Not all map images are created by GIS soware. Sometimes election ocials will
provide a hand-drawn map. Sometimes county ocials cannot provide an electronic image le, in which case
election ocials may take a cell-phone picture of their map and forward it to us.
e third-most frequent map format election ocials provide us with is a written or verbal description. We
encounter this situation in rural counties or small townships that have few precincts and little GIS capacity.
While this may seem antiquated, in reality election ocials do not commonly manage which precincts their
registered voters are assigned to using electronic maps. Instead, they manage their voter registration databases
using master street address les that identify which precinct each street address range is located in (e.g., even
numbered 100–198 Main Street is associated with precinct 1). When we geocode voter registration les during
our quality assurance protocols we may detect errors in these master street address les when we observe a
street range assigned to one precinct while surrounding neighbors are assigned to another23. We have consulted
with the Colorado and Virginia state governments to identify and rectify these errors, and alerted numerous
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
localities elsewhere of potential issues. ese errors pose a dilemma as to which boundaries to report. Since our
use case involves the geographic location of voters, we generally opt to generate modied precinct boundaries
that include these veried assignment errors.
Lastly, rarely local ocials do not respond to our multiple requests for their precinct boundaries (in extreme
cases we have made repeated contact attempts spanning more than a year). In these cases we rely upon other
information that provide clues as to the precinct boundaries, such as geocoded voter registration addresses with
precinct identiers, or other local districts and governmental units that local election ocials align precinct
boundaries with (depending on state and local practices). A detail we frequently wrestle with is local munic-
ipality annexations in localities that require municipal and precinct boundaries to coincide. We oen rely on
published annexation reports to verify the correctness of the precinct boundaries. When we detect conicts, we
contact local election ocials to resolve them.
We create our own GIS precinct maps or modify shapeles that we verify have errors. We nearly always have
a reference precinct map to start from that we obtained from the U.S. Census Bureau or a shapele created by us
from a prior election. When we do not have any initial resource, we created a machine learning algorithm to fab-
ricate a starter precinct map from census geography as a basemap using geocoded voter registration addresses
and their associated precincts. Automation rarely yields usable precinct boundaries without extensive additional
editing, since geocodes are imprecise, particularly in rural areas23. Once we have a base map, we examine ancil-
lary data to create the most accurate representation that we can.
In Tables2, 3 we present statistics drawn from our extensive documentation to provide a sense of the scope
of our work. Table2 presents the states from Alabama to Missouri and Table3 presents Montana to Wyoming,
Oce 2016 2018 2020
Attorney General 34108 100486 34551
Auditor 24490 36862 24743
Chief Financial Ocer 143 6152 144
City Council Member 143
Clerk of the Supreme Court 672
Commissioner of Agriculture 4425 24086 4333
Commissioner of Insurance 10763 8660 10982
Commissioner of Labor 2704 4609 2662
Commissioner of Public Lands 7197 13045 7464
Commissioner of School and Public Lands 737
Comptroller 10090 39439
Corporation Commissioner 1951 1948
Council Chairman 143
Delegate to the U.S. House 143 143 144
Governor 26124 114083 25247
Lieutenant Governor 13939 21865 14577
President 177202 170198
Mayor 143
Public Service Commissioner 5129 5074 5073
Public Utilities Commissioner 747 737 737
Railroad Commissioner 8832 8936 9014
Secretary of Commonwealth 2173
Secretary of State 18759 72828 17811
State Appeals Court 4196 6190 6551
State Board of Education 4812 4801 4756
State Controller 3004
State Court of Criminal Appeals 8832 10928 10986
State Mine Inspector 1489
State Supreme Court 15028 15126 15565
State University Regent 3010 3136
Superintendent of Public Instruction 8771 9761 3328
Tax Commissioner 424
Treasurer 28473 61680 29141
U.S. House 5688 3618 3609
U.S. Senate 115565 112084 78044
University Board of Regents/Trustees/Governors 4812 4801 4756
Tot a l 177202 152717 170198
Tab le 1. Statewide Oce Precinct Count Note: CA 2018 is in production at the time of this writing.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
with totals for the entire United States. e rst column presents the number of geographical “units” within each
state, which are counties for all states except those that use state legislative districts to group precincts (Alaska
and Delaware) and those that use cities or towns (Connecticut, Maine, Massachusetts, New Hampshire, Rhode
Island, and Vermont). In the next three columns we present the number of geographical units where, for the
2020, 2018, and 2016 November elections, we edited at least one precinct boundary by splitting a precinct into
two or more parts, merging it with another precinct, or drawing new precincts. In a given election, we edit at
least one precinct – and oen more – in about one-sixth of the geographical units. ese are the geographic units
where we make changes. Our quality assurance protocols include checking all precincts whether they require
further editing or not.
In six states we did not modify any precincts across all three election years (Alaska, Hawaii, Maryland,
Nevada, Oklahoma, Utah, and Wisconsin). State election ocials in all of these states – except Nevada – provide
a statewide precinct shapele. Indeed, nearly all states have a statewide map either disseminated by the state or
available from the Census Bureau’s Phase 2 redistricting data program production. We welcome a statewide
precinct map, but its existence does not guarantee our mapping work is done. We oen nd issues – errors or
out-of-date boundaries – that we resolve by collecting precinct maps from localities.
Tables2, 3 under-represent the breadth of our mapping work. We do not count certain tasks that we perform
regularly. We do not count adjusting precincts for city and town annexations in states where precincts do not
cross local government boundaries. When we collect maps from local governments, electronic representations
of local governments’ boundaries may not nicely conform with one another due to technical GIS details, such as
diering map projections. We do not count ensuring precinct boundaries from dierent sources do not result in
overlapping precincts – and in a rare cases do not extend outside state boundaries. We do not count where we
created precinct maps from sources other than those obtained from election ocials such as local land parcel
maps, which require extra eort to identify as a source for precinct boundaries and to collect. We do not count
when we create electronic representations of local governments’ precincts that localities adopt as their ocial
precinct map.
Precinct boundary data are most valuable when augmented with election results. Aer
constructing a statewide precinct map for an election of interest, we merge election results for partisan statewide
State Total Units 2020 Units
Adjusted 2018 Units
Adjusted 2016 Units
Adjusted 2020 Non-Precinct
Vote Allocated 2018 Non-Precinct
Vote Allocated 2016 Non-Precinct
Vote Allocated
Alabama 67 67 67 67 Y Y Y
Alaska 20 0 0 0 Y Y Y
Arizona 15 3 3 2 N N N
Arkansas 75 9 26 36 N Y Y
California 58 6 – 0 N – N
Colorado 64 7 6 7 N N N
Connecticut 169 55 56 57 N N N
Delaware 21 4 3 0 Y Y Y
District of
Columbia 1 0 1 1 N N N
Florida 67 29 28 34 Y Y N
Georgia 159 20 9 5 N N N
Hawaii 5 0 0 0 N N N
Idaho 44 16 18 14 Y Y Y
Illinois 102 2 4 11 Y N Y
Indiana 92 43 56 56 N Y Y
Iowa 99 12 7 6 N N N
Kansas 105 21 20 11 N N N
Kentucky 120 0 – 10 Y – Y
Louisiana 64 7 5 5 Y Y Y
Maine 425 3 3 0 Y Y Y
Maryland 24 0 0 0 N Y Y
Massachusetts 351 16 19 15 N N N
Michigan 83 8 4 2 Y Y Y
Minnesota 87 0 0 0 N N N
Mississippi 82 82 82 82 N N N
Missouri 115 60 90 90 Y Y Y
Tab le 2. Counties with Precinct Changes and States Requiring Vote Reallocation (Alabama to Missouri). Notes:
“Units” for all states are counties except AK and DE (state legislative districts) and CT, ME, MA, NH, RI, and
VT (cities and townships). KY 2018 had no statewide partisan election. CA 2018 is in production at the time of
this writing.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
oces. is step serves as an additional validation check, as described below. Our focus on statewide partisan
oces is primarily for purposes to measure the “normal vote,” the partisanship of a precinct, district, locality, or
other geography4. e scope of oces includes U.S. President, U.S. Senator, Governor, and other statewide oces.
U.S. President is the only oce appearing on all ballots in the presidential election years of 2016 and 2020. Other
oces vary with the timing of U.S. Senate elections – a third of the seats are scheduled for election every two years
– and the varied state oces whose terms expire in a given election. In a midterm election year, such as 2018, the
president does not appear on the ballot, but states tend to elect governors and other state oces. Not all states do,
however, raising the occasional possibility a state has no partisan statewide oce on the ballot. When no state-
wide partisan oce is on the ballot, we try to collect precinct boundaries augmented with U.S. House results. We
also include U.S. House races for single-district states since these qualify as astatewide partisan oce.
We create databases of precinct-level vote totals for every partisan statewide oce. We attempt to tally results
for every candidate, including write-in candidates, when available. When election ocials recognize “ocial”
write-in candidates through a formal ballot qualication process, election ocials may report their vote totals
alongside all major and minor party candidates, which allows us to report votes for each ofthese candidates.
Sometimes election ocials may choose to report all write-in candidates in a single category, in which case we
report only the aggregate write-in tallies.
We collect election results in a suitable electronic format to merge with precinct boundaries. Most oen
election ocials report candidates’ votes in an electronic spreadsheet of some sort. Some states and counties
still report election results in a scanned portable document format (pdf), either generated from soware or a
scanned image, requiring conversion to an electronic spreadsheet. Our experience with pdf to spreadsheet con-
version soware is we nearly always need to perform additional cleaning and reformatting.
Merging election results with precinct boundaries requires a common and unique identier in both data
sources. Sometimes precinct identiers in these databases are dierent. Oen these are minor inconsistencies
resolved through visual inspection of names. Infrequently, we may contact local election ocials to resolve
inconsistencies. Our preference is to produce shapeles with precinct identiers as they appear in election
results so that others may merge more election results data, if desired.
“At-large” precincts may cover an entire state, county, or a sub-unit within a county, such as districts or local-
ities. Some special-purpose precincts have very small boundaries, typically consisting of the city block where
an election oce is located. We typically treat these government oce precincts the same as at-large precincts
in our data processing. Depending on the locality, election ocials may create these precincts to report election
State Total
Units 2020 Units
Adjusted 2018 Units
Adjusted 2016 Units
Adjusted 2020 Non-Precinct
Vote Allocated 2018 Non-Precinct
Vote Allocated 2016 Non-Precinct
Vote Allocated
Montana 56 1 1 1 N N N
Nebraska 93 21 0 0 Y Y Y
Nevada 17 0 0 0 Y Y Y
New Hampshire 221 2 2 2 N N N
New Jersey 21 0 1 4 Y Y Y
New Mexico 33 6 7 7 N N N
Ne w Yor k 62 14 23 15 Y Y Y
North Carolina 100 3 0 0 Y Y Y
North Dakota 53 8 24 24 N N N
Ohio 88 18 41 45 N N N
Oklahoma 77 0 0 0 Y Y Y
Oregon 36 4 4 4 N N N
Pennsylvania 67 64 65 64 N N N
South Carolina 46 1 1 0 Y Y Y
South Dakota 66 39 40 40 Y Y Y
Tennessee 95 8 3 1 Y Y Y
Texas 254 11 11 12 Y Y Y
Utah 29 0 0 0 Y Y Y
Ver m ont 247 25 25 25 Y Y Y
Virginia 133 36 32 37 Y Y Y
Washington 39 0 0 0 N N N
West Virginia 55 55 55 55 N N N
Wisconsin 72 0 0 0 Y Y Y
Wyoming 23 10 10 10 N N N
U.S. Total 4,528 791 846 852 25 26 26
Tab le 3. Counties with Precinct Changes and States Requiring Vote Reallocation (Montana to Wyoming
and U.S. Total). Notes: “Units” for all states are counties except AK and DE (state legislative districts) and CT,
ME, MA, NH, RI, and VT (cities and townships). KY 2018 had no statewide partisan election. CA 2018 is in
production at the time of this writing.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
results for mail ballots, in-person early votes, overseas votes, provisional ballots, disability-assistance votes, or
some combination of these. Not all localities report election results by voting methods in special at-large pre-
cincts, as some may tabulate and report these votes within the voters’ home geographically-bound precincts.
When possible, we collect election results that allocate these votes to voters’ home precincts and oen these
reports are available from county – not state – election ocials, entailing additional data collection.
e last three columns of Tables2, 3 provide a sense of the extent where we allocate at-large votes for geo-
graphic units within a state during the 2020, 2018 and 2016 November elections. In about half the states we
apportion at least one – usually more – geographic unit’s votes. Typically, we apportion a candidate’s at-large
votes to geographically-bound precincts proportional to the candidate’s votes within the geographically-bound
precincts. For example, if Joe Biden receives 100 votes in an at-large precinct and two geographically-bound
precincts within it have 600 and 400 votes for Biden, we apportion 60 of the at-large Biden votes to the rst pre-
cinct and 40 to the second. We apportion fractions such that the largest remainders are awarded rst so that the
resultant precinct counts are whole numbers and tally correctly to the county-level results.
An important distinction between our database and other precinct databases, such as the database pro-
duced by the MIT Election Data and Science Lab1, is that these other data providers report at-large precincts as
separate rows and do not disaggregate to geographically-bound precincts as we do. ese diering approaches
primarily involve our respective use cases. We are most interested in measuring candidate votes cast within the
geographic bounds of a precinct, even if we must estimate votes by disaggregating votes election ocials report
in at-large precincts. Our approach permits analyses that account for the political character of a geographic unit
that splits counties, such as a legislative district. eir use case is primarily to measure candidate votes within
precincts of all types, which enables analyses of voting by dierent methods.
When we complete our allocations, we verify our votes tallies with ocial county-level election reports. We
investigate discrepancies when precinct election results do not match exactly. In most cases vote tally discrepan-
cies reveal errors in our data production, but we encounter very rare circumstances where either precinct-level
or county-level election results are in error or incomplete. Sometimes discrepancies are by design. States may
censor small vote tallies to protect voters’ condentiality and the secret ballot. e North Carolina State Board of
Elections adds a small amount of noise to their state’s precinct results per state law whenever a candidate receives
one hundred percent of the vote within a reporting unit and voters’ choices would be revealed.
As described, there are two components to our data collection, precinct boundaries and precinct
election results. It is possible to replicate collection of precinct election results. Indeed, a team of MIT scholars
created databases of precinct election results, and our teams have shared information on our data collection
eorts1. Retrospective replication of precinct boundaries following the procedures we describe is technically pos-
sible, but not practical. An independent team or scholar would encounter logistical diculties reproducing retro-
spective versions of our precinct boundary maps. Election ocials rarely archive precinct boundary data or maps,
even when they are produced in electronic formats that would facilitate archiving. As time passes and turnover
occurs within election oces institutional memory about precinct boundaries fades. Prospective continuation of
our data production is feasible, but is labor intensive, more so without our knowledge of where hurdles exist and
how to navigate over them.
Data Records
We post our 2016, 2018, and 2020 general election shapeles for each of the y U.S. states plus the District of
Columbia on public archives. Our original repository is the Harvard Dataverse24–26. We mirror the Harvard
Dataverse archive at the Election Lab at the University of Florida data archive (https://election.lab.u.edu/
data-archive/) We post updates to our databases at both locations. Some databases in addition to the 2016, 2018,
and 2020 general election shapeles can be found only on the Election Lab archive. ese databases include
primary elections and state general elections – such as those taking place in odd-numbered years – that are held
outside federal general election years.
Shapeles are used for GIS applications and are in practice a collection of les, one of which is a le that
includes the attributes stored in a dBase format accessible by most statistical and spreadsheet soware. Statewide
precinct boundary shapeles tend to be large, so we produce a separate shapele for each state. is assists us
and our user-base two-fold. From our end, our production workow is to release each state as it is completed
rather than delaying release until a nationwide le is complete. is enables our users to obtain data for their
states of interest as soon as we complete our work.
e attributes for each precinct record include information to uniquely identify precincts within a state.
Precinct names are not always unique among counties within a state. Precincts are named or numbered by local
election administrators and it is thus possible two localities within the same state can use the same precinct
identier, particularly when they sequentially number precincts. We standardize geographic identiers within a
state, but do not adopt a standardized schema across states. States have as few as one geographic eld (Delaware
– which embeds state legislative district identiers in its precinct codes) and up to fourteen elds (Georgia –
which identies parts of precincts split by legislative districts) needed to uniquely identify precincts. In some
cases, these identiers are duplicates to a degree, with separate elds identifying a geography with a code and a
long text name. We adopt the full schema used by a state for their election results, which facilitates the merging
of election results to precinct boundaries beyond the statewide partisan oces we provide. Further complicating
data schema is that they may change from one election to the next.
We identify individual candidates by a ten-character code, for example, G20PRERTRU. e rst character
denotes the election type, which can be ‘G’ for a general election, ‘C’ for recount results, ‘P’ for a primary, ‘S’
for a special election, and ‘R’ for a runo election. e second and third characters denote the last two digits of
the year of the election, ‘16,’ ‘18,’ or ‘20.’ e fourth through sixth characters reference the oce code, a list of
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
which is provided in Table4. e seventh character is a political party code. Major political parties are identied
as ‘D’ for Democrat and ‘R’ for Republican; codes for various minor state political parties are identied in our
documentation. e eighth through tenth characters represent the rst three characters of a candidates name.
Unusual exceptions to our candidate schema are described in our documentation.
Technical Validation
We utilize processes to verify precinct boundary correctness similar to those we use to draw maps from scratch
– geocoding and comparing boundaries to existing local boundaries. In addition, we compare precincts to prior
versions available to us through our work, either drawn ourselves or collected from other sources. Boundaries
that do not change when we expect they should are suspect. Election ocials in rapidly growing urban areas
oen create new precincts with new polling places to better meet voting demand. Election ocials oen con-
form precinct boundaries with other local political boundaries, so we may expect new precinct boundaries
following legislative redistricting at any level of government, especially when precincts dene districts for local
governments such as city or county legislatures. In the extreme, we observe precinct boundaries that appear to
be at least a decade out of date in that they are the same as those submitted to the Census Bureau as part of their
2010 Phase 2 Redistricting Data collection.
e merging of election results serves as another verication check. e number of geographically-bound
precincts with reported election results should align with the number found on a map, setting aside at-large
precincts. Precinct names may provide clues that precinct boundaries changed. Sometimes election ocials split
precincts into two or more precincts because a precinct’s number of registered voters has grown to a point where
voters are better served with the creation of a new polling location. Election ocials will oen signify these
child precincts with a sux of ‘A’ and ‘B’ or ‘1’ and ‘2,’ which serve as indicators of areas needing attention. In the
reverse, local election ocials may also consolidate two or more precincts into one precinct, usually resulting in
the disappearance of suxes or a precinct name. We may also detect a boundary realignment when one precinct
unexpectedly gains votes over the last election and a neighbor loses votes. Some localities name precincts aer
their polling place, and name changes may – but do not always – signal new boundaries. Precinct changes due
to local annexations are not always obvious from elections data, since these relatively small adjustments do not
oen result in name changes or changes in the number of precincts. For these, we collect annexation notices led
by local governments.
In the course of our work we have encountered oddities. A rural election oce that burned down along with
all its election data. Rural counties that allow voters to decide which precinct they live in, and which polling
place they will vote at, creating intermingled precincts that defy boundaries, so we create one precinct for the
entire county. We’ve discovered individuals and even an entire neighborhood assigned to vote in the wrong
county, which we veried with election ocials. On rare occasions we identify errors in precinct boundaries
and in certied vote totals. We work with election ocials to correct these issues so that overall election admin-
istration may be improved. We’ve shared election maps we’ve created with election ocials who do not have
GIS capacity, so they may have accurate representations. Our work has even been included in a few localities’
Code Oce
AGR Agriculture Commissioner
ATG Attorney General
AUD Auditor
COC Corporation Commissioner
COU City Council Member
DEL Delegate to the U.S. House
GOV Governor
H## U.S. House, where ## is the district number (‘AL’ denotes at large)
INS Insurance Commissioner
LAB Labor Commissioner
LAN Commissioner of Public Lands
LTG Lieutenant Governor
PRE President
PSC Public Service Commissioner
RRC Railroad Commissioner
SAC State Appeals Court (in Alabama, Civil Appeals Court)
SCC State Court of Criminal Appeals
SOS Secretary of State
SSC State Supreme Court
SPI Superintendent of Public Instruction
TRE Treasurer
USS U.S. Senate
Tab le 4. Statewide Oce Codes.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
2020 Phase 2 Redistricting Data Program transmission of precinct boundaries to the U.S. Census. We strive for
perfection but know the reality of working with big data is we will not catch all errors. Our large user-base (our
databases have over two hundred thousand downloads) includes thousands of mapping enthusiasts who create
election results maps for dissemination on social media. ey include tens of thousands of users who create DYI
redistricting plans using online mapping applications. Our users act as crowd-sourcing agents, and we welcome
and research their error reports.
Our databases are released under a Creative Commons Attribution 4.0 International license (https://creative-
commons.org/licenses/by/4.0/deed.en). Users are welcome to share and adapt our work as long as they pro-
vide appropriate credit. Unfortunately, attribution has at times been challenging, perhaps due to the success of
our work. We have observed peer-reviewed published research attribute our work to other organizations that
re-disseminate our databases. We hope this essay will provide future users a viable and persistent citation to our
work.
We used no customized soware for our databases.
Received: 11 June 2024; Accepted: 22 October 2024;
Published: xx xx xxxx
1. Baltz, S. et al. American election results at the precinct level. Sci. Data 9, 1–12, https://doi.org/10.1038/s41597-022-01745-0 (2022).
2. Herrnson, P., Hanmer, M. & Niemi, . The impact of ballot type on voter errors. Am. J. Polit. Sci. 56, 716–730, https://doi.
org/10.1111/j.1540-5907.2011.00579.x (2012).
3. Amos, B., McDonald, M. & Watins, . When boundaries collide: constructing a national database of demographic and voting
statistics. Public Opin. Q. 81, 385–400, https://doi.org/10.1093/poq/nfx001 (2017).
4. McDonald, M. P. Presidential vote in legislative districts. State Polit. Policy Q. 14, 196–204, https://doi.org/10.1177/1532440014529291
(2012).
5. ing, G. A Solution to the Ecological Inference Problem (Princeton University Press, Princton, NJ, 1997).
6. Warshaw, C., McGhee, E. & Migursi, M. Districts for a new decade – partisan outcomes and racial representation in the 2021–2022
redistricting cycle. Publius: e J. Fed. 52, 428–451, https://doi.org/10.1093/publius/pjac020 (2022).
7. Artes, J., aufman, A. ., ichter, B. . & Timmons, J. F. Are rms gerrymandered? Am. Polit. Sci. Rev. 1–21, https://doi.org/10.1017/
S0003055424000558 (2024).
8. de Benedictis-essner, J., Lee, D. D. I., Velez, Y. . & Warshaw, C. American local government elections database. Sci. Data 10, 912,
https://doi.org/10.1038/s41597-023-02792-x (2023).
9. Dobbs, . W., ing, D. M. & Jacobson, S. H. edistricting optimization with recombination: A local search case study. Comput. &
Oper. Res. 160, 106369, https://doi.org/10.1016/j.cor.2023.106369 (2023).
10. Dobbs, . W., Swamy, ., ing, D. M., Ludden, I. G. & Jacobson, S. H. An optimization case study in analyzing missouri redistricting.
INFORMS J. on Appl. Anal. 54, 162–187, https://doi.org/10.1287/inte.2022.0037 (2024).
11. Palmer, M., Schneer, B. & DeLuca, . A partisan solution to partisan gerrymandering: e dene–combine procedure. Polit.
Analysis 1–16, https://doi.org/10.1017/pan.2023.39 (2023).
12. enny, C. T. e t al. e use of dierential privacy for census data and its impact on redistricting: e case of the 2020 u.s. census. Sci.
Adv. 7, https://doi.org/10.1126/sciadv.ab3283 (2021).
13. uriwai, S., Ansolabehere, S., Dagonel, A. & Yamauchi, S. e geography of racially polarized voting: Calibrating surveys at the
district level. Am. Polit. Sci. Rev. 118, 922–939, https://doi.org/10.1017/S0003055423000436 (2024).
14. Hunt, C. & ouse, S. M. Polarization and place-based representation in us state legislatures. Legislative Stud. Q. https://doi.
org/10.1111/lsq.12441 (2023).
15. Fraga, B. L., Velez, ., Yamil & West, E. A. eversion to the mean, or their version of the dream? latino voting in an age of populism.
Am. Polit. Sci. Rev. 1–9, https://doi.org/10.33774/apsa-2023-764r1 (2024).
16. astogi, A. & Jones-Correa, M. Not just white soccer moms: Voting in suburbia in the 2016 and 2020 elections. RSF: e Russell Sage
Foundation J. Soc. Sci. 9, 184–203, https://doi.org/10.7758/SF.2023.9.2.08 (2023).
17. Grossman, G., im, S., exer, J. M. & Thirumurthy, H. Political partisanship influences behavioral responses to governors’
recommendations for covid-19 prevention in the united states. Proc. Natl. Acad. Sci. 117, 24144–24153, https://doi.org/10.1073/
pnas.2007835117 (2020).
18. itchens, ., Harris, S. & Miller, . What matters in school reopening plans: an analysis of the impact of school board demographics.
Polit. Groups, Identities 12, 186–216, https://doi.org/10.1080/21565503.2023.2224765 (2024).
19. Wang, B. S., odnyansy, S., Boarnet, M. G. & Comandon, A. Measuring the impact of covid-19 policies on local commute trac:
Evidence from mobile data in northern california. Travel. Behav. Soc. 34, 100660, https://doi.org/10.1016/j.tbs.2023.100660 (2024).
20. Be c, B., Antonelli, J. & LaScala-Gruenewald, A. Nec-restraint bans, law enforcement ocer unions, and police illings. Criminol.
& Public Policy https://doi.org/10.1111/1745-9133.12658 (2024).
21. Denes, M., Scanlon, M. & Schulz, F. Disclosure in democracy. SSRN https://doi.org/10.2139/ssrn.4154777 (2022).
22. Hughes, S., irchho, C. J., Conedera, . & Friedman, M. e municipal drining water database. PLOS Water 2, e0000081, https://
doi.org/10.1371/journal.pwat.0000081 (2023).
23. Amos, B. & McDonald, M. P. A method to audit the assignment of registered voters to districts and precincts. Polit. Analysis 28,
356–371, https://doi.org/10.1017/pan.2019.44 (2020).
24. Amos, B., Gerontais, S. & McDonald, M. Voting and election science team: 2016 precinct-level election results https://doi.org/10.7910/
DVN/NH5S2I (2024).
25. Amos, B., Gerontais, S. & McDonald, M. Voting and election science team: 2018 precinct-level election results https://doi.org/10.7910/
DVN/UBYU (2024).
26. Amos, B., Gerontais, S. & McDonald, M. Voting and election science team: 2020 precinct-level election results https://doi.org/10.7910/
DVN/7760H (2024).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
SCIENTIFIC DATA | (2024) 11:1173 | https://doi.org/10.1038/s41597-024-04024-2
www.nature.com/scientificdata
www.nature.com/scientificdata/
We thank our funding supporters: the Alfred P. Sloan Foundation, the Houston Endowment, Resilient
Democracy, and individual donors to the University of Florida Foundation’s Election Science Group account.
Research assistants who assisted with data collection include Maxwell Clarke, Robert Della Salle, Karl Klarner,
Sara Loving, Evan Smith, and Mario Villegas. Michal Migurski independently provided some data assistance. We
would like to thank numerous state and local ocials who kindly responded to our requests.
Author contributions
S.G. was primarily responsible for data collection and processing. B.A. was primarily responsible for some data
collection and processing, and voter le geocoding. M.M. was primarily responsible for project management and
fundraising. All authors reviewed the manuscript.
e authors declare no competing interests.
Correspondence and requests for materials should be addressed to M.M.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution-NonCommercial-
NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribu-
tion and reproduction in any medium or format, as long as you give appropriate credit to the original author(s)
and the source, provide a link to the Creative Commons licence, and indicate if you modied the licensed mate-
rial. You do not have permission under this licence to share adapted material derived from this article or parts of
it. e images or other third party material in this article are included in the article’s Creative Commons licence,
unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative
Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use,
you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by-nc-nd/4.0/.
© e Author(s) 2024
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com