ArticlePublisher preview available

Event-level prediction of urban crime reveals a signature of enforcement bias in US cities

Authors:
If you want to read the PDF, try requesting it from the authors.

Abstract and Figures

Policing efforts to thwart crime typically rely on criminal infraction reports, which implicitly manifest a complex relationship between crime, policing and society. As a result, crime prediction and predictive policing have stirred controversy, with the latest artificial intelligence-based algorithms producing limited insight into the social system of crime. Here we show that, while predictive models may enhance state power through criminal surveillance, they also enable surveillance of the state by tracing systemic biases in crime enforcement. We introduce a stochastic inference algorithm that forecasts crime by learning spatio-temporal dependencies from event reports, with a mean area under the receiver operating characteristic curve of ~90% in Chicago for crimes predicted per week within ~1,000 ft. Such predictions enable us to study perturbations of crime patterns that suggest that the response to increased crime is biased by neighbourhood socio-economic status, draining policy resources from socio-economically disadvantaged areas, as demonstrated in eight major US cities. Rotaru et al. introduce a transparent crime forecasting algorithm that reveals inequities in police enforcement and suggests an enforcement bias in eight US cities.
Predictive performance of Granger networks a,b, Out-of-sample AUC for predicting violent (a) and property crimes (b). The prediction is made 1 week in advance, and the event is registered as a successful prediction if we get a hit within ±1 day of the predicted date. c, Distribution of AUC on average, individually for violent and property crimes. Our mean AUC is close to 90%. d–f, Influence diffusion and perturbation space. If we are able to infer a model that predicts event dynamics at a specific spatial tile (the target) using observations from a source tile Δ days in future, we say that the source tile is within the influencing neighbourhood for the target location with a delay of Δ. Spatial radius of influence for 0.5, 1, 2 and 3 weeks (d), for violent (upper panel) and property crimes (lower panel). Note that the influencing neighbourhoods, as defined by our model, are large and approach a radius of 6 miles. Given the geometry of the City of Chicago, this maps to a substantial percentage of the total area of the urban space under consideration, demonstrating that crime manifests demonstrable long-range and almost city-wide influences. Extent of a few inferred neighbourhoods at a time delay of at most 3 days (e). Average rate of influence diffusion measured by number of predictive models inferred that transduce influence as we consider longer and longer time delays (f). Note that the rate of influence diffusion falls rapidly for property crimes, dropping to zero in about 1 week, whereas for violent crimes, the influence continues to diffuse even after 3 weeks.
… 
This content is subject to copyright. Terms and conditions apply.
Articles
https://doi.org/10.1038/s41562-022-01372-0
1Department of Medicine, University of Chicago, Chicago, IL, USA. 2Department of Computer Science, University of Chicago, Chicago, IL, USA.
3Department of Sociology, University of Chicago, Chicago, IL, USA. 4Committee on Quantitative Methods in Social, Behavioral, and Health Sciences,
University of Chicago, Chicago, IL, USA. 5Santa Fe Institute, Santa Fe, NM, USA. 6Committee on Genetics, Genomics, and Systems Biology, University of
Chicago, Chicago, IL, USA. e-mail: ishanu@uchicago.edu
The emergence of large-scale data and ubiquitous data-driven
modelling has sparked widespread government interest in the
possibility of predictive policing15, that is, predicting crime
before it happens to enable anticipatory enforcement. Such efforts,
however, do not document the distribution of crime in isolation but
rather its complex relationship with policing and society. In this
study, we re-conceptualize the process of crime prediction, build
methods to improve upon the state of the art and use this to diag-
nose both the distribution of reported crime and biases in enforce-
ment. The history of statistics has co-evolved with the history of
criminal prediction, but also with the history of enforcement cri-
tique. Siméon Poisson published the Poisson distribution and his
theory of probability in an analysis of the number of wrongful con-
victions in a given country6. Andrey Markov introduced Markov
processes to show that dependencies between outcomes could still
obey the central limit theorem to counter Pavel Nekrasov’s argu-
ment that, because Russian crime reports obeyed the law of large
numbers, “decisions made by criminals to commit crimes must all
be independent acts of free will”7.
In this study, we conceptualize the prediction of criminal reports
as that of modelling and predicting a system of spatio-temporal point
processes unfolding in a social context. We report an approach to
predict crime in cities at the level of individual events, with predic-
tive accuracy far greater than has been achieved in the past. Rather
than simply increasing the power of states by predicting the when
and where of anticipated crime, our tools allow us to audit them for
enforcement biases, and garner deep insight into the nature of the
dynamical processes through which policing and crime co-evolve
in urban spaces.
Classical investigations into the mechanics of crime810 have
recently given way to event-level crime predictions that have
enticed police forces to deploy them preemptively and stage inter-
ventions targeted at lowering crime rates. These efforts have gen-
erated multivariate models of time-invariant hotspots1113 and
estimate both long- and short-term dynamic risks13. One of the
earliest approaches to predictive policing was based on the use of
epidemic-type aftershock sequences4,5, originally developed to
model seismic phenomena. While these approaches have suggested
the possibility of predictive policing, many achieve only limited
out-of-sample performance4,5. More recently, deep learning archi-
tectures have yielded better results14. Machine learning and artificial
intelligence-based systems, however, are often black boxes produc-
ing little insight regarding the social system of crime and its rules
of organization. Moreover, the issue of how enforcement interacts
with, modulates and reinforces crime has rarely been addressed in
the context of precise event predictions.
A forecast competition for identifying hotspots prospec-
tively in the City of Portland was organized by the National
Institute of Justice (NIJ) in 2017 (https://nij.ojp.gov/funding/
real-time-crime-forecasting-challenge), which led to the develop-
ment of multiple effective approaches15,16 leveraging point processes
to model event dynamics, but not accounting for long-range and
time-delayed emergent interactions between spatial locations. Such
approaches, although laudable for demonstrating that event-level
prediction is possible with actionable accuracy, do not allow for
the elucidation of enforcement bias. Informing predictions with the
emergent structure of interactions allows us to significantly outper-
form solutions submitted to the NIJ challenge and simulate realistic
enforcement alternatives and consequences.
Results and discussion
Here we show that crime in cities may be predicted reliably one
or more weeks in advance, enabling model-based simulations that
reveal both the pattern of reported infractions and the pattern of
corresponding police enforcement. We learn from publicly recorded
historical event logs, and validate on events in the following year
beyond those in the training sample. Using incidence data from the
City of Chicago, our spatio-temporal network inference algorithm
infers patterns of past event occurrences and constructs a communi-
cating network (the Granger network) of local estimators to predict
Event-level prediction of urban crime reveals a
signature of enforcement bias in US cities
Victor Rotaru1,2, Yi Huang1, Timmy Li1,2, James Evans 3,4,5 and Ishanu Chattopadhyay 1,4,6 ✉
Policing efforts to thwart crime typically rely on criminal infraction reports, which implicitly manifest a complex relationship
between crime, policing and society. As a result, crime prediction and predictive policing have stirred controversy, with the
latest artificial intelligence-based algorithms producing limited insight into the social system of crime. Here we show that,
while predictive models may enhance state power through criminal surveillance, they also enable surveillance of the state by
tracing systemic biases in crime enforcement. We introduce a stochastic inference algorithm that forecasts crime by learning
spatio-temporal dependencies from event reports, with a mean area under the receiver operating characteristic curve of ~90%
in Chicago for crimes predicted per week within ~1,000 ft. Such predictions enable us to study perturbations of crime patterns
that suggest that the response to increased crime is biased by neighbourhood socio-economic status, draining policy resources
from socio-economically disadvantaged areas, as demonstrated in eight major US cities.
NATURE HUMAN BEHAVIOUR | VOL 6 | AUGUST 2022 | 1056–1068 | www.nature.com/nathumbehav
1056
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Article
A new algorithmic tool developed by Rotaru and colleagues can more accurately predict crime events in US cities. Predictive crime modelling can produce powerful statistical tools, but there are important considerations for researchers to take into account to avoid their findings being misused and doing more harm than good.
Article
Full-text available
People frequently compare the racial composition of stopped individuals with the racial composition of the local residential population to assess unequal policing. This type of evaluation rests on the assumption that the census-derived population accurately reflects the population at risk to be stopped. For vehicle stops, existing research indicates that this assumption is very problematic, resulting in highly unreliable assessments of black-white policing disparities. However, there is little research on the significance of this assumption for stopped urban pedestrians. Analyzing more than 100,000 investigatory stops in Chicago, the present study finds that similar to vehicle stops, most pedestrian investigations do not involve neighborhood residents, and estimates of racial disproportionality are inflated when this issue is ignored. Still, the degree to which estimates are inflated appears less than that previously reported for vehicle stops, and sizable racial disparities remain unexplained after the issue is taken into account. Implications for future research are discussed.
Article
Full-text available
Crime forecasts are sensitive to the spatial discretizations on which they are defined. Furthermore, while the Predictive Accuracy Index (PAI) is a common evaluation metric for crime forecasts, most crime forecasting methods are optimized using maximum likelihood or other smooth optimization techniques. Here we present a novel methodology that jointly (1) selects an optimal grid size and orientation and (2) learns a scoring function with the aim of directly maximizing PAI. Our method was one of the top performing submissions in the 2017 NIJ Crime Forecasting challenge, winning 9 of the 20 PAI categories under the name of team PASDA. We illustrate the model on data provided through the competition from the Portland Police Department.
Article
Full-text available
This article describes Team Kernel Glitches' solution to the National Institute of Justice's (NIJ) Real-Time Crime Forecasting Challenge. The goal of the NIJ Real-Time Crime Forecasting Competition was to maximize two different crime hotspot scoring metrics for calls-for-service to the Portland Police Bureau (PPB) in Portland, Oregon during the period from March 1, 2017 to May 31, 2017. Our solution to the challenge is a spatiotemporal forecasting model combining scalable randomized Reproducing Kernel Hilbert Space (RKHS) methods for approximating Gaussian processes with autoregressive smoothing kernels in a regularized supervised learning framework. Our model can be understood as an approximation to the popular log-Gaussian Cox Process model: we discretize the spatiotemporal point pattern and learn a log intensity function using the Poisson likelihood and highly efficient gradient-based optimization methods. Model hyperparameters including quality of RKHS approximation, spatial and temporal kernel lengthscales, number of autoregressive lags, bandwidths for smoothing kernels, as well as cell shape, size, and rotation, were learned using crossvalidation. Resulting predictions exceeded baseline KDE estimates by 0.157. Performance improvement over baseline predictions were particularly large for sparse crimes over short forecasting horizons.
Article
Global methods that fit a single forecasting method to all time series in a set have recently shown surprising accuracy, even when forecasting large groups of heterogeneous time series. We provide the following contributions that help understand the potential and applicability of global methods and how they relate to traditional local methods that fit a separate forecasting method to each series: •Global and local methods can produce the same forecasts without any assumptions about similarity of the series in the set. •The complexity of local methods grows with the size of the set while it remains constant for global methods. This result supports the recent evidence and provides principles for the design of new algorithms. •In an extensive empirical study, we show that purposely naïve algorithms derived from these principles show outstanding accuracy. In particular, global linear models provide competitive accuracy with far fewer parameters than the simplest of local methods.
Article
The number of predictive technologies used in the U.S. criminal justice system is on the rise. Yet there is little research to date on the reception of algorithms in criminal justice institutions. We draw on ethnographic fieldwork conducted within a large urban police department and a midsized criminal court to assess the impact of predictive technologies at different stages of the criminal justice process. We first show that similar arguments are mobilized to justify the adoption of predictive algorithms in law enforcement and criminal courts. In both cases, algorithms are described as more objective and efficient than humans’ discretionary judgment. We then study how predictive algorithms are used, documenting similar processes of professional resistance among law enforcement and legal professionals. In both cases, resentment toward predictive algorithms is fueled by fears of deskilling and heightened managerial surveillance. Two practical strategies of resistance emerge: foot-dragging and data obfuscation. We conclude by discussing how predictive technologies do not replace, but rather displace discretion to less visible—and therefore less accountable—areas within organizations, a shift which has important implications for inequality and the administration of justice in the age of big data.
Article
The history of policing in the United States is a history of tension between the police and the public, especially in marginalised communities, where the legitimacy of the police and their interventions has been most questioned. Marginalised and often minority communities often complain about over and under policing, that is, policing that harasses local residents but does not address serious crime. In recent years, concerns with the institutional legitimacy of the police in the US and elsewhere have risen in public discussions and in scientific research. Current models of police legitimacy tend to focus on transactions between the police and the public over matters of procedural justice; however, taking a more contextual view of police interventions in communities provides opportunities to look beyond transactions and sort out the socio-cultural acceptance of the police against the myriad of services they provide to communities. Here we focus on census tracts in Boston, merging calls for service data with perceptual survey data. We find significant differences in the types of police services requested by advantaged and disadvantaged communities. Public-initiated calls for service are largely for emergency response matters as opposed to crime prevention and community restoration; police-initiated services, however, are more evenly distributed across prevention, response, and restoration. While residents of disadvantaged, high-crime communities request the police more often, they perceive themselves as unwilling to report crime. Additionally, they perceive their communities as unsafe while also viewing the police as less legitimate.
Article
The unequal spatial distribution of crime is an enduring feature of cities. While research suggests that spatial diffusion processes heighten this concentration, the actual mechanisms of diffusion are not well understood as research rarely measures the ways in which people, groups, and behaviors connect neighborhoods. This study considers how a particular behavior, criminal co-offending, creates direct and indirect pathways between neighborhoods. Analyzing administrative records and survey data, the authors find that individual acts of co-offending link together to create a “network of neighborhoods,” facilitating the diffusion of crime over time and across space and, in so doing, create pathways between all Chicago neighborhoods. Statistical analyses demonstrate that these neighborhood networks are (1) stable over time; (2) generated by important structural characteristics, social processes, and endogenous network properties; and (3) a better predictor of the geographic distribution of crime than traditional spatial models.