Article

Forecasting Domestic Violence: A Machine Learning Approach to Help Inform Arraignment Decisions

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Arguably the most important decision at an arraignment is whether to release an offender until the date of his or her next scheduled court appearance. Under the Bail Reform Act of 1984, threats to public safety can be a key factor in that decision. Implicitly, a forecast of “future dangerousness” is required. In this article, we consider in particular whether usefully accurate forecasts of domestic violence can be obtained. We apply machine learning to data on over 28,000 arraignment cases from a major metropolitan area in which an offender faces domestic violence charges. One of three possible post-arraignment outcomes is forecasted within two years: (1) a domestic violence arrest associated with a physical injury, (2) a domestic violence arrest not associated with a physical injury, and (3) no arrests for domestic violence. We incorporate asymmetric costs for different kinds of forecasting errors so that very strong statistical evidence is required before an offender is forecasted to be a good risk. When an out-of-sample forecast of no post-arraignment domestic violence arrests within two years is made, it is correct about 90 percent of the time. Under current practice within the jurisdiction studied, approximately 20 percent of those released after an arraignment for domestic violence are arrested within two years for a new domestic violence offense. If magistrates used the methods we have developed and released only offenders forecasted not to be arrested for domestic violence within two years after an arraignment, as few as 10 percent might be arrested. The failure rate could be cut nearly in half. Over a typical 24-month period in the jurisdiction studied, well over 2,000 post-arraignment arrests for domestic violence perhaps could be averted.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In terms of prediction, Berk et al. (2016) applied supervised ML to forecast the future dangerousness of offenders in over 18,000 arraignment cases from a metropolitan area in which the offender faced DV charges [28]. Another crime prediction study on DV was conducted by Wijenayake et al. (2018) [45]. ...
... In terms of prediction, Berk et al. (2016) applied supervised ML to forecast the future dangerousness of offenders in over 18,000 arraignment cases from a metropolitan area in which the offender faced DV charges [28]. Another crime prediction study on DV was conducted by Wijenayake et al. (2018) [45]. ...
... Studies using a national database (n = 3) are mostly cross-sectional surveys, and self-reported responses lead to recall bias or lapse of time issues, which may affect the accuracy of the prediction. As discussed by Berk et al. (2016) [28], limited electronic data and court documents for DV cases can be extracted for ML algorithms' development. Some jurisdictions' records are still in written form instead of electronic. ...
Article
Full-text available
Unlabelled: Domestic violence (DV) is a public health crisis that threatens both the mental and physical health of people. With the unprecedented surge in data available on the internet and electronic health record systems, leveraging machine learning (ML) to detect obscure changes and predict the likelihood of DV from digital text data is a promising area health science research. However, there is a paucity of research discussing and reviewing ML applications in DV research. Methods: We extracted 3588 articles from four databases. Twenty-two articles met the inclusion criteria. Results: Twelve articles used the supervised ML method, seven articles used the unsupervised ML method, and three articles applied both. Most studies were published in Australia (n = 6) and the United States (n = 4). Data sources included social media, professional notes, national databases, surveys, and newspapers. Random forest (n = 9), support vector machine (n = 8), and naïve Bayes (n = 7) were the top three algorithms, while the most used automatic algorithm for unsupervised ML in DV research was latent Dirichlet allocation (LDA) for topic modeling (n = 2). Eight types of outcomes were identified, while three purposes of ML and challenges were delineated and are discussed. Conclusions: Leveraging the ML method to tackle DV holds unprecedented potential, especially in classification, prediction, and exploration tasks, and particularly when using social media data. However, adoption challenges, data source issues, and lengthy data preparation times are the main bottlenecks in this context. To overcome those challenges, early ML algorithms have been developed and evaluated on DV clinical data.
... Machine learning analytics may be better matched to the complexity of the decisionmaking processes reflected in criminal justice data, and complexity of data ensemble in administrative data, than generalized linear modelling (Brennan & Oliver, 2013). Although machine learning techniques have been used to predict criminal offending, and even rare violent events, with reasonable success (Berk, 2017;Berk et al., 2016;Berk & Sorenson, 2020), there have been very few applications to police behavior. Emerging literature proposes that machine learning is a powerful tool for developing risk assessments (Berk, 2021), but it remains that stratifying behavior for analysis provides considerable insight into those specific behaviors. ...
... Machine learning analytics have been used in recent years to interrogate policing data with considerable accuracy (Berk et al., 2009;Berk et al., 2016;Cubitt, Wooden, & Roberts, 2021). For example, analytical processes of this type have been used to forecast domestic violence (Berk, 2019;Berk et al., 2016;Grogger et al., 2021) and high-harm offense types (Berk et al., 2009;Cubitt & Morgan, 2022). ...
... Machine learning analytics have been used in recent years to interrogate policing data with considerable accuracy (Berk et al., 2009;Berk et al., 2016;Cubitt, Wooden, & Roberts, 2021). For example, analytical processes of this type have been used to forecast domestic violence (Berk, 2019;Berk et al., 2016;Grogger et al., 2021) and high-harm offense types (Berk et al., 2009;Cubitt & Morgan, 2022). Machine learning, while remaining an underutilized analytical methodology in the field of policing, offers a viable alternative to generalized linear modelling, by allowing data to be interrogated with greater granularity. ...
Article
The power to use force is a defining characteristic of policing, one that is accompanied by a responsibility to exercise these powers in the circumstances deemed necessary. This study analyzes data from four policing agencies to predict the likelihood of an officer drawing and pointing their firearm at a use of force incident. Findings suggest that situational factors were important in influencing whether an officer may draw and point their firearm. However, a priming effect, in which officers were more likely to draw their firearms when dispatched to an incident, may also be present. The rate that officers drew and pointed their firearms varied between jurisdictions, as did the nature of the incidents. Caution should be exercised in generalizing the results of single-site studies on police use of force, or introducing research into policy beyond the jurisdiction in which it was performed.
... Despite the wide usage of risk assessments in almost every jurisdiction in the USA, there has been surprisingly little discussion of how many FPs should be tolerated in an exchange to avoid FNs. Previous studies show that some stakeholders in the criminal justice system prefer the model that has more FPs than FNs (Barnes & Hyatt, 2012;Berk, 2019;Berk et al., 2016;Netter, 2007;Oswald et al., 2018). They believe that it is costlier to have FNs (inaccurately classifying high-risk individuals as low-risk individuals) than to have FPs (inaccurately classifying low-risk individuals as high-risk individuals) (Barnes & Hyatt, 2012;Berk, 2019;Berk et al., 2016). ...
... Previous studies show that some stakeholders in the criminal justice system prefer the model that has more FPs than FNs (Barnes & Hyatt, 2012;Berk, 2019;Berk et al., 2016;Netter, 2007;Oswald et al., 2018). They believe that it is costlier to have FNs (inaccurately classifying high-risk individuals as low-risk individuals) than to have FPs (inaccurately classifying low-risk individuals as high-risk individuals) (Barnes & Hyatt, 2012;Berk, 2019;Berk et al., 2016). However, stakeholders are uncertain about how much they can sacrifice individual liberty and other costs of FPs in order to reduce potential threats to public safety and other costs of FNs (Brauneis & Goodman, 2017). ...
... Using algorithms in risk assessment to predict someone's risk of getting involved in crime is more accurate than assessments by criminal justice decision-makers (Berk, 2019;Berk et al., 2016). Similar to predictions made by humans, however, algorithms have errors. ...
Article
Full-text available
Objectives We examine public attitudes towards false positives and false negatives in criminal justice risk assessment and how people’s choices differ in varying offenses and stages. Methods We use data from a factorial survey experiment conducted with a sample of 575 Americans. Respondents were randomly assigned to different conditions in the vignette for the criminal justice process and the offense severity and were asked to choose the cost ratio. Results While people prefer the cost ratio with higher false positives, the degree to which they accept false positives is lower than the cost ratios of existing risk assessments. The offense severity impacts people’s acceptance of false positives. Meanwhile, numeracy influences people’s decisions on the cost ratio. Conclusions To our knowledge, this is the first study to investigate public opinion on the cost ratio in risk assessments. We suggest that public opinion on the cost ratio can be an alternative way to find the ideal cost ratio.
... Maryland's urban centers, including Baltimore, encounter diverse crime challenges, particularly in the field of violent crime, while its suburban and rural regions encounter different types of criminal activity, such as property crimes and drugrelated offenses. Machine learning models offer the potential to analyze and predict these varying crime patterns at a granular level, empowering law enforcement and policymakers with the tools needed to proactively address crime [4,5]. ...
... Deeper or unbounded trees could memorize the training data, but cross-validation revealed that a depth ≈ 10 was optimal, likely because this prevents overfitting the smaller training sets. Similarly, the gradient boosting models (XGBoost, CatBoost, etc.) were tuned with a learning rate of~0.1 and moderate tree depths (3)(4)(5)(6), with early stopping rounds or regularization applied to curb overfitting (see Table 6 for details). The SVR model required tuning of the kernel hyperparameters (e.g., using an RBF kernel with C~10 and γ~0.1 was found to be best). ...
Article
Full-text available
This study advances crime analysis methodologies in Maryland by leveraging sophisticated machine learning (ML) techniques designed to cater to the state’s varied urban, suburban, and rural contexts. Our research utilized an enhanced combination of machine learning models, including random forest, gradient boosting, XGBoost, extra trees, and advanced ensemble methods like stacking regressors. These models have been meticulously optimized to address the unique dynamics and demographic variations across Maryland, enhancing our capability to capture localized crime trends with high precision. Through the integration of a comprehensive dataset comprising five years of detailed police reports and multiple crime databases, we executed a rigorous spatial and temporal analysis to identify crime hotspots. The novelty of our methodology lies in its technical sophistication and contextual sensitivity, ensuring that the models are not only accurate but also highly adaptable to local variations. Our models’ performance was extensively validated across various train–test split ratios, utilizing R-squared and RMSE metrics to confirm their efficacy and reliability for practical applications. The findings from this study contribute significantly to the field by offering new insights into localized crime patterns and demonstrating how tailored, data-driven strategies can effectively enhance public safety. This research importantly bridges the gap between general analytical techniques and the bespoke solutions required for detailed crime pattern analysis, providing a crucial resource for policymakers and law enforcement agencies dedicated to developing precise, adaptive public safety strategies.
... The judicial system in Spain is based on the principle of 'in dubio pro reo' (in case of doubt, the rights of the accused prevail). With this in mind, we should put on a balance some issues such as the cost of releasing people at risk rather than detain a person who will not reoffend, as some authors have put into practise in their research with a cost ratio, for example, where one false negative has the cost of 10 false positives (Berk et al. 2016). Other sectors, however, demand greater protection for the victim, seeking the lowest number of false negatives. ...
... The judge does not hand down a prison sentence to a convict based on his or her risk level. The risk assessment may only temporarily impact some rights, such as imposing a restraining order or, in the worst-case scenario, pretrial incarceration whilst awaiting the final sentence (see Berk et al. 2016 for a practical improvement of judicial arraignment practises in domestic violence using a machine learning forecasting system). Therefore, we are considering two very different kinds of risk, which are easily ponderable. ...
Article
Full-text available
Violence risk assessment is an internationally recognised methodology, aimed to manage different forms of violence. Most risk assessment tools, as is the case of the reviewed one, are designed to protect victims in the context of pressure, little time, or little information. This paper presents a reply to Valdivia et al. (AI & Society, July 2024) criticism of the algorithm for intimate partner violence risk assessment—EPV—used in the Basque Country. They concluded that more than 50% of high-risk victims are in danger, using results from a pilot version of the instrument, not the reviewed one published in 2010, nor the system in use since May 2013. In addition, qualitative information from a single professional generates global criticisms of the tool. Neither the current cut-off points nor the real weighting of the items nor the real risk management procedure are considered, and the personal opinion of a judge is assumed to be better than the use of tools when the accumulated research shows the opposite. When EPV risk assessment reports are used in courts, they may only temporarily affect some perpetrator rights, imposing restraining orders or, in the worst case, pretrial prison waiting for sentence. However, risk management can save the life of the victim. Cautions and suggestions related to the judicial context, such as improving risk reports or training judicial professionals, are shared. However, Valdivia et al.’s paper leads to misconceptions that extend to different sectors when echoing the wrong conclusions of their paper.
... Random forests are often used to identify predictive variables, ranked in terms of "variable importance". Random forests have been used to establish variable importance in public health research across a wide variety of topics, including predicting violence [31,32], assessing biosecurity practices [33], and examining phenotypic risk factors for temporomandibular disorders [34]. Random forests are useful for rapidly assessing large, complex datasets, and this method allowed us to examine all 92 items of the EPII without a priori assumptions. ...
... In this exploratory analysis, we sought to identify which items of the EPII explained the most variability in the Kessler-6 score, a measure of non-specific stress. Random forests are a useful tool for establishing variable importance and have been used in a wide variety of different contexts [31][32][33][34]. Unlike traditional regression techniques, which use indirect metrics such as p-values and measures of model fit to establish variable importance, random forests compute internal metrics for variable importance by calculating the change in model mean squared error when each variable is randomly permuted [31,36,37]. ...
Article
Full-text available
While the COVID-19 pandemic has negatively impacted many occupations, teachers and school staff have faced unique challenges related to remote and hybrid teaching, less contact with students, and general uncertainty. This study aimed to measure the associations between specific impacts of the COVID-19 pandemic and stress levels in Minnesota educators. A total of 296 teachers and staff members from eight middle schools completed online surveys between May and July of 2020. The Epidemic Pandemic Impacts Inventory (EPII) measured the effects of the COVID-19 pandemic according to nine domains (i.e., Economic, Home Life). The Kessler-6 scale measured non-specific stress (range: 0–24), with higher scores indicating greater levels of stress. Random forest analysis determined which items of the EPII were predictive of stress. The average Kessler-6 score was 6.8, indicating moderate stress. Three EPII items explained the largest amount of variation in the Kessler-6 score: increase in mental health problems or symptoms, hard time making the transition to working from home, and increase in sleep problems or poor sleep quality. These findings indicate potential areas for intervention to reduce employee stress in the event of future disruptions to in-person teaching or other major transitions during dynamic times.
... Supervised ML is easy to implement and is compared to other types of ML. It fits data as per algorithm predefined function i.e., backward substitution and Gaussian regressions [41,42]. Unsupervised ML is deduction of error and decision without particular feedback recognized, while reenforcement learning is a reward-based agent to data-set optimization [42]. ...
... It fits data as per algorithm predefined function i.e., backward substitution and Gaussian regressions [41,42]. Unsupervised ML is deduction of error and decision without particular feedback recognized, while reenforcement learning is a reward-based agent to data-set optimization [42]. In our current research, a variety of datasets are involved, mainly comprised of absorption bands, magnetic moments, and biofouling as per UV and IR spectroscopic analytical techniques. ...
Article
Full-text available
The synthesis of many transition metal complexes containing 3,5-diamino-1,2,4-triazole (Hdatrz) as a ligand with different counter anions Br⎺, Cl⎺, ClO4⎺ and SO42- has been studied extensively, but the chemistry of transition metal nitrate and acetate compounds and their reactivity are relatively unexplored. In this research work, two new series of metal(II) complexes (M = Ni, Co, and Zn) {[Ni3(Hdatrz)6(H2O)6](NO3)6 (1), [Co3(Hdatrz)6(H2O)6](NO3)6 (2), [Zn3(Hdatrz)6(H2O)6](NO3)6 (3), [Ni3(Hdatrz)6(H2O)6](OAc)6 (4), [Co3(Hdatrz)6(H2O)6] (OAc)6 (5) and [Zn3(Hdatrz)6(H2O)6](OAc)6 (6)} have been synthesized. These synthesized complexes were characterized by various physicochemical techniques such as UV-vis spectroscopy, Fourier transform infrared spectroscopy, and magnetic susceptibility measurements. All six complexes were found to be trinuclear and bridged through the Hdatrz ligand. Spectral data suggested a distorted octahedral environment around the central metal ions in these complexes. It also revealed that the NH and OH groups are involved in hydrogen bonding. These complexes were tested against the fungal strains Colletotrichum gloeosporioides and Aspergillus niger. These synthesized complexes have not been observed to have antifungal activities. The machine learning K-nearest neighbours evaluates the analytical characteristics and solubility behavior of the metal complexes. Machine learning models provide results with 75% precision.
... Then, the proportions can be seen as statistical estimates. For example, a confusion table for release decisions at arraignments from a given month, might be used to draw inferences about a full year of arraignments in that jurisdiction (Berk et al., 2016). Likewise, a confusion table for the housing decisions made for prison inmates (e.g., low security housing versus high security housing) from a given prison in a particular jurisdiction might be used to draw inferences about placement decisions in other prisons in the same jurisdiction (Berk and de Leeuw, 1999). ...
Preprint
Objectives: Discussions of fairness in criminal justice risk assessments typically lack conceptual precision. Rhetoric too often substitutes for careful analysis. In this paper, we seek to clarify the tradeoffs between different kinds of fairness and between fairness and accuracy. Methods: We draw on the existing literatures in criminology, computer science and statistics to provide an integrated examination of fairness and accuracy in criminal justice risk assessments. We also provide an empirical illustration using data from arraignments. Results: We show that there are at least six kinds of fairness, some of which are incompatible with one another and with accuracy. Conclusions: Except in trivial cases, it is impossible to maximize accuracy and fairness at the same time, and impossible simultaneously to satisfy all kinds of fairness. In practice, a major complication is different base rates across different legally protected groups. There is a need to consider challenging tradeoffs.
... Advances in statistical techniques show great promise for the field. For example, machine learning is in wide use in many settings, including those addressing violence against women (5). Machine learning can help inform criminal justice and child custody decisions when administrative actions depend on forecasts of problematic behavior. ...
Article
Violence against women, especially intimate partner violence, is recognized as a global public health issue due to its prevalence and global reach. This article outlines the scope of the issue, with respect to its prevalence, health outcomes, and risk factors, and identifies key milestones that led to its global recognition: methodological and data advances, acknowledgment as a criminal justice and health issue, support by the global women's movement, and the robust evidence demonstrating that intimate partner violence is preventable. Key issues for the future include recognition and consideration of intersectionality in research, improvements in the measurement of other forms of violence against women, and the need to scale up prevention efforts that have documented success. Violence against women is an urgent priority as it affects individuals, their families and surroundings, and the entire global health community.
... Were drugs or alcohol involved? Romantic couples in the process of separating can place officers in situations that are emotionally charged (Berk et al., 2016). Articulated threats of violence might be predictive as well. ...
Article
Full-text available
Purpose Police officers in the USA are often put in harm’s way when responding to calls for service. This paper provides a demonstration of concept for how machine learning procedures combined with conformal prediction inference can be properly used to forecast the amount of risk associated with each dispatch. Accurate forecasts of risk can help improve officer safety. Methods The unit of analysis is each of 1928 911 calls involving weapons offenses. Using data from the calls and other information, we develop a machine learning algorithm to forecast the risk that responding officers will face. Uncertainty in those forecasts is captured by nested conformal prediction sets. Results For approximately a quarter of a holdout sample of 100 calls, a forecast of high risk was correct with the odds of at least 3 to 1. For approximately another quarter of the holdout sample, a forecast of low risk was correct with an odds of at least 3 to 1. For remaining cases, insufficiently reliable forecasts were identified. A result of “can’t tell” is an appropriate assessment when the data are deficient. Conclusions Compared to current practice at the study site, we are able to forecast with a useful level of accuracy the risk for police officers responding to calls for service. With better data, such forecasts could be substantially improved. We provide examples.
... Lastly, there has been a more recent movement in using machine learning to conduct risk assessment. One developed algorithm used 36 variables to predict which individuals would not commit a new domestic violence offense in the next 2 years (Berk et al., 2016). The program was correct 89% of the time. ...
... Notwithstanding variations between study sites about population prevalence, the limited repeated IPV in Sweden, especially in same-sex dyads, presents a methodological and practical consideration: How can we accurately predict three to four same-sex or eight to 11 different-sex dyads out of 100 that would contact the police again in the next 140-180 days after the first reported IPV? As Bland and Ariel (2020) outline, some of these challenges confront statisticians and clinicians alike, but their conclusion is often unsatisfactory: predicting rare events can be done with a low false-positive rate and a high false-negative rate (Barnes & Hyatt, 2012;Berk et al., 2016;Berk & Sorenson, 2020;Sherman, 2013; see also Turner et al., 2019). More research is needed, preferably with multiple layers of data from the police and its partner agencies as well as self-reported data. ...
Article
Full-text available
In recent years more attention has been given to the ways in which mixed-sex and same-sex intimate partner violence (IPV) couples report crimes to the police. Specifically, what patterns of repetition, intermittency between contacts with the police, and harm trajectories over time exist, and are there variations between same-sex and mixed-sex dyads? We explore all eligible IPV reported in Sweden over 1,000 days ( n = 14,939) and use descriptive statistics to examine differences between different victims and offenders. We code IPV offences within three levels of harm recognized by law and develop a tiered approach to harm quantification that supports the growing evidence that not all IPV harm is the same. Based on official records, IPV usually ends following the first contact with the police, as nine out of ten dyads never call again. Variations across cisgender and sexual identity groups exist: Repeat same-sex IPV is not as common as mixed-sex IPV but is reported more quickly to the police after it had occurred once. In the 1,000-day follow-up period, same-sex dyads do not call the police more than four times and the repeated incidents trends seem to be driven primarily by outliers. Moreover, we find an overall pattern of decreasing time intervals between each additional contact, but no overall pattern of escalating severity over time. However, the overall severity trend it driven by female-victim-male-offender dyads: male offenders are more likely to cause escalation of harm, while two out of five male–male repeat IPV experience escalation in harm. We discuss the theoretical and practical implications of these findings, which overall illustrate the importance of observing IPV in typological terms, rather than as a continuum.
... We have created a dichotomous target variable (poor mental health, Yes or No) based on the aforementioned ordinal variable and we've investigated the following algorithms: Multilayer Perceptron (MLP), logistic regression, Support Vector Machine (linear, RBF, and polynomial kernels), random forest, and AdaBoost [33,34,37,38]. All models are developed using Python Scikit-Learn (version 1.1.1) ...
... Usually, they are intimidated by continuous exposure to the same scenario, for example, a chain of deaths of family members. In this regard, continuous exposure to such experiences leads to neurotic depression (Berk, 2016). ...
Chapter
Full-text available
Depression being a behavioural health disorder is a serious health concern in Zimbabwe and all over the world. If depression goes unaddressed, the consequences are detrimental and have an impact on the way one behaves as an individual and at the societal level. Despite the number of individuals who could benefit from treatment for behavioural health concerns, their difficulties are often unidentified and unaddressed through treatment. Technology carries the unrealised potential to identify people at risk of behavioural health conditions and to inform prevention and intervention strategies.
... Importantly, the development of these tools has historically been human-driven, with developers selecting risk factors on the basis of their own experience and knowledge of empirical research. However, datadriven approaches, where the design and refinement of risk assessment tools are determined through statistical modelling and machine learning algorithms, are increasingly being used with promising results (eg Berk, Sorenson & Barnes 2016;Leung & Trimboli 2022). ...
Article
Full-text available
This study examines how accurately the refined Family Violence Risk Assessment Tool (FVRAT) predicts repeat domestic violence. Developed on the basis of a previous validation study of an earlier, much longer version of the tool, the refined FVRAT consists of 10 checkbox items, along with sections recording victim and officer judgements. These are used to inform police responses in the Australian Capital Territory. A sample of over 450 unique reports of violence involving current and former intimate partners between March and December 2020 in which police used the refined FVRAT were examined. Repeat domestic violence was measured based on whether a subsequent report of domestic violence was made to police within six months. Consistent with the previous study, the refined FVRAT predicts repeat domestic violence at least moderately well. Victim judgements were also shown to enhance the tool's ability to correctly identify repeat domestic violence, although the findings also suggest some caution is warranted in using these judgements.
... Likewise, the use of artificial intelligence and, specifically, ML and the procedures indicated above, has interesting advantages over regressions. One of them is the possibility of working even in the presence of multicollinearity among the predictors, or of working with a large group of independent variables or "inputs", in line with the approach of [17]. This work states that, in the presence of predictors with little predictive power, which are normally eliminated from the models, the accuracy of the prediction can be significantly increased thanks to the aggregation of variables performed by ML. ...
Article
Full-text available
Intimate partner violence against women (IPVW) is a pressing social issue which poses a challenge in terms of prevention, legal action, and reporting the abuse once it has occurred. However, a significant number of female victims who file a complaint against their abuser and initiate legal proceedings, subsequently, withdraw charges for different reasons. Research in this field has been focusing on identifying the factors underlying women victims’ decision to disengage from the legal process to enable intervention before this occurs. Previous studies have applied statistical models to use input variables and make a prediction of withdrawal. However, none have used machine learning models to predict disengagement from legal proceedings in IPVW cases. This could represent a more accurate way of detecting these events. This study applied machine learning (ML) techniques to predict the decision of IPVW victims to withdraw from prosecution. Three different ML algorithms were optimized and tested with the original dataset to assess the performance of ML models against non-linear input data. Once the best models had been obtained, explainable artificial intelligence (xAI) techniques were applied to search for the most informative input features and reduce the original dataset to the most important variables. Finally, these results were compared to those obtained in the previous work that used statistical techniques, and the set of most informative parameters was combined with the variables of the previous study, showing that ML-based models had a better predictive accuracy in all cases and that by adding one new variable to the previous work’s predictive model, the accuracy to detect withdrawal improved by 7.5%.
... The precision rates were generally low across the four models. However, given that PPO applicants were rare in the population (i.e., 1.6%), the model performance was acceptable (Berk et al., 2016). As the objective of the study was to identify the true PPO applicants in the population, recall was prioritized, and thus we favored LR and XGB models over the other two models. ...
Article
Full-text available
Purpose Identifying pertinent risk factors is an essential first step for early detection and upstream prevention of spousal violence. However, limited research has examined the risk factors of spousal violence in the Asian context. This study aimed to understand the spousal violence issue in Singapore by (1) identifying the pertinent risk factors that could predict the likelihood of applying for a Personal Protection Order (PPO) - an order restraining a respondent from committing family violence against a person, and (2) understanding the relationship between various risk factors and the likelihood of PPO application. Method Linked administrative data of ever-married Singapore residents born in 1980 and 1985 (N = 51,853) were analyzed, using machine learning and network approaches. Results Results indicated that the pertinent risk factors associated with PPO application included lower educational attainment, staying in a public rental flat, early marriage and parenthood, childhood maltreatment, prior history of being respondent to PPO, offending behaviors, and mental illness. Conclusions Findings could aid in identifying individuals and families at-risk and informing upstream efforts to combat spousal violence issues. First responders, such as police or social workers, could utilize the relevant risk factor as a guide in cases of suspected family violence to identify at-risk individuals and families in a timely manner and minimize adverse effects.
... The use of machine learning techniques has become increasingly more prevalent in the social sciences, and criminology and criminal justice have begun to embrace machine learning to take advantage of the growth in the amount and complexity of data available and the potential for more robust, replicable, accurate and efficient modelling and forecasting compared to traditional approaches (Di Franco & Santurro, 2021;Grimmer et al., 2021;Hindman, 2015). As machine learning has spread across disciplines and into the social sciences, criminology and criminal justice researchers have begun to adopt machine learning techniques in a number of areas, including policing, risk-need assessment, domestic violence, violent and criminal recidivism, inmate misconduct and cost-benefit of criminal justice policy (Berk et al., 2016;Elluri et al., 2019;Ghasemi et al., 2021;Manning et al., 2018;Ngo et al., 2019;Travaini et al., 2022). ...
Article
Background: Although there is general consensus about the behavioural, clinical and sociodemographic variables that are risk factors for reoffending, optimal statistical modelling of these variables is less clear. Machine learning techniques offer an approach that may provide greater accuracy than traditional methods. Aim: To compare the performance of advanced machine learning techniques (classification trees and random forests) to logistic regression in classifying correlates of rearrest among adult probationers and parolees in the United States. Method: Data were from the subgroup of people on probation or parole who had taken part in the National Survey on Drug Use and Health for the years 2015-2019. We compared the performance of logistic regression, classification trees and random forests, using receiver operating characteristic curves, to examine the correlates of arrest within the past 12 months. Results: We found that machine learning techniques, specifically random forests, possessed significantly greater accuracy than logistic regression in classifying correlates of arrest. Conclusions: Our findings suggest the potential for enhanced risk classification. The next step would be to develop applications for criminal justice and clinical practice to inform better support and management strategies for former offenders in the community.
... The false-negative rate is high but set against a current lack of any formal prediction and associated intervention, so it provides a better approach than current practice. More sophisticated prediction models can reduce the error rate (see Berk et al. 2016;Hu et al. 2021;Yang and Olafsson 2011). ...
Article
Full-text available
While criminology and policing studies focus primarily on offenders and their behaviours, there has been an increasing focus on victims and victimology. In this paper, we argue that practitioners and scholars alike can benefit from shifting their focus on police records towards victims. Observing data on victims can lead to greater police efficiencies, particularly in the area of prevention. We review some of the arguments for such a change, then explore evidence on 380,169 victims in Kent, UK, during a 6-year period, to illustrate how to achieve new and feasible targets by focusing on a victim rather than an offender as the unit of analysis. Finally, we explore policy implications, in terms of harm reduction, prevention of repeat victimization, and triaging opportunities.
... More sophisticated and valid algorithms should therefore be used, including "logistic regression, naive Bayes, tree-augmented naive Bayes, random forests, gradient boosting, and weighted subspace random forests" (Turner et al., 2021). These instruments are more fitting for a variety of statistical performance reasons (see Kuhn and Johnson, 2013 and more recently Berk et al., 2016). However, whilst these models are useful, they are underutilised by law enforcement. ...
Article
Full-text available
Purpose A recent body of evidence investigated repeated intimate partner violence (IPV) using crime harm indices (the severity of victimisation), instead of crime counts (the number of additional victimisation incidents). Yet, the predictive utility of harm scores in IPV remains unclear – except that high-harm IPV is not usually followed by any additional IPV incidents. The authors take cases of repeat IPV from North Zealand Police, Denmark, to predict subsequent IPV harm and counts based on the level of harm of the first reported IPV offence. Design/methodology/approach Using the Danish crime harm index (CHI) to estimate harm levels, non-linear regression models are applied (due to the non-linear nature of the data) to show that the CHI level of the index offence validly predicts gains in future CHI but does not predict IPV counts. Findings The findings suggest that whilst high-harm IPV is a rare event and repeat high-harm IPV even rarer, when they do occur, escalation in harm is likely to occur. Practical implications A simple metric of harm of the first reported IPV offence can validly predict future harm – however, scholars should apply more fitting analytical techniques than crude descriptive statistics, which fail to take into account the non-linear distribution of police records. Originality/value This is the first study to show the value of predicting future harm based on prior harm in IPV.
... Robbins intends this division between morally sensitive contexts and 'neutral contexts' to largely map onto the distinction between contexts in which we intuitively feel comfortable with the use of opaque AI and contexts in which this opacity seems potentially problematic. Commonly identified ethically problematic contexts of use are those such as judicial sentencing (Berk et al. 2016;Barry-Jester et al. 2015), predictive policing (Ahmed 2018;Ensign et al. 2017;Joh 2017;O'Neil 2016) and medical diagnosis (de Bruijne 2016; Dhar and Ranganathan 2015;Erickson et al. 2017). He writes, One reason that using inexplicable decisions in morally sensitive contexts like the ones listed above is wrong is that we must ensure that the decisions are not based on inappropriate considerations… Combine this fact with using ML algorithms for decisions that have moral significance (i.e. ...
Article
Full-text available
The increasing demand for transparency in AI has recently come under scrutiny. The question is often posted in terms of “epistemic double standards”, and whether the standards for transparency in AI ought to be higher than, or equivalent to, our standards for ordinary human reasoners. I agree that the push for increased transparency in AI deserves closer examination, and that comparing these standards to our standards of transparency for other opaque systems is an appropriate starting point. I suggest that a more fruitful exploration of this question will involve a different comparison class. We routinely treat judgments made by highly trained experts in specialized fields as fair or well grounded even though—by the nature of expert/layperson division of epistemic labor—an expert will not be able to provide an explanation of the reasoning behind these judgments that makes sense to most other people. Regardless, laypeople are thought to be acting reasonably—and ethically—in deferring to the judgments of experts that concern their areas of specialization. I suggest that we reframe our question regarding the appropriate standards of transparency in AI as one that asks when, why, and to what degree it would be ethical to accept opacity in AI. I argue that our epistemic relation to certain opaque AI technology may be relevantly similar to the layperson’s epistemic relation to the expert in certain respects, such that the successful expert/layperson division of epistemic labor can serve as a blueprint for the ethical use of opaque AI.
... A deep dive into the statistical and algorithmic underpinnings of the tools commonly used by criminal justice agencies is beyond the scope of this paper, but Richard Berk and his colleagues have published several helpful explanations of the nuts and bolts of many of these methods(Berk, 2006(Berk, , 2008(Berk, , 2010(Berk, , 2011(Berk, , 2021Berk, Kriegler, & Baek, 2006;Berk, Sorenson, & Barnes, 2016; Berk, Sorenson, Barnes, Kurtz, & Ahlman, 2009). See these papers for further information. ...
Article
Full-text available
We address the organization of criminal justice forecasting and implications for its use in criminal justice policymaking. We argue that the use of forecasting is relatively widespread in criminal justice agency settings, but it is used primarily to inform decision-making and practice rather than to formulate and test new policy proposals. Using predictive policing and prison population forecasting as our main examples of the range of forecasting methods adopted in criminal justice practice, we describe their uses, how their use is organized, and the implications of the organizational arrangements for the transparent, reviewable, and consensual use of forecasting. We point out that both prison population forecasting and predictive policing have long histories that have led to advances in methodology. Prison population forecasting has generally become embedded in budget decision-making processes that contribute to greater transparency in method and applications. Predictive policing has been less transparent in method and use, partly because the methods are more complicated and rely on larger amounts of data, but it generally has not be used in ways to foster community engagement and build public support. Concerns about the legitimacy of its use persist.
... In addition to the agentic nature of the outcome in question, Bushway and Smith (2007) outline other issues involved with risk prediction in criminal justice applications, notably the availability of data used in such models of prediction of criminal behavior that may not accurately reflect all the inputs in the decision process. 2 The authors make a persuasive argument that identifying high risk offenders is difficult on the basis of official data, typically in the form of simpler models which include less inputs (see also Bushway 2020). Even with the growing literature on the use of machine learnings techniques in crime applications that invoke risk assessments (Berk 2017;Berk et al. 2016), this literature has come to focus on complications related to algorithmic fairness and potential discrimination that can results from the reliance on these methods for the purposes of prediction (Kleinberg et al. 2018;Ludwig and Mullainathan 2021;Stevenson and Doleac 2021). For instance, Kleinberg et al. (2018) describe the issue of judicial decisions regarding pretrial detention as a 'promising' machine learning application given the simplicity of the prediction task and the ostensible availability of large amounts of data in which to inform the decision process. ...
Article
Full-text available
Intoduction/Aim Extant tests of developmental theories have largely refrained from moving past testing models of association to building models of prediction, as have other fields with an intervention focus. With this in mind, we test the prognostic capacity to predict offending outcomes in early adulthood derived from various developmental theories. Methods Using 734 subjects from the Rochester Youth Development Study (RYDS), we use out-of-sample predictions based on 5-fold cross-validation and compare the sensitivity, specificity and positive predictive value of three different prognostic models to predict arrest and serious, persistent offending in early adulthood. The first uses only predictors measures in early adolescence, the second uses dynamic trajectories of delinquency from ages 14–22, and the third uses a combination of the two. We further consider how early in adolescence the trajectory models calibrate prediction. Results Both the early adolescent risk factor only model and the dynamic trajectory model were poor at prognosticating both arrest and persistent offending in early adulthood, which is manifest in the large rate of false positive cases. Conculsion Furthermore, existing developmental theories would be well served to move beyond cataloging risk factors and draw more heavily on refinements, including a greater focus on human agency in life course patterns of offending.
... (Chen et al., 2020; Van Noordt, 2020; Veale & Brass, 2019)Health & SafetyMachine Learning · Understand and help prevent workplace injuries and illnesses. · Early diagnostics system(Barth & Arnold, 1999;Berk et al., 2016;Kankanhalli et al., 2019; Uzun,2020) ...
Article
Full-text available
Technological advancements have created notable turning points throughout the history of humanity. Influential transformations in the administrative structure are the result of modern technological discoveries. The artificial intelligence (AI) revolution and algorithms now affect daily lives, communities, and government structures more than ever. Governments are the main coordinators of technological transition and supervisors of the activities of modern public administration systems. Hence, public administration and policies have crucial responsibilities in integrating, governing, and regulating AI technology. This article concentrates on the big questions of AI in the public administration and policy literature. The big questions discussion started by Robert Behn in 1995 draws attention to the big questions as the primary driving force of a public administration research agenda. The fundamental motivation of the big questions approach is shaped by the fact that “questions are as important as answers.” Integrating AI into public administration and the policy-making process allows numerous opportunities. However, AI technology also contains multiple threats and risks in economic, social, and even political structures in the long term. This article aims to identify big questions and discuss potential answers and solutions from an AI governance research agenda perspective.
... Machine learning, a subcategory of AI, is referred to as the process of implementing algorithms and recognizing patterns from the data to facilitate decision-making [13]. Decision-making examples include healthcare operational decisions [14] and decisions for risk forecasts [15,16]. As a subfield of machine learning, deep learning is typically represented by layered-structure algorithms, also known as artificial neural networks (ANN). ...
Article
Full-text available
Artificial Intelligence (AI)-based formulation development is a promising approach for facilitating the drug product development process. AI is a versatile tool that contains multiple algorithms that can be applied in various circumstances. Solid dosage forms, represented by tablets, capsules, powder, granules, etc., are among the most widely used administration methods. During the product development process, multiple factors including critical material attributes (CMAs) and processing parameters can affect product properties, such as dissolution rates, physical and chemical stabilities, particle size distribution, and the aerosol performance of the dry powder. However, the conventional trial-and-error approach for product development is inefficient, laborious, and time-consuming. AI has been recently recognized as an emerging and cutting-edge tool for pharmaceutical formulation development which has gained much attention. This review provides the following insights: (1) a general introduction of AI in the pharmaceutical sciences and principal guidance from the regulatory agencies, (2) approaches to generating a database for solid dosage formulations, (3) insight on data preparation and processing, (4) a brief introduction to and comparisons of AI algorithms, and (5) information on applications and case studies of AI as applied to solid dosage forms. In addition, the powerful technique known as deep learning-based image analytics will be discussed along with its pharmaceutical applications. By applying emerging AI technology, scientists and researchers can better understand and predict the properties of drug formulations to facilitate more efficient drug product development processes.
... Given this low proportion of procedures indicated above, has interesting advantages over regressions. One of them is 58 the possibility of working even in the presence of multicollinearity among the predictors, 59 or of working with a large group of independent variables or "inputs", in line with the 60 approach of [14]. This work states that, in the presence of predictors with little 61 predictive power, which are normally eliminated from the models, the accuracy of the 62 prediction can be significantly increased thanks to the aggregation of variables 63 performed by ML. ...
Preprint
Full-text available
Intimate partner violence (IPV) is an actual social issue which poses a challenge in terms of prevention, legal action, and reporting the abuse once it has occurred. In this last case, out of the total of female victims that fill a complaint against their abuser and initiate the legal proceedings, a significant number withdraw from it for different reasons. In this field, it is interesting to detect the victims that disengage from the legal process so that professionals can intervene before it occurs. Previous studies have applied statistical models to use input variables and make a prediction of withdrawal. However, it has not been found in the literature any study that uses machine learning models to predict disengagement from the legal proceedings in IPV cases, which can be a better option to detect these events with a higher precision. Therefore, in this work, a novel application of machine learning techniques to predict the decision of victims of IPV to withdraw from prosecution is studied. For this purpose, three different ML algorithms have been optimized and tested with the original dataset to prove the great performance of ML models against non-linear input data. Once the best models have been obtained, explainable artificial intelligence (xAI) techniques have been applied to search for the most informative input features and reduce the original dataset to the most important variables. Finally, these results have been compared to those obtained in the previous work that used statistical techniques, and the set of most informative parameters has been combined with the variables of the previous study, showing that ML-based models have a better predictive accuracy in all cases and that by adding one new variable to the previous work' subset, the accuracy to detect withdrawal improves by 7.5%.
... Increasingly, traditional research questions are being addressed with ML algorithms. These questions mainly center on policy prediction issues, but topics cover a wide range of policy areas: prediction of mortality (Kleinberg et al., 2015), targeting inspection in health policy (Kang et al., 2013), student academic performance (Halde et al., 2016), and attempted criminal parole (Berk et al., 2016). In the field of public management, there are few, but increasing numbers of studies exploring organizational reputation (Anastasopoulos & Whitford, 2018) and budget orientation (Anastasopoulos et al., 2020) with ML algorithms. ...
Article
The recent rapid development of artificial intelligence (AI) is expected to transform how governments work by enhancing the quality of decision-making. Despite rising expectations and the growing use of AI by governments, scholarly research on AI applications in public administration has lagged. In this study, we fill gaps in the current literature on the application of machine learning (ML) algorithms with a focus on revenue forecasting by local governments. Specifically, we explore how different ML models perform on predicting revenue for local governments and compare the relative performance of revenue forecasting by traditional forecasters and several ML algorithms. Our findings reveal that traditional statistical forecasting methods outperform ML algorithms overall, while one of ML algorithms, KNN, is more effective in predicting property tax revenue. This result is particularly salient for public managers in local governments to handle foreseeable fiscal challenges through more accurate predictions of revenue.
Article
Neural networks are a key component of formulation design, and the integration of artificial intelligence (AI) into drug development is revolutionizing the pharmaceutical industry. To solve the issues of cost, accuracy, and efficiency, AI-powered models—in particular, deep learning networks—are being used more and more to forecast and optimize medication compositions. Neural networks are capable of predicting solubility, stability, and bioavailability as well as suggesting optimal compositions by examining large datasets and identifying non-linear correlations between formulation components. The time required to produce new medications is greatly decreased by this methodology, which speeds up the conventional trial-and-error method. AI may also improve personalized medicine by customizing medication formulas to meet the demands of each patient. The use of neural networks in drug formulation is examined in this research, which also highlights recent developments, difficulties, and potential paths for AI-powered drug development.
Article
Background Domestic violence (DV) is a significant public health concern affecting the physical and mental well-being of numerous women, imposing a substantial health care burden. However, women facing DV often encounter barriers to seeking in-person help due to stigma, shame, and embarrassment. As a result, many survivors of DV turn to online health communities as a safe and anonymous space to share their experiences and seek support. Understanding the information needs of survivors of DV in online health communities through multiclass classification is crucial for providing timely and appropriate support. Objective The objective was to develop a fine-tuned large language model (LLM) that can provide fast and accurate predictions of the information needs of survivors of DV from their online posts, enabling health care professionals to offer timely and personalized assistance. Methods We collected 294 posts from Reddit subcommunities focused on DV shared by women aged ≥18 years who self-identified as experiencing intimate partner violence. We identified 8 types of information needs: shelters/DV centers/agencies; legal; childbearing; police; DV report procedure/documentation; safety planning; DV knowledge; and communication. Data augmentation was applied using GPT-3.5 to expand our dataset to 2216 samples by generating 1922 additional posts that imitated the existing data. We adopted a progressive training strategy to fine-tune GPT-3.5 for multiclass text classification using 2032 posts. We trained the model on 1 class at a time, monitoring performance closely. When suboptimal results were observed, we generated additional samples of the misclassified ones to give them more attention. We reserved 184 posts for internal testing and 74 for external validation. Model performance was evaluated using accuracy, recall, precision, and F1-score, along with CIs for each metric. Results Using 40 real posts and 144 artificial intelligence–generated posts as the test dataset, our model achieved an F1-score of 70.49% (95% CI 60.63%-80.35%) for real posts, outperforming the original GPT-3.5 and GPT-4, fine-tuned Llama 2-7B and Llama 3-8B, and long short-term memory. On artificial intelligence–generated posts, our model attained an F1-score of 84.58% (95% CI 80.38%-88.78%), surpassing all baselines. When tested on an external validation dataset (n=74), the model achieved an F1-score of 59.67% (95% CI 51.86%-67.49%), outperforming other models. Statistical analysis revealed that our model significantly outperformed the others in F1-score (P=.047 for real posts; P<.001 for external validation posts). Furthermore, our model was faster, taking 19.108 seconds for predictions versus 1150 seconds for manual assessment. Conclusions Our fine-tuned LLM can accurately and efficiently extract and identify DV-related information needs through multiclass classification from online posts. In addition, we used LLM-based data augmentation techniques to overcome the limitations of a relatively small and imbalanced dataset. By generating timely and accurate predictions, we can empower health care professionals to provide rapid and suitable assistance to survivors of DV.
Article
Data on incidents involving violence against women is becoming increasingly accessible, thanks in part to the Violence Against Women Act of 1994 and its reauthorizations. Technology facilitates the sharing of data and qualitative experiences in online forums and social media platforms, which has led to growing demand for analytical tools like data mining and machine learning algorithms to handle these large-scale data sources. This research note provides an overview of the application of big data techniques to research on violence against women and contributes to the discussion on ethical concerns for artificial intelligence in the violence against women research field.
Article
This study examined the use of machine learning in detecting deception among 210 individuals reporting homicides or missing persons to 911. The sample included an equal number of false allegation callers (FAC) and true report callers (TRC) identified through case adjudication. Independent coders, unaware of callers’ deception, analyzed each 911 call using 86 behavioral cues. Using the random forest model with k-fold cross-validation and repeated sampling, the study achieved an accuracy rate of 68.2% for all 911 calls, with sensitivity and specificity at 68.7% and 67.7%, respectively. For homicide reports, accuracy was higher at 71.2%, with a sensitivity of 77.3% but slightly lower specificity at 65.0%. In contrast, accuracy decreased to 61.4% for missing person reports, with a sensitivity of 49.1% and notably higher specificity at 73.6%. Beyond accuracy, key cues distinguishing FACs from TRCs were identified and included cues like “Blames others,” “Is self-dramatizing,” and “Is uncertain and insecure.”
Article
We evaluate the impacts of adopting algorithmic risk assessments in sentencing. We find that judges changed sentencing practices in response to the risk assessment, but that discretion played a large role in mediating its impact. Judges deviated from the recommendations associated with the algorithm in systematic ways, suggestive of alternative objectives. As a result, risk assessment did not lead to detectable gains in terms of public safety or reduced incarceration rates. Using simulations, we show that strict adherence to the sentencing recommendations associated with the algorithm would have had benefits (less incarceration) but also some costs (increased sentences for youth). (JEL D81, D91, H76, K41, K42)
Book
Full-text available
Discussing social media-related scholarship found in criminology, legal studies, policing, courts, corrections, victimization, and crime prevention, this book presents the current state of our knowledge on the impact of social media and the major sociological frameworks employed to study the U.S. justice system. Building a theoretical framework for the study of social media and criminal justice in each chapter, the chapters provide a systematic reflection of extant research on social media in cybercrime, operations of courts, administration of institutional and community corrections, law enforcement, and crime prevention. The book fills the gap between the contemporary state of knowledge regarding social media and criminal justice with respect to both empirical evidence and types of sociological frameworks being employed to explore and identify the societal costs and benefits of our growing dependence upon social media. In addition to providing an up-to-date overview of our current state of knowledge, this book highlights important areas of future research, wherein the benefits of social media can be expanded and the negative aspects of its broadening use can be minimized. Social Media and Criminal Justice will be of interest to students, scholars and practitioners in the areas of judicial administration, corrections management, law enforcement, and criminal justice-engaged community-based nonprofit organizations involved in court-referred treatment and/or active collaboration with local law enforcement agencies. For more information, please purchase the full book at https://www.taylorfrancis.com/books/mono/10.4324/9781003360049/social-media-criminal-justice-xiaochen-hu-nicholas-lovrich or on Amazon.
Article
Full-text available
We explore the feasibility of using machine learning on a police dataset to forecast domestic homicides. Existing forecasting instruments based on ordinary statistical instruments focus on non-fatal revictimization, produce outputs with limited predictive validity, or both. We implement a “super learner,” a machine learning paradigm that incorporates roughly a dozen machine learning models to increase the recall and AUC of forecasting using any one model. We purposely incorporate police records only, rather than multiple data sources, to illustrate the practice utility of the super learner, as additional datasets are often unavailable due to confidentiality considerations. Using London Metropolitan Police Service data, our model outperforms all extant domestic homicide forecasting tools: the super learner detects 77.64% of homicides, with a precision score of 18.61% and a 71.04% Area Under the Curve (AUC), which, collectively and severely, are assessed as “excellent.” Implications for theory, research, and practice are discussed.
Article
Full-text available
Domestic violence against women is a prevalent in Liberia, with nearly half of women reporting physical violence. However, research on the biosocial factors contributing to this issue remains limited. This study aims to predict women’s vulnerability to domestic violence using a machine learning approach, leveraging data from the Liberian Demographic and Health Survey (LDHS) conducted in 2019–2020. We employed seven machine learning algorithms to achieve this goal, including ANN, KNN, RF, DT, XGBoost, LightGBM, and CatBoost. Our analysis revealed that the LightGBM and RF models achieved the highest accuracy in predicting women’s vulnerability to domestic violence in Liberia, with 81% and 82% accuracy rates, respectively. One of the key features identified across multiple algorithms was the number of people who had experienced emotional violence. These findings offer important insights into the underlying characteristics and risk factors associated with domestic violence against women in Liberia. By utilizing machine learning techniques, we can better predict and understand this complex issue, ultimately contributing to the development of more effective prevention and intervention strategies.
Article
Algorithmic risk assessments are being deployed in an increasingly broad spectrum of domains including banking, medicine, and law enforcement. However, there is widespread concern about their fairness and trustworthiness, and people are also known to display algorithm aversion, preferring human assessments even when they are quantitatively worse. Thus, how does the framing of who made an assessment affect how people perceive its fairness? We investigate whether individual algorithmic assessments are perceived to be more or less accurate, fair, and interpretable than identical human assessments, and explore how these perceptions change when assessments are obviously biased against a subgroup. To this end, we conducted an online experiment that manipulated how biased risk assessments are in a loan repayment task, and reported the assessments as being made either by a statistical model or a human analyst. We find that predictions made by the model are consistently perceived as less fair and less interpretable than those made by the analyst despite being identical. Furthermore, biased predictive errors were more likely to widen this perception gap, with the algorithm being judged even more harshly for making a biased mistake. Our results illustrate that who makes risk assessments can influence perceptions of how acceptable those assessments are - even if they are identically accurate and identically biased against subgroups. Additional work is needed to determine whether and how decision aids should be presented to stakeholders so that the inherent fairness and interpretability of their recommendations, rather than their framing, determines how they are perceived.
Article
In New Mexico and many other jurisdictions, judges may detain defendants pretrial if the prosecutor proves, through clear and convincing evidence, that releasing them would pose a danger to the public. However, some policymakers argue that certain classes of defendants should have a “rebuttable presumption” of dangerousness, shifting the burden of proof to the defense. Using data on over 15,000 felony defendants who were released pretrial in a 4‐year period in New Mexico, we measure how many of them would have been detained by various presumptions, and what fraction of these defendants in fact posed a danger in the sense that they were charged with a new crime during pretrial supervision. We consider presumptions based on the current charge, past convictions, past failures to appear, past violations of conditions of release, and combinations of these drawn from recent legislative proposals. We find that for all these criteria, at most 8% of the defendants they identify are charged pretrial with a new violent crime (felony or misdemeanor), and at most 5% are charged with a new violent felony. The false‐positive rate, that is, the fraction of defendants these policies would detain who are not charged with any new crime pretrial, ranges from 71% to 90%. The broadest legislative proposals, such as detaining all defendants charged with a violent felony, are little more accurate than detaining a random sample of defendants released under the current system, and would jail 20 or more people to prevent a single violent felony. We also consider detention recommendations based on risk scores from the Arnold Public Safety Assessment (PSA). Among released defendants with the highest risk score and the “violence flag,” 7% are charged with a new violent felony and 71% are false positives. We conclude that these criteria for rebuttable presumptions do not accurately target dangerous defendants: they cast wide nets and recommend detention for many pretrial defendants who do not pose a danger to the public.
Preprint
Full-text available
Domestic violence against women is a prevalent issue in Liberia, with nearly half of women reporting physical violence. However, research on the biosocial factors contributing to this issue remains limited. In this study, we aim to predict women's vulnerability to domestic violence using a machine learning approach, leveraging data from the Liberian Demographic and Health Survey (LDHS) conducted in 2019–2020. To achieve this goal, we employed seven different machine learning algorithms, including ANN, KNN, RF, DT, XGBoost, LightGBM, and CatBoost. Our analysis revealed that the LightGBM and RF models achieved the highest accuracy in predicting women's vulnerability to domestic violence in Liberia, with accuracy rates of 81% and 82%, respectively. One of the key features identified across multiple algorithms was the number of people who had experienced emotional violence. These findings offer important insights into the underlying characteristics and risk factors associated with domestic violence against women in Liberia. By utilizing machine learning techniques, we can better predict and understand this complex issue, ultimately contributing to the development of more effective prevention and intervention strategies.
Article
This research explores the potential of supervised machine learning models to support the decision-making process in demobilizing ex-combatants in the peace process in Colombia. Recent works apply machine learning in analyzing crime and national security; however, there are no previous studies in the specific contexts of demobilization in an armed conflict. Therefore, the present paper makes a significant contribution by training and evaluating four machine learning models, using a database composed of 52,139 individuals and 21 variables. From the obtained results, it was possible to conclude that the XGBoost algorithm is the most suitable for predicting the future status of an ex-combatant. The XGBoost presented an AUC score of 0.964 in the cross-validation stage and an AUC of 0.952 in the test stage, evidencing the high reliability of the model.
Conference Paper
Machine learning applications related to high-stakes decisions are often surrounded by significant amounts of controversy. This has led to increasing interest in interpretable machine learning models. A well-known class of interpretable models is that of decision trees (DTs), which mirror a common strategy used by humans to arrive at solutions through a series of well-defined decisions. However, much of previous research on DTs for criminal justice predictions has focused primarily on collections (ensembles) of DTs whose results are aggregated together. Such DT ensembles are used to help improve accuracy; however, their increased complexity and deviation from human decision-making processes makes them much less interpretable compared to single-DT approaches. In this paper, we present a new DT model for criminal recidivism prediction that is designed with high interpretability, accuracy, and fairness as core objectives. The interpretability of the model stems from its formulation in terms of a single DT structure, while accuracy is achieved through an intensive optimization process of DT parameters that is carried out using a novel evolutionary algorithm. Through extensive experiments, we analyze the performance of our proposed EADTC (Evolutionary Algorithm Decision Tree for Crime prediction) method on relevant datasets. Our experiments show that the EADTC approach achieves competitive accuracy and fairness with respect to state-of-the-art ensemble DT models, while achieving higher interpretability due to the simpler, single-DT structure.
Article
How are children affected when states prohibit child welfare agencies from discriminating against same‐sex couples who wish to foster or adopt? This question stands at the heart of a debate between governments that seek to impose such antidiscrimination requirements and child welfare agencies that challenge them on religious freedom grounds. Yet until now there has been no reliable evidence on whether and how antidiscrimination rules for these agencies impact children. We have conducted the first nationwide study of how child outcomes vary when states adopt such antidiscrimination rules for child welfare agencies. Analyzing 20 years of child welfare data (2000–2019), we estimate that state antidiscrimination rules both (1) modestly increase children's success at finding foster and permanent homes, and (2) greatly reduce the average time to place children in such homes. These effects vary among subgroups, such that children who are most likely to find a home are generally not affected by state antidiscrimination requirements, whereas children who are least likely to find a home (primarily older children and children with various disabilities) benefit substantially from antidiscrimination measures. We estimate that the effect of antidiscrimination rules is equivalent to 15,525 additional children finding permanent homes and 360,000 additional children finding foster homes, nationwide, over a period of 20 years. Overall, the project offers two key contributions: First, it provides empirical grounding for some of the most heated constitutional and political battles of the culture wars. Second, it advances empirical legal studies by bringing machine learning causal inference to law.
Article
Machine learning algorithms are becoming ubiquitous in modern life. When used to help inform human decision making, they have been criticized by some for insufficient accuracy, an absence of transparency, and unfairness. Many of these concerns can be legitimate, although they are less convincing when compared with the uneven quality of human decisions. There is now a large literature in statistics and computer science offering a range of proposed improvements. In this article, we focus on machine learning algorithms used to forecast risk, such as those employed by judges to anticipate a convicted offender's future dangerousness and by physicians to help formulate a medical prognosis or ration scarce medical care. We review a variety of conceptual, technical, and practical features common to risk algorithms and offer suggestions for how their development and use might be meaningfully advanced. Fairness concerns are emphasized. Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 10 is March 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
Full-text available
When threats to public safety are a factor in sentencingdecisions, forecasts of ‘‘future dangerousness’’ are neces-sarily being made. Sometimes the forecasts are effectivelymandatory. Federal judges, for example, are required toassess risk in every case. Under 18 U.S.C. § 3553(a)(2)(C),‘‘[t]he court, in determining the particular sentence to beimposed, shall consider...(2) the need for the sentenceimposed...(C) to protect the public from further crimes ofthe defendant...’’ A judge must look into the future,determine the likelihood and seriousness of criminalbehavior, and within certain bounds, sentence to minimizethe harm that could result.Ideally, the forecasts should be highly accurate. Theyalso should be derived from procedures that are practical,transparent, and sensitive to the consequences of forecast-ing errors. However, there is usually no compelling guid-ance on precisely how these goals can best be achieved.Subjective judgment, sometimes called ‘‘clinical judg-ment,’’ is an approach that relies on intuition guided byexperience. As discussed below, the resulting risk assess-ments are often wildly inaccurate and their rationale opa-que. ‘‘Actuarial’’ methods depend on data that allow one tolink ‘‘risk factors’’ to various outcomes of interest. Theassociations found can then be used to forecast those out-comes when they are not known. Over the past severaldecades, regression statistical procedures have dominatedthe actuarial determination of empirically based risk fac-tors. By and large, this enterprise has been a success. Butthe increasing availability of very large datasets coupledwith new data analysis tools promise dramatically bettersuccess in the future. Machine learning will be a dominantstatistical driver.There is now a substantial and compelling literature instatistics and computer science showing that machinelearning statistical procedures will forecast at least asaccurately, and typically more accurately, than olderapproaches commonly derived from various forms ofregression analysis.
Article
Full-text available
As drug arrests and jail overcrowding added pressure to increase pretrial release in localities during the 1980s and 1990s, the need to manage a larger and higher-risk pretrial population of defendants awaiting adjudication in the community became a high priority for justice agencies. In the late 1990s Philadelphia officials sought to discover the ingredients of a successful supervision strategy through four interlinked field experiments to provide an empirical basis for a major reform of the pretrial release system. The results of the linked randomized experiments question common assumptions about “supervision,” its impact and effectiveness, about the underlying nature of the noncompliant defendant, and deterrence implications. The study emphasizes the importance of interpreting the findings in the context of implementation of the policy reform. Findings suggest that facilitative notification strategies wield little influence on defendant behavior and that deterrent aims are undermined by the system's failure to deliver consequences for defendant noncompliance during pretrial release. The most significant contribution of the article is its illustration of a major evidence-based policy reform undertaken by a major court system.
Article
Full-text available
This article examines the effectiveness of using different kinds of written reminders to reduce misdemeanor defendants' failure-to-appear (FTA) rates. A subset of defendants was surveyed after their scheduled court date to assess their perceptions of procedural justice and trust and confidence in the courts. Reminders reduced FTA overall, and more substantive reminders (e.g., with information on the negative consequences of FTA) were more effective than a simple reminder. FTA varied depending on several offense and offender characteristics, such as geographic location (urban vs. rural), type of offense, and number of offenses. The reminders were somewhat more effective for Whites and Hispanics than for Blacks. Defendants with higher institutional confidence and those who felt they had been treated more fairly by the criminal justice system were more likely to appear, though the effectiveness of the reminder was greatest among misdemeanants with low levels of trust in the courts. The implications for public policy and pretrial services are discussed.
Article
THE MISSION OF the Office of the Federal Detention Trustee (OFDT) is to manage and regulate the federal detention programs and the Justice Prisoner and Alien Transportation System (JPATS) by establishing a secure and effective operating environment that drives efficient and fair expenditure of appropriated funds One of the primary responsibilities of OFDT is to review existing detention practices and develop alternatives to improve mission efficiency and cost effectiveness OFDT and the entire )ustice system recognize that in some cases the most operationally efficient and cost effective utilization of funds involves the use of alternatives to secured detention for certain defendants awaiting trial. The Department of Justice (acting through the US Marshals Service and OFDT) pro vides the Federal Judiciary with supplemental funding to support alternatives to pretrial detention Alternatives to pretrial detention include but are not limited to, third party custodian, substance abuse testing substance abuse treatment, location monitoring, halfway house, community housing or shelter, mental health treatment, sex offender treatment and computer monitoring Pretrial services agen cies can recommend any of these alternatives to detention as conditions of pretrial release and the )udicial officer can set one or more of the alternatives to detention as conditions of bail in lieu of secured detention Consistent with the mission of OFDT, the current study was sponsored by OFDT with support from the Administrative Office of the US Courts The purpose of this research effort was twofold identify statistically significant and policy relevant predictors of pretrial outcome to identify federal criminal defendants who are most suited for pretrial release without jeopardizing the integrity of the judicial process or the safety of the community in particular release predicated on par ticipation in an alternatives to detention program, and develop recommendations for the use of OFDT funding that supports the Fed era! Judiciary s a!ternatives to detention program The study employed data provided by the Administrative Office of the US Courts, Office of Probation and Pretrial Services (OPPS) that described all persons charged with criminal offenses in the federal courts between October 1, 2001 and September 30, 2007 who were processed by the federal pre trial services system (N=565 178) All federal districts with the exception of the District of Columbia were represented in the study2 The research included six primary research ob)ectlves 1 Identify statistically significant and p01 icy relevant predictors of pretrial risk of federal criminal defendants Develop a classification scheme to scale the risk per Sons arrested for federal criminal offenses pose if released pending trial The risk classification scheme should allow for the future development of an instrument that could be used by federal pretrial services officers to assess the risk of individual criminal defendants 2 Examine persons charged with federal criminal offenses over the past seven (7) years and assess how the average pretrial risk level of federal criminal defendants has changed Assess whether the change in the average risk level has resulted in changes in the pretrial release/detention rate and pretrial failure rate 3 Examine defendants released pending trial with the condition of participation in an alternative to detention Identify the level of pretrial risk these defendants pose and, controlling for risk level, assess whether participation in an alternative to detention mitigated the risk of pretrial failure 4 Assess the efficacy of the alternatives to detention program at reducing federal criminal Justice costs, particularly costs associated with pretrial secured detention Identify a population most suited-both programmatically and economically-for pretrial release with conditions of alterna tives to detention 5 Examine how federal pretrial services cur rently assesses pretrial risk federal criminal defendants pose and the effectiveness of those practices in reducing unwarranted detention and preventing failures to appear and danger to the community while pend ing trial 6 Identify best practices relating to the determination of pretrial risk and rec ommendations to release or detain a defendant pending trial particularly as they relate to the assessment of pretrial risk and the administration of the alternatives to detention program.
Article
Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today’s most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ∗∗∗, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Article
Research Summary A substantial and powerful literature in statistics and computer science has clearly demonstrated that modern machine learning procedures can forecast more accurately than conventional parametric statistical models such as logistic regression. Yet, several recent studies have claimed that for criminal justice applications, forecasting accuracy is about the same. In this article, we address the apparent contradiction. Forecasting accuracy will depend on the complexity of the decision boundary. When that boundary is simple, most forecasting tools will have similar accuracy. When that boundary is complex, procedures such as machine learning, which proceed adaptively from the data, will improve forecasting accuracy, sometimes dramatically. Machine learning has other benefits as well, and effective software is readily available. Policy Implications The complexity of the decision boundary will in practice be unknown, and there can be substantial risks to gambling on simplicity. Criminal justice decision makers and other stakeholders can be seriously misled with rippling effects going well beyond the immediate offender. There seems to be no reason for continuing to rely on traditional forecasting tools such as logistic regression.
Article
Concern over the injustice of the money bail system led the founders of the Vera Institute of Justice to design and implement the Manhattan Bail Project in 1961. The Project demonstrated that people with strong ties to the community could be safely released from custody without bail merely on their promise to return to court—called release on recognizance. Federal, state, and local officials should be encouraged to examine their systems and implement a more just, more rational, and less costly system of ensuring appearance and protecting public safety while those charged but presumed innocent await the disposition of the charges. Toward that end, Attorney General Eric Holder convened a national conference on bail and criminal justice in June 2011 that presented another opportunity to realize the Manhattan Bail Project's mission: bringing pretrial justice to the significant proportion of impoverished defendants brought before the criminal courts.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Article
Statistically based risk assessment devices are widely used in criminal justice settings. Their promise remains largely unfulfilled, however, because assumptions and premises requisite to their development and application are routinely ignored and/or violated. This article provides a brief review of the most salient of these assumptions and premises, addressing the base rate and selection ratios, methods of combining predictor variables and the nature of criterion variables chosen, cross-validation, replicability, and generalizability. The article also discusses decision makers’ choices to add or delete items from the instruments and suggests recommendations for policy makers to consider when adopting risk assessments. Suggestions for improved practice, practical and methodological, are made.
Article
Objectives Recent legislation in Pennsylvania mandates that forecasts of "future dangerousness" be provided to judges when sentences are given. Similar requirements already exist in other jurisdictions. Research has shown that machine learning can lead to usefully accurate forecasts of criminal behavior in such setting. But there are settings in which there is insufficient IT infrastructure to support machine learning. The intent of this paper is provide a prototype procedure for making forecasts of future dangerousness that could be used to inform sentencing decisions when machine learning is not practical. We consider how classification trees can be improved so that they may provide an acceptable second choice. Methods We apply an version of classifications trees available in R, with some technical enhancements to improve tree stability. Our approach is illustrated with real data that could be used to inform sentencing decisions. Results Modest sized trees grown from large samples can forecast well and in a stable fashion, especially if the small fraction of indecisive classifications are found and accounted for in a systematic manner. But machine learning is still to be preferred when practical. Conclusions Our enhanced version of classifications trees may well provide a viable alternative to machine learning when machine learning is beyond local IT capabilities.
Article
The Bail Reform Act of 1984 changed the law dictating release and detention decisions in federal court. Since its passage, few studies have examined judicial decision-making in this context. Legal research enables us to account for the structure and interpretation of federal detention laws and to analyze previously neglected measures of legal factors in our analyses. We use US Sentencing Commission data on a sample of defendants who were sentenced in 2007 (N = 31,043). We find that legal factors—particularly length of criminal history, having committed a violent or otherwise serious offense, and having committed the offense while under supervision of the criminal justice system—have the strongest relationships with the presentence detention outcome. A defendant’s age, race, and ethnicity have weaker relationships with detention. When we compare defendants who are similarly situated with respect to legal factors, the probability of detention is similar regardless of age, race, and ethnicity.
Article
Recent studies of sentencing under the Federal Sentencing Guidelines suggest that unwarranted disparity has been reduced, but not eliminated. A number of studies conclude that legally irrelevant variables, including race/ethnicity and gender, continue to affect the sentences imposed on federal offenders. Research conducted at the state level also reveals that offender characteristics interact to create a punishment penalty for young, unemployed black and Hispanic male offenders. Our study builds on this research. Using data on drug offenders sentenced in three U.S. District Courts, we test for direct, indirect, and interactive effects of race and ethnicity on sentence severity. We find that gender, age, and employment status, but not race/ethnicity, have direct effects on sentencing, and that the effects of gender and employment status are conditioned by race/ethnicity. On the other hand, our results provide no support for our hypotheses that young black and Hispanic males and unemployed black and Hispani...
Article
Durkheimian, Marxist, and Weberian theories provide contrasting views of the influences of the social structure of areas and communities on law and the legal process. In light of these theories, we examine how various aspects of community social structure differentially affect criminal punishments administered to whites and nonwhites. Using county-level data from the state of Washington, we regress white and nonwhite rates of imprisonment on measures of crime and arrest rates, county social structure, and court workload. This analysis indicates that nonwhites—but not whites—are particularly likely to be sentenced to prison in urbanized counties and in counties with relatively large minority populations. We conclude by presenting material from interviews with justice officials which sheds light on the perceptual and political processes that link structural conditions to patterns of criminal punishment.
Article
This study examines the relative effects of a number of legal and extralegal factors on (1) the decision to release on recognizance and (2) the decision on amount of money bail. Social science research on these issues has been sparse compared to that on other phases of the criminal justice process. Findings from a regression analysis show that the first step of the bond disposition process, the recognizance decision, is influenced by several factors. The demeanor of defendants in open court is the most important. Net of other influences, good demeanor increases the probability of release on recognizance by 34.8%. In cases where recognizance is denied, only two variables are related significantly to the amount of money bond. Net of other influences, a felony offense (as opposed to a misdemeanor) increases predicted bail by 2300,andpoordemeanorincreasesthepredictedbailrequiredby2300, and poor demeanor increases the predicted bail required by 1600.
Article
The present study uses data on the processing of felony defendants in large urban courts to examine Hispanic, black, and white differences at the pretrial release stage. The major finding is that Hispanic defendants are more likely to be detained than white and black defendants. And, racial/ethnic differences are most pronounced in drug cases. In fact, Hispanic defendants suffer a triple burden at the pretrial release stage as they are the group most likely to be required to pay bail to gain release, the group that receives the highest bail amounts, and the group least able to pay bail. These findings are consistent with a focal concerns perspective of criminal case processing that suggests Hispanics as a newly immigrated group are especially prone to harsher treatment in the criminal case process.
Article
In this paper, we compare the results from a randomized clinical trial to the results from a regression discontinuity quasi-experiment when both designs are implemented in the same setting. We find that the results from the two approaches are effectively identical. We attribute the comparability in part to recent statistical developments that make the model required for the analysis of data from a regression discontinuity design easier to determine. These developments make an already strong quasi-experimental design even stronger. KeywordsQuasi-experiment-Randomized clinical trial-Regression discontinuity design
Article
In this paper, we attempt to forecast which prison inmates are likely to engage in very serious misconduct while incarcerated. Such misconduct would usually be a major felony if committed outside of prison: drug trafficking, assault, rape, attempted murder and other crimes. The binary response variable is problematic because it is highly unbalanced. Using data from nearly 10,000 inmates held in facilities operated by the California Department of Corrections, we show that several popular classification procedures do no better than the marginal distribution unless the data are weighted in a fashion that compensates for the lack of balance. Then, random forests performs reasonably well, and better than CART or logistic regression. Although less than 3% of the inmates studied over 24months were reported for very serious misconduct, we are able to correctly forecast such behavior about half the time.
Book
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Article
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.
Article
Forecasts of future dangerousness are often used to inform the sentencing decisions of convicted offenders. For individuals who are sentenced to probation or paroled to community supervision, such forecasts affect the conditions under which they are to be supervised. The statistical criterion for these forecasts is commonly called recidivism, which is defined as a charge or conviction for any new offence, no matter how minor. Only rarely do such forecasts make distinctions on the basis of the seriousness of offences. Yet seriousness may be central to public concerns, and judges are increasingly required by law and sentencing guidelines to make assessments of seriousness. At the very least, information about seriousness is essential for allocating scarce resources for community supervision of convicted offenders. The paper focuses only on murderous conduct by individuals on probation or parole. Using data on a population of over 60000 cases from Philadelphia's Adult Probation and Parole Department, we forecast whether each offender will be charged with a homicide or attempted homicide within 2 years of beginning community supervision. We use a statistical learning approach that makes no assumptions about how predictors are related to the outcome. We also build in the costs of false negative and false positive charges and use half of the data to build the forecasting model, and the other half of the data to evaluate the quality of the forecasts. Forecasts that are based on this approach offer the possibility of concentrating rehabilitation, treatment and surveillance resources on a small subset of convicted offenders who may be in greatest need, and who pose the greatest risk to society. Copyright (c) 2009 Royal Statistical Society.
Article
In this paper, we report on the development of a short screening tool that deputies in the Los Angeles Sheriff's Department could use in the field to help forecast domestic violence incidents in particular households. The data come from over 500 households to which sheriff's deputies were dispatched in the fall of 2003. Information on potential predictors was collected at the scene. Outcomes were measured during a three month follow-up. The data were analyzed with modern data mining procedures in which true forecasts were evaluated. A screening instrument was then developed based on a small fraction of the information collected. Making the screening instrument more complicated did not improve forecasting skill. Taking the relative costs of false positives and false negatives into account, the instrument correctly forecasted future calls for service about 60% of the time. Future calls involving domestic violence misdemeanors and felonies were correctly forecast about 50% of the time. The 50% figure is especially important because such calls require a law enforcement response and yet are a relatively small fraction of all domestic violence calls for service. A number of broader policy implications follow. It is feasible to construct a quick-response, domestic violence screener that is practical to deploy and that can forecast with useful skill. More informed decisions by police officers in the field can follow. Although the same kinds of predictors are likely to be effective in a wide variety of jurisdictions, the particular indicators selected will vary in response to local demographics and the local costs of forecasting errors. It is also feasible to evaluate such quick-response threat assessment tools for their forecasting accuracy. But, the costs of forecasting errors must be taken into account. Also, when the data used to build the forecasting instrument are also used to evaluate its accuracy, inflated estimates of forecasting skill are likely.
Article
The setting of bond in a first appearance court in one southeastern judicial district was examined to determine its relationship with official standards based on the recommendations of the American Bar Association advisory committee on standards for criminal justice and the National Advisory Commission on Criminal Justice Standards and Goals. Eighteen measures of five different recommended standards were considered. Only seriousness of charge showed apparent strength in its relationship with bond. The authors suggest a “facility hypothesis” (that court officials gravitated toward factors such as seriousness of charge) that may be readily processed and understood within constraints of time and organization. As added support for this hypothesis, defendants' demeanor in court is also shown to be significantly related to bond in the present study. These legal and personal criteria may be more identified as indicators of defendants' culpability than many other considerations recommended by the study commissions. Use of other official recommendations may require changes in the concepts of defendants held by court personnel or drastic changes in the organization of first appearance in court.
The Bail Reform Act of 1984
  • D N Adair
Adair, D. N. (2006) The Bail Reform Act of 1984. Washington, DC: Federal Judicial Center.
Wrongful Convictions: A New Exoneration Registry Tests Stubborn Judges
  • Cohen A.
Cohen, A. (2012) "Wrongful Convictions: A New Exoneration Registry Tests Stubborn Judges," May Atlantic 21.
Bail Decisions: Research Summary
  • L Devers
Devers, L. (2011) Bail Decisions: Research Summary. Washington, DC: Bureau of Justice Assistance, U.S. Department of Justice.
Developing a National Model for Pretrial Risk Assessment ” research summary from the Laura and John Arnold Foundation
  • Arnold Foundation
Arnold Foundation (2013) "Developing a National Model for Pretrial Risk Assessment," research summary from the Laura and John Arnold Foundation. Available at: www.arnoldfoundation. org.
The Bail Reform Act of
Federal Judicial Center (1993) The Bail Reform Act of 1984, 2d ed. Washington, DC: Federal Judicial Center.
24(1) Federal Sentencing Reporter 8. National Institute of Justice (2011) Batterer Intervention Programs Often Do Not Change Offender Behavior
  • J E Mcelroy
McElroy, J. E. (2011) "Introduction to the Manhattan Bail Project," 24(1) Federal Sentencing Reporter 8. National Institute of Justice (2011) Batterer Intervention Programs Often Do Not Change Offender Behavior. Available at: http://www.nij.gov/topics/crime/intimate-partner-violence/interventions/Pages/batterer-intervention.aspx.
Crime, Social Structure and Criminal Punishment: White and Nonwhite Rates of Imprisonment
  • Bridges
Statistical Modeling: The Two Cultures
  • Breiman