
Sameer Deshpande- Professor (Assistant) at University of Wisconsin–Madison
Sameer Deshpande
- Professor (Assistant) at University of Wisconsin–Madison
About
54
Publications
4,818
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
417
Citations
Current institution
Publications
Publications (54)
Many studies have reported associations between later-life cognition and socioeconomic position in childhood, young adulthood, and mid-life. However, the vast majority of these studies are unable to quantify how these associations vary over time and with respect to several demographic factors. Varying coefficient (VC) models, which treat the covari...
We study the impact of teenage sports participation on early-adulthood health using data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23 to 28 (self-rated health and PHQ9 Patient Depression Questionnaire score) and control for several demographic and socioeconomic confounders. To probe the possibi...
Current implementations of Bayesian Additive Regression Trees (BART) are based on axis-aligned decision rules that recursively partition the feature space using a single feature at a time. Several authors have demonstrated that oblique trees, whose decision rules are based on linear combinations of features, can sometimes yield better predictions t...
Although it is an extremely effective, easy-to-use, and increasingly popular tool for nonparametric regression, the Bayesian Additive Regression Trees (BART) model is limited by the fact that it can only produce discontinuous output. Initial attempts to overcome this limitation were based on regression trees that output Gaussian Processes instead o...
Expected points is a value function fundamental to player evaluation and strategic in-game decision-making across sports analytics, particularly in American football. To estimate expected points, football analysts use machine learning tools, which are not equipped to handle certain challenges. They suffer from selection bias, display counter-intuit...
Estimating varying treatment effects in randomized trials with noncompliance is inherently challenging since variation comes from two separate sources: variation in the impact itself and variation in the compliance rate. In this setting, existing frequentist and flexible machine learning methods are highly sensitive to the weak instruments problem,...
We will study the impact of adolescent sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23–28 — self-rated health and total score on the PHQ9 Patient Depression Questionnaire — and control for several potential confounders related...
We introduce a three-step framework to determine at which pitches Major League batters should swing. Unlike traditional plate discipline metrics, which implicitly assume that all batters should always swing at (resp. take) pitches inside (resp. outside) the strike zone, our approach explicitly accounts not only for the players and umpires involved...
Background
Artificial turf fields and environmental conditions may influence sports concussion risk, but existing research is limited by uncontrolled confounding factors, limited sample size, and the assumption that risk factors are independent of one another. The purpose of this study was to examine how playing surface, time of season, and game te...
As a baseball game progresses, batters appear to perform better the more times they face a particular pitcher. The apparent drop-off in pitcher performance from one time through the order to the next, known as the Time Through the Order Penalty (TTOP), is often attributed to within-game batter learning. Although the TTOP has largely been accepted w...
We introduce a three-step framework to determine, on a per-pitch basis, whether batters in Major League Baseball should swing at a pitch. Unlike traditional plate discipline metrics, which implicitly assume that all batters should always swing (resp. take) pitches inside (resp. outside) the strike zone, our approach explicitly accounts not only for...
Modern statistics provides an ever-expanding toolkit for estimating unknown parameters. Consequently, applied statisticians frequently face a difficult decision: retain a parameter estimate from a familiar method or replace it with an estimate from a newer or more complex one. While it is traditional to compare estimates using risk, such comparison...
Accurate estimation of the change in crime over time is a critical first step towards better understanding of public safety in large urban environments. Bayesian hierarchical modeling is a natural way to study spatial variation in urban crime dynamics at the neighborhood level, since it facilitates principled “sharing of information” between spatia...
Objective:
Estimate agricultural work's effect on hemoglobin (Hgb) level in men. A negative effect may indicate presence of chronic kidney disease of uncertain etiology.
Methods:
We use Demographic and Health Surveys data from seven African and Asian countries and use matching to control for seven confounders.
Results:
On average, Hgb levels w...
Test log-likelihood is commonly used to compare different models of the same data and different approximate inference algorithms for fitting the same probabilistic model. We present simple examples demonstrating how comparisons based on test log-likelihood can contradict comparisons according to other objectives. Specifically, our examples show tha...
Default implementations of Bayesian Additive Regression Trees (BART) represent categorical predictors using several binary indicators, one for each level of each categorical predictor. Regression trees built with these indicators partition the levels using a ``remove one a time strategy.'' Unfortunately, the vast majority of partitions of the level...
We demonstrate how Hahn et al.'s Bayesian Causal Forests model (BCF) can be used to estimate conditional average treatment effects for the longitudinal dataset in the 2022 American Causal Inference Conference Data Challenge. Unfortunately, existing implementations of BCF do not scale to the size of the challenge data. Therefore, we developed flexBC...
We will study the impact of adolescent sports participation on early-adulthood health using longitudinal data from the National Survey of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and total score on the PHQ9 Patient Depression Questionnaire -- and control for several potential confounders rela...
As a baseball game progresses, batters appear to perform better the more times they face a particular pitcher. The apparent drop-off in pitcher performance from one time through the order to the next, known as the Time Through the Order Penalty (TTOP), is often attributed to within-game batter learning. Although the TTOP has largely been accepted w...
We study the asymptotic properties of the multivariate spike-and-slab LASSO (mSSL) proposed by Deshpande et al.(2019) for simultaneous variable and covariance selection. Specifically, we consider the sparse multivariate linear regression problem where $q$ correlated responses are regressed onto $p$ covariates. In this problem, the goal is to estima...
The Gaussian chain graph model simultaneously parametrizes (i) the direct effects of $p$ predictors on $q$ correlated outcomes and (ii) the residual partial covariance between pair of outcomes. We introduce a new method for fitting sparse Gaussian chain graph models with spike-and-slab LASSO (SSL) priors. We develop an Expectation-Conditional Maxim...
Background
Chronic kidney disease of uncertain etiology (CKDu) has been found at high frequency in several lowland agricultural areas. Whether CKDu occurs in other countries with large agricultural populations remains uncertain, primarily due to lack of systematic data on kidney function. Hemoglobin (Hgb) levels are an ancillary marker for kidney d...
Many youths participate in sports, and it is of interest to understand the impact of youth sports participation on later-life outcomes. However, prospective studies take a long time to complete and retrospective studies may be more practical and time-efficient to address some questions. We pilot a retrospective survey of youth sports participation...
We examined the association between early-life participation in collision sports and later life cognitive health over a 28-year period in a population-based sample drawn from the longitudinal Swedish Adoption/Twin Study of Aging (1987-2014). Cognitive measures included the Mini-Mental State Examination and performance across multiple cognitive doma...
Gaussian processes (GPs) are used to make medical and scientific decisions, including in cardiac care and monitoring of carbon dioxide emissions. But the choice of GP kernel is often somewhat arbitrary. In particular, uncountably many kernels typically align with qualitative prior knowledge (e.g. function smoothness or stationarity). But in practic...
Modern statistics provides an ever-expanding toolkit for estimating unknown parameters. Consequently, applied statisticians frequently face a difficult decision: retain a parameter estimate from a familiar method or replace it with an estimate from a newer or complex one. While it is traditional to compare estimators using risk, such comparisons ar...
Many modern data analyses benefit from explicitly modeling dependence structure in data -- such as measurements across time or space, ordered words in a sentence, or genes in a genome. Cross-validation is the gold standard to evaluate these analyses but can be prohibitively slow due to the need to re-run already-expensive learning algorithms many t...
Concerned about potentially increased risk of neurodegenerative disease, several health professionals and policy makers have proposed limiting or banning youth participation in American-style tackle football. Given the large affected population (over 1 million boys play high school football annually), careful estimation of the long-term health effe...
American football is the most popular high school sport yet its association with health in adulthood has not been widely studied. We investigated the association between high school football and self-rated health, obesity, and pain in adulthood using a retrospective cohort study of the Wisconsin Longitudinal Study from 1957 to 2004. We matched 925...
Accurate estimation of the change in crime over time is a critical first step towards better understanding of public safety in large urban environments. Bayesian hierarchical modeling is a natural way to study spatial variation in urban crime dynamics at the neighborhood level, since it facilitates principled "sharing of information"S between spati...
Using high-resolution player tracking data made available by the National Football League (NFL) for their 2019 Big Data Bowl competition, we introduce the Expected Hypothetical Completion Probability (EHCP), a objective framework for evaluating plays. At the heart of EHCP is the question “on a given passing play, did the quarterback throw the pass...
Using high-resolution player tracking data made available by the National Football League (NFL) for their 2019 Big Data Bowl competition, we introduce the Expected Hypothetical Completion Probability (EHCP), a objective framework for evaluating plays. At the heart of EHCP is the question "on a given passing play, did the quarterback throw the pass...
American football is the most popular high school sport and is among the leading cause of injury among adolescents. While there has been considerable recent attention on the link between football and cognitive decline, there is also evidence of higher than expected rates of pain, obesity, and lower quality of life among former professional players,...
More than 1 million students play high school American football annually, but many health professionals have recently questioned its safety or called for its ban. These concerns have been partially driven by reports of chronic traumatic encephalopathy (CTE), increased risks of neurodegenerative disease, and associations between concussion history a...
A large body of work links traumatic brain injury (TBI) in adulthood to the onset of Alzheimer's disease (AD). AD is the chief cause of dementia, leading to reduced cognitive capacity and autonomy and increased mortality risk. More recently, researchers have sought to investigate whether TBI experienced in early-life may influence trajectories of c...
This dissertation explores Bayesian model selection and estimation in settings where the model space is too vast to rely on Markov Chain Monte Carlo for posterior calculation. First, we consider the problem of sparse multivariate linear regression, in which several correlated outcomes are simultaneously regressed onto a large set of covariates, whe...
In Reply Hoffmann contends that the clinical measures that were used in our study may not be sensitive to deficits associated with traumatic brain injury (TBI) or chronic traumatic encephalopathy (CTE), pointing to case studies that demonstrated a dissociation between cognitive and behavioral frontal lobe functions, the latter of which are difficul...
We propose a Bayesian procedure for simultaneous variable and covariance selection using continuous spike-and-slab priors in multivariate linear regression models where q possibly correlated responses are regressed onto p predictors. Rather than relying on a stochastic search through the high-dimensional model space, we develop an ECM algorithm sim...
We propose a Bayesian procedure for simultaneous variable and covariance selection using continuous spike-and-slab priors in multivariate linear regression models where q possibly correlated responses are regressed onto p predictors. Rather than relying on a stochastic search through the high-dimensional model space, we develop an ECM algorithm sim...
Importance
American football is the largest participation sport in US high schools and is a leading cause of concussion among adolescents. Little is known about the long-term cognitive and mental health consequences of exposure to football-related head trauma at the high school level.
Objective
To estimate the association of playing high school fo...
Causal effects are commonly defined as comparisons of the potential outcomes under treatment and control, but this definition is threatened by the possibility that the treatment or control condition is not well-defined, existing instead in more than one version. A simple, widely applicable analysis is proposed to address the possibility that the tr...
Causal effects are commonly defined as comparisons of the potential outcomes under treatment and control, but this definition is threatened by the possibility that the treatment or control condition is not well-defined, existing instead in more than one version. A simple, widely applicable analysis is proposed to address the possibility that the tr...
Since the advent of high-resolution pitch tracking data (PITCHf/x), many in the sabermetrics community have attempted to quantify a Major League Baseball catcher's ability to "frame" a pitch (i.e. increase the chance that a pitch is called as a strike). Especially in the last three years, there has been an explosion of interest in the "art of pitch...
Since the advent of high-resolution pitch tracking data (PITCHf/x), many in the sabermetrics community have attempted to quantify a Major League Baseball catcher’s ability to “frame” a pitch (i.e. increase the chance that a pitch is a called as a strike). Especially in the last 3 years, there has been an explosion of interest in the “art of pitch f...
A potential causal relationship between head injuries sustained by NFL players and later-life neurological decline may have broad implications for participants in youth and high school football programs. However, brain trauma risk at the professional level may be different than that at the youth and high school levels and the long-term effects of p...
Traditional NBA player evaluation metrics are based on scoring differential or some pace-adjusted linear combination of box score statistics like points, rebounds, assists, etc. These measures treat performances with the outcome of the game still in question (e.g. tie score with five minutes left) in exactly the same way as they treat performances...
Traditional NBA player evaluation metrics are based on scoring differential or some pace-adjusted linear combination of box score statistics like points, rebounds, assists, etc. These measures treat performances with the outcome of the game still in question (e.g. tie score with five minutes left) in exactly the same way as they treat performances...
Experimental scratch resistance testing provides two numbers: the penetration depth R
p and the healing depth R
h. In molecular dynamics computer simulations, we create a material consisting of N statistical chain segments by polymerization; a reinforcing phase can be included. Then we simulate the movement of an indenter and response of the segmen...
Polypropylene (PP) based materials for nonreusable syringe applications have been investigated, some of them containing an internal liquid lubricant. Hardness, tensile properties, and friction measured by two distinct procedures have been determined. We report three series of results: for nonirradiated samples; for samples directly after stopping t...
We have studied nine thermoplastic vulcanizate elastomers (TPVs) in four series: as made, after accelerated aging, after γ irradiation, after both irradiation and aging. The materials exhibit two glass transitions, one seen in cross-linked regions and the other in un-crosslinked amorphous regions. Three techniques of determination of glass transiti...