Brian J. Reich's research while affiliated with NC State Education Assistance Authority and other places

Publications (282)

Article
Spatially dependent data arises in many applications, and Gaussian processes are a popular modeling choice for these scenarios. While Bayesian analyses of these problems have proven to be successful, selecting prior distributions for these complex models remains a difficult task. In this work, we propose a principled approach for setting prior dist...
Article
Full-text available
The escalating frequency and severity of global wildfires necessitate an in-depth understanding and monitoring of wildfire smoke impacts, specifically its contribution to fine particulate matter (PM2.5). We propose a data-fusion method to study wildfire contribution to PM2.5 using satellite-derived smoke plume indicators and PM2.5 monitoring data....
Preprint
Full-text available
High spatial resolution wind data are essential for a wide range of applications in climate, oceanographic and meteorological studies. Large-scale spatial interpolation or downscaling of bivariate wind fields having velocity in two dimensions is a challenging task because wind data tend to be non-Gaussian with high spatial variability and heterogen...
Preprint
Full-text available
The escalating frequency and severity of global wildfires necessitate an in-depth understanding and monitoring of wildfire smoke impacts, specifically its contribution to fine particulate matter (PM2.5). We propose a data-fusion method to study wildfire contribution to PM2.5 using satellite-derived smoke plume indicators and PM2.5 monitoring data....
Article
Full-text available
Changes in annual maximum flood (AMF), which are usually detected using simple trend tests (e.g., Mann‐Kendall test (MKT)), are expected to change design‐flood estimates. We propose an alternate framework to detect significant changes in design‐flood between two periods and evaluate it for synthetically generated AMF from the Log‐Pearson Type‐3 (LP...
Article
Geostationary weather satellites collect high‐resolution data comprising a series of images. The Derived Motion Winds (DMW) Algorithm is commonly used to process these data and estimate atmospheric winds by tracking features in the images. However, the wind estimates from the DMW Algorithm are often missing and do not come with uncertainty measures...
Preprint
Full-text available
We propose a model to flexibly estimate joint tail properties by exploiting the convergence of an appropriately scaled point cloud onto a compact limit set. Characteristics of the shape of the limit set correspond to key tail dependence properties. We directly model the shape of the limit set using B\'ezier splines, which allow flexible and parsimo...
Preprint
Gaussian processes (GP) and Kriging are widely used in traditional spatio-temporal mod-elling and prediction. These techniques typically presuppose that the data are observed from a stationary GP with parametric covariance structure. However, processes in real-world applications often exhibit non-Gaussianity and nonstationarity. Moreover, likelihoo...
Article
Full-text available
Background: Fractional exhaled nitric oxide (FeNO) is a marker of airway inflammation. Elevated FeNO has been associated with environmental exposures, however, studies from tropical countries are limited. Using data from the Infants' Environmental Health Study (ISA) birth cohort, we evaluated medical conditions and environmental exposures' associa...
Article
Extreme environmental events frequently exhibit spatial and temporal dependence. These data are often modeled using max stable processes (MSPs) that are computationally prohibitive to fit for as few as a dozen observations. Supposed computationally-efficient approaches like the composite likelihood remain computationally burdensome with a few hundr...
Article
Biological sex and gender are critical variables in biomedical research, but are complicated by the presence of sex‐specific natural hormone cycles, such as the estrous cycle in female rodents, typically divided into phases. A common feature of these cycles are fluctuating hormone levels which induce sex differences in many behaviors controlled by...
Preprint
Spatially dependent data arises in many biometric applications, and Gaussian processes are a popular modelling choice for these scenarios. While Bayesian analyses of these problems have proven to be successful, selecting prior distributions for these complex models remains a difficult task. In this work, we propose a principled approach for setting...
Article
Adjusting for an unmeasured confounder is generally an intractable problem, but in the spatial setting it may be possible under certain conditions. We derive necessary conditions on the coherence between the exposure and the unmeasured confounder that ensure the effect of exposure is estimable. We specify our model and assumptions in the spectral d...
Preprint
Full-text available
Extreme streamflow is a key indicator of flood risk, and quantifying the changes in its distribution under non-stationary climate conditions is key to mitigating the impact of flooding events. We propose a non-stationary process mixture model (NPMM) for annual streamflow maxima over the central US (CUS) which uses downscaled climate model precipita...
Article
Wildland fire smoke contains hazardous levels of fine particulate matter (PM2.5), a pollutant shown to adversely effect health. Estimating fire attributable PM2.5 concentrations is key to quantifying the impact on air quality and subsequent health burden. This is a challenging problem since only total PM2.5 is measured at monitoring stations and bo...
Article
Predicting the response at an unobserved location is a fundamental problem in spatial statistics. Given the difficulty in modeling spatial dependence, especially in non-stationary cases, model-based prediction intervals are at risk of misspecification bias that can negatively affect their validity. Here we present a new approach for model-free nonp...
Preprint
Full-text available
Standard causal inference characterizes treatment effect through averages, but the counterfactual distributions could be different in not only the central tendency but also spread and shape. To provide a comprehensive evaluation of treatment effects, we focus on estimating quantile treatment effects (QTEs). Existing methods that invert a nonsmooth...
Preprint
Full-text available
We develop an R package SPQR that implements the semi-parametric quantile regression (SPQR) method in Xu and Reich (2021). The method begins by fitting a flexible density regression model using monotonic splines whose weights are modeled as data-dependent functions using artificial neural networks. Subsequently, estimates of conditional density and...
Article
Since the arrival of porcine epidemic diarrhea virus (PEDV) in the United States in 2013, elimination and control programs have had partial success. The dynamics of its spread are hard to quantify, though previous work has shown that local transmission and the transfer of pigs within production systems are most associated with the spread of PEDV. O...
Preprint
Marine conservation preserves fish biodiversity, protects marine and coastal ecosystems, and supports climate resilience and adaptation. Despite the importance of establishing marine protected areas (MPAs), research on the effectiveness of MPAs with different conservation policies is limited due to the lack of quantitative MPA information. In this...
Article
Full-text available
Global earth monitoring aims to identify and characterize land cover change like construction as it occurs. Remote sensing makes it possible to collect large amounts of data in near real-time over vast geographic areas and is becoming available in increasingly fine temporal and spatial resolution. Many methods have been developed for data from a si...
Article
Understanding the effects of interventions, such as restrictions on community and large group gatherings, is critical to controlling the spread of COVID-19. Susceptible-Infectious-Recovered (SIR) models are traditionally used to forecast the infection rates but do not provide insights into the causal effects of interventions. We propose a spatiotem...
Article
Full-text available
Analyzing massive spatial datasets using a Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental health. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate t...
Article
Many spatial phenomena exhibit interference, where exposures at one location may affect the response at other locations. Because interference violates the stable unit treatment value assumption, standard methods for causal inference do not apply. We propose a new causal framework to recover direct and spill‐over effects in the presence of spatial i...
Preprint
The dynamics that govern disease spread are hard to model because infections are functions of both the underlying pathogen as well as human or animal behavior. This challenge is increased when modeling how diseases spread between different spatial locations. Many proposed spatial epidemiological models require trade-offs to fit, either by abstracti...
Preprint
Full-text available
Quantifying changes in the probability and magnitude of extreme flooding events is key to mitigating their impacts. While hydrodynamic data are inherently spatially dependent, traditional spatial models such as Gaussian processes are poorly suited for modeling extreme events. Spatial extreme value models with more realistic tail dependence characte...
Article
Full-text available
Aim Decades of research on species distributions has revealed geographic variation in species‐environment relationships for a given species. That is, the way a species uses the local environment varies across geographic space. However, the drivers underlying this variation are contested and still largely unexplored. Niche traits that are conserved...
Preprint
Since the arrival of porcine epidemic diarrhea virus (PEDV) in the United States in 2013, elimination and control programs have had partial success. The dynamics of its spread are hard to quantify, though previous work has shown that local transmission and the transfer of pigs within production systems are most associated with the spread of PEDV. O...
Article
Full-text available
Background nly few studies have compared environmental pesticide air concentrations with specific urinary metabolites to evaluate pathways of exposure. Therefore, we compared pyrimethanil and chlorpyrifos concentrations in air with urinary 4-hydroxypyrimethanil (OHP, metabolite of pyrimethanil) and 3,5,6-trichloro-2-pyridinol (TCPy, metabolite of c...
Preprint
Extreme environmental events frequently exhibit spatial and temporal dependence. These data are often modeled using max stable processes (MSPs). MSPs are computationally prohibitive to fit for as few as a dozen observations, with supposed computationally-efficient approaches like the composite likelihood remaining computationally burdensome with a...
Article
Full-text available
Nontuberculous mycobacteria (NTM) are opportunistic pathogens that cause chronic pulmonary disease (PD). NTM infections are thought to be acquired from the environment; however, the basal environmental factors that drive and sustain NTM prevalence are not well understood. The highest prevalence of NTM PD cases in the United States is reported from...
Article
Full-text available
Background Little is known about the effects of pesticides on children’s respiratory and allergic outcomes. We evaluated associations of prenatal and current pesticide exposures with respiratory and allergic outcomes in children from the Infants’ Environmental Health Study in Costa Rica. Methods Among 5-year-old children (n=303), we measured prena...
Article
Spectral methods are important for both theory and computation in spatial data analysis. When data lie on a grid, spectral approaches can take advantage of the discrete Fourier transform for fast computation. If data are not on a grid, then low-rank processes with Fourier basis functions may be sufficient approximations. However, deciding which bas...
Preprint
A key task in the emerging field of materials informatics is to use machine learning to predict a material's properties and functions. A fast and accurate predictive model allows researchers to more efficiently identify or construct a material with desirable properties. As in many fields, deep learning is one of the state-of-the art approaches, but...
Article
The analysis of solar irradiance has important applications in predicting solar energy production from solar power plants. Although the sun provides every day more energy than we need, the variability caused by environmental conditions affects electricity production. Recently, new statistical models have been proposed to provide stochastic simulati...
Article
Full-text available
Objectives This research evaluates whether environmental exposures (pesticides and smoke) influence respiratory and allergic outcomes in women living in a tropical, agricultural environment. Methods We used data from 266 mothers from the Infants’ Environmental Health cohort study in Costa Rica. We evaluated environmental exposures in women by meas...
Preprint
Near real time change detection is important for a variety of Earth monitoring applications and remains a high priority for remote sensing science. Data sparsity, subtle changes, seasonal trends, and the presence of outliers make detecting actual landscape changes challenging. \cite{Adams2007} introduced Bayesian Online Changepoint Detection (BOCPD...
Preprint
In Bayesian analysis, the selection of a prior distribution is typically done by considering each parameter in the model. While this can be convenient, in many scenarios it may be desirable to place a prior on a summary measure of the model instead. In this work, we propose a prior on the model fit, as measured by a Bayesian coefficient of determin...
Article
Full-text available
Estimates of daily air pollution concentrations with complete spatial and temporal coverage are important for supporting epidemiologic studies and health impact assessments. While numerous approaches have been developed for modeling air pollution, they typically only consider each pollutant separately. We describe a spatial multipollutant data fusi...
Article
Flexible estimation of multiple conditional quantiles is of interest in numerous applications, such as studying the effect of pregnancy-related factors on low and high birth weight. We propose a Bayesian non-parametric method to simultaneously estimate non-crossing, non-linear quantile curves. We expand the conditional distribution function of the...
Article
Full-text available
Background Pesticides and metals may disrupt thyroid function, which is key to fetal brain development. Objectives To evaluate if current-use pesticide exposures, lead and excess manganese alter free thyroxine (FT4), free triiodothyronine (FT3), and thyroid stimulating hormone (TSH) concentrations in pregnant women from the Infants' Environmental...
Article
Full-text available
Geospatial models are crucial for identifying likely ‘hot-spots’ of Bt resistance evolution in Helicoverpa zea (Lepidoptera: Noctuidae), thereby improving regional insecticide resistance management (IRM) strategies and planted refuge compliance. To characterize H. zea distributions in relation to land use, we used historical trapping data collected...
Article
Full-text available
BACKGROUND Helicoverpa zea (Boddie) damage to Bt cotton and maize has increased due to widespread Bt resistance across the USA Cotton Belt. Our objective was to link Bt crop production patterns to cotton damage through a series of spatial and temporal surveys of commercial fields to understand how Bt crop production relates to greater than expected...
Article
Spatial extremes are common for climate data as the observations are usually referenced by geographic locations and dependent when they are nearby. An important goal of extremes modeling is to estimate the T-year return level. Among the methods suitable for modeling spatial extremes, perhaps the simplest and fastest approach is the spatial generali...
Article
The scientific rigor and computational methods of causal inference have had great impacts on many disciplines but have only recently begun to take hold in spatial applications. Spatial causal inference poses analytic challenges due to complex correlation structures and interference between the treatment at one location and the outcomes at others. I...
Article
High spatiotemporal resolution maps of surface vegetation from remote sensing data are desirable for vegetation and disturbance monitoring. However, due to the current limitations of imaging spectrometers, remote sensing datasets of vegetation with high temporal frequency of measurements have lower spatial resolution, and vice versa. In this resear...
Article
Full-text available
Short-term forecasting is an important tool in understanding environmental processes. In this paper, we incorporate machine learning algorithms into a conditional distribution estimator for the purposes of forecasting tropical cyclone intensity. Many machine learning techniques give a single-point prediction of the conditional distribution of the t...
Article
Full-text available
Land surface phenology (LSP) is a consistent and sensitive indicator of climate change effects on Earth's vegetation. Existing methods of estimating LSP require time series densities that, until recently, have only been available from coarse spatial resolution imagery such as MODIS (500 m) and AVHRR (1 km). LSP products from these datasets have imp...
Article
We study the problem of sparse signal detection on a spatial domain. We propose a novel approach to model continuous signals that are sparse and piecewise-smooth as the product of independent Gaussian processes (PING) with a smooth covariance kernel. The smoothness of the PING process is ensured by the smoothness of the covariance kernels of the Ga...
Article
The study of microbiomes has become a topic of intense interest in last several decades as the development of new sequencing technologies has made DNA data accessible across disciplines. In this paper, we analyze a global dataset to investigate environmental factors that affect topsoil microbiome. As yet, much associated work has focused on linking...
Article
Malaria is an infectious disease affecting a large population across the world, and interventions need to be efficiently applied to reduce the burden of malaria. We develop a framework to help policy-makers decide how to allocate limited resources in realtime for malaria control. We formalize a policy for the resource allocation as a sequence of de...
Article
Scientists use imaging to identify objects of interest and infer properties of these objects. The locations of these objects are often measured with error, which when ignored leads to biased parameter estimates and inflated variance. Current measurement error methods require an estimate or knowledge of the measurement error variance to correct thes...
Preprint
Full-text available
Understanding the effects of interventions, such as restrictions on community and large group gatherings, is critical to controlling the spread of COVID-19. Susceptible-Infectious-Recovered (SIR) models are traditionally used to forecast the infection rates but do not provide insights into the causal effects of interventions. We propose a spatiotem...
Preprint
Full-text available
Unobserved spatial confounding variables are prevalent in environmental and ecological applications where the system under study is complex and the data are often observational. Instrumental variables (IVs) are a common way to address unobserved confounding; however, the efficacy of using IVs on spatial confounding is largely unknown. This paper ex...
Preprint
Full-text available
We propose a non-parametric method to simultaneously estimate non-crossing, non-linear quantile curves. We expand the conditional distribution function of the response in $\mathcal{I}$-spline basis functions where the coefficients are further modeled as functions of the covariates using feed-forward neural networks. By leveraging the approximation...
Article
Full-text available
Geostatistical modeling for continuous point‐referenced data has been extensively applied to neuroimaging because it produces efficient and valid statistical inference. However, diffusion tensor imaging (DTI), a neuroimaging technique characterizing the brain's anatomical structure, produces a positive definite (p.d.) matrix for each voxel. Current...
Article
Due to their flexibility and predictive performance, machine-learning based regression methods have become an important tool for predictive modeling and forecasting. However, most methods focus on estimating the conditional mean or specific quantiles of the target quantity and do not provide the full conditional distribution, which contains uncerta...
Article
Full-text available
Wildland fire (wildfire; bushfire) pollution contributes to poor air quality, a risk factor for premature death. The frequency and intensity of wildfires are expected to increase; improved tools for estimating exposure to fire smoke are vital. New-generation satellite-based sensors produce high-resolution spectral images, providing real-time inform...
Article
Full-text available
Studies on diffusion tensor imaging (DTI) quantify the diffusion of water molecules in a brain voxel using an estimated 3 × 3 symmetric positive definite (p.d.) diffusion tensor matrix. Due to the challenges associated with modelling matrix‐variate responses, the voxel‐level DTI data are usually summarized by univariate quantities, such as fraction...
Article
Wendelberger, LJ, Reich, BJ, Wilson, AG. Multi‐model penalized regression. Stat Anal Data Min: The ASA Data Sci Journal. 2021; 1 ‐ 25. https://doi.org/10.1002/sam.11496 The above article from Statistical Analysis and Data Mining, published online on 13 January 2021 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement be...
Preprint
Full-text available
Analyzing massive spatial datasets using Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental heath. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate the...
Preprint
Full-text available
Adjusting for an unmeasured confounder is generally an intractable problem, but in the spatial setting it may be possible under certain conditions. In this paper, we derive necessary conditions on the coherence between the treatment variable of interest and the unmeasured confounder that ensure the causal effect of the treatment is estimable. We sp...
Article
Humans are concurrently exposed to chemically, structurally and toxicologically diverse chemicals. A critical challenge for environmental epidemiology is to quantify the risk of adverse health outcomes resulting from exposures to such chemical mixtures and to identify which mixture constituents may be driving etiologic associations. A variety of st...
Preprint
Advances in sensing and computation have accelerated at unprecedented rates and scales, in turn creating new opportunities for natural resources managers to improve adaptive and predictive management practices by coupling large environmental datasets with machine learning (ML). Yet, to date, ML models often remain inaccessible to managers working o...
Article
Fine particulate matter, PM\(_{2.5}\), has been documented to have adverse health effects, and wildland fires are a major contributor to \(\hbox {PM}_{2.5}\) air pollution in the USA. Forecasters use numerical models to predict PM\(_{2.5}\) concentrations to warn the public of impending health risk. Statistical methods are needed to calibrate the n...
Article
Prior distributions for high-dimensional linear regression require specifying a joint distribution for the unobserved regression coefficients, which is inherently difficult. We instead propose a new class of shrinkage priors for linear regression via specifying a prior first on the model fit, in particular, the coefficient of determination, and the...
Preprint
Short-term forecasting is an important tool in understanding environmental processes. In this paper, we incorporate machine learning algorithms into a conditional distribution estimator for the purposes of forecasting tropical cyclone intensity. Many machine learning techniques give a single-point prediction of the conditional distribution of the t...
Preprint
Full-text available
In spatial statistics, a common objective is to predict the values of a spatial process at unobserved locations by exploiting spatial dependence. In geostatistics, Kriging provides the best linear unbiased predictor using covariance functions and is often associated with Gaussian processes. However, when considering non-linear prediction for non-Ga...
Preprint
The scientific rigor and computational methods of causal inference have had great impacts on many disciplines, but have only recently begun to take hold in spatial applications. Spatial casual inference poses analytic challenges due to complex correlation structures and interference between the treatment at one location and the outcomes at others....
Preprint
Many spatial phenomena exhibit treatment interference where treatments at one location may affect the response at other locations. Because interference violates the stable unit treatment value assumption, standard methods for causal inference do not apply. We propose a new causal framework to recover direct and spill-over effects in the presence of...
Preprint
Predicting the response at an unobserved location is a fundamental problem in spatial statistics. Given the difficulty in modeling spatial dependence, especially in non-stationary cases, model-based prediction intervals are at risk of misspecification bias that can negatively affect their validity. Here we present a new approach for model-free spat...
Article
An important problem in modern forensic analyses is identifying the provenance of materials at a crime scene, such as biological material on a piece of clothing. This procedure, which is known as geolocation, is conventionally guided by expert knowledge of the biological evidence and therefore tends to be application specific, labour intensive and...
Article
Full-text available
Ecological occupancy modeling has historically relied on high‐quality, low‐quantity designed‐survey data for estimation and prediction. In recent years, there has been a large increase in the amount of high‐quantity, unknown‐quality opportunistic data. This has motivated research on how best to combine these two data sources in order to optimize in...
Preprint
Model fitting often aims to fit a single model, assuming that the imposed form of the model is correct. However, there may be multiple possible underlying explanatory patterns in a set of predictors that could explain a response. Model selection without regarding model uncertainty can fail to bring these patterns to light. We present multi-model pe...
Preprint
We establish causal effect models that allow for time- and spatially varying causal effects. Under the standard sequential randomization assumption, we show that the local causal parameter can be identified based on a class of estimating equations. To borrow information from nearby locations, we adopt the local estimating equation approach via loca...
Article
We propose a novel mixture Generalized Pareto (MIXGP) model to calibrate extreme precipitation forecasts. This model is able to describe the marginal distribution of observed precipitation and capture the dependence between climate forecasts and the observed precipitation under suitable conditions. In addition, the full range distribution of precip...
Preprint
The max-stable process is an asymptotically justified model for spatial extremes. In particular, we focus on the hierarchical extreme-value process (HEVP), which is a particular max-stable process that is conducive to Bayesian computing. The HEVP and all max-stable process models are parametric and impose strong assumptions including that all margi...
Preprint
Wildland fire smoke contains hazardous levels of fine particulate matter PM2.5, a pollutant shown to adversely effect health. Estimating fire attributable PM2.5 concentrations is key to quantifying the impact on air quality and subsequent health burden. This is a challenging problem since only total PM2.5 is measured at monitoring stations and both...