Sam J. Silva’s research while affiliated with University of Southern California and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (62)


The 5 Human Development Index regions over which equity is defined in the penalty P $\mathcal{P}$.
MSE (black) and P $\mathcal{P}$ (blue) for each neural network ensemble trained to predict (a) DTR and (b) TAS with varying α $\alpha $. (b, d) Show the same as (a, c), respectively, but only for α≤0.25 $\alpha \le 0.25$. Error bars represent standard error of the ensemble mean.
MSE (black) and P $\mathcal{P}$ (blue) for each neural network ensemble trained to predict DTR with varying α $\alpha $. The square and triangular points represent MSE over land and ocean, respectfully. Error bars represent standard error of the ensemble mean.
MSE for (a) DTR and (b) TAS in each Human Development Index region for several values of α $\alpha $. The black square line shows the mean of the colored bars, or simply the MSE over land, and the black triangle line shows the MSE over the ocean.
MSE of the neural network's DTR predictions averaged over the test data for (a) α=0 $\alpha =0$, (b) 0.05, (c) 0.1, and (d) 0.25. Values over the ocean are masked.
Enforcing Equity in Neural Climate Emulators
  • Article
  • Full-text available

April 2025

·

4 Reads

William Yik

·

Sam J. Silva

Neural network emulators have become an invaluable tool for climate prediction tasks but do not have an inherent ability to produce equitable predictions (e.g., predictions which are equally accurate across different regions or groups of people). This motivates the need for explicit internal representations of fairness. To that end, we draw on methods for enforcing physical constraints in emulators and propose a custom loss function which punishes predictions of unequal quality across any prespecified regions or category, here defined using Human Development Index. This loss function weighs a standard error metric against another which captures inequity between groups, allowing us to adjust the priority of each. Our results show that emulators trained with our loss function provide more equitable predictions. We empirically demonstrate that an appropriate selection of an equity priority can minimize loss of performance, mitigating the tradeoff between accuracy and equity.

Download

The locations used in this work, falling within 5 broad categories: marine (Pacific, Atlantic, and Indian Oceans, Graciosa, Kennaook), biogenic (the Amazon, Congo, and Borneo rainforests, and the Ozarks), urban (Los Angeles, Paris, Kinshasa, Beijing), desert (El Djouf), and polar (Utqiagvik and McMurdo Station).
The distribution of all two‐reaction cycle timescales for 16 locations, sampled from the surface (∼1,000 hPa) at noon local time in July. Colors follow the dominant local chemical sources described in Section 2.1: ocean‐dominated regions are in dark blue, desert in orange‐brown, biogenically‐dominated regions are in green, urban locations are in gray, and the remote polar regions are in light blue.
The bivariate distributions of two‐reaction cycle timescales along secondary dimensions, including (a) NOx concentration, (b) HO2 concentration, (c) OH reactivity, and (d) proportion of OH reactivity determined by reactions with NOx (plotted for urban and biogenic locations only, for clarity of visualization). The distributions use all two‐reaction cycles with timescales less than 10⁸ s (∼3.2 years) cycle⁻¹ from the surface locations at 539 daytime samples (determined by a threshold of greater than 0.4 for the cosine of solar zenith angle) in July, October, January, and April. This includes all noon locations plotted in Figure 2 except McMurdo Station, which is in polar night in the snapshot of local noon timescale distributions in Figure 2. The median timescales of all other distributions from Figure 2 are shown as stars. Topographic isolines indicate the number of points contained within the bounds of the two‐dimensional kernel density estimate, in increments of 10%.
The hexbin plots, with color of each hex showing the number of cycles contained in that hexbin, show the frequency of cycle timescales having rate‐limiting timesteps. Each plot captures the cycles across all 16 locations at the surface at local noon. On the horizontal axis we plot the timescale of the full cycle, while on the vertical axis we track the contribution of the longest participating reaction timescale in a particular cycle to the overall cycle timescale. For each cycle length across two‐, three‐, and four‐reaction cycles, the density of cycles with a rate‐limiting timestep decrease for cycles faster than 1 s.
Characterizing the Speed of Chemical Cycling in the Atmosphere

February 2025

·

33 Reads

Emy W. Li

·

Patrick Obin Sturm

·

Sam J. Silva

·

[...]

·

Christoph A. Keller

Chemical cycling drives the production and loss of many important atmospheric constituents. The speed of atmospheric chemical cycling is a particularly valuable indicator for characterizing and measuring the effects of such cycles on oxidant chemistry, air quality, and climate. Here, we apply graph theoretical methods to explicitly quantify and analyze the characteristic timescales of gas‐phase chemical cycles in the troposphere and stratosphere, as simulated by the GEOS‐Chem chemical mechanism. We identify all two‐, three‐, and four‐reaction cycles in the mechanism and calculate a characteristic timescale for each individual cycle. We find that the speed of chemical cycling varies by orders of magnitude at any given location but tends to be faster in urban‐ and biogenically‐dominated chemical regions, and slower during the night. We further quantify the fraction of cycling that contains a rate‐determining step, and explicitly demonstrate the large potential for mechanisms to recycle oxidants like OH.


Figure 1. Daily minimum temperature (A) and daily mortality counts (b) in LA County, California from 1 January 2014 to 31 December 2019. Triangles mark days at or above the 99th percentile of daily minimum temperatures. Predicted daily mortality from two models are plotted with different colors: TS with a natural cubic spline of 8 degrees of freedom (black) and time-stratified CC (gray). Plotted values are for a reference day of the week (Sunday) for clarity of visualization.
Figure 2. Relative average bias for four methods under data-generating Scenario 1 (A) and Scenario 2 (B), with panels for each version of the daily minimum temperature variable. cts, continuous; numbers are percentile cutoffs for increasingly rare exposures.
Figure 3. Relative efficiency for four methods under data-generating Scenario 1 (A) and Scenario 2 (B), with panels for each version of the daily minimum temperature variable. cts, continuous; numbers are percentile cutoffs for increasingly rare exposures.
Efficiency of case-crossover versus time-series study designs for extreme heat exposures

February 2025

·

19 Reads

Environmental Epidemiology

Background Time-stratified case-crossover (CC) and Poisson time series (TS) are two popular methods for relating acute health outcomes to time-varying ubiquitous environmental exposures. Our aim is to compare the performance of these methods in estimating associations with rare, extreme heat exposures and mortality—an increasingly relevant exposure in our changing climate. Methods Daily mortality data were simulated in various scenarios similar to observed Los Angeles County data from 2014 to 2019 (N = 367,712 deaths). We treated observed temperature as either a continuous or dichotomized variable and controlled for day of week and a smooth function of time. Five temperature dichotomization cutoffs between the 80th and 99th percentile were chosen to investigate the effects of extreme heat events. In each of 10,000 simulations, the CC and several TS models with varying degrees of freedom for time were fit to the data. We reported bias, variance, and relative efficiency (ratio of variance for a “reference” TS method to variance of another method) of temperature association estimates. Results CC estimates had larger uncertainty than TS methods, with the relative efficiency of CC ranging from 91% under the 80th percentile cutoff to 80% under the 99th percentile cutoff. As previously reported, methods best capturing data-generating time trends generally had the least bias. Additionally, TS estimates for observed Los Angeles data were larger with less uncertainty. Conclusions We provided new evidence that, compared with TS, CC has increasingly poor efficiency for rarer exposures in ecological study settings with shared, regional exposures, regardless of underlying time trends. Analysts should consider these results when applying either TS or CC methods.


A Nudge to the Truth: Atom Conservation as a Hard Constraint in Models of Atmospheric Composition Using a Species-Weighted Correction

November 2024

·

27 Reads

·

2 Citations

ACS ES&T Air

Computational models of atmospheric composition are not always physically consistent. For example, not all models respect fundamental conservation laws such as conservation of atoms in an interconnected chemical system. In well performing models, these unphysical deviations are often ignored because they are frequently minor, and thus only need a small nudge to perfectly conserve mass. Here we introduce a method that anchors a prediction from any numerical model to physically consistent hard constraints, nudging concentrations to the nearest solution that respects the conservation laws. This closed-form model-agnostic correction uses a single matrix operation to minimally perturb the predicted concentrations to ensure that atoms are conserved to machine precision. To demonstrate this approach, we train a gradient boosting decision tree ensemble to emulate a small reference model of ozone photochemistry and test the effect of the correction on accurate but nonconservative predictions. The nudging approach minimally perturbs the already well-predicted results for most species, but decreases the accuracy of important oxidants, including radicals. We develop a weighted extension of this nudging approach that considers the uncertainty and magnitude of each species in the correction. This species-level weighting approach is essential to accurately predict important low concentration species such as radicals. We find that applying the species-weighted correction slightly improves overall accuracy by nudging unphysical predictions to a more likely mass-conserving solution.


Figure 2. All possible 3-node motif isomorphism classes are studied in this work, along with species-and reaction-centered chemical explanations.
Figure 3. The three 3-node motifs present in the bimolecular reaction are shown in Figure 1. Motifs are shown as red arrows, and their motif isomorphism classes are labeled (see Figure 2).
Figure 4. Distribution of motifs for all six isomorphism classes across all three chemical mechanisms studied in this work.
Figure 5. The fraction of isomorphism classes centered on the HO x and NO x chemical families.
Figure 6. The scaled z-score of the isomorphism class prevalence in each of the three mechanisms is compared to a random baseline. Transparent bars are not statistically significant.
Graph characterization of higher-order structure in atmospheric chemical reaction mechanisms

November 2024

·

7 Reads

Environmental Data Science

Atmospheric chemical reactions play an important role in air quality and climate change. While the structure and dynamics of individual chemical reactions are fairly well understood, the emergent properties of the entire atmospheric chemical system, which can involve many different species that participate in many different reactions, are not well described. In this work, we leverage graph-theoretic techniques to characterize patterns of interaction (“motifs”) in three different representations of gas-phase atmospheric chemistry, termed “chemical mechanisms.” These widely used mechanisms, the master chemical mechanism, the GEOS-Chem mechanism, and the Super-Fast mechanism, vary dramatically in scale and application, but they all generally aim to simulate the abundance and variability of chemical species in the atmosphere. This motif analysis quantifies the fundamental patterns of interaction within the mechanisms, which are directly related to their construction. For example, the gas-phase chemistry in the very small Super-Fast mechanism is entirely composed of bimolecular reactions, and its motif distribution matches that of an individual bimolecular reaction well. The larger and more complex mechanisms show emergent motif distributions that differ strongly from any specific reaction type, consistent with their complexity. The proposed motif analysis demonstrates that while these mechanisms all have a similar design goal, their higher-order structure of interactions differs strongly and thus provides a novel set of tools for exploring differences across chemical mechanisms.


Ozone dry deposition through plant stomata: Multi-model comparison with flux observations and the role of water stress as part of AQMEII4 Activity 2

October 2024

·

95 Reads

·

1 Citation

A substantial portion of tropospheric O3 dry deposition occurs after diffusion of O3 through plant stomata. Simulating stomatal uptake of O3 in 3D atmospheric chemistry models is important in the face of increasing drought induced declines in stomatal conductance and enhanced ambient O3. Here, we present a comparison of the stomatal component of O3 dry deposition (egs) from chemical transport models and estimates of egs from observed CO2, latent heat, and O3 flux. The dry deposition schemes were configured as single-point models forced with data collected at flux towers. We conducted sensitivity analyses to study the impact of model parameters that control stomatal moisture stress on modeled egs. Examining six sites around the northern hemisphere, we find that the seasonality of observed flux-based egs agrees with the seasonality of simulated egs at times during the growing season with disagreements occurring during the later part of the growing season at some sites. We find that modeled water stress effects are too strong in a temperate-boreal transition forest. Some single-point models overestimate summertime egs in a seasonally water-limited Mediterranean shrubland. At all sites examined, modeled egs was sensitive to parameters that control the vapor pressure deficit stress. At specific sites that experienced substantial declines in soil moisture, the simulation of egs was highly sensitive to parameters that control the soil moisture stress. The findings demonstrate the challenges in accurately representing the effects of moisture stress on the stomatal sink of O3 during observed increases in dryness due to ecosystem specific plant-resource interactions.


Figure 1. Visualization of the corrective approach for the primary photolytic cycle. The 3 physically consistent predictions (truth, correction, and weighted nudge) lie on a yellow line that represents the conservative manifold on which both nitrogen and oxygen atoms are conserved.
Figure 2. The updated Julia photochemical mechanism includes peroxyactyl nitrate (PAN) chemistry, as well as important precursors and associated radicals. The full reaction list is contained in S3 in the supplement.
Figure 3. Distributions of the atom imbalance for each element.
Figure 4. Scatter plot of key species with the uncorrected predictions, corrected predictions and the weighted correction.
A nudge to the truth: atom conservation as a hard constraint in models of atmospheric composition using an uncertainty-weighted correction

August 2024

·

198 Reads

Computational models of atmospheric composition are not always physically consistent. For example, not all models respect fundamental conservation laws such as conservation of atoms in an interconnected chemical system. In well performing models, these nonphysical deviations are often ignored because they are frequently minor, and thus only need a small nudge to perfectly conserve mass. Here we introduce a method that anchors a prediction from any numerical model to physically consistent hard constraints, nudging concentrations to the nearest solution that respects the conservation laws. This closed-form model-agnostic correction uses a single matrix operation to minimally perturb the predicted concentrations to ensure that atoms are conserved to machine precision. To demonstrate this approach, we train a gradient boosting decision tree ensemble to emulate a small reference model of ozone photochemistry and test the effect of the correction on accurate but non-conservative predictions. The nudging approach minimally perturbs the already well-predicted results for most species, but decreases the accuracy of important oxidants, including radicals. We develop a weighted extension of this nudging approach that considers the uncertainty and magnitude of each species in the correction. This species-level weighting approach is essential to accurately predict important low concentration species such as radicals. We find that applying the uncertainty-weighted correction to the nonphysical predictions slightly improves overall accuracy, by nudging the predictions to a more likely mass-conserving solution.





Citations (41)


... Another approach first learns a new concentration or a concentration change ignoring atom balance, and then projects the solution onto the linear manifold allowed by atom balance. 26,27 This linear manifold is described by the atommolecule matrix E ∈ R a×m for a atoms and m species in the system, with E i,j defined as the number of atoms i in species j. All compositions allowed by atom conservation satisfy the following: ...

Reference:

Machine learning surrogate models for mechanistic kinetics: Embedding atom balance AND positivity
A Nudge to the Truth: Atom Conservation as a Hard Constraint in Models of Atmospheric Composition Using a Species-Weighted Correction
  • Citing Article
  • November 2024

ACS ES&T Air

... The thematic evolution analysis demonstrates an augmentation in the diversity of research subjects pertaining to both climate change and NCDs, accompanied by a shift in the relative weighting of these themes over time. With the growing interest in climate change, the diversity and definitions of the parameters utilized to accurately ascertain its impact have expanded, novel analysis methods have been developed, and the necessity to investigate the relationship with additional health outcomes and risk factors has emerged [62][63][64]. In the first period, 2001-2005, the theme "climate change" was dominant in terms of climate change. ...

The future of climate health research: An urgent call for equitable action- and solution-oriented science

Environmental Epidemiology

... We update all bimolecular and termolecular reactions to use rate constants recommended by JPL 19-5 42 . All photolytic rates are obtained using the KPP Standalone Interface within a GEOS-CF run 43 , at a surface grid cell containing Los Angeles used in prior work exploring chemical cycling in the atmosphere 44 . ...

Characterizing the Speed of Chemical Cycling in the Atmosphere
  • Citing Preprint
  • July 2024

... Despite the requirement of Ni by phytoplankton, surface ocean Ni concentrations are never fully depleted (>1.5 nmol kg 1 ). Hypotheses for the persistence of moderate surface Ni concentrations include the presence of poorly bioavailable Ni or a low biological demand relative to other (macro)nutrients (Archer et al., 2020;John et al., 2022John et al., , 2024Lemaitre et al., 2022;Middag et al., 2020;Yang et al., 2021). ...

Biogeochemical Fluxes of Nickel in the Global Oceans Inferred From a Diagnostic Model

... lag 0) effects of extreme heat on healthcare utilization for the morbidities of interest. Prior literature has shown that the strongest effects of extreme heat are mostly immediate and primarily observed on the same day, with some observing effects on lag day 1, with effects diminishing after lag day 1 [6,[21][22][23][24][25]. ...

Does socioeconomic and environmental burden affect vulnerability to extreme air pollution and heat? A case-crossover study of mortality in California

Journal of Exposure Science & Environmental Epidemiology

... A machine learning framework was used to predict residential electrical demand at varying temporal and spatial resolutions. The analysis used smart meters electricity records on an hourly basis, together with weather data, building characteristics, and socioeconomic indicators [27]. Architectures based on convolutional neural networks (CNN) and long short-term memory (LSTM) networks have also gained significant popularity in electricity load forecasting [28][29][30]. ...

A machine learning framework to estimate residential electricity demand based on smart meter electricity, climate, building characteristics, and socioeconomic datasets
  • Citing Article
  • March 2024

Applied Energy

... Machine learning (ML), and particularly neural networks (NNs), provides a framework for automatic symmetry discovery due to its capacity to identify patterns from data. Recent advancements have shown that ML models can uncover physical insights from data in unknown or partially known physical systems [3,4], including governing equations of dynamical systems [5,6], conservation laws [7,8,9,10,11,12], or physically relevant quantitites [13,14,15]. Various aspects of ML-assisted symmetry discovery have been explored using different strategies. ...

Interpretable conservation laws as sparse invariants
  • Citing Article
  • February 2024

PHYSICAL REVIEW E

... Model-Agnostic methods such as LIME, SHAP, and Partial Dependence Plot (PDP) are more popular than Model-Specific as they are not dependent on any ML model. However, the studies [33,34] show that XAI methods often generate misleading feature importance. The observations [33] in different experiments show that unimportant features can deemed to be relevant and important features to be irrelevant in SHAP. ...

Limitations of XAI Methods for Process-Level Understanding in the Atmospheric Sciences
  • Citing Article
  • December 2023

... Areas with higher albedo will reflect more sunlight, resulting in higher Land Surface Temperatures (LST), according to Zhou et al. (2013) and Varamesh et al. (2022). Built-up areas generally have higher albedo values than any other natural covers thus contributing to increased reflection, as highlighted by Schlaerth et al. (2023). Additionally, barren lands also possess relatively high albedo levels (Varamesh et al. 2022). ...

Albedo as a Competing Warming Effect of Urban Greening

... However, low fidelity achieved by subsetting data provides more leverage and we use it for pruning dimensions. A Two-Step HPO method proposed by (Yu et al., 2023) demonstrates that starting with small training and validation data subsets and progressively using more data is an effective and economical way to narrow choices and value ranges and converge, if not to a global, at least to a decent local minimum. However, the authors caution against using the proposed approach with adaptive tuning techniques, such as Bayesian HPO, because selections depend on history and may be based on different fidelity trials. ...

Two-Step Hyperparameter Optimization Method: Accelerating Hyperparameter Search by Using a Fraction of a Training Dataset
  • Citing Article
  • November 2023