Article

Beyond PCA: Additional Dimension Reduction Techniques to Consider in the Development of Climate Fingerprints

American Meteorological Society
Journal of Climate
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Dimension reduction techniques are an essential part of the climate analyst’s toolkit. Due to the enormous scale of climate data, dimension reduction methods are used to identify major patterns of variability within climate dynamics, to create compelling and informative visualizations, and to quantify major named modes such as the El-Niño Southern Oscillation. Principal Components Analysis (PCA), also known as the method of empirical orthogonal functions (EOFs), is the most commonly used form of dimension reduction, characterized by a remarkable confluence of attractive mathematical, statistical, and computational properties. Despite its ubiquity, PCA suffers from several difficulties relevant to climate science: high computational burden with large data sets, decreased statistical accuracy in high-dimensions, and difficulties comparing across multiple data sets. In this paper, we introduce several variants of PCA that are likely to be of use in climate sciences and address these problems. Specifically, we introduce non-negative, sparse , and tensor PCA and demonstrate how each approach provides superior pattern recognition in climate data. We also discuss approaches to comparing PCA-family results within and across data sets in a domain-relevant manner. We demonstrate these approaches through an analysis of several runs of the E3SM climate model from 1991 to 1995, focusing on the simulated response to the Mt. Pinatubo eruption; our findings are consistent with a recently-identified stratospheric warming fingerprint associated with this type of stratospheric aerosol injection.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In this approach, spatial and/or temporal patterns are established under various disturbances (i.e., greenhouse gases, aerosol loading, etc.) and matched to observations [2,3]. Although the past few years have seen extensions of fingerprinting to regional analyses [4,5], multiple variables [6,7] and challenging problems with very small signal-to-noise ratios [8,9,10,11], the method is designed to work within a single step. There has been some recent work to develop conditional multi-step fingerprinting methods [12], but this field is still in its infancy. ...
... In this approach, spatial and/or temporal patterns are established under various disturbances (i.e., greenhouse gases, aerosol loading, etc.) and matched to observations [20,45]. Although the past few years have seen extensions of fingerprinting to regional analyses [5,52], multiple variables [6,30] and challenging problems with very small signal-to-noise ratios [1,41,59,58], the method is designed to work within a single step. There has been some recent work to develop conditional multi-step fingerprinting methods [57], but this field is still in its infancy. ...
Preprint
Full-text available
Disturbances to the climate system, both natural and anthropogenic, have far reaching impacts that are not always easy to identify or quantify using traditional climate science analyses or causal modeling techniques. In this paper, we develop a novel technique for discovering and ranking the chain of spatio-temporal downstream impacts of a climate source, referred to herein as a source-impact pathway, using Random Forest Regression (RFR) and SHapley Additive exPlanation (SHAP) feature importances. Rather than utilizing RFR for classification or regression tasks (the most common use case for RFR), we propose a fundamentally new RFR-based workflow in which we: (i) train random forest (RF) regressors on a set of spatio-temporal features of interest, (ii) calculate their pair-wise feature importances using the SHAP weights associated with those features, and (iii) translate these feature importances into a weighted pathway network (i.e., a weighted directed graph), which can be used to trace out and rank interdependencies between climate features and/or modalities. We adopt a tiered verification approach to verify our new pathway identification methodology. In this approach, we apply our method to ensembles of data generated by running two increasingly complex benchmarks: (i) a set of synthetic coupled equations, and (ii) a fully coupled simulation of the 1991 eruption of Mount Pinatubo in the Philippines performed using a modified version 2 of the U.S. Department of Energy's Energy Exascale Earth System Model (E3SMv2). We find that our RFR feature importance-based approach can accurately detect known pathways of impact for both test cases.
Article
Full-text available
This work documents version two of the Department of Energy's Energy Exascale Earth System Model (E3SM). E3SMv2 is a significant evolution from its predecessor E3SMv1, resulting in a model that is nearly twice as fast and with a simulated climate that is improved in many metrics. We describe the physical climate model in its lower horizontal resolution configuration consisting of 110 km atmosphere, 165 km land, 0.5° river routing model, and an ocean and sea ice with mesh spacing varying between 60 km in the mid‐latitudes and 30 km at the equator and poles. The model performance is evaluated with Coupled Model Intercomparison Project Phase 6 Diagnosis, Evaluation, and Characterization of Klima simulations augmented with historical simulations as well as simulations to evaluate impacts of different forcing agents. The simulated climate has many realistic features of the climate system, with notable improvements in clouds and precipitation compared to E3SMv1. E3SMv1 suffered from an excessively high equilibrium climate sensitivity (ECS) of 5.3 K. In E3SMv2, ECS is reduced to 4.0 K which is now within the plausible range based on a recent World Climate Research Program assessment. However, a number of important biases remain including a weak Atlantic Meridional Overturning Circulation, deficiencies in the characteristics and spectral distribution of tropical atmospheric variability, and a significant underestimation of the observed warming in the second half of the historical period. An analysis of single‐forcing simulations indicates that correcting the historical temperature bias would require a substantial reduction in the magnitude of the aerosol‐related forcing.
Article
Full-text available
Aerosol particles originated from anthropogenic emissions, volcanic eruptions, biomass burning, and fossil combustion emissions, and their radiative effect is one of the most uncertain factors in climate change. Meanwhile, aerosol particles in fine particle size could also cause irreversible effects on the human respiratory system. This study attempted to analyse the spatial and temporal variations of global aerosol optical depth (AOD, 550 nm) during 1980–2018 using MERRA-2 aerosol reanalysis products and to investigate the effects of natural/anthropogenic emissions of different types of aerosols on AOD values. The results show that the global annual mean AOD values kept high levels with significant fluctuations during 1980–1995 and showed a consistent decreasing and less volatile trend after 1995. Spatially, the AOD values are relatively higher in the Northern Hemisphere than in the Southern Hemisphere, especially in North Africa (0.329), Northern India (0.235), and Eastern China (0.347), because of the intensive natural/anthropogenic aerosol emissions there. The sulphate-based aerosols emitted by biomass burning and anthropogenic emissions are the main types of aerosols worldwide, especially in densely populated and industrialized regions such as East Asia and Europe. Dust aerosols are also the main aerosol type in desert areas. For example, the AOD and AODP values for the Sahara Desert are 0.3178 and 75.32%, respectively. Both black carbon aerosols (BC) and organic carbon aerosols (OC) are primary or secondary from carbon emissions of fossil fuels, biomass burning, and open burning. Thus, the regions with high BC and OC aerosol loadings are mainly located in densely populated or vegetated areas such as East Asia, South Asia, and Central Africa. Sea salt aerosols are mainly found in coastline areas along the warm current pathway. This study could help relevant researchers in the fields of atmospheric science, environmental protection, air pollution, and ecological environment to understand the global spatial–temporal variations and main driving factors of aerosol loadings.
Article
Full-text available
Global climate models are central tools for understanding past and future climate change. The assessment of model skill, in turn, can benefit from modern data science approaches. Here we apply causal discovery algorithms to sea level pressure data from a large set of climate model simulations and, as a proxy for observations, meteorological reanalyses. We demonstrate how the resulting causal networks (fingerprints) offer an objective pathway for process-oriented model evaluation. Models with fingerprints closer to observations better reproduce important precipitation patterns over highly populated areas such as the Indian subcontinent, Africa, East Asia, Europe and North America. We further identify expected model interdependencies due to shared development backgrounds. Finally, our network metrics provide stronger relationships for constraining precipitation projections under climate change as compared to traditional evaluation metrics for storm tracks or precipitation itself. Such emergent relationships highlight the potential of causal networks to constrain longstanding uncertainties in climate change projections. Algorithms to assess causal relationships in data sets have seen increasing applications in climate science in recent years. Here, the authors show that these techniques can help to systematically evaluate the performance of climate models and, as a result, to constrain uncertainties in future climate change projections.
Article
Full-text available
Abstract A framework for analyzing and benchmarking climate model outputs is built upon δ‐MAPS, a recently developed complex network analysis method. The framework allows for the possibility of highlighting quantifiable topological differences across data sets, capturing the magnitude of interactions including lagged relationships and quantifying the modeled internal variability, changes in domains properties and in their connections over space and time. A set of four metrics is proposed to assess and compare the modeled domains shapes, strengths, and connectivity patterns. δ‐MAPS is applied to investigate the topological properties of sea surface temperature from observational data sets and in a subset of the Community Earth System Model (CESM) Large Ensemble focusing on the past 35 years and over the 20th and 21st centuries. Model ensemble members are mapped in a reduced metric space to quantify internal variability and average model error. It is found that network properties are on average robust whenever individual member or ensemble trends are removed. The assessment identifies biases in the CESM representation of the connectivity patterns that stem from too strong autocorrelations of domains signals and from the overestimation of the El Niño–Southern Oscillation amplitude and its thermodynamic feedback onto the tropical band in most members.
Article
Full-text available
El Niño and Southern Oscillation (ENSO) is the most prominent year-to-year climate fluctuation on Earth, alternating between anomalously warm (El Niño) and cold (La Niña) sea surface temperature (SST) conditions in the tropical Pacific. ENSO exerts its impacts on remote regions of the globe through atmospheric teleconnections, affecting extreme weather conditions worldwide. However, these teleconnections are inherently nonlinear and sensitive to ENSO SST anomaly patterns and amplitudes. In addition, teleconnections are modulated by variability in the oceanic mean state outside the tropics and by land and sea-ice extent. The character of ENSO as well as the ocean mean state have changed since the 1990s, which might be due to either natural variability, or anthropogenic forcing, or their combined influences. This has resulted in changes in ENSO atmospheric teleconnections in terms of precipitation and temperature in various parts of the globe. In addition, changes in ENSO teleconnection patterns have affected their predictability and the statistics of extreme events. However, the short observational record does not allow us to clearly distinguish which changes are robust and which are not. Climate models suggest that ENSO teleconnections will change because the mean atmospheric circulation will change due to anthropogenic forcing in the 21st century, which is independent of whether ENSO properties change or not. However, future ENSO teleconnection changes do not currently show strong inter-model agreement from region to region, highlighting the importance of identifying factors that affect uncertainty in future model projections.
Article
Full-text available
Interest in stratospheric aerosol and its role in climate has increased over the last decade due to the observed increase in stratospheric aerosol since 2000 and the potential for changes in the sulfur cycle induced by climate change. This review provides an overview about the advances in stratospheric aerosol research since the last comprehensive assessment of stratospheric aerosol was published in 2006. A crucial development since 2006 is the substantial improvement in the agreement between in situ and space-based inferences of stratospheric aerosol properties during volcanically quiescent periods. Furthermore, new measurement systems and techniques, both in situ and space-based, have been developed for measuring physical aerosol properties with greater accuracy and for characterizing aerosol composition. However, these changes induce challenges to constructing a long-term stratospheric aerosol climatology. Currently, changes in stratospheric aerosol levels less than 20% cannot be confidently quantified. The volcanic signals tend to mask any non-volcanically driven change, making them difficult to understand. While the role of carbonyl sulfide (OCS) as a substantial and relatively constant source of stratospheric sulfur has been confirmed by new observations and model simulations, large uncertainties remain with respect to the contribution from anthropogenic sulfur dioxide (SO2) emissions. New evidence has been provided that stratospheric aerosol can also contain small amounts of non-sulfate matter such as black carbon and organics. Chemistry-climate models have substantially increased in quantity and sophistication. In many models the implementation of stratospheric aerosol processes is coupled to radiation and/or stratospheric chemistry modules to account for relevant feedback processes.
Article
Full-text available
Global-mean surface temperature is affected by both natural variability and anthropogenic forcing. This study is concerned with identifying and removing from global-mean temperatures the signatures of natural climate variability over the period January 1900-March 2009. A series of simple, physically based methodologies are developed and applied to isolate the climate impacts of three known sources of natural variability: the El Nino-Southern Oscillation (ENSO), variations in the advection of marine air masses over the high-latitude continents during winter, and aerosols injected into the stratosphere by explosive volcanic eruptions. After the effects of ENSO and high-latitude temperature advection are removed from the global-mean temperature record, the signatures of volcanic eruptions and changes in instrumentation become more clearly apparent. After the volcanic eruptions are subsequently filtered from the record, the residual time series reveals a nearly monotonic global warming pattern since similar to 1950. The results also reveal coupling between the land and ocean areas on the interannual time scale that transcends the effects of ENSO and volcanic eruptions. Globally averaged land and ocean temperatures are most strongly correlated when ocean leads land by; 2-3 months. These coupled fluctuations exhibit a complicated spatial signature with largest-amplitude sea surface temperature perturbations over the Atlantic Ocean.
Article
Full-text available
Optimal signal detection theory has been applied in a search through 100 yr of surface temperature data for the climate response to four specific radiative forcings. The data used comes from 36 boxes on the earth and was restricted to the frequency band 0.06-0.13 cycles yr1 (16.67-7.69 yr) in the analysis. Estimates were sought of the strengths of the climate response to solar variability, volcanic aerosols, greenhouse gases, and anthropogenic aerosols. The optimal filter was constructed with a signal waveform computed from a two-dimensional energy balance model (EBM). The optimal weights were computed from a 10000-yr control run of a noise-forced EBM and from 1000-yr control runs from coupled ocean-atmosphere models at Geophysical Fluid Dynamics Laboratory (GFDL) and Max-Planck Institute; the authors also used a 1000-yr run using the GFDL mixed layer model. Results are reasonably consistent across these four separate model formulations. It was found that the component of the volcanic response perpendicular to the other signals was very robust and highly significant. Similarly, the component of the greenhouse gas response perpendicular to the others was very robust and highly significant. When the sum of all four climate forcings was used, the climate response was more than three standard deviations above the noise level. These findings are considered to be powerful evidence of anthropogenically induced climate change.
Article
Full-text available
Large changes in the hydrology of the western United States have been observed since the mid-twentieth century. These include a reduction in the amount of precipitation arriving as snow, a decline in snowpack at low and midelevations, and a shift toward earlier arrival of both snowmelt and the centroid (center of mass) of streamflows. To project future water supply reliability, it is crucial to obtain a better understanding of the underlying cause or causes for these changes. A regional warming is often posited as the cause of these changes without formal testing of different competitive explanations for the warming. In this study, a rigorous detection and attribution analysis is performed to determine the causes of the late winter/early spring changes in hydrologically relevant temperature variables over mountain ranges of the western United States. Natural internal climate variability, as estimated from two long control climate model simulations, is insufficient to explain the rapid increase in daily minimum and maximum temperatures, the sharp decline in frost days, and the rise in degree-days above 0°C (a simple proxy for temperature-driven snowmelt). These observed changes are also inconsistent with the model-predicted responses to variability in solar irradiance and volcanic activity. The observations are consistent with climate simulations that include the combined effects of anthropogenic greenhouse gases and aerosols. It is found that, for each temperature variable considered, an anthropogenic signal is identifiable in observational fields. The results are robust to uncertainties in model-estimated fingerprints and natural variability noise, to the choice of statistical down-scaling method, and to various processing options in the detection and attribution method.
Article
Full-text available
Variability of the atmospheric and oceanic circulations in the earth system gives rise to an array of naturally occurring dynamical modes. Instead of being spatially independent or spatially uniform, climate variability in different parts of the globe is orchestrated by one or a combination of several climate modes, and global changes take place with a distinctive spatial pattern resembling that of the modes-related climate anomalies. Climate impact on the dynamics of terrestrial and marine biosphere also demonstrates clear signals for the mode effects. In this review, we view modes as an important attribute of climate variability, changes, and impact and emphasize the emerging concept that future climate changes may be manifest as changes in the leading modes of the climate system. The focus of this review is on three of the leading modes: the North Atlantic Oscillation, the El Niño-Southern Oscillation, and the Pacific Decadal Oscillation.
Article
Full-text available
 A multi-fingerprint analysis is applied to the detection and attribution of anthropogenic climate change. While a single fingerprint is optimal for the detection of climate change, further tests of the statistical consistency of the detected climate change signal with model predictions for different candidate forcing mechanisms require the simultaneous application of several fingerprints. Model-predicted climate change signals are derived from three anthropogenic global warming simulations for the period 1880 to 2049 and two simulations forced by estimated changes in solar radiation from 1700 to 1992. In the first global warming simulation, the forcing is by greenhouse gas only, while in the remaining two simulations the direct influence of sulfate aerosols is also included. From the climate change signals of the greenhouse gas only and the average of the two greenhouse gas-plus-aerosol simulations, two optimized fingerprint patterns are derived by weighting the model-predicted climate change patterns towards low-noise directions. The optimized fingerprint patterns are then applied as a filter to the observed near-surface temperature trend patterns, yielding several detection variables. The space-time structure of natural climate variability needed to determine the optimal fingerprint pattern and the resultant signal-to-noise ratio of the detection variable is estimated from several multi-century control simulations with different CGCMs and from instrumental data over the last 136 y. Applying the combined greenhouse gas-plus-aerosol fingerprint in the same way as the greenhouse gas only fingerprint in a previous work, the recent 30-y trends (1966–1995) of annual mean near surface temperature are again found to represent a significant climate change at the 97.5% confidence level. However, using both the greenhouse gas and the combined forcing fingerprints in a two-pattern analysis, a substantially better agreement between observations and the climate model prediction is found for the combined forcing simulation. Anticipating that the influence of the aerosol forcing is strongest for longer term temperature trends in summer, application of the detection and attribution test to the latest observed 50-y trend pattern of summer temperature yielded statistical consistency with the greenhouse gas-plus-aerosol simulation with respect to both the pattern and amplitude of the signal. In contrast, the observations are inconsistent with the greenhouse-gas only climate change signal at a 95% confidence level for all estimates of climate variability. The observed trend 1943–1992 is furthermore inconsistent with a hypothesized solar radiation change alone at an estimated 90% confidence level. Thus, in contrast to the single pattern analysis, the two pattern analysis is able to discriminate between different forcing hypotheses in the observed climate change signal. The results are subject to uncertainties associated with the forcing history, which is poorly known for the solar and aerosol forcing, the possible omission of other important forcings, and inevitable model errors in the computation of the response to the forcing. Further uncertainties in the estimated significance levels arise from the use of model internal variability simulations and relatively short instrumental observations (after subtraction of an estimated greenhouse gas signal) to estimate the natural climate variability. The resulting confidence limits accordingly vary for different estimates using different variability data. Despite these uncertainties, however, we consider our results sufficiently robust to have some confidence in our finding that the observed climate change is consistent with a combined greenhouse gas and aerosol forcing, but inconsistent with greenhouse gas or solar forcing alone.
Article
Full-text available
The response to anthropogenic changes in climate forcing occurs against a backdrop of natural internal and externally forced climate variability that can occur on similar temporal and spatial scales. Internal climate variability, by which we mean climate variability not forced by external agents, occurs on all time-scales from weeks to centuries and millennia. Slow climate components, such as the ocean, have particularly important roles on decadal and century time-scales because they integrate high-frequency weather variability (Hasselmann, 1976) and interact with faster components. Thus the climate is capable of producing long time-scale internal variations of considerable magnitude without any external influences. Externally forced climate variations may be due to changes in natural forcing factors, such as solar radiation or volcanic aerosols, or to changes in anthropogenic forcing factors, such as increasing concentrations of greenhouse gases or sulphate aerosols. Pages: 695-738
Article
Full-text available
In many multivariate statistical techniques, a set of linear functions of the original p variables is produced. One of the more difŽ cult aspects of these techniques is the interpretation of the linear functions, as these functions usually have nonzero coefŽ cients on all p variables.A common approach is to effectively ignore (treat as zero) any coefŽ cients less than some threshold value, so that the function becomes simple and the interpretation becomes easier for the users. Such a procedure can be misleading.There are alternatives to principal component analysis which restrict the coefficients to a smaller number of possible values in the derivationof the linear functions,or replace the principal components by “principal variables.” This article introduces a new technique, borrowing an idea proposed by Tibshirani in the context of multiple regressionwhere similar problemsarise in interpreting regression equations. This approach is the so-called LASSO, the “least absolute shrinkage and selection operator,” in which a bound is introduced on the sum of the absolute values of the coefficients, and in which some coefficients consequently become zero.We explore some of the propertiesof the newtechnique,both theoreticallyand using simulationstudies, and apply it to an example.
Article
Full-text available
We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as , where dk, uk, and vk minimize the squared Frobenius norm of X, subject to penalties on uk and vk. This results in a regularized version of the singular value decomposition. Of particular interest is the use of L1-penalties on uk and vk, which yields a decomposition of X using sparse vectors. We show that when the PMD is applied using an L1-penalty on vk but not on uk, a method for sparse principal components results. In fact, this yields an efficient algorithm for the “SCoTLASS” proposal (Jolliffe and others 2003) for obtaining sparse principal components. This method is demonstrated on a publicly available gene expression data set. We also establish connections between the SCoTLASS method for sparse principal component analysis and the method of Zou and others (2006). In addition, we show that when the PMD is applied to a cross-products matrix, it results in a method for penalized canonical correlation analysis (CCA). We apply this penalized CCA method to simulated data and to a genomic data set consisting of gene expression and DNA copy number measurements on the same set of samples.
Article
Full-text available
Data from the satellite-based Special Sensor Microwave Imager (SSM/I) show that the total atmospheric moisture content over oceans has increased by 0.41 kg/m² per decade since 1988. Results from current climate models indicate that water vapor increases of this magnitude cannot be explained by climate noise alone. In a formal detection and attribution analysis using the pooled results from 22 different climate models, the simulated “fingerprint” pattern of anthropogenically caused changes in water vapor is identifiable with high statistical confidence in the SSM/I data. Experiments in which forcing factors are varied individually suggest that this fingerprint “match” is primarily due to human-caused increases in greenhouse gases and not to solar forcing or recovery from the eruption of Mount Pinatubo. Our findings provide preliminary evidence of an emerging anthropogenic signal in the moisture content of earth's atmosphere. • climate change • climate modeling • detection and attribution • water vapor
Article
In this paper we develop a method to quantify the accuracy of different pattern extraction techniques for the additive space–time modes often assumed to be present in climate data. It has previously been shown that the standard technique of principal component analysis (PCA; also known as empirical orthogonal functions) may extract patterns that are not physically meaningful. Here we analyze two modern pattern extraction methods, namely dynamical mode decomposition (DMD) and slow feature analysis (SFA), in comparison with PCA. We develop a Monte Carlo method to generate synthetic additive modes that mimic the properties of climate modes described in the literature. The datasets composed of these generated modes do not satisfy the assumptions of any pattern extraction method presented. We find that both alternative methods significantly outperform PCA in extracting local and global modes in the synthetic data. These techniques had a higher mean accuracy across modes in 60 out of 60 mixed synthetic climates, with SFA slightly outperforming DMD. We show that in the majority of simple cases PCA extracts modes that are not significantly better than a random guess. Finally, when applied to real climate data these alternative techniques extract a more coherent and less noisy global warming signal, as well as an El Niño signal with a clearer spectral peak in the time series, and more a physically plausible spatial pattern.
Article
Ensembles of climate model simulations are commonly used to separate externally forced climate change from internal variability. However, much of the information gained from running large ensembles is lost in traditional methods of data reduction such as linear trend analysis or large-scale spatial averaging. This paper demonstrates how a pattern recognition method (signal-to-noise-maximizing pattern filtering) extracts patterns of externally forced climate change from large ensembles and identifies the forced climate response with up to ten times fewer ensemble members than simple ensemble averaging. It is particularly effective at filtering out spatially coherent modes of internal variability (e.g., El Niño, North Atlantic Oscillation), which would otherwise alias into estimates of regional responses to forcing. This method is used to identify forced climate responses within the 40-member Community Earth System Model (CESM) large ensemble, including an El-Niño-like response to volcanic eruptions and forced trends in the North Atlantic Oscillation. The ensemble-based estimate of the forced response is used to test statistical methods for isolating the forced response from a single realization (i.e., individual ensemble members). Low-frequency pattern filtering is found to skillfully identify the forced response within individual ensemble members and is applied to the HadCRUT4 reconstruction of observed temperatures, whereby it identifies slow components of observed temperature changes that are consistent with the expected effects of anthropogenic greenhouse gas and aerosol forcing.
Article
Abstract Variability of the atmospheric and oceanic circulations in the earth system gives rise to an array of naturally occurring dynamical modes. Instead of being spatially independent or spatially uniform, climate variability in different parts of the globe is orchestrated by one or a combination of several climate modes, and global changes take place with a distinctive spatial pattern resembling that of the modes-related climate anomalies. Climate impact on the dynamics of terrestrial and marine biosphere also demonstrates clear signals for the mode effects. In this review, we view modes as an important attribute of climate variability, changes, and impact and emphasize the emerging concept that future climate changes may be manifest as changes in the leading modes of the climate system. The focus of this review is on three of the leading modes: the North Atlantic Oscillation, the El Niño-Southern Oscillation, and the Pacific Decadal Oscillation.
Article
Introduction.- Life Course Data in Criminology.- The Nondurable Goods Index.- Bone Shapes from a Paleopathology Study.- Modeling Reaction Time Distributions.- Zooming in on Human Growth.- Time Warping Handwriting and Weather Records.- How do Bone Shapes Indicate Arthritis?- Functional Models for Test Items.- Predicting Lip Acceleration from Electromyography.- Variable Seasonal Trend in the Goods Index.- The Dynamics of Handwriting Printed Characters.- A Differential Equation for Juggling.
Article
Traditional tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions yield higher-order principal components that have been used to understand tensor data in areas such as neu-roimaging, microscopy, chemometrics, and remote sensing. Sparsity in high-dimensional matrix factorizations and principal compo-nents has been well-studied exhibiting many benefits; less attention has been given to sparsity in tensor decompositions. We pro-pose two novel tensor decompositions that in-corporate sparsity: the Sparse Higher-Order SVD and the Sparse CP Decomposition. The latter solves an 1 -norm penalized relaxation of the single-factor CP optimization problem, thereby automatically selecting relevant fea-tures for each tensor factor. Through experi-ments and a scientific data analysis example, we demonstrate the utility of our methods for dimension reduction, feature selection, sig-nal recovery, and exploratory data analysis of high-dimensional tensors.
Article
Volcanic eruptions are an important natural cause of climate change on many timescales. A new capability to predict the climatic response to a large tropical eruption for the succeeding 2 years will prove valuable to society. In addition, to detect and attribute anthropogenic influences on climate, including effects of greenhouse gases, aerosols, and ozone-depleting chemicals, it is crucial to quantify the natural fluctuations so as to separate them from anthropogenic fluctuations in the climate record. Studying the responses of climate to volcanic eruptions also helps us to better understand important radiative and dynamical processes that respond in the climate system to both natural and anthropogenic forcings. Furthermore, modeling the effects of volcanic eruptions helps us to improve climate models that are needed to study anthropogenic effects. Large volcanic eruptions inject sulfur gases into the stratosphere, which convert to sulfate aerosols with an e-folding residence time of about 1 year. Large ash particles fall out much quicker. The radiative and chemical effects of this aerosol cloud produce responses in the climate system. By scattering some solar radiation back to space, the aerosols cool the surface, but by absorbing both solar and terrestrial radiation, the aerosol layer heats the stratosphere. For a tropical eruption this heating is larger in the tropics than in the high latitudes, producing an enhanced pole-to-equator temperature gradient, especially in winter. In the Northern Hemisphere winter this enhanced gradient produces a stronger polar vortex, and this stronger jet stream produces a characteristic stationary wave pattern of tropospheric circulation, resulting in winter warming of Northern Hemisphere continents. This indirect advective effect on temperature is stronger than the radiative cooling effect that dominates at lower latitudes and in the summer. The volcanic aerosols also serve as surfaces for heterogeneous chemical reactions that destroy stratospheric ozone, which lowers ultraviolet absorption and reduces the radiative heating in the lower stratosphere, but the net effect is still heating. Because this chemical effect depends on the presence of anthropogenic chlorine, it has only become important in recent decades. For a few days after an eruption the amplitude of the diurnal cycle of surface air temperature is reduced under the cloud. On a much longer timescale, volcanic effects played a large role in interdecadal climate change of the Little Ice Age. There is no perfect index of past volcanism, but more ice cores from Greenland and Antarctica will improve the record. There is no evidence that volcanic eruptions produce El Nino events, but the climatic effects of El Nino and volcanic eruptions must be separated to understand the climatic response to each.
Article
Much of my work in recent years has been devoted to understanding the hydrological and energy cycles. The incoming radiant energy from the sun is transformed into various forms (internal heat, potential energy, latent energy, and kinetic energy) moved around in various ways primarily by the atmosphere and oceans, stored and sequestered in the ocean, land, and ice components of the climate system, and ultimately radiated back to space as infrared radiation. The requirement for an equilibrium climate mandates a balance between the incoming and outgoing radiation and further mandates that the flows of energy are systematic. The imbalance at top of atmosphere from increasing greenhouse gases from human activities creates warming. The central concern with geoengineering fixes to global warming is that the cure could be worse than the disease. The problem of global warming arises from the buildup of greenhouse gases such as carbon dioxide from burning of fossil fuels and other human activities that change the composition of the atmosphere. However, the solution proposed is to reduce the incoming sunshine by emulating a volcanic eruption. In between the incoming solar radiation and the outgoing longwave radiation is the entire weather and climate system and the operation of the hydrological cycle. The eruption of Mount Pinatubo in 1991 is used as an analog for the geoengineering and show that there was a substantial decrease in precipitation over land and a record decrease in runoff and streamflow in 1992, suggesting that major adverse effects, such as drought, could arise from such geoengineering solutions.
Article
Empirical orthogonal functions (EOFs) are widely used in climate research to identify dominant patterns of variability and to reduce the dimensionality of climate data. EOFs, however, can be difficult to interpret. Rotated empirical orthogonal functions (REOFs) have been proposed as more physical entities with simpler patterns than EOFs. This study presents a new approach for finding climate patterns with simple structures that overcomes the problems encountered with rotation. The method achieves simplicity of the patterns by using the main properties of EOFs and REOFs simultaneously. Orthogonal patterns that maximise variance subject to a constraint that induces a form of simplicity are found. The simplified empirical orthogonal function (SEOF) patterns, being more ‘local’, are constrained to have zero loadings outside the main centre of action. The method is applied to winter Northern Hemisphere (NH) monthly mean sea level pressure (SLP) reanalyses over the period 1948–2000. The ‘simplified’ leading patterns of variability are identified and compared to the leading patterns obtained from EOFs and REOFs. Copyright © 2005 Royal Meteorological Society.
Article
 The multi-variate optimal fingerprint method for the detection of an externally forced climate change signal in the presence of natural internal variability is extended to the attribution problem. To determine whether a climate change signal which has been detected in observed climate data can be attributed to a particular climate forcing mechanism, or combination of mechanisms, the predicted space–time dependent climate change signal patterns for the candidate climate forcings must be specified. In addition to the signal patterns, the method requires input information on the space–time dependent covariance matrices of the natural climate variability and of the errors of the predicted signal patterns. The detection and attribution problem is treated as a sequence of individual consistency tests applied to all candidate forcing mechanisms, as well as to the null hypothesis that no climate change has taken place, within the phase space spanned by the predicted climate change patterns. As output the method yields a significance level for the detection of a climate change signal in the observed data and individual confidence levels for the consistency of the retrieved climate change signal with each of the forcing mechanisms. A statistically significant climate change signal is regarded as consistent with a given forcing mechanism if the statistical confidence level exceeds a given critical value, but is attributed to that forcing only if all other candidate climate change mechanisms (from a finite set of proposed mechanisms) are rejected at that confidence level. Although all relations can be readily expressed in standard matrix notation, the analysis is carried out using tensor notation, with a metric given by the natural-variability covariance matrix. This simplifies the derivations and clarifies the invariant relation between the covariant signal patterns and their contravariant fingerprint counterparts. The signal patterns define the reduced vector space in which the climate trajectories are analyzed, while the fingerprints are needed to project the climate trajectories onto this reduced space.
Article
There is increasingly clear evidence that human influence has contributed substantially to the large-scale climatic changes that have occurred over the past few decades. Attention is now turning to the physical implications of the emerging anthropogenic signal. Of particular interest is the question of whether current climate models may be over- or under-estimating the amplitude of the climate system''s response to external forcing, including anthropogenic. Evidence of a significant error in a model-simulated response amplitude would indicate the existence of amplifying or damping mechanisms that are inadequately represented in the model. The range of uncertainty in the factor by which we can scale model-simulated changes while remaining consistent with observed change provides an estimate of uncertainty in model-based predictions. With any model that displays a realistic level of internal variability, the problem of estimating this factor is complicated by the fact that it represents a ratio between two incompletely known quantities: both observed and simulated responses are subject to sampling uncertainty, primarily due to internal chaotic variability. Sampling uncertainty in the simulated response can be reduced, but not eliminated, through ensemble simulations. Accurate estimation of these scaling factors requires a modification of the standard "optimal fingerprinting" algorithm for climate change detection, drawing on the conventional "total least squares" approach discussed in the statistical literature. Code for both variants of optimal fingerprinting can be found on http://www.climateprediction.net/detection.
Article
The SCoTLASS problem-principal component analysis modified so that the components satisfy the Least Absolute Shrinkage and Selection Operator (LASSO) constraint-is reformulated as a dynamical system on the unit sphere. The LASSO inequality constraint is tackled by exterior penalty function. A globally convergent algorithm is developed based on the projected gradient approach. The algorithm is illustrated numerically and discussed on a well-known data set. (c) 2004 Elsevier B.V. All rights reserved.
Article
This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N-way array. Decompositions of higher-order tensors (i.e., N-way arrays with N 3) have applications in psychomet- rics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, and elsewhere. Two particular tensor decompositions can be considered to be higher-order extensions of the matrix singular value decomposition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rank- one tensors, and the Tucker decomposition is a higher-order form of principal component analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The N-way Toolbox, Tensor Toolbox, and Multilinear Engine are examples of software packages for working with tensors.
Chapter
Most statistical analyses involve one or more observations taken on each of a number of individuals in a sample, with the aim of making inferences about the general population from which the sample is drawn. In an increasing number of fields, these observations are curves or images. Curves and images are examples of functions, since an observed intensity is available at each point on a line segment, a portion of a plane, or a volume. For this reason, we call observed curves and images ‘functional data,’ and statistical methods for analyzing such data are described by the term ‘functional data analysis.’ It is the smoothness of the processes generating functional data that differentiates this type of data from more classical multivariate observations. This smoothness means that we can work with the information in the derivatives of functions or images. This article includes several illustrative examples.
Article
Climate and weather constitute a typical example where high dimensional and complex phenomena meet. The atmospheric system is the result of highly complex interactions between many degrees of freedom or modes. In order to gain insight in understanding the dynamical/physical behaviour involved it is useful to attempt to understand their interactions in terms of a much smaller number of prominent modes of variability. This has led to the development by atmospheric researchers of methods that give a space display and a time display of large space‐time atmospheric data. Empirical orthogonal functions (EOFs) were first used in meteorology in the late 1940s. The method, which decomposes a space‐time field into spatial patterns and associated time indices, contributed much in advancing our knowledge of the atmosphere. However, since the atmosphere contains all sorts of features, e.g. stationary and propagating, EOFs are unable to provide a full picture. For example, EOFs tend, in general, to be difficult to interpret because of their geometric properties, such as their global feature, and their orthogonality in space and time. To obtain more localised features, modifications, e.g. rotated EOFs (REOFs), have been introduced. At the same time, because these methods cannot deal with propagating features, since they only use spatial correlation of the field, it was necessary to use both spatial and time information in order to identify such features. Extended and complex EOFs were introduced to serve that purpose. Because of the importance of EOFs and closely related methods in atmospheric science, and because the existing reviews of the subject are slightly out of date, there seems to be a need to update our knowledge by including new developments that could not be presented in previous reviews. This review proposes to achieve precisely this goal. The basic theory of the main types of EOFs is reviewed, and a wide range of applications using various data sets are also provided. Copyright © 2007 Royal Meteorological Society
Article
The lasso penalizes a least squares regression by the sum of the absolute values ("L"1-norm) of the coefficients. The form of this penalty encourages sparse solutions (with many coefficients equal to 0). We propose the 'fused lasso', a generalization that is designed for problems with features that can be ordered in some meaningful way. The fused lasso penalizes the "L"1-norm of both the coefficients and their successive differences. Thus it encourages sparsity of the coefficients and also sparsity of their differences-i.e. local constancy of the coefficient profile. The fused lasso is especially useful when the number of features "p" is much greater than "N", the sample size. The technique is also extended to the 'hinge' loss function that underlies the support vector classifier. We illustrate the methods on examples from protein mass spectroscopy and gene expression data. Copyright 2005 Royal Statistical Society.
2021: The fingerprints of stratospheric aerosol injection in E3SM
  • B M Wagman
  • L P Swiler
  • K Chowdhary
  • B Hillman
  • Wagman, B. M.
Wagman, B. M., L. P. Swiler, K. Chowdhary, and B. Hillman, 2021: The fingerprints of stratospheric aerosol injection in e3sm. Tech. Rep. SAND2021-11522R, Sandia National Laboratories. https://doi.org/10.2172/1821542.
Detection and attribution of temperature changes in the mountainous western United States
  • C Bonfils
  • Bonfils, C.
ENSO atmospheric teleconnections and their response to greenhouse gas forcing
  • S.-W Yeh
  • Yeh, S.-W