Article

Ensemble Data Assimilation without Perturbed Observations

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The ensemble Kalman filter (EnKF) is a data assimilation scheme based on the traditional Kalman filter update equation. An ensemble of forecasts are used to estimate the background-error covariances needed to compute the Kalman gain. It is known that if the same observations and the same gain are used to update each member of the ensemble, the ensemble will systematically underestimate analysis-error covariances. This will cause a degradation of subsequent analyses and may lead to filter divergence. For large ensembles, it is known that this problem can be alleviated by treating the observations as random variables, adding random perturbations to them with the correct statistics.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... EnKF snow assimilation performs slightly better in higher-altitude regions compared to the DI method and causes less distortion of the hydrological and energy budget (Arsenault et al. 2013). However, the observations must be perturbed for the EnKF, which may lead to additional bias (Whitaker and Hamill 2002). Therefore, many approaches have been proposed to improve the EnKF, including the ensemble Kalman smoother (EnKS) and the ensemble square root filter (EnSRF). ...
... Therefore, many approaches have been proposed to improve the EnKF, including the ensemble Kalman smoother (EnKS) and the ensemble square root filter (EnSRF). Unlike the EnKF, the EnSRF does not require perturbing the observations, thus avoiding additional errors, and applies the observation error to depict the impacts of observation uncertainties on the assimilation (Whitaker and Hamill 2002). Furthermore, the performance of the ensemble DA depends on the ensemble spread. ...
... The EnKF is a widely used data assimilation method that employs the Monte Carlo method to estimate the model error (Evensen 1994). However, this method requires the observations to be randomly perturbed, which results in additional error in the assimilation (Whitaker and Hamill 2002). To overcome this, several ensemble assimilation methods have been proposed that do not require the addition of perturbation into the observations. ...
Article
Ensemble data assimilation (DA) is an efficient approach to reduce snow simulation errors by combining observation and land surface modeling. However, there is a small spread between ensemble members of simulated snowpack, which typically occurs for a long time with 100% snow cover fraction (SCF) or snow-free conditions. Here, we apply a hybrid DA method, in which direct insertion (DI) is a supplement of the ensemble square root filter (EnSRF), to assimilate the spaceborne SCF into a land surface model, driven by China Meteorological Administration Land Data Assimilation System high-resolution climate forcings over northern China during the snow season in 2021/22. Compared to the open-loop experiment (without SCF assimilation), the root-mean-square error (RMSE) of SCF is reduced by 6% through the original EnSRF and is even lower (by 14%) in the combined DI and EnSRF (EnSRFDI) experiment. The results reveal the ability of both EnSRF and EnSRFDI to improve the SCF estimation over regions where the snow cover is low, while only EnSRFDI is able to efficiently reduce the RMSE over areas with high SCF. Moreover, the SCF assimilation is also observed to improve the snow depth and soil temperature simulations, with the Kling–Gupta efficiency (KGE) increasing at 60% and 56%–70% stations, respectively, particularly under conditions with near-freezing temperature, in which reliable simulations are typically challenging. Our results demonstrate that the EnSRFDI hybrid method can be applied for the assimilation of spaceborne observational snow cover to improve land surface simulations and snow-related operational products. Significance Statement Due to the small spread between the seasonal snowpack of ensemble simulations, ensemble snow cover fraction (SCF) data assimilation (DA) proves to be ineffective. Therefore, we apply a hybrid method that combines the direct insertion (DI) and ensemble square root filter (EnSRF) to assimilate the spaceborne SCF into a land surface model (LSM) driven by high-resolution climate forcings. Our results reveal the applicability of the EnSRFDI to further improve snow cover simulations over regions with high SCF. Furthermore, the DA experiments were validated through a large number of in situ observations from the China Meteorological Administration. The uncertainties of snow depth and soil temperature simulations are also slightly reduced by the SCF DAs, particularly over regions with a poor LSM performance.
... In the field of Earth science, the EBKFs, including the ensemble Kalman filter (EnKF; Evensen [26]), ensemble Kalman filter with perturbed observations (hereinafter EnKF-PO; Burgers et al. [27]), ensemble square root filter (EnSRF; Whitaker and Hamill [28]), and ensemble transform Kalman filter (ETKF; Bishop et al. [29]), have been widely used to estimate a large number of state variables of the atmospheric and ocean models. ...
... The EnKF-PO [27] deals with perturbed observations as random variables to solve the problem of the systematic underestimation of the error covariance appearing in the EnKF. However, the additional sampling error relevant to perturbed observations can lead to inaccurate estimation results due to the systematic underestimation (biased estimation) of the error covariance calculated by the EnKF-PO when a small number of ensemble members are used [28]. ...
... Unlike the EnKF-PO, the EnSRF [28] utilizes the reduced Kalman gain to deal with the issue of the systematic underestimation of the posterior error covariance appearing in the EnKF. Therefore, it does not lead to an extra source of sampling error associated with perturbed observations as in the EnKF-PO, thus providing lower estimation error than the EnKF-PO for a small ensemble size. ...
Article
Full-text available
Due to the unavailability of GPS indoors, various indoor pedestrian positioning approaches have been designed to estimate the position of the user leveraging sensory data measured from inertial measurement units (IMUs) and wireless signal receivers, such as pedestrian dead reckoning (PDR) and received signal strength (RSS) fingerprinting. This study is similar to the previous study in that it estimates the user position by fusing noisy positional information obtained from the PDR and RSS fingerprinting using the Bayes filter in the indoor pedestrian positioning system. However, this study differs from the previous study in that it uses an enhanced state estimation approach based on the ensemble transform Kalman filter (ETKF), called QETKF, as the Bayes filer for the indoor pedestrian positioning instead of the SKPF proposed in the previous study. The QETKF estimates the updated user position by fusing the predicted position by the PDR and the positional measurement estimated by the RSS fingerprinting scheme using the ensemble transformation, whereas the SKPF calculates the updated user position by fusing them using both the unscented transformation (UT) of UKF and the weighting method of PF. In the field of Earth science, the ETKF has been widely used to estimate the state of the atmospheric and ocean models. However, the ETKF algorithm does not consider the model error in the state prediction model; that is, it assumes a perfect model without any model errors. Hence, the error covariance estimated by the ETKF can be systematically underestimated, thereby yielding inaccurate state estimation results due to underweighted observations. The QETKF proposed in this paper is an efficient approach to implementing the ETKF applied to the indoor pedestrian localization system that should consider the model error. Unlike the ETKF, the QETKF can avoid the systematic underestimation of the error covariance by considering the model error in the state prediction model. The main goal of this study is to investigate the feasibility of the pedestrian position estimation for the QETKF in the indoor localization system that uses the PDR and RSS fingerprinting. Pedestrian positioning experiments performed using the indoor localization system implemented on the smartphone in a campus building show that the QETKF can offer more accurate positioning results than the ETKF and other ensemble-based Kalman filters (EBKFs). This indicates that the QETKF has great potential in performing better position estimation with more accurately estimated error covariances for the indoor pedestrian localization system.
... Instead of the retrieved fog area, the retrieved cloud water path (CWP) is used as the indicator of sea fog and low-level stratus. The main idea is to compare the forecasted CWP for each member with the observation and set the pseudo-observation based on the best members, which is assimilated with the ensemble Kalman filter (EnKF; Evensen, 1994;Whitaker & Hamill, 2002) to modify all the members. Its advantages are described as follows: (a) the retrieval algorithms used to obtain CWP from passive microwave measurements have been developed for more than 50 years since 1962 (Barrett & Chung, 1962) and are reliable; (b) the relationship between MABL humidity and the liquid water content within fog and low-level stratus is robust and can be represented by the model well. ...
... The Grid-point Statistical Interpolation (GSI)/EnKF system version 3.5 (Shao et al., 2016) is applied in assimilation. The system contains two separate algorithms for calculating the analysis increments, of which the serial Ensemble Square Root Filter (EnSRF; Whitaker & Hamill, 2002) is chosen. The default control variables for analysis include U, V, air temperature (T), geopotential height (GPH) and q v . ...
Article
Full-text available
Numerical forecast of the sea fog is sensitive to the initial moist stratification within the marine atmospheric boundary layer (MABL). This study develops an online assimilation method to improve the MABL thermal and moist structures in sea fog ensemble forecasts based on the Weather Research and Forecasting model and Grid‐point Statistical Interpolation/EnKF system. It uses the satellite‐retrieved cloud water path (CWP) as the indicator of sea fog and low‐level stratus to determine the best ensemble members at each grid point. The relative humidity and cloud water profiles are extracted from the best members to generate a series of pseudo‐observations, which are assimilated to update all the members by EnKF method. The new method significantly improves the ensemble forecast of five widespread advection fog events over the Yellow Sea, which can be attributed to the decrease of both missed and spurious fog areas. The case study shows that assimilating the information of both humidity and cloud water outperforms assimilating either of them, while the impact of directly assimilating CWP observation is insignificant. The analysis increments of cloud water, thermal and moist structures in MABL together contribute to the correction of forecasted sea fog. The generation of pseudo‐observations can use the dynamic compatibility of the model to alleviate the impact of erroneous data in the observation, leading to the low sensitivity of the new method to CWP retrieval error.
... The standard Kalman filter [5] is efficient for low-dimensional problems and can process only linear systems. For nonlinear problems including a large number of variables, the ensemble Kalman filter (EnKF) [6,7], the ensemble square root filter (EnSRF) [8,9], the reduced rank square root Kalman filter (RRSQRT) [10,11], or the unscented Kalman filter (UKF) [12][13][14] are suitable choices. A comprehensive overview and a comparison of different Kalman filters can be found in [15,16]. ...
... The matrices P xy,k and P yy,k in Equations (8) and (9) contain the error covariance matrix of the state P k ∈ R n×n , well-known from the KF algorithms: P xy,k = P k · D T and P yy,k = D · P k · D T + R k . ...
Article
Full-text available
The simultaneous consideration of a numerical model and of different observations can be achieved using data-assimilation methods. In this contribution, the ensemble Kalman filter (EnKF) is applied to obtain the system-state development and also an estimation of unknown model parameters. An extension of the Kalman filter used is presented for the case of uncertain model parameters, which should not or cannot be estimated due to a lack of necessary measurements. It is shown that incorrectly assumed probability density functions for present uncertainties adversely affect the model parameter to be estimated. Therefore, the problem is embedded in a multilayered uncertainty space consisting of the stochastic space, the interval space, and the fuzzy space. Then, we propose classifying all present uncertainties into aleatory and epistemic ones. Aleatorically uncertain parameters can be used directly within the EnKF without an increase in computational costs and without the necessity of additional methods for the output evaluation. Epistemically uncertain parameters cannot be integrated into the classical EnKF procedure, so a multilayered uncertainty space is defined, leading to inevitable higher computational costs. Various possibilities for uncertainty quantification based on probability and possibility theory are shown, and the influence on the results is analyzed in an academic example. Here, uncertainties in the initial conditions are of less importance compared to uncertainties in system parameters that continuously influence the system state and the model parameter estimation. Finally, the proposed extension using a multilayered uncertainty space is applied on a multi-degree-of-freedom (MDOF) laboratory structure: a beam made of stainless steel with synthetic data or real measured data of vertical accelerations. Young’s modulus as a model parameter can be estimated in a reasonable range, independently of the measurement data generation.
... We also describe an optimal sensor algorithm based on an ensemble Kalman filter framework. Additional and more complete descriptions of DA algorithms are available in Evensen (1994), Anderson and Anderson (1999), Whitaker and Hamill (2002), Goosse et al. (2006Goosse et al. ( , 2012b, Dubinkina and Goosse (2013), Steiger et al. (2014), Comboul et al. (2015), Hakim et al. (2016), Tardif et al. (2019), Franke et al. (2020), Tierney et al. (2020b), King et al. (2021), andOsman et al. (2021). ...
... The class implements an ensemble square-root Kalman filter (Andrews, 1968), which processes ensemble means and deviations separately. This separation precludes the need for perturbed observations (Whitaker and Hamill, 2002) and provides several opportunities for enhanced computational ef- Instead, all records are processed simultaneously, which we refer to as a "block update". Block updates afford several advantages over sequential processing: they are typically faster on modern computer architectures, their results do not depend on the order in which proxy records are assimilated, and they permit the use of full error covariance matrices for R. By contrast, sequential processing only permits the use of independent error variances, and the final results will vary with the order of the proxies when using nonlinear forward models. ...
Article
Full-text available
Paleoclimate data assimilation (DA) is a tool for reconstructing past climates that directly integrates proxy records with climate model output. Despite the potential for DA to expand the scope of quantitative paleoclimatology, these methods remain difficult to implement in practice due to the multi-faceted requirements and data handling necessary for DA reconstructions, the diversity of DA methods, and the need for computationally efficient algorithms. Here, we present DASH, a MATLAB toolbox designed to facilitate paleoclimate DA analyses. DASH provides command line and scripting tools that implement common tasks in DA workflows. The toolbox is highly modular and is not built around any specific analysis, and thus DASH supports paleoclimate DA for a wide variety of time periods, spatial regions, proxy networks, and algorithms. DASH includes tools for integrating and cataloguing data stored in disparate formats, building state vector ensembles, and running proxy (system) forward models. The toolbox also provides optimized algorithms for implementing ensemble Kalman filters, particle filters, and optimal sensor analyses with variable and modular parameters. This paper reviews the key components of the DASH toolbox and presents examples illustrating DASH's use for paleoclimate DA applications.
... The localization function of Equation 3 is applied to the Kalman gain matrix rather than the background error covariance matrix, because the former has smaller dimensions than the latter in the usual case that p is smaller than n. For a serial EnKF (e.g., Anderson, 2001;Whitaker & Hamill, 2002), the observations are assimilated one by one, and thus the localized Kalman gain can be processed through each column separately, which makes the localization practical. Moreover, a localization function applied to the background error covariance matrix should be positive semidefinite like the covariance matrix, whereas a localization applied to the Kalman gain avoids this requirement. ...
... For an EnKF with perturbed observations, random noises from a normal distribution N(0, R) are added to the synthetic observations before they are assimilated by each ensemble member. The reason to choose the EnKF with perturbed observations rather than the deterministic flavors of EnKF (e.g., Anderson, 2001;Whitaker & Hamill, 2002) is that the same Kalman gain Equation 2 is used for the update of the ensemble mean and ensemble perturbations. Therefore, only one neural network is trained for localization, rather than training two separate networks for localizing the Kalman gain and reduced gain. ...
Article
Full-text available
Flow‐dependent background error covariances estimated from short‐term ensemble forecasts suffer from sampling errors due to limited ensemble sizes. Covariance localization is often used to mitigate the sampling errors, especially for high dimensional geophysical applications. Most applied localization methods, empirical or adaptive ones, multiply the Kalman gain or background error covariances by a distance‐dependent parameter, which is a simple linear filtering model. Here two localization methods based on convolutional neural networks (CNNs) learning from paired data sets are proposed. The CNN‐based localization function (CLF) aims to minimize the sampling error of the estimated Kalman gain, and the CNN‐based empirical localization function (CELF) aims to minimize the posterior error of state variables. These two CNN‐based localization methods can provide localization functions that are nonlinear, spatially and temporally adaptive, and non‐symmetric with respect to displacement, without requiring any prior assumptions for the localization functions. Results using the Lorenz05 model show that CLF and CELF can better capture the structures of the Kalman gain than the best Gaspari and Cohn (GC) localization function and the adaptive reference localization method. For both perfect‐ and imperfect‐model experiments, CLF produces smaller errors of the Kalman gain, prior and posterior than the best GC and reference localization, especially for spatially averaged observations. Without model error, CELF has smaller prior and posterior errors than the best GC and reference localization for spatially averaged observations, while with model error, CELF has smaller prior and posterior errors than the best GC and reference localization for single‐point observations.
... Although the reflectivity is not part of the model prognostic variables, the assimilation of radar reflectivity still provides an opportunity for the improvement of severe convection prediction and is widely used in scientific research and weather prediction departments. In addition, various assimilation methods, including optimal interpolation, simple initialization techniques, complex cloud analysis (Lin et al., 1993;Albers et al., 1996;Weygandt et al., 2002), the ensemble Kalman filter (EnKF) (Snyder and Zhang, 2003;Caya et al., 2005;Ruckstuhl and Janjić, 2018;, ensemble square-root filter (EnSRF) (Whitaker and Hamill, 2002), three-dimensional variational (3DVar) Crook, 1997, 1998;Gao et al., 1999;Hu et al., 2006), four-dimensional variational (4DVar) Wang et al., 2013), hybrid variational and ensemble approaches (Gao and Stensrud, 2014;Wang and Wang, 2017), etc., have been proven to be effective for reflectivity assimilation. ...
... The region where the composite reflectivity (CRF) is greater than the threshold is designated CR; otherwise, it is designated NCR. In this study, the reflectivity is directly assimilated through the EnSRF method (Whitaker and Hamill, 2002). In CR, 3D reflectivity that is greater than the threshold is assimilated through the EnSRF method. ...
Article
Full-text available
Increasing convection information in initial field and weakening spurious convection information are hot topics in numerical weather prediction. How to economically assimilate radar reflectivity in the non‐convective region (NCR) is the focus of this study. This study proposes a new assimilation scheme for the two‐dimensional (2D) composite reflectivity in NCR through the ensemble square‐root filter method. A single‐column observation test and two convective cases are studied to verify the assimilation effect. Three experiments are designed for each test, including a control experiment (Exp_CTL) that only assimilates convective reflectivity and two assimilation experiments that assimilate reflectivity in NCR (one assimilates 3D weak reflectivity [Exp_RF] and the other assimilates 2D composite reflectivity [Exp_CRF]) based on the Exp_CTL. The results of real‐case studies show that the new scheme has two clear advantages. One is that it can save approximately three quarters of the assimilation workload. The other is that the new scheme has the most significant effect on weakening spurious convection and decreasing the false alarm ratios of precipitation and reflectivity.
... The assimilation uses the ensemble square root filter (Whitaker and Hamill, 2002) to assimilate historical observations y into the 80-member ensemble of 20CRv3 (x b ), yielding x a . For the following, see Brönnimann (2022). ...
Article
Full-text available
Global surface air temperature increased by ca. 0.5 °C from the 1900s to the mid-1940s, also known as Early 20th Century Warming (ETCW). However, the ETCW started from a particularly cold phase, peaking in 1908–1911. The cold phase was global but more pronounced in the Southern Hemisphere than in the Northern Hemisphere and most pronounced in the Southern Ocean, raising the question of whether uncertainties in the data might play a role. Here we analyse this period based on reanalysis data and reconstructions, complemented with newly digitised ship data from 1903–1916, as well as land observations. The cooling is seen consistently in different data sets, though with some differences. Results suggest that the cooling was related to a La-Niña-like pattern in the Pacific, a cold tropical and subtropical South Atlantic, a cold extratropical South Pacific, and cool southern midlatitude land areas. The Southern Annular Mode was positive, with a strengthened Amundsen–Bellingshausen seas low, although the spread of the data products is considerable. All results point to a real climatic phenomenon as the cause of this anomaly and not a data artefact. Atmospheric model simulations are able to reproduce temperature and pressure patterns, consistent with a real and perhaps ocean-forced signal. Together with two volcanic eruptions just before and after the 1908–1911 period, the early 1900s provided a cold start into the ETCW.
... EnKF is an advanced DA method that represents the uncertainties of state variables using a stochastic ensemble. The ensemble square root filter EnKF (EnSRF) approach, introduced by Whitaker and Hamill, 31 was used to constrain the NO x emissions in this study. EnSRF obviates the need to perturb observations and has computational advantages over the original EnKF. ...
Article
Full-text available
Accurate estimates of fossil fuel CO2 (FFCO2) emissions are of great importance for climate prediction and mitigation regulations but remain a significant challenge for accounting methods relying on economic statistics and emission factors. In this study, we employed a regional data assimilation framework to assimilate in situ NO2 observations, allowing us to combine observation-constrained NOx emissions coemitted with FFCO2 and grid-specific CO2-to-NOx emission ratios to infer the daily FFCO2 emissions over China. The estimated national total for 2016 was 11.4 PgCO2·yr–1, with an uncertainty (1σ) of 1.5 PgCO2·yr–1 that accounted for errors associated with atmospheric transport, inversion framework parameters, and CO2-to-NOx emission ratios. Our findings indicated that widely used “bottom-up” emission inventories generally ignore numerous activity level statistics of FFCO2 related to energy industries and power plants in western China, whereas the inventories are significantly overestimated in developed regions and key urban areas owing to exaggerated emission factors and inexact spatial disaggregation. The optimized FFCO2 estimate exhibited more distinct seasonality with a significant increase in emissions in winter. These findings advance our understanding of the spatiotemporal regime of FFCO2 emissions in China.
... Many implementations use an ad-hoc tuning method to specify a localisation. They assume the correlations decrease with distance and, in order to mitigate the sampling error, either apply a cut off beyond which the correlations are zeroed (Houtekamer and Mitchell, 1998) or a damping to distant correlations (Houtekamer and Mitchell, 2001;Roh et al., 2015;Whitaker and Hamill, 2002). The shape of the damping is commonly a Gaspari-Cohn function which is similar to a Gaussian in shape but has compact support requiring it to go to zero at a finite distance (Gaspari and Cohn, 1999): it is identified by a width parameter. ...
... The Kalman filter technique is a linear analysis based on Bayes's theorem for linear systems; however, it has several variants which have been designed for nonlinear systems; e.g. the extended Kalman filter (Jazwinski, 1970), reduced rank square root filter (Hamill and Snyder, 2000;Voorrips et al., 1999;Whitaker and Hamill, 2002), ensemble Kalman Filter (Evensen, 2003) and sigma-point Kalman filter (Van Der Merwe et al., 2004). These variants can be used to improve the ensemble wave parameters such as significant wave height (H s ) from a model with low computational cost. ...
... The s scaling factors are optimized using an ensemble Kalman smoother as described in Peters et al. (2005), which is based on the ensemble square root filter presented by Whitaker and Hamill (2002). In the filter, the error covari-ance matrix P (both a priori and a posteriori) of size [s ×s] is represented by information in a smaller dimension N, which corresponds to the number of ensemble members and is set to 192 in our applications. ...
Article
Full-text available
We present the first application of the ICOsahedral Nonhydrostatic model with Aerosols and Reactive Trace gases (ICON-ART) in inverse modeling in inverse modeling of greenhouse gas fluxes with an ensemble Kalman smoother. For this purpose, we extended ICON-ART to efficiently handle gridded emissions, generate an ensemble of perturbed emissions during runtime and use nudging on selected variables to keep the simulations close to analyzed meteorology. We show that the system can optimize total and anthropogenic European CH4 fluxes on a national scale in an idealized setup using pseudo-observations from a realistic network of measurement stations. However, we were unable to constrain the sum of the natural emission sources of comparatively low magnitude. Also regions with low emissions and regions with low observational coverage could not be optimized individually for lack of observational constraints. Furthermore, we investigated the sensitivities towards different inversion parameters and design choices with 15 sensitivity runs using the same idealized setup, demonstrating the robustness of the approach when regarding some minimal requirements of the setup (e.g., number of ensemble members). Subsequently, we applied the system to real in situ observations from 28 European stations for three years, 2008, 2013 and 2018. We used a priori anthropogenic fluxes from the EDGARv6 inventory and a priori natural fluxes from peatlands and mineral soils, inland waters, the ocean, biofuels and biomass burning, and geology. Our results for the year 2018 indicate that anthropogenic emissions may be underestimated in EDGARv6 by ca. 25 % in the Benelux countries and, to a smaller degree, in northwestern France and southern England. In the rest of the domain, anthropogenic fluxes are corrected downwards by the inversion, suggesting an overestimation in the a priori. For most countries, this means that the a posteriori country-total anthropogenic emissions are closer to the values reported to the United Nations Framework Convention on Climate Change (UNFCCC) than the a priori emissions from EDGARv6. Aggregating the a posteriori emissions across the EU27 + UK results in a total of 17.4 Tg yr−1, while the a priori emissions were 19.9 Tg yr−1. Our a posteriori is close to the total reported to the UNFCCC of 17.8 Tg yr−1. Natural emissions are reduced from their a priori magnitude almost everywhere, especially over Italy and Romania–Moldova, where a priori geological emissions are high, and over the United Kingdom and Scandinavia, where emissions from peatlands and wetlands were possibly unusually low during the hot and dry summer of 2018. Our a posteriori anthropogenic emissions for the EU27 + UK fall within the range estimated by global top-down studies but are lower than most other regional inversions. However, many of these studies have used observations from different measurement stations or satellite observations. The spatial pattern of the emission increments in our results, especially the increase in the Benelux countries, also agrees well with other regional inversions.
... Zhang & Weng, 2015). The PSU WRF-EnKF system uses the EnSRF (Whitaker & Hamill, 2002) variation of the EnKF. Community Radiative Transfer Model (CRTM) version 2.3.0 is used to calculate simulated IR and MW BTs with microphysics-consistent scattering properties inferred from non-spherical particles following Sieron et al. (2017Sieron et al. ( , 2018 and Adaptive Background Error Inflation to help treat the nonlinearities and non-Gaussianities associated with all-sky BT assimilation. ...
Article
Full-text available
Plain Language Summary Satellite observations are the backbone of modern weather forecast operations, especially for severe weather monitoring and prediction. However, they are also severely underutilized by computer weather models. Many satellite observations impacted by clouds and precipitation are not used in these models, despite their ability to characterize important features of ongoing severe weather events. This study focuses on satellite observations at microwave (MW) frequencies that are impacted by clouds and precipitation. We explore their potential benefits by incorporating them into a computer weather model using data assimilation and examining their impact on severe weather forecasts. We utilize the 10 August 2020 Midwest derecho that produced extensive wind damage as a case study. Because these MW observations contain contributions from precipitating particles within the derecho, assimilating these observations yields better depictions of the derecho structure in the computer model. Consequently, more accurate forecasts of surface gusts are produced. The results of this study suggest a promising avenue for improving severe weather forecasts worldwide in the future, especially for regions that lack the resources and infrastructure to support high‐spatiotemporal‐resolution weather observations.
... EnKF relies on ensemble statistics to compute the error covariance matrix during the data assimilation process. In practical data assimilation, various ensemble filters [22] and derivative methods (e.g., Ensemble Transformed Kalman Filter, Ensemble Squareroot Kalman Filter, Ensemble Adjustment Kalman Filter [25][26][27] and Unscented Kalman Filter) have been proposed to implement ensemble updating in the Kalman filter. Among them, the Ensemble Adjustment Kalman Filter (EAKF) developed by Anderson [25] can decompose observations into a series of scalars and assimilate them in turn, making it well suited to coupled data assimilation problems with multiple mode components. ...
Article
Full-text available
This paper introduces a novel ensemble adjustment Kalman filter (EAKF) that integrates a machine-learning approach. The conventional EAKF adopts linear and Gaussian assumptions, making it difficult to handle cross-component updates in strongly coupled data assimilation (SCDA). The new approach employs nonlinear variable relationships established by a deep neural network (DNN) during the analysis stage of the EAKF, which nonlinearly projects observation increments into the state variable space. It can diminish errors in estimating cross-component error covariance arising from insufficient ensemble members, therefore improving the SCDA analysis. A conceptual coupled model is employed in this paper to conduct twin experiments, validating the DNN–EAKF’s capability to outperform conventional EAKF in SCDA. The results reveal that the DNN–EAKF can make SCDA superior to WCDA with a limited ensemble size. The root-mean-squared errors are reduced up to 70% while the anomaly correlation coefficients are increased up to 20% when the atmospheric observations are used to update the ocean variables directly. The other model components can also be improved through SCDA. This approach is anticipated to offer insights for future methodological integrations of machine learning and data assimilation and provide methods for SCDA applications in coupled general circulation models.
... where x is the state vector (dimension of n × 1) and the observation vector y (dimension of p × 1), the classic ensemble-square root filter (EnSRF; Whitaker & Hamill, 2002) updates the prior ensemble mean ...
Article
Full-text available
For Paleoclimate data assimilation (PDA), a hybrid gain analog offline ensemble Kalman filter (HGAOEnKF) is proposed. It keeps the benefits of the analog offline ensemble Kalman filter (AOEnKF) that constructs analog ensembles from existing climate simulations with joint information of the proxies. The analog ensembles can provide more accurate prior ensemble mean and “flow‐dependent” error covariances than randomly sampled ensembles. HGAOEnKF further incorporates the benefits of static prior error covariances that better capture large‐scale error correlations and mitigate sampling errors than the sample prior error covariances, through a hybrid gain approach within an ensemble framework. Observing system simulation experiments are conducted for various data assimilation methods, using ensemble simulations from the Community Earth System Model‐Last Millennium Ensemble Project. Results show that using the static prior error covariances estimated from a sufficiently large sample set is beneficial for the traditional offline ensemble Kalman filter (OEnKF) and AOEnKF. HGAOEnKF method is superior to the OEnKF and AOEnKF with and without static prior error covariances, especially for the reconstruction of extreme events. The advantages of HGAOEnKF over OEnKF and AOEnKF with and without static prior error covariances are persistent with varying sample sizes and presence of model errors.
... The superscripts f and a denote the forecast and the analysis estimates. 115 We use the Ensemble Square Root Filter (EnSRF) that solves the analysis without perturbing observations and performs more optimally than the stochastic EnKF (Whitaker and Hamill, 2002). The ensemble anomaly is updated as follows: ...
Preprint
Full-text available
This study emphasises the importance of soil moisture (SM) in subseasonal-to-seasonal (S2S) predictions at midlatitudes. To address this we introduce the Norwegian Climate Prediction Model Land (NorCPM-Land), a land reanalysis framework tailored for integration with the Norwegian Climate Prediction Model (NorCPM). NorCPM-Land assimilates blended SM data from the European Space Agency’s Climate Change Initiative into a 30-member offline simulation of the Community Land Model with fluxes from the coupled model. The assimilation of SM data reduces error in SM by 10.5 % when validated against independent SM observations. It also improves latent heat flux estimates, illustrating that the adjustment of underlying SM significantly augments the capacity to model land surface dynamics. We evaluate the added value of land initialisation for subseasonal predictions, by comparing the performance of hindcasts (retrospective prediction) using the standard NorCPM with a version where the land initial condition is taken from NorCPM-Land reanalysis. The hindcast covers the period 2000 to 2019 with four start dates per year. Land initialisation improves predictions up to a 3.5-month lead time for SM and a 1.5-month lead time for temperature and precipitation. The largest improvements are observed in regions with significant land-atmospheric coupling, such as the Central United States, the Sahel, and Central India. It also better captures extreme (high and low) temperature events in parts of Europe, the United States, and Asia, at mid and high latitudes. Overall, our study provides further evidence for the significant role of SM content in enhancing the accuracy of subseasonal predictions. This study provides an technique for improved land initialisation, utilising the same model employed in climate predictions.
... The Pennsylvania State University Weather Research and Forecasting model Ensemble Kalman Filter (PSU-WRF-EnKF) cycling data assimilation system was developed by Meng and Zhang [3,4] and Zhang et al. [8]. The EnKF method employed in this study is the serial EnKF with square root modification, called Ensemble Square Root Filter (EnSRF, [6]). To avoid filter divergence, the relax-toprior-perturbation (RTPP) technique and Gaspari and Cohn [1] localization were used. ...
Chapter
Full-text available
The assimilation of radar reflectivity and radial velocity from the WSR-88D radar and lidar water vapor profile observations could improve the forecast of location and timing of bore associated with nocturnal convection. This study describes the assimilation of such data observed during the 2015 Plains Elevated Convection at Night (PECAN) field campaign. The model and data assimilation system employed is the Pennsylvania State University Weather Research and Forecasting model Ensemble Kalman Filter (PSU-WRF-EnKF) cycling data assimilation system. The lidars include the Atmospheric Lidar for Validation, Interagency Collaboration and Education (ALVICE), University of Wyoming King Air compact Raman lidar, Atmospheric Radiation Measurement (ARM) Raman lidar, and National Center for Atmospheric Research (NCAR) micropulse Differential Absorption Lidar (DIAL). The bore propagation was observed by both radar and ALVICE lidar in Kansas, on 14 July 2015, and a nocturnal convection initiated near the bore/density current. Without assimilating any observations, the WRF model didn’t well simulate the location of the bore, even though it captured the bore structure quite well. The assimilation of WSR-88D radar observations improved the location and timing forecasts of the nocturnal convection and the associated bore.
... DART can perform both stochastic and deterministic EnKF algorithms, where only the former perturbs observations. When background and observation error distributions are near-Gaussian, the use of perturbed observations is known to degrade the quality of the ensemble-mean analysis relative to that produced by deterministic filters (Anderson, 2001;Whitaker and Hamill, 2002) and a theoretically similar deterministic variationally based EDA (Bowler et al., 2012). On the other hand, as the forecast model and observation operators become increasingly nonlinear, there is evidence that an EDA with perturbed observations avoids some pathological behaviors that appear in deterministic EnKFs (Lawson and Hansen, 2004;Anderson, 2010;Lei et al., 2010;Anderson, 2020). ...
Article
Full-text available
An ensemble of 3D ensemble-variational (En-3DEnVar) data assimilations is demonstrated with the Joint Effort for Data assimilation Integration (JEDI) with the Model for Prediction Across Scales – Atmosphere (MPAS-A) (i.e., JEDI-MPAS). Basic software building blocks are reused from previously presented deterministic 3DEnVar functionality and combined with a formal experimental workflow manager in MPAS-Workflow. En-3DEnVar is used to produce an 80-member ensemble of analyses, which are cycled with ensemble forecasts in a 1-month experiment. The ensemble forecasts approximate a purely flow-dependent background error covariance (BEC) at each analysis time. The En-3DEnVar BECs and prior ensemble-mean forecast errors are compared to those produced by a similar experiment that uses the Data Assimilation Research Testbed (DART) ensemble adjustment Kalman filter (EAKF). The experiment using En-3DEnVar produces a similar ensemble spread to and slightly smaller errors than the EAKF. The ensemble forecasts initialized from En-3DEnVar and EAKF analyses are used as BECs in deterministic cycling 3DEnVar experiments, which are compared to a control experiment that uses 20-member MPAS-A forecasts initialized from Global Ensemble Forecast System (GEFS) initial conditions. The experimental ensembles achieve mostly equivalent or better performance than the off-the-shelf ensemble system in this deterministic cycling setting, although there are many obvious differences in configuration between GEFS and the two MPAS ensemble systems. An additional experiment that uses hybrid 3DEnVar, which combines the En-3DEnVar ensemble BEC with a climatological BEC, increases tropospheric forecast quality compared to the corresponding pure 3DEnVar experiment. The JEDI-MPAS En-3DEnVar is technically working and useful for future research studies. Tuning of observation errors and spread is needed to improve performance, and several algorithmic advancements are needed to improve computational efficiency for larger-scale applications.
... The assimilation uses the Ensemble Square Root filter (Whitaker and Hamill, 2002) to assimilate 120 historical observations y into the 80 member ensemble of 20CRv3 (xb), yielding xa. For the following 121 see Brönnimann (2022). ...
Preprint
Full-text available
Global surface air temperature increased by ca. 0.5 °C from the 1900s to the mid-1940s, also known as Early Twentieth Century Warming (ETCW). However, the ETCW started from a particularly cold phase, peaking in 1908–1911. The cold phase was global but more pronounced in the Southern Hemisphere than in the Northern Hemisphere and most pronounced in the Southern Ocean, raising the question whether uncertainties in the data might play a role. Here we analyse this period based on reanalysis data and reconstructions, complemented with newly digitized ship data from 1903–1916 as well as land observations. The cooling is seen consistently in different data sets, though with some differences. Results suggest that the cooling was related to a La Niña-like pattern in the Pacific, a cold tropical and subtropical South Atlantic, a cold extratropical South Pacific, and cool Southern midlatitude land areas. The Southern Annular Mode was positive, with a strengthened Amundsen-Bellingshausen seas low, although the spread of the data products is considerable. All results point to a real climatic phenomenon as the cause of this anomaly and not a data artefact. Atmospheric model simulations are able to reproduce temperature and pressure patterns, consistent with a real and perhaps ocean-forced signal. Together with two volcanic eruptions just before and after the 1908–1911 period, the early 1900s provided a cold start into the ETCW.
... Given the development of observation networks and advanced data assimilation strategies, timely and dynamicsbased emission estimates with high temporal resolutions can be achieved by harmonically constraining the atmosphericchemical model with dense observations of trace gas compounds through an optimal assimilation methodology. The ensemble Kalman smoother (EnKS) (Whitaker et al., 2002;Peters et al., 2007;Peng et al., 2015), as a four-dimensional (4D) assimilation algorithm, makes use of chemical observations from the past to the future to provide an optimal estimate of source emissions, and it can capture the "error of the day" and construct fine emission characteristics with high temporal and spatial resolutions by using short-term ensemble forecasts (Kalnay, 2002). Since 2013, the fine particulate matter pollution (PM 2.5 , particles smaller than 2.5 µm in diameter) as the most urgent threat to public health has persistently decreased, and ground-based observations of PM 2.5 have progressively increased (Huang et al., 2018). ...
Article
Full-text available
Timely, continuous, and dynamics-based estimates of PM2.5 emissions with a high temporal resolution can be objectively and optimally obtained by assimilating observed surface PM2.5 concentrations using flow-dependent error statistics. The annual dynamics-based estimates of PM2.5 emissions averaged over mainland China for the years 2016–2020 without biomass burning emissions are 7.66, 7.40, 7.02, 6.62, and 6.38 Tg, respectively, which are very closed to the values of the Multi-resolution Emission Inventory (MEIC). Annual PM2.5 emissions in China have consistently decreased by approximately 3 % to 5 % from 2017 to 2020. Significant PM2.5 emission reductions occurred frequently in regions with large PM2.5 emissions. COVID-19 could cause a significant reduction of PM2.5 emissions in the North China Plain and northeast of China in 2020. The magnitudes of PM2.5 emissions were greater in the winter than in the summer. PM2.5 emissions show an obvious diurnal variation that varies significantly with the season and urban population. Compared to the diurnal variations of PM2.5 emission fractions estimated based on diurnal variation profiles from the US and EU, the estimated PM2.5 emission fractions are 1.25 % larger during the evening, the morning peak is 0.57 % smaller in winter and 1.05 % larger in summer, and the evening peak is 0.83 % smaller. Improved representations of PM2.5 emissions across timescales can benefit emission inventory, regulation policy and emission trading schemes, particularly for especially for high-temporal-resolution air quality forecasting and policy response to severe haze pollution or rare human events with significant socioeconomic impacts.
... Ensemble assimilation runs were created using the Pennsylvania State University ensemble Kalman filter (PSU-EnKF) data assimilation system (Zhang et al. 2006;Weng and Zhang 2012), which employs a serial ensemble Kalman filter (Whitaker and Hamill 2002). An important advantage of ensemble assimilation techniques is that they take advantage of flow-dependent forecast errors to characterize both the state (ensemble mean) and its uncertainty (ensemble spread). ...
Article
Data from rawinsondes launched during intensive observation periods (IOPs) of the Ontario Winter Lake-effect Systems (OWLeS) field project reveal that elevated mixed layers (EMLs) in the lower troposphere were relatively common near Lake Ontario during OWLeS lake-effect events. Conservatively, EMLs exist in 193 of the 290 OWLeS IOP soundings. The distribution of EML base pressure derived from the OWLeS IOP soundings reveals two classes of EML, one that has a relatively low-elevation base (900 – 750 hPa) and one that has a relatively high-elevation base (750 – 500 hPa). It is hypothesized that the former class of EML, which is the focus of this research, is, at times, the result of mesoscale processes related to individual Great Lakes. WRF reanalysis fields from a case study during the OWLeS field project provide evidence of two means by which low-elevation base EMLs can originate from the lake-effect boundary layer convection and associated mesoscale circulations. First, such EMLs can form within the upper-level outflow branches of mesoscale solenoidal circulations. Evacuated Great Lake-modified convective boundary layer air aloft then lies above ambient air of a greater static stability, forming EMLs. Second, such EMLs can form in the absence of a mesoscale solenoidal circulation when Great Lake-modified convective boundary layers overrun ambient air of a greater density. The reanalysis fields show that EMLs and layers of reduced static stability tied to Great Lake-modified convective boundary layers can extend downwind for hundreds of kilometers from their areas of formation. Operational implications and avenues for future research are discussed.
... , is referred to as the Kalman gain K. To account for a bias in the covariance analysis, we used the ensemble square root filter as proposed by Whitaker and Hamill (2002) and updated the ensemble mean and the anomaly from the ensemble mean, individually yielding the separate equations Eqs. (3) and (4). ...
Article
Full-text available
Climate reconstructions give insights in monthly and seasonal climate variability in the past few hundred years. However, for understanding past extreme weather events and for relating them to impacts, for example through crop yield simulations or hydrological modelling, reconstructions on a weather timescale are needed. Here, we present a data set of 258 years of daily temperature and precipitation fields for Switzerland from 1763 to 2020. The data set was reconstructed with the analogue resampling method, which resamples meteorological fields for a historical period based on the most similar day in a reference period. These fields are subsequently improved with data assimilation for temperature and bias correction for precipitation. Even for an early period prior to 1800 with scarce data availability, we found good validation results for the temperature reconstruction especially in the Swiss Plateau. For the precipitation reconstruction, skills are considerably lower, which can be related to the few precipitation measurements available and the heterogeneous nature of precipitation. By means of a case study of the wet and cold years from 1769 to 1772, which triggered widespread famine across Europe, we show that this data set allows more detailed analyses than hitherto possible.
... For the stochastic EnKF, perturbations are added to the ensemble equivalence of the observation, accounting for nonsymmetric observation errors ( van Leeuwen, 2020). Perturbing observations introduces additional sampling error in the analysis, which for applications with few ensemble members might be a significant contribution (Sakov and Oke, 2008a;Whitaker and Hamill, 2002). In the DEnKF, the assumption of a small KH term leads to a simplified ensemble transform matrix (ETM) used to update the ensemble anomalies. ...
Article
Full-text available
An operational ocean and sea ice forecast model, Barents-2.5, is implemented for short-term forecasting at the coast off northern Norway, the Barents Sea, and the waters around Svalbard. Primary forecast parameters are sea ice concentration (SIC), sea surface temperature (SST), and ocean currents. The model also provides input data for drift modeling of pollutants, icebergs, and search-and-rescue applications in the Arctic domain. Barents-2.5 has recently been upgraded to include an ensemble prediction system with 24 daily realizations of the model state. SIC, SST, and in situ hydrography are constrained through the ensemble Kalman filter (EnKF) data assimilation scheme executed in daily forecast cycles with a lead time up to 66 h. Here, we present the model setup and validation in terms of SIC, SST, in situ hydrography, and ocean and ice velocities. In addition to the model's forecast capabilities for SIC and SST, the performance of the ensemble in representing the model's uncertainty and the performance of the EnKF in constraining the model state are discussed.
... Each of the above studies that assimilate ABI all-sky observations for convective-scale NWP apply the ensemble Kalman filter (EnKF; Whitaker and Hamill 2002) method for DA. The EnKF is widely utilized for convection-allowing DA and NWP due to its ability to sample flowdependent background error covariances and easily generate ensemble perturbations for the forecast period. ...
Article
The Advanced Baseline Imager (ABI) aboard the GOES-16 and GOES-17 satellites provides high-resolution observations of cloud structures that could be highly beneficial for convective-scale DA. However, only clear-air radiance observations are typically assimilated at operational centers due to a variety of problems associated with cloudy radiance data. As such, many questions remain about how to best assimilate all-sky radiance data, especially when using hybrid DA systems such as EnVar wherein a nonlinear observation operator can lead to cost function gradient imbalance and slow minimization. Here, we develop new methods for assimilating all-sky radiance observations in EnVar using the novel Rapid Refresh Forecasting System (RRFS) that utilizes the Finite-Volume Cubed-Sphere (FV3) model. We first modify the EnVar solver by directly including brightness temperature (T b ) as a state variable. This modification improves the balance of the cost function gradient and speeds up minimization. Including T b as a state variable also improves the model fit to observations and increases forecast skill compared to utilizing a standard state vector configuration. We also evaluate the impact of assimilating ABI all-sky radiances in RRFS for a severe convective event in the central Great Plains. Assimilating the radiance observations results in better spin-up of a tornadic supercell. These data also aid in suppressing spurious convection by reducing the snow hydrometeor content near the tropopause and weakening spurious anvil clouds. The all-sky radiance observations pair well with reflectivity observations that remove primarily liquid hydrometeors (i.e., rain) closer to the surface. Additionally, the benefits of assimilating the ABI observations continue into the forecast period, especially for localized convective events.
... The first one is a stochastic interpretation of the Kalman filter equations where every member assimilates perturbed observations upon a perturbed background state issued from the perturbed forecast integration (e.g., Houtekamer and Mitchell 2001;Houtekamer et al. 2014). Alternatively, square-root schemes avoid randomization (Whitaker and Hamill 2002;Tippett et al. 2003) but are more prone to errors when the conditions deviate from Gaussianity (Lawson and Hansen 2004). ...
Article
A three-dimensional ensemble-variational (3DEnVar) data assimilation algorithm has been developed for the high-resolution AROME NWP system. Building on previous work on 3DEnVar for AROME, we describe a formulation of the 3DEnVar that is based on the traditional square-root preconditioning. The localization may be performed either in gridpoint or spectral space, and allows for cross-covariances between surface pressure and the three-dimensional variables. The scheme has capacity for dual resolution, with the ensemble running at a lower 3.2-km spatial resolution than the deterministic AROME running at 1.3 km. This formulation is compatible with the variational bias correction scheme used in AROME. Hybrid covariances are implemented with climatological covariances at 1.3-km resolution being combined with ensemble perturbations that are interpolated to high resolution on the fly in the computation of the gradient. Hybrid 3DEnVar has an increased computational cost compared to 3DVar, which is partly mitigated by the use of dual resolution and the adoption of a flexible convergence criterion in the minimization. To get the full benefit from the ensemble scheme, it is recommended 1) to increase ensemble size from 25 to 50 members and 2) to decrease the localization length scale for the benefit of high-density radar observations. With those changes, the 3DEnVar outperforms the operational AROME-France 3DVar by a significant margin on the first 12 h of forecast range, as evidenced by a 3-month summer experiment. Finally, a case study reports on the improved prediction of heavy rainfall that frequently occurs in the Mediterranean region.
... The community Gridpoint Statistical Interpolation (GSI; version 3.7) based Ensemble Kalman Filter (EnKF, version 1.3) (Wu et al. 2002;Kleist et al. 2009;Wang and Lei 2014;Wang et al. 2013;Johnson et al. 2015;Liu et al. 2018) with extensions to ingest satellite radiances (Jones et al. 2020;Johnson et al. 2022) is used to assimilate observations in this study. This version of the EnKF system uses a square root EnKF filter (EnSRF; Whitaker and Hamill 2002). Relaxation to prior perturbation is used to generate sufficient variance among the ensemble members , where 80% of the perturbation inflation occurred with the prior ensemble and 20% of the additional perturbations occurred with the posterior ensemble. ...
Article
On 28 April 2019, hourly forecasts from the operational High-Resolution Rapid Refresh (HRRR) model consistently predicted an isolated supercell storm late in the day near Dodge City, Kansas, that subsequently was not observed. Two convection-allowing model (CAM) ensemble runs are created to explore the reasons for this forecast error and implications for severe weather forecasting. The 40-member CAM ensembles are run using the HRRR configuration of the WRF-ARW Model at 3-km horizontal grid spacing. The Gridpoint Statistical Interpolation (GSI)-based ensemble Kalman filter is used to assimilate observations every 15 min from 1500 to 1900 UTC, with resulting ensemble forecasts run out to 0000 UTC. One ensemble only assimilates conventional observations, and its forecasts strongly resemble the operational HRRR with all ensemble members predicting a supercell storm near Dodge City. In the second ensemble, conventional observations plus observations of WSR-88D radar clear-air radial velocities, WSR-88D diagnosed convective boundary layer height, and GOES-16 all-sky infrared brightness temperatures are assimilated to improve forecasts of the preconvective environment, and its forecasts have half of the members predicting supercells. Results further show that the magnitude of the low-level meridional water vapor flux in the moist tongue largely separates members with and without supercells, with water vapor flux differences of 12% leading to these different outcomes. Additional experiments that assimilate only radar or satellite observations show that both are important to predictions of the meridional water vapor flux. This analysis suggests that mesoscale environmental uncertainty remains a challenge that is difficult to overcome. Significance Statement Forecasts from operational numerical models are the foundation of weather forecasting. There are times when these models make forecasts that do not come true, such as 28 April 2019 when successive forecasts from the operational High-Resolution Rapid Refresh (HRRR) model predicted a supercell storm late in the day near Dodge City, Kansas, that subsequently was not observed. Reasons for this forecast error are explored using numerical experiments. Results suggest that relatively small changes to the prestorm environment led to large differences in the evolution of storms on this day. This result emphasizes the challenges to operational severe weather forecasting and the continued need for improved use of all available observations to better define the atmospheric state given to forecast models.
... While the basic KF is inefficient to use in applications with large state spaces (due to the difficulty with respect to propagating very large error covariance matrices from one time to the next), the ensemble Kalman filter (EnKF; Evensen, 1994) is a popular and tractable approximation, which also allows nonlinear systems to be treated. The EnKF exists in many flavours, e.g. in stochastic (Burgers et al., 1998;Houtekamer and Mitchell, 1998) and square root forms (Bishop et al., 2001;Whitaker and Hamill, 2002), which, like the standard KF, are all based on Bayes' theorem and assume that errors in observed, prior, and posterior quantities are Gaussian distributed. Under the EnKF, the prior distribution is described by an ensemble of model forecast states and the posterior distribution by an ensemble of posterior states found by assimilating current observational information. ...
Article
Full-text available
The paper presents a simplification of the Kalman smoother that can be run as a post-processing step using only minimal stored information from a Kalman filter analysis, which is intended for use with large model products such as the reanalyses of the Earth system. A simple decay assumption is applied to cross-time error covariances, and we show how the resulting equations relate formally to the fixed-lag Kalman smoother and how they can be solved to give a smoother analysis along with an uncertainty estimate. The method is demonstrated in the Lorenz (1963) idealised system which is applied to both an extended and ensemble Kalman filter and smoother. In each case, the root mean square errors (RMSEs) against the truth, for both assimilated and unassimilated (independent) data, of the new smoother analyses are substantially smaller than for the original filter analyses, while being larger than for the full smoother solution. Up to 70 % (40 %) of the full smoother error reduction, with respect to the extended (ensemble) filters, respectively, is achieved. The uncertainties derived for the new smoother also agree remarkably well with the actual RMSE values throughout the assimilation period. The ability to run this smoother very efficiently as a post-processor should allow it to be useful for really large model reanalysis products and especially for ensemble products that are already being developed by various operational centres.
... If Equations 1 and 2 are implemented using ensembles, the observation error should be perturbed; otherwise, P a will be underestimated (Tippett et al., 2003;Whitaker & Hamill, 2002). Such perturbations will introduce artificial disturbances to the method, weakening its robustness. ...
Article
Full-text available
Although data assimilation‐based targeted observation approaches have been widely used, obtaining optimal observation sites over a specific region for a prediction target remains a challenge. Hence, this study developed a more practical region‐optional targeted observation method by introducing a projection vector, allowing for the prediction targeted region to be different from the observation one. By minimizing the analysis error variance in a targeted region, the method identified optimal sites through a sequential assimilation framework. This region‐optional method was applied in the targeted observation study of the sea surface temperature (SST) prediction associated with Indian Ocean Dipole (IOD). The first 10 optimal observation sites were identified, with seven sites in the west IO, and three in the east. The results were further validated by conducting observation system simulation experiments using an ensemble adjustment Kalman filter assimilation system in the Community Earth System Model (CESM). The assimilation of observations from the 10 optimal sites was capable of reducing root mean squared errors (RMSEs) by 38.2% when assessing the SST across the IOD key regions, significantly more than the reduction from 10 optimal sites identified via the conventional method, or that from 10 random sites. This improvement was primarily due to the error reduction in the eastern IO, where SST RMSEs were reduced by >50%. The proposed region‐optional targeted observation method can seek optimal sites in any region of interest and is not confined to the targeted region as in conventional algorithms, thus providing a more reasonable method for designing optimal observation networks.
Preprint
The ensemble-based tangent linear model (ETLM), which evolves the static background error covariance (BEC) in time in the data assimilation window, is beneficial for improving the analysis and avoids the need for developing the tangent linear forecast model and its adjoint as in four-dimensional variational data assimilation (4DVar). This study proposes to apply the filtered ETLM (FETLM), which is composed of ensemble perturbations at limited time slots and localized with a quasi-Gaussian filter in the way that avoids calculation of the inverse matrix. Tests with the Lorenz (1996) model showed that FETLM evolved the static BEC as in 4DVar and improved the analysis. In experiments with the FETLM implemented in the hourly-updated regional forecast system, the Rapid Refresh Forecast System, the imbalance of the analysis was mitigated due to the flow-dependent BEC. Moreover, the cross-variable covariance in the FETLM made it possible to assimilate radar reflectivity effectively even without hydrometeors as control variables.
Chapter
Full-text available
Com o aumento da frequência dos desastres naturais, a antecipação de eventos hidrometeorológicos com potencial de causar danos à população e meio ambiente é cada vez mais urgente. Neste contexto, o presente capítulo visa abordar o papel da modelagem hidrológica como componente fundamental para um sistema de alerta, fornecendo exemplos do potencial de implementação desta ferramenta em diversas regiões do Brasil. Adicionalmente, são discutidos alguns dos desafios que persistem para a previsão de desastres, como a necessidade de aprimorar a precisão das previsões de precipitação, integração de modelos atmosféricos e hidrológicos e as incertezas inerentes à modelagem. Essa análise visa elucidar sobre a necessidade contínua de busca por estratégias eficazes de monitoramento e melhoria da resposta a desastres naturais no Brasil.
Chapter
Full-text available
Apesar da bem reconhecida influência do clima na agricultura, a modelagem agrometeorológica é relativamente recente. Em diversas áreas da ciência, como meteorologia, agronomia e hidrologia, os modelos desempenham um papel crucial ao simular processos complexos por meio de estruturas matemáticas ou computacionais. A modelagem agrometeorológica desempenha, por sua vez, um papel fundamental ao simular o comportamento das culturas, ciclos de crescimento e respostas às condições ambientais, contribuindo tanto para a compreensão científica quanto para embasar a tomada de decisões. Neste contexto, o presente capítulo oferece uma análise abrangente do impacto dos elementos meteorológicos no crescimento e desenvolvimento das culturas agrícolas. Além disso, explora diversas categorias de modelos agrometeorológicos, enfatizando sua aplicação. A diversidade de aplicações se estende também à gestão de recursos hídricos e energéticos por meio da modelagem integrada agro- -hidrológica. Além disso, o capítulo busca apresentar abordagens para enfrentar as principais limitações associadas à modelagem agrometeorológica.
Article
Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series” (GOES-R) Geostationary Lightning Mapper (GLM) flash extent density (FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter (GSI-EnKF) framework were previously developed and tested with a mesoscale convective system (MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational (EnVar) hybrid data assimilation (DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar (PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.
Article
Hyperspectral infrared (IR) satellites can provide high-resolution vertical profiles of the atmospheric state, which significantly contributes to the forecast skill of numerical weather prediction, especially for regions with sparse observations. One challenge in assimilating the hyperspectral radiances is how to effectively extract the observation information, due to the interchannel correlations and correlated observation errors. An adaptive channel selection method is proposed, which is implemented within the data assimilation scheme and selects the radiance observation with the maximum reduction of variance in observation space. Compared to the commonly used channel selection method based on the maximum entropy reduction (ER), the adaptive method can provide flow-dependent and time-varying channel selections. The performance of the adaptive selection method is evaluated by assimilating only the synthetic Fengyun-4A ( FY-4A ) GIIRS IR radiances in an observing system simulation experiment (OSSE), with model resolutions from 7.5 to 1.5 km and then 300 m. For both clear-sky and all-sky conditions, the adaptive method generally produces smaller RMS errors of state variables than the ER-based method given similar amounts of assimilated radiances, especially with fine model resolutions. Moreover, the adaptive method has minimum RMS errors smaller than or approaching those with all channels assimilated. For the intensity of the tropical cyclone, the adaptive method also produces smaller errors of the minimum dry air mass and maximal wind speed at different levels, compared to the ER-based selection method. Significance Statement Assimilating satellite radiances has been essential for numerical weather prediction. Hyperspectral infrared satellites provide high-resolution vertical profiles for the atmospheric state and can further improve the numerical weather prediction. Due to limited computational resources, and correlated observations and associated errors, efficient and effective ways to assimilate the hyperspectral radiances are needed. An adaptive channel selection method that is incorporated with data assimilation is proposed. The adaptive channel selection can effectively extract the information from hyperspectral radiances under both clear- and all-sky conditions, with increased model resolutions from kilometers to subkilometers.
Article
Full-text available
The Modern Era Reanalysis (ModE-RA) is a global monthly paleo-reanalysis covering the period between 1421 and 2008. To reconstruct past climate fields an offline data assimilation approach is used, blending together information from an ensemble of transient atmospheric model simulations and observations. In the early period, ModE-RA utilizes natural proxies and documentary data, while from the 17th century onward instrumental measurements are also assimilated. The impact of each observation on the reconstruction is stored in the observation feedback archive, which provides additional information on the input data such as preprocessing steps and the regression-based forward models. The monthly resolved reconstructions include estimates of the most important climate fields. Furthermore, we provide a reconstruction, ModE-RAclim, which together with ModE-RA and the model simulations allows to disentangle the role of observations and model forcings. ModE-RA is best suited to study intra-annual to multi-decadal climate variability and to analyze the causes and mechanisms of past extreme climate events.
Article
Probabilistic forecasts in oceanographic applications, such as drift trajectory forecasts for search‐and‐rescue operations, face challenges due to high‐dimensional complex models and sparse spatial observations. We discuss localisation strategies for assimilating sparse point observations and compare the implicit equal‐weights particle filter and a localised version of the ensemble‐transform Kalman filter. First, we verify these methods thoroughly against the analytic Kalman filter solution for a linear advection diffusion model. We then use a non‐linear simplified ocean model to do state estimation and drift prediction. The methods are rigorously compared using a wide range of metrics and skill scores. Our findings indicate that both methods succeed in approximating the Kalman filter reference for linear models of moderate dimensions, even for small ensemble sizes. However, in high‐dimensional settings with a non‐linear model, we discover that the outcomes are significantly influenced by the dependence of the ensemble Kalman filter on relaxation and the particle filter's sensitivity to the chosen model error covariance structure. Upon proper relaxation and localisation parametrisation, the ensemble Kalman filter version outperforms the particle filter in our experiments. This article is protected by copyright. All rights reserved.
Article
Full-text available
Nowadays, the space exploration is going in the direction of exploiting small platforms to get high scientific return at significantly lower costs. However, miniaturized spacecraft pose different challenges both from the technological and mission analysis point of view. While the former is in constant evolution due to the manufacturers, the latter is an open point, since it is still based on a traditional approach, not able to cope with the new platforms’ peculiarities. In this work, a revised preliminary mission analysis approach, merging the nominal trajectory optimization with a complete navigation assessment, is formulated in a general form and three main blocks composing it are identified. Then, the integrated approach is specialized for a cislunar test case scenario, represented by the transfer trajectory from a low lunar orbit to an halo orbit of the CubeSat LUMIO, and each block is modeled with mathematical means. Eventually, optimal solutions, minimizing the total costs, are sought, showing the benefits of an integrated approach.
Article
Motivated by the observation that the interannual variability of the North Atlantic Oscillation (NAO) is associated with the ensemble emergence of individual NAO events occurring on the intraseasonal time scale, one naturally wonders how the intraseasonal processes cause the interannual variability, and what the dynamics are underlying the multiscale interaction. Using a novel time-dependent and spatially localized multiscale energetics formalism, this study investigates the dynamical sources for the NAO events with different phases and interannual regimes. For the positive-phase events (NAO ⁺ ), the intraseasonal-scale kinetic energy ( K ¹ ) over the North Atlantic sector is significantly enhanced for NAO ⁺ occurring in the negative NAO winter regime (NW), compared to those in the positive winter regime (PW). It is caused by the enhanced inverse cascading from synoptic transients and reduced energy dispersion during the life cycle of NAO ⁺ in NW. For the negative-phase events (NAO ⁻ ), K ¹ is significantly larger during the early and decay stages of NAO ⁻ in NW than that in PW, whereas the reverse occurs in the peak stage. Inverse cascading and baroclinic energy conversion are primary drivers in the formation of the excessive K ¹ during the early stage of NAO ⁻ in NW, whereas only the latter contributes to the larger K ¹ during the decay stage of NAO ⁻ in NW compared to that in PW. The barotropic transfer from the mean flow, inverse cascading and baroclinic energy conversion are all responsible for the strengthened K ¹ in the peak stage of NAO ⁻ in PW.
Chapter
By using an ensemble data assimilation system, this chapter reviews the impact of assimilating observable and retrievable ground-based scanning weather radar information. Pseudo-observations of temperature, humidity, and dual-polarimetric parameters are assimilated in addition to radial velocity and reflectivity. Instead of assimilating the information inside the weather system, retrieved moisture information surrounding precipitation systems could be further obtained by collocated dual-wavelength radars. Via the studies of idealized and different high-impact weather cases, analyses and quantitative precipitation forecasts are examined and validated.
Article
The success of the National Severe Storms Laboratory’s (NSSL) experimental Warn-on-Forecast System (WoFS) to provide useful probabilistic guidance of severe and hazardous weather is mostly due to the frequent assimilation of observations, especially radar observations. Phased-array radar (PAR) technology, which is a potential candidate to replace the current U.S. operational radar network, would allow for even more rapid assimilation of radar observations by providing full-volumetric scans of the atmosphere every ∼1 min. Based on previous studies, more frequent PAR data assimilation can lead to improved forecasts, but it can also lead to ensemble underdispersion and suboptimal observation assimilation. The use of stochastic and perturbed parameter methods to increase ensemble spread is a potential solution to this problem. In this study, four stochastic and perturbed parameter methods are assessed using a 1-km-scale version of the WoFS and include the stochastic kinetic energy backscatter (SKEB) scheme, the physically based stochastic perturbation (PSP) scheme, a fixed perturbed parameters (FPP) method, and a novel surface-model scheme blending (SMSB) method. Using NSSL PAR observations from the 9 May 2016 tornado outbreak, experiments are conducted to assess the impact of the methods individually, in different combinations, and with different cycling intervals. The results from these experiments reveal the potential benefits of stochastic and perturbed parameter methods for future versions of the WoFS. Stochastic and perturbed parameter methods can lead to more skillful forecasts during periods of storm development. Moreover, a combination of multiple methods can result in more skillful forecasts than using a single method. Significance Statement Phased-array radar technology allows for more frequent assimilation of radar observations into ensemble forecast systems like the experimental Warn-on-Forecast System. However, more frequent radar data assimilation can eventually cause issues for prediction systems due to the lack of ensemble spread. Thus, the purpose of this study is to explore the use of four stochastic and perturbed parameter methods in a next-generation Warn-on-Forecast System to generate ensemble spread and help prevent the issues from frequent radar data assimilation. Results from this study indicate the stochastic and perturbed parameter methods can improve forecasts of storms, especially during storm development.
Article
The Advanced Very High‐Resolution Radiometer (AVHRR) is a broad‐band, five channel scanner sensing in the visible, near infrared and thermal infrared portions of the electromagnetic spectrum. AVHRR instruments on‐board polar orbiting satellites have data records spanning 30 years. The radiances from the infra‐red channels of AVHRR have been directly assimilated over the ice‐free ocean in the NASA Goddard Earth Observing System (GEOS) since 2017 to constrain skin sea surface temperature (SST). The GEOS system already uses an advanced bulk sea surface temperature (SST) and a skin SST model which incorporates diurnal warming and cool skin temperature on the surface of the ocean. The positive contribution of this effort on the Numerical Weather Prediction (NWP) makes it desirable to extend the skin SST data assimilation procedure to reanalysis applications. The AVHRR data from platforms that were previously unexplored for NWP applications are assimilated at the Global Modeling and Assimilation Office (GMAO) at NASA, for an upcoming reanalysis project. This study uses results from reanalysis experiments to assess the impact of assimilating AVHRR radiances over open ocean on the atmospheric state variables, focusing on the analyzed skin temperatures that include active skin SST model. It is demonstrated that including the skin temperature variations in atmospheric analysis is generally beneficial. The addition of radiance measurements from AVHRR provides further improvements to the overall reanalysis system performance. by helping to correct the diurnal heating and reducing errors in the background departure of hyperspectral radiance observations, which are an essential component of atmospheric reanalyses. This article is protected by copyright. All rights reserved.
Article
Tropical cyclones (TCs) and tropical depressions (TDs), hereafter collectively referred to as tropical storms, often exhibit large year‐to‐year variability in the South Pacific Ocean basin. Many past studies have examined this variability in relation to the El Niño Southern Oscillation (ENSO) phenomenon, particularly using observational data from the post‐satellite era (i.e., after the 1970s when TC observations became more consistent). However, less emphasis is placed on how tropical storms are modulated at interdecadal and decadal time scales such as due to Interdecadal Pacific Oscillation (IPO). This is because post‐satellite data are available for relatively short time period (i.e., post‐1970s), limiting our understanding of the IPO–TC relationship in the South Pacific. Here, using NOAA‐CIRES 20th Century Reanalysis (20CR) dataset, we reconstruct historical records (1871–2014) of TC and depression proxies for the South Pacific Ocean basin, and then utilize these reconstructed proxies to first understand the connections between TC–ENSO and TC–IPO over the 20th century, and then investigate the combined effects of ENSO–IPO effects on TCs and depressions. Results show that La Niña (El Niño) is more dominant on TC activity than El Niño (La Niña) over the western subregion 140–170° E (eastern sub‐region, 170–220° E) as expected. We also show that TC numbers are strongly modulated by the IPO phenomenon with, on average, more TCs occurring during the positive phase than during the negative phase of the IPO in both western and eastern sub‐regions. We show for the first time (using a long‐term reconstructed TC dataset) that the combined phases of El Niño and + IPO account for increased TC activity, as opposed to the combined phase of La Niña and −IPO, in the eastern sub‐region. Similarly, the combined phase of La Niña and + IPO, as opposed to the combined phase of El Niño and −IPO, account for increased TC activity in the western sub‐region. However, unlike TCs, the patterns of ENSO variability seem to be reversed for TDs. Changes in large‐scale environmental conditions, such as environmental vertical wind shear, low‐level cyclonic relative vorticity, mid‐tropospheric relative humidity and sea surface temperature are linked to the various modes of variability patterns and their synergistic relationships.
Article
Full-text available
A rational approach is used to identify efficient schemes for data assimilation in nonlinear ocean–atmosphere models. The conditional mean, a minimum of several cost functionals, is chosen for an optimal estimate. After stating the present goals and describing some of the existing schemes, the constraints and issues particular to ocean–atmosphere data assimilation are emphasized. An approximation to the optimal criterion satisfying the goals and addressing the issues is obtained using heuristic characteristics of geophysical measurements and models. This leads to the notion of an evolving error subspace, of variable size, that spans and tracks the scales and processes where the dominant errors occur. The concept of error subspace statistical estimation (ESSE) is defined. In the present minimum error variance approach, the suboptimal criterion is based on a continued and energetically optimal reduction of the dimension of error covariance matrices. The evolving error subspace is characterized by error singular vectors and values, or in other words, the error principal components and coefficients. Schemes for filtering and smoothing via ESSE are derived. The data–forecast melding minimizes variance in the error subspace. Nonlinear Monte Carlo forecasts integrate the error subspace in time. The smoothing is based on a statistical approximation approach. Comparisons with existing filtering and smoothing procedures are made. The theoretical and practical advantages of ESSE are discussed. The concepts introduced by the subspace approach are as useful as the practical benefits. The formalism forms a theoretical basis for the intercomparison of reduced dimension assimilation methods and for the validation of specific assumptions for tailored applications. The subspace approach is useful for a wide range of purposes, including nonlinear field and error forecasting, predictability and stability studies, objective analyses, data-driven simulations, model improvements, adaptive sampling, and parameter estimation.
Article
Full-text available
A suboptimal Kalman filter called the ensemble transform Kalman filter (ET KF) is introduced. Like other Kalman filters, it provides a framework for assimilating observations and also for estimating the effect of observations on forecast error covariance. It differs from other ensemble Kalman filters in that it uses ensemble transformation and a normalization to rapidly obtain the prediction error covariance matrix associated with a particular deployment of observational resources. This rapidity enables it to quickly assess the ability of a large number of future feasible sequences of observational networks to reduce forecast error variance. The ET KF was used by the National Centers for Environmental Prediction in the Winter Storm Reconnaissance missions of 1999 and 2000 to determine where aircraft should deploy dropwindsondes in order to improve 24–72-h forecasts over the continental United States. The ET KF may be applied to any well-constructed set of ensemble perturbations. The ET KF technique supercedes the ensemble transform (ET) targeting technique of Bishop and Toth. In the ET targeting formulation, the means by which observations reduced forecast error variance was not expressed mathematically. The mathematical representation of this process provided by the ET KF enables such things as the evaluation of the reduction in forecast error variance associated with individual flight tracks and assessments of the value of targeted observations that are distributed over significant time intervals. It also enables a serial targeting methodology whereby one can identify optimal observing sites given the location and error statistics of other observations. This allows the network designer to nonredundantly position targeted observations. Serial targeting can also be used to greatly reduce the computations required to identify optimal target sites. For these theoretical and practical reasons, the ET KF technique is more useful than the ET technique. The methodology is illustrated with observation system simulation experiments involving a barotropic numerical model of tropical cyclonelike vortices. These include preliminary empirical tests of ET KF predictions using ET KF, 3DVAR, and hybrid data assimilation schemes—the results of which look promising. To concisely describe the future feasible sequences of observations considered in adaptive sampling problems, an extension to Ide et al.'s unified notation for data assimilation is suggested.
Article
Full-text available
An ensemble Kalman filter may be considered for the 4D assimilation of atmospheric data. In this paper, an efficient implementation of the analysis step of the filter is proposed. It employs a Schur (elementwise) product of the covariances of the background error calculated from the ensemble and a correlation function having local support to filter the small (and noisy) background-error covariances associated with remote observations. To solve the Kalman filter equations, the observations are organized into batches that are assimilated sequentially. For each batch, a Cholesky decomposition method is used to solve the system of linear equations. The ensemble of background fields is updated at each step of the sequential algorithm and, as more and more batches of observations are assimilated, evolves to eventually become the ensemble of analysis fields. A prototype sequential filter has been developed. Experiments are performed with a simulated observational network consisting of 542 radiosonde and 615 satellite-thickness profiles. Experimental results indicate that the quality of the analysis is almost independent of the number of batches (except when the ensemble is very small). This supports the use of a sequential algorithm. A parallel version of the algorithm is described and used to assimilate over 100 000 observations into a pair of 50-member ensembles. Its operation count is proportional to the number of observations, the number of analysis grid points, and the number of ensemble members. In view of the flexibility of the sequential filter and its encouraging performance on a NEC SX-4 computer, an application with a primitive equations model can now be envisioned.
Article
Full-text available
The usefulness of a distance-dependent reduction of background error covariance estimates in an ensemble Kalman filter is demonstrated. Covariances are reduced by performing an elementwise multiplication of the background error covariance matrix with a correlation function with local support. This reduces noisiness and results in an improved background error covariance estimate, which generates a reduced-error ensemble of model initial conditions. The benefits of applying the correlation function can be understood in part from examining the characteristics of simple 2 2 covariance matrices generated from random sample vectors with known variances and covariance. These show that noisiness in covariance estimates tends to overwhelm the signal when the ensemble size is small and/or the true covariance between the sample elements is small. Since the true covariance of forecast errors is generally related to the distance between grid points, covariance estimates generally have a higher ratio of noise to signal with increasing distance between grid points. This property is also demonstrated using a two-layer hemispheric primitive equation model and comparing covariance estimates generated by small and large ensembles. Covariances from the large ensemble are assumed to be accurate and are used a reference for measuring errors from covariances estimated from a small ensemble. The benefits of including distance-dependent reduction of covariance estimates are demonstrated with an ensemble Kalman filter data assimilation scheme. The optimal correlation length scale of the filter function depends on ensemble size; larger correlation lengths are preferable for larger ensembles. The effects of inflating background error covariance estimates are examined as a way of stabilizing the filter. It was found that more inflation was necessary for smaller ensembles than for larger ensembles.
Article
Full-text available
The possibility of performing data assimilation using the flow-dependent statistics calculated from an ensemble of short-range forecasts (a technique referred to as ensemble Kalman filtering) is examined in an idealized environment. Using a three-level, quasigeostrophic, T21 model and simulated observations, experiments are performed in a perfect-model context. By using forward interpolation operators from the model state to the observations, the ensemble Kalman filter is able to utilize nonconventional observations. In order to maintain a representative spread between the ensemble members and avoid a problem of inbreeding, a pair of ensemble Kalman filters is configured so that the assimilation of data using one ensemble of short-range forecasts as background fields employs the weights calculated from the other ensemble of short-range forecasts. This configuration is found to work well: the spread between the ensemble members resembles the difference between the ensemble mean and the true state, except in the case of the smallest ensembles. A series of 30-day data assimilation cycles is performed using ensembles of different sizes. The results indicate that (i) as the size of the ensembles increases, correlations are estimated more accurately and the root-mean-square analysis error decreases, as expected, and (ii) ensembles having on the order of 100 members are sufficient to accurately describe local anisotropic, baroclinic correlation structures. Due to the difficulty of accurately estimating the small correlations associated with remote observations, a cutoff radius beyond which observations are not used, is implemented. It is found that (a) for a given ensemble size there is an optimal value of this cutoff radius, and (b) the optimal cutoff radius increases as the ensemble size increases.
Article
Full-text available
This paper discusses an important issue related to the implementation and interpretation of the analysis scheme in the ensemble Kalman filter. It is shown that the observations must be treated as random variables at the analysis steps. That is, one should add random perturbations with the correct statistics to the observations and generate an ensemble of observations that then is used in updating the ensemble of model states. Traditionally, this has not been done in previous applications of the ensemble Kalman filter and, as will be shown, this has resulted in an updated ensemble with a variance that is too low. This simple modification of the analysis scheme results in a completely consistent approach if the covariance of the ensemble of model states is interpreted as the prediction error covariance, and there are no further requirements on the ensemble Kalman filter method, except for the use of an ensemble of sufficient size. Thus, there is a unique correspondence between the error statistics from the ensemble Kalman filter and the standard Kalman filter approach.
Article
Full-text available
The ring-shedding process in the Agulhas Current is studied using the ensemble Kalman filter to assimilate Geosat altimeter data into a two layer quasi-geostrophic ocean model. The properties of the ensemble Kalman filter are further explored with focus on the analysis scheme and the use of gridded data. The Geosat data consist of 10 fields of gridded sea-surface height anomalies separated 10 days apart which are added to a climatic mean field. This corresponds to a huge number of data values and a data reduction scheme must be applied to increase the efficiency of the analysis procedure. Further, it is illustrated how one can resolve the rank problem occurring when a too large data set or a small ensemble is used. 1 Introduction The Agulhas Current is a western-boundary current flowing along the east coast of South Africa. Its water originates from the Mozambique channel (see e.g. Saetre and da Silva, 1984) and from east of Madagascar (e.g. Lutjeharms et al., 1981) as part of...
Article
Full-text available
The need for unified notation in atmospheric and oceanic data assimilation arises from the field's rapid theoretical expansion and the desire to translate it into practical applications. Self-consistent notation is proposed that bridges sequential and variational methods, on the one hand, and operational usage, on the other. Over various other mottoes for this risky endeavor, the authors selected: "When I use a word," Humpty Dumpty said, in rather a scornful voice tone, "it means just what I choose it to mean --- neither more nor less." Lewis Carroll, 1871. 1 J. Met. Soc. Japan, Special Issue on "Data Assimilation in Meteorology and Oceanography: Theory and Practice." Vol. 75, No. 1B, pp. 181--189, 1997. 2 Corresponding author. 3 Current affiliation: CNES, 2, place Maurice Quentin, 75039 Paris Cedex 01, France. 1 J. Met. Soc. Japan, (1997), K. Ide, P. Courtier, M. Ghil and A.C. Lorenc 2 1. Introduction and motivation Model-based assimilation of observations, or data assimilati...
Article
This paper provides an overview of atmospheric data assimilation. It is shown how data assimilation developed historically from the requirement to provide initial conditions for numerical weather prediction. The basic concepts of atmospheric data assimilation are discussed, starting with the scalar case, and progressing through three dimensional spatial analysis to the full four dimensional problem. The most advanced algorithms (4DVAR and the Kalman filter) are introduced briefly and their relation to the simpler algorithms explored. The control of undesirable high frequency oscillations is sketched. The present state of atmospheric data assimilation is discussed and possible future developments are suggested.
Article
Anticipating the opportunity to make supplementary observations at locations that can depend upon the current weather situation, the question is posed as to what strategy should be adopted to select the locations, if the greatest improvement in analyses and forecasts is to be realized. To seek a preliminary answer, the authors introduce a model consisting of 40 ordinary differential equations, with the dependent variables representing values of some atmospheric quantity at 40 sites spaced equally about a latitude circle. The equations contain quadratic, linear, and constant terms representing advection, dissipation, and external forcing. Numerical integration indicates that small errors (differences between solutions) tend to double in about 2 days. Localized errors tend to spread eastward as they grow, encircling the globe after about 14 days.In the experiments presented, 20 consecutive sites lie over the ocean and 20 over land. A particular solution is chosen as the true weather. Every 6 h observations are made, consisting of the true weather plus small random errors, at every land site, and at one ocean site to be selected by the strategy being considered. An analysis is then made, consisting of observations where observations are made and previously made 6-h forecasts elsewhere. Forecasts are made for each site at ranges from 6 h to 10 days. In all forecasts, a slightly weakened external forcing is used to simulate the model error. This process continues for 5 years, and mean-square forecast errors at each site at each range are accumulated.Strategies that attempt to locate the site where the current analysis, as made without a supplementary observation, is most greatly in error are found to perform better than those that seek the oceanic site to which a chosen land site is most sensitive at a chosen range. Among the former are strategies based on the `breeding' method, a variant of singular vectors, and ensembles of `replicated' observations; the last of these outperforms the others. The authors speculate as to the applicability of these findings to models with more realistic dynamics or without extensive regions devoid of routine observations, and to the real world.
Article
A nonlinear differential equation of the Riccati type is derived for the covariance matrix of the optimal filtering error. The solution of this "variance equation" com-pletely specifies the optimal filter for either finite or infinite smoothing intervals and stationary or nonstationary statistics. The variance equation is closely related to the Hamiltonian (canonical) differential equations of the calculus of variations. Analytic solutions are available in some cases. The significance of the variance equation is illustrated by examples which duplicate, simplify, or extend earlier results in this field. The Duality Principle relating stochastic estimation and deterministic control problems plays an important role in the proof of theoretical results. In several examples, the estimation problem and its dual are discussed side-by-side. Properties of the variance equation are of great interest in the theory of adaptive systems. Some aspects of this are considered briefly.
Article
The ring-shedding process in the Agulhas Current is studied using the ensemble Kalman filter to assimilate Geosat altimeter data into a two-layer quasigeostrophic ocean model The properties of the ensemble Kalman filter are further explored with focus on the analysis scheme and the use of gridded data. The Geosat data consist of 10 fields of gridded sea surface height anomalies separated 10 days apart that are added to a climatic mean field This corresponds to a huge number of data values, and a data reduction scheme must be applied to increase the efficiency of the analysis procedure. Further, it is illustrated how one can resolve the rank problem occurring when a too large dataset or a small ensemble is used.
Article
Knowledge of the probability distribution of initial conditions is central to almost all practical studies of predictability and to improvements in stochastic prediction of the atmosphere. Traditionally, data assimilation for atmospheric predictability or prediction experiments has attempted to find a single ''best'' estimate of the initial state. Additional information about the initial condition probability distribution is then obtained primarily through heuristic techniques that attempt to generate representative perturbations around the best estimate. However, a classical theory for generating an estimate of the complete probability distribution of an initial state given a set of observations exists. This nonlinear filtering theory can be applied to unify the data assimilation and ensemble generation problem and to produce superior estimates of the probability distribution of the initial state of the atmosphere (or ocean) on regional or global scales. A Monte Carlo implementation of the fully nonlinear filter has been developed and applied to several low-order models. The method is able to produce assimilations with small ensemble mean errors while also providing random samples of the initial condition probability distribution. The Monte Carlo method can be applied in models that traditionally require the appli- cation of initialization techniques without any explicit initialization. Initial application to larger models is promising, but a number of challenges remain before the method can be extended to large realistic forecast models.
Article
This paper considers several filtering methods of stochastic nature, based on Monte Carlo drawing, for the sequential data assimilation in nonlinear models. They include some known methods such as the particle filter and the ensemble Kalman filter and some others introduced by the author: the second-order ensemble Kalman filter and the singular extended interpolated filter. The aim is to study their behavior in the simple nonlinear chaotic Lorenz system, in the hope of getting some insight into more complex models. It is seen that these filters perform satisfactory, but the new filters introduced have the advantage of being less costly. This is achieved through the concept of second-order-exact drawing and the selective error correction, parallel to the tangent space of the attractor of the system (which is of low dimension). Also introduced is the use of a forgetting factor, which could enhance significantly the filter stability in this nonlinear context.
Article
An estimate of the mean effect of ensemble‐averaging on forecast skill, under idealized ‘perfect model’ conditions, is obtained from a set of eight independent 50‐day winter ensemble forecast experiments made with a hemispheric version of the Meteorological Office (UKMO) 5‐level general circulation model. Each ensemble forecast consisted of seven individual integrations. Initial conditions for these were obtained by adding spatially correlated perturbations to a given wintertime analysis, and a further integration created in the same manner was used to represent nature, giving the perfect model approach. The ensemble‐mean forecast shows a clear improvement in amplitude and phase skill compared with individual forecasts, the period of significant predictability for daily fields being increased by 50%. The improvement in skill is consistent with simple theoretical estimates based on the perfect model assumption. These calculations are used to deduce how ensemble‐mean forecast skill should vary with the size of ensemble. The superiority of the ensemble‐mean is maintained when forecasts are spatially smoothed or time‐averaged. The spread of an ensemble distribution can in principle give an a priori indication of forecast skill. A moderate level of correlation between ensemble spread and the forecast skill of the ensemble‐mean is found on the hemispheric scale. The extent to which the potential benefits of ensemble forecasting may be achieved in reality depends on the model's practical forecast skill. Since the practical skill of the 5‐level model is rather low, an ensemble‐mean forecast is on average no better than an individual forecast up to the normal limit of deterministic predictability. However, in four experiments where the individual forecasts show skill beyond this point, the ensemble‐mean forecast does give increased skill. Spatial variations in both the practical and perfect model skills of an ensemble‐mean anomaly field are found to be related to corresponding variations in the statistical significance of the anomaly field. For example, the average perfect model skill, in regions where the ensemble‐mean anomaly is significantly different from zero, exceeds the full field skill in all experiments for forecast days 1–15, and in all but two cases for days 16–30.
Article
Bayesian probabilistic arguments are used to derive idealized equations for finding the best analysis for numerical weather prediction. These equations are compared with those from other published methods in the light of the physical characteristics of the NWP analysis problem; namely the predetermined nature of the basis for the analysis, the need for approximation because of large-order systems, the underdeterminacy of the problem when using observations alone, and the availability of prior relationships to resolve the underdeterminacy. Prior relationships result from (1) knowledge of the time evolution of the model (which together with the use of a time distribution of observations constitutes four-dimensional data assimilation); (2) knowledge that the atmosphere varies slowly (leading to balance relationships); (3) other nonlinear relationships coupling parameters and scales in the atmosphere. Methods discussed include variational techniques, smoothing splines, Kriging, optimal interpolation, successive corrections, constrained initialization, the Kalman-Bucy filter, and adjoint model data assimilation. They are all shown to relate to the idealized analysis, and hence to each other. Opinions are given on when particular methods might be more appropriate. By comparison with the idealized method some insight is gained into appropriate choices of parameters in the practical methods.
Article
A new sequential data assimilation method is discussed. It is based on forecasting the error statistics using Monte Carlo methods, a better alternative than solving the traditional and computationally extremely demanding approximate error covariance equation used in the extended Kalman filter. The unbounded error growth found in the extended Kalman filter, which is caused by an overly simplified closure in the error covariance equation, is completely eliminated. Open boundaries can be handled as long as the ocean model is well posed. Well-known numerical instabilities associated with the error covariance equation are avoided because storage and evolution of the error covariance matrix itself are not needed. The results are also better than what is provided by the extended Kalman filter since there is no closure problem and the quality of the forecast error statistics therefore improves. The method should be feasible also for more sophisticated primitive equation models. The computational load for reasonable accuracy is only a fraction of what is required for the extended Kalman filter and is given by the storage of, say, 100 model states for an ensemble size of 100 and thus CPU requirements of the order of the cost of 100 model integrations. The proposed method can therefore be used with realistic nonlinear ocean models on large domains on existing computers, and it is also well suited for parallel computers and clusters of workstations where each processor integrates a few members of the ensemble.
Article
Thesis (Ph. D.)--Princeton University, 1991. Includes bibliographical references.
Article
A theory for estimating the probability distribution of the state of a model given a set of observations exists.
Article
This article focuses on the construction, directly in physical space, of simply parametrized covariance functions for data-assimilation applications. A self-contained, rigorous mathematical summary of relevant topics from correlation theory is provided as a foundation for this construction. Covariance and correlation functions are defined, and common notions of homogeneity and isotropy are clarified. Classical results are stated, and proven where instructive. Included are smoothness properties relevant to multivariate statistical-analysis algorithms where wind/wind and wind/mass correlation models are obtained by differentiating the correlation model of a mass variable. the Convolution Theorem is introduced as the primary tool used to construct classes of covariance and cross-covariance functions on three-dimensional Euclidean space R³. Among these are classes of compactly supported functions that restrict to covariance and cross-covariance functions on the unit sphere S², and that vanish identically on subsets of positive measure on S². It is shown that these covariance and cross-covariance functions on S², referred to as being space-limited, cannot be obtained using truncated spectral expansions. Compactly supported and space-limited covariance functions determine sparse covariance matrices when evaluated on a grid, thereby easing computational burdens in atmospheric data-analysis algorithms.
Analysis methods for numerical weather pre-diction Optimal sites for supple-mentary weather observations: Simulation with a small model
  • Mon
  • Wea
  • Rev
  • A C Lorenc
Mon. Wea. Rev., 127, 1385–1407. Lorenc, A. C., 1986: Analysis methods for numerical weather pre-diction. Quart. J. Roy. Meteor. Soc., 112, 1177–1194. Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supple-mentary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399–414
The impact of ensemble forecasts on predict-ability Stochastic methods for sequential dataassimilation in strongly nonlinear systems Comment on ‘‘Data assimilation using an ensemble Kalman filter technique
  • J M Murphy
  • D T Pham
  • P J Leeuwen
Murphy, J. M., 1988: The impact of ensemble forecasts on predict-ability. Quart. J. Roy. Meteor. Soc., 114, 89–125. Pham, D. T., 2001: Stochastic methods for sequential dataassimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 1194–1207. van Leeuwen, P. J., 1999: Comment on ‘‘Data assimilation using an ensemble Kalman filter technique.’’ Mon. Wea. Rev., 127, 1374– 1377. Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.
Square root filtering. Stochastic Models
  • P S Maybeck
Maybeck, P. S., 1979: Square root filtering. Stochastic Models, Estimation and Control, Vol. 1, Academic Press, 411 pp.
  • P L Houtekamer
  • H L Mitchell
Houtekamer, P. L., and H. L. Mitchell, 1999: Reply. Mon. Wea. Rev., 127, 1378-1379.