Article

True global error maps for SMAP, SMOS, and ASCAT soil moisture data based on machine learning and triple collocation analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Quantifying the accuracy of the satellite-based soil moisture (SM) data is important for a number of key applications , such as: combining satellite-based SM products for long-term SM analyses, assimilating SM data into land surface models, and providing quality flags to mask bad quality SM data. A range of statistical methods have been proposed to estimate error statistics for large-scale SM datasets including the: instrumental variable (IV) method, triple collocation analysis (TCA), and quadruple collocation analysis (QCDA). While requiring only two input products, the IV method also imposes an additional assumption that one input product possesses serially uncorrelated errors-thus limiting its scope compared to TC. Likewise, QCDA requires four independent SM data products that are difficult to obtain and may not always be available for analysis. Nonetheless, TCA-based methods still cannot provide truly global error maps for satellite SM products due to the limited number of independent SM products and difficulties with baseline TCA assumptions. Moreover, temporal sampling requirements for TCA are often impractical because of low SM retrieval skill in forested and arid areas-as well as in regions prone to radio frequency interference. Here, we seek to fill significant spatial gaps in TCA results using machine learning (ML) and therefore provide spatially complete error maps for the satellite-based SM data products derived from the Soil Moisture Active Passive (SMAP), Soil Moisture and Ocean Salinity (SMOS), and Advanced Scatterometer (ASCAT) systems. Furthermore, we use SHapley Additive exPlanations (SHAP) values, a model-agnostic technique for interpreting ML models, to examine the impact of various environmental conditions on the quality of satellite-based SM retrievals. Globally, and across all three products, 72.0% of missing error information in a TCA-based analysis, due to either the lack of valid data or the inability of TCA to provide reliable results, can be reconstructed from the ensemble prediction mean of the ML models. Overall, we found that 22.7% (a.m.) and 34.2% (p.m.) of the Earth'sSM dynamics (between 60 • S to 60 • N) have not been investigated properly across all three satellite missions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The basic information about these datasets, including their spatial and temporal resolutions, is summarized in Table 2. The MODIS LAI product is employed to evaluate the impact of vegetation conditions on SM products, as previous studies have shown that vegetation plays a crucial role in the performance of remotely sensed SM data [22,26]. The surface soil temperature provided by ERA5-land is utilized to examine the influence of MSST on the performance of the four SM retrievals [46]. ...
... Remarkably, the coverage and the amount of data available for the four SM products are significantly limited after pre-processing. This limitation is primarily due to factors such as the proportion of water bodies, frozen soil conditions, RFI, and errors from algorithmic spurious inversions [8,26,67], which are generally identifiable through the quality markers of each product. In particular, coarse-resolution SM products in Jiangsu Province are especially affected by the extensive water system distribution, highlighting the need for the development of SM products that incorporate water body corrections in the inversion algorithms, or the creation of higher-resolution products to meet the application requirements of various disciplines [4,6,68]. ...
... Nevertheless, we also applied the Triple Collocation Analysis (TCA) method, which is based on mathematical statistics, does not require ground truth data, and is unaffected by spatial representativeness errors. By constructing traditional (modeled, active, and passive) triple collocations [26,71], we found that the ranking of R values obtained from TCA is nearly consistent with the site-based validation results presented in this study ( Figure S9), demonstrating the robustness of our findings. Furthermore, not only dynamic environmental factors such as vegetation density, soil temperature, and surface soil wetness but also static conditions such as soil properties, land cover, and climatic zones affect the SM inversion, which need to be analyzed comprehensively in combination [59]. ...
Article
Full-text available
Accurate surface soil moisture (SM) data are crucial for agricultural management in Jiangsu Province, one of the major agricultural regions in China. However, the seasonal performance of different SM products in Jiangsu is still unknown. To address this, this study aims to evaluate the applicability of four L-band microwave remotely sensed SM products, namely, the Soil Moisture Active Passive Single-Channel Algorithm at Vertical Polarization Level 3 (SMAP SCA-V L3, hereafter SMAP-L3), SMOS-SMAP-INRAE-BORDEAUX (SMOSMAP-IB), Soil Moisture and Ocean Salinity in version IC (SMOS-IC), and SMAP-INRAE-BORDEAUX (SMAP-IB) in Jiangsu at the seasonal scale. In addition, the effects of dynamic environmental variables such as the leaf vegetation index (LAI), mean surface soil temperature (MSST), and mean surface soil wetness (MSSM) on the performance of the above products are investigated. The results indicate that all four SM products exhibit significant seasonal differences when evaluated against in situ observations between 2016 and 2022, with most products achieving their highest correlation (R) and unbiased root-mean-square difference (ubRMSD) scores during the autumn. Conversely, their performance significantly deteriorates in the summer, with ubRMSD values exceeding 0.06 m³/m³. SMOS-IC generally achieves better R values across all seasons but has limited temporal availability, while SMAP-IB typically has the lowest ubRMSD values, even reaching 0.03 m³/m³ during morning observation in the winter. Additionally, the sensitivity of different products’ skill metrics to environmental factors varies across seasons. For ubRMSD, SMAP-L3 shows a general increase with LAI across all four seasons, while SMAP-IB exhibits a notable increase as the soil becomes wetter in the summer. Conversely, wet conditions notably reduce the R values during autumn for most products. These findings are expected to offer valuable insights for the appropriate selection of products and the enhancement of SM retrieval algorithms.
... TCA is a technique that allows for the assessment and quantification of error characteristics in three independent data sources without relying on reference data pre-assumed to be "true" (McColl et al., 2014). This method has been widely adopted in the uncertainty evaluation of remote sensing products across various fields, including soil moisture (Kim et al., 2023), sea surface salinity (Hoareau et al., 2018), and sea surface temperature (Saleh and Al-Anzi, 2021). For error statistics based on TCA, we selected the fractional mean-squared error (fMSE) and the squared correlation coefficient. ...
... The foundational assumptions of TCA are important for its application (Kim et al., 2023). These assumptions are as follows: (1) a linear relationship exists between each dataset and the true signal, (2) the errors among the datasets are orthogonal, and (3) there is no correlation among the errors of different datasets. ...
Article
Full-text available
Long time series of spatiotemporally continuous phytoplankton functional type (PFT) data are essential for understanding marine ecosystems and global biogeochemical cycles as well as for effective marine management. In this study, we integrated artificial intelligence (AI) technology with multisource marine big data to develop a spatial–temporal–ecological ensemble model based on deep learning (STEE-DL). This model generated the first AI-driven global daily gap-free 4 km PFT chlorophyll a concentration product from 1998 to 2023 (AIGD-PFT). The AIGD-PFT significantly enhances the accuracy and spatiotemporal coverage of quantifying eight major PFTs: diatoms, dinoflagellates, haptophytes, pelagophytes, cryptophytes, green algae, prokaryotes, and Prochlorococcus. The model input encompasses (1) physical oceanographic, biogeochemical, and spatiotemporal information and (2) ocean colour data (OC-CCI v6.0) that have been gap-filled using a discrete cosine transform–penalized least squares (DCT-PLS) approach. The STEE-DL model utilizes an ensemble strategy with 100 residual neural network (ResNet) models, applying Monte Carlo and bootstrapping methods to estimate the optimal PFT chlorophyll a concentration and assess the model uncertainty through ensemble means and standard deviations. The model's performance was validated using multiple cross-validation strategies – random, spatial-block, and temporal-block methods – combined with in situ data, demonstrating STEE-DL's robustness and generalization capability. The daily updates and seamless nature of the AIGD-PFT data product capture the complex dynamics of coastal regions effectively. Finally, through a comparative analysis using a triple-collocation analysis (TCA) approach, the competitive advantages of the AIGD-PFT data product over existing products were validated. The complete product dataset (1998–2023) can be freely downloaded from 10.11888/RemoteSen.tpdc.301164 (Zhang and Shen, 2024a).
... TCA is a 300 technique that allows for the assessment and quantification of error characteristics in three independent data sources without relying on reference data pre-assumed to be "true" (Mccoll et al., 2014). This method has been widely adopted in the uncertainty evaluation of remote sensing products across various fields, including soil moisture (Kim et al., 2023), sea surface salinity (Hoareau et al., 2018), and sea surface temperature (Saleh and Al-Anzi, 2021). ...
... The foundational assumptions of TCA are important for its application (Kim et al., 2023): (1) conducting TCA assessments for diatoms, prokaryotes, and Haptophytes. Before evaluation, the AIGD-PFT products were merged monthly and resampled to 1° resolution along with EOF-PFT and NOBM-monthly. ...
Preprint
Full-text available
Long time series of spatiotemporally continuous phytoplankton functional type (PFT) products are essential for understanding marine ecosystems, global biogeochemical cycles, and effective marine management. In this study, by integrating artificial intelligence (AI) technology with multi-source marine big data, we have developed a Spatial–Temporal–Ecological Ensemble model based on Deep Learning (STEE-DL), and then generated the first AI-driven Global Daily gap-free 4 km PFTs product from 1998 to 2023 (AIGD-PFT), significantly enhancing the accuracy and spatiotemporal coverage of quantifying eight major PFTs (i.e., Diatoms, Dinoflagellates, Haptophytes, Pelagophytes, Cryptophytes, Green Algae, Prokaryotes, and Prochlorococcus). The input data encompass physical oceanographic, biogeochemical, spatiotemporal information, and ocean color data (OC-CCI v6.0) that have been gap-filled using a Discrete Cosine Transform with a Penalized Least Square (DCT-PLS) approach. The STEE-DL model utilizes an ensemble strategy with 100 ResNet models, applying Monte Carlo and bootstrapping methods to estimate optimal PFT values and assess model uncertainty through ensemble means and standard deviations. The model's performance was validated using multiple cross-validation strategies—random, spatial-block, and temporal-block—combined with in-situ data, demonstrating STEE-DL's robustness and generalization capability. The daily updates and seamless nature of the AIGD-PFT product capture the complex dynamics of coastal regions effectively. Finally, through a comparative analysis using a triple-collocation (TC) approach, the competitive advantages of the AIGD-PFT product over existing products were validated. The AIGD-PFT product not only provides the foundation for detailed analyses of PFT trends, interannual variability, and the impacts of climate change on phytoplankton composition across various temporal and spatial scales, but also has the potential to facilitate precise quantification of marine carbon flux and enhances the accuracy of biogeochemical models. A video demonstration is available at https://doi.org/10.5446/67366 (Zhang and Shen, 2024a). The complete product dataset (1998–2023) can be freely downloaded at https://doi.org/10.11888/RemoteSen.tpdc.301164 (Zhang and Shen, 2024b).
... These techniques aim to capture the random errors of three datasets from different measurement systems without requiring a fixed truth reference dataset, under the assumption that these three datasets are independent and linearly related to the true product. This method has been widely employed in the hydrological sector for error characterization, particularly for SM products (Gruber et al., 2016;Chen et al., 2018;Kim et al., 2023a;Kim et al., 2023b). Despite its extensive use in characterizing errors in different microwave satellite SM datasets, only a few previous studies have initially investigated its application in evaluating CYGNSS-derived SM using different sets of triple SM products. ...
Article
Soil moisture (SM) is a key variable in hydrometeorology and climate systems. With the growing interest in capturing fine-scale SM variability for effective hydroclimate applications, spaceborne L-band bistatic radar systems using Global Navigation Satellite System-Reflectometry (GNSS-R) technology hold great potential to meet the demand for high spatiotemporal resolution SM data. Although primarily designed for tropical cyclone monitoring purposes, the first GNSS-R satellite constellation-Cyclone Global Navigation Satellite System (CYGNSS) mission, has demonstrated the benefits of reliably monitoring diurnal SM dynamics through its initial stage of seven-year data record, thanks to its high revisit frequency at sub-daily intervals. Nevertheless, knowledge of SM retrieval from CYGNSS, particularly linked with its distinctive features, remains poorly understood , while numerous existing uncertainties and open issues can restrict its effective SM retrieval and practical applications in the next operating stages. Unlike other review papers, this work aims to bridge this knowledge gap in CYGNSS SM retrieval by highlighting noteworthy design properties based on analyses of its real-world data, while providing a synthesis of recent advances in eliminating external uncertainty factors and improving SM inversion methods. Despite its potential, CYGNSS SM retrieval faces both general and particular challenges arising from common issues in retrieval algorithms for conventional GNSS-R satellites and unique data limitations tied to its technical design. Scientific debates over the contributions of coherent and incoherent components in total CYGNSS signals and accurate partitioning of these two parts are defined as the key algorithm-related challenges to resolve, along with correcting attenuation effects of vegetation and surface roughness. The data-related challenges involve variations in CYGNSS's spatial footprint, temporal frequency, and signal penetration depth across different land surface conditions, inadequate consideration of CYGNSS incidence angle change, excessive dependence on a reference SM dataset for inversion model calibration/training or validation, and computational demands for processing rapid multi-sampling CYGNSS data retrieval. Future research pathways highlight leveraging cutting-edge machine learning/deep learning algorithms to enhance CYGNSS SM data quantity and quality and better interpret its complex interactions with other hydroclimate variables. Assimilating CYGNSS SM data streams into physical models to improve the prediction of related variables and climate extremes also presents a promising prospect.
... Please refer to Text S2 for further details on the TCA method applied in this study. This study utilizes ASCAT, SMOS, and AMSR2 SM data, as detailed in Kim et al. (2023). The ASCAT data include the TU Wien algorithm applied to Metop-A, -B, and -C satellites, specifically the ASCAT SM Climate Data Record (CDR) version 7 at 12.5 km resolution (H119 and extended H120 datasets). ...
Article
Full-text available
This study investigates the potential benefit of assimilating soil moisture (SM) data retrieved from Soil Moisture Active Passive (SMAP) in improving global SM estimates and enhancing the weather forecast skill of the Korean Integrated Model (KIM). The 36‐km SMAP L2 SM retrievals are assimilated into the Noah land surface model (LSM) using the ensemble Kalman filter scheme through the National Aeronautics and Space Administration Land Information System (LIS) that is weakly coupled to KIM. A suite of cycling experiments of the KIM–LIS system that includes land and atmospheric data assimilation (DA) and five‐day weather forecasts are conducted over the global domain from March to July 2022. In the SMAP SM DA, two different SM bias correction methods, namely the cumulative distribution function matching and anomaly‐based bias correction, are applied to correct systematic biases between SMAP and Noah‐LSM before assimilation. The global triple collocation analysis reveals that compared to the control case (without SM DA), significant improvements in the global SM estimates are achieved by assimilating the SMAP data, especially by employing the anomaly‐based bias correction. Notably, the improved SM initial conditions lead to an improved screen‐level specific humidity and air temperature analysis and forecasts when the results are compared against the European Centre for Medium Range Weather Forecasts‐Integrated Forecasting System analysis. The beneficial impacts of the SMAP DA on the atmospheric variables extend up to an atmospheric level of 700 hPa. Prominent improvements in the KIM forecast skill by the SMAP DA applying the anomaly‐based bias correction are observed in the northern part of Africa and West and Central Asia with stronger impacts for longer forecast lead time. This paper demonstrates the feasibility of assimilating the SMAP data within KIM–LIS to enhance the KIM weather forecasts in the lower atmosphere when a proper SM bias correction method is applied.
... Scheer et al. (2023) confirmed the linkage to ground ice content Table 1. Summary of remote sensing techniques for near-surface soil moisture estimation (modified after Engman (1991); Moran et al. (2004); Wang and Qu (2009) (Wrona et al., 2017), (Kim et al., 2023) Active although only a limited number of samples were available. and Chen et al. (2023) suggested the retrieval of equivalent water depth from InSAR subsidence. ...
Preprint
Full-text available
The identification of spatial soil moisture patterns is of high importance for various applications in high latitude permafrost regions, but challenging with common remote sensing approaches due to high landscape heterogeneity. Seasonal thawing and freezing of near-surface soil lead to subsidence-heave cycles in the presence of ground ice, which can exhibit magnitudes of several centimeters. Our investigations document higher Sentinel-1 InSAR seasonal subsidence rates for locations with higher near-surface soil moisture compared to dryer ones. Based on this, we demonstrate that the relationship of thawing degree days – a measure of seasonal heating – and subsidence signals can be interpreted to assess spatial variations of near-surface soil moisture. A range of challenges, however, need to be addressed. We discuss the implications of using different sources of temperature data for deriving thawing degree days on the results. Atmospheric effects must be considered, as simple spatial filtering can suppress large-scale permafrost-related subsidence signals and lead to the underestimation of displacement values, making GACOS-corrected results preferable for the tested sites. Seasonal subsidence rate retrieval which considers these aspects provides a valuable tool for distinguishing between wet and dry landscape features, which is relevant for permafrost degradation monitoring in Arctic lowland permafrost regions. Spatial resolution constraints, however, remain for smaller typical permafrost features which drive wet versus dry conditions such as high and low centred polygons.
... However, it is challenging to define the heterogeneity as it is made up of different aspects, such as the level of diversity of surface conditions or the spatial distribution of these conditions. While many methods exist to assess the level of diversity (such as the Gini-Simpson index used in (Wang et al., 2024) or (Kim et al., 2023)), taking into account the spatial distribution is still not sufficiently understood. Consequently, the ubRMSD Intercept β 0 7.42e-2 (5.24e-4) Vegetation β 1 6.7e-3 (6.20e-4) Clay β 2 4.94e-3 (6.49e-4) Sand β 3 − 6.18e-4 (6.46e-4) Soil density β 4 − 2.86e-3 (6.45e-4) SOC β 5 5.89e-3 (7.15e-4) Open water β 6 2.23e-3 (5.57e-4) ...
... In these cases, SLHF decreases with increasing AMP-LST suggesting that SLHF may be conditioned by water availability. Although Satellite soil moisture products dominate the spatial variability in most scenarios, they are prone to significant errors in densely forested areas (Kim et al 2020(Kim et al , 2023) and therefore we have excluded these data from our analysis. Instead, we have considered satellite LAI (figure S2) to further understand the spatial distribution of SLHF model/products co-variability with AMP-LST (figure S3). ...
Article
Full-text available
The Amazon basin plays a crucial role in the global hydrological cycle and the climate system. Removal of latent heat from the surface covered by the tropical forest through evapotranspiration is a key process that still requires further research due to the complex nature of the involved processes, lack of observations and different model assumptions. Here we present an assessment of the consistency between different latent heat fluxes datasets through an indirect comparison against the daily amplitude of surface temperature and vegetation status estimated from satellite observations. Our study is based on the hypothesis that the observational satellite data can be used to provide hints on how realistically fluxes are represented in different datasets. Results evidence that datasets diverge inside the basin in both space and time, but it is possible to figure out areas under water-limited conditions, especially around the borders of the basin and some regions over eastern/southeastern Amazonia. In despite of these differences, a clear link between daily amplitude of surface temperature, leaf area index and latent heat flux can be observed over particular areas and seasons, where also correlations reach values closer to −0.98 (0.94) for surface temperature (leaf area index) indicating that satellite observations are suitable for assessing the representation of the partitioning of energy fluxes in models and widely used datasets.
... We used Shapley Additive explanations (SHAP) values (Lundberg and Lee, 2017;Lundberg et al., 2019), a model-agnostic technique for interpreting ML models, to explore functional correlations between the variables and forest age (Besnard et al., 2021). SHAP derives the Shapely additive contribution values from coalitional game theory (Kim et al., 2023). By examining the contribution of each input variable to the model's output, SHAP can identify the primary drivers of the model's predictions and provide insights into the underlying causes that influence forest age (Sun et al., 2023). ...
Article
Full-text available
A high-resolution, spatially explicit forest age map is essential for quantifying forest carbon stocks and carbon sequestration potential. Prior attempts to estimate forest age on a national scale in China have been limited by sparse resolution and incomplete coverage of forest ecosystems, attributed to complex species composition, extensive forest areas, insufficient field measurements, and inadequate methods. To address these challenges, we developed a framework that combines machine learning algorithms (MLAs) and remote sensing time series analysis for estimating the age of China's forests. Initially, we identify and develop the optimal MLAs for forest age estimation across various vegetation divisions based on forest height, climate, terrain, soil, and forest-age field measurements, utilizing these MLAs to ascertain forest age information. Subsequently, we apply the LandTrendr time series analysis to detect forest disturbances from 1985 to 2020, with the time since the last disturbance serving as a proxy for forest age. Ultimately, the forest age data derived from LandTrendr are integrated with the result of MLAs to produce the 2020 forest age map of China. Validation against independent field plots yielded an R2 ranging from 0.51 to 0.63. On a national scale, the average forest age is 56.1 years (standard deviation of 32.7 years). The Qinghai–Tibet Plateau alpine vegetation zone possesses the oldest forest with an average of 138.0 years, whereas the forest in the warm temperate deciduous-broadleaf forest vegetation zone averages only 28.5 years. This 30 m-resolution forest age map offers crucial insights for comprehensively understanding the ecological benefits of China's forests and to sustainably manage China's forest resources. The map is available at 10.5281/zenodo.8354262 (Cheng et al., 2023a).
Article
Full-text available
In recent years, black-box machine learning approaches have become a dominant modeling paradigm for knowledge extraction in remote sensing. Despite the potential benefits of uncovering the inner workings of these models with explainable AI, a comprehensive overview summarizing the used explainable AI methods and their objectives, findings, and challenges in remote sensing applications is still missing. In this paper, we address these issues by performing a systematic review to identify the key trends of how explainable AI is used in remote sensing and shed light on novel explainable AI approaches and emerging directions that tackle specific remote sensing challenges. We also reveal the common patterns of explanation interpretation, discuss the extracted scientific insights in remote sensing, and reflect on the approaches used for explainable AI methods evaluation. As such, our review provides a complete summary of the state-of-the-art of explainable AI in remote sensing. Further, we give a detailed outlook on the challenges and promising research directions, representing a basis for novel methodological development and a useful starting point for new researchers in the field.
Article
The fusion of active and passive microwave measurements is expected to provide more robust surface soil moisture (SSM) mapping across various environmental conditions compared to the use of a single sensor. Thus, the integration of the newest L-band passive (i.e., Soil Moisture Active Passive, SMAP) and the active (i.e., the Advanced Scatterometer, ASCAT) observations provides an opportunity for SSM mapping with improved accuracy. However, this integration remains largely underexplored. In this context, the integration of SMAP brightness temperature (TB) and ASCAT backscattering coefficients for global-scale SSM estimation was investigated, by fully considering the potential error sources in conventional radiative transfer models (RTMs) as well as other SSM linked factors. Based on ground measurements from globally distributed dense networks with mitigated mismatch issues and spatial/temporal independent evaluation strategies, this study: (i) comprehensively evaluated four classical machine learning approaches, including Random Forest (RF), Long-Short Term Memory (LSTM), Support Vector Machine (SVM), and Cascaded Neural Network (CNN), and chose the best performing RF method to implement the final integration of SSM; (ii) compared the integration retrievals to those made using data from a single sensor (SMAP or ASCAT) with the same machine learning framework, as well as to the SMAP passive, ASCAT active, and ESA CCI active-passive combined SSM products. The results show the integration retrievals achieve satisfactory performance by obtaining an averaged unbiased root mean squared error (ubRMSE) of 0.042 m3/m3 and a temporal correlation of 0.756, which are superior to machine learning based SSM estimated from a single active or passive sensor, and also outperform the SMAP, ASCAT, and ESA CCI products. Moreover, the temporal resolution is evidently improved compared to the SMAP and ASCAT SSM products, with a temporal ratio exceeding 60% for most areas across the globe. Therefore, blending active and passive measurements affords a more reliable SSM mapping with increased sampling at the global scale, and could contribute to improved hydro-ecological applications.
Article
Full-text available
Accurate wildfire severity mapping (WSM) is crucial in environmental damage assessment and recovery strategies. Machine learning (ML) and remote sensing technologies are extensively integrated and employed as powerful tools for WSM. However, the intricate nature of ML algorithms often leads to 'black box' systems, obscuring the decision-making process and significantly limiting stakeholders' ability to comprehend the basis of predictions. This opacity hinders efforts to enhance performance and risks exacerbating overfitting. This present study proposes an innovative WSM approach that incorporates qualitative and quantitative feature selection techniques within the Explainable AI (XAI) framework. The methodology aims to enhance the precision of WSM and provide insights into the factors contributing to model decisions, thereby increasing the interpretability of predictions and streamlining models to improve performance. To achieve this objective, we employed the SHapley Additive exPlanations (SHAP)-Forward Stepwise Selection (FSS) method to demonstrate its efficacy in elucidating the qualitative and quantitative impacts of predictors on ML algorithm performance, accuracy, and inter-pretability designed for WSM. Utilizing post-fire imagery from Sentinel-2 (S2), we analyzed ten bands to generate 225 unique spectral indices utilizing five different calculations: normalized, algebraic sum, difference, ratio, and product forms. Combined with the original S2 bands, this resulted in 235 potential predictors for ML classifications. A random forest model was subsequently developed using these predictors and optimized through extensive hyperparameter tuning, achieving an overall accuracy (OA) of 0.917 and a Kappa statistic of 0.896. The most influential predictors were identified using SHAP values, with an FSS process narrowing them down to the 12 most critical for effective WSM, as evidenced by stabilized OA and Kappa values (0.904 and 0.881, respectively). Further validation using a ninefold spatial cross-validation technique demonstrated the method's consistent performance across different data partitions, with OA values ranging from 0.705 to 0.894 and Kappa values from 0.607 to 0.867. By providing a more accurate and comprehensible XAI-based method for WSM, this research contributes to the broader field of environmental monitoring and disaster response, underscoring the potential of integrated qualitative and quantitative analysis to enhance ML models' capabilities.
Preprint
Full-text available
A spatially explicit, high-resolution forest age map is critical for quantifying forest carbon stock and carbon sequestration potential. Previous endeavours to estimate forest age in China at national scale mainly concentrated on a sparse resolution or incomplete forest ecosystems because of complex species composition, vast forest areas, insufficient field measurements, and the lack of effective methods. To overcome these limitations, we construct a framework for estimating China’s forest age by combining remote-sensing time series analysis with machine learning algorithms based on massive field measurements and remote-sensing dataset. Specifically, the LandTrendr time series analysis is first applied to detect forest disturbances from 1985 to 2020, with the time since the last disturbance serving as a proxy for forest age. Next, for pixels where no disturbance, machine learning algorithms are used to estimate forest age from independent variables, including forest height, climate, terrain, soil, and forest-age field measurements. Finally, MLA models are established for each vegetation division and used to estimate forest ages. Combining these two methods produces a spatially explicit 30-m-resolution forest-age map for China in the year of 2020. Validation against independent field plots produces a R2 from 0.51 to 0.63. Nationally, the average forest age is 56.1 years (standard deviation = 32.7 years), where the Qinghai-Tibet Plateau alpine vegetation zone has the oldest forest with an average of 138.0 years, whereas the forest in the warm temperate deciduous-broadleaf forest vegetation zone averages only 28.5 years. This 30-m-resolution forest-age map provides vital information for accurately understanding the ecological benefits of China’s forests and to sustainably manage China’s forest resources.
Article
Full-text available
Despite long‐standing efforts, hydrologists still lack robust tools for calibrating land surface model (LSM) streamflow estimates within ungauged basins. Using surface soil moisture estimates from the Soil Moisture Active Passive Level 4 Soil Moisture (L4_SM) product, precipitation observations, and streamflow gauge measurements for 617 medium‐scale (200–10,000 km²) basins in the contiguous United States, we measure the temporal (Spearman) rank correlation between antecedent (i.e., pre‐storm) surface soil moisture (ASM) and the storm‐scale runoff coefficient (RC; the fraction of storm‐scale precipitation accumulation converted into streamflow). In humid and semi‐humid basins, this rank correlation is shown to be sufficiently strong to allow for the substitution of storm‐scale RC observations (available only in basins that are both lightly regulated and gauged) with high‐quality ASM values (available quasi‐globally from L4_SM) in streamflow calibration procedures. Using this principle, we define a new, basin‐wise LSM streamflow calibration approach based on L4_SM alone and successfully apply it to identify LSM configurations that produce a high rank correlation with observed RC. However, since the approach cannot detect RC bias, it is less successful in identifying LSM configurations with low mean‐absolute error.
Article
Full-text available
Backscatter measured by scatterometers and Synthetic Aperture Radars is sensitive to the dielectric properties of the soil and normally increases with increasing soil moisture content. However, when the soil is dry, the radar waves penetrate deeper into the soil, potentially sensing subsurface scatterers such as near-surface rocks and stones. In this paper we propose an exponential model to describe the impact of such subsurface scatterers on C-Band backscatter measurements acquired by the Advanced Scatterometer (ASCAT) on board of the METOP satellites. The model predicts an increase of the subsurface scattering contributions with decreasing soil wetness that may counteract the signal from the soil surface. This may cause anomalous backscatter signals that deteriorate soil moisture retrievals from ASCAT. We test whether this new model is able to explain ASCAT observations better than a bare soil backscatter model without a subsurface scattering term, using k-fold cross validation and the Bayesian Information Criterion for model selection. We find that arid landscapes with Leptosols and Arenosols represent ideal environmental conditions for the occurrence of subsurface scattering. Nonetheless, subsurface scattering may also become important in more humid environments during dry spells. We conclude that subsurface scattering is a widespread phenomenon that (i) needs to be accounted for in active microwave soil moisture retrievals and (ii) has a potential for soil mapping, particularly in arid and semi-arid environments.
Article
Full-text available
We show that, contrary to popular assumptions, predictions from machine learning potentials built upon high-dimensional atom-density representations almost exclusively occur in regions of the representation space which lie outside the convex hull defined by the training set points. We then propose a perspective to rationalize the domain of robust extrapolation and accurate prediction of atomistic machine learning potentials in terms of the probability density induced by training points in the representation space.
Article
Full-text available
Monitoring the thermal state of permafrost (TSP) is important in many environmental science and engineering applications. However, such data are generally unavailable, mainly due to the lack of ground observations and the uncertainty of traditional physical models. This study produces novel permafrost datasets for the Northern Hemisphere (NH), including predictions of the mean annual ground temperature (MAGT) at the depth of zero annual amplitude (DZAA) (approximately 3 to 25 m) and active layer thickness (ALT) with 1 km resolution for the period of 2000–2016, as well as estimates of the probability of permafrost occurrence and permafrost zonation based on hydrothermal conditions. These datasets integrate unprecedentedly large amounts of field data (1002 boreholes for MAGT and 452 sites for ALT) and multisource geospatial data, especially remote sensing data, using statistical learning modeling with an ensemble strategy. Thus, the resulting data are more accurate than those of previous circumpolar maps (bias = 0.02±0.16 ∘C and RMSE = 1.32±0.13 ∘C for MAGT; bias = 2.71±16.46 cm and RMSE = 86.93±19.61 cm for ALT). The datasets suggest that the areal extent of permafrost (MAGT ≤0 ∘C) in the NH, excluding glaciers and lakes, is approximately 14.77 (13.60–18.97) × 106 km2 and that the areal extent of permafrost regions (permafrost probability >0) is approximately 19.82×106 km2. The areal fractions of humid, semiarid/subhumid, and arid permafrost regions are 51.56 %, 45.07 %, and 3.37 %, respectively. The areal fractions of cold (≤-3.0 ∘C), cool (-3.0 ∘C to -1.5 ∘C), and warm (>-1.5 ∘C) permafrost regions are 37.80 %, 14.30 %, and 47.90 %, respectively. These new datasets based on the most comprehensive field data to date contribute to an updated understanding of the thermal state and zonation of permafrost in the NH. The datasets are potentially useful for various fields, such as climatology, hydrology, ecology, agriculture, public health, and engineering planning. All of the datasets are published through the National Tibetan Plateau Data Center (TPDC), and the link is 10.11888/Geocry.tpdc.271190 (Ran et al., 2021a).
Article
Full-text available
Soil moisture performs a key function in the hydrologic process and understanding the global-scale water cycle. However, estimations of soil moisture taken from current sun-synchronous orbit (SSO) satellites are limited in that they are neither spatially nor temporally continuous. This limitation creates discontinuous soil moisture observation from space and hampers our understanding of the fundamental processes that control the surface hydrologic cycle across both time and space domains. Here, we propose to use frequent soil moisture observations from NASA's constellation of eight micro-satellites called the Cyclone Global Navigation Satellite System (CYGNSS) together with the Soil Moisture Active Passive (SMAP) to assimilate subdaily-scale soil moisture into a land surface model (LSM). Our results, which are based on triple collocation analysis (TCA), show how current scientific advances in satellite systems can fill previous gaps in soil moisture observations in subdaily scale by past observations, and eventually adds value to improvements in global scale soil moisture estimates in LSMs. Overall, TCA-based fractional mean square errors (fMSE) of LSM soil moisture are improved by 61% with the synergetic assimilation of CYGNSS data with SMAP soil moisture observations. However, assimilating satellite-based soil moisture over dense vegetation areas can degrade the performance of LSMs as these areas propagate erroneous soil moisture information to LSMs. To our knowledge, this study is the first global assimilation of GNSS-based soil moisture observations in land surface models.
Article
Full-text available
Passive microwave remote sensing observations at L-band provide key and global information on surface soil moisture and vegetation water content, which are related to the Earth water and carbon cycles. Only two space-borne L-band sensors are currently operating: SMOS, launched end of 2009 and thus providing now a 10-year global data set and SMAP, launched beginning of 2015. This study provides a state-of-the-art scientific overview of the SMOS-IC retrieval data set based on the SMOS L-band observations. This SMOS product aims at improved performance and independence of auxiliary data, key features for robust applications. The SMOS-IC product includes both a soil moisture (SM) and a L-band vegetation optical depth (L-VOD) data set which are currently at the basis of several studies evaluating the impact of climate and anthropogenic activities on aboveground carbon stocks. Since the release of the first version, the algorithm has been significantly changed in support to key applications, but no document is available to report these changes. This paper fills this gap by analyzing key science questions related to the product development, reviewing application results and presenting an extensive description of the last version of the product (version 2) considering changes in comparison to the previous version (V105). For the future it is planned to merge the SMOS and SMAP L-VOD data sets to ensure L-VOD data continuity in the event of failure of one of the space-borne SMOS or SMAP sensors.
Preprint
Full-text available
The vegetation optical depth (VOD), a vegetation index retrieved from passive or active microwave remote sensing systems, is related to the intensity of microwave extinction effects within the vegetation canopy layer. This index is only marginally impacted by effects from atmosphere, clouds and sun illumination, and thus increasingly used for ecological applications at large scales. Newly released VOD products show different abilities in monitoring vegetation features, depending on the algorithm used and the satellite frequency. VOD is increasingly sensitive to the upper vegetation layer as the frequency increases (from L-, C-to X-band), offering different capacities to monitor seasonal changes of the leafy and/or woody vegetation components, vegetation water status and aboveground biomass. This study evaluated nine recently developed/reprocessed VOD products from the AMSR2, SMOS and SMAP space-borne instruments for monitoring structural vegetation features related to phenology, height and aboveground biomass. For monitoring the seasonality of green vegetation (herbaceous and woody foliage), we found that X-VOD products, particularly from the LPDR-retrieval algorithm, outperformed the other VOD products in regions that are not densely vegetated, where they showed higher temporal correlation values with optical vegetation indices (VIs). However, LPDR X-VOD time series failed to detect changes in VOD after rainfall events whereas most other VOD products could do so, and overall daily variations are less pronounced in LPDR X-VOD. Results show that the reprocessed VODCA C-and X-VOD have almost comparable performance and VODCA C-VOD correlates better with VIs than other C-VOD products. Low frequency L-VOD, particularly the new version (V2) of SMOS-IC, show a higher temporal correlation with VIs, similar to C-VOD, in medium-densely vegetated biomes such as savannas (R ~ 0.70) than for other short vegetation types. Because the L-VOD indices are more sensitive to the non-green vegetation components (trunks and branches) than higher frequency products, they are well-correlated with aboveground biomass: (R ~ 0.91) across space between predicted and observed values for both SMOS-IC V2 and SMAP MT-DCA. However, when compared with forest canopy height, results at L-band are not systematically better than C-and X-VOD products. This revealed specific VOD retrieval issues for some ecosystems, e.g., boreal regions. It is expected that these findings can contribute to algorithm refinements, product enhancements and further developing the use of VOD for monitoring above-ground vegetation biomass, vegetation dynamics and phenology.
Article
Full-text available
With the increasing utilization of satellite-based soil moisture products, a primary challenge is knowing their accuracy and robustness. This study presents a comprehensive assessment over China of three widely used global satellite soil moisture products, i.e., Soil Moisture Active Passive (SMAP), European Space Agency (ESA) Climate Change Initiative (CCI) Soil Moisture, Soil Moisture and Ocean Salinity (SMOS). In situ soil moisture from 1682 stations and Variable Infiltration Capacity (VIC) model are used to evaluate the performance of SMAP_L3, ESA_CCI_SM_COMBINED, SMOS_CATDS_L3 from 31 March 2015 to 3 June 2018. The Triple Collocation (TC) approach is used to minimize the uncertainty (e.g., scale issue) during the validation process. The TC analysis is conducted using three triplets, i.e., [SMAP-Insitu-VIC], [CCI-Insitu-VIC], [SMOS-Insitu-VIC]. In general, SMAP is the most reliable product, reflecting the main spatiotemporal characteristics of soil moisture, while SMOS has the lowest accuracy. The results demonstrate that the overall root mean square error of SMAP, CCI, SMOS is 0.040, 0.028, 0.107 m3m−3, respectively. The overall temporal correlation coefficient of SMAP, CCI, SMOS is 0.68, 0.65, 0.38, respectively. The overall fractional root mean square error of SMAP, CCI, SMOS is 0.707, 0.750, 0.897, respectively. In irrigated areas, the accuracy of CCI is reduced due to the land surface model (which does not consider irrigation) used for the rescaling of the CCI_COMBINED soil moisture product during the merging process, while SMAP and SMOS preserve the irrigation signal. The quality of SMOS is most strongly impacted by land surface temperature, vegetation, and soil texture, while the quality of CCI is the least affected by these factors. With the increase of Radio Frequency Interference, the accuracy of SMOS decreases dramatically, followed by SMAP and CCI. Higher representativeness error of in situ stations is noted in regions with higher topographic complexity. This study helps to provide a guideline for the application of satellite soil moisture products in scientific research and gives some references (e.g., modify data algorithm according to the main error sources) for improving the data quality.
Article
Full-text available
Machine learning methods have been remarkably successful for a wide range of application areas in the extraction of essential information from data. An exciting and relatively recent development is the uptake of machine learning in the natural sciences, where the major goal is to obtain novel scientific insights and discoveries from observational or simulated data. A prerequisite for obtaining a scientific outcome is domain knowledge, which is needed to gain explainability, but also to enhance scientific consistency. In this article we review explainable machine learning in view of applications in the natural sciences and discuss three core elements which we identified as relevant in this context: transparency, interpretability, and explainability. With respect to these core elements, we provide a survey of recent scientific works that incorporate machine learning and the way that explainable machine learning is used in combination with domain knowledge from the application areas.
Article
Full-text available
Long-term gridded precipitation products are crucial for several applications in hydrology, agriculture and climate sciences. Currently available precipitation products suffer from space and time inconsistency due to the non-uniform density of ground networks and the difficulties in merging multiple satellite sensors. The recent “bottom-up” approach that exploits satellite soil moisture observations for estimating rainfall through the SM2RAIN (Soil Moisture to Rain) algorithm is suited to build a consistent rainfall data record as a single polar orbiting satellite sensor is used. Here we exploit the Advanced SCATterometer (ASCAT) on board three Meteorological Operational (MetOp) satellites, launched in 2006, 2012, and 2018, as part of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) Polar System programme. The continuity of the scatterometer sensor is ensured until the mid-2040s through the MetOp Second Generation Programme. Therefore, by applying the SM2RAIN algorithm to ASCAT soil moisture observations, a long-term rainfall data record will be obtained, starting in 2007 and lasting until the mid-2040s. The paper describes the recent improvements in data pre-processing, SM2RAIN algorithm formulation, and data post-processing for obtaining the SM2RAIN–ASCAT quasi-global (only over land) daily rainfall data record at a 12.5 km spatial sampling from 2007 to 2018. The quality of the SM2RAIN–ASCAT data record is assessed on a regional scale through comparison with high-quality ground networks in Europe, the United States, India, and Australia. Moreover, an assessment on a global scale is provided by using the triple-collocation (TC) technique allowing us also to compare these data with the latest, fifth-generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA5), the Early Run version of the Integrated Multi-Satellite Retrievals for Global Precipitation Measurement (IMERG), and the gauge-based Global Precipitation Climatology Centre (GPCC) products. Results show that the SM2RAIN–ASCAT rainfall data record performs relatively well at both a regional and global scale, mainly in terms of root mean square error (RMSE) when compared to other products. Specifically, the SM2RAIN–ASCAT data record provides performance better than IMERG and GPCC in data-scarce regions of the world, such as Africa and South America. In these areas, we expect larger benefits in using SM2RAIN–ASCAT for hydrological and agricultural applications. The limitations of the SM2RAIN–ASCAT data record consist of the underestimation of peak rainfall events and the presence of spurious rainfall events due to high-frequency soil moisture fluctuations that might be corrected in the future with more advanced bias correction techniques. The SM2RAIN–ASCAT data record is freely available at 10.5281/zenodo.3405563 (Brocca et al., 2019) (recently extended to the end of August 2019).
Article
Full-text available
The physical parameterization of key processes in land surface models (LSMs) remains uncertain, and new techniques are required to evaluate LSMs accuracy over large spatial scales. Given the role of soil moisture in the partitioning of surface water fluxes (between infiltration, runoff, and evapotranspiration), surface soil moisture (SSM) estimates represent an important observational benchmark for such evaluations. Here, we apply SSM estimates from the NASA Soil Moisture Active Passive Level‐4 product (SMAP_L4) to diagnose bias in the correlation between SSM and surface runoff for multiple Noah‐Multiple Physics (Noah‐MP) LSM parameterization cases. Results demonstrate that Noah‐MP surface runoff parameterizations often underestimate the correlation between prestorm SSM and the event‐scale runoff coefficient (RC; defined as the ratio between event‐scale streamflow and precipitation volumes). This bias can be quantified against an observational benchmark calculated using streamflow observations and SMAP_L4 SSM and applied to explain a substantial fraction of the observed basin‐to‐basin (and case‐to‐case) variability in the skill of event‐scale RC estimates from Noah‐MP. Most notably, a low bias in LSM‐predicted SSM/RC correlation squanders RC information contained in prestorm SSM and reduces LSM RC estimation skill. Based on this concept, a novel case selection strategy for ungauged basins is introduced and demonstrated to successfully identify poorly performing Noah‐MP parameterization cases.
Article
Full-text available
Permafrost is a key element of the cryosphere and an essential climate variable in the Global Climate Observing System. There is no remote-sensing method available to reliably monitor the permafrost thermal state. To estimate permafrost distribution at a hemispheric scale, we employ an equilibrium state model for the temperature at the top of the permafrost (TTOP model) for the 2000–2016 period, driven by remotely-sensed land surface temperatures, down-scaled ERA-Interim climate reanalysis data, tundra wetness classes and landcover map from the ESA Landcover Climate Change Initiative (CCI) project. Subgrid variability of ground temperatures due to snow and landcover variability is represented in the model using subpixel statistics. The results are validated against borehole measurements and reviewed regionally. The accuracy of the modelled mean annual ground temperature (MAGT) at the top of the permafrost is ± 2 °C when compared to permafrost borehole data. The modelled permafrost area (MAGT < 0 °C) covers 13.9 × 106 km2 (ca. 15% of the exposed land area), which is within the range or slightly below the average of previous estimates. The sum of all pixels having isolated patches, sporadic, discontinuous or continuous permafrost (permafrost probability > 0) is around 21 × 106 km2 (22% of exposed land area), which is approximately 2 × 106 km2 less than estimated previously. Detailed comparisons at a regional scale show that the model performs well in sparsely vegetated tundra regions and mountains, but is less accurate in densely vegetated boreal spruce and larch forests.
Article
Full-text available
The effective applications of land surface models (LSMs) and hydrologic models pose a varied set of data input and processing needs, ranging from ensuring consistency checks to more derived data processing and analytics. This article describes the development of the Land surface Data Toolkit (LDT), which is an integrated framework designed specifically for processing input data to execute LSMs and hydrological models. LDT not only serves as a preprocessor to the NASA Land Information System (LIS), which is an integrated framework designed for multi-model LSM simulations and data assimilation (DA) integrations, but also as a land-surface-based observation and DA input processor. It offers a variety of user options and inputs to processing datasets for use within LIS and stand-alone models. The LDT design facilitates the use of common data formats and conventions. LDT is also capable of processing LSM initial conditions and meteorological boundary conditions and ensuring data quality for inputs to LSMs and DA routines. The machine learning layer in LDT facilitates the use of modern data science algorithms for developing data-driven predictive models. Through the use of an object-oriented framework design, LDT provides extensible features for the continued development of support for different types of observational datasets and data analytics algorithms to aid land surface modeling and data assimilation.
Article
Full-text available
Knowledge of the temporal error structure for remotely sensed surface soil moisture retrievals can improve our ability to exploit them for hydrologic and climate studies. This study employs a triple collocation analysis to investigate both the total variance and temporal auto-correlation of errors in Soil Moisture Active and Passive (SMAP) products generated from two separate soil moisture retrieval algorithms, the vertically-polarized brightness temperature based Single Channel Algorithm (SCA-V, the current baseline SMAP algorithm) and the Dual Channel Algorithm (DCA). A key assumption made in SCA-V is that real-time vegetation opacity can be accurately captured using only a climatology for vegetation opacity. Results demonstrate that, while SCA-V generally outperforms DCA, SCA-V can produce larger total errors when this assumption is significantly violated by inter-annual variability in vegetation health and biomass. Furthermore, larger auto-correlated errors in SCA-V retrievals are found in areas with relatively large vegetation opacity deviations from climatological expectations. This implies that a significant portion of the auto-correlated error in SCA-V is attributable to the violation of its vegetation opacity climatology assumption and suggests that utilizing a real (as opposed to climatological) vegetation opacity time series in the SCA-V algorithm would reduce the magnitude of auto-correlated soil moisture retrieval errors.
Article
Full-text available
Over land the vegetation canopy affects the microwave brightness temperature by emission, scattering and attenuation of surface soil emission. Attenuation, as represented by vegetation optical depth (VOD), is a potentially useful ecological indicator. The NASA Soil Moisture Active Passive (SMAP) mission carries significant potential for VOD estimates because of its radio frequency interference mitigation efforts and because the L-band signal penetrates deeper into the vegetation canopy than the higher frequency bands used for many previous VOD retrievals. In this study, we apply the multi-temporal dual-channel retrieval algorithm (MT-DCA) to derive global VOD, soil moisture, and effective scattering albedo estimates from SMAP Backus-Gilbert enhanced brightness temperatures posted on a 9 km grid and with three day revisit time. SMAP VOD values from the MT-DCA follow expected global distributions and are shown to be highly correlated with canopy height. They are also broadly similar in magnitude (though not always in seasonal amplitude) to European Space Agency Soil Moisture and Ocean Salinity (SMOS) VOD. The SMOS VOD values are based on angular brightness temperature information while the SMAP measurements are at a constant incidence angle, requiring an alternate approach to VOD retrieval presented in this study.
Conference Paper
Full-text available
Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.
Article
Full-text available
The Advanced Microwave Scanning Radiometer 2 (AMSR2), a follow-up microwave sensor to the AMSR for Earth Observing System (AMSR-E), was launched on the Global Change Observation Mission 1 – Water (GCOM-W1) satellite in May 2012. It is as yet unclear if instrumental improvements in AMSR2 over AMSR-E have led to better soil moisture (SM) estimates, especially since there is no overlapping period of data between the sensors. This study focuses on comparing the results of AMSR2 and AMSR-E SM over Australia, distinguishing four Köppen climate zones to determine if AMSR2 is better than AMSR-E. This is achieved by selecting two year-long comparative time periods from the operating periods of AMSR-E and AMSR2, based on their statistical similarities in modeled SM as a proxy, using Modern Era Retrospective-analysis for Research and Applications-Land (MERRA-L). The AMSR2 and AMSR-E C- and X-band SM derived from the Land Parameter Retrieval Model (LPRM) was evaluated. Both AMSR2 C- and X-band SM products were found to show similar temporal patterns and spatial agreement with AMSR-E C- and X-band SM, supported by unbiased root mean square difference (ubRMSD) and R-values with MERRA-L SM, respectively. Using lag-based instrumental variable analysis to estimate the random error component of SM retrievals, the noise-to-signal ratios in AMSR2 X-band SM were found to be slightly higher than their AMSR-E counterparts. The improvements in AMSR2, such as the superior radiometric sensitivity and spatial resolution, have therefore not led to statistically significant differences in performance for LPRM retrievals at 1/2° × 1/2° grid resolution, when compared with AMSR-E. However, similarities in the metrics for AMSR2 and AMSR-E SM suggest that AMSR2 provides a valuable continuation to AMSR-E.
Article
Full-text available
The conditional merging (CM) spatial interpolation technique was applied to obtain the composite soil moisture products using the AMSR2 and in situ soil moisture for the 51 days of the summer through the late fall season of the year 2012 in Korean Peninsula. The ‘leave one out cross-validation’ analysis was conducted to assess the performance of the composite soil moisture products in estimating the soil moisture in ungagged locations. The control variable for comparison was the soil moisture products obtained by spatially interpolating the in situ soil moisture data measured at eight gage locations using the Ordinary Kriging (KR) technique. The results show that the composite soil moisture products are more accurate than the in situ only soil moisture products in estimating the soil moisture for the following cases: (1) when the spatial correlation of in situ soil moisture data is low. Such case includes when there is little rainfall and where the altitude is high (mountainous area) and (2) where the gage density is low or the area located further away from the in situ gages. For both cases, the KR method cannot use enough information due to the low spatial correlation of the in situ measurement for interpolation, while the CM method can take advantage of the satellite soil moisture measurement not affected by the spatial correlation of the in situ data.
Article
Full-text available
To date, triple collocation (TC) analysis is one of the most important methods for the global-scale evaluation of remotely sensed soil moisture data sets. In this study we review existing implementations of soil moisture TC analysis as well as investigations of the assumptions underlying the method. Different notations that are used to formulate the TC problem are shown to be mathematically identical. While many studies have investigated issues related to possible violations of the underlying assumptions, only few TC modifications have been proposed to mitigate the impact of these violations. Moreover, assumptions, which are often understood as a limitation that is unique to IC analysis are shown to be common also to other conventional performance metrics. Noteworthy advances in TC analysis have been made in the way error estimates are being presented by moving from the investigation of absolute error variance estimates to the investigation of signal-to-noise ratio (SNR) metrics. Here we review existing error presentations and propose the combined investigation of the SNR (expressed in logarithmic units), the unscaled error variances, and the soil moisture sensitivities of the data sets as an optimal strategy for the evaluation of remotely-sensed soil moisture data sets.
Article
Full-text available
Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and orthogonality with respect to the truth) are required for TCA estimates to be unbiased. Even though soil moisture studies commonly acknowledge that these assumptions are required for an unbiased TCA, no study has specifically investigated the degree to which errors in existing soil moisture datasets conform to these assumptions. Here these assumptions are evaluated both analytically and numerically over four extensively instrumented watershed sites using soil moisture products derived from active microwave remote sensing, passive microwave remote sensing, and a land surface model. Results demonstrate that nonorthogonal and error cross-covariance terms represent a significant fraction of the total variance of these products. However, the overall impact of error cross correlation on TCA is found to be significantly larger than the impact of nonorthogonal errors. Because of the impact of cross-correlated errors, TCA error estimates generally underestimate the true random error of soil moisture products.
Article
Full-text available
Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects. Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision.
Article
Full-text available
Calibration and validation of geophysical measurement systems typically requires knowledge of the “true” value of the target variable. However, the data considered to represent the “true” values often include their own measurement errors, biasing calibration and validation results. Triple collocation (TC) can be used to estimate the root-mean-square-error (RMSE), using observations from three mutually-independent, error-prone measurement systems. Here, we introduce Extended Triple Collocation (ETC): using exactly the same assumptions as TC, we derive an additional performance metric, the correlation coefficient of the measurement system with respect to the unknown target, ρ_(t,X_i ). We demonstrate that ρ_(t,X_i)^2 is the scaled, unbiased signal-to-noise ratio, and provides a complementary perspective compared to the RMSE. We apply it to three collocated wind datasets. Since ETC is as easy to implement as TC, requires no additional assumptions, and provides an extra performance metric, it may be of interest in a wide range of geophysical disciplines.
Conference Paper
Full-text available
The Cyclone Global Navigation Satellite System (CYGNSS) is a spaceborne mission concept focused on tropical cyclone (TC) inner core process studies. CYGNSS attempts to resolve the principle deficiencies with current TC intensity forecasts, which lies in inadequate observations and modeling of the inner core. CYGNSS consists of 8 GPS bistatic radar receivers deployed on separate nanosatellites. The primary science driver is rapid sampling of ocean surface winds in the inner core of tropical cyclones.
Article
Full-text available
This is the second part of a study on continental-scale water and energy flux analysis and validation conducted in phase 2 of the North American Land Data Assimilation System project (NLDAS-2). The first part concentrates on a model-by-model comparison of mean annual and monthly water fluxes, energy fluxes and state variables. In this second part, the focus is on the validation of simulated streamflow from four land surface models (Noah, Mosaic, Sacramento Soil Moisture Accounting (SAC-SMA), and Variable Infiltration Capacity (VIC) models) and their ensemble mean. Comparisons are made against 28-years (1 October 1979-30 September 2007) of United States Geological Survey observed streamflow for 961 small basins and 8 major basins over the conterminous United States (CONUS). Relative bias, anomaly correlation and Nash-Sutcliffe Efficiency (NSE) statistics at daily to annual time scales are used to assess model-simulated streamflow. The Noah (the Mosaic) model overestimates (underestimates) mean annual runoff and underestimates (overestimates) mean annual evapotranspiration. The SAC-SMA and VIC models simulate the mean annual runoff and evapotranspiration well when compared with the observations. The ensemble mean is closer to the mean annual observed streamflow for both the 961 small basins and the 8 major basins than is the mean from any individual model. All of the models, as well as the ensemble mean, have large daily, weekly, monthly, and annual streamflow anomaly correlations for most basins over the CONUS, implying strong simulation skill. However, the daily, weekly, and monthly NSE analysis results are not necessarily encouraging, in particular for daily streamflow. The Noah and Mosaic models are useful (NSE > 0.4) only for about 10% of the 961 small basins, the SAC-SMA and VIC models are useful for about 30% of the 961 small basins, and the ensemble mean is useful for about 42% of the 961 small basins. As the time scale increases, the NSE increases as expected. However, even for monthly streamflow, the ensemble mean is useful for only 75% of the 961 small basins.
Article
Full-text available
The European Space Agency's Soil Moisture and Ocean Salinity (SMOS) mission is perturbed by radio frequency interferences (RFIs) that jeopardize part of its scientific retrieval in certain areas of the world, particularly over continental areas in Europe, Southern Asia, and the Middle East. Areas affected by RFI might experience data loss or underestimation of soil moisture and ocean salinity retrieval values. To alleviate this situation, the SMOS team has put strategies in place that, one year after launch, have already improved the RFI situation in Europe where half of the sources have been successfully localized and switched off.
Article
Full-text available
The land surface albedo in the NCAR Community Climate System Model (CCSM2) is calculated based on a two-stream approximation, which does not include the effect of three-dimensional vegetation structure on radiative transfer. The model albedo (including monthly averaged albedo, direct albedo at local noon, and the solar zenith angle dependence of albedo) is evaluated using the Moderate Resolution Imaging Spectroradiometer (MODIS) Bidirectional Reflectance Distribution Function (BRDF) and albedo data acquired during July 2001-July 2002. The model monthly averaged albedos in February and July are close to the MODIS white-sky albedos (within 0.02 or statistically insignificant) over about 40% of the global land between 60°S and 70°N. However, CCSM2 significantly underestimates albedo by 0.05 or more over deserts (e.g., the Sahara Desert) and some semiarid regions (e.g., parts of Australia). The difference between the model direct albedo at local noon and the MODIS black-sky albedo for the near-infrared (NIR) band (with wavelength > 0.7 μm) is larger than the difference for the visible band (with wavelength < 0.7 μm) for most snow-free regions. For eleven model grid cells with different dominant plant functional types, the model diffuse NIR albedo is higher by 0.05 or more than the MODIS white-sky albedo in five of these cells. Direct albedos from the model and MODIS (as computed using the BRDF parameters) increase with solar zenith angles, but model albedo increases faster than the MODIS data. These analyses and the MODIS BRDF and albedo data provide a starting point toward developing a BRDF-based treatment of radiative transfer through a canopy for land surface models that can realistically simulate the mean albedo and the solar zenith angle dependence of albedo.
Article
Full-text available
This first paper of the two-part series describes the objectives of the community efforts in improving the Noah land surface model (LSM), documents, through mathematical formulations, the augmented conceptual realism in biophysical and hydrological processes, and introduces a framework for multiple options to parameterize selected processes (Noah-MP). The Noah-MP's performance is evaluated at various local sites using high temporal frequency data sets, and results show the advantages of using multiple optional schemes to interpret the differences in modeling simulations. The second paper focuses on ensemble evaluations with long-term regional (basin) and global scale data sets. The enhanced conceptual realism includes (1) the vegetation canopy energy balance, (2) the layered snowpack, (3) frozen soil and infiltration, (4) soil moisture-groundwater interaction and related runoff production, and (5) vegetation phenology. Sample local-scale validations are conducted over the First International Satellite Land Surface Climatology Project (ISLSCP) Field Experiment (FIFE) site, the W3 catchment of Sleepers River, Vermont, and a French snow observation site. Noah-MP shows apparent improvements in reproducing surface fluxes, skin temperature over dry periods, snow water equivalent (SWE), snow depth, and runoff over Noah LSM version 3.0. Noah-MP improves the SWE simulations due to more accurate simulations of the diurnal variations of the snow skin temperature, which is critical for computing available energy for melting. Noah-MP also improves the simulation of runoff peaks and timing by introducing a more permeable frozen soil and more accurate simulation of snowmelt. We also demonstrate that Noah-MP is an effective research tool by which modeling results for a given process can be interpreted through multiple optional parameterization schemes in the same model framework.
Article
Full-text available
Many physical, chemical and biological processes taking place at the land surface are strongly influenced by the amount of water stored within the upper soil layers. Therefore, many scientific disciplines require soil moisture observations for developing, evaluating and improving their models. One of these disciplines is meteorology where soil moisture is important due to its control on the exchange of heat and water between the soil and the lower atmosphere. Soil moisture observations may thus help to improve the forecasts of air temperature, air humidity and precipitation. However, until recently, soil moisture observations had only been available over a limited number of regional soil moisture networks. This has hampered scientific progress as regards the characterisation of land surface processes not just in meteorology but many other scientific disciplines as well. Fortunately, in recent years, satellite soil moisture data have increasingly become available. One of the freely available global soil moisture data sets is derived from the backscatter measurements acquired by the Advanced Scatterometer (ASCAT) that is a C-band active microwave remote sensing instrument flown on board of the Meteorological Operational (METOP) satellite series. ASCAT was designed to observe wind speed and direction over the oceans and was initially not foreseen for monitoring soil moisture over land. Yet, as argued in this review paper, the characteristics of the ASCAT instrument, most importantly its wavelength (5.7 cm), its high radiometric accuracy, and its multiple-viewing capabilities make it an attractive sensor for measuring soil moisture. Moreover, given the operational status of ASCAT, and its promising long-term prospects, many geoscientific applications might benefit from using ASCAT soil moisture data. Nonetheless, the ASCAT soil moisture product is relatively complex, requiring a good understanding of its properties before it can be successfully used in applications. To provide a comprehensive overview of the major characteristics and caveats of the ASCAT soil moisture product, this paper describes the ASCAT instrument and the soil moisture processor and near-real-time distribution service implemented by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT). A review of the most recent validation studies shows that the quality of ASCAT soil moisture product is - with the exception of arid environments - comparable to, and over some regions (e.g. Europe) even better than currently available soil moisture data derived from passive microwave sensors. Further, a review of applications studies shows that the use of the ASCAT soil moisture product is particularly advanced in the fields of numerical weather prediction and hydrologic modelling. But also in other application areas such as yield monitoring, epidemiologic modelling, or societal risks assessment some first progress can be noted. Considering the generally positive evaluation results, it is expected that the ASCAT soil moisture product will increasingly be used by a growing number of rather diverse land applications.
Article
Full-text available
A new version of a digital global map of irrigation areas was developed by combining irrigation statistics for 10825 sub-national statistical units and geo-spatial information on the location and extent of irrigation schemes. The map shows the percentage of each 5 arc minute by 5 arc minute cell that was equipped for irrigation around the year 2000. It is thus an important data set for global studies related to water and land use. This paper describes the data set and the mapping methodology and gives, for the first time, an estimate of the map quality at the scale of countries, world regions and the globe. Two indicators of map quality were developed for this purpose, and the map was compared to irrigated areas as derived from two remote sensing based global land cover inventories. We plan to further improve that data set; therefore comments, information and data that might contribute to that effort are highly welcome.
Article
Full-text available
The contrast between the point-scale nature of current ground-based soil moisture instrumentation and the ground resolution (typically >102 km2) of satellites used to retrieve soil moisture poses a significant challenge for the validation of data products from current and upcoming soil moisture satellite missions. Given typical levels of observed spatial variability in soil moisture fields, this mismatch confounds mission validation goals by introducing significant sampling uncertainty in footprint-scale soil moisture estimates obtained from sparse ground-based observations. During validation activities based on comparisons between ground observations and satellite retrievals, this sampling error can be misattributed to retrieval uncertainty and spuriously degrade the perceived accuracy of satellite soil moisture products. This review paper describes the magnitude of the soil moisture upscaling problem and measurement density requirements for ground-based soil moisture networks. Since many large-scale networks do not meet these requirements, it also summarizes a number of existing soil moisture upscaling strategies which may reduce the detrimental impact of spatial sampling errors on the reliability of satellite soil moisture validation using spatially sparse ground-based observations.
Article
Full-text available
Ground-based multifrequency (L-band to W-band, 1.41-90 GHz) and multiangular (20°-50°) bipolarized (V and H) microwave radiometer observations, acquired over a dense wheat field, are analyzed in order to assess the sensitivity of brightness temperatures ( Tb ) to land surface properties: surface soil moisture ( mv ) and vegetation water content (VWC). For each frequency, a combination of microwave Tb observed at either two contrasting incidence angles or two polarizations is used to retrieve mv and VWC, through regressed empirical logarithmic equations. The retrieval performance of the regression is used as an indicator of the sensitivity of the microwave signal to either mv or VWC. In general, L-band measurements are shown to be sensitive to both mv and VWC, with lowest root mean square errors (0.04 m3 ·m-3 and 0.52 kg ·m-2 , respectively) obtained at H polarization, 20° and 50° incidence angles. In spite of the dense vegetation, it is shown that mv influences the microwave observations from L-band to K-band (23.8 GHz). The highest sensitivity to soil moisture is observed at L-band in all configurations, while observations at higher frequencies, from C-band (5.05 GHz) to K-band, are only moderately influenced by mv at low incidence angles (e.g., 20°). These frequencies are also shown to be very sensitive to VWC in all the configurations tested. The highest frequencies (Q- and W-bands) are shown to be moderately sensitive to VWC only. These results are used to analyze the response of W-band emissivities derived from the Advanced Microwave Sounding Unit instruments over northern France.
Article
Full-text available
The Soil Moisture Active Passive (SMAP) mission is one of the first Earth observation satellites being developed by NASA in response to the National Research Council's Decadal Survey. SMAP will make global measurements of the soil moisture present at the Earth's land surface and will distinguish frozen from thawed land surfaces. Direct observations of soil moisture and freeze/thaw state from space will allow significantly improved estimates of water, energy, and carbon transfers between the land and the atmosphere. The accuracy of numerical models of the atmosphere used in weather prediction and climate projections are critically dependent on the correct characterization of these transfers. Soil moisture measurements are also directly applicable to flood assessment and drought monitoring. SMAP observations can help monitor these natural hazards, resulting in potentially great economic and social benefits. SMAP observations of soil moisture and freeze/thaw timing will also reduce a major uncertainty in quantifying the global carbon balance by helping to resolve an apparent missing carbon sink on land over the boreal latitudes. The SMAP mission concept will utilize L-band radar and radiometer instruments sharing a rotating 6-m mesh reflector antenna to provide high-resolution and high-accuracy global maps of soil moisture and freeze/thaw state every two to three days. In addition, the SMAP project will use these observations with advanced modeling and data assimilation to provide deeper root-zone soil moisture and net ecosystem exchange of carbon. SMAP is scheduled for launch in the 2014-2015 time frame.
Article
Estimating accurate surface soil moisture (SM) dynamics from space, and knowing the error characteristics of these estimates, is of great importance for the application of satellite-based SM data throughout many Earth Science/Environmental Engineering disciplines. Here, we introduce the Bayesian inference approach to analyze the error characteristics of widely used passive and active microwave satellite-derived SM data sets, at different overpass times, acquired from the Soil Moisture Active Passive (SMAP), Soil Moisture and Ocean Salinity (SMOS), and Advanced Scatterometer (ASCAT) missions. In particular, we apply Bayesian hierarchical modeling (BHM) and triple collocation analysis (TCA) to investigate the relative importance of different environmental factors and human activities on the accuracy of satellite-based data. To start, we compare the BHM-based sensitivity analysis method to the classic multiple regression models using a frequentist approach, which includes complete pooling and no-pooling models that have been widely used for sensitivity analysis in the field of remote sensing and demonstrate the BHM's adaptability and great potential for providing insight into sensitivity analysis that can be used by various remote sensing research communities. Next, we conduct an uncertainty analysis on BHM's model parameters using a full range of uncertainties to assess the association of various environmental factors with the accuracy of satellite-derived SM data. We focus on investigating human-induced error sources such as disturbed surface soil layers caused by irrigation activities on microwave satellite systems, naturally introduced error sources such as vegetation and soil organic matter, and errors related to the disregard of SM retrieval algorithmic assumptions-such as the thermal equilibrium passive microwave systems. Based on the BHM-based sensitivity analysis, we find that assessments of SM data quality with a single variable should be avoided, since numerous other factors simultaneously influence their quality. As such, this provides a useful framework for applying Bayesian theory to the investigation of the error characteristics of satellite-based SM data and other time-varying geophysical variables.
Article
Accurate specification of spatiotemporal errors of remotely sensed soil moisture (SM) data is essential for a correct assessment of their utility and optimally integrating multiple SM products or assimilating them into hydrological models. Although Triple Collocation Analysis (TCA) has been widely used to provide SM errors, the impact of rescaling technique on the TCA error estimates has not received major attention, which can lead to biased and inaccurate error estimates. Moreover, current knowledge about time-variant SM errors derived from TCA is still very limited, which hampers the advance of applying time-variant errors in data merging and data assimilation studies efficiently. Based on these considerations, this work aims to advance the use of the TCA for characterizing errors with a focus on the rescaling techniques, and validating TCA-based time-variant errors using global ground measurements in 759 grid cells. To this end, the Advanced Scatterometer (ASCAT) and four passive-based SM products, including Soil Moisture and Ocean Salinity Level-3 (SMOS-L3), SMOS INRA-CESBIO (SMOS-IC), Soil Moisture Active Passive Level-3 (SMAP-L3), and SMAP INRAE BORDEAUX (SMAP-IB) SM products, were considered. The time-variant error term here denotes an aggregate error magnitude over a 101-day moving-time-window. It is found that different selection of the rescaling technique considered in TCA led to TCA error estimates with significantly different accuracy when ground-based errors are regarded as the benchmark. The optimal combination strategy to implement TCA is applying TCA to SM anomalies and rescaling the errors by coefficients derived from the TCA model. Pearson's correlation with ground-based time-variant errors is 0.62, 0.72, 0.83, 0.89, and 0.93 for SMAP-IB, SMAP-L3, SMOS-IC, SMOS-L3, and ASCAT SM, respectively. Considering time-variant errors in applications is necessary since time-variant errors deviate from time-invariant errors by 50% when errors are rescaled by the TCA model parameters. Time-invariant errors are greater than time-variant errors when SM products are rescaled against a reference dataset while the opposite conclusion can be drawn when errors are rescaled by the TCA coefficients. TCA- and ground-based methods provide consistent evaluations in 74.7% (77.3%), 75.8% (79.8%), 79.6% (81.1%), and 78.6% (79.7%) of the analysis period on average (median) for the TCA implementations with SMAP-L3, SMAP-IB, SMOS-L3, and SMOS-IC SM, respectively. The error analysis reveals that TCA typically underestimated ASCAT errors while overestimated passive SM errors when considering ground-based evaluation as the benchmark. Moreover, TCA was found to have relatively less power to efficiently characterize SM errors in croplands when compared with other land cover types. This study validated TCA time-variant errors using ground measurements and compared TCA- and ground-based evaluation performances on a global scale. Our work arouses particular attention to the rescaling technique selection considered in TCA, which is crucial for accurately characterizing SM errors and efficiently using them in various hydrometeorological applications.
Article
The global validation of remotely sensed and/or modeled geophysical products is often complicated by a lack of suitable ground observations for comparison. By cross-comparing three independent collocated observations, triple collocation (TC) can solve for geophysical product errors in error-prone systems. However, acquiring three independent products for a geophysical variable of interest can be challenging. Here, a double instrumental variable based algorithm (IVd) is proposed as an extension of the existing single instrumental variable (IVs) approach to estimate product error standard deviation (σ) and product-truth correlation (R) using only two independent products - an easier requirement to meet in practice. An analytical examination of the IVd method suggests that it is less prone to bias and has reduced sampling errors relative to IVs. Results from an example application of the IVd method to precipitation product error estimation show that IVd-based σ and R are good approximations of reference values obtained from TC at the global extent. In addition to their spatial consistency, IVd estimated error metrics also have only marginal (less than 5%) relative biases versus a TC baseline. Consistent with our earlier analytical analysis, these empirical results are shown to be superior to those obtained by IVs. However, several caveats for the IVd approach should be acknowledged. As with TC and IVs, IVd estimates are less robust when the signal-to-noise ratio of geophysical products is very low. Additionally, IVd may be significantly biased when geophysical products have strongly contrasting error auto-correlations.
Article
Using the first full annual cycle of Cyclone Global Navigation Satellite System (CyGNSS) observations, we investigated the limitations and capabilities of CyGNSS observations for soil moisture (SM) estimates (0–5 cm). A relative signal-to-noise ratio (rSNR) value from a CyGNSS-derived delay-Doppler map is introduced to improve the temporal resolution of SM derived from Soil Moisture Active Passive (SMAP) data. We then evaluated the CyGNSS-derived rSNR using ground-based SM measurements and the triple collocation method with SMAP and modeled SM products. We found that CyGNSS can provide useful SM estimates over moderately vegetated regions (correlation coefficient of the individual data: 0.77) but shows degraded performance over arid and densely vegetated regions (correlation coefficient of the individual data: 0.68 and 0.67). However, when rSNR data is combined with SM data from SMAP, daily SM estimates can be achieved. These results show that synergistic use of CyGNSS observations can improve on SM estimates from current satellite systems.
Article
The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), is the latest atmospheric reanalysis of the modern satellite era produced by NASA's Global Modeling and Assimilation Office (GMAO). MERRA-2 assimilates observation types not available to its predecessor, MERRA, and includes updates to the Goddard Earth Observing System (GEOS) model and analysis scheme so as to provide a viable ongoing climate analysis beyond MERRA's terminus. While addressing known limitations of MERRA, MERRA-2 is also intended to be a development milestone for a future integrated Earth system analysis (IESA) currently under development at GMAO. This paper provides an overview of the MERRA-2 system and various performance metrics. Among the advances in MERRA-2 relevant to IESA are the assimilation of aerosol observations, several improvements to the representation of the stratosphere including ozone, and improved representations of cryospheric processes. Other improvements in the quality of MERRA-2 compared with MERRA include the reduction of some spurious trends and jumps related to changes in the observing system and reduced biases and imbalances in aspects of the water cycle. Remaining deficiencies are also identified. Production of MERRA-2 began in June 2014 in four processing streams and converged to a single near-real-time stream in mid-2015. MERRA-2 products are accessible online through the NASA Goddard Earth Sciences Data Information Services Center (GES DISC).
Article
Two passive microwave missions are currently operating at L-band to monitor surface soil moisture (SM) over continental surfaces. The SMOS sensor, based on an innovative interferometric technology enabling multi-angular signatures of surfaces to be measured, was launched in November 2009. The SMAP sensor, based on a large mesh reflector 6 m in diameter providing a conically scanning antenna beam with a surface incidence angle of 40°, was launched in January of 2015. Over the last decade, an intense scientific activity has focused on the development of the SM retrieval algorithms for the two missions. This activity has relied on many field (mainly tower-based) and airborne experimental campaigns, and since 2010–2011, on the SMOS and Aquarius space-borne L-band observations. It has relied too on the use of numerical, physical and semi-empirical models to simulate the microwave brightness temperature of natural scenes for a variety of scenarios in terms of system configurations (polarization, incidence angle) and soil, vegetation and climate conditions. Key components of the inversion models have been evaluated and new parameterizations of the effects of the surface temperature, soil roughness, soil permittivity, and vegetation extinction and scattering have been developed. Among others, global maps of select radiative transfer parameters have been estimated very recently. Based on this intense activity, improvements of the SMOS and SMAP SM inversion algorithms have been proposed. Some of them have already been implemented, whereas others are currently being investigated. In this paper, we present a review of the significant progress which has been made over the last decade in this field of research with a focus on L-band, and a discussion on possible applications to the SMOS and SMAP soil moisture retrieval approaches.
Article
A study to determine radio-frequency interference (RFI) in low-frequency passive microwave observations of the Advanced Microwave Scanning Radiometer-2 (AMSR2) is performed. RFI detection methods, such as the spectral difference method, have already been applied on microwave satellite sensors. However, these methods may result in false RFI detection, particularly in zones with extreme environmental conditions. To overcome this problem, this paper proposes an approach that uses the additional 7.3-GHz channel of the AMSR2 sensor in a new RFI detection method. This method uses calculated standard errors of estimate to detect RFI contamination in 6.9- and 7.3-GHz observations. It was found that 6.9-GHz observations are mainly contaminated in the USA, India, Japan, and parts of Europe. The 7.3-GHz observations are contaminated in South America, Ukraine, the Middle East, Southeast Asia, and Russia. The fact that these channels are not affected by RFI in exactly the same regions is useful for studies that prefer C-band brightness temperature observations (e.g., soil moisture retrieval algorithms). Therefore, a decision tree approach was set up to determine RFI and to select reliable brightness temperature observations in the lowest frequency free of any man-made contamination. The result is a reduction of the total contaminated pixels in the 6.9-GHz observations of 66% for horizontal observations and even 85% for vertical observations when 7.3 and 10.7 GHz are used. By linking RFI maps with civilization maps, this paper further shows that RFI sources at the C-band frequency are mainly located in urbanized areas.
Article
This study assesses two remotely sensed soil moisture products from the Advanced Microwave Scanning Radiometer 2 (AMSR2), a sensor onboard the Global Change Observation Mission 1 — Water (GCOM-W1) that was launched in May 2012. The soil moisture products were retrieved by the Japan Aerospace Exploration Agency (JAXA) algorithm and the Land Parameter Retrieval Model (LPRM) developed by the VU University Amsterdam, in collaboration with the National Aeronautics and Space Administration (NASA). The two products are compared at the global scale. In addition, the products are evaluated against field measurements from 47 stations from the COsmic-ray Soil Moisture Observing System (COSMOS) network which are located in the United States (36 stations), Australia (7 stations), Europe (2 stations) and Africa (2 stations).
Article
Radio frequency interference (RFI) detection techniques have different challenges and opportunities for interferometric radiometers such as the Microwave Imaging Radiometer using Aperture Synthesis on the Soil Moisture and Ocean Salinity (SMOS) mission. SMOS does not have highly oversampled temporal resolution or subband filters for oversampled spectral resolution, as do other radiometers with enhanced RFI detection capabilities. It does, however, have multisampled angular resolution in the sense that a single location is viewed from many different angles of incidence. This paper compares and contrasts RFI detection algorithms that use measurements made at a variety of different levels of SMOS signal processing, including the visibility domain, brightness temperature spatial domain, and brightness temperature angular domain. The angular domain detection algorithm, in particular, is developed and characterized in detail. Examples of the algorithms applied to cases with RFI (to assess detection skill) and without RFI (to assess false-alarm behavior) are considered.
Article
Knowledge of land surface water, energy, and carbon conditions are of critical im- portance due to their impact on many real world applications such as agricultural production, water resource management, and flood, weather, and climate predic- tion. Land Information System (LIS) is a software framework that integrates the
Article
Soil moisture is widely recognized as a state variable governing the mass and energy balance between the land surface and the atmosphere. For that, its knowledge is of upmost importance for many applications including flood and landslide prediction. In alpine catchments, soil moisture estimation is a very difficult task, because of complex topography, high vegetation density, and presence of snow and outcrops. In this study, the possibility to estimate soil moisture for these areas by using modelled and satellite data is investigated. Specifically, an updated version of a soil water balance model, which takes the snowmelt process into account, is employed. Moreover, satellite-derived soil moisture observations obtained by the Advanced SCATterometer (ASCAT) sensor onboard the MetOp satellite are tested by considering two products: the Surface Soil Moisture (SSM) and the Soil Water Index (SWI). The latter is obtained through the application of an exponential filter and it is aimed to reduce the differences in the layer depth of in situ measurements (10 cm) and satellite data (~2-3 cm). Quality-checked in situ soil moisture measurements collected at four continuous monitoring sites in Valle d'Aosta (North Italy) are used to test the accuracy of modelled and satellite estimates. Notwithstanding the above issues, results indicated the potential not only of modelling approaches but also, unexpectedly, of satellite data to retrieve soil moisture in high elevation regions (> 1000 m a.s.l.). Indeed, by estimating correctly the snowmelt contribution, the agreement between modelled and observed data is quite good, with correlation coefficient values, r, in the range 0.795-0.940. Also the ASCAT-derived SWI product provides satisfactorily results with r=0.635-0.869. Based on these findings, in situ, modelled and satellite soil moisture data will be order to improve the Civil Protection Alert System.
Article
Knowledge of land surface water, energy, and carbon conditions are of critical importance due to their impact on many real world applications such as agricultural production, water resource management, and flood, weather, and climate prediction. Land Information System (LIS) is a software framework that integrates the use of satellite and ground-based observational data along with advanced land surface models and computing tools to accurately characterize land surface states and fluxes. LIS employs the use of scalable, high performance computing and data management technologies to deal with the computational challenges of high resolution land surface modeling. To make the LIS products transparently available to the end users, LIS includes a number of highly interactive visualization components as well. The LIS components are designed using object-oriented principles, with flexible, adaptable interfaces and modular structures for rapid prototyping and development. In addition, the interoperable features in LIS enable the definition, intercomparison, and validation of land surface modeling standards and the reuse of a high quality land surface modeling and computing system.